Commit graph

7146 commits

Author SHA1 Message Date
Xinya Zhang
77cab7a3a5
[ROCm] Add AveragePool, GlobalAveragePool, MaxPool, GlobalMaxPool Ops (#11968)
* [ROCm] disable expected failure tests PoolTest.MaxPool_10_DilationPadding_?d

* [ROCm] Add AveragePool, GlobalAveragePool, MaxPool, GlobalMaxPool Ops

* (To squash after review) Replace rocm/nn/pool.cc with amd_hipify.py changes

* [ROCM] Replace miCompat with Helper functions

* (to squash) fix the compiling error of SetPoolingNdDescriptorHelper
2022-08-03 14:36:36 -07:00
Erick Muñoz
d1497bdf62
[oneDNN EP] Optimized DynamicQuantizeLinear operator (#12403)
* Removed unnecesary reorders
* Removed unnecesary element wise clip
2022-08-03 12:36:42 -07:00
Baiju Meswani
7f58bd7236
Perform graph transformations during offline tooling (#12422) 2022-08-03 11:27:12 -07:00
Dmitri Smirnov
dc984a03d5
Container and memory allocation guidelines (#12387)
Container and memory allocation guidelines
  Re-org and add code samples
  Clarify the wording on returning gsl::span
2022-08-03 10:31:59 -07:00
Tianlei Wu
97a340bf48
Fix integer overflow in LongformerAttention (#12435)
fix integer overflow
2022-08-03 10:29:07 -07:00
Changming Sun
44ec2cf088
Update publish-python-apidocs.yml (#12433) 2022-08-03 10:17:00 -07:00
Ye Wang
b622e5fa9b
Support vocab_mask/prefix_vocab_mask/no_repeat_number in greedysearch op (#12327)
* support more inputs for greedy search

* fix docs

* refactor test

* lint

* review comments
2022-08-03 10:10:08 -07:00
Xinya Zhang
01f3a197d7
[ROCm] InstanceNormalization, BatchNormalization and LRN Ops (#11972)
* [ROCm] Add InstanceNormalization Op

* Enable InstanceNormBatch1_fp16 and InstanceNormBatch2_fp16 for ROCm

* [ROCm] Add BatchNormalization for fp32 and fp16

* Enable BatchNormTest for ROCm

* [ROCm] Add LRN Op

* [ROCM] replace miCompat functions with Helper functions
2022-08-02 23:14:26 -07:00
Vincent Wang
99d2a63e1a
Set Fix Seed For SoftmaxCrossEntoryLoss Related UTs (#12432)
add seed
2022-08-03 13:29:30 +08:00
George Nash
26dc09417b
[oneDNN ep] matmulinteger postop fusion (#12354)
* MatMulInteger + post op fusion

This fuses MatMulInteger with upto 32 binary/elementwise
operators if running on the oneDNN execution provider.

Signed-off-by: George Nash <george.nash@intel.com>

* Remove the un-needed transformer

The MatMulIntegerToFloat transformer is not needed since
the transform done is handled by the MatMulIntegerBinaryEltwise
transformer code.

Signed-off-by: George Nash <george.nash@intel.com>

* Refactor of the post op trasformer code

This separates the code that finds the post op
nodes for MatMul and MatMulInteger to reduce code
repetition.

Signed-off-by: George Nash <george.nash@intel.com>

* Minor cleanup based on cpplint

resolved unused-variable build failure

Signed-off-by: George Nash <george.nash@intel.com>
2022-08-02 20:42:34 -07:00
Changming Sun
5d610bc8eb
Disable CG task in PR pipelines (#12426) 2022-08-02 19:01:41 -07:00
Yulong Wang
feed5da435
[js] loosen test timeout (#12427)
Losen the following test timeout:

1. "Test Web Multi-Browsers" stage in "ONNX Runtime Web CI Pipeline": 30min -> 60min
2. Node.js binding default per-case timeout: 30 sec -> 90 sec
2022-08-02 19:01:19 -07:00
smrkatte
54d5e86981
Add cast before copy for dissimilar scalar type (#12391)
* Add proper cast/copy callflow for ORT and non-ORT devices
2022-08-02 18:32:58 -07:00
Yulong Wang
c9e0d0f8b6
[js/node] upgrade terser version (#12351) 2022-08-02 15:50:44 -07:00
Changming Sun
1a64b94f60
Fix a small issue in nuget packaging pipeline (#12405)
In #12358 I typed a wrong path in the yaml file.
2022-08-02 15:44:43 -07:00
Dmitri Smirnov
eebaf5f270
Adjust and fixx abseil-cpp debugging visualization (#12415)
Move abseil-cpp.natvis file, add it to PDB, adjust visualization
2022-08-02 15:08:17 -07:00
shalvamist
ca6b4221fe
[js] Bug fix - permission issue with ensureSymlinkSync (#12369)
using ensureSymlinkSync might have issues with permissions when using 'dir' - changed to 'junction' to avoid this. 
If the folder generation fails it will cause the test to fails as well.
2022-08-02 12:21:31 -07:00
Chi Lo
b39257a5e6
Enable support of multi-level nested control flow ops model for TRT EP (#12147)
* Make multiple-level nested control flow op model work

* find correct input index

* find correct input index (cont.)

* enable nested layer unit tests for TRT EP

* add comment

* add Scan op to current workaround support of control flow op
2022-08-01 23:57:30 -07:00
Chi Lo
de3a91d85d
Revert TRT EP cache refactoring (#12376)
* revert cache refactor

* fix conflicts when reverting
2022-08-01 23:57:05 -07:00
Yi Zhang
5d1173fe68
Run IOS pipeline concurrently (#12400)
split ios pipelines
2022-08-02 11:07:17 +08:00
Yi Zhang
63d64636f6
Add the comment linking to wiki (#12398)
add the comment
2022-08-02 10:09:16 +08:00
LironKesem
315e006532
adding a comment on nll_loss_forward.output that can not be implemented (#12406)
adding a comment on nll_loss_forward.output that can not be implemented
2022-08-01 19:12:35 -04:00
msftlincoln
62922f4c3c
Eager Mode generator: add comments, rename functions (#12385)
* eager generator: add comments, rename functions

* lint
2022-08-01 15:52:47 -04:00
Edward Chen
f77ab4fea6
Manually add optimization flag for Android Release builds. (#12390)
With recent versions of NDK (since 23), the `-O` optimization level compile flag is not being passed when building in the "Release" configuration.
More details here: https://github.com/android/ndk/issues/1740

Our "Release" Android builds have been built without the optimization flag since we upgraded from NDK 21.

This change is a workaround to manually add `-O3` for "Release" Android builds.
2022-08-01 12:49:03 -07:00
George Wu
6bb807ef74
add cuda compute 8.7 to Cmakelists.txt to support Nvidia Orin devices (#12377)
* add cuda arch 8.7 to cmakelists.txt to support Nvidia Orin devices

* add cuda version >= 11 check for orin support
2022-08-01 09:45:58 -07:00
Cheng
3f66297499
code clean (#12392)
* code clean

* mispelling fix
2022-08-01 14:12:35 +08:00
Valery Chernov
1a4868e5c4
[TVM EP] Hot fix of build on Windows of TVM EP with ipp-crypto (#12381)
fix of build on Windows with ipp-crypto. cmake warnings fix

Co-authored-by: Valery Chernov <valery.chernov@deelvin.com>
2022-07-31 14:36:54 +02:00
Yi Zhang
8b4ad77ea2
pipeline can use last run's artifacts (#12379)
* first step

* depends on stage

* temp change

* specific

* runId

* parameters

* fix typo

* fix typo

* add nnapi

* add nnapi

* fix typo

* minor fix

* condition on stage

* format

* format
2022-07-30 21:34:57 +08:00
pengwa
6d1eb9509e
Refine gradient accumulation (on device training) (#12363)
* a

(cherry picked from commit 43909cdd6e3daf30a82d584292286806d1172a0b)

* optimize inplace accumulator a bit

* fix inputs

* revert logging

* minor fix

* tune perf and resolve comments

* typo

* fix

* fix tests

* move threshold to constexpr.
2022-07-30 10:24:01 +08:00
Changming Sun
7b4ce0c1e1
Delete the build scripts that were copied from manylinux project (#12358)
1. Delete the build scripts that were copied from manylinux project. Use "git checkout" instead.
2. Update manylinux version to get python 3.11. Related issue: Python 3.11 support #12343
3. Change the cuda version of linux gpu build job of nuget packaging pipeline from cuda 11.4 to cuda 11.6 to match the TRT job within the same pipeline.. (A lot other places need be updated as well, but I'd prefer to put them in another PR)
4. Make dockerfile names static. For example, replace tools/ci_build/github/linux/docker/$(DockerFile) to tools/ci_build/github/linux/docker/Dockerfile.manylinux2014_cpu . The former one relies on a runtime variable $(DockerFile), Template Parameters are expanded early in processing a pipeline run when most variables are not available. It like C++ macros vs variables.
2022-07-29 18:24:19 -07:00
Hariharan Seshadri
d5a1c01b38
Add C++ Session ctor taking model bytes and OrtPrepackedWeightsContainer (#12333) 2022-07-29 12:32:43 -07:00
Nat Kershaw (MSFT)
df8dd41a8e
Automatically run workflows to generate API docs PRs (#11749) 2022-07-29 10:24:59 -07:00
msftlincoln
9559d25da9
ORT Eager Mode Generator - make smaller functions (#12371)
These changes resulted in no change to the generated outputs ort_aten.g.cpp and ort_customops.g.cpp.
2022-07-29 10:12:34 -04:00
Changming Sun
f1a04078e9
Update CODEOWNERS to add a line for yaml files (#12378) 2022-07-29 07:05:41 -07:00
Yateng Hong
c579497134
Fix TRT custom op issue (#12283)
* Pass schema registry on CreateModel.

* Fix ORT_MINIMAL_BUILD.

* Fix build issue.
2022-07-29 03:39:56 -07:00
pengwa
6514069749
Make memory profiler work with multiple session runs. (#12317)
* make memory profiler work with multiple session runs.

(cherry picked from commit 5b636b4dd6fe91b75c063696dc73eda33ec36c8d)

* minor fix

* fix build

* fix window build

* 1. fix cpplint issues;
2. give unique filesname for each session profiler result.
2022-07-29 18:36:31 +08:00
ytaous
e4bd41fb3b
[ROCm] Enable Einsum for inferencing perf (#12360)
* enable einsum

* address comments

* comments

Co-authored-by: Ethan Tao <ettao@microsoft.com>
2022-07-28 20:26:25 -07:00
Justin Stoecker
9c0fa65110
Scope CreateFileMapping2 to valid API partitions (#12374) 2022-07-28 16:47:36 -07:00
Scott McKay
0e85af6990
Add MAUI csharp\sample\InferenceSample\ project (#12356)
Add csharp\sample\InferenceSample\Microsoft.ML.OnnxRuntime.InferenceSample.Maui so we have an equivalent setup for MAUI as for the other platforms.

This provides a setup to do some basic local testing of using an InferenceSession in a MAUI app.
2022-07-29 07:22:36 +10:00
sumitsays
805aa297fc
Remove preview keyword from DirectML pacakge (#12368)
Remove preview keyword

Co-authored-by: Sumit Agarwal <sumitagarwal@microsoft.com>
2022-07-28 14:18:58 -07:00
Jian Chen
7a7e372b9f
Remove training cuda 10.2 pipeline (#12347)
* update to 2022

* Update the VS version

* Rolling back to gcc 10

* Rolling back

* Update cuda home

* remove "CMAKE_CUDA_ARCHITECTURES=52"

* update cuda Architure to 70

* Delete cuda 10.2 training pipeline

* rolling back a mistake

* Update win-gpu-reduce-op-ci-pipeline.yml

* Update win-gpu-reduce-op-ci-pipeline.yml

* Update win-gpu-reduce-op-ci-pipeline.yml

* Delete tools/ci_build/github/linux/docker/scripts/training/ortmodule/stage1/requirements_torch1.10.0_cu10.2 directory

* Delete tools/ci_build/github/linux/docker/scripts/training/ortmodule/stage1/requirements_torch1.11.0_cu10.2 directory
2022-07-28 14:58:17 -04:00
Edward Chen
6e892a95b4
Use specific Android NDK version in CI builds. (#12350)
Current builds use a NDK version that happens to be on the build machine. The build machine environment may change in ways that are outside of our control.
This change installs a specific version of NDK (the current LTS version 25.0.8775105) and uses it.
2022-07-28 11:01:04 -07:00
Chen Fu
73919a6756
Convert DQ node with const weight tensor int8 to uint8 (#12331)
Convert DQ node with const weight tensor int8 to uint8

This is a follow-up with #12088, where convert weight tensor from int8 to uint8. Here we do the same thing in DequantizeLinear node, so that we don't have to perform the same changes for every single future operator.
2022-07-28 09:22:04 -07:00
Valery Chernov
e2423bb55c
[TVM EP] Build on Windows with ipp-crypto support (#12336)
* update TVM EP docs for ipp-crypto build conditions

* add ipp-crypto by ExternalProject_Add

Co-authored-by: Valery Chernov <valery.chernov@deelvin.com>
2022-07-28 15:40:19 +02:00
Vincent Wang
7a298c916a
Add Pow-15 for LayerNorm Fusion and FastGelu Fusion (#12314)
* op version check during fusion

* Revert "op version check during fusion"

This reverts commit cacc8f50ea36b08b73a98a81ca0ae3f84782435f.

* add pow-15 for fusion
2022-07-28 10:23:35 +08:00
Yulong Wang
186ba6e9f2
[js/rn] upgrade package react-native@^0.69.1 (#12155)
* [js/rn] upgrade package react-native@^0.69.1

* upgrade compile sdk to v31

* update ios version requirement

* update pod path for onnxruntime-react-native
2022-07-27 15:15:45 -07:00
Changming Sun
e6bb447101
Change native folder name for java macos arm64 (#12335) 2022-07-27 15:13:07 -07:00
101arrowz
148b1efe5e
[js/web] add ConvTranspose2D to WebGL backend (#11990)
* Add ConvTranspose

* Update docs + tests

* fix lint

* fix output shape calculations

* Revert "fix output shape calculations"

This reverts commit 8014fa9b33115f1d6a677fe2270a6da1b510ff67.

* fix format

* remove broken output_shape test
2022-07-27 13:57:12 -07:00
Justin Chu
d2b25a7c1c
Reduce CI noise from Python lint (#12270)
Description: Reduce CI noise from Python lint

Motivation and Context

Disable "missing-docstring" in pylint. This is usually noisy in tests
Show only added lint messages only for pyright
2022-07-27 13:42:29 -07:00
msftlincoln
9cf6912bba
Fix ORT Eager Mode to work with Pytorch 1.12 (#12323) 2022-07-27 16:24:46 -04:00