Commit graph

9253 commits

Author SHA1 Message Date
Yi Zhang
c4e4b98fb2
replace one pool with onnxruntime-Win2022-GPU-T4 (#16953)
### Description
replace one pool

### Motivation and Context
onnxruntime-gpu-tensorrt8-winbuild-t4 would be deprecated
2023-08-01 21:02:56 +08:00
Yulong Wang
6046456bb6
build break: apply formatter fix (#16947)
### Description
build break: apply formatter fix
2023-08-01 01:10:55 -07:00
Patrice Vignola
49512e558a
[DML EP] Add I/O binding and If operator (#16859)
Being able to leverage I/O binding for DML and registering `If` for the
DML EP allows us to avoid copying the past/present key/values back and
forth between the CPU and the GPU after every token.

This gives us a 25% performance increase for Dolly V2 with 128 tokens on
an RTX 4090.
2023-07-31 19:45:59 -07:00
Artyom Stepanishchev
ba23e5b234
[JS/Common] Fix malformed result of Tensor.fromImage(ImageBitmap) (#16919)
### Description

Set `canvas` dimensions to the `ImageBitmap` dimensions, thus fixing a
malformed Tensor creation.

### Motivation and Context

According to the [HTMLCanvasElement.drawImage()
spec](https://html.spec.whatwg.org/multipage/canvas.html#drawing-images):
> When the destination rectangle is outside the destination image (the
output bitmap), the pixels that land outside the output bitmap are
discarded, as if the destination was an infinite canvas whose rendering
was clipped to the dimensions of the output bitmap.

meaning that `ImageBitmap` pixels exceeding the canvas dimensions will
be discarded. Since no canvas dimensions are set for
`Tensor.fromImage(ImageBitmap)` if-case, the default 300x150px canvas
dimensions are used leading to the creation of malformed Tensors where
all the exceeding pixels are discarded and equal to `0, 0, 0, 0` during
the subsequent `pixels2DContext.getImageData()` call.
2023-07-31 18:18:06 -07:00
Jiajia Qin
fa8487ea3a
[js/webgpu] Check profilingMode in each run (#16897)
### Description
<!-- Describe your changes. -->
This PR moves checking profilingMode to each run instead of the
initialization stage. In this way, users can start/stop profiling at any
time. Otherwise, profiling only take effects at the very beginning and
can't be stopped.
2023-07-31 17:37:24 -07:00
kunal-vaishnavi
3c72f43f78
Extend saving models optimized by inference session (#16912)
### Description
This PR adds support for saving model optimizations after loading a
model that contains external data into an `InferenceSession`.



### Motivation and Context
This PR is a follow-up to a [previous
PR](https://github.com/microsoft/onnxruntime/pull/16716) for saving a
model optimized by an `InferenceSession`.
2023-07-31 16:39:35 -07:00
Changming Sun
73ddba964f
Update the MacOS/Linux build scripts that build/install protobuf from source (#16906)
### Description
1. As a follow-up of #16761, this PR allows build ORT on iOS/Android
without the need to explicitly specify a protoc path. #16761 is for
WASM. This one is for iOS/Android
2. Update the MacOS/Linux build scripts that build/install protobuf from
source. Make them be more flexible. Add the support for
RedHatEnterprise(ubi), which will needed for upgrading the base image
from centos:7 to ubi:8.
3. Update tools/ci_build/github/pai/rocm-ci-pipeline-env.Dockerfile :
the docker file's base image has preinstalled protobuf in /usr/local, we
should uninstall them to avoid conflicts.
2023-07-31 10:51:48 -07:00
Yi Zhang
28a099fca8
unify the steps of downloading cuda sdk and setup env (#16896)
### Description
The `%AGENT_TEMPDIRECTORY%\v11.8` is created in azcopy step.
So, the set env step should be after the azcopy step.

### Motivation and Context
Correct the previous logic
Unify the step since multiple jobs are using it.
2023-07-31 10:25:04 -07:00
Dmitri Smirnov
50764362ac
Update protobuf Natvis visualization (#16911)
### Description
Protobuf library update broke debug visualization.

### Motivation and Context
Hard to debug
2023-07-31 09:35:21 -07:00
satyajandhyala
77b2b618b2
[JS/WebGPU] Add Resize operator (#16680)
### Description
Implemented Resize operator support in JSEP



### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2023-07-31 09:35:06 -07:00
Hector Li
3fd1d3b9bd
Improve graph transformer DoubleQDQPairsRemover (#16910)
Improve graph transformer DoubleQDQPairsRemover

### Description
Improve DoubleQDQPairsRemover to not reset the scale & zero point if
existing value are same on the target DQ & Q nodes.

### Motivation and Context
Fix a bug that DoubleQDQPairsRemover reset the scale value while
removing unnecessary DQ & Q nodes.
2023-07-31 09:24:46 -07:00
satyajandhyala
dd24d52737
[JS/Web] Added Gelu contrib operator support to JSEP (#16909)
### Description
Added Gelu operator to JSEP


### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2023-07-31 09:18:58 -07:00
Tianlei Wu
92b6e10d37
skip test_smooth_quant to unblock Python Package Pipeline (#16914)
### Description
Python Package Pipeline failed since there is exception raised in
test_smooth_quant (from #16288):
```
File "/home/cloudtest/.local/lib/python3.8/site-packages/onnxruntime/quantization/quantize.py", line 384, in quantize_static
    importlib.import_module("neural_compressor.adaptor.ox_utils.smooth_quant")
  File "/home/cloudtest/.local/lib/python3.8/site-packages/neural_compressor/__init__.py", line 24, in <module>
    from .contrib import *
  File "/home/cloudtest/.local/lib/python3.8/site-packages/neural_compressor/contrib/__init__.py", line 19, in <module>
    from .strategy import *
  File "/home/cloudtest/.local/lib/python3.8/site-packages/neural_compressor/contrib/strategy/__init__.py", line 26, in <module>
    __import__(basename(f)[:-3], globals(), locals(), level=1)
  File "/home/cloudtest/.local/lib/python3.8/site-packages/neural_compressor/contrib/strategy/sigopt.py", line 22, in <module>
    from neural_compressor.strategy.strategy import strategy_registry, TuneStrategy
  File "/home/cloudtest/.local/lib/python3.8/site-packages/neural_compressor/strategy/__init__.py", line 20, in <module>
    from .strategy import STRATEGIES
  File "/home/cloudtest/.local/lib/python3.8/site-packages/neural_compressor/strategy/strategy.py", line 41, in <module>
    from ..algorithm import AlgorithmScheduler, ALGORITHMS
  File "/home/cloudtest/.local/lib/python3.8/site-packages/neural_compressor/algorithm/__init__.py", line 20, in <module>
    from .algorithm import ALGORITHMS, Algorithm, AlgorithmScheduler, algorithm_registry
  File "/home/cloudtest/.local/lib/python3.8/site-packages/neural_compressor/algorithm/algorithm.py", line 21, in <module>
    from neural_compressor.utils.create_obj_from_config import get_algorithm
  File "/home/cloudtest/.local/lib/python3.8/site-packages/neural_compressor/utils/create_obj_from_config.py", line 20, in <module>
    from neural_compressor.metric import METRICS
  File "/home/cloudtest/.local/lib/python3.8/site-packages/neural_compressor/metric/__init__.py", line 30, in <module>
    __import__(basename(f)[:-3], globals(), locals(), level=1)
  File "/home/cloudtest/.local/lib/python3.8/site-packages/neural_compressor/metric/coco_tools.py", line 54, in <module>
    from pycocotools import coco
  File "/usr/local/lib/python3.8/dist-packages/pycocotools/coco.py", line 52, in <module>
    from . import mask as maskUtils
  File "/usr/local/lib/python3.8/dist-packages/pycocotools/mask.py", line 3, in <module>
    import pycocotools._mask as _mask
  File "pycocotools/_mask.pyx", line 1, in init pycocotools._mask
ValueError: numpy.ndarray size changed, may indicate binary incompatibility. Expected 96 from C header, got 88 from PyObject
```
The cause is pycocotools package uses "oldest-supported-numpy", which
might cause older version numpy in build pycocotools:

9e9164f979/PythonAPI/pyproject.toml (L4)

Related issue: https://github.com/cocodataset/cocoapi/issues/248

### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2023-07-29 11:24:28 -07:00
Tianlei Wu
742edec5e8
[CUDA] Add PackedMultiHeadAttention operator (#16779)
### Description
Add new operator for MultiHeadAttention with inputs removed padding.
This only supports packed QKV format.
2023-07-28 16:35:38 -07:00
Alexey Kamenev
7c05f7bab1
Fix IRFFT contrib op output dimension calculation (#15662)
### Description
Fixes the issue with IRFFT output dimension calculation as described in
#13236

### Motivation and Context
Please refer to #13236 for detailed description.

Specifically, [this code](https://github.com/microsoft/onnxruntime/blob/main/onnxruntime/contrib_ops/cuda/math/fft_ops.cc#L103) computes the output dimension as:
```
out_dim = in_dim * 2 - 1
```
while it should be this instead:
```
out_dim = 2 * (in_dim - 1)
```
(assuming the original signal has even number of samples, of course).

For example, if the original signal has 4 samples, then the round trip should look something like:
```
4 -> (one-sided RFFT) -> 3 (complex) -> (one-sided IRFFT) -> 4
```
with the current code the output will be a signal with 5 points.

---------

Co-authored-by: Alexey Kamenev <akamenev@nvidia.com>
Co-authored-by: Nick Geneva <nicholasgeneva@gmail.com>
2023-07-28 15:52:37 -07:00
Yulong Wang
1743e9a615
[js] enable formatter for more file types (#16888)
### Description
enable formatter for .js/.json/.jsonc/.md files
2023-07-28 15:46:58 -07:00
Paul Willot
65534ff9ef
Update setup.py to add py311 (#16899)
### Description
Update setup.py to add python 3.11

### Motivation and Context
Python 3.11 is supported since release 1.15
2023-07-28 13:04:50 -07:00
Scott McKay
21a71d52bd
Enable CodeQL for Android build as per 1CS requirement. (#16875)
### Description
<!-- Describe your changes. -->
Split stages for CPU and CPU+NNAPI builds as CodeQL is enabled at the
stage level.
We run it for CPU+NNAPI as that covers all the Android code. 
We don't want to run it for both as duplicate issues would be created
for a problem in code included in both builds.

### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2023-07-28 17:54:23 +10:00
Yi Zhang
9f21f694cf
stop support to VS 2019 (#16892)
### Description
Remove VS 2019 code.

### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2023-07-28 13:09:35 +08:00
pengwa
a021cb1b6e
Allow creating ConstantScalarNode for double type (#16797)
### Allow creating ConstantScalarNode for double type

Allow create ConstantScalarNode for double type. Looks double type is
not respected when creating constant. So fix it.

```
onnxruntime::python::addObjectMethodsForTraining(pybind11::module&, onnxruntime::python::ExecutionProviderRegistrationFn)::<lambda(onnxruntime::training::OrtModuleGraphBuilder*, const onnxruntime::training::TrainingGraphTransformerConfiguration&)> [ONNXRuntimeError] : 1 : FAIL : Type Error: Type parameter (T) of Optype (Sub) bound to different types (tensor(double) and tensor(float) in node (/_original_module/_original_model/gpt_neox/layers.0/input_layernorm/Pow_Grad/Sub_1).
```
2023-07-28 12:41:22 +08:00
Wanming Lin
634a3f2f28
[WebNN EP] Support Max and Min ops (#16858) 2023-07-27 18:22:03 -07:00
Changming Sun
9dcbcf1d2f
Delete unused files (#16887)
### Description
These yaml files and docker files are not used by any pipeline. If I
were wrong, feel free to submit a PR to get the wrongly deleted file
back from git history (git keeps everything forever).
2023-07-27 16:46:09 -07:00
Changming Sun
161a9d1d6d
Add some safety check for conv op (#16839)
### Description
Add some safety check for conv op.
It is to validate if the attributes coming from a conv op are in a valid
range. (shouldn't be too large or too small).
2023-07-27 16:37:55 -07:00
satyajandhyala
e67547b978
[JS/WebGPU] Added Flatten operator support. (#16860)
### Description
Added Flatten operator support to JSEP.



### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2023-07-27 12:50:45 -07:00
Hector Li
ec935a5533
[QNN EP] Check the axis attribute for LayerNorm for HTP (#16872)
Check the axis attribute for LayerNorm for HTP

### Description
Add code to check the axis value for LayerNorm for HTP explicitly to
make sure only the last dimension is allowed.
2023-07-27 09:59:46 -07:00
Hector Li
2748f51603
[QNN EP] Remove duplicate string define (#16877)
[QNN EP] Remove duplicate string define

### Description
Remove duplicate string define for QNN op parameters, use them directly
from QNN header file.
2023-07-27 09:08:55 -07:00
Prathik Rao
779fba1666
ORT Cache (#16744)
### Description
<!-- Describe your changes. -->

This PR adds support to cache the exported training/evaluation ONNX
model in `ORTModule`. On future runs, instead of exporting the model
again, we can pick up the model from a location on disc and run
`ORTModule` training/evaluation.

### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->

ORT Training DRI Contribution

---------

Co-authored-by: root <root@orttrainingdev8.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net>
Co-authored-by: Prathik Rao <prathikrao@microsoft.com@orttrainingdev8.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net>
Co-authored-by: Baiju Meswani <bmeswani@microsoft.com>
Co-authored-by: pengwa <pengwa@microsoft.com>
2023-07-27 09:00:43 -07:00
Yi Zhang
bd95a8ea77
update onnxruntime-gpu-winbuild-T4 to onnxruntime-Win2022-GPU-T4 (#16838)
### Description


### Motivation and Context
It's also used to upgrade visual studio to VS2022.
onnxruntime-gpu-winbuild-T4 and onnxruntime-gpu-tensorrt8-winbuild-t4
are using the image based on one dev branch and VS2019

To avoid breaking the current CIs, we move jobs running on
onnxruntime-gpu-winbuild-T4/onnxruntime-gpu-tensorrt8-winbuild-t4 to
onnxruntime-Win2022-GPU-T4.
2023-07-27 08:38:20 -07:00
Adam Pocock
340f4ded73
[java] Fills out the javadoc so there are no more documentation warnings (#16776)
### Description
Adds javadoc for all protected and public members, methods and classes.

### Motivation and Context
The javadoc warnings were annoying me when running the builds. Also,
those types should have been documented.

---------

Co-authored-by: Scott McKay <Scott.McKay@microsoft.com>
2023-07-27 16:17:03 +10:00
Wang, Mengni
fe463d4957
Support SmoothQuant for ORT static quantization (#16288)
### Description

Support SmoothQuant for ORT static quantization via intel neural
compressor

> Note:
Please use neural-compressor==2.2 to try SmoothQuant function.

### Motivation and Context
For large language models (LLMs) with gigantic parameters, the
systematic outliers make quantification of activations difficult. As a
training free post-training quantization (PTQ) solution, SmoothQuant
offline migrates this difficulty from activations to weights with a
mathematically equivalent transformation. Integrating SmoothQuant into
ORT quantization can benefit the accuracy of INT8 LLMs.

---------

Signed-off-by: Mengni Wang <mengni.wang@intel.com>
2023-07-26 18:56:45 -07:00
Justin Chu
eeef157888
Format c++ code under winml/ (#16660)
winml/ was previously excluded from lintrunner config. This change
includes the directory and adds the clang-format config file specific to
winml/ that fits existing style.

---------

Signed-off-by: Justin Chu <justinchu@microsoft.com>
2023-07-25 21:56:50 -07:00
Patrice Vignola
649930142f
[DML EP] Add NCHW and float16 gamma/beta support for GroupNorm (#16814)
This will remove transposes that are non needed in the DML kernel. To
keep backward compatiblity, the default behavior is to set NHWC when no
attribute is set.
2023-07-25 21:43:29 -07:00
pengwa
39fca225ea
ORTModule log clean up (#16795)
### ORTModule log clean up

ORTModule log level - WARNING(Default) is for end users; INFO and
VERBOSE is for internal ORT training developers.

Few issues: 
1. ONNX export will output lots of WARNING error message like "The shape
inference of
com.microsoft::SoftmaxCrossEntropyLossInternal/ATen/PythonOp type is
missing", which is useless for us or end users.

![image](https://github.com/microsoft/onnxruntime/assets/10530022/f2409480-32e1-483d-bd18-f14149f0588d)

3. ORT also print some information like
""CleanUnusedInitializersAndNodeArgs] Removing
initializer","ReverseBFSWithStopGradient] Skip building gradient for",
which is also useless for us or end users most of the time.

![image](https://github.com/microsoft/onnxruntime/assets/10530022/ff3feaf1-3cb2-4392-b087-86b30b72994c)


5. Different ranks output logs and making ORT developers or end users
feels there are too many logs but usually not useful until we need
investigate.

Few improvements for the issues:
1. For ONNX export logs, there are two kinds of logs: a. export verbose
log; b. other logs printed by torch C++ backend. So this PR make
following change:
# VERBOSE -> FULL export verbose log + FULL torch other logs from stdout
and stderr (C++ backend)
# INFO -> FULL export verbose log + FILTERED torch other logs from
stdout and stderr (C++ backend)
# WARNING/ERROR -> [Rank 0] NO export verbose log + FILTERED torch other
logs from stdout and stderr (C++ backend)

e.g. for verbose level, print all logs as usually; for info level, print
verbose export log, and filtered logs from torch C++ backend (removing
messages like this "The shape inference of
com.microsoft::SoftmaxCrossEntropyLossInternal/ATen/PythonOp type is
missing") . For higher level, only log the info on rank 0.

2. For ORT gradient graph build and session creation, also suppress the
message and filtered out the message when log level >=INFO.

3. log level > INFO, then only logs on rank 0 is logged, to have a
cleaner user experience


This is the log for a BLOOM model training after the change: there are
limited of warnings.


![image](https://github.com/microsoft/onnxruntime/assets/10530022/f270b8d5-2944-49d2-a253-c07057d641a0)
2023-07-26 12:42:50 +08:00
Dmitri Smirnov
bf006d34a9
Used feature macro for if constexpr in a public header (#16836)
### Description
Use feature macro for `if constexpr`

### Motivation and Context
We still do not require customers to use C++17 compiler.
2023-07-25 21:42:30 -07:00
Wanming Lin
d0df83e408
[WebNN EP] Support rest Reduce* ops (#16824)
Add ReduceL1, ReduceL2, ReduceLogSum, ReduceLogSumExp, ReduceMin,
ReduceProd, ReduceSum, ReduceSumSquare.
2023-07-25 17:26:48 -07:00
Justin Chu
0c1a5098dc
Disable PERF* rules in ruff to allow better readability (#16834)
### Description

Disable two PERF* rules in ruff to allow better readability. Rational
commented inline. This change also removes the unused noqa directives
because of the rule change.

### Motivation and Context

Readability
2023-07-25 15:38:22 -07:00
Yulong Wang
53c771f215
[js/common] add unit tests for onnxruntime-common (#16812)
### Description
"onnxruntime-common" starts to get more and more complicated, so it's a
good idea to add unit tests for it.

Includes the following changes:
- move `mocha` from each subfolder (js/web/, js/node/) to root (js/), so
that it will be installed once and all subfolder can use.
- add folder `test` in js/common/ as root folder for ort-common tests.
- add sub folder `type-tests`. this folder contains a few typescript
source code, which are excluded from the tsconfig.json. they are not
compiled by default. instead, file `type-tests.ts` calls typescript
compiler (tsc) to check for the files under this folder whether the
compilation result is as expected. If tsc compiles a file successfully
when a failure is expected, this is considered an failed test.
- add sub folder `unit-tests`. files under this folder will be compiled
by default. we use default mode of mocha (using `describe()` and `it()`)
to setup test groups and cases.
- update eslint rules accordingly.
2023-07-25 14:37:41 -07:00
satyajandhyala
03ce0a5693
[Web/JS] Added Slice operator in JSEP. (#16811)
### Description
Added Slice operator support to JSEP.



### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2023-07-25 14:19:20 -07:00
Adam Pocock
a1bb670536
[java] Fp16 fix for android/react native (#16832)
### Description
This PR splits out the FP16 conversions into a separate package we can
override in the android build with a version which works on old versions
of Android.

I'm not sure the android build system changes are correct as I haven't
got an android build environment configured on my workstation.
@YUNQIUGUO if the CI build fails we should follow up offline to get my
environment configured so I can iterate on it.

### Motivation and Context
Fixes the CI failure after #16703.
2023-07-25 12:31:32 -07:00
Edward Chen
e01365f80b
Update upload_pod_archive_and_update_podspec.sh to take path pattern (#16810)
Update upload_pod_archive_and_update_podspec.sh to take a pod archive path glob pattern. The actual pod archive path has a version suffix which changes.
2023-07-25 08:55:31 -07:00
Yi Zhang
38db5eca65
replace onnxruntime-Win-CPU-2019 with onnxruntime-Win-CPU-2022 (#16844)
### Description
<!-- Describe your changes. -->



### Motivation and Context
upgrade to VS2022
2023-07-25 23:05:34 +08:00
Luis Rios
feeb0b50f9
Improve TreeNodeElementId hash function (#16459)
### Description
This PR improves `TreeNodeElementId` hash function by employing [Elegant
Pairing function](http://szudzik.com/ElegantPairing.pdf). In few works,
Elegant Pairing function maps two non−negative integers to a
non−negative integer that is uniquely associated with that pair. This
drastically reduces the collision and therefore reduces the time
required to create a session in order to use a large tree ensemble
model.

### Motivation and Context
We use ONNX runtime to serve our models as part of Triton backend. We
noticed that it was taking around 2 minutes to load a model which is a
large tree ensemble model (around 5k trees with around 3 millions nodes
in total). After investigating the issue, it was clear that the
`TreeNodeElementId` hash function wasn't being able to map keys to
buckets of C++ `unordered_map` without a significant amount of
collisions (in same cases 700 items per bucket).

The following picture shows graphically the improvement obtained by the
proposed change. We used the `onnx_test_runner` command.

![flamegraph](https://github.com/microsoft/onnxruntime/assets/3594678/2588e87c-125b-4a4b-8f03-55e00ae25e08)

#### Before
```
$> time ./onnx_test_runner -v ~/folder_with_model
result:
	Models: 1
	Total test cases: 0
		Succeeded: 0
		Not implemented: 0
		Failed: 0
	Stats by Operator type:
		Not implemented(0):
		Failed:
Failed Test Cases:

real	0m55.695s
user	0m52.919s
sys	0m0.760s
```

#### After
```
$> time ./onnx_test_runner -v ~/folder_with_model
result:
	Models: 1
	Total test cases: 0
		Succeeded: 0
		Not implemented: 0
		Failed: 0
	Stats by Operator type:
		Not implemented(0):
		Failed:
Failed Test Cases:

real	0m17.152s
user	0m14.318s
sys	0m0.619s
```
2023-07-25 14:25:50 +02:00
BoarQing
daef133982
update onnxruntime_perftest's README.md as vitisai is supported on v1.15.1 (#16827)
### Description
<!-- Describe your changes. -->
Updating README.md to add vitisai for onnxruntime_perftest


### Motivation and Context
<!-- - Why is this change required? What problem does it solve? -->
The perftest tool does support vitisai whereas the README.md does not
list it. This creates some confusions internally about if vitisai is
supported. See https://github.com/microsoft/onnxruntime/pull/15673 for
context.
2023-07-25 13:39:26 +02:00
Yi Zhang
f88f0d8e36
Upgrade 4 stages in nuget pipeline to VS2022 (#16825)
### Description


### Motivation and Context
Continue upgrading to VS2022

### Verfication

https://dev.azure.com/aiinfra/Lotus/_build/results?buildId=331377&view=results

N.B.
In practice, SDLNativeRules@3 doesn't support VS2019.
2023-07-25 14:22:39 +08:00
Yulong Wang
8b30dc11d7
Update run_CIs_for_external_pr.py to skip passed checks (#16808)
### Description
Update run_CIs_for_external_pr.py to skip passed checks
2023-07-25 16:11:53 +10:00
Yi Zhang
2e214d6e27
Workaround to upgrade VS2022 for Windows ARM build (#16826)
### Description



### Motivation and Context
It should be reverted when VS2022 is upgraded to 17.7 or above.

### Vefication

https://dev.azure.com/aiinfra/Lotus/_build/results?buildId=331401&view=logs&j=7517abfd-115a-5c61-78a0-7ba3c9e3a88d
2023-07-25 08:35:52 +08:00
pengwa
f2c0470436
Fix slice upstream - Incompatible dimensions (#16818)
### Fix slice upstream - (MatMul) [ShapeInferenceError] Incompatible
dimensions

```
     2023-07-22 14:58:16.918478478 [I:onnxruntime:Default, constant_sharing.cc:256 ApplyImpl] Total shared scalar initializer count: 10
        2023-07-22 14:58:16.919494252 [W:onnxruntime:Default, graph.cc:108 MergeShapeInfo] Error merging shape info for output. 'onnx::Cast_424' source:{-1,31,-1,-1} target:{-1,32,-1,-1}. Falling back to lenient merge.
        2023-07-22 14:58:16.921014114 [W:onnxruntime:Default, graph.cc:108 MergeShapeInfo] Error merging shape info for output. 'onnx::MatMul_425' source:{-1,31,-1,-1} target:{-1,32,-1,-1}. Falling back to lenient merge.

Traceback (most recent call last):
  File "examples/onnxruntime/training/language-modeling/run_clm.py", line 594, in <module>
    main()
  File "examples/onnxruntime/training/language-modeling/run_clm.py", line 542, in main
    train_result = trainer.train(resume_from_checkpoint=checkpoint)
  File "/bert_ort/pengwa/optimum/optimum/onnxruntime/trainer.py", line 454, in train
    return inner_training_loop(
  File "/bert_ort/pengwa/optimum/optimum/onnxruntime/trainer.py", line 755, in _inner_training_loop
    tr_loss_step = self.training_step(model, inputs)
  File "/bert_ort/pengwa/py38/lib/python3.8/site-packages/transformers/trainer.py", line 2735, in training_step
    loss = self.compute_loss(model, inputs)
  File "/bert_ort/pengwa/optimum/optimum/onnxruntime/trainer.py", line 363, in compute_loss
    return model_with_loss(dict_inputs, return_outputs)
  File "/bert_ort/pengwa/py38/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
    return forward_call(*input, **kwargs)
  File "/bert_ort/pengwa/py38/lib/python3.8/site-packages/deepspeed/utils/nvtx.py", line 15, in wrapped_fn
    ret_val = func(*args, **kwargs)
  File "/bert_ort/pengwa/py38/lib/python3.8/site-packages/deepspeed/runtime/engine.py", line 1724, in forward
    loss = self.module(*inputs, **kwargs)
  File "/bert_ort/pengwa/py38/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
    return forward_call(*input, **kwargs)
  File "/bert_ort/pengwa/py38/lib/python3.8/site-packages/onnxruntime/training/ortmodule/_utils.py", line 384, in _forward
    return ortmodule._torch_module.forward(*inputs, **kwargs)
  File "/bert_ort/pengwa/py38/lib/python3.8/site-packages/onnxruntime/training/ortmodule/_utils.py", line 364, in _forward
    return torch_module_ort._execution_manager(torch_module_ort.is_training()).forward(*inputs, **kwargs)
  File "/bert_ort/pengwa/py38/lib/python3.8/site-packages/onnxruntime/training/ortmodule/_training_manager.py", line 345, in forward
    self._fallback_manager.handle_exception(
  File "/bert_ort/pengwa/py38/lib/python3.8/site-packages/onnxruntime/training/ortmodule/_fallback.py", line 157, in handle_exception
    raise exception
  File "/bert_ort/pengwa/py38/lib/python3.8/site-packages/onnxruntime/training/ortmodule/_training_manager.py", line 280, in forward
    self._build_graph(graph_transformer_config)
  File "/bert_ort/pengwa/py38/lib/python3.8/site-packages/onnxruntime/training/ortmodule/_logger.py", line 218, in wrapper
    result = func(graph_execution_manager, *args, **kwargs)
  File "/bert_ort/pengwa/py38/lib/python3.8/site-packages/onnxruntime/training/ortmodule/_training_manager.py", line 360, in _build_graph
    super()._build_graph(graph_transformer_config)
  File "/bert_ort/pengwa/py38/lib/python3.8/site-packages/onnxruntime/training/ortmodule/_graph_execution_manager.py", line 186, in _build_graph
    self._graph_builder.build(config)
RuntimeError: /bert_ort/pengwa/onnxruntime/orttraining/orttraining/python/orttraining_pybind_state.cc:823 onnxruntime::python::addObjectMethodsForTraining(pybind11::module&, onnxruntime::python::ExecutionProviderRegistrationFn)::<lambda(onnxruntime::training::OrtModuleGraphBuilder*, const onnxruntime::training::TrainingGraphTransformerConfiguration&)> [ONNXRuntimeError] : 1 : FAIL : Node (MatMul_403) Op (MatMul) [ShapeInferenceError] Incompatible dimensions

 
```

Missed using `axis` attribute for `Slice` op, so change to use `axes`
inputs instead.

### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2023-07-25 08:21:46 +08:00
Wei-Sheng Chin
b0279b14d8
[DORT] Enable Dynamic Shape in DORT and Use Different InferenceSession's when Inputs Are Not Compatible (#16753)
Sometimes, ONNX exporter generates rank- or shape-dependent sub-graphs.
Thus, error could occur when running the ONNX model with different
inputs. This PR
([78e736d](78e736d857))
addresses this problem by
- if needed, exporting multiple ONNX models with different inputs for
the same GraphModule.
- implementing a naive mechanism to determine of existing ONNX models
(and the associated InferenceSession) can be reused.
 
On the other hand, in the second commit
[b5a9b5f](b5a9b5f849),
this PR also enables dynamic shapes in DORT by
- passing dynamic_shapes = True to exporter (see how
DEFAULT_DYNAMIC_BACKEND is created)
- calling torch._dynamo.optimize(dynamic_ort_aot, dynamic=True) (see how
dynamic_ort_aot is created).
2023-07-24 16:54:01 -07:00
dependabot[bot]
4b6d9fa851
Bump actions/deploy-pages from 1 to 2 (#16402) 2023-07-24 16:13:59 -07:00
Maximilian Müller
d8d8349a1b
fix: add missing nullptr of SessionOptions V2 (#16794)
/builds/devtechproviz/dl/ort-builder/onnxruntime/onnxruntime/python/onnxruntime_pybind_state.cc:388:14:
error: missing initializer for member
'OrtTensorRTProviderOptionsV2::trt_cuda_graph_enable'
[-Werror=missing-field-initializers]
  388 |             0};
      |

### Description
<!-- Describe your changes. -->



### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2023-07-24 15:17:11 -07:00