onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-06-26 03:00:54 +00:00

Author	SHA1	Message	Date
Yi Zhang	c4e4b98fb2	replace one pool with onnxruntime-Win2022-GPU-T4 (#16953 ) ### Description replace one pool ### Motivation and Context onnxruntime-gpu-tensorrt8-winbuild-t4 would be deprecated	2023-08-01 21:02:56 +08:00
Yulong Wang	6046456bb6	build break: apply formatter fix (#16947 ) ### Description build break: apply formatter fix	2023-08-01 01:10:55 -07:00
Patrice Vignola	49512e558a	[DML EP] Add I/O binding and `If` operator (#16859 ) Being able to leverage I/O binding for DML and registering `If` for the DML EP allows us to avoid copying the past/present key/values back and forth between the CPU and the GPU after every token. This gives us a 25% performance increase for Dolly V2 with 128 tokens on an RTX 4090.	2023-07-31 19:45:59 -07:00
Artyom Stepanishchev	ba23e5b234	[JS/Common] Fix malformed result of Tensor.fromImage(ImageBitmap) (#16919 ) ### Description Set `canvas` dimensions to the `ImageBitmap` dimensions, thus fixing a malformed Tensor creation. ### Motivation and Context According to the [HTMLCanvasElement.drawImage() spec](https://html.spec.whatwg.org/multipage/canvas.html#drawing-images): > When the destination rectangle is outside the destination image (the output bitmap), the pixels that land outside the output bitmap are discarded, as if the destination was an infinite canvas whose rendering was clipped to the dimensions of the output bitmap. meaning that `ImageBitmap` pixels exceeding the canvas dimensions will be discarded. Since no canvas dimensions are set for `Tensor.fromImage(ImageBitmap)` if-case, the default 300x150px canvas dimensions are used leading to the creation of malformed Tensors where all the exceeding pixels are discarded and equal to `0, 0, 0, 0` during the subsequent `pixels2DContext.getImageData()` call.	2023-07-31 18:18:06 -07:00
Jiajia Qin	fa8487ea3a	[js/webgpu] Check profilingMode in each run (#16897 ) ### Description <!-- Describe your changes. --> This PR moves checking profilingMode to each run instead of the initialization stage. In this way, users can start/stop profiling at any time. Otherwise, profiling only take effects at the very beginning and can't be stopped.	2023-07-31 17:37:24 -07:00
kunal-vaishnavi	3c72f43f78	Extend saving models optimized by inference session (#16912 ) ### Description This PR adds support for saving model optimizations after loading a model that contains external data into an `InferenceSession`. ### Motivation and Context This PR is a follow-up to a [previous PR](https://github.com/microsoft/onnxruntime/pull/16716) for saving a model optimized by an `InferenceSession`.	2023-07-31 16:39:35 -07:00
Changming Sun	73ddba964f	Update the MacOS/Linux build scripts that build/install protobuf from source (#16906 ) ### Description 1. As a follow-up of #16761, this PR allows build ORT on iOS/Android without the need to explicitly specify a protoc path. #16761 is for WASM. This one is for iOS/Android 2. Update the MacOS/Linux build scripts that build/install protobuf from source. Make them be more flexible. Add the support for RedHatEnterprise(ubi), which will needed for upgrading the base image from centos:7 to ubi:8. 3. Update tools/ci_build/github/pai/rocm-ci-pipeline-env.Dockerfile : the docker file's base image has preinstalled protobuf in /usr/local, we should uninstall them to avoid conflicts.	2023-07-31 10:51:48 -07:00
Yi Zhang	28a099fca8	unify the steps of downloading cuda sdk and setup env (#16896 ) ### Description The `%AGENT_TEMPDIRECTORY%\v11.8` is created in azcopy step. So, the set env step should be after the azcopy step. ### Motivation and Context Correct the previous logic Unify the step since multiple jobs are using it.	2023-07-31 10:25:04 -07:00
Dmitri Smirnov	50764362ac	Update protobuf Natvis visualization (#16911 ) ### Description Protobuf library update broke debug visualization. ### Motivation and Context Hard to debug	2023-07-31 09:35:21 -07:00
satyajandhyala	77b2b618b2	[JS/WebGPU] Add Resize operator (#16680 ) ### Description Implemented Resize operator support in JSEP ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2023-07-31 09:35:06 -07:00
Hector Li	3fd1d3b9bd	Improve graph transformer DoubleQDQPairsRemover (#16910 ) Improve graph transformer DoubleQDQPairsRemover ### Description Improve DoubleQDQPairsRemover to not reset the scale & zero point if existing value are same on the target DQ & Q nodes. ### Motivation and Context Fix a bug that DoubleQDQPairsRemover reset the scale value while removing unnecessary DQ & Q nodes.	2023-07-31 09:24:46 -07:00
satyajandhyala	dd24d52737	[JS/Web] Added Gelu contrib operator support to JSEP (#16909 ) ### Description Added Gelu operator to JSEP ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2023-07-31 09:18:58 -07:00
Tianlei Wu	92b6e10d37	skip test_smooth_quant to unblock Python Package Pipeline (#16914 ) ### Description Python Package Pipeline failed since there is exception raised in test_smooth_quant (from #16288): ``` File "/home/cloudtest/.local/lib/python3.8/site-packages/onnxruntime/quantization/quantize.py", line 384, in quantize_static importlib.import_module("neural_compressor.adaptor.ox_utils.smooth_quant") File "/home/cloudtest/.local/lib/python3.8/site-packages/neural_compressor/__init__.py", line 24, in <module> from .contrib import * File "/home/cloudtest/.local/lib/python3.8/site-packages/neural_compressor/contrib/__init__.py", line 19, in <module> from .strategy import * File "/home/cloudtest/.local/lib/python3.8/site-packages/neural_compressor/contrib/strategy/__init__.py", line 26, in <module> __import__(basename(f)[:-3], globals(), locals(), level=1) File "/home/cloudtest/.local/lib/python3.8/site-packages/neural_compressor/contrib/strategy/sigopt.py", line 22, in <module> from neural_compressor.strategy.strategy import strategy_registry, TuneStrategy File "/home/cloudtest/.local/lib/python3.8/site-packages/neural_compressor/strategy/__init__.py", line 20, in <module> from .strategy import STRATEGIES File "/home/cloudtest/.local/lib/python3.8/site-packages/neural_compressor/strategy/strategy.py", line 41, in <module> from ..algorithm import AlgorithmScheduler, ALGORITHMS File "/home/cloudtest/.local/lib/python3.8/site-packages/neural_compressor/algorithm/__init__.py", line 20, in <module> from .algorithm import ALGORITHMS, Algorithm, AlgorithmScheduler, algorithm_registry File "/home/cloudtest/.local/lib/python3.8/site-packages/neural_compressor/algorithm/algorithm.py", line 21, in <module> from neural_compressor.utils.create_obj_from_config import get_algorithm File "/home/cloudtest/.local/lib/python3.8/site-packages/neural_compressor/utils/create_obj_from_config.py", line 20, in <module> from neural_compressor.metric import METRICS File "/home/cloudtest/.local/lib/python3.8/site-packages/neural_compressor/metric/__init__.py", line 30, in <module> __import__(basename(f)[:-3], globals(), locals(), level=1) File "/home/cloudtest/.local/lib/python3.8/site-packages/neural_compressor/metric/coco_tools.py", line 54, in <module> from pycocotools import coco File "/usr/local/lib/python3.8/dist-packages/pycocotools/coco.py", line 52, in <module> from . import mask as maskUtils File "/usr/local/lib/python3.8/dist-packages/pycocotools/mask.py", line 3, in <module> import pycocotools._mask as _mask File "pycocotools/_mask.pyx", line 1, in init pycocotools._mask ValueError: numpy.ndarray size changed, may indicate binary incompatibility. Expected 96 from C header, got 88 from PyObject ``` The cause is pycocotools package uses "oldest-supported-numpy", which might cause older version numpy in build pycocotools: `9e9164f979/PythonAPI/pyproject.toml (L4)` Related issue: https://github.com/cocodataset/cocoapi/issues/248 ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2023-07-29 11:24:28 -07:00
Tianlei Wu	742edec5e8	[CUDA] Add PackedMultiHeadAttention operator (#16779 ) ### Description Add new operator for MultiHeadAttention with inputs removed padding. This only supports packed QKV format.	2023-07-28 16:35:38 -07:00
Alexey Kamenev	7c05f7bab1	Fix IRFFT contrib op output dimension calculation (#15662 ) ### Description Fixes the issue with IRFFT output dimension calculation as described in #13236 ### Motivation and Context Please refer to #13236 for detailed description. Specifically, [this code](https://github.com/microsoft/onnxruntime/blob/main/onnxruntime/contrib_ops/cuda/math/fft_ops.cc#L103) computes the output dimension as: ``` out_dim = in_dim * 2 - 1 ``` while it should be this instead: ``` out_dim = 2 * (in_dim - 1) ``` (assuming the original signal has even number of samples, of course). For example, if the original signal has 4 samples, then the round trip should look something like: ``` 4 -> (one-sided RFFT) -> 3 (complex) -> (one-sided IRFFT) -> 4 ``` with the current code the output will be a signal with 5 points. --------- Co-authored-by: Alexey Kamenev <akamenev@nvidia.com> Co-authored-by: Nick Geneva <nicholasgeneva@gmail.com>	2023-07-28 15:52:37 -07:00
Yulong Wang	1743e9a615	[js] enable formatter for more file types (#16888 ) ### Description enable formatter for .js/.json/.jsonc/.md files	2023-07-28 15:46:58 -07:00
Paul Willot	65534ff9ef	Update setup.py to add py311 (#16899 ) ### Description Update setup.py to add python 3.11 ### Motivation and Context Python 3.11 is supported since release 1.15	2023-07-28 13:04:50 -07:00
Scott McKay	21a71d52bd	Enable CodeQL for Android build as per 1CS requirement. (#16875 ) ### Description <!-- Describe your changes. --> Split stages for CPU and CPU+NNAPI builds as CodeQL is enabled at the stage level. We run it for CPU+NNAPI as that covers all the Android code. We don't want to run it for both as duplicate issues would be created for a problem in code included in both builds. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2023-07-28 17:54:23 +10:00
Yi Zhang	9f21f694cf	stop support to VS 2019 (#16892 ) ### Description Remove VS 2019 code. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2023-07-28 13:09:35 +08:00
pengwa	a021cb1b6e	Allow creating ConstantScalarNode for double type (#16797 ) ### Allow creating ConstantScalarNode for double type Allow create ConstantScalarNode for double type. Looks double type is not respected when creating constant. So fix it. ``` onnxruntime::python::addObjectMethodsForTraining(pybind11::module&, onnxruntime::python::ExecutionProviderRegistrationFn)::<lambda(onnxruntime::training::OrtModuleGraphBuilder*, const onnxruntime::training::TrainingGraphTransformerConfiguration&)> [ONNXRuntimeError] : 1 : FAIL : Type Error: Type parameter (T) of Optype (Sub) bound to different types (tensor(double) and tensor(float) in node (/_original_module/_original_model/gpt_neox/layers.0/input_layernorm/Pow_Grad/Sub_1). ```	2023-07-28 12:41:22 +08:00
Wanming Lin	634a3f2f28	[WebNN EP] Support Max and Min ops (#16858 )	2023-07-27 18:22:03 -07:00
Changming Sun	9dcbcf1d2f	Delete unused files (#16887 ) ### Description These yaml files and docker files are not used by any pipeline. If I were wrong, feel free to submit a PR to get the wrongly deleted file back from git history (git keeps everything forever).	2023-07-27 16:46:09 -07:00
Changming Sun	161a9d1d6d	Add some safety check for conv op (#16839 ) ### Description Add some safety check for conv op. It is to validate if the attributes coming from a conv op are in a valid range. (shouldn't be too large or too small).	2023-07-27 16:37:55 -07:00
satyajandhyala	e67547b978	[JS/WebGPU] Added Flatten operator support. (#16860 ) ### Description Added Flatten operator support to JSEP. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2023-07-27 12:50:45 -07:00
Hector Li	ec935a5533	[QNN EP] Check the axis attribute for LayerNorm for HTP (#16872 ) Check the axis attribute for LayerNorm for HTP ### Description Add code to check the axis value for LayerNorm for HTP explicitly to make sure only the last dimension is allowed.	2023-07-27 09:59:46 -07:00
Hector Li	2748f51603	[QNN EP] Remove duplicate string define (#16877 ) [QNN EP] Remove duplicate string define ### Description Remove duplicate string define for QNN op parameters, use them directly from QNN header file.	2023-07-27 09:08:55 -07:00
Prathik Rao	779fba1666	ORT Cache (#16744 ) ### Description <!-- Describe your changes. --> This PR adds support to cache the exported training/evaluation ONNX model in `ORTModule`. On future runs, instead of exporting the model again, we can pick up the model from a location on disc and run `ORTModule` training/evaluation. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> ORT Training DRI Contribution --------- Co-authored-by: root <root@orttrainingdev8.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net> Co-authored-by: Prathik Rao <prathikrao@microsoft.com@orttrainingdev8.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net> Co-authored-by: Baiju Meswani <bmeswani@microsoft.com> Co-authored-by: pengwa <pengwa@microsoft.com>	2023-07-27 09:00:43 -07:00
Yi Zhang	bd95a8ea77	update onnxruntime-gpu-winbuild-T4 to onnxruntime-Win2022-GPU-T4 (#16838 ) ### Description ### Motivation and Context It's also used to upgrade visual studio to VS2022. onnxruntime-gpu-winbuild-T4 and onnxruntime-gpu-tensorrt8-winbuild-t4 are using the image based on one dev branch and VS2019 To avoid breaking the current CIs, we move jobs running on onnxruntime-gpu-winbuild-T4/onnxruntime-gpu-tensorrt8-winbuild-t4 to onnxruntime-Win2022-GPU-T4.	2023-07-27 08:38:20 -07:00
Adam Pocock	340f4ded73	[java] Fills out the javadoc so there are no more documentation warnings (#16776 ) ### Description Adds javadoc for all protected and public members, methods and classes. ### Motivation and Context The javadoc warnings were annoying me when running the builds. Also, those types should have been documented. --------- Co-authored-by: Scott McKay <Scott.McKay@microsoft.com>	2023-07-27 16:17:03 +10:00
Wang, Mengni	fe463d4957	Support SmoothQuant for ORT static quantization (#16288 ) ### Description Support SmoothQuant for ORT static quantization via intel neural compressor > Note: Please use neural-compressor==2.2 to try SmoothQuant function. ### Motivation and Context For large language models (LLMs) with gigantic parameters, the systematic outliers make quantification of activations difficult. As a training free post-training quantization (PTQ) solution, SmoothQuant offline migrates this difficulty from activations to weights with a mathematically equivalent transformation. Integrating SmoothQuant into ORT quantization can benefit the accuracy of INT8 LLMs. --------- Signed-off-by: Mengni Wang <mengni.wang@intel.com>	2023-07-26 18:56:45 -07:00
Justin Chu	eeef157888	Format c++ code under `winml/` (#16660 ) winml/ was previously excluded from lintrunner config. This change includes the directory and adds the clang-format config file specific to winml/ that fits existing style. --------- Signed-off-by: Justin Chu <justinchu@microsoft.com>	2023-07-25 21:56:50 -07:00
Patrice Vignola	649930142f	[DML EP] Add NCHW and float16 gamma/beta support for GroupNorm (#16814 ) This will remove transposes that are non needed in the DML kernel. To keep backward compatiblity, the default behavior is to set NHWC when no attribute is set.	2023-07-25 21:43:29 -07:00
pengwa	39fca225ea	ORTModule log clean up (#16795 ) ### ORTModule log clean up ORTModule log level - WARNING(Default) is for end users; INFO and VERBOSE is for internal ORT training developers. Few issues: 1. ONNX export will output lots of WARNING error message like "The shape inference of com.microsoft::SoftmaxCrossEntropyLossInternal/ATen/PythonOp type is missing", which is useless for us or end users. ![image](https://github.com/microsoft/onnxruntime/assets/10530022/f2409480-32e1-483d-bd18-f14149f0588d) 3. ORT also print some information like ""CleanUnusedInitializersAndNodeArgs] Removing initializer","ReverseBFSWithStopGradient] Skip building gradient for", which is also useless for us or end users most of the time. ![image](https://github.com/microsoft/onnxruntime/assets/10530022/ff3feaf1-3cb2-4392-b087-86b30b72994c) 5. Different ranks output logs and making ORT developers or end users feels there are too many logs but usually not useful until we need investigate. Few improvements for the issues: 1. For ONNX export logs, there are two kinds of logs: a. export verbose log; b. other logs printed by torch C++ backend. So this PR make following change: # VERBOSE -> FULL export verbose log + FULL torch other logs from stdout and stderr (C++ backend) # INFO -> FULL export verbose log + FILTERED torch other logs from stdout and stderr (C++ backend) # WARNING/ERROR -> [Rank 0] NO export verbose log + FILTERED torch other logs from stdout and stderr (C++ backend) e.g. for verbose level, print all logs as usually; for info level, print verbose export log, and filtered logs from torch C++ backend (removing messages like this "The shape inference of com.microsoft::SoftmaxCrossEntropyLossInternal/ATen/PythonOp type is missing") . For higher level, only log the info on rank 0. 2. For ORT gradient graph build and session creation, also suppress the message and filtered out the message when log level >=INFO. 3. log level > INFO, then only logs on rank 0 is logged, to have a cleaner user experience This is the log for a BLOOM model training after the change: there are limited of warnings. ![image](https://github.com/microsoft/onnxruntime/assets/10530022/f270b8d5-2944-49d2-a253-c07057d641a0)	2023-07-26 12:42:50 +08:00
Dmitri Smirnov	bf006d34a9	Used feature macro for if constexpr in a public header (#16836 ) ### Description Use feature macro for `if constexpr` ### Motivation and Context We still do not require customers to use C++17 compiler.	2023-07-25 21:42:30 -07:00
Wanming Lin	d0df83e408	[WebNN EP] Support rest Reduce* ops (#16824 ) Add ReduceL1, ReduceL2, ReduceLogSum, ReduceLogSumExp, ReduceMin, ReduceProd, ReduceSum, ReduceSumSquare.	2023-07-25 17:26:48 -07:00
Justin Chu	0c1a5098dc	Disable PERF* rules in ruff to allow better readability (#16834 ) ### Description Disable two PERF* rules in ruff to allow better readability. Rational commented inline. This change also removes the unused noqa directives because of the rule change. ### Motivation and Context Readability	2023-07-25 15:38:22 -07:00
Yulong Wang	53c771f215	[js/common] add unit tests for onnxruntime-common (#16812 ) ### Description "onnxruntime-common" starts to get more and more complicated, so it's a good idea to add unit tests for it. Includes the following changes: - move `mocha` from each subfolder (js/web/, js/node/) to root (js/), so that it will be installed once and all subfolder can use. - add folder `test` in js/common/ as root folder for ort-common tests. - add sub folder `type-tests`. this folder contains a few typescript source code, which are excluded from the tsconfig.json. they are not compiled by default. instead, file `type-tests.ts` calls typescript compiler (tsc) to check for the files under this folder whether the compilation result is as expected. If tsc compiles a file successfully when a failure is expected, this is considered an failed test. - add sub folder `unit-tests`. files under this folder will be compiled by default. we use default mode of mocha (using `describe()` and `it()`) to setup test groups and cases. - update eslint rules accordingly.	2023-07-25 14:37:41 -07:00
satyajandhyala	03ce0a5693	[Web/JS] Added Slice operator in JSEP. (#16811 ) ### Description Added Slice operator support to JSEP. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2023-07-25 14:19:20 -07:00
Adam Pocock	a1bb670536	[java] Fp16 fix for android/react native (#16832 ) ### Description This PR splits out the FP16 conversions into a separate package we can override in the android build with a version which works on old versions of Android. I'm not sure the android build system changes are correct as I haven't got an android build environment configured on my workstation. @YUNQIUGUO if the CI build fails we should follow up offline to get my environment configured so I can iterate on it. ### Motivation and Context Fixes the CI failure after #16703.	2023-07-25 12:31:32 -07:00
Edward Chen	e01365f80b	Update upload_pod_archive_and_update_podspec.sh to take path pattern (#16810 ) Update upload_pod_archive_and_update_podspec.sh to take a pod archive path glob pattern. The actual pod archive path has a version suffix which changes.	2023-07-25 08:55:31 -07:00
Yi Zhang	38db5eca65	replace onnxruntime-Win-CPU-2019 with onnxruntime-Win-CPU-2022 (#16844 ) ### Description <!-- Describe your changes. --> ### Motivation and Context upgrade to VS2022	2023-07-25 23:05:34 +08:00
Luis Rios	feeb0b50f9	Improve TreeNodeElementId hash function (#16459 ) ### Description This PR improves `TreeNodeElementId` hash function by employing [Elegant Pairing function](http://szudzik.com/ElegantPairing.pdf). In few works, Elegant Pairing function maps two non−negative integers to a non−negative integer that is uniquely associated with that pair. This drastically reduces the collision and therefore reduces the time required to create a session in order to use a large tree ensemble model. ### Motivation and Context We use ONNX runtime to serve our models as part of Triton backend. We noticed that it was taking around 2 minutes to load a model which is a large tree ensemble model (around 5k trees with around 3 millions nodes in total). After investigating the issue, it was clear that the `TreeNodeElementId` hash function wasn't being able to map keys to buckets of C++ `unordered_map` without a significant amount of collisions (in same cases 700 items per bucket). The following picture shows graphically the improvement obtained by the proposed change. We used the `onnx_test_runner` command. ![flamegraph](https://github.com/microsoft/onnxruntime/assets/3594678/2588e87c-125b-4a4b-8f03-55e00ae25e08) #### Before ``` $> time ./onnx_test_runner -v ~/folder_with_model result: Models: 1 Total test cases: 0 Succeeded: 0 Not implemented: 0 Failed: 0 Stats by Operator type: Not implemented(0): Failed: Failed Test Cases: real 0m55.695s user 0m52.919s sys 0m0.760s ``` #### After ``` $> time ./onnx_test_runner -v ~/folder_with_model result: Models: 1 Total test cases: 0 Succeeded: 0 Not implemented: 0 Failed: 0 Stats by Operator type: Not implemented(0): Failed: Failed Test Cases: real 0m17.152s user 0m14.318s sys 0m0.619s ```	2023-07-25 14:25:50 +02:00
BoarQing	daef133982	update onnxruntime_perftest's README.md as vitisai is supported on v1.15.1 (#16827 ) ### Description <!-- Describe your changes. --> Updating README.md to add vitisai for onnxruntime_perftest ### Motivation and Context <!-- - Why is this change required? What problem does it solve? --> The perftest tool does support vitisai whereas the README.md does not list it. This creates some confusions internally about if vitisai is supported. See https://github.com/microsoft/onnxruntime/pull/15673 for context.	2023-07-25 13:39:26 +02:00
Yi Zhang	f88f0d8e36	Upgrade 4 stages in nuget pipeline to VS2022 (#16825 ) ### Description ### Motivation and Context Continue upgrading to VS2022 ### Verfication https://dev.azure.com/aiinfra/Lotus/_build/results?buildId=331377&view=results N.B. In practice, SDLNativeRules@3 doesn't support VS2019.	2023-07-25 14:22:39 +08:00
Yulong Wang	8b30dc11d7	Update run_CIs_for_external_pr.py to skip passed checks (#16808 ) ### Description Update run_CIs_for_external_pr.py to skip passed checks	2023-07-25 16:11:53 +10:00
Yi Zhang	2e214d6e27	Workaround to upgrade VS2022 for Windows ARM build (#16826 ) ### Description ### Motivation and Context It should be reverted when VS2022 is upgraded to 17.7 or above. ### Vefication https://dev.azure.com/aiinfra/Lotus/_build/results?buildId=331401&view=logs&j=7517abfd-115a-5c61-78a0-7ba3c9e3a88d	2023-07-25 08:35:52 +08:00
pengwa	f2c0470436	Fix slice upstream - Incompatible dimensions (#16818 ) ### Fix slice upstream - (MatMul) [ShapeInferenceError] Incompatible dimensions ``` 2023-07-22 14:58:16.918478478 [I:onnxruntime:Default, constant_sharing.cc:256 ApplyImpl] Total shared scalar initializer count: 10 2023-07-22 14:58:16.919494252 [W:onnxruntime:Default, graph.cc:108 MergeShapeInfo] Error merging shape info for output. 'onnx::Cast_424' source:{-1,31,-1,-1} target:{-1,32,-1,-1}. Falling back to lenient merge. 2023-07-22 14:58:16.921014114 [W:onnxruntime:Default, graph.cc:108 MergeShapeInfo] Error merging shape info for output. 'onnx::MatMul_425' source:{-1,31,-1,-1} target:{-1,32,-1,-1}. Falling back to lenient merge. Traceback (most recent call last): File "examples/onnxruntime/training/language-modeling/run_clm.py", line 594, in <module> main() File "examples/onnxruntime/training/language-modeling/run_clm.py", line 542, in main train_result = trainer.train(resume_from_checkpoint=checkpoint) File "/bert_ort/pengwa/optimum/optimum/onnxruntime/trainer.py", line 454, in train return inner_training_loop( File "/bert_ort/pengwa/optimum/optimum/onnxruntime/trainer.py", line 755, in _inner_training_loop tr_loss_step = self.training_step(model, inputs) File "/bert_ort/pengwa/py38/lib/python3.8/site-packages/transformers/trainer.py", line 2735, in training_step loss = self.compute_loss(model, inputs) File "/bert_ort/pengwa/optimum/optimum/onnxruntime/trainer.py", line 363, in compute_loss return model_with_loss(dict_inputs, return_outputs) File "/bert_ort/pengwa/py38/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl return forward_call(input, kwargs) File "/bert_ort/pengwa/py38/lib/python3.8/site-packages/deepspeed/utils/nvtx.py", line 15, in wrapped_fn ret_val = func(args, *kwargs) File "/bert_ort/pengwa/py38/lib/python3.8/site-packages/deepspeed/runtime/engine.py", line 1724, in forward loss = self.module(inputs, *kwargs) File "/bert_ort/pengwa/py38/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl return forward_call(input, *kwargs) File "/bert_ort/pengwa/py38/lib/python3.8/site-packages/onnxruntime/training/ortmodule/_utils.py", line 384, in _forward return ortmodule._torch_module.forward(inputs, *kwargs) File "/bert_ort/pengwa/py38/lib/python3.8/site-packages/onnxruntime/training/ortmodule/_utils.py", line 364, in _forward return torch_module_ort._execution_manager(torch_module_ort.is_training()).forward(inputs, *kwargs) File "/bert_ort/pengwa/py38/lib/python3.8/site-packages/onnxruntime/training/ortmodule/_training_manager.py", line 345, in forward self._fallback_manager.handle_exception( File "/bert_ort/pengwa/py38/lib/python3.8/site-packages/onnxruntime/training/ortmodule/_fallback.py", line 157, in handle_exception raise exception File "/bert_ort/pengwa/py38/lib/python3.8/site-packages/onnxruntime/training/ortmodule/_training_manager.py", line 280, in forward self._build_graph(graph_transformer_config) File "/bert_ort/pengwa/py38/lib/python3.8/site-packages/onnxruntime/training/ortmodule/_logger.py", line 218, in wrapper result = func(graph_execution_manager, args, *kwargs) File "/bert_ort/pengwa/py38/lib/python3.8/site-packages/onnxruntime/training/ortmodule/_training_manager.py", line 360, in _build_graph super()._build_graph(graph_transformer_config) File "/bert_ort/pengwa/py38/lib/python3.8/site-packages/onnxruntime/training/ortmodule/_graph_execution_manager.py", line 186, in _build_graph self._graph_builder.build(config) RuntimeError: /bert_ort/pengwa/onnxruntime/orttraining/orttraining/python/orttraining_pybind_state.cc:823 onnxruntime::python::addObjectMethodsForTraining(pybind11::module&, onnxruntime::python::ExecutionProviderRegistrationFn)::<lambda(onnxruntime::training::OrtModuleGraphBuilder, const onnxruntime::training::TrainingGraphTransformerConfiguration&)> [ONNXRuntimeError] : 1 : FAIL : Node (MatMul_403) Op (MatMul) [ShapeInferenceError] Incompatible dimensions ``` Missed using `axis` attribute for `Slice` op, so change to use `axes` inputs instead. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2023-07-25 08:21:46 +08:00
Wei-Sheng Chin	b0279b14d8	[DORT] Enable Dynamic Shape in DORT and Use Different InferenceSession's when Inputs Are Not Compatible (#16753 ) Sometimes, ONNX exporter generates rank- or shape-dependent sub-graphs. Thus, error could occur when running the ONNX model with different inputs. This PR ([78e736d](`78e736d857`)) addresses this problem by - if needed, exporting multiple ONNX models with different inputs for the same GraphModule. - implementing a naive mechanism to determine of existing ONNX models (and the associated InferenceSession) can be reused. On the other hand, in the second commit [b5a9b5f](`b5a9b5f849`), this PR also enables dynamic shapes in DORT by - passing dynamic_shapes = True to exporter (see how DEFAULT_DYNAMIC_BACKEND is created) - calling torch._dynamo.optimize(dynamic_ort_aot, dynamic=True) (see how dynamic_ort_aot is created).	2023-07-24 16:54:01 -07:00
dependabot[bot]	4b6d9fa851	Bump actions/deploy-pages from 1 to 2 (#16402 )	2023-07-24 16:13:59 -07:00
Maximilian Müller	d8d8349a1b	fix: add missing nullptr of SessionOptions V2 (#16794 ) /builds/devtechproviz/dl/ort-builder/onnxruntime/onnxruntime/python/onnxruntime_pybind_state.cc:388:14: error: missing initializer for member 'OrtTensorRTProviderOptionsV2::trt_cuda_graph_enable' [-Werror=missing-field-initializers] 388 \| 0}; \| ### Description <!-- Describe your changes. --> ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2023-07-24 15:17:11 -07:00

1 2 3 4 5 ...

9253 commits