onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-06-20 02:07:56 +00:00

Author	SHA1	Message	Date
Baiju Meswani	fb85b31fac	Remove protobuf pin from training requirements (#13695 )	2022-11-22 12:27:18 -08:00
Yulong Wang	2bebe6189a	set node schema when apply NHWC transformer (#13660 ) ### Description set node schema when apply NHWC transformer ### Motivation and Context The implementation in `IExecutionProvider::GetCapability()` checks node schema to determine the capability of the current EP. If NHWC graph transformer created a new channel last `Conv` node to replace the channel first `Conv` node, we need to assign the schema to the replaced node.	2022-11-22 12:26:52 -08:00
Patrice Vignola	ce460f9cdb	[DML EP] Return device removal reason when D3D12 device gets removed (#13727 ) ### Description Before this change, when the D3D12 device was getting removed, we were returning a generic device removed error, which can be harder to investigate. ### Motivation and Context It makes it easier to debug and investigate device removal failures.	2022-11-22 10:38:56 -08:00
Patrice Vignola	6c5333e1a7	[DML EP] Enable more DML tests (#13726 ) ### Description Enables more DML tests. ### Motivation and Context It increases test coverage that was missing for the DML EP	2022-11-22 10:35:16 -08:00
Adam Pocock	dd2c031d95	[java] Sparse tensor support (#10653 ) Description: Adds support for creating and receiving sparse tensors in the ORT Java API. CSRC and COO tensors as inputs are tested, but there is no op which accepts a block sparse tensor to test. COO tensors are tested as outputs, but there is no op which emits a CSRC or block sparse tensor to test. Motivation and Context - Why is this change required? What problem does it solve? Request to expose ORT sparse tensor support in Java. cc @yuslepukhin	2022-11-22 10:29:24 -08:00
Tianlei Wu	8b0e0f4927	Add RemovePadding and RestorePadding for BERT model (#13701 ) Add two operators RemovePadding and RestorePadding based on ideal of effective transformer (https://github.com/bytedance/effective_transformer) to improve large batch size inference for BERT model.	2022-11-22 10:00:23 -08:00
guyang3532	ba9a585fcc	Fix the tensor save for backward release problem (#13679 ) Motivation: PythonOp is saving input for backward, it's risky since ONNX Runtime backend is not aware of this, the tensor buffer may be "released" by ORT, then potentially modified by other operators before backward function executes. Fix: This pr just clone all input of PythonOp before forward is invoked. This may be high overhead, it's just a workaround before a better fix.	2022-11-22 17:32:19 +08:00
pengwa	947aab0ae0	Make HF converge with lighting native amp (#13616 ) ### Fix training convergence issues #### Problem: Huggingface Transformers: 4.22.0 PyTorch Lightning: 1.6.3 PyTorch: v1.12.1, cuda 11.6 ORT: main branch, cuda 11.6 Model: RobertaForSequenceClassification @ models/roberta/modeling_roberta.py Mixed Precision training with `torch.autocast`: `a64e1dfd7d/pytorch_lightning/plugins/precision/native_amp.py (L99)` Under this amp autocast context, forward + loss computation run. Here is a snippet of loss computation. ``` if labels is not None: ... if self.config.problem_type == "regression": loss_fct = MSELoss() if self.num_labels == 1: ... elif self.config.problem_type == "single_label_classification": loss_fct = CrossEntropyLoss() loss = loss_fct(logits.view(-1, self.num_labels), labels.view(-1)) elif self.config.problem_type == "multi_label_classification": ... return SequenceClassifierOutput( loss=loss, logits=logits, hidden_states=outputs.hidden_states, attentions=outputs.attentions, ) ``` It is found after forward run, loss is 1.0850 in float16, looks good.. Then it did a scaling up here: `a64e1dfd7d/pytorch_lightning/plugins/precision/native_amp.py (L62)`, the scaler is 65536. then we get a scaled loss 71104 in float type (because float16 loss multiple fp32 scaler, type got promoted to fp32). Then backward started with initial grads to be 1, then 1 (float32) * 65536 (float32) as the backward step, generating a float16 gradient, then we got a `inf`. The problem occurs. With `inf`, the backward feed the `inf` into crossentropygradient op, generating `nan`s. Then all gradients got `nan` in back propagation. So we see training with ORTModule (it almost always `overflow`, the loss did not drop too much, as compared with PyTorch). #### Analysis for the UT (when autocast enabled) PyTorch trace graph looks like this : ``` graph(%0 : Float(16, 3, strides=[3, 1], requires_grad=0, device=cuda:0), %target : Long(16, strides=[1], requires_grad=0, device=cuda:0), %2 : Float(3, 3, strides=[3, 1], requires_grad=1, device=cuda:0)): %9 : int = prim::Constant[value=5]() # /opt/conda/envs/ptca/lib/python3.8/site-packages/torch/nn/modules/linear.py:114:0 %10 : bool = prim::Constant[value=0]() # /opt/conda/envs/ptca/lib/python3.8/site-packages/torch/nn/modules/linear.py:114:0 %11 : bool = prim::Constant[value=0]() # /opt/conda/envs/ptca/lib/python3.8/site-packages/torch/nn/modules/linear.py:114:0 %12 : NoneType = prim::Constant() %13 : Half(3, 3, strides=[3, 1], requires_grad=0, device=cuda:0) = aten::to(%2, %9, %10, %11, %12) # /opt/conda/envs/ptca/lib/python3.8/site-packages/torch/nn/modules/linear.py:114:0 %14 : int = prim::Constant[value=5]() # /opt/conda/envs/ptca/lib/python3.8/site-packages/torch/nn/modules/linear.py:114:0 %15 : bool = prim::Constant[value=0]() # /opt/conda/envs/ptca/lib/python3.8/site-packages/torch/nn/modules/linear.py:114:0 %16 : bool = prim::Constant[value=0]() # /opt/conda/envs/ptca/lib/python3.8/site-packages/torch/nn/modules/linear.py:114:0 %17 : NoneType = prim::Constant() %18 : Half(16, 3, strides=[3, 1], requires_grad=0, device=cuda:0) = aten::to(%0, %14, %15, %16, %17) # /opt/conda/envs/ptca/lib/python3.8/site-packages/torch/nn/modules/linear.py:114:0 %19 : NoneType = prim::Constant() %input : Half(16, 3, strides=[3, 1], requires_grad=0, device=cuda:0) = aten::linear(%18, %13, %19) # /opt/conda/envs/ptca/lib/python3.8/site-packages/torch/nn/modules/linear.py:114:0 %21 : NoneType = prim::Constant() %22 : int = prim::Constant[value=1]() # /opt/conda/envs/ptca/lib/python3.8/site-packages/torch/nn/functional.py:3,014:0 %23 : int = prim::Constant[value=-100]() # /opt/conda/envs/ptca/lib/python3.8/site-packages/torch/nn/functional.py:3,014:0 %24 : float = prim::Constant[value=0.]() # /opt/conda/envs/ptca/lib/python3.8/site-packages/torch/nn/functional.py:3,014:0 %data : Float(requires_grad=0, device=cuda:0) = aten::cross_entropy_loss(%input, %target, %21, %22, %23, %24) # /opt/conda/envs/ptca/lib/python3.8/site-packages/torch/nn/functional.py:3,014:0 %27 : Float(requires_grad=0, device=cuda:0) = ^_OutputIdentityOp()(%data) # /opt/conda/envs/ptca/lib/python3.8/site-packages/onnxruntime/training/ortmodule/_io.py:430:0 return (%27) ``` The most important lines %target : Long(16, strides=[1], requires_grad=0, device=cuda:0), %input : _Half_(16, 3, strides=[3, 1], requires_grad=0, device=cuda:0) = aten::linear(%18, %13, %19) # /opt/conda/envs/ptca/lib/python3.8/site-packages/torch/nn/modules/linear.py:114:0 _Float_(requires_grad=0, device=cuda:0) = aten::cross_entropy_loss(%_input_, %target, %21, %22, %23, %24) # /opt/conda/envs/ptca/lib/python3.8/site-packages/torch/nn/functional.py:3,014:0 `aten::cross_entropy_loss` takes Half input, and return Float output. As said in doc: https://pytorch.org/docs/stable/amp.html#cuda-ops-that-can-autocast-to-float32, `cross_entropy` in autocast mode will run in fp32 mode, e.g. convert its input to fp32 (if it is not), do the compute and return fp32 result. The other hand, ORT's `SoftmaxCrossEntropyLossInternal` take same types of input and output, and our code `31cb3cb254/orttraining/orttraining/python/training/ortmodule/_custom_op_symbolic_registry.py (L68)` when exporting `aten::cross_entropy_loss` assumed this, and set the output to be fp16 either. So this is the reason we have the problem. #### Possible Fixes 1. Enhance `SoftmaxCrossEntropyLossInternal` to support different types of input and output. 2. Check the input and output when exporting, add the input case explicitly if there is type promotion from input to output. This PR used the 2nd approach. We can start 1st approach when needed later. TODO: revisit all other exporter functions, add the checks, etc. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2022-11-22 15:08:30 +08:00
Changming Sun	67e46a873a	Add '-DCMAKE_OSX_ARCHITECTURES=x86_64;arm64' when build protobuf from source on MacOS (#13720 ) ### Description Add '-DCMAKE_OSX_ARCHITECTURES=x86_64;arm64' when build protobuf from source on MacOS. Because later on we will the built library with the other parts of onnxruntime to generate libonnxruntime.dylib, and if the target CPU ARCH of libonnxruntime.dylib is not x86_64, it will fail. ### Motivation and Context To fix a packaging pipeline failure, which was introduced from #13694	2022-11-21 21:59:34 -08:00
PeixuanZuo	8f3c6ea0df	[ROCm] Add GemmFastGelu TunableOp (#13589 ) ### Description <!-- Describe your changes. --> 1. Update the rules for GemmFastGelu fusion, MatMul input x should >= two dimension, input weight should == two dimension. 2. Add GemmFastGelu fusion test. 3. Add GemmFastGelu TunableOp, only contains the original implementation(Gemm + FastGelu). ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Co-authored-by: peixuanzuo <peixuanzuo@linmif39a000004.zvflicr54joexhdgnhvmxrxygg.phxx.internal.cloudapp.net>	2022-11-22 12:58:01 +08:00
PeixuanZuo	45a895cdc3	[ROCm] Fix static TunableOp (#13668 ) ### Description <!-- Describe your changes. --> 1. Re-add staticSelectionOp for FastGelu. 2. Call TunableOp when enable tuning. Call StaticSelectionOp when disable tuning. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Co-authored-by: peixuanzuo <peixuanzuo@linmif39a000004.zvflicr54joexhdgnhvmxrxygg.phxx.internal.cloudapp.net>	2022-11-22 10:51:54 +08:00
Yulong Wang	f1b5e4f1c9	[js] [deps] upgrade @xmldom/xmldom@0.7.9 (#13705 ) ### Description upgrade @xmldom/xmldom@0.7.9 ### Motivation and Context ``` yarn audit yarn audit v1.22.19 ┌───────────────┬──────────────────────────────────────────────────────────────┐ │ critical │ xmldom allows multiple root nodes in a DOM │ ├───────────────┼──────────────────────────────────────────────────────────────┤ │ Package │ @xmldom/xmldom │ ├───────────────┼──────────────────────────────────────────────────────────────┤ │ Patched in │ >=0.7.7 │ ├───────────────┼──────────────────────────────────────────────────────────────┤ │ Dependency of │ @expo/config-plugins │ ├───────────────┼──────────────────────────────────────────────────────────────┤ │ Path │ @expo/config-plugins > @expo/plist > @xmldom/xmldom │ ├───────────────┼──────────────────────────────────────────────────────────────┤ │ More info │ https://www.npmjs.com/advisories/1084900 │ └───────────────┴──────────────────────────────────────────────────────────────┘ 1 vulnerabilities found - Packages audited: 952 Severity: 1 Critical Done in 3.51s. ```	2022-11-21 17:01:42 -08:00
Seungwon Jeong	307ad1413a	[js/web] support 'pytorch_half_pixel' mode for WebGL kernel 'Resize' (#11208 ) Description: 1. add pytorch_half_pixel interpolation mode in resize-packed.ts Changes: add the following case in createPackedResizeProgramInfo function: ``` case 'pytorch_half_pixel': getSourceFracIndex = ` vec4 getSourceFracIndex(ivec4 coords) { vec4 fcoords = vec4(coords); return vec4( ${outputWidth}.0 > 1.0 ? (fcoords.x + 0.5) / scaleWHWH.x - 0.5 : 0.0, ${outputHeight}.0 > 1.0 ? (fcoords.y + 0.5) / scaleWHWH.y - 0.5 : 0.0, ${outputWidth}.0 > 1.0 ? (fcoords.z + 0.5) / scaleWHWH.z - 0.5 : 0.0, ${outputHeight}.0 > 1.0 ? (fcoords.w + 0.5) / scaleWHWH.w - 0.5 : 0.0 ); } `; break; ``` 2. fix "unrecognized input '' for node: Resize_$num" error when inputs like [input_tensor, None, scale_factor] (roiInput not given) are fed into the resize layer. Changes: change in input handling logic in upsample.ts & node scanning logic in graph.ts Motivation and Context Before this fix, we aren't able to use webGL backend when the neural network contains pytorch resize layers. This fix adds 'pytorch_half_pixel' interpolation mode support and makes it possible to use webGL backend for more kind of computer vision networks. This commit solves: #10430 Co-authored-by: neo <neo@icode-lab.com> Co-authored-by: Yulong Wang <7679871+fs-eire@users.noreply.github.com>	2022-11-21 12:03:48 -08:00
shalvamist	3119381011	ORT Web build script (#12643 ) Description: Adding a few scripts to enable user to build ORT Web in a simpler way. Instructions: Under ROOT\js folder you will have 2 scripts - 1. "Build_web.bat" - for Windows users 1. "Build_web.sh" - for Linux users Default build configuration is "Release" to change the build configuration just add to the script call the flag "--config <Desired configuration>". As example: ``` build_web.bat --config Debug ``` Co-authored-by: shalvamist <shalva.mist@microsoft.com>	2022-11-21 11:08:39 -08:00
Changming Sun	a5c2047dd1	Fix the remaining Prefast warnings in CPU EP (#13707 ) ### Description Fix the remaining Prefast warnings in CPU EP.	2022-11-21 10:21:38 -08:00
cloudhan	8de5381e84	Add IsSupported support to Op functor (#13692 ) Sometime it is a bit risky to call the Op directly to check whether the impl supports consuming the param. This gives the user a way to actually implement `IsSupported` for checking in non-compact way.	2022-11-21 19:22:00 +08:00
shalvamist	4a2a857030	Bug Fix - WASM build break (#13699 ) ### Description When using the build flag "--cmake_extra_defines onnxruntime_DEBUG_NODE_INPUTS_OUTPUTS=1" with WASM it results with a build break. Since we are comparing a const vs. non-const T type, this added casting resolves the issue.	2022-11-20 23:30:31 -08:00
PeixuanZuo	da2bd3ad4d	[ROCm] Build ROCm CI with Release config and enable kernel explorer test (#13687 ) ### Description <!-- Describe your changes. --> 1. Build ROCm CI with Release config to save time. 2. use 32 threads to build, we have 256 threads on new CI machine. 3. enable ROCm kernel explorer test. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Co-authored-by: peixuanzuo <peixuanzuo@linmif39a000004.zvflicr54joexhdgnhvmxrxygg.phxx.internal.cloudapp.net>	2022-11-21 10:04:10 +08:00
dependabot[bot]	8472876155	Bump socket.io-parser from 4.0.4 to 4.0.5 in /js/web (#13608 ) Bumps [socket.io-parser](https://github.com/socketio/socket.io-parser) from 4.0.4 to 4.0.5. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/socketio/socket.io-parser/releases">socket.io-parser's releases</a>.</em></p> <blockquote> <h2>4.0.5</h2> <h3>Bug Fixes</h3> <ul> <li>check the format of the index of each attachment (<a href="`b559f050ee`">b559f05</a>)</li> </ul> <h4>Links</h4> <ul> <li>Diff: <a href="https://github.com/socketio/socket.io-parser/compare/4.0.4...4.0.5">https://github.com/socketio/socket.io-parser/compare/4.0.4...4.0.5</a></li> </ul> </blockquote> </details> <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/socketio/socket.io-parser/blob/main/CHANGELOG.md">socket.io-parser's changelog</a>.</em></p> <blockquote> <h2><a href="https://github.com/socketio/socket.io-parser/compare/4.0.4...4.0.5">4.0.5</a> (2022-06-27)</h2> <h3>Bug Fixes</h3> <ul> <li>check the format of the index of each attachment (<a href="`b559f050ee`">b559f05</a>)</li> </ul> <h1><a href="https://github.com/socketio/socket.io-parser/compare/4.1.2...4.2.0">4.2.0</a> (2022-04-17)</h1> <h3>Features</h3> <ul> <li>allow the usage of custom replacer and reviver (<a href="https://github-redirect.dependabot.com/socketio/socket.io-parser/issues/112">#112</a>) (<a href="`b08bc1a93e`">b08bc1a</a>)</li> </ul> <h2><a href="https://github.com/socketio/socket.io-parser/compare/4.1.1...4.1.2">4.1.2</a> (2022-02-17)</h2> <h3>Bug Fixes</h3> <ul> <li>allow objects with a null prototype in binary packets (<a href="https://github-redirect.dependabot.com/socketio/socket.io-parser/issues/114">#114</a>) (<a href="`7f6b262ac8`">7f6b262</a>)</li> </ul> <h2><a href="https://github.com/socketio/socket.io-parser/compare/4.1.0...4.1.1">4.1.1</a> (2021-10-14)</h2> <h1><a href="https://github.com/socketio/socket.io-parser/compare/4.0.4...4.1.0">4.1.0</a> (2021-10-11)</h1> <h3>Features</h3> <ul> <li>provide an ESM build with and without debug (<a href="`388c616a92`">388c616</a>)</li> </ul> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="`f3329eb5a4`"><code>f3329eb</code></a> chore(release): 4.0.5</li> <li><a href="`b559f050ee`"><code>b559f05</code></a> fix: check the format of the index of each attachment</li> <li>See full diff in <a href="https://github.com/socketio/socket.io-parser/compare/4.0.4...4.0.5">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=socket.io-parser&package-manager=npm_and_yarn&previous-version=4.0.4&new-version=4.0.5)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) - `@dependabot use these labels` will set the current labels as the default for future PRs for this repo and language - `@dependabot use these reviewers` will set the current reviewers as the default for future PRs for this repo and language - `@dependabot use these assignees` will set the current assignees as the default for future PRs for this repo and language - `@dependabot use this milestone` will set the current milestone as the default for future PRs for this repo and language You can disable automated security fix PRs for this repo from the [Security Alerts page](https://github.com/microsoft/onnxruntime/network/alerts). </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2022-11-19 12:55:21 -08:00
Nat Kershaw (MSFT)	43a7b520e4	Convert label config to one line regexes (#13702 )	2022-11-19 11:38:29 -08:00
Yulong Wang	2d732e9729	[js] [deps] upgrade minimatch@3.1.2 (#13703 ) ### Description upgrade minimatch@3.1.2 ### Motivation and Context ``` # npm audit report minimatch <3.0.5 Severity: high minimatch ReDoS vulnerability - https://github.com/advisories/GHSA-f8q6-p94x-37v3 ```	2022-11-18 22:27:57 -08:00
Hariharan Seshadri	c7329e004d	Improve fp16 performance of GPT-2's logits MatMul while using BeamSearch (#13686 )	2022-11-18 18:50:19 -08:00
dependabot[bot]	c358d64b0e	Bump loader-utils from 2.0.0 to 2.0.4 in /js/web (#13666 ) Bumps [loader-utils](https://github.com/webpack/loader-utils) from 2.0.0 to 2.0.4. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/webpack/loader-utils/releases">loader-utils's releases</a>.</em></p> <blockquote> <h2>v2.0.4</h2> <h3><a href="https://github.com/webpack/loader-utils/compare/v2.0.3...v2.0.4">2.0.4</a> (2022-11-11)</h3> <h3>Bug Fixes</h3> <ul> <li>ReDoS problem (<a href="https://github-redirect.dependabot.com/webpack/loader-utils/issues/225">#225</a>) (<a href="`ac09944dfa`">ac09944</a>)</li> </ul> <h2>v2.0.3</h2> <h3><a href="https://github.com/webpack/loader-utils/compare/v2.0.1...v2.0.3">2.0.3</a> (2022-10-20)</h3> <h3>Bug Fixes</h3> <ul> <li><strong>security:</strong> prototype pollution exploit (<a href="https://github-redirect.dependabot.com/webpack/loader-utils/issues/217">#217</a>) (<a href="`a93cf6f470`">a93cf6f</a>)</li> </ul> <h2>v2.0.2</h2> <h3><a href="https://github.com/webpack/loader-utils/compare/v2.0.1...v2.0.2">2.0.2</a> (2021-11-04)</h3> <h3>Bug Fixes</h3> <ul> <li>base64 generation and unicode characters (<a href="https://github-redirect.dependabot.com/webpack/loader-utils/issues/197">#197</a>) (<a href="`8c2d24ee40`">8c2d24e</a>)</li> </ul> <h2>v2.0.1</h2> <h3><a href="https://github.com/webpack/loader-utils/compare/v2.0.0...v2.0.1">2.0.1</a> (2021-10-29)</h3> <h3>Bug Fixes</h3> <ul> <li>md4 support on Node.js v17 (<a href="https://github-redirect.dependabot.com/webpack/loader-utils/issues/193">#193</a>) (<a href="`1069f61284`">1069f61</a>)</li> </ul> </blockquote> </details> <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/webpack/loader-utils/blob/v2.0.4/CHANGELOG.md">loader-utils's changelog</a>.</em></p> <blockquote> <h3><a href="https://github.com/webpack/loader-utils/compare/v2.0.3...v2.0.4">2.0.4</a> (2022-11-11)</h3> <h3>Bug Fixes</h3> <ul> <li>ReDoS problem (<a href="https://github-redirect.dependabot.com/webpack/loader-utils/issues/225">#225</a>) (<a href="`ac09944dfa`">ac09944</a>)</li> </ul> <h3><a href="https://github.com/webpack/loader-utils/compare/v2.0.1...v2.0.3">2.0.3</a> (2022-10-20)</h3> <h3>Bug Fixes</h3> <ul> <li><strong>security:</strong> prototype pollution exploit (<a href="https://github-redirect.dependabot.com/webpack/loader-utils/issues/217">#217</a>) (<a href="`a93cf6f470`">a93cf6f</a>)</li> </ul> <h3><a href="https://github.com/webpack/loader-utils/compare/v2.0.1...v2.0.2">2.0.2</a> (2021-11-04)</h3> <h3>Bug Fixes</h3> <ul> <li>base64 generation and unicode characters (<a href="https://github-redirect.dependabot.com/webpack/loader-utils/issues/197">#197</a>) (<a href="`8c2d24ee40`">8c2d24e</a>)</li> </ul> <h3><a href="https://github.com/webpack/loader-utils/compare/v2.0.0...v2.0.1">2.0.1</a> (2021-10-29)</h3> <h3>Bug Fixes</h3> <ul> <li>md4 support on Node.js v17 (<a href="https://github-redirect.dependabot.com/webpack/loader-utils/issues/193">#193</a>) (<a href="`1069f61284`">1069f61</a>)</li> </ul> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="`6688b50281`"><code>6688b50</code></a> chore(release): 2.0.4</li> <li><a href="`ac09944dfa`"><code>ac09944</code></a> fix: ReDoS problem (<a href="https://github-redirect.dependabot.com/webpack/loader-utils/issues/225">#225</a>)</li> <li><a href="`7162619fb9`"><code>7162619</code></a> chore(release): 2.0.3</li> <li><a href="`a93cf6f470`"><code>a93cf6f</code></a> fix(security): prototype polution exploit (<a href="https://github-redirect.dependabot.com/webpack/loader-utils/issues/217">#217</a>)</li> <li><a href="`90c7c4be17`"><code>90c7c4b</code></a> chore(release): 2.0.2</li> <li><a href="`8c2d24ee40`"><code>8c2d24e</code></a> fix: base64 generation and unicode characters (<a href="https://github-redirect.dependabot.com/webpack/loader-utils/issues/197">#197</a>)</li> <li><a href="`5fb5562084`"><code>5fb5562</code></a> chore(release): 2.0.1</li> <li><a href="`1069f61284`"><code>1069f61</code></a> fix: md4 support on Node.js v17 (<a href="https://github-redirect.dependabot.com/webpack/loader-utils/issues/193">#193</a>)</li> <li>See full diff in <a href="https://github.com/webpack/loader-utils/compare/v2.0.0...v2.0.4">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=loader-utils&package-manager=npm_and_yarn&previous-version=2.0.0&new-version=2.0.4)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) - `@dependabot use these labels` will set the current labels as the default for future PRs for this repo and language - `@dependabot use these reviewers` will set the current reviewers as the default for future PRs for this repo and language - `@dependabot use these assignees` will set the current assignees as the default for future PRs for this repo and language - `@dependabot use this milestone` will set the current milestone as the default for future PRs for this repo and language You can disable automated security fix PRs for this repo from the [Security Alerts page](https://github.com/microsoft/onnxruntime/network/alerts). </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2022-11-18 18:01:25 -08:00
Edward Chen	4901987d1d	Remove SafeInt dependency from Objective-C API. (#13698 )	2022-11-18 17:06:12 -08:00
Changming Sun	3e9e5e9d6d	Patch Protobuf and ONNX's cmake files and enforce BinSkim check (#13694 ) Patch Protobuf and ONNX's cmake files and enforce BinSkim check. This PR has overlap with #13523 . I would prefer to get this one merged first so that we can finished the BinSkim work, and I try to make this PR as small as possible.	2022-11-18 10:09:47 -08:00
Wei-Sheng Chin	6160ba0692	Fix aten::_to_copy in DORT (#13682 ) `aten::_to_copy` is not exportable to ONNX. In DORT, so it's replaced in `_replace_to_copy_with_to`. This replacement logic becomes incorrect in latest PyTorch commit, and this PR is a fix. Basically, we examine more key-word attributes passed to `aten::_to_copy` and if they lead to a type casting operator (i.e., mapped to ONNX's Cast), we replace that `aten::_to_copy` with `aten::to`. Unsupported attributes are removed (with a low risk of breaking FX graph's assumptions).	2022-11-18 09:31:18 -08:00
Vincent Wang	07812a2fa6	Fix UT Failure on AMD for ORTModule's Conv Test (#13688 ) Currently provider option conv_algo_search is for CUDA only, so remove the checking for ROCm EP.	2022-11-18 17:52:22 +08:00
Changming Sun	7a57976d1a	Make natvis files work better (#13665 ) ### Description After this change, you will see GSL.natvis and wil.nativs files will be added to every onnxruntime_xxx project. Like this: ![image](https://user-images.githubusercontent.com/856316/202081013-314145a8-7a0f-4f45-bf85-f9ed0e247c63.png) This is because in onnxruntime_common.cmake we have: ```cmake if (MSVC) set(ABSEIL_NATVIS_FILE "abseil-cpp.natvis") target_sources( onnxruntime_common INTERFACE $<BUILD_INTERFACE:${PROJECT_SOURCE_DIR}/external/${ABSEIL_NATVIS_FILE}>) endif() ``` It sets a property, INTERFACE_SOURCES, on the target "onnxruntime_common". Then if anyone else uses: ``` target_link_libraries(mytarget PRIVATE onnxruntime_common) ``` The nativis file will be added to `mytarget`. However, in this project we don't use such things for the targets that are static libraries. For example, onnxruntime_graph is a static library. Instead, we use the `onnxruntime_add_include_to_target ` function to explicitly control what we want to propagate . The function was written before we started to have nativis files. So it doesn't pass a source file from one static library to another. Now we have the need. Probably only for Windows. ### Motivation and Context Add natvis files to every project.	2022-11-17 19:13:40 -08:00
Ye Wang	38a74af45d	Support position_ids broadcasting in EmbedLayerNorm (#13677 ) ### Description <!-- Describe your changes. --> ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> fix https://github.com/microsoft/onnxruntime/issues/13508	2022-11-17 17:56:27 -08:00
Adrian Lizarraga	abfdb63e31	Update protobuf-java to version 3.21.7 (#13630 ) ### Description Update protobuf-java to version 3.21.7. This change only impact tests. ### Motivation and Context The current version exhibits CVE-2022-3509	2022-11-17 15:04:42 -08:00
pengwa	d5721b3464	Fix wrong import path in docs (#13680 ) ### Fix wrong import path in docs	2022-11-17 18:15:02 +08:00
PeixuanZuo	a50877ac99	[ROCm] Add ROCm5.3.2 to python package pipeline (#13664 ) ### Description <!-- Describe your changes. --> Add ROCm5.3.2 to python package pipeline we build rocm/dev-centos-7:x.x.x stage by ourselves to avoid dependence on AMD's release. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Co-authored-by: peixuanzuo <peixuanzuo@linmif39a000004.zvflicr54joexhdgnhvmxrxygg.phxx.internal.cloudapp.net>	2022-11-17 16:10:49 +08:00
Sunny Shukla	de77c60e6e	[oneDNN ep] SLN performance improvement for bias (#13620 ) ### Description SkipLayerNorm performance improvement when bias is present as input ### Motivation and Context - For SkipLayerNorm op, adding bias tensor using post-op to the add primitive adding input and skip tensors is causing drastic performance degradation. - Hence the post-op is removed and instead, two add primitives are used in series, adding input and skip, and then adding bias to the result of input and skip. - This change has shown a significant amount of performance gain for SkipLayerNorm operator.	2022-11-16 21:25:00 -08:00
cloudhan	b731cf397d	Make static analysis happy (#13655 ) Just suppress some warning by changing code.	2022-11-17 09:07:20 +08:00
Yi Zhang	116079749e	Fix Mac CI in Packaging pipeline (#13671 ) ### Description <!-- Describe your changes. --> The default python upgrades to 3.11 in Mac, but 3.11 hasn't been supported yet. So Use python3.8 instead. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Fix MacOS CI in Zip-Nuget-Java-Nodejs Packaging Pipeline ### Test Run https://dev.azure.com/aiinfra/Lotus/_build/results?buildId=249020&view=logs&j=ded01483-6627-58ac-64dc-d4a232827e5d	2022-11-17 08:12:30 +08:00
Jian Chen	8442d9df2c	Cjian/c4244 round 6 (#13663 ) ### Description Fix round 6 ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2022-11-16 16:26:11 -05:00
Rachel Guo	2efd2878ab	[rn] Add uint8 typedArray support for react native android (#13622 ) ### Description <!-- Describe your changes. --> - Add missing uint8 typedArray case - Add createInputTensor_uint8 unit test in TensorHelperTest.java file ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Detected inferencesession.run() call error when running react native app with uint8array input ort tensor. Add missing support to fix.	2022-11-16 12:37:47 -08:00
shalvamist	359091f64a	XNNPACK - GEMM & MATMUL integration (#13126 ) ### Description - Added support for XNNPACK GEMM & MATMUL ops. ### Motivation and Context Documented ~5% performance improvement on mobileBert using XNNpack Gemm operation Co-authored-by: shalvamist <shalva.mist@microsoft.com>	2022-11-16 09:47:35 -08:00
Dwayne Robinson	55fb790d88	DML EP allow squeeze-13 axes to be empty (#13635 ) ### Description Description: [ONNX Squeeze-13](https://github.com/onnx/onnx/blob/main/docs/Operators.md#Squeeze) treats empty `axes` as if all axes had been given. This works for [earlier Squeeze versions](https://github.com/microsoft/onnxruntime/pull/12649), but Squeeze-13 checks for axes as a dynamic input tensor, which means it needs to checked for existence before accessing. ### Motivation and Context - Why is this change required? What problem does it solve? Fixes a customer model. Makes ORT DML EP consistent with spec.	2022-11-15 11:03:21 -08:00
Jian Chen	3201a1f841	Cjian/c4244 round 5 (#13645 ) ### Description Round 5 of the fixes, there are 192 to go. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2022-11-15 13:48:21 -05:00
Abhishek Udupa	9c6c219949	Enable shape-sensitive analysis in ProfileExplorer for GPU kernels (#13647 ) ### Description Improve the profile explorer by enabling shape sensitivity for GPU kernels. ### Motivation and Context Due to problems with the ROCM profiler, it was previously challenging to retrieve the shapes corresponding to a GPU kernel event. [PR 13546](https://github.com/microsoft/onnxruntime/pull/13549) addresses these problems, so it's now possible to retrieve shapes from the ORT ROCM/CUDA profilers. This PR leverages [PR 13546](https://github.com/microsoft/onnxruntime/pull/13549) to enable shape-sensitive GPU kernel ranking. Co-authored-by: Abhishek Udupa <abhishek.udupa@microsoft.com>	2022-11-15 10:05:40 -08:00
Yulong Wang	4cd8b4269a	ignore dirty state of submodule XNNPACK (#13648 ) ### Description ignore dirty state of submodule XNNPACK ### Motivation and Context ONNX Runtime WebAssembly build will apply a patch to XNNPACK so it is considered 'dirty' state in the submodule. We want to ignore this when checking the workspace using `git status`.	2022-11-15 00:38:46 -08:00
cloudhan	9e649d1ac4	Allow CUDA EP enable or disable TunableOp via session options and environment variable (#13601 ) This ports #13116 from ROCm EP to CUDA EP	2022-11-15 14:43:54 +08:00
JiCheng	2490cf84c9	[QLinearSoftmax]remove input_shape check in Ctor (#13489 ) ### Description In some case, we can't get node's shape to do pre-process. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2022-11-15 12:02:17 +08:00
Changming Sun	ad31ac466b	Delete cpu-esrp-pipeline.yml (#13623 ) The content has been moved to "Zip-Nuget-Java-Nodejs Packaging Pipeline".	2022-11-14 19:00:40 -08:00
Jeff Bloomfield	b1169635cc	Ensure graph resolve occurs after free dimension is overridden (#13634 ) ### Description This ensures that the graph is re-resolved after a free dimension shape is overridden according to session options. ### Motivation and Context This ensures that shape inference occurs, which is necessary to apply the optimation and ensure it the session is compatible with bound shapes. This bug seems to only have affected a small fraction of models.	2022-11-14 18:39:29 -08:00
Guenther Schmuelling	6f6560a7b9	fix to reduce peak memory usage in ort-web (#13323 ) fix to reduce peak memory usage in ort-web	2022-11-14 12:18:02 -08:00
Justin Chu	197191e58c	Update pylint config to include valid short names (#13631 ) ### Description Update pylint config to include valid short names Also disabled `too-many-arguments` and `too-many-locals` ### Motivation and Context Refine config to reduce lint noise	2022-11-14 10:00:25 -08:00
Jian Chen	f0ff2c5de9	Cjian/c4244 round 4 (#13632 ) ### Description round 4, There are 436 more togo. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2022-11-14 12:20:26 -05:00
cloudhan	369a822409	Share TunableOp between CUDA and ROCM EP (#13560 ) Make TunableOp to support CUDA kernel authoring and add the corresponding supports for kernel explorer	2022-11-11 13:56:44 +08:00

1 2 3 4 5 ...

7730 commits