onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-07-20 19:12:24 +00:00

Author	SHA1	Message	Date
Scott McKay	d1b2b35cd2	Various fixes to the CSharp setup (#15782 ) ### Description <!-- Describe your changes. --> Various fixes to the CSharp setup - fix warnings - fix invalid tests - update test sdk nuget package - enables testing on linux - fixes issue with some unit tests not running in CI - run unit tests in linux pipeline using dotnet ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Unit tests weren't breaking in CIs for both Windows and Linux builds and should have been.	2023-05-05 14:27:30 +10:00
BoarQing	272aab4afa	Fix issues on Windows for Vitis AI (#15810 ) ### Description Fix two errors that is only encountered on windows ### Motivation and Context For onnxruntime::VitisAIProviderFactoryCreator::Create, it would cause the compile error. For if (it == provider_options_map.end()), it would cause an error but execute as normal Co-authored-by: Zhang <yueqingz@amd.com>	2023-05-04 14:42:19 -07:00
cloudhan	412d05a1d2	[ROCm] Update cmake (#15807 ) Followup of #15775	2023-05-04 11:20:56 -07:00
dependabot[bot]	58ee076750	Bump engine.io from 6.4.1 to 6.4.2 in /js/web (#15799 ) Bumps [engine.io](https://github.com/socketio/engine.io) from 6.4.1 to 6.4.2. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/socketio/engine.io/releases">engine.io's releases</a>.</em></p> <blockquote> <h2>6.4.2</h2> <p>⚠️ This release contains an important security fix ⚠️</p> <p>A malicious client could send a specially crafted HTTP request, triggering an uncaught exception and killing the Node.js process:</p> <pre><code>TypeError: Cannot read properties of undefined (reading 'handlesUpgrades') at Server.onWebSocket (build/server.js:515:67) </code></pre> <p>Please upgrade as soon as possible.</p> <h3>Bug Fixes</h3> <ul> <li>include error handling for Express middlewares (<a href="https://redirect.github.com/socketio/engine.io/issues/674">#674</a>) (<a href="`93957828be`">9395782</a>)</li> <li>prevent crash when provided with an invalid query param (<a href="`fc480b4f30`">fc480b4</a>)</li> <li><strong>typings:</strong> make clientsCount public (<a href="https://redirect.github.com/socketio/engine.io/issues/675">#675</a>) (<a href="`bd6d4713b0`">bd6d471</a>)</li> <li><strong>uws:</strong> prevent crash when using with middlewares (<a href="`8b22162903`">8b22162</a>)</li> </ul> <h3>Credits</h3> <p>Huge thanks to <a href="https://github.com/tyilo"><code>@tyilo</code></a> and <a href="https://github.com/cieldeville"><code>@cieldeville</code></a> for helping!</p> <h4>Links</h4> <ul> <li>Diff: <a href="https://github.com/socketio/engine.io/compare/6.4.1...6.4.2">https://github.com/socketio/engine.io/compare/6.4.1...6.4.2</a></li> <li>Client release: -</li> <li>ws version: <a href="https://github.com/websockets/ws/releases/tag/8.11.0">~8.11.0</a> (no change)</li> </ul> </blockquote> </details> <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/socketio/engine.io/blob/main/CHANGELOG.md">engine.io's changelog</a>.</em></p> <blockquote> <h2><a href="https://github.com/socketio/engine.io/compare/6.4.1...6.4.2">6.4.2</a> (2023-05-02)</h2> <p>⚠️ This release contains an important security fix ⚠️</p> <p>A malicious client could send a specially crafted HTTP request, triggering an uncaught exception and killing the Node.js process:</p> <pre><code>TypeError: Cannot read properties of undefined (reading 'handlesUpgrades') at Server.onWebSocket (build/server.js:515:67) </code></pre> <p>Please upgrade as soon as possible.</p> <h3>Bug Fixes</h3> <ul> <li>include error handling for Express middlewares (<a href="https://redirect.github.com/socketio/engine.io/issues/674">#674</a>) (<a href="`93957828be`">9395782</a>)</li> <li>prevent crash when provided with an invalid query param (<a href="`fc480b4f30`">fc480b4</a>)</li> <li><strong>typings:</strong> make clientsCount public (<a href="https://redirect.github.com/socketio/engine.io/issues/675">#675</a>) (<a href="`bd6d4713b0`">bd6d471</a>)</li> <li><strong>uws:</strong> prevent crash when using with middlewares (<a href="`8b22162903`">8b22162</a>)</li> </ul> <h3>Credits</h3> <p>Huge thanks to <a href="https://github.com/tyilo"><code>@tyilo</code></a> and <a href="https://github.com/cieldeville"><code>@cieldeville</code></a> for helping!</p> <h3>Dependencies</h3> <ul> <li><a href="https://github.com/websockets/ws/releases/tag/8.11.0"><code>ws@~8.11.0</code></a> (no change)</li> </ul> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="`95e215387c`"><code>95e2153</code></a> chore(release): 6.4.2</li> <li><a href="`fc480b4f30`"><code>fc480b4</code></a> fix: prevent crash when provided with an invalid query param</li> <li><a href="`0141951185`"><code>0141951</code></a> refactor(types): ensure compatibility with Express middlewares</li> <li><a href="`8b22162903`"><code>8b22162</code></a> fix(uws): prevent crash when using with middlewares</li> <li><a href="`93957828be`"><code>9395782</code></a> fix: include error handling for Express middlewares (<a href="https://redirect.github.com/socketio/engine.io/issues/674">#674</a>)</li> <li><a href="`911d0e3575`"><code>911d0e3</code></a> refactor: return HTTP 400 upon invalid request overlap</li> <li><a href="`bd6d4713b0`"><code>bd6d471</code></a> fix(typings): make clientsCount public (<a href="https://redirect.github.com/socketio/engine.io/issues/675">#675</a>)</li> <li>See full diff in <a href="https://github.com/socketio/engine.io/compare/6.4.1...6.4.2">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=engine.io&package-manager=npm_and_yarn&previous-version=6.4.1&new-version=6.4.2)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) You can disable automated security fix PRs for this repo from the [Security Alerts page](https://github.com/microsoft/onnxruntime/network/alerts). </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2023-05-04 10:06:01 -07:00
Yulong Wang	4712009f8a	[js/web] add target ort.webgpu.min.js (#15780 ) ### Description add target ort.webgpu.min.js WebGPU is experimental feature, so I don't want to put webgpu into the ort.min.js file. This change adds 2 ways for users to access ort-web with webgpu: - using script tag: by URL `https://cdn.jsdelivr.net/npm/onnxruntime-web@1.15.0/dist/ort.webgpu.min.js` ( this URL is not ready yet ) - using `import()`: use `import { Tensor, InferenceSession } from 'onnxruntime-web/webgpu';` - 'onnxruntime-web/webgpu' instead of 'onnxruntime-web'	2023-05-04 10:05:39 -07:00
RandySheriffH	8e610f25d8	Implement lite custom op API (#15778 ) Implement a set of new APIs for lightweight custom ops registration, to save efforts from schema-composing. A few highlights: - Support build-time type inference; - Support function-as-op for "stateless" ops; - Support structure-as-op for "stateful" ops; - Support varied input/output forms such as span, scalar, and tensors, either optional or non-optional. --------- Co-authored-by: Randy Shuai <rashuai@microsoft.com>	2023-05-04 09:49:17 -07:00
Changming Sun	34fcdd83c8	Update softmax_grad_impl.cu: add constexpr (#15794 ) ### Description Add a "constexpr" keyword to fix a static analysis warning	2023-05-04 08:10:17 -07:00
Yulong Wang	df7424e11f	[JSEP] fix constructor for OrtDevice (#15805 ) ### Description Add the missing `OrtDevice` initialization in JSEP introduced by #15618	2023-05-04 07:51:17 -07:00
Yulong Wang	33d1372729	[wasm] revert emsdk to v3.1.19 (#15793 ) ### Description latest emsdk generated multi-thread version sometimes crash with unknown reason ( error: memory access out of bounds ). we don't want to break existing ort-web users, so revert emsdk back to 3.1.19 (same to what ort v1.14.0 uses)	2023-05-04 01:15:01 -07:00
dependabot[bot]	422606ea76	Bump engine.io from 6.4.0 to 6.4.2 in /onnxruntime/test/wasm (#15798 )	2023-05-04 05:43:40 +00:00
Baiju Meswani	e464588a0e	Avoid generating training documentation during packaging (#15795 )	2023-05-03 19:09:07 -07:00
Jian Chen	5eedd884c8	Adding support for conv fp16 fusion on Resnet50v1 (#15474 ) ### Description Adding support for conv fp16 fusion with Conv-Add and Conv-Add-act. Specifically tested on on Resnet50v1 ### Motivation and Context Adding support for conv fp16 fusion with Conv-Add and Conv-Add-act. Specifically tested on on Resnet50v1	2023-05-03 15:48:06 -07:00
Changming Sun	1fb2f2605b	Update VERSION_NUMBER (#15773 ) ### Description 1. Update VERSION_NUMBER for preparing the upcoming release. This PR's commit will not be included in the 1.15 release branch 2. Delete package/rpm/onnxruntime.spec since it was not used in past years. ### Motivation and Context Preparing the release. Fixed [AB#15311](https://aiinfra.visualstudio.com/6a833879-cd9b-44a4-a9de-adc2d818f13c/_workitems/edit/15311)	2023-05-03 15:07:34 -07:00
Baiju Meswani	ba7b83ff3c	Remove onnxruntime_PYBIND_EXPORT_OPSCHEMA definition from onnxruntime (#15776 )	2023-05-03 13:08:35 -07:00
Changming Sun	41c082fdde	Add a Github workflow for Prefast (#15763 )	2023-05-03 11:42:51 -07:00
Changming Sun	d53324d4a7	Update cmake version in a few places (#15775 ) ### Description They were missed in #15707 , because they are not in common places for Dockerfiles. Though this commit updated tools/ci_build/github/pai/rocm-ci-pipeline-env.Dockerfile, it won't automatically take effect. The image needs to be manually generated and pushed to a place, and before doing that our CMakeLists.txt also needs to be tweaked a little bit.	2023-05-02 22:56:28 -07:00
RandySheriffH	e3ec2b3a8e	Exclude cases from reduced build (#15779 ) Exclude cases from reduced build to unblock pipeline. Fixed [AB#15326](https://aiinfra.visualstudio.com/6a833879-cd9b-44a4-a9de-adc2d818f13c/_workitems/edit/15326) Co-authored-by: Randy Shuai <rashuai@microsoft.com>	2023-05-02 21:05:54 -07:00
Yulong Wang	ef1f17f3dc	[wasm/JSEP] add threaded build to artifacts (#15777 ) ### Description This is the first part to create a webassembly artifacts for ort-web webgpu EP (wasm build). there will be following steps to consume the artifacts in web build	2023-05-02 17:53:44 -07:00
Baiju Meswani	2d519d21af	Python documentation for onnxruntime-training (#15765 )	2023-05-02 16:58:16 -07:00
Jian Chen	abdd4f518a	Update TRT Windows Cuda 11.6 to 11.8 (#15746 ) ### Description Update TRT Windows cuda 11.6 to 11.8 ### Motivation and Context We are adapting newer version of cuda systemwide.	2023-05-02 12:23:13 -07:00
Changming Sun	328cabb194	Download protoc from Github Release instead of Nuget (#15731 ) ### Description Download protoc from Github Release instead of Nuget to avoid having dependency on nuget.exe on Linux ### Motivation and Context To avoid having dependency on nuget.exe on Linux. Many users' build environment do not have nuget or dotnet.	2023-05-02 12:18:59 -07:00
Nat Kershaw (MSFT)	e901cdbf54	Add wildcard paths to the API docs generation workflows (#15313 )	2023-05-02 10:43:45 -07:00
Chen Fu	4b8025e492	Parallelize fp16 pooling operators (#15766 ) ### Description Parallelize fp16 pooling operators ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2023-05-02 08:48:56 -07:00
Chen Fu	bc58fd5413	fix compilation error in no absl build (#15769 ) ### Description Fix no-absl build error:	2023-05-02 08:20:49 -07:00
Sohaib Iftikhar	92309187b3	Allow compilation using clang when using cuda. (#15672 ) ### Description Currently compiling with clang + cuda leads to: ``` /code/build/_deps/onnxruntime-src/include/onnxruntime/core/common/make_string.h:33:6: error: call to function 'operator<<' that is neither visible in the template definition nor found by argument-dependent lookup ss << t; ^ /code/build/_deps/onnxruntime-src/include/onnxruntime/core/common/make_string.h:39:3: note: in instantiation of function template specialization 'onnxruntime::detail::MakeStringImpl<gsl::span<const long, 18446744073709551615>>' requested here MakeStringImpl(ss, args...); ^ /code/build/_deps/onnxruntime-src/include/onnxruntime/core/common/make_string.h:46:3: note: in instantiation of function template specialization 'onnxruntime::detail::MakeStringImpl<const char , gsl::span<const long, 18446744073709551615>>' requested here MakeStringImpl(ss, args...); ^ /code/build/_deps/onnxruntime-src/include/onnxruntime/core/common/make_string.h:93:18: note: in instantiation of function template specialization 'onnxruntime::detail::MakeStringImpl<const char , gsl::span<const long, 18446744073709551615>>' requested here return detail::MakeStringImpl(detail::if_char_array_make_ptr_t<Args const&>(args)...); ^ /code/build/_deps/onnxruntime-src/onnxruntime/contrib_ops/cuda/quantization/qordered_ops/qordered_qdq.cc:73:12: note: in instantiation of function template specialization 'onnxruntime::MakeString<char[39], gsl::span<const long, 18446744073709551615>>' requested here return ORT_MAKE_STATUS(ONNXRUNTIME, INVALID_ARGUMENT, "Shape not meet clean tile requirement!", dims); ^ /code/build/_deps/onnxruntime-src/include/onnxruntime/core/common/common.h:188:48: note: expanded from macro 'ORT_MAKE_STATUS' ::onnxruntime::MakeString(__VA_ARGS__)) ^ /code/build/_deps/onnxruntime-src/include/onnxruntime/core/framework/tensor_shape.h:201:15: note: 'operator<<' should be declared prior to the call site or in namespace 'gsl' std::ostream& operator<<(std::ostream& out, const TensorShape& shape); ^ 1 error generated. ```	2023-05-02 01:12:39 -07:00
Changming Sun	034698cf6a	Revert "Implement lite custom op API (#15590 )" (#15768 ) This reverts commit `cdf4fc49fc` because it breaks the "debug_node_input_output" build in "Post Merge" pipeline	2023-05-02 01:10:10 -07:00
Prathik Rao	090312af71	add local state dict option (#15759 ) ### Description Adds an option to load local state dictionary for whisper model export. ### Motivation and Context This is useful to demonstrate workflow of using ORT Training to get model weights, downloading said weights onto a local gpu-enabled device, exporting the custom model using `convert_to_onnx.py`, and then nicely feeding the .onnx file into ORT InferenceSession.	2023-05-01 22:08:11 -07:00
Ye Wang	391f897983	Bring back SLN cuda kernel and use provider options to switch to standard implementation (#15660 )	2023-05-01 18:35:26 -07:00
Nat Kershaw (MSFT)	9219615471	Fix python AP docs generation (#15760 ) Docs are failing on the operator generation step. Remove this temporarily so that we can publish.	2023-05-01 18:31:59 -07:00
Changming Sun	5352f6d9b0	Make "--cuda_version" build arg optional (#15758 ) ### Description This change will allow us building CUDA EP without installing CUDA SDK on Windows. ### Motivation and Context Nvidia's CUDA installer comes with a VS extension. In the past, we require installing the extension. It is a little bit inconvenient since: 1. Visual Studio must be installed before CUDA SDK. CUDA's installer will not install the extension if your machine doesn't have Visual Studio. 2. We need to install CUDA SDK on our build machines, instead of just downloading it and using it. After this change, we will not need to install CUDA SDK on our build machines. So it will be easier to add a support for a different CUDA version. Also, fix two PreFast warnings.	2023-05-01 18:00:47 -07:00
Ye Wang	7f293d065a	Remove some D2D memcpy in T5 BeamSearch (#15706 ) ### Description <!-- Describe your changes. --> replace D2D memcpy with cuda kernels in creating decoder initial feeds before: ![image](https://user-images.githubusercontent.com/52801275/234746326-209110e3-1cd4-4e8c-a488-515dd34c06c7.png) after: ![image](https://user-images.githubusercontent.com/52801275/234736352-0bd027a0-e382-47d7-a1c5-ae9fbc9f1c9d.png) ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> --------- Co-authored-by: Ubuntu <wy@v100-2.0cdb2e52twzevn1i4fi45bylyg.jx.internal.cloudapp.net>	2023-05-01 13:03:21 -07:00
Ashwini Khade	0ffae8073b	Creating Nuget and Android packages for Training (#15712 ) ### Description This PR creates Nuget and Android for Training. ### Motivation and Context These packages are intended to be released in ORT 1.15 to enable On-Device Training Scenarios. ## Packaging Story for Learning On The Edge Release ### Nuget Packages: 1. New Native package -> Microsoft.ML.OnnxRuntime.Training (Native package will contain binaries for: win-x86, win-x64, win-arm, win-arm64, linux-x64, linux-arm64, android) 2. C# bindings will be added to existing package -> Microsoft.ML.OnnxRuntime.Managed ### Android Package published to Maven: 1. New package for training (full build) -> onnxruntime-training-android-full-aar ### Python Package published to PyPi: 1. Python bindings and offline tooling will be added to the existing ort training package -> onnxruntime-training	2023-05-01 12:59:56 -07:00
Sumit Agarwal	4c4f688a93	[DML EP] Fix dml_external_project (#15656 ) ### Description While building ORT for DML EP with `dml_EXTERNAL_PROJECT` flag, 2 variables (`DML_SHARED_LIB`, `DML_PACKAGE_DIR`) value is not set properly. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2023-05-01 12:02:56 -07:00
liqun Fu	62fc6ed5a8	[Feature Request] Support Resize opset 19 (#15633 )	2023-05-01 10:49:17 -07:00
cao lei	d58fa9805b	ExecutionProvider API refactor - replace OrtMemoryInfo with OrtDevice (#15618 ) ### Description ExecutionProvider API refactor - replace OrtMemoryInfo with OrtDevice ### Motivation and Context Currently “Location” is represented as ORTMemoryInfo, which is OrtDevice + OrtMemType, while OrtDevice is represent as DeviceType + DeviceId + MemType. As we can see there is some unnecessary hierarchy, the proposal is to make it a clear definition that to use OrtDevice as an abstraction for Location --------- Co-authored-by: Lei Cao <leca@microsoft.com>	2023-05-01 10:06:00 -07:00
Baiju Meswani	bb33285ec2	C# training api updates for on device training (#15720 )	2023-05-01 10:01:38 -07:00
shalvamist	c10a6a9d17	Tensor <--> image - Adding per channel compute for Norm mean & Bias (#14705 ) ### Description Enabled the use of per channel Bias and Mean normalization when converting an image <--> tensor. Added a few bug fixes and updates to the relevant E2E tests. --------- Co-authored-by: shalvamist <shalva.mist@microsoft.com>	2023-05-01 09:37:50 -07:00
Scott McKay	0770cf3699	Remove C# SessionOptions.RegisterCustomOpsUsingFunction. (#15754 ) Symbol visibility from DllImport is inconsistent across platforms resulting in the symbol not necessarily being visible to ORT native code that tries to look it up by name. Best solution is to use DllImport to load the library and to call the registration function directly. That requires the native SessionOptions handle and OrtApiBase struct. We could either make those public, or provide a helper where the user passes in a delegate from their DllImport. Can add when needed.	2023-05-01 09:14:21 -07:00
RandySheriffH	cdf4fc49fc	Implement lite custom op API (#15590 ) Implement a set of new APIs for lightweight custom ops registration, to save efforts on schema-composing. A few highlights: 1. Support build-time type inference; 2. Support function-as-op for "stateless" ops; 3. Support structure-as-op for "stateful" ops; 4. Support varied input/output forms such as span, scalar, and tensors, either optional or non-optional. --------- Co-authored-by: Randy Shuai <rashuai@microsoft.com>	2023-05-01 08:45:26 -07:00
Chen Fu	0e9472d391	NHWC graph optimizer (#15724 ) ### Description Augment nhwc graph optimizer to accommodate fp16 operators. ### Motivation and Context With new fp16 conv operator added. This operator prefers NHWC data layout. We need to augment existing graph optimizers to better utilize the new operator.	2023-05-01 08:44:07 -07:00
Chunye Wang@AMD	d35850c142	[VitisAI]Update VitisAI EP to be compatible with VitisAI 3.5 (#15673 ) ### Description Originally VitisAI EP only works with old version of VitisAI release. ### Motivation and Context Update VitisAI EP so that it works together with the current VitisiAI 3.5 and further version of VitisAI. We try our best to make it forward compatible. --------- Co-authored-by: Wang Chunye <chunywan@xilinx.com> Co-authored-by: mingyue <mingyue@amd.com> Co-authored-by: mingyueliuh <131847423+mingyueliuh@users.noreply.github.com> Co-authored-by: liumingyue <mingyue@xilinx.com> Co-authored-by: moore-ch <129165652+moore-ch@users.noreply.github.com> Co-authored-by: shoucair <shoucai.ren@amd.com> Co-authored-by: zz002 <zhenze.wang@amd.com> Co-authored-by: BoarQing <yuz75@Pitt.edu> Co-authored-by: Yueqing Zhang <yueqingz@amd.com> Co-authored-by: Scott McKay <Scott.McKay@microsoft.com>	2023-05-01 08:28:26 -07:00
Jeff Bloomfield	3df3a85114	Default kOrtSessionOptionsDisableQuantQDQ to 1 when the DML EP is registered (#15725 ) This addresses a performance regression in some INT8 models with the DirectML EP by defaulting OrtSessionOptionsDisableQuantQDQ to 1 when the EP is registered. This regression occured due to the introduction of the QDQ propagation transformer, which is based on this session option. That transformer maximizes the number of nodes which are executed as quantized by logically propagating quantize operators upstream and dequantize operators downstream. However, it does this simply by inserting QDQ pairs, with an expectation that something will recognize sequences of DQ->Op->Q. This logic and related L2 transformers are not currently enabled for the DirectML EP. This change also removes a noisy warning when the session option for memory pattern is overriden as the DirectML EP is registered.	2023-05-01 08:26:03 -07:00
Tianlei Wu	10dff4f665	only add type info from symbolic shape inference for fp16 conversion (#15617 ) ### Description Walkaround of https://github.com/microsoft/onnxruntime/issues/15521. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2023-04-30 23:22:11 -07:00
Chi Lo	6e652d0554	Support explicit TRT profiles from provider options (#15546 ) Previous behavior of TRT EP to set TRT optimization profiles for dynamic shape input is based on input tensor values. Users can't explicitly specify the profiles. This PR makes users capable of specifying min/max/opt profiles through newly added three provider options: `trt_profile_min_shapes`, `trt_profile_max_shapes` and `trt_profile_opt_shapes` with the format of "input1:dim1xdim2...,input2:dim3xdim4...". (Note: It's similar to --minShapes, --maxShapes and --optShapes of trtexec command-line [flags](https://docs.nvidia.com/deeplearning/tensorrt/developer-guide/index.html#trtexec-flags)) For example, if you are using onnxruntime_perf_test, you can try this: `./onnxruntime_perf_test -e tensorrt -r 1 -i "trt_profile_min_shapes\|imgs:1x3x384x288 trt_profile_max_shapes\|imgs:32x3x384x288 trt_profile_opt_shapes\|imgs:16x3x384x288" your_model_path` If the engine cache is enabled, you still need to provide these three explicit provider options in order to use this feature. ORT TRT will compare the min/max/opt profile shape with the ones saved in .profile file to decide whether to rebuild the engine. Constraints to use these provider options: (1) Need to specify min/max/opt profile shapes for all the dynamic shape input This feature is also requested by other users: https://github.com/microsoft/onnxruntime/issues/13851	2023-04-30 22:30:26 -07:00
Scott McKay	31e7d3d7d4	Disable TestRegisterCustomOpsWithFunction on Linux (#15747 ) ### Description <!-- Describe your changes. --> Disable new test that is failing on linux. Not required for this release. Will fix in the next week. Marshal.Prelink can be used on Windows to make the symbol available but Linux appears to work differently. Also need to update the pre-checkin tests so this is tested early as it's only failing in the E2E tests run in the packaging pipeline. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Fix packaging pipeline error.	2023-04-30 14:39:02 +10:00
Changming Sun	176161348e	Revert "make nuget workflow easy to debug. (#15693 )" (#15744 ) This reverts commit `53ff50d19a` because it make the nuget pipeline fail.	2023-04-29 19:05:01 -07:00
kunal-vaishnavi	7ae01cec15	Update wheel path to Whisper custom export script (#15739 ) ### Description This PR updates the documentation for using the Whisper custom export scripts via the wheel. ### Motivation and Context The path should say `onnxruntime.transformers.models.whisper.convert_to_onnx` instead of `onnxruntime.transformers.models.convert_to_onnx`.	2023-04-29 17:32:34 -07:00
sfatimar	4fbc08e3c2	VPUX config fix and dynamic_shape bug fixed. (#15737 ) Dynamic shapes was not working with serialized model so we are switching to compile model method ### Motivation and Context Dynamic shapes was not working with serialized model - If it fixes an open issue, please link to the issue here. --> Signed-off-by: MaajidKhan <n.maajid.khan@intel.com> Co-authored-by: MaajidKhan <n.maajid.khan@intel.com>	2023-04-29 15:48:34 -07:00
Adrian Lizarraga	d32c540b2d	[QNN EP] Support LRN operator (#15741 ) ### Description Adds support for the LRN operator to QNN EP. ### Motivation and Context Enables basic models like googlenet and alexnet to run entirely on QNN EP.	2023-04-29 13:23:42 -07:00
Changming Sun	65020d433e	Prefast fixes for CUDA EP (#15726 ) ### Description 1. Adjust cmake flags. Do not modify CMAKE_CXX_FLAGS globally. Only apply the flags to ORT code. 2. Fix some SDL warnings.	2023-04-29 12:43:12 -07:00

1 2 3 4 5 ...

8738 commits