onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-07-16 18:31:27 +00:00

Author	SHA1	Message	Date
Jian Chen	34cb293c6b	Remove unused ADO YML pipeline template (#15857 ) ### Description Remove unused ADO YML pipeline template ### Motivation and Context Clean up and reduce our codebase.	2023-05-09 09:15:04 -07:00
JiCheng	686d42e6c8	layer_norm_fix (#15844 ) ### Description Fix bugs of Layernorm Fusion. More checks on ReduceMean axes separate out layernorm transform_test ### Motivation and Context Our layernorm fusion pattern works only for axis=-1 currently. - For training senario: The pattern produced error results directly as they didn't handle "axes" and only assumed it's the default vaue. - For Inference: ~~We lost some oppotunities to fuse layernrom. ~~ ReduceMean has default axes 0 which means reduce on all dimensions	2023-05-09 22:09:00 +08:00
Wanming Lin	00b1e79e04	Support WebNN EP (#15698 ) Description: This PR intends to enable WebNN EP in ONNX Runtime Web. It translates the ONNX nodes by [WebNN API](https://webmachinelearning.github.io/webnn/), which is implemented in C++ and uses Emscripten [Embind API](https://emscripten.org/docs/porting/connecting_cpp_and_javascript/embind.html#). Temporarily using preferred layout NHWC for WebNN graph partitions since the restriction in WebNN XNNPack backend implementation and the ongoing [discussion](https://github.com/webmachinelearning/webnn/issues/324) in WebNN spec that whether WebNN should support both 'NHWC' and 'NCHW' layouts. No WebNN native EP, only for Web. Motivation and Context: Allow ONNXRuntime Web developers to access WebNN API to benefit from hardware acceleration. WebNN API Implementation Status in Chromium: - Tracked in Chromium issue: [#1273291](https://bugs.chromium.org/p/chromium/issues/detail?id=1273291) - CPU device: based on XNNPack backend, and had been available on Chrome Canary M112 behind "#enable-experimental-web-platform-features" flag for Windows and Linux platforms. Further implementation for more ops is ongoing. - GPU device: based on DML, implementation is ongoing. Open: - GitHub CI: WebNN currently is only available on Chrome Canary/Dev with XNNPack backend for Linux and Windows. This is an open to reviewers to help identify which GitHub CI should involved the WebNN EP and guide me to enable it. Thanks!	2023-05-08 21:25:10 -07:00
Chen Fu	685e5b00f6	NhwcFusedConv: Add before Activation (#15837 ) ### Description Fp16 FusedConv and NhwcFusedConv. Fused Add operator should be performed BEFORE the activation operator. ### Motivation and Context Previous understanding of fused conv is incorrect.	2023-05-08 21:02:35 -07:00
pengwa	003c7d3e4d	Add CPU allocation test for multiple GPU distributed run (#15829 ) ### Add CPU allocation test for non-CPU devices distributed run When CUDA EP is enabled in distributed training, CPU memory is still used for some node output. Early we have distributed run test coverage, but don't cover the case when some of the node are using CPU devices for storing tensor output. As a result, I recalled we hit regression twice in the passing months: - https://github.com/microsoft/onnxruntime/pull/14050 - https://github.com/microsoft/onnxruntime/pull/15823 So adding this test to avoid future regressions. The test graph looks like this: ![image](https://user-images.githubusercontent.com/10530022/236594940-70c68a55-18bf-4e09-bbf5-8a64895d3045.png) ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2023-05-09 10:27:19 +08:00
Rachel Guo	817d70a63b	[js/rn] Fix extensions header include issue (#15800 ) ### Description <!-- Describe your changes. --> Identified the cause for a `redefinition compilation error` happened in a react native expo app with ort-extensions enabled when running the ios side. Fix the include path now, so we can remove the temporary forward declaration in OnnxruntimeModule.mm file. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Fix implementation detail. --------- Co-authored-by: rachguo <rachguo@rachguos-Mini.attlocal.net>	2023-05-08 17:12:10 -07:00
Yulong Wang	0457fd0b40	upgrade emsdk to 3.1.37 (#15817 ) ### Description upgrade emsdk to 3.1.37 WIP branch to debug the mystery memory issue in web assembly multi-thread build.	2023-05-08 16:49:47 -07:00
Tianlei Wu	191ee1d3c0	Fix symbolic shape infer empty value_info (#15842 ) ### Description When node output is optional, symbolic shape infer might add an empty value_info item. Add some checking to avoid this. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> - Stable diffusion optimized model reported invalid data type 0 during inference.	2023-05-08 16:18:35 -07:00
Yi Zhang	045c623415	Make Nuget workflow easy to debug (#15808 ) ### Description Fix the bug in #15693 ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2023-05-08 20:53:08 +08:00
Nat Kershaw (MSFT)	5e9b42326c	Fix packaging pipeline for nightly builds (#15839 )	2023-05-07 20:42:38 -07:00
PeixuanZuo	2735e0d031	[ROCm] simplify ck data type Adaptor (#15734 ) DataTypeAdaptor is defined many times in every file that integrates CK. This PR refactor the code to put DataTypeAdaptor in a header file.	2023-05-06 17:39:52 +08:00
Ted Themistokleous	42d62b8f2b	Fixes to get stable diffusion benchmark running (#15755 ) ### Description Added changes to MIGraphX EP to suppoert stable diffusion 1. Added parameterized input dimensions to not trigger a precompile to set input parameters in the EP 2. Removed input checking for Resize operator in EP as MIGraphX already performs these checks 3. Add support to benchmark script to use the MIGraphX execution provider 4. Add support for an odd valued batch size (3) that was seen on other benchmarks we were performing comparison on. ### Motivation and Context These changes are required to get stable diffusion mdoels to run on MIGraphX through the EP. Without these changes we see the following incorrect behavior. 1. Resize operators are pushed onto the CPU EP instead of MIGraphX, causing a significant slowdown during runs 2. Precompile operations incorrectly parse input_ids parameter for our text model, with a 1, which breaks during MIGraphX Compile of onnx. This in turn throws an error and stops any setup before inference. 3. Selecting the correct EP in the benchmark script which was previously missing the MIGraphX option 5. Suppressed an error we keep seeing with pthread_set_affinity - this is a quality of life change when using the MIGraphX EP This was testing with the benchmark.py script using stable diffusion v2 located in onnxruntime/onnxruntime/python/tools/transformers/models/stable_diffusion/ --------- Co-authored-by: Ted Themistokleous <tthemist@amd.com>	2023-05-06 17:35:21 +08:00
PeixuanZuo	41457885e0	[ROCm] add rocm5.5 to python package pipeline (#15820 ) add rocm5.5 to python packaging pipeline. https://aiinfra.visualstudio.com/Lotus/_build/results?buildId=306082&view=results TODO: Remove version 5.2.3, 5.3.2 and 5.4 in the next PR.	2023-05-06 10:21:15 +08:00
Nat Kershaw (MSFT)	ed31e4b737	Add nuget release version suffix to support publishing rcs to nuget.org (#15791 )	2023-05-05 18:18:24 -07:00
pengwa	dfac096501	Fix segfault for multiple GPU run (regression) (#15823 ) ### Fix segfault for multiple GPU run https://github.com/microsoft/onnxruntime/pull/15618 introduced `GetOrtDeviceByMemType`. The intention should be: handle CPU device differently in the if branch, while might by mistakenly passing the unique default non-cpu device id. ``` OrtDevice CUDAExecutionProvider::GetOrtDeviceByMemType(OrtMemType mem_type) const { if (mem_type == OrtMemTypeCPUInput \|\| mem_type == OrtMemTypeCPUOutput) { return OrtDevice(OrtDevice::CPU, OrtDevice::MemType::CUDA_PINNED, default_device_.Id()); } return default_device_; } ``` We observed a segement fault thrown when running multiple GPU training ` CUDA_LAUNCH_BLOCKING=1 python -m torch.distributed.launch --nproc_per_node=2 examples/onnxruntime/training/language-modeling/run_mlm.py --model_name_or_path distilbert-base-uncased --dataset_name wikitext --dataset_config_name wikitext-2-raw-v1 --num_train_epochs 10 --per_device_train_batch_size 8 --per_device_eval_batch_size 8 --do_train --do_eval --overwrite_output_dir --output_dir ./outputs222/ --seed 1137 --fp16 --report_to none --optim adamw_ort_fused --max_steps 400 --logging_steps 1 ` It is found GPU0 works fine, GPU1 throw segement fault. Looking further, a Shape node trying to allocate it's output tensor, trying to fetch corresponding allocator with ORTDevice(Device:[DeviceType:0 MemoryType:1 DeviceId:1]), while CPU device did not have device id = 1, so a no allocator returned. When we try to call `AsStreamBasedAllocator` for the allocator, segement happens as no null check was done there. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2023-05-06 08:48:53 +08:00
Sheil Kumar	2b7f26af7c	Add GridSample implementation to DirectML (#15788 ) Add GridSample implementation to DirectML EP. Temporary add HLSL shader in the DirectML EP to handle GridSample until officially added to DirectML.	2023-05-05 15:59:33 -07:00
Adrian Lizarraga	45f5c27632	[QNN EP] Update default QNN SDK to version 2.10.0 (#15818 ) ### Description - Updates the default QNN SDK for CI pipelines to version 2.10.0. - Disables convolution op tests that run on the QNN CPU backend due to a potential bug with QNN SDK 2.10.0. ### Motivation and Context Allows us to test the latest QNN SDK in default CI pipeline runs.	2023-05-05 13:01:21 -07:00
Chen Fu	2f6df99b89	Register fp16 FusedConv to CPU EP (#15824 ) ### Description Fp16 FusedConv Operator registered to CPU EP as contribOP ### Motivation and Context Should have done it earlier in https://github.com/microsoft/onnxruntime/pull/15498 but I forgot.	2023-05-05 12:48:53 -07:00
Guenther Schmuelling	5a43828b3d	update ort extensions to 94142d8391c9791ec71c38336436319a2d4ac7a0 (#15688 ) needed to get tokenizers/decode for whisper --------- Co-authored-by: Shalva Mist <shalvamist@microsoft.com>	2023-05-05 09:48:07 -07:00
Yulong Wang	41a19ae1b5	[js/webgpu] fix Transpose with non-float tensor (#15819 ) ### Description fix Transpose with non-float tensor. only register float type for Transpose.	2023-05-05 08:29:19 -07:00
Changming Sun	e139ae238b	Add a codesign step to Windows AI nuget pipeline (#15816 ) ### Description Add a codesign step to Windows AI nuget pipeline to sign the nuget package	2023-05-04 22:07:44 -07:00
Scott McKay	d1b2b35cd2	Various fixes to the CSharp setup (#15782 ) ### Description <!-- Describe your changes. --> Various fixes to the CSharp setup - fix warnings - fix invalid tests - update test sdk nuget package - enables testing on linux - fixes issue with some unit tests not running in CI - run unit tests in linux pipeline using dotnet ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Unit tests weren't breaking in CIs for both Windows and Linux builds and should have been.	2023-05-05 14:27:30 +10:00
BoarQing	272aab4afa	Fix issues on Windows for Vitis AI (#15810 ) ### Description Fix two errors that is only encountered on windows ### Motivation and Context For onnxruntime::VitisAIProviderFactoryCreator::Create, it would cause the compile error. For if (it == provider_options_map.end()), it would cause an error but execute as normal Co-authored-by: Zhang <yueqingz@amd.com>	2023-05-04 14:42:19 -07:00
cloudhan	412d05a1d2	[ROCm] Update cmake (#15807 ) Followup of #15775	2023-05-04 11:20:56 -07:00
dependabot[bot]	58ee076750	Bump engine.io from 6.4.1 to 6.4.2 in /js/web (#15799 ) Bumps [engine.io](https://github.com/socketio/engine.io) from 6.4.1 to 6.4.2. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/socketio/engine.io/releases">engine.io's releases</a>.</em></p> <blockquote> <h2>6.4.2</h2> <p>⚠️ This release contains an important security fix ⚠️</p> <p>A malicious client could send a specially crafted HTTP request, triggering an uncaught exception and killing the Node.js process:</p> <pre><code>TypeError: Cannot read properties of undefined (reading 'handlesUpgrades') at Server.onWebSocket (build/server.js:515:67) </code></pre> <p>Please upgrade as soon as possible.</p> <h3>Bug Fixes</h3> <ul> <li>include error handling for Express middlewares (<a href="https://redirect.github.com/socketio/engine.io/issues/674">#674</a>) (<a href="`93957828be`">9395782</a>)</li> <li>prevent crash when provided with an invalid query param (<a href="`fc480b4f30`">fc480b4</a>)</li> <li><strong>typings:</strong> make clientsCount public (<a href="https://redirect.github.com/socketio/engine.io/issues/675">#675</a>) (<a href="`bd6d4713b0`">bd6d471</a>)</li> <li><strong>uws:</strong> prevent crash when using with middlewares (<a href="`8b22162903`">8b22162</a>)</li> </ul> <h3>Credits</h3> <p>Huge thanks to <a href="https://github.com/tyilo"><code>@tyilo</code></a> and <a href="https://github.com/cieldeville"><code>@cieldeville</code></a> for helping!</p> <h4>Links</h4> <ul> <li>Diff: <a href="https://github.com/socketio/engine.io/compare/6.4.1...6.4.2">https://github.com/socketio/engine.io/compare/6.4.1...6.4.2</a></li> <li>Client release: -</li> <li>ws version: <a href="https://github.com/websockets/ws/releases/tag/8.11.0">~8.11.0</a> (no change)</li> </ul> </blockquote> </details> <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/socketio/engine.io/blob/main/CHANGELOG.md">engine.io's changelog</a>.</em></p> <blockquote> <h2><a href="https://github.com/socketio/engine.io/compare/6.4.1...6.4.2">6.4.2</a> (2023-05-02)</h2> <p>⚠️ This release contains an important security fix ⚠️</p> <p>A malicious client could send a specially crafted HTTP request, triggering an uncaught exception and killing the Node.js process:</p> <pre><code>TypeError: Cannot read properties of undefined (reading 'handlesUpgrades') at Server.onWebSocket (build/server.js:515:67) </code></pre> <p>Please upgrade as soon as possible.</p> <h3>Bug Fixes</h3> <ul> <li>include error handling for Express middlewares (<a href="https://redirect.github.com/socketio/engine.io/issues/674">#674</a>) (<a href="`93957828be`">9395782</a>)</li> <li>prevent crash when provided with an invalid query param (<a href="`fc480b4f30`">fc480b4</a>)</li> <li><strong>typings:</strong> make clientsCount public (<a href="https://redirect.github.com/socketio/engine.io/issues/675">#675</a>) (<a href="`bd6d4713b0`">bd6d471</a>)</li> <li><strong>uws:</strong> prevent crash when using with middlewares (<a href="`8b22162903`">8b22162</a>)</li> </ul> <h3>Credits</h3> <p>Huge thanks to <a href="https://github.com/tyilo"><code>@tyilo</code></a> and <a href="https://github.com/cieldeville"><code>@cieldeville</code></a> for helping!</p> <h3>Dependencies</h3> <ul> <li><a href="https://github.com/websockets/ws/releases/tag/8.11.0"><code>ws@~8.11.0</code></a> (no change)</li> </ul> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="`95e215387c`"><code>95e2153</code></a> chore(release): 6.4.2</li> <li><a href="`fc480b4f30`"><code>fc480b4</code></a> fix: prevent crash when provided with an invalid query param</li> <li><a href="`0141951185`"><code>0141951</code></a> refactor(types): ensure compatibility with Express middlewares</li> <li><a href="`8b22162903`"><code>8b22162</code></a> fix(uws): prevent crash when using with middlewares</li> <li><a href="`93957828be`"><code>9395782</code></a> fix: include error handling for Express middlewares (<a href="https://redirect.github.com/socketio/engine.io/issues/674">#674</a>)</li> <li><a href="`911d0e3575`"><code>911d0e3</code></a> refactor: return HTTP 400 upon invalid request overlap</li> <li><a href="`bd6d4713b0`"><code>bd6d471</code></a> fix(typings): make clientsCount public (<a href="https://redirect.github.com/socketio/engine.io/issues/675">#675</a>)</li> <li>See full diff in <a href="https://github.com/socketio/engine.io/compare/6.4.1...6.4.2">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=engine.io&package-manager=npm_and_yarn&previous-version=6.4.1&new-version=6.4.2)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) You can disable automated security fix PRs for this repo from the [Security Alerts page](https://github.com/microsoft/onnxruntime/network/alerts). </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2023-05-04 10:06:01 -07:00
Yulong Wang	4712009f8a	[js/web] add target ort.webgpu.min.js (#15780 ) ### Description add target ort.webgpu.min.js WebGPU is experimental feature, so I don't want to put webgpu into the ort.min.js file. This change adds 2 ways for users to access ort-web with webgpu: - using script tag: by URL `https://cdn.jsdelivr.net/npm/onnxruntime-web@1.15.0/dist/ort.webgpu.min.js` ( this URL is not ready yet ) - using `import()`: use `import { Tensor, InferenceSession } from 'onnxruntime-web/webgpu';` - 'onnxruntime-web/webgpu' instead of 'onnxruntime-web'	2023-05-04 10:05:39 -07:00
RandySheriffH	8e610f25d8	Implement lite custom op API (#15778 ) Implement a set of new APIs for lightweight custom ops registration, to save efforts from schema-composing. A few highlights: - Support build-time type inference; - Support function-as-op for "stateless" ops; - Support structure-as-op for "stateful" ops; - Support varied input/output forms such as span, scalar, and tensors, either optional or non-optional. --------- Co-authored-by: Randy Shuai <rashuai@microsoft.com>	2023-05-04 09:49:17 -07:00
Changming Sun	34fcdd83c8	Update softmax_grad_impl.cu: add constexpr (#15794 ) ### Description Add a "constexpr" keyword to fix a static analysis warning	2023-05-04 08:10:17 -07:00
Yulong Wang	df7424e11f	[JSEP] fix constructor for OrtDevice (#15805 ) ### Description Add the missing `OrtDevice` initialization in JSEP introduced by #15618	2023-05-04 07:51:17 -07:00
Yulong Wang	33d1372729	[wasm] revert emsdk to v3.1.19 (#15793 ) ### Description latest emsdk generated multi-thread version sometimes crash with unknown reason ( error: memory access out of bounds ). we don't want to break existing ort-web users, so revert emsdk back to 3.1.19 (same to what ort v1.14.0 uses)	2023-05-04 01:15:01 -07:00
dependabot[bot]	422606ea76	Bump engine.io from 6.4.0 to 6.4.2 in /onnxruntime/test/wasm (#15798 )	2023-05-04 05:43:40 +00:00
Baiju Meswani	e464588a0e	Avoid generating training documentation during packaging (#15795 )	2023-05-03 19:09:07 -07:00
Jian Chen	5eedd884c8	Adding support for conv fp16 fusion on Resnet50v1 (#15474 ) ### Description Adding support for conv fp16 fusion with Conv-Add and Conv-Add-act. Specifically tested on on Resnet50v1 ### Motivation and Context Adding support for conv fp16 fusion with Conv-Add and Conv-Add-act. Specifically tested on on Resnet50v1	2023-05-03 15:48:06 -07:00
Changming Sun	1fb2f2605b	Update VERSION_NUMBER (#15773 ) ### Description 1. Update VERSION_NUMBER for preparing the upcoming release. This PR's commit will not be included in the 1.15 release branch 2. Delete package/rpm/onnxruntime.spec since it was not used in past years. ### Motivation and Context Preparing the release. Fixed [AB#15311](https://aiinfra.visualstudio.com/6a833879-cd9b-44a4-a9de-adc2d818f13c/_workitems/edit/15311)	2023-05-03 15:07:34 -07:00
Baiju Meswani	ba7b83ff3c	Remove onnxruntime_PYBIND_EXPORT_OPSCHEMA definition from onnxruntime (#15776 )	2023-05-03 13:08:35 -07:00
Changming Sun	41c082fdde	Add a Github workflow for Prefast (#15763 )	2023-05-03 11:42:51 -07:00
Changming Sun	d53324d4a7	Update cmake version in a few places (#15775 ) ### Description They were missed in #15707 , because they are not in common places for Dockerfiles. Though this commit updated tools/ci_build/github/pai/rocm-ci-pipeline-env.Dockerfile, it won't automatically take effect. The image needs to be manually generated and pushed to a place, and before doing that our CMakeLists.txt also needs to be tweaked a little bit.	2023-05-02 22:56:28 -07:00
RandySheriffH	e3ec2b3a8e	Exclude cases from reduced build (#15779 ) Exclude cases from reduced build to unblock pipeline. Fixed [AB#15326](https://aiinfra.visualstudio.com/6a833879-cd9b-44a4-a9de-adc2d818f13c/_workitems/edit/15326) Co-authored-by: Randy Shuai <rashuai@microsoft.com>	2023-05-02 21:05:54 -07:00
Yulong Wang	ef1f17f3dc	[wasm/JSEP] add threaded build to artifacts (#15777 ) ### Description This is the first part to create a webassembly artifacts for ort-web webgpu EP (wasm build). there will be following steps to consume the artifacts in web build	2023-05-02 17:53:44 -07:00
Baiju Meswani	2d519d21af	Python documentation for onnxruntime-training (#15765 )	2023-05-02 16:58:16 -07:00
Jian Chen	abdd4f518a	Update TRT Windows Cuda 11.6 to 11.8 (#15746 ) ### Description Update TRT Windows cuda 11.6 to 11.8 ### Motivation and Context We are adapting newer version of cuda systemwide.	2023-05-02 12:23:13 -07:00
Changming Sun	328cabb194	Download protoc from Github Release instead of Nuget (#15731 ) ### Description Download protoc from Github Release instead of Nuget to avoid having dependency on nuget.exe on Linux ### Motivation and Context To avoid having dependency on nuget.exe on Linux. Many users' build environment do not have nuget or dotnet.	2023-05-02 12:18:59 -07:00
Nat Kershaw (MSFT)	e901cdbf54	Add wildcard paths to the API docs generation workflows (#15313 )	2023-05-02 10:43:45 -07:00
Chen Fu	4b8025e492	Parallelize fp16 pooling operators (#15766 ) ### Description Parallelize fp16 pooling operators ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2023-05-02 08:48:56 -07:00
Chen Fu	bc58fd5413	fix compilation error in no absl build (#15769 ) ### Description Fix no-absl build error:	2023-05-02 08:20:49 -07:00
Sohaib Iftikhar	92309187b3	Allow compilation using clang when using cuda. (#15672 ) ### Description Currently compiling with clang + cuda leads to: ``` /code/build/_deps/onnxruntime-src/include/onnxruntime/core/common/make_string.h:33:6: error: call to function 'operator<<' that is neither visible in the template definition nor found by argument-dependent lookup ss << t; ^ /code/build/_deps/onnxruntime-src/include/onnxruntime/core/common/make_string.h:39:3: note: in instantiation of function template specialization 'onnxruntime::detail::MakeStringImpl<gsl::span<const long, 18446744073709551615>>' requested here MakeStringImpl(ss, args...); ^ /code/build/_deps/onnxruntime-src/include/onnxruntime/core/common/make_string.h:46:3: note: in instantiation of function template specialization 'onnxruntime::detail::MakeStringImpl<const char , gsl::span<const long, 18446744073709551615>>' requested here MakeStringImpl(ss, args...); ^ /code/build/_deps/onnxruntime-src/include/onnxruntime/core/common/make_string.h:93:18: note: in instantiation of function template specialization 'onnxruntime::detail::MakeStringImpl<const char , gsl::span<const long, 18446744073709551615>>' requested here return detail::MakeStringImpl(detail::if_char_array_make_ptr_t<Args const&>(args)...); ^ /code/build/_deps/onnxruntime-src/onnxruntime/contrib_ops/cuda/quantization/qordered_ops/qordered_qdq.cc:73:12: note: in instantiation of function template specialization 'onnxruntime::MakeString<char[39], gsl::span<const long, 18446744073709551615>>' requested here return ORT_MAKE_STATUS(ONNXRUNTIME, INVALID_ARGUMENT, "Shape not meet clean tile requirement!", dims); ^ /code/build/_deps/onnxruntime-src/include/onnxruntime/core/common/common.h:188:48: note: expanded from macro 'ORT_MAKE_STATUS' ::onnxruntime::MakeString(__VA_ARGS__)) ^ /code/build/_deps/onnxruntime-src/include/onnxruntime/core/framework/tensor_shape.h:201:15: note: 'operator<<' should be declared prior to the call site or in namespace 'gsl' std::ostream& operator<<(std::ostream& out, const TensorShape& shape); ^ 1 error generated. ```	2023-05-02 01:12:39 -07:00
Changming Sun	034698cf6a	Revert "Implement lite custom op API (#15590 )" (#15768 ) This reverts commit `cdf4fc49fc` because it breaks the "debug_node_input_output" build in "Post Merge" pipeline	2023-05-02 01:10:10 -07:00
Prathik Rao	090312af71	add local state dict option (#15759 ) ### Description Adds an option to load local state dictionary for whisper model export. ### Motivation and Context This is useful to demonstrate workflow of using ORT Training to get model weights, downloading said weights onto a local gpu-enabled device, exporting the custom model using `convert_to_onnx.py`, and then nicely feeding the .onnx file into ORT InferenceSession.	2023-05-01 22:08:11 -07:00
Ye Wang	391f897983	Bring back SLN cuda kernel and use provider options to switch to standard implementation (#15660 )	2023-05-01 18:35:26 -07:00
Nat Kershaw (MSFT)	9219615471	Fix python AP docs generation (#15760 ) Docs are failing on the operator generation step. Remove this temporarily so that we can publish.	2023-05-01 18:31:59 -07:00

1 2 3 4 5 ...

8759 commits