onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-07-08 17:17:15 +00:00

Author	SHA1	Message	Date
Baiju Meswani	8a3de16d14	Temporary fix to make the training pipeline green (#16353 )	2023-06-14 13:11:35 -07:00
Baiju Meswani	ed2482667b	Fix training pipeline (#16342 )	2023-06-13 15:06:38 -07:00
zesongw	c5176ed122	[WebNN EP] Add several new unary Ops (Ceil, Exp, Identity, Reciprocal, Tan) (#16302 ) ### Description - Add new Ops: Ceil, Exp, Identity, Reciprocal, Tan. - Set MinSupportedOpSet for unary Ops. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Support more Ops for other models. The legacy optimization attribute "consumed_inputs" is not supported in WebNN EP.	2023-06-13 08:14:55 -07:00
Edward Chen	4f23577cb5	[React Native] Publish E2E test logs on build failure too. (#16327 ) ### Description <!-- Describe your changes. --> Publish E2E test logs on build failure too. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Get more information about intermittent test failures.	2023-06-12 17:56:46 -07:00
Yulong Wang	e3e4926d00	[js/common] allow import onnxruntime-common as ESM and CJS (#15772 ) ### Description allow import onnxruntime-common as ESM and CJS.	2023-06-12 12:05:11 -07:00
Sheil Kumar	0df9e42960	User/sheilk/register div nonzero (#16309 ) [DML EP] NonZero supported datatypes has incorrect number of template datatypes 2 should be 1	2023-06-12 10:11:59 -07:00
satyajandhyala	889f80082f	[js/web] Added Reduce operators support (#16122 ) ### Description Added support for ReduceL1, ReduceL2, ReduceMean, ReduceMin, ReduceMax, ReduceSum, ReduceLogSum, ReduceLogSumExp, ReduceProd and ReduceSquareSum. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> --------- Co-authored-by: Satya Jandhyala <sajandhy@microsoft.com> Co-authored-by: guschmue <guschmue@microsoft.com>	2023-06-12 07:46:27 -07:00
pengwa	40bcc0441b	Enhance StatisticsSubscriber (#16098 ) ### Enhance StatisticsSubscriber There are few improvements for `StatisticsSubscriber`: - Reduce peak memory impact for tensors (having many many many elements, consuming too much GPU memory, causing original recipe run failed with OOM), by split the statistics into two phases (split into buckets, and merge result across buckets). - Allow dump intermediate tensors. Originally only nn.Module forward()'s return value are dumped, there are requirements we want to inspect some specific intermediate tensor in the forward() function, now we support it. - Add documents for collecting dumps on multiple ranks Docs link on this branch for better view: https://github.com/microsoft/onnxruntime/blob/pengwa/conv_tool_v2/docs/ORTModule_Convergence_Notes.md --------- Co-authored-by: mindest <30493312+mindest@users.noreply.github.com>	2023-06-12 18:32:08 +08:00
JiCheng	eed02a3f78	Xnnpack QDQ test (#16281 ) ### Description A few QDQ tests failed on XNNPACK EP. The reason should be the range of input_data doesn't fit for scale and zero_point. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2023-06-12 14:00:42 +08:00
zhangsibo1129	97751ad516	[CANN] Fix registration of Identity operator (#16210 ) ### Description <!-- Describe your changes. --> This [PR](`e726151b5c (diff-6957596681c25d78e7f3f56485f307fb7e66369309523240209a62c8fa21646b)`) introduces a missing registration of Identity operator for version greater than 14. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> It broke the CANN CI. I added the registration of identity operator.	2023-06-10 17:23:21 -07:00
JiCheng	5ab51694ab	gather OP with scalar indice in NNAPI EP (#16279 ) ### Description NNAPI Doesn't support the indices input of Gather to be a scalar. To workaround it. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2023-06-10 09:32:07 +08:00
Yulong Wang	59f42cccb8	[js/common] refactor tensor type in onnxruntime-common (#15843 ) ### Description <!-- Describe your changes. --> refactor tensor type in onnxruntime-common. ### Motivation and Context There major motivation is that I am doing a local change to address the API part of #15312. And I am doing a refactoring of onnxruntime-common anyway (#15772). The `tensor.ts` and `tensor-impl.ts` are too large, so I split contents into multiple files to make the type declarations clearer. The original target of this change is for API only ( ie. do not refactor any implementation.). However, there are a few type/implementation inconsistencies so I also made minimal changes to fix them. ### Changes - extract `TensorUtils` for non-template interfaces - extract `TensorFactory` for all overloads of `Tensor.fromImage()` - refactor options type that used for `Tensor.fromImage()` - fix JSDoc comments to make option descriptions consistent with actual type declarations - fix an inconsistency for `options.format` and `options.bitmapFormat`; change all `bitmapFormat` to `format` - extract `ConversionUtils` for `tensor.toDataURL()` and `tensor.toImageData()` - put implementations into multiple files from `tensor-impl.ts` - fix a bug that cause unittest fail. put comments for future fix.	2023-06-09 16:19:29 -07:00
Yulong Wang	f274bbb0c8	[js] add API that allows to get package version (#16207 ) ### Description Add an API for users to get version of current package. example usage: ```js import { env } from 'onnxruntime-node'; console.log(env.versions.node); // output "1.16.0" ``` ```js import { env } from 'onnxruntime-web'; console.log(env.versions.web); // output "1.16.0" console.log(env.versions.common); // output "1.16.0" console.log(env.versions.node); // output "undefined" ``` #16156	2023-06-09 16:18:53 -07:00
Yi Zhang	3b5a8352c1	CodeSign Mac packages in nuget pipeline (#16291 ) ### Description 1. Updated Mac package workflow for easily debugging. 2. Changed Archive type from tgz to zip since zip is supported by ESRP. 3. .../dylib.dSYM/Contents/Resources/DWARF/libonnxruntime.1.16.0.dylib is a debug symbol file, so it couldn't be signed. ### Motivation and Context It‘s required from VS code. Mac binaries in nuget should be signed	2023-06-10 06:35:47 +08:00
Adrian Lizarraga	1a22d245e2	[QNN EP] Fix auto_pad handling for Conv operator (#16299 ) ### Description Correctly sets padding when the `auto_pad` attribute is specified for Conv operator. ### Motivation and Context Needed to correctly translate ONNX Conv to QNN Conv2d.	2023-06-09 09:23:08 -07:00
Edward Chen	b668a6da96	Treat Objective-C static analysis warnings as errors (#16293 ) - Update Objective-C static analysis check to fail on warnings. - Address warning. - Clean up build definition.	2023-06-09 08:51:49 -07:00
Scott McKay	443f553782	Fix native onnxruntime library not loading in Azure App Service (#16286 ) ### Description <!-- Describe your changes. --> SetThreadDescription isn't available in an Azure App Service sandbox. #15219 removed a check that it was available, making it a hard dependency. When it's not available the dll load fails with a 'procedure not found' error. Add back the check. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> #15375 - although note this has nothing to do with the original issue. This is just for https://github.com/microsoft/onnxruntime/issues/15375#issuecomment-1579464889	2023-06-09 18:40:51 +10:00
Hector Li	a9d47f72a4	[QNN EP] Add model description into context binary file metadata for validation (#16248 ) ### Description Add model description into context binary file metadata for validation ### Motivation and Context Dump more information for validation --------- Co-authored-by: Adrian Lizarraga <adlizarraga@microsoft.com>	2023-06-08 22:13:43 -07:00
Hector Li	d1e8d4a261	[QNN EP] Fix an issue for Conv with dynamic weights (#16235 ) ### Description Fix an issue for Conv with dynamic weights Root cause: Conv op builder create the weight input tensor with wrong name. With dynamic weight, Transpose node is inserted. Conv op builder should use the new name which is Transpose output. It cause the weight producer has wrong output shape.	2023-06-08 17:09:35 -07:00
Jhen-Jie Hong	ac8444f299	[js/rn] Implement dispose native method (#16131 ) ### Description <!-- Describe your changes. --> Implement `dispose` react native method. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Currently we are not able to release the memory used by model in JS runtime if we don't want to use it anymore, we can do that only by reload app on debug or restart app on release.	2023-06-09 09:17:33 +10:00
Adrian Lizarraga	b48628f1cd	[QNN EP] Add tests for large inputs that trigger memory alloc errors (#16223 ) ### Description Adds tests for operators that return error 1002 (QNN_COMMON_ERROR_MEM_ALLOC) when the call to graphFinalize() fails. This seems to happen for large input sizes. Operators: - Sub - Div - Conv - MaxPool ### Motivation and Context This documents bugs that need to be addressed with unit tests.	2023-06-08 15:47:51 -07:00
Changming Sun	b72fe664c1	Refactor prepack buffer code (#16280 ) ### Description 1. Use IAllocatorUniquePtr to replace BufferUniquePtr. It will ensure the deleter is always right. 2. Change some std::unique_ptr to std::optional 3. Bypass Arena allocator when allocating the prepack buffers for mlas. In this special case, Arena doesn't help any. And this change is just an internal implementation change, it doesn't affect our public interface.	2023-06-08 14:42:02 -07:00
Sheil Kumar	9d52632da9	[DML EP] Register Div with int64 and NonZero with bool (#16276 ) [DML] Register Div with int64 and NonZero with bool These data types are supported by DML	2023-06-08 13:49:39 -07:00
kunal-vaishnavi	79e0230002	Add vocab masks to Whisper export with beam search (#16180 ) ### Description This PR adds flags for exporting Whisper with vocab masks for logits processing. This PR also sets `input_features` back to FP32 precision for the user and casts `input_features` to FP16 precision when needed. ### Motivation and Context This helps enable specific logits processing for the exported Whisper model.	2023-06-08 12:36:35 -07:00
Yuriy Chernyshov	a3a443c804	Support re2 == 2023-06-02 (#16257 ) ### Description google/re2 [was switched](`49d776b9d2`) to absl::string_view in version 2023-06-02. As `absl::string_view` is a drop-in replacement for `std::string_view` it does not have `as_string()` method. This PR ensures the forward compatibility with the newest versions of re2 library.	2023-06-08 11:26:26 -07:00
Scott McKay	b07b647f66	Fix some issues with NNAPI Softmax (#16095 ) ### Description <!-- Describe your changes. --> Update NNAPI Softmax to coerce to 2D when opset is < 13. This prevents the layout change to NHWC from breaking the implementation, as well as making it work correctly when the ONNX node's axis != 1. Add check for opset 13+ that axis is inner-most dimension as we don't currently handle any other value correctly. Update tests to add model to check NHWC layout, as well as 4D tests. We didn't notice the issues with the NNAPI EP as it was only processing input shapes that were 2D or 4D (which was overly restrictive as well). ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> #15949	2023-06-08 13:56:06 +10:00
Artur	dc1312cfb1	[web] fix: Provide typings for exports (#16249 ) ### Description Adds typings to be compatible with `moduleResolution: bundler` ### Motivation and Context Fixes #16242	2023-06-07 14:52:36 -07:00
Changming Sun	fe0cc8ce62	Remove some usages of CUDA_VERSION macro (#16199 ) ### Description We should avoid using the macro since the value of the macro is inaccurate. For example, our prebuilt packages are built with CUDA 11.8 but people may run the binaries with CUDA 11.4. (The minimal CUDA version we support is CUDA 11.4) A runtime function should be used to determine CUDA version. Like: ```C++ int cuda_runtime_version = 0; CUDA_CALL_THROW(cudaRuntimeGetVersion(&cuda_runtime_version)); ORT_ENFORCE(cuda_runtime_version >= 11040, "ONNX Runtime needs cuda runtime higher than 11.4"); ```	2023-06-07 14:34:22 -07:00
Dmitri Smirnov	908e940660	[CPP Api] Remove deprecated CustomOp API (#16256 ) ### Description Custom Op API has been deprecated in 1.15 release. We are removing it.	2023-06-07 14:03:13 -07:00
Vrajang Parikh	67f4a4fd16	Objective-C binding for ORT training (#16127 ) ### Description Implement Objective-C binding for `ORTCheckPoint`. Additionally, - Modify `onnxruntime_objectivec.cmake` to only include training header and sources when training flag is enabled - Enable objective-c binding for `orttraining-mac-ci-pipeline` ### Motivation and Context This PR is part of implementing Objective-C bindings for training API. It implements objective-c binding for ORTCheckPoint class. The objective-C API closely resembles the C++ API. Note: The test for saving checkpoint is skipped as it requires use of training session. It will be added when the objective-c binding for `ORTTrainingSession` is added.	2023-06-07 14:01:30 -07:00
Adam Pocock	bca49d62a0	Fixing CoreML in Java (#16231 ) ### Description The name of the flag we set when compiling the JNI binding to enable the CoreML EP changed at some point in the past. This PR fixes it by updating the flag in the JNI. I also added a quick smoke test for the CoreML provider to make sure it doesn't crash and can be enabled. ### Motivation and Context All the EPs should work as expected in Java. Fixes #16230.	2023-06-07 12:24:57 -07:00
Edward Chen	1261d0b8ba	Fix some build issues on MacOS with Xcode 14.3. (#15878 ) - Fix flatbuffers flatc warning, unused-but-set-variable. - Address `-Wshorten-64-to-32` warnings (fix in our code, allow in dependencies' code). - Update CI builds to use Xcode 14.3. - Update minimum iOS version to 12.0. - Update Mac hosted agents to MacOS 13 where possible.	2023-06-07 12:07:11 -07:00
Adrian Lizarraga	b8858f034e	[QNN EP] Increase conv test tolerance for Windows x64 (#16241 ) ### Description Increases allowable accuracy tolerance for specific Conv op test on QNN CPU backed (Windows x64). ### Motivation and Context Allow QNN NuGet pipeline to run. PR https://github.com/microsoft/onnxruntime/pull/15975 introduced a failing test on Windows x64.	2023-06-07 10:52:56 -07:00
Wanming Lin	a8c2f24ae0	[WebNN EP] Merge support for segment anything into main branch (#16208 ) We implemented a number of new ops and data types to support running segment anything model on Chromium WebNN DML backend (POC) in a forked branch https://github.com/honry/onnxruntime/tree/stable-diffusion In this PR, we migrate the changes in the forked branch to main branch, includes: - 22 new ops - New tensor data types: bool, int32, uint32, uint64, int64, float16 (As JavaScript hasn't shipped Float16Array, we use Uint16Array as a workaound) - Handle empty input tensors and duplicated outputs - Fixed some nits	2023-06-07 09:56:37 -07:00
cloudhan	05bea0d3c3	Add new cases for non biased mha tests (#16097 ) 1. Add new test data GetSelfAttentionData_WithPastAndPresent_HeadSize8_NoMask_NoRelPosBias, also added non-biased data 2. Add new test data GetCrossAttentionData_DiffSequenceLengths_HeadSize8, also added non-biased data 3. Disabled the new tests for CUDA EP due to qkv is not correctly transposed.	2023-06-07 15:04:27 +08:00
cloudhan	3373160863	[CPU EP] Refactor CPU mha (#16247 ) Followup of #16075	2023-06-07 14:41:14 +08:00
cloudhan	f013965831	Add non qkv biased version mha unittests (#16075 ) 1. Add nonbiased mha unittests data 2. Update CPU and CUDA EP to accept inputs with `qkv_bias`	2023-06-06 09:18:41 +08:00
Adam Pocock	3c2a11f2f1	[java] Allow the creation of boolean tensors from ByteBuffer (#15556 ) ### Description The tensor creation code now allows the creation of boolean tensors from non-direct `ByteBuffer` instances. It previously only allowed them from arrays and direct `ByteBuffer` instances and this fixes that inconsistency. The boolean tensor test has been updated to cover all three cases. ### Motivation and Context Fixes #15509.	2023-06-05 09:58:50 -07:00
PeixuanZuo	a95f8ae53c	[ROCm] Update ROCm/MIGraphX CI pipeline (#16215 ) MIGraphX CI - Change docker container user name to `onnxruntimedev` ROCm CI - Build docker image every job instead of using prebuild image. - Every job create a container with only one GPU with command `docker run -it --device=/dev/kfd --device=/dev/dri/renderDxxx` - Remove tests that are unstable or use outdated interfaces. - Enable training ortmodule test.	2023-06-05 10:28:10 +08:00
ashari4	18c97381cd	Detect fake tensor mode if it has already been created. (#16220 ) ### Description <!-- Describe your changes. --> Detect fake tensor mode if it has already been created. Follows this example in pytorch: `86c7652503/torch/_inductor/compile_fx.py (L280)` ### Motivation and Context As of torch nightly 6/2/23, when trying to run a torch dynamo graph on the ORT backend, we observe ``` E torch._dynamo.exc.BackendCompilerFailed: backend='compiler_fn' raised: E AssertionError: Mixing fake modes NYI E E E You can suppress this exception and fall back to eager by setting: E import torch._dynamo E torch._dynamo.config.suppress_errors = True ``` The issue is that `ort_backend.py` creates a new fake tensor mode even though one has already been created by torch.	2023-06-02 23:17:49 -07:00
Somdev Sangwan	2e66bc8669	prevent object destruction compile error (#16134 ) ### Description The proposed fix is to store the result of AsBlockSparse() in a variable to ensure the object isn't destroyed until the end of the current scope. ### Motivation and Context "own_buffer_tensor" is a temporary object that is destroyed at the end of the expression and causes a compile error.	2023-06-02 11:19:53 -07:00
Changming Sun	6b5b79872b	Avoid taking dependency on dl.fedoraproject.org (#16202 ) ### Description 1. Avoid taking dependency on dl.fedoraproject.org The website is not very stable. Our build pipelines often fail to fetch packages from there. 2. Update manylinux to the latest version	2023-06-02 07:41:46 -07:00
Changming Sun	7686193c40	Fix DNNL build (#16201 )	2023-06-02 09:46:03 +08:00
Yulong Wang	319a0dc6aa	[js/doc] allow deduplicate opset version (#16182 ) ### Description allow deduplicate opset version in generated document webgpu-operators.md	2023-06-01 17:28:08 -07:00
Dale Phurrough	6e1c3003ff	DML EP and MLAS buffer allocator - increase alignment to 64 bytes for AVX-512 processing (#15141 ) Fixes #13119 top concerns by * using `onnxruntime::AllocatorDefaultAlloc` instead of `malloc` * set `MLAS_DEFAULT_PREFERRED_BUFFER_ALIGNMENT=64` which cascades that value to several members and functions not directly related to MLAS. ### Motivation and Context * Fixes #13119 top concerns. Otherwise, alignment is to 16 bytes circa 1990s 👴 * Does not yet enable flexible alignment. Instead fixed at 64 (64 x 8 bits=512 bits) for modern NN hardware like AVX-512	2023-06-01 16:32:55 -07:00
Adrian Lizarraga	5a4c3b7937	[QNN EP] Support Equal, Less, LessOrGreater, Greater, GreaterOrEqual operators on HTP backend (#16171 ) ### Description - Updates QDQ transformer to handle QDQ logical operators (Equal, Less, LessOrEqual, Greater, GreaterOrEqual). - Expects 2 DQ inputs and no Qs in the output, which is boolean. ### Motivation and Context This is needed to enable QDQ models with logical comparison operators to run on QNN EP.	2023-06-01 15:07:15 -07:00
Hector Li	f72dc198c6	[QNN EP]Add UT for cached Qnn context binary (#16184 ) ### Description 1. Add UT for cached Qnn context binary 2. Minor change: set model path to "" if model_path is not available since the model could be loaded from buffer instead of Onnx file ### Motivation and Context support more scenario --------- Co-authored-by: Adrian Lizarraga <adlizarraga@microsoft.com>	2023-06-01 14:28:46 -07:00
Changming Sun	5bfa1183d1	Add a Memory Profiling build job in post merge pipeline (#16172 ) ### Description 1. Add a Memory Profiling build job 2. Remove no absl build job since the feature will be removed 3. Simplify post-merge-jobs.yml by unifying the pool names ### Motivation and Context To catch build errors in #16124	2023-06-01 13:00:44 -07:00
Alexander Visheratin	e6c6184fee	[JS/WebGPU] Unsqueeze operator implementation (#16138 ) ### Description This PR adds an implementation of the Squeeze operator to WebGPU JSEP. The implementation follows the [operator schema](https://github.com/onnx/onnx/blob/main/docs/Operators.md#Unsqueeze). To implement the `Unsqueeze` operator in the same fashion as the `Squeeze`, I added the `ComputeOutputShape()` method to the `UnsqueezeBase` class and made some slight modifications. Please let me know if it is a bad idea and if I should move this method to the JS implementation. I also uncommented test case lines in the `suite-test-list.jsonc` file for both Squeeze and Unsqueeze operators following @hariharans29's [comment](https://github.com/microsoft/onnxruntime/pull/16024#issuecomment-1565113633). ### How was it tested 1. I created a model with only one operator: ```Python import onnx.helper node = onnx.helper.make_node( "Unsqueeze", inputs=["T", "axes"], outputs=["y"], ) graph = onnx.helper.make_graph([node], "test", [onnx.helper.make_tensor_value_info("T", 1, [3, 4, 5]), onnx.helper.make_tensor_value_info("axes", 7, [2])], [onnx.helper.make_tensor_value_info("y", 1, [3, 1, 4, 5, 1])]) onnx.save(onnx.helper.make_model(graph), "unsqueeze.onnx") ``` 2. I compiled the runtime using @fs-eire's [instructions](https://gist.github.com/fs-eire/a55b2c7e10a6864b9602c279b8b75dce). 3. I ran the test models in the browser using this minimal setup: ```HTML <html> <script src=".\dist\ort.webgpu.min.js"></script> <script> async function run() { const session = await ort.InferenceSession.create('unsqueeze.onnx', {executionProviders: ['webgpu']}); console.log(session); const input = new ort.Tensor('float32', new Float32Array(60), [3, 4, 5]); const dim = new ort.Tensor('int64', [1n, 4n], [2]); const output = await session.run({ "T": input, "axes": dim }); console.log(output); } run(); </script> </html> ``` ### Motivation and Context Improve operator coverage for WebGPU JSEP.	2023-06-01 12:23:02 -07:00
Changming Sun	5b08176314	Exclude shufflenet from DNNL's model tests (#16126 )	2023-06-01 10:56:24 -07:00

1 2 3 4 5 ...

8958 commits