onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-07-17 18:40:28 +00:00

Author	SHA1	Message	Date
Yulong Wang	ae6dcc839e	Revert "[js/webgpu] disable failed tests temporarily (#23127 )" (#23130 ) ### Description This reverts commit `9115682d69`. ### Motivation and Context	2024-12-18 18:07:50 -08:00
Prathik Rao	31e6e1010c	gather elements webgpu implementation (#23137 ) Increases operator coverage for WebGPU EP.	2024-12-18 16:29:26 -08:00
Changming Sun	5d7030e4c6	Revert DML pipeline changes (#23135 ) ### Description Previously we wanted to add DirectML EP to existing onnxruntime Windows CUDA packages. After careful consideration, we will postpone the change. This PR reverts some pipeline changes previously made by @mszhanyi and @jchen351 .	2024-12-18 10:42:10 -08:00
Changming Sun	e76bd2f5e9	Update CODEOWNERS: remove onnxruntime-es (#21677 ) Removing this restriction for now.	2024-12-17 13:39:13 -08:00
Wanming Lin	a5b60ec03f	[WebNN] Add limit to QDQ ops (#23076 ) WebNN requires the `scale_shape` to be a subsample of the `input_shape`.	2024-12-17 12:52:08 -08:00
Enrico Galli	54edb43e77	[WebNN] Fixes MLTensor caching across different contexts (#23100 ) We weren't checking that MLTensors were from the same context before reusing them. Found while debugging microsoft/webnn-developer-preview#69	2024-12-17 12:51:16 -08:00
Tianlei Wu	5afab787db	Update python version metadata (remove 3.7, 3.8, 3.9; add 3.13). (#23067 ) ### Description * Update python version metadata to be in sync with latest python packages (onnxruntime, onnxruntime-gpu and onnxruntime-qnn). * Update black format target-version to 3.10, and use lintrunner to format all files. * Update the lintrunner installation command line to be consistent. * Include `requirements-lintrunner.txt` in `requirements-dev.txt` to avoid duplicated settings. ### Motivation and Context https://github.com/microsoft/onnxruntime/issues/22993 Python support by numpy: https://numpy.org/neps/nep-0029-deprecation_policy.html#drop-schedule ``` On Apr 05, 2024 drop support for Python 3.9 On Apr 04, 2025 drop support for Python 3.10 ```	2024-12-17 10:59:20 -08:00
Jiajia Qin	0981bbf4ca	[webgpu] Optimize matmulnbits with M > 1 (#23102 ) This is the webgpu native ep implementation of #23092. I used https://github.com/fs-eire/ort-webgpu-nodejs-chatapp-prototype to test. Meanwhile, applied https://github.com/fs-eire/ort-webgpu-nodejs-chatapp-prototype/pull/2 to print the first token time. The result is like below: The latest main branch: Intel Arc Graphics ``` 659 tokens in 24.8sec, 26.57 tokens/sec Decoding first token with input 449 tokens: 13.0 sec Decoding remaining 210 tokens: 11.8 sec 17.79 tokens/sec ``` NV RTX 2000 ``` 659 tokens in 14.4sec, 45.85 tokens/sec Decoding first token with input 449 tokens: 7.3 sec Decoding remaining 210 tokens: 7.0 sec 29.81 tokens/sec ``` ------------------------------------------------------------------------- With this PR: Intel Arc Graphics ``` 657 tokens in 20.6sec, 31.92 tokens/sec Decoding first token with input 449 tokens: 8.5 sec Decoding remaining 208 tokens: 12.1 sec 17.23 tokens/sec ``` NV RTX 2000 ``` 659 tokens in 11.4sec, 57.93 tokens/sec Decoding first token with input 449 tokens: 4.1 sec Decoding remaining 210 tokens: 7.2 sec 28.98 tokens/sec ``` From above data, you can see that with this PR, both intel (13s -> 8.5s) and NV (7.3s -> 4.1s) GPUs for the first token time are performing better.	2024-12-16 20:47:40 -08:00
Yulong Wang	9115682d69	[js/webgpu] disable failed tests temporarily (#23127 ) ### Description Those test cases start to fail for unknown reasons. To unblock the CI, I disabled those tests temporarily to earn time to investigate the root cause.	2024-12-16 15:35:47 -08:00
Dmitri Smirnov	ae97068137	Fix Pybind memory leak (#23105 ) ### Description <!-- Describe your changes. --> Array GETITEM returns new reference which is a leak ### Motivation and Context Address https://github.com/microsoft/onnxruntime/issues/22271	2024-12-16 10:38:23 -08:00
tianf-fff	a4eb8f27b6	[VitisAI] Add profiler interface for vitisai (#23032 ) ### Description <!-- Describe your changes. --> Add common interfaces for vitis ep profiler. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Vitis ep can collect and record api and kernel timestamps in file when onnxruntime '-p' is enabled.	2024-12-16 09:09:48 -08:00
Changming Sun	2ff66b80e0	Fix a deadlock bug in EigenNonBlockingThreadPool.h (#23098 ) ### Description This PR fixes a deadlock bug in EigenNonBlockingThreadPool.h. It only happens on platforms with weakly ordered memory model, such as ARM64.	2024-12-16 09:05:12 -08:00
Yulong Wang	3a0b958586	add 2 CMake build options of Dawn (#23096 ) ### Description This change adds the following CMake build options for Dawn: - onnxruntime_BUILD_DAWN_MONOLITHIC_LIBRARY - OFF by default - when enabled, builds Dawn as a monolithic library (webgpu_dawn.dll) - onnxruntime_ENABLE_DAWN_BACKEND_VULKAN - OFF by default - when enabled, build with Vulkan backend for Dawn on Windows - onnxruntime_ENABLE_DAWN_BACKEND_D3D12 - ON by default - when enabled, build with DirectX 12 backend for Dawn on Windows ### File Size Comparison (Windows) \| Build \| cmdline \| File Size \| \|---\|---\|---\| \| Baseline \| --config Release<br/> --build_shared_lib \| `12,755,456 onnxruntime.dll` \| \| WebGPU D3D12 (default) \| --use_webgpu<br/> --config Release<br/> --build_shared_lib \| `17,082,368 dxcompiler.dll`<br/>` 1,508,472 dxil.dll`<br/>`18,708,480 onnxruntime.dll` \| \| WebGPU D3D12+Vulkan \| --use_webgpu<br/> --config Release<br/> --build_shared_lib<br/> --cmake_extra_defines<br/> onnxruntime_ENABLE_DAWN_BACKEND_D3D12=1<br/> onnxruntime_ENABLE_DAWN_BACKEND_VULKAN=1 \| `17,081,344 dxcompiler.dll`<br/>` 1,508,472 dxil.dll`<br/>`19,388,416 onnxruntime.dll` \| \| WebGPU Vulkan \| --use_webgpu<br/> --config Release<br/> --build_shared_lib<br/> --cmake_extra_defines<br/> onnxruntime_ENABLE_DAWN_BACKEND_D3D12=0<br/> onnxruntime_ENABLE_DAWN_BACKEND_VULKAN=1 \| `17,615,872 onnxruntime.dll` \| \| Monolithic \| --use_webgpu<br/> --config Release<br/> --build_shared_lib<br/> --cmake_extra_defines<br/> onnxruntime_BUILD_DAWN_MONOLITHIC_LIBRARY=1 \| `17,082,368 dxcompiler.dll`<br/>` 1,508,472 dxil.dll`<br/>`13,277,696 onnxruntime.dll`<br/>` 5,616,640 webgpu_dawn.dll` \| \| External Dawn \| --use_webgpu<br/> --config Release<br/> --build_shared_lib<br/> --cmake_extra_defines<br/> onnxruntime_USE_EXTERNAL_DAWN=1<br/> --skip_tests \| `17,081,344 dxcompiler.dll`<br/>` 1,508,472 dxil.dll`<br/>`13,277,184 onnxruntime.dll`	2024-12-13 16:05:48 -08:00
genmingz@AMD	62e7e24f17	Add attrProto.release_s interface (#22977 ) ### Description Add AttributeProto.release_s interface, which is used to obtain the string in the attribute using move semantics instead of copying it ### Motivation and Context The ep_context node stores a lot of information in attributes, which may cause the memory usage to increase. Use this interface to avoid memory waste --------- Co-authored-by: GenMing Zhong <genmingz@xlnx.xilinx.com> Co-authored-by: genmingz <genmingz@amd.com>	2024-12-12 21:13:43 -08:00
Hector Li	2a36fd4f6e	Fix the ctx_gen tool to make sure all generated ctx.onnx have max_size (#23097 ) ### Description Fix the qnn_ctx_gen tool to make sure all generated ctx.onnx have max_size	2024-12-12 21:12:02 -08:00
Hector Li	f43f40facf	Backward compatible with old QNN version (#23095 ) ### Description Make QNN EP compliable with old QNN version	2024-12-12 17:04:20 -08:00
Yulong Wang	01539ee7ab	[js/webgpu] fix Conv2DMatMul shader's out-of-bound read (#23085 ) ### Description <!-- Describe your changes. --> Fix a bug caused by potential out-of-bound reads of `W` in the Conv2DMatMul shader. ### Motivation and Context Fixes #22983	2024-12-12 11:33:53 -08:00
Dmitri Smirnov	890a719c91	Remove deprecated static from Eigen that contributes to size increase (#23084 ) ### Description <!-- Describe your changes. --> This patches Eigen source to remove an unused deprecated static var. ### Motivation and Context Internal customer request.	2024-12-12 10:19:47 -08:00
Ankit Maheshkar	1f88284f96	OVEP 1.21.0 Development Updates (#23080 ) ### Description OVEP development changes for ORT 1.21 Release ### Motivation and Context - Has Critical Bug Fixes - Improved Performance optimizations for both memory & inference latency (https://github.com/intel/onnxruntime/pull/513) - Enabled Model Compilation using NPUW (https://github.com/intel/onnxruntime/pull/508) - Fixed support for EPContext embed mode 0 for lower memory utilization - Updated NuGet package name as `Intel.ML.OnnxRuntime.OpenVino` - Fixed QDQ Stripping logic on NPU	2024-12-11 22:26:32 -08:00
Hector Li	ebb968d34a	disable the EP context embed model by default in session option (#23070 ) change the default value for session option ep.context_embed_mode to 0 to avoid the model loading memory overhead	2024-12-11 17:26:29 -08:00
Yulong Wang	e605870783	[js/web] Update API for `ort.env.webgpu` (#23026 ) ### Description This PR is a replacement of #21671. It offers a new way for accessing the following: - `ort.env.webgpu.adapter`: - deprecating. There is no point to get the value of it. Once `GPUDevice.adapterInfo` is supported, there is no point to set the value too. - `ort.env.webgpu.device`: - set value of `GPUDevice` if user created it. Use at user's own risk. - get value of `Promise<GPUDevice>`. if not exist, create a new one. if exist return it. - `ort.env.webgpu.powerPreference`: - deprecating. encouraging users to set `ort.env.webgpu.device` if necessary. - `ort.env.webgpu.forceFallbackAdapter`: - deprecating. encouraging users to set `ort.env.webgpu.device` if necessary.	2024-12-11 10:24:14 -08:00
sushraja-msft	8800830a44	Implement 2d tiled matmulnbits specialized for prefill (#23058 ) ### Description This change implements matmul4bits with tiling both for A and B. This is beneficial for prefill scenarios on Intel integrated GPUs, because each row of A has to run through the same set of shared rows of B. This change should improve core occupancy and model_benchmark does indicate improvements for prefill. The same shader is not used for generation because when A has just a single row, the other threads in the workgroup get unused and that hurts performance. ``` -- Baseline run on an Alderlake GPU -- C:\onnxruntime>C:\model_benchmark\model_benchmark.exe -i C:\Phi-3.5-mini-instruct-onnx-web\Phi-3.5-mini-instruct-onnx-web -l 500 Batch size: 1, prompt tokens: 501, tokens to generate: 128 Prompt processing (time to first token): avg (us): 1.72338e+07 avg (tokens/s): 29.0707 << p50 (us): 1.72548e+07 stddev (us): 57012.8 n: 5 * 501 token(s) Token generation: avg (us): 79227.5 avg (tokens/s): 12.6219 p50 (us): 79284.4 stddev (us): 2109.72 n: 635 * 1 token(s) Token sampling: avg (us): 15.8198 avg (tokens/s): 63211.8 p50 (us): 14.3 stddev (us): 8.67178 n: 640 * 1 token(s) E2E generation (entire generation loop): avg (ms): 27297.8 p50 (ms): 27269.8 stddev (ms): 89.4322 n: 5 Peak working set size (bytes): 5490987008 WebGPU device lost (2): Device was destroyed. ----------------------------------- With Prefill Optimization ---- C:\onnxruntime>C:\model_benchmark\model_benchmark.exe -i C:\Phi-3.5-mini-instruct-onnx-web\Phi-3.5-mini-instruct-onnx-web -l 500 Batch size: 1, prompt tokens: 501, tokens to generate: 128 Prompt processing (time to first token): avg (us): 1.2135e+07 avg (tokens/s): 41.2856 << p50 (us): 1.21288e+07 stddev (us): 21282.1 n: 5 * 501 token(s) Token generation: avg (us): 78945.3 avg (tokens/s): 12.667 p50 (us): 78900.7 stddev (us): 2232.43 n: 635 * 1 token(s) Token sampling: avg (us): 20.5608 avg (tokens/s): 48636.3 p50 (us): 18.7 stddev (us): 19.0409 n: 640 * 1 token(s) E2E generation (entire generation loop): avg (ms): 22163.8 p50 (ms): 22160.1 stddev (ms): 31.3122 n: 5 Peak working set size (bytes): 5478862848 WebGPU device lost (2): Device was destroyed. ```	2024-12-10 17:07:11 -08:00
amancini-N	d8de3c4096	[CUDA EP] Fix BeamSearch on T5 with sequence_as_input_ids (#20667 ) (#20668 ) ### Description Change the implementation of BeamSearch op when using CUDA EP: in case of T5 model, and in case the decoder input_ids are sequences, copy the sequences device-to-device instead of host-to-device ### Motivation and Context - Fixes #20667	2024-12-10 16:20:47 -08:00
shiyi	02f0af0d08	[WebNN] Improve data type check of slice op (#22988 ) A follow-up of [[WebNN] Support negative steps for slice](https://github.com/microsoft/onnxruntime/pull/22871#discussion_r1847929774). Slice op is emulated by reverse+slice when steps < 0 so `SliceOpBuilder::HasSupportedInputsImpl()` should also check the supported data types of reverse. --------- Co-authored-by: Wanming Lin <wanming.lin@intel.com>	2024-12-10 15:48:16 -08:00
Edward Chen	fa6ad202aa	Minor updates to onnxruntime_java.cmake (#23068 ) - Use `ANDROID` instead of `CMAKE_SYSTEM_NAME STREQUAL "Android"`. - Put common gradle arguments into `COMMON_GRADLE_ARGS` to make them easier to reuse.	2024-12-10 15:44:36 -08:00
Jiajia Qin	defcc4f819	[webgpu] Optimize Expand (#23052 ) ### Description <!-- Describe your changes. --> Use components = 4 if possible. This is the webgpu native implementation from #22752	2024-12-10 14:58:57 -08:00
Misha Chornyi	bf4d3e1a5b	Update vcpkg.json - lock flatbuffer version (#23046 ) ### Description Locking version introduced in: `03ea5dc495/onnxruntime/core/flatbuffers/schema/ort_training_checkpoint.fbs.h (L11-L13)` ### Motivation and Context Resolve issue for version `>=1.20.` https://github.com/microsoft/onnxruntime/issues/22666	2024-12-10 11:23:01 -08:00
Jian Chen	5f7b9d0245	Upgrade gradle to 8.7 (#23016 ) ### Description This PR only upgrade the gradle version and `com.android.tools.build:gradle` version from build.gradle. This only update the react-native library gradle version, not the e2e test. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2024-12-10 10:49:03 -08:00
A-Satti	b14b4ec703	Restore Qspectre flag (#23060 ) Restore a removed Qspectre flag and update comment ### Motivation and Context Adjustment for PR `f5293d253c`	2024-12-09 21:52:21 -08:00
Scott McKay	708ee8556e	Reduce default logger usage (#23030 ) ### Description <!-- Describe your changes. --> We have use cases where multiple sessions are created concurrently. Minimizing the usage of the default logger is important for these scenarios. Wire through the session logger to as many places as possible. The EP logger can also be used once the session is created (can't be used during EP construction/kernel registration but can be used in GetCapability and Compile). ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Improve logging when there are concurrent sessions.	2024-12-10 12:54:14 +11:00
wejoncy	e12421be30	[CoreML] more performace flag (#22975 ) ### Description refactor unsquzee's implementation add more flags to boost peformance. add profile flag ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> --------- Co-authored-by: jicwen <jicwen@YiMacBook-Pro.local> Co-authored-by: wejoncy <wejoncy@.com> Co-authored-by: Scott McKay <skottmckay@gmail.com>	2024-12-10 09:35:05 +08:00
amancini-N	8f3384b4c1	Fix BeamSearch T5 if initializers are on outer scope (#23044 ) ### Description This PR adds the logic needed to consider only the needed implicit inputs on BeamSearch op in case of T5 model (encoder/decoder, 2 graphs). The logic added is similar to what happens in the _If_ kernel setup. ### Motivation and Context Fixes #23043	2024-12-09 15:15:20 -08:00
Scott McKay	2f2c73bdde	Miscellaneous cleanups (#23048 ) ### Description <!-- Describe your changes. --> - fix some missing end of version markers and since_version info - fix include to use onnx_protobuf.h which handles minimal builds - we should always prefer that header over directly using the onnx ones ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2024-12-10 09:24:16 +11:00
Yulong Wang	22ae97c7dc	[webgpu] Add Alias def for Flatten (#23038 ) ### Description Add `Alias` definition for Flatten in WebGPU EP. also add int32/uint32 in type constraint T.	2024-12-09 14:19:43 -08:00
Wanming Lin	6d9636f07c	[WebNN] Allow ops to handle ignoring an empty tensor as input (#22972 ) ### Description Some ops should allow empty tensor as input, e.g. roi, scales inputs in Resize ### Motivation and Context It avoid some unexpected fallback for optional input with empty tensor. e.g. roi and scales are both optional inputs in Resize, in some models they have non-empty name but with empty initializer presented as `[0]`, WebNN currently will fallback all nodes with 0 dimension, which is not expected. ![image](https://github.com/user-attachments/assets/599ba351-b5f6-49ac-8a1f-69fb28dbaf9b)	2024-12-06 17:58:15 -08:00
A-Satti	f5293d253c	Update Intel Thread Counts (#22894 ) ### Description The default thread count methodology by onnxruntime did not account for new upcoming Intel microarchitectures leading to a suboptimal thread count. Optimizing the thread count for new Intel microarchitectures reveal gains on the majority of models across datatypes and shows gains up to ~1.5x speedup. ### Motivation and Context Applications should run on Intel with the most performant thread configuration for the majority of models. With new microarchitectures, adjusting the thread count methodology is required to take advantage of their differences. <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2024-12-06 13:56:50 -08:00
Jing Fang	bd5a759d0c	[ARM CPU] Add rotary embedding fp16 kernel (#23013 ) ### Description Add fp16 kernel to rotary embedding to boost performance. ### Motivation and Context Part of performance optimization work for group query attention	2024-12-06 13:25:48 -08:00
Hector Li	401d16c671	Enable QNN HTP spill fill buffer setting to save RAM usage. (#22853 ) ### Description Enable QNN HTP spill fill buffer setting to save RAM usage. This feature is available after QNN 2.28. Need to re-generate QNN context binary. https://docs.qualcomm.com/bundle/publicresource/topics/80-63442-50/htp_backend.html#qnn-htp-backend-api Requirements: 1. Need to re-generate the Onnx model with QNN context binary by set the EP option enable_htp_spill_fill_buffer = 1. 2. Works for a model with multiple Context binaries. Need manually merge 2 Onnx model with context binary into 1 Onnx model. 3. Requires Linux platform if generate the context binary offline since QnnSystem lib is not available for Windows x86_64 platform. No need to do extra thing while running the model inference. The generated EPContext node will have a max_size attribute with the maximum spill fill buffer size for the context binary <img width="353" alt="image" src="https://github.com/user-attachments/assets/a3bf48be-a8da-4381-8a1d-3f2558eea37d"> --------- Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>	2024-12-06 11:36:52 -08:00
dependabot[bot]	d27fecd3d3	Bump cross-spawn from 6.0.5 to 6.0.6 in /js/web (#23019 ) Bumps [cross-spawn](https://github.com/moxystudio/node-cross-spawn) from 6.0.5 to 6.0.6. <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/moxystudio/node-cross-spawn/blob/v6.0.6/CHANGELOG.md">cross-spawn's changelog</a>.</em></p> <blockquote> <h2><a href="https://github.com/moxystudio/node-cross-spawn/compare/v6.0.5...v6.0.6">6.0.6</a> (2024-11-18)</h2> <h3>Bug Fixes</h3> <ul> <li>disable regexp backtracking (<a href="https://redirect.github.com/moxystudio/node-cross-spawn/issues/160">#160</a>) (<a href="https://github.com/moxystudio/node-cross-spawn/commit/ba5aaef">ba5aaef</a>)</li> <li><strong>core:</strong> support worker threads (<a href="https://redirect.github.com/moxystudio/node-cross-spawn/issues/127">#127</a>) (<a href="https://github.com/moxystudio/node-cross-spawn/commit/f4af31c">f4af31c</a>)</li> </ul> <p><!-- raw HTML omitted --><!-- raw HTML omitted --></p> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="`d35c865b87`"><code>d35c865</code></a> chore(release): 6.0.6</li> <li><a href="`5a37e19173`"><code>5a37e19</code></a> chore: update package.json and package.lock</li> <li><a href="`ba5aaef783`"><code>ba5aaef</code></a> fix: disable regexp backtracking (<a href="https://redirect.github.com/moxystudio/node-cross-spawn/issues/160">#160</a>)</li> <li><a href="`f4af31c8ee`"><code>f4af31c</code></a> fix(core): support worker threads (<a href="https://redirect.github.com/moxystudio/node-cross-spawn/issues/127">#127</a>)</li> <li>See full diff in <a href="https://github.com/moxystudio/node-cross-spawn/compare/v6.0.5...v6.0.6">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=cross-spawn&package-manager=npm_and_yarn&previous-version=6.0.5&new-version=6.0.6)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) Dependabot will merge this PR once it's up-to-date and CI passes on it, as requested by @fs-eire. [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) You can disable automated security fix PRs for this repo from the [Security Alerts page](https://github.com/microsoft/onnxruntime/network/alerts). </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2024-12-05 10:07:08 -08:00
Yi Zhang	6ed77cc374	Deprecate macos-12 (#23017 ) ### Description <!-- Describe your changes. --> ### Motivation and Context ESRP code-sign task has supported .net 8, so we can remove macos-12	2024-12-05 14:07:21 +08:00
Yulong Wang	1c79a4c9dd	[js/common] use TS type inference to eliminate `unknown` (#23012 ) ### Description This change uses a TypeScript trick to infer global types in onnxruntime-common. Thanks to the strong type system of TypeScript, we are able to refer to types that may not be available in the context. This helps to keep onnxruntime-common not to include dependencies like "@webgpu/types", and still being able to use the types in the declaration. See comments of `TryGetGlobalType` in `type-helper.ts`.	2024-12-04 19:01:26 -08:00
Jian Chen	f340b3cad3	Adding DML to python cuda package (#22606 )	2024-12-04 21:20:12 -05:00
Yulong Wang	3234487385	[js] remove more unused training types (#22753 ) ### Description remove more unused training types	2024-12-04 16:44:09 -08:00
dependabot[bot]	3975e79303	Bump axios from 1.6.1 to 1.7.9 in /js/node (#23009 )	2024-12-04 23:52:24 +00:00
Wanming Lin	cacd97dba3	[WebNN] Improve the util function of creating WebNN constant MLOperand (#22935 ) Merge the util functions to create or retrieve: - A WebNN constant MLOperand filled with the specified value, data type, and shape. - A WebNN scalar constant MLOperand with the specified value and data type.	2024-12-04 15:09:54 -08:00
Jing Fang	fbe22fdac7	[ARM CPU] Fix flaky hqnbitgemm UT (#23010 ) ### Description Increase fp16 qnbitgemm UT tol and use fixed seeds. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2024-12-04 14:55:52 -08:00
Yulong Wang	7b0fa407eb	fix requirements.txt path (#22946 ) ### Description #22380 removes the file `tools/ci_build/github/linux/docker/inference/x86_64/python/cpu/scripts/requirements.txt` but it is still used in `dockerfiles/Dockerfile.cuda`. This change updates the file path of the requirements.txt fixes #22945.	2024-12-04 13:08:29 -08:00
Yulong Wang	d0dde4f7d4	[wasm/test] update packages versions (#23008 ) ### Description Upgrade packages version to resolve the following dependabot alerts: - https://github.com/microsoft/onnxruntime/security/dependabot/269 - https://github.com/microsoft/onnxruntime/security/dependabot/268 - https://github.com/microsoft/onnxruntime/security/dependabot/275 - https://github.com/microsoft/onnxruntime/security/dependabot/306 ``` # npm audit report braces <3.0.3 Severity: high Uncontrolled resource consumption in braces - https://github.com/advisories/GHSA-grv7-fg5c-xmjg fix available via `npm audit fix` node_modules/braces cookie <0.7.0 cookie accepts cookie name, path, and domain with out of bounds characters - https://github.com/advisories/GHSA-pxg6-pf52-xh8x fix available via `npm audit fix` node_modules/cookie engine.io 0.7.8 - 0.7.9 \|\| 1.8.0 - 6.6.1 Depends on vulnerable versions of cookie Depends on vulnerable versions of ws node_modules/engine.io socket.io 1.6.0 - 4.7.5 Depends on vulnerable versions of engine.io node_modules/socket.io ws 8.0.0 - 8.17.0 Severity: high ws affected by a DoS when handling a request with many HTTP headers - https://github.com/advisories/GHSA-3h5v-q93c-6h6q fix available via `npm audit fix` node_modules/ws socket.io-adapter 2.5.2 - 2.5.4 Depends on vulnerable versions of ws node_modules/socket.io-adapter 6 vulnerabilities (1 low, 1 moderate, 4 high) ```	2024-12-04 13:08:13 -08:00
Yulong Wang	fdf5ffe2cf	[js/node] fix TypeScript declaration in onnxruntime-node (#23000 ) ### Description fix TypeScript declaration in onnxruntime-node ### Motivation and Context Fixes #22978	2024-12-04 11:29:27 -08:00
Xu Xing	c19617a24a	[js/webgpu] Add GatherND (#22847 ) ### Description <!-- Describe your changes. --> ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2024-12-04 09:57:32 -08:00

1 2 3 4 5 ...

12111 commits