onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-07-03 03:58:54 +00:00

Author	SHA1	Message	Date
Misha Chornyi	bf4d3e1a5b	Update vcpkg.json - lock flatbuffer version (#23046 ) ### Description Locking version introduced in: `03ea5dc495/onnxruntime/core/flatbuffers/schema/ort_training_checkpoint.fbs.h (L11-L13)` ### Motivation and Context Resolve issue for version `>=1.20.` https://github.com/microsoft/onnxruntime/issues/22666	2024-12-10 11:23:01 -08:00
Jian Chen	5f7b9d0245	Upgrade gradle to 8.7 (#23016 ) ### Description This PR only upgrade the gradle version and `com.android.tools.build:gradle` version from build.gradle. This only update the react-native library gradle version, not the e2e test. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2024-12-10 10:49:03 -08:00
A-Satti	b14b4ec703	Restore Qspectre flag (#23060 ) Restore a removed Qspectre flag and update comment ### Motivation and Context Adjustment for PR `f5293d253c`	2024-12-09 21:52:21 -08:00
Scott McKay	708ee8556e	Reduce default logger usage (#23030 ) ### Description <!-- Describe your changes. --> We have use cases where multiple sessions are created concurrently. Minimizing the usage of the default logger is important for these scenarios. Wire through the session logger to as many places as possible. The EP logger can also be used once the session is created (can't be used during EP construction/kernel registration but can be used in GetCapability and Compile). ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Improve logging when there are concurrent sessions.	2024-12-10 12:54:14 +11:00
wejoncy	e12421be30	[CoreML] more performace flag (#22975 ) ### Description refactor unsquzee's implementation add more flags to boost peformance. add profile flag ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> --------- Co-authored-by: jicwen <jicwen@YiMacBook-Pro.local> Co-authored-by: wejoncy <wejoncy@.com> Co-authored-by: Scott McKay <skottmckay@gmail.com>	2024-12-10 09:35:05 +08:00
amancini-N	8f3384b4c1	Fix BeamSearch T5 if initializers are on outer scope (#23044 ) ### Description This PR adds the logic needed to consider only the needed implicit inputs on BeamSearch op in case of T5 model (encoder/decoder, 2 graphs). The logic added is similar to what happens in the _If_ kernel setup. ### Motivation and Context Fixes #23043	2024-12-09 15:15:20 -08:00
Scott McKay	2f2c73bdde	Miscellaneous cleanups (#23048 ) ### Description <!-- Describe your changes. --> - fix some missing end of version markers and since_version info - fix include to use onnx_protobuf.h which handles minimal builds - we should always prefer that header over directly using the onnx ones ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2024-12-10 09:24:16 +11:00
Yulong Wang	22ae97c7dc	[webgpu] Add Alias def for Flatten (#23038 ) ### Description Add `Alias` definition for Flatten in WebGPU EP. also add int32/uint32 in type constraint T.	2024-12-09 14:19:43 -08:00
Wanming Lin	6d9636f07c	[WebNN] Allow ops to handle ignoring an empty tensor as input (#22972 ) ### Description Some ops should allow empty tensor as input, e.g. roi, scales inputs in Resize ### Motivation and Context It avoid some unexpected fallback for optional input with empty tensor. e.g. roi and scales are both optional inputs in Resize, in some models they have non-empty name but with empty initializer presented as `[0]`, WebNN currently will fallback all nodes with 0 dimension, which is not expected. ![image](https://github.com/user-attachments/assets/599ba351-b5f6-49ac-8a1f-69fb28dbaf9b)	2024-12-06 17:58:15 -08:00
A-Satti	f5293d253c	Update Intel Thread Counts (#22894 ) ### Description The default thread count methodology by onnxruntime did not account for new upcoming Intel microarchitectures leading to a suboptimal thread count. Optimizing the thread count for new Intel microarchitectures reveal gains on the majority of models across datatypes and shows gains up to ~1.5x speedup. ### Motivation and Context Applications should run on Intel with the most performant thread configuration for the majority of models. With new microarchitectures, adjusting the thread count methodology is required to take advantage of their differences. <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2024-12-06 13:56:50 -08:00
Jing Fang	bd5a759d0c	[ARM CPU] Add rotary embedding fp16 kernel (#23013 ) ### Description Add fp16 kernel to rotary embedding to boost performance. ### Motivation and Context Part of performance optimization work for group query attention	2024-12-06 13:25:48 -08:00
Hector Li	401d16c671	Enable QNN HTP spill fill buffer setting to save RAM usage. (#22853 ) ### Description Enable QNN HTP spill fill buffer setting to save RAM usage. This feature is available after QNN 2.28. Need to re-generate QNN context binary. https://docs.qualcomm.com/bundle/publicresource/topics/80-63442-50/htp_backend.html#qnn-htp-backend-api Requirements: 1. Need to re-generate the Onnx model with QNN context binary by set the EP option enable_htp_spill_fill_buffer = 1. 2. Works for a model with multiple Context binaries. Need manually merge 2 Onnx model with context binary into 1 Onnx model. 3. Requires Linux platform if generate the context binary offline since QnnSystem lib is not available for Windows x86_64 platform. No need to do extra thing while running the model inference. The generated EPContext node will have a max_size attribute with the maximum spill fill buffer size for the context binary <img width="353" alt="image" src="https://github.com/user-attachments/assets/a3bf48be-a8da-4381-8a1d-3f2558eea37d"> --------- Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>	2024-12-06 11:36:52 -08:00
dependabot[bot]	d27fecd3d3	Bump cross-spawn from 6.0.5 to 6.0.6 in /js/web (#23019 ) Bumps [cross-spawn](https://github.com/moxystudio/node-cross-spawn) from 6.0.5 to 6.0.6. <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/moxystudio/node-cross-spawn/blob/v6.0.6/CHANGELOG.md">cross-spawn's changelog</a>.</em></p> <blockquote> <h2><a href="https://github.com/moxystudio/node-cross-spawn/compare/v6.0.5...v6.0.6">6.0.6</a> (2024-11-18)</h2> <h3>Bug Fixes</h3> <ul> <li>disable regexp backtracking (<a href="https://redirect.github.com/moxystudio/node-cross-spawn/issues/160">#160</a>) (<a href="https://github.com/moxystudio/node-cross-spawn/commit/ba5aaef">ba5aaef</a>)</li> <li><strong>core:</strong> support worker threads (<a href="https://redirect.github.com/moxystudio/node-cross-spawn/issues/127">#127</a>) (<a href="https://github.com/moxystudio/node-cross-spawn/commit/f4af31c">f4af31c</a>)</li> </ul> <p><!-- raw HTML omitted --><!-- raw HTML omitted --></p> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="`d35c865b87`"><code>d35c865</code></a> chore(release): 6.0.6</li> <li><a href="`5a37e19173`"><code>5a37e19</code></a> chore: update package.json and package.lock</li> <li><a href="`ba5aaef783`"><code>ba5aaef</code></a> fix: disable regexp backtracking (<a href="https://redirect.github.com/moxystudio/node-cross-spawn/issues/160">#160</a>)</li> <li><a href="`f4af31c8ee`"><code>f4af31c</code></a> fix(core): support worker threads (<a href="https://redirect.github.com/moxystudio/node-cross-spawn/issues/127">#127</a>)</li> <li>See full diff in <a href="https://github.com/moxystudio/node-cross-spawn/compare/v6.0.5...v6.0.6">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=cross-spawn&package-manager=npm_and_yarn&previous-version=6.0.5&new-version=6.0.6)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) Dependabot will merge this PR once it's up-to-date and CI passes on it, as requested by @fs-eire. [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) You can disable automated security fix PRs for this repo from the [Security Alerts page](https://github.com/microsoft/onnxruntime/network/alerts). </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2024-12-05 10:07:08 -08:00
Yi Zhang	6ed77cc374	Deprecate macos-12 (#23017 ) ### Description <!-- Describe your changes. --> ### Motivation and Context ESRP code-sign task has supported .net 8, so we can remove macos-12	2024-12-05 14:07:21 +08:00
Yulong Wang	1c79a4c9dd	[js/common] use TS type inference to eliminate `unknown` (#23012 ) ### Description This change uses a TypeScript trick to infer global types in onnxruntime-common. Thanks to the strong type system of TypeScript, we are able to refer to types that may not be available in the context. This helps to keep onnxruntime-common not to include dependencies like "@webgpu/types", and still being able to use the types in the declaration. See comments of `TryGetGlobalType` in `type-helper.ts`.	2024-12-04 19:01:26 -08:00
Jian Chen	f340b3cad3	Adding DML to python cuda package (#22606 )	2024-12-04 21:20:12 -05:00
Yulong Wang	3234487385	[js] remove more unused training types (#22753 ) ### Description remove more unused training types	2024-12-04 16:44:09 -08:00
dependabot[bot]	3975e79303	Bump axios from 1.6.1 to 1.7.9 in /js/node (#23009 )	2024-12-04 23:52:24 +00:00
Wanming Lin	cacd97dba3	[WebNN] Improve the util function of creating WebNN constant MLOperand (#22935 ) Merge the util functions to create or retrieve: - A WebNN constant MLOperand filled with the specified value, data type, and shape. - A WebNN scalar constant MLOperand with the specified value and data type.	2024-12-04 15:09:54 -08:00
Jing Fang	fbe22fdac7	[ARM CPU] Fix flaky hqnbitgemm UT (#23010 ) ### Description Increase fp16 qnbitgemm UT tol and use fixed seeds. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2024-12-04 14:55:52 -08:00
Yulong Wang	7b0fa407eb	fix requirements.txt path (#22946 ) ### Description #22380 removes the file `tools/ci_build/github/linux/docker/inference/x86_64/python/cpu/scripts/requirements.txt` but it is still used in `dockerfiles/Dockerfile.cuda`. This change updates the file path of the requirements.txt fixes #22945.	2024-12-04 13:08:29 -08:00
Yulong Wang	d0dde4f7d4	[wasm/test] update packages versions (#23008 ) ### Description Upgrade packages version to resolve the following dependabot alerts: - https://github.com/microsoft/onnxruntime/security/dependabot/269 - https://github.com/microsoft/onnxruntime/security/dependabot/268 - https://github.com/microsoft/onnxruntime/security/dependabot/275 - https://github.com/microsoft/onnxruntime/security/dependabot/306 ``` # npm audit report braces <3.0.3 Severity: high Uncontrolled resource consumption in braces - https://github.com/advisories/GHSA-grv7-fg5c-xmjg fix available via `npm audit fix` node_modules/braces cookie <0.7.0 cookie accepts cookie name, path, and domain with out of bounds characters - https://github.com/advisories/GHSA-pxg6-pf52-xh8x fix available via `npm audit fix` node_modules/cookie engine.io 0.7.8 - 0.7.9 \|\| 1.8.0 - 6.6.1 Depends on vulnerable versions of cookie Depends on vulnerable versions of ws node_modules/engine.io socket.io 1.6.0 - 4.7.5 Depends on vulnerable versions of engine.io node_modules/socket.io ws 8.0.0 - 8.17.0 Severity: high ws affected by a DoS when handling a request with many HTTP headers - https://github.com/advisories/GHSA-3h5v-q93c-6h6q fix available via `npm audit fix` node_modules/ws socket.io-adapter 2.5.2 - 2.5.4 Depends on vulnerable versions of ws node_modules/socket.io-adapter 6 vulnerabilities (1 low, 1 moderate, 4 high) ```	2024-12-04 13:08:13 -08:00
Yulong Wang	fdf5ffe2cf	[js/node] fix TypeScript declaration in onnxruntime-node (#23000 ) ### Description fix TypeScript declaration in onnxruntime-node ### Motivation and Context Fixes #22978	2024-12-04 11:29:27 -08:00
Xu Xing	c19617a24a	[js/webgpu] Add GatherND (#22847 ) ### Description <!-- Describe your changes. --> ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2024-12-04 09:57:32 -08:00
Yulong Wang	a615bd6688	Bump version of Dawn to 12a3b24c4 (#23002 ) ### Description Upgrade version of Dawn. Removed dawn.patch, because all patches are included in upstream. Updated code that affected by API changes (`const char*` -> `WGPUStringView`) ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2024-12-04 09:47:16 -08:00
Yulong Wang	50b38ca9d5	[js/web] update default export to include webgpu (#22754 ) ### Description This PR changes the following exports: - `onnxruntime-web` now is same to `onnxruntime-web/webgpu`. - `onnxruntime-web/webgpu` is deprecating. ### Migration instructions: - use `onnxruntime-web` instead of `onnxruntime-web/webgpu`. - use `onnxruntime-web/wasm` if want to use onnxruntime-web without webgpu/webnn. ### Export table \| file name \| export entry \| includes WASM \| includes JSEP (WebGPU & WebNN) \| includes WebGL \| ------------- \| ------------- \| ----- \| ----- \| ----- \| ort.all.min.js<br/>ort.all.js<br/>ort.all.min.mjs<br/>ort.all.mjs \| `onnxruntime-web/all` \| ✔️\| ✔️\| ✔️ \| ort.min.js<br/>ort.js<br/>ort.min.mjs<br/>ort.mjs \| `onnxruntime-web` \| ✔️\| ❌ --> ✔️\| ✔️ -->❌ \| ort.webgpu.min.js<br/>ort.webgpu.js<br/>ort.webgpu.min.mjs<br/>ort.webgpu.mjs \| `onnxruntime-web/webgpu` \| ✔️ \| ✔️ \|❌ \| ort.wasm.min.js<br/>ort.wasm.js<br/>ort.wasm.min.mjs<br/>ort.wasm.mjs \| `onnxruntime-web/wasm` \| ✔️ \| ❌ \|❌	2024-12-04 09:46:45 -08:00
Chi Lo	9b9f881475	[TensorRT EP] Use TRT/CUDA/ORT version from runtime instead of build time to generate hash value (#22921 ) Use TensorRT and CUDA version fetched at runtime to get the hash value which determines the cache name. The old way to get the version is at compile/build time that might have some issues in some cases, ex: TRT EP uses the TRT version which we or users built against at compile time. However, users can change different TRT version at run time, that can cause issue because TRT EP always checks the "fixed" TRT version, not the TRT version it uses now. This can cause TRT EP to use incompatible TRT engine cache. see the github issue here: https://github.com/microsoft/onnxruntime/issues/22382#issuecomment-2404140754	2024-12-03 21:58:43 -08:00
dependabot[bot]	bd701e4f33	Bump cross-spawn from 7.0.3 to 7.0.6 in /js (#23003 )	2024-12-04 05:07:21 +00:00
Yulong Wang	06526af346	[js/webgpu] fix a bug in transpose shader (#22997 ) ### Description Fix a bug in transpose shader, when input/output rank is 1. ### Motivation and Context Fixes #22994	2024-12-03 20:21:08 -08:00
Yulong Wang	e84b8e7bd5	allow specify a custom local source path for Dawn (#22999 ) ### Description Allows to build ONNX Runtime with a custom local path of Dawn's source code. Usage: ```sh build --use_webgpu --cmake_extra_defines "onnxruntime_CUSTOM_DAWN_SRC_PATH=C:/src/dawn" ```	2024-12-03 19:25:22 -08:00
dependabot[bot]	4497c97d54	Bump cross-spawn from 7.0.3 to 7.0.6 in /js/node (#22998 ) Bumps [cross-spawn](https://github.com/moxystudio/node-cross-spawn) from 7.0.3 to 7.0.6. <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/moxystudio/node-cross-spawn/blob/master/CHANGELOG.md">cross-spawn's changelog</a>.</em></p> <blockquote> <h3><a href="https://github.com/moxystudio/node-cross-spawn/compare/v7.0.5...v7.0.6">7.0.6</a> (2024-11-18)</h3> <h3>Bug Fixes</h3> <ul> <li>update cross-spawn version to 7.0.5 in package-lock.json (<a href="`f700743918`">f700743</a>)</li> </ul> <h3><a href="https://github.com/moxystudio/node-cross-spawn/compare/v7.0.4...v7.0.5">7.0.5</a> (2024-11-07)</h3> <h3>Bug Fixes</h3> <ul> <li>fix escaping bug introduced by backtracking (<a href="`640d391fde`">640d391</a>)</li> </ul> <h3><a href="https://github.com/moxystudio/node-cross-spawn/compare/v7.0.3...v7.0.4">7.0.4</a> (2024-11-07)</h3> <h3>Bug Fixes</h3> <ul> <li>disable regexp backtracking (<a href="https://redirect.github.com/moxystudio/node-cross-spawn/issues/160">#160</a>) (<a href="`5ff3a07d9a`">5ff3a07</a>)</li> </ul> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="`77cd97f3ca`"><code>77cd97f</code></a> chore(release): 7.0.6</li> <li><a href="`6717de49ff`"><code>6717de4</code></a> chore: upgrade standard-version</li> <li><a href="`f700743918`"><code>f700743</code></a> fix: update cross-spawn version to 7.0.5 in package-lock.json</li> <li><a href="`9a7e3b2165`"><code>9a7e3b2</code></a> chore: fix build status badge</li> <li><a href="`085268352d`"><code>0852683</code></a> chore(release): 7.0.5</li> <li><a href="`640d391fde`"><code>640d391</code></a> fix: fix escaping bug introduced by backtracking</li> <li><a href="`bff0c87c8b`"><code>bff0c87</code></a> chore: remove codecov</li> <li><a href="`a7c6abc6fe`"><code>a7c6abc</code></a> chore: replace travis with github workflows</li> <li><a href="`9b9246e096`"><code>9b9246e</code></a> chore(release): 7.0.4</li> <li><a href="`5ff3a07d9a`"><code>5ff3a07</code></a> fix: disable regexp backtracking (<a href="https://redirect.github.com/moxystudio/node-cross-spawn/issues/160">#160</a>)</li> <li>Additional commits viewable in <a href="https://github.com/moxystudio/node-cross-spawn/compare/v7.0.3...v7.0.6">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=cross-spawn&package-manager=npm_and_yarn&previous-version=7.0.3&new-version=7.0.6)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) Dependabot will merge this PR once CI passes on it, as requested by @fs-eire. [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) You can disable automated security fix PRs for this repo from the [Security Alerts page](https://github.com/microsoft/onnxruntime/network/alerts). </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2024-12-03 18:48:22 -08:00
Yulong Wang	d3bc3180d8	[js/node] fix CUDA artifact installation script for Linux/x64 (#22984 ) ### Description This PR updates installation script to fix it for CUDA v12. However, it may be difficult for CUDA v11 since the steps are quite complicated to automate. Added a few lines of instructions instead. fixes #22877	2024-12-03 16:07:43 -08:00
Prathik Rao	5c644d3747	[WebGPU EP] Flatten implementation (#22964 ) Implements flatten operator for native webgpu.	2024-12-03 14:40:57 -08:00
Jian Chen	9ed0c7fe26	Redo "Update Gradle version 8.7 and java version 17 within onnxruntime/java" (#22923 ) ### Description <!-- Describe your changes. --> ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2024-12-02 18:34:25 -08:00
Edward Chen	e2356a0403	Use UTF8 string encoding in ORTSaveCodeAndDescriptionToError(). (#22982 ) Update from ASCII to UTF8 string encoding when creating the `NSString` description.	2024-12-02 17:41:52 -08:00
Kee	8c52fa3924	[VSINPU]Split/Pad and some element-wise OPs support (#22916 ) ### Description -Add split/pad/neg/not/ceil/round/min/max op support -Fix conv2d op default pads value issue -Add VSINPU EP to support python bindings ### Motivation and Context -New OPs support for VSINPU EP --------- Signed-off-by: Kee <xuke537@hotmail.com>	2024-12-02 13:57:30 -08:00
Satya Kumar Jandhyala	e8bf46a70e	[WebGPU EP] Support GroupQueryAttention (#22658 ) ### Description <!-- Describe your changes. --> Support GroupQueryAttention operator for native webgpu ep. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> This is required for inferencing some LLMs.	2024-12-02 12:40:03 -08:00
Jian Chen	6c2ff5fc55	Refactor emulator start and stop functions for clarity and efficiency (#22861 ) ### Description This pull request introduces several enhancements and new functionalities to the `tools/python/util/android/android.py` file, focusing on improving the management of Android emulators. The most important changes include adding a timeout parameter to the `start_emulator` function, adding checks to prevent multiple emulators from running simultaneously, and introducing new utility functions to manage emulator processes more effectively. Enhancements to `start_emulator` function: * Added a `timeout_minutes` parameter to the `start_emulator` function to make the startup timeout configurable. [[1]](diffhunk://#diff-c54db556a9c445989f830c09ab90ce2704e648deaccce9c9e0ee4875ddaa864dL108-R117) [[2]](diffhunk://#diff-c54db556a9c445989f830c09ab90ce2704e648deaccce9c9e0ee4875ddaa864dL158-R170) * Added a check to prevent starting a new emulator if one with the same AVD name is already running. * Included additional emulator arguments `-verbose` for better control and debugging. * Added a final verification step to ensure the emulator has started successfully. New utility functions for managing emulator processes: * Introduced `check_emulator_running_using_avd_name `, `check_emulator_running_using_process`, and `check_emulator_running_using_pid` to check if an emulator is running based on AVD name, process instance, or PID, respectively. * Added `stop_emulator_by_proc` and `stop_emulator_by_pid` functions to stop the emulator process using a `subprocess.Popen` instance or PID, with a configurable timeout. * Updated the `stop_emulator` function to use the new utility functions for stopping the emulator process. These changes enhance the robustness and flexibility of the emulator management utilities, making it easier to handle different scenarios in CI environments and development workflows. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> --------- Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Scott McKay <skottmckay@gmail.com>	2024-12-02 09:29:17 -08:00
Chi Lo	e234023d11	[TensorRT EP] Fix wrong input order when generating IndexedSubGraph (#22857 ) The input order of generated indexedSubGraph needs to be consistent with the input order of original graph. This PR will also fix the github issue https://github.com/microsoft/onnxruntime/issues/22729	2024-12-02 01:45:29 -08:00
Chi Lo	49a80df77f	Keep the model metadata on the generated EP context model (use bridge api) (#22860 ) In addition to the [PR](https://github.com/microsoft/onnxruntime/pull/22825) which directly uses internal graph api, this PR updates the bridge api for the case of TRT EP and OpenVINO EP.	2024-12-01 21:57:45 -08:00
Vincent Wang	1128882bfd	Quantize Bias for Conv/Gemm on Quantized Model (#22889 ) Some quantized models don't have Conv/Gemm node's bias quantized but still leave them in float. This PR is to create a sub-graph to quantize the bias for Conv/Gemm nodes with scale = scale_input_0 * scale_input_1 and zp = 0. We only do this for bias initializer so that ConstantFolding will fold the sub-graph to a real quantized int32 bias initializer during the graph optimization next round.	2024-11-28 10:10:24 +08:00
Vincent Wang	42ecb05080	[QNN] ReduceL2 Support (#22636 ) Add ReduceL2 support to QNN EP. Some of the QNN AI Hub models contain Reduce L2, such as openai_clip_CLIPTextEncoder and openai_clip_CLIPIamgeEncoder, without this PR, the ReduceL2 will be assigned to CPU and the graph will be split to 2 QNN graphs, which this PR, all nodes will be in QNN EP.	2024-11-28 10:09:13 +08:00
Jing Fang	08abab0b14	[CPU] Fix mamtulnbits accuracy level (#22963 ) ### Description Fix mamtulnbits accuracy level ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2024-11-27 17:40:04 -08:00
wejoncy	a24723df16	[CoreML ] ML Program more operators support [3/N] (#22710 ) ### Description - Erf - Round - Max - ReduceMax - ReduceMean - ReduceSum - Unsqueeze - Squeeze - Softmax ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> --------- Co-authored-by: Scott McKay <skottmckay@gmail.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>	2024-11-28 09:21:02 +08:00
Yi Zhang	b930b4ab5b	Limit PipAuthenticate in Private Project now (#22954 ) ### Description Fixes regression in post merge pipeline caused by #22612 ### Motivation and Context So far, there isn't the artifactFeeds in Public Project	2024-11-27 13:32:35 +08:00
Wanming Lin	fe749a88a5	[WebNN EP] Fixed bug in usage of Array.reduce() (#22944 ) In JS, reduce of empty array with no initial value will throw error. Fix it by checking the array length firstly.	2024-11-26 19:03:44 -08:00
wejoncy	c284a686f2	[CoreML] Create EP by AppendExecutionProvider (#22675 ) ### Description AppendExecutionProvider("CoreML", {{"MLComputeUnits","MLProgram"}}) ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> --------- Co-authored-by: Scott McKay <skottmckay@gmail.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>	2024-11-27 09:26:31 +08:00
Chen Feiyue	487184fa42	[VSINPU] update crosscompiling patch (#22937 ) ### Description <!-- Describe your changes. --> Update this patch because the origin file has changed ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2024-11-26 14:35:16 -08:00
amancini-N	8826e39a81	#22890 Fix profiling on empty Optional (#22891 ) ### Description Fix sequential_executor.cc to avoid segfault when profiling is used on model with empty Optional ### Motivation and Context Fixes #22890	2024-11-26 11:18:47 -08:00
shiyi	afbb53937c	[WebNN] Support negative steps for slice (#22871 ) Slice with negative steps can be emulated by reverse+slice.	2024-11-25 23:06:23 -08:00

1 2 3 4 5 ...

12085 commits