onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-05-14 20:48:00 +00:00

Author	SHA1	Message	Date
Corentin Maravat	a9d4d08ed1	Add of ReduceMax Gradient (#23501 )	2025-01-31 10:37:41 -08:00
Yulong Wang	6bbf1bd948	[js/web] upgrade version of flatbuffers (#23545 ) ### Description Upgrade version of flatbuffers to latest. This change fixes #23361.	2025-01-31 10:28:53 -08:00
Sushanth Rajasankar	271c509d59	DP4AMatMul perf refinements (#23539 ) In this change 1. Vectorization of k is updated to 4. 2. Tile_A, Tile_B are stored transposed in shared memory. This makes it so that memory locality is improved for our access pattern. 3. Lane output is switched to being individual vectors and its loop unrolled, this solves the problem where laneoutput was not on registers before. Perf improvements are not very consistent with this change. On Tigerlake GPU with 32.0.101.6460 (latest intel drivers) ``` Baseline model_benchmark.exe -i C:\Phi-3.5-mini-instruct-onnx-web\Phi-3.5-mini-instruct-onnx-web\ -l 1000 Batch size: 1, prompt tokens: 1001, tokens to generate: 128 Prompt processing (time to first token): avg (us): 7.36557e+06 <<<< avg (tokens/s): 135.903 p50 (us): 7.35498e+06 stddev (us): 27599 n: 5 * 1001 token(s) With Change model_benchmark.exe -i C:\Phi-3.5-mini-instruct-onnx-web\Phi-3.5-mini-instruct-onnx-web\ -l 1000 Batch size: 1, prompt tokens: 1001, tokens to generate: 128 Prompt processing (time to first token): avg (us): 6.52302e+06 <<<< avg (tokens/s): 153.457 p50 (us): 6.52224e+06 stddev (us): 10407.3 n: 5 * 1001 token(s) ``` However, using the Intel GPA comparing before and after profile, one can clearly see straight runs of ALU work without being interspersed by writebacks to local memory that contained lane_output before. ![image](https://github.com/user-attachments/assets/e01d3474-8406-4a61-b352-2ecbf0855a7f)	2025-01-31 10:20:01 -08:00
kunal-vaishnavi	cb69c59863	Add fusions for SigLIP and Conformer-Encoder (#23528 ) ### Description This PR adds fusions for [Google's SigLIP model](https://huggingface.co/google/siglip-base-patch16-224/) and Microsoft's internal conformer-encoder model. Here is an example of how to run the ORT transformer optimizer for the SigLIP model. ``` $ git clone https://github.com/microsoft/onnxruntime $ cd onnxruntime/onnxruntime/python/tools/transformers $ python3 optimizer.py --input /path/to/model.onnx --output /path/to/model_opt.onnx --model_type clip --num_heads 16 --hidden_size 1152 --use_external_data_format --opt_level 0 --disable_shape_inference ``` Here is an example of how to run the ORT transformer optimizer for the conformer-encoder model. ``` $ git clone https://github.com/microsoft/onnxruntime $ cd onnxruntime/onnxruntime/python/tools/transformers $ python3 optimizer.py --input /path/to/model.onnx --output /path/to/model_opt.onnx --model_type conformer --num_heads 16 --hidden_size 1024 --use_external_data_format --opt_level 0 --disable_shape_inference --convert_attribute ``` ### Motivation and Context This PR helps optimize multi-modal models that use SigLIP for the vision encoder and conformer-encoder for the speech encoder. This PR uses changes from the following PRs: - https://github.com/pytorch/pytorch/pull/144801 - https://github.com/microsoft/onnxscript/pull/2018 - https://github.com/microsoft/onnxscript/pull/2019 - https://github.com/microsoft/onnxscript/pull/2020 - https://github.com/microsoft/onnxscript/pull/2021 - https://github.com/microsoft/onnxscript/pull/2022 - https://github.com/microsoft/onnxscript/pull/2024 - https://github.com/microsoft/onnxscript/pull/2025 - https://github.com/microsoft/onnxscript/pull/2029 - https://github.com/microsoft/onnxscript/pull/2033 ### Introduction of ONNX Script This PR introduces [ONNX Script](https://github.com/microsoft/onnxscript) into the ORT transformer optimizer as an optional step via the `fold_transpose_initializers()` method of the `DynamoOnnxHelper` class.	2025-01-31 09:17:49 -08:00
Changming Sun	61fae9bb91	Remove "--enable_pybind" from webgpu pipeline (#23550 ) There is a crash in the WebGPU CI pipeline. It crashed at process shutdown when unloading onnxruntime_pybind11_state.pyd. Here is the callstack: ``` dxil.dll!DxcSwapThreadMalloc() Unknown dxil.dll!DxcThreadMalloc::DxcThreadMalloc(struct IMalloc ) Unknown dxil.dll!DxcValidator::Release(void) Unknown [Inline Frame] webgpu_dawn.dll!Microsoft::WRL::ComPtr<IDxcValidator>::InternalRelease() Line 235 C++ [Inline Frame] webgpu_dawn.dll!Microsoft::WRL::ComPtr<IDxcValidator>::{dtor}() Line 290 C++ webgpu_dawn.dll!dawn::native::d3d12::Backend::`scalar deleting destructor'(unsigned int) C++ webgpu_dawn.dll!`eh vector destructor iterator'(void ptr, unsigned __int64 size, unsigned __int64 count, void()(void ) destructor) C++ webgpu_dawn.dll!dawn::native::InstanceBase::~InstanceBase() Line 197 C++ webgpu_dawn.dll!dawn::native::InstanceBase::`scalar deleting destructor'(unsigned int) C++ webgpu_dawn.dll!dawn::native::InstanceBase::DeleteThis() Line 218 C++ ucrtbase.dll!<lambda>(void)() Unknown ucrtbase.dll!__crt_seh_guarded_call<int>::operator()<<lambda_7777bce6b2f8c936911f934f8298dc43>,<lambda>(void) &,<lambda_3883c3dff614d5e0c5f61bb1ac94921c>>() Unknown ucrtbase.dll!_execute_onexit_table() Unknown onnxruntime_pybind11_state.pyd!dllmain_crt_process_detach(const bool is_terminating) Line 182 C++ > onnxruntime_pybind11_state.pyd!dllmain_dispatch(HINSTANCE__ * const instance, const unsigned long reason, void * const reserved) Line 293 C++ ntdll.dll!LdrpCallInitRoutine() Unknown ntdll.dll!LdrShutdownProcess() Unknown ntdll.dll!RtlExitUserProcess() Unknown kernel32.dll!ExitProcessImplementation() Unknown ucrtbase.dll!exit_or_terminate_process() Unknown ucrtbase.dll!common_exit() Unknown python312.dll!00007ff9cab3ec8d() Unknown python312.dll!00007ff9cab3efbf() Unknown python312.dll!00007ff9cab3edee() Unknown python312.dll!00007ff9cab57f4c() Unknown python312.dll!00007ff9cab57579() Unknown python312.dll!00007ff9cab573be() Unknown python312.dll!00007ff9cab5729b() Unknown python312.dll!00007ff9cabacfcb() Unknown python312.dll!00007ff9cabacd7d() Unknown python312.dll!00007ff9cab99e2d() Unknown python.exe!00007ff78a641230() Unknown kernel32.dll!BaseThreadInitThunk() Unknown ntdll.dll!RtlUserThreadStart() Unknown ``` It might be because the destruct order of some global variables was wrong. I saw DX DLLs were getting destroyed earlier than the WebGPU instance in our code in onnxruntime_pybind11_state.pyd.	2025-01-31 08:43:58 -08:00
Tianlei Wu	0bb4ea6797	Update BiasGelu fusion and related ops (#23518 ) ### Description (1) Update BiasGelu fusion to support onnx Gelu-20 Since onnx Gelu-20 supports float/double/bf16/fp16, here we update related ops to support these data types in CUDA and ROCm execution providers: (2) Add double support for Gelu/FastGelu op in CUDA/ROCm execution provider (3) Add BFloat16 support for Gelu ops in CUDA execution provider (4) Add unit tests (5) Update operator documents ### Motivation and Context https://github.com/microsoft/onnxruntime/issues/23491	2025-01-30 22:53:59 -08:00
Caroline Zhu	4dde74a393	Add more details to BrowserStack script failure (#23520 ) ### Description Add details about how to access the BrowserStack logs ### Motivation and Context - browserstack link on its own is confusing to people who don't have context. Let me know if you have suggestions to make the text more clear or informative	2025-01-30 22:25:12 -08:00
Changming Sun	ead9d5cf43	Set ANDROID_USE_LEGACY_TOOLCHAIN_FILE to false (#23544 ) NDK has two toolchain cmake files as you can see in https://android.googlesource.com/platform/ndk/+/refs/heads/main/build/cmake By default NDK use the legacy one for providing the best compatibility. We don't need to. This PR changes to use the new one. The new toolchain cmake file uses standard cmake flags like CMAKE_ANDROID_RTTI to control C++ features.	2025-01-30 16:10:09 -08:00
Takeshi Watanabe	7e2408880e	Enable dlpack by default (#23110 ) ### Description <!-- Describe your changes. --> This PR will enable python dlpack interface by default. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> dlpack python interface is useful in inference mode not only training mode. Since some inference result preprocess may be written in torch and making unnecessary device transfer should be reduced in those cases. closes https://github.com/microsoft/onnxruntime/issues/15963 closes https://github.com/microsoft/onnxruntime/issues/22061 TODOs: - [x] Add tests like `5407c69028/orttraining/orttraining/test/python/orttraining_test_ortvalue.py` that's unrelated to training feature --------- Co-authored-by: Xavier Dupré <xadupre@users.noreply.github.com> Co-authored-by: Justin Chu <justinchuby@users.noreply.github.com>	2025-01-30 23:23:56 +01:00
Edward Chen	dc2f7a9a0c	Add overload of `TryParseStringWithClassicLocale()` that uses `std::from_chars()` (#23541 ) Add overload of `TryParseStringWithClassicLocale()` that uses `std::from_chars()` for certain types. Reduce binary size. It recently increased after PR #23526.	2025-01-30 13:55:54 -08:00
Hector Li	5407c69028	Fix the issue that the new generated EP context model not able to find external data (#23537 ) Fix the issue that the new generated EP context model not able to find external data ### Description The new generated EP context model was not able to find the external data file because it lost track of the source model path which used to locate the external initializers. Relate to issue: https://github.com/microsoft/onnxruntime/issues/23358	2025-01-29 22:01:13 -08:00
Yulong Wang	fbae88f5ad	[js/web] use the recommended workaround for Vite (#23531 ) ### Description After some investigation and debug, I decided to follow the recommended workaround as suggested in https://github.com/vitejs/vite/issues/8427. ### Motivation and Context There is a known issue with Vite 5.x when using WebAssembly package. Detail information is in https://github.com/vitejs/vite/issues/8427. There are previous attempts to fix this problem (#23487). I tried various ways to make it working out of the box for Vite users but none of them worked: Some "fixes" did fix the usage of Vite but broke other use case/bundler and some introduced other issues. Eventually I figured out that there is no good way to fix this inside ONNX Runtime. Considering the root cause is inside Vite and it may be fixed in Vite v6. I think now the best way is to follow the recommended workaround.	2025-01-29 17:38:22 -08:00
Edward Chen	d5338da1f5	Fix tensor external data info length parsing issue. (#23526 ) Fix tensor external data info length parsing issue. The old implementation was parsing a `size_t` value with `strtol` (via `OrtStrToPtrDiff`) on ARM64 MSVC. `bf023ab3d5/onnxruntime/core/platform/path_lib.h (L74)` If we have `sizeof(size_t) == 8` and `sizeof(long) == 4` (as is the case for x64 and ARM64 MSVC), `strtol` will return a maximum value of `2^31-1` even for a larger, valid `size_t` value. `strtol` will also set `errno` to `ERANGE`, but we weren't checking that. Updated to use `ParseStringWithClassicLocale` which will parse directly to the target type. Added some tests.	2025-01-29 13:35:25 -08:00
Ted Themistokleous	e3e41739a7	[ROCm EP] Fix transpose helper for gfx gridsize constraints (#23527 ) Remove inline default transposeHelper and ensure we use the proper check via CanUse_hipBlasTransposeHelper_MLFloat16 Related to change in ROCm Onnxruntime repo: https://github.com/ROCm/onnxruntime/pull/82 ### Description Required to correctly limit grid size of transpose helper kernel ### Motivation and Context Compile was defaulting to the inline constructor that was removed instead of using the overloaded case with proper checks. Removed the inline default "true" case as this is incorrect for newer AMD cards/targets Co-authored-by: Ted Themistokleous <tedthemistokleous@amd.com>	2025-01-29 10:41:16 -08:00
Hector Li	80bc1d25f0	Enable Ep context with external data for CPU nodes (#23498 ) ### Description When user dump the EP context model, if the nodes not partitioned to the EP, and they have external initializers, then the dumped model still point to the old external data file. It does not make sense that new generated model still point to old external data file. Example, model has node A, B, C, D all has external initializer in ext.bin. So ext.bin contains data for A, B, C, D. After dumping the EP context model, node A is on CPU, node B, C, D are on EP and dumped as EPContext node. If A's data is still in ext.bin, then new generated model has to depend on old ext.bin which contains all external data for the old model which is a big overhead. Fix: For new generated model, user should have option to specify the new external data file, so that the new generated model either pack all initializers into the Onnx model or has all initializers in the external data file. Add option ep.context_model_external_initializers_file_name to specify the new external data file and size threshold. All initializers will be inside the external data fie if the options is specified. Otherwise all initializers will be inside the EP context Onnx model. ### Motivation and Context Fix the issue https://github.com/microsoft/onnxruntime/issues/23358	2025-01-28 20:22:22 -08:00
Yulong Wang	bf023ab3d5	[js/web] allow import .mjs/.wasm file (#23487 ) ### Description Allow importing the `.mjs` and `.wasm` files. when using Vite, this enables web app to consume ORT-web for simplify the setup: ```js import * as ort from 'onnxruntime-web'; import wasmFileUrl from 'onnxruntime-web/.wasm?url'; ort.env.wasm.wasmPaths = { wasm: wasmFileUrl };	2025-01-28 16:24:41 -08:00
Karim Vadsariya	655a23ff1d	[onnxruntime/build] Add new flag enable_generic_interface to build primary EPs by default (#23342 ) ### Description - Add new build flag in build.py to build onnxruntime.dll supporting interfaces for all primary EPs( QNN, TensoRT, OpenVino, VitisAI). - Modify onnxruntime.dll/onnxruntime_shared.dll build settings to remove dependency of IHV SDK Toolset to be installed on the system. - Change CMake variables to be explicit when building EP vs ORT. e.g. onnxruntime_USE_TENSORRT vs onnxruntime_USE_TENSORRT_INTERFACE, to evolve the build system to build ORT independent of EPs. ### Motivation and Context Changes in the build system required to evolve the repo to build the components independently while removing unnecessary dependencies --------- Co-authored-by: Lei Cao <jslhcl@gmail.com> Co-authored-by: Karim Vadsariya <kvadsariya@microsoft.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>	2025-01-28 15:24:09 -08:00
Jian Chen	a770a8dec8	Update RN to 0.71.19 (#23381 ) ### Description <!-- Describe your changes. --> Upgrading RN to 0.71.19, including Android and iOS changes.. This PR also include the E2E test changes. Used React-Native upgrade [helper](https://react-native-community.github.io/upgrade-helper/?from=0.70.15&to=0.71.19&package=onnxruntime-android&name=onnxruntime) as the reference. ### Motivation and Context Need newer RN version to fix S360 work items.	2025-01-28 09:53:13 -08:00
Changming Sun	1cf0ebd4cc	Delete Prefast workflow until the build failure is fixed (#23510 ) ### Description Delete Prefast workflow until the build failure is fixed ### Motivation and Context Right now the pipelines are failing due to an environment change from Github.	2025-01-28 09:11:12 -08:00
Corentin Maravat	d2c5e2474c	Add of GlobalMaxPool Gradient (#23502 ) ### Description Added gradient computation support for the GlobalMaxPool node. ### Motivation and Context Improve the training capabilities of ONNX Runtime.	2025-01-28 09:00:01 -08:00
Dmitri Smirnov	ded8730d6e	Remove thrust::unary_function (#23506 ) ### Description <!-- Describe your changes. --> Remove thrust::unary_function which is deprecated in later versions of CUDA. ### Motivation and Context Addresses issue: https://github.com/microsoft/onnxruntime/issues/23499	2025-01-28 08:58:18 -08:00
Yulong Wang	8db97a68f2	[webgpu] Bump version of Dawn to b9b4a370 (#23494 ) ### Description This PR updates the version of Dawn to `b9b4a37041dec3dd62ac92014a6cc1aece48d9f3` (ref: [chromium](`67f86f01dd/DEPS (399)`)) in the `deps.txt` file. The newer version of Dawn includes the previous changes from dawn.patch so that we can remove the patch file. There is a little interface changes and code is updated correspondingly.	2025-01-27 14:02:06 -08:00
Malik Shahzad Muzaffar	fdde2e25e1	Fix for gcc 13.3.1: Avoid creating a copy (#23500 ) ### Description This change avoids creating loop variable copy. GCC 13.3 suggests to use reference type to prevent copying. ### Motivation and Context While building onnxruntime 1.20.1 with latest changes from gcc 13.3, I get build error like ``` onnxruntime-1.20.1/onnxruntime/core/optimizer/selectors_actions/selector_action_transformer.cc: In function 'onnxruntime::common::Status onnxruntime::MatchAndProcess(Graph&, const GraphViewer&, Node&, bool&, const logging::Logger&, const std::string&, const SelectorActionRegistry&, const SatRuntimeOptimizationSaveContext)': onnxruntime-1.20.1/onnxruntime/core/optimizer/selectors_actions/selector_action_transformer.cc:150:23: error: loop variable 'op_schema' creates a copy from type 'const gsl::not_null<const onnx::OpSchema>' [-Werror=range-loop-construct] 150 \| for (const auto op_schema : action_saved_state.produced_node_op_schemas) { \| ^~~~~~~~~ onnxruntime-1.20.1/onnxruntime/core/optimizer/selectors_actions/selector_action_transformer.cc:150:23: note: use reference type to prevent copying 150 \| for (const auto op_schema : action_saved_state.produced_node_op_schemas) { \| ^~~~~~~~~ \| & ```	2025-01-27 13:35:18 -08:00
dependabot[bot]	96ec1dd134	Bump ruff from 0.9.2 to 0.9.3 (#23496 ) Bumps [ruff](https://github.com/astral-sh/ruff) from 0.9.2 to 0.9.3. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/astral-sh/ruff/releases">ruff's releases</a>.</em></p> <blockquote> <h2>0.9.3</h2> <h2>Release Notes</h2> <h3>Preview features</h3> <ul> <li>[<code>airflow</code>] Argument <code>fail_stop</code> in DAG has been renamed as <code>fail_fast</code> (<code>AIR302</code>) (<a href="https://redirect.github.com/astral-sh/ruff/pull/15633">#15633</a>)</li> <li>[<code>airflow</code>] Extend <code>AIR303</code> with more symbols (<a href="https://redirect.github.com/astral-sh/ruff/pull/15611">#15611</a>)</li> <li>[<code>flake8-bandit</code>] Report all references to suspicious functions (<code>S3</code>) (<a href="https://redirect.github.com/astral-sh/ruff/pull/15541">#15541</a>)</li> <li>[<code>flake8-pytest-style</code>] Do not emit diagnostics for empty <code>for</code> loops (<code>PT012</code>, <code>PT031</code>) (<a href="https://redirect.github.com/astral-sh/ruff/pull/15542">#15542</a>)</li> <li>[<code>flake8-simplify</code>] Avoid double negations (<code>SIM103</code>) (<a href="https://redirect.github.com/astral-sh/ruff/pull/15562">#15562</a>)</li> <li>[<code>pyflakes</code>] Fix infinite loop with unused local import in <code>__init__.py</code> (<code>F401</code>) (<a href="https://redirect.github.com/astral-sh/ruff/pull/15517">#15517</a>)</li> <li>[<code>pylint</code>] Do not report methods with only one <code>EM101</code>-compatible <code>raise</code> (<code>PLR6301</code>) (<a href="https://redirect.github.com/astral-sh/ruff/pull/15507">#15507</a>)</li> <li>[<code>pylint</code>] Implement <code>redefined-slots-in-subclass</code> (<code>W0244</code>) (<a href="https://redirect.github.com/astral-sh/ruff/pull/9640">#9640</a>)</li> <li>[<code>pyupgrade</code>] Add rules to use PEP 695 generics in classes and functions (<code>UP046</code>, <code>UP047</code>) (<a href="https://redirect.github.com/astral-sh/ruff/pull/15565">#15565</a>, <a href="https://redirect.github.com/astral-sh/ruff/pull/15659">#15659</a>)</li> <li>[<code>refurb</code>] Implement <code>for-loop-writes</code> (<code>FURB122</code>) (<a href="https://redirect.github.com/astral-sh/ruff/pull/10630">#10630</a>)</li> <li>[<code>ruff</code>] Implement <code>needless-else</code> clause (<code>RUF047</code>) (<a href="https://redirect.github.com/astral-sh/ruff/pull/15051">#15051</a>)</li> <li>[<code>ruff</code>] Implement <code>starmap-zip</code> (<code>RUF058</code>) (<a href="https://redirect.github.com/astral-sh/ruff/pull/15483">#15483</a>)</li> </ul> <h3>Rule changes</h3> <ul> <li>[<code>flake8-bugbear</code>] Do not raise error if keyword argument is present and target-python version is less or equals than 3.9 (<code>B903</code>) (<a href="https://redirect.github.com/astral-sh/ruff/pull/15549">#15549</a>)</li> <li>[<code>flake8-comprehensions</code>] strip parentheses around generators in <code>unnecessary-generator-set</code> (<code>C401</code>) (<a href="https://redirect.github.com/astral-sh/ruff/pull/15553">#15553</a>)</li> <li>[<code>flake8-pytest-style</code>] Rewrite references to <code>.exception</code> (<code>PT027</code>) (<a href="https://redirect.github.com/astral-sh/ruff/pull/15680">#15680</a>)</li> <li>[<code>flake8-simplify</code>] Mark fixes as unsafe (<code>SIM201</code>, <code>SIM202</code>) (<a href="https://redirect.github.com/astral-sh/ruff/pull/15626">#15626</a>)</li> <li>[<code>flake8-type-checking</code>] Fix some safe fixes being labeled unsafe (<code>TC006</code>,<code>TC008</code>) (<a href="https://redirect.github.com/astral-sh/ruff/pull/15638">#15638</a>)</li> <li>[<code>isort</code>] Omit trailing whitespace in <code>unsorted-imports</code> (<code>I001</code>) (<a href="https://redirect.github.com/astral-sh/ruff/pull/15518">#15518</a>)</li> <li>[<code>pydoclint</code>] Allow ignoring one line docstrings for <code>DOC</code> rules (<a href="https://redirect.github.com/astral-sh/ruff/pull/13302">#13302</a>)</li> <li>[<code>pyflakes</code>] Apply redefinition fixes by source code order (<code>F811</code>) (<a href="https://redirect.github.com/astral-sh/ruff/pull/15575">#15575</a>)</li> <li>[<code>pyflakes</code>] Avoid removing too many imports in <code>redefined-while-unused</code> (<code>F811</code>) (<a href="https://redirect.github.com/astral-sh/ruff/pull/15585">#15585</a>)</li> <li>[<code>pyflakes</code>] Group redefinition fixes by source statement (<code>F811</code>) (<a href="https://redirect.github.com/astral-sh/ruff/pull/15574">#15574</a>)</li> <li>[<code>pylint</code>] Include name of base class in message for <code>redefined-slots-in-subclass</code> (<code>W0244</code>) (<a href="https://redirect.github.com/astral-sh/ruff/pull/15559">#15559</a>)</li> <li>[<code>ruff</code>] Update fix for <code>RUF055</code> to use <code>var == value</code> (<a href="https://redirect.github.com/astral-sh/ruff/pull/15605">#15605</a>)</li> </ul> <h3>Formatter</h3> <ul> <li>Fix bracket spacing for single-element tuples in f-string expressions (<a href="https://redirect.github.com/astral-sh/ruff/pull/15537">#15537</a>)</li> <li>Fix unstable f-string formatting for expressions containing a trailing comma (<a href="https://redirect.github.com/astral-sh/ruff/pull/15545">#15545</a>)</li> </ul> <h3>Performance</h3> <ul> <li>Avoid quadratic membership check in import fixes (<a href="https://redirect.github.com/astral-sh/ruff/pull/15576">#15576</a>)</li> </ul> <h3>Server</h3> <ul> <li>Allow <code>unsafe-fixes</code> settings for code actions (<a href="https://redirect.github.com/astral-sh/ruff/pull/15666">#15666</a>)</li> </ul> <h3>Bug fixes</h3> <ul> <li>[<code>flake8-bandit</code>] Add missing single-line/dotall regex flag (<code>S608</code>) (<a href="https://redirect.github.com/astral-sh/ruff/pull/15654">#15654</a>)</li> <li>[<code>flake8-import-conventions</code>] Fix infinite loop between <code>ICN001</code> and <code>I002</code> (<code>ICN001</code>) (<a href="https://redirect.github.com/astral-sh/ruff/pull/15480">#15480</a>)</li> </ul> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/astral-sh/ruff/blob/main/CHANGELOG.md">ruff's changelog</a>.</em></p> <blockquote> <h2>0.9.3</h2> <h3>Preview features</h3> <ul> <li>[<code>airflow</code>] Argument <code>fail_stop</code> in DAG has been renamed as <code>fail_fast</code> (<code>AIR302</code>) (<a href="https://redirect.github.com/astral-sh/ruff/pull/15633">#15633</a>)</li> <li>[<code>airflow</code>] Extend <code>AIR303</code> with more symbols (<a href="https://redirect.github.com/astral-sh/ruff/pull/15611">#15611</a>)</li> <li>[<code>flake8-bandit</code>] Report all references to suspicious functions (<code>S3</code>) (<a href="https://redirect.github.com/astral-sh/ruff/pull/15541">#15541</a>)</li> <li>[<code>flake8-pytest-style</code>] Do not emit diagnostics for empty <code>for</code> loops (<code>PT012</code>, <code>PT031</code>) (<a href="https://redirect.github.com/astral-sh/ruff/pull/15542">#15542</a>)</li> <li>[<code>flake8-simplify</code>] Avoid double negations (<code>SIM103</code>) (<a href="https://redirect.github.com/astral-sh/ruff/pull/15562">#15562</a>)</li> <li>[<code>pyflakes</code>] Fix infinite loop with unused local import in <code>__init__.py</code> (<code>F401</code>) (<a href="https://redirect.github.com/astral-sh/ruff/pull/15517">#15517</a>)</li> <li>[<code>pylint</code>] Do not report methods with only one <code>EM101</code>-compatible <code>raise</code> (<code>PLR6301</code>) (<a href="https://redirect.github.com/astral-sh/ruff/pull/15507">#15507</a>)</li> <li>[<code>pylint</code>] Implement <code>redefined-slots-in-subclass</code> (<code>W0244</code>) (<a href="https://redirect.github.com/astral-sh/ruff/pull/9640">#9640</a>)</li> <li>[<code>pyupgrade</code>] Add rules to use PEP 695 generics in classes and functions (<code>UP046</code>, <code>UP047</code>) (<a href="https://redirect.github.com/astral-sh/ruff/pull/15565">#15565</a>, <a href="https://redirect.github.com/astral-sh/ruff/pull/15659">#15659</a>)</li> <li>[<code>refurb</code>] Implement <code>for-loop-writes</code> (<code>FURB122</code>) (<a href="https://redirect.github.com/astral-sh/ruff/pull/10630">#10630</a>)</li> <li>[<code>ruff</code>] Implement <code>needless-else</code> clause (<code>RUF047</code>) (<a href="https://redirect.github.com/astral-sh/ruff/pull/15051">#15051</a>)</li> <li>[<code>ruff</code>] Implement <code>starmap-zip</code> (<code>RUF058</code>) (<a href="https://redirect.github.com/astral-sh/ruff/pull/15483">#15483</a>)</li> </ul> <h3>Rule changes</h3> <ul> <li>[<code>flake8-bugbear</code>] Do not raise error if keyword argument is present and target-python version is less or equals than 3.9 (<code>B903</code>) (<a href="https://redirect.github.com/astral-sh/ruff/pull/15549">#15549</a>)</li> <li>[<code>flake8-comprehensions</code>] strip parentheses around generators in <code>unnecessary-generator-set</code> (<code>C401</code>) (<a href="https://redirect.github.com/astral-sh/ruff/pull/15553">#15553</a>)</li> <li>[<code>flake8-pytest-style</code>] Rewrite references to <code>.exception</code> (<code>PT027</code>) (<a href="https://redirect.github.com/astral-sh/ruff/pull/15680">#15680</a>)</li> <li>[<code>flake8-simplify</code>] Mark fixes as unsafe (<code>SIM201</code>, <code>SIM202</code>) (<a href="https://redirect.github.com/astral-sh/ruff/pull/15626">#15626</a>)</li> <li>[<code>flake8-type-checking</code>] Fix some safe fixes being labeled unsafe (<code>TC006</code>,<code>TC008</code>) (<a href="https://redirect.github.com/astral-sh/ruff/pull/15638">#15638</a>)</li> <li>[<code>isort</code>] Omit trailing whitespace in <code>unsorted-imports</code> (<code>I001</code>) (<a href="https://redirect.github.com/astral-sh/ruff/pull/15518">#15518</a>)</li> <li>[<code>pydoclint</code>] Allow ignoring one line docstrings for <code>DOC</code> rules (<a href="https://redirect.github.com/astral-sh/ruff/pull/13302">#13302</a>)</li> <li>[<code>pyflakes</code>] Apply redefinition fixes by source code order (<code>F811</code>) (<a href="https://redirect.github.com/astral-sh/ruff/pull/15575">#15575</a>)</li> <li>[<code>pyflakes</code>] Avoid removing too many imports in <code>redefined-while-unused</code> (<code>F811</code>) (<a href="https://redirect.github.com/astral-sh/ruff/pull/15585">#15585</a>)</li> <li>[<code>pyflakes</code>] Group redefinition fixes by source statement (<code>F811</code>) (<a href="https://redirect.github.com/astral-sh/ruff/pull/15574">#15574</a>)</li> <li>[<code>pylint</code>] Include name of base class in message for <code>redefined-slots-in-subclass</code> (<code>W0244</code>) (<a href="https://redirect.github.com/astral-sh/ruff/pull/15559">#15559</a>)</li> <li>[<code>ruff</code>] Update fix for <code>RUF055</code> to use <code>var == value</code> (<a href="https://redirect.github.com/astral-sh/ruff/pull/15605">#15605</a>)</li> </ul> <h3>Formatter</h3> <ul> <li>Fix bracket spacing for single-element tuples in f-string expressions (<a href="https://redirect.github.com/astral-sh/ruff/pull/15537">#15537</a>)</li> <li>Fix unstable f-string formatting for expressions containing a trailing comma (<a href="https://redirect.github.com/astral-sh/ruff/pull/15545">#15545</a>)</li> </ul> <h3>Performance</h3> <ul> <li>Avoid quadratic membership check in import fixes (<a href="https://redirect.github.com/astral-sh/ruff/pull/15576">#15576</a>)</li> </ul> <h3>Server</h3> <ul> <li>Allow <code>unsafe-fixes</code> settings for code actions (<a href="https://redirect.github.com/astral-sh/ruff/pull/15666">#15666</a>)</li> </ul> <h3>Bug fixes</h3> <ul> <li>[<code>flake8-bandit</code>] Add missing single-line/dotall regex flag (<code>S608</code>) (<a href="https://redirect.github.com/astral-sh/ruff/pull/15654">#15654</a>)</li> <li>[<code>flake8-import-conventions</code>] Fix infinite loop between <code>ICN001</code> and <code>I002</code> (<code>ICN001</code>) (<a href="https://redirect.github.com/astral-sh/ruff/pull/15480">#15480</a>)</li> <li>[<code>flake8-simplify</code>] Do not emit diagnostics for expressions inside string type annotations (<code>SIM222</code>, <code>SIM223</code>) (<a href="https://redirect.github.com/astral-sh/ruff/pull/15405">#15405</a>)</li> </ul> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Commits</summary> <ul> <li><a href="`90589372da`"><code>9058937</code></a> Fix grep for version number in docker build (<a href="https://redirect.github.com/astral-sh/ruff/issues/15699">#15699</a>)</li> <li><a href="`b5ffb404de`"><code>b5ffb40</code></a> Bump version to 0.9.3 (<a href="https://redirect.github.com/astral-sh/ruff/issues/15698">#15698</a>)</li> <li><a href="`cffd1866ce`"><code>cffd186</code></a> Preserve raw string prefix and escapes (<a href="https://redirect.github.com/astral-sh/ruff/issues/15694">#15694</a>)</li> <li><a href="`569060f46c`"><code>569060f</code></a> [<code>flake8-pytest-style</code>] Rewrite references to <code>.exception</code> (<code>PT027</code>) (<a href="https://redirect.github.com/astral-sh/ruff/issues/15680">#15680</a>)</li> <li><a href="`15394a8028`"><code>15394a8</code></a> [red-knot] MDTests: Do not depend on precise public-symbol type inference (<a href="https://redirect.github.com/astral-sh/ruff/issues/1">#1</a>...</li> <li><a href="`fc2ebea736`"><code>fc2ebea</code></a> [red-knot] Make <code>infer.rs</code> unit tests independent of public symbol inference ...</li> <li><a href="`43160b4c3e`"><code>43160b4</code></a> Tidy knot CLI tests (<a href="https://redirect.github.com/astral-sh/ruff/issues/15685">#15685</a>)</li> <li><a href="`0173738eef`"><code>0173738</code></a> [red-knot] Port comprehension tests to Markdown (<a href="https://redirect.github.com/astral-sh/ruff/issues/15688">#15688</a>)</li> <li><a href="`05ea77b1d4`"><code>05ea77b</code></a> Create Unknown rule diagnostics with a source range (<a href="https://redirect.github.com/astral-sh/ruff/issues/15648">#15648</a>)</li> <li><a href="`1e790d3885`"><code>1e790d3</code></a> [red-knot] Port 'deferred annotations' unit tests to Markdown (<a href="https://redirect.github.com/astral-sh/ruff/issues/15686">#15686</a>)</li> <li>Additional commits viewable in <a href="https://github.com/astral-sh/ruff/compare/0.9.2...0.9.3">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=ruff&package-manager=pip&previous-version=0.9.2&new-version=0.9.3)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-01-27 12:13:15 -08:00
Michael Sharp	42f0c00f95	Adds the new System.Numerics.Tensors as an input/output type when using dotnet 8.0 and up. (#23261 ) ### Description Adds the new System.Numerics.Tensors as an input/output type when using dotnet 8.0 and up. It does not change/remove any of the existing API, only adds additional ones. ### Motivation and Context Now that C#/Dotnet has an official tensor type built into the language, we want to expand the places that it can be used.	2025-01-27 10:58:38 -08:00
Yateng Hong	97c2bbe3eb	Fix shape infer of onnx GroupNorm (#23477 ) ### Description <!-- Describe your changes. --> Fix shape infer of onnx GroupNorm. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Unable to run shape inference for onnx `GroupNorm`. [model.onnx](https://raw.githubusercontent.com/onnx/onnx/refs/heads/main/onnx/backend/test/data/node/test_group_normalization_example/model.onnx) > python D:\source\cognition\onnxruntime\onnxruntime\python\tools\symbolic_shape_infer.py --input model.onnx Traceback (most recent call last): File "D:\source\cognition\onnxruntime\onnxruntime\python\tools\symbolic_shape_infer.py", line 2999, in <module> out_mp = SymbolicShapeInference.infer_shapes( File "D:\source\cognition\onnxruntime\onnxruntime\python\tools\symbolic_shape_infer.py", line 2935, in infer_shapes raise Exception("Incomplete symbolic shape inference")	2025-01-25 23:41:29 -08:00
Changming Sun	1fc9c4823d	Enable coremltools for Linux build (#23481 ) ### Description Enable coremltools for Linux build. In order to do this, I did: 1. Add uuid-devel to the Linux images and regenerate them. 2. Patch the coremltools code a little bit to add some missing header files. ### Motivation and Context To make the code simpler. Later on I will create another PR to remove the COREML_ENABLE_MLPROGRAM C/C++ macro. Also, after this PR I will bring more changes to onnxruntime_provider_coreml.cmake to make it work with vcpkg.	2025-01-24 18:18:37 -08:00
Jing Fang	13348c572a	[ARM CPU] hgemm optimized for gqa (#23107 ) ### Description Add fp16 kernels for GQA matmul on ARM CPU. The kernels are mlas hgemm for C = alpha * A x B' + beta * C ### Motivation and Context Add fp16 support for GQA, speed up the operator and reduce memory usage. __Token Generation__ \| \| HGEMM Runtime (ns) \| SGEMM Runtime (ns) \| Speed-up (%) \| \|---------------------------------\|--------------------\|--------------------\|--------------\| \| M:1/N:4096/K:4096 \| 251551 \| 1775905 \| 85.84 \| \| M:1/N:11008/K:4096 \| 892507 \| 4649145 \| 80.80 \| \| M:1/N:4096/K:11008 \| 866860 \| 3240015 \| 73.25 \| \| M:1/N:11008/K:11008 \| 2631615 \|8783877 \| 70.04 \| __Prompting__ \| \| HGEMM Runtime (ns) \| SGEMM Runtime (ns) \| Speed-up (%) \| \|---------------------------------\|--------------------\|--------------------\|--------------\| \| M:1024/N:4096/K:4096 \| 90508701 \| 111283029 \| 18.67 \| \| M:2048/N:4096/K:4096 \| 181307522 \| 240211107 \| 24.52 \| \| M:1024/N:11008/K:4096 \| 241120234 \| 307707933 \| 21.64 \| \| M:2048/N:11008/K:4096 \| 481091232 \| 648921367 \| 25.86 \| \| M:1024/N:4096/K:11008 \| 241736343 \| 310129880 \| 22.05 \| \| M:2048/N:4096/K:11008 \| 480456703 \| 644814999 \| 25.49 \| \| M:1024/N:11008/K:11008 \| 642121440 \| 847925766 \| 24.27 \| \| M:2048/N:11008/K:11008 \| 1276097154 \| 1731314509 \| 26.29	2025-01-24 15:25:24 -08:00
Grégoire	c89a798b73	Enable opti on Microsoft.ML.OnnxRuntime with RelWithDebInfo config (#23463 ) Microsoft.ML.OnnxRuntime is not built with the Release configuration but RelWithDebInfo which is not recognized by the MSBuild SDK. Consequently, the optimizations are not enabled. A fix would be to simply force the configuration to be Release when building the .NET code even if it was set to RelWithDebInfo in the command line arguments but I could not find an easy way to do that. Instead, I try to mimic the behavior of the Release configuration by setting the optimize property. I can see a 15% performance improvement using this simple model summing up the 3 inputs: ```csharp using System.Buffers; using System.Collections.Frozen; using System.Net; using System.Net.Sockets; using System.Runtime.CompilerServices; using System.Runtime.InteropServices; using System.Text; using System.Text.RegularExpressions; using BenchmarkDotNet.Attributes; using BenchmarkDotNet.Configs; using BenchmarkDotNet.Running; using Microsoft.ML.OnnxRuntime; var config = DefaultConfig.Instance; //.WithOptions(ConfigOptions.DisableOptimizationsValidator); BenchmarkSwitcher.FromAssembly(typeof(Program).Assembly).Run(args, config); public class OnnxBench { private const int Iterations = 100_000; private const int BatchSize = 50; private InferenceSession _session = default!; private string[] _inputNames = default!; private OrtValue[] _inputValues = default!; private RunOptions _runOptions = default!; [GlobalSetup] public void GlobalSetup() { using SessionOptions sessionOptions = new(); sessionOptions.InterOpNumThreads = 1; sessionOptions.IntraOpNumThreads = 1; sessionOptions.GraphOptimizationLevel = GraphOptimizationLevel.ORT_ENABLE_ALL; sessionOptions.ExecutionMode = ExecutionMode.ORT_SEQUENTIAL; _session = new InferenceSession( Convert.FromBase64String("CAo6cAoOCgFBCgFCEgFEIgNBZGQKDgoBQwoBRBIBWCIDQWRkEgJscloRCgFBEgwKCggBEgYKAAoCCAFaEQoBQhIMCgoIARIGCgAKAggBWhEKAUMSDAoKCAESBgoACgIIAWIRCgFYEgwKCggBEgYKAAoCCAFCBAoAEBU="), sessionOptions); _inputNames = ["A", "B", "C"]; _inputValues = [ OrtValue.CreateTensorValueFromMemory(new float[BatchSize], [BatchSize, 1]), OrtValue.CreateTensorValueFromMemory(new float[BatchSize], [BatchSize, 1]), OrtValue.CreateTensorValueFromMemory(new float[BatchSize], [BatchSize, 1]), ]; _runOptions = new RunOptions(); } [Benchmark(OperationsPerInvoke = Iterations)] public float Run() { var inputValues0Span = _inputValues[0].GetTensorMutableDataAsSpan<float>(); var inputValues1Span = _inputValues[1].GetTensorMutableDataAsSpan<float>(); var inputValues2Span = _inputValues[2].GetTensorMutableDataAsSpan<float>(); for (int i = 0; i < BatchSize; i += 1) { inputValues0Span[i] = Random.Shared.NextSingle(); inputValues1Span[i] = Random.Shared.NextSingle(); inputValues2Span[i] = Random.Shared.NextSingle(); } float sum = 0f; for (int i = 0; i < Iterations; i += 1) { using var output = _session.Run(_runOptions, _inputNames, _inputValues, _session.OutputNames); ReadOnlySpan<float> outputData = output[0].GetTensorDataAsSpan<float>(); for (int j = 0; j < outputData.Length; j += 1) { sum += outputData[j]; } } return sum; } } ``` \| Method \| Mean \| Error \| StdDev \| \|------- \|---------:\|----------:\|----------:\| \| Before \| 5.003 us \| 0.0318 us \| 0.0297 us \| \| After \| 4.325 us \| 0.0568 us \| 0.0503 us \|	2025-01-24 09:52:05 -08:00
Caroline Zhu	d00ae325ce	Revert "[Mobile] Add BrowserStack Android MAUI Test (#23383 )" (#23474 ) This reverts commit `9f9fcf74ff`. ### Motivation and Context - NuGet packaging pipelines failing with this error: ```Files\dotnet\packs\Microsoft.NET.Runtime.MonoTargets.Sdk\8.0.12\Sdk\RuntimeComponentManifest.targets(3,5): error : Empty ResolveFrameworkReference.RuntimePackPath while trying to read runtime components manifest. ResolvedFrameworkReference available: { Microsoft.NETCore.App, RuntimePackPath: }```	2025-01-23 21:48:27 -08:00
Ti-Tai Wang	8b1d3b3d57	Align AvgPool ceil_mode on last value to torch (#16752 ) Fix #16203 Previous to this PR, if `ceil_mode` is on, the calculation of a value would divide the kernel size, even if remaining pixels is less than the kernel size, which causes the difference in this operator between ORT and torch. However, this fix only applies to the change in #15597, which only supports AvgPool since 19. The older opset version is remain the same, as it's using mlas files. Also, the PR fixes the shape mismatch caused by sliding window starting from padding. More detail: https://github.com/onnx/onnx/pull/6650 (And this PR is also validated with the tests added in https://github.com/onnx/onnx/pull/6650)	2025-01-23 17:35:11 -08:00
Adrian Lizarraga	06fc73b7d4	[TRT EP Perf Tool] Add annotations import to python script to support annotations on Python 3.8 (#23466 ) ### Description Adds `from __future__ import annotations` to python script to support annotations on Python 3.8. ### Motivation and Context Pipeline that runs this script is using Ubuntu 20.04's default python version (3.8), which does not support annotations unless one imports from __future__.	2025-01-23 08:54:55 -08:00
Adrian Lizarraga	46dc0b5f21	[QNN EP] Add LoggingManager::HasDefaultLogger() to provider bridge API (#23467 ) ### Description Fixes QNN EP builds due to missing function in provider bridge API: `logging::LoggingManager::HasDefaultLogger()` ### Motivation and Context A [recent PR](https://github.com/microsoft/onnxruntime/pull/23120) made QNN EP a shared library. A [different PR](https://github.com/microsoft/onnxruntime/pull/23435) added use of a new function to QNN EP that was not part of the provider bridge API. The CI did not catch it because main was not merged into the first PR before merging.	2025-01-22 21:26:24 -08:00
Adrian Lizarraga	3b4c7df4e9	[QNN EP] Make QNN EP a shared library (#23120 ) ### Description - Makes QNN EP a shared library by default when building with `--use_qnn` or `--use_qnn shared_lib`. Generates the following build artifacts: - Windows: `onnxruntime_providers_qnn.dll` and `onnxruntime_providers_shared.dll` - Linux: `libonnxruntime_providers_qnn.so` and `libonnxruntime_providers_shared.so` - Android: Not supported. Must build QNN EP as a static library. - Allows QNN EP to still be built as a static library with `--use_qnn static_lib`. This is primarily for the Android QNN AAR package. - Unit tests run for both the static and shared QNN EP builds. ### Detailed changes - Updates Java bindings to support both shared and static QNN EP builds. - Provider bridge API: - Adds logging sink ETW to the provider bridge. Allows EPs to register ETW callbacks for ORT logging. - Adds a variety of methods for onnxruntime objects that are needed by QNN EP. - QNN EP: - Adds `ort_api.h` and `ort_api.cc` that encapsulates the API provided by ORT in a manner that allows the EP to be built as either a shared or static library. - Adds custom function to transpose weights for Conv and Gemm (instead of adding util to provider bridge API). - Adds custom function to quantize data for LeakyRelu (instead of adding util to provider bridge API). - Adds custom ETW tracing for QNN profiling events: - shared library: defines its own TraceLogging provider handle - static library: uses ORT's TraceLogging provider handle and existing telemetry provider. - ORT-QNN Packages: - Python: Pipelines build QNN EP as a shared library by default. User can build a local python wheel with QNN EP as a static library by passing `--use_qnn static_lib`. - NuGet: Pipelines build QNN EP as a shared library by default. `build.py` currently enforces QNN EP to be built as a shared library. Can add support for building a QNN NuGet package with static later if deemed necessary. - Android: Pipelines build QNN EP as a static library. `build.py` enforces QNN EP to be built as a static library. Packaging multiple shared libraries into an Android AAR package is not currently supported due to the added need to also distribute a shared libcpp.so library. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2025-01-22 12:11:00 -08:00
Changming Sun	77adf4b040	Add custom vcpkg ports (#23456 ) ### Description Add custom vcpkg ports for the following packages: 1. cpuinfo 2. onnx 3. pthreadpool 4. xnnpack Because: - The cpuinfo/pthreadpool/xnnpack packages in the official vcpkg repo are too old. - XNNPack's version is updated from 2022-12-22 to 2025-01-17 - CPUINFO's version is updated from 2022-07-19 to 2024-12-09 - Pthreadpool's version is updated from 2020-04-10 to 2024-12-17, and the source code location is changed from https://github.com/Maratyszcza/pthreadpool to https://github.com/google/pthreadpool - The onnx package in the official repo requires building python from source, which then requires a lot of additional dependencies to be installed. This PR removes them. - Added a disable_gcc_warning.patch file for xnnpack for addressing the issue reported in https://github.com/google/XNNPACK/issues/7650. I will remove this patch when the issue is fully addressed. - Added " -DONNX_DISABLE_STATIC_REGISTRATION=ON" to ONNX's config options. -	2025-01-22 11:49:16 -08:00
Changming Sun	3dcc90119b	Update the compile flags for vcpkg packages (#23455 ) ### Description This PR updates the triplets files that manage the compile flags for vcpkg packages. All the changes are autogenerated except for the gen.py file in this PR. Main changes: 1. Enable debug info for all Linux build config(Release and Debug) 2. Set CMAKE_CXX_STANDARD in each triplet. The value is set to 20 for macOS targets and 17 for the others. 3. Only set _FORTIFY_SOURCE in release build. This is to address a build issue on some platforms with the following glibc change: "Warn if user requests __FORTIFY_SOURCE but it is disabled" https://sourceware.org/git/?p=glibc.git;a=commit;f=include/features.h;h=05c2c9618f583ea4acd69b3fe5ae2a2922dd2ddc ### Motivation and Context Address a Linux build error.	2025-01-22 11:48:38 -08:00
Caroline Zhu	9f9fcf74ff	[Mobile] Add BrowserStack Android MAUI Test (#23383 ) ### Description Add test project that will perform an automated UI test that runs the unit tests on Android. ### Motivation - Enables end-to-end on-device MAUI unit testing which we want to add to the packaging pipelines ### Context Microsoft.ML.OnnxRuntime.Tests.MAUI uses DeviceRunners.VisualRunners to allow running the unit tests (found in Microsoft.ML.OnnxRuntime.Tests.Common) across multiple devices. DeviceRunners.VisualRunners provides a simple UI with a button that will run the unit tests and a panel with the unit test results. In order to automate the process of running the unit tests across mobile devices, Appium is used for UI testing orchestration (it provides a way to interact with the UI), and BrowserStack automatically runs these Appium tests across different mobile devices. This project does not include the capability to start an Appium server locally or attach to a local emulator or device. ## Build & run instructions ### Requirements * A BrowserStack account with access to App Automate * You can set BrowserStack credentials as environment variables as shown [here](https://www.browserstack.com/docs/app-automate/appium/getting-started/c-sharp/nunit/integrate-your-tests#CLI) * ONNXRuntime NuGet package 1. You can either download the [stable NuGet package](https://www.nuget.org/packages/Microsoft.ML.OnnxRuntime) then follow the instructions from [NativeLibraryInclude.props file](../Microsoft.ML.OnnxRuntime.Tests.Common/NativeLibraryInclude.props) to use the downloaded .nupkg file 2. Or follow the [build instructions](https://onnxruntime.ai/docs/build/android.html) to build the Android package locally * The dotnet workloads for maui and maui-android, which will not always automatically install correctly 1. `dotnet workload install maui` 2. `dotnet workload install maui-android` * [Appium](https://appium.io/docs/en/latest/quickstart/) and the [UiAutomator2 driver](https://appium.io/docs/en/latest/quickstart/uiauto2-driver/) ### Run instructions 1. Build the Microsoft.ML.OnnxRuntime.Tests.MAUI project into a signed APK. 1. Run the following: `dotnet publish -c Release -f net8.0-android` in the Microsoft.ML.OnnxRuntime.Tests.MAUI directory. 2. Search for the APK files generated. They should be located in `bin\Release\net8.0-android\publish`. 3. If they're in a different location, edit the `browserstack.yml` file to target the path to the signed APK. 2. Ensure you've set the BrowserStack credentials as environment variables. 3. Run the following in the Microsoft.ML.OnnxRuntime.Tests.Android.BrowserStack directory: `dotnet test` 4. Navigate to the [BrowserStack App Automate dashboard](https://app-automate.browserstack.com/dashboard/v2/builds) to see your test running!	2025-01-22 10:57:09 -08:00
Jiajia Qin	25f427466e	[js/webgpu] Optimize ConvTranspose (Continue) (#23429 ) BUG #23273 This PR does below optimizations: 1. When output channels is one, 1) calculate the offset before the inchannel loop to reduce indices to offsets calculation, 2) split the `inputChannelsPerGroup` into `inputChannelsPerGroupInt` and `inputChannelsRemainder` parts so that we can always access 4 data for `inputChannelsPerGroupInt`. 2. Use precise initial value to reduce useless loop iterations. Thanks @jiangzhaoming 's suggestion's on this. With this PR, ConvTranspose becomes 3.7s from 8.4s on Intel Meteor Lake. On NV RTX 2000 Ada, it becomes 1.6s from 2.7s.	2025-01-22 08:59:17 -08:00
Changming Sun	ff8465eda4	Use onnx_protobuf.h to suppress some GCC warnings (#23453 ) ### Description Use onnx_protobuf.h to suppress some GCC warnings. All the changes are autogenerated by a shell command. ```bash find . -type f -exec sed -i 's/#include\s\+<onnx\/onnx_pb.h>/#include "core\/graph\/onnx_protobuf.h"/g' {} \; ``` ### Motivation and Context This PR is needed for making vcpkg work(without disabling all warnings) This PR is split from another bigger PR per request from a reviewer.	2025-01-21 20:25:12 -08:00
Changming Sun	c9614fbf90	Suppress some strict-aliasing related warnings in WebGPU EP (#23454 ) ### Description Suppress some strict-aliasing related warnings in WebGPU EP For example: ``` /home/chasun/src/onnxruntime/onnxruntime/core/providers/webgpu/math/unary_elementwise_ops.cc:208:30: error: dereferencing type-punned pointer will break strict-aliasing rules [-Werror=strict-aliasing] 208 \| float encoded_value = reinterpret_cast<const float>(attr); \| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ``` This PR does not really fix the problems. It just suppresses the warnings to make build pass. Some issues related to strict aliasing may be fixed by using std::bit_cast, which requires c++20 however. ### Motivation and Context Build the code on Azure Linux 3 fails. To reproduce the issue, you may get an AzureLinux3 machine and run: ``` python3 tools/ci_build/build.py --update --build --build_wheel --use_xnnpack --build_nodejs --use_webgpu --build_dir b --skip_submodule_sync --parallel --use_binskim_compliant_compile_flags --build_shared_lib --config Release ```	2025-01-21 17:37:08 -08:00
dependabot[bot]	87582ba9b7	Bump ruff from 0.9.1 to 0.9.2 (#23427 ) Bumps [ruff](https://github.com/astral-sh/ruff) from 0.9.1 to 0.9.2. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/astral-sh/ruff/releases">ruff's releases</a>.</em></p> <blockquote> <h2>0.9.2</h2> <h2>Release Notes</h2> <h3>Preview features</h3> <ul> <li>[<code>airflow</code>] Fix typo "security_managr" to "security_manager" (<code>AIR303</code>) (<a href="https://redirect.github.com/astral-sh/ruff/pull/15463">#15463</a>)</li> <li>[<code>airflow</code>] extend and fix AIR302 rules (<a href="https://redirect.github.com/astral-sh/ruff/pull/15525">#15525</a>)</li> <li>[<code>fastapi</code>] Handle parameters with <code>Depends</code> correctly (<code>FAST003</code>) (<a href="https://redirect.github.com/astral-sh/ruff/pull/15364">#15364</a>)</li> <li>[<code>flake8-pytest-style</code>] Implement pytest.warns diagnostics (<code>PT029</code>, <code>PT030</code>, <code>PT031</code>) (<a href="https://redirect.github.com/astral-sh/ruff/pull/15444">#15444</a>)</li> <li>[<code>flake8-pytest-style</code>] Test function parameters with default arguments (<code>PT028</code>) (<a href="https://redirect.github.com/astral-sh/ruff/pull/15449">#15449</a>)</li> <li>[<code>flake8-type-checking</code>] Avoid false positives for <code>\|</code> in <code>TC008</code> (<a href="https://redirect.github.com/astral-sh/ruff/pull/15201">#15201</a>)</li> </ul> <h3>Rule changes</h3> <ul> <li>[<code>flake8-todos</code>] Allow VSCode GitHub PR extension style links in <code>missing-todo-link</code> (<code>TD003</code>) (<a href="https://redirect.github.com/astral-sh/ruff/pull/15519">#15519</a>)</li> <li>[<code>pyflakes</code>] Show syntax error message for <code>F722</code> (<a href="https://redirect.github.com/astral-sh/ruff/pull/15523">#15523</a>)</li> </ul> <h3>Formatter</h3> <ul> <li>Fix curly bracket spacing around f-string expressions containing curly braces (<a href="https://redirect.github.com/astral-sh/ruff/pull/15471">#15471</a>)</li> <li>Fix joining of f-strings with different quotes when using quote style <code>Preserve</code> (<a href="https://redirect.github.com/astral-sh/ruff/pull/15524">#15524</a>)</li> </ul> <h3>Server</h3> <ul> <li>Avoid indexing the same workspace multiple times (<a href="https://redirect.github.com/astral-sh/ruff/pull/15495">#15495</a>)</li> <li>Display context for <code>ruff.configuration</code> errors (<a href="https://redirect.github.com/astral-sh/ruff/pull/15452">#15452</a>)</li> </ul> <h3>Configuration</h3> <ul> <li>Remove <code>flatten</code> to improve deserialization error messages (<a href="https://redirect.github.com/astral-sh/ruff/pull/15414">#15414</a>)</li> </ul> <h3>Bug fixes</h3> <ul> <li>Parse triple-quoted string annotations as if parenthesized (<a href="https://redirect.github.com/astral-sh/ruff/pull/15387">#15387</a>)</li> <li>[<code>fastapi</code>] Update <code>Annotated</code> fixes (<code>FAST002</code>) (<a href="https://redirect.github.com/astral-sh/ruff/pull/15462">#15462</a>)</li> <li>[<code>flake8-bandit</code>] Check for <code>builtins</code> instead of <code>builtin</code> (<code>S102</code>, <code>PTH123</code>) (<a href="https://redirect.github.com/astral-sh/ruff/pull/15443">#15443</a>)</li> <li>[<code>flake8-pathlib</code>] Fix <code>--select</code> for <code>os-path-dirname</code> (<code>PTH120</code>) (<a href="https://redirect.github.com/astral-sh/ruff/pull/15446">#15446</a>)</li> <li>[<code>ruff</code>] Fix false positive on global keyword (<code>RUF052</code>) (<a href="https://redirect.github.com/astral-sh/ruff/pull/15235">#15235</a>)</li> </ul> <h2>Contributors</h2> <ul> <li><a href="https://github.com/AlexWaygood"><code>@AlexWaygood</code></a></li> <li><a href="https://github.com/BurntSushi"><code>@BurntSushi</code></a></li> <li><a href="https://github.com/Daverball"><code>@Daverball</code></a></li> <li><a href="https://github.com/Garrett-R"><code>@Garrett-R</code></a></li> <li><a href="https://github.com/Glyphack"><code>@Glyphack</code></a></li> <li><a href="https://github.com/InSyncWithFoo"><code>@InSyncWithFoo</code></a></li> <li><a href="https://github.com/Lee-W"><code>@Lee-W</code></a></li> <li><a href="https://github.com/MichaReiser"><code>@MichaReiser</code></a></li> <li><a href="https://github.com/cake-monotone"><code>@cake-monotone</code></a></li> </ul> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/astral-sh/ruff/blob/main/CHANGELOG.md">ruff's changelog</a>.</em></p> <blockquote> <h2>0.9.2</h2> <h3>Preview features</h3> <ul> <li>[<code>airflow</code>] Fix typo "security_managr" to "security_manager" (<code>AIR303</code>) (<a href="https://redirect.github.com/astral-sh/ruff/pull/15463">#15463</a>)</li> <li>[<code>airflow</code>] extend and fix AIR302 rules (<a href="https://redirect.github.com/astral-sh/ruff/pull/15525">#15525</a>)</li> <li>[<code>fastapi</code>] Handle parameters with <code>Depends</code> correctly (<code>FAST003</code>) (<a href="https://redirect.github.com/astral-sh/ruff/pull/15364">#15364</a>)</li> <li>[<code>flake8-pytest-style</code>] Implement pytest.warns diagnostics (<code>PT029</code>, <code>PT030</code>, <code>PT031</code>) (<a href="https://redirect.github.com/astral-sh/ruff/pull/15444">#15444</a>)</li> <li>[<code>flake8-pytest-style</code>] Test function parameters with default arguments (<code>PT028</code>) (<a href="https://redirect.github.com/astral-sh/ruff/pull/15449">#15449</a>)</li> <li>[<code>flake8-type-checking</code>] Avoid false positives for <code>\|</code> in <code>TC008</code> (<a href="https://redirect.github.com/astral-sh/ruff/pull/15201">#15201</a>)</li> </ul> <h3>Rule changes</h3> <ul> <li>[<code>flake8-todos</code>] Allow VSCode GitHub PR extension style links in <code>missing-todo-link</code> (<code>TD003</code>) (<a href="https://redirect.github.com/astral-sh/ruff/pull/15519">#15519</a>)</li> <li>[<code>pyflakes</code>] Show syntax error message for <code>F722</code> (<a href="https://redirect.github.com/astral-sh/ruff/pull/15523">#15523</a>)</li> </ul> <h3>Formatter</h3> <ul> <li>Fix curly bracket spacing around f-string expressions containing curly braces (<a href="https://redirect.github.com/astral-sh/ruff/pull/15471">#15471</a>)</li> <li>Fix joining of f-strings with different quotes when using quote style <code>Preserve</code> (<a href="https://redirect.github.com/astral-sh/ruff/pull/15524">#15524</a>)</li> </ul> <h3>Server</h3> <ul> <li>Avoid indexing the same workspace multiple times (<a href="https://redirect.github.com/astral-sh/ruff/pull/15495">#15495</a>)</li> <li>Display context for <code>ruff.configuration</code> errors (<a href="https://redirect.github.com/astral-sh/ruff/pull/15452">#15452</a>)</li> </ul> <h3>Configuration</h3> <ul> <li>Remove <code>flatten</code> to improve deserialization error messages (<a href="https://redirect.github.com/astral-sh/ruff/pull/15414">#15414</a>)</li> </ul> <h3>Bug fixes</h3> <ul> <li>Parse triple-quoted string annotations as if parenthesized (<a href="https://redirect.github.com/astral-sh/ruff/pull/15387">#15387</a>)</li> <li>[<code>fastapi</code>] Update <code>Annotated</code> fixes (<code>FAST002</code>) (<a href="https://redirect.github.com/astral-sh/ruff/pull/15462">#15462</a>)</li> <li>[<code>flake8-bandit</code>] Check for <code>builtins</code> instead of <code>builtin</code> (<code>S102</code>, <code>PTH123</code>) (<a href="https://redirect.github.com/astral-sh/ruff/pull/15443">#15443</a>)</li> <li>[<code>flake8-pathlib</code>] Fix <code>--select</code> for <code>os-path-dirname</code> (<code>PTH120</code>) (<a href="https://redirect.github.com/astral-sh/ruff/pull/15446">#15446</a>)</li> <li>[<code>ruff</code>] Fix false positive on global keyword (<code>RUF052</code>) (<a href="https://redirect.github.com/astral-sh/ruff/pull/15235">#15235</a>)</li> </ul> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="`0a39348381`"><code>0a39348</code></a> Include build binaries</li> <li><a href="`027f8009e5`"><code>027f800</code></a> Comment out non-npm-publish jobs</li> <li><a href="`425870df76`"><code>425870d</code></a> Upload npm publish logs when failed</li> <li><a href="`c20255abe4`"><code>c20255a</code></a> Bump version to 0.9.2 (<a href="https://redirect.github.com/astral-sh/ruff/issues/15529">#15529</a>)</li> <li><a href="`420365811f`"><code>4203658</code></a> Fix joining of f-strings with different quotes when using quote style `Preser...</li> <li><a href="`fc9dd63d64`"><code>fc9dd63</code></a> [airflow] extend and fix AIR302 rules (<a href="https://redirect.github.com/astral-sh/ruff/issues/15525">#15525</a>)</li> <li><a href="`79e52c7fdf`"><code>79e52c7</code></a> [<code>pyflakes</code>] Show syntax error message for <code>F722</code> (<a href="https://redirect.github.com/astral-sh/ruff/issues/15523">#15523</a>)</li> <li><a href="`cf4ab7cba1`"><code>cf4ab7c</code></a> Parse triple quoted string annotations as if parenthesized (<a href="https://redirect.github.com/astral-sh/ruff/issues/15387">#15387</a>)</li> <li><a href="`d2656e88a3`"><code>d2656e8</code></a> [<code>flake8-todos</code>] Allow VSCode GitHub PR extension style links in `missing-tod...</li> <li><a href="`c53ee608a1`"><code>c53ee60</code></a> Typeshed-sync workflow: add appropriate labels, link directly to failing run ...</li> <li>Additional commits viewable in <a href="https://github.com/astral-sh/ruff/compare/0.9.1...0.9.2">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=ruff&package-manager=pip&previous-version=0.9.1&new-version=0.9.2)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-01-21 17:21:21 -08:00
Wanming Lin	18a54284c8	[WebNN] Remove workarounds for TFLite backend (#23406 ) The WebNN CPU device type may now target different backends, such as CoreML. Legacy special workarounds for the TFLite backend should be removed and allowed to fail as is, as these are implementation issues. Additionally, the WebNN EP should adhere to the WebNN API conformance. We assume all the WebNN ops should be supported, so remove the WebNN op support status for different device types in webnn-operators.md as well.	2025-01-21 17:20:19 -08:00
dependabot[bot]	f4dc965522	Bump vite from 6.0.7 to 6.0.11 in /js/web/test/e2e/exports/testcases/vite-default (#23446 ) Bumps [vite](https://github.com/vitejs/vite/tree/HEAD/packages/vite) from 6.0.7 to 6.0.11. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/vitejs/vite/releases">vite's releases</a>.</em></p> <blockquote> <h2>v6.0.11</h2> <p>Please refer to <a href="https://github.com/vitejs/vite/blob/v6.0.11/packages/vite/CHANGELOG.md">CHANGELOG.md</a> for details.</p> <h2>v6.0.10</h2> <p>Please refer to <a href="https://github.com/vitejs/vite/blob/v6.0.10/packages/vite/CHANGELOG.md">CHANGELOG.md</a> for details.</p> <h2>v6.0.9</h2> <p>This version contains a breaking change due to security fixes. See <a href="https://github.com/vitejs/vite/security/advisories/GHSA-vg6x-rcgg-rjx6">https://github.com/vitejs/vite/security/advisories/GHSA-vg6x-rcgg-rjx6</a> for more details.</p> <p>Please refer to <a href="https://github.com/vitejs/vite/blob/v6.0.9/packages/vite/CHANGELOG.md">CHANGELOG.md</a> for details.</p> <h2>v6.0.8</h2> <p>Please refer to <a href="https://github.com/vitejs/vite/blob/v6.0.8/packages/vite/CHANGELOG.md">CHANGELOG.md</a> for details.</p> </blockquote> </details> <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/vitejs/vite/blob/main/packages/vite/CHANGELOG.md">vite's changelog</a>.</em></p> <blockquote> <h2><!-- raw HTML omitted -->6.0.11 (2025-01-21)<!-- raw HTML omitted --></h2> <ul> <li>fix: <code>preview.allowedHosts</code> with specific values was not respected (<a href="https://github.com/vitejs/vite/tree/HEAD/packages/vite/issues/19246">#19246</a>) (<a href="`aeb3ec84a2`">aeb3ec8</a>), closes <a href="https://redirect.github.com/vitejs/vite/issues/19246">#19246</a></li> <li>fix: allow CORS from loopback addresses by default (<a href="https://github.com/vitejs/vite/tree/HEAD/packages/vite/issues/19249">#19249</a>) (<a href="`3d03899737`">3d03899</a>), closes <a href="https://redirect.github.com/vitejs/vite/issues/19249">#19249</a></li> </ul> <h2><!-- raw HTML omitted -->6.0.10 (2025-01-20)<!-- raw HTML omitted --></h2> <ul> <li>fix: try parse <code>server.origin</code> URL (<a href="https://github.com/vitejs/vite/tree/HEAD/packages/vite/issues/19241">#19241</a>) (<a href="`2495022420`">2495022</a>), closes <a href="https://redirect.github.com/vitejs/vite/issues/19241">#19241</a></li> </ul> <h2><!-- raw HTML omitted -->6.0.9 (2025-01-20)<!-- raw HTML omitted --></h2> <ul> <li>fix!: check host header to prevent DNS rebinding attacks and introduce <code>server.allowedHosts</code> (<a href="`bd896fb5f3`">bd896fb</a>)</li> <li>fix!: default <code>server.cors: false</code> to disallow fetching from untrusted origins (<a href="`b09572acc9`">b09572a</a>)</li> <li>fix: verify token for HMR WebSocket connection (<a href="`029dcd6d77`">029dcd6</a>)</li> </ul> <h2><!-- raw HTML omitted -->6.0.8 (2025-01-20)<!-- raw HTML omitted --></h2> <ul> <li>fix: avoid SSR HMR for HTML files (<a href="https://github.com/vitejs/vite/tree/HEAD/packages/vite/issues/19193">#19193</a>) (<a href="`3bd55bcb7e`">3bd55bc</a>), closes <a href="https://redirect.github.com/vitejs/vite/issues/19193">#19193</a></li> <li>fix: build time display 7m 60s (<a href="https://github.com/vitejs/vite/tree/HEAD/packages/vite/issues/19108">#19108</a>) (<a href="`cf0d2c8e23`">cf0d2c8</a>), closes <a href="https://redirect.github.com/vitejs/vite/issues/19108">#19108</a></li> <li>fix: don't resolve URL starting with double slash (<a href="https://github.com/vitejs/vite/tree/HEAD/packages/vite/issues/19059">#19059</a>) (<a href="`35942cde11`">35942cd</a>), closes <a href="https://redirect.github.com/vitejs/vite/issues/19059">#19059</a></li> <li>fix: ensure <code>server.close()</code> only called once (<a href="https://github.com/vitejs/vite/tree/HEAD/packages/vite/issues/19204">#19204</a>) (<a href="`db81c2dada`">db81c2d</a>), closes <a href="https://redirect.github.com/vitejs/vite/issues/19204">#19204</a></li> <li>fix: resolve.conditions in ResolvedConfig was <code>defaultServerConditions</code> (<a href="https://github.com/vitejs/vite/tree/HEAD/packages/vite/issues/19174">#19174</a>) (<a href="`ad75c56dce`">ad75c56</a>), closes <a href="https://redirect.github.com/vitejs/vite/issues/19174">#19174</a></li> <li>fix: tree shake stringified JSON imports (<a href="https://github.com/vitejs/vite/tree/HEAD/packages/vite/issues/19189">#19189</a>) (<a href="`f2aed62d0b`">f2aed62</a>), closes <a href="https://redirect.github.com/vitejs/vite/issues/19189">#19189</a></li> <li>fix: use shared sigterm callback (<a href="https://github.com/vitejs/vite/tree/HEAD/packages/vite/issues/19203">#19203</a>) (<a href="`47039f4643`">47039f4</a>), closes <a href="https://redirect.github.com/vitejs/vite/issues/19203">#19203</a></li> <li>fix(deps): update all non-major dependencies (<a href="https://github.com/vitejs/vite/tree/HEAD/packages/vite/issues/19098">#19098</a>) (<a href="`8639538e64`">8639538</a>), closes <a href="https://redirect.github.com/vitejs/vite/issues/19098">#19098</a></li> <li>fix(optimizer): use correct default install state path for yarn PnP (<a href="https://github.com/vitejs/vite/tree/HEAD/packages/vite/issues/19119">#19119</a>) (<a href="`e690d8bb1e`">e690d8b</a>), closes <a href="https://redirect.github.com/vitejs/vite/issues/19119">#19119</a></li> <li>fix(types): improve <code>ESBuildOptions.include / exclude</code> type to allow <code>readonly (string \| RegExp)[]</code> (<a href="`ea53e70952`">ea53e70</a>), closes <a href="https://redirect.github.com/vitejs/vite/issues/19146">#19146</a></li> <li>chore(deps): update dependency pathe to v2 (<a href="https://github.com/vitejs/vite/tree/HEAD/packages/vite/issues/19139">#19139</a>) (<a href="`71506f0a8d`">71506f0</a>), closes <a href="https://redirect.github.com/vitejs/vite/issues/19139">#19139</a></li> </ul> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="`a0ed4057c9`"><code>a0ed405</code></a> release: v6.0.11</li> <li><a href="`3d03899737`"><code>3d03899</code></a> fix: allow CORS from loopback addresses by default (<a href="https://github.com/vitejs/vite/tree/HEAD/packages/vite/issues/19249">#19249</a>)</li> <li><a href="`aeb3ec84a2`"><code>aeb3ec8</code></a> fix: <code>preview.allowedHosts</code> with specific values was not respected (<a href="https://github.com/vitejs/vite/tree/HEAD/packages/vite/issues/19246">#19246</a>)</li> <li><a href="`9654348258`"><code>9654348</code></a> release: v6.0.10</li> <li><a href="`2495022420`"><code>2495022</code></a> fix: try parse <code>server.origin</code> URL (<a href="https://github.com/vitejs/vite/tree/HEAD/packages/vite/issues/19241">#19241</a>)</li> <li><a href="`a55f8ba3e4`"><code>a55f8ba</code></a> release: v6.0.9</li> <li><a href="`bd896fb5f3`"><code>bd896fb</code></a> fix!: check host header to prevent DNS rebinding attacks and introduce `serve...</li> <li><a href="`029dcd6d77`"><code>029dcd6</code></a> fix: verify token for HMR WebSocket connection</li> <li><a href="`b09572acc9`"><code>b09572a</code></a> fix!: default <code>server.cors: false</code> to disallow fetching from untrusted origins</li> <li><a href="`c0f72a695c`"><code>c0f72a6</code></a> release: v6.0.8</li> <li>Additional commits viewable in <a href="https://github.com/vitejs/vite/commits/v6.0.11/packages/vite">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=vite&package-manager=npm_and_yarn&previous-version=6.0.7&new-version=6.0.11)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) You can disable automated security fix PRs for this repo from the [Security Alerts page](https://github.com/microsoft/onnxruntime/network/alerts). </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-01-21 17:18:39 -08:00
Changming Sun	368e243194	Make ORT and Dawn use the same protobuf/abseil source code (#23447 ) ### Description Make ORT and Dawn use the same protobuf/abseil source code	2025-01-21 17:17:47 -08:00
Jian Chen	628c0e00c4	Change MacOS-13 to ubuntu on for android-java-api-aar-test.yml. (#23444 ) ### Description <!-- Describe your changes. --> ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2025-01-21 17:07:20 -08:00
sushraja-msft	58c29d34b2	WIP: Dp4MatMulNBits accuracy level 4 matmul for WebGPU EP (#23365 ) ### Description This change implements accuracy level 4 - quantize A to int8 matmul for the WebGPU EP. The matmul kernel here uses DP4A for matrix multiplication, in order to keep the DP4A fed co-operative matrix multiplication is implemented which preloads the row/col into local variables before the multiplication operation. Credits to @qjia7 for help with the quantizer shader. Performance metrics on intel ADL/TGL GPU. ``` PS C:\onnxruntime> C:\model_benchmark\model_benchmark.exe -i C:\Phi-3.5-mini-instruct-onnx-web\Phi-3.5-mini-instruct-onnx-web -l 500 Batch size: 1, prompt tokens: 501, tokens to generate: 128 Prompt processing (time to first token): avg (us): 2.76762e+06 avg (tokens/s): 181.022 <<< Prefill speed p50 (us): 2.74843e+06 stddev (us): 41756.4 n: 5 * 501 token(s) Token generation: avg (us): 81500.7 avg (tokens/s): 12.2698 p50 (us): 81104.1 stddev (us): 2961.31 n: 635 * 1 token(s) Token sampling: avg (us): 13.1836 avg (tokens/s): 75851.9 p50 (us): 12 stddev (us): 6.47085 n: 640 * 1 token(s) E2E generation (entire generation loop): avg (ms): 13120 p50 (ms): 13081.6 stddev (ms): 114.689 n: 5 Peak working set size (bytes): 5467533312 WebGPU device lost (2): Device was destroyed. ``` This kernel is 2.10x faster than its F16 counterpart for a 500 token prefill. Previous prefill record is 86tks/s. In order to support devices with subgroup size 8/32, a no subgroup version of the same shader is included. Performance is slower than the subgroup version on ADL. ``` PS C:\onnxruntime> C:\model_benchmark\model_benchmark.exe -i C:\Phi-3.5-mini-instruct-onnx-web\Phi-3.5-mini-instruct-onnx-web -l 500 Batch size: 1, prompt tokens: 501, tokens to generate: 128 Prompt processing (time to first token): avg (us): 4.11989e+06 avg (tokens/s): 121.605 p50 (us): 4.11847e+06 stddev (us): 2147.48 n: 5 * 501 token(s) Token generation: avg (us): 81174.9 avg (tokens/s): 12.3191 p50 (us): 81301.1 stddev (us): 2177.2 n: 635 * 1 token(s) Token sampling: avg (us): 14.7998 avg (tokens/s): 67568.3 p50 (us): 12.3 stddev (us): 11.5481 n: 640 * 1 token(s) E2E generation (entire generation loop): avg (ms): 14431.1 p50 (ms): 14433.8 stddev (ms): 5.02473 n: 5 Peak working set size (bytes): 5466480640 WebGPU device lost (2): Device was destroyed. ```	2025-01-21 15:46:51 -08:00
Adrian Lizarraga	c7f764c941	[QNN EP]: Clean up QNN logging resources if an error occurs during initialization (#23435 ) ### Description Re-implementation of https://github.com/microsoft/onnxruntime/pull/23320 (which was reverted). - Cleans up QNN logging resources if an error occurs during initialization. - Updates `QnnLogging()`, which is a logging callback called by QNN libs, to handle situations in which ORT logging is unavailable, thus avoiding a segmentation fault. - Updates `QnnBackendManager::CreateHtpPowerCfgId()` and `QnnBackendManager::SetHtpPowerConfig()` to check that backend setup is complete. These functions get called in QNN EP's `OnRunStart()` even if QNN backend setup failed and the model is assigned to a different EP. This prevents a segmentation fault. Our Android tests ran into this issue because the QNN backend setup failed, the model was then assigned to CPU EP, and the QNN EP's `OnRunStart()` was still called with an invalid backend. ### Motivation and Context If QNN initialization fails at any point, we have to properly clean up the logging resources so that QNN does not call our `QnnLogging()` callback after the EP has been destroyed.	2025-01-21 12:48:08 -08:00
dependabot[bot]	997ab24c08	Bump clang-format from 19.1.6 to 19.1.7 (#23428 ) Bumps [clang-format](https://github.com/ssciwr/clang-format-wheel) from 19.1.6 to 19.1.7. <details> <summary>Commits</summary> <ul> <li><a href="`f865928dd2`"><code>f865928</code></a> Bump to v19.1.7</li> <li>See full diff in <a href="https://github.com/ssciwr/clang-format-wheel/compare/v19.1.6...v19.1.7">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=clang-format&package-manager=pip&previous-version=19.1.6&new-version=19.1.7)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-01-21 12:41:16 -08:00
junchao-zhao	5b5aa11b83	Fix eigen external deps (#23439 ) ### Description <!-- Describe your changes. --> I think we should not use the eigen in the system directly, but should first use the eigen specified in deps.txt. in ubuntu22.04, ORT fails to compile when I install libeigen3-dev (which ROS2 humble depends on). The error message is below: ``` [ 62%] Built target onnxruntime_lora [ 62%] Building CXX object CMakeFiles/onnxruntime_session.dir/home/junchao/work/plugin/ai/onnxruntime/onnxruntime/core/session/IOBinding.cc.o /home/junchao/work/plugin/ai/onnxruntime/onnxruntime/core/providers/cpu/math/element_wise_ops.cc: In member function ‘onnxruntime::common::Status onnxruntime::Min_6<T>::Compute(onnxruntime::OpKernelContext*) const [with T = float]’: /home/junchao/work/plugin/ai/onnxruntime/onnxruntime/core/providers/cpu/math/element_wise_ops.cc:750:56: error: no matching function for call to ‘Eigen::ArrayWrapper<Eigen::Map<Eigen::Matrix<float, -1, 1>, 0, Eigen::Stride<0, 0> > >::min<Eigen::PropagateNaN>(Eigen::ArrayWrapper<Eigen::Map<const Eigen::Matrix<float, -1, 1>, 0, Eigen::Stride<0, 0> > >)’ 750 \| min = min.array().template min<Eigen::PropagateNaN>(EigenMap<float>(data_n).array()); \| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ In file included from /usr/include/eigen3/Eigen/Core:19, from /home/junchao/work/plugin/ai/onnxruntime/onnxruntime/core/util/math_cpuonly.h:68, from /home/junchao/work/plugin/ai/onnxruntime/onnxruntime/core/providers/cpu/math/element_wise_ops.h:10, from /home/junchao/work/plugin/ai/onnxruntime/onnxruntime/core/providers/cpu/math/element_wise_ops.cc:4: /usr/include/eigen3/Eigen/src/Core/../plugins/ArrayCwiseBinaryOps.h:33:28: note: candidate: ‘template<class OtherDerived> const Eigen::CwiseBinaryOp<Eigen::internal::scalar_min_op<typename Eigen::internal::traits<T>::Scalar, typename Eigen::internal::traits<OtherDerived>::Scalar>, const Derived, const OtherDerived> Eigen::ArrayBase<Derived>::min(const Eigen::ArrayBase<OtherDerived>&) const [with OtherDerived = OtherDerived; Derived = Eigen::ArrayWrapper<Eigen::Map<Eigen::Matrix<float, -1, 1>, 0, Eigen::Stride<0, 0> > >]’ 33 \| EIGEN_MAKE_CWISE_BINARY_OP(min,min) \| ^~~ /usr/include/eigen3/Eigen/src/Core/util/Macros.h:1339:4: note: in definition of macro ‘EIGEN_MAKE_CWISE_BINARY_OP’ 1339 \| (METHOD)(const EIGEN_CURRENT_STORAGE_BASE_CLASS<OtherDerived> &other) const \ \| ^~~~~~ /usr/include/eigen3/Eigen/src/Core/../plugins/ArrayCwiseBinaryOps.h:33:28: note: template argument deduction/substitution failed: 33 \| EIGEN_MAKE_CWISE_BINARY_OP(min,min) \| ^~~ /usr/include/eigen3/Eigen/src/Core/util/Macros.h:1339:4: note: in definition of macro ‘EIGEN_MAKE_CWISE_BINARY_OP’ 1339 \| (METHOD)(const EIGEN_CURRENT_STORAGE_BASE_CLASS<OtherDerived> &other) const \ \| ^~~~~~ /home/junchao/work/plugin/ai/onnxruntime/onnxruntime/core/providers/cpu/math/element_wise_ops.cc:750:56: error: type/value mismatch at argument 1 in template parameter list for ‘template<class OtherDerived> const Eigen::CwiseBinaryOp<Eigen::internal::scalar_min_op<typename Eigen::internal::traits<T>::Scalar, typename Eigen::internal::traits<OtherDerived>::Scalar>, const Derived, const OtherDerived> Eigen::ArrayBase<Derived>::min(const Eigen::ArrayBase<OtherDerived>&) const [with OtherDerived = OtherDerived; Derived = Eigen::ArrayWrapper<Eigen::Map<Eigen::Matrix<float, -1, 1>, 0, Eigen::Stride<0, 0> > >]’ 750 \| min = min.array().template min<Eigen::PropagateNaN>(EigenMap<float>(data_n).array()); \| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ /home/junchao/work/plugin/ai/onnxruntime/onnxruntime/core/providers/cpu/math/element_wise_ops.cc:750:56: note: expected a type, got ‘Eigen::PropagateNaN’ /home/junchao/work/plugin/ai/onnxruntime/onnxruntime/core/providers/cpu/math/element_wise_ops.cc: In lambda function: /home/junchao/work/plugin/ai/onnxruntime/onnxruntime/core/providers/cpu/math/element_wise_ops.cc:802:77: error: no matching function for call to ‘Eigen::Map<const Eigen::Array<Eigen::half, -1, 1, 0, -1, 1>, 0, Eigen::Stride<0, 0> >::min<Eigen::PropagateNaN>(Eigen::half)’ 802 \| output_vec_map = input_1_vec_map.template min<Eigen::PropagateNaN>( \| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^ 803 \| static_cast<Eigen::half>(per_iter_bh.ScalarInput0<MLFloat16>())); \| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ In file included from /usr/include/eigen3/Eigen/Core:19, from /home/junchao/work/plugin/ai/onnxruntime/onnxruntime/core/util/math_cpuonly.h:68, from /home/junchao/work/plugin/ai/onnxruntime/onnxruntime/core/providers/cpu/math/element_wise_ops.h:10, from /home/junchao/work/plugin/ai/onnxruntime/onnxruntime/core/providers/cpu/math/element_wise_ops.cc:4: /usr/include/eigen3/Eigen/src/Core/../plugins/ArrayCwiseBinaryOps.h:33:28: note: candidate: ‘template<class OtherDerived> const Eigen::CwiseBinaryOp<Eigen::internal::scalar_min_op<typename Eigen::internal::traits<T>::Scalar, typename Eigen::internal::traits<OtherDerived>::Scalar>, const Derived, const OtherDerived> Eigen::ArrayBase<Derived>::min(const Eigen::ArrayBase<OtherDerived>&) const [with OtherDerived = OtherDerived; Derived = Eigen::Map<const Eigen::Array<Eigen::half, -1, 1, 0, -1, 1>, 0, Eigen::Stride<0, 0> >]’ 33 \| EIGEN_MAKE_CWISE_BINARY_OP(min,min) \| ^~~ /usr/include/eigen3/Eigen/src/Core/util/Macros.h:1339:4: note: in definition of macro ‘EIGEN_MAKE_CWISE_BINARY_OP’ 1339 \| (METHOD)(const EIGEN_CURRENT_STORAGE_BASE_CLASS<OtherDerived> &other) const \ \| ^~~~~~ /usr/include/eigen3/Eigen/src/Core/../plugins/ArrayCwiseBinaryOps.h:33:28: note: template argument deduction/substitution failed: 33 \| EIGEN_MAKE_CWISE_BINARY_OP(min,min) \| ^~~ /usr/include/eigen3/Eigen/src/Core/util/Macros.h:1339:4: note: in definition of macro ‘EIGEN_MAKE_CWISE_BINARY_OP’ 1339 \| (METHOD)(const EIGEN_CURRENT_STORAGE_BASE_CLASS<OtherDerived> &other) const \ \| ^~~~~~ /home/junchao/work/plugin/ai/onnxruntime/onnxruntime/core/providers/cpu/math/element_wise_ops.cc:802:77: error: type/value mismatch at argument 1 in template parameter list for ‘template<class OtherDerived> const Eigen::CwiseBinaryOp<Eigen::internal::scalar_min_op<typename Eigen::internal::traits<T>::Scalar, typename Eigen::internal::traits<OtherDerived>::Scalar>, const Derived, const OtherDerived> Eigen::ArrayBase<Derived>::min(const Eigen::ArrayBase<OtherDerived>&) const [with OtherDerived = OtherDerived; Derived = Eigen::Map<const Eigen::Array<Eigen::half, -1, 1, 0, -1, 1>, 0, Eigen::Stride<0, 0> >]’ 802 \| output_vec_map = input_1_vec_map.template min<Eigen::PropagateNaN>( \| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^ 803 \| static_cast<Eigen::half>(per_iter_bh.ScalarInput0<MLFloat16>())); \| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ /home/junchao/work/plugin/ai/onnxruntime/onnxruntime/core/providers/cpu/math/element_wise_ops.cc:802:77: note: expected a type, got ‘Eigen::PropagateNaN’ /home/junchao/work/plugin/ai/onnxruntime/onnxruntime/core/providers/cpu/math/element_wise_ops.cc:805:77: error: no matching function for call to ‘Eigen::Map<const Eigen::Array<Eigen::half, -1, 1, 0, -1, 1>, 0, Eigen::Stride<0, 0> >::max<Eigen::PropagateNaN>(Eigen::half)’ 805 \| output_vec_map = input_1_vec_map.template max<Eigen::PropagateNaN>( \| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^ 806 \| static_cast<Eigen::half>(per_iter_bh.ScalarInput0<MLFloat16>())); \| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ``` ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Fix #23407 @lixing-star	2025-01-21 12:40:42 -08:00
Jian Chen	899ea21ffe	Moving RN_CI Android Testing to Linux (#23422 ) ### Description Moving Android E2E test steps from Mac-OS13 to unbunt22.04 ### Motivation and Context Deduced the dependency on MacOS, which is deprecating the x64 version.	2025-01-21 11:55:29 -08:00

1 2 3 4 5 ...

12271 commits