onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-07-04 04:07:22 +00:00

Author	SHA1	Message	Date
jingyanwangms	4a5d66c15f	Default value 10.2->10.3 in linux-gpu-tensorrt-daily-perf-pipeline.yml (#21823 ) ### Description Fix default value 10.2->10.3 in linux-gpu-tensorrt-daily-perf-pipeline.yml ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2024-09-10 15:26:16 -07:00
George Wu	31ae11788a	[QNN EP] Update QNN SDK to 2.26 (#22037 ) * update default QNN SDK version to 2.26 * enable layernorm implicit bias workaround for QNN 2.26 * update artifact names for py win arm64 and arm64ec to re-enable ort-qnn-nightly arm64 python packages	2024-09-10 14:03:06 -07:00
Sophie Schoenmeyer	e7107f41de	Decrease API docs artifact retention days (#22003 ) ### Description When API docs workflows fail, we typically don't catch the issue until the most recently generated artifact expires. The current artifact retention is 60 days, so by decreasing to 30 days, we can ensure that we're resolving the workflow failures more quickly. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2024-09-10 10:44:08 -07:00
Erick Muñoz	7489bfee53	Enable AVX NE CONVERT for FP16 to FP32 cast (#21183 ) ### Description Implementation of a new cast assembly kernel that uses AVX_NE_CONVERT instructions to accelerate casting from FP16 to FP32. Added CPUID checks to determine support of the ISA. ### Motivation and Context Currently FP16 models executed on systems that lack complete FP16 operator support use single precision on every node to run the model, this means the original FP16 weights have to be casted to FP32 in order to run the model properly, this change aims to accelerate the casting by using upconvert instructions and therefore improve performance.	2024-09-09 21:19:31 -07:00
Jake Mathern	d4d419f789	fix more dml warnings (#21980 ) ### Description Fixes more warnings in DML execution provider that lead to security issues in binskim ### Motivation and Context OS components that include ORT must treat certain warnings as errors, and cannot disable critical compiler warnings https://github.com/microsoft/binskim/blob/main/src/BinSkim.Rules/PERules/BA2007.EnableCriticalCompilerWarnings.cs	2024-09-09 17:50:17 -07:00
Jian Chen	93c4c9cb6a	Using wostringstream only on Windows (#21938 ) ### Description Using wostringstream only on Windows ### Motivation and Context From line [62](https://github.com/microsoft/onnxruntime/pull/21938/files#diff-47776d020ac08134de4059eab473550237f4999c598ab56afad3676d2f193edcR62), currently, `stream_` can be either `wostringstream` or `ostringstream` depending on the OS, however, for Unix like system, `stream_` should be `ostringstream`, instead of.	2024-09-09 13:20:17 -07:00
Adrian Lizarraga	c7ae9b977a	[Quantization] Apply workaround for crash when using histogram-based calibrators (#21972 ) ### Description - Applies a workaround that prevents the histogram-based calibrators (percentile, entropy, distribution) from crashing. The workaround involves copying inference outputs that come directly from model inputs. A description of the bug is here: https://github.com/microsoft/onnxruntime/issues/21922. This PR does not fix the root bug, but instead provides a workaround to _unblock_ users using histogram-based calibration. - Adds a unit test that runs all histogram-based calibrators to help catch future regressions. We didn't have unit tests that ran these calibration methods. ### Motivation and Context Trying to quantize a model with the percentile, entropy, or distribution calibration methods raises an exception: ```shell File "/.../site-packages/onnxruntime/quantization/quantize.py", line 691, in quantize quantize_static( File "/.../site-packages/onnxruntime/quantization/quantize.py", line 525, in quantize_static calibrator.collect_data(calibration_data_reader) File "/.../site-packages/onnxruntime/quantization/calibrate.py", line 571, in collect_data self.collector.collect(clean_merged_dict) File "/.../site-packages/onnxruntime/quantization/calibrate.py", line 746, in collect return self.collect_value(name_to_arr) File "/.../site-packages/onnxruntime/quantization/calibrate.py", line 836, in collect_value hist, hist_edges = np.histogram(data_arr, self.num_bins, range=(-threshold, threshold)) File "<__array_function__ internals>", line 180, in histogram File ".../site-packages/numpy/lib/histograms.py", line 793, in histogram bin_edges, uniform_bins = _get_bin_edges(a, bins, range, weights) File "/.../site-packages/numpy/lib/histograms.py", line 426, in _get_bin_edges first_edge, last_edge = _get_outer_edges(a, range) File "/.../site-packages/numpy/lib/histograms.py", line 315, in _get_outer_edges raise ValueError( ValueError: supplied range of [nan, nan] is not finite ``` The calibrators create an augmented model with all tensors (including model inputs) set as model outputs. The data for outputs that are also model inputs is corrupted as described in https://github.com/microsoft/onnxruntime/issues/21922. The corrupted data sometimes contains `NaN` values that cause numpy's histogram utilities to raise an exception.	2024-09-09 12:05:41 -07:00
Peishen Yan	2cdc05f189	Move Gelu and LayerNorm fusion to L1 optimization (#21332 ) According to https://github.com/microsoft/onnxruntime/issues/20915, we move the Gelu and LayerNorm fusion to L1 with a condition on the ONNX opset the model imports (LayerNorm requires opset 16+ and Gelu requires opset 20+.) If the opset version doesn't meet the requirements, the fusion is delayed to L2 optimization since the internal contrib op doesn't have a requirement for any specific ONNX opset. --------- Co-authored-by: Scott McKay <Scott.McKay@microsoft.com> Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com>	2024-09-09 13:27:52 +10:00
Yi Zhang	de7a02beef	Add parameter for flexdonwload (#22009 ) ### Description <!-- Describe your changes. --> ### Motivation and Context Thus, we can run Nuget_Packaging_GPU stage directly	2024-09-08 14:17:55 +08:00
Wanming Lin	ad9afbb042	[WebNN EP] Remove workaround for CPU op supported list (#21962 ) We assume all WebNN ops are supported across all backends.	2024-09-06 22:14:52 -07:00
Edward Chen	f3725b9f06	Use output variable from InstallAppleProvisioningProfile task to set provisioning profile UUID. (#22018 ) This is more flexible than hardcoding the provisioning profile name or UUID. The name shouldn't usually change but it is not guaranteed to remain constant.	2024-09-06 18:00:34 -07:00
zz002	28b550f091	[VitisAI] Add processing for sessionOptions.AppendExecutionProvider("VitisAI", options) (#21839 ) ### Description <!-- Describe your changes. --> [VitisAI] Add processing for sessionOptions.AppendExecutionProvider("VitisAI", options) ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> --------- Co-authored-by: Zhenze Wang <zhenzew@xilinx.com>	2024-09-06 14:06:33 -07:00
Arne H Juul	493159b481	near-zero negative values must convert to 0 not NAN (#18473 ) for the Float8 types with unsigned zero, we must clear the sign bit when rounding to zero; otherwise we end up with 0x80 which is the encoding for NAN. ### Description Handle all zero and near-zero values the same way, rounding to positive zero. Note that I removed one "if" level but did not re-indent the code in this PR, to make it easier to see what the actual changes are. ### Motivation and Context For the two new 8-bit floating point types Float8E4M3FNUZ and Float8E5M2FNUZ, converting from a near-zero negative value would end up with the sign bit set only; this bit pattern is not negative zero but instead means NAN.	2024-09-06 11:41:48 -07:00
Arne H Juul	605a84ffc9	remove unused and confusing float16 constants (#21999 ) ### Description Remove unused and confusing special constants in MLFloat16 and BFloat16 types. ### Motivation and Context While looking at adding a specialization for std::numeric_limits for the 16-bit floating point types, I found that there are various special constants in those types that are confusing or just wrong. MLFLoat16::Epsilon is not an epsilon at all, but approximates "e". Looks like a copy-paste bug. BFloat16::Epsilon does not correspond to `numeric_limits::epsilon()`, nor even to the C# Float.Epsilon. Instead, it corresponds to `numeric_limits::min()` which was really confusing to me. The "MinValue" constants does correspond to the C# `Float.MinValue` constant, but this is C++ so it would be better renamed to "LowestValue" since it corresponds to `numeric_limits::lowest()`. As it was unused except for some unit tests I have replaced it with the equivalent `MaxValue.Negate()` here. There's also an unused `kSignaling_NaNBits` constant which is just wrong (has the same value as `kPositiveInfinityBits` instead of a NaN).	2024-09-05 22:00:48 -07:00
Edward Chen	970ebc2ccf	Fix typo in coreml_supported_mlprogram_ops.md (#22004 ) ### Description <!-- Describe your changes. --> Fix typo: ai:onnx -> ai.onnx ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Typo.	2024-09-06 12:50:56 +10:00
Edward Chen	0c398b3e52	Update Android NDK version to 27.0.12077973. (#21989 ) Upgrade to newer version. r26 will be unsupported soon.	2024-09-05 17:57:24 -07:00
Adrian Lizarraga	b011f6fbf6	[TransposeOptimizer] Support Unsqueeze/Transpose of input consumed by per-axis DQ (#21821 ) ### Description Follow-up to: https://github.com/microsoft/onnxruntime/pull/21793 - Support looking past a per-axis DQ to do in-place Unsqueeze/Transpose of initializers - Support looking past a per-axis DQ to cancel a Transpose or Squeeze. ### Test models For all test models, the transpose optimizer pushes a Transpose through a Mul's input[0]. The Mul's input[1] is optionally unsqueezed and then transposed. ### I. Test in-place unsqueeze and transpose of per-axis quantized weight Original model has input[1] with shape (3,) <details><summary>click to expand model image</summary> <img src="https://github.com/user-attachments/assets/37b6f60c-77d2-4bd3-8ca2-58dc7c88a304" /> </details> Optimized model has input[1] with shape (1, 3, 1, 1). The initializer was unsqueezed and transposed in-place. <details><summary>click expand model image</summary> <img src="https://github.com/user-attachments/assets/adb72757-a164-400c-bfef-2a05f0e35825" /> </details> ### II. Test canceling existing Squeeze before per-axis DQ Original model has input[1] that is squeezed. <details><summary>click expand model image</summary> <img src="https://github.com/user-attachments/assets/f27e6742-b563-42a9-ad06-bb3178b0ceb8" /> </details> Optimized model unsqueezed and transposed input[1]. The original squeeze was removed due to the unsqueeze, leaving only the Transpose. <details><summary>click expand model image</summary> <img src="https://github.com/user-attachments/assets/e56261d4-eba6-4a9f-847b-dcd33548dd07" /> </details> ### III. Test canceling existing Transpose before per-axis DQ Original model has input[1] that is transposed. <details><summary>click expand model image</summary> <img src="https://github.com/user-attachments/assets/f157e04a-572a-479d-8e3b-cf57954df5c0" /> </details> Optimized model transposed input[1], thus canceling the existing transpose. <details><summary>click expand model image</summary> <img src="https://github.com/user-attachments/assets/63d742ce-3762-4ab2-bdb0-1b507886da9d" /> </details> ### IV. Test QDQ fix-up of Transpose/Unsqueeze for per-axis quantization Original model has input[1] that can be broadcasted. <details><summary>click expand model image</summary> <img src="https://github.com/user-attachments/assets/96c0092c-22ec-486d-882e-e2cb59ffe324" /> </details> The main transpose optimization loop inserts float32 Unsqueeze and Transpose after the DQ. The qdq fix-up pass inserts new per-axis Q/DQ ops after the inserted nodes. <details><summary>click expand model image</summary> <img src="https://github.com/user-attachments/assets/b6f89c11-974d-4b35-922f-11effdf06883" /> </details> ### Motivation and Context Enables the TransposeOptimizer to support more models with per-axis QDQ nodes. Per-axis quantization can improve model accuracy and is used by EPs like QNN. --------- Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com>	2024-09-05 17:26:17 -07:00
Wanming Lin	23f6604c39	[WebNN EP] Use identity for one input of Max/Min (#21974 ) Now WebNN supports `identity` op, use it for `Max` and `Min` ops with only one input.	2024-09-05 16:47:40 -07:00
Scott McKay	20c802afd4	Add better native nuget package readme (#21889 ) ### Description <!-- Describe your changes. --> Request from Nuget team to add a better readme to the nuget package so it is displayed nicely on nuget.org. Previously we were using the ORT repo readme.md but that a) doesn't display correctly due to limited markdown support on nuget.org, and b) has a lot of irrelevant info like build pipeline status. - Created a generic readme.md that includes the ORT description from the main readme, includes the ORT logo via an acceptable link, and lists the native nuget packages so the file can be included in any of them as-is. - Updated the nuget packaging script to add the `readme` tag and use this file. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Request from MS Nuget team to MS package owners to add.	2024-09-06 08:28:14 +10:00
Tianlei Wu	c7d0ded079	[CUDA] Update Dockerfile.cuda with cuda 12.5.1 and cudnn 9 (#21987 ) ### Description Previous image is based on cuda 12.1 and cudnn 8, which is out of date since we have moved to cudnn 9 since 1.19 release. (1) Upgrade base image to cuda 12.5.1 and cudnn 9. (2) Update CMAKE_CUDA_ARCHITECTURES from 52;60;61;70;75;86 to 61;70;75;80;86;90 to support A100 and H100 (3) Make the build faster: exclude unit test; use ninja etc. (4) upgrade some packages (like packaging etc) before building to avoid build error. ### Motivation and Context https://github.com/microsoft/onnxruntime/issues/21792 https://github.com/microsoft/onnxruntime/issues/21532	2024-09-05 15:25:40 -07:00
0xdr3dd	2dae8aaced	[Fuzzer] Add fuzzer support for linux (#21996 ) ### Description Added some change in fuzzer project code to support linux also. How to test on linux: 1. Make sure you have installed clang/llvm. 2. run below command to build asan instrumented project: ``` CFLAGS="-g -fsanitize=address -shared-libasan -fprofile-instr-generate -fcoverage-mapping" CXXFLAGS="-g -shared-libasan -fsanitize=address -fprofile-instr-generate -fcoverage-mapping" CC=clang CXX=clang++ ./build.sh --update --build --config Debug --compile_no_warning_as_error --build_shared_lib --skip_submodule_sync --skip_tests --use_full_protobuf --parallel --fuzz_testing --build_dir build/ ``` 3. run fuzzer for some time, it will generate .profraw file: ``` LLVM_PROFILE_FILE="%p.profraw" ./build/Debug/onnxruntime_security_fuzz /t /v onnxruntime/test/testdata/bart_tiny.onnx 1 m ``` 4. Get the cov by running below cmd: ``` llvm-profdata merge -sparse .profraw -o default.profdata llvm-cov report ./build/Debug/onnxruntime_security_fuzz -instr-profile=default.profdata ``` <img width="1566" alt="Screenshot 2024-09-05 at 4 25 08 PM" src="https://github.com/user-attachments/assets/2aa0bb83-6634-4d33-b026-3535e97df431"> ### Motivation and Context 1. Currently fuzzer only supports windows and MSVC, we can't generate the code coverage using MSVC. With clang/llvm we can try and use clang instrumentation and llvm tools like llvm-cov. 2. In future we can add coverage guided fuzzer (libfuzzer) in same project. (Working on it)	2024-09-05 11:52:15 -07:00
Yueqing Zhang	f4d62eeb2e	[VitisAI] remove unused header (#21890 ) ### Description <!-- Describe your changes. --> Removed unused headers ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> This would cause compile error on machine that didn't install nlohmann. Co-authored-by: Yueqing Zhang <yueqingz@amd.com>	2024-09-05 08:37:15 -07:00
Javier Martinez	840f896c5f	Uncomment line in OVEP that was commented out in error (#21973 ) ### Description One line change to re-enable a line incorrectly commented out in an earlier commit ### Motivation and Context Fix issue introduced with [PR 21872](https://github.com/microsoft/onnxruntime/pull/21872#discussion_r1736744441)	2024-09-05 08:34:55 -07:00
Scott McKay	8b661f7157	Fix DML packaging CIs (#21997 ) ### Description <!-- Describe your changes. --> The DML CIs build native and C# as well as sign DLLs in the same CI. Some parts of that require .net 8 and some .net 6. Update to use .net 8 in general, and revert to .net 6 for the signing. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Fix packaging pipeline.	2024-09-05 22:30:40 +08:00
Scott McKay	5e24c5d5f8	Fix C# doc generation workflow (#21988 ) ### Description <!-- Describe your changes. --> - Update docfx usage. - The docfx cli is now a dotnet tool. - Split some commands up so it's easier to debug failures - Update to .net8. - Exclude mobile targets from build as the workloads aren't available and it doesn't change the generated documentation. - The mobile specific APIs (e.g. enable CoreML EP) still exist in this case as we check in the implementation if it's valid to use them or not, so the workloads are not required to generate complete API documentation. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Fix doc gen.	2024-09-05 13:54:17 +10:00
Yulong Wang	2e83541eba	fix one build warning in MSVC (#21983 ) ### Description Fix one MSVC warning member not initialized ``` Warning C26495 Variable 'onnxruntime::ITuningContext::allocators_' is uninitialized. Always initialize a member variable (type.6). C:\code\onnxruntime\onnxruntime\core\framework\tuning_context.h 22 ```	2024-09-04 17:51:14 -07:00
Jiajia Qin	3580e01348	[js/webgpu] Optimize grouped conv (#21892 ) ### Description <!-- Describe your changes. --> #21618 This PR optimizes grouped conv by 1) more sequential memory access in gpu 2) reusing input's data to reduce global memory access times. See `Conv\|GroupedConv` op in [Wav2Vec2](https://huggingface.co/facebook/wav2vec2-base-960h) becomes 92 ms from 1058 ms on iGPUs with 32 EU. For the whole model on my iGPUs with 32 EU, wav2vec2 model becomes 982ms from 1942 ms. squeezebert-uncased model becomes 71.86ms from 431.77ms. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2024-09-04 17:16:35 -07:00
mindest	30f07758a2	Add packaging version constraint. (#21814 ) ### Description Newer `setuptools` requires newer version of `packaging`, due to function update. ### Motivation and Context Fixes #21792	2024-09-04 16:57:04 -07:00
Prathik Rao	ed232dc1ef	Sets enable_windows_arm64ec_qnn to false in training CI (#21981 ) ### Description <!-- Describe your changes. --> ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2024-09-04 16:01:14 -07:00
Scott McKay	44fc7b443c	Update C# test projects (#21631 ) ### Description <!-- Describe your changes. --> Update various test projects to .net8 from EOL frameworks. Replace the Xamarin based Android and iOS test projects with a MAUI based project that uses .net 8. Add new CoreML flags to C# bindings ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Remove usage of EOL frameworks.	2024-09-05 08:21:23 +10:00
Scott McKay	8632e67dc3	Update C# E2E project's test package versions (#21975 ) ### Description <!-- Describe your changes. --> Update C# test package dependencies to match #21913 This csproj isn't included in the main sln and was overlooked. We need the newer xunit version for Assert.Fail which is used in shared unit test source that is included here as well. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Fix CI failure	2024-09-05 07:53:53 +10:00
Jian Chen	09d786fc14	Rename ios_packaging.requirements.txt to ios_packaging/requirements.txt (#21936 ) ### Description Rename ios_packaging.requirements.txt to ios_packaging/requirements.txt ### Motivation and Context By doing this, the package within os_packaging/requirements.txt can be scanned by CG task	2024-09-04 13:18:05 -07:00
Jiajia Qin	a80bfed5b4	[js/webgpu] Optimize transpose (#21964 ) ### Description <!-- Describe your changes. --> Fix bugs in previous implementation and add more situations to go the optimized path. Below situations will go to the optimized path. 1. 2d inputs or squeezed 2d inputs 2. channels last or channels first transpose. For example, channel last transpose: [1, 256, 512, 512] -> [1, 512, 512, 256] For this case, the transpose becomes [256, 512x512] -> [512x512, 256] ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> For SD Turbo demo, the total transpose time becomes 39.98ms from 122.09ms. And the correspnding percents becomes 3.89% from 11.05% in this demo. This PR will also help #21618, the total transpose time in that demo becomes 17.32 ms from 70.25 ms on my iGPUs.	2024-09-04 12:04:04 -07:00
Hector Li	190588bb64	Enable QNN weight sharing (#21077 ) ### Description Enable QNN weight sharing across graphs in single context Create tool to generate QNN context cache model with weight sharing enabled.	2024-09-04 11:20:33 -07:00
Yueqing Zhang	9031112c8e	[VitisAI] add registered custom op for perf test (#21336 ) ### Description <!-- Describe your changes. --> Register for custom op when testing the performance ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> This is needed for providers to test their implementation	2024-09-04 11:13:35 -07:00
zz002	bf8a8e7e36	[VitisAI] Bug fixes in model_clone (#21950 ) ### Description <!-- Describe your changes. --> VitisAI bug fixes in model clone ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Co-authored-by: Zhenze Wang <zhenzew@xilinx.com>	2024-09-04 10:29:17 -07:00
Edward Chen	cbf3c50d75	Improve stability of Android ReactNative E2E test (#21969 ) - Remove redundant `OnnxruntimeModuleExampleE2ETest CheckOutputComponentExists` test - Attempt to close any Application Not Responding (ANR) dialog prior to running Android test - Add `--take-screenshots failing` option to detox test commands to save screenshots on failure	2024-09-04 08:41:07 -07:00
Chen Feiyue	d4290f6e7f	Update vsinpu ep cross-compiling patch (#21963 ) - Block the bf16 && ummla gemm functions because we cannot support these features yet	2024-09-03 22:54:43 -07:00
Yueqing Zhang	dd2425932d	[VitisAI] Fix model path (#21911 ) ### Description <!-- Describe your changes. --> Change the .data path so it is on the same path as the model path. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> This would fix the issue if a model has .data file, the executable can't read the data if the model is in another directory.	2024-09-03 22:42:01 -07:00
Yulong Wang	decb3852a0	refactor: extract shared util function ComputeBroadcastOutputShape (#21940 ) ### Description This is used in multiple places.	2024-09-03 18:21:36 -07:00
Tianlei Wu	628c0a8f0e	Remove unused find_cudnn_supported_cuda_versions (#21620 ) ### Description The function find_cudnn_supported_cuda_versions is not used anymore. Remove it. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2024-09-03 14:38:33 -07:00
sfatimar	8dba8e3e24	Memory Optimization for Compilation in OVEP (#21872 ) Calling Split API Calls Read+Model in lieu of unified Compile Model call for export compile flow to ensure memory optimization. Freeing up model proto and serialized string and read model ov ir later to free up memory for the ahead pipeline Optimization during EpCtxt flow All the Graph related operations require all the Node Attributes to be set while dealing with model instances internally with them, in the existing implementation these attributes make a copy when constructing a Graph dynamically during runtime. Propose to use these attributes in place without creating a copy to avoid memory allocation / copy while calling these Graph related functions. Changes to ensure the bug fixes related to openvino version and epctxt file path. Moving Compiler version to C++20 for getting r-value mem optimizations benefit ### Motivation and Context This change is required because memory optimization during Compilation flow is too high. --------- Co-authored-by: saurabhkale17 <saurabh1.kale@intel.com> Co-authored-by: Preetha Veeramalai <preetha.veeramalai@intel.com> Co-authored-by: Vishnudas Thaniel S <vishnudas.thaniel.s@intel.com> Co-authored-by: Javier E. Martinez <javier.e.martinez@intel.com> Co-authored-by: jatinwadhwa921 <110383850+jatinwadhwa921@users.noreply.github.com> Co-authored-by: ankitm3k <ankit.maheshkar@intel.com> Co-authored-by: jatinwadhwa921 <jatin.wadhwa@intel.com>	2024-09-03 13:52:31 -07:00
Yi Zhang	4962252c8f	Enable xnnpack ep works in current windows xnn ci (#21951 ) ### Description The EP wasn't added in session option in onnxruntime_test_all. ### Motivation and Context After this PR onnxruntime_test_all --gtest_filter=\xnnpack\maxpool\* can step into `8c5336449d/onnxruntime/core/providers/xnnpack/nn/max_pool.cc (L209)` --------- Co-authored-by: Yi Zhang <your@email.com>	2024-09-03 10:02:00 -07:00
Chester Liu	5c74539ab7	Fix copying ORT dylib into wheel on macOS (#21931 ) Fix #21223 on macOS --------- Co-authored-by: Baiju Meswani <bmeswani@microsoft.com>	2024-09-03 11:08:25 +08:00
Yulong Wang	257792225f	revert forceinline for MakeString (#21943 ) ### Description revert forceinline for MakeString. This change reverts https://github.com/microsoft/onnxruntime/pull/21893. The forceinline was introduced for performance considerations, however it turns out to have some notable binary size increase, which is a concern for some binary size sensitive platforms like Android. I made a few tests locally and found it is not related to whether or not have used the template struct `if_char_array_make_ptr_t` trick. So I have to revert this back.	2024-09-02 19:01:08 -07:00
Scott McKay	e788b3d30e	Fix C# warnings. (#21913 ) ### Description <!-- Describe your changes. --> Update some testing dependencies. Fix various warnings. Mainly around documentation (existing) and unit test usage (mainly resulting from xunit update). Invalid angle brackets for generics in documentation were changed to use curly braces based on https://learn.microsoft.com/en-us/dotnet/csharp/language-reference/xmldoc/ > To refer to generic identifiers in code reference (cref) elements, you can use either the escape characters (for example, cref="List<T>") or braces (cref="List{T}"). As a special case, the compiler parses the braces as angle brackets to make the documentation comment less cumbersome to the author when referring to generic identifiers. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2024-09-03 10:08:29 +10:00
Yulong Wang	bad00a3657	Add dependency dawn into deps.txt (#21910 ) ### Description Add dependency dawn into deps.txt. This is a preparation for introducing WebGPU EP.	2024-09-02 04:24:28 -07:00
Kyle	b1ae43cbcb	Add Files Signature Validation after Signed by ESRP (#21949 ) ### Description <!-- Describe your changes. --> Files signature validation after signed by ESRP. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> - Add validation after the ESRP process. - Make sure the targeting pattern/suffix files are signed successfully by ESRP. - If the signature is not Valid, then will fail the following stages.	2024-09-02 17:16:59 +08:00
Yulong Wang	8c5336449d	Stop VSCode appending file associations to settings.json (#21944 ) ### Description If you open onnxruntime source code using VSCode with C/C++ extension, it's keeping adding file associations for C/C++ headers into this settings.json. This is annoying when staging/committing changes. Add a configuration to disable this behavior. see: - https://stackoverflow.com/questions/65220185/how-to-stop-vs-code-to-keep-adding-standard-c-libraries-to-the-file-associatio - https://github.com/microsoft/vscode-cpptools/issues/722#issuecomment-480329005	2024-08-31 19:04:12 -07:00
mingyueliuh	047f32c79d	[VitisAI] Remove shape infer from bridge ort (#21331 ) ### Description Vitis AI EP's custom op are completely self contained within Vitis AI EP implementation (rather than needing to add static functions in provider_bridge). --------- Co-authored-by: liumingyue <mingyue@xilinx.com>	2024-08-31 08:57:23 -07:00

1 2 3 4 5 ...

11626 commits