onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-05-29 23:06:41 +00:00

Author	SHA1	Message	Date
Jian Chen	4e75605eec	Replace inline pip install with pip install from requirements.txt (#21106 ) ### Description Replace inline pip install with pip install from requirements.txt ### Motivation and Context so that CG can recognize ### Dependency - [x] https://github.com/microsoft/onnxruntime/pull/21085	2024-07-22 12:39:10 -07:00
Scott McKay	34cd2e8ed8	Add CoreML ML Program Resize (#21370 ) ### Description <!-- Describe your changes. --> Add CoreML ML Program Resize - refactor existing logic to try and simplify and share between NeuralNetwork and MLProgram checks - add handling for some new attributes - antialias and axes - should have been done when setting the CoreML EP max opset to 21 ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Support priority models	2024-07-20 09:35:05 +10:00
Changming Sun	9140d9b1ff	Update azure-kusto-data and azure-kusto-ingest (#21409 ) A vulnerability has been found in the Kusto SDK. We need to update it to latest to address a security alert.	2024-07-18 14:26:26 -07:00
Yifan Li	bb76ead96c	[TensorRT EP] support TensorRT 10.2-GA (#21395 ) ### Description <!-- Describe your changes. --> * promote trt version to 10.2.0.19 * EP_Perf CI: clean config of legacy TRT<8.6, promote test env to trt10.2-cu118/cu125 * skip two tests as Float8/BF16 are supported by TRT>10.0 but TRT CIs are not hardware-compatible on these: ``` 1: [ FAILED ] 2 tests, listed below: 1: [ FAILED ] IsInfTest.test_isinf_bfloat16 1: [ FAILED ] IsInfTest.test_Float8E4M3FN ``` ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2024-07-18 12:11:52 -07:00
kailums	1b38c05544	change ci docker image to rocm6.1 (#21296 ) ### Description <!-- Describe your changes. --> There is a bug for kernel running on rocm6.0, so change ci docker image to rocm6.1 For the torch installed in the docker image, change to rocm repo when it is not 6.0 version. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2024-07-18 14:50:01 +08:00
vraspar	fa287042ca	Add ML Program support for transpose op (#21364 ) ### Description Add support for transpose op ### Motivation and Context Enable support for Autodesk model	2024-07-16 16:34:58 -07:00
vraspar	218301403d	Add ML Program support for basic activation ops (#21326 ) ### Description Add support for: - Sigmoid - Relu - Tanh ### Motivation and Context Enable support for Autodesk model	2024-07-15 22:30:20 -07:00
George Wu	4005d12ed4	add vitisai ep build stage to Windows CPU Pipeline (#21361 ) We need to prevent VitisAI EP build breaks, add a stage in Windows CPU CI Pipeline to build Vitis AI EP on Windows. There are no external dependencies for builds. Tests have to be disabled though as the EP has external SW/HW dependencies. This will at least allow us to prevent build breaks which has happened on multiple occasions recently. tested https://dev.azure.com/onnxruntime/onnxruntime/_build/results?buildId=1432346&view=results and it seems to run fine.	2024-07-15 19:34:08 -07:00
Jian Chen	c03e6fff4c	Combining android build and test step into one job (#21340 ) ### Description Combining android build and test step into one job ### Motivation and Context Reduce runtime by removing additional machine allocation, and artifact uploading and downloading. --------- Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com>	2024-07-15 14:44:03 -07:00
Yi Zhang	f2ebd1cd6b	[Fix] Exception in iosDynamicFramework Post-Merge workflow (#21262 ) ### Description the exception was caused by `3dd6fcc089` Why I add skip_macos_test because there's new an exception in https://dev.azure.com/onnxruntime/onnxruntime/_build/results?buildId=1425579&view=logs&j=c90c5af3-67d5-5936-5a62-71c93ebfca65&t=01038f35-8e78-5801-1aa1-d9647bb65858 ``` 2024-07-05T14:41:09.3864740Z mkdir -p /Users/runner/Library/Developer/Xcode/DerivedData/apple_package_test-akksnidsbpojopfdqrclgsoqqerv/Build/Products/Debug/macos_package_testUITests.xctest/Contents/Frameworks 2024-07-05T14:41:09.3933430Z mkdir: /Users/runner/Library/Developer/Xcode/DerivedData/apple_package_test-akksnidsbpojopfdqrclgsoqqerv/Build/Products/Debug/macos_package_testUITests.xctest: Operation not permitted 2024-07-05T14:41:09.3996760Z /var/folders/0f/b0mzpg5d31z074x3z5lzkdxc0000gn/T/tmp97ycvwq5/apple_package_test/Pods/Target Support Files/Pods-macos_package_testUITests/Pods-macos_package_testUITests-frameworks.sh: line 7: realpath: command not found 2024-07-05T14:41:09.4003170Z :18: error: Unexpected failure 2024-07-05T14:41:11.1323470Z error: Sandbox: mkdir(72212) deny(1) file-write-create /Users/runner/Library/Developer/Xcode/DerivedData/apple_package_test-akksnidsbpojopfdqrclgsoqqerv/Build/Products/Debug/macos_package_testUITests.xctest (in target 'macos_package_testUITests' from project 'apple_package_test') 2024-07-05T14:41:11.1325620Z 2024-07-05T14:41:11.8731110Z 2024-07-05T14:41:11.8733040Z Test session results, code coverage, and logs: 2024-07-05T14:41:11.8734820Z /Users/runner/Library/Developer/Xcode/DerivedData/apple_package_test-akksnidsbpojopfdqrclgsoqqerv/Logs/Test/Test-macos_package_test-2024.07.05_14-40-38-+0000.xcresult 2024-07-05T14:41:11.8735530Z 2024-07-05T14:41:11.8906210Z Testing failed: 2024-07-05T14:41:11.8911060Z Sandbox: mkdir(72212) deny(1) file-write-create /Users/runner/Library/Developer/Xcode/DerivedData/apple_package_test-akksnidsbpojopfdqrclgsoqqerv/Build/Products/Debug/macos_package_testUITests.xctest 2024-07-05T14:41:11.8912570Z Unexpected failure 2024-07-05T14:41:11.8913690Z Testing cancelled because the build failed. 2024-07-05T14:41:11.8914380Z 2024-07-05T14:41:11.8914970Z TEST FAILED 2024-07-05T14:41:11.8915480Z 2024-07-05T14:41:11.8915780Z 2024-07-05T14:41:11.8916750Z The following build commands failed: 2024-07-05T14:41:11.8919280Z PhaseScriptExecution [CP]\ Embed\ Pods\ Frameworks /Users/runner/Library/Developer/Xcode/DerivedData/apple_package_test-akksnidsbpojopfdqrclgsoqqerv/Build/Intermediates.noindex/apple_package_test.build/Debug/macos_package_testUITests.build/Script-059136A7770CA5376C30F2FD.sh (in target 'macos_package_testUITests' from project 'apple_package_test') 2024-07-05T14:41:11.8922180Z (1 failure) ``` And I find macos test is skipped in `9ef28f092f/tools/ci_build/github/azure-pipelines/templates/c-api-cpu.yml (L119-L127)` as well. Maybe it is an known issue.	2024-07-12 09:24:12 -07:00
Ted Themistokleous	4ac4cd2668	Migraphx ep windows build (#21284 ) ### Description Repeat of #21084 with removal of policy CMP0144 to suppress warnings which uses CMake 3.27.0. ### Motivation and Context Already approved PR: https://github.com/microsoft/onnxruntime/pull/21084 Removed the added policy from CMake 3.27.0.	2024-07-11 21:21:38 -07:00
Edward Chen	33e7c7f6ec	Enable Android CI build stages to run in parallel. (#21314 ) Enable Android CI build stages to run in parallel to possibly reduce total build time.	2024-07-11 10:09:09 -07:00
Yi Zhang	41ea47be1e	Move QNN nuget package stages out of the big Nuget packaging pipeline. (#21306 ) ### Description 1. remove QNN stages from the big packaging pipeline 2. Add publish nightly package in the current [QNN Nuget pipeline](https://dev.azure.com/aiinfra/Lotus/_builddefinitionId=1234]) ### Motivation and Context Reduce the complexity of the big Nuget packaging pipelines. --------- Co-authored-by: Yi Zhang <your@email.com>	2024-07-11 09:07:23 -07:00
Changming Sun	fe6ef404b5	Enable LTO for Android build (#21243 ) ### Description Enable LTO for Android build, which can reduce binary size by 6%.	2024-07-10 18:44:17 -07:00
Changming Sun	8749fa381e	Update absl (#21300 ) ### Description Our macOS pipeline are failing because of a build error in absl. However, the bug fix we need is not available in the latest ABSL release. Here is the issue: https://github.com/abseil/abseil-cpp/pull/1536 And here is the fix: `779a3565ac` GTests uses ABSL. But this ABSL target also depends on GTest. So, it is a circular dependency. We should be able to avoid that by avoid building tests for ABSL. However, the version we are using has a problem with that: it has cmake target that still depends on GTest even when testing is disabled. It's strange that we suddenly hit this problem and it only happens on macOS.	2024-07-10 11:14:15 -07:00
Jian Chen	d1c19e79ea	Update OpenVino CI Ubuntu to 22.04 (#21127 ) ### Description [Update OpenVino CI Ubuntu to 22.04](`312fab5b3f`) ### Motivation and Context Ubuntu 22.04 is needed for linux C++20	2024-07-09 09:56:44 -07:00
Baiju Meswani	0bbd061a54	Exclude azure ep from gen_def.cc (#21250 ) Addresses python packaging pipeline failure.	2024-07-04 10:50:27 -07:00
Yi Zhang	30b6e82e7d	Make ROCm packaging stages to a single workflow (#21235 ) ### Description Make current ROCm packaging stages to a single workflow. Reduce the possibility of all nightly packages can't be generated by one failed stage ### Motivation and Context Our plan is to reduce the complexity of the current zip-nuget pipeline to improve the stability and performance of nightly packages generation. ROCm packaging stages has no dependencies with other packaging jobs and it's the most time-consuming route. After this change, the most used CPU/CUDA/Mobile packaging workflow duration can be reduced roughly from 3h20m to 2h30m.	2024-07-04 11:07:04 +08:00
cloudhan	f39ee14b46	Add GQA support for ROCm (#21032 )	2024-07-03 14:55:31 +08:00
Baiju Meswani	116398c1a4	onnxruntime shared lib inside python package (#21223 )	2024-07-02 15:37:50 -07:00
Yi Zhang	beb2496748	Templatize publishing nuget package (#21199 ) ### Description It's the prerequisite step of reducing complexity of current zip-nuget pipeline. Some packaging tasks could be cut from the most complex nuget pipline and easily be published ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2024-07-02 09:24:19 +08:00
Chen Feiyue	56b36a58ba	Initial PR for VSINPU execution provider (#20903 ) ### Description <!-- Describe your changes. --> -It is an initial PR for VSINPU execution provider ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> - For support VeriSilicon hardware - TIM-VX(Tensor Interface Module) (https://github.com/VeriSilicon/TIM-VX) is an integrated software solution by Verisilicon for our hardware(A311D/i.MX 8M Plus etc.) design, it is easy to use Verisilicon’s hardware by simply connecting onnxruntime with the TIM-VX API by this VSINPU execution provider.	2024-06-28 21:48:34 -07:00
Jian Chen	9007ede102	Update upstream packaging pipeline name to make it more meaningful. (#21154 ) ### Description Update upstream packaging pipeline name to make it more meaningful. ### Motivation and Context The upstream pipeline used to only building Nuget packages, but now it also builds Zip and Java. So change the name will make it more meaningful.	2024-06-28 21:40:09 -07:00
Jian Chen	0cbe7eec5e	Uppdate nuget to Use Nuget 6.10.x (#21209 ) ### Description Uppdate nuget to Use Nuget 6.10.x	2024-06-28 19:49:54 -07:00
Preetha Veeramalai	6baaaf5165	OVEP options to disable CPU fallback at compile time (#21166 ) ### Description Provide user level options to control the fallback on CPU for models not supported on Intel's NPU hardware. ### Motivation and Context - Current workflow of OVEP allows safe fallback from OV NPU to OV CPU on compilation failures. Also supports MLAS CPU fallback in presence of unsupported custom ops. - The PR provides a build-time option to disable fallback from OV NPU to OV CPU. - The session Option "kOrtSessionOptionsDisableCPUEPFallback" disables OV CPU and MLAS CPU fallback. - Also has bug fix for proto creation. --------- Co-authored-by: jatinwadhwa921 <jatin.wadhwa@intel.com> Co-authored-by: ankitm3k <ankit.maheshkar@intel.com>	2024-06-28 08:31:02 -07:00
Yi Zhang	587e92c279	Add FP32 and INT4 test in Llama2 (#21187 ) ### Description <!-- Describe your changes. --> ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2024-06-28 06:18:26 +08:00
Changming Sun	d1ab94c2b0	Add compatibility for NumPy 2.0 (#21085 ) ### Description As suggested by SciPy's doc, we will `Build against NumPy 2.0.0, then it will work for all NumPy versions with the same major version number (NumPy does maintain backwards ABI compatibility), and as far back as NumPy 1.19 series at the time of writing` I think it works because in [numpyconfig.h#L64](https://github.com/numpy/numpy/blob/main/numpy/_core/include/numpy/numpyconfig.h#L64) there is a macro NPY_FEATURE_VERSION. By default it is set to NPY_1_19_API_VERSION. And the NPY_FEATURE_VERSION macro controls ABI. This PR only upgrade the build time dependency; When a user installs ONNX Runtime, they still can use numpy 1.x. ### Motivation and Context Recently numpy published a new version, 2.0.0, which is incompatible with the latest ONNX Runtime release.	2024-06-27 13:50:53 -07:00
PeixuanZuo	446aa986a1	[ROCm] Extend the Pipeline restriction time (#21158 ) ROCm EP builds are taking longer.	2024-06-27 15:36:04 +08:00
Jian Chen	f81c0ec32a	Remove warning suppression from Java Packaging pipeline. (#21010 ) ### Description Remove warning suppression from Java Packaging pipeline. ### Motivation and Context We want the CI step not to produce warning.	2024-06-24 16:46:21 -07:00
aciddelgado	ebd0368bb0	Make Flash Attention work on Windows (#21015 ) ### Description Previously, Flash Attention only worked on Linux systems. This PR will make it work and enable it to be built and run on Windows. Limitations of Flash Attention in Windows: Requires CUDA 12. ### Motivation and Context This will significantly increase the performance of Windows-based LLM's with hardware sm>=80. To illustrate the improvement of Flash Attention over Memory Efficient Attention, here are some average benchmark numbers for the GQA operator, run with configurations based on several recent models (Llama, Mixtral, Phi-3). The benchmarks were obtained on RTX4090 GPU using the test script located at (onnxruntime/test/python/transformers/benchmark_gqa_windows.py). * Clarifying Note: These benchmarks are just for the GQA operator, not the entire model. ### Memory Efficient Attention Kernel Benchmarks: \| Model Name \| Max Sequence Length \| Inference Interval (ms) \| Throughput (samples/second) \| \|----------------------------------------\|---------------------\|-------------------------\|-----------------------------\| \| Llama3-8B (Average Prompt) \| 8192 \| 0.19790525 \| 13105.63425 \| \| Llama3-8B (Average Token) \| 8192 \| 0.207775538 \| 12025.10172 \| \| Llama3-70B (Average Prompt) \| 8192 \| 0.216049167 \| 11563.31185 \| \| Llama3-70B (Average Token) \| 8192 \| 0.209730731 \| 12284.38149 \| \| Mixtral-8x22B-v0.1 (Average Prompt) \| 32768 \| 0.371928785 \| 7031.440056 \| \| Mixtral-8x22B-v0.1 (Average Token) \| 32768 \| 0.2996659 \| 7607.947159 \| \| Phi-3-mini-128k (Average Prompt) \| 131072 \| 0.183195867 \| 15542.0852 \| \| Phi-3-mini-128k (Average Token) \| 131072 \| 0.198215688 \| 12874.53494 \| \| Phi-3-small-128k (Average Prompt) \| 65536 \| 2.9884929 \| 2332.584142 \| \| Phi-3-small-128k (Average Token) \| 65536 \| 0.845072406 \| 2877.85822 \| \| Phi-3-medium-128K (Average Prompt) \| 32768 \| 0.324974429 \| 8094.909517 \| \| Phi-3-medium-128K (Average Token) \| 32768 \| 0.263662567 \| 8978.463687 \| ### Flash Attention Kernel Benchmarks: \| Model Name \| Max Sequence Length \| Inference Interval (ms) \| Throughput (samples/second) \| \|--------------------------------------\|---------------------\|-------------------------\|-----------------------------\| \| Llama3-8B (Average Prompt) \| 8192 \| 0.163566292 \| 16213.69057 \| \| Llama3-8B (Average Token) \| 8192 \| 0.161643692 \| 16196.14715 \| \| Llama3-70B (Average Prompt) \| 8192 \| 0.160510375 \| 17448.67753 \| \| Llama3-70B (Average Token) \| 8192 \| 0.169427308 \| 14702.62043 \| \| Mixtral-8x22B-v0.1 (Average Prompt) \| 32768 \| 0.164121964 \| 15618.51301 \| \| Mixtral-8x22B-v0.1 (Average Token) \| 32768 \| 0.1715865 \| 14524.32273 \| \| Phi-3-mini-128k (Average Prompt) \| 131072 \| 0.167527167 \| 14576.725 \| \| Phi-3-mini-128k (Average Token) \| 131072 \| 0.175940594 \| 15762.051 \| \| Phi-3-small-128k (Average Prompt) \| 65536 \| 0.162719733 \| 17824.494 \| \| Phi-3-small-128k (Average Token) \| 65536 \| 0.14977525 \| 16749.19858 \| \| Phi-3-medium-128K (Average Prompt) \| 32768 \| 0.156490786 \| 17679.2513 \| \| Phi-3-medium-128K (Average Token) \| 32768 \| 0.165333833 \| 14932.26079 \| Flash Attention is consistently faster for every configuration we benchmarked, with improvements in our trials ranging from ~20% to ~650%. In addition to these improvements in performance, Flash Attention has better memory usage. For example, Memory Efficient Attention cannot handle a max sequence length higher than 32,768, but Flash Attention can handle max sequence lengths at least as high as 131,072. --------- Co-authored-by: Tianlei Wu <tlwu@microsoft.com>	2024-06-24 09:43:49 -07:00
Yi Zhang	5b5ce0bfb0	Add UsePython Task in Nuget Publish workflow (#21144 ) ### Description Otherwise it would fail in `b95982e588/tools/ci_build/github/azure-pipelines/publish-nuget.yml (L78-L81)` ### Motivation and Context The Windows CPU image is migrated to managed image ### Verification Link https://dev.azure.com/aiinfra/Lotus/_build?definitionId=1313	2024-06-24 13:36:13 +08:00
Changming Sun	f5625b8858	Revert "[MIGraphX EP] enable compilation and execution on Windows (21084)" (#21132 ) ### Description This reverts commit `1d7bf56947` because it broken the AMD GPU CI pipeline. Sorry when I reviewed the PR I forgot to run the AMD GPU CI pipeline. Will revert the PR first then ask the author to fix the issue.	2024-06-21 01:01:07 -07:00
Yi Zhang	69d522f4e9	[Fix] use cmdline in Final Jar Testing Stage for new managed Windows Image (#21130 ) ### Description No bash command in Managed Windows image. Use CmdlLine step instead. ### Verified Link https://dev.azure.com/aiinfra/Lotus/_build/results?buildId=491902&view=logs&j=f1f8e11e-a9fa-53e5-cd29-3ba2c1988550	2024-06-21 12:41:06 +08:00
Ted Themistokleous	1d7bf56947	[MIGraphX EP] enable compilation and execution on Windows (#36 ) (#21084 )	2024-06-20 16:21:11 -07:00
Changming Sun	efcaa835b1	Update generate_nuspec_for_native_nuget.py for training (#21112 ) ### Description Similar to #21096 , but this one is for ORT training nuget package.	2024-06-20 16:13:31 -07:00
Changming Sun	bd3a9ee99d	Add UsePythonVersion (#21109 ) ### Description The machine has multiple python installations and none of them is in PATH. Therefore we should explicitly set python version via this task to avoid having surprises. ### Motivation and Context Similar to #21095	2024-06-19 20:47:21 -07:00
Changming Sun	27f3ac78d4	Delete RoslynAnalyzers (#21104 ) ### Description Delete RoslynAnalyzers. Use CodeQL instead. ### Motivation and Context Now we already have CodeQL which is modern and also covers C# code. The RoslynAnalyzers one is not in our pull request pipelines. The "RoslynAnalyzers@2" task is outdated and needs be upgraded. I will delete it for now since we already have CodeQL.	2024-06-19 20:11:15 -07:00
Changming Sun	be423747b1	Delete pyop (#21094 ) ### Description Remove the "--enable_language_interop_ops" build flag, because the code is incompatible with the latest numpy, and the build flag is not used anywhere except a macOS CI pipeline. It does not seem to have a ship plan. ### Motivation and Context The build error was: ``` onnxruntime/core/language_interop_ops/pyop/pyop.cc:122:85: error: no member named 'elsize' in '_PyArray_Descr' static_cast<int64_t>(PyArray_DescrFromType(type)->elsize), ~~~~~~~~~~~~~~~~~~~~~~~~~~~ ^ ```	2024-06-19 16:21:33 -07:00
Clément Péron	8ab8e649a7	tools: build: fix typo (#21052 ) ### Description Typo in the python build script	2024-06-19 16:14:58 -07:00
Adrian Lizarraga	3ae5df1d18	[QNN EP] Update QNN SDK to 2.23.0 (#21008 ) ### Description - Updates CI pipelines to use QNN SDK 2.23.0 by default. - QNN SDK adds support for int64 Cast. This allows QNN EP to support ONNX ArgMax/ArgMin/TopK operators that generate an int64 graph output. Example translation of ArgMax: - ONNX: input --> ArgMax --> output (int64) - QNN: input --> ArgMax --> Cast (int32 to int64) --> output (int64) ### Motivation and Context Update onnxruntime to use the latest QNN SDK.	2024-06-19 12:37:42 -07:00
Jian Chen	6a0d64e65c	Component Gov round 7 (#21051 ) ### Description ignoreDirectories does not recursively include sub folders like we thought it would. We need to add additional sub folders. ### Motivation and Context Fix CG : 1. https://aiinfra.visualstudio.com/Lotus/_componentGovernance/218239/alert/11474679?typeId=25427568 2. https://aiinfra.visualstudio.com/Lotus/_componentGovernance/218239/alert/11475140?typeId=25421034&pipelinesTrackingFilter=0	2024-06-19 11:07:02 -07:00
Yi Zhang	cc3168bcbb	Add UsePython task in Nuget_Packaging_CPU stage (#21095 ) ### Description supplement of https://github.com/microsoft/onnxruntime/pull/21062 ### Motivation and Context	2024-06-19 21:09:37 +08:00
Scott McKay	5fc60f36f2	Update to the net8 MAUI targets. Remove Xamarin. (#21062 ) ### Description <!-- Describe your changes. --> Xamarin is EOL so remove support. The MAUI targets are EOL and need updating. https://dotnet.microsoft.com/en-us/platform/support/policy/maui Other cleanups: - netcoreapp3.1 is EOL - the net6 macos target was added in the mistaken belief that was for MAUI mac support, but that is actually via the mac-catalyst target which we recently added support for. - some CIs that were using the old build setup of splitting pre-net6 targets. The ORT C# bindings csproj was updated last year and the `PreNet6` and `SelectedTargets` properties no longer exist as they were replaced by the simpler `IncludeMobileTargets` property. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Remove EOL components. #21058	2024-06-19 16:20:58 +10:00
Jian Chen	1ad2c0a4b2	fix Window_CI in Github Action (#21070 ) ### Description fix Window_CI in Github Action	2024-06-18 23:14:08 -07:00
cloudhan	ddd4ce3cb7	[ROCm] Update ck to use ck_tile (#21030 )	2024-06-19 14:06:10 +08:00
Changming Sun	ffb8e8eb0e	Update build.py: add a comment (#20993 ) ### Description Update build.py: add a comment ### Motivation and Context See the comment.	2024-06-18 13:52:34 -07:00
Yi Zhang	809cb26ace	Use A100 for LLama2 model test (#21068 ) ### Description ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2024-06-18 11:04:02 +08:00
Changming Sun	9ef4f1b789	Update pybind11 (#21072 ) ### Description Upgrade pybind11 to the latest as suggested by @gnought in #21063 ### Motivation and Context Recently numpy released a new version, which caused compatibility issue between the latest numpy version and the latest ONNX Runtime version.	2024-06-17 19:50:57 -07:00
Scott McKay	159fe9d4f3	Update to mobile model usability checker (#19843 ) ### Description <!-- Describe your changes. --> - Add check for CoreML MLProgram supported ops - Only check usability with ORT Mobile package if requested - this package will be deprecated so info is a) of minimal value and b) can be confusing. - Output more things at INFO level - a lot of meaningful info was only output at DEBUG level. The default INFO level is more useful - dump full partition info at DEBUG level - Check subgraphs fully - CoreML can handle a subgraph - TBD if we want to add support for adding a subgraph to the parent graph for Loop and If nodes - most likely will be required for simple If nodes to be performant - Check 5D CoreML limitation ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Improve helper tools --------- Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com>	2024-06-18 07:50:33 +10:00
Nikolai Svakhin	7b3fff650a	Updated build script for CUDA case (#20987 ) ### Description In CUDA case, use the cuda_home variable to set CMAKE's CUDA compiler to a correct version of NVCC Otherwise, an NVCC from a current PATH would be picked up, which could be from a different version of CUDA. ### Motivation and Context I had a case when I had main CUDA installed, and it was a version 11.8. I wanted to build against 12.5, so I downloaded and unpacked it into a separate directory and passed it as a `--cuda-home` parameter, however the ONNX builder was still picking the NVCC compiler from 11.8. This would fix the issue https://github.com/microsoft/onnxruntime/issues/20928 cc @gedoensmax	2024-06-17 14:41:43 -07:00

1 2 3 4 5 ...

2408 commits