onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-05-15 20:50:42 +00:00

Author	SHA1	Message	Date
Scott McKay	bcc01ac123	Updates to apple packaging (#21611 ) ### Description <!-- Describe your changes. --> Add ability to test packaging without rebuilding every time. Add ability to comment out some platforms/architectures without the scripts to assemble the c/obj-c packages breaking. Update a couple of commands to preserve symlinks. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Make debugging packaging issues faster. Creates correct package for mac-catalyst and doesn't require setting symlinks via bash script.	2024-08-06 08:50:56 +10:00
vraspar	88c811b638	Restructure MacOS framework package to fix malformed Framework errors (#21536 ) ### Description Refactor framework directory structure for MacOS packages ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Apple started enforcing specific [framework structure](https://developer.apple.com/library/archive/documentation/MacOSX/Conceptual/BPFrameworks/Concepts/FrameworkAnatomy.html) for MacOS packages. We need to change how we package for MacOS to follow the guidelines Fixes following issue: [Malformed Framework](https://github.com/microsoft/onnxruntime-swift-package-manager/issues/19 )	2024-08-04 12:47:16 -07:00
Julius Tischbein	1391354265	Adding CUDNN Frontend and use for CUDA NN Convolution (#19470 ) ### Description Added CUDNN Frontend and used it for NHWC convolutions, and optionally fuse activation. #### Backward compatible - For model existed with FusedConv, model can still run. - If ORT is built with cuDNN 8, cuDNN frontend will not be built into binary. Old kernels (using cudnn backend APIs) are used. #### Major Changes - For cuDNN 9, we will enable cudnn frontend to fuse convolution and bias when a provider option `fuse_conv_bias=1`. - Remove the fusion of FusedConv from graph transformer for CUDA provider, so there will not be FusedConv be added to graph for CUDA EP in the future. - Update cmake files regarding to cudnn settings. The search order of CUDNN installation in build are like the following: * environment variable `CUDNN_PATH` * `onnxruntime_CUDNN_HOME` cmake extra defines. If a build starts from build.py/build.sh, user can pass it through `--cudnn_home` parameter, or by environment variable `CUDNN_HOME` if `--cudnn_home` not used. * cudnn python package installation directory like python3.xx/site-packages/nvidia/cudnn * CUDA installation path #### Potential Issues - If ORT is built with cuDNN 8, FusedConv fusion is no longer done automatically, so some model might have performance regression. If user still wants FusedConv operator for performance reason, they can still have multiple ways to walkaround: like use older version of onnxruntime; or use older version of ORT to save optimized onnx, then run with latest version of ORT. We believe that majority users have moved to cudnn 9 when 1.20 release (since the default in ORT and PyTorch is cudnn 9 for 3 months when 1.20 release), so the impact is small. - cuDNN graph uses TF32 by default, and user cannot disable TF32 through the use_tf32 cuda provider option. If user encounters accuracy issue (like in testing), user has to set environment variable `NVIDIA_TF32_OVERRIDE=0` to disable TF32. Need update the document of use_tf32 later. #### Follow ups This is one of PRs that target to enable NHWC convolution in CUDA EP by default if device supports it. There are other changes will follow up to make it possible. (1) Enable `prefer_nhwc` by default for device with sm >= 70. (2) Change `fuse_conv_bias=1` by default after more testing. (3) Add other NHWC operators (like Resize or UpSample). ### Motivation and Context The new CUDNN Frontend library provides the functionality to fuse operations and provides new heuristics for kernel selection. Here it fuses the convolution with the pointwise bias operation. On the [NVIDIA ResNet50](https://pytorch.org/hub/nvidia_deeplearningexamples_resnet50/) we get a performance boost from 49.1144 ms to 42.4643 ms per inference on a 2560x1440 input (`onnxruntime_perf_test -e cuda -I -q -r 100-d 1 -i 'prefer_nhwc\|1' resnet50.onnx`). --------- Co-authored-by: Tianlei Wu <tlwu@microsoft.com> Co-authored-by: Maximilian Mueller <maximilianm@nvidia.com>	2024-08-02 15:16:42 -07:00
dependabot[bot]	3b73ef2bf7	Bump torch from 1.13.1 to 2.2.0 in /tools/ci_build/github/windows/eager (#21505 ) Bumps [torch](https://github.com/pytorch/pytorch) from 1.13.1 to 2.2.0. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/pytorch/pytorch/releases">torch's releases</a>.</em></p> <blockquote> <h2>PyTorch 2.2: FlashAttention-v2, AOTInductor</h2> <h1>PyTorch 2.2 Release Notes</h1> <ul> <li>Highlights</li> <li>Backwards Incompatible Changes</li> <li>Deprecations</li> <li>New Features</li> <li>Improvements</li> <li>Bug fixes</li> <li>Performance</li> <li>Documentation</li> </ul> <h1>Highlights</h1> <p>We are excited to announce the release of PyTorch® 2.2! PyTorch 2.2 offers ~2x performance improvements to <code>scaled_dot_product_attention</code> via FlashAttention-v2 integration, as well as AOTInductor, a new ahead-of-time compilation and deployment tool built for non-python server-side deployments.</p> <p>This release also includes improved torch.compile support for Optimizers, a number of new inductor optimizations, and a new logging mechanism called TORCH_LOGS.</p> <p><strong>Please note that we are <a href="https://redirect.github.com/pytorch/pytorch/issues/114602">deprecating macOS x86 support</a>, and PyTorch 2.2.x will be the last version that supports macOS x64.</strong></p> <p>Along with 2.2, we are also releasing a series of updates to the PyTorch domain libraries. More details can be found in the library updates blog.</p> <p>This release is composed of 3,628 commits and 521 contributors since PyTorch 2.1. We want to sincerely thank our dedicated community for your contributions. As always, we encourage you to try these out and report any issues as we improve 2.2. More information about how to get started with the PyTorch 2-series can be found at our <a href="https://pytorch.org/get-started/pytorch-2.0/">Getting Started</a> page.</p> <p>Summary:</p> <ul> <li><code>scaled_dot_product_attention</code> (SDPA) now supports FlashAttention-2, yielding around 2x speedups compared to previous versions.</li> <li>PyTorch 2.2 introduces a new ahead-of-time extension of TorchInductor called AOTInductor, designed to compile and deploy PyTorch programs for non-python server-side.</li> <li><code>torch.distributed</code> supports a new abstraction for initializing and representing ProcessGroups called device_mesh.</li> <li>PyTorch 2.2 ships a standardized, configurable logging mechanism called TORCH_LOGS.</li> <li>A number of torch.compile improvements are included in PyTorch 2.2, including improved support for compiling Optimizers and improved TorchInductor fusion and layout optimizations.</li> <li>Please note that we are deprecating macOS x86 support, and PyTorch 2.2.x will be the last version that supports macOS x64.</li> <li><code>torch.ao.quantization</code> now offers a prototype <code>torch.export</code> based flow</li> </ul> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Commits</summary> <ul> <li><a href="`8ac9b20d4b`"><code>8ac9b20</code></a> Run docker release build on final tag (<a href="https://redirect.github.com/pytorch/pytorch/issues/117131">#117131</a>) (<a href="https://redirect.github.com/pytorch/pytorch/issues/117182">#117182</a>)</li> <li><a href="`2490352430`"><code>2490352</code></a> Fix cuInit test on Windows (<a href="https://redirect.github.com/pytorch/pytorch/issues/117095">#117095</a>)</li> <li><a href="`3a44bb713f`"><code>3a44bb7</code></a> [CI] Test that cuInit is not called during import (<a href="https://redirect.github.com/pytorch/pytorch/issues/117043">#117043</a>)</li> <li><a href="`1c8ba3847d`"><code>1c8ba38</code></a> [CI] Use jemalloc for CUDA builds (<a href="https://redirect.github.com/pytorch/pytorch/issues/116900">#116900</a>) (<a href="https://redirect.github.com/pytorch/pytorch/issues/116988">#116988</a>)</li> <li><a href="`96d2ddbafe`"><code>96d2ddb</code></a> Store user model to simplify ONNXProgram.{adapt_torch_*,<strong>call</strong>} APIs (<a href="https://redirect.github.com/pytorch/pytorch/issues/1152">#1152</a>...</li> <li><a href="`738b4a560a`"><code>738b4a5</code></a> Update ONNX's IO Adapter to support FakeTensor with ExportedProgram (<a href="https://redirect.github.com/pytorch/pytorch/issues/114407">#114407</a>)...</li> <li><a href="`4cf10bf4dc`"><code>4cf10bf</code></a> [Cherry-pick] [Quant] [PT2] Enable batchnorm in _move_exported_model_to_eval ...</li> <li><a href="`7e97e4b4b6`"><code>7e97e4b</code></a> [AARCH64] Fall back to GEMM if mkldnn_matmul fails (<a href="https://redirect.github.com/pytorch/pytorch/issues/115936">#115936</a>) (<a href="https://redirect.github.com/pytorch/pytorch/issues/116666">#116666</a>)</li> <li><a href="`1a3e3c7cff`"><code>1a3e3c7</code></a> [CUDA] baddmm should fall back to addmm for batch=1 (<a href="https://redirect.github.com/pytorch/pytorch/issues/114992">#114992</a>) (<a href="https://redirect.github.com/pytorch/pytorch/issues/116518">#116518</a>)</li> <li><a href="`ab7505f78c`"><code>ab7505f</code></a> Fix broken PyYAML 6.0 on MacOS x86 (<a href="https://redirect.github.com/pytorch/pytorch/issues/115956">#115956</a>) (<a href="https://redirect.github.com/pytorch/pytorch/issues/116551">#116551</a>)</li> <li>Additional commits viewable in <a href="https://github.com/pytorch/pytorch/compare/v1.13.1...v2.2.0">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=torch&package-manager=pip&previous-version=1.13.1&new-version=2.2.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) You can disable automated security fix PRs for this repo from the [Security Alerts page](https://github.com/microsoft/onnxruntime/network/alerts). </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2024-08-01 04:28:43 -07:00
vraspar	07d3be5b0e	CoreML: Add ML Program Split Op (#21456 ) ### Description Add support for Split Op ### Motivation and Context Address operator gaps in high priority model. --------- Co-authored-by: Scott McKay <skottmckay@gmail.com> Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com>	2024-07-30 14:04:47 +10:00
Yifan Li	5d78b9a17b	[TensorRT EP] Update TRT OSS Parser to 10.2 (#21552 ) ### Description <!-- Describe your changes. --> Update TRT OSS Parser to [latest 10.2-GA branch](`f161f95883`) ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2024-07-29 17:27:38 -07:00
Jian Chen	79537d0523	Remove tools/ci_build/github/android/run_nnapi_code_coverage.sh (#21371 ) ### Description Remove tools/ci_build/github/android/run_nnapi_code_coverage.sh ### Motivation and Context This file is no longer needed	2024-07-29 10:00:52 -07:00
Jian Chen	bc3713206d	Update QNN pipeline pool (#21482 ) ### Description Update QNN pipeline pool ### Motivation and Context Let all our pipelines are using the latest NDK version	2024-07-29 10:00:21 -07:00
Yi Zhang	05cef469e8	Move on-device training packages publish step (#21539 ) ### Description Since the onedevice training cpu packaging has been a separated pipeline, it's nuget package publishing step must be moved as well. ### Motivation and Context Fixes the exception in Nuget Publishing Packaging Pipeline caused by #21485	2024-07-29 09:59:46 -07:00
Jian Chen	7e23212de9	Delete tools/ci_build/github/azure-pipelines/win-gpu-ci-pipeline.yml (#21529 ) ### Description Delete tools/ci_build/github/azure-pipelines/win-gpu-ci-pipeline.yml ### Motivation and Context This CI pipeline has been divided into 4 different pipeline.	2024-07-27 15:58:12 -07:00
maggie1059	10b4a3b90b	Fix conda failure for onnxruntime-directml (#21526 ) The change in #21005 works for directly building wheels with `build.py`, but ort-nightly-directml wheels, as well as the 1.18.1 release of the onnxruntime-directml python wheel, still do not work with conda since they're built from the `py-win-gpu.yml` pipeline, which uses `install_third_party_deps.ps1` to set compile flags.	2024-07-26 22:26:38 -07:00
Scott McKay	5af423c7c0	Set version and other info in the C# dll (#21517 ) ### Description <!-- Describe your changes. --> Set version and other info in the Microsoft.ML.OnnxRuntime C# dll by setting GenerateAssemblyInfo to true and passing in ORT version in the CI. Minor re-org of the order of properties so related things are grouped a little better. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> #21475	2024-07-27 13:22:57 +10:00
Jian Chen	7db7c4e5c8	Separating all GPU stages into different Pipelines (#21521 ) ### Description Separating all GPU stages into different Pipelines	2024-07-26 14:54:45 -07:00
Prathik Rao	278f0f5cd2	disables qnn in ort training cpu pipeline (#21510 ) ### Description <!-- Describe your changes. --> `enable_windows_arm64_qnn` and `enable_windows_x64_qnn` are true by default but unnecessary for training. This change explicitly sets these parameters to false for training pipeline. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> ORT 1.19 Release Preparation	2024-07-26 17:23:35 +08:00
Scott McKay	b0e1f7f798	CoreML: Aggregated changes to add all required ops for priority model (#21472 ) ### Description <!-- Describe your changes. --> Add these changes to one PR to simplify checkin - Add Concat (#21423) - Add DepthToSpace (#21426) - Add LeakyRelu (#21453) - Add test scripts (#21427) - Add ability to set coreml flags from python (#21434) Other changes - updated partitioning utils to support dropping constant initializers from a ComputeCapability's inputs. - noticed that the list of inputs to the coreml model was unexpectedly long due to this - we copy constant initializers to a CoreML model so don't need the originals, and if they remain as inputs ORT can't free them as they appear to be in use. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2024-07-26 08:29:33 +10:00
Scott McKay	3cdf4b917b	Fix Android CI Pipeline code coverage failure (#21504 ) ### Description <!-- Describe your changes. --> Current failure is due to a version mismatch. Use llvm-cov from the Android NDK instead of the system gcov so that the version is correct. Also comment out publishing to the Azure dashboard to simplify the setup. The CI prints out the stats for review by developers. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Fix CI pipeline	2024-07-26 07:36:23 +10:00
Changming Sun	4167b68abf	Split ondevice training cpu packaging pipeline to a separated pipeline (#21485 ) ### Description Right now our "Zip-Nuget-Java-Nodejs Packaging Pipeline" is too big. This OnDevice training part is independent of the others, so it can be split out. Then our NPM Packaging pipeline will not depends on this training stuff. ### Motivation and Context Similar to #21235 Also, this PR fixed a problem that: "NuGet_Test_Linux_Training_CPU" job downloads artifacts from "onnxruntime-linux-x64" for getting customop shared libs, but the job forget to declare it depends on the "Linux_C_API_Packaging_CPU_x64" which produces the artifact. Such problems can be hard to find when a pipeline goes big.	2024-07-25 10:58:34 -07:00
Yifan Li	ebcb7075eb	Set CUDA12 as default in GPU packages (#21438 ) ### Description * Swap cuda version 11.8/12.2 in GPU CIs * Set CUDA12 as default version in yamls of publishing nuget/python/java GPU packages * Suppress warnings as errors of flash_api.cc during ort win-build	2024-07-25 10:17:16 -07:00
Adrian Lizarraga	eb9b377306	[QNN EP] Update to QNN SDK 2.24.0 (#21463 ) ### Description - Update pipelines to use QNN SDK 2.24 by default - Update QNN_Nuget_Windows pipeline to build csharp solution without mobile projects (fixes errors). - Implement workaround for QNN 2.24 validation bug for LayerNorm ops without an explicit bias input. - Enable Relu unit test, which now passes due to the fact Relu is no longer fused into QuantizeLinear for QNN EP. - Fix bug where a negative quantization axis is not properly normalized for per-channel int4 conv. ### Motivation and Context Update QNN SDk.	2024-07-24 10:17:12 -07:00
Changming Sun	b04adcc381	Update copy_strip_binary.sh: use "make install" instead (#21464 ) ### Description Before this change, copy_strip_binary.sh manually copies each file from onnx runtime's build folder to an artifact folder. It can be hard when dealing with symbolic link for shared libraries. This PR will change the packaging pipelines to run "make install" first, before packaging shared libs . ### Motivation and Context Recently because of feature request #21281 , we changed libonnxruntime.so's SONAME. Now every package that contains this shared library must also contains libonnxruntime.so.1. Therefore we need to change the packaging scripts to include this file. Instead of manually construct the symlink layout, using `make install` is much easier and will make things more consistent because it is a standard way of making packages. Breaking change: After this change, our inference tarballs that are published to our Github release pages will be not contain ORT training headers.	2024-07-24 10:02:00 -07:00
Scott McKay	2580d935cb	CoreML: Add ML Program ConvTranspose (#21416 ) ### Description <!-- Describe your changes. --> Add ML Program ConvTranspose - some limitations to simplify the implementation for now - some limitations due to flaky CoreML output Added support for non-contiguous MLMultiArray output as we see that with some unit tests when the CPU-only flag is not set (e.g. innermost dim has min size of 16 but test output only has 8 values). - support only one non-contiguous dim to keep it simple - manually tested as we don't have a setup that can test objective-c code - test code is in model.mm and can be enabled via ifdef if we need to validate any future changes ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Address operator gaps in high priority model. --------- Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com>	2024-07-24 16:08:20 +10:00
Scott McKay	1df9aa2f08	CoreML: Add GridSample ML Program support (#21431 ) ### Description <!-- Describe your changes. --> Add GridSample ML Program support One combination of inputs has diffs between the pytorch generated unit tests data and CoreML. Disabling until needed as investigation may take a while. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> High priorities models	2024-07-24 11:04:48 +10:00
George Wu	c65afcea55	fix python qnn pipelines issues (#21462 ) build_py_params wasn't plumbed through for python qnn pipelines. incorporate fixes for deprecated numpy version option from https://github.com/microsoft/onnxruntime/pull/21459	2024-07-23 15:54:44 -07:00
Changming Sun	f70215d4e6	Update C++ dependencies (#21410 ) 1. Update google benchmark from 1.8.3 to 1.8.5 2. Update google test from commit in main branch to tag 1.15.0 3. Update pybind11 from 2.12.0 to 2.13.1 4. Update pytorch cpuinfo to include the support for Arm Neoverse V2, Cortex X4, A720 and A520. 5. Update re2 from 2024-05-01 to 2024-07-02 6. Update cmake to 3.30.1 7. Update Linux docker images 8. Fix a warning in test/perftest/ort_test_session.cc:826:37: error: implicit conversion loses integer precision: 'streamoff' (aka 'long long') to 'const std::streamsize' (aka 'const long') [-Werror,-Wshorten-64-to-32]	2024-07-23 10:00:36 -07:00
Scott McKay	0f1f3b7705	CoreML: ML Program Slice (#21433 ) ### Description <!-- Describe your changes. --> Add support for Slice ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> High priority models.	2024-07-23 20:21:55 +10:00
mindest	5b9369e93c	Fix typos according to reviewdog report. (#21335 ) ### Description Fix typos based on reviewdog report but with some exceptions/corrections.	2024-07-22 13:37:32 -07:00
Jian Chen	4e75605eec	Replace inline pip install with pip install from requirements.txt (#21106 ) ### Description Replace inline pip install with pip install from requirements.txt ### Motivation and Context so that CG can recognize ### Dependency - [x] https://github.com/microsoft/onnxruntime/pull/21085	2024-07-22 12:39:10 -07:00
Scott McKay	34cd2e8ed8	Add CoreML ML Program Resize (#21370 ) ### Description <!-- Describe your changes. --> Add CoreML ML Program Resize - refactor existing logic to try and simplify and share between NeuralNetwork and MLProgram checks - add handling for some new attributes - antialias and axes - should have been done when setting the CoreML EP max opset to 21 ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Support priority models	2024-07-20 09:35:05 +10:00
Changming Sun	9140d9b1ff	Update azure-kusto-data and azure-kusto-ingest (#21409 ) A vulnerability has been found in the Kusto SDK. We need to update it to latest to address a security alert.	2024-07-18 14:26:26 -07:00
Yifan Li	bb76ead96c	[TensorRT EP] support TensorRT 10.2-GA (#21395 ) ### Description <!-- Describe your changes. --> * promote trt version to 10.2.0.19 * EP_Perf CI: clean config of legacy TRT<8.6, promote test env to trt10.2-cu118/cu125 * skip two tests as Float8/BF16 are supported by TRT>10.0 but TRT CIs are not hardware-compatible on these: ``` 1: [ FAILED ] 2 tests, listed below: 1: [ FAILED ] IsInfTest.test_isinf_bfloat16 1: [ FAILED ] IsInfTest.test_Float8E4M3FN ``` ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2024-07-18 12:11:52 -07:00
kailums	1b38c05544	change ci docker image to rocm6.1 (#21296 ) ### Description <!-- Describe your changes. --> There is a bug for kernel running on rocm6.0, so change ci docker image to rocm6.1 For the torch installed in the docker image, change to rocm repo when it is not 6.0 version. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2024-07-18 14:50:01 +08:00
vraspar	fa287042ca	Add ML Program support for transpose op (#21364 ) ### Description Add support for transpose op ### Motivation and Context Enable support for Autodesk model	2024-07-16 16:34:58 -07:00
vraspar	218301403d	Add ML Program support for basic activation ops (#21326 ) ### Description Add support for: - Sigmoid - Relu - Tanh ### Motivation and Context Enable support for Autodesk model	2024-07-15 22:30:20 -07:00
George Wu	4005d12ed4	add vitisai ep build stage to Windows CPU Pipeline (#21361 ) We need to prevent VitisAI EP build breaks, add a stage in Windows CPU CI Pipeline to build Vitis AI EP on Windows. There are no external dependencies for builds. Tests have to be disabled though as the EP has external SW/HW dependencies. This will at least allow us to prevent build breaks which has happened on multiple occasions recently. tested https://dev.azure.com/onnxruntime/onnxruntime/_build/results?buildId=1432346&view=results and it seems to run fine.	2024-07-15 19:34:08 -07:00
Jian Chen	c03e6fff4c	Combining android build and test step into one job (#21340 ) ### Description Combining android build and test step into one job ### Motivation and Context Reduce runtime by removing additional machine allocation, and artifact uploading and downloading. --------- Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com>	2024-07-15 14:44:03 -07:00
Yi Zhang	f2ebd1cd6b	[Fix] Exception in iosDynamicFramework Post-Merge workflow (#21262 ) ### Description the exception was caused by `3dd6fcc089` Why I add skip_macos_test because there's new an exception in https://dev.azure.com/onnxruntime/onnxruntime/_build/results?buildId=1425579&view=logs&j=c90c5af3-67d5-5936-5a62-71c93ebfca65&t=01038f35-8e78-5801-1aa1-d9647bb65858 ``` 2024-07-05T14:41:09.3864740Z mkdir -p /Users/runner/Library/Developer/Xcode/DerivedData/apple_package_test-akksnidsbpojopfdqrclgsoqqerv/Build/Products/Debug/macos_package_testUITests.xctest/Contents/Frameworks 2024-07-05T14:41:09.3933430Z mkdir: /Users/runner/Library/Developer/Xcode/DerivedData/apple_package_test-akksnidsbpojopfdqrclgsoqqerv/Build/Products/Debug/macos_package_testUITests.xctest: Operation not permitted 2024-07-05T14:41:09.3996760Z /var/folders/0f/b0mzpg5d31z074x3z5lzkdxc0000gn/T/tmp97ycvwq5/apple_package_test/Pods/Target Support Files/Pods-macos_package_testUITests/Pods-macos_package_testUITests-frameworks.sh: line 7: realpath: command not found 2024-07-05T14:41:09.4003170Z :18: error: Unexpected failure 2024-07-05T14:41:11.1323470Z error: Sandbox: mkdir(72212) deny(1) file-write-create /Users/runner/Library/Developer/Xcode/DerivedData/apple_package_test-akksnidsbpojopfdqrclgsoqqerv/Build/Products/Debug/macos_package_testUITests.xctest (in target 'macos_package_testUITests' from project 'apple_package_test') 2024-07-05T14:41:11.1325620Z 2024-07-05T14:41:11.8731110Z 2024-07-05T14:41:11.8733040Z Test session results, code coverage, and logs: 2024-07-05T14:41:11.8734820Z /Users/runner/Library/Developer/Xcode/DerivedData/apple_package_test-akksnidsbpojopfdqrclgsoqqerv/Logs/Test/Test-macos_package_test-2024.07.05_14-40-38-+0000.xcresult 2024-07-05T14:41:11.8735530Z 2024-07-05T14:41:11.8906210Z Testing failed: 2024-07-05T14:41:11.8911060Z Sandbox: mkdir(72212) deny(1) file-write-create /Users/runner/Library/Developer/Xcode/DerivedData/apple_package_test-akksnidsbpojopfdqrclgsoqqerv/Build/Products/Debug/macos_package_testUITests.xctest 2024-07-05T14:41:11.8912570Z Unexpected failure 2024-07-05T14:41:11.8913690Z Testing cancelled because the build failed. 2024-07-05T14:41:11.8914380Z 2024-07-05T14:41:11.8914970Z TEST FAILED 2024-07-05T14:41:11.8915480Z 2024-07-05T14:41:11.8915780Z 2024-07-05T14:41:11.8916750Z The following build commands failed: 2024-07-05T14:41:11.8919280Z PhaseScriptExecution [CP]\ Embed\ Pods\ Frameworks /Users/runner/Library/Developer/Xcode/DerivedData/apple_package_test-akksnidsbpojopfdqrclgsoqqerv/Build/Intermediates.noindex/apple_package_test.build/Debug/macos_package_testUITests.build/Script-059136A7770CA5376C30F2FD.sh (in target 'macos_package_testUITests' from project 'apple_package_test') 2024-07-05T14:41:11.8922180Z (1 failure) ``` And I find macos test is skipped in `9ef28f092f/tools/ci_build/github/azure-pipelines/templates/c-api-cpu.yml (L119-L127)` as well. Maybe it is an known issue.	2024-07-12 09:24:12 -07:00
Edward Chen	33e7c7f6ec	Enable Android CI build stages to run in parallel. (#21314 ) Enable Android CI build stages to run in parallel to possibly reduce total build time.	2024-07-11 10:09:09 -07:00
Yi Zhang	41ea47be1e	Move QNN nuget package stages out of the big Nuget packaging pipeline. (#21306 ) ### Description 1. remove QNN stages from the big packaging pipeline 2. Add publish nightly package in the current [QNN Nuget pipeline](https://dev.azure.com/aiinfra/Lotus/_builddefinitionId=1234]) ### Motivation and Context Reduce the complexity of the big Nuget packaging pipelines. --------- Co-authored-by: Yi Zhang <your@email.com>	2024-07-11 09:07:23 -07:00
Changming Sun	fe6ef404b5	Enable LTO for Android build (#21243 ) ### Description Enable LTO for Android build, which can reduce binary size by 6%.	2024-07-10 18:44:17 -07:00
Changming Sun	8749fa381e	Update absl (#21300 ) ### Description Our macOS pipeline are failing because of a build error in absl. However, the bug fix we need is not available in the latest ABSL release. Here is the issue: https://github.com/abseil/abseil-cpp/pull/1536 And here is the fix: `779a3565ac` GTests uses ABSL. But this ABSL target also depends on GTest. So, it is a circular dependency. We should be able to avoid that by avoid building tests for ABSL. However, the version we are using has a problem with that: it has cmake target that still depends on GTest even when testing is disabled. It's strange that we suddenly hit this problem and it only happens on macOS.	2024-07-10 11:14:15 -07:00
Jian Chen	d1c19e79ea	Update OpenVino CI Ubuntu to 22.04 (#21127 ) ### Description [Update OpenVino CI Ubuntu to 22.04](`312fab5b3f`) ### Motivation and Context Ubuntu 22.04 is needed for linux C++20	2024-07-09 09:56:44 -07:00
Yi Zhang	30b6e82e7d	Make ROCm packaging stages to a single workflow (#21235 ) ### Description Make current ROCm packaging stages to a single workflow. Reduce the possibility of all nightly packages can't be generated by one failed stage ### Motivation and Context Our plan is to reduce the complexity of the current zip-nuget pipeline to improve the stability and performance of nightly packages generation. ROCm packaging stages has no dependencies with other packaging jobs and it's the most time-consuming route. After this change, the most used CPU/CUDA/Mobile packaging workflow duration can be reduced roughly from 3h20m to 2h30m.	2024-07-04 11:07:04 +08:00
cloudhan	f39ee14b46	Add GQA support for ROCm (#21032 )	2024-07-03 14:55:31 +08:00
Yi Zhang	beb2496748	Templatize publishing nuget package (#21199 ) ### Description It's the prerequisite step of reducing complexity of current zip-nuget pipeline. Some packaging tasks could be cut from the most complex nuget pipline and easily be published ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2024-07-02 09:24:19 +08:00
Jian Chen	9007ede102	Update upstream packaging pipeline name to make it more meaningful. (#21154 ) ### Description Update upstream packaging pipeline name to make it more meaningful. ### Motivation and Context The upstream pipeline used to only building Nuget packages, but now it also builds Zip and Java. So change the name will make it more meaningful.	2024-06-28 21:40:09 -07:00
Jian Chen	0cbe7eec5e	Uppdate nuget to Use Nuget 6.10.x (#21209 ) ### Description Uppdate nuget to Use Nuget 6.10.x	2024-06-28 19:49:54 -07:00
Yi Zhang	587e92c279	Add FP32 and INT4 test in Llama2 (#21187 ) ### Description <!-- Describe your changes. --> ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2024-06-28 06:18:26 +08:00
Changming Sun	d1ab94c2b0	Add compatibility for NumPy 2.0 (#21085 ) ### Description As suggested by SciPy's doc, we will `Build against NumPy 2.0.0, then it will work for all NumPy versions with the same major version number (NumPy does maintain backwards ABI compatibility), and as far back as NumPy 1.19 series at the time of writing` I think it works because in [numpyconfig.h#L64](https://github.com/numpy/numpy/blob/main/numpy/_core/include/numpy/numpyconfig.h#L64) there is a macro NPY_FEATURE_VERSION. By default it is set to NPY_1_19_API_VERSION. And the NPY_FEATURE_VERSION macro controls ABI. This PR only upgrade the build time dependency; When a user installs ONNX Runtime, they still can use numpy 1.x. ### Motivation and Context Recently numpy published a new version, 2.0.0, which is incompatible with the latest ONNX Runtime release.	2024-06-27 13:50:53 -07:00
PeixuanZuo	446aa986a1	[ROCm] Extend the Pipeline restriction time (#21158 ) ROCm EP builds are taking longer.	2024-06-27 15:36:04 +08:00
Jian Chen	f81c0ec32a	Remove warning suppression from Java Packaging pipeline. (#21010 ) ### Description Remove warning suppression from Java Packaging pipeline. ### Motivation and Context We want the CI step not to produce warning.	2024-06-24 16:46:21 -07:00

1 2 3 4 5 ...

2015 commits