onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-06-07 00:13:17 +00:00

Author	SHA1	Message	Date
Changming Sun	65ef270e06	Update Aten pipeline's docker file to use UBI8 (#20856 ) ### Description Now it uses CentOS 7 which is EOL. This PR updates it to UBI8. ### Motivation and Context To deprecate CentOS 7 .	2024-05-30 07:38:15 -07:00
Jian Chen	228713f635	adding publishing stage to publish java CUDA 12 pkg to ado (#20834 )	2024-05-29 16:24:23 -07:00
Vincent Wang	e77f238dc6	Update Torch Version to Fix ATen CPU Pipeline Failure (#20845 ) Update Torch Version to Fix ATen CPU Pipeline Failure.	2024-05-29 16:04:18 +08:00
Edward Chen	535e9d7114	Update package_release_tasks.py (#20835 ) 1. Move azcopy environment variables out of script and into an Azure DevOps variable group. Move towards consolidating the managed identity client ID definition in one place. 2. Disable azcopy overwrite. We don't want to accidentally change the files for a released package.	2024-05-28 17:50:25 -07:00
Adrian Lizarraga	e78b18a2fb	Increase ComponentDetection timeout for React Native CI (#20800 ) ### Description Runs of the React Native CI are timing out during ComponentDetection after 8 minutes. This increases the timeout value. ### Motivation and Context Runs of the React Native CI are timing out during ComponentDetection.	2024-05-28 08:36:38 -07:00
Jian Chen	b1b8cb05dc	Adding java build and packaging stage to cuda-packaging-pipeline.yml (#20812 ) ### Description Adding java build/packaging stage to `cuda-packaging-pipeline.yml` ### Motivation and Context This way we can enable publishing the Java Cuda 12 along with Nuget CUDA 12	2024-05-27 07:59:19 -07:00
Changming Sun	439ed92b96	Remove TVM EP's pipeline (#20813 ) ### Description Temporarily remove TVM EP's pipeline until someone helps us upgrade TVM to a newer version which is compatible with the latest ONNX. ### Motivation and Context The ONNX version that TVM EP uses has a known security vulnerability. We cannot continue using it in our hosted build environment. This change is temporary	2024-05-25 20:42:41 -07:00
Jian Chen	fe24006425	Fix Nuget Cuda pipeline package pipeline (#20741 ) ### Description <!-- Describe your changes. --> This PR adding protoc.exe to make the Nuget Cuda Pipleine, which also allowing it to get build Java for various CUDA version ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2024-05-24 09:15:57 -07:00
Changming Sun	535a030b1e	Remove manylinux build scripts from python packaging pipeline (#20786 ) ### Description Use a common set of prebuilt manylinux base images to build the packages, to avoid building the manylinux part again and again. The base images can be used in GenAI and other projects too. This PR also updates the GCC version for inference python CUDA11/CUDA12 builds from 8 to 11. Later on I will update all other CUDA pipelines to use GCC 11, to avoid the issue described in https://github.com/onnx/onnx/issues/6047 and https://github.com/microsoft/onnxruntime-genai/issues/257 . ### Motivation and Context To extract the common part as a reusable build infra among different ONNX Runtime projects.	2024-05-24 08:18:22 -07:00
Jian Chen	884acd4598	Fix Nuget-Cuda pubish pipeline (#20794 ) ### Description Previous all feed are set to nightly, the offcial released feed-id is not set ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2024-05-23 18:27:46 -07:00
Changming Sun	b522df0ae4	Update RE2 to the latest (#20775 ) Update RE2 to the latest. To keep the components up to date.	2024-05-23 14:30:15 -07:00
Yi Zhang	fa8670fe5b	Add a test image for stable diffusion (#20780 )	2024-05-23 08:50:23 -07:00
Jian Chen	d4fe4b5b51	Replace ubuntu-latest with onnxruntime-Ubuntu2204-AMD-CPU (#20736 ) ### Description <!-- Describe your changes. --> ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2024-05-22 13:36:02 -07:00
Jian Chen	0a10a3003a	component-governance fix round 4 (#20754 ) ### Description <!-- Describe your changes. --> ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2024-05-22 11:05:24 -07:00
Jian Chen	372974e5d6	Using CPU pool to build Linux GPU C API Package (#20648 ) ### Description <!-- Describe your changes. --> ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2024-05-20 15:25:14 -07:00
Jian Chen	ddafbf2224	Component Governance fix round 3 (#20689 ) ### Description <!-- Describe your changes. --> ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2024-05-20 13:39:09 -07:00
Jian Chen	11df22b59b	Reenabling Nuget Cuda Packaging Pipeline (#20688 ) ### Description <!-- Describe your changes. --> ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2024-05-20 10:37:15 -07:00
Edward Chen	fefae0cd04	Add Mac CI GitHub Actions workflow (#20717 ) Add a new GitHub Actions workflow, `.github/workflows/mac.yml`. It contains these jobs: - ARM64 MacOS CI build. - Objective-C static analysis build. This was moved over from another Azure DevOps pipeline to make it more visible.	2024-05-20 10:27:03 -07:00
Yulong Wang	036fcd93d4	[js/web] optimize module export and deployment (#20165 ) ### Description This PR make numbers of optimizations to onnxruntime-web's module export and deployment. See each section below for more details. #### Preview > [onnxruntime-web@1.19.0-esmtest.20240513-a16cd2bd21](https://www.npmjs.com/package/onnxruntime-web/v/1.19.0-esmtest.20240513-a16cd2bd21) > ~~onnxruntime-web@1.19.0-esmtest.20240430-c7edbcc63d~~ > ~~onnxruntime-web@1.18.0-esmtest.20240428-624c681c83~~ > ~~onnxruntime-web@1.18.0-esmtest.20240411-1abb64e894~~ <details> <summary><h4>Breaking changes</h4></summary> There is no code change required, but there are a few differences regarding code import, flags, bundler config and deployment steps. #### Importing: Import table is changed. See following for details. <details> <summary><h5>Current import table:</h5></summary> \| Target Name \| Path for "import" or "require" \| WebGL \| JSEP \| wasm \| Proxy \| Training \| \|------\|-----\|-----\|-----\|-----\|-----\|-----\| \| `ort` (default) \| `onnxruntime-web` \| ✔️ \| ❌ \| ✔️ \| ✔️ \| ❌ \| \| `ort.all` \| `onnxruntime-web/experimental` \| ✔️ \| ✔️ \| ✔️ \| ✔️ \| ❌ \| \| `ort.node` \| `onnxruntime-web` \| ❌ \| ❌ \| ✔️ \| ❌ \| ❌ \| \| `ort.training` \| `onnxruntime-web/training` \| ❌ \| ❌ \| ✔️ \| ✔️<sup>\[1]</sup> \| ✔️ \| \| `ort.wasm` \| `onnxruntime-web/wasm` \| ❌ \| ❌ \| ✔️ \| ✔️ \| ❌ \| \| `ort.wasm-core` \| `onnxruntime-web/wasm-core` \| ❌ \| ❌ \| ✔️ \| ❌ \| ❌ \| \| `ort.webgl` \| `onnxruntime-web/webgl` \| ✔️ \| ❌ \| ❌ \| ✔️<sup>\[2]</sup> \| ❌ \| \| `ort.webgpu` \| `onnxruntime-web/webgpu` \| ❌ \| ✔️ \| ✔️ \| ✔️ \| ❌ \| * [1] didn't test. may not actually work. * [2] not working. this is a mistake in build config. </details> <details> <summary><h5>Proposed update:</h5></summary> \| Target Name \| Path for "import" or "require" \| WebGL \| JSEP \| wasm \| Proxy \| Training \| \|------\|-----\|-----\|-----\|-----\|-----\|-----\| \| `ort` (default) \| `onnxruntime-web` \| ✔️ \| ❌ \| ✔️ \| ✔️ \| ❌ \| \| `ort.all` \| ~~`onnxruntime-web/experimental`~~<br/>`onnxruntime-web/all` \| ✔️ \| ✔️ \| ✔️ \| ✔️ \| ❌ \| \| `ort.node` \| `onnxruntime-web` \| ❌ \| ❌ \| ✔️ \| ❌ \| ❌ \| \| `ort.training` \| `onnxruntime-web/training` \| ❌ \| ❌ \| ✔️ \| ✔️ \| ✔️ \| \| `ort.wasm` \| `onnxruntime-web/wasm` \| ❌ \| ❌ \| ✔️ \| ✔️ \| ❌ \| \| ~~`ort.wasm-core`~~ \| ~~`onnxruntime-web/wasm-core`~~ \| ~~❌~~ \| ~~❌~~ \| ~~✔️~~ \| ~~❌~~ \| ~~❌~~ \| \| `ort.webgl` \| `onnxruntime-web/webgl` \| ✔️ \| ❌ \| ❌ \| ~~✔️~~ ❌ \| ❌ \| \| `ort.webgpu` \| `onnxruntime-web/webgpu` \| ❌ \| ✔️ \| ✔️ \| ✔️ \| ❌ \| </details> #### Flags: The following flags are deprecated: - `env.wasm.simd` (boolean): will be ignored. SIMD is always enabled in build. The following flags changed their type: - `env.wasm.wasmPaths`: When using this flag as a string ( for the URL prefix ), nothing is changed. When using this flag as an object ( for per-file path override ), the type changed: ```diff - export interface Old_WasmFilePaths{ - 'ort-wasm.wasm'?: string; - 'ort-wasm-threaded.wasm'?: string; - 'ort-wasm-simd.wasm'?: string; - 'ort-training-wasm-simd.wasm'?: string; - 'ort-wasm-simd-threaded.wasm'?: string; - }; + export interface New_WasmFilePaths { + /** + * Specify the override path for the main .wasm file. + * + * This path should be an absolute path. + * + * If not modified, the filename of the .wasm file is: + * - `ort-wasm-simd-threaded.wasm` for default build + * - `ort-wasm-simd-threaded.jsep.wasm` for JSEP build (with WebGPU and WebNN) + * - `ort-training-wasm-simd-threaded.wasm` for training build + / + wasm?: URL\|string; + /* + * Specify the override path for the main .mjs file. + * + * This path should be an absolute path. + * + * If not modified, the filename of the .mjs file is: + * - `ort-wasm-simd-threaded.mjs` for default build + * - `ort-wasm-simd-threaded.jsep.mjs` for JSEP build (with WebGPU and WebNN) + * - `ort-training-wasm-simd-threaded.mjs` for training build + / + mjs?: URL\|string; + } ``` #### Bundler compatibility: Config changes are need for bundlers. See usage example in /js/web/test/e2e/ for Webpack, parcel and rollup. #### Deployment: - if consuming from a CDN, there is no breaking change. - if consuming from a local server, need to copy all `ort-.wasm` and `ort-.mjs` files (totally 6 files) in the dist folder. (previously only need to copy `ort-.wasm` files.) </details> <details> <summary><h4>Problems</h4></summary> There are a few problems with the current module export and deployment: - Script URL cannot be correctly inferred when imported as ESM. - Workers are forcefully encoded using Blob URL, which makes onnxruntime-web not working in CSP environment and Node.js, when using proxy or multi-threading feature. - Generated JS code (by Emscripten) is encoded using `function.toString()`, which is unstable and error-prone. - When running with a different Emscripten build, always need the build step. Making it difficult to swap artifacts in deveopment/debug. </details> <details> <summary><h4>Goals</h4></summary> - Full ESM support - Support variances of ways to import. Including: - import from HTML's `<script>` tag (IIFE format, exporting to global variable `ort`) ```html <script src="https://example.com/cdn-path-to-onnxruntime-web/dist/ort.min.js"></script> ``` - import from source code inside `<script type="module">` tag (ESM) ```html <script type="module"> import * as ort from "https://example.com/cdn-path-to-onnxruntime-web/dist/ort.min.mjs"; // using 'ort' </script> ``` - import in a CommonJS project (CJS format, resolve from package.json "exports" field) ```js // myProject/main.js const ort = require('onnxruntime-web'); ``` - import in an ESM project (ESM format, resolve from package.json "exports" field) ```js // myProject/main.js (or main.mjs) import * as ort from 'onnxruntime-web'; ``` - Support popular bundlers when importing onnxruntime-web into a CJS/ESM project. - webpack (esm requires extra post-process step) - rollup - parcel (esm requires extra post-process step) - More bundlers TBD - Multi-threading support for Node.js NOTE: keeping single JavaScript file (the all-in-one bundle) is no longer a goal. This is because technically there is a conflict with the other requirements. </details> <details> <summary><h4>Important Design Decisions</h4></summary> - Drop support of single JavaScript output. - The current onnxruntime-web distribution uses a single JavaScript file to include all code. While there are a few benefits, it also creates problems as mentioned above. Since ESM is being used more and more widely, and browsers are making more restricted security checks and requirement, the old Blob based solution is going to be replaced. - To achieve the requirement, specifically, the CSP environment support, we have to offer a non Blob based solution. Therefore, we have to distribute multiple files and drop the single file solution. - Do not run parser/postprocess on Emscripten generated JavaScript. - Emscripten is evolving quickly so we should only depends on what's in its documentation instead of a certain implementation details. (for example, currently we patch on its code to deal with a special variable `_scriptDir`) - Keep the generated files as-is also helps to: - reduce the size of ort.min.js - make it easier to replace build artifacts when in development/debug - Drop support for non-SIMD and non-MultiThread. This helps to reduce the number of artifacts in distribution. - (fixed-sized) SIMD is supported in any mainstream JS environment. - Multi-thread as WebAssembly feature is supported in any mainstream JS environment. In some environment the feature is guarded with cross origin policy, but it can still work if not trying to create any worker. - Use ESM output for Emscripten generated JavaScript. - There are 2 ways to dynamically import classic (umd) modules and neither of them are recommended: - dynamically creating a <script> tag. This changes the HTML structure and have quite a lot of compatibility issue - use `fetch()` and `eval()`. However `eval` is strongly suggested to be avoid because there is a great perf hit. - importing ESM is super easy - just use the `import()` call. Considering ESM is widely supported in modern browsers and Node.js this is the better option. - Add Blob based solution as a fallback for cross-origin workers. - There are still wide use case of importing onnxruntime-web from CDN. In this usage, make it able create worker by using `fetch()`+`Blob` to create a same-origin Blob URL. </details> <details> <summary><h4>Distribution File Manifest</h4></summary> The distribution folder contains the following files: - WebAssembly artifacts. These files are the result of compiling the ONNX Runtime C++ code to WebAssembly by Emscripten. \| File Name \| Build Flags \| \|------\|-----\| \| ort-wasm-simd-threaded.mjs <br/> ort-wasm-simd-threaded.wasm \| `--enable_wasm_simd` <br/> `--enable_wasm_threads` \| \| ort-training-wasm-simd-threaded.mjs <br/> ort-training-wasm-simd-threaded.wasm \| `--enable_training_apis` <br/> `--enable_wasm_simd` <br/> `--enable_wasm_threads` \| \| ort-wasm-simd-threaded.jsep.mjs <br/> ort-wasm-simd-threaded.jsep.wasm \| `--enable_wasm_simd` <br/> `--enable_wasm_threads` <br/> `--use_jsep` <br/> `--use_webnn` \| - onnxruntime-web JavaScript artifacts. These files are generated by ESBuild as the entry point for onnxruntime-web. There are multiple build targets for different use cases: \| Target Name \| Path for "import" or "require" \| Description \| \|------\|-----\|-----\| \| `ort` \| `onnxruntime-web` \| The default target. \| \| `ort.all` \| `onnxruntime-web/all` \| The target including webgl. \| \| `ort.node` \| `onnxruntime-web` \| The default target for Node.js. \| \| `ort.training` \| `onnxruntime-web/training` \| The target including training APIs \| \| `ort.wasm` \| `onnxruntime-web/wasm` \| The target including only WebAssembly (CPU) EP \| \| `ort.webgl` \| `onnxruntime-web/webgl` \| The target including only WebGL EP \| For each target, there are multiple files generated: \| File Name \| Description \| \|------\|-----\| \| [target].js \| The entry point for the target. IIFE and CommonJS format. \| \| [target].mjs \| The entry point for the target. ESM format. \| \| [target].min.js <br/> [target].min.js.map \| The entry point for the target. Minimized with sourcemap. IIFE and CommonJS format. \| \| [target].min.mjs <br/> [target].min.mjs.map \| The entry point for the target. Minimized with sourcemap. ESM format. \| \| [target].proxy.mjs \| (if appliable) The proxy ESM module for the target. \| \| [target].proxy.min.mjs <br/> [target].proxy.min.mjs.map \| (if appliable) The proxy ESM module for the target. Minimized with sourcemap. \| </details> <details> <summary><h4>Dynamic Import Explained</h4></summary> - Local Served \| No Proxy: ``` [Bundle or ort.min.js] \| + import()--> [ort-wasm-simd-threaded.mjs] \| + WebAssembly.instantiateStreaming()--> [ort-wasm-simd-threaded.wasm] \| + new Worker()--> [ort-wasm-simd-threaded.mjs (worker)] \| + WebAssembly.instantiateStreaming()--> [ort-wasm-simd-threaded.wasm] ``` - Local Served \| Proxy: ``` [Bundle or ort.min.js] \| + import()--> [ort.proxy.min.mjs] \| + new Worker()--> [ort.proxy.min.mjs (worker)] \| + import()--> [ort-wasm-simd-threaded.mjs] \| + WebAssembly.instantiateStreaming()--> [ort-wasm-simd-threaded.wasm] \| + new Worker()--> [ort-wasm-simd-threaded.mjs (worker)] \| + WebAssembly.instantiateStreaming()--> [ort-wasm-simd-threaded.wasm] ``` - Cross Origin \| No Proxy: ``` [Bundle or ort.min.js] \| + fetch('ort-wasm-simd-threaded.mjs') \| + URL.createObjectURL(res.blob()) \| + import()--> [blob:... (ort-wasm-simd-threaded)] \| + WebAssembly.instantiateStreaming()--> [ort-wasm-simd-threaded.wasm] \| + new Worker()--> [blob:... (ort-wasm-simd-threaded) (worker)] \| + WebAssembly.instantiateStreaming()--> [ort-wasm-simd-threaded.wasm] ``` - Cross Origin \| Proxy ``` [Bundle or ort.min.js] \| + fetch('ort.proxy.min.mjs') \| + URL.createObjectURL(res.blob()) \| + import()--> [blob:... (ort.proxy)] \| + new Worker()--> [blob:... (ort.proxy) (worker)] \| + fetch('ort-wasm-simd-threaded.mjs') \| + URL.createObjectURL(res.blob()) \| + import()--> [blob:... (ort-wasm-simd-threaded)] \| + WebAssembly.instantiateStreaming()--> [ort-wasm-simd-threaded.wasm] \| + new Worker()--> [blob:... (ort-wasm-simd-threaded) (worker)] \| + WebAssembly.instantiateStreaming()--> [ort-wasm-simd-threaded.wasm] ``` </details>	2024-05-20 09:51:16 -07:00
Edward Chen	e81c8676e3	MatMulNBits + Add fusion (#20587 ) - Add MatMulNBits Bias input - Add graph transformer to fuse MatMulNBits + Add	2024-05-16 11:00:59 -07:00
Yifan Li	47a178b518	[EP Perf] Fix on EP Perf (#20683 ) ### Description <!-- Describe your changes. --> * Partially revert [previous change](https://github.com/microsoft/onnxruntime/pull/19804), and * Redo concurrency_test_result parser outside of post.py * Add support of syncing memtest result to db ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> To fix the error when CI is running on two model groups. - When running on two model groups, the [previous change](https://github.com/microsoft/onnxruntime/pull/19804) wrongly navigates two levels up in the directory after running one model group, while one level is needed. After that, the script can't find another model group. - Running on one model group can't repro the issue	2024-05-15 21:38:52 -07:00
Jian Chen	d1e66f0446	Increase NPM ComponentDetection.Timeout: 1200 (#20681 ) ### Description <!-- Describe your changes. --> ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2024-05-15 13:41:59 -07:00
Jian Chen	87ed1e3e3f	Component governance fix round 2 (#20679 )	2024-05-14 17:15:15 -07:00
Edward Chen	113aa2992f	Update React Native CI (#20673 ) - Move iOS package build to separate job so it can run in parallel with Android AAR build and be decoupled from the test stage. The test stage fails sometimes (not infrequently) and may need to be re-run. - Update stop iOS simulator step so it doesn't fail if the start step doesn't run.	2024-05-14 14:10:56 -07:00
Jian Chen	83a871f890	Fix critical and High issues from Component Governance (#20611 ) ### Description <!-- Describe your changes. --> ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2024-05-14 09:17:23 -07:00
Hector Li	0e11d0c4f8	Enable Qnn nuget nightly (#20662 ) ### Description Enable Qnn nuget nightly	2024-05-13 21:28:43 -07:00
Yi Zhang	c131ea89e1	Nuget Publish pipelines should be trigger by rel-* automatically too. (#20652 ) ### Description And Set allowPackageConflicts = True `#allowPackageConflicts: false # boolean. Optional. Use when command = push && nuGetFeedType = internal. Allow duplicates to be skipped. Default: false.` https://learn.microsoft.com/en-us/azure/devops/pipelines/tasks/reference/nuget-command-v2?view=azure-pipelines Once the publish patial failed, we don't need to rerun the whole package generation workflow.	2024-05-13 13:18:16 -07:00
Edward Chen	90d49ccb9a	Allow path pattern to be specified in package_release_tasks.py. (#20650 ) Do more in the Python helper script so the Bash code in the release definition can be simplified.	2024-05-13 09:16:04 -07:00
Jian Chen	4fe565a62a	Java CUDA 12 support (#20583 ) ### Description - This PR combine all CUDA 12 stage into the Zip-nuget-... pipeline. - It also enables the cuda12 support ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2024-05-10 14:16:22 -07:00
George Wu	a0c4bd4da7	[qnn ep] sign onnxruntime.dll/pyd for qnn packages (#20634 ) sign only onnxruntime.dll and onnxruntime_pybind11_state.pyd in packages.	2024-05-09 20:45:44 -07:00
Yi Zhang	5a18818e1d	Migrate training storage from SAS to managed identity (#20618 ) ### Description orttrainingtestdatascus has only save mnist whose size is only 64M in Azure File To meet security requirements and reduce maintenance cost, move the test data to lotusscus and saved in Azure blob.	2024-05-09 15:44:29 -07:00
Jian Chen	d1cbb3e076	The time for nuget pkg should be consistent (#20522 ) This pull request primarily involves changes to the build scripts in the `tools/ci_build/github/azure-pipelines` directory. The changes add build date and time information to the build process. This is achieved by introducing two new parameters, `BuildDate` and `BuildTime`, and incorporating them into the `msbuildArguments` in multiple locations. Addition of new parameters: * [`tools/ci_build/github/azure-pipelines/templates/c-api-cpu.yml`](diffhunk://#diff-00815920cc190d10fdebceac0c3a4b8a59e408684ae38177dfe7f96cae276c59R309-R310): Added `BuildDate` and `BuildTime` parameters using the pipeline's start time. Incorporation of new parameters in `msbuildArguments`: * [`tools/ci_build/github/azure-pipelines/c-api-noopenmp-packaging-pipelines.yml`](diffhunk://#diff-efb530efd945fdd9d3e1b92e53d25cc8db7df2e28071c364b07a7193092de01bL947-R948): Added `CurrentDate` and `CurrentTime` parameters to `msbuildArguments` in multiple locations. [[1]](diffhunk://#diff-efb530efd945fdd9d3e1b92e53d25cc8db7df2e28071c364b07a7193092de01bL947-R948) [[2]](diffhunk://#diff-efb530efd945fdd9d3e1b92e53d25cc8db7df2e28071c364b07a7193092de01bL1092-R1093) [[3]](diffhunk://#diff-efb530efd945fdd9d3e1b92e53d25cc8db7df2e28071c364b07a7193092de01bL1114-R1115) [[4]](diffhunk://#diff-efb530efd945fdd9d3e1b92e53d25cc8db7df2e28071c364b07a7193092de01bL1137-R1138) * [`tools/ci_build/github/azure-pipelines/templates/c-api-cpu.yml`](diffhunk://#diff-00815920cc190d10fdebceac0c3a4b8a59e408684ae38177dfe7f96cae276c59L446-R448): Incorporated the `CurrentDate` and `CurrentTime` parameters into `msbuildArguments`.### Description <!-- Describe your changes. --> ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2024-05-09 11:35:45 -07:00
Edward Chen	a0db2187ee	Update CocoaPods package release script. (#20608 ) - Update method for uploading to Azure storage to use managed identity. - Allow helper script tasks to be split across different calls. - Rewrite helper script in Python. Motivation: Recently the Azure storage account configuration was changed and now the old way of uploading to it no longer works.	2024-05-08 16:17:26 -07:00
Changming Sun	08b637350a	Remove an extra space in azure_scale_set_vm_mount_test_data.sh (#20584 )	2024-05-08 09:46:50 -07:00
Scott McKay	8d09baf49f	Clarify when protobuf dependency builds protoc (#20542 ) ### Description <!-- Describe your changes. --> Currently figuring out if the protobuf dependency is building protoc it is a little obtuse and inconsistent * in some places we directly set protobuf_BUILD_PROTOC_BINARIES to OFF to indicate the protobuf dependency is not building protoc * e.g. macOS/iOS/visionOS builds * for a user provided protoc path we don't set protobuf_BUILD_PROTOC_BINARIES, and inside protobuf_function.cmake that determines if `protobuf::protoc` is added as a dependency or not * `0dda8b0c44/cmake/external/protobuf_function.cmake (L40-L45)` To be more consistent/explicit, set protobuf_BUILD_PROTOC_BINARIES to OFF when ONNX_CUSTOM_PROTOC_EXECUTABLE set and valid. Remove outdated script that built and external protoc binary which was used in later builds. The build setup will fetch a pre-built protoc so there's no need for this additional build. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Make it easier to figure out if protoc is coming from the protobuf dependency.	2024-05-08 08:30:11 +10:00
aciddelgado	4e27841bdb	fix gqa cpu nan bug (#20521 ) ### Description There was a bug with gqa on cpu where on token case, with batch_size > 1, and with past_present_share_buffer off, the output would occasionally contain nans. this pr fixes that. it also updates documentation and fixes posid gen for rotary in cuda in prompt case. ### Motivation and Context this pr solves the GQA CPU bug as well as updates the documentation and makes seqlens_k irrelevant for prompt case, which is useful to prevent user error.	2024-05-07 15:19:26 -07:00
Adrian Lizarraga	0dda8b0c44	[QNN EP] Update QNN SDK to 2.21 (#20534 ) ### Description - Updates QNN pipelines to use QNN SDK 2.21 - Downloads QNN SDK from Azure storage to avoid having to rebuild images when a new version is released. ### Motivation and Context Test with the latest QNN SDK.	2024-05-01 20:17:35 -07:00
Scott McKay	f9febc4f35	Remove usage of 'required reason' iOS API from protobuf (#20529 ) ### Description <!-- Describe your changes. --> Using certain APIs is about to require a [privacy manifest](https://developer.apple.com/documentation/bundleresources/privacy_manifest_files/describing_use_of_required_reason_api) to be added to a package. Our version of protobuf uses `mach_absolute_time`. Patch as per https://github.com/protocolbuffers/protobuf/pull/15662/ to remove usage. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Usage of API will require a privacy manifest for an iOS app to be accepted as of 5/1/2024 #20519	2024-05-02 08:21:08 +10:00
Yifan Li	29417762f7	[TensorRT EP] support TensorRT 10-GA (#20506 ) ### Description <!-- Describe your changes. --> This branch is based on rel-1.18.0 and supports TensorRT 10-GA. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2024-05-01 11:10:53 -07:00
Hector Li	755aaea9a6	Qnn nuget update (#20527 ) ### Description Update Qnn nuget package to include Qnn libs and license file	2024-04-30 22:12:53 -07:00
Yi Zhang	91baeb8495	Reduce downloads to NodeJS to mitigate random connection exception. (#20518 ) ### Description There was connection exception in docker build in package pipeline ``` 48.26 + curl https://nodejs.org/dist/v18.17.1/node-v18.17.1-linux-x64.tar.gz -sSL --retry 5 --retry-delay 30 --create-dirs -o /tmp/src/node-v18.17.1-linux-x64.tar.gz --fail 456.0 curl: (92) HTTP/2 stream 0 was not closed cleanly: INTERNAL_ERROR (err 2) ``` https://dev.azure.com/aiinfra/Lotus/_build/results?buildId=453140&view=logs&j=f9f5b320-fa10-56c4-debe-61ea69c74793&t=1656e225-defa-5b12-8935-2a0a93e76a67&s=3c85d903-a183-5028-775e-d63999fcc9ae In fact, docker image shouldn't be rebuilt this time. Checked the code, The docker image tag in Linux_C_API_Packaging_GPU_x64 of onnxruntimecuda${{ variables.CUDA_VERSION_MAJOR }}build was same as the image tag of Linux-gpu-ci-pipeline, but their docker files are different. So changing the Linux GPU pipeline's image tag to avoid packaging pipeline docker image overridden unexpectedly.	2024-05-01 09:04:56 +08:00
Rachel Guo	8c31f27dd1	Catalyst nuget package .NET changes only (#20424 ) ### Description <!-- Describe your changes. --> https://github.com/microsoft/onnxruntime/pull/20418 Add back Catalyst changes only for now. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Co-authored-by: rachguo <rachguo@rachguos-Mini.attlocal.net>	2024-04-29 15:39:48 -07:00
Scott McKay	923b0ef323	Run fuzz testing before the CG task cleans up the build directory (#20500 ) ### Description <!-- Describe your changes. --> Update order of steps ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Fix CI	2024-04-29 16:02:53 +10:00
Rachel Guo	ff505b9f44	Follow up fix for #20472 (#20484 ) ### Description <!-- Describe your changes. --> Error: *Artifact name input: e2e_test_logs_1364625_$(Date:yyyyMMddHHmmss) ##[error]Artifact name is not valid: e2e_test_logs_1364625_$(Date:yyyyMMddHHmmss). It cannot contain '\', /', "', ':', '<', '>', '\|', '', and '?'** Date not correctly showing up in the artifact name. Use predefined pipeline variable BuildNumber instead which also serves similarly as a timestamp. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> RN CI failure --------- Co-authored-by: rachguo <rachguo@rachguos-Mac-mini.local> Co-authored-by: rachguo <rachguo@rachguos-Mini.attlocal.net>	2024-04-27 13:42:24 +10:00
Rachel Guo	88904b9220	Add unique identifier to e2e_test_logs artifacts in react-native-ci.yml (#20472 ) ### Description <!-- Describe your changes. --> As title. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2024-04-26 22:20:10 +10:00
Scott McKay	aa27dadd1c	Use download.onnxruntime.ai in podspec (#20474 ) ### Description <!-- Describe your changes. --> Update to more generic url ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2024-04-26 20:28:54 +10:00
Yi Zhang	464f199b95	Extend mac package jobs time out limit (#20459 )	2024-04-25 10:13:13 -07:00
Yi Zhang	e5947f5729	Two improvements in pipelines (#20449 ) ### Description 1. Update the image name to avoid docker image wouldn't be overwrite. there was an mistake that variables.CUDA_VERSION_MAJOR is always empty `14fcf0a52d/tools/ci_build/github/azure-pipelines/stages/nuget-linux-cuda-packaging-stage.yml (L120)` 3. set one artifact name as variable to make the job rerunnable ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2024-04-25 10:15:40 +08:00
Scott McKay	a46bab6364	Update podspec url to use AFD hostname (#20452 ) Update to use AFD url when generating podspec	2024-04-24 09:37:24 -07:00
Rachel Guo	14fcf0a52d	Support visionos build (#20365 ) ### Description <!-- Describe your changes. --> This PR supports a build of onnxruntime.xcframework for xros/xrsimulator for visionos via the build command of `python3 tools/ci_build/github/apple/build_apple_framework.py --config Release/Debug tools/ci_build/github/apple/default_vision_os_framework_build_settings.json`. For officially include visionos in ios cocoapods package and testing in CI, would require separate work for upgrading the Xcode version & upgrade macOS CI agent to macos-13-arm64 or higher. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> visionos support: https://github.com/microsoft/onnxruntime/discussions/19313 --------- Co-authored-by: rachguo <rachguo@rachguos-Mini.attlocal.net> Co-authored-by: rachguo <rachguo@rachguos-Mac-mini.local>	2024-04-23 18:15:07 -07:00
Yulong Wang	5055dc0aa8	[js/web] add diagnose log for chrome (#20439 ) ### Description Add logs to further diagnose the pipeline issue.	2024-04-23 17:18:54 -07:00
Edward Chen	76461c8f4d	Increase timeout for iOS packaging pipeline jobs. (#20434 )	2024-04-23 11:55:55 -07:00
Yi Zhang	7ebc653f04	Revert "Nuget .NET changes for Mac Catalyst (#19923 )" (#20418 ) This reverts commit `f396748ed6`. ### Description <!-- Describe your changes. --> ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2024-04-23 15:08:12 +08:00
Adrian Lizarraga	e6a677f6b7	[QNN EP] Download QNN SDK from azure blob in packaging pipelines (#20359 ) ### Description - Updates Windows QNN Nuget and Python packaging pipelines to download QNN SDK from blob storage. - Makes the QNN SDK version configurable when launching the python packaging pipeline. ### Motivation and Context Removes the need to rebuild images to update QNN SDK. Only applies to Windows pipelines. Linux pipelines still get the SDK from disk.	2024-04-22 22:32:55 -07:00
Yi Zhang	197b3f1d90	Enable Whisper Test with OMP_FFMPEG (#20402 ) ### Description Installing OMP_FFMPEG in the docker and Readd Whisper Test Download OMP_FFMPEG in restricted accessed Azure blob.	2024-04-22 10:55:56 -07:00
Yulong Wang	a457c1df80	upgrade emsdk to 3.1.57 (#20295 ) ### Description upgrade emsdk to 3.1.57	2024-04-19 23:05:18 -07:00
Rachel Guo	f396748ed6	Nuget .NET changes for Mac Catalyst (#19923 ) ### Description <!-- Describe your changes. --> Add Nuget package changes for adding new 'net6.0-maccatalyst' platform. The output ORT Nuget package was manually tested and verified in a .NET MAUI app setup. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> --------- Co-authored-by: rachguo <rachguo@rachguos-Mini.attlocal.net> Co-authored-by: Yi Zhang <zhanyi@microsoft.com> Co-authored-by: rachguo <rachguo@rachguos-Mac-mini.local>	2024-04-19 14:20:03 -07:00
sfatimar	4d1963c2a2	OpenVINO EP Rel 1.18 Changes (#20337 ) ### Description These changes include Support to OpenVINO 2024.1 Import PreCompiled Blobs with EPContext Blob Separate Device/Precision as input Deprecate CPU_FP32 , GPU_FP32 terminology , introduce CPU, GPU AUTO GPU, CPU will only create GPU Blob and not CPU Blob. ### Motivation and Context - OpenVINO 2024.1 will be out soon - Import Precompiled Blob can greatly reduce FEIL/FIL Time. - Separating Device/Precision will make the input cleaner - --------- Co-authored-by: Suryaprakash Shanmugam <suryaprakash.shanmugam@intel.com> Co-authored-by: Preetha Veeramalai <preetha.veeramalai@intel.com>	2024-04-19 00:31:38 -07:00
Yulong Wang	3577a4bd02	[Node.js binding] Allow installation to download CUDA binaries via script (#20364 ) ### Description Currently we try to include all prebuilt binaries into the NPM packages. This was working until we added libonnxruntime_providers_cuda.so (>400MB) into the NPM package. The NPM registry refuses to accept new package publishment because the file is too large. To make the new NPM package working, we have to remove the large file from the package, and add a new script on package installation. This script will try to dynamically install onnxruntime CUDA dynamic library for Linux/x64.	2024-04-18 13:44:42 -07:00
Yi Zhang	4d2b98155f	More fixes on random connection excepiton in Mac Build. (#20328 ) ### Description supplement of #20322 ### Motivation and Context Fixes random connection exceptions in Mac build in Python Packaging Pipeline https://dev.azure.com/aiinfra/Lotus/_build/results?buildId=443617&view=logs&j=5849a411-e258-5ce5-39bd-7b65d44961a0&t=ccb871c8-76d9-5e80-55b0-4279efd5567f and IOS full xcframework https://dev.azure.com/aiinfra/Lotus/_build/results?buildId=443458&view=logs&j=370fd1a2-3dec-5916-4d2c-8aae58c72d28&t=686352ba-ee61-5ad4-8739-e8abd07372a4&s=e9aa87c8-a9ad-51f7-3b12-045ecc319776	2024-04-17 08:37:56 +08:00
Yi Zhang	caf692e626	[Fix] Random connection exceptions in MacOS_C_API_Packaging_CPU stage (#20322 ) ### Description Add download_deps to reduce downloading from 3rd party websites. ### Motivation and Context Fix frequent random exception like ``` CMake Error at abseil_cpp-subbuild/abseil_cpp-populate-prefix/src/abseil_cpp-populate-stamp/download-abseil_cpp-populate.cmake:162 (message): Each download failed! error: downloading 'https://github.com/abseil/abseil-cpp/archive/refs/tags/20240116.0.zip' failed status_code: 35 status_string: "SSL connect error" log: --- LOG BEGIN --- Trying 20.29.134.23:443... Connected to github.com (20.29.134.23) port 443 ALPN: curl offers h2,http/1.1 (304) (OUT), TLS handshake, Client hello (1): [315 bytes data] CAfile: /etc/ssl/cert.pem CApath: none Recv failure: Operation timed out LibreSSL/3.3.6: error:02FFF03C:system library:func(4095):Operation timed out Closing connection ``` https://dev.azure.com/aiinfra/Lotus/_build/results?buildId=443278&view=logs&j=006a7a04-d43b-5fe1-df02-ecafb79c4d6e&t=110edd38-9f3b-50cf-b328-8ed0f915e5c1 --------- Co-authored-by: Yi Zhang <your@email.com>	2024-04-16 13:28:18 +08:00
Edward Chen	287ecea2f1	Fix binary size check build publish step. (#20298 ) Add `--user` option to pip install command. Error: ``` ERROR: Could not install packages due to an OSError: [Errno 13] Permission denied: '/usr/local/bin/f2py' Consider using the `--user` option or check the permissions. ``` See #19877.	2024-04-15 10:15:42 -07:00
liqun Fu	cd7112f800	Integration with ONNX 1.16.0 (#19745 ) ### Description update with ONNX 1.16.0 branch according to https://github.com/microsoft/onnxruntime/blob/main/docs/How_To_Update_ONNX_Dev_Notes.md ONNX 1.16.0 release notes: https://github.com/onnx/onnx/releases/tag/v1.16.0 #### Updated ops for CPU EP: - DequantizeLinear(21) - Added int16 and uint16 support + various optimizer tests - Missing int4 and uint4 support - Missing block dequantization support - QuantizeLinear(21) - Added int16 and uint16 support + various optimizer tests - Missing int4 and uint4 support - Missing block quantization support - Cast(21) - Missing int4 and uint4 support - CastLike(21) - Missing int4 and uint4 support - ConstantOfShape(21) - Missing int4 and uint4 support - Identity(21) - Missing int4 and uint4 support - If(21) - Missing int4 and uint4 support - Loop(21) - Missing int4 and uint4 support - Reshape(21) - Missing int4 and uint4 support - Scan(21) - Missing int4 and uint4 support - Shape(21) - Missing int4 and uint4 support - Size(21) - Missing int4 and uint4 support - Flatten(21) - Missing float8e4m3fnuz, float8e5m2, float8e5m2fnuz, int4, and uint4 support - Pad(21) - Missing float8e4m3fnuz, float8e5m2, float8e5m2fnuz, int4, and uint4 support - Squeeze(21) - Missing float8e4m3fnuz, float8e5m2, float8e5m2fnuz, int4, and uint4 support - Transpose(21) - Missing float8e4m3fnuz, float8e5m2, float8e5m2fnuz, int4, and uint4 support - Unsqueeze(21) - Missing float8e4m3fnuz, float8e5m2, float8e5m2fnuz, int4, and uint4 support #### Unimplemented opset 21 features/ops - int4 and uint4 data type - QLinearMatMul(21) - GroupNormalization(21) - ai.onnx.ml.TreeEnsemble(5) ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> ### Disabled tests #### ORT Training orttraining/orttraining/test/python/orttraining_test_ort_apis_py_bindings.py - test_ort_custom_ops: Potential shape inference bug for custom ops #### Python quantization unit tests test/onnx/python/quantization (shape inference bug) - test_op_conv_transpose.py: test_quantize_conv_transpose_u8u8_fp16 - test_op_conv_transpose.py: test_quantize_conv_transpose_s8s8_fp16 - test_op_gemm.py: test_quantize_qop_gemm_s8s8 - test_op_gemm.py: test_quantize_qop_gemm_e4m3fn_same - test_op_gemm.py: test_quantize_qop_gemm_e4m3fn_p3 - test_op_matmul.py: test_quantize_matmul_u8u8_f16 - test_op_matmul.py: test_quantize_matmul_s8s8_f16 - test_op_matmul.py: test_quantize_matmul_s8s8_f16_entropy - test_op_matmul.py: test_quantize_matmul_s8s8_f16_percentile - test_op_matmul.py: test_quantize_matmul_s8s8_f16_distribution - test_op_relu.py: test_quantize_qop_relu_s8s8 #### ONNX tests - test_maxpool_2d_ceil_output_size_reduce_by_one: ONNX 1.16.0 fixed a maxpool output size bug and added this test. Enable this test when [ORT PR](https://github.com/microsoft/onnxruntime/pull/18377) is merged. Refer to original [ONNX PR](https://github.com/onnx/onnx/pull/5741). - test_ai_onnx_ml_tree_ensemble_set_membership_cpu: new unimplemented op ai.onnx.ml.TreeEnsemble - test_ai_onnx_ml_tree_ensemble_single_tree_cpu: same - test_ai_onnx_ml_tree_ensemble_set_membership_cuda: same - test_ai_onnx_ml_tree_ensemble_single_tree_cuda: same - test_cast_INT4_to_FLOAT_cpu: ORT Cast(21) impl doesn't support int4 yet - test_cast_INT4_to_INT8_cpu: same - test_cast_UINT4_to_FLOAT_cpu: same - test_cast_UINT4_to_UINT8_cpu: same - test_cast_INT4_to_FLOAT_cuda - test_cast_INT4_to_INT8_cuda - test_cast_UINT4_to_FLOAT_cuda - test_cast_UINT4_to_UINT8_cuda - test_constantofshape_float_ones_cuda: ConstantOfShape(21) not implemented for cuda - test_constantofshape_int_shape_zero_cuda: same - test_constantofshape_int_zeros_cuda: same - test_flatten_axis0_cuda: Flatten(21) not implemented for cuda - test_flatten_axis1_cuda: same - test_flatten_axis2_cuda: same - test_flatten_axis3_cuda: same - test_flatten_default_axis_cuda: same - test_flatten_negative_axis1_cuda: same - test_flatten_negative_axis2_cuda: same - test_flatten_negative_axis3_cuda: same - test_flatten_negative_axis4_cuda: same - test_qlinearmatmul_2D_int8_float16_cpu: QLinearMatMul(21) for onnx not implemented in ORT yet - test_qlinearmatmul_2D_int8_float32_cpu: same - test_qlinearmatmul_2D_uint8_float16_cpu: same - test_qlinearmatmul_2D_uint8_float32_cpu: same - test_qlinearmatmul_3D_int8_float16_cpu: same - test_qlinearmatmul_3D_int8_float32_cpu: same - test_qlinearmatmul_3D_uint8_float16_cpu: same - test_qlinearmatmul_3D_uint8_float32_cpu: same - test_qlinearmatmul_2D_int8_float16_cuda: same - test_qlinearmatmul_2D_int8_float32_cuda: same - test_qlinearmatmul_2D_uint8_float16_cuda: same - test_qlinearmatmul_2D_uint8_float32_cuda: same - test_qlinearmatmul_3D_int8_float16_cuda: same - test_qlinearmatmul_3D_int8_float32_cuda: same - test_qlinearmatmul_3D_uint8_float16_cuda: same - test_qlinearmatmul_3D_uint8_float32_cuda: same - test_size_cuda: Size(21) not implemented for cuda - test_size_example_cuda: same - test_dequantizelinear_blocked: Missing implementation for block dequant for DequantizeLinear(21) - test_quantizelinear_blocked_asymmetric: Missing implementation for block quant for QuantizeLinear(21) - test_quantizelinear_blocked_symmetric: Missing implementation for block quant for QuantizeLinear(21) --------- Signed-off-by: liqunfu <liqun.fu@microsoft.com> Signed-off-by: Ganesan Ramalingam <grama@microsoft.com> Co-authored-by: Ganesan Ramalingam <grama@microsoft.com> Co-authored-by: George Wu <jywu@microsoft.com> Co-authored-by: adrianlizarraga <adlizarraga@microsoft.com>	2024-04-12 09:46:49 -07:00
Yifan Li	9577fe454d	[EP Perf] Customize onnx-tensorrt commit id when init CI tasks (#20175 ) ### Description <!-- Describe your changes. --> Customize commit id of onnx-tensorrt in EP Perf CI variables when testing OSS parsers in different versions ### To Verify ![image](https://github.com/microsoft/onnxruntime/assets/109183385/9dc650d8-377d-4223-8951-f0849b1fe984) After assigning `onnxTensorrtCommitId` in EP Perf CI Variables, CI would prompt during the step of [Build latest ORT Image with TensorRT OSS parser](https://aiinfra.visualstudio.com/Lotus/_build/results?buildId=438217&view=logs&j=b6bfa4e2-8141-507f-8ca1-59b3f929fa71&t=fc64e110-ab59-54e4-1c37-853e84a52a7e&l=396450): ``` Updated deps.txt with new commit id a43ce67187bab219520fd80f21af8bbd4354bc8c and hash 572535aefef477050f86744dfab1fef840198035 ``` And CI would [overwrite the line of onnx_tensorrt in deps.txt](https://aiinfra.visualstudio.com/Lotus/_build/results?buildId=438217&view=logs&j=b6bfa4e2-8141-507f-8ca1-59b3f929fa71&t=fc64e110-ab59-54e4-1c37-853e84a52a7e&l=396451) which was assigned as: ``` onnx_tensorrt;`a43ce67187`.zip;572535aefef477050f86744dfab1fef840198035 ``` ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> To save time of modifying deps.txt and manually calculating zip hash	2024-04-10 09:46:05 -07:00
Yi Zhang	0acde1157a	Set parallel count to avoid OOM in training GPU packaging pipeline (#20255 ) ### Description make the compilation work on Azure CPU Agent by reduce the parallel count ### Motivation and Context The OOM issue mentioned in #20244 was caused the by low memory/parallel_count.	2024-04-10 14:05:53 +08:00
Yi Zhang	14d7872ce9	Reuse T4 for Cuda12.2 training packaging pipeline. (#20244 ) ### Description It always has been out of memory in training CUDA 12.2 packaging pipeline https://dev.azure.com/aiinfra/Lotus/_build?definitionId=1308&_a=summary since the PR #19910 I tried other CPU agents for example, D64as_v5(256G memory) and D32as_v4(128G memory and 256 G SSD temp storage), which are still out of memory like the below image ![image](https://github.com/microsoft/onnxruntime/assets/16190118/5acde9ef-674f-4b6d-a1b3-b54647645083) But it works on T4, though T4 only has 4 vCPUs, 28G memory and 180G temp storage, and it takes much more time. ### Motivation and Context Restore CUDA 12.2 training packaging pipeline first. More time is needed to investigate the root cause ### Other Clues. These 2 compilation steps take nearly 6 minutes with Cuda 12.2 on T4 And it runs out of memory on CPU machine. @ajindal1 cuda12.2 on T4 ``` 2024-03-14T05:39:08.7726865Z [ 90%] Building CUDA object CMakeFiles/onnxruntime_providers_cuda.dir/onnxruntime_src/onnxruntime/contrib_ops/cuda/bert/flash_attention/flash_fwd_split_hdim32_fp16_sm80.cu.o 2024-03-14T05:45:01.3223393Z [ 90%] Building CUDA object CMakeFiles/onnxruntime_providers_cuda.dir/onnxruntime_src/onnxruntime/contrib_ops/cuda/bert/flash_attention/flash_fwd_split_hdim64_bf16_sm80.cu.o 2024-03-14T05:46:07.9218003Z [ 90%] Building CUDA object CMakeFiles/onnxruntime_providers_cuda.dir/onnxruntime_src/onnxruntime/contrib_ops/cuda/bert/flash_attention/flash_fwd_split_hdim96_fp16_sm80.cu.o 2024-03-14T05:52:59.2387051Z [ 90%] Building CUDA object CMakeFiles/onnxruntime_providers_cuda.dir/onnxruntime_src/onnxruntime/contrib_ops/cuda/bert/group_query_attention_impl.cu.o ``` But they could be finished in about one minute with Cuda 11.8 on CPU ``` cuda11.8 on CPU 2024-04-09T11:34:35.0849836Z [ 90%] Building CUDA object CMakeFiles/onnxruntime_providers_cuda.dir/onnxruntime_src/onnxruntime/contrib_ops/cuda/bert/flash_attention/flash_fwd_split_hdim32_fp16_sm80.cu.o 2024-04-09T11:35:53.6648154Z [ 90%] Building CUDA object CMakeFiles/onnxruntime_providers_cuda.dir/onnxruntime_src/onnxruntime/contrib_ops/cuda/bert/flash_attention/flash_fwd_split_hdim64_bf16_sm80.cu.o cuda11.8 on GPU 024-03-13T12:16:33.4102477Z [ 90%] Building CUDA object CMakeFiles/onnxruntime_providers_cuda.dir/onnxruntime_src/onnxruntime/contrib_ops/cuda/bert/flash_attention/flash_fwd_split_hdim32_fp16_sm80.cu.o 2024-03-13T12:19:58.8268272Z [ 90%] Building CUDA object CMakeFiles/onnxruntime_providers_cuda.dir/onnxruntime_src/onnxruntime/contrib_ops/cuda/bert/flash_attention/flash_fwd_split_hdim64_bf16_sm80.cu.o ```	2024-04-10 09:21:40 +08:00
Adrian Lizarraga	05d97e8d18	Update QNN python packages to use QNN SDK version 2.19.2 (#20213 ) ### Description Update QNN python packages to use QNN SDK version 2.19.2. ### Motivation and Context Our CI builds already use QNN SDK version 2.19.2. We should make sure the ort-nightly-qnn python packages are also built with the same QNN SDK version.	2024-04-05 17:15:25 -07:00
Yi Zhang	23a5d0a305	Extend time out in Windows GPU packaging jobs (#20207 ) ### Description Extend Windows GPU Packaging job building time out to 6 hours, and test stage to 3 hours. ### Motivation and Context There're still a few timeout issues after refactoring. The probability is about 20% in https://dev.azure.com/aiinfra/Lotus/_build?definitionId=84. I found the building could be finished in 4 hours if it becomes slow, https://dev.azure.com/aiinfra/Lotus/_build/results?buildId=434340&view=logs&j=0c6ee496-b38e-55a9-3699-12934156e90f, although in most cases, it only take about 30 minutes. Not like before, the building couldn't be completed. So, In this PR, I extend the timeout to 6 hours. And one interesting thing, if one windows GPU job becomes slow, all other windows GPU jobs in the same run become slow too. So I doubt it has something with the ADO or virtualization. That is, it's not completely random. https://dev.azure.com/aiinfra/Lotus/_build?definitionId=841	2024-04-06 08:03:42 +08:00
Yi Zhang	4ea54b82f9	[Fix] Upload training CUDA daily wheel (#20183 ) ### Description <!-- Describe your changes. --> ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2024-04-03 13:18:26 +08:00
Yi Zhang	523ef04240	enable lto in Python-CUDA-Packaging Pipline (#20164 ) ### Description Except [Python-CUDA-Packaging pipeline](https://dev.azure.com/aiinfra/Lotus/_build?definitionId=1299&_a=summary), all windows cuda packaging jobs have been running well now. After comparison, enable_lto isn't added in the pipeline, which might be one root cause of the random hang. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2024-04-01 15:42:28 +08:00
Jeff Bloomfield	2f31560430	Enable generic feature level devices in DML EP (#20114 ) ### Description Enable NPUs supporting DXCORE_ADAPTER_ATTRIBUTE_D3D12_GENERIC_ML and D3D_FEATURE_LEVEL_1_0_GENERIC with DML EP. This also begins ingesting DX headers through the DirectX-Headers repo. Note that this includes an update to cgamanifest.json for onnx-tensorrt which is triggered during re-generation due to a prior changes to deps.txt. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2024-03-29 14:37:30 -07:00
Adam Pocock	2f82400b13	[java] Java 21 build support (#19876 ) ### Description Bump spotless and the Gradle wrapper to 6.25.0 and 8.6 respectively to allow compiling ORT on Java 21. The build still targets Java 8. I'm not sure if there will be CI changes necessary to use this PR, specifically for the Gradle version as I don't know if that is cached somewhere earlier in the CI build process. The new Gradle version adds a warning that using `--source` and `--target` to select the Java language version is obsolete which is annoying, we can fix it if we decide to only allow building on newer versions of Java, while still supporting running on Java 8. ### Motivation and Context Java 21 is the latest LTS release of Java and ORT should be able to build on it.	2024-03-28 15:51:22 -07:00
Yi Zhang	f7b52d2e3e	[Fix] Only copy java files when build_java is True (#20121 ) ### Description ### Motivation and Context Fix error in Nuget-CUDA-Packaging-Pipeline	2024-03-28 14:06:28 -07:00
Yi Zhang	c5d7310f1b	Remove TSA upload in testing stage (#20115 ) ### Description <!-- Describe your changes. --> ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> --------- Co-authored-by: Yi Zhang <your@email.com>	2024-03-28 13:15:03 +08:00
Yi Zhang	8f069f81c4	Split more windows GPU workflow into 2 stages, building and testing, to make them more stable (#20080 ) ### Description reactor win-ci.yml to solve the random hang issue in more GPU workflows, move nugget-zip packages and python cuda12 packages building to CPU machine. --------- Co-authored-by: Yi Zhang <your@email.com>	2024-03-28 12:55:44 +08:00
Dmitri Smirnov	b95fd4e644	Enable CUDA EP unit testing on Windows (#20039 ) ### Description Address build issues and source code discrepancies. Fix cuda_test_provider gtest argument stack corruption. ### Motivation and Context `OpTester` class that is widely used for kernel testing is not suitable for testing internal classes for EPs that are built as shared objects. Currently, CUDA EP tests run only on Linux. We want to enable testing and developments on Windows, and create a usable pattern for testing of other EPs internals. Alternatives considered: Abstracting EP unit tests into separate test executable such as `onnxruntime_test_all`. This alternative was rejected as it would create a lot more changes in the established patterns, and potentially interfere with CUDA functionality with more complex source code maintanence.	2024-03-27 13:32:36 -07:00
Yi Zhang	ab2eaedfaa	Install ONNX by buildling source code in Windows DML stage (#20079 ) ### Description In #20073, I use pin onnx version to unblock the whole PR CI. In fact, we could use the onnx that installed by building source code, that the onnx version is controlled by deps.txt. For some history reason, DML stage installed onnx from pypi. Now, the onnx can be installed as other stages. add an option to skip installing onnx in win-ci-prebuild-step	2024-03-27 12:29:34 -07:00
Yi Zhang	4df9d16f98	[Fix] TSAUpload task must be in building stage (#20098 ) ### Description In #20085, TSAUpload was in testing stage so main branch failed.	2024-03-27 12:20:57 -07:00
Yulong Wang	47903e701a	fix condition in web CI YAML (#20095 ) ### Description fix condition in web CI YAML	2024-03-27 10:35:43 -07:00
Yi Zhang	0561b9576e	Fix and Refactor Python Packaging Pipeline (#20085 ) ### Description Make Windows GPU Packaging stage in Python Packaging pipeline run on CPU machine as well ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> ### Test Link https://dev.azure.com/aiinfra/Lotus/_build/results?buildId=430961&view=results	2024-03-27 12:17:22 +08:00
Yulong Wang	0313dd1f65	Update Web CI to use data dir under Agent.TempDirectory (#20074 ) ### Description Update Web CI to use data dir under Agent.TempDirectory This change fixes the random failure caused by unstable access to karma temp directory (which is under AppData\Local\Temp) on CI pipeline	2024-03-26 13:16:59 -07:00
Baiju Meswani	40efbd6c37	Fix training and macos ci pipelines (#20034 )	2024-03-26 12:20:11 -07:00
sfatimar	eab35c20fc	Ort openvino npu 1.17 master (#19966 ) ### Description Add NPU to list of device supported. Added changes for Support to OV 2024.0 Nuget packages removes packaging of OpenVINO DLL Bug Fixes with Python API Reverted Dockerfiles not being maintained. ### Motivation and Context NPU Device has been introduced by Intel in latest client systems OpenVINO 2024.0 release is out. --------- Co-authored-by: Suryaprakash Shanmugam <suryaprakash.shanmugam@intel.com> Co-authored-by: Preetha Veeramalai <preetha.veeramalai@intel.com> Co-authored-by: Ubuntu <ubuntu@ubuntu-118727.iind.intel.com> Co-authored-by: hmamidix <hemax.sowjanya.mamidi@intel.com> Co-authored-by: vthaniel <vishnudas.thaniel.s@intel.com> Co-authored-by: saurabhkale17 <saurabh1.kale@intel.com>	2024-03-21 18:44:00 -07:00
Yi Zhang	cd6d3aea45	Refactor Python CUDA packaging pipeline to fix random hangs in building (#19989 ) ### Description 1. Move building on CPU machine. 2. Optimize the pipeline 3. Since there isn't official ONNX package for python 12, the python 12 test stage uses the packages built with ONNX source in build stage. ### Motivation and Context 1. Resolve the random hang in compilation 4. Save a lot of GPU resources. ---------	2024-03-22 09:16:00 +08:00
Yi Zhang	30a0d80925	Fix exception in Publish unit test results step (#20007 ) ### Description Test results files are all in RelWithDebInfo\RelWithDebInfo directory. It's not necessary to stat the directory of _deps ### Motivation and Context Recently this exception in zip-nuget pipleine occurs many times. `##[error]Error: Failed find: EPERM: operation not permitted, stat 'D:\a\_work\1\b\RelWithDebInfo\_deps\flatbuffers-src\java\src\test\java\DictionaryLookup'` https://dev.azure.com/aiinfra/Lotus/_build/results?buildId=426981&view=logs&j=75fc0348-fe99-522b-3acb-90fd80ac5271&t=5d4ebcc1-bcde-574d-6f4e-8abd0f04ae4b	2024-03-22 06:53:59 +08:00
Yi Zhang	175f149b30	Remove downloading deps in CUDA package test stage (#19993 ) ### Description <!-- Describe your changes. --> ### Motivation and Context downloading deps is not needed in test stage remove it to reduce random downloading errors	2024-03-21 10:01:03 +08:00
Yufeng Li	15219e2e71	turn on neural_speed by default (#19627 ) ### Description <!-- Describe your changes. --> the crash caused by the neural_speed turns out to be a very corn case. Turn it on by default. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2024-03-20 12:49:58 -07:00
Rachel Guo	6b305f95e0	Support xcframework for mac catalyst builds. (#19534 ) ### Description <!-- Describe your changes. --> ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> MAUI on macOS uses mac-catalyst which requires a different native binary. --------- Co-authored-by: rachguo <rachguo@rachguos-Mini.attlocal.net> Co-authored-by: Scott McKay <skottmckay@gmail.com>	2024-03-20 10:55:19 -07:00
Yi Zhang	8adbc09314	[Fix] Error Python Packaging Pipeline (Training CPU) (#19992 ) ### Description fix the error caused by https://github.com/microsoft/onnxruntime/pull/19973	2024-03-20 09:02:50 -07:00
mindest	3dfe4a5e6d	[ROCm] Remove MPI dependency and collectives to use NCCL (#19830 ) ### Description * Remove MPI dependency to use NCCL AllReduce, etc. * Exclude unsupported collectives in hipify	2024-03-19 17:35:18 -07:00
Hariharan Seshadri	cd6ec50b50	Switch a portion of CI/packaging jobs to MacOS12 (#19908 )	2024-03-19 14:54:58 -07:00
Yi Zhang	d4c8bc359e	Fix Training CPU docker image name to avoid unnecessary rebuilding (#19973 ) ### Description The docker image name was fixed, but the docker argument was different in different job. It would trigger rebuilding the docker image almost every time!!!	2024-03-19 09:33:24 -07:00
Yulong Wang	b29849a287	[js/common] fix typedoc warnings (#19933 ) ### Description Fix a few warnings in typedoc (for generating JS API): ``` [warning] The signature TrainingSession.loadParametersBuffer has an @param with name "buffer", which was not used. [warning] NonTensorType, defined in ./lib/onnx-value.ts, is referenced by OnnxValue but not included in the documentation. [warning] TensorFactory, defined in ./lib/tensor-factory.ts, is referenced by Tensor but not included in the documentation. [warning] ExternalDataFileType, defined in ./lib/onnx-model.ts, is referenced by InferenceSession.SessionOptions.externalData but not included in the documentation. [warning] TensorToDataUrlOptions, defined in ./lib/tensor-conversion.ts, is referenced by Tensor.toDataURL.toDataURL.options but not included in the documentation. [warning] TensorToImageDataOptions, defined in ./lib/tensor-conversion.ts, is referenced by Tensor.toImageData.toImageData.options but not included in the documentation. [warning] Failed to resolve link to "GpuBufferType" in comment for Env.WebGpuFlags.adapter. [warning] Failed to resolve link to "GpuBufferType" in comment for Env.WebGpuFlags.device. ``` Changes highlighted: - Merge `CoreMlExecutionProviderOption` and `CoreMLExecutionProviderOption`. They expose 2 set of different options for React-native and ORT nodejs binding. This should be fixed in future. - Fix a few inconsistency of names between JSDoc and parameters - Fix broken type links - Exclude trace functions	2024-03-15 19:01:50 -07:00
Yifan Li	0b2a75b274	[EP Perf] Add concurrency test (#19804 ) ### Description <!-- Describe your changes. --> * Add concurrency test to EP Perf CI panel (impl. by onnx_test_runner) * Model: FasterRCNN-10 model within CI image * `-c` param configurable via CI panel when kicking off CI tasks * Auto-replicate test input/outputs according to `-c` param * By default, the model test will be executed in 100 iterations (~2min added to T4 CI task load overall) ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> To monitor potential concurrency issues of ORT-TRT	2024-03-15 07:41:21 -07:00
Justin Chu	bcf47d3546	Update install_deps_lort.sh to fix onnxscript installation (#19922 ) Install onnxscript correctly with `pip install`. Dev dependencies are not required. ### Motivation and Context Fix build breaks.	2024-03-14 17:05:50 -07:00
Adam Louly	32558134a9	[On-Device-Training] Upgrade Flatbuffers to Support 2GB+ Checkpoints. (#19770 ) ### Description Modifications to support 2GB+ checkpoint & Upgrading Flatbuffers ### Motivation and Context This PR includes changes that will make ort handle 2GB+ checkpoints. To do that we need to upgrade flatbuffers to 23.5.9 - https://github.com/google/flatbuffers/pull/7945 - Modified the commitHash and the hash for the new version - Removed the patch for rust generator's unused variable warning as it is no longer producing this - [Check it out here](`d121e09d89/src/idl_gen_rust.cpp`) - Updated the VerifyField calls with alignment values that were introduced in the new version. --------- Co-authored-by: Sumit Agarwal <sumitagarwal@microsoft.com>	2024-03-14 16:36:24 -07:00
Yi Zhang	87a9f77c56	Refactor Python Packaing Pipeline (Training Cuda 11.8) (#19910 ) ### Description 1. Use stage to organize the pipeline and split building and testing 2. Move compilation on CPU machine 3. test stage can leverage existing artifacts 4. check wheel size, it gives warning if the size above 300M 5. docker image name wasn't change even the argument changed, which caused the docker image was always rebuilt. So update the docker image name according to the argument can save the docker build time. Pipeline duration reduced by 60% (2 hours -> 50 minutes) Compilation time reduced by 75% (1.5hours -> 20 minutes) GPU time reduced by 87% ( 8 hours to 1 hours) for debugging, the GPU time could be reduced by above 95%, because we can choose run only one test stage and skip building. ### Motivation and Context Make the pipeline efficient. Optimized https://dev.azure.com/aiinfra/Lotus/_build/results?buildId=424177&view=results Curent https://dev.azure.com/aiinfra/Lotus/_build/results?buildId=422393&view=results ---------	2024-03-15 06:47:41 +08:00
Changming Sun	8b766bd24e	Change nuget pipeline's "Windows_Packaging_combined_GPU" job to download TRT binaries in every build (#19919 ) ### Description Change nuget pipeline's "Final_Jar_Testing_Windows_GPU" job to download TRT binaries in every build. Now all the other build jobs are already doing this. This is the only one left. Similar to #19909 ### Motivation and Context As a follow up of #19118	2024-03-14 15:07:56 -07:00
Changming Sun	ea4a5eea18	Change nuget pipeline's "Final_Jar_Testing_Windows_GPU" job to download TRT binaries in every build (#19909 ) ### Description Change nuget pipeline's "Final_Jar_Testing_Windows_GPU" job to download TRT binaries in every build. Now all the other build jobs are already doing this. This is the only one left. ### Motivation and Context As a follow up of #19118	2024-03-14 07:55:00 -07:00
Yulong Wang	e771a763c3	[js/test] align web test runner flags with ort.env (#19790 ) ### Description the `npm test` flags are difficult to memorize, because they are different to the `ort.env` flags. This change makes those flags align with ort JS API. eg. `--wasm-enable-proxy` became `--wasm.proxy`. Old flags are marked as deprecated except `-x` (as a shortcut of `--wasm.numThreads`)	2024-03-13 12:00:36 -07:00

1 2 3 4 5 ...

1975 commits