onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-05-31 23:27:43 +00:00

Author	SHA1	Message	Date
Yi Zhang	bd95a8ea77	update onnxruntime-gpu-winbuild-T4 to onnxruntime-Win2022-GPU-T4 (#16838 ) ### Description ### Motivation and Context It's also used to upgrade visual studio to VS2022. onnxruntime-gpu-winbuild-T4 and onnxruntime-gpu-tensorrt8-winbuild-t4 are using the image based on one dev branch and VS2019 To avoid breaking the current CIs, we move jobs running on onnxruntime-gpu-winbuild-T4/onnxruntime-gpu-tensorrt8-winbuild-t4 to onnxruntime-Win2022-GPU-T4.	2023-07-27 08:38:20 -07:00
Wang, Mengni	fe463d4957	Support SmoothQuant for ORT static quantization (#16288 ) ### Description Support SmoothQuant for ORT static quantization via intel neural compressor > Note: Please use neural-compressor==2.2 to try SmoothQuant function. ### Motivation and Context For large language models (LLMs) with gigantic parameters, the systematic outliers make quantification of activations difficult. As a training free post-training quantization (PTQ) solution, SmoothQuant offline migrates this difficulty from activations to weights with a mathematically equivalent transformation. Integrating SmoothQuant into ORT quantization can benefit the accuracy of INT8 LLMs. --------- Signed-off-by: Mengni Wang <mengni.wang@intel.com>	2023-07-26 18:56:45 -07:00
Justin Chu	0c1a5098dc	Disable PERF* rules in ruff to allow better readability (#16834 ) ### Description Disable two PERF* rules in ruff to allow better readability. Rational commented inline. This change also removes the unused noqa directives because of the rule change. ### Motivation and Context Readability	2023-07-25 15:38:22 -07:00
Edward Chen	e01365f80b	Update upload_pod_archive_and_update_podspec.sh to take path pattern (#16810 ) Update upload_pod_archive_and_update_podspec.sh to take a pod archive path glob pattern. The actual pod archive path has a version suffix which changes.	2023-07-25 08:55:31 -07:00
Yi Zhang	38db5eca65	replace onnxruntime-Win-CPU-2019 with onnxruntime-Win-CPU-2022 (#16844 ) ### Description <!-- Describe your changes. --> ### Motivation and Context upgrade to VS2022	2023-07-25 23:05:34 +08:00
Yi Zhang	f88f0d8e36	Upgrade 4 stages in nuget pipeline to VS2022 (#16825 ) ### Description ### Motivation and Context Continue upgrading to VS2022 ### Verfication https://dev.azure.com/aiinfra/Lotus/_build/results?buildId=331377&view=results N.B. In practice, SDLNativeRules@3 doesn't support VS2019.	2023-07-25 14:22:39 +08:00
Yulong Wang	8b30dc11d7	Update run_CIs_for_external_pr.py to skip passed checks (#16808 ) ### Description Update run_CIs_for_external_pr.py to skip passed checks	2023-07-25 16:11:53 +10:00
PeixuanZuo	8ede2f139e	[ROCm] Optimize ROCm CI pipeline 2 (#16691 ) - Set `KERNEL_EXPLORER_TEST_USE_CUPY=1` to replace numpy with cupy on kernel explorer test. KERNEL_EXPLORER_TEST_USE_CUPY=0 The CPU utilization is shown as below: ![image](https://github.com/microsoft/onnxruntime/assets/94887879/91724b78-0b4e-4cbd-ad88-83cad9976472) KERNEL_EXPLORER_TEST_USE_CUPY=1 The CPU utilization is shown as below: ![image](https://github.com/microsoft/onnxruntime/assets/94887879/58239911-667c-4d5f-bb78-deca60d0266f) - Use `Bash@3`. - Update shell script.	2023-07-24 13:57:48 +08:00
Chi Lo	21ef14476b	Bug fix for nested control flow ops for TRT EP (#16343 ) Current TRT EP can support model which has nested control flow ops (multiple level subgraphs). But it fails at a case where the subgraph has outer scope value that is defined several levels up in the top-level graph, in this case, the outer scope value is the input of the top-level graph. The outer scope values are not properly handled during TRT EP's subgraph reconstruction stage and fails at `graph.resolve()`. The way ORT gets capability from EPs is a bottom-up approach meaning inner most subgraph gets handled first. TRT EP reconstructs each subgraph level by level and following modifications are made to fix the outer scope values issue: - `SetGraphOuterScopeValuesAndInputs()` and `SetAllGraphInputs()` are added to handle outer scope values and add those values as graph inputs if needed in order to make `graph.resolve()` happy. - Change to use `GetNodeArgIncludingParentGraphs` so that when creating the fused TRT node for some subgraphs in` Graph::CreateFusedSubGraphNode()`, it can get the NodeArgs for outer scope values from top-level graph. This PR fixes https://github.com/microsoft/onnxruntime/issues/16217	2023-07-23 16:16:17 -07:00
Yi Zhang	3252ff2cb7	Change DML GPU pool in Windows GPU workflow use Visual Studio 2022 (#16784 ) ### Description 1. use the pool with VS2022 2. upgrade System.Memory to 4.5.5 ### Motivation and Context Solve the build error while using VS2022: `[Failure] Msbuild failed when processing the file 'D:\a\_work\1\s\csharp\src\Microsoft.ML.OnnxRuntime\Microsoft.ML.OnnxRuntime.csproj' with message: Method not found: 'System.ReadOnlySpan`1<Char> Microsoft.IO.Path.GetFileName(System.ReadOnlySpan`1<Char>)'` Ref: https://stackoverflow.com/questions/73399777/azure-build-failing-due-to-method-not-found-system-readonlyspan1char-micros	2023-07-23 10:07:21 +08:00
Justin Chu	d79515041c	[Better Engineering] Bump ruff to 0.0.278 and fix new lint errors (#16789 ) Stack from [ghstack](https://github.com/ezyang/ghstack) (oldest at bottom): * __->__ #16789 Bump ruff to 0.0.278 and fix new lint errors. I added noqa to all existing RUF012 errors which requires mutable class variables to be annotated with `ClassVar`, as well as all PERF issues. Signed-off-by: Justin Chu <justinchu@microsoft.com>	2023-07-21 12:53:41 -07:00
Baiju Meswani	538d2412ef	Objective-C Add Support to Create and Query String ORTValues (#16764 ) This pull request contains a few changes: 1. Adds support for string ort values. 2. Fixes the training minimal build (that was broken with #16601) by putting custom op registration behind #ifdefs 3. Fixes the iOS pod package generation (that was again broken with #16601) by explicitly providing paths to be copied during pod creation.	2023-07-20 17:39:29 -07:00
Adrian Lizarraga	a8c263f92c	[QNN EP] Update QNN SDK to 2.12 (#16750 ) ### Description - Updates the default QNN SDK to 2.12 for CI pipelines - Adds a disabled InstanceNormalization test for regression on QNN SDK 2.12 - Cleans up logs for unsupported ops. ### Motivation and Context Test with the latest QNN SDK.	2023-07-20 16:22:14 -07:00
Xavier Dupré	2bc9fbb621	Fix url in the code documentation (graph optimizations) (#16770 ) ### Description Fix a wrong url in the documentation as mentioned in issue #16678. ### Motivation and Context Better documentation.	2023-07-20 07:02:22 -07:00
Yi Zhang	c314d7724f	Update dml gpu pool to onnxruntime-Win2022-GPU-dml-A10 (#16765 ) ### Description onnxruntime-Win2022-GPU-dml-A10 is using VS2022. ### Motivation and Context 1. Upgrade VS2019 to VS2022 to fix prefast issue.	2023-07-20 16:52:13 +08:00
Edward Chen	fc1f463ff1	[ios] Enable training package in packaging pipeline (#16683 ) Build iOS training package in packaging pipeline. Refactor iOS packaging pipeline to build different package variants in parallel.	2023-07-19 19:55:00 -07:00
saurabh	24566058b3	ovep dockerfile and wheel docs changes (#16482 ) ### Description This PR is includes changes in the documentation of _readmeOV.rst_ file and also the changes in the dockerfile which enables to build ORT with latest OpenVINO 2023.0.0 ### Motivation and Context Modified the dockerfile to incorporate the latest version of OpenVINO (2023.0.0) for building Onnxruntime. The changes in the PR aim to improve the overall user experience by providing accurate and up-to-date documentation while leveraging latest OpenVINO 2023.0.0	2023-07-19 09:01:09 -07:00
Scott McKay	ad90352a68	Add MAUI test app that can be used to test model loading and performance (#16658 ) ### Description <!-- Describe your changes. --> MAUI test app with tooling to add model and generated or provided input test data. The app will load the model and validate the output. It can also run a specified number of iterations to provide basic performance information. <img width="401" alt="image" src="https://github.com/microsoft/onnxruntime/assets/979079/daf3af13-fb22-4cbb-9159-486b483a7485"> ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Primarily to make it easier to test an arbitrary model on iOS. A MAUI app allows testing on all platforms. --------- Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com>	2023-07-18 08:21:18 +10:00
Edward Chen	df8843c4a7	Upgrade old Python version in packaging pipeline (#16667 ) - Upgrade from Python 3.6 to 3.8 in packaging pipeline. - Raise build.py minimum required Python version.	2023-07-17 08:24:47 -07:00
Adrian Lizarraga	19169afe30	[QNN EP] Add option to skip unit tests in the QNN NuGet packaging pipeline (#16164 ) Add option to skip unit tests in the QNN NuGet packaging pipeline.	2023-07-14 10:52:05 -07:00
Yi Zhang	36b121d8c2	add more check to Web CI on cache restore (#16689 ) ### Description <!-- Describe your changes. --> ### Motivation and Context Make sure the data is correct.	2023-07-14 10:00:13 +08:00
Scott McKay	a3fc04ba74	Fix CodeCoverage pipeline (#16684 ) ### Description <!-- Describe your changes. --> Delete second reference to onnxruntime_api_tests_without_env in the code coverage commands. One was removed in #16373 and the duplicate wasn't noticed. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Fix pipeline.	2023-07-14 07:47:04 +10:00
PeixuanZuo	ebc311365b	[ROCm] Optimize ROCm CI to reduce time (#16620 ) This PR mainly optimize ROCm CI test to reduce time and CPU utilization. - use smaller batch size on strided_batched_gemm/batched_gemm test - disable cpu training test - fix test_e2e_padding_elimination Occasional failures on ROCm.	2023-07-13 10:58:03 +08:00
Yi Zhang	f3b40abe29	Use pipeline cache to cache onnx node test data. (#16659 ) ### Description Use pipeline cache instead of reading data from the image. ### Motivation and Context 1. To reduce the browser dependency of custom image. 2. The onnx node test data is less than 30M and the cache download time is very short.	2023-07-13 09:26:27 +08:00
PeixuanZuo	596dbe277e	[ROCm] add upgrade to fix security issue (#16668 )	2023-07-12 17:57:18 +08:00
Edward Chen	1b8d5c43c2	Fix builds (#16646 ) - Fix some more `shorten-64-to-32` warnings - Move minimum build.py Python version back to 3.6	2023-07-11 19:21:25 -07:00
Scott McKay	ce68a4c06a	Fix Linux build failure when onnxruntime_DISABLE_ABSEIL=ON (#16373 ) ### Description <!-- Describe your changes. --> Add ort_value.h to session_options.h so OrtValue is defined. Update a unit test binary to add required include paths. Adding ort_value.h pulls in more data type headers. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> #16193	2023-07-12 11:23:18 +10:00
mindest	347c963d5c	[ROCm] Add ROCm Triton TunableOp for GroupNorm (#16196 ) ### Description - Refactor existing Triton TunableOp-related code (based on work in #15862) - Add GroupNorm Triton implementation	2023-07-11 13:55:30 +08:00
Yulong Wang	5b6c1394cb	[js/test] CI: use pre-downloaded testdata in image (#16562 ) ### Description update web CI to use pre-downloaded testdata in image	2023-07-10 22:22:14 -07:00
PeixuanZuo	2fd5e1cc39	[ROCm] fix shell bug (#16641 ) `set -ex` with `grep` will exit when grep doesn't meet any string.	2023-07-10 17:31:27 +08:00
PeixuanZuo	cb4bf4f5c8	[ROCm] Move ROCm build step on CPU only machine (#16596 ) - Move ROCm build step on CPU only machine - Add the performance data of the huggingface bert-large model on the MI200 - At the beginning of the test step, check the agent's GPU usage and kill the threads occupying the GPU, which may be left over from previous tasks that exited abnormally. - Use different docker images during the build and test steps. The difference is the `uid` and `user` when build docker image and create docker container.	2023-07-10 11:55:10 +08:00
Xavier Dupré	47a0289ee6	[CI] Removes type2 in process_registration and fix Windows GPU Reduced Ops CI Pipeline (#16530 ) ### Description Windows GPU Reduced Ops CI Pipeline is broken due to the introduction of a second template type in registered kernels. The python code checking the registration is broken due to that. This PR addresses this issue on the python side by keeping only one type equal to the concatenation of the two types.	2023-07-07 18:21:06 +02:00
Edward Chen	6be7b03e53	Enable `-Wshorten-64-to-32` warning if available. (#16524 ) - Fix some warnings from Xcode build (`-Wshorten-64-to-32`). - Enable `-Wshorten-64-to-32` warning if available. Currently it's not fully enabled for `onnxruntime_test_all` and `onnxruntime_providers_xnnpack` yet. - Some clean up in build.py including setting CMake generator more consistently.	2023-07-07 08:11:44 -07:00
Edward Chen	e22b0836e7	[objc] Update docs and fix static analysis build (#16617 ) - Update some documentation comments. - Use onnxruntime_training.h as the umbrella header so training API docs are included in generated docs. - Fix static analysis build.	2023-07-07 07:58:54 -07:00
Scott McKay	697dd12f6e	Re-organize the transpose optimization and layout transformation files. (#16246 ) ### Description <!-- Describe your changes. --> Split out the more basic changes from #15552 for easier review. Re-organize to clarify the structure - Separate out generic base functionality from ORT specific components - pass in handlers for internal ORT ops to Optimize - Split out layout transformation from transpose optimization - Separate out level 1 transpose optimizer - Cleanup some naming to try and clarify things like an optimizer vs. general optimization code Most of the changes are from this movement of code. Two implementation changes: - the extended handlers are queried first in GetHandler - allows the extended handlers to override the default behaviour for an ONNX operator - simplify the Optimize function to remove OptimizerMode. - `can_modify_node` is used instead of `mode` and `ignore_assigned_nodes` and a long description of the current usage is added. I don't _think_ that changes the current behavior and hopefully clarifies what happens and when, and makes the base transpose optimizer implementation more generic. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Create a cleaner separation to support adding EP specific logic next to cleanly handle where an EP has additional layout sensitive behaviour required (e.g. it's Resize implementation only handles one layout).	2023-07-07 08:24:47 +10:00
Yi Zhang	fed08e070a	Add compiler cache in linux wasm build (#16579 ) ### Description Add compiler cache in wasm build to accelerate web ci ### Motivation and Context It could reduce the pipeline duration by 30 minutes. web ci could be completed in 2 hours with cache. https://dev.azure.com/onnxruntime/onnxruntime/_build/results?buildId=1053219&view=results	2023-07-06 06:58:48 +08:00
Vrajang Parikh	fd8ad9b950	Enable iOS packaging for training (#16525 ) ### Description Enable support for building iOS packages/CocoaPods with training API - Add `Training` Package variant and config files in current iOS packaging utilities to enable creation of training packages ### Motivation and Context This PR introduces new `Training` variant in `build_and_assemble_ios_pods.py` script which allows creating pods for iOS with training API enabled. The sample script to build training pods: ``` python3 tools/ci_build/github/apple/build_and_assemble_ios_pods.py --variant Training \ --build-settings-file tools/ci_build/github/apple/default_full_ios_training_framework_build_settings.json \ -b=-- path_to_protoc_exe=<path/to/protoc> ``` Note: build settings file should have `--enable_training` as a build parameter. Simply adding training packaging increases the duration of the Azure pipeline for packaging by 70 minutes. To address this issue, we need to parallelize pod creation. In order not to further strain the pipeline, the changes for training packaging will be added in another PR, which optimizes the packaging pipeline. --------- Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com>	2023-07-05 13:27:59 -07:00
PeixuanZuo	e2526714e2	[ROCm] Move MIGraphX build step on CPU only machine (#16582 ) - Move MIGraphX build step on CPU only machine - Use ccache on build step - Not pass host uid into docker build process.	2023-07-05 13:55:28 +08:00
Wei-Sheng Chin	a0a5f57581	[DORT] Use new FX-to-ONNX exporter (#16450 ) The ONNX exporter in DORT have been moved to PyTorch as a formal feature. We therefore switch to consume the exporter from PyTorch instead of maintaining two duplicates.	2023-07-04 13:13:04 -07:00
PeixuanZuo	d540c7da0f	[ROCm] Add ROCm5.6 to python package pipeline (#16572 ) Add ROCm5.6 to python package pipeline.	2023-07-04 18:18:12 +08:00
pengwa	ac100ebb64	Fix orttraining-ortmodule-distributed CI (#16569 ) ### Fix orttraining-ortmodule-distributed CI https://pypi.org/project/pydantic/#history released version 2.0 1st July, Deepspeed has known issue on newer version of it (https://github.com/microsoft/DeepSpeed/issues/3280). So fix this by add similar check as DS did in https://github.com/microsoft/DeepSpeed/pull/3290	2023-07-03 13:18:59 +08:00
Scott McKay	2fd25de360	Use verbose logging in Android emulator in React Native CI (#16528 ) ### Description <!-- Describe your changes. --> Set emulator logging to verbose to see if it helps with intermittent React Native CI failures when emulator crashes at startup ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2023-06-30 11:51:20 +10:00
Yi Zhang	fb7e1f133f	[Fix] TSA Upload failed in nuget pipeline. (#16476 ) ### Description partially revert PR #16244. ### Motivation and Context npm pipeline couldn't triggered if nuget pipeline status is warning. ### Test Run https://dev.azure.com/aiinfra/Lotus/_build/results?buildId=321873&view=logs&s=b17bed5b-cc14-5026-390a-fb2feea063f2	2023-06-28 06:40:49 +08:00
Rachel Guo	892b1b19ea	[js/rn] limit x86_64 arch in detox xcodebuild for react native e2e test (#16460 ) ### Description <!-- Describe your changes. --> As title. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Works with local onnxruntime-c pod in js/rn/e2e test.	2023-06-27 09:45:04 -07:00
guyang3532	4768ac5f30	Fix onnxruntime-CI-nightly-ort-pipeline Failure (#16495 ) The image for the onnxruntime-CI-nightly-ort-pipeline is too old. The ort package in the image is older than latest test code in nightly ci. This causes the nightly ci failed.	2023-06-27 23:19:23 +08:00
Yi Zhang	6e9541046e	extend react native ci timeout limit (#16469 ) ### Description <!-- Describe your changes. --> ### Motivation and Context 2 consecutive runs in npm pipeline failed due to time out	2023-06-27 08:44:03 +08:00
Yifan Li	e2c214d81f	[TensorRT EP] TRT 8.6 minor version update (#16475 ) ### Description * Minor version update: TRT 8.6.0.12->8.6.1.6 * CI pipeline ymls/dockerfiles are updated * cgmanifest.json/deps.txt/download-deps.yml are updated; Win trt binaries uploaded to [win img 307029](https://aiinfra.visualstudio.com/AI%20Infra%20Management/_build/results?buildId=307029&view=results) * Re-enable unit tests which were failed in 8.6.0 and re-gained support in 8.6.1	2023-06-26 10:44:27 -07:00
PeixuanZuo	7e211f0e03	[ROCm] Move mount data step into docker container (#16471 ) Some CI jobs may interrupted unexpectedly and didn't execute umount data step. The data left in host device will cause `device or resource busy` and make subsequent CI jobs fail. Move the mount data step into docker container, the host machine will not be occupied when CI jobs exit incorrectly.	2023-06-26 10:25:06 +08:00
Rachel Guo	04dbdc96bf	[js/rn] Fix React Native CI pipeline E2E test (#16447 ) ### Description <!-- Describe your changes. --> Based on this kindly provided quick fix: https://github.com/microsoft/onnxruntime/pull/16411 See more description in the above linked pr about bumping AGP version, etc. Also fixed import header file path in detox e2e test. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Good build: https://dev.azure.com/onnxruntime/onnxruntime/_build/results?buildId=1041757&view=logs&j=de302ec2-2305-57e0-e8c6-cd89c569f2a3&t=9894c870-b8ce-548d-51ff-8f44d21a4117&l=18	2023-06-22 14:33:49 -07:00
Yi Zhang	8e8840f1de	Enable Web CI on Linux (#16419 ) ### Description 1. Enable Web ci on Linux ### Motivation and Context 1. speed up web ci, the duration can be reduced from 160 minutes to 130 minutes, a time saving of 20% could be be achieved. The total computation time is 455 minutes now. Moved to Linux, it could be reduced to 336 minutes. 2. It's the first step to enable compilation cache for emscripten 3. per Yulong's request, build_web stages are still using windows pool ![image](https://github.com/microsoft/onnxruntime/assets/16190118/c9496408-74bd-45ea-b4ae-a4dd2a574d17) https://dev.azure.com/onnxruntime/onnxruntime/_build/results?buildId=1038382&view=results	2023-06-22 15:42:58 +08:00

1 2 3 4 5 ...

2046 commits