onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-05-29 23:06:41 +00:00

Author	SHA1	Message	Date
Yifan Li	d6ce43db5e	[EP Perf] MemTest: Add Valgrind and fix addressSanitizer (#16930 ) ### Description 1. Add valgrind to existing ep_perf CI MemTest and parse ORT-TRT memLeak details 1. General Valgrind logs and logs related to ORT-TRT will be parsed in [CI artifacts](https://aiinfra.visualstudio.com/Lotus/_build/results?buildId=334122&view=artifacts&pathAsName=false&type=publishedArtifacts) 1. Logic: 1. Run valgrind with `onnxruntime-perf-test -e tensorrt` and export log to `valgrind.log` 2. Identify if any `definitely lost` memleak happened 1. For log paragraphs which show `definitely lost`, parse if they have keyword `TensorrtExecutionProvider`. 2. If so, extract these details to `ort_trt_memleak_detail.log`, and return `build failure` to EP Perf CI 3. Fix existing addressSanitizer and sync the squeezenet testcase with latest update from [ort-inference-example](https://github.com/microsoft/onnxruntime-inference-examples/blob/main/c_cxx/squeezenet/main.cpp) 1. Updates in short: Upgrade main.cpp to be using OrtTensorRTProviderOptionsV2 4. Reorder the 7-min-MemTest to be ahead of 9-hr-model-tests, and enable MemTest by default	2023-08-04 16:58:57 -07:00
Yulong Wang	5af8774a0b	[build] do init and precheck first (#16961 ) ### Description This change allows Web CI to do some check as the first step, so that if there are errors it won't launch the task to build web assembly, which is heavy. Checks includes: - "npm ci" in /js, /js/common and /js/web. this implicitly include: - typescript compiler in /js - typescript compiler in /js/common - webpack build in /js/common - typescript compiler in /js/web - ESLint on typescripts - clang-format formatter (.js, .ts, .cc, .h, .mm) - Prettier formatter (.json, .jsonc, .md) --------- Co-authored-by: Caroline Zhu <carolinezhu@microsoft.com@orttrainingdev7.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net>	2023-08-04 16:44:45 -07:00
Yi Zhang	555414f1aa	Set PR trigger rules (#16987 ) ### Description Add a script to insert the trigger rules to workflow yamls. First step, skipp windows gpu and linux gpu workflow when there's only doc change ### Motivation and Context Make skipping workflows for doc change easily. [AB#18201](https://aiinfra.visualstudio.com/6a833879-cd9b-44a4-a9de-adc2d818f13c/_workitems/edit/18201)	2023-08-04 08:21:07 -07:00
Edward Chen	06096fcb31	Hardcode xcodebuild destination iOS simulator OS to 16.4. (#16982 )	2023-08-03 14:49:54 -07:00
Dmitri Smirnov	bd4d011142	[C#] Rename unreleased API, add utilities (#16806 ) ### Description 1. rename OrtValue.FillStringTensorElement to StringTensorSetElementAt . To the API user I think we're conceptually setting the string at an offset in the tensor with is roughly equivalent to `List<string> list ... list[index] = "value"`. 2. While working on new inference examples, I noticed that I am still inclined to use `DenseTensor` for N-D indexing. Added `GetStrides()` and `GetIndex()` from strides for long dims, so the user can obtain strides and translate N-D indices into a flat index to operate directly on the native `OrtValue` buffers. Expose these functions to the user. 3. Make sure we generate docs for C# public static functions.	2023-08-02 10:06:42 -07:00
Yulong Wang	4a2a248dd7	remove unused comments in mac CI yml file (#16964 ) ### Description remove unused comments in mac CI yml file	2023-08-01 20:52:12 -07:00
Yulong Wang	afac67bcc3	[build] fix the CI pipeline (#16962 ) ### Description There are currently multiple failures that blocking the CI pipelines so this PR has all of the fixes in order to make sure it passes the CI. Otherwise a single fix will still fail the CI. includes: #16960 #16958 Please help to make sure this PR get merged once CI passed. @snnn @carzh @guschmue Fixed: [AB#18118](https://aiinfra.visualstudio.com/6a833879-cd9b-44a4-a9de-adc2d818f13c/_workitems/edit/18118) --------- Co-authored-by: Caroline Zhu <carolinezhu@microsoft.com@orttrainingdev7.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net>	2023-08-01 16:22:45 -07:00
Changming Sun	e412d93b00	Add lsb-release package to android custom build (#16944 ) ### Description Add lsb-release package to android custom build ### Motivation and Context To fix a build issue: /workspace/onnxruntime/tools/ci_build/github/linux/docker/inference/x64/python/cpu/scripts/install_protobuf.sh: line 27: lsb_release: command not found	2023-08-01 11:27:29 -07:00
Yulong Wang	969c95f73f	[js/common] a few fixes/revises to onnxruntime-common (#16853 ) ### Description - enable unit test for js/common in CI - add debug config in js/.vscode/launch.json - enable source map for js/common/test for debugging purposes; add source map files to ignore list - ignore js/common/test folder for npm packaging	2023-08-01 11:17:39 -07:00
Yi Zhang	c4e4b98fb2	replace one pool with onnxruntime-Win2022-GPU-T4 (#16953 ) ### Description replace one pool ### Motivation and Context onnxruntime-gpu-tensorrt8-winbuild-t4 would be deprecated	2023-08-01 21:02:56 +08:00
Changming Sun	73ddba964f	Update the MacOS/Linux build scripts that build/install protobuf from source (#16906 ) ### Description 1. As a follow-up of #16761, this PR allows build ORT on iOS/Android without the need to explicitly specify a protoc path. #16761 is for WASM. This one is for iOS/Android 2. Update the MacOS/Linux build scripts that build/install protobuf from source. Make them be more flexible. Add the support for RedHatEnterprise(ubi), which will needed for upgrading the base image from centos:7 to ubi:8. 3. Update tools/ci_build/github/pai/rocm-ci-pipeline-env.Dockerfile : the docker file's base image has preinstalled protobuf in /usr/local, we should uninstall them to avoid conflicts.	2023-07-31 10:51:48 -07:00
Yi Zhang	28a099fca8	unify the steps of downloading cuda sdk and setup env (#16896 ) ### Description The `%AGENT_TEMPDIRECTORY%\v11.8` is created in azcopy step. So, the set env step should be after the azcopy step. ### Motivation and Context Correct the previous logic Unify the step since multiple jobs are using it.	2023-07-31 10:25:04 -07:00
Scott McKay	21a71d52bd	Enable CodeQL for Android build as per 1CS requirement. (#16875 ) ### Description <!-- Describe your changes. --> Split stages for CPU and CPU+NNAPI builds as CodeQL is enabled at the stage level. We run it for CPU+NNAPI as that covers all the Android code. We don't want to run it for both as duplicate issues would be created for a problem in code included in both builds. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2023-07-28 17:54:23 +10:00
Yi Zhang	9f21f694cf	stop support to VS 2019 (#16892 ) ### Description Remove VS 2019 code. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2023-07-28 13:09:35 +08:00
Changming Sun	9dcbcf1d2f	Delete unused files (#16887 ) ### Description These yaml files and docker files are not used by any pipeline. If I were wrong, feel free to submit a PR to get the wrongly deleted file back from git history (git keeps everything forever).	2023-07-27 16:46:09 -07:00
Yi Zhang	bd95a8ea77	update onnxruntime-gpu-winbuild-T4 to onnxruntime-Win2022-GPU-T4 (#16838 ) ### Description ### Motivation and Context It's also used to upgrade visual studio to VS2022. onnxruntime-gpu-winbuild-T4 and onnxruntime-gpu-tensorrt8-winbuild-t4 are using the image based on one dev branch and VS2019 To avoid breaking the current CIs, we move jobs running on onnxruntime-gpu-winbuild-T4/onnxruntime-gpu-tensorrt8-winbuild-t4 to onnxruntime-Win2022-GPU-T4.	2023-07-27 08:38:20 -07:00
Wang, Mengni	fe463d4957	Support SmoothQuant for ORT static quantization (#16288 ) ### Description Support SmoothQuant for ORT static quantization via intel neural compressor > Note: Please use neural-compressor==2.2 to try SmoothQuant function. ### Motivation and Context For large language models (LLMs) with gigantic parameters, the systematic outliers make quantification of activations difficult. As a training free post-training quantization (PTQ) solution, SmoothQuant offline migrates this difficulty from activations to weights with a mathematically equivalent transformation. Integrating SmoothQuant into ORT quantization can benefit the accuracy of INT8 LLMs. --------- Signed-off-by: Mengni Wang <mengni.wang@intel.com>	2023-07-26 18:56:45 -07:00
Justin Chu	0c1a5098dc	Disable PERF* rules in ruff to allow better readability (#16834 ) ### Description Disable two PERF* rules in ruff to allow better readability. Rational commented inline. This change also removes the unused noqa directives because of the rule change. ### Motivation and Context Readability	2023-07-25 15:38:22 -07:00
Edward Chen	e01365f80b	Update upload_pod_archive_and_update_podspec.sh to take path pattern (#16810 ) Update upload_pod_archive_and_update_podspec.sh to take a pod archive path glob pattern. The actual pod archive path has a version suffix which changes.	2023-07-25 08:55:31 -07:00
Yi Zhang	38db5eca65	replace onnxruntime-Win-CPU-2019 with onnxruntime-Win-CPU-2022 (#16844 ) ### Description <!-- Describe your changes. --> ### Motivation and Context upgrade to VS2022	2023-07-25 23:05:34 +08:00
Yi Zhang	f88f0d8e36	Upgrade 4 stages in nuget pipeline to VS2022 (#16825 ) ### Description ### Motivation and Context Continue upgrading to VS2022 ### Verfication https://dev.azure.com/aiinfra/Lotus/_build/results?buildId=331377&view=results N.B. In practice, SDLNativeRules@3 doesn't support VS2019.	2023-07-25 14:22:39 +08:00
Yulong Wang	8b30dc11d7	Update run_CIs_for_external_pr.py to skip passed checks (#16808 ) ### Description Update run_CIs_for_external_pr.py to skip passed checks	2023-07-25 16:11:53 +10:00
PeixuanZuo	8ede2f139e	[ROCm] Optimize ROCm CI pipeline 2 (#16691 ) - Set `KERNEL_EXPLORER_TEST_USE_CUPY=1` to replace numpy with cupy on kernel explorer test. KERNEL_EXPLORER_TEST_USE_CUPY=0 The CPU utilization is shown as below: ![image](https://github.com/microsoft/onnxruntime/assets/94887879/91724b78-0b4e-4cbd-ad88-83cad9976472) KERNEL_EXPLORER_TEST_USE_CUPY=1 The CPU utilization is shown as below: ![image](https://github.com/microsoft/onnxruntime/assets/94887879/58239911-667c-4d5f-bb78-deca60d0266f) - Use `Bash@3`. - Update shell script.	2023-07-24 13:57:48 +08:00
Chi Lo	21ef14476b	Bug fix for nested control flow ops for TRT EP (#16343 ) Current TRT EP can support model which has nested control flow ops (multiple level subgraphs). But it fails at a case where the subgraph has outer scope value that is defined several levels up in the top-level graph, in this case, the outer scope value is the input of the top-level graph. The outer scope values are not properly handled during TRT EP's subgraph reconstruction stage and fails at `graph.resolve()`. The way ORT gets capability from EPs is a bottom-up approach meaning inner most subgraph gets handled first. TRT EP reconstructs each subgraph level by level and following modifications are made to fix the outer scope values issue: - `SetGraphOuterScopeValuesAndInputs()` and `SetAllGraphInputs()` are added to handle outer scope values and add those values as graph inputs if needed in order to make `graph.resolve()` happy. - Change to use `GetNodeArgIncludingParentGraphs` so that when creating the fused TRT node for some subgraphs in` Graph::CreateFusedSubGraphNode()`, it can get the NodeArgs for outer scope values from top-level graph. This PR fixes https://github.com/microsoft/onnxruntime/issues/16217	2023-07-23 16:16:17 -07:00
Yi Zhang	3252ff2cb7	Change DML GPU pool in Windows GPU workflow use Visual Studio 2022 (#16784 ) ### Description 1. use the pool with VS2022 2. upgrade System.Memory to 4.5.5 ### Motivation and Context Solve the build error while using VS2022: `[Failure] Msbuild failed when processing the file 'D:\a\_work\1\s\csharp\src\Microsoft.ML.OnnxRuntime\Microsoft.ML.OnnxRuntime.csproj' with message: Method not found: 'System.ReadOnlySpan`1<Char> Microsoft.IO.Path.GetFileName(System.ReadOnlySpan`1<Char>)'` Ref: https://stackoverflow.com/questions/73399777/azure-build-failing-due-to-method-not-found-system-readonlyspan1char-micros	2023-07-23 10:07:21 +08:00
Justin Chu	d79515041c	[Better Engineering] Bump ruff to 0.0.278 and fix new lint errors (#16789 ) Stack from [ghstack](https://github.com/ezyang/ghstack) (oldest at bottom): * __->__ #16789 Bump ruff to 0.0.278 and fix new lint errors. I added noqa to all existing RUF012 errors which requires mutable class variables to be annotated with `ClassVar`, as well as all PERF issues. Signed-off-by: Justin Chu <justinchu@microsoft.com>	2023-07-21 12:53:41 -07:00
Baiju Meswani	538d2412ef	Objective-C Add Support to Create and Query String ORTValues (#16764 ) This pull request contains a few changes: 1. Adds support for string ort values. 2. Fixes the training minimal build (that was broken with #16601) by putting custom op registration behind #ifdefs 3. Fixes the iOS pod package generation (that was again broken with #16601) by explicitly providing paths to be copied during pod creation.	2023-07-20 17:39:29 -07:00
Adrian Lizarraga	a8c263f92c	[QNN EP] Update QNN SDK to 2.12 (#16750 ) ### Description - Updates the default QNN SDK to 2.12 for CI pipelines - Adds a disabled InstanceNormalization test for regression on QNN SDK 2.12 - Cleans up logs for unsupported ops. ### Motivation and Context Test with the latest QNN SDK.	2023-07-20 16:22:14 -07:00
Xavier Dupré	2bc9fbb621	Fix url in the code documentation (graph optimizations) (#16770 ) ### Description Fix a wrong url in the documentation as mentioned in issue #16678. ### Motivation and Context Better documentation.	2023-07-20 07:02:22 -07:00
Yi Zhang	c314d7724f	Update dml gpu pool to onnxruntime-Win2022-GPU-dml-A10 (#16765 ) ### Description onnxruntime-Win2022-GPU-dml-A10 is using VS2022. ### Motivation and Context 1. Upgrade VS2019 to VS2022 to fix prefast issue.	2023-07-20 16:52:13 +08:00
Edward Chen	fc1f463ff1	[ios] Enable training package in packaging pipeline (#16683 ) Build iOS training package in packaging pipeline. Refactor iOS packaging pipeline to build different package variants in parallel.	2023-07-19 19:55:00 -07:00
saurabh	24566058b3	ovep dockerfile and wheel docs changes (#16482 ) ### Description This PR is includes changes in the documentation of _readmeOV.rst_ file and also the changes in the dockerfile which enables to build ORT with latest OpenVINO 2023.0.0 ### Motivation and Context Modified the dockerfile to incorporate the latest version of OpenVINO (2023.0.0) for building Onnxruntime. The changes in the PR aim to improve the overall user experience by providing accurate and up-to-date documentation while leveraging latest OpenVINO 2023.0.0	2023-07-19 09:01:09 -07:00
Scott McKay	ad90352a68	Add MAUI test app that can be used to test model loading and performance (#16658 ) ### Description <!-- Describe your changes. --> MAUI test app with tooling to add model and generated or provided input test data. The app will load the model and validate the output. It can also run a specified number of iterations to provide basic performance information. <img width="401" alt="image" src="https://github.com/microsoft/onnxruntime/assets/979079/daf3af13-fb22-4cbb-9159-486b483a7485"> ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Primarily to make it easier to test an arbitrary model on iOS. A MAUI app allows testing on all platforms. --------- Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com>	2023-07-18 08:21:18 +10:00
Edward Chen	df8843c4a7	Upgrade old Python version in packaging pipeline (#16667 ) - Upgrade from Python 3.6 to 3.8 in packaging pipeline. - Raise build.py minimum required Python version.	2023-07-17 08:24:47 -07:00
Adrian Lizarraga	19169afe30	[QNN EP] Add option to skip unit tests in the QNN NuGet packaging pipeline (#16164 ) Add option to skip unit tests in the QNN NuGet packaging pipeline.	2023-07-14 10:52:05 -07:00
Yi Zhang	36b121d8c2	add more check to Web CI on cache restore (#16689 ) ### Description <!-- Describe your changes. --> ### Motivation and Context Make sure the data is correct.	2023-07-14 10:00:13 +08:00
Scott McKay	a3fc04ba74	Fix CodeCoverage pipeline (#16684 ) ### Description <!-- Describe your changes. --> Delete second reference to onnxruntime_api_tests_without_env in the code coverage commands. One was removed in #16373 and the duplicate wasn't noticed. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Fix pipeline.	2023-07-14 07:47:04 +10:00
PeixuanZuo	ebc311365b	[ROCm] Optimize ROCm CI to reduce time (#16620 ) This PR mainly optimize ROCm CI test to reduce time and CPU utilization. - use smaller batch size on strided_batched_gemm/batched_gemm test - disable cpu training test - fix test_e2e_padding_elimination Occasional failures on ROCm.	2023-07-13 10:58:03 +08:00
Yi Zhang	f3b40abe29	Use pipeline cache to cache onnx node test data. (#16659 ) ### Description Use pipeline cache instead of reading data from the image. ### Motivation and Context 1. To reduce the browser dependency of custom image. 2. The onnx node test data is less than 30M and the cache download time is very short.	2023-07-13 09:26:27 +08:00
PeixuanZuo	596dbe277e	[ROCm] add upgrade to fix security issue (#16668 )	2023-07-12 17:57:18 +08:00
Edward Chen	1b8d5c43c2	Fix builds (#16646 ) - Fix some more `shorten-64-to-32` warnings - Move minimum build.py Python version back to 3.6	2023-07-11 19:21:25 -07:00
Scott McKay	ce68a4c06a	Fix Linux build failure when onnxruntime_DISABLE_ABSEIL=ON (#16373 ) ### Description <!-- Describe your changes. --> Add ort_value.h to session_options.h so OrtValue is defined. Update a unit test binary to add required include paths. Adding ort_value.h pulls in more data type headers. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> #16193	2023-07-12 11:23:18 +10:00
mindest	347c963d5c	[ROCm] Add ROCm Triton TunableOp for GroupNorm (#16196 ) ### Description - Refactor existing Triton TunableOp-related code (based on work in #15862) - Add GroupNorm Triton implementation	2023-07-11 13:55:30 +08:00
Yulong Wang	5b6c1394cb	[js/test] CI: use pre-downloaded testdata in image (#16562 ) ### Description update web CI to use pre-downloaded testdata in image	2023-07-10 22:22:14 -07:00
PeixuanZuo	2fd5e1cc39	[ROCm] fix shell bug (#16641 ) `set -ex` with `grep` will exit when grep doesn't meet any string.	2023-07-10 17:31:27 +08:00
PeixuanZuo	cb4bf4f5c8	[ROCm] Move ROCm build step on CPU only machine (#16596 ) - Move ROCm build step on CPU only machine - Add the performance data of the huggingface bert-large model on the MI200 - At the beginning of the test step, check the agent's GPU usage and kill the threads occupying the GPU, which may be left over from previous tasks that exited abnormally. - Use different docker images during the build and test steps. The difference is the `uid` and `user` when build docker image and create docker container.	2023-07-10 11:55:10 +08:00
Xavier Dupré	47a0289ee6	[CI] Removes type2 in process_registration and fix Windows GPU Reduced Ops CI Pipeline (#16530 ) ### Description Windows GPU Reduced Ops CI Pipeline is broken due to the introduction of a second template type in registered kernels. The python code checking the registration is broken due to that. This PR addresses this issue on the python side by keeping only one type equal to the concatenation of the two types.	2023-07-07 18:21:06 +02:00
Edward Chen	6be7b03e53	Enable `-Wshorten-64-to-32` warning if available. (#16524 ) - Fix some warnings from Xcode build (`-Wshorten-64-to-32`). - Enable `-Wshorten-64-to-32` warning if available. Currently it's not fully enabled for `onnxruntime_test_all` and `onnxruntime_providers_xnnpack` yet. - Some clean up in build.py including setting CMake generator more consistently.	2023-07-07 08:11:44 -07:00
Edward Chen	e22b0836e7	[objc] Update docs and fix static analysis build (#16617 ) - Update some documentation comments. - Use onnxruntime_training.h as the umbrella header so training API docs are included in generated docs. - Fix static analysis build.	2023-07-07 07:58:54 -07:00
Scott McKay	697dd12f6e	Re-organize the transpose optimization and layout transformation files. (#16246 ) ### Description <!-- Describe your changes. --> Split out the more basic changes from #15552 for easier review. Re-organize to clarify the structure - Separate out generic base functionality from ORT specific components - pass in handlers for internal ORT ops to Optimize - Split out layout transformation from transpose optimization - Separate out level 1 transpose optimizer - Cleanup some naming to try and clarify things like an optimizer vs. general optimization code Most of the changes are from this movement of code. Two implementation changes: - the extended handlers are queried first in GetHandler - allows the extended handlers to override the default behaviour for an ONNX operator - simplify the Optimize function to remove OptimizerMode. - `can_modify_node` is used instead of `mode` and `ignore_assigned_nodes` and a long description of the current usage is added. I don't _think_ that changes the current behavior and hopefully clarifies what happens and when, and makes the base transpose optimizer implementation more generic. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Create a cleaner separation to support adding EP specific logic next to cleanly handle where an EP has additional layout sensitive behaviour required (e.g. it's Resize implementation only handles one layout).	2023-07-07 08:24:47 +10:00

1 2 3 4 5 ...

2061 commits