onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-07-03 03:58:54 +00:00

Author	SHA1	Message	Date
PeixuanZuo	cb4bf4f5c8	[ROCm] Move ROCm build step on CPU only machine (#16596 ) - Move ROCm build step on CPU only machine - Add the performance data of the huggingface bert-large model on the MI200 - At the beginning of the test step, check the agent's GPU usage and kill the threads occupying the GPU, which may be left over from previous tasks that exited abnormally. - Use different docker images during the build and test steps. The difference is the `uid` and `user` when build docker image and create docker container.	2023-07-10 11:55:10 +08:00
Xavier Dupré	47a0289ee6	[CI] Removes type2 in process_registration and fix Windows GPU Reduced Ops CI Pipeline (#16530 ) ### Description Windows GPU Reduced Ops CI Pipeline is broken due to the introduction of a second template type in registered kernels. The python code checking the registration is broken due to that. This PR addresses this issue on the python side by keeping only one type equal to the concatenation of the two types.	2023-07-07 18:21:06 +02:00
Edward Chen	6be7b03e53	Enable `-Wshorten-64-to-32` warning if available. (#16524 ) - Fix some warnings from Xcode build (`-Wshorten-64-to-32`). - Enable `-Wshorten-64-to-32` warning if available. Currently it's not fully enabled for `onnxruntime_test_all` and `onnxruntime_providers_xnnpack` yet. - Some clean up in build.py including setting CMake generator more consistently.	2023-07-07 08:11:44 -07:00
Edward Chen	e22b0836e7	[objc] Update docs and fix static analysis build (#16617 ) - Update some documentation comments. - Use onnxruntime_training.h as the umbrella header so training API docs are included in generated docs. - Fix static analysis build.	2023-07-07 07:58:54 -07:00
Scott McKay	697dd12f6e	Re-organize the transpose optimization and layout transformation files. (#16246 ) ### Description <!-- Describe your changes. --> Split out the more basic changes from #15552 for easier review. Re-organize to clarify the structure - Separate out generic base functionality from ORT specific components - pass in handlers for internal ORT ops to Optimize - Split out layout transformation from transpose optimization - Separate out level 1 transpose optimizer - Cleanup some naming to try and clarify things like an optimizer vs. general optimization code Most of the changes are from this movement of code. Two implementation changes: - the extended handlers are queried first in GetHandler - allows the extended handlers to override the default behaviour for an ONNX operator - simplify the Optimize function to remove OptimizerMode. - `can_modify_node` is used instead of `mode` and `ignore_assigned_nodes` and a long description of the current usage is added. I don't _think_ that changes the current behavior and hopefully clarifies what happens and when, and makes the base transpose optimizer implementation more generic. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Create a cleaner separation to support adding EP specific logic next to cleanly handle where an EP has additional layout sensitive behaviour required (e.g. it's Resize implementation only handles one layout).	2023-07-07 08:24:47 +10:00
Yi Zhang	fed08e070a	Add compiler cache in linux wasm build (#16579 ) ### Description Add compiler cache in wasm build to accelerate web ci ### Motivation and Context It could reduce the pipeline duration by 30 minutes. web ci could be completed in 2 hours with cache. https://dev.azure.com/onnxruntime/onnxruntime/_build/results?buildId=1053219&view=results	2023-07-06 06:58:48 +08:00
Vrajang Parikh	fd8ad9b950	Enable iOS packaging for training (#16525 ) ### Description Enable support for building iOS packages/CocoaPods with training API - Add `Training` Package variant and config files in current iOS packaging utilities to enable creation of training packages ### Motivation and Context This PR introduces new `Training` variant in `build_and_assemble_ios_pods.py` script which allows creating pods for iOS with training API enabled. The sample script to build training pods: ``` python3 tools/ci_build/github/apple/build_and_assemble_ios_pods.py --variant Training \ --build-settings-file tools/ci_build/github/apple/default_full_ios_training_framework_build_settings.json \ -b=-- path_to_protoc_exe=<path/to/protoc> ``` Note: build settings file should have `--enable_training` as a build parameter. Simply adding training packaging increases the duration of the Azure pipeline for packaging by 70 minutes. To address this issue, we need to parallelize pod creation. In order not to further strain the pipeline, the changes for training packaging will be added in another PR, which optimizes the packaging pipeline. --------- Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com>	2023-07-05 13:27:59 -07:00
PeixuanZuo	e2526714e2	[ROCm] Move MIGraphX build step on CPU only machine (#16582 ) - Move MIGraphX build step on CPU only machine - Use ccache on build step - Not pass host uid into docker build process.	2023-07-05 13:55:28 +08:00
Wei-Sheng Chin	a0a5f57581	[DORT] Use new FX-to-ONNX exporter (#16450 ) The ONNX exporter in DORT have been moved to PyTorch as a formal feature. We therefore switch to consume the exporter from PyTorch instead of maintaining two duplicates.	2023-07-04 13:13:04 -07:00
PeixuanZuo	d540c7da0f	[ROCm] Add ROCm5.6 to python package pipeline (#16572 ) Add ROCm5.6 to python package pipeline.	2023-07-04 18:18:12 +08:00
pengwa	ac100ebb64	Fix orttraining-ortmodule-distributed CI (#16569 ) ### Fix orttraining-ortmodule-distributed CI https://pypi.org/project/pydantic/#history released version 2.0 1st July, Deepspeed has known issue on newer version of it (https://github.com/microsoft/DeepSpeed/issues/3280). So fix this by add similar check as DS did in https://github.com/microsoft/DeepSpeed/pull/3290	2023-07-03 13:18:59 +08:00
Scott McKay	2fd25de360	Use verbose logging in Android emulator in React Native CI (#16528 ) ### Description <!-- Describe your changes. --> Set emulator logging to verbose to see if it helps with intermittent React Native CI failures when emulator crashes at startup ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2023-06-30 11:51:20 +10:00
Yi Zhang	fb7e1f133f	[Fix] TSA Upload failed in nuget pipeline. (#16476 ) ### Description partially revert PR #16244. ### Motivation and Context npm pipeline couldn't triggered if nuget pipeline status is warning. ### Test Run https://dev.azure.com/aiinfra/Lotus/_build/results?buildId=321873&view=logs&s=b17bed5b-cc14-5026-390a-fb2feea063f2	2023-06-28 06:40:49 +08:00
Rachel Guo	892b1b19ea	[js/rn] limit x86_64 arch in detox xcodebuild for react native e2e test (#16460 ) ### Description <!-- Describe your changes. --> As title. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Works with local onnxruntime-c pod in js/rn/e2e test.	2023-06-27 09:45:04 -07:00
guyang3532	4768ac5f30	Fix onnxruntime-CI-nightly-ort-pipeline Failure (#16495 ) The image for the onnxruntime-CI-nightly-ort-pipeline is too old. The ort package in the image is older than latest test code in nightly ci. This causes the nightly ci failed.	2023-06-27 23:19:23 +08:00
Yi Zhang	6e9541046e	extend react native ci timeout limit (#16469 ) ### Description <!-- Describe your changes. --> ### Motivation and Context 2 consecutive runs in npm pipeline failed due to time out	2023-06-27 08:44:03 +08:00
Yifan Li	e2c214d81f	[TensorRT EP] TRT 8.6 minor version update (#16475 ) ### Description * Minor version update: TRT 8.6.0.12->8.6.1.6 * CI pipeline ymls/dockerfiles are updated * cgmanifest.json/deps.txt/download-deps.yml are updated; Win trt binaries uploaded to [win img 307029](https://aiinfra.visualstudio.com/AI%20Infra%20Management/_build/results?buildId=307029&view=results) * Re-enable unit tests which were failed in 8.6.0 and re-gained support in 8.6.1	2023-06-26 10:44:27 -07:00
PeixuanZuo	7e211f0e03	[ROCm] Move mount data step into docker container (#16471 ) Some CI jobs may interrupted unexpectedly and didn't execute umount data step. The data left in host device will cause `device or resource busy` and make subsequent CI jobs fail. Move the mount data step into docker container, the host machine will not be occupied when CI jobs exit incorrectly.	2023-06-26 10:25:06 +08:00
Rachel Guo	04dbdc96bf	[js/rn] Fix React Native CI pipeline E2E test (#16447 ) ### Description <!-- Describe your changes. --> Based on this kindly provided quick fix: https://github.com/microsoft/onnxruntime/pull/16411 See more description in the above linked pr about bumping AGP version, etc. Also fixed import header file path in detox e2e test. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Good build: https://dev.azure.com/onnxruntime/onnxruntime/_build/results?buildId=1041757&view=logs&j=de302ec2-2305-57e0-e8c6-cd89c569f2a3&t=9894c870-b8ce-548d-51ff-8f44d21a4117&l=18	2023-06-22 14:33:49 -07:00
Yi Zhang	8e8840f1de	Enable Web CI on Linux (#16419 ) ### Description 1. Enable Web ci on Linux ### Motivation and Context 1. speed up web ci, the duration can be reduced from 160 minutes to 130 minutes, a time saving of 20% could be be achieved. The total computation time is 455 minutes now. Moved to Linux, it could be reduced to 336 minutes. 2. It's the first step to enable compilation cache for emscripten 3. per Yulong's request, build_web stages are still using windows pool ![image](https://github.com/microsoft/onnxruntime/assets/16190118/c9496408-74bd-45ea-b4ae-a4dd2a574d17) https://dev.azure.com/onnxruntime/onnxruntime/_build/results?buildId=1038382&view=results	2023-06-22 15:42:58 +08:00
Baiju Meswani	42489a8a24	Add ability to create ort format models from training offline utility (#16360 )	2023-06-21 18:51:43 -07:00
yf711	0ad0d6ebbf	Unblock Linux MultiGPU TensorRT CI (#16446 ) ### Description Revert docker base image to nvidia/cuda:11.8.0-cudnn8-devel-ubuntu20.04@sha256:b754c43fe9d62e88862d168c4ab9282618a376dbc54871467870366cacfa456e ### Motivation and Context The default img env of nvidia/cuda:11.8.0-cudnn8-devel-ubuntu20.04 has minor upgrade, which make Linux MultiGPU TensorRT CI (NV12 instance with Maxwell GPU) fail on three CApiTestGlobalThreadPoolsWithProvider tests (these three tests have higher error which are above the tolerance) That minor upgrade includes cudnn 8.7.0->8.9.0, which might be a factor that make maxwell GPU generator higher error. CIs with T4 GPU are not affected.	2023-06-21 17:15:39 -07:00
Rachel Guo	961fa7274a	[NNAPI doc] add reducemean to supported op list (#16414 ) ### Description <!-- Describe your changes. --> As title. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2023-06-21 00:29:20 -07:00
Rachel Guo	b4b126ffb0	Set onnxruntime-c local pod path environment variable for react native e2e tests on ci (#16431 ) ### Description <!-- Describe your changes. --> Set onnxruntime-c local pod path environment variable for react native e2e tests on react-native-ci.yml ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Previously the E2E test project is not properly consuming a local built onnxruntime-c version pod. https://github.com/microsoft/onnxruntime/pull/16411#issuecomment-1598512816	2023-06-21 00:27:36 -07:00
PeixuanZuo	470d6c1cce	[ROCm] Delete unused file to fix Component Governance Alert (#16407 ) Delete unused file to fix Component Governance Alert	2023-06-19 11:28:32 -07:00
PeixuanZuo	1418d8728c	[ROCm] Fix CI Pipeline (#16409 ) 1. add `set -ex` before commands. 2. update ccache.	2023-06-19 15:22:13 +08:00
Yi Zhang	8b9eab093b	keep symlinks in maven package (#16376 ) ### Description 1. Keep symlink in the package. 2. keep the artifact package format ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2023-06-19 09:41:39 +08:00
saurabh	a6ce7b339f	Enable model subgraph execution in OVEP and setting the OpenVINO dll's to the path from the OpenVINO pypi packge in OVEP and fix OVEP windows io buffer sample (#16147 ) ### Description This PR enables execution of subgraphs in OVEP and currently, when OVEP developers install the onnxruntime-openvino package on windows from pypi, they would have to additionally download OpenVINO windows binaries and run the setupvars.bat script which sets the environment PATH to locate the OV dll's. Also this PR fixes issues of OVEP windows io buffer sample. ### Motivation and Context Fix: We want to make the user experience easy for OVEP Python developers on windows platform. This fix, introduces a function add_openvino_libs_to_path at the location tools/python/util/add_openvino_win_libs.py. The above function, can be called by OVEP python users in the application code and that takes care of setting the OpenVINO dll's to the path from the OpenVINO pypi packge (openvino) which was installed. This change also makes sure that add_openvino_libs_to_path() function is added to onnxruntime python package only when it is build for OpenVINO Execution Provider for ONNXRuntime and not for default ORT python package builds. New user experience for Python OVEP developers on windows platform: step 1: pip install onnxruntime-openvino step 2: pip install openvino step 3: <Add these 2 lines in the application code> import onnxruntime.tools.add_openvino_win_libs as utils utils.add_openvino_libs_to_path() --------- Signed-off-by: MaajidKhan <n.maajid.khan@intel.com> Co-authored-by: MaajidKhan <n.maajid.khan@intel.com> Co-authored-by: Suryaprakash Shanmugam <suryaprakash.shanmugam@intel.com>	2023-06-16 19:47:09 -07:00
dependabot[bot]	dd660c054e	Bump transformers from 4.24.0 to 4.30.0 in /tools/ci_build (#16331 )	2023-06-16 13:08:46 -07:00
Changming Sun	188d5f5398	Fix Linux Multi GPU build pipeline (#16368 ) ### Description The build pipeline runs on Azure NV12 machines that will be deprecated soon because the SKU is too old. So this PR will move the pipeline to a Windows machine with two A10 GPUs.	2023-06-15 16:24:46 -07:00
Yi Zhang	3e99e43a1d	extend Final AAR testing timeout limit (#16340 ) ### Description <!-- Describe your changes. --> ### Motivation and Context improve nuget pipeline stability	2023-06-15 17:27:45 +08:00
PeixuanZuo	097346be9d	[ROCm] Add clean step for ROCm CI pipeline (#16336 ) 1. Add clean step for ROCm CI pipeline 2. Fix error "device or resource busy" bug by setting umount dataset step as `always()` step.	2023-06-15 13:44:12 +08:00
Baiju Meswani	5eec24837f	Fix for AMD GPU pipeline (#16357 )	2023-06-14 20:36:16 -07:00
Changming Sun	dbc7a195b1	Update win-ci-pipeline.yml: enable xnnpack tests (#16244 ) 1. Enable xnnpack test 2. Change TSA database name from onnxruntime_master to onnxruntime_main. This is a leftover of renaming the "master" branch to "main" 3. Add two static analysis jobs for WinML and DML 4. Rename the machine pool "aiinfra-dml-winbuild" to "onnxruntime-Win2019-GPU-dml-A10", so that the internal and public ADO instances use the same machine pool name. 5. Move Windows GPU CI build pipeline from "onnxruntime-Win2022-GPU-T4" to "onnxruntime-Win2022-GPU-A10" machine pool, because we do not have enough T4 GPUs.	2023-06-14 19:12:42 -07:00
Baiju Meswani	8a3de16d14	Temporary fix to make the training pipeline green (#16353 )	2023-06-14 13:11:35 -07:00
Edward Chen	4f23577cb5	[React Native] Publish E2E test logs on build failure too. (#16327 ) ### Description <!-- Describe your changes. --> Publish E2E test logs on build failure too. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Get more information about intermittent test failures.	2023-06-12 17:56:46 -07:00
JiCheng	eed02a3f78	Xnnpack QDQ test (#16281 ) ### Description A few QDQ tests failed on XNNPACK EP. The reason should be the range of input_data doesn't fit for scale and zero_point. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2023-06-12 14:00:42 +08:00
Yulong Wang	f274bbb0c8	[js] add API that allows to get package version (#16207 ) ### Description Add an API for users to get version of current package. example usage: ```js import { env } from 'onnxruntime-node'; console.log(env.versions.node); // output "1.16.0" ``` ```js import { env } from 'onnxruntime-web'; console.log(env.versions.web); // output "1.16.0" console.log(env.versions.common); // output "1.16.0" console.log(env.versions.node); // output "undefined" ``` #16156	2023-06-09 16:18:53 -07:00
Yi Zhang	3b5a8352c1	CodeSign Mac packages in nuget pipeline (#16291 ) ### Description 1. Updated Mac package workflow for easily debugging. 2. Changed Archive type from tgz to zip since zip is supported by ESRP. 3. .../dylib.dSYM/Contents/Resources/DWARF/libonnxruntime.1.16.0.dylib is a debug symbol file, so it couldn't be signed. ### Motivation and Context It‘s required from VS code. Mac binaries in nuget should be signed	2023-06-10 06:35:47 +08:00
Edward Chen	b668a6da96	Treat Objective-C static analysis warnings as errors (#16293 ) - Update Objective-C static analysis check to fail on warnings. - Address warning. - Clean up build definition.	2023-06-09 08:51:49 -07:00
Vrajang Parikh	67f4a4fd16	Objective-C binding for ORT training (#16127 ) ### Description Implement Objective-C binding for `ORTCheckPoint`. Additionally, - Modify `onnxruntime_objectivec.cmake` to only include training header and sources when training flag is enabled - Enable objective-c binding for `orttraining-mac-ci-pipeline` ### Motivation and Context This PR is part of implementing Objective-C bindings for training API. It implements objective-c binding for ORTCheckPoint class. The objective-C API closely resembles the C++ API. Note: The test for saving checkpoint is skipped as it requires use of training session. It will be added when the objective-c binding for `ORTTrainingSession` is added.	2023-06-07 14:01:30 -07:00
Edward Chen	1261d0b8ba	Fix some build issues on MacOS with Xcode 14.3. (#15878 ) - Fix flatbuffers flatc warning, unused-but-set-variable. - Address `-Wshorten-64-to-32` warnings (fix in our code, allow in dependencies' code). - Update CI builds to use Xcode 14.3. - Update minimum iOS version to 12.0. - Update Mac hosted agents to MacOS 13 where possible.	2023-06-07 12:07:11 -07:00
PeixuanZuo	a95f8ae53c	[ROCm] Update ROCm/MIGraphX CI pipeline (#16215 ) MIGraphX CI - Change docker container user name to `onnxruntimedev` ROCm CI - Build docker image every job instead of using prebuild image. - Every job create a container with only one GPU with command `docker run -it --device=/dev/kfd --device=/dev/dri/renderDxxx` - Remove tests that are unstable or use outdated interfaces. - Enable training ortmodule test.	2023-06-05 10:28:10 +08:00
Changming Sun	6b5b79872b	Avoid taking dependency on dl.fedoraproject.org (#16202 ) ### Description 1. Avoid taking dependency on dl.fedoraproject.org The website is not very stable. Our build pipelines often fail to fetch packages from there. 2. Update manylinux to the latest version	2023-06-02 07:41:46 -07:00
Changming Sun	5bfa1183d1	Add a Memory Profiling build job in post merge pipeline (#16172 ) ### Description 1. Add a Memory Profiling build job 2. Remove no absl build job since the feature will be removed 3. Simplify post-merge-jobs.yml by unifying the pool names ### Motivation and Context To catch build errors in #16124	2023-06-01 13:00:44 -07:00
Yi Zhang	e0199cfbd9	extend mac packaging timeout limit (#16173 ) ### Description ### Motivation and Context MacOS_py_wheels are often failed due to timeout	2023-05-31 18:31:28 +08:00
Baiju Meswani	7edc4b105d	Copy missing training header files to the package archive (#16119 )	2023-05-30 16:45:40 -07:00
Sunghoon	bf05d4ec26	Fix nightly ort CI pipeline (#16162 ) This PR changes [night ort CI pipeline](https://dev.azure.com/onnxruntime/onnxruntime/_build?definitionId=198) to pick up the latest night ACPT image, which was changed from torch 2.0.0.dev to torch 2.1.0.dev.	2023-05-30 14:00:34 -07:00
Xavier Dupré	e726151b5c	Introduce float 8 types (#14731 ) ### Description The PR implements FloatE4M3FN, FloatE5M2, FloatE4MEFNUZ, FloatE5M2FNUZ as described in PR https://github.com/onnx/onnx/pull/4805. It uses CUDA API to cast float/half to float8 if CUDA>=11.8, a custom implementation if CUDA<11.8. * It implements, Cast, QuantizeLinear, DequantizeLinear for all types on CPU, only for types FloatE4M3FN, FloatE5M2 on CUDA. * It extends the supported types for control flow operator, Shape, Reshape, Identity, If, Loop, Scan, Reshape * It implements Equal(19). * Cast, QuantizeLinear, DequantizeLinear operators now support a parameter `saturate` only valid for float 8 types. It is true by default. In that case, any value out of range is converted into the maximum float 8 value. If false, it is infinite. * QuantizeLinear, DequantizeLinear now supports multiple scales on CUDA (and ROCm by extension), scale = 1D tensor with one scale per channel ### Motivation and Context Supports latest onnx version. Fixes [AB#15395](https://aiinfra.visualstudio.com/6a833879-cd9b-44a4-a9de-adc2d818f13c/_workitems/edit/15395) --------- Co-authored-by: Xavier Dupre <xadupre@microsoft.com@orttrainingdev8.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net> Co-authored-by: Randy Shuai <rashuai@microsoft.com> Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com> Co-authored-by: Scott McKay <Scott.McKay@microsoft.com>	2023-05-30 13:25:58 -07:00
Yi Zhang	31fc25d2c2	[Fix] Check if CUDA is downloaded in AGENT_TEMPDIRECTORY (#16142 ) ### Description supplement of #15915 ### Motivation and Context fix nuget pipeline exception in the stage of Final_Jar_Testing_Windows_GPU ``` JUnit Jupiter:ProviderOptionsTest:testCUDAOptions() MethodSource [className = 'ai.onnxruntime.providers.ProviderOptionsTest', methodName = 'testCUDAOptions', methodParameterTypes = ''] => ai.onnxruntime.OrtException: Error code - ORT_RUNTIME_EXCEPTION - message: D:\a\_work\1\s\onnxruntime\core\session\provider_bridge_ort.cc:1131 onnxruntime::ProviderLibrary::Get [ONNXRuntimeError] : 1 : FAIL : LoadLibrary failed with error 126 "" when trying to load "C:\Users\cloudtest\AppData\Local\Temp\onnxruntime-java17193857285260738736\onnxruntime_providers_cuda.dll" ``` ### Verification https://dev.azure.com/aiinfra/Lotus/_build/results?buildId=313476&view=results	2023-05-30 13:14:08 +08:00
Yi Zhang	73584f9360	More fixes on nuget pipeline (#16091 ) ### Description 1. parameters couldn't using string to comprare, change it to boolean. 2. Windows_CI_GPU_DML_DEV_arm64 on the pool onnxruntime-Win-CPU-2022 failed to pass prefast step, change the pool to aiinfra-dml-winbuild. 3. skipped test_zfnet512, it's failed in Nuget_Test_Win_Training_CPU Todo Only Final_Jar_Testing_Windows_GPU failed now. https://dev.azure.com/aiinfra/Lotus/_build/results?buildId=313042&view=logs&s=d66543d5-16de-5a48-6ecb-a36e21ff8d4d&j=d9489789-5e39-5a05-13ab-9aaf7b4d386f	2023-05-27 08:59:12 +08:00
Scott McKay	5e41d1600a	Add new QNN CIs to azp run tool (#16109 ) ### Description <!-- Describe your changes. --> Add 2 new QNN CIs to tools/python/run_CIs_for_external_pr.py ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Update tool so it runs all current CIs	2023-05-27 08:46:16 +10:00
Changming Sun	60bb07307b	Fix the TRT GPU build job in python packaging pipeline (#16073 ) 1. Cherry-pick #16054 back to the main branch 2. Replace onnxruntime-gpu-winbuild-t4 with onnxruntime-Win2022-GPU-T4. The later one has VS2022. --------- Co-authored-by: Patrice Vignola <vignola.patrice@gmail.com>	2023-05-25 00:09:08 -07:00
Yi Zhang	76fd9aa745	[Fix] Some pipelines have to be using VS2019 (#16034 ) ### Description ### Motivation and Context Fix nuget and python package pipeline. 1. ARM 32 build isn't supported by VS2022 officially. https://developercommunity.visualstudio.com/t/Compilation-Error-with-VS2022-ARM/10285309 2. onnxruntime-gpu-winbuild-T4 and onnxruntime-gpu-winbuild-tensorrt8-T4 haven't VS 2022	2023-05-25 09:55:35 +08:00
yf711	84f1af7ff5	ort build flag fix (#16072 ) ### Description * Sync and clean build flag `--use_tensorrt_builtin_parser` from existing CI config as this becomes default flag * cuda version update	2023-05-24 12:32:10 -07:00
Guenther Schmuelling	20857c4ff2	workaround test failure in ci (#16070 ) don't run wasm proxy test on debug build to unblock ci. Needs some longer debugging.	2023-05-24 21:01:06 +08:00
Shukant Pal	f316bc57c4	[CoreML EP] Implement Unary & Reduce operators (#15532 ) ### Description This change is a follow-up to #15327. It adds Unary operators (Sqrt, Reciprocal) and Reduce operators (ReduceSum, ReduceMean). I've tried to follow existing patterns in the code :-) ### Motivation and Context This reduces fragmentation across EPs when using CoreML on macOS, thereby speeding up execution. --------- Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com>	2023-05-24 18:16:59 +10:00
RandySheriffH	d35361bf9d	Fix python pipeline for AzureEP without using root (#16023 ) Fix python pipeline for AzureEP without using root, this is for 1.15. --------- Co-authored-by: Randy Shuai <rashuai@microsoft.com>	2023-05-22 16:38:47 -07:00
Changming Sun	0204594f90	Cleanup WASM cmake code (#15996 ) ### Description Remove the "onnxruntime_BUILD_WEBASSEMBLY" cmake option. Use `if (CMAKE_SYSTEM_NAME STREQUAL "Emscripten")` instead. It makes some code look more nature. For example, ```cmake if (CMAKE_SYSTEM_NAME STREQUAL "iOS" OR CMAKE_SYSTEM_NAME STREQUAL "Android" OR onnxruntime_BUILD_WEBASSEMBLY) ``` becomes ```cmake if (CMAKE_SYSTEM_NAME STREQUAL "iOS" OR CMAKE_SYSTEM_NAME STREQUAL "Android" OR CMAKE_SYSTEM_NAME STREQUAL "Emscripten") ```	2023-05-20 18:07:39 -07:00
Hector Li	4324d2173b	[QNN EP] Enable Qnn context cache to save model initialization time (#15815 ) ### Description Enable Qnn Context cache feature to save model initialization time Provider options: qnn_context_cache_enable\|1 to enable the cache feature qnn_context_cache_path to set the cache path. It is set to model_file.onnx.bin by default. ### Motivation and Context Model initialization time takes long because the cost of conversion from Onnx model to Qnn model. Qnn have feature to serialize the Qnn context to file, then next time user can load it from the cache context and execute the graph to save the cost. --------- Co-authored-by: Adrian Lizarraga <adlizarraga@microsoft.com>	2023-05-19 10:52:17 -07:00
RandySheriffH	4dfb89b3ad	Implement mutex-free spin lock for task queue (#14834 ) Implemented "lock-free" spinlock to save CPU usage on context switching. The change has been tested on queene service of Ads team, the lock-free version of ort (40 threads) saves CPU usage on gen8 (128 logical processors on 8 numa nodes) windows by nearly half, from 65% to 35%. For 32 cores, the curve is flat: Anubis, 32 vCPU, windows, hugging face models, 95 percentile E2E latency in ms: model \| mutex(ms) \| mutex-free --- \| --- \| --- alvert_base_v2 \| 34.21 \| 34.09 bert_large_uncased \| 116.27\| 117.84 bart_base \| 72.06 \| 71.99 distilgpt2 \| 25.43 \| 25.02 vit_base_patch16_224 \| 37.33 \| 37.76 Anubis, 32 vCPU win, Linux, 1st party models, 95 percentile E2E latency in ms: model \| mutex(ms) \| mutex-free --- \| --- \| --- deepthink_v2 \| 24.35 \| 22.95 bing_feeds \| 36.96 \| 36.48 deep_writes \| 14.46 \| 14.32 keypoints \| 9.34 \| 7.69 model11 \| 1.71 \| 1.66 model12 \| 1.82 \| 1.44 model2 \| 4.21 \| 3.95 model6 \| 1.08 \| 1.05 agiencoder \| 0.99 \| 0.93 geminet_transformer \| 5.32 \| 5.24 --------- Co-authored-by: Randy Shuai <rashuai@microsoft.com>	2023-05-19 10:12:10 -07:00
Patrice Vignola	310b22aa0c	[DML EP] Update DirectML version to 1.12.0 (#16011 )	2023-05-18 19:37:12 -07:00
PeixuanZuo	d78bbf5ef2	[ROCm] remove ROCm5.2.3, ROCm5.3, ROCm5.4 from pipeline (#16004 ) remove ROCm5.2.3, ROCm5.3, ROCm5.4 from pipeline.	2023-05-19 10:29:01 +08:00
Edward Chen	6d46007028	Add explicit 'set +x' before printing a vso[] command to avoid output getting parsed again with a trailing quote. (#15986 ) Here's the motivating issue: https://github.com/microsoft/azure-pipelines-tasks/issues/10331 Noticed some problems in other repos so also updating usages in ORT. We may be fine now without it, but this change adds some safeguard against future additions of 'set -x' for debugging.	2023-05-17 19:30:28 -07:00
Changming Sun	d98763473a	Change CUDA pipelines to download CUDA SDK in every build job (#15915 ) ### Description Change CUDA pipelines to download CUDA SDK in every build job ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2023-05-17 17:31:51 -07:00
cloudhan	856afa49dd	[C#] Add missing rocm csharp api (#15540 )	2023-05-18 08:15:19 +08:00
Yi Zhang	6d43d51eb0	[Fix] No test result report while not using ctest (#15976 ) ### Description 1. Set gtest output while ctest is set to empty. 2. onnx_src in _deps shouldn't be removed because onnx_test_pytorch_converted and onnx_test_pytorch_converted need to read data from onnx/backend/test/data/.. ### Motivation and Context Test result report is important to find the flaky tests. ### To do Tests are not inconsistent. If ctest_path is empty, onnx_test_pytorch_converted and onnx_test_pytorch_converted will not be executed, if it's not, onnxruntime_mlas_test will not be executed. `270c09a37f/tools/ci_build/build.py (L1743-L1753)`	2023-05-17 08:31:16 -07:00
Jian Chen	2881d849d4	Update Win-CPU-2021 to onnxruntime-Win-CPU-2022 (#15967 ) ### Description After this PR there are following pool need to be updated. old\|new\|note ---\|---\|--- onnxruntime-Win2019-GPU-dml-A10\|tbd\| onnxruntime-Win2019-GPU-T4\|onnxruntime-Win2022-GPU-T4\| onnxruntime-Win2019-GPU-training-T4\|onnxruntime-Win2022-GPU-T4\|ame as the above because we do not have many T4 GPUs onnxruntime-tensorrt8-winbuild-T4\|tbd\| aiinfra-dml-winbuild\|tbd\| ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2023-05-17 08:29:27 -07:00
kailums	f62f722c70	integrate triton into ort (#15862 ) ### Description In some scenarios, the triton written kernels are more performant than CK or other handwritten kernels, so we implement a framework that onnxruntime can use these triton written kernels. This PR is to integrate triton into ort, so that ort can use kernels that written and compiled by triton. The main change focus on two part: 1. a build part to compile triton written kernel and combine these kernels into libonnxruntime_providers_rocm.so 2. a loader and launcher in c++, for loading and launch triton written kernels. #### Build To compile triton written kernel, add a script `tools/ci_build/compile_triton.py`. This script will dynamic load all kernel files, compile them, and generate `triton_kernel_infos.a` and `triton_kernel_infos.h`. `triton_kernel_infos.a` contains all compiled kernel instructions, this file will be combined into libonnxruntime_providers_rocm.so, using --whole-archive flag. `triton_kernel_infos.h` defines a const array that contains all the metadata for each compiled kernel. These metadata will be used for load and launch. So this header file is included by 'triton_kernel.cu' which defines load and launch functions. Add a build flag in build.py and CMakeList.txt, when building rocm provider, it will call triton_kernel build command, and generate all necessary files. #### C++ Load and Launch On c++ part, we implement load and launch functions in triton_kernel.cu and triton_kernel.h. These two files located in `providers/cuda`, and when compiling rocm, they will be hipified. so this part supports both cuda and rocm. But currently we only call triton kernel in rocm. We also implement a softmax triton op for example. Because there will generate many kernels for different input shape of softmax, we use TunableOp to select the best one. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2023-05-17 09:35:28 +08:00
Jian Chen	780442b9f6	Change windows machine pools to use VS2022  (#15806 ) ### Description <!-- Describe your changes. --> Old pool \| New pool \| Notes -- \| -- \| -- onnxruntime-Win-CPU-2019 \| onnxruntime-Win-CPU-2022 \| onnxruntime-Win2019-CPU-training \| onnxruntime-Win2022-CPU-training-AMD \| onnxruntime-Win2019-CPU-training-AMD \| onnxruntime-Win2022-CPU-training-AMD \| Same as the above onnxruntime-Win2019-GPU-dml-A10 \| Need be created \| You need to create a new image for it first onnxruntime-Win2019-GPU-T4 \| onnxruntime-Win2022-GPU-T4 \| onnxruntime-Win2019-GPU-training-T4 \| onnxruntime-Win2022-GPU-T4 \| Same as the above because we do not have many T4 GPUs onnxruntime-tensorrt8-winbuild-T4\| TBD\|TBD Win-CPU-2021\|onnxruntime-Win-CPU-2022\| will do it in next PR Win-CPU-2019\|onnxruntime-Win2022-Intel-CPU'\| Intel CPU needed for win-ci-pipeline.yml -> `stage: x64_release_dnnl` <br class="Apple-interchange-newline"> ### Motivation and Context With vs2022 we can take the advantage of 64bit compiler. It also with better c++20 support	2023-05-16 10:34:34 -07:00
RandySheriffH	7faad53632	Set default option for package name and build arg options (#15958 ) Set default value for parameters in nuget-zip pipeline, and only apply the configurations when they are not "NONE". --------- Co-authored-by: Randy Shuai <rashuai@microsoft.com>	2023-05-16 09:07:38 -07:00
cloudhan	dc383ed4ce	Basic CSharp packaging support for ROCm EP (#15535 ) This PR mainly fixes building errors when trying to build nupkg for ROCm EP. It also slighly improve the packaging logic so that devlopers can produce the nupkg on linux natively.	2023-05-16 07:27:38 +08:00
yf711	825d691617	Unify cuda & trt version on few CIs (#15943 ) ### Description The cuda & trt version of some CIs didn't sync with the majority. Unifying cuda version as 11.8 and trt version as 8.6 on these CIs	2023-05-15 09:54:30 -07:00
Rachel Guo	18133ddadb	[doc] add LeakyRelu to coreml supported ops (#15944 ) ### Description <!-- Describe your changes. --> ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2023-05-15 09:46:30 -07:00
Adrian Lizarraga	5542e70dd1	[QNN EP] Update default QNN SDK version to 2.10 for QNN NuGet pipeline (#15899 ) ### Description Updates the default QNN SDK version to 2.10 for the QNN NuGet pipeline. ### Motivation and Context Ensures that the daily QNN NuGet pipeline builds ORT using the latest QNN SDK by default.	2023-05-15 09:17:42 -07:00
PeixuanZuo	af6cb2af87	[ROCm] update ROCm/MIGraphX CI to ROCm5.5 (#15905 ) update ROCm/MIGraphX CI to ROC5.5. TODO: two PR to fix failure on orttraining/orttraining/test/python/orttraining_test_ortmodule_api.py - test_gradient_correctness_minmax/test_gradient_correctness_argmax_unfold/test_gradient_correctness_argmax_diagonal (https://github.com/microsoft/onnxruntime/pull/15903) - test_ortmodule_attribute_name_collision_warning (https://github.com/microsoft/onnxruntime/pull/15884)	2023-05-15 10:28:15 +08:00
Yi Zhang	b20d5e85d5	Update Cuda to 11.8 in 2 Linux GPU workflows. (#15925 ) ### Description use template variable for cuda version ### Motivation and Context	2023-05-14 12:51:25 +08:00
RandySheriffH	7c4e8267e7	Implement openAI endpoint invoker for nuget (#15797 ) Implement openAI audio endpoint, and enable nuget packaging. --------- Co-authored-by: Randy Shuai <rashuai@microsoft.com>	2023-05-11 22:04:02 -07:00
Yi Zhang	0e7ae13e74	Run Linux GPU tests in docker container (#15872 ) ### Description Run Linux GPU tests in docker container ### Motivation and Context	2023-05-12 06:29:22 +08:00
Jian Chen	1a73d61829	Update eigen to 3.4 and remove the eigen from git submodule (#15875 ) ### Description Update eigen to 3.4 and remove the eigen from git submodule ### Motivation and Context We need to have eigen 3.4 for c++20	2023-05-11 11:56:59 -07:00
Changming Sun	7c58d013aa	Remove Ubuntu 18.04 usages (#15781 ) ### Description Remove Ubuntu 18.04 usages because it will be EOL this month. ### Motivation and Context	2023-05-11 11:44:00 -07:00
Yulong Wang	756cf3a76f	increase web CI timeout (#15876 ) ### Description The CI is extremely slow on downloading source code (~1MB/sec) so the web CI went timeout. This is blocking the PR/checks. Increase the timeout temporarily.	2023-05-11 11:17:46 -07:00
liqun Fu	ac9ae9f7c5	update onnx release 1.14 for docker files (#15680 ) ### Description this is for ort 1.15 release to work with onnx 1.14 It shall be merged after onnx 1.14 release and before ort 1.15 release. ### Motivation and Context --------- Signed-off-by: Liqun Fu <liqfu@microsoft.com>	2023-05-10 13:15:56 -07:00
Nat Kershaw (MSFT)	36c9ae0f58	Fix release version suffix for RC builds (#15865 )	2023-05-09 23:06:08 -07:00
Sumit Agarwal	b473e3f3c6	[DML EP] Update DirectML version to 1.11.0 (#15858 ) ### Description - Update DML version to 1.11.0 - Disable Gemm+Softmax fusion ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2023-05-09 12:48:15 -07:00
Jian Chen	34cb293c6b	Remove unused ADO YML pipeline template (#15857 ) ### Description Remove unused ADO YML pipeline template ### Motivation and Context Clean up and reduce our codebase.	2023-05-09 09:15:04 -07:00
Wanming Lin	00b1e79e04	Support WebNN EP (#15698 ) Description: This PR intends to enable WebNN EP in ONNX Runtime Web. It translates the ONNX nodes by [WebNN API](https://webmachinelearning.github.io/webnn/), which is implemented in C++ and uses Emscripten [Embind API](https://emscripten.org/docs/porting/connecting_cpp_and_javascript/embind.html#). Temporarily using preferred layout NHWC for WebNN graph partitions since the restriction in WebNN XNNPack backend implementation and the ongoing [discussion](https://github.com/webmachinelearning/webnn/issues/324) in WebNN spec that whether WebNN should support both 'NHWC' and 'NCHW' layouts. No WebNN native EP, only for Web. Motivation and Context: Allow ONNXRuntime Web developers to access WebNN API to benefit from hardware acceleration. WebNN API Implementation Status in Chromium: - Tracked in Chromium issue: [#1273291](https://bugs.chromium.org/p/chromium/issues/detail?id=1273291) - CPU device: based on XNNPack backend, and had been available on Chrome Canary M112 behind "#enable-experimental-web-platform-features" flag for Windows and Linux platforms. Further implementation for more ops is ongoing. - GPU device: based on DML, implementation is ongoing. Open: - GitHub CI: WebNN currently is only available on Chrome Canary/Dev with XNNPack backend for Linux and Windows. This is an open to reviewers to help identify which GitHub CI should involved the WebNN EP and guide me to enable it. Thanks!	2023-05-08 21:25:10 -07:00
Yulong Wang	0457fd0b40	upgrade emsdk to 3.1.37 (#15817 ) ### Description upgrade emsdk to 3.1.37 WIP branch to debug the mystery memory issue in web assembly multi-thread build.	2023-05-08 16:49:47 -07:00
Yi Zhang	045c623415	Make Nuget workflow easy to debug (#15808 ) ### Description Fix the bug in #15693 ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2023-05-08 20:53:08 +08:00
Nat Kershaw (MSFT)	5e9b42326c	Fix packaging pipeline for nightly builds (#15839 )	2023-05-07 20:42:38 -07:00
PeixuanZuo	41457885e0	[ROCm] add rocm5.5 to python package pipeline (#15820 ) add rocm5.5 to python packaging pipeline. https://aiinfra.visualstudio.com/Lotus/_build/results?buildId=306082&view=results TODO: Remove version 5.2.3, 5.3.2 and 5.4 in the next PR.	2023-05-06 10:21:15 +08:00
Nat Kershaw (MSFT)	ed31e4b737	Add nuget release version suffix to support publishing rcs to nuget.org (#15791 )	2023-05-05 18:18:24 -07:00
Adrian Lizarraga	45f5c27632	[QNN EP] Update default QNN SDK to version 2.10.0 (#15818 ) ### Description - Updates the default QNN SDK for CI pipelines to version 2.10.0. - Disables convolution op tests that run on the QNN CPU backend due to a potential bug with QNN SDK 2.10.0. ### Motivation and Context Allows us to test the latest QNN SDK in default CI pipeline runs.	2023-05-05 13:01:21 -07:00
Guenther Schmuelling	5a43828b3d	update ort extensions to 94142d8391c9791ec71c38336436319a2d4ac7a0 (#15688 ) needed to get tokenizers/decode for whisper --------- Co-authored-by: Shalva Mist <shalvamist@microsoft.com>	2023-05-05 09:48:07 -07:00
Scott McKay	d1b2b35cd2	Various fixes to the CSharp setup (#15782 ) ### Description <!-- Describe your changes. --> Various fixes to the CSharp setup - fix warnings - fix invalid tests - update test sdk nuget package - enables testing on linux - fixes issue with some unit tests not running in CI - run unit tests in linux pipeline using dotnet ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Unit tests weren't breaking in CIs for both Windows and Linux builds and should have been.	2023-05-05 14:27:30 +10:00
Yulong Wang	4712009f8a	[js/web] add target ort.webgpu.min.js (#15780 ) ### Description add target ort.webgpu.min.js WebGPU is experimental feature, so I don't want to put webgpu into the ort.min.js file. This change adds 2 ways for users to access ort-web with webgpu: - using script tag: by URL `https://cdn.jsdelivr.net/npm/onnxruntime-web@1.15.0/dist/ort.webgpu.min.js` ( this URL is not ready yet ) - using `import()`: use `import { Tensor, InferenceSession } from 'onnxruntime-web/webgpu';` - 'onnxruntime-web/webgpu' instead of 'onnxruntime-web'	2023-05-04 10:05:39 -07:00
Yulong Wang	33d1372729	[wasm] revert emsdk to v3.1.19 (#15793 ) ### Description latest emsdk generated multi-thread version sometimes crash with unknown reason ( error: memory access out of bounds ). we don't want to break existing ort-web users, so revert emsdk back to 3.1.19 (same to what ort v1.14.0 uses)	2023-05-04 01:15:01 -07:00
Baiju Meswani	e464588a0e	Avoid generating training documentation during packaging (#15795 )	2023-05-03 19:09:07 -07:00
Changming Sun	1fb2f2605b	Update VERSION_NUMBER (#15773 ) ### Description 1. Update VERSION_NUMBER for preparing the upcoming release. This PR's commit will not be included in the 1.15 release branch 2. Delete package/rpm/onnxruntime.spec since it was not used in past years. ### Motivation and Context Preparing the release. Fixed [AB#15311](https://aiinfra.visualstudio.com/6a833879-cd9b-44a4-a9de-adc2d818f13c/_workitems/edit/15311)	2023-05-03 15:07:34 -07:00
Baiju Meswani	ba7b83ff3c	Remove onnxruntime_PYBIND_EXPORT_OPSCHEMA definition from onnxruntime (#15776 )	2023-05-03 13:08:35 -07:00

1 2 3 4 5 ...

2066 commits