### Description
Add the ability to run graph
### Motivation and Context
A brief description is as follows:
1) If the whole graph is supported, then will be processed by the graph
engine, directly.
2) If the whole graph is not supported, the whole graph will be divided
into subgraphs and single operators; The sub-graphs will be run on graph
engine, and the single operators will fallback to the traditional mode.
### Description
<!-- Describe your changes. -->
### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
### Description
For compilation in container, ADO Cache task doesn't work directly.
The workaround is to mount the cache directory to the container, and let
CCache in container to read/write cache data.
In short, we just leverage ADO API to download/upload cache data.
The Post-jobs works in stack-mode, So the PostBuildCleanUp Tasks should
be defined first.
Thus, The PostBuildCleanUp would be executed lastly.
Else, Cache Task would fail to upload cache because the Agent Directory
is cleaned.
Integrate TensorRT 8.5
- Update TensorRT EP to support TensorRT 8.5
- Update relevant CI pipelines
- Disable known non-supported ops for TensorRT
- Make timeout configurable.
We observe more than [20
hours](https://aiinfra.visualstudio.com/Lotus/_build/results?buildId=256729&view=logs&j=71ce39d8-054f-502a-dcd0-e89fa9931f40)
of running unit tests with TensorRT 8.5 in package pipelines. Because we
can't use placeholder to significantly reduce testing time (c-api
application test will deadlock) in package pipelines, we only run
subsets of model tests and unit tests that are related to TRT (add new
build flag--test_all_timeout and set it to 72000 seconds by package
pipelines). Just to remember, we still run all the tests in TensorRT CI
pipelines to have full test coverage.
- include https://github.com/microsoft/onnxruntime/pull/13918 to fix
onnx-tensorrt compile error.
Co-authored-by: George Wu <jywu@microsoft.com>
### Description
This PR enables building nuget packages locally for on device training
using --build_nuget arg.
This PR also enables the C# bindings by default in the managed package.
If a user triggers any training apis when the native binary is not built
for training, an exception with message "Training is disabled in the
current build. Please build ONNXRuntime from source with the build flags
enable_training and enable_training_on_device. " is thrown.
Build command for creating nuget packes for on device training:
build.bat --enable_training --enable_training_on_device --build_nuget
2 Nuget packages are built
1. Microsoft.ML.OnnxRuntime.Managed
2. Microsoft.ML.OnnxRuntime.Training OR
Microsoft.ML.OnnxRuntime.Training.Gpu
### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
## Description
1. Convert some git submodules to cmake external projects
2. Update nsync from
[1.23.0](https://github.com/google/nsync/releases/tag/1.23.0) to
[1.25.0](https://github.com/google/nsync/releases/tag/1.25.0)
3. Update re2 from 2021-06-01 to 2022-06-01
4. Update wil from an old commit to 1.0.220914.1 tag
5. Update gtest to a newer commit so that it can optionally leverage
absl/re2 for parsing command line flags.
The following git submodules are deleted:
1. FP16
2. safeint
3. XNNPACK
4. cxxopts
5. dlpack
7. flatbuffers
8. googlebenchmark
9. json
10. mimalloc
11. mp11
12. pthreadpool
More will come.
## Motivation and Context
There are 3 ways of integrating 3rd party C/C++ libraries into ONNX
Runtime:
1. Install them to a system location, then use cmake's find_package
module to locate them.
2. Use git submodules
6. Use cmake's external projects(externalproject_add).
At first when this project was just started, we considered both option 2
and option 3. We preferred option 2 because:
1. It's easier to handle authentication. At first this project was not
open source, and it had some other non-public dependencies. If we use
git submodule, ADO will handle authentication smoothly. Otherwise we
need to manually pass tokens around and be very careful on not exposing
them in build logs.
2. At that time, cmake fetched dependencies after "cmake" finished
generating vcprojects/makefiles. So it was very difficult to make cflags
consistent. Since cmake 3.11, it has a new command: FetchContent, which
fetches dependencies when it generates vcprojects/makefiles just before
add_subdirectories, so the parent project's variables/settings can be
easily passed to the child projects.
And when the project went on, we had some new concerns:
1. As we started to have more and more EPs and build configs, the number
of submodules grew quickly. For more developers, most ORT submodules are
not relevant to them. They shouldn't need to download all of them.
2. It is impossible to let two different build configs use two different
versions of the same dependency. For example, right now we have protobuf
3.18.3 in the submodules. Then every EP must use the same version.
Whenever we have a need to upgrade protobuf, we need to coordinate
across the whole team and many external developers. I can't manage it
anymore.
3. Some projects want to manage the dependencies in a different way,
either because of their preference or because of compliance
requirements. For example, some Microsoft teams want to use vcpkg, but
we don't want to force every user of onnxruntime using vcpkg.
7. Someone wants to dynamically link to protobuf, but our build script
only does static link.
8. Hard to handle security vulnerabilities. For example, whenever
protobuf has a security patch, we have a lot of things to do. But if we
allowed people to build ORT with a different version of protobuf without
changing ORT"s source code, the customer who build ORT from source will
be able to act on such things in a quicker way. They will not need to
wait ORT having a patch release.
9. Every time we do a release, github will also publish a source file
zip file and a source file tarball for us. But they are not usable,
because they miss submodules.
### New features
After this change, users will be able to:
1. Build the dependencies in the way they want, then install them to
somewhere(for example, /usr or a temp folder).
2. Or download the dependencies by using cmake commands from these
dependencies official website
3. Similar to the above, but use your private mirrors to migrate supply
chain risks.
4. Use different versions of the dependencies, as long as our source
code is compatible with them. For example, you may use you can't use
protobuf 3.20.x as they need code changes in ONNX Runtime.
6. Only download the things the current build needs.
10. Avoid building external dependencies again and again in every build.
### Breaking change
The onnxruntime_PREFER_SYSTEM_LIB build option is removed you could think from now
it is default ON. If you don't like the new behavior, you can set FETCHCONTENT_TRY_FIND_PACKAGE_MODE to NEVER.
Besides, for who relied on the onnxruntime_PREFER_SYSTEM_LIB build
option, please be aware that this PR will change find_package calls from
Module mode to Config mode. For example, in the past if you have
installed protobuf from apt-get from ubuntu 20.04's official repo,
find_package can find it and use it. But after this PR, it won't. This
is because that protobuf version provided by Ubuntu 20.04 is too old to
support the "config mode". It can be resolved by getting a newer version
of protobuf from somewhere.
fix for https://github.com/microsoft/onnxruntime/issues/13383,
https://github.com/microsoft/onnxruntime/issues/13408
Currently ort-web doesn't catch exceptions because turning on exception
catching increases the binary size by 3MB (~30%).
But ort can throw (ie onnx errors or ORT_ENFORCE) and there is no
useable error message.
Turning on exception catching just for top level api released file will
fix the error messages at minimal increase of binary size.
1. Remove the cmake option onnxruntime_DEV_MODE and replace it with
"--compile-no-warning-as-error"
2. Suppress some GSL warnings because now we treat nvcc diag warnings as
errors
### Description
<!-- Describe your changes. -->
Fix document generation CI. It's not currently updating the docs as
we're skipping the tests, which is the invocation of build.py that would
have generated the documentation.
Setup specific task to generate documentation for greater clarity.
### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
Operator kernel documentation is not getting updated and is now out of
date.
### Description
Currently, hipify happens before cmake is configured and then cmake glob
the directories. This get rids of thoes customized python threading
logic and opt for build system itself to generate the files.
This also supersede the half baked branch
[sukha/hipify-with-cmake](https://github.com/microsoft/onnxruntime/tree/sukha/hipify-with-cmake)
### Description
<!-- Describe your changes. -->
ROCm developers always need to build onnxruntime *whl with
`--enable_rocm_profiling`.
Add a ROCm dev python package pipeline which product *.whl with build
args `--enable_rocm_profiling`.
The dev *whl need to upload to azure storage and can get from
https://download.onnxruntime.ai/onnxruntime_nightly_rocm53.profiling.html
### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
**Description**: This PR adds Ascend CANN execution provider support.
**Motivation and Context**
- Why is this change required? What problem does it solve?
As the info shown in the issue. CANN is the API layer for Ascend
processor. Add CANN EP can allow user run onnx model on Ascend hardware
via onnxruntime
The detail change:
1. Added CANN EP framework.
2. Added the basic operators to support ResNet and VGG model.
3. Added C/C++、Python API support
- If it fixes an open issue, please link to the issue here.
https://github.com/microsoft/onnxruntime/issues/11477
Author:
lijiawei <lijiawei19@huawei.com>
wangxiyuan <wangxiyuan1007@gmail.com>
Co-authored-by: FFrog <ljw1101.vip@gmail.com>
This changes are to align OV 2022.2 Release with ORT . Changes
CPU FP16 Support, dGPU Support, RHEL Dockerfile, Ubuntu 20 Dockerfile
**Motivation and Context**
- This change is required to ensure ORT-OpenVINO Execution Provider is
aligned with latest changes.
- If it fixes an open issue, please link to the issue here.
Co-authored-by: mayavijx <mayax.vijayan@intel.com>
Co-authored-by: shamaksx <shamax.kshirsagar@intel.com>
Co-authored-by: pratiksha <pratikshax.bapusaheb.vanse@intel.com>
Co-authored-by: pratiksha <mohsinx.mohammad@intel.com>
Co-authored-by: Sahar Fatima <sfatima.3001@gmail.com>
Co-authored-by: Preetha Veeramalai <preetha.veeramalai@intel.com>
Co-authored-by: nmaajidk <n.maajid.khan@intel.com>
Co-authored-by: Mateusz Tabaka <mateusz.tabaka@intel.com>
Co-authored-by: intel <intel@iotgecsp-nuc04.iind.intel.com>
1. add node test data to current model tests
2. support opset version to filter tests.
3. remove old filter based on onnx version. To avoid confusion, ONLY
support opset version filter in onnxruntime_test_all
4. support read onnx test data from absolute path on Windows.
* Fix bug in pybind get_all_operator_schema due to premature reference dropping
* Add updated operator kernels markdown table
* Update build.py to include documentation generation for DML operators too
* Update GPU pipeline to include DML in the build to so operators can be generated.
* Use a separate pipeline stage, feedback from Changming and Scott
* Appease annoying Python linter
* Add onnxruntime_BUILD_UNIT_TESTS=OFF and remove stale --use_dml in cuda stage
* drop nuphar code and configs
* refactor test case
* format python
* remove nuphar from training test
* remove commented nuphar logics
* restore llvm setting
* drop nuphar ci
* fix compile err
* fix compile err
Co-authored-by: Randy Shuai <rashuai@microsoft.com>
* upgrade emsdk to 3.1.19
* fix build break
* ignore '-Wunused-but-set-variable' in eigen
* add malloc and free in exported functions
* EXPORTED_FUNCTIONS
* Add first pass of rocm kernel profiler
* Clean up rocm_profiler. Format args. Demangle kernel names.
Add Api EventRecords
* Remove debug output
* Temporarily disable profiling unit test 'api record check' for cupti
* Fix compile error for non-gpu builds
* Use common file for demangle and pid/tid. Namespace ThreadUtil. Fix gpu buffer clearing.
* Merge demangle into profiler_common
* Merge demangle into profiler_common part 2
* Style cleanup
* Resolve linking issues via ProviderHost interface
* Demangle cuda kernel names
* Clean up comments
* Fix formatting
* Fix anal retentive formatting
* Make ORT as Pytorch JIT backend
LORT likely doesn't work with aten fallback so we only test LORT in its own CI.
* Revert changes to enable external CUDA allocator. Will add it later.
Revert "Revert changes to enable external CUDA allocator. Will add it later."
This reverts commit d5487f2e193014c805505afae8fb577c53667658.
Fix external allocator
* Relax tolerance and remove commented code
* Print more information in CI
* Fix pointer
* Address comments.
1. Reuse ORT-eager mode's environment.
2. Remove unused ctor.
* Use Pytorch master branch as all PRs are merged
Fix
* Refine based on cpplint feedbacks
* Revert changes to allow custom CUDA allocator in public APIs
* Use torch.testing.assert_close
* Use unittest framework
* Switch docker repo
* Rename *.cpp to *.cc
* Address comments
* Add comment
* Use same pipeline file for eager and lort pipelines
* Address comments
* Add yaml comment
* Fix cmake files
* Address comments
* Rename flags, remove printing code, remove dead comment
* Add build option to link prebuilt TensorRT parser
* Test without the build option to link prebuilt TRTParser
* Minor: update name of build option
* Minor: update name of build option
Losen the following test timeout:
1. "Test Web Multi-Browsers" stage in "ONNX Runtime Web CI Pipeline": 30min -> 60min
2. Node.js binding default per-case timeout: 30 sec -> 90 sec
* refactor test for model with undefined shapes
* add test for TVMso EP
* update build script for TVM EP tests
* fix pylint
* disable test for Windows
* fix black
* fix python format
* fix pylint
* fix python format
* replace Path.resolve with os.path.join
* fix python path issue
Co-authored-by: Valery Chernov <valery.chernov@deelvin.com>
Fix bug where onnxruntime_USE_NCCL flag would default to ON, causing ORT to not build properly. New functionality: flag is ON when training is enabled and NCCL is not disabled. Flag is OFF otherwise
* add description of build ORT+TVM EP on Windows
* fix cmake error related to symlink creation on Windows
* add llvm config path to build flags for correct build on Windows
* update TVM_EP.md for llvm_config build arg
* fix warnings skipping during build on Windows
* fix using string or wstring for model path to correct build on Windows (MSVC error)
* fix error in custom logger for correct build on Windows
* implement glob algorithm for Windows
* additional build fixes
* update TVM with export of VM symbols for dll
* description of nasm issue and workaround
* update TVM with export of Executable from VM symbols for dll
* description of installation of ipp-crypto dependencies on Windows
* cmake key for ipp-crypto build
* fix wstring for TVMso EP
* fix ipp-crypto build
* cmake key onnxruntime_TVM_USE_HASH switch off not specific methods, but full hash functionality
* fix absolute path to compiled lib
* update TVM_EP.md, fix lint warnings
* update TVM_EP.md
* small fixes after review
* switch on handshake functionality for Linux workflow
Co-authored-by: Valery Chernov <valery.chernov@deelvin.com>
Co-authored-by: KJlaccHoeUM9l <wotpricol@mail.ru>
* update trt 8.4ga
* trt 8.4 linux ci pipeline
* fix cmake
* placeholder_builder
* trt 8.4 windows pipeline
* gpu package pipeline
* trt 8.4.1.5 , packaging pipeline updates
* python packaging
* ctest timeout
* python packaging test
* bump timeout
* python format
* format
* revert
* newline
* enable trt python tests
* typo
* python format
* disable on windows
* Initiate Ort SNPE EP
* fix snpe ep windows build which is caused by the utility method (ToUTF8String) name change on master
* correct the source path for libonnxruntime.so while building for andorid package
* add AdditionalDependencies for amr64
* On MS-Windows, the patchfile must be a text file, i.e. CR-LF must be used as line endings. A file with LF may give the error: "Assertion failed, hunk, file patch.c, line 343," unless the option '--binary' is given.
* fix build failure if snpe is not enabled
* update doc for contrib op
* separate out snpe ep settings to onnxruntime_snpe_provider.cmake
* renaming according review comments
* update according review comments
* Implement XNNPACK support via an EP.
* Layout transform uses the GraphPartitioner infrastructure.
* Node fusion is supported.
* Conv and MaxPool implementations were ported from Changming's PR.
* Added optional mutex in InferenceSession::Run as we only want to allow sequential calls if xnnpack is enabled
Description: Format all python files under onnxruntime with black and isort.
After checking in, we can use .git-blame-ignore-revs to ignore the formatting PR in git blame.
#11315, #11316