onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-06-14 01:13:40 +00:00

Author	SHA1	Message	Date
Changming Sun	df11c85955	Download protoc.exe from nuget when cross-compiling (#15395 ) ### Description 1. The protoc package on nuget.org contains binaries for Windows_x86/Windows_x64/Linux_x86/Linux_x64/MacOS_x64, which can cover most use cases. Though it doesn't have binaries for AMR64, they are only needed when we cross-compile for Intel CPUs on ARM CPUs. It is rare. When you have such a need, you always can build protoc from source by yourself and pass it to build.py as "--path_to_protoc_exe". Or if you have security concerns that you don't want to use prebuilt binaries from outside, you can do the same thing. 2. Remove GoogleTestAdapter related thing. That part of code is out of maintain. ### Motivation and Context As a follow-up of PR #15190.	2023-04-06 17:06:59 -07:00
George Wu	4db10c93d1	[TensorRT EP] make --use_tensorrt_builtin_parser the default behavior in build.py (#15320 ) Change the default behavior to link against the nvonnxparser library (onnx-tensorrt parser) that is included with the TensorRT package. Previously, the default behavior was to build and statically link against the OSS onnx-tensorrt parser. Historically, we wanted to incorporate the latest commits/fixes from OSS parser. These days the OSS parser is not significantly different from the included parser library so there is less reason to build against it by default. By linking with parser shared library from TensorRT library, the major benefit is it's much easier to build/link against a minor version update of TensorRT. And OnnxRuntime can be updated with a new TensorRT minor version by simply replacing TensorRT libraries with the newer version. (because the parser is no longer statically linked into onnxruntime) Added --use_tensorrt_oss_parser to build.py to support the previous default behavior. (build + static link OSS parser)	2023-04-05 07:53:29 -07:00
Changming Sun	4a0b86eba6	Update the post-merge pipeline (#14965 ) ### Description 1. Remove Linux jobs for ORT-Extension combined build 2. Add a macOS build job for ORT-Extension combined build 3. Adjust the yaml file so that it can support two different ADO instances. ### Motivation and Context To test our code better. And it will enable us to run such tests for every commit in the main branch. It would be easier for us to figure out which change caused a build break. See [AB#13435](https://aiinfra.visualstudio.com/6a833879-cd9b-44a4-a9de-adc2d818f13c/_workitems/edit/13435)	2023-03-29 13:12:07 -07:00
Edward Chen	ea40dc3ad6	Update build.py to disallow running as root user by default. (#15164 ) Try to address intermittent permissions issues that show up in non-transient CI environments.	2023-03-27 14:46:04 -07:00
Justin Chu	d834ec895a	Adopt linrtunner as the linting tool - take 2 (#15085 ) ### Description `lintrunner` is a linter runner successfully used by pytorch, onnx and onnx-script. It provides a uniform experience running linters locally and in CI. It supports all major dev systems: Windows, Linux and MacOs. The checks are enforced by the `Python format` workflow. This PR adopts `lintrunner` to onnxruntime and fixed ~2000 flake8 errors in Python code. `lintrunner` now runs all required python lints including `ruff`(replacing `flake8`), `black` and `isort`. Future lints like `clang-format` can be added. Most errors are auto-fixed by `ruff` and the fixes should be considered robust. Lints that are more complicated to fix are applied `# noqa` for now and should be fixed in follow up PRs. ### Notable changes 1. This PR removed some suboptimal patterns: - `not xxx in` -> `xxx not in` membership checks - bare excepts (`except:` -> `except Exception`) - unused imports The follow up PR will remove: - `import *` - mutable values as default in function definitions (`def func(a=[])`) - more unused imports - unused local variables 2. Use `ruff` to replace `flake8`. `ruff` is much (40x) faster than flake8 and is more robust. We are using it successfully in onnx and onnx-script. It also supports auto-fixing many flake8 errors. 3. Removed the legacy flake8 ci flow and updated docs. 4. The added workflow supports SARIF code scanning reports on github, example snapshot: ![image](https://user-images.githubusercontent.com/11205048/212598953-d60ce8a9-f242-4fa8-8674-8696b704604a.png) 5. Removed `onnxruntime-python-checks-ci-pipeline` as redundant ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Unified linting experience in CI and local. Replacing https://github.com/microsoft/onnxruntime/pull/14306 --------- Signed-off-by: Justin Chu <justinchu@microsoft.com>	2023-03-24 15:29:03 -07:00
Chi Lo	11dac121b2	Fix test_custom_op_get_const_input inference test on Android CI (#15132 ) Fix test_custom_op_get_const_input for inference test on Android CI and copy custom op libraries to the emulator.	2023-03-22 23:02:42 -07:00
Edward Chen	c46c7ccba5	Update Gradle version (#14862 ) - Update Gradle version used in most places from 6.8.3 to 8.0.1. Update Android Gradle Plugin version where applicable. Not updated in this change: React Native Android projects (under `js/react_native/`). That can be done later along with updating the React Native projects. - Add Gradle wrapper in `java/` to make it easier to consistently use a specific Gradle version.	2023-03-08 12:22:06 -08:00
George Wu	289f7dbcdd	enable pybind for qnn ep (#14897 ) enable python bindings for QNN EP. tested on Windows Dev Kit 2023 (ARM64) with python 3.11 (ARM64) from https://www.python.org/ftp/python/3.11.1/python-3.11.1-arm64.exe	2023-03-03 07:26:53 -08:00
Hector Li	c6074f3a4b	OnnxRuntime QNN EP (#14791 ) ### Description Integrate Qualcomm QNN SDK to enable inference on QC hexagon NPU devices ### Motivation and Context Enable Ort inference on QC hexagon NPU devices. --------- Co-authored-by: Satya Jandhyala <sajandhy@microsoft.com> Co-authored-by: Adrian Lizarraga <adlizarraga@microsoft.com> Co-authored-by: Adrian Lizarraga <adrianlm2@gmail.com>	2023-03-01 13:48:20 -08:00
Yulong Wang	69c5edb11b	[wasm] upgrade emsdk from 3.1.19 to 3.1.32 (#14818 ) ### Description upgrade emsdk from 3.1.19 to 3.1.32 also add explicit config for stack size (1MB).	2023-02-28 11:06:09 -08:00
Yulong Wang	6b83ad9659	[js/web] allow unittest (onnxruntime_test_all) to run in browser (#14820 ) ### Description allow onnxruntime_test_all to run in browser for WebAssembly build (use flag `--wasm_run_tests_in_browser`). To output the logs from stdout correctly, this test needs to be build with `--enable_wasm_threads`.	2023-02-24 16:45:33 -08:00
Jian Chen	62ee0c8110	Migrating ORT Extensions from Git submodule to cmake FetchContent (#14298 ) ### Description <!-- Describe your changes. --> Merging extensions from Git submodule to cmake FetchContent ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> --------- Co-authored-by: Changming Sun <chasun@microsoft.com> Co-authored-by: Jian Chen <jchen351@MacBook-Pro.local>	2023-02-22 19:42:36 -08:00
Tang, Cheng	8f34c8c8ed	Introduce collective ops to ort inference build (#14399 ) ### Description Introduce collective ops into onnxruntime inference build, including 1) AllReduce and AllGather schema in contrib op, controlled by USE_MPI flag 2) AllReduce and AllGather kernel in cuda EP, controlled by ORT_USE_NCCL flag ### Motivation and Context Enable the collective ops in onnxruntime inference build so we have the ability to run distributed inference with multiple GPUs. The original ncclAllReduce ops in training build require quite complex configurations, which is not suitable for inference case, and it already broken. so we introduce a new implementation. --------- Co-authored-by: Cheng Tang <chenta@microsoft.com@orttrainingdev9.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net>	2023-02-07 13:47:48 -08:00
pengwa	7eca42484c	link mpi when either use_mpi or use_nccl enabled (#14467 ) ### Only link mpi when either use_mpi or use_nccl enabled To fix the issue https://github.com/microsoft/onnxruntime/issues/14278. Talked with @askhade, we think if users want to enable NCCL/MPi but MPI is not found, it should be failure instead of warning. So this PR made the change. As a result, to make CIs pass, we need disable NCCL/MPI explicitly in the build command. This PR take an alternative approach, e.g. since NCCL and MPi are not used for customers, disable NCCL by default if "--disable_nccl" not specified, disable MPI by default if "--use_mpi" not specified. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2023-02-03 20:11:50 +08:00
Baiju Meswani	d06ad9462b	[Bug Fix] Include python training apis when enable_training is enabled (#14485 )	2023-01-31 17:17:26 -08:00
cloudhan	3b6d551c35	Enable ccache for HIP objects (#14465 ) This enables HIP compiler to be launched with `ccache` when build with `--use_cache`	2023-01-28 22:34:24 +08:00
Chi Lo	80d61989e9	Unit test modification for TensorRT EP (#14339 ) Two modifications: - After [TRT 8.5](https://github.com/microsoft/onnxruntime/pull/13867) being merged, we can manually set timeout and make TRT EP only run small portion of unit tests (`onnxruntime_SKIP_AND_PERFORM_FILTERED_TENSORRT_TESTS=ON`) due to additional TRT kernel overhead introduced by TRT 8.5 which increases test time a lot. This PR modifies the checking condition and make TensorRT CIs (can enable builder placeholder) still run most of the unit tests. - Exclude TRT EP from [Resize Opset 18](https://github.com/microsoft/onnxruntime/pull/13890) unit tests since TensorRT 8.5 supports operators up to Opset 17.	2023-01-18 21:30:19 -08:00
Scott McKay	b9ecd428c1	Add ability to register custom ops by specifying a function name (#14177 ) ### Description <!-- Describe your changes. --> Use dlsym/GetProcAddress to lookup a custom ops registration function by name and call it. This will be better on mobile platforms where the custom ops library is linked against, and there isn't necessarily a filesystem that a library path can be loaded from. Alternative is to wire up passing in the address of the function, but that has multiple complications which differ by platform. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Enable using ort and ort-ext packages on mobile platforms. Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com>	2023-01-12 15:11:34 +10:00
RandySheriffH	83ad562826	Rename CloudEP to AzureEP (#14175 ) Rename CloudEP to AzureEP. Co-authored-by: Randy Shuai <rashuai@microsoft.com>	2023-01-11 12:25:04 -08:00
Xavier Dupré	79dc39600f	Replace distutils by setuptools to import build_ext (#14108 ) ### Description Uses setuptools instead of distutils. ### Motivation and Context Fixes #14107.	2023-01-09 11:48:01 +01:00
Yi Zhang	2ce7b1c1dc	Enable cache for msbuild (#14085 ) ### Description Enable ccache in windows CPU compilation. The windows compilation in CI could be reduced to 1 more minute at most. ![image](https://user-images.githubusercontent.com/16190118/210294061-86742cf4-65c7-4cc2-9725-e102c3c64abd.png)	2023-01-06 11:19:57 +08:00
Baiju Meswani	0ff61f7b97	Update torch to 1.13.1 in CI and packaging pipelines for ort training (#14055 )	2023-01-03 20:03:33 -08:00
Ashwini Khade	68b5b2d7d3	Refactor training build options (#13964 ) ### Description 1. Renames all references of on device training to training apis. This is to keep the naming general. Nothing really prevents us from using the same apis on servers\non-edge devices. 2. Update ENABLE_TRAINING option: With this PR when this option is enabled, training apis and torch interop is also enabled. 3. Refactoring for onnxruntime_ENABLE_TRAINING_TORCH_INTEROP option: - Removed user facing option - Setting onnxruntime_ENABLE_TRAINING_TORCH_INTEROP to ON when onnxruntime_ENABLE_TRAINING is ON as we always build with torch interop. Once this PR is merged when --enable_training is selected we will do a "FULL Build" for training (with all the training entry points and features). Training entry points include: 1. ORTModule 2. Training APIs Features include: 1. ATen Fallback 2. All Training OPs includes communication and collectives 3. Strided Tensor Support 4. Python Op (torch interop) 5. ONNXBlock (Front end tools for training artifacts prep when using trianing apis) ### Motivation and Context Intention is to simply the options for building training enabled builds. This is part of the larger work item to create dedicated build for learning on the edge scenarios with just training apis enabled.	2023-01-03 13:28:16 -08:00
RandySheriffH	587e891cae	CloudEP (#13855 ) Implement CloudEP for hybrid inferencing. The PR introduces zero new API, customers could configure session and run options to do inferencing with Azure [triton endpoint.](https://learn.microsoft.com/en-us/azure/machine-learning/how-to-deploy-with-triton?tabs=azure-cli%2Cendpoint) Sample configuration in python be like: ``` sess_opt.add_session_config_entry('cloud.endpoint_type', 'triton'); sess_opt.add_session_config_entry('cloud.uri', 'https://cloud.com'); sess_opt.add_session_config_entry('cloud.model_name', 'detection2'); sess_opt.add_session_config_entry('cloud.model_version', '7'); // optional, default 1 sess_opt.add_session_config_entry('cloud.verbose', '1'); // optional, default '0', meaning no verbose ... run_opt.add_run_config_entry('use_cloud', '1') # 0 for local inferencing, 1 for cloud endpoint. run_opt.add_run_config_entry('cloud.auth_key', '...') ... sess.run(None, {'input':input_}, run_opt) ``` Co-authored-by: Randy Shuai <rashuai@microsoft.com>	2023-01-03 10:03:15 -08:00
Edward Chen	df8ff34f25	Update CUDA ArgMin/ArgMax op kernels to have end version 11 since opset 12+ is not supported yet. (#13983 ) ### Description Update CUDA ArgMin/ArgMax op kernels to have end version 11 since opset 12+ is not supported yet. With the way these kernels are currently registered, the documentation shows support for opset 11+. This is not accurate. ### Motivation and Context Fix #13781	2022-12-21 19:01:00 -05:00
FFFrog	6705915af8	[CANN] Add the ability to run graph (#13728 ) ### Description Add the ability to run graph ### Motivation and Context A brief description is as follows: 1) If the whole graph is supported, then will be processed by the graph engine, directly. 2) If the whole graph is not supported, the whole graph will be divided into subgraphs and single operators; The sub-graphs will be run on graph engine, and the single operators will fallback to the traditional mode.	2022-12-16 06:57:40 -08:00
Yi Zhang	aa9fbed3d4	Add compilation cache for Linux GPU (#13995 ) ### Description <!-- Describe your changes. --> ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2022-12-16 16:38:12 +08:00
Yi Zhang	7d20d889d1	Use cache for compilation in container (#13960 ) ### Description For compilation in container, ADO Cache task doesn't work directly. The workaround is to mount the cache directory to the container, and let CCache in container to read/write cache data. In short, we just leverage ADO API to download/upload cache data. The Post-jobs works in stack-mode, So the PostBuildCleanUp Tasks should be defined first. Thus, The PostBuildCleanUp would be executed lastly. Else, Cache Task would fail to upload cache because the Agent Directory is cleaned.	2022-12-16 07:19:07 +08:00
Chi Lo	5b492cbae3	[TensorRT EP] support TensorRT 8.5 (#13867 ) Integrate TensorRT 8.5 - Update TensorRT EP to support TensorRT 8.5 - Update relevant CI pipelines - Disable known non-supported ops for TensorRT - Make timeout configurable. We observe more than [20 hours](https://aiinfra.visualstudio.com/Lotus/_build/results?buildId=256729&view=logs&j=71ce39d8-054f-502a-dcd0-e89fa9931f40) of running unit tests with TensorRT 8.5 in package pipelines. Because we can't use placeholder to significantly reduce testing time (c-api application test will deadlock) in package pipelines, we only run subsets of model tests and unit tests that are related to TRT (add new build flag--test_all_timeout and set it to 72000 seconds by package pipelines). Just to remember, we still run all the tests in TensorRT CI pipelines to have full test coverage. - include https://github.com/microsoft/onnxruntime/pull/13918 to fix onnx-tensorrt compile error. Co-authored-by: George Wu <jywu@microsoft.com>	2022-12-14 13:06:03 -08:00
Ashwini Khade	65201e47bf	Enable nuget packages for on device training (#13637 ) ### Description This PR enables building nuget packages locally for on device training using --build_nuget arg. This PR also enables the C# bindings by default in the managed package. If a user triggers any training apis when the native binary is not built for training, an exception with message "Training is disabled in the current build. Please build ONNXRuntime from source with the build flags enable_training and enable_training_on_device. " is thrown. Build command for creating nuget packes for on device training: build.bat --enable_training --enable_training_on_device --build_nuget 2 Nuget packages are built 1. Microsoft.ML.OnnxRuntime.Managed 2. Microsoft.ML.OnnxRuntime.Training OR Microsoft.ML.OnnxRuntime.Training.Gpu ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2022-12-05 14:54:09 -08:00
Changming Sun	04900f96c1	Improve dependency management (#13523 ) ## Description 1. Convert some git submodules to cmake external projects 2. Update nsync from [1.23.0](https://github.com/google/nsync/releases/tag/1.23.0) to [1.25.0](https://github.com/google/nsync/releases/tag/1.25.0) 3. Update re2 from 2021-06-01 to 2022-06-01 4. Update wil from an old commit to 1.0.220914.1 tag 5. Update gtest to a newer commit so that it can optionally leverage absl/re2 for parsing command line flags. The following git submodules are deleted: 1. FP16 2. safeint 3. XNNPACK 4. cxxopts 5. dlpack 7. flatbuffers 8. googlebenchmark 9. json 10. mimalloc 11. mp11 12. pthreadpool More will come. ## Motivation and Context There are 3 ways of integrating 3rd party C/C++ libraries into ONNX Runtime: 1. Install them to a system location, then use cmake's find_package module to locate them. 2. Use git submodules 6. Use cmake's external projects(externalproject_add). At first when this project was just started, we considered both option 2 and option 3. We preferred option 2 because: 1. It's easier to handle authentication. At first this project was not open source, and it had some other non-public dependencies. If we use git submodule, ADO will handle authentication smoothly. Otherwise we need to manually pass tokens around and be very careful on not exposing them in build logs. 2. At that time, cmake fetched dependencies after "cmake" finished generating vcprojects/makefiles. So it was very difficult to make cflags consistent. Since cmake 3.11, it has a new command: FetchContent, which fetches dependencies when it generates vcprojects/makefiles just before add_subdirectories, so the parent project's variables/settings can be easily passed to the child projects. And when the project went on, we had some new concerns: 1. As we started to have more and more EPs and build configs, the number of submodules grew quickly. For more developers, most ORT submodules are not relevant to them. They shouldn't need to download all of them. 2. It is impossible to let two different build configs use two different versions of the same dependency. For example, right now we have protobuf 3.18.3 in the submodules. Then every EP must use the same version. Whenever we have a need to upgrade protobuf, we need to coordinate across the whole team and many external developers. I can't manage it anymore. 3. Some projects want to manage the dependencies in a different way, either because of their preference or because of compliance requirements. For example, some Microsoft teams want to use vcpkg, but we don't want to force every user of onnxruntime using vcpkg. 7. Someone wants to dynamically link to protobuf, but our build script only does static link. 8. Hard to handle security vulnerabilities. For example, whenever protobuf has a security patch, we have a lot of things to do. But if we allowed people to build ORT with a different version of protobuf without changing ORT"s source code, the customer who build ORT from source will be able to act on such things in a quicker way. They will not need to wait ORT having a patch release. 9. Every time we do a release, github will also publish a source file zip file and a source file tarball for us. But they are not usable, because they miss submodules. ### New features After this change, users will be able to: 1. Build the dependencies in the way they want, then install them to somewhere(for example, /usr or a temp folder). 2. Or download the dependencies by using cmake commands from these dependencies official website 3. Similar to the above, but use your private mirrors to migrate supply chain risks. 4. Use different versions of the dependencies, as long as our source code is compatible with them. For example, you may use you can't use protobuf 3.20.x as they need code changes in ONNX Runtime. 6. Only download the things the current build needs. 10. Avoid building external dependencies again and again in every build. ### Breaking change The onnxruntime_PREFER_SYSTEM_LIB build option is removed you could think from now it is default ON. If you don't like the new behavior, you can set FETCHCONTENT_TRY_FIND_PACKAGE_MODE to NEVER. Besides, for who relied on the onnxruntime_PREFER_SYSTEM_LIB build option, please be aware that this PR will change find_package calls from Module mode to Config mode. For example, in the past if you have installed protobuf from apt-get from ubuntu 20.04's official repo, find_package can find it and use it. But after this PR, it won't. This is because that protobuf version provided by Ubuntu 20.04 is too old to support the "config mode". It can be resolved by getting a newer version of protobuf from somewhere.	2022-12-01 09:51:59 -08:00
Guenther Schmuelling	2d523c507e	for wasm catch exceptions at top level api (#13644 ) fix for https://github.com/microsoft/onnxruntime/issues/13383, https://github.com/microsoft/onnxruntime/issues/13408 Currently ort-web doesn't catch exceptions because turning on exception catching increases the binary size by 3MB (~30%). But ort can throw (ie onnx errors or ORT_ENFORCE) and there is no useable error message. Turning on exception catching just for top level api released file will fix the error messages at minimal increase of binary size.	2022-11-28 10:24:34 -08:00
Changming Sun	efcbdac58e	Remove the cmake option: onnxruntime_DEV_MODE (#13573 ) 1. Remove the cmake option onnxruntime_DEV_MODE and replace it with "--compile-no-warning-as-error" 2. Suppress some GSL warnings because now we treat nvcc diag warnings as errors	2022-11-07 09:06:28 -08:00
Hector Li	1b494daffa	Add yml file for Snpe EP build (#13494 ) Add yml file for Snpe EP build	2022-10-28 19:47:50 -07:00
Scott McKay	ab71c4bbc0	Document generation CI is broken (#13308 ) ### Description <!-- Describe your changes. --> Fix document generation CI. It's not currently updating the docs as we're skipping the tests, which is the invocation of build.py that would have generated the documentation. Setup specific task to generate documentation for greater clarity. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Operator kernel documentation is not getting updated and is now out of date.	2022-10-28 07:20:48 +10:00
cloudhan	928c9fc348	Hipify during build instead of before cmake config (#13333 ) ### Description Currently, hipify happens before cmake is configured and then cmake glob the directories. This get rids of thoes customized python threading logic and opt for build system itself to generate the files. This also supersede the half baked branch [sukha/hipify-with-cmake](https://github.com/microsoft/onnxruntime/tree/sukha/hipify-with-cmake)	2022-10-20 22:46:22 -07:00
PeixuanZuo	b4853a978a	[ROCm] add rocm python package pipeline with --use_rocm_profiling (#13068 ) ### Description <!-- Describe your changes. --> ROCm developers always need to build onnxruntime whl with `--enable_rocm_profiling`. Add a ROCm dev python package pipeline which product .whl with build args `--enable_rocm_profiling`. The dev *whl need to upload to azure storage and can get from https://download.onnxruntime.ai/onnxruntime_nightly_rocm53.profiling.html ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2022-10-17 10:11:20 +08:00
wangxiyuan	952c99304a	Add CANN EP (#12416 ) Description: This PR adds Ascend CANN execution provider support. Motivation and Context - Why is this change required? What problem does it solve? As the info shown in the issue. CANN is the API layer for Ascend processor. Add CANN EP can allow user run onnx model on Ascend hardware via onnxruntime The detail change: 1. Added CANN EP framework. 2. Added the basic operators to support ResNet and VGG model. 3. Added C/C++、Python API support - If it fixes an open issue, please link to the issue here. https://github.com/microsoft/onnxruntime/issues/11477 Author: lijiawei <lijiawei19@huawei.com> wangxiyuan <wangxiyuan1007@gmail.com> Co-authored-by: FFrog <ljw1101.vip@gmail.com>	2022-09-22 14:53:40 -07:00
sfatimar	cccbe90764	Openvino ep 2022.2 v4.2 (#13023 ) This changes are to align OV 2022.2 Release with ORT . Changes CPU FP16 Support, dGPU Support, RHEL Dockerfile, Ubuntu 20 Dockerfile Motivation and Context - This change is required to ensure ORT-OpenVINO Execution Provider is aligned with latest changes. - If it fixes an open issue, please link to the issue here. Co-authored-by: mayavijx <mayax.vijayan@intel.com> Co-authored-by: shamaksx <shamax.kshirsagar@intel.com> Co-authored-by: pratiksha <pratikshax.bapusaheb.vanse@intel.com> Co-authored-by: pratiksha <mohsinx.mohammad@intel.com> Co-authored-by: Sahar Fatima <sfatima.3001@gmail.com> Co-authored-by: Preetha Veeramalai <preetha.veeramalai@intel.com> Co-authored-by: nmaajidk <n.maajid.khan@intel.com> Co-authored-by: Mateusz Tabaka <mateusz.tabaka@intel.com> Co-authored-by: intel <intel@iotgecsp-nuc04.iind.intel.com>	2022-09-22 12:31:40 -07:00
Yi Zhang	8356e3b9b0	Add onnx single node test data to tests (#12822 ) 1. add node test data to current model tests 2. support opset version to filter tests. 3. remove old filter based on onnx version. To avoid confusion, ONLY support opset version filter in onnxruntime_test_all 4. support read onnx test data from absolute path on Windows.	2022-09-21 10:02:57 -07:00
Alexey Gladyshev	2b5b11d373	[C#][TVM EP] Fix issues related to using TVM EP in C# front-end (#12958 ) Changes in this PR: * Update building of Nuget package for TVM EP * Update of documentation for using TVM EP in C#	2022-09-16 16:04:59 +02:00
Dwayne Robinson	8e4eb24648	Update operator kernel table to include DML operators (#12887 ) * Fix bug in pybind get_all_operator_schema due to premature reference dropping * Add updated operator kernels markdown table * Update build.py to include documentation generation for DML operators too * Update GPU pipeline to include DML in the build to so operators can be generated. * Use a separate pipeline stage, feedback from Changming and Scott * Appease annoying Python linter * Add onnxruntime_BUILD_UNIT_TESTS=OFF and remove stale --use_dml in cuda stage	2022-09-09 10:21:25 -07:00
RandySheriffH	d3b684cd9e	Drop nuphar (#11555 ) * drop nuphar code and configs * refactor test case * format python * remove nuphar from training test * remove commented nuphar logics * restore llvm setting * drop nuphar ci * fix compile err * fix compile err Co-authored-by: Randy Shuai <rashuai@microsoft.com>	2022-09-07 15:11:18 -07:00
Yi Zhang	c571b99336	Refactor setup_test_data (#12818 ) * refactory setup_test_data * mv setup test data to test stage * model link for C# test * add comment	2022-09-07 08:33:27 +08:00
Yulong Wang	82a28cc2c3	upgrade emsdk to 3.1.19 (#12690 ) * upgrade emsdk to 3.1.19 * fix build break * ignore '-Wunused-but-set-variable' in eigen * add malloc and free in exported functions * EXPORTED_FUNCTIONS	2022-08-30 13:42:45 -07:00
mwootton	817dc94345	Add first pass of rocm kernel profiler (#10911 ) * Add first pass of rocm kernel profiler * Clean up rocm_profiler. Format args. Demangle kernel names. Add Api EventRecords * Remove debug output * Temporarily disable profiling unit test 'api record check' for cupti * Fix compile error for non-gpu builds * Use common file for demangle and pid/tid. Namespace ThreadUtil. Fix gpu buffer clearing. * Merge demangle into profiler_common * Merge demangle into profiler_common part 2 * Style cleanup * Resolve linking issues via ProviderHost interface * Demangle cuda kernel names * Clean up comments * Fix formatting * Fix anal retentive formatting	2022-08-26 19:38:03 -07:00
Changming Sun	7927d525a7	Remove CUDNN path from CI build scripts (#12671 )	2022-08-24 18:21:50 -07:00
Wei-Sheng Chin	dc486d146b	Make ORT callable from various Pytorch compilers (LazyTensor, TorchDynamo, etc) (#10460 ) * Make ORT as Pytorch JIT backend LORT likely doesn't work with aten fallback so we only test LORT in its own CI. * Revert changes to enable external CUDA allocator. Will add it later. Revert "Revert changes to enable external CUDA allocator. Will add it later." This reverts commit d5487f2e193014c805505afae8fb577c53667658. Fix external allocator * Relax tolerance and remove commented code * Print more information in CI * Fix pointer * Address comments. 1. Reuse ORT-eager mode's environment. 2. Remove unused ctor. * Use Pytorch master branch as all PRs are merged Fix * Refine based on cpplint feedbacks * Revert changes to allow custom CUDA allocator in public APIs * Use torch.testing.assert_close * Use unittest framework * Switch docker repo * Rename .cpp to .cc * Address comments * Add comment * Use same pipeline file for eager and lort pipelines * Address comments * Add yaml comment * Fix cmake files * Address comments * Rename flags, remove printing code, remove dead comment	2022-08-22 09:40:40 -07:00
Thiago Crepaldi	d1ba801570	Add BuildError for --gen_doc and --enable_training (#12630 )	2022-08-17 14:18:37 -04:00
yf711	9d10badc55	Add build option to link TensorRT prebuilt parser (#12602 ) * Add build option to link prebuilt TensorRT parser * Test without the build option to link prebuilt TRTParser * Minor: update name of build option * Minor: update name of build option	2022-08-16 14:09:58 -07:00

1 2 3 4 5 ...

481 commits