Commit graph

1194 commits

Author SHA1 Message Date
Changming Sun
04900f96c1
Improve dependency management (#13523)
## Description
1. Convert some git submodules to cmake external projects
2. Update nsync from
[1.23.0](https://github.com/google/nsync/releases/tag/1.23.0) to
[1.25.0](https://github.com/google/nsync/releases/tag/1.25.0)
3. Update re2 from 2021-06-01 to 2022-06-01
4. Update wil from an old commit to 1.0.220914.1 tag
5. Update gtest to a newer commit so that it can optionally leverage
absl/re2 for parsing command line flags.

The following git submodules are deleted:

1. FP16
2. safeint
3. XNNPACK
4. cxxopts
5. dlpack
7. flatbuffers
8. googlebenchmark
9. json
10. mimalloc
11. mp11
12. pthreadpool

More will come.

## Motivation and Context
There are 3 ways of integrating 3rd party C/C++ libraries into ONNX
Runtime:
1. Install them to a system location, then use cmake's find_package
module to locate them.
2.  Use git submodules 
6.  Use cmake's external projects(externalproject_add). 

At first when this project was just started, we considered both option 2
and option 3. We preferred option 2 because:

1. It's easier to handle authentication. At first this project was not
open source, and it had some other non-public dependencies. If we use
git submodule, ADO will handle authentication smoothly. Otherwise we
need to manually pass tokens around and be very careful on not exposing
them in build logs.
2. At that time, cmake fetched dependencies after "cmake" finished
generating vcprojects/makefiles. So it was very difficult to make cflags
consistent. Since cmake 3.11, it has a new command: FetchContent, which
fetches dependencies when it generates vcprojects/makefiles just before
add_subdirectories, so the parent project's variables/settings can be
easily passed to the child projects.

And when the project went on,  we had some new concerns:
1. As we started to have more and more EPs and build configs, the number
of submodules grew quickly. For more developers, most ORT submodules are
not relevant to them. They shouldn't need to download all of them.
2. It is impossible to let two different build configs use two different
versions of the same dependency. For example, right now we have protobuf
3.18.3 in the submodules. Then every EP must use the same version.
Whenever we have a need to upgrade protobuf, we need to coordinate
across the whole team and many external developers. I can't manage it
anymore.
3. Some projects want to manage the dependencies in a different way,
either because of their preference or because of compliance
requirements. For example, some Microsoft teams want to use vcpkg, but
we don't want to force every user of onnxruntime using vcpkg.
7. Someone wants to dynamically link to protobuf, but our build script
only does static link.
8. Hard to handle security vulnerabilities. For example, whenever
protobuf has a security patch, we have a lot of things to do. But if we
allowed people to build ORT with a different version of protobuf without
changing ORT"s source code, the customer who build ORT from source will
be able to act on such things in a quicker way. They will not need to
wait ORT having a patch release.
9. Every time we do a release, github will also publish a source file
zip file and a source file tarball for us. But they are not usable,
because they miss submodules.
 
### New features

After this change, users will be able to:
1. Build the dependencies in the way they want, then install them to
somewhere(for example, /usr or a temp folder).
2. Or download the dependencies by using cmake commands from these
dependencies official website
3. Similar to the above, but use your private mirrors to migrate supply
chain risks.
4. Use different versions of the dependencies, as long as our source
code is compatible with them. For example, you may use you can't use
protobuf 3.20.x as they need code changes in ONNX Runtime.
6.  Only download the things the current build needs.
10. Avoid building external dependencies again and again in every build.

### Breaking change
The onnxruntime_PREFER_SYSTEM_LIB build option is removed you could think from now 
it is default ON. If you don't like the new behavior, you can set FETCHCONTENT_TRY_FIND_PACKAGE_MODE to NEVER.
Besides, for who relied on the onnxruntime_PREFER_SYSTEM_LIB build
option, please be aware that this PR will change find_package calls from
Module mode to Config mode. For example, in the past if you have
installed protobuf from apt-get from ubuntu 20.04's official repo,
find_package can find it and use it. But after this PR, it won't. This
is because that protobuf version provided by Ubuntu 20.04 is too old to
support the "config mode". It can be resolved by getting a newer version
of protobuf from somewhere.
2022-12-01 09:51:59 -08:00
Patrice Vignola
4128e44b4f
[DML EP] Upgrade DML to 1.10.0 (#13796)
### Description
Upgrade DML to 1.10.0
2022-11-30 21:32:14 -08:00
Changming Sun
29ed8811e5
Move C/C++ deps' URLs to deps.txt (#13769)
### Description
1. Move C/C++ deps' URLs to deps.txt, and download the dependencies from
Azure Devops Artifacts instead of github.
2. Add "EXCLUDE_FROM_ALL" keyword to the cmake external projects, so
that we only build the parts we need and avoid installing the 3rd-party
dependencies when people run `make install` in ORT's build directory.
However, at this moment cmake itself doesn't have the feature. So I
copied their code to cmake/external/helper_functions.cmake and modified
it.

This PR is split from #13523, to make that one smaller. 

### Motivation and Context
1. Secure the supply chain
2. Make it be possible to automatically detect if ORT has an old
dependency that hasn't been updated from a long time.
2022-11-29 18:06:35 -08:00
Guenther Schmuelling
2d523c507e
for wasm catch exceptions at top level api (#13644)
fix for https://github.com/microsoft/onnxruntime/issues/13383,
https://github.com/microsoft/onnxruntime/issues/13408

Currently ort-web doesn't catch exceptions because turning on exception
catching increases the binary size by 3MB (~30%).
But ort can throw (ie onnx errors or ORT_ENFORCE) and there is no
useable error message.

Turning on exception catching just for top level api released file will
fix the error messages at minimal increase of binary size.
2022-11-28 10:24:34 -08:00
Edward Chen
4901987d1d
Remove SafeInt dependency from Objective-C API. (#13698) 2022-11-18 17:06:12 -08:00
Changming Sun
3e9e5e9d6d
Patch Protobuf and ONNX's cmake files and enforce BinSkim check (#13694)
Patch Protobuf and ONNX's cmake files and enforce BinSkim check.

This PR has overlap with #13523 . I would prefer to get this one merged
first so that we can finished the BinSkim work, and I try to make this
PR as small as possible.
2022-11-18 10:09:47 -08:00
Changming Sun
7a57976d1a
Make natvis files work better (#13665)
### Description
After this change, you will see GSL.natvis and wil.nativs files will be
added to every onnxruntime_xxx project.

Like this:

![image](https://user-images.githubusercontent.com/856316/202081013-314145a8-7a0f-4f45-bf85-f9ed0e247c63.png)

This is because in onnxruntime_common.cmake we have:

```cmake
    if (MSVC)
    set(ABSEIL_NATVIS_FILE "abseil-cpp.natvis")
    target_sources(
        onnxruntime_common
        INTERFACE $<BUILD_INTERFACE:${PROJECT_SOURCE_DIR}/external/${ABSEIL_NATVIS_FILE}>)
  endif()
```
It sets a property, INTERFACE_SOURCES, on the target
"onnxruntime_common".

Then if anyone else uses:
```
target_link_libraries(mytarget PRIVATE onnxruntime_common)
```
The nativis file will be added to `mytarget`.

However, in this project we don't use such things for the targets that
are static libraries. For example, onnxruntime_graph is a static
library.

Instead, we use the `onnxruntime_add_include_to_target ` function to
explicitly control what we want to propagate . The function was written
before we started to have nativis files. So it doesn't pass a source
file from one static library to another. Now we have the need. Probably
only for Windows.

### Motivation and Context

Add natvis  files to every project.
2022-11-17 19:13:40 -08:00
Jian Chen
8442d9df2c
Cjian/c4244 round 6 (#13663)
### Description
Fix round 6 



### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2022-11-16 16:26:11 -05:00
cloudhan
369a822409
Share TunableOp between CUDA and ROCM EP (#13560)
Make TunableOp to support CUDA kernel authoring and add the corresponding supports for kernel explorer
2022-11-11 13:56:44 +08:00
Patrice Vignola
3482180ec2
DML EP add a registration for Shape and Size (#13442)
### Description
Add a DML registration for Shape to avoid copying back to the CPU just
to get the shape of a GPU tensor.



### Motivation and Context
When using free dimensions, many Transformers models extensively use the
`Shape` operator. This causes hundreds of GPU->CPU copy that should be
completely avoidable. Note that this change also uses the same
heuristics as other providers (e.g. CUDA) to force some tensors on the
CPU in certain situations.

Co-authored-by: Patrice Vignola <pavignol@microsoft.com>
2022-11-08 19:29:37 -08:00
Peter Salas
b383312f4c
[tvm] Add support for int8 models, update TVM revision (#13519)
### Description
In the TVM EP, this adds more entries to the conversion from
`ONNXTensorElementDataType` to `DLDataType`. Additionally, it removes an
unused function and updates the TVM revision to allow running models
from recent revisions of TVM.

### Motivation and Context
In the TVM EP, the mapping from `ONNXTensorElementDataType` to
`DLDataType` was incomplete and neglected several integer types (in
particular `ONNX_TENSOR_ELEMENT_DATA_TYPE_UINT8` and
`ONNX_TENSOR_ELEMENT_DATA_TYPE_INT8`) which prevented some models from
running.

Co-authored-by: Peter Salas <psalas@octoml.ai>
2022-11-08 11:28:32 -08:00
Changming Sun
efcbdac58e
Remove the cmake option: onnxruntime_DEV_MODE (#13573)
1. Remove the cmake option onnxruntime_DEV_MODE and replace it with
"--compile-no-warning-as-error"
2. Suppress some GSL warnings because now we treat nvcc diag warnings as
errors
2022-11-07 09:06:28 -08:00
Changming Sun
23da468154
Upgrade cmake version to 3.24 (#13569)
### Description
Upgrade cmake version to 3.24 because I need to use a new feature that
is only provided in that version and later. Starting from cmake 3.24,
the
[FetchContent](https://cmake.org/cmake/help/latest/module/FetchContent.html#module:FetchContent)
module and the
[find_package()](https://cmake.org/cmake/help/latest/command/find_package.html#command:find_package)
command now support integration capabilities, which means calls to
"FetchContent" can be implicitly redirected to "find_package", and vice
versa. Users can use a cmake variable to control the behavior. So, we
don't need to provide such a build option. We can delete our
"onnxruntime_PREFER_SYSTEM_LIB" build option and let cmake handle it.
And it would be easier for who wants to use vcpkg.


### Motivation and Context

Provide a unified package management method, and get aligned with the
community. This change is split from #13523 for easier review.
2022-11-04 22:58:51 -07:00
George Nash
0296bc74c1
oneDNN ep bf16 enabling (#13484)
### Description
 This adds bfloat16 support to the oneDNN ep.

When using the oneDNN ep this enables bfloat16 support for the following
ops:

Exp, Sigmoid, Tanh, Relu, MatMul, Gelu, BiasGelu, Add, Sub,
Mul, Div, Div, Sqrt, Pow, ReduceMean,  Abs, Cast, Equal, Exp,
FastGelu, FusedMatMul, Gemm, Greter, GreaterOrEqual, LeakyRelu,
Less, LessOrEqual, LRN, ReduceOps, Reshape, Squeeze, Transpose,
 and Unsqueeze.

LayerNorm with some internal casting. 
BatchNorm only enabled BFloat16 for input and output, scale and bias
still need fp32 input.

Added bfloat16 unit tests for all of the operators in question. When
possible we reused the already existing unit tests that were added by
CUDA and ROCM eps.

In many of the unit tests an unusual pattern will be seen 

    #if defined(USE_DNNL)
    TEST(Test, bfloat16_test) {
      #if defined(USE_DNNL)
        // oneDNN ep specific code
      #endif
       //test code
    }
    #endif

Although it looks unusual this was purposely done if another ep
implements bfloat16 support for that operator they will be able to
enable the unit test by adding there execution provider to the first
line without needing to edit inside the test.

Example: `#if defined(USE_CUDA) || defined(USE_DNNL)` see the
MatMul_float16 test in matmul_test.cc for and example of how this is
useful.

Additionally two new ISA checks (AVX512_BF16 and AMX-BF16) were added to
the cpuid_info code in. This was important to detecting is bfloat16
operations are supported by the CPU.

### Motivation and Context
This expands the capabilities of the oneDNN execution provider to
support models containing bfloat16 operations.

Signed-off-by: George Nash <george.nash@intel.com>
Signed-off-by: Ruihan-Yin <ruihan.yin@intel.com>
2022-11-04 18:25:09 -07:00
Edward Chen
4401f50c5e
Change GSL download to use HTTPS URL. (#13563) 2022-11-04 18:01:18 -07:00
cloudhan
2de883c592
Update CK and fix performance issue on dev machine (#13531)
1. Update CK to its latest develop branch
2. `-mllvm -amdgpu-early-inline-all=true` is critical to CK's
performance, ensure it is properly configured.
- The flags are propagated from target `hip-lang::device`'s
`INTERFACE_COMPILE_OPTIONS`, we must not manually add the flags.
- Instead, we must ensure this target is properly configured by checking
_CMAKE_HIP_DEVICE_RUNTIME_TARGET is set.

TL,DR

`hip-lang::device` sometime will be not be properly configured if our
`CMAKE_PREFIX_PATH` is not configured carefully. In the CI docker, the
configuration is in good state, but on dev machine it is not, which then
silently result poor performance for kernels. We fixed it in this PR and
add a guard to avoid unsuccessful future editing and to prevent
convoluted debugging process.

`_CMAKE_HIP_DEVICE_RUNTIME_TARGET ` is shared in
`/opt/rocm/lib/cmake/hip-lang/hip-lang-config.cmake` and it is internal
to
[CMake](https://gitlab.kitware.com/cmake/cmake/-/merge_requests/6121/diffs),
the variable name will not be changed in the foreseeable future.
2022-11-03 19:32:30 +08:00
George Nash
77be22f379
[oneDNN ep] Update from oneDNN v2.7.0 to oneDNN v2.7.1 (#13536)
The oneDNN 2.7.1 release includes multiple functional and performance
improvements.

Signed-off-by: George Nash <george.nash@intel.com>

### Description
Update the oneDNN library from 2.7.0 to 2.7.1. This contains multiple
functional and performance improvements.



### Motivation and Context
This is a minor point release from the oneDNN library that gives
performance and functional fixes that were found in the oneDNN 2.7
library shortly after release.

Signed-off-by: George Nash <george.nash@intel.com>
2022-11-02 15:57:49 -07:00
Changming Sun
b1e1b25e04
Delete CUB (#13534)
### Description
Delete CUB

### Motivation and Context
Because it is already in CUDA SDK.
2022-11-02 13:06:22 -07:00
Wei-Sheng Chin
b5904c40dd
Enable ORT in TorchDynamo (#13259)
This PR enables ORT to execute graphs captured by TorchDynamo. Major compilation code is in `OrtBackend.compile` in ort_backend.py. `register_backend.py` is for plugging `OrtBackend` into TorchDynamo as a compiler.
2022-11-01 11:19:29 -07:00
PeixuanZuo
c8886c5b4c Revert "Update CK and fix performance due to lacking -amdgpu-early-inline-all=true (#13493)"
This reverts commit 4dd053cc15.
2022-11-01 13:05:55 +08:00
Baiju Meswani
c557a55816
Fix on-device training ExportModelForInferencing api (#13510) 2022-10-31 21:29:06 -07:00
Edward Chen
2ecd1d6622
Switch GSL to MS GSL 4.0.0 (#13416) 2022-10-29 04:15:20 -07:00
cloudhan
4dd053cc15
Update CK and fix performance due to lacking -amdgpu-early-inline-all=true (#13493)
1. Update CK to its latest develop branch
2. `-mllvm -amdgpu-early-inline-all=true` is critical to CK's
performance, add it.
2022-10-28 09:36:00 -07:00
JiCheng
20c3c35c33
[XNNPACK] support building xnnpack EP for IOS (#13461)
### Description
support building xnnpack for IOS


### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2022-10-28 15:03:04 +08:00
Changming Sun
4a20c0d98b
Delete zlib.cmake (#13467)
Delete the file because it is not included by any other file.
2022-10-27 15:36:04 -07:00
Rui Ren
136e15bfaf
revert cmake external file (#13459) 2022-10-26 11:38:15 -07:00
cloudhan
2748f38362
Drop hip_add_library (#13406)
Switching to use CMake's builtin hip language support.
2022-10-25 12:57:48 +08:00
cloudhan
928c9fc348
Hipify during build instead of before cmake config (#13333)
### Description

Currently, hipify happens before cmake is configured and then cmake glob
the directories. This get rids of thoes customized python threading
logic and opt for build system itself to generate the files.

This also supersede the half baked branch
[sukha/hipify-with-cmake](https://github.com/microsoft/onnxruntime/tree/sukha/hipify-with-cmake)
2022-10-20 22:46:22 -07:00
Ted Themistokleous
a561fde126
MIGraphX Execution Provider: Stream Synchronization (#12899)
**Description**: Changes to the MIGraphx execution provider code to
allow for stream synchronization on the gpu side

**Motivation and Context**
Performance boost by removing redundant host to device synchronizations 

The current implementation of the execution provider continuously calls
hipDeviceSynchronize() between computations which adds overhead and an
idle wait between the GPU's computations. This is noticeable during
device

This change leverages new functionality that's been added to MIGraphX to
allow for GPU side synchronization which avoids the need for
host->device waits.

To maintain backwards compatibility with older MIGraphX versions, the
compile time define MIGRAPHX_STREAM_SYNC has been added to the API to
allow for older version operate with newer builds of onnxruntime without
loss of functionality to the current feature set as of (08/09/22)

Co-authored-by: Ted Themistokleous <tthemist@amd.com>
2022-10-14 10:23:51 -07:00
cloudhan
790e363909
Reland: Change ROCm to use tunable GEMM (#13231)
Reland: Change ROCm to use tunable GEMM (#12853)
2022-10-13 21:49:42 -07:00
Wei-Sheng Chin
dc324b1d90
[LazyTensor] Make LORT Build Again with Latest PyTorch (#13303)
`python setup.py develop` doesn't install PyTorch as a normal package in
site-packages anymore, and the user must stay at PyTorch's root
directory to call `import torch`. This will break LORT tests because
LORT tests contains `import torch` and are called outside PyTorch root
directory. To make PyTorch a normal package again, this PR build PyTorch
with `python setup.py install`.
2022-10-13 13:56:17 -07:00
Dmitri Smirnov
25c0a66934
Natvis adjustments to make debugging bearable (#13237)
### Description

- Fix Abseil::InlinedVector inlined storage visualization
- Fix typo in protobuf natvis.
- Add basic gsl.natvis


### Motivation and Context
Debugging is hard.
2022-10-10 10:06:55 -07:00
cloudhan
51ac6617f5
Fix warnings and enable dev mode for ROCm CI (#13223)
Fix warnings and enable dev mode for ROCm CI:

* Fix ROCm headers complaining "This file is deprecated. Use the header file from ..."
* Disable warning signed and unsigned compare for kernel explorer
* Fix unused and nondiscard warnings
* Enable dev mode for ROCm CI
* Walkaround error "unknown warning option '-Wno-nonnull-compare'" in kernel explorer by using '-Wno-unknown-warning-option' to ignore the unknown option
* Fix error "unused parameter 'mask'"
* Fix warning "instantiation of variable 'onnxruntime::rocm::Consts<float>::One' required here, but no definition is available", etc. Fixed by using C++17's inline (implied by constexpr) static initialization.
* Remove unused variable
* Add the missing `override` specifier
2022-10-07 09:45:01 +08:00
Edward Chen
4e37464cc5
Add build configuration to binary size checks pipeline. (#13208)
Add another build configuration to binary size checks pipeline. Enable additional configurations to be added more easily.
2022-10-05 12:39:19 -07:00
cloudhan
72076b1eb2
Update ROCm CI to use HIP LANGUAGE (#13214)
Update for ROCm CI before reland tunable GEMM #12853. This PR also update
composable kernel to use CMakes's HIP language support so that we can
mix C/C++ compiler with HIP compiler instead of locking to hip-clang
2022-10-05 16:15:16 +08:00
Yulong Wang
054464dce2
fix XNNPACK on WebAssembly SIMD (#13161)
### Description

fix XNNPACK on WebAssembly SIMD.

Flag "-msimd128" need to be applied to every source file when compiling
WASM SIMD. Currently only a part of the source files are compiled with
this flag so we get inconsistent result for
`sizeof(xnn_f32_minmax_params)` because the type definition include a
`#ifdef` for `__wasm_simd128__`. The inconsistency causes writing
garbage data to a stack variable and eventually cause the crash.

XNNPACK libraries are C libraries so need to apply the build flags not
only to `CMAKE_CXX_FLAGS` but also to `CMAKE_C_FLAGS`.
2022-09-30 16:34:15 -07:00
George Nash
b76a65c784
Upgrade the oneDNN ep to use oneDNNv2.7 (#13175)
### Description
This updates the oneDNN library used by oneDNN ep from version 2.6 to
version 2.7



### Motivation and Context
This brings in the many improvements incorporated into the oneDNN
library to the oneDNN execution provider.

Signed-off-by: George Nash <george.nash@intel.com>
2022-09-30 12:29:17 -07:00
cloudhan
c93cb8f949
Revert "Enable ROCm to use tunable GEMM" (#13160)
Reverts microsoft/onnxruntime#12853 due to CI pipeline problem.
2022-09-30 14:01:16 +08:00
cloudhan
32c2c4b480
Change ROCm to use tunable GEMM (#12853)
Change ROCm to use tunable GEMM. It is not enabled in this PR. This will drastically improve GEMM performance in some shapes and dtypes configuration. This will benefit the overall performance for BERT inference and hopefully, training, when enabled.
2022-09-28 16:21:54 +08:00
Rachel Guo
9a44a69653
Refactor NNAPI EP OpBuilder/OpSupportChecker structure (#13065)
### Description
<!-- Describe your changes. -->

As title

-Split long OpBuilder and OpSupportChecker files into individual
operator files.

-Add OpBuilder/SupportChecker registry factories.

-Combine the functionality of op_builder and op_support_checker into one
op_builder.
### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->

The NNAPI OPBuilder was splitted into OPBuilder (For EP::Compile) and
OPSupportChecker (for EP::GetCapability)
At the time it was reasonable choice, but OPBuilder/OPSupportChecker
share some logic and has to use addition helper.

Clean up now to make NNAPI OPBuilder/OPSupportChecker into single
OPBuilder (similar to what CoreML EP has)
2022-09-27 17:12:09 -07:00
Changming Sun
b25437ec41
Upgrade protobuf version (#13100)
Upgrade protobuf version from 3.18.1 to 3.18.3 to address CVE-2022-1941
2022-09-26 21:30:28 -07:00
RandySheriffH
77a066c700
Drop nuphar from java API (#13107)
Drop nuphar from:

- java API
- tvm.cmake
- run_build.sh
2022-09-26 17:06:08 -07:00
RandySheriffH
a83a9ed6b0
Remove miscellaneous nuphar configs (#13070)
Remove a handful of nuphar related configurations after deprecation.

Co-authored-by: Randy Shuai <rashuai@microsoft.com>
2022-09-26 13:41:28 -07:00
Dale Phurrough
2ae33b3613
fix CuDNN lib path for Windows (#12974)
Fixes microsoft/onnxruntime#12969

### Motivation and Context

Build is broken, can't find cudnn.lib with nvidia official install of
cuDNN

Alternative method is to use `IF(EXISTS
${onnxruntime_CUDNN_HOME}/lib/x64/cudnn.lib)` to test for legacy
location and only add the legacy dir to the path, else add the current
official `lib/` dir.
2022-09-26 13:23:38 -07:00
Changming Sun
eafd67b8fd
Update CUDA version to 11.6 and refactor python packaging pipeline (#13002)
1. Update CUDA version from 11.4 to 11.6.
2. Update Manylinux version
3. Upgrade GCC version from 10 to 11 for most x86_64 pipelines. CentOS 7 ARM64 doesn't have GCC 11 yet.
4. Refactor python packaging pipeline: 
    a. Split Linux GPU build job to two parts, build and test, so that the
build part doesn't need to use a GPU machine
    b. Make the Linux GPU build job and Linux CPU build job more similar: share the same bash script and yaml file.
5. Temporarily disable Attention_Mask1D_Fp16_B2_FusedNoPadding because it is causing one of our packaging pipeline to fail. I have created an ADO task for this.
2022-09-23 00:29:27 -07:00
cloudhan
a24b41d92e
Move all TunableOp related falicilities to EP level directory (#12857)
Some Ops in EP directory instead of contrib_ops directory will
require TunableOp. We will also need to add EP level session tuning
options for it. So move those code all at once.

Also remove duplicated utility functions.
2022-09-23 11:10:19 +08:00
wangxiyuan
952c99304a
Add CANN EP (#12416)
**Description**: This PR adds Ascend CANN execution provider support.

**Motivation and Context**
- Why is this change required? What problem does it solve?
As the info shown in the issue. CANN is the API layer for Ascend
processor. Add CANN EP can allow user run onnx model on Ascend hardware
via onnxruntime
  The detail change:
  1. Added CANN EP framework.
  2. Added the basic operators to support ResNet and VGG model.
  3. Added C/C++、Python API support
- If it fixes an open issue, please link to the issue here.
   https://github.com/microsoft/onnxruntime/issues/11477

Author: 
lijiawei <lijiawei19@huawei.com>
wangxiyuan <wangxiyuan1007@gmail.com>

Co-authored-by: FFrog <ljw1101.vip@gmail.com>
2022-09-22 14:53:40 -07:00
sfatimar
cccbe90764
Openvino ep 2022.2 v4.2 (#13023)
This changes are to align OV 2022.2 Release with ORT . Changes
CPU FP16 Support, dGPU Support, RHEL Dockerfile, Ubuntu 20 Dockerfile 

**Motivation and Context**
- This change is required to ensure ORT-OpenVINO Execution Provider is
aligned with latest changes.
- If it fixes an open issue, please link to the issue here.

Co-authored-by: mayavijx <mayax.vijayan@intel.com>
Co-authored-by: shamaksx <shamax.kshirsagar@intel.com>
Co-authored-by: pratiksha <pratikshax.bapusaheb.vanse@intel.com>
Co-authored-by: pratiksha <mohsinx.mohammad@intel.com>
Co-authored-by: Sahar Fatima <sfatima.3001@gmail.com>
Co-authored-by: Preetha Veeramalai <preetha.veeramalai@intel.com>
Co-authored-by: nmaajidk <n.maajid.khan@intel.com>
Co-authored-by: Mateusz Tabaka <mateusz.tabaka@intel.com>
Co-authored-by: intel <intel@iotgecsp-nuc04.iind.intel.com>
2022-09-22 12:31:40 -07:00
Adam Louly
268bfe2a5d
python training api bindings (#12610)
**Description**: **Python API Bindings for on device training. **
**Motivation and Context**
- This PR contains api bindings so python users can perform a whole
training loop.

Co-authored-by: Adam Louly <adamlouly@microsoft.com@orttrainingdev7.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net>
Co-authored-by: Baiju Meswani <bmeswani@microsoft.com>
2022-09-16 09:38:24 -07:00
sumitsays
363c695dad
Update DML 1.9.0 to 1.9.1 (#12966)
Update DML to 1.9.1

Co-authored-by: Dwayne Robinson <dwayner@microsoft.com>
2022-09-15 10:54:22 -07:00