Commit graph

54 commits

Author SHA1 Message Date
Changming Sun
bf33919afb
Update absl and gtest to fix an ARM64EC build error (#18735)
### Description
Update absl and gtest to fix an ARM64EC build error


### Motivation and Context
We need to get an important fix into ORT.
The fix is:

8028a87c96
2023-12-07 15:55:17 -08:00
Edward Chen
14a343441d
Fix Objective-C static analysis build (#18606)
- Patch abseil to fix a compile error about not finding `cxxabi.h`.
- Fix some static analysis warnings.
2023-11-28 17:14:20 -08:00
cloudhan
6f3c1f9dc9
[ROCm] Update ck for GemmFloat8 (#18487) 2023-11-23 12:06:19 +08:00
Ye Wang
f9af94009b
onboard MoE (#18279)
### Description
<!-- Describe your changes. -->
1. Introduce MoE CUDA op to ORT based on FT implementation.
2. Upgrade cutlass to 3.1.0 to avoid some build failures on Windows.
Remove patch file for cutlass 3.0.0.
3. Sharded MoE implementation will come with another PR

limitation: __CUDA_ARCH__ >= 700


### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2023-11-14 16:48:51 -08:00
PeixuanZuo
a62a500ae1
[ROCm] Update CK version (#17628)
update ck version
2023-11-13 15:43:38 -08:00
Scott McKay
8d298f6f78
Fix xnnpack compile error on arm32 (#18291)
### Description
<!-- Describe your changes. -->
Use different march flag to workaround what appears to be a clang issue.

See https://github.com/tensorflow/tensorflow/issues/59970 for links to
various relevant pieces of info/discussions.


### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2023-11-12 08:59:20 +10:00
Scott McKay
64c91d790b
Fix ability to use patch on Windows CI machines (#18356)
### Description
<!-- Describe your changes. -->
Add 32-bit patch binary and infra to fallback to it. The Azure devops
Windows CIs are missing patch.exe from their git install for some reason
so the default `find_package(Patch)` fails as that is where it expects
to find it.

Remove Eigen patch. Underlying issue was fixed in source 3 years ago by
c6c84ed961
and the patch command is invalid (args are for git apply not patch).

### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
Make usage of patch consistent across all CIs
Fix https://github.com/microsoft/onnxruntime/issues/15248
2023-11-11 07:32:14 +10:00
Scott McKay
4f2096be38
Update XNNPACK to latest version (#18038)
### Description
<!-- Describe your changes. -->
Update XNNPACK to latest version
- adds fp16 kernels and various other improvements
- requires pthreadpool update as well

Most code updates in the XNNPACK EP are to adjust to the new XNNPACK API
- 'setup' is split into 'reshape' and 'setup'
-  some ops use a workspace buffer
   -  copied workspace allocation from XNNPACK unit test code
- some suffixes changed 

Added wrapper for XNNPACK caches to base XNNPACK EP kernel
- simplifies usage
- XNNPACK split out the code and weights caches, but the code cache
isn't currently usable via the public API
- we could use the internal types if we think it's required for
performance reasons. non-trivial though as we'd need to propagate ifdef
values from the XNNPACK build up to the ORT build.
- using XNNPACK internals would also mean we would not be able to
support using a pre-build XNNPACK package
    - not an issue currently
  
Fixed opset registration for internal NHWC domain
- was not being tied to the ONNX version, so nodes inserted by layout
transformation had the incorrect opset
- a number of other places needed updating once this issue was fixed

Remove support for NCHW Resize from XNNPACK EP so it's NHWC only
- we only supported NCHW for fp32,
- doing so adds complexity in multiple places (XNNPACK EP kernel
implementation, layout transformation and transpose optimization)
- unclear if that complexity provides any benefit. can add back if
required by production scenario

### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
We're looking at enabling fp16 support for CoreML and NNAPI. If we do
that we need a good fallback story if the CPU EP will be used. The
XNNPACK fp16 kernels will hopefully provide that.

NOTE: This PR doesn't add fp16 support to the XNNPACK EP kernels. That
can be done as required in separate EPs and should be relatively simple
to do.
2023-11-03 09:04:28 -07:00
liqun Fu
2be4dc6d04
ONNX 1.15 integration (#17125)
### Description
this is for ORT 1.17.0 - make ORT to use ONNX release 1.15.0 branch. Eventually will update to the release tag once ONNX 1.15.0 is released


### Motivation and Context
Prepare for ORT 1.17.0 release. People can start work on new and updated ONNX ops in ORT.
---------

Signed-off-by: Liqun Fu <liqfu@microsoft.com>
2023-09-26 14:44:48 -07:00
Changming Sun
5af6279440
Fix Android build (#17540)
### Description
The new cpuinfo library doesn't use clog on Android. Newer XNNPack
versions have removed the dependency on clog, but the one we use still
has it. So I cherry-pick the XNNPack to our patch file.
2023-09-14 07:36:01 -07:00
cloudhan
4e6cec4d09
Update ck and enable test (#16383)
Apply the fix in https://github.com/ROCmSoftwarePlatform/composable_kernel/issues/728
Introduce more kernel instances and allow the introduction of streamk and splitk.
2023-08-22 11:08:55 +08:00
Yulong Wang
5704e71b89
update onnx.patch to apply wasm build break fix (#17104)
### Description
This PR fixes build break for WebAssembly introduced in
6986981482
(435ad2b1d8).

This change updates onnx.patch in onnxruntime repo. the corresponding PR
in onnx repo is: https://github.com/onnx/onnx/pull/5495.

It may takes a while for the next onnx version bump.
2023-08-11 15:00:39 -07:00
Edward Chen
1261d0b8ba
Fix some build issues on MacOS with Xcode 14.3. (#15878)
- Fix flatbuffers flatc warning, unused-but-set-variable.
- Address `-Wshorten-64-to-32` warnings (fix in our code, allow in dependencies' code).
- Update CI builds to use Xcode 14.3.
- Update minimum iOS version to 12.0.
- Update Mac hosted agents to MacOS 13 where possible.
2023-06-07 12:07:11 -07:00
sdegrande
cf062dbdb1
FlatBuffers fails to compile with gcc13. (#15787)
When building the FlatBuffers dependencies, gcc13 emits a
stringop-overflow warning. All warnings being turned into errors, that
fails the compilation of FlatBuffers, and as a consequence also fails
the build of onnxruntime.

This commit adds the application of a patch to FlatBuffers's
CMakeList.txt, to add -Wno-error=stringop-overflow to the
CMAKE_CXX_FLAGS.
2023-05-11 11:20:19 -07:00
Changming Sun
15f7dca9fb
Update protobuf to 3.21.x (#15245)
### Description

Fixed
[AB#10092](https://aiinfra.visualstudio.com/6a833879-cd9b-44a4-a9de-adc2d818f13c/_workitems/edit/10092),
[AB#11753](https://aiinfra.visualstudio.com/6a833879-cd9b-44a4-a9de-adc2d818f13c/_workitems/edit/11753),
[AB#11759](https://aiinfra.visualstudio.com/6a833879-cd9b-44a4-a9de-adc2d818f13c/_workitems/edit/11759)

### Motivation and Context
The one we use has a security issue in Java, though we don't use that
version's protobuf java package.
2023-03-29 14:08:18 -07:00
Ye Wang
2ee822d483
Extend memory efficient attention coverage in Attention/MHA cuda op (#15064)
### Description
<!-- Describe your changes. -->

1. upgrade cutlass to 3.0 that containing attn_bias support.
2. extend Attention/MHA to use memory efficient attention when
rel_pos_bias with [1, num_head, s, s*] and 1d mask with [2 * batch_size
+ 1] are present.

new mask format introduction:
MASK_1D_KEY_SEQ_LEN_START,  
[3 * batch_size + 2] with [key_len[0], ..., key_len[batch_size - 1],
query_start[0], ..., query_start[batch_size - 1], query_end[batch_size -
1], key_start[0], ..., key_start[batch_size - 1], key_end[batch_size -
1]]

e.g
2D mask with [[1, 1, 1, 0, 0, 0], [1, 1, 1, 1, 1, 0]] converts to this
1D mask is [3, 5, 0, 6, 12, 0, 6, 12]


### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->

It potentially benefits tnlrv6 and t5(encoder)

---------

Co-authored-by: Ubuntu <wy@v100-2.0cdb2e52twzevn1i4fi45bylyg.jx.internal.cloudapp.net>
Co-authored-by: Kunal Vaishnavi <kvaishnavi@microsoft.com>
Co-authored-by: Kunal Vaishnavi <kvaishnavi@microsoft.com@orttrainingdev7.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net>
2023-03-23 11:05:17 -07:00
cloudhan
a5ab88247b
ROCm Flash Attention (#14838)
Adds flash attention via composable kernel for ROCm EP
2023-03-16 10:39:58 +08:00
Yi Zhang
80f807c03d
upgrade protobuf to 3.20.2 and onnx to 1.13 (#14279)
### Description
upgrade protobuf to 3.20.2, same as onnx 1.13.0

### Motivation and Context
Per component governance requirement and Fixes #14060

unused-parameter error occurs in 2 conditions.
1. compile protolbuf

`onnxruntime_src/cmake/external/protobuf/src/google/protobuf/repeated_ptr_field.h:752:66:
error: unused parameter ‘prototype’ [-Werror=unused-parameter]`
2. include onnx_pb.h
```
2023-01-28T10:20:15.0410853Z FAILED: CMakeFiles/onnxruntime_pybind11_state.dir/onnxruntime_src/onnxruntime/python/onnxruntime_pybind_iobinding.cc.o 
......
2023-01-28T10:20:15.0466024Z                  from /build/Debug/_deps/onnx-src/onnx/onnx_pb.h:51,
2023-01-28T10:20:15.0466958Z                  from /onnxruntime_src/include/onnxruntime/core/framework/to_tensor_proto_element_type.h:10,
....
2023-01-28T10:20:15.0609678Z /build/Debug/_deps/onnx-build/onnx/onnx-operators-ml.pb.h:1178:25:   required from here
2023-01-28T10:20:15.0610895Z /onnxruntime_src/cmake/external/protobuf/src/google/protobuf/repeated_ptr_field.h:752:66: error: unused parameter ‘prototype’ [-Werror=unused-parameter]
2023-01-28T10:20:15.0611707Z cc1plus: all warnings being treated as errors

```

https://dev.azure.com/onnxruntime/2a773b67-e88b-4c7f-9fc0-87d31fea8ef2/_apis/build/builds/874605/logs/22
2023-01-31 12:55:09 -08:00
PeixuanZuo
4eac0db3af
[ROCm] Add GemmFastGelu CK implementation (#13759)
### Description
<!-- Describe your changes. -->

Add GemmFastGelu CK implementation.

TODO 
1. The performance of CK GemmFastGelu in ORT is not good as using CK
directly, still need to investigate the reason and improve the CK in
ORT.
`GemmFastGeluUnfused float16 NN m=49152 n=3072 k=768 2298.8064 us 100.89
tflops`
`withbias DeviceGemmMultipleD_Xdl_CShuffle<256, 256, 128, 32, 8, 8,
Default> LoopScheduler: Default, PipelineVersion: v1 float16 NN m=49152
n=3072 k=768 2401.9799 us 96.56 tflops`

### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->

Co-authored-by: peixuanzuo <peixuanzuo@linmif39a000004.zvflicr54joexhdgnhvmxrxygg.phxx.internal.cloudapp.net>
2023-01-05 17:53:30 +08:00
Changming Sun
fc2a6db573
Update absl to the latest release (#13990)
### Description
Update absl to a new version

### Motivation and Context
The new version contains fixes that are needed for Nvidia GPU build.
Once we update it to that version, we don't need to maintain our private
patches for Nvidia GPU build.
2022-12-19 14:25:13 -08:00
Changming Sun
3e9e5e9d6d
Patch Protobuf and ONNX's cmake files and enforce BinSkim check (#13694)
Patch Protobuf and ONNX's cmake files and enforce BinSkim check.

This PR has overlap with #13523 . I would prefer to get this one merged
first so that we can finished the BinSkim work, and I try to make this
PR as small as possible.
2022-11-18 10:09:47 -08:00
Changming Sun
efcbdac58e
Remove the cmake option: onnxruntime_DEV_MODE (#13573)
1. Remove the cmake option onnxruntime_DEV_MODE and replace it with
"--compile-no-warning-as-error"
2. Suppress some GSL warnings because now we treat nvcc diag warnings as
errors
2022-11-07 09:06:28 -08:00
cloudhan
2de883c592
Update CK and fix performance issue on dev machine (#13531)
1. Update CK to its latest develop branch
2. `-mllvm -amdgpu-early-inline-all=true` is critical to CK's
performance, ensure it is properly configured.
- The flags are propagated from target `hip-lang::device`'s
`INTERFACE_COMPILE_OPTIONS`, we must not manually add the flags.
- Instead, we must ensure this target is properly configured by checking
_CMAKE_HIP_DEVICE_RUNTIME_TARGET is set.

TL,DR

`hip-lang::device` sometime will be not be properly configured if our
`CMAKE_PREFIX_PATH` is not configured carefully. In the CI docker, the
configuration is in good state, but on dev machine it is not, which then
silently result poor performance for kernels. We fixed it in this PR and
add a guard to avoid unsuccessful future editing and to prevent
convoluted debugging process.

`_CMAKE_HIP_DEVICE_RUNTIME_TARGET ` is shared in
`/opt/rocm/lib/cmake/hip-lang/hip-lang-config.cmake` and it is internal
to
[CMake](https://gitlab.kitware.com/cmake/cmake/-/merge_requests/6121/diffs),
the variable name will not be changed in the foreseeable future.
2022-11-03 19:32:30 +08:00
PeixuanZuo
c8886c5b4c Revert "Update CK and fix performance due to lacking -amdgpu-early-inline-all=true (#13493)"
This reverts commit 4dd053cc15.
2022-11-01 13:05:55 +08:00
cloudhan
4dd053cc15
Update CK and fix performance due to lacking -amdgpu-early-inline-all=true (#13493)
1. Update CK to its latest develop branch
2. `-mllvm -amdgpu-early-inline-all=true` is critical to CK's
performance, add it.
2022-10-28 09:36:00 -07:00
JiCheng
20c3c35c33
[XNNPACK] support building xnnpack EP for IOS (#13461)
### Description
support building xnnpack for IOS


### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2022-10-28 15:03:04 +08:00
cloudhan
72076b1eb2
Update ROCm CI to use HIP LANGUAGE (#13214)
Update for ROCm CI before reland tunable GEMM #12853. This PR also update
composable kernel to use CMakes's HIP language support so that we can
mix C/C++ compiler with HIP compiler instead of locking to hip-clang
2022-10-05 16:15:16 +08:00
cloudhan
46c074a6c8
Update composable kernel and enable experimental inter wave scheduling (#12626)
Update ck to latest master and enable interwave scheduling
2022-08-25 22:19:41 -07:00
cloudhan
f39354d7cb
Add composable kernel GEMM baseline for kernel explorer (#12364)
* Split GemmBase RocBlasGemm

* Add composable kernel GEMM baseline

* Make linter happy

* Address review comment

* Update bert cases with batchsize

* Adjust includes to fix IWYU lint

* Only builds and links used ck kernels to improve building time

* Remove warmup run on SelectImpl

* Add comment to utility function

* Mute cpplint

* Make RocBlasGemm<T>::SelectImpl semantically correct

* Add reduced basic test cases for ck gemm

* More robust gemm testing

* Fix warnings

* Fix grammar
2022-08-04 17:32:20 -07:00
Scott McKay
6bf6bac1fd
Add patching of xnnpack CMakeLists.txt to allow building with Emscripten. (#11829) 2022-06-14 09:31:17 +10:00
Ben Niu
20fbf603d3
Fix ARM64EC build breaks (#11111)
Apply this 4c015dbb49 to fix ARM64EC build breaks.
2022-04-05 10:00:42 -07:00
Changming Sun
f87a06cd96
Patch absl so that it doesn't disable important VC++ warnings (#10836)
This PR is just for making onnxruntime passing Binskim rules.

Below is how I made it:

git clone absl repo, checkout the version we are using
Then apply our patch file
Make modifications
Regenerate the patch file by "git diff > C:\src\onnxruntime\cmake\patch\xxx.patch"
Then submit the change to our repo
You will need to repeat the steps when you need to advance the absl commit or add more changes to it.
2022-03-10 15:35:39 -08:00
Changming Sun
283d0c47b4
Update our absl cmake files (#10762) 2022-03-04 09:28:04 -08:00
Dmitri Smirnov
7e092a7e3f
Reduce number of memory allocations based on a customer profiling case (#10193)
Add abseil and inlined containers typedefs
Introduce TensorShapeVector for shape building.
Use gsl::span<const T> to make interfaces accept different types of vector like args.
Introduce InineShapeVectorT for shape capacity typed instantiations
Refactor cuda slice along with provider shared interfaces
Refactor Concat, Conv, Pad
Build with Conv Einsum and ConvTranspose refactored.
Remove TesnorShape::GetDimsAsVector()
Refactor SliceIterator and SliceIteratorBase
Refactor broadcast
Refactor Pads for twice as long
Remove memory planner intermediate shapes vector
Refactor orttraining
Fix passing TenshroShapeVector to tests
Remove abseil copy and submodule, use FetchContent_Declare/Fetch
Path with separate command
Make RocmAsyncBuffer accept anything convertible to span. Adjust Linux GPU pipeline.
2022-01-24 10:40:46 -08:00
suryasidd
1a5b75a554
[OpenVINO-EP] Remove support for OpenVINO 2020.2 (#6493)
* Removed OpenVINO 2020.2 support

* Updated documentation and build.py

* Removed unnecessary libraries from setup.py
2021-01-28 23:00:41 -08:00
S. Manohar Karlapalem
ff58f621fa
Remove nGraph Execution Provider (#5858)
* Remove nGraph Execution Provider

Pursuant to nGraph deprecation notice: https://github.com/microsoft/onnxruntime/blob/master/docs/execution_providers/nGraph-ExecutionProvider.md#deprecation-notice

**Deprecation Notice**

| | |
| --- | --- |
| Deprecation Begins	| June 1, 2020 |
| Removal Date |	December 1, 2020 |

Starting with the OpenVINO™ toolkit 2020.2 release, all of the features
previously available through nGraph have been merged into the OpenVINO™
toolkit. As a result, all the features previously available through
ONNX RT Execution Provider for nGraph have been merged with ONNX RT
Execution Provider for OpenVINO™ toolkit.

Therefore, ONNX RT Execution Provider for **nGraph** will be deprecated
starting June 1, 2020 and will be completely removed on December 1,
2020. Users are recommended to migrate to the ONNX RT Execution Provider
for OpenVINO™ toolkit as the unified solution for all AI inferencing on
Intel® hardware.

* Remove nGraph Licence info from ThirdPartyNotices.txt

* Use simple Test.Run() for tests without EP exclusions

To be consistent with rest of test code.

* Remove nGraph EP functions from Java code
2020-11-19 16:47:55 -08:00
S. Manohar Karlapalem
6d4f2f5bf9
OpenVINO EP v2.0 (#3585)
* Added FP16 transformations

* Revert "Added CMAKE_BUILD_TYPE to make building dynamic"

This reverts commit d3e17af1af655cfdc4d2fec33f52055caa525e85.

* Added FP16 transformations for FP16 builds

* Backend logic cleanup

Cleans the backend(intel_graph.*) code in the following ways:-

1. Minimize global usage: Since all the IR graphs need to be
re-generated on every Infer, it is bad practice to rely on globals
for their saving and usage as there would be multiple readers and
writers to the same global variable leading to incorrect usages or
contentions. This change replaces globals with locals where possible.
 This change also fixes an existing bug with due to
incorrect global usage.

2. Remove all unused functions.

3. Remove all unused headers and prepocessor directives.

* removed commented out code

* Disabled default optimization for Intel EP

Signed-off-by: suryasidd <surya.siddharth.pemmaraju@intel.com>

* Fix missed plugins.xml for python bindings

* Fixed the build after latest master changes

Signed-off-by: suryasidd <surya.siddharth.pemmaraju@intel.com>

* Disabled unsupported ops for accelerators

Signed-off-by: suryasidd <surya.siddharth.pemmaraju@intel.com>

* Added some more disabled ops

Signed-off-by: suryasidd <surya.siddharth.pemmaraju@intel.com>

* Added environment variable to enable debugging

Signed-off-by: suryasidd <surya.siddharth.pemmaraju@intel.com>

* Added more debug statements

Signed-off-by: suryasidd <surya.siddharth.pemmaraju@intel.com>

* Fixed unsupported ops list for GPU and VPU

Signed-off-by: suryasidd <surya.siddharth.pemmaraju@intel.com>

* Fixed unsqueeze unit tests

Signed-off-by: suryasidd <surya.siddharth.pemmaraju@intel.com>

* Added error message to the status

Signed-off-by: suryasidd <surya.siddharth.pemmaraju@intel.com>

* Overwrite Model proto with shape info from data

Overwrites the shape info of Model proto with the shape from
actual input data. Needed for inferring models with Dynamic
shapes.

* Removed print statement and disabled where op

Signed-off-by: suryasidd <surya.siddharth.pemmaraju@intel.com>

* Disabled Reshape with Empty initializer

* Added more debug statements for 1P

* Don't allow 1D inputs with symbol for dimension

* Disabled some 3rd phase ops

* Disabled split and added zero dimension check for OutputDefs

* Cleanup zero dimensionality check

* Added different data type check for inputs and initializers

* Added conditions for Mod, Cast and Pad

* Removed unused variable

* Disabled scan and added conditions for squeeze

* Added changes for fixing all C++ unit tests

* Implements Backend Manager class for caching

Backend Manager provides a layer of indirection between EP interface
and OV backend that provides caching services for models with
symbolic dims in input shapes.

* clean up commented blocks

* clang-formatting

* Read I/O type info from ModleProto

Read the tensor element type information from ModelProto object,
as FusedNode is no longer available.

* code cleanup

* clang-formatting

* Added print statement for jenkins

* Disabled some python tests

* Changed the path of convert fp32 to fp16 hpp

* Added conditions for BatchNorm in GetCapability

* Fixed failed tests

* Revert "Added conditions for BatchNorm in GetCapability"

This reverts commit c3c28c3b00d27892c42546b35dacdd807a48ee90.

* Added Intel to onnxruntime backends

* pick up vars set by OV package setupvars.sh

* Added conditions for Identity

* remove a few cout prints

* Added conditions for GPU_FP32 unit tests

* Revert "pick up vars set by OV package setupvars.sh"

This reverts commit 8199e029c03eae21a1a7ef6bfdc93d00e5d0198b.

* Commented out fatal message for protobuf

* Might need to be removed

* Add interface class for current backend

* moved common logic to base class

* simplified cpu backend

* Removed unused headers

* use vectors to save i/o tensors for windows compatibility

* move utils fxns to backend_utils namespace

* rename ov_backend to ibackend

* Factory pattern for backend creation

* rename CPU backend to Basic backend

* renamed to vad-M and added to factory list

* Added conditions for VPU

* Added print statements

* Changed the logic for checking for symbolic shapes

* Modified logic for zero dimension check

* Removed VPU single dimension condition

* Removed comments

* Modified logic in DimensionCheck method

* Remove legacy OpenVINO EP

Remove all the legacy code for OpenVINO EP. UEP code will take its
place going forward.

This change does NOT remove OVEP files in the following areas asa
they will be reused by UEP:-
1. Documentation: All .md files
2. Docker releated files
3. Python bindings
4. Java bindings
5. C# bindings
6. ORT Server
7. CI pipeline setup files

* Rename Intel EP to OpenVINO EP

* Added unique names to the subgraphs

* Removed subgraphs with only constant inputs

* Modified subgraph partitioning algorithm to remove const input subgraphs

* Apply suggestion to onnxruntime/core/providers/openvino/openvino_execution_provider.cc

* Tracking output names to fix the output order bug

* Changed output names to a unordered map

* Modified logic to check for symbolic input shapes

* Fixed a bug in Reshape check

* Added empty model path to Model constructor

* Made necessary changes to cmake to build from the binary package

* Changed INTEL_CVSDK_DIR to INTEL_OPENVINO_DIR

* Enable dyn device selection with C++ API

* Added Round operator to unsupported list

* Modified subgraph partition logic for MYRIAD

* Removed supported ops from the list

* Enable dyn dev selection in Py API's

* Add documentation for dynamic device selection

* Use MYRIAD || HDDL instead of VPU

* Removed temporary cast of Int64 to FP32

* Disabled unit Tests for CPU_FP32 and GPU_FP32

* Removed default "CPU" from unit tests to allow overriding

* Removed ops Concat, Squeeze, Unsqueeze from unsupported list

* Get the device id from info

* Removed overwriting device_id and precision

* Enabled ConvTranspose and EyeLike

* Reordered unsupported ops in alphabetical order

* Fixed syntax error

* Fixed syntax error

* Code clean-up: Handle exceptions, logs and formatting

Code formatted according to ORT coding guidelines.

* remove debug print from pybind code

* updated docs with ops and models

* formatting prints

* Added default values for c and j for openvino

* Overriding the values set for c and j to be 1
* BACKEND_OPENVINO should be empty if openvino is not in build

* Overriding c value with default for perftest

* fix VAD-M device string bug

* Add IE error details to exceptions

* Use IE specific device names in EP

* Add VAD-F (FPGA) device support

* Removed unecessary libraries from whl package

* Code changes for Windows compatibility

* Add VAD-F option to python API

* [revert before merge] cmake changes for RC

* Enable Windows build in CMake

* Unset macro OPTIONAL for windows builds

inference_engine.hpp's include chain defines a macro 'OPTIONAL'
which conflicts with onnx project's headers when using MSVC. So
would need to explictly unset it for MSVC.

* Use a single copy of plugin/IE::Core

Defined as a static member in Backend manager

* Remove restriction of single subgraphs for  myriad

* Passed subgraph name to Backend to enhance log statements

* Disabled zero dimension conditions

* Disabled concat to remove zero dims

* Enabled building ngraph as part of ORT

* Removed serializing and added versioning

* Fix CPU_FP32 unit tests

* Removed unecessary condition

* add ngraph.so.0.0 to .whl

* Check for zero dimensions only for inputs and outputs

* Restrict loading only 10 subgraphs on myriad

* Build ngraph.dll within UEP. Doesn't link yet

* Rename Linux included libngraph.so to libovep_ngraph.so

Renames locally built libngraph.so containing ONNX importer to
libovep_ngraph.so in order to avoid linkage conflicts with
libngraph.so supplied by OpenVINO binary installer.
Applies only for Linux builds.

* use output_name cmake properties for lib name

* fix .so name format in lib_name.patch

* CMake code cleanup

* Rename WIN32 included ngraph.dll to ovep_ngraph.dll

To avoid conflict with ngraph.dll distributed by openvino.

* Added myriad config for networks without 4 dimensions

* Loading the 10 max clusters for inference on myriad

* Refactor code and add Batching support

Encapsulate subgraph settings into context structs.

Add batching support for completely supported models.

* Disabled some broken tests

* use input_indexes to avoid batch-checking initializers

* Avoid static initialization order error on WOS

* Added candy to broken tests

* InternalCI changes for 2020.2

* Updated DLDT instructions

* Unsaved changed in install_openvino.sh

* Changes after manual check

* Remove custom ngraph onnx_import build for WOS

ONNX Importer on WOS does not have protobuf issue.

* Remove FP32ToFP16 ngraph pass

This conversion is performed implicitly within IE.

* Surround debug logic by #ifndef NDEBUG

* remove invalid TODO comments

* removed references to ngrpah-ep

* clang-formatting

* remove commented code

* comment edits

* updating copyright year to that of first OpenVINO-EP release

* remove redundant log msg

* Modified operator and topology support

* Update build instructions

* doc formatting

* Fixed clip unit tests

* Revert "Remove FP32ToFP16 ngraph pass"

This reverts commit ec962ca5f315a5658ad980e740196f19de2639c1.

* Applying FP16 transformation only for GPU FP16

* Fixed GPU FP32 python tests

* automatically use full protobuf

* disable onnxrt server for now

* Disabled upsample

* update dockerfile instructions

* Removed MO paths and added ngraph path

* Remove OVEP from ORT Server docs

Will put it back in after validation

* Updated path to Ngraph lib

* Disabled Resize and some other python tests

* Removed unnecesary header files

* Use commit SHA to fetch ngraph repo

* Avoid un-needed file changes due to version update

* Fixed clip tests

* Fixed Pow, max and min onnx tests

* build.md doc typo

* Update cmake patch command for ngraph src

* remove dead cmake code for onnxruntime_USE_OPENVINO_BINARY

* use spaces instead of tab

* remove commented code

* Add info about protobuf version

* edit debug env var and enable for WIN32

* specify only version tag of 2020.2 for dockerbuilds

* remove unnecessary file changes

* Pass empty string as default argument to C# tests

* Use ${OPENVINO_VERSION} to name openvino install directory in CI builds

* Enabled unnecessarily disabled tests

* Fixed ngraph protobuf patch

* Fixed error in protobuf patch

* Revert "Use ${OPENVINO_VERSION} to name openvino install directory in CI builds"

This reverts commit 89e72adb8bf3b9712f5c81c5e13fe68c6c0df002.

* Remove unsetting OPTIONAL macro

This is no longer used in recent ONNX update onnx/onnx@da13be2,
so this unset workaround is no longer necessary.

* Use a null string  default argument for C# API

* Set OpenVINO version yml files and pass to CI Docker builds

Git Tag info for DLDT as well as install directory are set
using this value.

This reverts commit 9fa9c20348ed72ae360a95c98e9b074d2f9fafc5.

* Documentation: recommendation and instructions for disabling ORT graph optimizations

* more doc updates

* Reduced the number of models according to CI time constraints

Co-authored-by: ynimmaga <yamini.nimmagadda@intel.com>
Co-authored-by: suryasidd <surya.siddharth.pemmaraju@intel.com>
Co-authored-by: Mikhail Treskin <mikhail.treskin@intel.com>
Co-authored-by: mbencer <mateusz.bencer@intel.com>
Co-authored-by: Aravind <aravindx.gunda@intel.com>
Co-authored-by: suryasidd <48925384+suryasidd@users.noreply.github.com>
2020-04-24 04:06:02 -07:00
Changming Sun
69bc8ce3c2 Upgrade protobuf to 3.11.3 2020-02-12 14:47:00 -08:00
Changming Sun
90b708f8a9
Update protobuf to 3.11.2 (#1928)
Update protobuf to 3.11.2 (#1928)
2019-12-27 18:28:18 -08:00
Sreekanth Yalachigere
4c996a8699 DNNL CMAKE update (#2548) 2019-12-05 13:48:57 -08:00
mikecaraman
358b517d49 [v2] Add ACL (Arm Compute Library) execution provider (#2258)
* Guard unused parameter

Guard unused parameter for Linux Arm and other cases.

* Add ACL (Arm Compute Library) execution provider

Add a new execution provider targeting Arm architecture based on Arm Compute Library.
Validated on NXP i.MX8QM CPU with ResNet50, MobileNetv2 and VGG models.
All unit tests are passing.

Comparative performance improvements for ResNet50v1 model obtained with
onnxruntime_perf_test:
		A72	2xA72	A53	4xA53
ACL vs CPU  	16%	9%	21%	13%

Usage documentation available in ACL-ExecutionProvider.

* Fix eigen unused parameter

Fix eigen unused parameter error for Arm cross-compilation.
2019-10-31 12:25:36 -07:00
Tomasz Dołbniak
72110d3508 Patch for the MKLDNN v1 segfaults (#2145) 2019-10-17 12:10:00 -07:00
Sreekanth Yalachigere
485c24b62d MKL-DNN 1.0 (#2134)
* MKL-DNN 1.0

* changed libmkldnn version to 1
2019-10-15 12:06:34 -07:00
Tomasz Socha
f93be8af90 Update nGraph to version 0.26 (#1965)
* Adjust ngraph cmake files to onnx 1.5.0

* Enable LSTM reverse direction mode in nGraph EP

* Enable full support for the Split op in nGraph EP

* Revert "Disable the unsigned input Shrink op tests for nGraph until the next update"

This reverts commit 257b42a55bdd98f804d4846868542b8e3aeb4b4e.

* Enable Gather and remove unused subgraph attribute

* Remove the unused param from AppendClusterToSubGraph

* Fix for the incorrect onnx opset version

* Use the r0.26 release branch before the tag is created

* Enable the quantizelinear and dequantizelinear for NGEP

* Use the v0.26.0-rc.2 tag in ngraph.cmake

* Add skip for modes others than default in Pad operator

* Reenable negative axis tests for ngraph

* Use temporary ngraph version

* Use branch name instead of SHA for temporary ngraph branch

* Use ngraph v0.26.0-rc.4

* Remove patch for missing symbol in MKLDNN

* Use MKLDNN 1.0 in ngraph

* Exclude the Pad op for opsets greater than 10

* Disable quantizelinear and dequantizelinear tests for ONNX 1.5.0

* Fix the onnx-headers related compilation errors

* ONNX libs linking fix

* Use a tag for ngraph and support more Pad modes

* Use the v0.26.0 release tag for nGraph

* Update ngraph to RC8 - bigobj flag for Windows builds

* Fix the MKLDNN constexpr error on Windows
2019-10-14 10:37:48 -07:00
Tomasz Dołbniak
69baf9e800 Update nGraph to v0.22.1 (#1582)
* Update nGraph to 0.21 and adjust the EP

* Share the graph initializers between custom ops

* Update nGraph to 0.22 and exclude Gather entirely

* Enable building on Windows with nGraph v0.21.1-rc.0

* Disable the unsigned input Shrink op tests for nGraph until the next update

* Line-shortening code refactor

* Fix for the master branch merge artifact

* MKLDNN patches adjustment for Windows

* Exclude MatMulInteger for non-const zero points

* Exclude ConvInteger for non-const zero points

* Enable full Cast op support

* Use the v0.22.1 tag

* Skip ConvTranspose_InvalidKernelShape test for ngraph provider

* Create sub-graph ModelProto from fused_node
2019-08-10 17:41:08 -07:00
xkszltl
be16b274fc Upgrade mklml and set march with official option. (#1469)
1. There's formal way for setting march.
2. Upgrade to new MKLML.

Besides, the mem patch can be drop for v1.0.0 since it's fixed in upstream.
2019-07-25 19:37:59 -07:00
Sreekanth Yalachigere
f3c74ec3e9 Reduce memory footprint of MKL-DNN EP (#1429)
* MKL-DNN EP memory fix patch

* Call default provider for Opset10

* opset 10 fix

* removed email header from patch

* UseSubgraph method refactored
2019-07-18 22:57:00 -07:00
R. G. Esteves
93528d9b3c Reduce memory footprint of nGraph (#1296)
* Fix unnecessary memory allocation in MKLDNN 1x1 convolution.

* remove the patch header.
2019-07-07 20:23:19 -07:00
R. G. Esteves
f4a9ccae99 Enable nGraph Debug ci test. (#1000)
* Enable nGraph Debug ci test.

* nGraph doesn't work with stack trace.

* Fix corrupt patch.
2019-05-23 19:58:35 -07:00
R. G. Esteves
17690355ed Support for building ngraph EP on Windows (#978)
* Enable windows support for ngraph ep

* Slight rewording of building for windows

* Streamline build of ngraph ep, disable C# bindings build
2019-05-07 14:14:50 -07:00