Commit graph

634 commits

Author SHA1 Message Date
Changming Sun
1b23b28706
Remove MKLML/openblas/jemalloc build config (#6212) 2020-12-30 17:18:19 -08:00
sfatimar
7347996942
Openvino ep 2021.2 (#6196)
* Enabling fasterrcnn variant and vehicle detector

* changes for 2021_2 branch

* yolov3_pytorch commit

* fixed braces in basic_backend.cc

* ci information added

* faster rcnn variant and vehicle detector changes were made in 2021.1 and not in 2021.2

* some changes to support unit tests

* disable some tests which are failing

* fix myriad tests for vehicle detector

* Did some cleanup
*cleaned up comments
*Disabled Add_Broadcast_0x1 and Add_Broadcast_1x0
tests on MYRIAD_FP16 backend due to a bug
*cleaned up capability_2021_2.cc file
*Removed extra conditions which were added
for some validation in backend_utils

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>

* yolov3 pytorch workaround to ensure that the output names are matched

* gemmoptest fixed on myriad

* Fixed MYRIADX CPP Test Failures

*Expand,GatherND,Range,Round op's
are only supported in model

*where op with float input data
types are not supported and fixed

*Scatter and ScatterElements op's with
negative axis are fixed

*Reshape op with 0 dim value are not
supported and fixed

*Disabled InstanceNorm_2 test on MYRIADX

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>

* make changes to yolov3 pytorch

* Fixed python unit tests
*Fixed failing python tests on vpu,
GPU and CPU

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>

* Fixes POW op failures on GPU_FP16

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>

* Clean up capability_2021_2.cc

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>

* Updated docx for MultiThreading option
*Added extra info on setting the num_of_threads
option using the API and it's actual usage

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>

* fixed slice and removed extra prints

* Disabled failing python tests

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>

* Minor changes added in capabilty_2021_2

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>

* made changes to slice to avoid failures

* Disabling FP16 support for GPU_FP32
->Inferencing an FP16 model on GPU_FP32
leads to accuracy mismatches. so, we would
rather use GPU_FP16 to infer an FP16 model
on GPU Device

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>

* Updated docx for Inferencing a FP16 Model

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>

* fix for mask rcnn

* Script for installing openvino from source

* Updated with openvino 2021.2 online installation

* code comment fixes
fixed accuracy mismatch for div

* Update OpenvinoEP-ExecutionProvider.md

updated for 2021.2 branch

* Update README.md

updated dockerfile documentation

* Update BUILD.md

build.md update documentation

* permissiong change of install_openvino.sh

* made changes to align with microsoft onnxruntime changes

* Updated with ov 2021.2.200

Co-authored-by: suryasidd <surya.siddharth.pemmaraju@intel.com>
Co-authored-by: sfatimar <sahar.fatima@intel/com>
Co-authored-by: MaajidKhan <n.maajidkhan@gmail.com>
Co-authored-by: mohdansx <mohdx.ansari@intel.com>
2020-12-23 08:47:22 -08:00
Guoyu Wang
c339bb2da9
Remove ignored build warnings for pybind on Mac (#6165) 2020-12-17 19:54:28 -08:00
Tixxx
32c67c2944
Deprecating Horovod and refactored Adasum computations (#5468)
deprecated horovod submodule
refactored adasum logic to be ort-native
added tests for native kernel and e2e tests
2020-12-17 16:21:33 -08:00
George Wu
297c824807
remove dnnl_dll_path from post build copy (#6142) 2020-12-15 13:47:39 -08:00
Edward Chen
9810b9e02b
Reduce amount of compiled CUDA device code (#6118)
Move CudaKernel from cuda_common.h to a new separate header, cuda_kernel.h. Update include sites to use cuda_kernel.h instead if they need CudaKernel. Inclusions of cuda_common.h are now more lightweight.

Make corresponding changes for ROCM execution provider code.

Other minor cleanup.
2020-12-14 15:27:40 -08:00
Edward Chen
c8ac34d6a5
Fix DEBUG_NODE_INPUTS_OUTPUTS test by putting it in a separate process, clean up unused test_main.cc files. (#5949)
Move the DEBUG_NODE_INPUTS_OUTPUTS test into its own process. The implementation uses static variables which do not interact well with other tests.
Clean up old test_main.cc files which are no longer used.
2020-12-11 11:36:58 -08:00
Sherlock
a53f4dd379
Introduce VariadicAlias, remove hardcoded alias limits (#6106)
* Introduce VariadicAlias, remove hardcoded alias limits

* Include optional-lite in winml build

Co-authored-by: Sherlock Huang <bahuang@OrtTrainingDev3.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>
2020-12-11 10:47:08 -08:00
Ryan Lai
753af576c4
If building inbox, hook up winrt_activation_handler for WinML Tests (#6074)
* If building inbox, hook up winrt_activation_handler with what is already defined in in dllload.cpp

* Add base.h header

* Missed custom ops test
2020-12-10 14:41:01 -08:00
RandySheriffH
404982ded5
Enable varied input type for custom op (#6066)
* allow custom op taking varied types

* refactor test case

* add test model

* refactor test case

* enable copy elision

* update test case

* fix issue in ToString function
2020-12-09 15:10:42 -08:00
Edward Chen
abdbb5fc84
Reduction kernel optimization (#6088)
Optimize reduction kernel code by moving loads from global memory before computation.
Add CMake option to build CUDA code with --generate-line-info option.
2020-12-09 10:20:23 -08:00
ashbhandare
b1a75d0e98
Enable passing initial optimizer state while creating training session (#5869)
* Support to pass initial optimizer states to optimizer graph builder

* Changes for passing init optim state to training session config

* Pass optimizer state through cpp and python frontend

* Cleanup

* Review comments

* Fix windows and mac CI

* Review comments

* review comments

* Review comments

* Frontend review changes

* Fix CI
2020-12-08 21:20:51 -05:00
satyajandhyala
f68a256140
Android code coverage (#6061)
* Added Onnxruntime_GCOV_COVERAGE flag for Android.

* Set CMAKE_SYSTEM_NAME explicityly for Android.

* Added GCOV_PREFIX option to collect code coverage data.
Added a new python script to generate code coverage info.
Modified build pipeline to geneate Android code coverage info

* Added build command line option --android_coverage

* Added a comment describing the GCOV environment variables

* Fixed PEP8 issues.

* Added --android_coverage option to the build command.

* Increased Android emulator memory from 3K to 8K.

* Increased Android partition-size from 2GB to 4GB to overcome no-space-left-on-device error

* Removed source_dir from command line args.

* Use cwd absolute path to run tests.

* Added commands to output the contents of /data/local/tmp on the emulator.

* Added run_adb_shell function.

* Format changes.

* Removed keywd argument cwd.

* Removed Android in the --build_dir path.

* Removed commands added for debugging.

* Removed exxtra new-lines.

* Fix MacOs build pipeline failures by uninstalling openssl before running build script.

* Revert "Fix MacOs build pipeline failures by uninstalling openssl before running build script."

This reverts commit 90d0568fe533e9456c20d061a2d435c8fea48266.

* Change dir to the build directory where the tar file is copied.

* Changed the option from --android_coverage to --code_coverage

* Moved steps to generate Android code coverage to run_nnap_code_coverage.sh

* Require --android option if --code_coverage is specified.

* No code coverage needed for onnx_test_runner.

* Expect that the emulator is running when the script is executed.

* Fixed the title in the buildpipeline step.

* Fixed the formatting issue.

* Added a command line argument, ORT_ROOT, to run_nnapi_code_coverage.sh script

Co-authored-by: Satya Jandhyala <satyajandhyala@Satyas-Mac-mini.local>
2020-12-08 10:55:02 -08:00
Ryan Lai
2878e8eb2e
Fix nuget build error (#6009) 2020-12-03 09:28:39 -08:00
Ryan Lai
897310f6fb
Add suspend handler with new telemetry event for UWP scenarios (#5907)
* Add suspend handler with new telemetry event

* Fix build warning

* Use cppwinrt from nuget

* Restore nuget packages

* add dependencies

* Add nuget_helpers

* Cleaned up

* Clean up

* Comment

* Add dependencies for the rest

* Remove unused line

* Update activation string

* PR comment to remove ALL
2020-12-01 20:26:18 -08:00
Changming Sun
2d9dcc4576
Add python 3.9 support (#5874)
1. Add python 3.9 support(except Linux ARM)
2. Add Windows GPU python 3.8 to our packaging pipeline.
2020-11-30 12:02:48 -08:00
Wenbing Li
1852ade75d
Enable the xcode build for Apple Silicon (arm64 MacOS) (#5924)
* fix the build script for macos/xcode

* add the version check

* correct the osx-arch configuration

* typo
2020-11-30 11:22:08 -08:00
Changming Sun
c5b4d9091c
Fix a tiny issue in onnxruntime_unittests.cmake (#5901) 2020-11-25 14:21:13 -08:00
baijumeswani
69b9368c93
Add unit tests to identify configuration migration scenarios for checkpointing (#5678) 2020-11-25 09:40:26 -08:00
Adam Pocock
8b83c51a35
[Java] Initial Apple Silicon support (#5891)
* Rearranging checks in onnxruntime_mlas.cmake to pickup Apple Silicon.

On an M1 Macbook Pro clang reports:

$ clang -dumpmachine
arm64-apple-darwin20.1.0

So the regex check needs to look for "arm64" first, as otherwise it
matches 32-bit ARM and you get NEON compilation failures.

* Adding Java side library loading support for Apple Silicon (and other aarch64 architectures).

* Adding Qgemm fix from @tracysh

* Fixes the java packaging on Windows.

* Missed a check in the java platform detector.
2020-11-24 15:51:40 -08:00
Ashwini Khade
705d093167
Update onnx (#5720)
* update onnx

* update docker image for testing
2020-11-24 11:20:15 -08:00
Zhang Lei
9992f0f812
Implement QLinear GlobalAveragePool with sse2/neon. (#5838)
Add QLinear Global Average Pool for quantization for ARM and SSE2.

Co-authored-by: Tracy Sharpe <tracysh@microsoft.com>
2020-11-23 19:23:58 -08:00
Ryan Hill
ba739a8000
Convert OpenVINO into a shared provider (#5778)
Same as Dnnl and TensorRT before it, now with more methods and more cleanup.
2020-11-20 17:39:57 -08:00
Scott McKay
f0142da59c
Add NNAPI to providers that can be used via the python bindings. (#5867)
Update ORT model conversion script
  - add args for specifying optimization level and whether to use NNAPI
  - add logic to create a list of required ops and ORT format model that can be used with NNAPI
2020-11-21 09:18:35 +10:00
Hector Li
fd6e7d9c5c
Fix the arm64 build issue on some special OS for OpenVino (#5870)
CMAKE_LIBRARY_ARCHITECTURE returns empty from some OS
2020-11-19 21:13:02 -08:00
S. Manohar Karlapalem
ff58f621fa
Remove nGraph Execution Provider (#5858)
* Remove nGraph Execution Provider

Pursuant to nGraph deprecation notice: https://github.com/microsoft/onnxruntime/blob/master/docs/execution_providers/nGraph-ExecutionProvider.md#deprecation-notice

**Deprecation Notice**

| | |
| --- | --- |
| Deprecation Begins	| June 1, 2020 |
| Removal Date |	December 1, 2020 |

Starting with the OpenVINO™ toolkit 2020.2 release, all of the features
previously available through nGraph have been merged into the OpenVINO™
toolkit. As a result, all the features previously available through
ONNX RT Execution Provider for nGraph have been merged with ONNX RT
Execution Provider for OpenVINO™ toolkit.

Therefore, ONNX RT Execution Provider for **nGraph** will be deprecated
starting June 1, 2020 and will be completely removed on December 1,
2020. Users are recommended to migrate to the ONNX RT Execution Provider
for OpenVINO™ toolkit as the unified solution for all AI inferencing on
Intel® hardware.

* Remove nGraph Licence info from ThirdPartyNotices.txt

* Use simple Test.Run() for tests without EP exclusions

To be consistent with rest of test code.

* Remove nGraph EP functions from Java code
2020-11-19 16:47:55 -08:00
Hariharan Seshadri
62508ef0e4
Revert "Remove MKLML build config (#5559)" (#5855) 2020-11-19 10:53:08 -08:00
Sheil Kumar
84c1340f9b
Refactor implementation of Tensor<T> and underlying buffer stores to improve binary size and maintainability (#5836)
* refactor tensor buffers to make cleaner

* refactor to make tensor backing buffer implementation smaller and cleaner

* missed virtual on destructor

* remove unnecessary static_pointer_cast

* add string vector accessor

Co-authored-by: Sheil Kumar <sheilk@microsoft.com>
2020-11-18 14:56:47 -08:00
Justin Stoecker
bd236ecc26
Switch to unified DirectML 1.4.0 redistributable (#5794)
Transitions from the ORT-only DML NuGet (hosted on the onnxruntime_public feed) to the new unified DirectML NuGet (Microsoft.AI.DirectML) on nuget.org. In addition, the Microsoft.AI.MachineLearning (WinML) and Microsoft.ML.OnnxRuntime.DirectML packages now take a dependency on the Microsoft.AI.DirectML package. This means we can remove the extra copy of DML binaries in these packages since they will be installed by the DML package.
2020-11-17 13:42:23 -08:00
Scott McKay
7b76b57fc8
Support EPs that compile nodes in a minimal build. (#5776)
* Support EPs that compile nodes in a minimal build. This enables NNAPI being used.
2020-11-17 13:52:22 +10:00
Tiago Koji Castro Shibata
794e8479eb
Revert #5805 (#5823)
* Fix race condition in msbuild

* Revert "Named Dimension Override internals test and experimental API (#5805)"

This reverts commit 157d1844fb.
2020-11-16 17:05:28 -08:00
Sheil Kumar
671fa60327
Enable direct tensorization and detensorization to many buffers in WinML (#5791)
* switch to work PC

* back with iterable of buffers

* add raw api tests

* tensorization

* last test

* all tests pass!

* small cleanup

* whitespace

* newline

* whitespace

* refactor common code into DisjointBufferHelpers

* remove unused file

* warning

* skip gpu tests when hardware not available

* Add error condition when createreference is invoked

* add null check to cretereference

* uncomment out check

Co-authored-by: Sheil Kumar <sheilk@microsoft.com>
2020-11-16 10:06:22 -08:00
RandySheriffH
20ae1ea21f
Remerge custom gpu op (#5818)
* add case for cpu custom op on gpu

* format doc

* restrict GPU custom op on Linux GPU CI only

* separate cu file to a independent project

* fix typo

* include cuda_add lib

* move lib def

* add file header

Co-authored-by: RandySheriffH <rashuai@microsoft.com>
2020-11-16 09:27:46 -08:00
Guoyu Wang
c4818d36ed
[NNAPI EP] Make NNAPI EP build on non-Android Platform (#5779)
* Make NNAPI EP build on non-Android Platform

* minor updates

* Adress CR comments

* Fix build issue using Windows, address CR comments

* Fix linux build warnings

* Fix for test failure

* Fix for test failure

* Fix model_tests failure
2020-11-15 17:04:45 -08:00
Ori Levari
157d1844fb
Named Dimension Override internals test and experimental API (#5805) 2020-11-13 21:21:11 -08:00
sfatimar
dfbf6d78be
OpenVino: fix allocation failure on Window for RelWithDebInfo build (#5713)
* ng_supported_ops

* Remove ng_supported_ops

* Revert "Remove ng_supported_ops"

This reverts commit 3c27385b2d88c6e8cf7ac4e8c290a367ad5d0bd8.

* Revert "ng_supported_ops"

This reverts commit 650721ae2913b79739521d58838298e031abdac1.

* cmake changes to ensure that the debug build on windows link to debug builds of openvino
and do not result in bad allocation error

Co-authored-by: sfatimar <sahar.fatima@intel/com>
2020-11-13 07:59:52 -08:00
jeyblu
435b904f0e
add dnnl gpu engine (#5788) 2020-11-12 20:17:54 -08:00
stevenlix
54de618c2e
Improve TensorRT engine caching (#5737)
* add profile caching to improve engine caching feature

* Add comments

* fix typo

* add decryption for engine caching

* Update tensorrt_execution_provider.cc

* Update tensorrt_execution_provider.cc

* Update tensorrt_execution_provider.cc

* Update tensorrt_execution_provider.cc

* Update tensorrt_execution_provider.cc

* update onnx-tensorrt submodule

* set opt profile to max value of the range

* add hash to engine/profile name

* Add calibration based INT8 quantization

* add an option to enable both FP16 and INT8

* Update tensorrt_execution_provider.cc

* add env variable to specify calibration file name

* clean up code

* Add comments and update TRT document

* enable tensorrt basic test and add EngineCachingTest

* clean up

* update envrionment variable in the test

* clean up
2020-11-12 08:56:45 -08:00
Hariharan Seshadri
b92fc66ea1
Support opset-13 specs of controlflow ops (Loop, If) (#5665) 2020-11-11 23:44:14 -08:00
Maajid khan
a84a058f9e
[OpenVINO-EP] Enabling Multi Device support (#5740)
* Enabling Multi Device support for UEP

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>

* Minor fix added
*Added a simple fix to determine OpenVINO
version for Arm build as well

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>
2020-11-11 15:16:30 -08:00
Xueyun Zhu
d8ace07ad7
Add CPU send/recv for pipeline (#5315)
* cpu send/recv

* clean up send/recv

* remove unused code

* assert and nccl option for mnist

* add build option to enable build with only cpu. Without this, nccl is always enabled which will break build on machine that only contains cpu

* Add USE_MPI distinct from USE_NCCL/USE_HOROVOD

* fix

* fix

* exclude cpu send/recv for machines without mpi

Co-authored-by: Tim Harris <tiharr@microsoft.com>
2020-11-11 12:41:39 -08:00
Yufeng Li
2ba637c558
Implement Scale function for quant gemm (#5632)
* Implement a Scale function for quantization

Quantized GEMM is always followed by Scaling (PerTensor Or PerColumn), and often need to be accumulated to an existing matrix. This PR implements a post-processor for quantized GEMM result and accumulate it to another matrix.
2020-11-10 23:34:38 -08:00
Alberto Magni
c75b7c5c47
[CMake] Enable NCCL only when enabling CUDA or ROCm support (#5516)
Conditionally enable NCCL depending on CUDA and ROCM

Before this change NCCL support was enabled unconditionally, even
when building without CUDA or ROCM support.
This caused the command:
$ ./build.sh --enable_training

To trigger the following cmake warning
-- Could NOT find NCCL (missing: NCCL_INCLUDE_DIR NCCL_LIBRARY)
CMake Warning at CMakeLists.txt:1282 (message):
NCCL is not found. Please use --nccl_home to specify the path of NCCL.
Otherwise, NCCL is disabled.

This is a spurious warning because the user did not ask to search for NCCL.
2020-11-10 12:39:23 -08:00
Weixing Zhang
fff85a6a35
Add GPU kernels for ROCm EP (#5655)
* Add kernels for AMD GPU.

This PR is mostly about GPU kernels for ROCm EP. Due to similar GPU programming language (CUDA and HIP and similar math library calls, one principle in ROCM EP design is to share CUDA kernels as much as possible for ROCm. Thus, the script amd_hipify.py has been created for converting CUDA kernels to ROCm HIP kernels automatically during compilation phase. But, for some reasons such as perf issue, syntax difference..., some converted kernels need some manual intervention. These kernels will be checked in the repo physically for now. In order to avoid manual intervention, the plan is to refactor CUDA kernels to make them portable between CUDA EP and ROCm EP as much as possible.

Please refer to "HIP Porting Guide" for details.

* like lamb, multi-tensor-apply needs to be disabled for IsAllFiniteOp and ReduceAllL2, current AMD GPU compiler has perf issue for kernel parameter which is a structure with "pass by value".

* Use hipMemsetAsync and add checks on HIP calls.

* move the generated files to build folder.

Co-authored-by: Jesse Benson <jesseb@microsoft.com>
2020-11-06 16:11:06 -08:00
Maajid khan
d6f9cc181d
Modify logic to determine OV Version (#5701)
Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>
2020-11-05 15:12:02 -08:00
edgchen1
07bd4ef470
Upgrade optional implementation to https://github.com/martinmoene/optional-lite. (#5563) 2020-11-03 15:27:47 -08:00
Hector Li
b6eeadf420
Enable OpenVino build on Arm64 platform (#5682) 2020-11-03 13:55:34 -08:00
Ashwini Khade
1cca903680
update onnx commit id (#5594)
* update onnx commit id

* update onnx commit for docker images

* update docker images
2020-11-02 09:46:36 -08:00
Maajid khan
d98062da0c
[OpenVINO-EP] Hetero support (#5627)
* Implement Hetero in UEP
* Added security checks to take valid Hetero combinations
  as device type
* Integrating Hetero features
* Get the statistics Report in Debug Mode

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>

* Passing right device type for vadm_baackend

Added simple fix to pick the right device type
when using vadm_backend with Hetero as well.

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>

* Fixed batching logic for 2020.4 and above

* Fixed flake8 PEP8 errors

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>

* Minor Fixes Added
*Added security checks for device_type passed
in for Hetero build during run time
*code cleanup

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>

* Minor changes Added
*Fixed batch_size bug in vadm_backend
*code cleanup
*Documentation updated for Hetero

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>

Co-authored-by: suryasidd <surya.siddharth.pemmaraju@intel.com>
2020-10-30 22:35:08 -07:00
Changming Sun
d9293f38e6 Revert "Custom Op on GPU (#5620)"
This reverts commit 2c63196600.
2020-10-30 21:23:51 -07:00