1. refactor the pipeline, remove some duplicated code
2. Move Windows_py_GPU_Wheels job to Win-GPU-CUDA10. We'll deprecated the "Win-GPU" pool
3. Delete cpu-nocontribops-esrp-pipeline.yml and cpu-nocontribops-pipeline.yml
4. In Linux nuget jobs, run "make install" before creating the package. So that extra RPAH info will be removed
This change adds a new execution provider powered by [DirectML](https://aka.ms/DirectML).
DirectML is a high-performance, hardware-accelerated DirectX 12 library for machine learning on Windows. DirectML provides GPU acceleration for common machine learning tasks across a broad range of supported hardware and drivers.
The DirectML execution provider is capable of greatly improving evaluation time of models using commodity GPU hardware, without sacrificing broad hardware support or requiring vendor-specific extensions to be installed.
**Note** that the DML EP code was moved verbatim from the existing WindowsAI project, which is why it doesn't yet conform to the onnxruntime coding style. This is something that can be fixed later; we would like to keep formatting/whitespace changes to a minimum for the time being to make it easier to port fixes from WindowsAI to ORT during this transition.
Summary of changes:
* Initial commit of DML EP files under onnxruntime/core/providers/dml
* Add cmake entries for building the DML EP and for pulling down the DirectML redist using nuget
* Add a submodule dependency on the Windows Implementation Library (WIL)
* Add docs under docs/execution_providers/DirectML-ExecutionProvider.md
* Add support for DML EP to provider tests and perf tests
* Add support for DML EP to fns_candy_style_transfer sample
* Add entries to the C ABI for instantiating the DML EP
Enable multi-device test for GPU
* Add build pipeline for TensorRT multi-GPU test
* Add code to disable fp16 test if hardware architecture not supported
* Add option to set the device id in onnx_test_runner for model tests
Remove gsl subodule and replace with a local copy of gsl-lite
Refactor for onnxruntime::make_unique
gsl::span size and index are now size_t
Remove lambda auto argument type detection.
Remove constexpr from fail_fast in gsl due to Linux not being happy.
Comment out std::stream support due to MacOS std lib broken.
Move make_unique into include/core/common so it is accessible for server builds.
Relax requirements for onnxruntime/test/providers/cpu/ml/write_scores_test.cc
due to x86 build.
Add ONNXRUNTIME_ROOT to Server Lib includes so gsl is recognized
1. remove sudo from the cleanup step for Linux so that we don't need the sudo access for vstsagent build user
2. a minor fix in the install_ubuntu.sh to make the image smaller for openvino
* Bump onnx to latest
Update onnx.in.proto with changes for SparseTensor.
* add temp skip tests
* remove passed tests from skip list
* skip more tests for new ops in opset 11
* skip crashing tests
* update handling of new attribute types sparse tensor and sparse tensors
* advance onnx commit and remove skip cpu_flaky_tests
* temporarily skip yolo3 model test due to resize opset10 shape inference regression
* update proto for onnxruntime server
* advance onnx commit further
1. Add openvino GPU nightly build pipeline, this test is running on Intel Up square Edge device. The device are host locally not from Azure VM. We persist a smaller model test data on Edge device.
2. Update the build condition for openvino GPU so it works for GPU_FP32, GPU_FP16
3. add option to install_ubuntu.sh to exclude the package used for nuphar, so that we can save some disk space as the Edge device usually have limited disk space.
Enable Nuphar EP docker build
Revert back to LLVM 6.0.1
Reinstate disabled Softmax tests caused by LLVM 8.0.1
Reinstate Nuphar Python test due to stale sympy version
Increase build timeout of Linux CI