* add python api's for get/set execution providers, checking available and all providers.
* add back deleted code.
* minimize peak memory consumption for sess.set_providers() api. need to remove references to underlying _sess object
* fix typo.
* add validation for set_providers(), addr other review feedback.
* Fix broken link and minor wording updates
* Update links to use relative paths
* Update sample section organization
* Fix a few more links
* Update links to relative paths
* Fix link urls
* Update links to relative paths
* Update link to perf test doc page
* Update links to relative paths
* Update to relative paths for links
* Update link
* use unaligned buffer for Nuphar in onnxruntime_perf_test to avoid crash
symbolic shape inference fixes to support more sophiscated models
remove useless code from model_quantizer
* Run symbolic shape inference for subgraphs in Loop/Scan
* Allow a symbolic dimension to merge with an int dimension
* Bump onnx to latest
Update onnx.in.proto with changes for SparseTensor.
* add temp skip tests
* remove passed tests from skip list
* skip more tests for new ops in opset 11
* skip crashing tests
* update handling of new attribute types sparse tensor and sparse tensors
* advance onnx commit and remove skip cpu_flaky_tests
* temporarily skip yolo3 model test due to resize opset10 shape inference regression
* update proto for onnxruntime server
* advance onnx commit further
1. Add openvino GPU nightly build pipeline, this test is running on Intel Up square Edge device. The device are host locally not from Azure VM. We persist a smaller model test data on Edge device.
2. Update the build condition for openvino GPU so it works for GPU_FP32, GPU_FP16
3. add option to install_ubuntu.sh to exclude the package used for nuphar, so that we can save some disk space as the Edge device usually have limited disk space.
C/C++ Opage APIs
Add new virtual interfaces for NonTensorType
Implement entry points.
Add shared header for the data container.
Add export symbols.
Add serialization/deserialization.
Implement model with Opaque types.
Rework opqaue_api_test as a standalone executable.
It should always have outputs but in case it doesn't (nothing fails currently if it doesn't even though that makes it meaningless) make sure it also has a node.
* Mention OrtCreateSessionFromArray in C API doc
* Add GetDataTransfer() interface in the EP.
* Check return status of RegisterDataTransfer
* Address PR comments
* Mention OrtCreateSessionFromArray in C API doc
* Add make_unique implementation for use with C++11
* Add cgmanifest and TPN files as well
* Add annotation to cgmanifest to identify the component that uses the dependency
* Rework the feed/fetch copy setup so that it can be calculated upfront by the control flow nodes. Also simplifies how it all works.
Update the control flow nodes to do the calculation prior to graph execution.
Enhance proto3 compatibility.
Replace has_*() method to corresponding enum handling so we can deal with
proto3 generated stream from proto2 code.
Add utility wrappers for remaining has_*() methods so we can
easily deal with them if/when we switch to proto3.
Refactor the SGEMM kernels to resynchronize the code between Windows/Linux and remove unneeded binary bloat from a different zero/add mode kernel. Another goal is to get to a cleaner state for then doing a DGEMM kernel.
Enable Nuphar EP docker build
Revert back to LLVM 6.0.1
Reinstate disabled Softmax tests caused by LLVM 8.0.1
Reinstate Nuphar Python test due to stale sympy version
Increase build timeout of Linux CI
When NUPHAR_USE_MKL or NUPHAR_USE_AVX2 is not defined, we got
"unreachable code" warnings on Windows, which were truned into
errors and broke the build.
* Mention OrtCreateSessionFromArray in C API doc
* Fix perf test executable due to removal of certain C APIs
* fix linux build
* Avoid duplication
* Update coding guidelines to prefer using make_unique for heap allocations (unless where not possible).
* Implement Nuphar execution provider
Nuphar execution provider is a TVM-based compilation provider. It has shown great speedups for RNN models using Scan.
This PR is mainly for a preview of the shared codegen library for other TVM-based providers.
* Fix submodules
* Fix TVM submodule
* Update Nuphar to latest and resolve confliction
* Remove stale files caused by merge -X theirs
* Revert heap buffer change to not introduce onnxruntime_framework into onnxruntime_perf_test
* Fix bad merge
* Merge from Nuphar
* Fix warning treated as error, revert some unnecessary changes
* Revert some more test changes
* Some more test revert or comments to make review easier
New tests could be added later
* One more revert of unnecessary changes
* More change revert. Test could be added back later.
Fix an issue that CUDA EP fallback too much nodes to CPU for some case which cause huge data copy.
https://github.com/microsoft/onnxruntime/issues/1675
Currently, if the node's inputs are all as initialier, CUDA EP will fallback it to CPU. And it will also fallback some nodes under it. It could cause some huge data copy. for the case reported by a user, it has several Slices with input from initializer, and a Concat op to concat the output from Slice output. The data is huge 16MB after concat, which make the data copy from CPU to GPU quite costly because it's a sync copy.
Fix
If the node's inputs are all initializer, we shouldn't fallback the node to CPU.
* Support bilinear mode with actual 2D inputs in Resize and upsample
* Fix build break
* Fix build break
* Add test
* CUDA changes
* Resolve PR comments
* Resolve comments