* Implement Nuphar execution provider
Nuphar execution provider is a TVM-based compilation provider. It has shown great speedups for RNN models using Scan.
This PR is mainly for a preview of the shared codegen library for other TVM-based providers.
* Fix submodules
* Fix TVM submodule
* Update Nuphar to latest and resolve confliction
* Remove stale files caused by merge -X theirs
* Revert heap buffer change to not introduce onnxruntime_framework into onnxruntime_perf_test
* Fix bad merge
* Merge from Nuphar
* Fix warning treated as error, revert some unnecessary changes
* Revert some more test changes
* Some more test revert or comments to make review easier
New tests could be added later
* One more revert of unnecessary changes
* More change revert. Test could be added back later.
- Fix the Windows end-to-end test in NuGet CI
- Skip the TestModelSerialization, because it is failing on Linux. Must be fixed before API is released for use. Owner is notified.
Added Sample Featurizer and Infrastructure
Make featurizers and unit tests compile and run with GTest.
Create definitions for the first featurizer kernel.
Add new operator domain.
Create datetime_transformer kernel and build.
Move OPAQUE types definitions for featurizers kerneles out to a separate cc.
Register them with the type system.
Provide unit tests for new AutoML DateTimeTransformer kernel.
Make necessary adjustments to the test infrastructure to make it run
with new types.
- Added python script for generating markdown doc from the registered opkernels.
- Made some conditional changes in the pybind to expose necessary python API
- Added some missing type-constraints in the op kernel registrations
* remove memory copy between CUDA and TRT
* add info to RegisterExecutionProvider input
* use new IDeviceAllocator for trt allocator
* remove SetDefaultInputsMemoryType from TRT EP
* remove onnx-tensorrt 5.0
* add submodule onnx-tensorrt branch 5.1
* remove redundancy
* Update transformer_memcpy.cc
* Update tensorrt_execution_provider.cc
* switch to TensorRT 5.1.5.0
* update python binding
* disable failed test case on TensorRT
* Update activation_op_test.cc
* upgrade to TensorRT container 19.06
* update according to feedback
* add comments
* remove tensorrt allocator and use cuda(gpu) allocator
* update onnx-tensorrt submodule
* change ci build cuda directory name
* Add MacOS leg of Python packaging job
* Update copy files source directory for Mac OS leg
* Add a task to display the binaries directories contents after build wheel creation
* Revert some changes
* Add task to log
* Update
* Remove unnecessary logs
Python script and necessary changes in the azure-pipelines yaml file to post the binary size data from NuGet package build. Currently only posted from CPU pipeline. GPU and other pipelines may be added as necessary.
* Update DNNLibrary
* Allow fp16 by default
* Add nnapi build in ci
* Fix nnapi ep after #1268
* Remove unused variables
* Support nnapi in onnx_test_runner
* Update DNNLibrary to fix tests
* Update build.py for android build support, solve conflict of
tools/ci_build/build.py
* Support non-ARM Android build, solve conflict of tools/ci_build/build.py
* Enable android test by x86_64 android emulator
* Add dnnlibrary/NNAPI support in build.py
* suppress the verbose adb output
* Remove debug logs
* Install cmake by pip
* Fix undefined host_protoc_path
* cmake==3.13.2 in pypi is actually 3.12.2, so install 3.13.2.post1 instead
* Fix Android ARM64 build
* Use android ndk r20 instead of r19c, fix conflicts in install_deps_android.sh
* Update version number to 0.5.0 in preparation for release
* Update to README.md to direct to Versioning doc
* Resolve PR comment
* Remove incorrect line generation
* Minor updates to update version script
* Minor comment update
* sync onnx to get equal op with float support
* doc update
* fix test failure because of updated shape inference logic for roialign.
* filter consum test cases since it's not implemented yet.