1.Let mlas use session thread pool
2.Remove onnxruntime_USE_MLAS cmake option
3. Remove the win32 thread pool code inside mlas
mlas will:
1.use ort thread pool if it get passed in
2.use openmp if the threadpool parameter is nullptr
3.run single threaded if the threadpool parameter is nullptr and openmp is disabled.
Added Sample Featurizer and Infrastructure
Make featurizers and unit tests compile and run with GTest.
Create definitions for the first featurizer kernel.
Add new operator domain.
Create datetime_transformer kernel and build.
Move OPAQUE types definitions for featurizers kerneles out to a separate cc.
Register them with the type system.
Provide unit tests for new AutoML DateTimeTransformer kernel.
Make necessary adjustments to the test infrastructure to make it run
with new types.
* Mention OrtCreateSessionFromArray in C API doc
* Fix perf test executable due to removal of certain C APIs
* fix linux build
* Avoid duplication
* Fix mem leak
* Update nGraph to 0.21 and adjust the EP
* Share the graph initializers between custom ops
* Update nGraph to 0.22 and exclude Gather entirely
* Enable building on Windows with nGraph v0.21.1-rc.0
* Disable the unsigned input Shrink op tests for nGraph until the next update
* Line-shortening code refactor
* Fix for the master branch merge artifact
* MKLDNN patches adjustment for Windows
* Exclude MatMulInteger for non-const zero points
* Exclude ConvInteger for non-const zero points
* Enable full Cast op support
* Use the v0.22.1 tag
* Skip ConvTranspose_InvalidKernelShape test for ngraph provider
* Create sub-graph ModelProto from fused_node
* remove memory copy between CUDA and TRT
* add info to RegisterExecutionProvider input
* use new IDeviceAllocator for trt allocator
* remove SetDefaultInputsMemoryType from TRT EP
* remove onnx-tensorrt 5.0
* add submodule onnx-tensorrt branch 5.1
* remove redundancy
* Update transformer_memcpy.cc
* Update tensorrt_execution_provider.cc
* switch to TensorRT 5.1.5.0
* update python binding
* disable failed test case on TensorRT
* Update activation_op_test.cc
* upgrade to TensorRT container 19.06
* update according to feedback
* add comments
* remove tensorrt allocator and use cuda(gpu) allocator
* update onnx-tensorrt submodule
* change ci build cuda directory name
* Update DNNLibrary
* Allow fp16 by default
* Add nnapi build in ci
* Fix nnapi ep after #1268
* Remove unused variables
* Support nnapi in onnx_test_runner
* Update DNNLibrary to fix tests
* Update build.py for android build support, solve conflict of
tools/ci_build/build.py
* Support non-ARM Android build, solve conflict of tools/ci_build/build.py
* Enable android test by x86_64 android emulator
* Add dnnlibrary/NNAPI support in build.py
* suppress the verbose adb output
* Remove debug logs
* Install cmake by pip
* Fix undefined host_protoc_path
* cmake==3.13.2 in pypi is actually 3.12.2, so install 3.13.2.post1 instead
* Fix Android ARM64 build
* Use android ndk r20 instead of r19c, fix conflicts in install_deps_android.sh
* sync onnx to get equal op with float support
* doc update
* fix test failure because of updated shape inference logic for roialign.
* filter consum test cases since it's not implemented yet.
Description: Describe your changes.
Change the logic to find cublas dll
Motivation and Context
Why is this change required? What problem does it solve?
The name pattern of cublas changed since 10.1. It doesn't include minor version in its name anymore.
If it fixes an open issue, please link to the issue here.
Add MlasGetPreferredBufferAlignment() for use by CPUAllocator::Alloc to get the byte alignment for CPU tensors. Using MLAS allows the value to be based on the platform the binary is running on instead of a constant value fixed at compile time.
* Add arm64 nocontribops pipeline
* minor fix
* Added new template for arm build -- disable all tests
* fix build command
* add arm64 flag for msbuild
* add arm leg as upstream dependency
* update platform to arm64 for msbuild
* remove test task from arm build
* remove ESRP signing of C# dlls in arm build
* Updated to work for both --arm and --arm64
* Make the cross compiling cmake flags symmetric
* Add dynamic check for /Wno-error flag, instead of extra build option
* remove extra full-stop
* replace log sinks
* limit headers to include dir
* first changes to do dynamic linking
* wip for using cxx api
* remove weird dangling dependency
* building with tests failing
* finish updating converters
* fix const
* intital introduction of typedef
* change logging to use spdlog
* get tests passing
* clang format
* map logging levels better
* clean up unused imports
* trent cr comments
* clang-format
* code review comments
* changing buffer use to reserve
* Dynamically link
* revert tvm
* update binary uploading
* catch exceptions by const-ref
* Revert "revert tvm"
This reverts commit 387676dd1018134d15eb71fa126f7caf94380800.
* fix typo
* update versioning of lib
Description:
This change adds the common part of TVM based codegen library. It includes following parts:
* Microsoft TVM Inventory (MTI): a set of TVM ops for neural networks, similar to TOPI
* Compiler pass for traversing ONNX graph and generate TVM ops
* Compiler pass for traversing generated graph and specify TVM schedule
* Compiler pass for handling weight layout
* Utils for debugging
Motivation and Context:
TVM is an open deep learning compiler stack for cpu, gpu and specialized accelerators. To leverage it in ONNX, we built an execution provider named Nuphar. Currently, Nuphar gets good performance on CPUs with AVX2 on quantized LSTM models.
This codegen library was part of Nuphar execution provider. It is split out for sharing with other execution providers, as we'd like to reuse TVM in more devices.
* init
* Update DNNLibrary
* Update DNNLibrary, set compiler flags, it compiles now
* Add more missing flags, add test
* Update DNNLibrary
* Update Compile method, fix allocator and some other bugs
* Update DNNLibrary
* Implement CopyTensor
* Not delete state explicitly since it is managed by unique_ptr
* Add the missing files when SingleUnitTestProjct is ON
* misc changes
* Fix wrong name in provider factory
* Add my own test
* Update the code of add node into graph, and add the missing initializer into graph
* Fix the bug that re-build the graph produces extra output
* Update DNNLibrary
* Transpose nchw (ONNX) -> nhwc (NNAPI)
* Add license
* Add GetSupportedNodes method (implement it later)
* Rename onnxruntime_nnapi_test->onnxruntime_nnapi_squeezenet_test
* Update squeezenet_test.cpp after rebase master
* Remove squeezenet_test.cpp since it is almost same with the c++ sample
* Update DNNLibrary for GetSupportedNodes
* Update GetSupportedNodes
* Revert "Remove squeezenet_test.cpp since it is almost same with the c++ sample"
This reverts commit a97575fd9ff49e50ba1dc8d8154790d8cd86c48d.
* Update DNNLibrary
* Fix multiple outputs bug
* Remove GetKernelRegistry
* Revert "Revert "Remove squeezenet_test.cpp since it is almost same with the c++ sample""
This reverts commit 2a0670e9cbf10ea654111ce39e198a4be0ddd838.
* Set default memory type of NNAPI EP
* Add CPUOutput allocator
* Update DNNLibrary for multiple outputs
* Fix bug of nhwc->nchw
* Remove GetExecutionHandle()
* Update cuda for python wheels
* Update cuda for python wheels
* Update cuda for python wheels
* Update azure-pipelines-py-packaging.yml
* Update to cuda 10
* Only test win gpu
* Update cuda for python wheels
* Use manylinux2010 image to build linux python wheels
Allow wheels built to truly be compliant with a manylinux policy
Implementation of the MLAS changes for NCHWc convolution/pooling support. These changes adopt the blocking format used by MKL-DNN and other convolution libraries for better performance.
* move all contrib ops to one place
* namespace changes
* bug fix - remove redundant file after merge master
* plus more minor bug fixes
* bug fix
* fix extra space in include header + namespace fix
* fix linux build failure:
* fix test group names
* remove redundant test
* Initial commit for OpenVINO Execution Provider
OpenVINO Execution Provider provides the interface for ONNX Runtime
applications to access Intel's hardware accelerators using Intel's
OpenVINO Toolkit.
* Fixed bug in GetCapability to disable custom ops
Signed-off-by: suryasidd <surya.siddharth.pemmaraju@intel.com>
* Added OPENVINO ci pipeline
Added new pipeline for openvino provider,
made changes to support the docker build and
onnxruntime build with openvino.
Signed-off-by: Luis Daniel Castellanos <luis.daniel.castellanos@intel.com>
* Enabled all unit tests for OpenVINO EP
Signed-off-by: suryasidd <surya.siddharth.pemmaraju@intel.com>
* Fixed syntax issue in run_docker_build.sh file
* Added missing default OPENVINO_VERSION
Default value for OPENVINO_VERSION env was
missing causing the build to fail
* Added install Model Optimizer deps step
* Fixed python unit tests and some tests from onnx_backend_test_series
Signed-off-by: suryasidd <surya.siddharth.pemmaraju@intel.com>
* Fixed indentation bug
Signed-off-by: suryasidd <surya.siddharth.pemmaraju@intel.com>
* Disabled some of the python backend tests for OpenVINO
Signed-off-by: suryasidd <surya.siddharth.pemmaraju@intel.com>
* Disabled some model tests
Signed-off-by: suryasidd <surya.siddharth.pemmaraju@intel.com>
* Remove Duplicate checks for openvino in build.py
Signed-off-by: suryasidd <surya.siddharth.pemmaraju@intel.com>
* Modified GetCapability for FP16
Signed-off-by: suryasidd <surya.siddharth.pemmaraju@intel.com>
* Disabled GPU FP32 tests that are not supported
Signed-off-by: suryasidd <surya.siddharth.pemmaraju@intel.com>
* Convert modelProto to string and use it in compile
Signed-off-by: suryasidd <surya.siddharth.pemmaraju@intel.com>
* Pass byte-array input args to MO
* Serialized ModelProto passed in-memory to MO
ModelOptimizer python module receives the serialized ModelProto
in-memory.
Uses appropriate ONNX function to load the serialized bytes.
* Make Py_Finalize compatible with older python versions
Also, remove pFunc unassigned variable possibility.
* Fallback if input dims of Matmul is greater than 2
Signed-off-by: suryasidd <surya.siddharth.pemmaraju@intel.com>
* fixup: Device #define syntax
* Updated the documentation
Signed-off-by: suryasidd <surya.siddharth.pemmaraju@intel.com>
* Enable dynamic dim value
* removed commented out code
* Added Dockerfile for openvino EP
Updated instructions on dockerfiles/README.md file
Signed-off-by: Luis Daniel Castellanos <luis.daniel.castellanos@intel.com>
* Disabled fp16_inception_v1 test
Signed-off-by: suryasidd <surya.siddharth.pemmaraju@intel.com>
* Code formatting with clang-format
Uses style from the .clang-format file in root directory.
* fixup: docker tag and build error fixes
* Heuristics to automatically detect batching
Distributes slices from batch into parallel infer-request objects.
* Handle disabled tests in GetCapability
Signed-off-by: suryasidd <surya.siddharth.pemmaraju@intel.com>
* Disabled average pool and max pool if ceil_mode is 1
Also dilations are not supported if they are greater than 1
Signed-off-by: suryasidd <surya.siddharth.pemmaraju@intel.com>
* Disabled Unsqueeze int32 test
Signed-off-by: suryasidd <surya.siddharth.pemmaraju@intel.com>
* changes to fix output results bug
* Disabled a few C++ unit tests for MYRIAD FP16
Signed-off-by: suryasidd <surya.siddharth.pemmaraju@intel.com>
* Manually revert '9fe162bb Enable dynamic dim value'
Reverts compile time setting of dynamic shape
Reverting manually due to significantly huge auto-revert conflicts.
* Fixed unused variable warning
Signed-off-by: suryasidd <surya.siddharth.pemmaraju@intel.com>
* Disabled Mul test for GPU_FP16 due to accuracy issue
Signed-off-by: suryasidd <surya.siddharth.pemmaraju@intel.com>
* VPU documentation update
* Disabled inception_v1 for MYRIAD and HDDL
*Also disabled few C++ accuracy tests for HDDL
Signed-off-by: suryasidd <surya.siddharth.pemmaraju@intel.com>
* updates from upstream
* use the new CustomOpApis for I/O interfacing
* Pass initializers as subgraph meta-def inputs in GetCapability()
Requirement due to API changes introduced with PR# 1019.
* Remove obsolete functions
* Save indexes of graph inputs from fused_node info
Both inputs and initializers are passed as data inputs to the
infer function. To identify only inputs among them, save thier
index info from fused_node in Compile function.
* Documentation changes to enable VPU
* Fix VPU related changes in documentation
* Fix minor changes in documentation
* Fix VPU related changes in documentation
* Use Node.In/OutputDefs() to track graph inputs and outputs.
Don't use graph_viewer's GetInputs() or
GetInputsIncludingInitializers().
* Permit "SAME_UPPER" auto_pad attribute from MaxPool
* Disabled fp16_tiny_yolov2 in onnx model tests
* Updated documentation to include configuration guides for myriad and hddl
Signed-off-by: suryasidd <surya.siddharth.pemmaraju@intel.com>
* Use 8 Infer requests only for VAD-R
* disable debug prints
* Clang-format source files
* Updated BUILD.md with OpenVINO R5 links
Signed-off-by: suryasidd <surya.siddharth.pemmaraju@intel.com>
* Disabled same upper python tests
Signed-off-by: suryasidd <surya.siddharth.pemmaraju@intel.com>
* Update test exclusion syntax
* Change path of install_onnx.sh
Signed-off-by: suryasidd <surya.siddharth.pemmaraju@intel.com>
* Disable tiny_yolov2 in broken tests
Signed-off-by: suryasidd <surya.siddharth.pemmaraju@intel.com>
* Revert "Change path of install_onnx.sh"
This reverts commit ba9db165f3be430f2aff1ef413299ed04637196a.
This change is only required for Intel internal CI pipeline until
the settings are matched with the upstream's CI pipeline.
* Added debug statements for debugging CI error
Signed-off-by: suryasidd <surya.siddharth.pemmaraju@intel.com>
* Add --build_wheel to linux openvino pipeline
Signed-off-by: suryasidd <surya.siddharth.pemmaraju@intel.com>
* Added -v option to onnx_test_runner for debugging
Signed-off-by: suryasidd <surya.siddharth.pemmaraju@intel.com>
* Removed path change patch
Signed-off-by: suryasidd <surya.siddharth.pemmaraju@intel.com>
* Added -c 1 to onnx_test_runner
Signed-off-by: suryasidd <surya.siddharth.pemmaraju@intel.com>
* Refactor MO python invocation in separate function
Cleans up Model Optimizer python invocation check and conversion
logic. Invokes MO only once in GetCapability() and passes the
IR strings (xml and bin) to the Compiler as meta-def attributes.
* Add comments
* code cleanup and comments
* Code cleanup for GetCapability
Signed-off-by: suryasidd <surya.siddharth.pemmaraju@intel.com>
* Removed unnecessary files
Signed-off-by: suryasidd <surya.siddharth.pemmaraju@intel.com>
* Revert "Added -v option to onnx_test_runner for debugging"
This reverts commit d1dd70938a94d648df1a1dbbc2e48d0b97e49ec8.
* Revert "Added debug statements for debugging CI error"
This reverts commit b86d41afed2aa29c3508155d6f9c8d3a7263cc60.
* incorporate Status Code changes
* ComputeFunc returns Status::OK() on success
* Use test names to disable tests for MYRIAD and VAD-R
Signed-off-by: suryasidd <surya.siddharth.pemmaraju@intel.com>
* Rename local identifiers from CNNNetwork to OpenVINO network
CNNNetwork is an OpenVINO's API class that represents more than
just convolutional neural networks (CNNs). Renaming helps to avoid
confusion that the API's only support CNN type models.
* Added error message if building on windows
* Removed duplicate option in Cmake
* Removed unnecessary parameters in activation_opt_test
Signed-off-by: suryasidd <surya.siddharth.pemmaraju@intel.com>
* Refactor Map search and access logic for efficiently and cleanliness.
* use C++ style casts
* Use os.path.join for python directory path operations
* use C++ style casts
* EP classes should use onnxruntime namespace
* Clean up fixes from PR comments
* Don't explicitly shutdown Py interpreter
* Remove debug print statements
Prints will be re-enabled later with a logging mechanism with
debug/verbose printing options.
* Decrement ref counts for used pyObjects
* Restore build instructions for other compilers
Content under the "Using other compilers" section has been
accidentally deleted by a previous commit. Restoring back that
content from the latest upstream repo.
* CMake code cleanup
Code clean up, commenting and formatting of CMake code.
* Don't pass the unused device_info parameter to OpenVINOGraph ctor.
* Add support for multiple I/O data types
Adds support for the following tensor data types for graph inputs
and outputs:
1) float
2) float16
3) int32
4) int16
5) int8
6) uint16
7) uint8
* cleanup setup.py module list definition
* Deduce index of input using tracked input index map
Ignores initializers in case they are ordered before inputs.
* Removed debug statement in MO code
Signed-off-by: suryasidd <surya.siddharth.pemmaraju@intel.com>
* PR feedback
* Removed per_sample_tolerance for openvino
* Removed unnecessary disabled tests
Signed-off-by: suryasidd <surya.siddharth.pemmaraju@intel.com>
* Removed debug function
Signed-off-by: suryasidd <surya.siddharth.pemmaraju@intel.com>
* Disabled tiny_yolo_v2 due to accuracy issues
Signed-off-by: suryasidd <surya.siddharth.pemmaraju@intel.com>
* Changed the disabled reason for broken tests
Signed-off-by: suryasidd <surya.siddharth.pemmaraju@intel.com>
* Disabled Reshape with no input
Signed-off-by: suryasidd <surya.siddharth.pemmaraju@intel.com>
* Python formatting with Autopep8
* Minor fix for MYRIAD devices
* Added zero dimension check
*Removed setting batch size for the network
Signed-off-by: suryasidd <surya.siddharth.pemmaraju@intel.com>
* Set the threshold to larger value for MNIST
Signed-off-by: suryasidd <surya.siddharth.pemmaraju@intel.com>
* Removed setting higher threshold in provider_test_utils
Signed-off-by: suryasidd <surya.siddharth.pemmaraju@intel.com>
* Check for --use_openvino in python wheel setup.py
Add openvino modules to the setup script for building the wheel
package only for --use_openvino a build option.
* Removed nullptr checks for GetNode()
Signed-off-by: suryasidd <surya.siddharth.pemmaraju@intel.com>
Address #1155
Add debug helper methods to be able to dump input name and shape information for node inputs, and the data from node outputs.
As the input data comes from graph inputs, initializers or node outputs we don't dump it.
Must be manually enabled by building with '--cmake_extra_defines onnxruntime_DEBUG_NODE_INPUTS_OUTPUTS=ON'
Advance ONNX submodule to 5c51f0dbbe88ee1536f17ee7bd462b2ab3772c52
This commit in ONNX contains a fix to ConstantOfShape test data.
Uncomment ConstantOfShape.
Update test script, make sure exclusions are uniform.
Not needed any more. Because we don't build the date library.
And Sheil says: "It’s a little bit intrusive for callers to be forced onto cpp14 just because they are consuming onnxruntime."
* Add version and latest commit id to ORT Server
* Update cmake
* Change build id to build number
* Use target_compile_definitions instead of add_definitions