Prior to this every test shared the same tolerances. This meant
that if an ONNX test failed due to a small but acceptable difference in
output, the only alternative was to disable the test entirely.
In op set 17, the DFT operator is being added. Without this change, the
tests for that operator fail because the output is off by about 5e-5.
It's better to keep test coverage for this new op rather than disable
the test entirely.
Also prior to this change, the global tolerances were not shared between
C++, JavaScript, and Python tests. Now they are.
Also fix various minor issues raised by linters.
Unblocks https://github.com/microsoft/onnxruntime/issues/11640.
* update TVM
* get alignment constant from TVM
* update TVM_VM_SetInputs to upstream with TVM API
* fix CI issue: update TVM EP dependencies
* add sudo
* revert changes needed to install missing package
* add package for TVM EP CI
Co-authored-by: Valery Chernov <valery.chernov@deelvin.com>
Co-authored-by: KJlaccHoeUM9l <wotpricol@mail.ru>
* Initiate Ort SNPE EP
* fix snpe ep windows build which is caused by the utility method (ToUTF8String) name change on master
* correct the source path for libonnxruntime.so while building for andorid package
* add AdditionalDependencies for amr64
* On MS-Windows, the patchfile must be a text file, i.e. CR-LF must be used as line endings. A file with LF may give the error: "Assertion failed, hunk, file patch.c, line 343," unless the option '--binary' is given.
* fix build failure if snpe is not enabled
* update doc for contrib op
* separate out snpe ep settings to onnxruntime_snpe_provider.cmake
* renaming according review comments
* update according review comments
* Implement XNNPACK support via an EP.
* Layout transform uses the GraphPartitioner infrastructure.
* Node fusion is supported.
* Conv and MaxPool implementations were ported from Changming's PR.
* Added optional mutex in InferenceSession::Run as we only want to allow sequential calls if xnnpack is enabled
* Add disentangled attention TRT plugin as contrib op
* update plugin name & remove null character
* update onnx-tensorrt submodule with my beta version
* use suggested plugin name & simpler shape propagation
* update onnx-tensorrt gitsubmodule to temporary fork
* update onnx-tensorrt to temporary commit
* redirect submodule back to latest 8.2-GA release of onnx-tensorrt repo
Co-authored-by: HHH-ComputeLab <haohangh@nvidia.com>
* update TVM
* small fixes
* update TVM with new set_input and NDArray API
* use set_input instead of set_one_input
Co-authored-by: Valery Chernov <valery.chernov@deelvin.com>
* Disable training code in DNNL LayerNorm code
The capability code already does not claim the LayerNorm and
SkipLayerNorm that require more than one output. However,
building with training enabled was causing issues.
The training specific code has been removed even when building with
training enabled.
Signed-off-by: George Nash <george.nash@intel.com>
* Fix for DNNL FusedMatMul op.
The bug was in the transpose code.
Signed-off-by: Chethan Palangotu Keshava <chethan.palangotu.keshava@intel.com>
* Use agreed upon memory format type when runnig Pooling Gradient in dnnl ep
The dnnl ep does not currently have a way to pass memory_format information
between the forward pooling primitive to the backward pooling primitive.
This change explicitly sets the memory_format to use match that of Onnxruntime.
For both the forward and backward pooling code. This will prevent using un-matched
memory format that could result in an `unimplemented` error from dnnl ep.
Signed-off-by: George Nash <george.nash@intel.com>
* Update dnnl ep to use OneDNN v2.6
Do not run ReduceInfLogSum on the kDnnlExecutionProvider due to a
calculation bug when doing Log or infinity valuse. The fix for this
issue will be part of the next OneDNN release.
Signed-off-by: George Nash <george.nash@intel.com>
* Update PrintMemory function in dnnl ep
This modification can be used to enable/disable memory printing
for dnnl ep develpers. This is considered a developer only feature
and is disabled by default. It must be enabled and code recompiled
to use.
Even if it is enabled it will not actually print any memory because
the developer needs to take the extra step of spefifying the memory
that will be printed to the screen.
Signed-off-by: George Nash <george.nash@intel.com>
* Update binary ops to run on intel GPU when using dnnl ep
Binary ops (i.e. Add, Div, Mul, and Sub ) was updated to no longer
call GetMemoryAndReshape in the past this would move the memory from
CPU to the GPU. This extra call is no longer needed since it is taken
care of by the GetMemoryInOrtFormat call. Removing the GetMemoryAndReshape
prevented copying the memory to GPU twice.
Signed-off-by: George Nash <george.nash@intel.com>
Co-authored-by: Chethan Palangotu Keshava <chethan.palangotu.keshava@intel.com>
* rename info to options for TVM EP
* transfer options processing from TVMExecutionProvider to TVMEPOptions
* transfer TVMRunner to separated files
* implement TVMCompiler class
* replace CompileFunc by TVMCompiler object. update TVMRunner. now it does not depend on TvmExecutionProvider
* correct logging of TVM EP options
* RunnerImpl, GERunnerImpl and VMRunnerImpl were implemented
* add prepareComputeInfo method
* remove update_output_shapes flag
* embed all TVM EP dependences to tvm namespace. transfer model compilation from TVMRunner. connect TVMRunnerImpl to TVMRunner
* refactor compileModel method
* small cleaning
* separate TVM EP options data store and processing
* replace TvmTensorShape by InlinedVector with max_size 5
* correct indentation
* update TVM hash
Co-authored-by: Valery Chernov <valery.chernov@deelvin.com>
* add executor option (vm or graph) and support virtual machine methods
* nullptr check for compile and run methods (see also PR#10211 from microsoft:onnxruntime)
* get output shapes for VM
* remove run_with_benchmark. remove run methods from python api, get it from native side
* get outputs method for VM was implemented
* support multiple input for VM
* update python logging and exception
* small fix
* update tvm with patch for VM API
* update nhwc transformations for TVM EP
* add data alignment check and support set_input_zero_copy for GE in TVM EP
* fix logger name
* return back to apache/tvm with VM fixes instead of local dev branch
* hide customized tvm logger while issue is not resolved. fix tvm warning related to target_host
* flake8 fix
Co-authored-by: Valery Chernov <valery.chernov@deelvin.com>
Work on minimizing memory management calls by
reducing number of allocations and copies.
Replace std::unordered_set to InlinedHashSet
and add usage of InlinedVector.
Employ std::move() to minimize copying and memory allocations.
Remove copying of the const shared data into each of the
PropagateCast transformer instances.
Move inlined_containers.h header to include/common
Adjust AsSpan imlementation for C++ < 17
* add support for bool type
* add TVM EP support for tests
* include TVM EP in python test pool
* fix pylint
* moved technical imports to a separate file
* clean up post build actions & move _ld_preload.py extension to CMake level
* add files for include TVM EP into CI
* implement custom logger for TVM
* replace TVM logging with ONNX RT logging
* update link for TVM EP tutorial
* clean up TVM EP cmake
* add pybind auto enabling for TVM EP
* fix blank spaces
* code review fixes
* replace print with comment
* add list of EP without TVM EP
* enable onnx tests
* disable contrib ops and ml ops
* reuse Dockerfile.ubuntu
* Move install_tvm_test_dependencies.sh out of Docker context dir, update build definition.
Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com>
Add abseil and inlined containers typedefs
Introduce TensorShapeVector for shape building.
Use gsl::span<const T> to make interfaces accept different types of vector like args.
Introduce InineShapeVectorT for shape capacity typed instantiations
Refactor cuda slice along with provider shared interfaces
Refactor Concat, Conv, Pad
Build with Conv Einsum and ConvTranspose refactored.
Remove TesnorShape::GetDimsAsVector()
Refactor SliceIterator and SliceIteratorBase
Refactor broadcast
Refactor Pads for twice as long
Remove memory planner intermediate shapes vector
Refactor orttraining
Fix passing TenshroShapeVector to tests
Remove abseil copy and submodule, use FetchContent_Declare/Fetch
Path with separate command
Make RocmAsyncBuffer accept anything convertible to span. Adjust Linux GPU pipeline.
Although github works with both, this is more precise.
Having an extension also makes it easy to match with regex, when we want to inject code to reroute traffic to our own git mirror.
* squashed commit for standalone tvm execution provider
* critical fix for correct python build with stvm ep
* get tuning log file from ep options. It has priority over AUTOTVM_TUNING_LOG
* updates and fixes
* update parsing of stvm provider options
* add support of external data for onnx model
* add conditional dump of subgraphs
* remove unused code
* get input tensor shapes through provider options. get output shapes for fixed input ones by TVM API
* support AUTO_TVM tuning log file inside ORT. Selector for Ansor and Auto_TVM is provider option (tuning_type)
* add fp16
* add functionality of conversion of model layout to NHWC if need. Necessary parameter was added to STVM provider options
* fix license text in header. fix log format
* small fixes
* fix issues from flake8
* remove model proto construction from GetCapability
* reserve memory for vector of DLTensors
* add simple tutorial for STVM EP
* STVM docs
* jroesch/tvm -> apache/tvm
* remove dead code, unneccessary logs and comments
* fix in readme
* improve tutorial notebook
* tvm update
* update STVM_EP.md
* fix default value
* update STVM_EP.md
* some TODOs for the future development
* shorten long lines
* add hyperlink to STVM_EP.md
* fix Linux CI error
* fix error in csharp test
Co-authored-by: Jared Roesch <jroesch@octoml.ai>
Co-authored-by: Valery Chernov <valery.chernov@deelvin.com>
Co-authored-by: KJlaccHoeUM9l <wotpricol@mail.ru>
* update base image from 11.4.0 to 11.4.2
* update Linux TRT GPU pipeline to TRT 8.2
* update onnx-tensorrt to 8.2-GA
* disable failing TensorRT 8.2 tests.
* update pad test.
* fix
* update win trt ci pipeline to trt 8.2
* test run with cuda 11.4 and cudnn 8.2
* increase timeout
* revert
* revert
* update packaging pipelines to use trt 8.2
* fix typo
* update trt gpu perf pipeline to trt 8.2
* increase timeout
* delete deprecated ci-perf-pipeline.yml
* bump timeout
* adjust timeout packaging
* Add QAttention to DNNL EP
Add QAttention to DNNL EP (limited support and disable for gpu)
update ONEDNN version to 2.4.4
bug fix in getcapability
add memory debug print
Signed-off-by: Wang <zhaoyang.wang@intel.com>
* Address Code Review + MatMulInteger Fix
clean up code and add comments
fix matmulinteger and add fusion rule to enable initialized vector weight zero
points of 0s
update DNNL_TAG to v2.5
Signed-off-by: Wang <zhaoyang.wang@intel.com>
* Linux Compile Fix + rollback ONEDNN to 2.4.4
Signed-off-by: Zhaoyang Wang <zhaoyang.wang@intel.com>
* Fix QAttention Debug build
Signed-off-by: Wang <zhaoyang.wang@intel.com>
* Fix QAttention build if USE_DNNL not specified
Signed-off-by: George Nash <george.nash@intel.com>
Co-authored-by: Wang <zhaoyang.wang@intel.com>
Co-authored-by: MTC <63478620+jeyblu@users.noreply.github.com>
* Enable selecting custom ops in onnxruntime-extensions.
* Move cmake_helper.py.
* Remove over-indented spaces.
* Add doc.
* Remove onnxruntime-extensions from git submodules, and user should pass path of onnxruntime-extensions for build.
* Modify doc.
* Remove argument --enable_onnxruntime_extensions and use --onnxruntime_extensions_path.
* Fix build error.
* Fix build error.
* Use onnxruntime_extensions_path.
* support both submodule and external source folders
* refinement
* Update cgmanifest.json
* Support building onnxruntime-extensions from either git submodule or pre-pulled path.
* Update doc.
* more standard name
* update docs
* add the copyright header
Co-authored-by: Zuwei Zhao <zuzhao@microsoft.com>
Co-authored-by: Wenbing Li <wenbingl@outlook.com>
Co-authored-by: Wenbing Li <10278425+wenbingl@users.noreply.github.com>