Commit graph

887 commits

Author SHA1 Message Date
Vincent Wang
39dc6ea8a3
Fix to_dlpack Failure on PyTorch-1.10 (#9151)
* workaround to_dlpack fail in new pt version

* add torch code link
2021-09-24 09:48:07 +08:00
ke1337
6e83392ff1
Bump up TVM version to avoid conflict with existing one (#9159)
* Bump up tvm version

* Bump up onnxruntime-tvm version

There are some c++17 related fixes in TVM

Co-authored-by: KeDengMS <kedeng@microsoft.com>
2021-09-22 17:39:19 -07:00
Suffian Khan
47888392ab
Fix nightly CI pipeline to generate ROCm 4.2 wheels and add ROCm 4.3.1 wheels (#9101)
* make work for both rocm 4.2 and rocm 4.3.1

* fix rocm 4.3.1 docker image reference

* fix CUDA_VERSION to ROCM_VERSION

* fix ReduceConsts conflict def

* add ifdef to miopen_common.h as well

* trailing ws
2021-09-19 23:36:03 -07:00
Tiago Koji Castro Shibata
12515552d1
Remove cpuinfo from WCOS builds (#9076) 2021-09-16 12:05:47 -07:00
Tracy Sharpe
4828d2ebb1
MLAS: port aarch64 sgemv kernel to Windows ARM64 (#9071) 2021-09-15 18:40:40 -07:00
Changming Sun
0270ab17c5
Set onnxruntime_DISABLE_RTTI to default OFF (#9049) 2021-09-14 13:53:02 -07:00
Rajalakshmi Srinivasaraghavan
e83cc534d4 Fix cmake POWER10 detection
Recent commit 60c98a8 changed variable mlas_common_srcs which affects
POWER10 detection.
2021-09-12 11:56:55 -07:00
Ryan Hill
c3321b1778
Fix NVTX profiling so it can run in the shared CUDA provider (#9035)
* Move NVTX profiling so it can run in the shared provider properly
2021-09-11 00:35:54 -07:00
Zuwei Zhao
ff66cfdfa6
Enable linking in exception throwing support library when build onnxruntime wasm. (#8973)
* Enable linking in exception throwing support library when build onnxruntime webassembly containing onnxruntime-extensions.

* Add flag in build.py to enable linking exceptions throwing library.

* Update onnxruntime-extensions document and bind custom_ops build flag with use_extensions.

* Update doc.

* Update cgmanifest.json.

Co-authored-by: Zuwei Zhao <zuzhao@microsoft.com>
2021-09-10 22:09:16 +08:00
George Wu
00eca42413
make_policy(SET CMP0104 OLD) (#8793) 2021-09-07 13:12:50 -07:00
Changming Sun
60c98a86b7
CMake file changes for macOS universal2 support (#8953) 2021-09-04 13:30:33 -07:00
stevenlix
a9776d1c70
Add QDQ model support in TensorRT EP (#8969)
* disable setting dynamic range for QDQ model

* update cgmanifest

* Update cgmanifest.json
2021-09-03 19:33:34 -07:00
Gary Miguel
47435311f4
Include pytorch_export_contrib_ops in inference builds (#8878)
* Include pytorch_export_contrib_ops in inference builds

Rename / move it from tools/python/register_custom_ops_pytorch_exporter
to onnxruntime/python/tools/pytorch_export_contrib_ops.

Rationale for inclusion in inference builds:
This code is potentially useful for anyone using ORT, not just training.

Rationale for new name:
"Contrib op" is the nomenclature used within ORT to refer to the set of
ops that are not in the standard op set but are included by default with
ORT. This is more specific than "custom op", which is what the PyTorch
exporter uses to refer to any non-standard op.

Step 1 of addressing #8818. After this is merged I will update the docs.

* Enable test_pytorch_export_contrib_ops.py in CI

Fixes AB#1342330
2021-09-02 14:26:58 -07:00
Changming Sun
1a34775fe9
Fix the benchmark code (#8926) 2021-09-02 10:36:24 -07:00
satyajandhyala
4570d85f20
Move setdlopenflags calls into _pybind_state.py (#8916)
* Use PROTOBUF_LIB instead of protobuf::libprotbuf

* Moved setdlopenflags to _pybind_state.py

* Copy the generated _pybind_state.py to required location for Windows.
2021-09-02 09:54:32 -07:00
pengwa
3eb08d4dc7
custom autograd func memory (#8901)
* remove PythonOpGrad control dependency && avoid segement fault

* comment alignment

* fix bugs
2021-09-01 09:29:26 +08:00
Changming Sun
a9a0d3f6fa Update min supported macOS version to 10.14 2021-08-31 16:09:48 -07:00
Yulong Wang
206537936f
[js/web] enable proxy worker for wasm backend (#8862) 2021-08-31 10:23:42 -07:00
Maajid khan
b7129305be
[OpenVINO-EP] UEP v3.1 Release with OpenVINO 2021.4 (#8892)
* Add command to skip tests

* Remove support for OV_2021.3_LTS and ov_2021.1

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>

* Removed request_id parameter from all references

request_id parameter was being used with ov_2020.3
release. Starting from 2020.4 OV release, input_name
paramater is being used instead to get the
KernelContext_GetInput.

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>

* Enabling CI Logs in the branch

* CI Commits to enable logs

* Enable CI Print

* Added Imagescaler op to the supported op's list

Fixes test_tiny_yolo_V2 opset 8 model to support
fully on OV-EP. This model is the older variation
of tiny_yolo_v2 model which has Imagescaler op.

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>

* Added ops to fully support yolov3 model

-Added changes to support yolov3 opset 10 model
fully on CPU_FP32.

-This also increases the operator coverage for GPU
hardware. There by enabling yolov3 model on GPU
with fewer subgraphs.

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>

* Enabling tiny_yolov3 model fully on CPU

->Enabled tiny_yolov3 model fully on CPU.

-> Also reduces the number of subgraphs
to infer this model on GPU

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>

* Adding GatherND op support for CPU and GPU

->This enables yolov3_pytorch model to work
with fewer subgraphs on CPU and GPU Devices.

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>

* Fixes Albert model for ISV customer

ConvTranspose op was getting rejected
due to a condition. Fixed it.

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>

* Disabling this 4 cpp tests for openvino-ep

These unit tests are failing with special conditions
for conv_transpose op with output_shape attribute.
so disabling them for now.

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>

* Docker file changes for 2021.4-v3.1

* Remvoing duplicate code

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>

* ReduceMax No dimension supported

* Fixes failing protobuf issue for docker

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>

* Excluding openvinoep type for convtranpose test

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>

* Disabled 2 Failing convtranspose tests with TensorRT EP

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>

Co-authored-by: suryasidd <surya.siddharth.pemmaraju@intel.com>
Co-authored-by: Aravind Gunda <aravindx.gunda@intel.com>
Co-authored-by: sfatimar <sahar.fatima@intel/com>
2021-08-31 09:23:13 -07:00
satyajandhyala
31926176ac
Support external custom operator schemas on Ubuntu (#8807)
* Expose symbols in onnx and protobuf namespaces in python when building with --enable_external_custom_op_schemas

* Add external onnx and protobuf files to wheel

* Added an example to demonstrate external custom ops use-case

* Added a Linux build pipeline to test external custom ops
2021-08-28 11:05:21 -07:00
Zuwei Zhao
89e8bff121
Enable selecting custom ops in onnxruntime-extensions. (#8826)
* Enable selecting custom ops in onnxruntime-extensions.

* Move cmake_helper.py.

* Remove over-indented spaces.

* Add doc.

* Remove onnxruntime-extensions from git submodules, and user should pass path of onnxruntime-extensions for build.

* Modify doc.

* Remove argument --enable_onnxruntime_extensions and use --onnxruntime_extensions_path.

* Fix build error.

* Fix build error.

* Use onnxruntime_extensions_path.

* support both submodule and external source folders

* refinement

* Update cgmanifest.json

* Support building onnxruntime-extensions from either git submodule or pre-pulled path.

* Update doc.

* more standard name

* update docs

* add the copyright header

Co-authored-by: Zuwei Zhao <zuzhao@microsoft.com>
Co-authored-by: Wenbing Li <wenbingl@outlook.com>
Co-authored-by: Wenbing Li <10278425+wenbingl@users.noreply.github.com>
2021-08-27 21:45:52 -07:00
Tang, Cheng
ae7f2d824d
Share the execution provider instance for training (#8719)
* seperate the training python module; share the execution proivder instance

* fix build break

* fix cuda test crash; reorg the python module code base

* se correct env

* use provider customized hash func

* fixbuild break

* fix rocm break

* use const ref in argument

* rename the file

* move hash func to trainiing module
2021-08-27 16:23:35 -07:00
Guoyu Wang
6a1939252f
Fix Android java API failure (#8865)
* Fix Android Package break

* Without java fix -- pipeline should fail

* With java fix, should pass now

* address CR comments
2021-08-27 15:58:56 -07:00
Scott McKay
0034ad72e6
Minimize changes to fix missing symbols used from C# (#8867)
* Revert "Cleanup C# bindings to add EP (#8810)"

This reverts commit b21ea00020.

* Add back in a minimal set of changes.
Provide stubs in for a limited set of things
  - things called from C# using a static lib of ORT built for mac/ios
  - things in OrtApis that are not included in the build by default
  - things in OrtApis that are excluded in a minimal build

* Cleanup order or EPs in test

* Fix unused function in ROCM build
2021-08-28 07:10:14 +10:00
Edward Chen
7e53a1df6f
Enable selector action transformer infrastructure in minimal build. (#8804) 2021-08-27 17:16:05 +10:00
Rachel Guo
1886f1a737
Make SparseTensor infrastructure optional (#8802)
Add cmake parameter and #ifdefs to allow for disabling sparse tensor support. This comes with a significant binary size cost so we want to be able to exclude it in a minimal build.
2021-08-27 17:12:26 +10:00
Yulong Wang
e8564d6597
[js/web] update emsdk to v2.0.26 (#8653)
* update emsdk to v2.0.26

* fix pooling build warning

* fix build break

* use pragma diagnostic semantic only when __GNUC__ is defined

* fix build break

* disable AttentionPastState_dynamic
2021-08-26 15:31:34 -07:00
Scott McKay
b21ea00020
Cleanup C# bindings to add EP (#8810)
Fix C# add EP bindings.
Add stubs to ORT so that if EP is not included in the build we return a graceful error message.
Move declaration of stubs into C API and out for EP so they're in one place and are easier to use (no extra header required in the C/C++ world and consistent with the CUDA EP setup).
Fix inconsistency in ROCM EP.
Cleanup a few other things.
2021-08-26 13:59:40 +10:00
Jorn Tuyls
9053e1522d
Check for Python_EXECUTABLE in pyxir.cmake to fix Vitis AI EP build (#8631)
Co-authored-by: Jorn Tuyls <jornt.tuyls@gmail.com>
2021-08-24 08:39:50 -07:00
Changming Sun
4bfff45859
Downgrade Eigen (#8817) 2021-08-23 18:06:23 -07:00
Tiago Koji Castro Shibata
62c0d24340
Fix Windows Store build (#8753)
* Remove APIs unavailable in Store in #8349, #8178, #8065

* Add UWP stubs of C runtime functions

* Remove UWP incompatible tests from UWP build

* Remove incompatible tests from Store

* Use UWP stubs in store only

* Skip partition check outside of Windows

* Remove unused WRL include

* Workaround Windows header not including what it uses

* Fix precompiled header name clash

* Workaround SDK bugs

* DXCore workaround in Win7

* Fix warning

* Fix more warnings

* Bump WinML to target Windows 8

* Fix more warnings

* Remove unnecessary workarounds

* Remove Desktop only APIs from DML adapter
2021-08-23 11:19:03 -07:00
Suffian Khan
9fa0d8392a
Extend node debugging utilities to push tensors and node placement to SQL database (#8672)
* adding support for tracing to sqldb instead of files

* use compiled statements

* script to pull tensors from db

* link sqlite3

* remove node info redundant with onnx graph

* addressing PR comments

* address PR comments and include program counter

* third party notice

* use find_pacakge

* add to cgmanifests.json

* address thread safety and add pid suffix

* build fi

* python script to select on devicetype

* remove unpopulated and redundant Shape and Type fields

* comment

* comment

* PR comments

* add graph execution counter to session state

* move increment to inference session

* std::endl to \n

* ifdef on graph execution counter

* add ifdef to inference session

* move DEBUG_NODE_INPUTS_OUTPUTS to CMakeLists.txt
2021-08-21 00:40:12 -07:00
pengwa
39059f2539
enable torch interop build (#8493)
* fix build - python.h not found

* disable --build_shared_lib for ortmodule tests

* fix

* fix the build flag

* disable --build_shared_lib for training path (not only for ortmodule)

* fix missing test model files

* disable test CApiTest.test_custom_op_library when ENABLE_TRAINING_TORCH_INTEROP is ON

* enable custom_op_library build

* fix build

* fix

* merge master and fix build failure

* build onnx_test_runner when onnxruntime_ENABLE_TRAINING_TORCH_INTEROP is ON

* resolve comments

* use --enable_training_torch_interop to replace "onnxruntime_ENABLE_TRAINING_TORCH_INTEROP=ON"
2021-08-19 09:16:32 +08:00
Chen Fu
00b345eb7b
ARM Neon S8S8 kernel for QGemm (#8695)
Using signed int, qgemm kernel avoids extending uint8 to int16 while computing matrix multiplication, achieving higher performance. We also find that by using only lower 64b of vector registers to load A and B matrix, we can get further performance improvements. We also experimented with using ldp to load two 64b in one shot, vs using two ldr to load one 64b at a time, in both Big and little cores, there is no noticeable differences.

Submitting the LDP version. At this point we don't need to choose kernel based on micro-architecture.

Inference time of resnet50, thread count 2

Big Core on Pixel 3a
Current master: 292.947 ms
First iteration S8S8: 188.239 ms
LDP load two 64b reg: 178.715 ms
LDR load one 64b reg: 179.536 ms

Little Core
Master: 546.317 ms
S8S8: 513.332 ms
LDP: 489.19 ms
LDR: 497.865 ms

Raspberry Pi 3B+
Master: 660.08 ms
S8S8: 608.577 ms
LDP: 603.675 ms
LDR 602.075 ms
2021-08-18 09:58:47 -07:00
Rachel Guo
78759059f1
[CoreML EP]Make coreml ep build on non-macOS platform (#8677)
* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* clean

* remove unused defs

* correct typo

* remove onnxruntime_coreml_proto

* cr comments

* enablie nnapi/coreml in minimal build

* enable nnapi/coreml in one build

* refine dependencies

* fix nnapi build failure and remove onnxruntime_coreml_proto dependencies in unit tests cmake files

* small fix

* fix

* fix build

* revert

* fix build

Co-authored-by: rachguo <rachguo@rachguos-Mini.attlocal.net>
2021-08-18 09:35:32 -07:00
Wei-Sheng Chin
47b3ecb53b
Packaging pipeline now builds with PythonOp (aka running autograd.Function) (#8652)
This PR disable UTs in training's package pipelines 
for building packages with PythonOp (torch.autograd.Function).
2021-08-17 10:55:13 -07:00
KeDengMS
d0ff2621ee
[Nuphar] Fix Windows build in VS 2019 (#8728)
Update TVM to fix c++17 build break in VS 2019
Remove tvm::nnvm from build
2021-08-13 16:13:34 -07:00
Chen Fu
8f7422be69
Limiting platforms where cpuinfo is included (#8716)
* Limiting platforms where cpuinfo is included

* Suppress strncpy warning during msvc build

Co-authored-by: Chen Fu <fuchen@microsoft.com>
2021-08-13 14:46:21 -07:00
George Nash
e695cd304a
Dnnl refactor (#8627)
* dnnl ep rework

    rework DnnlTensor,DnnlNode,DnnlSubgraph to support arbitrary graph topology and tensor data types

    rework GetCapability to claim nodes in graph greedily from node topological ordering and delay creation of DnnlSubgraph until Compile

    rework compile to have DnnlSubgraphPrimitive as the object to handle primitive creation and execution
        instead of thread local primitive pool which duplicates intermediate memory allocated by the EP across threads

    DnnlSubgraphPrimitive provides helpers to handle many common functions for each dnnl primitive builder and become the centralized place to store input, output, intermediate memories, initializer memories and etc
        it provides functions to obtain input memories with automatic reordering/reshaping and moving between engines
        it provides interfaces to add primitive, set output memory for single node and etc

    add CONCURRENT_EXEC compile flag for dnnl library as without it, convolution primitive cannot be created and executed on different threads

    enable unit tests to run on dnnl ep as well if built with dnnl ep

    add dnnl ep support for Matmulinteger

* Add Relu to the DNNL refactor

Signed-off-by: George Nash <george.nash@intel.com>

* Add Convolution op to the DNNL rework

Signed-off-by: George Nash <george.nash@intel.com>

* Add Pooling ops to the DNNL rework

This adds the following ops:
    - AveragePool
    - GlobalAveragePool
    - GlobalMaxPool
    - MaxPool

Note: Pooling with dilation is not yet supported.
Note: GlobalLpPool, LpPool, MaxRoiPool, and MaxUnpool are not supported yet.

Signed-off-by: George Nash <george.nash@intel.com>

* Add Sum op to the DNNL rework

Signed-off-by: George Nash <george.nash@intel.com>

* Add ConvGrad op to the DNNL rework

Signed-off-by: George Nash <george.nash@intel.com>

* Add MaxPoolGrad and AveragePoolGrad ops to DNNL rework

Signed-off-by: George Nash <george.nash@intel.com>

* Added lrn operator to the refactored code

Signed-off by chethan.palangoutu.keshava@intel.com

* Added ReduceMean DNNL op to the refactor code

Signed-off-by: Chethan Palangotu Keshava <chethan.palangotu.keshava@intel.com>

* Added Softmax DNNL op for the refactored code

Signed-off-by: Chethan Palangotu Keshava <chethan.palangotu.keshava@intel.com>

* Added BatchNorm DNNL op inference-only for refactored code

Signed-off-by: Chethan Palangotu Keshava <chethan.palangotu.keshava@intel.com>

* Added Binary Ops to DNNL rework

Signed-off-by: Wang <zhaoyang.wang@intel.com>

* Added ReluGrad to DNNL Rework

Signed-off-by: Wang <zhaoyang.wang@intel.com>

* Update OneDNN tag to v2.3

Signed-off-by: Wang <zhaoyang.wang@intel.com>

* Added support for memory upto dim size 12

this is to fix the CI test cases that contain binary ops of input dim
size > 5

Signed-off-by: Wang <zhaoyang.wang@intel.com>

* Prevent claiming support for float16 and bfloat16 when only float is suppoted

By using The string.find used was causing the code to claiming support
for float16 and bfloat16 when we only supported float. We now explicitly
check the code for the data type or the data type with a 7 letter prefix
basically prefixed with "tensor("

Signed-off-by: George Nash <george.nash@intel.com>

* Disable uint8 mul and div, improve type conversion

Disable mul_uint8 and div_uint8 test cases as they use modulo for
overflow handling while onednn uses saturation

improve ype conversion using enum instead of string comparsion as well
as adding more types

Signed-off-by: Wang <zhaoyang.wang@intel.com>

Co-authored-by: Wang <zhaoyang.wang@intel.com>
Co-authored-by: Chethan Palangotu Keshava <chethan.palangotu.keshava@intel.com>
2021-08-13 14:15:43 -07:00
Guoyu Wang
59e59b9e0e
Fix unknown warning "-Wformat-truncation" build failure for arm (#8721)
* fix clang arm build failure

* address CR comments

* Change to cmake check flag option

* Add missing __aarch64__
2021-08-12 23:47:03 -07:00
stevenlix
f00933c41a
Update TensorRT parser to the latest (#8712)
* update trt parser to the latest

* update cgmanifest

* update cgmanifest

* update setup_env_trt to cuda11.4

* Update setup_env_trt.bat
2021-08-12 18:10:51 -07:00
Nathaniel McVicar
ce6675a74e Avoid setting compile options on system libs for protobuf on Windows
Signed-off-by: Nathaniel McVicar <namcvica@microsoft.com>
2021-08-12 15:56:39 -07:00
ytaous
0725f80d2d
Revert "Fix Windows Store build (#8481)" (#8679)
This reverts commit 53e7831b53.
2021-08-11 00:37:36 -07:00
Tiago Koji Castro Shibata
53e7831b53
Fix Windows Store build (#8481)
* Remove APIs unavailable in Store in #8349, #8178, #8065

* Add UWP stubs of C runtime functions

* Remove UWP incompatible tests from UWP build

* Remove incompatible tests from Store

* Use UWP stubs in store only

* Skip partition check outside of Windows

* Remove unused WRL include

* Workaround Windows header not including what it uses

* Fix precompiled header name clash

* Workaround SDK bugs

* DXCore workaround in Win7

* Fix warning

* Fix more warnings

* Bump WinML to target Windows 8

* Fix more warnings

* Remove unnecessary workarounds
2021-08-10 15:19:30 -07:00
Edward Chen
20f006c580
Remove flake8 check from CMake build. (#8662) 2021-08-09 14:10:36 -07:00
Tang, Cheng
6d3c2c85ef
Integrate eager mode source code into onnxruntime repo (#8584)
* integrate eager mode source codde; build with cmake and integrate the python test

* Adding the python path for importing libraries in the Eager mode

* fix clang break;check if training and python enabled

* handling the linking of torch libraries across multiple platforms

* merge and fix the naming

* add build instruction

Co-authored-by: Abhishek Jindal <abjindal@OrtTrainingDev0.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>
Co-authored-by: ajindal1 <abjindal@microsoft.com>
2021-08-06 08:30:27 -07:00
Ashwini Khade
96eb9810ba
Update onnx (#8458)
* updates for picking pnnx commit

* add tests filter to c# tests

* plus test fixes

* fix versioning for contrib ops

* fix tests

* test filter for optional ops

* more versioning related updates

* fix test

* fix layernorm spec

* more updates

* update docs

* add more test filters

* more filters

* update binary size threshold

* update docs

* plus more fixes

* updates per review

* update to release commit

* add filters for optional type tests

* plus updates
2021-08-05 09:21:44 -07:00
Changming Sun
375e86f0a0
Make DNNL EP not depending on onnx (#8588) 2021-08-03 14:11:36 -07:00
Weixing Zhang
deab284e4c
fix build failure with --cmake_extra_defines onnxruntime_DEBUG_NODE_INPUTS_OUTPUTS=1 (#8587)
* fix build failure with --cmake_extra_defines onnxruntime_DEBUG_NODE_INPUTS_OUTPUTS=1

* another compile error and add onnxruntime_USE_ROCM

* braces alignment

Co-authored-by: suffian khan <sukha@microsoft.com>
2021-08-03 09:02:49 -07:00
stevenlix
d14b08d09c
Update onnx-tensorrt parser and cgmanifest (#8585)
* update onnx-tensorrt parser and cgmanifest.json

* update cgmanifest
2021-08-02 18:55:33 -07:00