Commit graph

872 commits

Author SHA1 Message Date
pengwa
3eb08d4dc7
custom autograd func memory (#8901)
* remove PythonOpGrad control dependency && avoid segement fault

* comment alignment

* fix bugs
2021-09-01 09:29:26 +08:00
Changming Sun
a9a0d3f6fa Update min supported macOS version to 10.14 2021-08-31 16:09:48 -07:00
Yulong Wang
206537936f
[js/web] enable proxy worker for wasm backend (#8862) 2021-08-31 10:23:42 -07:00
Maajid khan
b7129305be
[OpenVINO-EP] UEP v3.1 Release with OpenVINO 2021.4 (#8892)
* Add command to skip tests

* Remove support for OV_2021.3_LTS and ov_2021.1

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>

* Removed request_id parameter from all references

request_id parameter was being used with ov_2020.3
release. Starting from 2020.4 OV release, input_name
paramater is being used instead to get the
KernelContext_GetInput.

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>

* Enabling CI Logs in the branch

* CI Commits to enable logs

* Enable CI Print

* Added Imagescaler op to the supported op's list

Fixes test_tiny_yolo_V2 opset 8 model to support
fully on OV-EP. This model is the older variation
of tiny_yolo_v2 model which has Imagescaler op.

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>

* Added ops to fully support yolov3 model

-Added changes to support yolov3 opset 10 model
fully on CPU_FP32.

-This also increases the operator coverage for GPU
hardware. There by enabling yolov3 model on GPU
with fewer subgraphs.

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>

* Enabling tiny_yolov3 model fully on CPU

->Enabled tiny_yolov3 model fully on CPU.

-> Also reduces the number of subgraphs
to infer this model on GPU

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>

* Adding GatherND op support for CPU and GPU

->This enables yolov3_pytorch model to work
with fewer subgraphs on CPU and GPU Devices.

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>

* Fixes Albert model for ISV customer

ConvTranspose op was getting rejected
due to a condition. Fixed it.

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>

* Disabling this 4 cpp tests for openvino-ep

These unit tests are failing with special conditions
for conv_transpose op with output_shape attribute.
so disabling them for now.

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>

* Docker file changes for 2021.4-v3.1

* Remvoing duplicate code

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>

* ReduceMax No dimension supported

* Fixes failing protobuf issue for docker

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>

* Excluding openvinoep type for convtranpose test

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>

* Disabled 2 Failing convtranspose tests with TensorRT EP

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>

Co-authored-by: suryasidd <surya.siddharth.pemmaraju@intel.com>
Co-authored-by: Aravind Gunda <aravindx.gunda@intel.com>
Co-authored-by: sfatimar <sahar.fatima@intel/com>
2021-08-31 09:23:13 -07:00
satyajandhyala
31926176ac
Support external custom operator schemas on Ubuntu (#8807)
* Expose symbols in onnx and protobuf namespaces in python when building with --enable_external_custom_op_schemas

* Add external onnx and protobuf files to wheel

* Added an example to demonstrate external custom ops use-case

* Added a Linux build pipeline to test external custom ops
2021-08-28 11:05:21 -07:00
Zuwei Zhao
89e8bff121
Enable selecting custom ops in onnxruntime-extensions. (#8826)
* Enable selecting custom ops in onnxruntime-extensions.

* Move cmake_helper.py.

* Remove over-indented spaces.

* Add doc.

* Remove onnxruntime-extensions from git submodules, and user should pass path of onnxruntime-extensions for build.

* Modify doc.

* Remove argument --enable_onnxruntime_extensions and use --onnxruntime_extensions_path.

* Fix build error.

* Fix build error.

* Use onnxruntime_extensions_path.

* support both submodule and external source folders

* refinement

* Update cgmanifest.json

* Support building onnxruntime-extensions from either git submodule or pre-pulled path.

* Update doc.

* more standard name

* update docs

* add the copyright header

Co-authored-by: Zuwei Zhao <zuzhao@microsoft.com>
Co-authored-by: Wenbing Li <wenbingl@outlook.com>
Co-authored-by: Wenbing Li <10278425+wenbingl@users.noreply.github.com>
2021-08-27 21:45:52 -07:00
Tang, Cheng
ae7f2d824d
Share the execution provider instance for training (#8719)
* seperate the training python module; share the execution proivder instance

* fix build break

* fix cuda test crash; reorg the python module code base

* se correct env

* use provider customized hash func

* fixbuild break

* fix rocm break

* use const ref in argument

* rename the file

* move hash func to trainiing module
2021-08-27 16:23:35 -07:00
Guoyu Wang
6a1939252f
Fix Android java API failure (#8865)
* Fix Android Package break

* Without java fix -- pipeline should fail

* With java fix, should pass now

* address CR comments
2021-08-27 15:58:56 -07:00
Scott McKay
0034ad72e6
Minimize changes to fix missing symbols used from C# (#8867)
* Revert "Cleanup C# bindings to add EP (#8810)"

This reverts commit b21ea00020.

* Add back in a minimal set of changes.
Provide stubs in for a limited set of things
  - things called from C# using a static lib of ORT built for mac/ios
  - things in OrtApis that are not included in the build by default
  - things in OrtApis that are excluded in a minimal build

* Cleanup order or EPs in test

* Fix unused function in ROCM build
2021-08-28 07:10:14 +10:00
Edward Chen
7e53a1df6f
Enable selector action transformer infrastructure in minimal build. (#8804) 2021-08-27 17:16:05 +10:00
Rachel Guo
1886f1a737
Make SparseTensor infrastructure optional (#8802)
Add cmake parameter and #ifdefs to allow for disabling sparse tensor support. This comes with a significant binary size cost so we want to be able to exclude it in a minimal build.
2021-08-27 17:12:26 +10:00
Yulong Wang
e8564d6597
[js/web] update emsdk to v2.0.26 (#8653)
* update emsdk to v2.0.26

* fix pooling build warning

* fix build break

* use pragma diagnostic semantic only when __GNUC__ is defined

* fix build break

* disable AttentionPastState_dynamic
2021-08-26 15:31:34 -07:00
Scott McKay
b21ea00020
Cleanup C# bindings to add EP (#8810)
Fix C# add EP bindings.
Add stubs to ORT so that if EP is not included in the build we return a graceful error message.
Move declaration of stubs into C API and out for EP so they're in one place and are easier to use (no extra header required in the C/C++ world and consistent with the CUDA EP setup).
Fix inconsistency in ROCM EP.
Cleanup a few other things.
2021-08-26 13:59:40 +10:00
Jorn Tuyls
9053e1522d
Check for Python_EXECUTABLE in pyxir.cmake to fix Vitis AI EP build (#8631)
Co-authored-by: Jorn Tuyls <jornt.tuyls@gmail.com>
2021-08-24 08:39:50 -07:00
Changming Sun
4bfff45859
Downgrade Eigen (#8817) 2021-08-23 18:06:23 -07:00
Tiago Koji Castro Shibata
62c0d24340
Fix Windows Store build (#8753)
* Remove APIs unavailable in Store in #8349, #8178, #8065

* Add UWP stubs of C runtime functions

* Remove UWP incompatible tests from UWP build

* Remove incompatible tests from Store

* Use UWP stubs in store only

* Skip partition check outside of Windows

* Remove unused WRL include

* Workaround Windows header not including what it uses

* Fix precompiled header name clash

* Workaround SDK bugs

* DXCore workaround in Win7

* Fix warning

* Fix more warnings

* Bump WinML to target Windows 8

* Fix more warnings

* Remove unnecessary workarounds

* Remove Desktop only APIs from DML adapter
2021-08-23 11:19:03 -07:00
Suffian Khan
9fa0d8392a
Extend node debugging utilities to push tensors and node placement to SQL database (#8672)
* adding support for tracing to sqldb instead of files

* use compiled statements

* script to pull tensors from db

* link sqlite3

* remove node info redundant with onnx graph

* addressing PR comments

* address PR comments and include program counter

* third party notice

* use find_pacakge

* add to cgmanifests.json

* address thread safety and add pid suffix

* build fi

* python script to select on devicetype

* remove unpopulated and redundant Shape and Type fields

* comment

* comment

* PR comments

* add graph execution counter to session state

* move increment to inference session

* std::endl to \n

* ifdef on graph execution counter

* add ifdef to inference session

* move DEBUG_NODE_INPUTS_OUTPUTS to CMakeLists.txt
2021-08-21 00:40:12 -07:00
pengwa
39059f2539
enable torch interop build (#8493)
* fix build - python.h not found

* disable --build_shared_lib for ortmodule tests

* fix

* fix the build flag

* disable --build_shared_lib for training path (not only for ortmodule)

* fix missing test model files

* disable test CApiTest.test_custom_op_library when ENABLE_TRAINING_TORCH_INTEROP is ON

* enable custom_op_library build

* fix build

* fix

* merge master and fix build failure

* build onnx_test_runner when onnxruntime_ENABLE_TRAINING_TORCH_INTEROP is ON

* resolve comments

* use --enable_training_torch_interop to replace "onnxruntime_ENABLE_TRAINING_TORCH_INTEROP=ON"
2021-08-19 09:16:32 +08:00
Chen Fu
00b345eb7b
ARM Neon S8S8 kernel for QGemm (#8695)
Using signed int, qgemm kernel avoids extending uint8 to int16 while computing matrix multiplication, achieving higher performance. We also find that by using only lower 64b of vector registers to load A and B matrix, we can get further performance improvements. We also experimented with using ldp to load two 64b in one shot, vs using two ldr to load one 64b at a time, in both Big and little cores, there is no noticeable differences.

Submitting the LDP version. At this point we don't need to choose kernel based on micro-architecture.

Inference time of resnet50, thread count 2

Big Core on Pixel 3a
Current master: 292.947 ms
First iteration S8S8: 188.239 ms
LDP load two 64b reg: 178.715 ms
LDR load one 64b reg: 179.536 ms

Little Core
Master: 546.317 ms
S8S8: 513.332 ms
LDP: 489.19 ms
LDR: 497.865 ms

Raspberry Pi 3B+
Master: 660.08 ms
S8S8: 608.577 ms
LDP: 603.675 ms
LDR 602.075 ms
2021-08-18 09:58:47 -07:00
Rachel Guo
78759059f1
[CoreML EP]Make coreml ep build on non-macOS platform (#8677)
* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* clean

* remove unused defs

* correct typo

* remove onnxruntime_coreml_proto

* cr comments

* enablie nnapi/coreml in minimal build

* enable nnapi/coreml in one build

* refine dependencies

* fix nnapi build failure and remove onnxruntime_coreml_proto dependencies in unit tests cmake files

* small fix

* fix

* fix build

* revert

* fix build

Co-authored-by: rachguo <rachguo@rachguos-Mini.attlocal.net>
2021-08-18 09:35:32 -07:00
Wei-Sheng Chin
47b3ecb53b
Packaging pipeline now builds with PythonOp (aka running autograd.Function) (#8652)
This PR disable UTs in training's package pipelines 
for building packages with PythonOp (torch.autograd.Function).
2021-08-17 10:55:13 -07:00
KeDengMS
d0ff2621ee
[Nuphar] Fix Windows build in VS 2019 (#8728)
Update TVM to fix c++17 build break in VS 2019
Remove tvm::nnvm from build
2021-08-13 16:13:34 -07:00
Chen Fu
8f7422be69
Limiting platforms where cpuinfo is included (#8716)
* Limiting platforms where cpuinfo is included

* Suppress strncpy warning during msvc build

Co-authored-by: Chen Fu <fuchen@microsoft.com>
2021-08-13 14:46:21 -07:00
George Nash
e695cd304a
Dnnl refactor (#8627)
* dnnl ep rework

    rework DnnlTensor,DnnlNode,DnnlSubgraph to support arbitrary graph topology and tensor data types

    rework GetCapability to claim nodes in graph greedily from node topological ordering and delay creation of DnnlSubgraph until Compile

    rework compile to have DnnlSubgraphPrimitive as the object to handle primitive creation and execution
        instead of thread local primitive pool which duplicates intermediate memory allocated by the EP across threads

    DnnlSubgraphPrimitive provides helpers to handle many common functions for each dnnl primitive builder and become the centralized place to store input, output, intermediate memories, initializer memories and etc
        it provides functions to obtain input memories with automatic reordering/reshaping and moving between engines
        it provides interfaces to add primitive, set output memory for single node and etc

    add CONCURRENT_EXEC compile flag for dnnl library as without it, convolution primitive cannot be created and executed on different threads

    enable unit tests to run on dnnl ep as well if built with dnnl ep

    add dnnl ep support for Matmulinteger

* Add Relu to the DNNL refactor

Signed-off-by: George Nash <george.nash@intel.com>

* Add Convolution op to the DNNL rework

Signed-off-by: George Nash <george.nash@intel.com>

* Add Pooling ops to the DNNL rework

This adds the following ops:
    - AveragePool
    - GlobalAveragePool
    - GlobalMaxPool
    - MaxPool

Note: Pooling with dilation is not yet supported.
Note: GlobalLpPool, LpPool, MaxRoiPool, and MaxUnpool are not supported yet.

Signed-off-by: George Nash <george.nash@intel.com>

* Add Sum op to the DNNL rework

Signed-off-by: George Nash <george.nash@intel.com>

* Add ConvGrad op to the DNNL rework

Signed-off-by: George Nash <george.nash@intel.com>

* Add MaxPoolGrad and AveragePoolGrad ops to DNNL rework

Signed-off-by: George Nash <george.nash@intel.com>

* Added lrn operator to the refactored code

Signed-off by chethan.palangoutu.keshava@intel.com

* Added ReduceMean DNNL op to the refactor code

Signed-off-by: Chethan Palangotu Keshava <chethan.palangotu.keshava@intel.com>

* Added Softmax DNNL op for the refactored code

Signed-off-by: Chethan Palangotu Keshava <chethan.palangotu.keshava@intel.com>

* Added BatchNorm DNNL op inference-only for refactored code

Signed-off-by: Chethan Palangotu Keshava <chethan.palangotu.keshava@intel.com>

* Added Binary Ops to DNNL rework

Signed-off-by: Wang <zhaoyang.wang@intel.com>

* Added ReluGrad to DNNL Rework

Signed-off-by: Wang <zhaoyang.wang@intel.com>

* Update OneDNN tag to v2.3

Signed-off-by: Wang <zhaoyang.wang@intel.com>

* Added support for memory upto dim size 12

this is to fix the CI test cases that contain binary ops of input dim
size > 5

Signed-off-by: Wang <zhaoyang.wang@intel.com>

* Prevent claiming support for float16 and bfloat16 when only float is suppoted

By using The string.find used was causing the code to claiming support
for float16 and bfloat16 when we only supported float. We now explicitly
check the code for the data type or the data type with a 7 letter prefix
basically prefixed with "tensor("

Signed-off-by: George Nash <george.nash@intel.com>

* Disable uint8 mul and div, improve type conversion

Disable mul_uint8 and div_uint8 test cases as they use modulo for
overflow handling while onednn uses saturation

improve ype conversion using enum instead of string comparsion as well
as adding more types

Signed-off-by: Wang <zhaoyang.wang@intel.com>

Co-authored-by: Wang <zhaoyang.wang@intel.com>
Co-authored-by: Chethan Palangotu Keshava <chethan.palangotu.keshava@intel.com>
2021-08-13 14:15:43 -07:00
Guoyu Wang
59e59b9e0e
Fix unknown warning "-Wformat-truncation" build failure for arm (#8721)
* fix clang arm build failure

* address CR comments

* Change to cmake check flag option

* Add missing __aarch64__
2021-08-12 23:47:03 -07:00
stevenlix
f00933c41a
Update TensorRT parser to the latest (#8712)
* update trt parser to the latest

* update cgmanifest

* update cgmanifest

* update setup_env_trt to cuda11.4

* Update setup_env_trt.bat
2021-08-12 18:10:51 -07:00
Nathaniel McVicar
ce6675a74e Avoid setting compile options on system libs for protobuf on Windows
Signed-off-by: Nathaniel McVicar <namcvica@microsoft.com>
2021-08-12 15:56:39 -07:00
ytaous
0725f80d2d
Revert "Fix Windows Store build (#8481)" (#8679)
This reverts commit 53e7831b53.
2021-08-11 00:37:36 -07:00
Tiago Koji Castro Shibata
53e7831b53
Fix Windows Store build (#8481)
* Remove APIs unavailable in Store in #8349, #8178, #8065

* Add UWP stubs of C runtime functions

* Remove UWP incompatible tests from UWP build

* Remove incompatible tests from Store

* Use UWP stubs in store only

* Skip partition check outside of Windows

* Remove unused WRL include

* Workaround Windows header not including what it uses

* Fix precompiled header name clash

* Workaround SDK bugs

* DXCore workaround in Win7

* Fix warning

* Fix more warnings

* Bump WinML to target Windows 8

* Fix more warnings

* Remove unnecessary workarounds
2021-08-10 15:19:30 -07:00
Edward Chen
20f006c580
Remove flake8 check from CMake build. (#8662) 2021-08-09 14:10:36 -07:00
Tang, Cheng
6d3c2c85ef
Integrate eager mode source code into onnxruntime repo (#8584)
* integrate eager mode source codde; build with cmake and integrate the python test

* Adding the python path for importing libraries in the Eager mode

* fix clang break;check if training and python enabled

* handling the linking of torch libraries across multiple platforms

* merge and fix the naming

* add build instruction

Co-authored-by: Abhishek Jindal <abjindal@OrtTrainingDev0.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>
Co-authored-by: ajindal1 <abjindal@microsoft.com>
2021-08-06 08:30:27 -07:00
Ashwini Khade
96eb9810ba
Update onnx (#8458)
* updates for picking pnnx commit

* add tests filter to c# tests

* plus test fixes

* fix versioning for contrib ops

* fix tests

* test filter for optional ops

* more versioning related updates

* fix test

* fix layernorm spec

* more updates

* update docs

* add more test filters

* more filters

* update binary size threshold

* update docs

* plus more fixes

* updates per review

* update to release commit

* add filters for optional type tests

* plus updates
2021-08-05 09:21:44 -07:00
Changming Sun
375e86f0a0
Make DNNL EP not depending on onnx (#8588) 2021-08-03 14:11:36 -07:00
Weixing Zhang
deab284e4c
fix build failure with --cmake_extra_defines onnxruntime_DEBUG_NODE_INPUTS_OUTPUTS=1 (#8587)
* fix build failure with --cmake_extra_defines onnxruntime_DEBUG_NODE_INPUTS_OUTPUTS=1

* another compile error and add onnxruntime_USE_ROCM

* braces alignment

Co-authored-by: suffian khan <sukha@microsoft.com>
2021-08-03 09:02:49 -07:00
stevenlix
d14b08d09c
Update onnx-tensorrt parser and cgmanifest (#8585)
* update onnx-tensorrt parser and cgmanifest.json

* update cgmanifest
2021-08-02 18:55:33 -07:00
stevenlix
ee99fb400c
Upgrade TensorRT to v8.0.1 (#8512)
* update onnx-tensorrt parser to master

* disable unsupported tests

* add cuda sm 75 for T4

* update tensorrt pipeline

* update trt pipelines

* update trt pipelines

* Update linux-gpu-tensorrt-ci-pipeline.yml

* update trt cid pipeline

* Update linux-gpu-tensorrt-ci-pipeline.yml

* Update Tensorrt Windows build pool and TensorRT/CUDA/CuDNN version

* update to cuda11.4 in trt ci pipeline

* update base image to cuda11.4

* update packaging pipeline to cuda11.4

* clean up

* remove cuda11.1 and cuda11.3 docker file

* disable unsupported tensorrt tests at runtime

* Update linux-multi-gpu-tensorrt-ci-pipeline.yml
2021-08-02 11:20:31 -07:00
Changming Sun
0510688411
Update compliance tasks in python packaging pipeline and fix some compile warnings (#8471)
1. Update SDLNativeRules from v2 to v3. The new one allows us setting excluded paths.
2. Update TSAUpload from v1 to v2. And add a config file ".gdn/.gdntsa" for it.
3. Fix some parentheses warnings
4. Update cmake to the latest.
5. Remove "--x86" build option from pipeline yaml files. Now we can auto-detect cpu architecture from python. So we don't need to ask user to specify it.
2021-07-30 17:16:37 -07:00
baijumeswani
816ad86d14
Configuring ORTModule - Internal Options (#8537) 2021-07-30 13:05:32 -07:00
Rachel Guo
0cf2ed029b
Add python binding for CoreML EP (#8472)
* add pybind binding for coreml ep

* update merged files

* address comments

* format

* remove lines for non-macOS platform

Co-authored-by: rachguo <rachguo@rachguos-Mini.attlocal.net>
2021-07-29 10:06:47 -07:00
Tang, Cheng
00d8f8ce95
enable shared lib based execution provider test on linux (#8480)
* enable shared lib test on linux

* fix build break

* add onnx dependency

* add rpath

* skip the test for linux training

* set ONNX_ML definition

* install training python dependency

* update

* fix format; add eigen include folder

* fix format

* skip amd build

* enable shared provider on training

* fix comments in pr

Co-authored-by: Ubuntu <chenta@chenta-orttraining-cpu.bxgbzpva45kedp3rhbsbit4phb.jx.internal.cloudapp.net>
Co-authored-by: Changming Sun <chasun@microsoft.com>
2021-07-28 16:58:13 -07:00
Tracy Sharpe
539d1d44c1
Optimize ARM64EC build (#8515)
Add sgemm and qgemm optimized kernels for ARM64EC configuration.
2021-07-27 23:46:39 -07:00
Oliver Rausch
1685ab8138
Implement Concat with Strided copy (#8336)
Adds a StridedCopy function that implements a copy from strided tensor to another.

This parallelizes the Concat operator, and can also be used in the future to parallelize many other data movement operators (e.g. Transpose, Split, etc.).
This operation is also required for the proposed data layout extensions to ORT.
2021-07-27 18:27:56 +02:00
Edward Chen
b4baac888c
[NNAPI EP] Make partitioning stop ops configurable from Python API. (#8484) 2021-07-27 08:16:47 -07:00
Vincent Wang
c8d210de29
Decouple Forward and Backward of ATenOp (#8301)
* atenop for inference

* assert if dtype mismatch

* atenop config in frontend

* fix orttrainer test

* gradient def not only for ATenOp

* bugfix

* fix gradient input shape and type issue

* fix after merge master
2021-07-23 16:53:26 +08:00
Tracy Sharpe
b2b9de939f
cleanup onnxruntime_mlas.cmake of old gcc workarounds (#8469) 2021-07-22 22:01:05 -07:00
Dmitri Smirnov
950fe5e28b
Implement SparseTensor and infrastructure suppport and advance ONNX commit (#8038)
SparseTensor support
  Implement Builder pattern
  Fix support for 1-D and 2-D COO indices
  Implement and test CSR support.
  Handle shape inference for SparseTensors
  Implement conversion for COO, CSR and tests.
  Address the case where constant sparse initializer is the output.
  Implement test infra for SparseTensors
  Implement SparseDenseMatMul for Csr and COO and tested it.
  Add hash for SparseToDenseMatMul
  Finish shared provider refactor
  Refactor GetOrCreate to Create
  Working on py interface
  Expose OrtDevice and use it in allocate_numpy
	Adjust Sparse interfaces, add support for string SparseTensor. Add tests.
	Add and test to_cuda()
	Add accessors to format specific indices
	Test values and indices views, read-only flag, after GC access
	Add sparse related methods to OrtValue
	Re-work SparseTensor wrapper, add OrtValue methods
	Rework numpy_array_to_cuda/to_cpu
	Add run_with_ort_values
	Add models and test sparse_mat_mul with run_with_ort_values
	Refactor sparse tensor to use a single buffer
        Ifdef x86 Eigen CSR sparse matmul implementation
        Exclude broken test, check for string type when copying cross device
       Split pybind schema, regenerate docs, add exclusion
       Conditionally exclude schema module
       Update docs fix cuda build
       Add test to a filter and renerate JS docs
      Add conversion and test string support for sparse tensors
      Exclude conversion utils from minimal build
      Add CUDA Memcpy and adjust provider interfaces
2021-07-22 15:24:36 -07:00
pengwa
892ac9f55a
code structure update (rename only) (#8410) 2021-07-22 23:50:19 +08:00
Ryan Hill
7e2ecb2eeb
Remove unnecessary line as no headers exist now (#8446) 2021-07-21 01:03:05 -07:00
Adam Pocock
55b26b6951
[Java] Adds support for DNNL, OpenVINO, TensorRT shared providers and refactors the CUDA shared provider loader (#8013) 2021-07-20 22:33:15 -07:00
Rajalakshmi Srinivasaraghavan
894fc82858 POWER10: Additional check in cmake
When compiling with newer gcc and older glibc, there is a chance
for new POWER10 macros to be not available in hwcap.h. This patch
checks whether hwcap macros are available before using that in
platform.cpp.
2021-07-20 13:04:18 -07:00