Commit graph

4211 commits

Author SHA1 Message Date
Faith Xu
fba46a76bc
Update readme to reference docs webpage content (#6621)
* Fix broken links to EP docs

* Fix another link

* Simplify content to link to docs site

* Update README.md

* Add build pipeline status

* Fix openvino pipeline widget
2021-02-11 16:50:32 -08:00
Changming Sun
8378a45ae7
Add python 3.8/3.9 support for Windows GPU and Linux ARM64 (#6615)
Add python 3.8/3.9 support for Windows GPU and Linux ARM64

Delete jemalloc from cgmanifest.json.

Add onnx node test to Nuphar pipeline.

Change $ANDROID_HOME/ndk-bundle to $ANDROID_NDK_HOME. The later one is more accurate.

Delete Java GPU packaging pipeline

Remove test data download step in Nuget Mac OS pipeline. Because these machines are out of control and out of our network, it's hard to make it reliable and the data secure.

Fix a doc problem in c-api-artifacts-package-and-publish-steps-windows.yml. It shouldn't copy C_API.md, because the file has been moved into a different branch.

Delete the CI build docker file for Ubuntu cuda 9.x and Ubuntu x86 32 bits

And, due to some internal restrictions, I need to rename some of the agent pools
2021-02-11 16:43:35 -08:00
Yufeng Li
1c3168c0f6
Skip constant folding dequantizelinear for quant qdq format (#6643)
* skip constant folding dequantizelinear for quant qdq format
2021-02-11 14:06:13 -08:00
Ye Wang
b4b829dfcf
Update transformers tool based on latest transformers (#6641)
* bert_base_cased: embedlayer fusion

* xlm_mlm_en_2048: attention fusion
2021-02-11 10:11:47 -08:00
Ye Wang
a7b6fc08f2
Support skiplayernorm fusion without beta in layernorm (#6617)
* support skiplayernorm fusion without beta in layernorm

* use place holder

* review comments
2021-02-10 17:50:10 -08:00
Guoyu Wang
fd83e38dcf
[CoreML EP] Add support of BatchNormalization/Reshape/Global[Average/Max]Pool (#6625)
* [CoreML EP] Add batch norm support

* Add reshape support

* Add global pooling support

* Addressed CR comments
2021-02-10 17:16:35 -08:00
Guoyu Wang
64edcad2d8
[NNAPI EP] Add EP option to disable CPU (#6593)
* Add NNAPI EP option to disable CPU

* update comments

* Address CR comment

* Address CR comments, update code comments

* Address CR comments
2021-02-10 17:16:07 -08:00
Matthew Emmett
d2ce8a2c80
Add hipFFT include directory (transitional step) before ROCm. (#5992)
hipFFT is transitioning to a separate repository (away from being
included in rocFFT).  During this transition, using the hipFFT version
of hipfft.h won't produce a deprecation warning.
2021-02-10 16:46:03 -08:00
Changming Sun
042964f633 Change how ONNX get installed 2021-02-10 14:41:26 -08:00
Edward Chen
e59cb9455e
Add CI build with type reduction enabled (#6622) 2021-02-10 13:31:51 -08:00
Edward Chen
352e8cb8a8
Move ORT_ENFORCE()'s within MLTypeCallDispatcher to helper class functions to reduce the size used by function names in ORT_ENFORCE(). (#6624)
Move ORT_ENFORCE()'s within MLTypeCallDispatcher to helper class functions to reduce the size of function names in ORT_ENFORCE().
ORT_ENFORCE() captures the containing function's name in the error message. For some usages of MLTypeCallDispatcher (i.e., with numerous types or long type names), the function name is quite long and can contribute significantly to the binary size. Usage in the Cast CPU kernel is a notable example.
This change moves the ORT_ENFORCE() checks from a class template member function template with variable length name to a helper function with a fixed length name.
2021-02-10 11:34:38 -08:00
Dwayne Robinson
eef9a7a8a9
Update DirectML 1.4.0 to 1.4.1 for ORT 1.7 (#6636) 2021-02-10 10:34:40 -08:00
Xiang Zhang
8502573125
fix CheckLearningModelPixelRange (#6632) 2021-02-10 10:23:54 -08:00
Derek Murray
88d48063fa
Log warning when GetGradientForOp() silently fails. (#6586)
* Add warning when GetGradientForOp() silently fails.

In some cases, `GetGradientForOp()` can return without creating any nodes, which may lead to an invalid graph being created.
2021-02-10 10:01:16 -08:00
Hariharan Seshadri
b09bfc8611 Revert "Remove abs in LpPool (#6303)"
This reverts commit 3b3e698674.
2021-02-10 00:48:14 -08:00
Wei-Sheng Chin
8972621138
Generate shape-independent graph if any input dimension < 2 (#6581)
* Throw for non-supported case

* Not to go to shape-dependent branch when seeing unsupported shapes
2021-02-10 15:44:25 +08:00
Hariharan Seshadri
8f0b877a1d
Enable running some ops on CUDA (#6572) 2021-02-09 22:10:43 -08:00
Yufeng Li
505c1f30b5 use == instead of is for python 3.8 2021-02-09 19:59:28 -08:00
Changming Sun
e70344e648
Fix training python packaging pipeline (#6613)
In a previous PR, I set the docker file name to a wrong value.
2021-02-09 11:04:39 -08:00
Justin Stoecker
1c72774232
Update a few WinML model test filters for DML 2021-02-09 10:23:57 -08:00
Cian Hayes
8f14b8bd9d
Support disabling training kernels as part of a reduced build (#6557) 2021-02-09 09:51:31 -08:00
stevenlix
e9d03983fc
Add engine decryption in TensorRT EP (#6612)
* add trt engine decryption

* update document

* add windows support to decryption

* fix issues

* remove redundant get() from engine/context check

* fix issue
2021-02-09 00:46:14 -08:00
Changming Sun
0b89f931d0
Update CUDA build configs (#6598)
1. Fix Nuget package build break caused by #6225
2. Delete Dockerfile.centos_gpu. It is not used anywhere.
3. Fix Linux CUDA 10.2 build error caused by glibc upgrade
2021-02-08 22:55:42 -08:00
Xavier Dupré
d3a2c8c1c7
Support double for operators ReduceMax, ReduceMin (#6265)
* Support double for operators ReduceMax, ReduceMin

* add unit test to pai-excluded-tests.txt

Co-authored-by: xavier dupré <xavier.dupre@gmail.com>
2021-02-08 19:14:26 -08:00
Randy Shuai
ff063309b0 enable omp for debug build 2021-02-08 19:10:13 -08:00
Randy Shuai
6c5f50d00e deprecate omp in ci 2021-02-08 19:10:13 -08:00
Yufeng Li
56e4e47f66
Quantize model with QDQ format (#6541)
* implement qdq format in quant tool
* refactor code
2021-02-08 18:46:07 -08:00
Scott McKay
c02ae61cab
Make kernel hash stable in type reduced build (#6603)
* Add infrastructure so that a kernel definition has the full list of supported types and a list of types enabled in this build. We need to use the full list when calculating the kernel hash so that the hash value in an ORT format model is stable across builds with and without type reduction enabled.
2021-02-09 12:00:08 +10:00
Cian Hayes
16eed68a1e
Fix layer_norm.cc on x86 (#6556)
* Fix LayerNromGrad on x86

* PR feedback
2021-02-08 17:36:14 -08:00
Scott McKay
13d7db9a98
Don't update the excluded ops/types unless args.update is true. Updating the exclusion info triggers rebuilding of all kernels using type reduction. (#6604) 2021-02-09 07:15:31 +10:00
Scott McKay
0b1e21c638
Fix bug with ORT format serialization of tensor attributes. (#6602)
The model path needs to come from the Node not the potentially nullptr subgraph.
2021-02-09 07:15:21 +10:00
Pranav Sharma
67ef6b1aa6
[Mult-GPU inferencing] Add new API to get/set device id. Set correct device id in cuda allocator. (#6592) 2021-02-08 08:59:18 -08:00
nietras
1dd920fa7c
Fix TensorRT unnecessary file cache operations (#6601)
* Fix TensorRT unnecessary file cache operations

* fix compile
2021-02-07 20:09:30 -08:00
Edward Chen
19c130f561
Reduce CastMLFloat16ThroughFloat size (Scott's suggested changes), fix unused function warning. (#6597) 2021-02-08 07:20:53 +10:00
Scott McKay
190b90a682
Fix some coding conventions issues (#6583)
Fix some coding conventions issues
Use #define for types that Cast supports
2021-02-08 07:11:26 +10:00
Weixing Zhang
c86c21e002
Generate error when an explicit stream argument is not provided in the <<<...>>> kernel launch syntax (#6599)
* Generate error when an explicit stream argument is not provided in the <<<...>>> kernel launch syntax

Co-authored-by: Weixing Zhang <wezhan@microsoft.com>
2021-02-06 15:54:29 -08:00
Jesse Benson
d18aa45b46 Enable more ROCM ops that are sharing CUDA code. Some are needed for Turing NLG models. 2021-02-06 14:40:34 -08:00
Adam Pocock
dbe31361bc Fix build.gradle so it always targets Java 8 class files. 2021-02-05 22:26:17 -08:00
Ryan Lai
b57a7f4de3 Delay load dxcore in winml model tests 2021-02-05 21:08:11 -08:00
George Nash
b50b0a89aa Fix build failure when building with --build_wheel on Windows
This resolves issue #6536

Signed-off-by: George Nash <george.nash@intel.com>
2021-02-05 18:59:01 -08:00
Nat Kershaw (MSFT)
af9dfa7a4d
Remove docs that have been migrated to https://onnxruntime.ai/docs (#6225) 2021-02-05 18:09:27 -08:00
Dmitri Smirnov
dda5a62072
Fix updated Doxygen errors. (#6588) 2021-02-05 18:07:03 -08:00
Chun-Wei Chen
115e16b37b
ort_test_utils: skip creating input if it is an initializer (#6544) 2021-02-05 17:34:08 -08:00
Scott McKay
ccfd90291b
Remove condition from ORT_RETURN_IF[_NOT] macro output. (#6563)
Remove condition from ORT_RETURN_IF[_NOT] macro output as repeating the condition doesn't add much value compared to the explicit error message, and the error message includes the file and line anyway so it's easy enough to find the condition if needed.
Update the few places where the macros were used without an explicit error message to provide an explicit error message.

Saves 12.5KB in a minimal MinSizeRel build with all DNN ops, 16KB in full release build.
2021-02-05 17:33:29 -08:00
Changming Sun
b5bd14fc9f
Update GPU packaging pipelines to cuda11 and fix the other build break issues (#6585)
Update gpu packaging pipelines to CUDA11

In the next release we will use CUDA 11. And our CUDA 11 build suddenly became broken because recently CentOS 7 posted an update of glibc. The version of glibc was changed from 2.17-317.el7 to 2.17-322.el7_9. But the newer one isn't compatible with CUDA 11. We have to downgrade it.
2021-02-05 16:58:37 -08:00
Ye Wang
82229c8e61
Support no bias in layernorm and skiplayernorm op (#6554)
* add noBias attribute in layernorm

* skip bias in skiplayernorm

* fix

* fix cuda tets

* add tests

* fix windows build

* fix win build issue

* review comments
2021-02-05 16:48:22 -08:00
Weixing Zhang
299ace0759
Support to allow user to specify compute stream per session (#3723)
* Support to allow user to specify compute stream per session

Create computation cuda stream explicitly rather than use default legacy stream or per-thread default stream.

remove some redudant cudaStreamSynchronize

fix gpt2 model test failures

don't use default stream in nccl either.

add stream schronization in OnRunEnd()

using cub::DeviceScan::InclusiveSum which can be called with stream specified.

fix topK failure due to latest rebase

fix tensorrt

support user specified stream

add user_stream support in tensorrt EP

use same stream for both tensort and CUDA EP.

fix ScatterND

specify stream for adasum and p2p kernels.

fix loop

fix CApiTest.custom_op_handler

fix CApiTest.varied_input_custom_op_handler

change for cudaMemcpyFromSymbol

improve provider options for user specified compute stream

* add changes for ROCM EP

* fix GatherGrad UT for ROCM EP

* clean code and fix NonMaxSuppression

* use default stream for ROCM now

* fix CApiTest.custom_op_handler:OrtFormatCustomOpTests.ConvertOnnxModelToOrt

* fix tensorrt ut: CApiTest.io_binding_cuda

Co-authored-by: Weixing Zhang <wezhan@microsoft.com>
2021-02-05 15:48:18 -08:00
sfatimar
973c3917a6
OpenVino add build_shared_lib flag in the build command (#6560)
* Dockerfile changes to add build_shared_lib
2021_1 indendation changes

* csharp shared library

Co-authored-by: sfatimar <sahar.fatima@intel/com>
2021-02-05 12:18:02 -08:00
Guoyu Wang
68193e28de
Let execution fall back to CPU EP if Compile of a partition on current EP fails (#6580)
* Let exccution fall back to CPU EP if compile of a partition fails

* Removed debugging logs

* Addressed CR comments
2021-02-05 12:14:55 -08:00
Chun-Wei Chen
f2ce3aae13
add set_model_dir and update ONNX (#6119) 2021-02-05 09:30:49 -08:00