Commit graph

3753 commits

Author SHA1 Message Date
zhijxu
89e5b3a24f resolve review comments 2020-11-16 11:23:01 +08:00
zhijxu
89902c2519 fix frontend bug.
old ort session may already exists when creating new ort session, this may cause OOM error
2020-11-16 11:23:01 +08:00
Guoyu Wang
c4818d36ed
[NNAPI EP] Make NNAPI EP build on non-Android Platform (#5779)
* Make NNAPI EP build on non-Android Platform

* minor updates

* Adress CR comments

* Fix build issue using Windows, address CR comments

* Fix linux build warnings

* Fix for test failure

* Fix for test failure

* Fix model_tests failure
2020-11-15 17:04:45 -08:00
Weixing Zhang
5b7dc5aeee
fix build failure for ROCm EP (#5816)
The kernel declaration of Identity needs to be updated in ROCm EP since
ROCm EP shares the implementation of Identity with CUDA EP in which it
has been changed due to opset 13 support.
2020-11-15 10:36:15 -08:00
Jesse Benson
ced5b66306 Re-enable multi-tensor-apply for LAMB optimizer 2020-11-15 09:35:00 -08:00
Weixing Zhang
fc614ad050 revert the code change which was based on b4869926
The change b4869926 which was to remove per-thread allocator would cause seg fault for
distributed training.

In addition, add dockerfile for ROCm3.9
2020-11-15 00:24:32 -08:00
RandySheriffH
c23fbba463
Fix reduce pipeline by replacing model (#5813)
* update model and better comment

* fix parameter

Co-authored-by: Randy Shuai <rashuai@microsoft.com>
2020-11-14 20:17:23 -08:00
Scott McKay
3269e59b2c
Add opset 13 registration for Identity. (#5800)
* Add opset 13 registration for Identity.
2020-11-14 21:40:24 +10:00
Ori Levari
157d1844fb
Named Dimension Override internals test and experimental API (#5805) 2020-11-13 21:21:11 -08:00
Ye Wang
262e9ef21d
Support input dimension swap in Attention op (#5774)
* checkin cpu

* checkin cpu

* add test

* cuda

* update comments

* review comments

* update

* modify var name

* remove unnecessary error msg

* fix comments

Co-authored-by: wangye <wangye@OrtDevTest2v100.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>
2020-11-13 18:29:08 -08:00
sfatimar
dfbf6d78be
OpenVino: fix allocation failure on Window for RelWithDebInfo build (#5713)
* ng_supported_ops

* Remove ng_supported_ops

* Revert "Remove ng_supported_ops"

This reverts commit 3c27385b2d88c6e8cf7ac4e8c290a367ad5d0bd8.

* Revert "ng_supported_ops"

This reverts commit 650721ae2913b79739521d58838298e031abdac1.

* cmake changes to ensure that the debug build on windows link to debug builds of openvino
and do not result in bad allocation error

Co-authored-by: sfatimar <sahar.fatima@intel/com>
2020-11-13 07:59:52 -08:00
Vincent Wang
0c8902cbbe
Update Gradient Builder of Some Ops for OpSet13 (#5748)
* gradient builder for opset13

* code clean.

* resolve comments

* stop grad for axes input

* add split to stop grad list.

Co-authored-by: Vincent Wang <weicwang@OrtDevTest2v100.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>
2020-11-13 16:20:34 +08:00
Yufeng Li
1f722863b2
Scale Bias post processor for ARM (#5795) 2020-11-12 21:12:23 -08:00
jeyblu
435b904f0e
add dnnl gpu engine (#5788) 2020-11-12 20:17:54 -08:00
Ryan Lai
0ea998134a
Skip new x86 tests in ort model tests (#5789) 2020-11-12 18:08:11 -08:00
Dmitri Smirnov
2f35e65135
Add Float16 and BFloat16 support to C# API (#5775)
Add Float16 and BFloat16 support.
2020-11-12 17:57:08 -08:00
edgchen1
4d517c68a3
Fix reference to old download_e2e_test_data.py script. It was renamed to download_azure_blob.py. (#5790) 2020-11-12 15:48:06 -08:00
Alberto Magni
88c3704257
Add shape inference for additional ops
This commit adds shape inference support for the following ops:

SoftmaxCrossEntropy
SoftmaxCrossEntropyLossGrad
SoftmaxCrossEntropyGrad
LayerNormalizationGrad
Motivation and Context
2020-11-12 20:18:54 +00:00
Ryan Lai
4e29f48010
skip gpt2 test on x86 (#5787) 2020-11-12 11:49:47 -08:00
pengwa
49288de17c
Fix memory planning issues (#5752)
* Fix memory planning issues

* fix build

* fix the wrong line...
2020-11-13 03:07:59 +08:00
alexzakv
44d3c31200
Winml_principles_change (#5727)
* Contributing page change

* Update WinML_principles.md

* Update WinML_principles.md

* Update WinML_principles.md

* Updated

* Update WinML_principles.md

* Update WinML_principles.md

* Update WinML_principles.md

* Update WinML_principles.md
2020-11-12 10:39:24 -08:00
Guoyu Wang
dc0f7b8f82
Remove onnxruntime_session_options_config_keys.h from c_api (#5772)
* Remove seesion config keys header from c_api

* remove copy session config header in release package

* Keep the session option config header in the package
2020-11-12 09:12:13 -08:00
stevenlix
54de618c2e
Improve TensorRT engine caching (#5737)
* add profile caching to improve engine caching feature

* Add comments

* fix typo

* add decryption for engine caching

* Update tensorrt_execution_provider.cc

* Update tensorrt_execution_provider.cc

* Update tensorrt_execution_provider.cc

* Update tensorrt_execution_provider.cc

* Update tensorrt_execution_provider.cc

* update onnx-tensorrt submodule

* set opt profile to max value of the range

* add hash to engine/profile name

* Add calibration based INT8 quantization

* add an option to enable both FP16 and INT8

* Update tensorrt_execution_provider.cc

* add env variable to specify calibration file name

* clean up code

* Add comments and update TRT document

* enable tensorrt basic test and add EngineCachingTest

* clean up

* update envrionment variable in the test

* clean up
2020-11-12 08:56:45 -08:00
Vincent Wang
2a87108431
SoftmaxCrossEntropyLoss OpSet13. (#5777)
Co-authored-by: Vincent Wang <weicwang@microsoft.com>
2020-11-12 15:50:34 +08:00
Hariharan Seshadri
b92fc66ea1
Support opset-13 specs of controlflow ops (Loop, If) (#5665) 2020-11-11 23:44:14 -08:00
Sherlock
07dc25e939
Compute global gradient norm according to 'enable_grad_norm_clip' (#5728)
* Introduce PassThrough op to wait for all gradient ready before weight update

* Compute gradient norm for fp32 runs

* Update FE UT expected value

* Respect enable_grad_norm_clip
2020-11-11 21:10:34 -08:00
Pranav Sharma
1ae58c960c
Allow turning off printing of shape when compiled with onnxruntime_DEBUG_NODE_INPUTS_OUTPUTS. (#5768)
* Allow turning off printing of shape when compiled with onnxruntime_DEBUG_NODE_INPUTS_OUTPUTS.
2020-11-11 18:59:04 -08:00
ashbhandare
5aec34500d
Add megatron transforms for BART (#5521)
* Large model export and run ORT Python support

* Megatron change

refine a bit

workaround self attention issue

use partitioned name for weights when megatron model parallel is enabled

Fix Megatron Transformer Issue (cuased by the renaming)

Add UTs for T5 model parallel

Fix megatron seed issue

fix log a bit

checkkpointing changes + rebase

Unintended reshape transform change

t5 layer norm changes

add t5 layer norm kernel

use template for t5 layer norm

template definition changes

no build error

add CPU cuda kernel

first unit test

other forward unit tests

add T5LayerNormGrad

Add c++ transform and test for T5 LN

minor fix

BART MLP Megatron tranform

Add concat slice transform + test

Cosmetic improvements in concat slice transform

Constant folding bug fix + megatron attention transform for BART

Undo unnecessary changes

* Cleanup

* Remove unnecessary changes

* Cleanup megatron

* Windows build

* Add self attention test graph

* Correcting transforms + cleanup

* review comments

* review comments

* fix build and test failures

* Fix CI

* fix windows CI

Co-authored-by: Peng Wang <pengwa@microsoft.com>
Co-authored-by: Aishwarya <aibhanda@OrtTrainingDev4.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>
2020-11-11 16:21:36 -08:00
Hariharan Seshadri
a14cd6267b
Support opset-13 specs of softmax family ops (Softmax, LogSoftmax, Hardmax) (#5707) 2020-11-11 15:45:03 -08:00
Xavier Dupré
e5c8040c52
Improves performance of operator Transpose (#5550)
* Improves implementation of transpose operator
* Simplifies transposition when it is not really needed.
2020-11-12 00:25:25 +01:00
Maajid khan
a84a058f9e
[OpenVINO-EP] Enabling Multi Device support (#5740)
* Enabling Multi Device support for UEP

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>

* Minor fix added
*Added a simple fix to determine OpenVINO
version for Arm build as well

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>
2020-11-11 15:16:30 -08:00
Guoyu Wang
4207e99be3
[NNAPI EP] Move GetCapability independent of ModelBuilder (#5767)
* Move GetCapability independent of ModelBuilder

* minor code style fix

* Move ort_enforce for same number of op_builders and op_support_checkers

* minor code fix
2020-11-11 13:33:38 -08:00
Xueyun Zhu
d8ace07ad7
Add CPU send/recv for pipeline (#5315)
* cpu send/recv

* clean up send/recv

* remove unused code

* assert and nccl option for mnist

* add build option to enable build with only cpu. Without this, nccl is always enabled which will break build on machine that only contains cpu

* Add USE_MPI distinct from USE_NCCL/USE_HOROVOD

* fix

* fix

* exclude cpu send/recv for machines without mpi

Co-authored-by: Tim Harris <tiharr@microsoft.com>
2020-11-11 12:41:39 -08:00
Ashwini Khade
496fa18c96
fix graph partitioning for nested functions (#5755)
* fix graph partitioning for nested functions

* enable broken test for SCE
2020-11-11 11:38:27 -08:00
Derek Murray
bc1768c7f1
Stop gradient flowing to the k input of TopK (#5762) 2020-11-11 10:24:44 -08:00
Dmitri Smirnov
871af477d7
Fix outputs of Sequences and Maps exposure. (#5743)
Fix outputs of Sequences and Maps exposure.
  Add more test conditions.
  Make sure RunWithBingind calls the right function.
2020-11-11 10:21:22 -08:00
liqunfu
1416d12f0b
Liqun/merge e2e pipelines (#5702)
* Create an Azure Pipeline to merge cpp and python e2e pipelines into one. Still keep cpp 2e2 pipeline until this new pipeline is stable.

Co-authored-by: liqun <liqun@OrtTrainingDev4.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>
2020-11-11 09:42:08 -08:00
Yufeng Li
2ba637c558
Implement Scale function for quant gemm (#5632)
* Implement a Scale function for quantization

Quantized GEMM is always followed by Scaling (PerTensor Or PerColumn), and often need to be accumulated to an existing matrix. This PR implements a post-processor for quantized GEMM result and accumulate it to another matrix.
2020-11-10 23:34:38 -08:00
George Wu
cca8cd849a
update native build instructions for ACL on Jetson. (#5764) 2020-11-10 23:10:59 -08:00
Changming Sun
79350a642a
Update install_deps.sh: remove the unnecessary data generating step (#5758)
We install onnx python package from this script, so python tests can run the tests for the latest commit which we are importing.
2020-11-10 22:19:03 -08:00
Guoyu Wang
0767c4fdfb
Fix x86 build break (#5759) 2020-11-10 20:33:27 -08:00
Guoyu Wang
042365029f
[NNAPI] Split OPBuilder IsOpSupported into a separated class (#5746)
* init change

* Split opbuilder into opbuilder and opsupportchecker

* Update code comments

* Address CR comments, some minor code updates
2020-11-10 15:00:38 -08:00
Scott McKay
6803e4ab44
Fix BatchNormalization registrations. (#5750)
Add diatribe on how to correctly update registrations.
2020-11-11 07:32:26 +10:00
Alberto Magni
c75b7c5c47
[CMake] Enable NCCL only when enabling CUDA or ROCm support (#5516)
Conditionally enable NCCL depending on CUDA and ROCM

Before this change NCCL support was enabled unconditionally, even
when building without CUDA or ROCM support.
This caused the command:
$ ./build.sh --enable_training

To trigger the following cmake warning
-- Could NOT find NCCL (missing: NCCL_INCLUDE_DIR NCCL_LIBRARY)
CMake Warning at CMakeLists.txt:1282 (message):
NCCL is not found. Please use --nccl_home to specify the path of NCCL.
Otherwise, NCCL is disabled.

This is a spurious warning because the user did not ask to search for NCCL.
2020-11-10 12:39:23 -08:00
Tim Harris
48b14b52b8
Remove Env::Task wrapper around std::function (#5753)
This is a small perf / clean-up change. It removes the Env::Task abstraction which wraps a single std::function field, and adds at least one virtual method call overhead when creating a Task and when executing it. The POSIX and Windows implementations are now identical.
2020-11-10 20:22:07 +00:00
leqiao-1
2b1ebbc286
update MCR images table (#5509)
Add tag 1.5.2 for images. 
Remove tensorRT image from table.
2020-11-10 11:47:59 -08:00
edgchen1
4c6118eb49
Update get_applicable_matrix_reduction() to combine dimensions of 1 with the given reduction axes. (#5734) 2020-11-10 10:32:50 -08:00
Hariharan Seshadri
63b85fc696
Fix VS 2017 build break (#5745) 2020-11-10 10:25:43 -08:00
Xavier Dupré
d59f057db3
enable string for operator Shape (#5742) 2020-11-10 18:38:36 +01:00
Xavier Dupré
8c74df2068
Add support for string with operator Expand (#5751) 2020-11-10 18:38:20 +01:00