Commit graph

4553 commits

Author SHA1 Message Date
Weixing Zhang
17f91ff410
remove un-needed header file. (#7193)
Co-authored-by: Weixing Zhang <wezhan@microsoft.com>
2021-04-01 21:05:58 -07:00
Ryan Hill
5a6d477625
Make IDataTransfer be directly shared with shared providers (#7215) 2021-04-01 20:39:16 -07:00
Edward Chen
0ebeaf529d
Check kernel def hashes (#7120)
Add unit test for verifying kernel def hashes.
Add way to add new types to kernel definition without changing hash.
2021-04-01 17:42:58 -07:00
ashbhandare
15c67ddbf0
Make output 1 of ConcatTraining Optional and place on CPU (#7199)
* Optional input 1 on CPU ConcatTraining

* Rename output_1
2021-04-01 16:05:17 -07:00
Jesse Benson
4543459984 MIOpen supports MIOPEN_REDUCE_TENSOR_AVG now. 2021-04-01 16:00:34 -07:00
Yufeng Li
34a8b22186
disable prepacking in training (#7201)
* disable prepacking in training
2021-04-01 14:03:47 -07:00
sfatimar
52bcef4d4f
Openvino ep 2021.3 (#7180)
* Integrate openvino-ep-2021.3

* operators type

* changed the myriad as it is case sensitive

* logging information for openvino-ep-2021.3

* Unit test fix

* Resize operator added for myriad

* Fixed python tests for CPU and GPU

* data commit for loop tile and gatherelements failure

* adding checks for Where

* fixing gatherelements and loop tests

* disabling instance normalization test for now as there seems to be a
myriad bug, putting loop in ops supported only because all the tests
fail

* gather elements op test taking care of warning message

* condition needs to be an intializers

* Disabled python test for Myriad

* Disable compilation warning for MSVC windows compiler

* softmax_test, threedimaxis0 and 1 test give accuracy mismatch
tensoroptest disables test gives accuracy mismatch
gather test gives accuracy mismatch

* Updated with ov version 2021.3

* Updated with ov version 2021.3

* Updated README

* Disabling python tests for cpu

* Disabling python tests with accuracy mismatch on cpu

* Added fix for Linux CI Pipeline failure

-> Disabled tests that were throwing segfault

Co-authored-by: sfatimar <sahar.fatima@intel/com>
Co-authored-by: MaajidKhan <n.maajidkhan@gmail.com>
Co-authored-by: Aravind <aravindx.gunda@intel.com>
2021-04-01 11:28:54 -07:00
baijumeswani
249a2c14ef
Pin version of pytorch to 1.8.1 for ORTModule CI pipeline (#7167)
* Pin version of pytorch to 1.8.1 for ORTModule CI pipeline
* Use pytorch-lightning stable version 1.2.5
* Revert to cuda 10.1
2021-04-01 09:37:47 -07:00
George Wu
fc6ac5bfac
dnnl fixes (#7202) 2021-04-01 07:34:18 -07:00
Scott McKay
329fd03bb4
Add int32_t as required type to some operators (#7192)
* Updates to some operators to always support int32 and int64 based on testing of Android package build config with a minimal build.

If an operator can be used for shape manipulation (int64) it is frequently used for indices manipulation (int32), so we enable both types for that set of ops.
  - e.g. BERT models take indices as input
  - Scatter/Gather ops utilize indices

Misc. fix to python bindings to exclude call that fails in a minimal build.
2021-04-01 19:32:34 +10:00
Edward Chen
04679e31ab
Specify CUDA compute capability 7.5 in Linux GPU build (#7203)
Recently a build agent pool was changed to use T4 GPUs (CUDA compute capability 7.5). Updating some CUDA build options to accommodate that.
2021-03-31 18:51:44 -07:00
Hariharan Seshadri
0e0dd50e39
Support int32 type for TopK CPU op (#7089) 2021-03-31 18:08:21 -07:00
Xavier Dupré
b370ddbf5e
Removes unnecessary transpose in operator Einsum (#7141)
* remove one unnecessary transpose
* add more unit test
2021-03-31 09:59:08 +02:00
Guoyu Wang
d500c5952b
Add Android AAR packaging script for ORT-Mobile (#7138)
* Add Android aar packaging script for ORT-Mobile

* Address CR comments
2021-03-30 18:42:18 -07:00
Yulong Wang
0fdef1bf47
[Node.js binding] upgrade y18n to v4.0.1 (#7185) 2021-03-30 16:09:04 -07:00
Negin Raoof
45cb0cae8c
Adding TorchEmbedding contrib op (#7136)
* Adding TorchEmbedding contrib op

* Update contrib_defs.cc

* Shape fix

* Update shape_inference_test_helper.h

* Fix typo

* Fix test

* Fix for test code

* Merge

* Fix CI

* Fix for CI

* Fix CI no-contrib
2021-03-30 14:33:25 -07:00
liqunfu
e545604499
. (#7165) 2021-03-30 13:58:30 -07:00
RandySheriffH
d880578537
Exclude cpuid.h from Mac non x86 arch (#7166)
* add ifdef to exclude inclusion from non x86 arch

* exclude calling of __cpuid_count

Co-authored-by: Randy Shuai <rashuai@microsoft.com>
2021-03-30 11:50:42 -07:00
Edward Chen
0ccfe6c86a
Enable type reduction for Scatter/ScatterElements CPU kernels (#7171)
Enable type reduction for Scatter/ScatterElements CPU kernels. Some refactoring to reduce binary size.
Add MLTypeCallDispatcher methods.
Minor cleanup for Pad CPU kernel.
2021-03-30 11:02:24 -07:00
Tang, Cheng
07201bac7a
expose session option and provider options (#7112)
* expose session option and provider options

* merge provider_names and provider_options

* integrate into orttrainer options

* fix doc string

* fix a typo

* Update orttraining/orttraining/python/training/orttrainer.py

Co-authored-by: Thiago Crepaldi <thiago.crepaldi@microsoft.com>

* Update orttraining/orttraining/python/training/orttrainer.py

Co-authored-by: Thiago Crepaldi <thiago.crepaldi@microsoft.com>

* Update orttraining/orttraining/python/training/orttrainer_options.py

Co-authored-by: Thiago Crepaldi <thiago.crepaldi@microsoft.com>

* fix the usage of provider_options

* Update orttraining/orttraining/python/training/orttrainer.py

Co-authored-by: Thiago Crepaldi <thiago.crepaldi@microsoft.com>

* Update orttraining/orttraining/python/training/orttrainer.py

Co-authored-by: Thiago Crepaldi <thiago.crepaldi@microsoft.com>

* update expected result in tests

* fix default provider options

* minor update to trigger rebuild

* minor update to trigger rebuild

Co-authored-by: Thiago Crepaldi <thiago.crepaldi@microsoft.com>
2021-03-30 09:49:45 -07:00
Yufeng Li
c4ebc60870
sort quantized nodes in topo logical order (#7172) 2021-03-30 09:01:15 -07:00
Yufeng Li
4f30341253
Check the count of DequantizeLinear for matmul (#7174) 2021-03-30 09:00:48 -07:00
Tracy Sharpe
a01334ba56
MLAS: activate udot kernel on Windows ARM64 (#7169) 2021-03-29 17:56:48 -07:00
Changming Sun
bbcf419ac6
Move the Windows GPU machine pool of Onnxruntime packaging pipelines to a new one (#7161) 2021-03-29 17:32:03 -07:00
Ben Niu
d1acdd4f4b
Support building ARM64EC onnxruntime.dll (#6999) 2021-03-29 15:35:30 -07:00
Yufeng Li
77c19436c0
add a notebook for mobilenetv2 quantization (#7164)
* add a notebook for quant mobilenetv2
2021-03-29 13:24:14 -07:00
RandySheriffH
aeca7c2940
Cuda Profiler (#7110)
* implement cuda profiler

* add counters

* downgrade cupti kernel version

* move mutex

* add cupti to path

* fix win gpu build err

* add path for cuda10

* fix linux com err

* extend include path

* add init flag

* fix test case

* fix tensorrt pipeline

* add UT

Co-authored-by: Ubuntu <randysheriff@rashuai-linux-gpu-3.3cfnmjowvu4e5bidlsmcxsmzwg.xx.internal.cloudapp.net>
2021-03-29 12:04:36 -07:00
Ashwini Khade
b22e60bd44
pull onnx latest commit (#7102)
* update onnx commit

* fix test scripts to remove deprecated call

* update filters

* add registration for relu and cumsum ver 14

* add promote trilu to onnx domain

* update onnx-tensorrt submodule

* update flag

* update flag

* update dependencies

* fix android ci failure
2021-03-29 11:00:38 -07:00
Scott McKay
9297527b7a
Enable NHWC transformer when generating ORT format model (#7126)
* Allow specific optimizers to be disabled.
  - replace unused ability to specify just the optimizers to run
    - never used so not needed
Allow the disabled list to be specified via the python bindings
  - expected usage is internal, so using kwargs for that so as not to pollute the documentation with stuff no user is likely to need
Update the ORT format model conversion script to disable NCHWc transformer when level is 'all'
  - currently there aren't any known use cases where we'd want the NCHWc transformations to run as they create a device specific model and aren't used on ARM
    - the ORT format model is not expected to be generated on the target device (e.g. generate on Windows/Linux/macOS to deploy to Android/iOS so there's a good chance we'd generate a useless/invalid model
  - default to 'all' as ARM and MLAS prefer NHWC and the NHWC transformer runs at that level
* Add matching changes to optimizer generation in training code
2021-03-29 18:39:48 +10:00
satyajandhyala
90294b9c43
Fix Transpose and MatMul fusion code to check the input datatypes as … (#7147)
* Fix Transpose and MatMul fusion code to check the input datatypes as FusedMatMul only supports floating point datatypes.

* Added testcases to make sure that the int32/int64 datatypes prevent Transport-MatMul fusion.
2021-03-28 09:24:12 -07:00
Jeff Daily
65ce5f07b3
add Dockerfile.rocm4.1.pytorch (#7152) 2021-03-26 21:40:10 -07:00
Suffian Khan
f27835c4de
Disable batch size test for AMD CI pipeline after agent upgrade to Rocm 4.1 (#7153)
* disable batch size test for rocm 4.1 until resolved

* Update orttraining-pai-ci-pipeline.yml

Forgot to modify both pipelines
2021-03-26 22:32:39 -05:00
Changming Sun
f365f1d967
Resize_impl.cu: Change _Round to roundf (#7140)
This is to keep the change minimal, make it work exactly like what it worked before.
2021-03-26 18:29:21 -07:00
Edward Chen
63d9d5afd3
Fix Pad and Gather incorrect usage of HasType helpers. (#7146) 2021-03-26 17:36:31 -07:00
Sherlock
ab86634c36
Address comments from ORTModule master merge (#7101)
* Address ortmodule merge master comments

Co-authored-by: Sherlock Huang <bahuang@OrtTrainingDev3.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>
2021-03-26 16:26:42 -07:00
Thiago Crepaldi
a01f15198c
Add support for large models (#7113)
* Add support for large models

* Handle models with registered buffers
2021-03-26 14:08:46 -07:00
Suffian Khan
2b31b80b1f
icnrease timeout (#7145) 2021-03-26 11:26:18 -07:00
Yufeng Li
3771e0bf10
update bert quantization notebook (#7137) 2021-03-25 18:12:53 -07:00
KeDengMS
c9b29fbd06
Disable MatmulTransposeFusion for CPU EP (#7135)
It causes convergence issue in BERT on CPU
2021-03-25 17:16:58 -07:00
Dmitri Smirnov
2bf54bcaa2
Fix bugs in sparsify script (#7134)
Fix type and check.
2021-03-25 14:53:52 -07:00
G. Ramalingam
cc0e7bee76
Add function-body to SoftmaxGrad (#6988)
* Add function body to SoftmaxGrad schema

* Add type context and cleanup

* Add test case with symbolic dimensions

* Add opset specification to function

* handle opset dependence

* Exclude from minimal build
2021-03-25 11:34:06 -07:00
Tianlei Wu
53c123dcee
Add session option configuration to enable GeluApproximation (#7131) 2021-03-25 11:32:36 -07:00
Yufeng Li
8e54b76e2d
QDQ implementation (#7033)
* Add QDQ basic implementation
2021-03-25 09:17:23 -07:00
RandySheriffH
865c67611c
Exclude profiler from minimal build (#7115)
* Exclude TP profiler from minimum build

* fix typo

* remove Clock

* fix comments

Co-authored-by: Randy Shuai <rashuai@microsoft.com>
2021-03-25 09:06:14 -07:00
Vincent Wang
fda0470683
Add New AllocKind for YieldOp Outputs, Run YieldOp with InferenceSession in UT (#7125)
* new allockind, add ut

* change macro

* fix win build

* rename alloc kind

* fix mem leak
2021-03-25 15:18:51 +08:00
Sherlock
1c8d874412
Promote BiasDropout from orttraining to onnxruntime (#7116)
* Promote BiasDropout from orttraining to onnxruntime

Co-authored-by: Sherlock Huang <bahuang@OrtTrainingDev3.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>
2021-03-24 20:42:42 -07:00
jingyanwangms
cd67f12add
Move IOBinding and RunOptions to ctx (#7028)
* Liqun/ort module perf1 (#6806)

add mysql script to log perf data
Co-authored-by: liqun <liqun@OrtTrainingDev4.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>

* Resolve HTTP Error 503: Service Unavailable for MNIST dataset (#6989)

* Reduce logging for ORTModule for the end user (#6982)

* Support none types in forward output (#7001)

* Missed test case for none type output (#7014)

* save iobinding to ctx

* save run_options to ctx

* remove debug tests

* PR comments and clean up

* add RunStateInfo

* remove whitespace edits

* PR comments

* remove test changes

* fix test failure

* Fit unit test test_nesting_forward_backward_calls

Co-authored-by: liqunfu <liqfu@microsoft.com>
Co-authored-by: baijumeswani <bmeswani@microsoft.com>
Co-authored-by: Jingyan Wang <jingywa@OrtTrainingDev3.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>
2021-03-24 17:51:00 -07:00
Changming Sun
2e3bbad19f
Move TensorRT Windows CI build to the machine pool (#7127) 2021-03-24 14:28:25 -07:00
Guoyu Wang
1c04eec2b1
[NNAPI EP] Fix error for QLinearAdd with an initializer as input (#7093)
* Fix the issue where input to qlinearadd is an initializer

* Add UT

* Adress CR comments
2021-03-24 11:56:53 -07:00
harshithapv
540eac253e
Deepspeed pipeline parallel and fairscale sharded optimizer test samples with ORTModule (#7078)
* adding samples for Deepspeed pipeline parallel and fairscale sharded optimizer with ortmodule

* fixed typo in args

* addressed Thiago's comments

* Update orttraining/orttraining/test/python/orttraining_test_ortmodule_deepspeed_pipeline_parallel.py

Co-authored-by: Thiago Crepaldi <thiago.crepaldi@microsoft.com>
2021-03-24 09:43:05 -07:00