Commit graph

4287 commits

Author SHA1 Message Date
M. Zeeshan Siddiqui
5b7e7aaa45 Move event_pool and message_queue to core. 2021-02-17 11:50:56 -08:00
M. Zeeshan Siddiqui
eecce31a8b Fix build, cleanup. 2021-02-17 11:50:41 -08:00
Thiago Crepaldi
3184c47ad1 Merge branch 'master' into thiagofc/merge-from-master 2021-02-17 11:49:52 -08:00
baijumeswani
01dfa8e125
Support non tuple return values from torch.nn.module (#6660)
* Support dictionary, namedtuples and huffingface ModelOutput type for model return values
2021-02-16 20:48:32 -08:00
Thiago Crepaldi
7f33671ade
Handle multiple devices scenarios (#6672)
* Handle multiple devices scenarios
2021-02-16 18:22:30 -08:00
Thiago Crepaldi
7ee5baa60d
Remove monkey patch for PyTorch Nightly + ORTTrainer (#6659) 2021-02-16 17:24:50 -08:00
Maajid khan
f649f917fe
[OpenVINO-EP] Enabling OpenVINO Runtime options for Perftest application (#6654)
* Adding changes to enable ov_config_options

Enabling a flag to pass OpenVINO Runtime options
as an string argument using a command line.

* Enabling OpenVINO Runtime options for perftest

Enables OpenVINO EP runtime options into onnxruntime_perf_test.
Now these options can be passed as an argument to the perf test CPP
application using key-value pairs seperated by a space via a
command line.

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>

* minor changes added

* Corrected Indentation

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>

* corrected Indendation issues

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>

* Making config options generic to all EP's

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>
2021-02-13 16:09:31 -08:00
RandySheriffH
df3d6bad5f
Deprecate OMP from Python package (#6610)
1. For previous openmp build, remove --use_openmp, so thread pool will become default;
2. For previous non-openmp build, add --use_openmp and rename the package to indicate the inclusion.
3. Add a mac build with openmp enabled.
2021-02-12 21:50:41 -08:00
Faith Xu
72eb5de0e2 Add Python 3.9 to pypi metadata 2021-02-12 20:00:17 -08:00
Scott McKay
25f7c93504
Require explicit inclusion of custom op support in a minimal build (#6663)
* Remove support from custom ops from the base minimal build as they contribute too much binary growth to an Android build.
Add ability to explicitly enable custom op support in a minimal build.
Change one minimal build CI to test adding custom op support (unit tests are run in that build to validate)
2021-02-13 12:42:33 +10:00
Changming Sun
dd50c39ac6
Change Linux python packaging pipeline compile flags (#6668) 2021-02-12 15:28:56 -08:00
Sheil Kumar
87cb6fd495
Add LearningModelBuilder to WinML Experimental Namespace along with various Audio operators (#6623)
* model building

* fix build

* winml adapter model building api

* model building

* make build

* make build again

* add model building with audio op

* inplace and inorder fft

* add ifft

* works!

* cleanup

* add comments

* switch to iterative rather than recursive and use parallelization

* batched parallelization

* fft->dft

* cleanup

* window functions

* add melweightmatrix op

* updates to make spectrogram test work

* push latest

* add onesided

* cleanup

* Clean up building apis and fix mel

* cleanup

* cleanup

* naive stft

* fix test output

* middle c complete

* 3 tones

* cleanup

* signal def new line

* Add save functionality

* Perf improvements, 10x improvement

* cleanup

* use bitreverse lookup table for performance

* implement constant initializers for tensors

* small changes

* add matmul tests

* merge issues

* support add attribute

* add tests for double data type windowfunctions and minor cleanup

* stft onesided/and not tests

* cleanup

* cleanup

* clean up

* cleanup

* remove threading attribute

* forward declare orttypeinfo

* warnings

* fwd declare

* fix warnings

* 1 more warning

* remove saving to e drive...

* cleanup and fix stft test

* add opset picker

* small additions

* add onnxruntime tests

* add signed/unsigned

* fix warning

* fix warning

* finish onnxruntime tests

* make windows namespace build succeed

* add experimental flag

* add experimental api into nuget package

* add experimental api build flag and add to windows ai nuget package

* turn experimental for tests

* add minimum opset version to new experimental domain

* api cleanup

* disable ms experimental ops test when --ms_experimental is not enabled

* add macro behind flag

* remove unused x

* pr feedback

Co-authored-by: Sheil Kumar <sheilk@microsoft.com>
2021-02-12 14:17:10 -08:00
ashbhandare
ff465483b1
Add TNLRv3 fp16 pattern to Layer Norm fusion (#6661)
* Add tnlrv3 pattern

* Add test
2021-02-12 14:05:36 -08:00
RandySheriffH
a07a14dce5
exclude non support types (#6653) 2021-02-12 13:30:48 -08:00
Edward Chen
b2cddc5337
Consolidate MLTypeCallDispatcher classes (#6651) 2021-02-12 13:26:56 -08:00
Suffian Khan
e6de0eb813
Add nightly pipeline for MI100 to run convergence and batch size test similar to V100. (#6611)
* Partial updating of ROCM reduction code.

* Update reduction_all.cu

* Add reduce template parameters.

* miopen common

* Reuse CUDA's reduction_functions.cc

* Reduction ops.

* Update remaining reduction ops to use MIOpen.  double datatype is not supported, so disable those typed kernels.

* Disable a couple more unsupported tests.

* Code formatting.

* Delete ROCM-specific reduction code that is identical to CUDA reduction code.

* Fix scratch buffer early free.

* Fix merge conflict.

* first attempt nightly amd ci pipeline

* try fix bad yaml file

* try again with corrected model directory

* add convergence test as well

* update reference loss for amd mi100

* include mi100 test results csv

* update the mi100  convergence test reference values

* update batch sizes for mi100 32g

* fix gpu sku for run_convergence_test.py

* undo unrelated changes to master

* pr comments

* pr comment

Co-authored-by: Jesse Benson <jesseb@microsoft.com>
2021-02-12 13:22:06 -08:00
Guoyu Wang
f11b5d3072
[CoreML EP] Enable coreml for onnx_test_runner and onnxruntime_perf_test (macOS only) (#6642) 2021-02-12 10:41:36 -08:00
Edward Chen
78e408dbe9
Enable type reduction for ConstantOfShape CPU kernel. (#6594)
* Enable type reduction for ConstantOfShape.
2021-02-12 18:27:25 +10:00
Faith Xu
950c941f11
Remove year from license (#6658) 2021-02-12 00:25:56 -08:00
Scott McKay
ce01c3760f
Cleanup macros used to register activations. (#6628)
Registrations need to either be between a start and end version, or be the current version. Having a macro that uses 3 versions will break or lead to misuse when a 4th version is released.
2021-02-12 17:49:03 +10:00
Scott McKay
1916e35bea
Reduce tensorprotoutils binary size (#6634)
* Move type agnostic code out of UnpackInitializerData
Refactor the unpack tensor logic to switch on data size
Add test cases

* Remove templatization of more parts
2021-02-12 16:48:13 +10:00
Faith Xu
fba46a76bc
Update readme to reference docs webpage content (#6621)
* Fix broken links to EP docs

* Fix another link

* Simplify content to link to docs site

* Update README.md

* Add build pipeline status

* Fix openvino pipeline widget
2021-02-11 16:50:32 -08:00
Changming Sun
8378a45ae7
Add python 3.8/3.9 support for Windows GPU and Linux ARM64 (#6615)
Add python 3.8/3.9 support for Windows GPU and Linux ARM64

Delete jemalloc from cgmanifest.json.

Add onnx node test to Nuphar pipeline.

Change $ANDROID_HOME/ndk-bundle to $ANDROID_NDK_HOME. The later one is more accurate.

Delete Java GPU packaging pipeline

Remove test data download step in Nuget Mac OS pipeline. Because these machines are out of control and out of our network, it's hard to make it reliable and the data secure.

Fix a doc problem in c-api-artifacts-package-and-publish-steps-windows.yml. It shouldn't copy C_API.md, because the file has been moved into a different branch.

Delete the CI build docker file for Ubuntu cuda 9.x and Ubuntu x86 32 bits

And, due to some internal restrictions, I need to rename some of the agent pools
2021-02-11 16:43:35 -08:00
Yufeng Li
1c3168c0f6
Skip constant folding dequantizelinear for quant qdq format (#6643)
* skip constant folding dequantizelinear for quant qdq format
2021-02-11 14:06:13 -08:00
Thiago Crepaldi
0732d72706
Add support for dynamic axes for outputs + check model output type before export (#6648) 2021-02-11 10:18:02 -08:00
Ye Wang
b4b829dfcf
Update transformers tool based on latest transformers (#6641)
* bert_base_cased: embedlayer fusion

* xlm_mlm_en_2048: attention fusion
2021-02-11 10:11:47 -08:00
Sherlock
9294dde143
Rename ONNX graphs variables in ORTModule (#6645)
Co-authored-by: Sherlock Huang <bahuang@OrtTrainingDev3.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>
2021-02-10 20:10:23 -08:00
Ye Wang
a7b6fc08f2
Support skiplayernorm fusion without beta in layernorm (#6617)
* support skiplayernorm fusion without beta in layernorm

* use place holder

* review comments
2021-02-10 17:50:10 -08:00
Guoyu Wang
fd83e38dcf
[CoreML EP] Add support of BatchNormalization/Reshape/Global[Average/Max]Pool (#6625)
* [CoreML EP] Add batch norm support

* Add reshape support

* Add global pooling support

* Addressed CR comments
2021-02-10 17:16:35 -08:00
Guoyu Wang
64edcad2d8
[NNAPI EP] Add EP option to disable CPU (#6593)
* Add NNAPI EP option to disable CPU

* update comments

* Address CR comment

* Address CR comments, update code comments

* Address CR comments
2021-02-10 17:16:07 -08:00
Matthew Emmett
d2ce8a2c80
Add hipFFT include directory (transitional step) before ROCm. (#5992)
hipFFT is transitioning to a separate repository (away from being
included in rocFFT).  During this transition, using the hipFFT version
of hipfft.h won't produce a deprecation warning.
2021-02-10 16:46:03 -08:00
Changming Sun
042964f633 Change how ONNX get installed 2021-02-10 14:41:26 -08:00
Edward Chen
e59cb9455e
Add CI build with type reduction enabled (#6622) 2021-02-10 13:31:51 -08:00
Vincent Wang
eec602e48a
OrtModule v0.21 (#6395)
* ortmodule v0.2

* use pt module for eval

* get user outputs in yield op

* pass output grads to yield output without copy

* Disable mem_pattern for ORTModule

* Avoid allocating output buffer for Yield op

* Change to WaitAndReset to avoid overriding signal

* remove unnecessory signal/wait at the end of bg thread

* Return Session.Run result as a std::future

* export model with torch.no_grad()

* Handle bg thread's early return in Forward call

* Removed duplicated Yield kernel

* Silence "CUDA kernel missing log"

* Add missing transforms, clear iobinding (#6532)

* revert ortmodule.py to a working state first

* Apply ortmodule.py change from dev branch

* Rename to YieldOp

Co-authored-by: Sherlock Huang <bahuang@OrtTrainingDev3.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>
Co-authored-by: ashbhandare <ash.bhandare@gmail.com>
Co-authored-by: Sherlock <baihan.huang@gmail.com>
2021-02-10 13:27:15 -08:00
Edward Chen
352e8cb8a8
Move ORT_ENFORCE()'s within MLTypeCallDispatcher to helper class functions to reduce the size used by function names in ORT_ENFORCE(). (#6624)
Move ORT_ENFORCE()'s within MLTypeCallDispatcher to helper class functions to reduce the size of function names in ORT_ENFORCE().
ORT_ENFORCE() captures the containing function's name in the error message. For some usages of MLTypeCallDispatcher (i.e., with numerous types or long type names), the function name is quite long and can contribute significantly to the binary size. Usage in the Cast CPU kernel is a notable example.
This change moves the ORT_ENFORCE() checks from a class template member function template with variable length name to a helper function with a fixed length name.
2021-02-10 11:34:38 -08:00
Dwayne Robinson
eef9a7a8a9
Update DirectML 1.4.0 to 1.4.1 for ORT 1.7 (#6636) 2021-02-10 10:34:40 -08:00
Xiang Zhang
8502573125
fix CheckLearningModelPixelRange (#6632) 2021-02-10 10:23:54 -08:00
Derek Murray
88d48063fa
Log warning when GetGradientForOp() silently fails. (#6586)
* Add warning when GetGradientForOp() silently fails.

In some cases, `GetGradientForOp()` can return without creating any nodes, which may lead to an invalid graph being created.
2021-02-10 10:01:16 -08:00
Hariharan Seshadri
b09bfc8611 Revert "Remove abs in LpPool (#6303)"
This reverts commit 3b3e698674.
2021-02-10 00:48:14 -08:00
Wei-Sheng Chin
8972621138
Generate shape-independent graph if any input dimension < 2 (#6581)
* Throw for non-supported case

* Not to go to shape-dependent branch when seeing unsupported shapes
2021-02-10 15:44:25 +08:00
Hariharan Seshadri
8f0b877a1d
Enable running some ops on CUDA (#6572) 2021-02-09 22:10:43 -08:00
Yufeng Li
505c1f30b5 use == instead of is for python 3.8 2021-02-09 19:59:28 -08:00
Changming Sun
e70344e648
Fix training python packaging pipeline (#6613)
In a previous PR, I set the docker file name to a wrong value.
2021-02-09 11:04:39 -08:00
Justin Stoecker
1c72774232
Update a few WinML model test filters for DML 2021-02-09 10:23:57 -08:00
Cian Hayes
8f14b8bd9d
Support disabling training kernels as part of a reduced build (#6557) 2021-02-09 09:51:31 -08:00
stevenlix
e9d03983fc
Add engine decryption in TensorRT EP (#6612)
* add trt engine decryption

* update document

* add windows support to decryption

* fix issues

* remove redundant get() from engine/context check

* fix issue
2021-02-09 00:46:14 -08:00
Changming Sun
0b89f931d0
Update CUDA build configs (#6598)
1. Fix Nuget package build break caused by #6225
2. Delete Dockerfile.centos_gpu. It is not used anywhere.
3. Fix Linux CUDA 10.2 build error caused by glibc upgrade
2021-02-08 22:55:42 -08:00
Xavier Dupré
d3a2c8c1c7
Support double for operators ReduceMax, ReduceMin (#6265)
* Support double for operators ReduceMax, ReduceMin

* add unit test to pai-excluded-tests.txt

Co-authored-by: xavier dupré <xavier.dupre@gmail.com>
2021-02-08 19:14:26 -08:00
Randy Shuai
ff063309b0 enable omp for debug build 2021-02-08 19:10:13 -08:00
Randy Shuai
6c5f50d00e deprecate omp in ci 2021-02-08 19:10:13 -08:00