Commit graph

4319 commits

Author SHA1 Message Date
Thiago Crepaldi
fb3f1f5cc1
Enable custom ops on ORTModule (#6740) 2021-02-18 09:08:10 -08:00
Sherlock
b7b5612159
Merge pull request #6742 from microsoft/mzs/sync-from-master
Sync from master
2021-02-18 00:12:38 -08:00
M. Zeeshan Siddiqui
40dda452cf Merge branch 'master' of https://github.com/microsoft/onnxruntime into mzs/sync-from-master 2021-02-18 03:03:01 +00:00
M. Zeeshan Siddiqui
e44ac6524f
Plug n Allocate with external CUDA allocator via PyBind. (#6679) 2021-02-17 18:59:38 -08:00
liqunfu
dd8ef4409a
Liqun/migrate perf test (#6733)
move ort training perf tests to azure devops
2021-02-17 17:48:47 -08:00
liqunfu
2c5e603bad
Liqun/nuphar nuget (#6656)
create nuphar nuget with correct name
2021-02-17 16:13:07 -08:00
Thiago Crepaldi
21f9e32c60
Merge pull request #6714 from microsoft/thiagofc/merge-from-master
Merge master into thiagofc/ortmodule-api
2021-02-17 16:08:53 -08:00
Ramakrishnan Sivakumar
a5bef6886b
Threading support for Hybrid core architecture (#6728) 2021-02-17 15:35:07 -08:00
Guoyu Wang
6810d98ea3
Update links to gh-pages for ORT minimal documents (#6721)
* Fix broken link in ort minimal docs

* Update link of build.md to gh-pages
2021-02-17 14:34:50 -08:00
Justin Stoecker
af4e5c0c6e
Minor WinML model test skip name change 2021-02-17 14:27:58 -08:00
Maajid khan
b41e9b5d4c
[OpenVINO-EP] Fixes OpenVINO-EP build on windows (#6726)
* Fixes OpenVINO-EP windows build

Openvino EP build is broken on windows. The issue
is wchar_t is UTF-16 on windows while on other platforms
such as Linux and MacOS, wchar_t is UTF-32.

so wide Unicode string has to be converted to an UTF8 string
for sure on windows.

This commit fixes this issue.
2021-02-17 13:49:03 -08:00
Thiago Crepaldi
9d4b730e46 Fix merge leftover 2021-02-17 11:58:06 -08:00
M. Zeeshan Siddiqui
9853ef84f8 Reduce binary size, limit asynchronous/backgroud thread stuff to training only. 2021-02-17 11:51:09 -08:00
M. Zeeshan Siddiqui
5b7e7aaa45 Move event_pool and message_queue to core. 2021-02-17 11:50:56 -08:00
M. Zeeshan Siddiqui
eecce31a8b Fix build, cleanup. 2021-02-17 11:50:41 -08:00
Thiago Crepaldi
3184c47ad1 Merge branch 'master' into thiagofc/merge-from-master 2021-02-17 11:49:52 -08:00
Yulong Wang
9a9202a218
[Node.js binding] update dependency typedoc (#6720) 2021-02-17 10:22:05 -08:00
Changming Sun
0be5475de6
Update packaging pipelines(#6664) 2021-02-17 09:53:36 -08:00
Changming Sun
46c06f6ac7
Change Windows GPU CI pipeline to CUDA11 (#6616) 2021-02-17 09:44:44 -08:00
Changming Sun
eefeacd828
Skip running gpt2 model in C# x86 (#6722) 2021-02-17 09:37:16 -08:00
Derek Murray
b8d5fa812c
Fix typo in README.md (#6713)
Fixes #6710.
2021-02-17 09:29:30 -08:00
Wei-Sheng Chin
9e67b88c83
Use local rank as GPU ID (#6719) 2021-02-17 22:42:54 +08:00
RandySheriffH
9043df8b66
Deprecate OMP from nuget pipeline (release:1.7) (#6671)
* deprecate OMP from nuget

* remove omp build

* remove

* add openmp build

* add variants

* rename package

* move GPU to no omp pipeline

* reset path

* switch to abs path

* reset path

* add cpu package

* remove obsolete name

* set package name

Co-authored-by: Randy Shuai <rashuai@microsoft.com>
2021-02-17 00:03:44 -08:00
Suffian Khan
105883f4b8
remove longformer_global_impl.cu from hipify (#6716) 2021-02-16 22:26:18 -08:00
Hariharan Seshadri
aa2622efb2
Support multiple dynamic inputs in custom ops (#6666) 2021-02-16 20:54:30 -08:00
baijumeswani
01dfa8e125
Support non tuple return values from torch.nn.module (#6660)
* Support dictionary, namedtuples and huffingface ModelOutput type for model return values
2021-02-16 20:48:32 -08:00
Scott McKay
02c7873b0e
Update ORT model conversion script to support custom ops (#6701)
* Add support for custom ops library to the ORT model conversion script
Simplify model conversion now that we read ops from the ORT format model.
Enable custom ops in the python bindings if custom ops are turned on in a minimal build.
* Add test of model conversion involving custom ops.
2021-02-17 12:52:39 +10:00
Thiago Crepaldi
7f33671ade
Handle multiple devices scenarios (#6672)
* Handle multiple devices scenarios
2021-02-16 18:22:30 -08:00
Thiago Crepaldi
7ee5baa60d
Remove monkey patch for PyTorch Nightly + ORTTrainer (#6659) 2021-02-16 17:24:50 -08:00
Tianlei Wu
9b446d5f7e
Longformer Attention CUDA kernel memory Improvements (#6646)
* Integrate memory improvements from NVidia
* compute max_global_num before buffer allocation
* update conversion script to support transformers 4.0
* update benchmark script for creating dummy inputs for different batch_size

* Use a wrapper of cuda event to avoid memory leak
2021-02-16 14:54:48 -08:00
Edward Chen
b09a370218
Address warning in data_types_internal.h (#6704)
Address "unreachable code" warning in data_types_internal.h.
2021-02-16 12:41:48 -08:00
RandySheriffH
c36ee4bd40
Rename Python packaging pipelines (#6682)
* rename pipelines

* resync and rename

* resync master

* rename package id

* remove OrtPackageId which is for nuget

Co-authored-by: Randy Shuai <rashuai@microsoft.com>
2021-02-16 11:43:03 -08:00
RandySheriffH
497eef8d3d
remove omp (#6675)
Co-authored-by: Randy Shuai <rashuai@microsoft.com>
2021-02-16 11:42:32 -08:00
Changming Sun
d48a4c0a54
Add CG step to nuget GPU pipeline (#6678) 2021-02-16 09:46:20 -08:00
Changming Sun
9a01174037
Disable some unit tests for training (#6699) 2021-02-16 09:45:59 -08:00
Scott McKay
33279250b5
Update a couple of usages of args.minimal_build to check for not specified vs empty list correctly. (#6688) 2021-02-16 14:46:51 +10:00
Tracy Sharpe
37b83acd76
MLAS: add uint8_t NHWC max pooling (#6684)
Add support to transform graphs containing uint8 MaxPool to a custom operator that supports NHWC format that can be more easily vectorized.
2021-02-15 10:05:29 -08:00
Edward Chen
a35b30e237
Change BuildKernelDefConstraintsFunctorFromTypeList struct to BuildKernelDefConstraintsFromTypeList function. (#6674) 2021-02-15 09:16:07 +10:00
Maajid khan
f649f917fe
[OpenVINO-EP] Enabling OpenVINO Runtime options for Perftest application (#6654)
* Adding changes to enable ov_config_options

Enabling a flag to pass OpenVINO Runtime options
as an string argument using a command line.

* Enabling OpenVINO Runtime options for perftest

Enables OpenVINO EP runtime options into onnxruntime_perf_test.
Now these options can be passed as an argument to the perf test CPP
application using key-value pairs seperated by a space via a
command line.

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>

* minor changes added

* Corrected Indentation

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>

* corrected Indendation issues

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>

* Making config options generic to all EP's

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>
2021-02-13 16:09:31 -08:00
RandySheriffH
df3d6bad5f
Deprecate OMP from Python package (#6610)
1. For previous openmp build, remove --use_openmp, so thread pool will become default;
2. For previous non-openmp build, add --use_openmp and rename the package to indicate the inclusion.
3. Add a mac build with openmp enabled.
2021-02-12 21:50:41 -08:00
Faith Xu
72eb5de0e2 Add Python 3.9 to pypi metadata 2021-02-12 20:00:17 -08:00
Scott McKay
25f7c93504
Require explicit inclusion of custom op support in a minimal build (#6663)
* Remove support from custom ops from the base minimal build as they contribute too much binary growth to an Android build.
Add ability to explicitly enable custom op support in a minimal build.
Change one minimal build CI to test adding custom op support (unit tests are run in that build to validate)
2021-02-13 12:42:33 +10:00
Changming Sun
dd50c39ac6
Change Linux python packaging pipeline compile flags (#6668) 2021-02-12 15:28:56 -08:00
Sheil Kumar
87cb6fd495
Add LearningModelBuilder to WinML Experimental Namespace along with various Audio operators (#6623)
* model building

* fix build

* winml adapter model building api

* model building

* make build

* make build again

* add model building with audio op

* inplace and inorder fft

* add ifft

* works!

* cleanup

* add comments

* switch to iterative rather than recursive and use parallelization

* batched parallelization

* fft->dft

* cleanup

* window functions

* add melweightmatrix op

* updates to make spectrogram test work

* push latest

* add onesided

* cleanup

* Clean up building apis and fix mel

* cleanup

* cleanup

* naive stft

* fix test output

* middle c complete

* 3 tones

* cleanup

* signal def new line

* Add save functionality

* Perf improvements, 10x improvement

* cleanup

* use bitreverse lookup table for performance

* implement constant initializers for tensors

* small changes

* add matmul tests

* merge issues

* support add attribute

* add tests for double data type windowfunctions and minor cleanup

* stft onesided/and not tests

* cleanup

* cleanup

* clean up

* cleanup

* remove threading attribute

* forward declare orttypeinfo

* warnings

* fwd declare

* fix warnings

* 1 more warning

* remove saving to e drive...

* cleanup and fix stft test

* add opset picker

* small additions

* add onnxruntime tests

* add signed/unsigned

* fix warning

* fix warning

* finish onnxruntime tests

* make windows namespace build succeed

* add experimental flag

* add experimental api into nuget package

* add experimental api build flag and add to windows ai nuget package

* turn experimental for tests

* add minimum opset version to new experimental domain

* api cleanup

* disable ms experimental ops test when --ms_experimental is not enabled

* add macro behind flag

* remove unused x

* pr feedback

Co-authored-by: Sheil Kumar <sheilk@microsoft.com>
2021-02-12 14:17:10 -08:00
ashbhandare
ff465483b1
Add TNLRv3 fp16 pattern to Layer Norm fusion (#6661)
* Add tnlrv3 pattern

* Add test
2021-02-12 14:05:36 -08:00
RandySheriffH
a07a14dce5
exclude non support types (#6653) 2021-02-12 13:30:48 -08:00
Edward Chen
b2cddc5337
Consolidate MLTypeCallDispatcher classes (#6651) 2021-02-12 13:26:56 -08:00
Suffian Khan
e6de0eb813
Add nightly pipeline for MI100 to run convergence and batch size test similar to V100. (#6611)
* Partial updating of ROCM reduction code.

* Update reduction_all.cu

* Add reduce template parameters.

* miopen common

* Reuse CUDA's reduction_functions.cc

* Reduction ops.

* Update remaining reduction ops to use MIOpen.  double datatype is not supported, so disable those typed kernels.

* Disable a couple more unsupported tests.

* Code formatting.

* Delete ROCM-specific reduction code that is identical to CUDA reduction code.

* Fix scratch buffer early free.

* Fix merge conflict.

* first attempt nightly amd ci pipeline

* try fix bad yaml file

* try again with corrected model directory

* add convergence test as well

* update reference loss for amd mi100

* include mi100 test results csv

* update the mi100  convergence test reference values

* update batch sizes for mi100 32g

* fix gpu sku for run_convergence_test.py

* undo unrelated changes to master

* pr comments

* pr comment

Co-authored-by: Jesse Benson <jesseb@microsoft.com>
2021-02-12 13:22:06 -08:00
Guoyu Wang
f11b5d3072
[CoreML EP] Enable coreml for onnx_test_runner and onnxruntime_perf_test (macOS only) (#6642) 2021-02-12 10:41:36 -08:00
Edward Chen
78e408dbe9
Enable type reduction for ConstantOfShape CPU kernel. (#6594)
* Enable type reduction for ConstantOfShape.
2021-02-12 18:27:25 +10:00