Commit graph

11997 commits

Author SHA1 Message Date
Tianlei Wu
d69bc31309
Refine BERT optimization script options (#3618)
* Remove paramters like --gpu_only --sequence_length. Update bert GPU notebook accordingly.
* Remove input_int32 and float16 parameters from constructors of BertOnnxModel class and other classes derived from it. 
* Update gpt2 benchmark. Add comments in gpt2 notebook to indicate work in progress. Clear notebook output before official 1.3.0 release is ready.
2020-04-21 21:28:06 -07:00
Scott McKay
b4508dbdc6
Improve TopK performance. (#3612)
* Update TopK implementation.
  - add faster heap
  - special case k=1
  - update selector for when to use heap and when to use nth_element based on performance testing
  - parallelize if enough work to do
  - reduce templatized code
  - add some extra unit tests.

Perf tested vs. master. Average speedup is 3.75x using this combination of input sizes:

```
    batches = [10, 25, 50]
    batch_size = [8, 16, 32, 64, 128, 256, 512, 1024, 2048]
    k = [1, 2, 4, 6, 8, 16, 24, 32, 48, 64, 128]
```

For larger batches (e.g. 50x2048) the speedup is over 20x.
2020-04-22 10:05:13 +10:00
edgchen1
5492d02c4e
Remove Windows CUDA 9 build definition and helper scripts. (#3615) 2020-04-21 15:22:27 -07:00
Sherlock
d66d5bb86a
Update Optimizer Domain and Opset (#3602)
* Update Domain and Opset for SGD

* Update Adam Domain and Opset

* Update Lamb Domain and Opset
2020-04-21 15:06:02 -07:00
Edward Chen
47f1758fdc Add --skip_onnx_tests to orttraining Windows builds. 2020-04-21 21:50:35 +00:00
Edward Chen
297ab43b0c Add --enable_onnx_tests to Windows builds to allow set up of test data directory. 2020-04-21 20:34:55 +00:00
Edward Chen
2e4b9b1d0e Disable CudaKernelTest.SoftmaxCrossEntropyLoss_LargeSizeTensor because it's flaky. 2020-04-21 20:30:45 +00:00
Edward Chen
28a0c863b1 Revert "Convert Gelu to use TryParallelFor (#3599)"
This reverts commit 2579a72a88.
2020-04-21 18:45:20 +00:00
Edward Chen
d50c3e7a71 Fix GraphTransformationTests tests. 2020-04-21 18:43:49 +00:00
Pranav Sharma
9636da3951
Threadpool related changes. (#3564)
Threadpool related changes.

Don't create ORT threadpool if openmp is enabled (except for inter op threadpool).
Created a new static function ThreadPool::NumThreads to account for openmp settings and null threadpool ptr.
Log a warning when using SetIntraOpNumThreads when openmp is enabled.
Added a document for ORT devs.
Fix LSTM to use the new threadpool abstractions.
Rename GetNumCpuCores to GetThreadAffinityMasks and move it to the Env class.

Co-authored-by: Tracy Sharpe <tracysh@microsoft.com>
2020-04-21 09:57:39 -07:00
Adam Pocock
3dd3f84116
[Java] Adding model metadata support (#3573)
* java - adding deployment information to build.gradle.

* java - adding support for model metadata.
2020-04-21 02:28:15 -07:00
Jeff Bloomfield
c2a01b9431 Disable erroneous compiler warning in space_depth_ops.cc 2020-04-21 01:40:12 -07:00
George Wu
1c37d5e6ec
debug option for dumping tensorrt subgraphs. (#3604) 2020-04-21 11:55:30 +08:00
Edward Chen
87fad09c7b Fix merge issue. 2020-04-21 03:44:44 +00:00
Edward Chen
daa14b64e3 Merge remote-tracking branch 'origin/master' into edgchen1/merge_from_master 2020-04-21 03:31:32 +00:00
edgchen1
ead00f97f3
Sync onnx_backend_test_series.py disabled tests (#3603)
Make the set of disabled tests consistent between ort_training and master. Fix some regex patterns.
2020-04-20 18:00:53 -07:00
pengwa
e233e6ba45
Refactor - ScatterElements (#3559)
Refactor ScatterElements using MLTypeCallDispatcherRet to refactor
2020-04-21 08:58:42 +08:00
Changming Sun
2579a72a88
Convert Gelu to use TryParallelFor (#3599) 2020-04-20 17:32:39 -07:00
Jeff Bloomfield
971b98f9a5 Fix ARM build error 2020-04-20 17:15:55 -07:00
Changming Sun
911d125323 Remove openmp from gpu build 2020-04-20 17:13:54 -07:00
Jeff Bloomfield
850ab19e62 Fix Winml test build error 2020-04-20 15:31:16 -07:00
Jeff Bloomfield
19cdd6f1e1 Fix chk build error 2020-04-20 11:34:07 -07:00
liqunfu
781e1c36be
Add front-end MNIST test (#3231)
* add frontend minst test

* to use torch nightly with torchvision

* remove incorrect comment per reviewer's comment

* experiment torchvision import failure

* experiment install_deps.sh

* more experiment install_deps.sh

* experiment install_deps.sh with --upgrade

* Experiment with install_deps.sh.

* Experiment with install_ubuntu.sh.

* Use Ubuntu 18.04 and Python 3.6 for CI.

* Update cmake version for CI.

* Install MPI on Ubuntu 18.04 for CI.

* Increase tolerance for MNIST test.

* Go back to Ubuntu 16.04 for CI, fix installing from deadsnakes ppa.

* Clean-up.

* Update ort_trainer.py from ort_training.

* Get default Ubuntu Python ver back to 3.5.

* Add underscore to opset_version parameter name in ORTTrainer constructor.

* Move loss/model wrap before the call for sample output.

* Update expected values for MNIST test.

Co-authored-by: liqun <liqun@OrtTrainingDev4.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>
Co-authored-by: Sergii Dymchenko <sedymche@microsoft.com>
2020-04-20 11:19:31 -07:00
edgchen1
f180b71f27
Support ONNX test version parsing from path on Windows in onnx_test_runner. (#3588) 2020-04-20 10:02:51 -07:00
Sheil Kumar
31b6629e99
Fork WinML IDL Guids (#3591)
Co-authored-by: Sheil Kumar <sheilk@microsoft.com>
2020-04-20 09:17:07 -07:00
Prabhat
381fee47ab
Added support to build onnxruntime with ACL (#3586)
* Added support to build onnxruntime with ACL

* Added ACL build instructions
2020-04-20 13:35:28 +05:30
Changming Sun
75426a3091 Fix build break 2020-04-19 18:32:46 -07:00
Jeff Bloomfield
5d2874298e Merge remote-tracking branch 'upstream/user/jeffbloo/FreeDimOverrideByName' into user/jeffbloo/MergeGithubMasterToDmlDevPlusPending 2020-04-19 13:50:21 -07:00
Jeff Bloomfield
88732cd092 upstream/jeffbloo/TrimOnSessionInitializationEnd 2020-04-19 13:49:23 -07:00
Jeff Bloomfield
eceb18869a Merge remote-tracking branch 'origin/user/jeffbloo/BatchTensorCopy' into user/jeffbloo/MergeGithubMasterToDmlDevPlusPending 2020-04-19 13:45:31 -07:00
Jeff Bloomfield
acbfa42647 Merge remote-tracking branch 'origin/DmlDev' into user/jeffbloo/MergeGithubMasterToDmlDevPlusPending 2020-04-19 13:44:25 -07:00
Jeff
7d523d2580 Merge remote-tracking branch 'upstream/master' into jeffbloo/TrimOnSessionInitializationEnd 2020-04-19 11:58:44 -07:00
Jeff
414c4174a4 Merge remote-tracking branch 'upstream/master' into user/jeffbloo/FreeDimOverrideByName 2020-04-19 11:57:42 -07:00
Jeff Bloomfield
8ee5953153 Merge remote-tracking branch 'upstream/master' into user/jeffbloo/MergeGithubMasterToDmlDev1 2020-04-19 11:52:44 -07:00
Jeff Bloomfield
a4e312da43 Fix build error in D3DDeviceCache.cpp 2020-04-19 11:52:41 -07:00
Zhang Lei
422266c445
Support conv transpos 1D in cuda provider. (#3300)
* Support conv transpos 1D in cuda provider.

* Clear some old comment. Enable conv_transpose_1d onnx test for cuda.
2020-04-19 22:07:34 +08:00
Scott McKay
7d5348f87e
Add ability to batch device copy for graph inputs and outputs. (#3580)
* Add ability to batch device copy for graph inputs and outputs.
2020-04-19 17:51:07 +10:00
Prabhat
ea62b3435a
Clean up build.py code (#3466) 2020-04-18 20:48:30 -07:00
Maxim Kalinin
fcf0f6ee9f
Generalize reshape fusion (#3554)
* Generalize reshape fusion

* Allow arbitrary number of Concat arguments
* Apply fusion even when an output of an internal node is used elsewhere
* Fix a bug when an internal node's output is the subgraph output
* Simplify code
2020-04-18 20:47:23 -07:00
Tiago Koji Castro Shibata
14e387aa1a
Fix WinML namespace build break (#3583)
* Add missing winrt namespace

* Conditional compilation of dxcore code

* Fix TAEF macros
2020-04-18 20:46:01 -07:00
Sherlock
56b223bc60
Implement OneHot CUDA Kernels (#3390)
* Implement OneHot CUDA Kernels

* Support fp16

* Use HandleNegativeAxis

* Make MLFloat16 test GPU only
2020-04-18 17:41:39 -07:00
Hariharan Seshadri
1599562016 Fix BatchNorm CUDA kernel definition 2020-04-18 17:21:29 -07:00
Zhang Lei
c365822808
Refactor some for the calibate.py. Add QLinearAdd and QLinearMul support. Fix bugs loading jpgs not strict RGB, and typoes in load_batch call. (#3542) 2020-04-18 17:10:55 -07:00
Dmitri Smirnov
db9566f70d
Implement Inverse(12) for CPU and CUDA (#3485) 2020-04-18 17:10:21 -07:00
Dmitri Smirnov
38a18023c7
Fix some too popular warnings. (#3578)
Some pointless and noisy warnings either fixed or disabled.
2020-04-18 17:05:05 -07:00
Changming Sun
d68245853e
Disable downloading test data on Linux (#3581) 2020-04-18 15:54:58 -07:00
Sergii Dymchenko
3e884b4b6b
Fix some typos. (#3582)
* Fix some typos.

* Fix a typo.
2020-04-18 14:18:05 -07:00
suryasidd
6fe688c732
Disabled failed maxpool test on GPU (#3549) 2020-04-18 13:49:42 -07:00
edgchen1
52cfc98ec4
Merge pull request #3557 from microsoft/havenka/master-merge
Merge from master
2020-04-18 09:40:32 -07:00
edgchen1
811bd67872
Clean up docs. (#3579)
* Fix orttraining/README.md formatting.

* Delete ORT_TRAINING_BUILDS.md.

* Fix typo.
2020-04-17 22:13:11 -07:00