Commit graph

2250 commits

Author SHA1 Message Date
edgchen1
b4e82913d1
Merge pull request #3670 from microsoft/edgchen1/merge_from_master
Merge from master to ort_training_for_merge_to_master
2020-04-23 15:17:42 -07:00
Edward Chen
deac467683 Merge remote-tracking branch 'origin/master' into edgchen1/merge_from_master 2020-04-23 20:50:33 +00:00
David Brownell
3ce31933bb
Wheel file updates for FeaturizerLibrary data (#3640) 2020-04-23 13:27:22 -07:00
ytaous
ae7da23460
disable broken test in DML (#3666)
* temporary disable LSTM_Seq_lens_unpacked for dml test

* temporary disable LSTM_Seq_lens_unpacked for dml test

* temporary disable LSTM_Seq_lens_unpacked

Co-authored-by: Ethan Tao <ettao@microsoft.com>
2020-04-23 13:23:50 -07:00
edgchen1
49a1c5e546
Change CentOS build to use agent pool because builds on hosted agents run out of disk space. (#3662) 2020-04-23 12:19:19 -07:00
Xavier Dupré
5777fc18c3
Removes omp for ThreadPool in TreeEsemble* (#3596)
* Removes omp to use ThreadPool

* removes unnecessary old OMP code

* rename compute_agg, use ThreadPool::NumThreads

Co-authored-by: xavier dupré <xavier.dupre@gmail.com>
2020-04-22 23:48:31 -07:00
gwang-msft
02bae6bd06
Not use OpenMP for android build (#3636) 2020-04-22 21:17:05 -07:00
edgchen1
2dd4f7e96b
Add check for nullptr in PlannerImpl::FindReusableTensor(). (#3619) 2020-04-22 20:18:29 -07:00
suffiank
0e12d05cd2
fixes for ort_trainer.py to resume from checkpoint (#3510)
* fixes for ort_trainer.py to resume from checkpoint

* define self.state_dict_ during init

* add comment of explanation

* add unit test for restore from checkpoint

* fix file not found

Co-authored-by: suffian khan <sukha@microsoft.com>
2020-04-22 16:33:58 -07:00
Changming Sun
00917917d6
Downgrade numpy requirement to 1.16.6 (#3635) 2020-04-22 16:11:33 -07:00
Weixing Zhang
e4fc83252d
Refactoring code related to WARP_SIZE. (#3623)
1. Centralize its definition in common.cuh.
2. Rename it to GPU_WARP_SIZE which can be extended to AMD GPU later.
3. Centralize warp shuffle functions.

Co-authored-by: Weixing Zhang <wezhan@microsoft.com>
2020-04-22 15:19:06 -07:00
Mikhail Kuznetsov
3cf3595579
Replaced spaces on tabs (#3555) 2020-04-22 15:16:19 -07:00
Ye Wang
7837c7efc3
Add Features to ShortGrainDropper for ONNX export (#3628)
* add features to short_grain_dropper for ONNX export

* update FeaturizersLibrary

* fix warnings
2020-04-22 14:09:39 -07:00
edgchen1
bb9b0ba5b3
Merge pull request #3607 from microsoft/edgchen1/merge_from_master
Merge from master to ort_training
2020-04-22 13:22:32 -07:00
Ye Wang
70b554cc85
Add Features to ForecastingPivot Transformer for ONNX Export (#3608)
* checkin

* fix MSVC build error

* test changes

* split pivot output into multiple tensors

* add horizon tensor

* Support multiple types for non-pivot tensor

* limit horizon tensor type to int32_t as max_horizon type

* work around some conversion warnings for local machine

* support variadic shape for non-pivot input

* dropping all rows is an exception

* fix a bug

* fix the way that generates horizon tensor

* more tests added

* add TypeConstraint() in ONNX_OPERATOR_KERNEL_EX

* update Featurizerslibrary
2020-04-22 13:09:31 -07:00
Wei-Sheng Chin
ab70625b29
Add Lamb shape inference (#3634) 2020-04-22 11:32:28 -07:00
Paul McDaniel
2c74766ad1
Add new docs around how to bind to the onnxruntime.dll (#3539) 2020-04-22 11:24:36 -07:00
Edward Chen
8df5076d96 Merge remote-tracking branch 'origin/master' into edgchen1/merge_from_master 2020-04-22 17:16:00 +00:00
Edward Chen
8d09cefafc Merge remote-tracking branch 'origin/ort_training' into edgchen1/merge_from_master 2020-04-22 16:56:15 +00:00
edgchen1
b518cb2a7a
Clean up OPTIONAL name conflict workarounds in ort_training. (#3622)
* Clean up OPTIONAL name conflict workarounds.

* Cleanup unnecessory header files onnx_protobuf.h

Co-authored-by: Sherlock Huang
2020-04-22 09:07:55 -07:00
Vincent Wang
d3a2ac5c5c
Eliminate Useless Cast during Transformer. (#3606)
* Remove Useless Cast during Transformer.

* Resolve comments.

* Check if graph can remove the node.

Co-authored-by: Vincent Wang <weicwang@OrtDevTest2v100.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>
2020-04-22 16:36:46 +08:00
Tianlei Wu
d69bc31309
Refine BERT optimization script options (#3618)
* Remove paramters like --gpu_only --sequence_length. Update bert GPU notebook accordingly.
* Remove input_int32 and float16 parameters from constructors of BertOnnxModel class and other classes derived from it. 
* Update gpt2 benchmark. Add comments in gpt2 notebook to indicate work in progress. Clear notebook output before official 1.3.0 release is ready.
2020-04-21 21:28:06 -07:00
Scott McKay
b4508dbdc6
Improve TopK performance. (#3612)
* Update TopK implementation.
  - add faster heap
  - special case k=1
  - update selector for when to use heap and when to use nth_element based on performance testing
  - parallelize if enough work to do
  - reduce templatized code
  - add some extra unit tests.

Perf tested vs. master. Average speedup is 3.75x using this combination of input sizes:

```
    batches = [10, 25, 50]
    batch_size = [8, 16, 32, 64, 128, 256, 512, 1024, 2048]
    k = [1, 2, 4, 6, 8, 16, 24, 32, 48, 64, 128]
```

For larger batches (e.g. 50x2048) the speedup is over 20x.
2020-04-22 10:05:13 +10:00
edgchen1
5492d02c4e
Remove Windows CUDA 9 build definition and helper scripts. (#3615) 2020-04-21 15:22:27 -07:00
Sherlock
d66d5bb86a
Update Optimizer Domain and Opset (#3602)
* Update Domain and Opset for SGD

* Update Adam Domain and Opset

* Update Lamb Domain and Opset
2020-04-21 15:06:02 -07:00
Edward Chen
47f1758fdc Add --skip_onnx_tests to orttraining Windows builds. 2020-04-21 21:50:35 +00:00
Edward Chen
297ab43b0c Add --enable_onnx_tests to Windows builds to allow set up of test data directory. 2020-04-21 20:34:55 +00:00
Edward Chen
2e4b9b1d0e Disable CudaKernelTest.SoftmaxCrossEntropyLoss_LargeSizeTensor because it's flaky. 2020-04-21 20:30:45 +00:00
Edward Chen
28a0c863b1 Revert "Convert Gelu to use TryParallelFor (#3599)"
This reverts commit 2579a72a88.
2020-04-21 18:45:20 +00:00
Edward Chen
d50c3e7a71 Fix GraphTransformationTests tests. 2020-04-21 18:43:49 +00:00
Pranav Sharma
9636da3951
Threadpool related changes. (#3564)
Threadpool related changes.

Don't create ORT threadpool if openmp is enabled (except for inter op threadpool).
Created a new static function ThreadPool::NumThreads to account for openmp settings and null threadpool ptr.
Log a warning when using SetIntraOpNumThreads when openmp is enabled.
Added a document for ORT devs.
Fix LSTM to use the new threadpool abstractions.
Rename GetNumCpuCores to GetThreadAffinityMasks and move it to the Env class.

Co-authored-by: Tracy Sharpe <tracysh@microsoft.com>
2020-04-21 09:57:39 -07:00
Adam Pocock
3dd3f84116
[Java] Adding model metadata support (#3573)
* java - adding deployment information to build.gradle.

* java - adding support for model metadata.
2020-04-21 02:28:15 -07:00
George Wu
1c37d5e6ec
debug option for dumping tensorrt subgraphs. (#3604) 2020-04-21 11:55:30 +08:00
Edward Chen
87fad09c7b Fix merge issue. 2020-04-21 03:44:44 +00:00
Edward Chen
daa14b64e3 Merge remote-tracking branch 'origin/master' into edgchen1/merge_from_master 2020-04-21 03:31:32 +00:00
edgchen1
ead00f97f3
Sync onnx_backend_test_series.py disabled tests (#3603)
Make the set of disabled tests consistent between ort_training and master. Fix some regex patterns.
2020-04-20 18:00:53 -07:00
pengwa
e233e6ba45
Refactor - ScatterElements (#3559)
Refactor ScatterElements using MLTypeCallDispatcherRet to refactor
2020-04-21 08:58:42 +08:00
Changming Sun
2579a72a88
Convert Gelu to use TryParallelFor (#3599) 2020-04-20 17:32:39 -07:00
Changming Sun
911d125323 Remove openmp from gpu build 2020-04-20 17:13:54 -07:00
liqunfu
781e1c36be
Add front-end MNIST test (#3231)
* add frontend minst test

* to use torch nightly with torchvision

* remove incorrect comment per reviewer's comment

* experiment torchvision import failure

* experiment install_deps.sh

* more experiment install_deps.sh

* experiment install_deps.sh with --upgrade

* Experiment with install_deps.sh.

* Experiment with install_ubuntu.sh.

* Use Ubuntu 18.04 and Python 3.6 for CI.

* Update cmake version for CI.

* Install MPI on Ubuntu 18.04 for CI.

* Increase tolerance for MNIST test.

* Go back to Ubuntu 16.04 for CI, fix installing from deadsnakes ppa.

* Clean-up.

* Update ort_trainer.py from ort_training.

* Get default Ubuntu Python ver back to 3.5.

* Add underscore to opset_version parameter name in ORTTrainer constructor.

* Move loss/model wrap before the call for sample output.

* Update expected values for MNIST test.

Co-authored-by: liqun <liqun@OrtTrainingDev4.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>
Co-authored-by: Sergii Dymchenko <sedymche@microsoft.com>
2020-04-20 11:19:31 -07:00
edgchen1
f180b71f27
Support ONNX test version parsing from path on Windows in onnx_test_runner. (#3588) 2020-04-20 10:02:51 -07:00
Sheil Kumar
31b6629e99
Fork WinML IDL Guids (#3591)
Co-authored-by: Sheil Kumar <sheilk@microsoft.com>
2020-04-20 09:17:07 -07:00
Prabhat
381fee47ab
Added support to build onnxruntime with ACL (#3586)
* Added support to build onnxruntime with ACL

* Added ACL build instructions
2020-04-20 13:35:28 +05:30
Changming Sun
75426a3091 Fix build break 2020-04-19 18:32:46 -07:00
Zhang Lei
422266c445
Support conv transpos 1D in cuda provider. (#3300)
* Support conv transpos 1D in cuda provider.

* Clear some old comment. Enable conv_transpose_1d onnx test for cuda.
2020-04-19 22:07:34 +08:00
Scott McKay
7d5348f87e
Add ability to batch device copy for graph inputs and outputs. (#3580)
* Add ability to batch device copy for graph inputs and outputs.
2020-04-19 17:51:07 +10:00
Prabhat
ea62b3435a
Clean up build.py code (#3466) 2020-04-18 20:48:30 -07:00
Maxim Kalinin
fcf0f6ee9f
Generalize reshape fusion (#3554)
* Generalize reshape fusion

* Allow arbitrary number of Concat arguments
* Apply fusion even when an output of an internal node is used elsewhere
* Fix a bug when an internal node's output is the subgraph output
* Simplify code
2020-04-18 20:47:23 -07:00
Tiago Koji Castro Shibata
14e387aa1a
Fix WinML namespace build break (#3583)
* Add missing winrt namespace

* Conditional compilation of dxcore code

* Fix TAEF macros
2020-04-18 20:46:01 -07:00
Sherlock
56b223bc60
Implement OneHot CUDA Kernels (#3390)
* Implement OneHot CUDA Kernels

* Support fp16

* Use HandleNegativeAxis

* Make MLFloat16 test GPU only
2020-04-18 17:41:39 -07:00