Commit graph

2269 commits

Author SHA1 Message Date
edgchen1
4aa033b99e
Addressing review comments (#3690)
- https://github.com/microsoft/onnxruntime/pull/3681#discussion_r414359326
- https://github.com/microsoft/onnxruntime/pull/3681#discussion_r414359463
- https://github.com/microsoft/onnxruntime/pull/3681#discussion_r414360023
- https://github.com/microsoft/onnxruntime/pull/3681#discussion_r414361667
- https://github.com/microsoft/onnxruntime/pull/3681#discussion_r414368707
- https://github.com/microsoft/onnxruntime/pull/3681#discussion_r414371480
- https://github.com/microsoft/onnxruntime/pull/3681#discussion_r414379362
- https://github.com/microsoft/onnxruntime/pull/3681#discussion_r414374516
- https://github.com/microsoft/onnxruntime/pull/3681#discussion_r414801087
2020-04-24 14:57:18 -07:00
edgchen1
7347c73139
Revert "resolving conflicts from master (#3691)" (#3696)
This reverts commit c38a60a450.
2020-04-24 14:49:00 -07:00
ytaous
c38a60a450
resolving conflicts from master (#3691)
* resolving conflicts

* resolving conflicts

* resolving conflicts

* resolve conflicts

Co-authored-by: Ethan Tao <ettao@microsoft.com>
2020-04-24 14:38:30 -07:00
Edward Chen
3863bd6f74 Revert "Try not to modify base name (#3638)"
This reverts commit d9641f292d.

Reverting to fix onnx_test_runner test failures.
2020-04-24 04:26:59 +00:00
Edward Chen
5a790a4b42 Merge remote-tracking branch 'origin/master' into ort_training_for_merge_to_master 2020-04-24 02:27:27 +00:00
Pranav Sharma
939d036660
Add omp impl for tryparallelfor and modify gelu to use fastgelu impl. (#3667)
* Add omp impl for tryparallelfor and modify gelu to use fastgelu impl.

* Address PR comments.
2020-04-23 18:24:46 -07:00
edgchen1
6ca44e216a
Merge pull request #3675 from microsoft/edgchen1/merge_from_ort_training
Merge from ort_training to ort_training_for_merge_to_master
2020-04-23 17:30:26 -07:00
Du Li
2659f205cc
Complex multiplication and conjugate contrib ops (#3384)
* adding ComplexMulConj

* Adding fp16 support.

* adding a util func
2020-04-23 17:21:48 -07:00
Edward Chen
4416d41874 Merge remote-tracking branch 'origin/ort_training' into edgchen1/merge_from_ort_training 2020-04-24 00:19:05 +00:00
Ori Levari
bae1dd7f04
add test for LearningModel creation from missing model path (#3661) 2020-04-23 15:37:32 -07:00
edgchen1
b4e82913d1
Merge pull request #3670 from microsoft/edgchen1/merge_from_master
Merge from master to ort_training_for_merge_to_master
2020-04-23 15:17:42 -07:00
Sheil Kumar
2d2375aa23
swap float16/float (#3663)
Co-authored-by: Sheil Kumar <sheilk@microsoft.com>
2020-04-23 14:27:18 -07:00
Yufeng Li
c0e817ff16
Fix a bug in skiplayernorm fusion pattern 2 (#3660)
For skiplayernorm fusion pattern 2, its input[0] should be equal to the input[0] of Add_1, but is overridden by the input[0] of Add_2.
2020-04-23 14:18:59 -07:00
Edward Chen
deac467683 Merge remote-tracking branch 'origin/master' into edgchen1/merge_from_master 2020-04-23 20:50:33 +00:00
David Brownell
3ce31933bb
Wheel file updates for FeaturizerLibrary data (#3640) 2020-04-23 13:27:22 -07:00
ytaous
ae7da23460
disable broken test in DML (#3666)
* temporary disable LSTM_Seq_lens_unpacked for dml test

* temporary disable LSTM_Seq_lens_unpacked for dml test

* temporary disable LSTM_Seq_lens_unpacked

Co-authored-by: Ethan Tao <ettao@microsoft.com>
2020-04-23 13:23:50 -07:00
edgchen1
49a1c5e546
Change CentOS build to use agent pool because builds on hosted agents run out of disk space. (#3662) 2020-04-23 12:19:19 -07:00
Weixing Zhang
336624806e
Simplify and clean code (#3655)
1. It is not necessary to include cudnn_common.h for kernels which are not implemented with CUDNN.
2. Minor change in layer norm kernel to simplify the code and resolve building warning.

Co-authored-by: Weixing Zhang <wezhan@microsoft.com>
2020-04-23 10:12:55 -07:00
XiaocenDong
125f68f305
fixed mnist bug (#3569)
* fixed mnist bug

* fixed train_step param
2020-04-23 23:22:38 +08:00
Xavier Dupré
5777fc18c3
Removes omp for ThreadPool in TreeEsemble* (#3596)
* Removes omp to use ThreadPool

* removes unnecessary old OMP code

* rename compute_agg, use ThreadPool::NumThreads

Co-authored-by: xavier dupré <xavier.dupre@gmail.com>
2020-04-22 23:48:31 -07:00
Xueyun Zhu
f1ba9aaf34
Add pipeline transformer for wait/record node (#3513)
* pipeline transformer

* clean up

* address feedback

* add record/wait for first stage and updated split script

* address feedback

* make recv/send signal as initializer

* merge

* address feedback

* unify input and initializer

* address feedback and bug fix

* minor fix

* windows build

* fix
2020-04-22 23:28:01 -07:00
pengwa
6136fd0789
GatherElementsGrad Kernels (#3627)
* GatherElementsGrad cuda kernel & tests

* Fix comments

* Fix include path
2020-04-23 14:02:34 +08:00
Wei-Sheng Chin
d9641f292d
Try not to modify base name (#3638) 2020-04-22 22:24:43 -07:00
Vincent Wang
ffe19ae49b
Expand elimination and Expand gradient. (#3610)
* Expand elmination and Expand gradient.

* Resolve comments.

* Fix test break.

* Check if graph can remove the node.

* Resolve comment.

Co-authored-by: Vincent Wang <weicwang@microsoft.com>
2020-04-23 13:17:15 +08:00
Tang, Cheng
37f4f74308
expose training session so the training app could register custom kernel and transformers (#3642)
Co-authored-by: Cheng Tang <chenta@microsoft.com>
2020-04-22 21:35:41 -07:00
gwang-msft
02bae6bd06
Not use OpenMP for android build (#3636) 2020-04-22 21:17:05 -07:00
edgchen1
2dd4f7e96b
Add check for nullptr in PlannerImpl::FindReusableTensor(). (#3619) 2020-04-22 20:18:29 -07:00
suffiank
0e12d05cd2
fixes for ort_trainer.py to resume from checkpoint (#3510)
* fixes for ort_trainer.py to resume from checkpoint

* define self.state_dict_ during init

* add comment of explanation

* add unit test for restore from checkpoint

* fix file not found

Co-authored-by: suffian khan <sukha@microsoft.com>
2020-04-22 16:33:58 -07:00
Changming Sun
00917917d6
Downgrade numpy requirement to 1.16.6 (#3635) 2020-04-22 16:11:33 -07:00
Weixing Zhang
e4fc83252d
Refactoring code related to WARP_SIZE. (#3623)
1. Centralize its definition in common.cuh.
2. Rename it to GPU_WARP_SIZE which can be extended to AMD GPU later.
3. Centralize warp shuffle functions.

Co-authored-by: Weixing Zhang <wezhan@microsoft.com>
2020-04-22 15:19:06 -07:00
Mikhail Kuznetsov
3cf3595579
Replaced spaces on tabs (#3555) 2020-04-22 15:16:19 -07:00
Ye Wang
7837c7efc3
Add Features to ShortGrainDropper for ONNX export (#3628)
* add features to short_grain_dropper for ONNX export

* update FeaturizersLibrary

* fix warnings
2020-04-22 14:09:39 -07:00
edgchen1
bb9b0ba5b3
Merge pull request #3607 from microsoft/edgchen1/merge_from_master
Merge from master to ort_training
2020-04-22 13:22:32 -07:00
Ye Wang
70b554cc85
Add Features to ForecastingPivot Transformer for ONNX Export (#3608)
* checkin

* fix MSVC build error

* test changes

* split pivot output into multiple tensors

* add horizon tensor

* Support multiple types for non-pivot tensor

* limit horizon tensor type to int32_t as max_horizon type

* work around some conversion warnings for local machine

* support variadic shape for non-pivot input

* dropping all rows is an exception

* fix a bug

* fix the way that generates horizon tensor

* more tests added

* add TypeConstraint() in ONNX_OPERATOR_KERNEL_EX

* update Featurizerslibrary
2020-04-22 13:09:31 -07:00
Wei-Sheng Chin
ab70625b29
Add Lamb shape inference (#3634) 2020-04-22 11:32:28 -07:00
Paul McDaniel
2c74766ad1
Add new docs around how to bind to the onnxruntime.dll (#3539) 2020-04-22 11:24:36 -07:00
Edward Chen
8df5076d96 Merge remote-tracking branch 'origin/master' into edgchen1/merge_from_master 2020-04-22 17:16:00 +00:00
Edward Chen
8d09cefafc Merge remote-tracking branch 'origin/ort_training' into edgchen1/merge_from_master 2020-04-22 16:56:15 +00:00
edgchen1
b518cb2a7a
Clean up OPTIONAL name conflict workarounds in ort_training. (#3622)
* Clean up OPTIONAL name conflict workarounds.

* Cleanup unnecessory header files onnx_protobuf.h

Co-authored-by: Sherlock Huang
2020-04-22 09:07:55 -07:00
Vincent Wang
d3a2ac5c5c
Eliminate Useless Cast during Transformer. (#3606)
* Remove Useless Cast during Transformer.

* Resolve comments.

* Check if graph can remove the node.

Co-authored-by: Vincent Wang <weicwang@OrtDevTest2v100.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>
2020-04-22 16:36:46 +08:00
Tianlei Wu
d69bc31309
Refine BERT optimization script options (#3618)
* Remove paramters like --gpu_only --sequence_length. Update bert GPU notebook accordingly.
* Remove input_int32 and float16 parameters from constructors of BertOnnxModel class and other classes derived from it. 
* Update gpt2 benchmark. Add comments in gpt2 notebook to indicate work in progress. Clear notebook output before official 1.3.0 release is ready.
2020-04-21 21:28:06 -07:00
Scott McKay
b4508dbdc6
Improve TopK performance. (#3612)
* Update TopK implementation.
  - add faster heap
  - special case k=1
  - update selector for when to use heap and when to use nth_element based on performance testing
  - parallelize if enough work to do
  - reduce templatized code
  - add some extra unit tests.

Perf tested vs. master. Average speedup is 3.75x using this combination of input sizes:

```
    batches = [10, 25, 50]
    batch_size = [8, 16, 32, 64, 128, 256, 512, 1024, 2048]
    k = [1, 2, 4, 6, 8, 16, 24, 32, 48, 64, 128]
```

For larger batches (e.g. 50x2048) the speedup is over 20x.
2020-04-22 10:05:13 +10:00
edgchen1
5492d02c4e
Remove Windows CUDA 9 build definition and helper scripts. (#3615) 2020-04-21 15:22:27 -07:00
Sherlock
d66d5bb86a
Update Optimizer Domain and Opset (#3602)
* Update Domain and Opset for SGD

* Update Adam Domain and Opset

* Update Lamb Domain and Opset
2020-04-21 15:06:02 -07:00
Edward Chen
47f1758fdc Add --skip_onnx_tests to orttraining Windows builds. 2020-04-21 21:50:35 +00:00
Edward Chen
297ab43b0c Add --enable_onnx_tests to Windows builds to allow set up of test data directory. 2020-04-21 20:34:55 +00:00
Edward Chen
2e4b9b1d0e Disable CudaKernelTest.SoftmaxCrossEntropyLoss_LargeSizeTensor because it's flaky. 2020-04-21 20:30:45 +00:00
Edward Chen
28a0c863b1 Revert "Convert Gelu to use TryParallelFor (#3599)"
This reverts commit 2579a72a88.
2020-04-21 18:45:20 +00:00
Edward Chen
d50c3e7a71 Fix GraphTransformationTests tests. 2020-04-21 18:43:49 +00:00
Pranav Sharma
9636da3951
Threadpool related changes. (#3564)
Threadpool related changes.

Don't create ORT threadpool if openmp is enabled (except for inter op threadpool).
Created a new static function ThreadPool::NumThreads to account for openmp settings and null threadpool ptr.
Log a warning when using SetIntraOpNumThreads when openmp is enabled.
Added a document for ORT devs.
Fix LSTM to use the new threadpool abstractions.
Rename GetNumCpuCores to GetThreadAffinityMasks and move it to the Env class.

Co-authored-by: Tracy Sharpe <tracysh@microsoft.com>
2020-04-21 09:57:39 -07:00