Commit graph

3426 commits

Author SHA1 Message Date
Tianlei Wu
3bbce69185
bump version to 1.5.1 (#5258) 2020-09-22 20:57:34 -07:00
Jeff Bloomfield
59e69bf35b
Handle missing initializers in allocation planner to fix crashes with DML provider (#5244)
* Fix memory planning bug with DML EP

* Address PR comments

* Fix typo
2020-09-22 19:37:07 -07:00
Ye Wang
898531f502
Fix reshape fusion crash (#5252)
* fix reshape fusion crash

* handling start_node statelessly

* fix
2020-09-22 15:04:13 -07:00
Guoyu Wang
e30530d9ea
Add java API for AddSessionConfigEntry (#5241)
* Add session option config entry API for java

* Java format

* Add extra test verification

* Address PR comments

* Update comments

Co-authored-by: gwang0000 <62914304+gwang0000@users.noreply.github.com>
2020-09-22 14:51:39 -07:00
KeDengMS
8dceebda0e
[Training/Python] Add option to enable symbolic shape inference (#5107)
This change adds symbolic shape inference to ORT training which helps static memory planning for model like BART.
2020-09-22 10:49:07 -07:00
edgchen1
14f250a4d0
Update BUILD.md training dependency info. (#5240)
Update training dependency versions based on Dockerfile.training.
2020-09-22 10:36:04 -07:00
Guoyu Wang
d957dbebea
Fix possible ios build break after update to Xcode 12 (#5246)
* Fix possible ios build break after update to Xcode 12

* Address comments
2020-09-22 07:42:54 -07:00
suffian khan
417929b049 jobs timeout .. 2020-09-21 21:51:59 -07:00
suffian khan
a6eb90472c try fix error on code coverage ci build 2020-09-21 21:51:59 -07:00
Sherlock
1478643215
Place Shape's output in CPU memory (#5245)
Co-authored-by: Sherlock Huang <bahuang@OrtTrainingDev3.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>
2020-09-21 20:21:59 -07:00
Sherlock
038192bdb2
Place shape related compute nodes in CPU (#4940)
* Place shape related nodes in CPU
* visit candidates by topological order
* Make CPU node placement a utility function
* skip placing on CPU if the data typs is float16 or bfloat16
2020-09-21 17:10:39 -07:00
Changming Sun
0cb09374c6
Update BUILD.md for CUDA versions (#5239) 2020-09-21 15:28:53 -07:00
George Wu
3147bc00c3
update TensorRT docs (#5238)
* doc updates TensorRT

* update

* update

* fix warning

* newline

* format
2020-09-21 15:24:20 -07:00
Xueyun Zhu
55e4b5d302
add pipeline distributed training test (#5222)
* add pipeline distributed training test

* fix max line length error in windows build

* function header indent

* fix

* fix flake8 error
2020-09-21 14:35:01 -07:00
liqunfu
84c222126c
Deprecate testMNISTTrainingAndTestingOpset10 (#4927)
Co-authored-by: liqun <liqun@OrtTrainingDev4.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>
2020-09-21 14:17:08 -07:00
Pranav Sharma
974b9bfc09
Allow sharing of initializers between sessions. (#5092)
* Allow sharing of initializers between sessions.

* Allow sharing of initializers between sessions (2).

* Add test for C#

* Add test for C#; address PR comments

* Address PR comments
Moved AddInitializer logic to internal session options
Added tests for owned buffer
Clarified documentation
Fix bug where memory info and not device was getting compared

* Fix test

* Fix training build

* Add ver 5 end marker and ver 6 starter, add scenario and usage examples.
2020-09-21 14:09:37 -07:00
Scott McKay
e0719a1073
Revert to using release SafeInt repo now that it supports a build with exceptions disabled. (#5233) 2020-09-22 06:29:28 +10:00
edgchen1
e9671e93f0
Fix TransposeScaleMatMul and MatMulScaleFusion issues (#5230)
- Rename TransposeScaleMatMul back to TransposeMatMul for backwards compatibility
- Fix MatMulScaleFusion issues:
  - Add check for supported execution providers
  - Add check for supported MatMul input types
2020-09-21 12:34:01 -07:00
Ye Wang
65740deb10
Fix a bug in EmbedLayerNorm fusion (#5150)
* fix embedlayernorm bug

* review comments

* interim checkin

* review comments

* Fix core dump in MacOS

* remove unnecessary lines

* update document

* Update graph_utils.cc

* Update onnx_exporter.py

* resolve comments
2020-09-21 12:26:14 -07:00
stevenlix
aefb2cc49b
Create profile for all dynamic shape input tensors (#5229) 2020-09-20 05:55:21 -07:00
Tiago Koji Castro Shibata
cd663d58f5
Fix WinML warnings (#5228) 2020-09-19 12:41:42 -07:00
Guoyu Wang
78a29aebbc
[ORT Mobile] ORT Minimal E2E CI (#5200)
* Modify the ort minimal CI to ort minimal e2e ci
2020-09-19 18:43:22 +10:00
Dmitri Smirnov
8ee4e8226e
Preserve relative order of the results and the tests. (#5225) 2020-09-19 00:45:44 -07:00
Weixing Zhang
b49f6a5e2c
using GPU_WARP_SIZE to make kernel portable between AMD and Nvidia GPU (#5173) 2020-09-18 14:56:16 -07:00
Suffian Khan
84589c7e05
Fuse softmax(a + b) in case of simple broadcast (#4937)
* bias softmax kernel

* bias softmax kernel

* remove debug comments

* remove debug comment

* windows build doesnt handle unary minus on unsigned type

* int64 => int treated as error

* only support cuda

* add bias softmax fusion tests

* PR comments

* more PR comments

* use MLTypeCallDispatcher

* break function into pieces

* add loop unroll and add to list for inference as well

* use std::min and move operator==

* revert std::min (doesnt work ci pipeline) and fix int to size_t error

* pr comments

* fixes for windows ci

* fix for windows ci

* pr comments on consistency

* p_model_

* fix formatting and add anonymous namespace

Co-authored-by: suffian khan <sukha@OrtTrainingDev1.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>
2020-09-18 14:15:55 -07:00
Tang, Cheng
e0b49844e9
Provide option to let layernorm stash mean/var as fp32 or bfloat16 (#5215)
* add option to set layernorm stash type

* bug fix

* fix merge error

* fix win build error
2020-09-18 13:42:01 -07:00
Dmitri Smirnov
a90ab12589
Refactor onnx_test_runner (#5169)
Refactor onnx_test_runner for better object ownership, code readability and maintainability.
2020-09-18 13:19:35 -07:00
Ryan Hill
13318ab0d4
Remove invalid install line (#5219) 2020-09-18 11:58:40 -07:00
Shucai Xiao
a632dd2d3b
Amdmigraphx improvements (#5158)
* code backup

* remove unnecessary log info

* code backup

* code backup

* merge changes from master branch

* code backup

* code backup

* merge changes from master branch

* code backup

* code backup for constant folding enhancement

* code backup

* include more scenarios for constant folding

* code backup

* remove unnecessary code

* remove unnecessary log information

* fix an error in comments

* update algorithm to do graph partition

* code backup

* remove unnecessary log information

* remove an unused function

* remove unnecessary changes
2020-09-18 11:56:50 -07:00
Weixing Zhang
f91248e0cc
remove curand_generator_ related code since it is not used. (#5220) 2020-09-18 11:50:35 -07:00
KeDengMS
ce3b67e0cd
[Python] Move symbolic_shape_infer from nuphar to tools (#5162)
* [Python] Move symbolic shape inference from nuphar to tools

* Fix PEP8 ERROR
2020-09-18 09:31:06 -07:00
RRRachelllll555
f7c1e51810
Remove shape inference and fix save large model(>2g) issue (#5210)
* remove shape inference and fix save large model problem

* remove unnecessary import

* refine code and add external format for quantize_qat

* remove initializers in tensors_to_calibrate

* small refine

Co-authored-by: t-yguo <t-yguo@microsoft.com>
2020-09-18 08:46:31 -07:00
Scott McKay
c46a480306
Update conversion script and process to simplify creating ORT format models and a minimal build (#5217)
* Update conversion script and process to simplify creating ORT format models and a minimal build.
2020-09-18 18:49:54 +10:00
George Wu
1b61dfaf69
fix _WIN32 (#5218) 2020-09-18 00:23:17 -07:00
Pranav Prakash
f5df96256c
Fix order of returned values in quantize_weight_per_channel (#5205)
Must match returned order of `quantize_inputs`
2020-09-17 17:57:46 -07:00
liqunfu
f37e1292a1
--shm-size=1024m to fix nccl shared memory issue (#5214)
* --shm-size=256m to fix nccl shared memory issue

Co-authored-by: liqun <liqun@OrtTrainingDev4.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>
2020-09-17 17:21:47 -07:00
Guoyu Wang
8156e0dd10
[ORT Mobile] Some updates to iOS/Android build settings (#5184)
* Update android CI and build settings

* add build_java to arm64 also

* Add ios signing param

* fix a small build warning

* address pr comments
2020-09-17 15:53:14 -07:00
Tracy Sharpe
8698157112
NCHWc optimizer fixes for quantized models (#5203)
This updates the NCHWc transformer to not interfere with quantized convolution models, based on observations from internal models. The tensor type for MaxPool must be float. The input to GlobalAveragePool/GlobalMaxPool must be in NCHWc format.
2020-09-17 09:52:21 -07:00
Pranav Sharma
d535894297
Add API to allow configuration of the global thread pools. (#5199) 2020-09-17 09:19:18 -07:00
Suffian Khan
e01e0b2e40
Fix softmax_warp_backward math when is_log_softmax = True and register LogSoftmax CUDA kernel (#5160)
* register logsoftmax cuda kernel; fix logsoftmaxgrad cuda kernal; fix tests to invoke dispatch_softmax_*

* forgot to remove axis check

* add tests all axis

Co-authored-by: suffian khan <sukha@OrtTrainingDev1.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>
2020-09-17 07:15:25 -07:00
S. Manohar Karlapalem
584638e5d3
Corrects doc typos and formatting (#5201) 2020-09-17 01:25:19 -07:00
Zhang Lei
cd0386b649
MaxPool versioning in quantization tools. (#5194)
MaxPool versioning in quantization tools.
2020-09-16 22:52:24 -07:00
Ryan Hill
b11c106346
Remove almost all of the reinterpret_casts from the provider shared API (#5190) 2020-09-16 17:00:15 -07:00
Vincent Wang
c37472a1aa
Mixed Precision Transformer and Gradient Builder Refactor (#4892)
* transform mixed precision before build gradient

* resolve comments

Co-authored-by: Vincent Wang <weicwang@microsoft.com>
2020-09-17 02:44:50 +08:00
Tiago Koji Castro Shibata
f3f119a945
Use onecore umbrella lib in onecore builds (#5182)
* delayload hack

* Skip tests

* Onecore uses onecore umbrella

* Uncomment tests

* cleanup

* Disable dev mode for WinML
2020-09-16 10:46:27 -07:00
Tiago Koji Castro Shibata
1a2e289d2d
Fix nuget build (#5163)
* Fix nuget content

* Revert "Fix nuget content"

This reverts commit e2cdcec4e39964c50eac2fb306c7a4bb84352443.

* Nuget packaging

* skip tests

* msbuild path

* Force msbuild version

* Workaround https://github.com/NuGet/Home/issues/7621

* cleanup
2020-09-16 10:37:09 -07:00
Dmitri Smirnov
e6f85f338e
Refactor TensorAt, prepare for release (#5180)
* Refactor TensorAt
  locations* must be const and int64_t since our dims are int64_t
  Remove unnecessary copy of locations.
  Remove unnecesary casting and C-casting. Simplify implementation.
  Add a check for string type.
  Make CXX api return T& to fully expose C API in C++, const std::vector& by value as it
  covers more ground and eliminate redundant copy.
  Eliminate inner loop, compute strides first.
2020-09-16 10:20:45 -07:00
edgchen1
a20f8037f6
Install ssh in builder image, fix segfault in TrainingRunnerTest.Basic. (#5186) 2020-09-16 09:53:30 -07:00
Bowen Bao
400ac85565
Improve error message for FE model export checking (#5156) 2020-09-16 09:22:37 -07:00
Changming Sun
965e2b095d
Update MCR CUDA docker image to 10.2 (#5181) 2020-09-16 09:01:31 -07:00