Tianlei Wu
3bbce69185
bump version to 1.5.1 ( #5258 )
2020-09-22 20:57:34 -07:00
Jeff Bloomfield
59e69bf35b
Handle missing initializers in allocation planner to fix crashes with DML provider ( #5244 )
...
* Fix memory planning bug with DML EP
* Address PR comments
* Fix typo
2020-09-22 19:37:07 -07:00
Ye Wang
898531f502
Fix reshape fusion crash ( #5252 )
...
* fix reshape fusion crash
* handling start_node statelessly
* fix
2020-09-22 15:04:13 -07:00
Guoyu Wang
e30530d9ea
Add java API for AddSessionConfigEntry ( #5241 )
...
* Add session option config entry API for java
* Java format
* Add extra test verification
* Address PR comments
* Update comments
Co-authored-by: gwang0000 <62914304+gwang0000@users.noreply.github.com>
2020-09-22 14:51:39 -07:00
KeDengMS
8dceebda0e
[Training/Python] Add option to enable symbolic shape inference ( #5107 )
...
This change adds symbolic shape inference to ORT training which helps static memory planning for model like BART.
2020-09-22 10:49:07 -07:00
edgchen1
14f250a4d0
Update BUILD.md training dependency info. ( #5240 )
...
Update training dependency versions based on Dockerfile.training.
2020-09-22 10:36:04 -07:00
Guoyu Wang
d957dbebea
Fix possible ios build break after update to Xcode 12 ( #5246 )
...
* Fix possible ios build break after update to Xcode 12
* Address comments
2020-09-22 07:42:54 -07:00
suffian khan
417929b049
jobs timeout ..
2020-09-21 21:51:59 -07:00
suffian khan
a6eb90472c
try fix error on code coverage ci build
2020-09-21 21:51:59 -07:00
Sherlock
1478643215
Place Shape's output in CPU memory ( #5245 )
...
Co-authored-by: Sherlock Huang <bahuang@OrtTrainingDev3.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>
2020-09-21 20:21:59 -07:00
Sherlock
038192bdb2
Place shape related compute nodes in CPU ( #4940 )
...
* Place shape related nodes in CPU
* visit candidates by topological order
* Make CPU node placement a utility function
* skip placing on CPU if the data typs is float16 or bfloat16
2020-09-21 17:10:39 -07:00
Changming Sun
0cb09374c6
Update BUILD.md for CUDA versions ( #5239 )
2020-09-21 15:28:53 -07:00
George Wu
3147bc00c3
update TensorRT docs ( #5238 )
...
* doc updates TensorRT
* update
* update
* fix warning
* newline
* format
2020-09-21 15:24:20 -07:00
Xueyun Zhu
55e4b5d302
add pipeline distributed training test ( #5222 )
...
* add pipeline distributed training test
* fix max line length error in windows build
* function header indent
* fix
* fix flake8 error
2020-09-21 14:35:01 -07:00
liqunfu
84c222126c
Deprecate testMNISTTrainingAndTestingOpset10 ( #4927 )
...
Co-authored-by: liqun <liqun@OrtTrainingDev4.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>
2020-09-21 14:17:08 -07:00
Pranav Sharma
974b9bfc09
Allow sharing of initializers between sessions. ( #5092 )
...
* Allow sharing of initializers between sessions.
* Allow sharing of initializers between sessions (2).
* Add test for C#
* Add test for C#; address PR comments
* Address PR comments
Moved AddInitializer logic to internal session options
Added tests for owned buffer
Clarified documentation
Fix bug where memory info and not device was getting compared
* Fix test
* Fix training build
* Add ver 5 end marker and ver 6 starter, add scenario and usage examples.
2020-09-21 14:09:37 -07:00
Scott McKay
e0719a1073
Revert to using release SafeInt repo now that it supports a build with exceptions disabled. ( #5233 )
2020-09-22 06:29:28 +10:00
edgchen1
e9671e93f0
Fix TransposeScaleMatMul and MatMulScaleFusion issues ( #5230 )
...
- Rename TransposeScaleMatMul back to TransposeMatMul for backwards compatibility
- Fix MatMulScaleFusion issues:
- Add check for supported execution providers
- Add check for supported MatMul input types
2020-09-21 12:34:01 -07:00
Ye Wang
65740deb10
Fix a bug in EmbedLayerNorm fusion ( #5150 )
...
* fix embedlayernorm bug
* review comments
* interim checkin
* review comments
* Fix core dump in MacOS
* remove unnecessary lines
* update document
* Update graph_utils.cc
* Update onnx_exporter.py
* resolve comments
2020-09-21 12:26:14 -07:00
stevenlix
aefb2cc49b
Create profile for all dynamic shape input tensors ( #5229 )
2020-09-20 05:55:21 -07:00
Tiago Koji Castro Shibata
cd663d58f5
Fix WinML warnings ( #5228 )
2020-09-19 12:41:42 -07:00
Guoyu Wang
78a29aebbc
[ORT Mobile] ORT Minimal E2E CI ( #5200 )
...
* Modify the ort minimal CI to ort minimal e2e ci
2020-09-19 18:43:22 +10:00
Dmitri Smirnov
8ee4e8226e
Preserve relative order of the results and the tests. ( #5225 )
2020-09-19 00:45:44 -07:00
Weixing Zhang
b49f6a5e2c
using GPU_WARP_SIZE to make kernel portable between AMD and Nvidia GPU ( #5173 )
2020-09-18 14:56:16 -07:00
Suffian Khan
84589c7e05
Fuse softmax(a + b) in case of simple broadcast ( #4937 )
...
* bias softmax kernel
* bias softmax kernel
* remove debug comments
* remove debug comment
* windows build doesnt handle unary minus on unsigned type
* int64 => int treated as error
* only support cuda
* add bias softmax fusion tests
* PR comments
* more PR comments
* use MLTypeCallDispatcher
* break function into pieces
* add loop unroll and add to list for inference as well
* use std::min and move operator==
* revert std::min (doesnt work ci pipeline) and fix int to size_t error
* pr comments
* fixes for windows ci
* fix for windows ci
* pr comments on consistency
* p_model_
* fix formatting and add anonymous namespace
Co-authored-by: suffian khan <sukha@OrtTrainingDev1.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>
2020-09-18 14:15:55 -07:00
Tang, Cheng
e0b49844e9
Provide option to let layernorm stash mean/var as fp32 or bfloat16 ( #5215 )
...
* add option to set layernorm stash type
* bug fix
* fix merge error
* fix win build error
2020-09-18 13:42:01 -07:00
Dmitri Smirnov
a90ab12589
Refactor onnx_test_runner ( #5169 )
...
Refactor onnx_test_runner for better object ownership, code readability and maintainability.
2020-09-18 13:19:35 -07:00
Ryan Hill
13318ab0d4
Remove invalid install line ( #5219 )
2020-09-18 11:58:40 -07:00
Shucai Xiao
a632dd2d3b
Amdmigraphx improvements ( #5158 )
...
* code backup
* remove unnecessary log info
* code backup
* code backup
* merge changes from master branch
* code backup
* code backup
* merge changes from master branch
* code backup
* code backup for constant folding enhancement
* code backup
* include more scenarios for constant folding
* code backup
* remove unnecessary code
* remove unnecessary log information
* fix an error in comments
* update algorithm to do graph partition
* code backup
* remove unnecessary log information
* remove an unused function
* remove unnecessary changes
2020-09-18 11:56:50 -07:00
Weixing Zhang
f91248e0cc
remove curand_generator_ related code since it is not used. ( #5220 )
2020-09-18 11:50:35 -07:00
KeDengMS
ce3b67e0cd
[Python] Move symbolic_shape_infer from nuphar to tools ( #5162 )
...
* [Python] Move symbolic shape inference from nuphar to tools
* Fix PEP8 ERROR
2020-09-18 09:31:06 -07:00
RRRachelllll555
f7c1e51810
Remove shape inference and fix save large model(>2g) issue ( #5210 )
...
* remove shape inference and fix save large model problem
* remove unnecessary import
* refine code and add external format for quantize_qat
* remove initializers in tensors_to_calibrate
* small refine
Co-authored-by: t-yguo <t-yguo@microsoft.com>
2020-09-18 08:46:31 -07:00
Scott McKay
c46a480306
Update conversion script and process to simplify creating ORT format models and a minimal build ( #5217 )
...
* Update conversion script and process to simplify creating ORT format models and a minimal build.
2020-09-18 18:49:54 +10:00
George Wu
1b61dfaf69
fix _WIN32 ( #5218 )
2020-09-18 00:23:17 -07:00
Pranav Prakash
f5df96256c
Fix order of returned values in quantize_weight_per_channel ( #5205 )
...
Must match returned order of `quantize_inputs`
2020-09-17 17:57:46 -07:00
liqunfu
f37e1292a1
--shm-size=1024m to fix nccl shared memory issue ( #5214 )
...
* --shm-size=256m to fix nccl shared memory issue
Co-authored-by: liqun <liqun@OrtTrainingDev4.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>
2020-09-17 17:21:47 -07:00
Guoyu Wang
8156e0dd10
[ORT Mobile] Some updates to iOS/Android build settings ( #5184 )
...
* Update android CI and build settings
* add build_java to arm64 also
* Add ios signing param
* fix a small build warning
* address pr comments
2020-09-17 15:53:14 -07:00
Tracy Sharpe
8698157112
NCHWc optimizer fixes for quantized models ( #5203 )
...
This updates the NCHWc transformer to not interfere with quantized convolution models, based on observations from internal models. The tensor type for MaxPool must be float. The input to GlobalAveragePool/GlobalMaxPool must be in NCHWc format.
2020-09-17 09:52:21 -07:00
Pranav Sharma
d535894297
Add API to allow configuration of the global thread pools. ( #5199 )
2020-09-17 09:19:18 -07:00
Suffian Khan
e01e0b2e40
Fix softmax_warp_backward math when is_log_softmax = True and register LogSoftmax CUDA kernel ( #5160 )
...
* register logsoftmax cuda kernel; fix logsoftmaxgrad cuda kernal; fix tests to invoke dispatch_softmax_*
* forgot to remove axis check
* add tests all axis
Co-authored-by: suffian khan <sukha@OrtTrainingDev1.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>
2020-09-17 07:15:25 -07:00
S. Manohar Karlapalem
584638e5d3
Corrects doc typos and formatting ( #5201 )
2020-09-17 01:25:19 -07:00
Zhang Lei
cd0386b649
MaxPool versioning in quantization tools. ( #5194 )
...
MaxPool versioning in quantization tools.
2020-09-16 22:52:24 -07:00
Ryan Hill
b11c106346
Remove almost all of the reinterpret_casts from the provider shared API ( #5190 )
2020-09-16 17:00:15 -07:00
Vincent Wang
c37472a1aa
Mixed Precision Transformer and Gradient Builder Refactor ( #4892 )
...
* transform mixed precision before build gradient
* resolve comments
Co-authored-by: Vincent Wang <weicwang@microsoft.com>
2020-09-17 02:44:50 +08:00
Tiago Koji Castro Shibata
f3f119a945
Use onecore umbrella lib in onecore builds ( #5182 )
...
* delayload hack
* Skip tests
* Onecore uses onecore umbrella
* Uncomment tests
* cleanup
* Disable dev mode for WinML
2020-09-16 10:46:27 -07:00
Tiago Koji Castro Shibata
1a2e289d2d
Fix nuget build ( #5163 )
...
* Fix nuget content
* Revert "Fix nuget content"
This reverts commit e2cdcec4e39964c50eac2fb306c7a4bb84352443.
* Nuget packaging
* skip tests
* msbuild path
* Force msbuild version
* Workaround https://github.com/NuGet/Home/issues/7621
* cleanup
2020-09-16 10:37:09 -07:00
Dmitri Smirnov
e6f85f338e
Refactor TensorAt, prepare for release ( #5180 )
...
* Refactor TensorAt
locations* must be const and int64_t since our dims are int64_t
Remove unnecessary copy of locations.
Remove unnecesary casting and C-casting. Simplify implementation.
Add a check for string type.
Make CXX api return T& to fully expose C API in C++, const std::vector& by value as it
covers more ground and eliminate redundant copy.
Eliminate inner loop, compute strides first.
2020-09-16 10:20:45 -07:00
edgchen1
a20f8037f6
Install ssh in builder image, fix segfault in TrainingRunnerTest.Basic. ( #5186 )
2020-09-16 09:53:30 -07:00
Bowen Bao
400ac85565
Improve error message for FE model export checking ( #5156 )
2020-09-16 09:22:37 -07:00
Changming Sun
965e2b095d
Update MCR CUDA docker image to 10.2 ( #5181 )
2020-09-16 09:01:31 -07:00