Commit graph

3442 commits

Author SHA1 Message Date
Scott McKay
b49ff6151e
Workaround issue with VS2017 compiler. (#5279)
The definitions for some Eigen classes don't get pulled in leading to errors. Split out the broadcast function creation logic from the functions using std::enable_if to workaround that.
2020-09-25 06:50:14 +10:00
KeDengMS
5a71819be6
Symbolic shape inference: fix a case for concat (#5277)
* Symbolic shape inference: fix a case when concat requires merge multiple dims

* Fix a bug triggered in newer version of sympy
Fix a bug in output data type guessing
2020-09-24 08:16:47 -07:00
Josh Bradley
4ed31ca214
Combine custom logger global threadpools (#4857)
* add custom logger and global threadpools to C and C++ API

* code cleanup and formatting

* reformat code

* tidy up some more code formatting

* remove comment

* fix API break from merging from master

* renamed API function to CreateEnvWithCustomLoggerAndGlobalThreadPools

* rename log variable and apply clang-format
2020-09-24 00:50:26 -07:00
Dwayne Robinson
6ad39819c2
Update DirectML Nuget to 1.3.0 (#5274)
Update to 1.3.0
2020-09-23 22:53:02 -07:00
Dwayne Robinson
a4cb00b91e
Merge pull request #5273 from microsoft/user/dwayner/CmakeLinkerOptFlags
Linker opt flags - fix conflicting CMake linker flags which contradict those needed by the Windows inbox universal CRT
2020-09-23 20:08:21 -07:00
edgchen1
6d5b93b805
Synchronize training dependency versions between Docker image and Python wheel. (#5261)
Synchronize training dependency versions between Docker image and wheel, update docs, refactor build scripts.
2020-09-23 19:03:42 -07:00
Justin Stoecker
56862f4022 Add way to disable additional linker opt flags 2020-09-23 12:56:40 -07:00
Ashwini Khade
16220f3848
Add FusedMatMul contrib op (#5213)
* bug fix transformer

* fuse cpu kernel for transposescalematmul and matmul

* fuse transpose_scale_matmul cpu kernel with matmul

* fix test

* Add FusedMatMul Contrib Op

* fix test

* fix typo

* plus more updates per review
2020-09-23 12:17:50 -07:00
Guoyu Wang
fe7e7bfe60
Update build.md for building using Xcode 12 on Mac (#5256)
* update build.md

* update build.md

* Address pr comments
2020-09-23 09:23:35 -07:00
Yufeng Li
61ba5b501a
Fix bug in the back to back quantization of matmul and conv (#5264)
* fix bug in the back to back quantization of matmul and conv

* fix bug in back to back gather
2020-09-23 08:47:20 -07:00
George Wu
b5a6a8e847
remove implicit linking of tensorrt and dnnl ep shared libs (#5262)
* remove trt and dnnl from link command

* add comment
2020-09-23 05:47:18 -07:00
Dwayne Robinson
6ea66b43db
ORT DirectML EP for Iron release, ONNX 1.5 (part 2) (#5263)
* Merged PR 5195856: Fix broken cases of zero size tensors in Cast/Reduce

 MaskRCNN failed when `Cast` tried to execute `Xor` with emptiness (zero in dimensions). This is perfectly legal and should be treated as a nop.

Ultimately DML itself should treat this case as a nop, just like how C's `memcpy` treats 0 count as a nop, but I'm just addressing it in ORT now, as enabling it in DML would impact more operators to be consistent (probably should incrementally add a flag to tensor validation so operators can be opted in gradually).

Corresponding WindowsAI PR: https://microsoft.visualstudio.com/WindowsAI/_git/WindowsAI/pullrequest/5195850

Related work items: #27469839, #28761382

* Merged PR 5201369: Remove copy of initializers added in DMLXP refactor

When used in ORT, a common method shouldn't copy and return initializer data

Related work items: #29514403

Co-authored-by: Justin Stoecker <justoeck@microsoft.com>
Co-authored-by: Jeff Bloomfield <jeffbloo@microsoft.com>
2020-09-23 01:56:19 -07:00
Hariharan Seshadri
75d994f194
Handle zero norm values in LpNormalization CPU kernel (#5251) 2020-09-22 22:01:09 -07:00
Adam Pocock
d26c71f55c
[java] Fixing the buffer semantics. (#5223)
* [java] Fixing the buffer semantics.
* Renaming bufferCapacity to bufferRemaining.
* Adding a cast to char* so the pointer arithmetic works on Windows.
2020-09-22 21:29:01 -07:00
Scott McKay
c52561d044
Rework broadcasting setup to decrease binary size. (#5227)
* Rework broadcasting setup to decrease binary size. Push all the type specific down and separate out the broadcasting/parallelization.

Reductions:
element_wise_ops: 521.0KB -> 268.8KB
where: 25.8 KB -> 17.3 KB
qlinear_binary_op: 28.1 -> 12.8
2020-09-23 14:15:40 +10:00
Changming Sun
43faf9e388
Disable a few tests that run too long(1 hour) in debug mode (#5257) 2020-09-22 21:06:24 -07:00
Tianlei Wu
3bbce69185
bump version to 1.5.1 (#5258) 2020-09-22 20:57:34 -07:00
Jeff Bloomfield
59e69bf35b
Handle missing initializers in allocation planner to fix crashes with DML provider (#5244)
* Fix memory planning bug with DML EP

* Address PR comments

* Fix typo
2020-09-22 19:37:07 -07:00
Ye Wang
898531f502
Fix reshape fusion crash (#5252)
* fix reshape fusion crash

* handling start_node statelessly

* fix
2020-09-22 15:04:13 -07:00
Guoyu Wang
e30530d9ea
Add java API for AddSessionConfigEntry (#5241)
* Add session option config entry API for java

* Java format

* Add extra test verification

* Address PR comments

* Update comments

Co-authored-by: gwang0000 <62914304+gwang0000@users.noreply.github.com>
2020-09-22 14:51:39 -07:00
KeDengMS
8dceebda0e
[Training/Python] Add option to enable symbolic shape inference (#5107)
This change adds symbolic shape inference to ORT training which helps static memory planning for model like BART.
2020-09-22 10:49:07 -07:00
edgchen1
14f250a4d0
Update BUILD.md training dependency info. (#5240)
Update training dependency versions based on Dockerfile.training.
2020-09-22 10:36:04 -07:00
Guoyu Wang
d957dbebea
Fix possible ios build break after update to Xcode 12 (#5246)
* Fix possible ios build break after update to Xcode 12

* Address comments
2020-09-22 07:42:54 -07:00
suffian khan
417929b049 jobs timeout .. 2020-09-21 21:51:59 -07:00
suffian khan
a6eb90472c try fix error on code coverage ci build 2020-09-21 21:51:59 -07:00
Sherlock
1478643215
Place Shape's output in CPU memory (#5245)
Co-authored-by: Sherlock Huang <bahuang@OrtTrainingDev3.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>
2020-09-21 20:21:59 -07:00
Sherlock
038192bdb2
Place shape related compute nodes in CPU (#4940)
* Place shape related nodes in CPU
* visit candidates by topological order
* Make CPU node placement a utility function
* skip placing on CPU if the data typs is float16 or bfloat16
2020-09-21 17:10:39 -07:00
Changming Sun
0cb09374c6
Update BUILD.md for CUDA versions (#5239) 2020-09-21 15:28:53 -07:00
George Wu
3147bc00c3
update TensorRT docs (#5238)
* doc updates TensorRT

* update

* update

* fix warning

* newline

* format
2020-09-21 15:24:20 -07:00
Xueyun Zhu
55e4b5d302
add pipeline distributed training test (#5222)
* add pipeline distributed training test

* fix max line length error in windows build

* function header indent

* fix

* fix flake8 error
2020-09-21 14:35:01 -07:00
liqunfu
84c222126c
Deprecate testMNISTTrainingAndTestingOpset10 (#4927)
Co-authored-by: liqun <liqun@OrtTrainingDev4.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>
2020-09-21 14:17:08 -07:00
Pranav Sharma
974b9bfc09
Allow sharing of initializers between sessions. (#5092)
* Allow sharing of initializers between sessions.

* Allow sharing of initializers between sessions (2).

* Add test for C#

* Add test for C#; address PR comments

* Address PR comments
Moved AddInitializer logic to internal session options
Added tests for owned buffer
Clarified documentation
Fix bug where memory info and not device was getting compared

* Fix test

* Fix training build

* Add ver 5 end marker and ver 6 starter, add scenario and usage examples.
2020-09-21 14:09:37 -07:00
Scott McKay
e0719a1073
Revert to using release SafeInt repo now that it supports a build with exceptions disabled. (#5233) 2020-09-22 06:29:28 +10:00
edgchen1
e9671e93f0
Fix TransposeScaleMatMul and MatMulScaleFusion issues (#5230)
- Rename TransposeScaleMatMul back to TransposeMatMul for backwards compatibility
- Fix MatMulScaleFusion issues:
  - Add check for supported execution providers
  - Add check for supported MatMul input types
2020-09-21 12:34:01 -07:00
Ye Wang
65740deb10
Fix a bug in EmbedLayerNorm fusion (#5150)
* fix embedlayernorm bug

* review comments

* interim checkin

* review comments

* Fix core dump in MacOS

* remove unnecessary lines

* update document

* Update graph_utils.cc

* Update onnx_exporter.py

* resolve comments
2020-09-21 12:26:14 -07:00
stevenlix
aefb2cc49b
Create profile for all dynamic shape input tensors (#5229) 2020-09-20 05:55:21 -07:00
Tiago Koji Castro Shibata
cd663d58f5
Fix WinML warnings (#5228) 2020-09-19 12:41:42 -07:00
Guoyu Wang
78a29aebbc
[ORT Mobile] ORT Minimal E2E CI (#5200)
* Modify the ort minimal CI to ort minimal e2e ci
2020-09-19 18:43:22 +10:00
Dmitri Smirnov
8ee4e8226e
Preserve relative order of the results and the tests. (#5225) 2020-09-19 00:45:44 -07:00
Weixing Zhang
b49f6a5e2c
using GPU_WARP_SIZE to make kernel portable between AMD and Nvidia GPU (#5173) 2020-09-18 14:56:16 -07:00
Suffian Khan
84589c7e05
Fuse softmax(a + b) in case of simple broadcast (#4937)
* bias softmax kernel

* bias softmax kernel

* remove debug comments

* remove debug comment

* windows build doesnt handle unary minus on unsigned type

* int64 => int treated as error

* only support cuda

* add bias softmax fusion tests

* PR comments

* more PR comments

* use MLTypeCallDispatcher

* break function into pieces

* add loop unroll and add to list for inference as well

* use std::min and move operator==

* revert std::min (doesnt work ci pipeline) and fix int to size_t error

* pr comments

* fixes for windows ci

* fix for windows ci

* pr comments on consistency

* p_model_

* fix formatting and add anonymous namespace

Co-authored-by: suffian khan <sukha@OrtTrainingDev1.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>
2020-09-18 14:15:55 -07:00
Tang, Cheng
e0b49844e9
Provide option to let layernorm stash mean/var as fp32 or bfloat16 (#5215)
* add option to set layernorm stash type

* bug fix

* fix merge error

* fix win build error
2020-09-18 13:42:01 -07:00
Dmitri Smirnov
a90ab12589
Refactor onnx_test_runner (#5169)
Refactor onnx_test_runner for better object ownership, code readability and maintainability.
2020-09-18 13:19:35 -07:00
Ryan Hill
13318ab0d4
Remove invalid install line (#5219) 2020-09-18 11:58:40 -07:00
Shucai Xiao
a632dd2d3b
Amdmigraphx improvements (#5158)
* code backup

* remove unnecessary log info

* code backup

* code backup

* merge changes from master branch

* code backup

* code backup

* merge changes from master branch

* code backup

* code backup for constant folding enhancement

* code backup

* include more scenarios for constant folding

* code backup

* remove unnecessary code

* remove unnecessary log information

* fix an error in comments

* update algorithm to do graph partition

* code backup

* remove unnecessary log information

* remove an unused function

* remove unnecessary changes
2020-09-18 11:56:50 -07:00
Weixing Zhang
f91248e0cc
remove curand_generator_ related code since it is not used. (#5220) 2020-09-18 11:50:35 -07:00
KeDengMS
ce3b67e0cd
[Python] Move symbolic_shape_infer from nuphar to tools (#5162)
* [Python] Move symbolic shape inference from nuphar to tools

* Fix PEP8 ERROR
2020-09-18 09:31:06 -07:00
RRRachelllll555
f7c1e51810
Remove shape inference and fix save large model(>2g) issue (#5210)
* remove shape inference and fix save large model problem

* remove unnecessary import

* refine code and add external format for quantize_qat

* remove initializers in tensors_to_calibrate

* small refine

Co-authored-by: t-yguo <t-yguo@microsoft.com>
2020-09-18 08:46:31 -07:00
Scott McKay
c46a480306
Update conversion script and process to simplify creating ORT format models and a minimal build (#5217)
* Update conversion script and process to simplify creating ORT format models and a minimal build.
2020-09-18 18:49:54 +10:00
George Wu
1b61dfaf69
fix _WIN32 (#5218) 2020-09-18 00:23:17 -07:00