Commit graph

3451 commits

Author SHA1 Message Date
Tang, Cheng
d9ecc0cebf
add bert loss legacy back (#5224) 2020-09-27 13:41:16 -07:00
George Wu
16d35266ab
add install targets for ep shared libs (#5286) 2020-09-25 07:10:43 -07:00
Guoyu Wang
3a3f26f38e
Move ort flatbuffers helper functions and value info r/w functions into separated lib (#5276)
* Move fbs include from header to cc

* add initial cmake for flatbuffers

* Move most flatbuffers util to ort_flatbuffers

* move code around

* fix

* move test/perf runner to use flatbuffer directly instead of model

* minor update

* Fix build break

* Clean up includes and foward decl

* Fix traning CI build breaks

* Addressed PR comment, replaced some include with forward decls

* Remove ORT_MUST_USE_RESULT temporarily
2020-09-25 05:36:29 -07:00
Changming Sun
17f1178c2e
Downgrade GCC (#5269)
Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com>
2020-09-24 21:14:54 -07:00
Sherlock
b03fb82ab7
Transformer layer-wise Recompute (#4526)
* Build Recomputation Graph

* Make topological sort to run FW nodes first

* Pattern match start and end of transformer layer

* Topological sort with Priority

* Add logger to Gradient Graph Builder

* Use Logger

* Introduce Execution Order
2020-09-24 19:56:32 -07:00
Faith Xu
b6e71200eb
Add additional tutorial links (#5272) 2020-09-24 17:27:58 -07:00
Dmitri Smirnov
89742411ec
Insert telemetry template into GPU build, add telemry build switches. (#5278) 2020-09-24 17:13:09 -07:00
Thiago Crepaldi
ebeeff22dd
Update PyTorch TransformerModel sample (#5275) 2020-09-24 16:28:07 -07:00
Ryan Lai
71b52ad5de
Fix inbox telemetry (#5265)
* ifdef to check if redist or not

* Fix redist telemetry

Co-authored-by: Ryan Lai <ryalai96@gamil.com>
2020-09-24 14:58:07 -07:00
Scott McKay
b49ff6151e
Workaround issue with VS2017 compiler. (#5279)
The definitions for some Eigen classes don't get pulled in leading to errors. Split out the broadcast function creation logic from the functions using std::enable_if to workaround that.
2020-09-25 06:50:14 +10:00
KeDengMS
5a71819be6
Symbolic shape inference: fix a case for concat (#5277)
* Symbolic shape inference: fix a case when concat requires merge multiple dims

* Fix a bug triggered in newer version of sympy
Fix a bug in output data type guessing
2020-09-24 08:16:47 -07:00
Josh Bradley
4ed31ca214
Combine custom logger global threadpools (#4857)
* add custom logger and global threadpools to C and C++ API

* code cleanup and formatting

* reformat code

* tidy up some more code formatting

* remove comment

* fix API break from merging from master

* renamed API function to CreateEnvWithCustomLoggerAndGlobalThreadPools

* rename log variable and apply clang-format
2020-09-24 00:50:26 -07:00
Dwayne Robinson
6ad39819c2
Update DirectML Nuget to 1.3.0 (#5274)
Update to 1.3.0
2020-09-23 22:53:02 -07:00
Dwayne Robinson
a4cb00b91e
Merge pull request #5273 from microsoft/user/dwayner/CmakeLinkerOptFlags
Linker opt flags - fix conflicting CMake linker flags which contradict those needed by the Windows inbox universal CRT
2020-09-23 20:08:21 -07:00
edgchen1
6d5b93b805
Synchronize training dependency versions between Docker image and Python wheel. (#5261)
Synchronize training dependency versions between Docker image and wheel, update docs, refactor build scripts.
2020-09-23 19:03:42 -07:00
Justin Stoecker
56862f4022 Add way to disable additional linker opt flags 2020-09-23 12:56:40 -07:00
Ashwini Khade
16220f3848
Add FusedMatMul contrib op (#5213)
* bug fix transformer

* fuse cpu kernel for transposescalematmul and matmul

* fuse transpose_scale_matmul cpu kernel with matmul

* fix test

* Add FusedMatMul Contrib Op

* fix test

* fix typo

* plus more updates per review
2020-09-23 12:17:50 -07:00
Guoyu Wang
fe7e7bfe60
Update build.md for building using Xcode 12 on Mac (#5256)
* update build.md

* update build.md

* Address pr comments
2020-09-23 09:23:35 -07:00
Yufeng Li
61ba5b501a
Fix bug in the back to back quantization of matmul and conv (#5264)
* fix bug in the back to back quantization of matmul and conv

* fix bug in back to back gather
2020-09-23 08:47:20 -07:00
George Wu
b5a6a8e847
remove implicit linking of tensorrt and dnnl ep shared libs (#5262)
* remove trt and dnnl from link command

* add comment
2020-09-23 05:47:18 -07:00
Dwayne Robinson
6ea66b43db
ORT DirectML EP for Iron release, ONNX 1.5 (part 2) (#5263)
* Merged PR 5195856: Fix broken cases of zero size tensors in Cast/Reduce

 MaskRCNN failed when `Cast` tried to execute `Xor` with emptiness (zero in dimensions). This is perfectly legal and should be treated as a nop.

Ultimately DML itself should treat this case as a nop, just like how C's `memcpy` treats 0 count as a nop, but I'm just addressing it in ORT now, as enabling it in DML would impact more operators to be consistent (probably should incrementally add a flag to tensor validation so operators can be opted in gradually).

Corresponding WindowsAI PR: https://microsoft.visualstudio.com/WindowsAI/_git/WindowsAI/pullrequest/5195850

Related work items: #27469839, #28761382

* Merged PR 5201369: Remove copy of initializers added in DMLXP refactor

When used in ORT, a common method shouldn't copy and return initializer data

Related work items: #29514403

Co-authored-by: Justin Stoecker <justoeck@microsoft.com>
Co-authored-by: Jeff Bloomfield <jeffbloo@microsoft.com>
2020-09-23 01:56:19 -07:00
Hariharan Seshadri
75d994f194
Handle zero norm values in LpNormalization CPU kernel (#5251) 2020-09-22 22:01:09 -07:00
Adam Pocock
d26c71f55c
[java] Fixing the buffer semantics. (#5223)
* [java] Fixing the buffer semantics.
* Renaming bufferCapacity to bufferRemaining.
* Adding a cast to char* so the pointer arithmetic works on Windows.
2020-09-22 21:29:01 -07:00
Scott McKay
c52561d044
Rework broadcasting setup to decrease binary size. (#5227)
* Rework broadcasting setup to decrease binary size. Push all the type specific down and separate out the broadcasting/parallelization.

Reductions:
element_wise_ops: 521.0KB -> 268.8KB
where: 25.8 KB -> 17.3 KB
qlinear_binary_op: 28.1 -> 12.8
2020-09-23 14:15:40 +10:00
Changming Sun
43faf9e388
Disable a few tests that run too long(1 hour) in debug mode (#5257) 2020-09-22 21:06:24 -07:00
Tianlei Wu
3bbce69185
bump version to 1.5.1 (#5258) 2020-09-22 20:57:34 -07:00
Jeff Bloomfield
59e69bf35b
Handle missing initializers in allocation planner to fix crashes with DML provider (#5244)
* Fix memory planning bug with DML EP

* Address PR comments

* Fix typo
2020-09-22 19:37:07 -07:00
Ye Wang
898531f502
Fix reshape fusion crash (#5252)
* fix reshape fusion crash

* handling start_node statelessly

* fix
2020-09-22 15:04:13 -07:00
Guoyu Wang
e30530d9ea
Add java API for AddSessionConfigEntry (#5241)
* Add session option config entry API for java

* Java format

* Add extra test verification

* Address PR comments

* Update comments

Co-authored-by: gwang0000 <62914304+gwang0000@users.noreply.github.com>
2020-09-22 14:51:39 -07:00
KeDengMS
8dceebda0e
[Training/Python] Add option to enable symbolic shape inference (#5107)
This change adds symbolic shape inference to ORT training which helps static memory planning for model like BART.
2020-09-22 10:49:07 -07:00
edgchen1
14f250a4d0
Update BUILD.md training dependency info. (#5240)
Update training dependency versions based on Dockerfile.training.
2020-09-22 10:36:04 -07:00
Guoyu Wang
d957dbebea
Fix possible ios build break after update to Xcode 12 (#5246)
* Fix possible ios build break after update to Xcode 12

* Address comments
2020-09-22 07:42:54 -07:00
suffian khan
417929b049 jobs timeout .. 2020-09-21 21:51:59 -07:00
suffian khan
a6eb90472c try fix error on code coverage ci build 2020-09-21 21:51:59 -07:00
Sherlock
1478643215
Place Shape's output in CPU memory (#5245)
Co-authored-by: Sherlock Huang <bahuang@OrtTrainingDev3.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>
2020-09-21 20:21:59 -07:00
Sherlock
038192bdb2
Place shape related compute nodes in CPU (#4940)
* Place shape related nodes in CPU
* visit candidates by topological order
* Make CPU node placement a utility function
* skip placing on CPU if the data typs is float16 or bfloat16
2020-09-21 17:10:39 -07:00
Changming Sun
0cb09374c6
Update BUILD.md for CUDA versions (#5239) 2020-09-21 15:28:53 -07:00
George Wu
3147bc00c3
update TensorRT docs (#5238)
* doc updates TensorRT

* update

* update

* fix warning

* newline

* format
2020-09-21 15:24:20 -07:00
Xueyun Zhu
55e4b5d302
add pipeline distributed training test (#5222)
* add pipeline distributed training test

* fix max line length error in windows build

* function header indent

* fix

* fix flake8 error
2020-09-21 14:35:01 -07:00
liqunfu
84c222126c
Deprecate testMNISTTrainingAndTestingOpset10 (#4927)
Co-authored-by: liqun <liqun@OrtTrainingDev4.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>
2020-09-21 14:17:08 -07:00
Pranav Sharma
974b9bfc09
Allow sharing of initializers between sessions. (#5092)
* Allow sharing of initializers between sessions.

* Allow sharing of initializers between sessions (2).

* Add test for C#

* Add test for C#; address PR comments

* Address PR comments
Moved AddInitializer logic to internal session options
Added tests for owned buffer
Clarified documentation
Fix bug where memory info and not device was getting compared

* Fix test

* Fix training build

* Add ver 5 end marker and ver 6 starter, add scenario and usage examples.
2020-09-21 14:09:37 -07:00
Scott McKay
e0719a1073
Revert to using release SafeInt repo now that it supports a build with exceptions disabled. (#5233) 2020-09-22 06:29:28 +10:00
edgchen1
e9671e93f0
Fix TransposeScaleMatMul and MatMulScaleFusion issues (#5230)
- Rename TransposeScaleMatMul back to TransposeMatMul for backwards compatibility
- Fix MatMulScaleFusion issues:
  - Add check for supported execution providers
  - Add check for supported MatMul input types
2020-09-21 12:34:01 -07:00
Ye Wang
65740deb10
Fix a bug in EmbedLayerNorm fusion (#5150)
* fix embedlayernorm bug

* review comments

* interim checkin

* review comments

* Fix core dump in MacOS

* remove unnecessary lines

* update document

* Update graph_utils.cc

* Update onnx_exporter.py

* resolve comments
2020-09-21 12:26:14 -07:00
stevenlix
aefb2cc49b
Create profile for all dynamic shape input tensors (#5229) 2020-09-20 05:55:21 -07:00
Tiago Koji Castro Shibata
cd663d58f5
Fix WinML warnings (#5228) 2020-09-19 12:41:42 -07:00
Guoyu Wang
78a29aebbc
[ORT Mobile] ORT Minimal E2E CI (#5200)
* Modify the ort minimal CI to ort minimal e2e ci
2020-09-19 18:43:22 +10:00
Dmitri Smirnov
8ee4e8226e
Preserve relative order of the results and the tests. (#5225) 2020-09-19 00:45:44 -07:00
Weixing Zhang
b49f6a5e2c
using GPU_WARP_SIZE to make kernel portable between AMD and Nvidia GPU (#5173) 2020-09-18 14:56:16 -07:00
Suffian Khan
84589c7e05
Fuse softmax(a + b) in case of simple broadcast (#4937)
* bias softmax kernel

* bias softmax kernel

* remove debug comments

* remove debug comment

* windows build doesnt handle unary minus on unsigned type

* int64 => int treated as error

* only support cuda

* add bias softmax fusion tests

* PR comments

* more PR comments

* use MLTypeCallDispatcher

* break function into pieces

* add loop unroll and add to list for inference as well

* use std::min and move operator==

* revert std::min (doesnt work ci pipeline) and fix int to size_t error

* pr comments

* fixes for windows ci

* fix for windows ci

* pr comments on consistency

* p_model_

* fix formatting and add anonymous namespace

Co-authored-by: suffian khan <sukha@OrtTrainingDev1.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>
2020-09-18 14:15:55 -07:00