Commit graph

3235 commits

Author SHA1 Message Date
George Wu
bca9ccb1b3
add install sec updates (#4957) 2020-08-31 18:13:02 -07:00
Xueyun Zhu
1e1f5a9c79
support data parallel + pipeline parallel (#4648)
* enable data + pipeline parallel

* distributed group calculation

* fix typo

* fix test and minor changes
2020-08-31 17:32:03 -07:00
Thiago Crepaldi
9817b8c8a7
Fix state_dict/checkpoint issue introduced by #4639 (#4984)
https://github.com/microsoft/onnxruntime/pull/4639 changed the default
behavior by removing optimizer state from state_dict/checkpoint APIs.
The reason for the previous change was to allow models trained on ORT to
be used for inference on PyTorch, which is an important feature.

Due to the change aforementioned, when resuming training from a checkpoint,
the optimizer would start with random weights, leading to a bad performance.
This behavior would also cause reproducibility issues, as the optimizer
wouldnt be able to resume from its previous state.

This PR adds a boolean flag to state_dict/save_xheckpoint API that
when True (default) it saves both model and optimizer state.
When False, only the model state is kept.
2020-08-31 17:00:14 -07:00
Ashwini Khade
8679a7244e
Enable rejecting models based on onnx opset (#4912)
* enable rejecting models based on onnx opset

* enable unreleased opsets in linux and mac CI

* test fixes and more updates

* enable unreleased opsets in CI builds

* enable released opsets in linux cis

* try fix windows ci yml

* yml fixes

* update yml

* yml updates post master merge

* review comments

* bug fix
2020-08-31 13:35:36 -07:00
Sherlock
50c610e70a
Stop Gradient at Shape op (#4983) 2020-08-31 13:13:17 -07:00
Faith Xu
7af052fd62
Add CI status badges for Training builds (#4951)
* Add CI status badges for Training builds

* Fix links
2020-08-31 12:10:38 -07:00
M. Zeeshan Siddiqui
6d9d252bc3
Disable NegativeLogLikelihoodLoss_LargeSizeTensor test (#4979)
Disabling this test until it's intermittent failure is root caused, this is a function and does not have a dedicated op by itself. However, this op is not used in known model to the best of my knowledge to disabling this test for the sanity of CI until the investigation is over is probably reasonable.
2020-08-31 11:02:07 -07:00
edgchen1
b41e5e88fb
Add more node debug dump functionality. (#4921)
Add ability to dump node inputs/outputs to files, filter nodes, configure behavior with environment variables.
2020-08-31 10:17:23 -07:00
Sherlock
98f7fdd7da
Handle MatmulGradient with 2D Weight at B (#4977) 2020-08-30 22:56:33 -07:00
Changming Sun
bac41969be
update (#4948) 2020-08-29 19:05:07 -07:00
Hariharan Seshadri
64d52ae47d
Support creating sessions using DML EP via C# (#4955) 2020-08-29 15:18:50 -07:00
Hariharan Seshadri
7080e485a3
hHandle upper-cased subscript labels in Einsum (#4964) 2020-08-29 15:18:21 -07:00
Dwayne Robinson
f4b057b098
Fix DML License in nuget package (#4969) 2020-08-29 00:02:01 -07:00
gwang-msft
ea5732319e
Add option ORT_NO_EXCEPTIONS to disable most exception/throw in /onnxruntime/ (#4894)
* init no exception changes

* initial test

* disable exceptions

* more throw handling

* minor update

* fix linux build break

* fix windows/nuphar build break

* address cr comments, move #ifdef to ORT_CATCH

* address cr comments, move #ifdef to ORT_CATCH

* handle return statement in ORT_CATCH

* linux build break fix

* addressed cr comments, remove ort_catch_end

* addressed cr comments, remove ort_catch_end

* move mlas to a separated ifdef flag

* merge master, move some new code in master to no_exc

Co-authored-by: gwang0000 <62914304+gwang0000@users.noreply.github.com>
2020-08-28 23:03:51 -07:00
Brian Martin
655ffd5d5b
make (de)tensorization events measure level events (#4958)
* make tensorizer events measures

* throttle the events and add a new one SoftwareBitmapToGPUTensorTelemetryEvent

* factor out timing code into a class

* typo

* typo

* move eventimer class into its own header file

* add throttling to detensorization and remove variable timing

* make detensorization events measures as well

* add ConvertGPUTensorToSoftwareBitmapTelemetryEvent event

* de-duplicate event names

* fix comment

* PR feedback
2020-08-28 16:49:32 -07:00
Thiago Crepaldi
cd0f2fb48c
Add code oweners for pytorch frontend (#4963) 2020-08-28 15:57:52 -07:00
Hariharan Seshadri
7045910d10
Support RegisterCustomOpsLibrary via the Python API (#4764) 2020-08-28 13:24:29 -07:00
Dwayne Robinson
040c5fa3e0
Merge pull request #4925 from microsoft/user/dwayner/Iron
ORT DirectML EP for Iron release, ONNX 1.5
2020-08-28 12:28:30 -07:00
Wei-Sheng Chin
1281ff6462
Put operators in-between Wait and Record (#4916) 2020-08-28 11:44:54 -07:00
Hariharan Seshadri
b945225de3
Include DirectML pdb in x86 bin folder (#4953) 2020-08-28 11:29:26 -07:00
Changming Sun
c37fa7c278
Delete Dockerfile.centos6_gpu (#4851) 2020-08-28 09:56:52 -07:00
Brian Martin
39382dc6c3
Update winrt_api.md to address the 1.4 release (#4946) 2020-08-28 08:05:22 -07:00
Dwayne Robinson
79429c934b Update 2020-08-27 21:01:19 -07:00
Ori Levari
a7ce5b2be1
fix comment and casing of telemetry fields for named dimension overrides (#4943)
Co-authored-by: Ori Levari <orlevari@microsoft.com>
2020-08-27 17:30:56 -07:00
Ye Wang
dfb9d97ddf
Support DistilBert's Attention fusion in Optimizer (#4748)
* checkin

* attention fusion

* attention work under layernorm, still need refine

* embedlayernorm(have problems with graph.Resolve())

* some fix

* update: attention works but onnx results in protobuf parsing failed

* tested by optimizer

* add embedlayer fusion test

* add attention fusion test

* clean code, need refactor later

* clean code

* added reshape fusion for distilbert, modified attention, added tests

* refactor

* small fix

* remove uncessary lines

* fix reshape and modify attention

* resolving conflicts

* restore

* refactor and review partial comments

* refactor attention

* small fix

* fix inf compare

* match new pattern for attention fusion

* formatting

* attention does not depend on transposescalematmul

* fix

* review coments

* revert changes

* review comments

* small fix
2020-08-27 17:00:30 -07:00
George Wu
e6b6736e48
update cuda capabilities (#4936) 2020-08-27 16:38:18 -07:00
Tang, Cheng
efdd96595f
bfloat16 and opset13 related fix (#4913)
* regsiter part of opset13 cpu kernels; fix a bug in func impl; adjust reshapefusion order

* remove useless function

Co-authored-by: Cheng Tang <chenta@microsoft.com>
2020-08-27 16:10:53 -07:00
Dwayne Robinson
f68d5263b7 Merged PR 5100436: EinSum ONNX 1.7 (opset 12) ORT DML EP kernel
Adds EinSum operator (purely an EP kernel, not a dedicated DML operator), which takes an equation string and depending on the specifics is capable of representing: identity, diag, trace, transpose, reduce sum, dot product, matmul, elementwise multiplication, inner product, outer product.

The DML EP recognizes many of them (identity, transpose, reduce sum, 1D dot product, matmul, elementwise multiplication), but defers to CPU when not supported (extended inner product, outer product, diag, trace, arbitrary batch ellipsis).

https://github.com/onnx/onnx/blob/master/docs/Operators.md#Einsum

WindowsAI PR: https://microsoft.visualstudio.com/DefaultCollection/WindowsAI/_git/WindowsAI/pullrequest/5100608

Related work items: #27469790
2020-08-27 22:10:14 +00:00
Nick Feeney
b5c765c76b Merged PR 5103319: 8d Update
Required changes for 8D scatter and gather

Related work items: #27678554
2020-08-27 21:28:02 +00:00
Brian Martin
970ddd56a7
Fix typo in contributing.md (#4939)
committments -> commitments
2020-08-27 14:01:36 -07:00
Sherlock
9f5d4918dc
MatMul Gradient optimization for dB when B's is 2D tensor (#4899)
* Optimized MatMulGrad for dB when B's shape is 2D

* Refactor for ConstantScalarNode

Co-authored-by: Sherlock Huang <bahuang@OrtTrainingDev3.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>
2020-08-27 11:33:20 -07:00
Sheil Kumar
6dc85b5f14
wstring_convert std::codecvt_utf8 add ~200KB to inbox windows.ai.machinelearning.dll binary size (#4932)
* switch to UTF8FromHString

* remove extra c_str

Co-authored-by: Sheil Kumar <sheilk@microsoft.com>
2020-08-27 10:07:10 -07:00
Dmitri Smirnov
2b460eaeca
Revise IDisposable implementation in C# interfaces (#4915)
Revise IDisposable implementation in C# interfaces
2020-08-27 09:17:42 -07:00
Scott McKay
08eb15068c
Exclude the Map types from the build if ML ops are disabled. (#4908)
* Exclude the Map types from the build if ML ops are disabled. They're the only ops that use Map.
2020-08-27 17:48:12 +10:00
Ye Wang
792ed44537
Support EmbedLayerNorm fusion for DistilBert (#4928)
* checkin embedlayernorm fusion for distilbert

* move function from optimizer_utils

* review comments
2020-08-26 21:46:31 -07:00
harshithapv
00fe718264
Fix divide-by-zero for SSCE kernel when normalize factor is zero. (#4911)
* Changes in SSCE for all tokens ignored case.
2020-08-26 17:12:17 -07:00
Thiago Crepaldi
cac25751bd
Fix mnist example (#4926) 2020-08-26 15:28:39 -07:00
Scott McKay
438babd966
Fix some Android build issues when ORT_MINIMAL_BUILD is defined. (#4924) 2020-08-27 07:37:51 +10:00
liqunfu
b3783a9f85
matching multiple choice between new and old apis (#4918)
* matching multiple choice between new and old apis

* update according to reviewer's comments

Co-authored-by: liqun <liqun@OrtTrainingDev4.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>
2020-08-26 12:36:10 -07:00
Ashwini Khade
0d3bbfdd0f
enable nuget packaging in local builds (#4884)
* enable building nuget packages

* add nuget creation from build.py

* add documentation

* fix flake8 errors

* fix nuget package version

* enable csharp tests

* update csharp tests

* copy nuget packges to nuget-artifacts

* add libmklml_gnu

* plus review updates

* fix references for release builds
2020-08-26 12:33:48 -07:00
Thiago Crepaldi
0a2848d3a0
Remove cerberus from wheel package (#4919) 2020-08-26 09:00:03 -07:00
Dwayne Robinson
cb5e199a79 Merged PR 5093868: GatherND1 ORT DML EP
Add batchDimensionCount.
https://github.com/onnx/onnx/pull/2585  - add batch_dim parameter.

DML PR: https://microsoft.visualstudio.com/WindowsAI/_git/WindowsAI/pullrequest/5089850
2020-08-26 02:22:21 +00:00
KeDengMS
5d3638e935
Fix symbolic shape inference bug when subgraph contains Constant node (#4858)
Constant node will be converted to initializer, and thus need to be added to subgraph initializer after such conversion
2020-08-25 16:51:18 -07:00
Xiang Zhang
170fee0987
User/xianz/fixbuild (#4906)
* support Normalized_0_1 and Normalized_1_1

* add tests for Normalized_1_1

* fix build error

* fix imagetests failure

* support denterization and add more tests

* fix build

* remove added models

* disable gpu tests for CPU pipeline

* refactor based on comments and moved two added models

* merge normalizer and Denomalizer into NominalRangeConverter

* add comments

* little change

* fix build failure for amd64
2020-08-25 15:08:55 -07:00
Scott McKay
1161c4d75f
Exclude MLAS AVX512 in minimal build (#4905) 2020-08-26 08:03:37 +10:00
ytaous
cb2dfee31c
Size Op - CUDA kernel support (#4868)
* cuda kernel support

* on comments

* test UT

* test UT

* revert settings

* attempt to fix broken UT

* corrected UT fix

Co-authored-by: Ethan Tao <ettao@microsoft.com>
2020-08-25 14:26:41 -07:00
Hariharan Seshadri
294eaca9ef
Support double for ArgMax operator (#4907) 2020-08-25 13:23:52 -07:00
Dudeldu
3d63d8d4f1
Extend C++ API for Map/Sequence Type Info (#3517) (#4781)
* Extend C++ API for Map/Sequence Type Info (#3517)

Expose functionality to view type information about sequences/maps
to C++ API.

- Add functions
    - `TypeInfo::GetSequenceTypeInfo`
    - `SequenceTypeInfo::GetSequenceElementType`
    - `TypeInfo::GetMapTypeInfo`
    - `MapTypeInfo::GetMapValueType`
    - `MapTypeInfo::GetMapKeyType`
- Add structs
    - `SequenceTypeInfo`
    - `MapTypeInfo`

Co-authored-by: Dudeldu <mustermann.informatik@gmail.com>
Co-authored-by: Jonas-Heinrich <Jonas@JonasHeinrich.com>

* Extend tests to cover new type info functionality for sequences and maps

 - two new test case in test_nontensor_types for maps and sequences

Co-authored-by: Jonas-Heinrich <Jonas@JonasHeinrich.com>
2020-08-25 12:03:23 -07:00
Hariharan Seshadri
6c26e52134
Support accessing a model's metadata in C# (#4867)
Implement access to model's metadata in C#
2020-08-25 11:13:49 -07:00
Hariharan Seshadri
26bd8c2085
Support scalar tensors in c# (#4849) 2020-08-25 11:00:33 -07:00