Commit graph

3116 commits

Author SHA1 Message Date
Ori Levari
a7ce5b2be1
fix comment and casing of telemetry fields for named dimension overrides (#4943)
Co-authored-by: Ori Levari <orlevari@microsoft.com>
2020-08-27 17:30:56 -07:00
Ye Wang
dfb9d97ddf
Support DistilBert's Attention fusion in Optimizer (#4748)
* checkin

* attention fusion

* attention work under layernorm, still need refine

* embedlayernorm(have problems with graph.Resolve())

* some fix

* update: attention works but onnx results in protobuf parsing failed

* tested by optimizer

* add embedlayer fusion test

* add attention fusion test

* clean code, need refactor later

* clean code

* added reshape fusion for distilbert, modified attention, added tests

* refactor

* small fix

* remove uncessary lines

* fix reshape and modify attention

* resolving conflicts

* restore

* refactor and review partial comments

* refactor attention

* small fix

* fix inf compare

* match new pattern for attention fusion

* formatting

* attention does not depend on transposescalematmul

* fix

* review coments

* revert changes

* review comments

* small fix
2020-08-27 17:00:30 -07:00
George Wu
e6b6736e48
update cuda capabilities (#4936) 2020-08-27 16:38:18 -07:00
Tang, Cheng
efdd96595f
bfloat16 and opset13 related fix (#4913)
* regsiter part of opset13 cpu kernels; fix a bug in func impl; adjust reshapefusion order

* remove useless function

Co-authored-by: Cheng Tang <chenta@microsoft.com>
2020-08-27 16:10:53 -07:00
Brian Martin
970ddd56a7
Fix typo in contributing.md (#4939)
committments -> commitments
2020-08-27 14:01:36 -07:00
Sherlock
9f5d4918dc
MatMul Gradient optimization for dB when B's is 2D tensor (#4899)
* Optimized MatMulGrad for dB when B's shape is 2D

* Refactor for ConstantScalarNode

Co-authored-by: Sherlock Huang <bahuang@OrtTrainingDev3.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>
2020-08-27 11:33:20 -07:00
Sheil Kumar
6dc85b5f14
wstring_convert std::codecvt_utf8 add ~200KB to inbox windows.ai.machinelearning.dll binary size (#4932)
* switch to UTF8FromHString

* remove extra c_str

Co-authored-by: Sheil Kumar <sheilk@microsoft.com>
2020-08-27 10:07:10 -07:00
Dmitri Smirnov
2b460eaeca
Revise IDisposable implementation in C# interfaces (#4915)
Revise IDisposable implementation in C# interfaces
2020-08-27 09:17:42 -07:00
Scott McKay
08eb15068c
Exclude the Map types from the build if ML ops are disabled. (#4908)
* Exclude the Map types from the build if ML ops are disabled. They're the only ops that use Map.
2020-08-27 17:48:12 +10:00
Ye Wang
792ed44537
Support EmbedLayerNorm fusion for DistilBert (#4928)
* checkin embedlayernorm fusion for distilbert

* move function from optimizer_utils

* review comments
2020-08-26 21:46:31 -07:00
harshithapv
00fe718264
Fix divide-by-zero for SSCE kernel when normalize factor is zero. (#4911)
* Changes in SSCE for all tokens ignored case.
2020-08-26 17:12:17 -07:00
Thiago Crepaldi
cac25751bd
Fix mnist example (#4926) 2020-08-26 15:28:39 -07:00
Scott McKay
438babd966
Fix some Android build issues when ORT_MINIMAL_BUILD is defined. (#4924) 2020-08-27 07:37:51 +10:00
liqunfu
b3783a9f85
matching multiple choice between new and old apis (#4918)
* matching multiple choice between new and old apis

* update according to reviewer's comments

Co-authored-by: liqun <liqun@OrtTrainingDev4.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>
2020-08-26 12:36:10 -07:00
Ashwini Khade
0d3bbfdd0f
enable nuget packaging in local builds (#4884)
* enable building nuget packages

* add nuget creation from build.py

* add documentation

* fix flake8 errors

* fix nuget package version

* enable csharp tests

* update csharp tests

* copy nuget packges to nuget-artifacts

* add libmklml_gnu

* plus review updates

* fix references for release builds
2020-08-26 12:33:48 -07:00
Thiago Crepaldi
0a2848d3a0
Remove cerberus from wheel package (#4919) 2020-08-26 09:00:03 -07:00
KeDengMS
5d3638e935
Fix symbolic shape inference bug when subgraph contains Constant node (#4858)
Constant node will be converted to initializer, and thus need to be added to subgraph initializer after such conversion
2020-08-25 16:51:18 -07:00
Xiang Zhang
170fee0987
User/xianz/fixbuild (#4906)
* support Normalized_0_1 and Normalized_1_1

* add tests for Normalized_1_1

* fix build error

* fix imagetests failure

* support denterization and add more tests

* fix build

* remove added models

* disable gpu tests for CPU pipeline

* refactor based on comments and moved two added models

* merge normalizer and Denomalizer into NominalRangeConverter

* add comments

* little change

* fix build failure for amd64
2020-08-25 15:08:55 -07:00
Scott McKay
1161c4d75f
Exclude MLAS AVX512 in minimal build (#4905) 2020-08-26 08:03:37 +10:00
ytaous
cb2dfee31c
Size Op - CUDA kernel support (#4868)
* cuda kernel support

* on comments

* test UT

* test UT

* revert settings

* attempt to fix broken UT

* corrected UT fix

Co-authored-by: Ethan Tao <ettao@microsoft.com>
2020-08-25 14:26:41 -07:00
Hariharan Seshadri
294eaca9ef
Support double for ArgMax operator (#4907) 2020-08-25 13:23:52 -07:00
Dudeldu
3d63d8d4f1
Extend C++ API for Map/Sequence Type Info (#3517) (#4781)
* Extend C++ API for Map/Sequence Type Info (#3517)

Expose functionality to view type information about sequences/maps
to C++ API.

- Add functions
    - `TypeInfo::GetSequenceTypeInfo`
    - `SequenceTypeInfo::GetSequenceElementType`
    - `TypeInfo::GetMapTypeInfo`
    - `MapTypeInfo::GetMapValueType`
    - `MapTypeInfo::GetMapKeyType`
- Add structs
    - `SequenceTypeInfo`
    - `MapTypeInfo`

Co-authored-by: Dudeldu <mustermann.informatik@gmail.com>
Co-authored-by: Jonas-Heinrich <Jonas@JonasHeinrich.com>

* Extend tests to cover new type info functionality for sequences and maps

 - two new test case in test_nontensor_types for maps and sequences

Co-authored-by: Jonas-Heinrich <Jonas@JonasHeinrich.com>
2020-08-25 12:03:23 -07:00
Hariharan Seshadri
6c26e52134
Support accessing a model's metadata in C# (#4867)
Implement access to model's metadata in C#
2020-08-25 11:13:49 -07:00
Hariharan Seshadri
26bd8c2085
Support scalar tensors in c# (#4849) 2020-08-25 11:00:33 -07:00
Ryan Lai
d3cddba8f1
Add this line to allow collection of AppSessionGuids (#4901)
Co-authored-by: Ryan Lai <ryalai96@gamil.com>
2020-08-25 10:43:09 -07:00
Scott McKay
14c691030f
Fix build break from removing custom ORT onnx protobuf (#4904)
Exclude parsing of json config in model (also excludes json parsing library)
2020-08-25 18:10:42 +10:00
edgchen1
71d8846635
Fix telemetry-steps.yml (#4903)
Fix bug in telemetry-steps.yml that causes telemetry setup to be disabled even if TELEMETRYGUID is set.
2020-08-24 22:14:40 -07:00
Changming Sun
f34ed3a576
Hot fix for the python packaging pipeline Linux ARM build (#4902) 2020-08-24 20:14:33 -07:00
Bowen Bao
db6a821869
Enable example transformer test with dynamic size inputs (#4888)
Co-authored-by: Thiago Crepaldi <thiago.crepaldi@microsoft.com>
2020-08-24 14:31:08 -07:00
Xiang Zhang
824fcbfd9d
support Normalized_0_1 and Normalized_1_1 (#4800)
* support Normalized_0_1 and Normalized_1_1

* add tests for Normalized_1_1

* fix build error

* fix imagetests failure

* support denterization and add more tests

* fix build

* remove added models

* disable gpu tests for CPU pipeline

* refactor based on comments and moved two added models

* merge normalizer and Denomalizer into NominalRangeConverter

* add comments

* little change
2020-08-24 13:13:50 -07:00
Tianlei Wu
268d2283c0
Export GPT-2 ONNX model without postion_ids and attention_mask inputs (#4852)
* Export GPT-2 ONNX model without postion_ids and attention_mask inputs
* allow benchmark_gpt2 on user's model
* refactor:  get_dummy_inputs returns a data class.
2020-08-24 13:05:25 -07:00
Changming Sun
26546f81fe
Remove the private ONNX protobuf definition file (#4878) 2020-08-24 12:40:33 -07:00
Ye Wang
c5cb9d7b41
match reshape fusion for distilbert (#4844)
* reshape fusion for distilbert

* Update reshape_fusion.cc

* Update reshape_fusion.cc

* fix reshape

* resolve comments

* Update reshape_fusion.cc

* review comments

* review comments

* rename

* Update reshape_fusion.cc
2020-08-24 10:45:31 -07:00
Chun-Wei Chen
744809ceae
Detect whether the node has been inserted cast nodes twice (#4811)
* check whether the node has been casted before

* check casted node logically

* better naming convention

* nit: extra space

* change to skip for Cast Node

* remove hasNodeBeenCast

* Add a Unit test

* Add test onnx file

* nit: naming convention and comments

* check CI: try to remove test

* move test to existing test file
2020-08-24 07:25:41 -07:00
Scott McKay
47c4144bd1
Add gcc/clang flags to make binary smaller (https://interrupt.memfault.com/blog/best-and-worst-gcc-clang-compiler-flags#-ffunction-sections--fdata-sections----gc-sections) (#4895)
Add gcc/clang flags to make binary smaller. ~10% reduction for Android baseline build (minimal build with no ops, no exceptions, no rtti).
2020-08-24 19:24:13 +10:00
Rayan-Krishnan
eb05db5a2a
Fix OptimizerConfig params groups (#4877)
* Copy samples to build folder and load models from there. Fix CI
* This PR also includes a fix to path validation for save_as_onnx API
* Add torchtext to CI for GPU training
* Remove new frontend tests from CI

Co-authored-by: Thiago Crepaldi <thiago.crepaldi@microsoft.com>
2020-08-22 22:04:17 -07:00
Scott McKay
728e886bba
Add kernel def hash logic for minimal build (#4891)
* Add hash based lookup of kernels
2020-08-23 14:39:07 +10:00
Scott McKay
db7669b225
Reduce ONNX dependency in minimal build (#4890)
* Next round of changes.

Remove inclusion of ONNX schema header
Exclude custom registry related things
Move IsConstantInitializer from graph_utils to Graph as it's needed in a minimal build and graph_utils is excluded.
2020-08-23 07:02:13 +10:00
Pranav Sharma
29dcfb24ab
Allow multiple sessions to share an allocator, optimize constant folding memory usage, expose arena configs. (#4813)
* Add support for sharing allocators

* Incremental update

* Address some PR comments, add unit tests, add documentation.

* Address PR comments, add tests and some documentation.

* Fix build and test issues

* Remove RegisterAllocator API restoring the OrtAllocator interface changes. Changed docs to reflect this.
Also fixed the orttraining segfault. The segfault was because in the case of training session,
the CPU exec prov is not available at the time the transformers are applied. Changed it to create
a new one.
2020-08-22 10:03:17 -07:00
jingyanwangms
fa68bbc82e
Relu grad kernel (#4864)
* create branch for debug

* move unit test

* more changes

* move relu to activations_grad*

* Fix ReluGrad Domain and opset version

* added unit test, CudaKernelTest.Relu_basic doesn't work yet

* remove CudaKernelTest.Relu_basic

* PR comment

* add unit test ReluGradTest_Basic

Co-authored-by: Jingyan Wang <jingywa@OrtTrainingDev3.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>
Co-authored-by: Sherlock Huang <bahuang@OrtTrainingDev3.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>
2020-08-22 01:03:44 -07:00
Thiago Crepaldi
dce2ce7a4f
Fix checkpoint API and copy samples into build dir (#4887)
* Fix state_dict APIs
* Copy samples to build folder and fix CI
2020-08-22 00:09:48 -07:00
liqunfu
6260d073b3
Glue parallel training (#4550)
add mpi size, rank python API

add single node parallel training example
2020-08-21 21:24:27 -07:00
RRRachelllll555
9a6db9b9f4
Fix next node access bug in calibration tool (#4863)
* fix bug in calibration tool

* fix next node access bugs

* rm file in wrong folder

* refine

* optimize

* refine

* refine format

* refine

Co-authored-by: t-yguo <t-yguo@microsoft.com>
2020-08-21 20:48:54 -07:00
RandySheriffH
3fa73a5b6a
ReduceBinarySize (#4747)
* cancel night build on pyop

* add rewriter to rewrite cpu provider

* skip BuildKernelCreateInfo<void>

* refactor variable name and comment

* include ops from csv file

* process multiple eps

* add default function to cuda provider

* rename function and add license header

* fix import

* add doc

* fix typo

* deal with empty kernel entry in cuda

* rename the rewriter file

* add comment into provider file

* add comment and rename function

* log warnings

* refactor extracting logic

* add entry for script to run solo

* add better example

* avoid onnx importing

* fix flake8 alerts

* minor fixes to better comments and doc

* add entries for all domains

* add void entry into contrib providers

* format cuda_contrib_kernels.cc

* format cpu_contrib_kernels.cc

* add all providers

* add default entry to all providers

* include op_kernel header

* cancelling change in providers beyond cpu/cuda

* rename file and switch file format to domain;opset;op1,op2...

* update doc

* restore non-regular ending grammar in cuda_contrib_kernels.cc

* add ort_root as input argument of script

* enable test in ci

* update doc

* update doc

* revert change on linux gnu ci

* switch to set to host ops

* simplify trimming logic

* add domain map to track current model

* allow ort_root to take relative path
2020-08-21 19:50:13 -07:00
gwang-msft
82bc21e35e
Namespace change on ort flatbuffers schema (#4886)
* correct some errors in the flatbuffers schema, move flatbuffers submodule to cmake/external

* update the ort flatbuffers schema to use less namespace

* minor update

Co-authored-by: gwang0000 <62914304+gwang0000@users.noreply.github.com>
2020-08-21 17:43:11 -07:00
Vincent Wang
fdd0926d00
int64_t support for GatherND cuda (#4881)
Co-authored-by: Vincent Wang <weicwang@microsoft.com>
2020-08-22 08:00:31 +08:00
Thiago Crepaldi
acbf6d15c6
Improve LRScheduler tests (#4885)
* LRScheduler tests added to the Transformer model
	* Refactored LRScheduler tests for the BERT Toy onnx example
	* Removed dead code
2020-08-21 16:18:30 -07:00
Scott McKay
e00ad83f2b
Initial changes to disable code in a minimal build (#4872)
* Initial set of changes to start disabling code in the minimal build. Breaking changes into multiple PRs so they're more easily reviewed. Focus on InferenceSession, Model and Graph here. SessionState will be next.
Needs to be integrated with de/serialization code before being testable so changes are all off by default.

Changes are limited to
  - #ifdef'ing out code
  - moving some things around so there are fewer #ifdef statements
  - moving definition of some one-line methods into the header so we don't need to #ifdef out in a .cc as well
  - exclude some things in the cmake setup

* Update session state and a few other places.

The core code builds if ORT_MINIMAL_BUILD is specified.
2020-08-22 07:14:53 +10:00
Yufeng Li
fb43aa0de0
implement per-channel for quantizelinear and dequantizelinear (#4759)
* update onnx to latest master

* implement per-channel for quantizelinear and dequantizelinear

* refine the unit test

* exclude sequence_insert tests

* refine onnx cmake

* add failure tests to broken_tests

* move qdq common code to a seperate function

* refine code
2020-08-21 12:08:50 -07:00
Thiago Crepaldi
5427a7e9af
Update LRScheduler to use scheduling similar to HuggingFace (#4880) 2020-08-21 10:24:04 -07:00