Commit graph

6800 commits

Author SHA1 Message Date
pengwa
17a8ecee6f
fix win build errors (on device training) (#11844)
* fix win build errors

* fix linux build

* fix typo

* minor fix

* fix win in c api

* fix linux build complaining bool

* fix ORT_RETURN_ON_ERROR
2022-06-23 18:16:50 +08:00
Baiju Meswani
fac8dae9df
Add support for gradient clipping, AdamWOptimizer and tensorseq as inputs (#11697) 2022-06-22 10:27:58 -07:00
ashbhandare
f14f0e19ec
Fix trainer (#11933)
fix trainer:
2022-06-22 08:48:19 -07:00
Baiju Meswani
a36e92d86e
Offline tooling readme (#11920) 2022-06-21 17:57:44 -07:00
Ashwini Khade
a3ec2d6d15
changes to c api + bug fix (#11858)
Co-authored-by: Ashwini Khade <askhade>
2022-06-17 16:56:42 -07:00
Ashwini Khade
f63e28c92f
C API version 0.001 (#11758)
* C API version 0.001

* fix linker issues

* fixes for save checkpoint api

* plus fixes based on tests

* plus test_runner and other changes

* Plus cosmetic updates

* remove unnecessary headers

* plus some updates

* plus more changes

Co-authored-by: Ashwini Khade <askhade@microsoft.com@orttrainingdev10.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net>
2022-06-15 11:13:35 -07:00
pengwa
fb88efbe18
End to end run pass (on device training) (#11694)
* lr_scheduler implementation

(cherry picked from commit d9c2552b3a3b2ff38ee0a14770257aa1169f6fa9)

* refactor Module/Optimizer constructor.

* add intermidiate API layer bridging public interfaces with internal ones.

* synthetic data loader

* make end to end run pass

* avoid many session input copy (CPU to GPU)
some clean up

* NVTX for runner

* minor fix after sync

* revert to let Module/Optimizer handle session creation.

* fix tests & test file folder consolidation

* refine based on comments & fix cpplint

* typos
2022-06-10 15:25:44 -07:00
Baiju Meswani
a61c38e4f4
Add ability to author float initializers (#11752) 2022-06-10 11:21:14 -07:00
pengwa
540935aace
lr scheduler implementation (on device training) (#11714)
* lr_scheduler implementations

* rename test_runner to test_trainer.

* add unit tests

* address comments
2022-06-09 08:04:30 +08:00
ashbhandare
1c316d0e39
Parameter,Module and Optimizer changes (#11494)
* Module step

* On device training offline composition

* Working grad accumulation with test for TrainStep

* Temp changes

* Revert "On device training offline composition"

This reverts commit ec3da68247.

* cleanup

* Implement eval step

* Use new graphs and checkpoints

* Optimizer test, changes

* review comments

* review comments

* review comments

Co-authored-by: Aishwarya Bhandare <aibhanda@microsoft.com@orttrainingdev7.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net>
Co-authored-by: Baiju Meswani <bmeswani@microsoft.com>
2022-05-31 09:20:47 -07:00
pengwa@microsoft.com
e1c63cb06a Merge branch 'master' of https://github.com/microsoft/onnxruntime into training_dev/on_device_poc 2022-05-28 01:54:17 +00:00
Baiju Meswani
c318b19307
Add support for BCEWithLogitsLoss (#11630) 2022-05-27 15:50:16 -07:00
Scott McKay
4fabc400de
Fix CUDA 11.6 build error on Windows (#11578)
* Avoid windows header that defines 'small'
2022-05-28 08:04:46 +10:00
Scott McKay
7e6d052275
Add better error message for subgraph output coming directly from outer scope value. (#11638)
* Add better message for subgraph output coming directly from outer scope value.

* Use regex to match value name as the test model is processed in a different order on different platforms.
2022-05-28 08:04:27 +10:00
Gary Miguel
b67c0f639c
Remove filter_mode input from pyflakes GitHub action (#11644)
Previously it triggered:

`Warning: Unexpected input(s) 'filter_mode', valid inputs are ['entryPoint', 'args', 'github_token', 'level', 'reporter']`
2022-05-27 07:59:17 -07:00
pengwa
44f7b1bf2c
MTA AdamWOptimizer (#11506)
* skeleton change

* adam compute kernels

* add rtol/atol for tests

* some clean up

* optional outputs

* more clean up

* add tests

* adamw mode=1 test pass

* clean up tests

* add HF AdamW test cases

* refactor adam test file

* make test pass

* all test pass, fix comments

* rename to adamw

* make test pass again

* fix cpplint

* minor fixes

* fix python lint

* Fix build and tests

* fix builds

* fix windows build

* fix win build

* minor fix

* Refine based on comments

* resolve comments

* formatting

* resolve comments

* add ut
2022-05-27 19:52:04 +08:00
Vincent Wang
02724c54ff
[CUDA] Implement BitmaskDropout, BitmaskBiasDropout and BitmaskDropoutGrad (#11534)
* Implement BitmaskDropout and associated unit tests.

* Implement BitmaskDropoutGrad and associated unit tests.

* Implement Dropout -> BitmaskDropout rewrite rule and associated unit tests.

* Implement (Dropout,DropoutGrad) -> (BitmaskDropout,BitmaskDropoutGrad) rewrite rule.

This commit does not yet include unit tests for this rewrite rule.

This commit also introduces improved documentation for all changes which will be grouped
into this PR.

* bitmask dropout

* fix win build

* bugfix for rocm

* bugfix

* fix code format

* fix ut

* fix build break

* fix ut in win

* resolve comments

* fix ut in trt

* resolve comments

* fix rocm build error

* fix typo

Co-authored-by: Aidan Beggs <aidanbeggs@microsoft.com>
2022-05-27 17:24:47 +08:00
Vincent Wang
eadb1a3128
Speed Up GradientChecker Running (#11579)
* fix gradient tester

* test size adjust

* fix win build
2022-05-27 15:14:53 +08:00
Changming Sun
6a45f9f059
Pin protobuf version to 3.18.1 (#11645) 2022-05-26 21:14:56 -07:00
microsoft-github-policy-service[bot]
006597b9b8
Microsoft mandatory file (#11619)
Co-authored-by: microsoft-github-policy-service[bot] <77245923+microsoft-github-policy-service[bot]@users.noreply.github.com>
2022-05-25 13:56:10 -07:00
Yulong Wang
f0dff6bb74
[js/rn] add expo config plugin support (#11556)
* [js/rn] add expo config plugin support

* resolve comments
2022-05-25 11:55:35 -07:00
Ryan Hill
d03d7afef8
Fix build errors when building with enable_memory_profile (#11617) 2022-05-25 10:08:33 -07:00
Hariharan Seshadri
6e65bac5c2
Memory usage optimization in LongFormer Attention (#11611) 2022-05-25 10:07:41 -07:00
Adrian Lizarraga
883e4bc341
Update the 'Linux-GPU-EP-Perf' pipeline to build ORT from source by default. (#11610) 2022-05-25 09:29:49 -07:00
Thiago Crepaldi
427230431a
Fix torch cpp ext build when CPU wheel is installed but GPU card is present (#11608)
* Fix torch cpp ext build when CPU wheel is installed but GPU card is present

Also there is a minor improvement for ATen operator that allows both
"::op" and "aten::op" name for operators

* Fix flake8 false positive
2022-05-25 09:44:26 -04:00
George Nash
147a1737f9
MatMul postop fusion for dnnl ep (#11565)
This includes a series of unit test that exercise
the MatMul fusion. This is not an exhaustive list
of tests.  The tests focuse on paterns seen in
in models, with additional tests to cover at least
one instance of each operator type that can be part
of the fusion.

Signed-off-by: George Nash <george.nash@intel.com>
2022-05-24 22:19:38 -07:00
Yulong Wang
4e9ad7b6ae
Update .flake8 to exclude .git directory (#11615) 2022-05-24 19:43:02 -07:00
Baiju Meswani
3a22a866a1
On device training offline tooling (#11520) 2022-05-24 18:21:39 -07:00
Gary Miguel
e3a2d5cca8
Add additional python requirements (#11522)
These are used by some of the python code in the package, e.g., 0292356bd7/onnxruntime/python/tools/transformers/optimizer.py (L25)

c8270c2940/onnxruntime/python/tools/symbolic_shape_infer.py (L10)

0292356bd7/onnxruntime/python/tools/transformers/torch_onnx_export_helper.py (L9)
2022-05-20 16:16:18 -07:00
Yulong Wang
69aaf03345
allow catch all exceptions (#11498) 2022-05-20 03:35:47 -07:00
PeixuanZuo
a67994316a
Update rocm ci to ROCm5.1.1 + torch1.10.0
* [UPDATE] update amd ci pipeline 2 rocm5.1.1

* [FIX] json format error

* [ERROR] disable unit tests

* [FIX] ucx error

* [FIX] cmake version

* [FIX] units test
2022-05-20 11:07:21 +08:00
Tang, Cheng
abecb56832
fix buid break (#11492)
Co-authored-by: Cheng Tang <chenta@microsoft.com@orttrainingdev9.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net>
2022-05-19 16:10:45 -07:00
Vincent Wang
436c4f9b79
Add BFloat16 (bf16) support for ATen (#11546)
Co-authored-by: Vincent Wang <weicwang@microsoft.com>
2022-05-19 10:04:08 -04:00
Adrian Lizarraga
e45197fa8c
[trt-ep-perf] Fix upload time of EP perf data (#11531)
Fix the post.py script to use the actual "upload time" in ISO format instead of the day/month/year of the commit date.
2022-05-18 15:36:21 -07:00
Valery Chernov
8092d9f9a2
[TVM EP] Support inference by shared library created by TVM (#11389)
* add so_folder option to TVM EP options. add TvmSoEP class and update TVM EP factory

* compilation from so_folder was implemented

* update TVMCompiler for default pipeline and compilation from shared lib

* filter excess so-file in so_folder

* clean Compile method and vm conditions

* implementation of TVMSoCompile on native side instead of python API

* cpplint fixes

* some fixes after review

* more cpplint fixes

* more fixes after review

* align TVMso EP with new API for compilation from #10632

* small fixes for cpplint

Co-authored-by: Valery Chernov <valery.chernov@deelvin.com>
2022-05-18 14:50:54 +02:00
Adrian Lizarraga
48efeca66c
[trt-ep-perf] Fix bug that suppresses latency gain reporting (#11321)
Fix bug that prevents EP perf script from reporting latency gain for TensortRT/CUDA
2022-05-17 14:00:52 -07:00
Edward Chen
782f9e394d
[CoreML EP] Fix condition in PRelu op supported check. (#11543) 2022-05-17 09:03:24 -07:00
Ryan Hill
deef214772
Update gather to use multiple threads (#11524) 2022-05-16 19:31:14 -07:00
Edward Chen
5eaa893936
[CoreML EP] Add support for PRelu (#11474) 2022-05-16 16:30:09 -07:00
Justin Chu
d9c9adb78b
Add python static type checking in CI checks (#11518)
- Enable pyright and pylint (https://github.com/microsoft/pyright) in CI
- Enable pyright, pylint and bandit by default in VS code

Pylint has some good style checks. pyright is Microsoft's static type checker.
2022-05-16 13:26:56 -07:00
PeixuanZuo
c556f5f22f
Add AMD python package ROCm5.1.1+torch1.11 (#11516)
* [FIX] fix name error

* [ADD] add rocm5.1.1 python package

* [ADD] torch1.10.0 rocm requirements

* [UPDATE] update docker Repository name
2022-05-16 08:14:11 +08:00
Sheil Kumar
6255194659
All LearningModelSessions created from a common LearningModelDevice should share the same thread pool (#11457)
* Share thread pools between devices

* make tests reuse device

* Change cpu thread pool options for dml sessions to use 1 thread with no spinning

* fix test failure

* Update missing type constraints for dft

* Add comment and rename inference session parameter

* default missing causing inconsistent test behavior

Co-authored-by: Sheil Kumar <sheilk@microsoft.com>
2022-05-13 11:12:43 -07:00
Yi Zhang
5709ed2e16
Fix shellcheck warning (#11489)
* fix shellcheck warning

* Update java_linux_final_test.sh
2022-05-13 15:36:59 +08:00
RajalakshmiSR
b14c1fd479
POWER: Optimize MlasQLinearAddKernelHelper() (#11454)
This patch uses vector instrinsics to optimize MlasQLinearAddKernelHelper
function for POWER processor.

Co-authored-by: Rajalakshmi Srinivasaraghavan <rajis@linux.ibm.com>
2022-05-12 18:38:45 -07:00
George Wu
09590f013a
fix windows ci debug build break (#11495)
* update msc version check

* update comment

* typo

* whitespace
2022-05-12 16:54:00 -07:00
Rachel Guo
4aef7e3aab
[CoreML EP] Add DepthToSpace op support (#11468)
* initial impl of depthtospace coreml support

* fix build

* address pr comments

* minor update

* minor pr comments

Co-authored-by: rachguo <rachguo@rachguos-Mini.attlocal.net>
Co-authored-by: rachguo <rachguo@rachguos-Mac-mini.local>
2022-05-12 13:48:51 -07:00
Yi Zhang
a3f05da338
Revert "[TVM EP] update set input to remove excess copying inside TVM (#11247)" (#11504)
This reverts commit 5ae461ec0a.
2022-05-13 02:27:36 +08:00
Tianlei Wu
ece1274ffa
revert safeint version (#11500) 2022-05-12 11:24:43 -07:00
Justin Chu
f94b25933a
ci(cpplint): Ignore runtime/references warnings (#11499)
Allow non-const references 6f85d3e5c8/docs/Coding_Conventions_and_Standards.md (L11-L12)
2022-05-12 07:51:45 -07:00
Justin Chu
6f85d3e5c8
fix(onnx_export): Extract arg value from torch Value (#11471)
**Description**: Extract arg value from torch Value

**Motivation and Context**

Input to gelu is `torch._C.Value` type values. This caused the `if approximate == "none"` check to always fail, preventing the optimized `com.microsoft::Gelu` op from being used.
2022-05-11 11:36:43 -07:00