Commit graph

3911 commits

Author SHA1 Message Date
Ryan Lai
8bcb5fd119
Add skip test reason for onnx model zoo models and tier 2 models (#6081) 2020-12-10 14:41:17 -08:00
Ryan Lai
753af576c4
If building inbox, hook up winrt_activation_handler for WinML Tests (#6074)
* If building inbox, hook up winrt_activation_handler with what is already defined in in dllload.cpp

* Add base.h header

* Missed custom ops test
2020-12-10 14:41:01 -08:00
Du Li
e945b5fcf6
adding fp16 support for topk cuda kernel (#6082)
* adding fp16 support for topk.

* disable fp16 tests for cpu ep

Co-authored-by: Du Li <duli@OrtTrainingDev0.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>
2020-12-10 11:04:19 -08:00
Vincent Wang
7ddeafdfcc
Add ReduceL2Grad and ClipGrad (#5970)
* ReduceL2Grad and ClipGrad.

* fix win build and amd ci pipeline

* resolve comments.

Co-authored-by: Vincent Wang <weicwang@AiFramework2080ti2.corp.microsoft.com>
2020-12-10 11:03:26 +08:00
RandySheriffH
404982ded5
Enable varied input type for custom op (#6066)
* allow custom op taking varied types

* refactor test case

* add test model

* refactor test case

* enable copy elision

* update test case

* fix issue in ToString function
2020-12-09 15:10:42 -08:00
Jesse Benson
cc47cfcb31 Update AMD transpose to match CUDA transpose. 2020-12-09 11:00:18 -08:00
Edward Chen
abdbb5fc84
Reduction kernel optimization (#6088)
Optimize reduction kernel code by moving loads from global memory before computation.
Add CMake option to build CUDA code with --generate-line-info option.
2020-12-09 10:20:23 -08:00
Sergii Dymchenko
9e26e59a37
Deprecate opsets <12 for training. (#6027) 2020-12-09 00:15:27 -08:00
Weixing Zhang
d95fc5e849
clean un-used code. (#6059)
Co-authored-by: Weixing Zhang <wezhan@microsoft.com>
2020-12-08 23:15:30 -08:00
Weixing Zhang
2705115732
add dockerfile for ROCm3.10 and update BUILD.md for ROCm EP (#5821)
* add HSA_NO_SCRATCH_RECLAIM=1 to dockerfile

It is to work around an issue in AMD compiler which generates poor GPU ISA when the type of kernel parameter is a structure and “pass-by-value” is used

* update BUILD.md

* add dockerfile for rocm3.10
2020-12-08 23:14:56 -08:00
ashbhandare
b1a75d0e98
Enable passing initial optimizer state while creating training session (#5869)
* Support to pass initial optimizer states to optimizer graph builder

* Changes for passing init optim state to training session config

* Pass optimizer state through cpp and python frontend

* Cleanup

* Review comments

* Fix windows and mac CI

* Review comments

* review comments

* Review comments

* Frontend review changes

* Fix CI
2020-12-08 21:20:51 -05:00
Sherlock
7a43fa0028
Fix AllReduce kernel for contiguous buffer (#6064) 2020-12-08 15:55:13 -08:00
Edward Chen
e357486707
Fix build definition template typo, add logging (#6065)
Fix a typo in tools/ci_build/github/azure-pipelines/templates/get-docker-image-steps.yml.
Add logging to tools/ci_build/get_docker_image.py for easier debugging.
2020-12-08 15:16:50 -08:00
baijumeswani
523d187193
save data to and load data from an hdf5 file for checkpointing (#5975)
* save python dictionary to hdf5 representation and load an hdf5 file into a python dictionary

* unit tests for saving data to and loading data from hdf5 file
2020-12-08 11:40:57 -08:00
Du Li
3e81711a13
Update version to 1.6.0 (#6041)
* Update version to 1.6.0

* Add v 1.5.3 info

* Updating WindowsAI and ONNX version

Co-authored-by: Du Li <duli@OrtTrainingDev0.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>
2020-12-08 11:09:51 -08:00
satyajandhyala
f68a256140
Android code coverage (#6061)
* Added Onnxruntime_GCOV_COVERAGE flag for Android.

* Set CMAKE_SYSTEM_NAME explicityly for Android.

* Added GCOV_PREFIX option to collect code coverage data.
Added a new python script to generate code coverage info.
Modified build pipeline to geneate Android code coverage info

* Added build command line option --android_coverage

* Added a comment describing the GCOV environment variables

* Fixed PEP8 issues.

* Added --android_coverage option to the build command.

* Increased Android emulator memory from 3K to 8K.

* Increased Android partition-size from 2GB to 4GB to overcome no-space-left-on-device error

* Removed source_dir from command line args.

* Use cwd absolute path to run tests.

* Added commands to output the contents of /data/local/tmp on the emulator.

* Added run_adb_shell function.

* Format changes.

* Removed keywd argument cwd.

* Removed Android in the --build_dir path.

* Removed commands added for debugging.

* Removed exxtra new-lines.

* Fix MacOs build pipeline failures by uninstalling openssl before running build script.

* Revert "Fix MacOs build pipeline failures by uninstalling openssl before running build script."

This reverts commit 90d0568fe533e9456c20d061a2d435c8fea48266.

* Change dir to the build directory where the tar file is copied.

* Changed the option from --android_coverage to --code_coverage

* Moved steps to generate Android code coverage to run_nnap_code_coverage.sh

* Require --android option if --code_coverage is specified.

* No code coverage needed for onnx_test_runner.

* Expect that the emulator is running when the script is executed.

* Fixed the title in the buildpipeline step.

* Fixed the formatting issue.

* Added a command line argument, ORT_ROOT, to run_nnapi_code_coverage.sh script

Co-authored-by: Satya Jandhyala <satyajandhyala@Satyas-Mac-mini.local>
2020-12-08 10:55:02 -08:00
Suffian Khan
e35211c0ff
Fix AMD GPU pipeline by adjusting reference /opt/rocm-3.9.0 => /opt/rocm (#6063)
* use /opt/rocm instead

* fix indent
2020-12-08 08:53:20 -08:00
Pranav Sharma
2c5ba9ab00
Bump up API version for 1.6 release (#6076) 2020-12-08 01:24:29 -08:00
Yufeng Li
3cae28699b
work around of the build break in mac (#6069)
* Fix the build break in macos release

* revert android change
2020-12-07 20:39:36 -08:00
Ye Wang
fa06be2133
Support export >2G model when using optimizer.py only (#6014)
* checkin

* add warning if user specify same inut and output path
2020-12-07 17:18:49 -08:00
Edward Chen
b348538c8a
Update build docker image cache cleanup (#6048)
The current image cache cleanup is not removing many images. Upon examining the cache container registry logs, it appears there are some infrequent pulls of old images which may be made by something other than CI builds (perhaps some automated scan of the registry).
This change adds a minimum access count for images in the cache so that infrequently but periodically accessed images can be removed. The idea is that images used by CI builds that are worth caching will have a higher volume of accesses.
2020-12-07 13:07:19 -08:00
Tianlei Wu
51fbe87b9b
Update profiler tool to support gpt2 and longformer models (#6011)
* support gpt2 and longformer in profiler tool
* rename bert_profiler to profiler
* Add --basic_optimization to allow user to use basic level of graph optimization
* Add --kernel_time_only to filter kernel time and exclude fence time
* Add --threshold to filter nodes that with low run time percentage.
2020-12-07 10:33:41 -08:00
Changming Sun
925879a8b0
Remove python 3.8 Windows GPU build from python packaging pipeline (#6054)
Revert the last a few changes to get the pipeline back to a normal state.
2020-12-07 10:23:07 -08:00
George Wu
020efc9002
fix windows cuda support for python 3.8 + (#6046)
* fix

* noqa

* fix.

* remove unused import
2020-12-07 10:09:22 -08:00
ashbhandare
7cebf76a46
Improve checkpointing for Zero stage 1 (#5478)
* Initial running changes

* Checkpointing aggregation changes

* compare with older version

* initial cleanup

* Add zero test, minor fix

* Fix zero test, transform, formatting

* Review comments

* add more unit tests

* review comments

* Try fix CI

* Add additional check on just aggregation code

* Try fix ckpt gen

* Add pregenerated ckpt for CI, enable zero test in e2e

* Moving test to nightly, removing ckpt files

* Add tests to dist GPU CI

* Fix dist test

* Review comments

* Fix test
2020-12-07 09:16:01 -08:00
Hariharan Seshadri
a046ef133a
Update api_summary.rst (#6038) 2020-12-04 17:59:56 -08:00
dependabot[bot]
d5e8c48e54 Bump highlight.js from 10.2.1 to 10.4.1 in /nodejs
Bumps [highlight.js](https://github.com/highlightjs/highlight.js) from 10.2.1 to 10.4.1.
- [Release notes](https://github.com/highlightjs/highlight.js/releases)
- [Changelog](https://github.com/highlightjs/highlight.js/blob/master/CHANGES.md)
- [Commits](https://github.com/highlightjs/highlight.js/compare/10.2.1...10.4.1)

Signed-off-by: dependabot[bot] <support@github.com>
2020-12-04 16:45:07 -08:00
Edward Chen
d8139814fd
Clean up builds (#6015)
Update training Python packaging build to use get_docker_image.py.
Remove BUILD_EXTR_PAR docker build argument.
Update get_docker_image.py to check again for the image in the cache after building and before pushing to reduce the chance of a redundant push.
2020-12-04 15:13:17 -08:00
Sheil Kumar
00f43a3a68
add missing iclosable interface (#6036)
Co-authored-by: Sheil Kumar <sheilk@microsoft.com>
2020-12-04 13:21:03 -08:00
Jesse Benson
14f6eb14b1 Use __launch_bounds__ workaround, rather than limiting threads to 256 on AMD. 2020-12-03 13:06:34 -08:00
Jesse Benson
98ea7372d3 Re-enable Lamb unit tests for AMD 2020-12-03 13:06:34 -08:00
Jesse Benson
245d43615d Fix AMD multi-tensor implementation. 2020-12-03 13:06:34 -08:00
Edward Chen
6572a4d306
Disable Python 3.9 for training Python packaging build. (#6012)
Disable Python 3.9 for training Python packaging build. Python 3.9 is not supported by the PyTorch dependency.
2020-12-03 11:42:28 -08:00
Tianlei Wu
cdb91208a3
longformer onnx conversion and benchmark tools (#6007)
* initial implementation of longformer tools for onnx conversion and benchmark

* Support ONNX conversion for transformers 4.0
Add an option to optimize onnx model, and export fp16 model
2020-12-03 11:37:30 -08:00
Cecilia Liu
3b198c9614
Support Fusion for 1 and 2 Inputs Bert Models Converted From tf (#5993)
Support fusion for 1 and 2 inputs Bert models converted from tf
2020-12-03 10:52:33 -08:00
Sherlock
c86a1e5c13
Fix Flaky orttraining tests (#5977)
* Fix Flacky orttraining  tests
2020-12-03 10:24:25 -08:00
Ryan Lai
2878e8eb2e
Fix nuget build error (#6009) 2020-12-03 09:28:39 -08:00
baijumeswani
2b35f7d4f6
Fix build.py bug which prevents running some unit tests (#5990)
Also ignore an exception occurred for execution providers which generate compiled nodes
2020-12-03 08:57:55 -08:00
Xavier Dupré
0acc3837ee
Make operator TreeEnsemble 5x faster for batches of size 100.000 (#5965)
* improves processing time by 10
* extend coverage unit test coverage
* better implementation for the multi regression case
* better comment, keep parallelization by trees when not enough trees
2020-12-03 14:36:42 +01:00
Xavier Dupré
524b9fa899
Initialize a structure in operator ReduceSum (#6005)
* fix initialisation issue
2020-12-03 12:41:26 +01:00
Zhang Lei
648c9c7789
Fix bugs for 1: Calibrator should check model inputs; 2: (#6017)
quantize_inupts forgot to use parameter initializer_use_weight_qtyp.
2020-12-03 00:00:16 -08:00
Xavier Dupré
bdd06f6310
Fix PR #5550 reverted in #5911 (performance improvment for operator Transpose) (#5916)
* Improves implementation of transpose operator
* Fix issue mentioned in #5911
* adding unit test for function DoTransposeImpl
2020-12-03 00:38:18 +01:00
Yufeng Li
f2dcba7afe
Fuse MatMulIntegerToFloat only when scales are scalar (#6008)
MatMulIntegerToFloat fusion fuses per-row and per-column MatMulInteger, which is not supported by the MatMulIntegerToFloat kernel now. Limit the fusion to per-matrix only before we supporting the per-channel fully.
2020-12-02 14:40:17 -08:00
Yufeng Li
4fdfbfd4b4
Add int32_t support for DeQuantizeLinear (#5994)
* Add int32_t support for DeQuantizeLinear

* DequantizeLinear with int32 should have not zero point
2020-12-02 12:35:41 -08:00
Olivia Jain
c727a28735
include gemm_helper.h (#5988) 2020-12-02 11:28:28 -08:00
Xiang Zhang
b4e6cc59c7
skip the check for A channel (#5989) 2020-12-02 11:23:54 -08:00
Guoyu Wang
cdacee6696
[NNAPI] Support non-1d tensor for C of Gemm op (#5982)
* Add support for non-1d tensor for C of Gemm

* check android api level before add squeeze

* Minor update

* Fix to accept c only in format of {1,1,...,1,n}
2020-12-02 00:22:38 -08:00
Guoyu Wang
6846c665ff
Use loose version in build.py (#5998) 2020-12-01 20:57:44 -08:00
Ryan Lai
897310f6fb
Add suspend handler with new telemetry event for UWP scenarios (#5907)
* Add suspend handler with new telemetry event

* Fix build warning

* Use cppwinrt from nuget

* Restore nuget packages

* add dependencies

* Add nuget_helpers

* Cleaned up

* Clean up

* Comment

* Add dependencies for the rest

* Remove unused line

* Update activation string

* PR comment to remove ALL
2020-12-01 20:26:18 -08:00
Edward Chen
6d642a3dba
Replace direct pulls from image cache container registry with get_docker_image.py, build definition clean up. (#5906) 2020-12-01 19:10:23 -08:00