Commit graph

6239 commits

Author SHA1 Message Date
Edward Chen
df16c605e8
Add "available since" message for C API additions since v1.10.0. (#10348) 2022-01-25 10:15:34 -08:00
Alexey Gladyshev
a0fe4a7c1c
[TVM EP] Improved usability of TVM EP (#10241)
* improved usability of TVM EP
* moved technical import under a condition related to TVM EP only
* Revert "moved technical import under a condition related to TVM EP only"
* add conditional _ld_preload.py file extension for TVM EP
* improve readability of inserted code
2022-01-25 18:48:08 +01:00
Xavier Dupré
6e95c0316d
Builds onnxruntime + eager mode with the same value for _GLIBCXX_USE_CXX11_ABI as pytorch (#10114)
* add _GLIBCXX_USE_CXX11_ABI
* restrict to eager mode
2022-01-25 11:25:31 +01:00
pallavides
790c3be7e9
Fix Reshape issue when shape size is -1 (#10356)
* Fix Reshape issue (in_place) when shape size is -1
2022-01-24 19:30:52 -08:00
Edward Chen
4b87d2c172
Fix dockerfiles/Dockerfile.arm32v7 build. (#10360)
Install CMake, ignore some Eigen warnings.
2022-01-24 19:06:09 -08:00
Chen Fu
df0c819850
fix compilation error due to symantic conflict with another PR (#10370)
Resolve PR conflicts between: #10289 and #10334
Co-authored-by: Chen Fu <fuchen@microsoft.com>
2022-01-24 16:32:05 -08:00
Chen Fu
2afce4830c
Symmetric QGEMM (#10289)
Adding code for symmetric quantized matrix multiplication. Used in quantized convolution, achieving significant perf gain.

TODO, use Symmetric Quantized GEMM in other operators!

TODO address activation buffer overread in custom allocators and tensors supplied by users.

DOT kernel perf test:

Pixel 5a:

Cartoongan	513.539 ms	471.786 ms
Efficient	57.5169 ms	56.4174 ms
Edgetpu	14.6673 ms	13.5959 ms
NEON kernel perf test

Pixel 3a

Cartoongan	1423.53 ms	1069.92 ms
Efficient	114.086 ms	107.968 ms
Edgetpu	39.2632 ms	36.9839 ms


Co-authored-by: Chen Fu <fuchen@microsoft.com>
2022-01-24 10:49:04 -08:00
Dmitri Smirnov
7e092a7e3f
Reduce number of memory allocations based on a customer profiling case (#10193)
Add abseil and inlined containers typedefs
Introduce TensorShapeVector for shape building.
Use gsl::span<const T> to make interfaces accept different types of vector like args.
Introduce InineShapeVectorT for shape capacity typed instantiations
Refactor cuda slice along with provider shared interfaces
Refactor Concat, Conv, Pad
Build with Conv Einsum and ConvTranspose refactored.
Remove TesnorShape::GetDimsAsVector()
Refactor SliceIterator and SliceIteratorBase
Refactor broadcast
Refactor Pads for twice as long
Remove memory planner intermediate shapes vector
Refactor orttraining
Fix passing TenshroShapeVector to tests
Remove abseil copy and submodule, use FetchContent_Declare/Fetch
Path with separate command
Make RocmAsyncBuffer accept anything convertible to span. Adjust Linux GPU pipeline.
2022-01-24 10:40:46 -08:00
wejoncy
5df15c5644
additional options of NNAPI for ORT_PERF_TOOL (#10351)
* additional options of NNAPI for ORT_PERF_TOOL

* reuse current key '-i'

* fix

* fix

* _MSC_VER won't be defined when build with NDK

* fix

* fix
2022-01-24 10:17:56 -08:00
PeixuanZuo
3dfadf9031
[FIX] Add condition in amd ci pipeline yaml to stop test in time when onnxruntime build failed (#10335)
* [FIX] Add condition in amd ci pipeline yaml to stop test in time when onnxruntime build failed.
2022-01-24 15:34:48 +08:00
Jeff Daily
42db893607
Add ThresholdedRelu to ROCm EP. (#9480)
Sources were already hipified and compiled, but ROCm EP registration was missing.
2022-01-22 13:29:07 -08:00
Edward Chen
6876641c1e
Pin version of post to dashboard scripts' dependencies and update them to work with recent version. (#10353) 2022-01-21 19:35:58 -08:00
Edward Chen
bfabef081d
Remove unused pipeline orttraining-linux-gpu-perf-test-ci-pipeline.yml and unused send_perf_metrics tool. (#10326) 2022-01-21 14:31:34 -08:00
Baiju Meswani
141606534c
Add support for FusedAdam to be mathematically equivalent to pytorch/AdamW (#10106) 2022-01-21 13:37:59 -08:00
Cheng Tang
13e277525c fix whitelist 2022-01-21 13:30:53 -08:00
Olivia Jain
eee627fde9
Track Session Creation Time (#10281)
* add back previous changes lost in merge

* post session to dashboard

* post session creation time to dashboard

* fix trt 8 functionality:

* add component governance

* Remove hardcoded values

* Update linux-gpu-tensorrt-daily-perf-pipeline.yml for Azure Pipelines

* cleanup errors

* post results only once

* checkout 8.0 GA

* try build 8.0 without building shared lib

* add back build_shared_lib, not the problem

* add upload_time to table

* use identifier to post

* Shorten to TRT x.x

* shorten commit hash using rev_parse

* use shortened commit hash

* use nvidia's default TRT_VERSION
2022-01-21 13:20:53 -08:00
Yufeng Li
d2b1424968
fix bugs in cpuid_info (#10334)
* fix serveral bugs in cpuid_info
2022-01-20 16:30:18 -08:00
Tang, Cheng
2dcb69685e
support type promotion in binary poerators in eager mode (#10285)
Co-authored-by: Cheng Tang <chenta@microsoft.com@orttrainingdev9.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net>
2022-01-20 10:06:09 -08:00
Baiju Meswani
c67594694c
Add ability to set onnx opset version from json config (#10223) 2022-01-20 09:10:19 -08:00
Vincent Wang
001cc53968
Fix CUDA10.2 Build Break for BFloat16 Change (#10331)
* fix build break on cuda 10.2

* fix linux build
2022-01-20 18:17:28 +08:00
Abhishek Jindal
4aa7cee0d8
Abjindal/clean eager backend (#10055)
* clearing map for eager mode backends

* clearing map for eager mode backends manager

* making OrtBackendsManager an extern variable and trying to delete it

* cleaning backends manager when the python interpret exits

* adding ifdef for eager mode code

* disabling warning for pybind state file

* disabling warning for python module file

* running clang auto format and reducing redundancy

* remove new line

* moving declaration to a new header file

* adding the header file for eager mode for python module

* removing source files for eager mode

* add source file for python module in eager mode

* Update orttraining/orttraining/python/orttraining_python_module_eager.h

Co-authored-by: Thiago Crepaldi <thiago.crepaldi@microsoft.com>

Co-authored-by: Thiago Crepaldi <thiago.crepaldi@microsoft.com>
2022-01-19 14:20:09 -08:00
Scott McKay
90e2a4b936
Fix GH Issue 10305 by adding implicit inputs to consumer nodes map. (#10319) 2022-01-20 07:46:35 +10:00
jingyanwangms
a656c55a75
Add _force_exportable_set and pass debug_options (#10282)
* Add _force_exportable_set and pass debug_options

* Update orttraining/orttraining/python/training/ortmodule/experimental/hierarchical_ortmodule/_hierarchical_ortmodule.py

Co-authored-by: Wei-Sheng Chin <wschin@outlook.com>

* nit fix

* Update orttraining/orttraining/python/training/ortmodule/experimental/hierarchical_ortmodule/_hierarchical_ortmodule.py

Co-authored-by: Wei-Sheng Chin <wschin@outlook.com>

Co-authored-by: Wei-Sheng Chin <wschin@outlook.com>
2022-01-19 10:26:27 -08:00
David Fan
7b14c70cfe
[ortmodule] Ensure contiguous tensor into forward pass (#10315) 2022-01-18 22:06:37 -08:00
Sunghoon
b038f4e56f
Add a build option to create a WebAssembly static library (#10184)
* add p50 in test

* Add a build option to create a WebAssembly static library

Co-authored-by: Yulong Wang <yulongw@microsoft.com>
2022-01-18 18:05:04 -08:00
Yi-Hong Lyu
62eab67f79
Fuse DQ -> ArgMax into ArgMax (#10274) 2022-01-18 14:47:33 -08:00
Yi-Hong Lyu
e27f2dc932
int8/uint8 support for Argmax for opset 1, 11, 12 (#10296) 2022-01-18 14:37:34 -08:00
Yulong Wang
712f4e403d
[js/common] upgrade marked@4.0.10 (Dependbot warning) (#10313) 2022-01-18 14:00:10 -08:00
Scott McKay
c1c9fa18bf
C#: Avoid inefficient DenseTensor ctor in ToTensor extensions (#10240)
* Update extension helpers to avoid inefficient construction of DenseTensor.
Add tests for extension helpers.
2022-01-19 07:43:44 +10:00
Guoyu Wang
6ae22d562b
[QDQ] Move NNAPI EP to use NodeUnitIODef for non-QDQ ops (#10237) 2022-01-18 12:54:58 -08:00
Chen Fu
33dd2f8f5e
fix mac compilation error (#10268)
Fix Mac compilation error in new cpuinfo changes
2022-01-18 08:09:27 -08:00
Vincent Wang
c12cafa524
Optimize Transpose CUDA Kernel (#10230)
* optimize transpose cuda

* fix comment typo
2022-01-15 15:39:06 +08:00
RandySheriffH
a757bd7186
Render summarized ort perf with tree map in browser (#10189)
* render summarized ort perf with tree map

* add readme

* add comment

* Update readme.md

* Update readme.md
2022-01-14 15:45:32 -08:00
RandySheriffH
ab5fd42ed4
reset MIN for float/double (#10284) 2022-01-14 13:57:29 -08:00
pengwa
e365ad7f3a
fix deadlock in model.train mode forward run only (#9960)
* fix deadlock in model.train model forward run only

* fix tests

* clear the grad_fns before every forward run

* add clean up on exit

* fix

* refine code comments
2022-01-14 13:53:29 -08:00
Thiago Crepaldi
6a7d3deb22
Update pytorch-lightning (#10276) 2022-01-14 16:49:10 -05:00
Vincent Wang
44e2db9397
CUDA BFloat16 Refactor (#10085) 2022-01-14 19:38:56 +08:00
Xavier Dupré
e38e51ea8e
Improve iobinding, faster name search (#10005)
* Improve iobinding, faster name search
2022-01-14 12:18:18 +01:00
Vincent Wang
3ea7fb0f9f
fix mem leak (#10272) 2022-01-14 14:54:19 +08:00
dependabot[bot]
2a55bc2c21 Bump engine.io from 4.1.1 to 4.1.2 in /js/web
Bumps [engine.io](https://github.com/socketio/engine.io) from 4.1.1 to 4.1.2.
- [Release notes](https://github.com/socketio/engine.io/releases)
- [Changelog](https://github.com/socketio/engine.io/blob/4.1.2/CHANGELOG.md)
- [Commits](https://github.com/socketio/engine.io/compare/4.1.1...4.1.2)

---
updated-dependencies:
- dependency-name: engine.io
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
2022-01-13 18:26:02 -08:00
Baiju Meswani
2affd6e71e
orttraining packaging and ci pipelines to use cuda 11.3 (#10252) 2022-01-13 13:36:33 -08:00
dependabot[bot]
4b205eb2b3
Bump follow-redirects from 1.13.3 to 1.14.7 in /js/web (#10266)
Bumps [follow-redirects](https://github.com/follow-redirects/follow-redirects) from 1.13.3 to 1.14.7.
- [Release notes](https://github.com/follow-redirects/follow-redirects/releases)
- [Commits](https://github.com/follow-redirects/follow-redirects/compare/v1.13.3...v1.14.7)

---
updated-dependencies:
- dependency-name: follow-redirects
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-01-13 09:05:22 -08:00
dependabot[bot]
943a1aa2d6
Bump follow-redirects from 1.14.5 to 1.14.7 in /js/node (#10265)
Bumps [follow-redirects](https://github.com/follow-redirects/follow-redirects) from 1.14.5 to 1.14.7.
- [Release notes](https://github.com/follow-redirects/follow-redirects/releases)
- [Commits](https://github.com/follow-redirects/follow-redirects/compare/v1.14.5...v1.14.7)

---
updated-dependencies:
- dependency-name: follow-redirects
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-01-13 09:03:41 -08:00
Edward Chen
d43ef67d2b
Move binary size check to separate pipeline (#10254)
Move binary size check(s) to a separate pipeline. In the future, other binary size-related builds can go here.
Add publishing of build artifacts for easier analysis.
Add optional build with debug info.
2022-01-12 19:21:20 -08:00
dependabot[bot]
3d9d8e20cc Bump numpy from 1.19.2 to 1.21.0 in /tools/ci_build
Bumps [numpy](https://github.com/numpy/numpy) from 1.19.2 to 1.21.0.
- [Release notes](https://github.com/numpy/numpy/releases)
- [Changelog](https://github.com/numpy/numpy/blob/main/doc/HOWTO_RELEASE.rst.txt)
- [Commits](https://github.com/numpy/numpy/compare/v1.19.2...v1.21.0)

---
updated-dependencies:
- dependency-name: numpy
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
2022-01-12 17:45:35 -08:00
Yi-Hong Lyu
499f1d5fd7
Quantization of Argmax (#10213)
This patch includes:
* int8/uint8 support for Argmax
* Quantization tool support for Argmax
2022-01-12 14:12:56 -08:00
Tiago Koji Castro Shibata
98f85ae05b
Bump winrt version (#10243) 2022-01-12 10:52:27 -08:00
ashari4
aff96ce081
remove hardcoded type (#10251) 2022-01-12 10:00:34 -08:00
CarlPoirier
4af232df0c
Fix props file overwriting AdditionalIncludeDirectories (#10124)
Co-authored-by: Carl Poirier <carl.poirier@vab-solutions.com>
2022-01-11 23:30:40 -08:00
Rachel Guo
a099bd454b
[QDQ] Add shared qdq selectors (#10178)
* wip

* wip

* wip

* wip

* wip

* save

* minor changes

* update test graph name

* address pr comments

* update

* address pr comments

* address pr comments

* fix warning

* minor include fix

* update to nodegroupselectors

* delete unnecessary includes

Co-authored-by: rachguo <rachguo@rachguos-Mini.attlocal.net>
2022-01-11 19:41:45 -08:00