Commit graph

6594 commits

Author SHA1 Message Date
Baiju Meswani
249c4dec7f
Update orttraining release pipelines to use torch 1.11.0 (#11018)
* Update orttraining release pipelines to use torch 1.11.0

* Change requirements_torch...txt to requirements.txt

* Update cuda cmake architectures and clean up old files
2022-03-31 21:51:06 -07:00
Changming Sun
8e6dbad287 FIX: Nuget pipeline doesn't report binary size for Linux ARM64
In #10652 #10637 #10624, we changed the RID. But I forgot to update this part.
2022-03-31 18:32:05 -07:00
wejoncy
11a4ca741d
fuse Conv+Add+activation for CPU from different op-branch (#10987)
* Fuse op conv Add and activation from two branch
* simplify code

Co-authored-by: Jicheng Wen <jicwen@microsoft.com>
2022-04-01 09:25:17 +08:00
dependabot[bot]
79e4ed8064 Bump pytorch-lightning
Bumps [pytorch-lightning](https://github.com/PyTorchLightning/pytorch-lightning) from 1.5.10 to 1.6.0.
- [Release notes](https://github.com/PyTorchLightning/pytorch-lightning/releases)
- [Changelog](https://github.com/PyTorchLightning/pytorch-lightning/blob/master/CHANGELOG.md)
- [Commits](https://github.com/PyTorchLightning/pytorch-lightning/compare/1.5.10...1.6.0)

---
updated-dependencies:
- dependency-name: pytorch-lightning
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
2022-03-31 16:51:24 -07:00
Boris Fomitchev
eab7c0d5bf
Fixing optimizer failure due to missing provider list (#10497)
Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>
2022-03-31 11:05:49 -07:00
Linnea May
bfcd5bd4a2
remove hardcoded library name (#11058)
Co-authored-by: Linnea May <linneamay@microsoft.com>
2022-03-31 10:41:31 -07:00
Yulong Wang
8dcadba670
[js] aggregation of recent dependabot security warnings fix (#11060)
* update package-lock.json

* Bump minimist from 1.2.5 to 1.2.6 in /js/react_native

Bumps [minimist](https://github.com/substack/minimist) from 1.2.5 to 1.2.6.
- [Release notes](https://github.com/substack/minimist/releases)
- [Commits](https://github.com/substack/minimist/compare/1.2.5...1.2.6)

---
updated-dependencies:
- dependency-name: minimist
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>

* Bump minimist from 1.2.5 to 1.2.6 in /js/react_native/e2e

Bumps [minimist](https://github.com/substack/minimist) from 1.2.5 to 1.2.6.
- [Release notes](https://github.com/substack/minimist/releases)
- [Commits](https://github.com/substack/minimist/compare/1.2.5...1.2.6)

---
updated-dependencies:
- dependency-name: minimist
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>

* Bump plist from 3.0.4 to 3.0.5 in /js/react_native

Bumps [plist](https://github.com/TooTallNate/node-plist) from 3.0.4 to 3.0.5.
- [Release notes](https://github.com/TooTallNate/node-plist/releases)
- [Changelog](https://github.com/TooTallNate/plist.js/blob/master/History.md)
- [Commits](https://github.com/TooTallNate/node-plist/commits)

---
updated-dependencies:
- dependency-name: plist
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>

* Bump ansi-regex from 4.1.0 to 4.1.1 in /js/react_native

Bumps [ansi-regex](https://github.com/chalk/ansi-regex) from 4.1.0 to 4.1.1.
- [Release notes](https://github.com/chalk/ansi-regex/releases)
- [Commits](https://github.com/chalk/ansi-regex/compare/v4.1.0...v4.1.1)

---
updated-dependencies:
- dependency-name: ansi-regex
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>

* Bump plist from 3.0.4 to 3.0.5 in /js/react_native/e2e

Bumps [plist](https://github.com/TooTallNate/node-plist) from 3.0.4 to 3.0.5.
- [Release notes](https://github.com/TooTallNate/node-plist/releases)
- [Changelog](https://github.com/TooTallNate/plist.js/blob/master/History.md)
- [Commits](https://github.com/TooTallNate/node-plist/commits)

---
updated-dependencies:
- dependency-name: plist
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>

* Bump ansi-regex from 4.1.0 to 4.1.1 in /js/react_native/e2e

Bumps [ansi-regex](https://github.com/chalk/ansi-regex) from 4.1.0 to 4.1.1.
- [Release notes](https://github.com/chalk/ansi-regex/releases)
- [Commits](https://github.com/chalk/ansi-regex/compare/v4.1.0...v4.1.1)

---
updated-dependencies:
- dependency-name: ansi-regex
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-03-31 02:06:04 -07:00
dependabot[bot]
e9c68d57ca
Bump minimist from 1.2.5 to 1.2.6 in /js/web (#11033)
Bumps [minimist](https://github.com/substack/minimist) from 1.2.5 to 1.2.6.
- [Release notes](https://github.com/substack/minimist/releases)
- [Commits](https://github.com/substack/minimist/compare/1.2.5...1.2.6)

---
updated-dependencies:
- dependency-name: minimist
  dependency-type: direct:development
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-03-30 16:26:34 -07:00
Yulong Wang
6c7090a829
[js/web] fix output type mapping (#11049) 2022-03-30 16:26:04 -07:00
RandySheriffH
9505e8c6c1
fix json format (#11046)
Co-authored-by: Randy Shuai <rashuai@microsoft.com>
2022-03-30 16:15:33 -07:00
Adam Pocock
9616ad483f
[Java] Support configuring CUDA and TensorRT execution providers (#10697)
Java side parts for configuring CUDA and TensorRT.
Adding tests for CUDA and TensorRT. Refactoring library loading logic as provider options need to have their shared library loaded before they can be constructed.
2022-03-30 14:26:51 -07:00
Yulong Wang
179406bd25
[JS] upgrade package-lock.json from v1 to v2 (#11039)
* upgrade package-lock.json from v1 to v2

* upgrade requirement of nodejs version to 16.x
2022-03-30 13:30:28 -07:00
Nat Kershaw (MSFT)
998bf0fdb6
Remove advice to use IO Binding for this scenario (#11006) 2022-03-30 10:23:50 -07:00
Xavier Dupré
c37d2728bf
Implement TreeEnsemble for opset(ai.onnx.ml)==3 (#10821)
* Implement TreeEnsemble for opset(ai.onnx.ml)==3
* use of InlineVector
* refactoring
* improve attributes retrieval
* avoid creating a temporary buffer
* modifies onnx.ml.cpu.json
* use unordered_map
* update docs/OperatorKernels.md
* address PR comments (TH -> ThresholdType, ORT_RETURN...)
* add a python unit test to load a TreeEnsembleRegressor following ai.onnx.ml==3 specifications
2022-03-30 12:53:12 +02:00
Yulong Wang
1424b796ff
[js/web] disable test_tan temorarily (#11048) 2022-03-29 21:47:52 -07:00
Yi Zhang
d1bdd2cd94
allow trailing slash in directory (#11001)
* allow trailing slash in directory

* fix lint
2022-03-30 09:42:57 +08:00
ytaous
5868413caf
fix seg fault (#11038)
Co-authored-by: Ethan Tao <ettao@microsoft.com@orttrainingdev8.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net>
2022-03-29 14:12:45 -07:00
Edward Chen
8f456735d1
Remove unused variable. (#11043) 2022-03-29 14:11:07 -07:00
Erick Alejandro Muñoz Alvarado
6c005bfdbc
Enabled Cast operator on OneDNN EP (#11023) 2022-03-29 08:16:01 -07:00
Vincent Wang
6a6840d5c6
Fuse LayerNormalization for Apex O2 (#10233) 2022-03-29 21:22:04 +08:00
Vincent Wang
3b6cee8059
[CUDA] Optimize Conv and ConvGrad for Training (#10999)
* Optimize Conv and ConvGrad for Training

* add provider option to control

* fix typo
2022-03-29 07:31:36 +08:00
Chi Lo
8ba52b0a05
Bump master version to 1.12 (#10797)
* bump master version to 1.11

* bump master version to 1.12
2022-03-28 12:30:11 -07:00
Edward Chen
9371401746
Move node EP assignment for ORT format into SessionState::FinalizeSessionState() (#10944)
Follow up to #10904.
- Move node EP assignment for ORT format into SessionState::FinalizeSessionState().
- Add unit test for #10904.
- Make convert_onnx_models_to_ort.py optimization level configurable via environment variable.
2022-03-28 10:37:22 -07:00
Baiju Meswani
9c6cc018a9
Add utility to get the gradient graph from GradientGraphBuilder (#10995)
* Add pybind method to get the gradient graph

* Fix segmentation fault because of logging for gradien building
2022-03-25 17:13:56 -07:00
Chen Fu
dc72159105
Symmetric Quant indirect Conv kernel for ARMv8 A55 chip (#10862)
ARM a55 micro-architecture (with dot product instructions), similar to a53, is widely used as little cores in big.Little configurations. A55 has a narrower memory load/store hardware, where a 128b load instruction would block the pipeline for 2 whole cycles, during which no other instructions can be executed. On the other hand, a 64b load instruction can be duo issued with many other instructions.

This change adds a Symmetric Quant indirect Conv kernel for a55 micro-architecture, where we replace

ldr q4,[x1],

with

ldr d4,[x1],
ldr x11,[x1],
ins v4.d[1],x11

so that we can try to hide the memory load cycles behind computing cycles in the kernel.

With this new kernel, cartoongan model shows significant perf improvement on Pixel5a little cores (2 threads running on two little cores):

new kernel: 2188.59 ms
old kernel: 2360.61 ms
2022-03-25 17:10:47 -07:00
leqiao-1
8ddc45f52d
Add linux and macos arm64 java aritifacts (#10981) 2022-03-25 16:23:17 -07:00
Jack·Boos·Yu
d1be71eaa3
[cmake] Add keyword STATIC to add_library in function onnxruntime_add_static_library (#10998) 2022-03-25 16:19:36 -07:00
Chandru Ramakrishnan
cb31b7eab1
Fixed creation of ORT_Value to pass offset of 0 (#11004) 2022-03-25 15:52:10 -04:00
Scott McKay
47c09e6701
Clarify usage of kOnnxDomainAlias. (#10962)
* Clarify usage of kOnnxDomainAlias.
2022-03-25 09:52:59 +10:00
pengwa
89ef987ab1
Improve NonZero on CUDA/ROCM (#10307)
* improve NonZero

* fix megatron_fp16 optimzier, fix the doc

* multi_tensor_applier

* resolve comment

* fix building warning

* fix build error when enabling training and use tensorrt
2022-03-25 07:35:45 +08:00
mpapdiwala
1e917c879e
Adding support for saving and loading train step info properties in the state dict and checkpoint file. (#10569)
* Adding optimization step and step parameter to the ORTTrainer constructor

* Added ORTTrainerOptions for optimization step

* Adding Train Step Info Settings to State Dictionary

* Adding train step info key

* Updating comments

* Reverting changes

* Updating test case for new state dict entry train_step_info
2022-03-24 11:50:45 -07:00
Christoph Hausner
989e640009
Update docstrings in quantize.py (#10952) 2022-03-24 10:49:33 -07:00
mindest
3c5853dcbc
register custom_op_symbolic for squeeze (#10970)
* register custom_op_symbolic for squeeze

* remove misleading warning msg from symbolic_opset9
2022-03-24 10:28:21 +08:00
Shucai Xiao
7ee52fb8a0
amdmigraphx_ep-add ops to be supported by migraphx and fixed a bug in check ops to be supported (#10496)
* backup debugging information related to debugging a jira ticket

* fixed a bug in checking whether an input can be constand folded

* added more operators that are supported by migraphx

* revert unnecessary changes

* remove unused logger parameter

* rename function to make name style consistent

* backup code changes

* fix review comments

* refactor graph utility functions to add unit tests

* backup additional changes

* fixed a link error in build migraphx_basic_test

* add unit test for some migraphx utility functions

* add more supported ops in migraphx
2022-03-23 19:17:19 -07:00
Adrian Tsai
ae08f9666d
Fix type constraints in registration of DequantizeLinear (#10986) 2022-03-23 17:05:12 -07:00
Sheil Kumar
938f3857a5
Set the default for the STFT onesided attribute to 1, which tests expect (#10984)
Co-authored-by: Sheil Kumar <sheilk@microsoft.com>
2022-03-23 14:20:54 -07:00
Chandru Ramakrishnan
07201726ed
Fixed macros for graph transformer registration. (#10983) 2022-03-23 14:55:17 -04:00
Olivia Jain
de384805cd
Custom parameters (#10964)
* get inputs independently for trtexec

* track one process only

* remove engine and profile files

* change time to commit time

* add runtime option for io binding

* move to commit date

* fixes

* add option for graph optimization

* cleanup docker script

* note second time creation

* allow for parameters to be configured from pipeline at runtime

* uncomment

* include optional arguments at runtime

* post second session creation

* update cmake version

* Revert "update cmake version"

This reverts commit 09a1364eae68610724c8e90eeea777b7ee03f74b.

* Move data format import
2022-03-23 09:47:24 -07:00
Jeff Daily
9a3be9b46a
use #include <hiprand/hiprand.h>, not deprecated #include <hiprand.h> (#10966) 2022-03-23 08:56:45 -07:00
Yi Zhang
0efbe92296
fix coverage report error in master build (#10969)
* fix error in master

* check NNAPI_EP_MASTER

* Revert "check NNAPI_EP_MASTER"

This reverts commit 59c9043b7c9bbcb4b495d2dd121ef6d4271be408.

* rm coverage in PR build
2022-03-23 16:00:57 +08:00
raviskolli
480c793125
Update training packages to Pytorch 1.11.0 (#10851)
* Update ortmodule training packages to Pytorch 1.11.0

Co-authored-by: Harshitha Venkata <havenka@microsoft.com>
Co-authored-by: Baiju Meswani <bmeswani@microsoft.com>
2022-03-22 16:45:51 -07:00
Baiju Meswani
565318ce86
Support ORT WASM compilation with the training flag (#10973)
* Add training support for ORT web assembly compilation

* Use wrapper for eigen includes in training
2022-03-22 16:13:35 -07:00
Scott McKay
b28e5064f3
Ignore DequantizeLinear nodes in CommonSubexpressionElimination optimizer (#10934)
* Ignore DequantizeLinear nodes in CommonSubexpressionElimination.

Coalescing DQ nodes results in QDQ node groups having overlaps, which the QDQ processing does not support.
2022-03-23 08:46:01 +10:00
Xavier Dupré
b88fb68fac
Adds missing numpy type when looking for the ort correspondance (#10943) 2022-03-22 14:44:48 -07:00
Yulong Wang
dce5d719c5
add build flag for emscripten settings (#10963)
* allows multiple '--cmake_extra_defines' flags

* fix flake8 error

* Add build flag for emscripten settings

* remove "emscripten_settings" in generate_build_tree()

* format code
2022-03-22 11:55:45 -07:00
Sheil Kumar
027565b3b2
Add multi-dim dft test, and fix complex idft (#10947)
* fix complex multi-dim dft

* Add multi-dim dft test, and fix complex idft

* remove incorrect inplace specification

* Add DFT tests

* update epsilon to 1000ths place

Co-authored-by: Sheil Kumar <sheilk@microsoft.com>
2022-03-22 10:08:12 -07:00
Yulong Wang
2da82fd0b9
allows multiple '--cmake_extra_defines' flags (#10953)
* allows multiple '--cmake_extra_defines' flags

* fix flake8 error
2022-03-21 19:10:47 -07:00
Sunghoon
6d19c295d0
use lf as eol for node package (#10965) 2022-03-21 15:50:03 -07:00
Sunghoon
b34d9f6867
[js/wasm] Add WebAssembly static library build into web CI pipeline (#10959)
* add webassembly static library build into ci

* add webassembly static library build into ci

* skip publishing on static lib

* fix type
2022-03-21 15:49:49 -07:00
Chandru Ramakrishnan
4a5b5328a4
Added support to Eager CodeGen for multiple in-place parameters. (#10945)
* Added support to CodeGen for multiple inplace output parameters.

* Updated output Tensor to references.
2022-03-21 13:10:22 -07:00