Commit graph

6611 commits

Author SHA1 Message Date
Xavier Dupré
3f42665a40
Improve transfered time from ort to torch (#9610)
* Improve transfered time from ort to torch
* Use static_cast
* fix call to Python API for python <= 3.8
* investigation
* fix ref counts
* disable import if no training
* one function to convert multiple ortvalues
* add proto_type
* enforce dlpack->deleter to be not null
* fix _ortvalues_to_torch_tensor for eager mode
* rename proto_type into element_type in the Python API
* conversion from ort to torch 2x times faster
* fix conversion of list of OrtValue
* replace has_bool_tensor by bool_tensor_indices
* introduce _ortvalues_to_torch_tensor_list
* use _ortvalues_to_torch_tensor_list for cache
* fix ambiguity between c and python classes

Co-authored-by: xavier dupré <xavier.dupre@gmail.com>
Co-authored-by: Thiago Crepaldi <thiago.crepaldi@microsoft.com>
2022-04-06 09:12:58 +02:00
Scott McKay
58d97691ac
Set dims for constant with multiple values (#11116)
* Also fix issue with data transfer not handling Tensor<std::string> correctly.
2022-04-06 07:39:07 +10:00
Abhishek Jindal
91c940b619
adding fill scalar for torch ones direct initialization on ort device (#10898)
* adding fill scalar for torch ones direct initialization on device and adding test case for it

* using ConstantOfShape to for implementing fill Scalar in atenops

* adding case for handling at::Tensor attribute

* handling the at::Tensor type for ConstantOfShape

* handling the at::Tensor type for ConstantOfShape with attr type

* handling the at::Tensor type case

* converting the data to tensor in case of aten tensor mapping is needed

* handling aten tensor case

* handling aten tensor case and reversing the string case

* changing type of scalar
2022-04-05 11:17:25 -07:00
G. Ramalingam
2c2408814f
Add function body for SoftmaxCrossEntropyLossGrad (#10779)
* Add function definition for SoftmaxCrossEntropyLossGrad

Signed-off-by: Ganesan Ramalingam <grama@microsoft.com>

* Cleanup

Signed-off-by: Ganesan Ramalingam <grama@microsoft.com>

* Eliminate unused variable

Signed-off-by: Ganesan Ramalingam <grama@microsoft.com>

* Fix index of weight tensor

Signed-off-by: Ganesan Ramalingam <grama@microsoft.com>

* A few fixes to handle typing and weight

Signed-off-by: Ganesan Ramalingam <grama@microsoft.com>

* Fix for zero D dimensions

Signed-off-by: Ganesan Ramalingam <grama@microsoft.com>

* Add function body to internal op also

Signed-off-by: Ganesan Ramalingam <grama@microsoft.com>

* A few fixes

Signed-off-by: Ganesan Ramalingam <grama@microsoft.com>

* Fix type variable name

Signed-off-by: Ganesan Ramalingam <grama@microsoft.com>

* Fix type constraint var

Signed-off-by: Ganesan Ramalingam <grama@microsoft.com>

* Fix ignore_index handling in testcase

Signed-off-by: Ganesan Ramalingam <grama@microsoft.com@orttrainingdev7.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net>

* Add fun def for SoftmaxCrossEntropyLossInternal

Signed-off-by: Ganesan Ramalingam <grama@microsoft.com@orttrainingdev7.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net>

* Specify opset

Signed-off-by: Ganesan Ramalingam <grama@microsoft.com@orttrainingdev7.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net>

* Handle opset in NLL function

Signed-off-by: Ganesan Ramalingam <grama@microsoft.com>

* Address PR feedback

Signed-off-by: Ganesan Ramalingam <grama@microsoft.com>

* Modify onehot

Signed-off-by: Ganesan Ramalingam <grama@microsoft.com>

* Eliminate duplicate statement

Co-authored-by: Ganesan Ramalingam <grama@microsoft.com@orttrainingdev7.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net>
2022-04-05 10:52:40 -07:00
Ben Niu
20fbf603d3
Fix ARM64EC build breaks (#11111)
Apply this 4c015dbb49 to fix ARM64EC build breaks.
2022-04-05 10:00:42 -07:00
Erick Muñoz
25fdf8b167
Add Dequantize Linear operator on OneDNN EP (#11036) 2022-04-05 08:32:26 -07:00
Baiju Meswani
8db180c245
orttraining cuda 10.2 to not build for compute_80 (#11103) 2022-04-04 17:22:05 -07:00
Jack·Boos·Yu
01631893cd
[cmake] Re-factor pre-compile header usage (#11093) 2022-04-04 16:28:34 -07:00
Changming Sun
fc7fe0012f
Fix: nodejs installer file name is wrong (#11097) 2022-04-04 16:24:08 -07:00
Olivia Jain
872ed91d8a
Perf FasterRCNN + MaskRCNN (#11102)
* add faster mask

* fix paths
2022-04-04 13:23:25 -07:00
chethanpk
112dec6565
Added code for FusedMatMul inside matmul op primitive (#11077) 2022-04-04 10:00:02 -07:00
Jack·Boos·Yu
ea004e953f
[cmake] Export multi targets in static build (#11063)
* [cmake] Export multi targets in static build

* Install more components in static build, format some code

* Fix code pos
2022-04-03 22:37:18 -07:00
Jack·Boos·Yu
2dfd81b9bb
[cmake] Add option onnxruntime_ENABLE_CPUINFO (#11084) 2022-04-01 22:29:27 -07:00
Changming Sun
25398cc5fe
Add cleanup instruction to run_dockerbuild.sh (#11079) 2022-04-01 22:18:56 -07:00
Baiju Meswani
f9940f17b1
Remove extra-index-url to avoid nuget security analysis vulnerability (#11082) 2022-04-01 18:30:55 -07:00
Chun-Wei Chen
b9279f637d
update How_To_Update_ONNX_Dev_Notes with right paths (#11074) 2022-04-01 08:05:31 -07:00
Changming Sun
588a66e221
Add cleanup steps to the build jobs which run in Linux CPU machine pool (#11078) 2022-03-31 22:34:12 -07:00
Baiju Meswani
249c4dec7f
Update orttraining release pipelines to use torch 1.11.0 (#11018)
* Update orttraining release pipelines to use torch 1.11.0

* Change requirements_torch...txt to requirements.txt

* Update cuda cmake architectures and clean up old files
2022-03-31 21:51:06 -07:00
Changming Sun
8e6dbad287 FIX: Nuget pipeline doesn't report binary size for Linux ARM64
In #10652 #10637 #10624, we changed the RID. But I forgot to update this part.
2022-03-31 18:32:05 -07:00
wejoncy
11a4ca741d
fuse Conv+Add+activation for CPU from different op-branch (#10987)
* Fuse op conv Add and activation from two branch
* simplify code

Co-authored-by: Jicheng Wen <jicwen@microsoft.com>
2022-04-01 09:25:17 +08:00
dependabot[bot]
79e4ed8064 Bump pytorch-lightning
Bumps [pytorch-lightning](https://github.com/PyTorchLightning/pytorch-lightning) from 1.5.10 to 1.6.0.
- [Release notes](https://github.com/PyTorchLightning/pytorch-lightning/releases)
- [Changelog](https://github.com/PyTorchLightning/pytorch-lightning/blob/master/CHANGELOG.md)
- [Commits](https://github.com/PyTorchLightning/pytorch-lightning/compare/1.5.10...1.6.0)

---
updated-dependencies:
- dependency-name: pytorch-lightning
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
2022-03-31 16:51:24 -07:00
Boris Fomitchev
eab7c0d5bf
Fixing optimizer failure due to missing provider list (#10497)
Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>
2022-03-31 11:05:49 -07:00
Linnea May
bfcd5bd4a2
remove hardcoded library name (#11058)
Co-authored-by: Linnea May <linneamay@microsoft.com>
2022-03-31 10:41:31 -07:00
Yulong Wang
8dcadba670
[js] aggregation of recent dependabot security warnings fix (#11060)
* update package-lock.json

* Bump minimist from 1.2.5 to 1.2.6 in /js/react_native

Bumps [minimist](https://github.com/substack/minimist) from 1.2.5 to 1.2.6.
- [Release notes](https://github.com/substack/minimist/releases)
- [Commits](https://github.com/substack/minimist/compare/1.2.5...1.2.6)

---
updated-dependencies:
- dependency-name: minimist
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>

* Bump minimist from 1.2.5 to 1.2.6 in /js/react_native/e2e

Bumps [minimist](https://github.com/substack/minimist) from 1.2.5 to 1.2.6.
- [Release notes](https://github.com/substack/minimist/releases)
- [Commits](https://github.com/substack/minimist/compare/1.2.5...1.2.6)

---
updated-dependencies:
- dependency-name: minimist
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>

* Bump plist from 3.0.4 to 3.0.5 in /js/react_native

Bumps [plist](https://github.com/TooTallNate/node-plist) from 3.0.4 to 3.0.5.
- [Release notes](https://github.com/TooTallNate/node-plist/releases)
- [Changelog](https://github.com/TooTallNate/plist.js/blob/master/History.md)
- [Commits](https://github.com/TooTallNate/node-plist/commits)

---
updated-dependencies:
- dependency-name: plist
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>

* Bump ansi-regex from 4.1.0 to 4.1.1 in /js/react_native

Bumps [ansi-regex](https://github.com/chalk/ansi-regex) from 4.1.0 to 4.1.1.
- [Release notes](https://github.com/chalk/ansi-regex/releases)
- [Commits](https://github.com/chalk/ansi-regex/compare/v4.1.0...v4.1.1)

---
updated-dependencies:
- dependency-name: ansi-regex
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>

* Bump plist from 3.0.4 to 3.0.5 in /js/react_native/e2e

Bumps [plist](https://github.com/TooTallNate/node-plist) from 3.0.4 to 3.0.5.
- [Release notes](https://github.com/TooTallNate/node-plist/releases)
- [Changelog](https://github.com/TooTallNate/plist.js/blob/master/History.md)
- [Commits](https://github.com/TooTallNate/node-plist/commits)

---
updated-dependencies:
- dependency-name: plist
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>

* Bump ansi-regex from 4.1.0 to 4.1.1 in /js/react_native/e2e

Bumps [ansi-regex](https://github.com/chalk/ansi-regex) from 4.1.0 to 4.1.1.
- [Release notes](https://github.com/chalk/ansi-regex/releases)
- [Commits](https://github.com/chalk/ansi-regex/compare/v4.1.0...v4.1.1)

---
updated-dependencies:
- dependency-name: ansi-regex
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-03-31 02:06:04 -07:00
dependabot[bot]
e9c68d57ca
Bump minimist from 1.2.5 to 1.2.6 in /js/web (#11033)
Bumps [minimist](https://github.com/substack/minimist) from 1.2.5 to 1.2.6.
- [Release notes](https://github.com/substack/minimist/releases)
- [Commits](https://github.com/substack/minimist/compare/1.2.5...1.2.6)

---
updated-dependencies:
- dependency-name: minimist
  dependency-type: direct:development
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-03-30 16:26:34 -07:00
Yulong Wang
6c7090a829
[js/web] fix output type mapping (#11049) 2022-03-30 16:26:04 -07:00
RandySheriffH
9505e8c6c1
fix json format (#11046)
Co-authored-by: Randy Shuai <rashuai@microsoft.com>
2022-03-30 16:15:33 -07:00
Adam Pocock
9616ad483f
[Java] Support configuring CUDA and TensorRT execution providers (#10697)
Java side parts for configuring CUDA and TensorRT.
Adding tests for CUDA and TensorRT. Refactoring library loading logic as provider options need to have their shared library loaded before they can be constructed.
2022-03-30 14:26:51 -07:00
Yulong Wang
179406bd25
[JS] upgrade package-lock.json from v1 to v2 (#11039)
* upgrade package-lock.json from v1 to v2

* upgrade requirement of nodejs version to 16.x
2022-03-30 13:30:28 -07:00
Nat Kershaw (MSFT)
998bf0fdb6
Remove advice to use IO Binding for this scenario (#11006) 2022-03-30 10:23:50 -07:00
Xavier Dupré
c37d2728bf
Implement TreeEnsemble for opset(ai.onnx.ml)==3 (#10821)
* Implement TreeEnsemble for opset(ai.onnx.ml)==3
* use of InlineVector
* refactoring
* improve attributes retrieval
* avoid creating a temporary buffer
* modifies onnx.ml.cpu.json
* use unordered_map
* update docs/OperatorKernels.md
* address PR comments (TH -> ThresholdType, ORT_RETURN...)
* add a python unit test to load a TreeEnsembleRegressor following ai.onnx.ml==3 specifications
2022-03-30 12:53:12 +02:00
Yulong Wang
1424b796ff
[js/web] disable test_tan temorarily (#11048) 2022-03-29 21:47:52 -07:00
Yi Zhang
d1bdd2cd94
allow trailing slash in directory (#11001)
* allow trailing slash in directory

* fix lint
2022-03-30 09:42:57 +08:00
ytaous
5868413caf
fix seg fault (#11038)
Co-authored-by: Ethan Tao <ettao@microsoft.com@orttrainingdev8.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net>
2022-03-29 14:12:45 -07:00
Edward Chen
8f456735d1
Remove unused variable. (#11043) 2022-03-29 14:11:07 -07:00
Erick Alejandro Muñoz Alvarado
6c005bfdbc
Enabled Cast operator on OneDNN EP (#11023) 2022-03-29 08:16:01 -07:00
Vincent Wang
6a6840d5c6
Fuse LayerNormalization for Apex O2 (#10233) 2022-03-29 21:22:04 +08:00
Vincent Wang
3b6cee8059
[CUDA] Optimize Conv and ConvGrad for Training (#10999)
* Optimize Conv and ConvGrad for Training

* add provider option to control

* fix typo
2022-03-29 07:31:36 +08:00
Chi Lo
8ba52b0a05
Bump master version to 1.12 (#10797)
* bump master version to 1.11

* bump master version to 1.12
2022-03-28 12:30:11 -07:00
Edward Chen
9371401746
Move node EP assignment for ORT format into SessionState::FinalizeSessionState() (#10944)
Follow up to #10904.
- Move node EP assignment for ORT format into SessionState::FinalizeSessionState().
- Add unit test for #10904.
- Make convert_onnx_models_to_ort.py optimization level configurable via environment variable.
2022-03-28 10:37:22 -07:00
Baiju Meswani
9c6cc018a9
Add utility to get the gradient graph from GradientGraphBuilder (#10995)
* Add pybind method to get the gradient graph

* Fix segmentation fault because of logging for gradien building
2022-03-25 17:13:56 -07:00
Chen Fu
dc72159105
Symmetric Quant indirect Conv kernel for ARMv8 A55 chip (#10862)
ARM a55 micro-architecture (with dot product instructions), similar to a53, is widely used as little cores in big.Little configurations. A55 has a narrower memory load/store hardware, where a 128b load instruction would block the pipeline for 2 whole cycles, during which no other instructions can be executed. On the other hand, a 64b load instruction can be duo issued with many other instructions.

This change adds a Symmetric Quant indirect Conv kernel for a55 micro-architecture, where we replace

ldr q4,[x1],

with

ldr d4,[x1],
ldr x11,[x1],
ins v4.d[1],x11

so that we can try to hide the memory load cycles behind computing cycles in the kernel.

With this new kernel, cartoongan model shows significant perf improvement on Pixel5a little cores (2 threads running on two little cores):

new kernel: 2188.59 ms
old kernel: 2360.61 ms
2022-03-25 17:10:47 -07:00
leqiao-1
8ddc45f52d
Add linux and macos arm64 java aritifacts (#10981) 2022-03-25 16:23:17 -07:00
Jack·Boos·Yu
d1be71eaa3
[cmake] Add keyword STATIC to add_library in function onnxruntime_add_static_library (#10998) 2022-03-25 16:19:36 -07:00
Chandru Ramakrishnan
cb31b7eab1
Fixed creation of ORT_Value to pass offset of 0 (#11004) 2022-03-25 15:52:10 -04:00
Scott McKay
47c09e6701
Clarify usage of kOnnxDomainAlias. (#10962)
* Clarify usage of kOnnxDomainAlias.
2022-03-25 09:52:59 +10:00
pengwa
89ef987ab1
Improve NonZero on CUDA/ROCM (#10307)
* improve NonZero

* fix megatron_fp16 optimzier, fix the doc

* multi_tensor_applier

* resolve comment

* fix building warning

* fix build error when enabling training and use tensorrt
2022-03-25 07:35:45 +08:00
mpapdiwala
1e917c879e
Adding support for saving and loading train step info properties in the state dict and checkpoint file. (#10569)
* Adding optimization step and step parameter to the ORTTrainer constructor

* Added ORTTrainerOptions for optimization step

* Adding Train Step Info Settings to State Dictionary

* Adding train step info key

* Updating comments

* Reverting changes

* Updating test case for new state dict entry train_step_info
2022-03-24 11:50:45 -07:00
Christoph Hausner
989e640009
Update docstrings in quantize.py (#10952) 2022-03-24 10:49:33 -07:00
mindest
3c5853dcbc
register custom_op_symbolic for squeeze (#10970)
* register custom_op_symbolic for squeeze

* remove misleading warning msg from symbolic_opset9
2022-03-24 10:28:21 +08:00