Commit graph

6582 commits

Author SHA1 Message Date
Nat Kershaw (MSFT)
998bf0fdb6
Remove advice to use IO Binding for this scenario (#11006) 2022-03-30 10:23:50 -07:00
Xavier Dupré
c37d2728bf
Implement TreeEnsemble for opset(ai.onnx.ml)==3 (#10821)
* Implement TreeEnsemble for opset(ai.onnx.ml)==3
* use of InlineVector
* refactoring
* improve attributes retrieval
* avoid creating a temporary buffer
* modifies onnx.ml.cpu.json
* use unordered_map
* update docs/OperatorKernels.md
* address PR comments (TH -> ThresholdType, ORT_RETURN...)
* add a python unit test to load a TreeEnsembleRegressor following ai.onnx.ml==3 specifications
2022-03-30 12:53:12 +02:00
Yulong Wang
1424b796ff
[js/web] disable test_tan temorarily (#11048) 2022-03-29 21:47:52 -07:00
Yi Zhang
d1bdd2cd94
allow trailing slash in directory (#11001)
* allow trailing slash in directory

* fix lint
2022-03-30 09:42:57 +08:00
ytaous
5868413caf
fix seg fault (#11038)
Co-authored-by: Ethan Tao <ettao@microsoft.com@orttrainingdev8.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net>
2022-03-29 14:12:45 -07:00
Edward Chen
8f456735d1
Remove unused variable. (#11043) 2022-03-29 14:11:07 -07:00
Erick Alejandro Muñoz Alvarado
6c005bfdbc
Enabled Cast operator on OneDNN EP (#11023) 2022-03-29 08:16:01 -07:00
Vincent Wang
6a6840d5c6
Fuse LayerNormalization for Apex O2 (#10233) 2022-03-29 21:22:04 +08:00
Vincent Wang
3b6cee8059
[CUDA] Optimize Conv and ConvGrad for Training (#10999)
* Optimize Conv and ConvGrad for Training

* add provider option to control

* fix typo
2022-03-29 07:31:36 +08:00
Chi Lo
8ba52b0a05
Bump master version to 1.12 (#10797)
* bump master version to 1.11

* bump master version to 1.12
2022-03-28 12:30:11 -07:00
Edward Chen
9371401746
Move node EP assignment for ORT format into SessionState::FinalizeSessionState() (#10944)
Follow up to #10904.
- Move node EP assignment for ORT format into SessionState::FinalizeSessionState().
- Add unit test for #10904.
- Make convert_onnx_models_to_ort.py optimization level configurable via environment variable.
2022-03-28 10:37:22 -07:00
Baiju Meswani
9c6cc018a9
Add utility to get the gradient graph from GradientGraphBuilder (#10995)
* Add pybind method to get the gradient graph

* Fix segmentation fault because of logging for gradien building
2022-03-25 17:13:56 -07:00
Chen Fu
dc72159105
Symmetric Quant indirect Conv kernel for ARMv8 A55 chip (#10862)
ARM a55 micro-architecture (with dot product instructions), similar to a53, is widely used as little cores in big.Little configurations. A55 has a narrower memory load/store hardware, where a 128b load instruction would block the pipeline for 2 whole cycles, during which no other instructions can be executed. On the other hand, a 64b load instruction can be duo issued with many other instructions.

This change adds a Symmetric Quant indirect Conv kernel for a55 micro-architecture, where we replace

ldr q4,[x1],

with

ldr d4,[x1],
ldr x11,[x1],
ins v4.d[1],x11

so that we can try to hide the memory load cycles behind computing cycles in the kernel.

With this new kernel, cartoongan model shows significant perf improvement on Pixel5a little cores (2 threads running on two little cores):

new kernel: 2188.59 ms
old kernel: 2360.61 ms
2022-03-25 17:10:47 -07:00
leqiao-1
8ddc45f52d
Add linux and macos arm64 java aritifacts (#10981) 2022-03-25 16:23:17 -07:00
Jack·Boos·Yu
d1be71eaa3
[cmake] Add keyword STATIC to add_library in function onnxruntime_add_static_library (#10998) 2022-03-25 16:19:36 -07:00
Chandru Ramakrishnan
cb31b7eab1
Fixed creation of ORT_Value to pass offset of 0 (#11004) 2022-03-25 15:52:10 -04:00
Scott McKay
47c09e6701
Clarify usage of kOnnxDomainAlias. (#10962)
* Clarify usage of kOnnxDomainAlias.
2022-03-25 09:52:59 +10:00
pengwa
89ef987ab1
Improve NonZero on CUDA/ROCM (#10307)
* improve NonZero

* fix megatron_fp16 optimzier, fix the doc

* multi_tensor_applier

* resolve comment

* fix building warning

* fix build error when enabling training and use tensorrt
2022-03-25 07:35:45 +08:00
mpapdiwala
1e917c879e
Adding support for saving and loading train step info properties in the state dict and checkpoint file. (#10569)
* Adding optimization step and step parameter to the ORTTrainer constructor

* Added ORTTrainerOptions for optimization step

* Adding Train Step Info Settings to State Dictionary

* Adding train step info key

* Updating comments

* Reverting changes

* Updating test case for new state dict entry train_step_info
2022-03-24 11:50:45 -07:00
Christoph Hausner
989e640009
Update docstrings in quantize.py (#10952) 2022-03-24 10:49:33 -07:00
mindest
3c5853dcbc
register custom_op_symbolic for squeeze (#10970)
* register custom_op_symbolic for squeeze

* remove misleading warning msg from symbolic_opset9
2022-03-24 10:28:21 +08:00
Shucai Xiao
7ee52fb8a0
amdmigraphx_ep-add ops to be supported by migraphx and fixed a bug in check ops to be supported (#10496)
* backup debugging information related to debugging a jira ticket

* fixed a bug in checking whether an input can be constand folded

* added more operators that are supported by migraphx

* revert unnecessary changes

* remove unused logger parameter

* rename function to make name style consistent

* backup code changes

* fix review comments

* refactor graph utility functions to add unit tests

* backup additional changes

* fixed a link error in build migraphx_basic_test

* add unit test for some migraphx utility functions

* add more supported ops in migraphx
2022-03-23 19:17:19 -07:00
Adrian Tsai
ae08f9666d
Fix type constraints in registration of DequantizeLinear (#10986) 2022-03-23 17:05:12 -07:00
Sheil Kumar
938f3857a5
Set the default for the STFT onesided attribute to 1, which tests expect (#10984)
Co-authored-by: Sheil Kumar <sheilk@microsoft.com>
2022-03-23 14:20:54 -07:00
Chandru Ramakrishnan
07201726ed
Fixed macros for graph transformer registration. (#10983) 2022-03-23 14:55:17 -04:00
Olivia Jain
de384805cd
Custom parameters (#10964)
* get inputs independently for trtexec

* track one process only

* remove engine and profile files

* change time to commit time

* add runtime option for io binding

* move to commit date

* fixes

* add option for graph optimization

* cleanup docker script

* note second time creation

* allow for parameters to be configured from pipeline at runtime

* uncomment

* include optional arguments at runtime

* post second session creation

* update cmake version

* Revert "update cmake version"

This reverts commit 09a1364eae68610724c8e90eeea777b7ee03f74b.

* Move data format import
2022-03-23 09:47:24 -07:00
Jeff Daily
9a3be9b46a
use #include <hiprand/hiprand.h>, not deprecated #include <hiprand.h> (#10966) 2022-03-23 08:56:45 -07:00
Yi Zhang
0efbe92296
fix coverage report error in master build (#10969)
* fix error in master

* check NNAPI_EP_MASTER

* Revert "check NNAPI_EP_MASTER"

This reverts commit 59c9043b7c9bbcb4b495d2dd121ef6d4271be408.

* rm coverage in PR build
2022-03-23 16:00:57 +08:00
raviskolli
480c793125
Update training packages to Pytorch 1.11.0 (#10851)
* Update ortmodule training packages to Pytorch 1.11.0

Co-authored-by: Harshitha Venkata <havenka@microsoft.com>
Co-authored-by: Baiju Meswani <bmeswani@microsoft.com>
2022-03-22 16:45:51 -07:00
Baiju Meswani
565318ce86
Support ORT WASM compilation with the training flag (#10973)
* Add training support for ORT web assembly compilation

* Use wrapper for eigen includes in training
2022-03-22 16:13:35 -07:00
Scott McKay
b28e5064f3
Ignore DequantizeLinear nodes in CommonSubexpressionElimination optimizer (#10934)
* Ignore DequantizeLinear nodes in CommonSubexpressionElimination.

Coalescing DQ nodes results in QDQ node groups having overlaps, which the QDQ processing does not support.
2022-03-23 08:46:01 +10:00
Xavier Dupré
b88fb68fac
Adds missing numpy type when looking for the ort correspondance (#10943) 2022-03-22 14:44:48 -07:00
Yulong Wang
dce5d719c5
add build flag for emscripten settings (#10963)
* allows multiple '--cmake_extra_defines' flags

* fix flake8 error

* Add build flag for emscripten settings

* remove "emscripten_settings" in generate_build_tree()

* format code
2022-03-22 11:55:45 -07:00
Sheil Kumar
027565b3b2
Add multi-dim dft test, and fix complex idft (#10947)
* fix complex multi-dim dft

* Add multi-dim dft test, and fix complex idft

* remove incorrect inplace specification

* Add DFT tests

* update epsilon to 1000ths place

Co-authored-by: Sheil Kumar <sheilk@microsoft.com>
2022-03-22 10:08:12 -07:00
Yulong Wang
2da82fd0b9
allows multiple '--cmake_extra_defines' flags (#10953)
* allows multiple '--cmake_extra_defines' flags

* fix flake8 error
2022-03-21 19:10:47 -07:00
Sunghoon
6d19c295d0
use lf as eol for node package (#10965) 2022-03-21 15:50:03 -07:00
Sunghoon
b34d9f6867
[js/wasm] Add WebAssembly static library build into web CI pipeline (#10959)
* add webassembly static library build into ci

* add webassembly static library build into ci

* skip publishing on static lib

* fix type
2022-03-21 15:49:49 -07:00
Chandru Ramakrishnan
4a5b5328a4
Added support to Eager CodeGen for multiple in-place parameters. (#10945)
* Added support to CodeGen for multiple inplace output parameters.

* Updated output Tensor to references.
2022-03-21 13:10:22 -07:00
Leandro Gracia Gil
1cc2cfb7b8
Move #ifndef ORT_CXX_API_THROW to the no exceptions case. (#10937)
This is related to https://github.com/microsoft/onnxruntime/issues/10564
which introduced a fix in the wrong case where exceptions are enabled.
2022-03-21 11:12:56 -07:00
leqiao-1
a6ea278502
add python3.10 support (#10848)
* add python3.10 support

* upgrade numpy version in build pipeline

* add python 3.10 path

* upgrade torch version in build pipeline

* update docker run arguments

* change torch version

* fix typo

* fix permission issue

* change python version

* remove python3.10 for openvino build

* remove python 3.10 for openvino build
2022-03-21 09:46:02 +08:00
G. Ramalingam
8703d37517
Extend DropoutGrad function to support bfloat16 (#10662)
* Update DropoutGrad function to support bfloat16

* Eliminate dead comments

* Set opset version for testcase

Signed-off-by: Ganesan Ramalingam <grama@microsoft.com>

* Update to new builder

Signed-off-by: Ganesan Ramalingam <grama@microsoft.com>
2022-03-20 15:11:08 -07:00
Scott McKay
91722e2bc4
Fix typos (#10935) 2022-03-20 08:27:35 +10:00
Yi Zhang
c1e37e4ebf
Android CI Pipeline: Fix post coverage bug (#10949) 2022-03-19 11:17:08 -07:00
Ella Charlaix
fe6ab719f3
Fix a typo in quantization tools (#10940) 2022-03-18 21:03:16 -07:00
soundarthiaga
eabb14788a
[perf_metric] added inferences per second metric (#10921) 2022-03-18 21:01:11 -07:00
Yi Zhang
3897b93606
optimize Android CI (#10938) 2022-03-19 11:00:21 +08:00
Kotaro Yamamoto
2dea7dc27f
Skip python arena shrinkage test on ppc (#10901) 2022-03-18 19:31:21 -07:00
soundarthiaga
de06d95096
[parallel_inference] added support for parallel inference with timed duration perf test (#10922) 2022-03-18 19:05:28 -07:00
Scott McKay
5cbacec854
Maintain aspect ratio by doing resize + crop in image_to_pb tool (#10887) 2022-03-19 07:08:45 +10:00
ytaous
f058c59407
Performance: add io_binding support for bert benchmark util (#10907)
* io_binding support

* cover all test cases

* per comments

Co-authored-by: Ethan Tao <ettao@microsoft.com@orttrainingdev7.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net>
2022-03-18 10:33:30 -07:00