Commit graph

700 commits

Author SHA1 Message Date
Yufeng Li
2e6c2177af
remove deprecated quantize api (#11263) 2022-04-19 19:41:55 -07:00
Tianlei Wu
bab9b80f1f
auto mixed precision for t5 (#11252) 2022-04-19 12:42:11 -07:00
Dmitri Smirnov
98faaa7e2f
Scoped GIL release in run_with_iobinding (#11248) 2022-04-18 13:07:45 -07:00
Yufeng Li
dec99657a1
Improve onnx shape inference in quant tool (#11106)
onnx.shape_inference.infer_shapes only works for model size < 2GB, while onnx.shape_inference.infer_shapes_path works for all models. This PR replaces infer_shapes with infer_shapes_path.
2022-04-18 08:07:31 -07:00
chausner
c2b4054c74 Fix typos 2022-04-14 13:53:50 -07:00
Olivia Jain
ae243c2bb5
Pull Nightly Wheel File and Cleanup Perf (#11164)
* delete unused files

* only use one dockerfile, otherwise install

* Update pipeline file

* get other changes

* minimal packages

* update pull nightly variable

* try logical boolean

* test boolean

* have build ort as boolean

* case senstive

* use the current head not the previous commit

* add helpful note
2022-04-11 11:41:11 -07:00
Tianlei Wu
00b595e389
move longformer and t5 to models subdirectory (#11161)
* move longformer scripts to models subdirectory
* Copy transformers\models\t5 to python package as well
2022-04-09 22:35:14 -07:00
Changming Sun
4983d6e5d6
Call pluggable EP's shutdown function in Environment::~Environment() (#11120)
I disabled some tests temporarily. I will move them to a separated executable file in another PR.

In the future, I want to combine onnxruntime::Environment and OrtEnv classes. Now we have 3 env classes, it is too confusing:

1. onnxruntime::Env
2. onnxruntime::Environment
3. OrtEnv
Our python binding uses onnxruntime::Environment, while all other language bindings use OrtEnv. So python doesn't unload EPs but the others do. It's better to make them consistent.

Please note even I added the call, currently the unload function still is a no-op on Linux. So, currently on Windows we must unload the EPs while on Linux we must not do it.
2022-04-07 14:11:29 -07:00
Dmitri Smirnov
2700261f7c
Provide an API to supply external initializers data from user buffers (#11109)
Imlpement AddExternalInitializers
2022-04-07 12:21:53 -07:00
Maajid khan
81fa28bc56
OpenVINO-EP v4.0 Release PR with OpenVINO 2022.1 (#11025)
* Enabling ov-ep for 2022.1 Release

->Added ov-ep 2022.1 flow
->Validated CPU Unit tests with OV
Master using onnxruntime_test_all unit
tests.

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>

* Fix for output mismatch b/w OpenVINO and ONNX

Refer:
https://jira.devtools.intel.com/browse/CVS-60310

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>

* Enabling Adobe ops

->Enable Resize op for iGPU
->Enable Add op for iGPU

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>

* Removing irrelevant conditions

->Removing some conditions from
GetCapability() which are now not
required. (Removed conditions for
OV version support less than 2021.2)

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>

* Enable upsample op

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>

* Enable Adobe proxy-e model

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>

* Removing any extra conditions for Opset13 ops

* Opset13 changes

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>

* Exception handling for devices

* Added comments

* Implement GPU Throttling feature

*Added GPU Throttling feature for iGPU's.
when user enables it as a runtime option,
it helps in reducing overall CPU usage
of the application

*Added changes to exercise this option
using onnxruntime_perf_test application.

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>

* Renaming the runtime config option

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>

* Added the user to video and users group

* Handling_GPU.0_GPU.1

* Handling special conditions

->Handling corner cases for
device_type checks

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>

* Modification to include new api 2.0 changes in the code

* Added opset13 changes

->Enabled Few ops
->Added Debug info for case 3b in getcapability()

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>

* Enabling ov-ep for 2022.1 Release

->Added ov-ep 2022.1 flow
->Validated CPU Unit tests with OV
Master using onnxruntime_test_all unit
tests.

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>

* Fix for output mismatch b/w OpenVINO and ONNX

Refer:
https://jira.devtools.intel.com/browse/CVS-60310

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>

* Enabling Adobe ops

->Enable Resize op for iGPU
->Enable Add op for iGPU

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>

* Removing irrelevant conditions

->Removing some conditions from
GetCapability() which are now not
required. (Removed conditions for
OV version support less than 2021.2)

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>

* Enable upsample op

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>

* Enable Adobe proxy-e model

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>

* Removing any extra conditions for Opset13 ops

* Opset13 changes

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>

* Exception handling for devices

* Added comments

* Implement GPU Throttling feature

*Added GPU Throttling feature for iGPU's.
when user enables it as a runtime option,
it helps in reducing overall CPU usage
of the application

*Added changes to exercise this option
using onnxruntime_perf_test application.

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>

* Renaming the runtime config option

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>

* Added the user to video and users group

* Handling_GPU.0_GPU.1

* Handling special conditions

->Handling corner cases for
device_type checks

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>

* Added opset13 changes

->Enabled Few ops
->Added Debug info for case 3b in getcapability()

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>

* Log comments updated

* Changes to enable 2.0 api

* Enabling ov-ep for 2022.1 Release

->Added ov-ep 2022.1 flow
->Validated CPU Unit tests with OV
Master using onnxruntime_test_all unit
tests.

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>

* Fix for output mismatch b/w OpenVINO and ONNX

Refer:
https://jira.devtools.intel.com/browse/CVS-60310

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>

* Enabling Adobe ops

->Enable Resize op for iGPU
->Enable Add op for iGPU

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>

* Removing irrelevant conditions

->Removing some conditions from
GetCapability() which are now not
required. (Removed conditions for
OV version support less than 2021.2)

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>

* Enable upsample op

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>

* Enable Adobe proxy-e model

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>

* Removing any extra conditions for Opset13 ops

* Opset13 changes

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>

* Exception handling for devices

* Added comments

* Implement GPU Throttling feature

*Added GPU Throttling feature for iGPU's.
when user enables it as a runtime option,
it helps in reducing overall CPU usage
of the application

*Added changes to exercise this option
using onnxruntime_perf_test application.

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>

* Renaming the runtime config option

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>

* Added the user to video and users group

* Handling_GPU.0_GPU.1

* Handling special conditions

->Handling corner cases for
device_type checks

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>

* Added opset13 changes

->Enabled Few ops
->Added Debug info for case 3b in getcapability()

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>

* Fix build issue

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>

* Fixes issues

*Fixes compiler warnings c4458 on windows.
*Fixes the bug in device_type check logic
*Adds print info for enable_opencl_throttling
option in onnxruntime_perf_test

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>

* commit to make openvino_2021.4 compatible

* Fixed IO Buffer Optimization

* Fix output names issue

* Fix 2021.3 branch

* Bug Fix for Multiple inputs/outputs

- Assigns the right output_name and
input_name for the graph when
returned by CompiledModel::inputs()
OV function.

- Also takex care of output mismatch
issue b/w openvino output and onnx
output

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>

* Add comments for the changes made

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>

* IO Buffer Changes

* Commit for Disabling GPU Throttling for 2021.4

* Updated branch

* Fix windows build

->Fixed windows build in debug mode
->Disabled scatternd3_tensor_int64

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>

* Fixed CPP Unit tests for CPU

-Fixed shrink, MVN, ReduceL2, Maxpool,
upsample, scatter, slice, reshape,
unsqueeze.

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>

* Fixed first set of GPU Tests

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>

* Fixed additional failing tests on GPU

->Added conditions to disable certain ops
under certain conditions

->Disabled certain tests

->Added some op supports for no_dimension
supported

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>

* Added Expand op support for CPU

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>

* Added condition for squeeze op

->Shape can't have empty axes attribute

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>

* Add support for LessOrEqual op function

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>

* OV Interface wait for replaced by indefinite wait call

* use names from ONNX model to access OV tensors

This chnage is to use the input/output names
retrieved from original onnx model to access
OV tensors and to check if there's any input
or output names mismatch b/w ONNX naming
and OV naming.

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>

* Fixes Myriad unit tests and other issues

->Fixes Myriad CPP unit tests
->Fixes output mismatch issue with models with
sub graph partitioning

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>

* Fix segfault issue

->Fixed case 3b condition in get_capability()
which was causing the segfault issue

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>

* Fixed build isuse with ov 2021.4 with I/O buffer

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>

* Disables performance counters for I/O Buffer

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>

* Fixed inputs/outputs mismatch for HDDL with 2022.1

Signed-off-by: Mohammad Amir Aqeel <mohammadx.amir.aqeel@intel.com>

* Fix to enable GPU FP16

* Enabled mlperf_ssd_mobilenet_300 model fully on CPU

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>

* Added ov version specific dll packaging for nuget

* Fixed conditions for few ops

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>

* Dockerfile updates

* Updated License Info

-Updated the copyrights License Info
-modified FP16 transformations with OV 2022.1

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>

* Disabling mlperf_ssd_mobilenet_300 model

->Disabled this model for openvino. The
test is failing in Internal_CI pipelines.

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>

* Disabling failing python CPU Tests

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>

* Fixed flake8 python errors

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>

Co-authored-by: hdgx <harinix.d.g@intel.com>
Co-authored-by: mayavijx <mayax.vijayan@intel.com>
Co-authored-by: sfatimar <sahar.fatima@intel.com>
Co-authored-by: mohsinmx <mohsinx.mohammad@intel.com>
Co-authored-by: Mohammad Amir Aqeel <mohammadx.amir.aqeel@intel.com>
2022-04-06 13:30:33 -07:00
Xavier Dupré
3f42665a40
Improve transfered time from ort to torch (#9610)
* Improve transfered time from ort to torch
* Use static_cast
* fix call to Python API for python <= 3.8
* investigation
* fix ref counts
* disable import if no training
* one function to convert multiple ortvalues
* add proto_type
* enforce dlpack->deleter to be not null
* fix _ortvalues_to_torch_tensor for eager mode
* rename proto_type into element_type in the Python API
* conversion from ort to torch 2x times faster
* fix conversion of list of OrtValue
* replace has_bool_tensor by bool_tensor_indices
* introduce _ortvalues_to_torch_tensor_list
* use _ortvalues_to_torch_tensor_list for cache
* fix ambiguity between c and python classes

Co-authored-by: xavier dupré <xavier.dupre@gmail.com>
Co-authored-by: Thiago Crepaldi <thiago.crepaldi@microsoft.com>
2022-04-06 09:12:58 +02:00
Olivia Jain
872ed91d8a
Perf FasterRCNN + MaskRCNN (#11102)
* add faster mask

* fix paths
2022-04-04 13:23:25 -07:00
Boris Fomitchev
eab7c0d5bf
Fixing optimizer failure due to missing provider list (#10497)
Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>
2022-03-31 11:05:49 -07:00
Nat Kershaw (MSFT)
998bf0fdb6
Remove advice to use IO Binding for this scenario (#11006) 2022-03-30 10:23:50 -07:00
Christoph Hausner
989e640009
Update docstrings in quantize.py (#10952) 2022-03-24 10:49:33 -07:00
Olivia Jain
de384805cd
Custom parameters (#10964)
* get inputs independently for trtexec

* track one process only

* remove engine and profile files

* change time to commit time

* add runtime option for io binding

* move to commit date

* fixes

* add option for graph optimization

* cleanup docker script

* note second time creation

* allow for parameters to be configured from pipeline at runtime

* uncomment

* include optional arguments at runtime

* post second session creation

* update cmake version

* Revert "update cmake version"

This reverts commit 09a1364eae68610724c8e90eeea777b7ee03f74b.

* Move data format import
2022-03-23 09:47:24 -07:00
raviskolli
480c793125
Update training packages to Pytorch 1.11.0 (#10851)
* Update ortmodule training packages to Pytorch 1.11.0

Co-authored-by: Harshitha Venkata <havenka@microsoft.com>
Co-authored-by: Baiju Meswani <bmeswani@microsoft.com>
2022-03-22 16:45:51 -07:00
Xavier Dupré
b88fb68fac
Adds missing numpy type when looking for the ort correspondance (#10943) 2022-03-22 14:44:48 -07:00
Ella Charlaix
fe6ab719f3
Fix a typo in quantization tools (#10940) 2022-03-18 21:03:16 -07:00
ytaous
f058c59407
Performance: add io_binding support for bert benchmark util (#10907)
* io_binding support

* cover all test cases

* per comments

Co-authored-by: Ethan Tao <ettao@microsoft.com@orttrainingdev7.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net>
2022-03-18 10:33:30 -07:00
Ye Wang
ee05c591e5
Fix benchmark bugs and add Pytorch version control (#10928) 2022-03-18 09:24:19 -07:00
Guoyu Wang
6f844522c8
Follow up update for python API checking if vcruntime140_1.dll is available (#10927) (#10933) 2022-03-18 08:09:16 -07:00
Ye Wang
78133434b5
Fix fp16 converter bugs[1/n] (#10882)
handle sequence type
2022-03-17 22:38:43 -07:00
zhangyaobit
5d4ff67c36
Support fusion options for benchmark.py (#10900)
* Support fusion options for benchmark.py

* Add fusion options for tf model export as well.

* Add command example and warning related to fusion options.
2022-03-17 20:57:43 -07:00
Guoyu Wang
b86d105153
[python API] Change raise import error when C:\Windows\System32\vcruntime140_1.dll is not found to warning (#10927)
* remove throw if C:\\Windows\\System32\\vcruntime140_1.dll cannot be found

* Add comments and update warning message

* adding back accidentally removed line

Co-authored-by: gwang0000 <62914304+gwang0000@users.noreply.github.com>
2022-03-17 18:56:43 -07:00
PeixuanZuo
463fac67a3
[FIX] symbolic shape infer error with onnx-1.11.0 (#10674)
* [FIX] symbolic shape infer error with onnx-1.11.0

* [FIX] consider inputs name contains 'unk__'

* [TEST] enable gpt2 test

* [FIX] gpt2_megatron_opt.onnx graph
2022-03-17 13:47:02 +08:00
Chi Lo
ce204d0744
Update to flatbuffers v2.0.0 (#10866) 2022-03-16 09:18:49 -07:00
PeixuanZuo
5763657715
[UPDATE] Add prefix in front of the file (#10884) 2022-03-16 21:05:18 +08:00
Valery Chernov
625a1f7673
[TVM EP] code refactor (#10655)
* rename info to options for TVM EP

* transfer options processing from TVMExecutionProvider to TVMEPOptions

* transfer TVMRunner to separated files

* implement TVMCompiler class

* replace CompileFunc by TVMCompiler object. update TVMRunner. now it does not depend on TvmExecutionProvider

* correct logging of TVM EP options

* RunnerImpl, GERunnerImpl and VMRunnerImpl were implemented

* add prepareComputeInfo method

* remove update_output_shapes flag

* embed all TVM EP dependences to tvm namespace. transfer model compilation from TVMRunner. connect TVMRunnerImpl to TVMRunner

* refactor compileModel method

* small cleaning

* separate TVM EP options data store and processing

* replace TvmTensorShape by InlinedVector with max_size 5

* correct indentation

* update TVM hash

Co-authored-by: Valery Chernov <valery.chernov@deelvin.com>
2022-03-16 13:55:04 +01:00
PeixuanZuo
040c0645e2
[ADD] Add micro-benchmark for Cast (#10870)
* [ADD] Add micro-benchmark for Cast

* [UPDATE] related to bert model and fix the format
2022-03-16 10:48:26 +08:00
Funtowicz Morgan
c4f73af234
Fix wrong percentile values returned during calibration (#10847)
* Use numpy.percentile to get the lookup value.

* Use 1.0 as float value rather than integer.

* Add missing cdf parameter for `np.percentile`.

* Use 100. instead of 1.0

* Remove print.

* Update from @yufenglee
2022-03-11 14:52:09 -08:00
Kotaro Yamamoto
64556888a1
add python binding for RunOptions config entry (#10694) 2022-03-11 08:49:22 -08:00
Chun-Wei Chen
5202efd11e
remove unused six in code and CIs (#10832) 2022-03-10 15:38:44 -08:00
zhangyaobit
9cbcc93e03
Add micro-benchmarks for Attention and SkipLayerNormalization ops. (#10798)
* Add micro-benchmarks for Attention and SkipLayerNormalization ops.

* Add choices for argument provider and precision.

* Automatically select CUDA or ROCM execution provider.
2022-03-09 18:18:51 -08:00
Edward Chen
c147c9dda6
Remove ORT_ENABLE_RUNTIME_OPTIMIZATION_IN_MINIMAL_BUILD. (#10778)
Remove ORT_ENABLE_RUNTIME_OPTIMIZATION_IN_MINIMAL_BUILD as it is now implied by ORT_EXTENDED_MINIMAL_BUILD.
Remove related CMake option.
2022-03-08 16:18:49 -08:00
liqun Fu
da885a72e8
update with onnx 1.11 release (#10441) 2022-03-07 21:10:55 -08:00
zhangyaobit
b7f00b9682
Refactor the common code per operator into an abstract base class. (#10785) 2022-03-07 13:15:49 -08:00
Daigo HIROOKA
a08036da09
correct symbolic name of GridSample operation (#10782)
Function name needs to match PyTorch ATen op name, which is `aten::grid_sampler`.
2022-03-07 12:49:12 -08:00
Hariharan Seshadri
9d30262422
Fix AMD training pipeline (#10788) 2022-03-07 08:53:08 -08:00
Fei Hu
60acfd3dd8
Support CUDA Graph in the CUDA EP (#9978) 2022-03-06 20:47:31 -08:00
Tianlei Wu
0e335aba37
Update BeamSearch operator spec to support t5 (#10777)
* change BeamSearch op to support encoder decoder model

* check model_type and decoder attribute

* fix

* update comments

* warn shape inference issue with onnx v1.11 or T5

* skip parity test when tempature != 1.0

* fix build
2022-03-04 21:52:45 -08:00
Ye Wang
259ade2557
Add ability to modify num_hidden_layers from benchmark script (#10760)
* add ability to modify num_hidden_layers from benchmark script

* comment

* Revert "comment"

This reverts commit 28794b0e4f86506dcc937738894fcef97fc84e48.

* Revert "add ability to modify num_hidden_layers from benchmark script"

This reverts commit 96f36ed7f751721bcf4e3ab8748a715f19a4e044.

* review coments

Co-authored-by: Ubuntu <wy@linux-v100.aidmrjtolptuzevavgwhrapqcd.jx.internal.cloudapp.net>
2022-03-04 18:28:51 -08:00
Ella Charlaix
fde847473b
Add min max moving average calibration method (#10753)
* Add min max moving average calibration method

* Modify the calibration extra options dictionnary creation
2022-03-04 14:55:31 -08:00
Tianlei Wu
379b3cdef6
T5 to ONNX conversion script (#10766)
* T5 onnx conversion script
2022-03-04 14:42:04 -08:00
Olivia Jain
12eb660415
Compare TRT vs ORT-TRT Accurately (#10565)
* get inputs independently for trtexec

* track one process only

* remove engine and profile files

* change time to commit time

* add runtime option for io binding

* move to commit date

* fixes

* add option for graph optimization

* cleanup docker script

* include remaining changes

* choose graph optimization option

* add space in option
2022-03-04 10:14:18 -08:00
zhangyaobit
4c88fa5971
Add micro-benchmark for FastGelu (#10744)
* Add micro-benchmark for FastGelu

* Delete the bert-base case, as it is very similar to the bert-large one.

* Add argument parsing and more user-friendly provider type assertion.
2022-03-04 08:51:15 -08:00
Scott McKay
e337f5faf3
Enable QDQ cleanup and NHWC optimizers in an extended minimal build. (#10729)
* Enable QDQ cleanup and NHWC optimizers in an extended minimal build.
2022-03-04 15:45:42 +10:00
Tianlei Wu
47ab0c2006
Auto mixed precision conversion of GPT-2 onnx model (#10711)
* add auto mixed precision
* Add float_to_float16_max_diff, update fp16 constants
* remove cascaded Cast nodes
2022-03-02 21:08:51 -08:00
Yufeng Li
7ab0c607b4
add qdq support of (un)squeeze and GlobalAveragePool (#10721) 2022-03-02 10:58:35 -08:00
Funtowicz Morgan
e5c6dc1fc8
Add ability to save calibration augmented models through external data format when model size exceeds 2Gb. (#10695) 2022-03-02 08:35:30 -08:00