* Update ONNX to 1.12 (#11924)
Follow-ups that need to happen after this and before the next ORT release:
* Support SequenceMap with https://github.com/microsoft/onnxruntime/pull/11731
* Support signal ops with https://github.com/microsoft/onnxruntime/pull/11778
Follow-ups that need to happen after this but don't necessarily need to happen before the release:
* Implement LayerNormalization kernel for opset version 17: https://github.com/microsoft/onnxruntime/issues/11916Fixes#11640
* Dll version fix ovep4.1 (#11953)
* Setting default version values for ovep dlls as well
* Update backend_manager.cc
Co-authored-by: mayavijx <mayax.vijayan@intel.com>
Co-authored-by: mohsin <mohsinx.mohammad@intel.com>
* Optimize t5 encoder in beam search (#11926)
* ooptimize t5 encoder
* update
* update
* update
* refactor expand impl
* cuda tests passed
* update
* alignment
* more alignments
* review comments
* Allow saving on CPU usage for infrequent inference requests by reducing thread spinning (#11841)
Introduce Start/Stop threadpool spinning switch
Add a session config option to force spinning stop at the end of the Run()
* Restructure function inliner (#11731)
* Add nested function call tests
* Add overload for Specialize
* Pass symboltable to onnx shape inference
* Avoid renaming empty names
* Enable sequence_map tests which failed before this change
* Deprecate APIs returning raw ptrs and provide replacements (#11922)
Provider better documentation
* register signal ops for opset 17 (#11778)
* Register signal ops for op set 17
Note code is mostly being moved, not added. These ops were previously
only registered as Microsoft contrib ops and only built if
`BUILD_MS_EXPERIMENTAL_OPS=1`. They've been added to the ai.onnx
standard op set in version 17.
Main components of this change:
* Move the kernels from the conrib_ops directory to the
core directory.
* Add function bodies for ms experimental ops. This will allow
old models that use the contrib ops to continue to function.
All the function bodies consist of a single op (the
new standard op), so performance overhead should be minimal.
Minor clean-up also in this change:
* De-duplicate get_scalar_value_from_tensor: put it in a new utils.h.
* Fix some bugs that caused compilation errors with the experimental
ops. Tested with `build.sh --ms_experimental`
* Fix some spelling errors and lint violations.
* Replace a couple of switch statements with `MLTypeCallDispatcher`.
* Use `InlineVector` instead of `std::vector`.
Unblocks https://github.com/microsoft/onnxruntime/issues/11640
* Include opset 15 in Conv+BatchNormalization fusion (#11960)
* Fix WinML Tests are still targetting deprecated (deleted) experimental signal op definitions (#12006)
* fix winml tests
* remove legacy test
* switch idft -> dft+inverse attr
* upgrade opset 13->17 for signal ops tests
* [C# Tests] Add support for double tensor output in TestPreTrainedModels. (#12008)
Add support for double tensor output in TestPreTrainedModels.
* DML EP ResNet50 opset 15 fails in ONNX checker for FusedBatchNormalization lacking training_mode attribute (#12010)
FusedBatchNormalization include training_mode attribute
* Generalize native op creation (#11539)
* create op from ep
* read input count from context
* create holder to host nodes
* fix typo
* cast type before comparison
* throw error on API fail
* silence warning from minimal build
* switch to unique_ptr with deleter to host nodes
* fix typo
* fix build err for minimal
* fix build err for minimal
* add UT for conv
* enable test on CUDA
* add comment
* fix typo
* use gsl::span and string view for Node constructor
* Added two APIs - CopyKernelInfo and ReleaseKernelInfo
* pass gsl::span by value
* switch to span<NodeArg* const> to allow for reference to const containers
* fix typo
* fix reduced build err
* fix reduced build err
* refactoring node construction logic
* rename exceptions
* add input and output count as arguments for op creation
* refactor static member
* use ORT_CATCH instead of catch
* cancel try catch
* add static value name map
* format input definition and set err code
* fix comments
* fix typo
* [DML EP] Pad operator: Handle negative pad counts (#11974)
* Pad fallback to CPU
* Added queryPad in operatorRegistration.cpp
* Acknowledged PR comments
* Used any_of
* used none_of instead of any_of
Co-authored-by: Sumit Agarwal <sumitagarwal@microsoft.com>
* Add warning about future computation change for ConvTranspose with auto_pad (#11984)
* Add warning about future computation change for Convtranspose with auto_pad
* improve msg
* update TODO to make lint happy
* update more contents for warning and add if
* valid was not infected
* move it into kernel registration
* parse auto_pad myself
* try to use conv_transpose_attrs_.auto_pad directly
* update roialign cuda impl to onnx opset16 (#12036)
* roialign opset16
* fix
* fix
* Fix windows eager build break by pinning to torch version 1.11.0 (#12033)
Fix windows and linux eager build to torch 1.11.0.
* Skip Constant Folding for ops producing an optional type output (#11839)
* Disable sequence-type tests since C# infra doesn't support well (#12037)
* Extend lifetime of KernelDef when creating a standalone op (#12057)
place tmp kernel def as local variable to cover the lifetime of kernel creation
* Add targets files for new .net6 frameworks (#12016)
* Add net6 targets.
Remove maccatalyst as we don't have a native build targetting that.
* Set platform in macos targets
* Add targetFramework entries
* Move NativeLib.DllName definition and set using preprocessor values for simplicity. Couldn't get it to build with the preprocessor based setup when it was in a separate file.
Update the nuspec generation to set platform version for .net6 targets. TODO: Validate versions. I copied them from the managed nuget package the packaging pipeline generated prior to adding targets. Possibly w could/should lower some of the versions.
Hopefully the need to specify a version goes away when the release version of VS2022 supports .net6.
* Try android 31.1 as https://github.com/actions/virtual-environments/blob/main/images/win/Windows2022-Readme.md suggests that should be available on the CI machines
* Fix patch version mismatch
Add some extra debug info in case it helps
* Debug nuget location in CI
* Add workspace entry back in
* Add steps
* One more attempt with hardcoded nuget.exe path and original android31.0 version
* Better fix - found explicit nuget download and updated version there.
* flake8 fixes
* Fix black complaints.
* Exit Microsoft_ML_OnnxRuntime_CheckPrerequisites for net6 iOS.
* Removed outdated comment
* Fix DML custom operators which set descriptor heap to command list (#12059)
* Make C# runtest.sh automatically set latest opset (#12039)
* Update C# runtest.sh for opset 17
Should have been part of https://github.com/microsoft/onnxruntime/pull/11924
* get appropriate opset version from onnx doc
* use absolute rather than relative path
* fix typo in var name
* Disable DML command list reuse for Xbox (#12063)
disable cl reuse for xbox
* Add data type check in ConvAddRelu fusion (#12058)
* Add undocumented attribute to disable generation of Java bindings from the Android AAR. (#12075)
The generated bindings causes C# build errors that require workaround code. Disabling generation should avoid the need for any workarounds.
As the user has the C# ORT package with the C# to C bindings there's no need for binding generation that calls the ORT Java API (which is C# -> Java ->C).
* enable the extensions custom build for java and android (#11823)
* generate quantization parameter for outputs (#12089)
* DML EP Update to DML 1.9 (#12090)
* Update to DML 1.9
* Appease obnoxious Python formatting tool
* Fix orttraining-linux-ci-pipeline - Symbolic shape infer (#11965)
fix symbolic shape error due to upgraded numpy + legacy sympy
* check consumers of dq node before swap dq and transpose (#12099)
* check consumers of dq node before swap dq and transpose
* add unit test
Co-authored-by: Gary Miguel <garymiguel@microsoft.com>
Co-authored-by: Preetha Veeramalai <preetha.veeramalai@intel.com>
Co-authored-by: mayavijx <mayax.vijayan@intel.com>
Co-authored-by: mohsin <mohsinx.mohammad@intel.com>
Co-authored-by: Ye Wang <52801275+wangyems@users.noreply.github.com>
Co-authored-by: Dmitri Smirnov <yuslepukhin@users.noreply.github.com>
Co-authored-by: G. Ramalingam <grama@microsoft.com>
Co-authored-by: Dwayne Robinson <dwayner@microsoft.com>
Co-authored-by: Sheil Kumar <smk2007@gmail.com>
Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com>
Co-authored-by: sumitsays <sumitagarwal330@gmail.com>
Co-authored-by: Sumit Agarwal <sumitagarwal@microsoft.com>
Co-authored-by: Chun-Wei Chen <jacky82226@gmail.com>
Co-authored-by: George Wu <jywu@microsoft.com>
Co-authored-by: Wil Brady <25513670+WilBrady@users.noreply.github.com>
Co-authored-by: Hariharan Seshadri <shariharan91@gmail.com>
Co-authored-by: Wei-Sheng Chin <wschin@outlook.com>
Co-authored-by: Scott McKay <skottmckay@gmail.com>
Co-authored-by: Jeff Bloomfield <38966965+jeffbloo@users.noreply.github.com>
Co-authored-by: Justin Stoecker <justoeck@microsoft.com>
Co-authored-by: Wenbing Li <10278425+wenbingl@users.noreply.github.com>
Co-authored-by: Yufeng Li <liyufeng1987@gmail.com>
Co-authored-by: pengwa <pengwa@microsoft.com>
* Try manually installing trt8.4 in multi-gpu pipeline
* Remove stmts that clean up cmake, ctest. Update tensorrt repository name passed to get_docker_image.py
* Update trt and cudnn home
* Don't install trtexec cli tool.
* Increase job timeout
* Revert timeout change and use trt placeholder builder build option
* update trt 8.4ga
* trt 8.4 linux ci pipeline
* fix cmake
* placeholder_builder
* trt 8.4 windows pipeline
* gpu package pipeline
* trt 8.4.1.5 , packaging pipeline updates
* python packaging
* ctest timeout
* python packaging test
* bump timeout
* python format
* format
* revert
* newline
* enable trt python tests
* typo
* python format
* disable on windows
* aten op for inference
* fix build error
* more some code to training only
* remove domain from operator name
* move aten_op_executor ext out from ortmodule
* add pipeline
* add exec mode
* fix script
* fix ut script
* fix test pipeline
* failure test
* rollback
* bugfix
* resolve comments
* enable aten for python build only
* fix win build
* use target_compile_definitions
* support io binding
* turn off aten by default
* fix ut
Co-authored-by: Vincent Wang <weicwang@microsoft.com>
Co-authored-by: zhijxu <zhijxu@microsoft.com>
* update TVM
* get alignment constant from TVM
* update TVM_VM_SetInputs to upstream with TVM API
* fix CI issue: update TVM EP dependencies
* add sudo
* revert changes needed to install missing package
* add package for TVM EP CI
Co-authored-by: Valery Chernov <valery.chernov@deelvin.com>
Co-authored-by: KJlaccHoeUM9l <wotpricol@mail.ru>
Description:
Add the extra param to match gelu in PyTorch in the contrib symbolic function
Motivation and Context
Why is this change required? What problem does it solve?
The symbolic function in /onnxruntime/python/tools/pytorch_export_contrib_ops.py is missing a recently added parameter approximate. We add this parameter and use the exporter defined gelu if approximate is "tanh".
* move all logic for ubuntu dockerfiles
* pass in trt version
* update trt 8.0 file
* downgrade protobuf
* uncomment
* and
* change to 8.0
* update dockerfiles
* checkout protobuf based on version
* adding last dockerfile:
:
* checkout 3.10 protobuf
* fix checkout version
* update to 8.2
* keep only one submodule sync
* cleanup
* Delete Dockerfile.custom-trt-perf
* create checkout submodules script
* properly compare decimals in bin/sh
* combine build ort paths
* deprecate TRT 7.2
* only checkout protobuf if we checkout older onnx-tensorrt
* only pull nvidia container if true, update image
* downgrade protobuf only if we checkout onnx-trt
* Update linux-gpu-tensorrt-daily-perf-pipeline.yml for Azure Pipelines
* Update linux-gpu-tensorrt-daily-perf-pipeline.yml for Azure Pipelines
* Add quotes to avoid path splitting
* address shellcheck
* use shellcheck suggestions
Description: Format all python files under onnxruntime with black and isort.
After checking in, we can use .git-blame-ignore-revs to ignore the formatting PR in git blame.
#11315, #11316
* remove rocm42 CI
* update torch to v1.11.0
Co-authored-by: Ethan Tao <ettao@microsoft.com@orttrainingdev7.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net>
* Enabling ov-ep for 2022.1 Release
->Added ov-ep 2022.1 flow
->Validated CPU Unit tests with OV
Master using onnxruntime_test_all unit
tests.
Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>
* Fix for output mismatch b/w OpenVINO and ONNX
Refer:
https://jira.devtools.intel.com/browse/CVS-60310
Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>
* Enabling Adobe ops
->Enable Resize op for iGPU
->Enable Add op for iGPU
Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>
* Removing irrelevant conditions
->Removing some conditions from
GetCapability() which are now not
required. (Removed conditions for
OV version support less than 2021.2)
Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>
* Enable upsample op
Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>
* Enable Adobe proxy-e model
Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>
* Removing any extra conditions for Opset13 ops
* Opset13 changes
Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>
* Exception handling for devices
* Added comments
* Implement GPU Throttling feature
*Added GPU Throttling feature for iGPU's.
when user enables it as a runtime option,
it helps in reducing overall CPU usage
of the application
*Added changes to exercise this option
using onnxruntime_perf_test application.
Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>
* Renaming the runtime config option
Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>
* Added the user to video and users group
* Handling_GPU.0_GPU.1
* Handling special conditions
->Handling corner cases for
device_type checks
Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>
* Modification to include new api 2.0 changes in the code
* Added opset13 changes
->Enabled Few ops
->Added Debug info for case 3b in getcapability()
Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>
* Enabling ov-ep for 2022.1 Release
->Added ov-ep 2022.1 flow
->Validated CPU Unit tests with OV
Master using onnxruntime_test_all unit
tests.
Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>
* Fix for output mismatch b/w OpenVINO and ONNX
Refer:
https://jira.devtools.intel.com/browse/CVS-60310
Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>
* Enabling Adobe ops
->Enable Resize op for iGPU
->Enable Add op for iGPU
Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>
* Removing irrelevant conditions
->Removing some conditions from
GetCapability() which are now not
required. (Removed conditions for
OV version support less than 2021.2)
Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>
* Enable upsample op
Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>
* Enable Adobe proxy-e model
Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>
* Removing any extra conditions for Opset13 ops
* Opset13 changes
Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>
* Exception handling for devices
* Added comments
* Implement GPU Throttling feature
*Added GPU Throttling feature for iGPU's.
when user enables it as a runtime option,
it helps in reducing overall CPU usage
of the application
*Added changes to exercise this option
using onnxruntime_perf_test application.
Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>
* Renaming the runtime config option
Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>
* Added the user to video and users group
* Handling_GPU.0_GPU.1
* Handling special conditions
->Handling corner cases for
device_type checks
Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>
* Added opset13 changes
->Enabled Few ops
->Added Debug info for case 3b in getcapability()
Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>
* Log comments updated
* Changes to enable 2.0 api
* Enabling ov-ep for 2022.1 Release
->Added ov-ep 2022.1 flow
->Validated CPU Unit tests with OV
Master using onnxruntime_test_all unit
tests.
Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>
* Fix for output mismatch b/w OpenVINO and ONNX
Refer:
https://jira.devtools.intel.com/browse/CVS-60310
Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>
* Enabling Adobe ops
->Enable Resize op for iGPU
->Enable Add op for iGPU
Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>
* Removing irrelevant conditions
->Removing some conditions from
GetCapability() which are now not
required. (Removed conditions for
OV version support less than 2021.2)
Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>
* Enable upsample op
Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>
* Enable Adobe proxy-e model
Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>
* Removing any extra conditions for Opset13 ops
* Opset13 changes
Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>
* Exception handling for devices
* Added comments
* Implement GPU Throttling feature
*Added GPU Throttling feature for iGPU's.
when user enables it as a runtime option,
it helps in reducing overall CPU usage
of the application
*Added changes to exercise this option
using onnxruntime_perf_test application.
Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>
* Renaming the runtime config option
Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>
* Added the user to video and users group
* Handling_GPU.0_GPU.1
* Handling special conditions
->Handling corner cases for
device_type checks
Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>
* Added opset13 changes
->Enabled Few ops
->Added Debug info for case 3b in getcapability()
Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>
* Fix build issue
Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>
* Fixes issues
*Fixes compiler warnings c4458 on windows.
*Fixes the bug in device_type check logic
*Adds print info for enable_opencl_throttling
option in onnxruntime_perf_test
Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>
* commit to make openvino_2021.4 compatible
* Fixed IO Buffer Optimization
* Fix output names issue
* Fix 2021.3 branch
* Bug Fix for Multiple inputs/outputs
- Assigns the right output_name and
input_name for the graph when
returned by CompiledModel::inputs()
OV function.
- Also takex care of output mismatch
issue b/w openvino output and onnx
output
Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>
* Add comments for the changes made
Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>
* IO Buffer Changes
* Commit for Disabling GPU Throttling for 2021.4
* Updated branch
* Fix windows build
->Fixed windows build in debug mode
->Disabled scatternd3_tensor_int64
Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>
* Fixed CPP Unit tests for CPU
-Fixed shrink, MVN, ReduceL2, Maxpool,
upsample, scatter, slice, reshape,
unsqueeze.
Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>
* Fixed first set of GPU Tests
Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>
* Fixed additional failing tests on GPU
->Added conditions to disable certain ops
under certain conditions
->Disabled certain tests
->Added some op supports for no_dimension
supported
Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>
* Added Expand op support for CPU
Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>
* Added condition for squeeze op
->Shape can't have empty axes attribute
Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>
* Add support for LessOrEqual op function
Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>
* OV Interface wait for replaced by indefinite wait call
* use names from ONNX model to access OV tensors
This chnage is to use the input/output names
retrieved from original onnx model to access
OV tensors and to check if there's any input
or output names mismatch b/w ONNX naming
and OV naming.
Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>
* Fixes Myriad unit tests and other issues
->Fixes Myriad CPP unit tests
->Fixes output mismatch issue with models with
sub graph partitioning
Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>
* Fix segfault issue
->Fixed case 3b condition in get_capability()
which was causing the segfault issue
Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>
* Fixed build isuse with ov 2021.4 with I/O buffer
Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>
* Disables performance counters for I/O Buffer
Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>
* Fixed inputs/outputs mismatch for HDDL with 2022.1
Signed-off-by: Mohammad Amir Aqeel <mohammadx.amir.aqeel@intel.com>
* Fix to enable GPU FP16
* Enabled mlperf_ssd_mobilenet_300 model fully on CPU
Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>
* Added ov version specific dll packaging for nuget
* Fixed conditions for few ops
Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>
* Dockerfile updates
* Updated License Info
-Updated the copyrights License Info
-modified FP16 transformations with OV 2022.1
Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>
* Disabling mlperf_ssd_mobilenet_300 model
->Disabled this model for openvino. The
test is failing in Internal_CI pipelines.
Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>
* Disabling failing python CPU Tests
Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>
* Fixed flake8 python errors
Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>
Co-authored-by: hdgx <harinix.d.g@intel.com>
Co-authored-by: mayavijx <mayax.vijayan@intel.com>
Co-authored-by: sfatimar <sahar.fatima@intel.com>
Co-authored-by: mohsinmx <mohsinx.mohammad@intel.com>
Co-authored-by: Mohammad Amir Aqeel <mohammadx.amir.aqeel@intel.com>
* Update orttraining release pipelines to use torch 1.11.0
* Change requirements_torch...txt to requirements.txt
* Update cuda cmake architectures and clean up old files
* migrate to 1ES Hosted Pool
* migrate to Kusto database
* refactor and organize ep names with ORT prefix
* standardize TRT benchmarking with save/load engine, input binding, and workspace
* Add TRT 8.2 to ep perf pipeline
* update model_list.json with full onnx zoo
* add anubis credentials
* add anubis credentials
* clarify trt variables
* get system info from docker image
* remove unwanted commenting