Commit graph

7452 commits

Author SHA1 Message Date
Changming Sun
b25437ec41
Upgrade protobuf version (#13100)
Upgrade protobuf version from 3.18.1 to 3.18.3 to address CVE-2022-1941
2022-09-26 21:30:28 -07:00
Hector Li
073dbba784
skip the placeholder inputs while adding node inputs as sub-graph inputs (#13106)
Fix issue that all nodes inputs are added as sub-graph inputs event the input does not exist.

Solution:
Skip the placeholder inputs while adding node inputs as sub-graph inputs. E.g Onnx node test test_resize_upsample_scales_linear, 2nd input roi is empty.
2022-09-26 21:06:29 -07:00
Yufeng Li
c746083344
use parameter names to specify argument mapping (#13108)
use parameter names to specify argument mapping to avoid mismatches.
2022-09-26 20:56:59 -07:00
RandySheriffH
e3bdba37a8
Mitigate prefast static analysis warnings (#13032)
Address static analysis warnings:

https://msdata.visualstudio.com/DefaultCollection/Vienna/_workitems/edit/1944984/

https://msdata.visualstudio.com/DefaultCollection/Vienna/_workitems/edit/1943846/

Co-authored-by: Randy Shuai <rashuai@microsoft.com>
2022-09-26 17:06:33 -07:00
RandySheriffH
77a066c700
Drop nuphar from java API (#13107)
Drop nuphar from:

- java API
- tvm.cmake
- run_build.sh
2022-09-26 17:06:08 -07:00
Vincent Wang
0e98fb4e9b
Fix Build Error for CUDA113 Introduced by 6efa9d9 (#13089)
Fix build error for CUDA version < 11.4. The error was introduce by
commit 6efa9d9e10.
2022-09-27 07:57:14 +08:00
Edward Chen
b62ba0b5a7
Remove old enable_linux_gpu_tests parameter from template invocation. (#13102)
Remove old enable_linux_gpu_tests parameter from template invocation in build-perf-test-binaries-pipeline.yml.
2022-09-26 16:27:40 -07:00
Chen Fu
e9b1bbc6a5
fix Numpy array None judgement bug (#13103)
fix https://github.com/microsoft/onnxruntime/issues/13054
2022-09-26 15:15:32 -07:00
RandySheriffH
a83a9ed6b0
Remove miscellaneous nuphar configs (#13070)
Remove a handful of nuphar related configurations after deprecation.

Co-authored-by: Randy Shuai <rashuai@microsoft.com>
2022-09-26 13:41:28 -07:00
Jian Chen
44c14e8cbb
Adding test case for conv per channel with QDQ format (#13041)
**Description**: Adding test case for conv per channel with QDQ format
2022-09-26 16:25:28 -04:00
Dale Phurrough
2ae33b3613
fix CuDNN lib path for Windows (#12974)
Fixes microsoft/onnxruntime#12969

### Motivation and Context

Build is broken, can't find cudnn.lib with nvidia official install of
cuDNN

Alternative method is to use `IF(EXISTS
${onnxruntime_CUDNN_HOME}/lib/x64/cudnn.lib)` to test for legacy
location and only add the legacy dir to the path, else add the current
official `lib/` dir.
2022-09-26 13:23:38 -07:00
Nat Kershaw (MSFT)
ce2ea44a56
Try to fix GitHub labeling action (#12999) 2022-09-26 11:46:28 -07:00
Changming Sun
7116825aef
Add CMAKE_CUDA_ARCHITECTURES list to python packaging pipeline (#13081) 2022-09-26 10:22:43 -07:00
mayavijx
ade0d29174
Updated Dockerfile.ubuntu_openvino with OV 2022.2 official release (#13069)
Updated Dockerfile.ubuntu_openvino to use OV 2022.2 official release
which was using pre release only.
2022-09-26 00:15:52 -07:00
dependabot[bot]
365a01397d Bump protobuf from 3.17.0 to 3.18.3 in /tools/ci_build
Bumps [protobuf](https://github.com/protocolbuffers/protobuf) from 3.17.0 to 3.18.3.
- [Release notes](https://github.com/protocolbuffers/protobuf/releases)
- [Changelog](https://github.com/protocolbuffers/protobuf/blob/main/generate_changelog.py)
- [Commits](https://github.com/protocolbuffers/protobuf/compare/v3.17.0...v3.18.3)

---
updated-dependencies:
- dependency-name: protobuf
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
2022-09-25 20:00:36 -07:00
Scott McKay
b820256f34
Add check that bias and scale sizes match norm_size in LayerNormalization (#13060)
### Description
Add check that bias and scale sizes match norm_size in
LayerNormalization.

### Motivation and Context
#12917
2022-09-26 08:22:49 +10:00
Hariharan Seshadri
19c51376c4
Introduce QDQ transformer fusion tools for ordered quantized ops (#12661) 2022-09-24 23:22:44 -07:00
dependabot[bot]
6587a85f8f Bump protobuf from 3.18.1 to 3.18.3 in /tools/ci_build/github/linux/tvm
Bumps [protobuf](https://github.com/protocolbuffers/protobuf) from 3.18.1 to 3.18.3.
- [Release notes](https://github.com/protocolbuffers/protobuf/releases)
- [Changelog](https://github.com/protocolbuffers/protobuf/blob/main/generate_changelog.py)
- [Commits](https://github.com/protocolbuffers/protobuf/compare/v3.18.1...v3.18.3)

---
updated-dependencies:
- dependency-name: protobuf
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
2022-09-24 21:12:16 -07:00
dependabot[bot]
c1ff4b468d Bump protobuf in /tools/ci_build/github/linux/docker/scripts/manylinux
Bumps [protobuf](https://github.com/protocolbuffers/protobuf) from 3.18.1 to 3.18.3.
- [Release notes](https://github.com/protocolbuffers/protobuf/releases)
- [Changelog](https://github.com/protocolbuffers/protobuf/blob/main/generate_changelog.py)
- [Commits](https://github.com/protocolbuffers/protobuf/compare/v3.18.1...v3.18.3)

---
updated-dependencies:
- dependency-name: protobuf
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
2022-09-24 15:21:50 -07:00
Chih-Hsuan Yen
9abd6e3a30
setup.py: use packaging instead of wheel.vendored.packaging (#13083) 2022-09-24 08:32:44 -07:00
ytaous
2cc4e7e5c2
[Build] Fix broken AMD CI (#13082)
Introduced by https://github.com/microsoft/onnxruntime/pull/12949
- add missing lines in excluded list

Co-authored-by: Ethan Tao <ettao@microsoft.com>
2022-09-24 00:21:25 -07:00
dependabot[bot]
63c3b21902
Bump protobuf from 3.18.1 to 3.18.3 in /tools/ci_build/github/linux/docker/inference/x64/python/cpu/scripts (#13080) 2022-09-23 22:15:36 -07:00
Scott McKay
8e2528bad2
More LayoutNormalization opset 17 changes (#13066)
### Description
Add CUDA kernel.
Support double in CPU kernel and only write Mean and InvStdDev values if the optional outputs exist.

### Motivation and Context
Complete opset 17 support for LayoutNormalization
2022-09-24 13:22:44 +10:00
Changming Sun
9e21ffb649
Add license header to some files. (#13074) 2022-09-23 18:46:02 -07:00
Baiju Meswani
bcc93ab17c
Deprecate ORTTrainer (#13022) 2022-09-23 18:10:09 -07:00
Tianlei Wu
6f27659ceb
Fix prefast warnings (#13017)
Fix prefast warnings:
[C26451](https://learn.microsoft.com/en-us/cpp/code-quality/C26451?view=msvc-170)
[C26436](https://learn.microsoft.com/en-us/cpp/code-quality/c26436?view=msvc-170)
[C26814](https://learn.microsoft.com/en-us/cpp/code-quality/C26814?view=msvc-170)
2022-09-23 12:50:23 -07:00
Baiju Meswani
8bb16ab900
Propagate environment variable to docker image (#13031) 2022-09-23 11:23:49 -07:00
Zhang Lei
6efa9d9e10
Add more qordered int8 operators for CUDA provider (#12949)
Attention, Quantize/Dequantize etc.
Update QOrderedMatmul's schema, updated unittest.
Verified test data for QOrdered Attention.

Co-authored-by: Zhang Lei <phill.zhang@gmail.com>
Co-authored-by: Lei Zhang <zhalei@microsoft.com>
2022-09-23 10:49:33 -07:00
Edward Chen
5f611b63a1
Make classes IKernelTypeStrResolver and IKernelLookup have protected destructors. (#13059) 2022-09-23 09:16:45 -07:00
PeixuanZuo
2ef1f8b93e
[ROCm] add tunable SkipLayerNorm for ROCm EP (#12817)
**Description**: Describe your changes.
Related PR: https://github.com/microsoft/onnxruntime/pull/12803
https://github.com/microsoft/onnxruntime/pull/12816
https://github.com/microsoft/onnxruntime/pull/12821

1.add tunable skip layernorm for rocm ep
2. keep origin implementation when disable tuning.

**Motivation and Context**
- Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here.
2022-09-23 16:39:44 +08:00
Changming Sun
eafd67b8fd
Update CUDA version to 11.6 and refactor python packaging pipeline (#13002)
1. Update CUDA version from 11.4 to 11.6.
2. Update Manylinux version
3. Upgrade GCC version from 10 to 11 for most x86_64 pipelines. CentOS 7 ARM64 doesn't have GCC 11 yet.
4. Refactor python packaging pipeline: 
    a. Split Linux GPU build job to two parts, build and test, so that the
build part doesn't need to use a GPU machine
    b. Make the Linux GPU build job and Linux CPU build job more similar: share the same bash script and yaml file.
5. Temporarily disable Attention_Mask1D_Fp16_B2_FusedNoPadding because it is causing one of our packaging pipeline to fail. I have created an ADO task for this.
2022-09-23 00:29:27 -07:00
Yi Zhang
92237567d3
add opset17 node test data (#13062)
### Description ###
Add opset17 node test data

### Motivation and Context ###
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2022-09-23 14:33:37 +08:00
cloudhan
a24b41d92e
Move all TunableOp related falicilities to EP level directory (#12857)
Some Ops in EP directory instead of contrib_ops directory will
require TunableOp. We will also need to add EP level session tuning
options for it. So move those code all at once.

Also remove duplicated utility functions.
2022-09-23 11:10:19 +08:00
Faith Xu
8fb3f05cd6
Add cgmanifest file in codeowner list (#13042)
Marks @onnxruntime-admin as owner for cgmanifest file to help review
changes in dependencies and version updates.
2022-09-22 18:58:01 -07:00
Scott McKay
394c249c7c
Add ONNX LayerNormalization(17) (#12978)
**Description**: LayerNormalization is now part of the ONNX spec as of
opset 17.
We had a LayerNormalization contrib op, which (incorrectly) was
registered in the ONNX domain. Use that implementation for the ONNX
operator.

Update skip_layer_norm_fusion.cc. There are other optimizers that use
LayerNormalization that need updates as well.

**Motivation and Context**
#12916
2022-09-23 09:49:27 +10:00
wangxiyuan
952c99304a
Add CANN EP (#12416)
**Description**: This PR adds Ascend CANN execution provider support.

**Motivation and Context**
- Why is this change required? What problem does it solve?
As the info shown in the issue. CANN is the API layer for Ascend
processor. Add CANN EP can allow user run onnx model on Ascend hardware
via onnxruntime
  The detail change:
  1. Added CANN EP framework.
  2. Added the basic operators to support ResNet and VGG model.
  3. Added C/C++、Python API support
- If it fixes an open issue, please link to the issue here.
   https://github.com/microsoft/onnxruntime/issues/11477

Author: 
lijiawei <lijiawei19@huawei.com>
wangxiyuan <wangxiyuan1007@gmail.com>

Co-authored-by: FFrog <ljw1101.vip@gmail.com>
2022-09-22 14:53:40 -07:00
Scott McKay
078ceab1db
Use full ORT package for onnxruntime-react-native. (#13037)
**Description**: 
Use full ORT package for onnxruntime-react-native.

Left the params required for the mobile build in comments so they're
easily discovered if we need to create onnxruntime-react-native-mobile
in the future.

**Motivation and Context**
Remove barrier to using ORT with react native as the mobile package that
was being used supports a limited range of opsets/operators/types, and
requires ORT format models. The full package will run any model.
2022-09-23 07:20:03 +10:00
ashari4
c4a7e88fc8
QuantizeBFP and DequantizeBFP (#12833)
* `QuantizeBFP` and `DequantizeBFP` schemas - similar to
`QuantizeLinear` and `DeQuantizeLinear`.
* BFP datatype is represented as a `uint8` tensor with shape and stride
metadata. This is preferrable to adding a new datatype for BFP, which is
more disruptive and [discouraged by
PyTorch](https://discuss.pytorch.org/t/training-with-custom-quantized-datatype/152132/2).

Context: 

The Microsoft Floating Point (BFP) datatype shares an exponent for every
n numbers called a “bounding box.” Each number still has its own
mantissa and sign bits. BFP has been shown to incur 3-4 less cost
(energy and area) than BFloat16 and INT8 counterparts without reductions
in accuracy for the ImageNet benchmark as described in [Rouhani
2020](https://proceedings.neurips.cc/paper/2020/file/747e32ab0fea7fbd2ad9ec03daa3f840-Paper.pdf).

Requirements:

* There are many variants of BFP (number of mantissa bits, number of
shared exponent bits, size of bounding box, custom bit fields, etc.)
* The size and layout of an BFP variant varies across hardware
* bounding box can be over arbitrary dimensions; for example, for the
channel "C" dimension in a N x C x H x W tensor for convolution

Goals of this PR:

* Add initial versions of QuantizeBFP and DequantizeBFP operators to
enable QDQ-style quantization with BFP. Once the schemas stabilize, we
can consider upstreaming to ONNX.
* Add some basic type and shape inferencing tests; tests that run on an
EP will be a follow-up.
2022-09-22 14:02:55 -07:00
Hariharan Seshadri
057567f39f
Fix bug in Attention Fusion (#13050) 2022-09-22 13:46:59 -07:00
sfatimar
cccbe90764
Openvino ep 2022.2 v4.2 (#13023)
This changes are to align OV 2022.2 Release with ORT . Changes
CPU FP16 Support, dGPU Support, RHEL Dockerfile, Ubuntu 20 Dockerfile 

**Motivation and Context**
- This change is required to ensure ORT-OpenVINO Execution Provider is
aligned with latest changes.
- If it fixes an open issue, please link to the issue here.

Co-authored-by: mayavijx <mayax.vijayan@intel.com>
Co-authored-by: shamaksx <shamax.kshirsagar@intel.com>
Co-authored-by: pratiksha <pratikshax.bapusaheb.vanse@intel.com>
Co-authored-by: pratiksha <mohsinx.mohammad@intel.com>
Co-authored-by: Sahar Fatima <sfatima.3001@gmail.com>
Co-authored-by: Preetha Veeramalai <preetha.veeramalai@intel.com>
Co-authored-by: nmaajidk <n.maajid.khan@intel.com>
Co-authored-by: Mateusz Tabaka <mateusz.tabaka@intel.com>
Co-authored-by: intel <intel@iotgecsp-nuc04.iind.intel.com>
2022-09-22 12:31:40 -07:00
Edward Chen
6ea8780886
Replace std::exclusive_scan() with for loop because std::exclusive_scan() is not implemented in GCC 7. (#13045) 2022-09-22 09:30:22 -07:00
Pranav Sharma
c7a4093db8
Fix prefast static analysis warning by not calling delete explicilty. (#13048)
### Description
Fix prefast static analysis warning by not calling delete explicilty.

### Motivation and Context
Prefast runs.
2022-09-21 20:55:38 -07:00
yf711
2f9b358d16
Replace the source of TRT version and fix the build (#13046)
Replace the source of TRT version and fix the build issue happened on
Linux environment

### Description
Replace the source of TRT version from NvInfer.h to NvInferVersion.h


### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->

On Linux platform, using nvinfer.h in
tensorrt_execution_provider_utils.h would show error when building ORT
unit tests, as ORT unit test show the deprecation warnings as errors.
(Although this error didn't show up on Linux CI pipeline )

### Verification
The new change has been tested under both Linux & Windows environments.
2022-09-21 19:13:37 -07:00
Justin Chu
2f9b559391
Declutter pull_request_template (#13026)
Turn the instructions in pull_request_template into comments so
templated language no longer clutter the PR description section.
2022-09-21 14:56:30 -07:00
shalvamist
851b0ce936
[js/web][Fix] - updating the C API to catch non-tensor data (#12811)
Added a check for tensor validation on the input - this change fixes the
quiet abort WASM takes when processing a non tensor data in
"OrtGetTensorData"

**Motivation and Context**
At the current status when we try to process non-tensor data through
OrtGetTensorData and exception is thrown which results in a quiet abort
from WASM (assuming WASM was built without exception handling).

I added a check in the C API to catch this case and output a meaningful
message to the user

[example_error_github_12622.zip](https://github.com/microsoft/onnxruntime/files/9464328/example_error_github_12622.zip)
2022-09-21 13:59:17 -07:00
Dwayne Robinson
8de5535e9c
Reduce test warning spew due to CPU fallback (#13035)
**Description**: I added a warning in
https://github.com/microsoft/onnxruntime/pull/10831 a week ago, but it's
noisy for onnxruntime_test_all.exe because very few tests explicitly
specify the providers they use, relying on implicit CPU, which makes it
harder to see actual errors in the output. So reduce this noise (that
is, if no EP's were explicitly provided, display no warning).

Sample output spew:
```
2022-09-20 20:08:50.6299388 [W:onnxruntime:NchwcOptimizerTests, session_state.cc:1030 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Some nodes were not assigned to the preferred execution providers which may or may not have an negative impact on performance. e.g. ORT explicitly assigns shape related ops to CPU to improve perf.
```

**Motivation and Context**
- *Why is this change required? What problem does it solve?* Test output
noise makes it harder to debug real failures.
- *If it fixes an open issue, please link to the issue here.* NA
2022-09-21 13:58:18 -07:00
Jian Chen
051a0a67a5
Cjian/per channels not working (#13038)
**Description**: This fix the bug where per_channel quantization isn't
working when axis == 0
2022-09-21 16:24:23 -04:00
Jian Chen
6248b69795
Fixes bug which makes quantized_input_names = [] (#13029)
**Description**: Fixes bug in `tools/quantization/operators/split.py`
which would make `quantized_input_names == []`
2022-09-21 14:25:38 -04:00
yf711
240aeadf1a
Update engine hash id generator with model name/model content/metadata (#13015)
**Update engine hash id generator with model name/model
content/metadata**

**Description**: 

* Updated engine id generator, which use model name/model inputs &
outputs/env metadata (instead of model path) to generate hash
* New bridged API were introduced in order to enable id generator in the
TRTEP utility

**Motivation and Context**
- Why is this change required? What problem does it solve? To fix this
[issue](https://github.com/triton-inference-server/server/issues/4587)
caused by id generator using model path

How to use:
* Call [TRTGenerateMetaDefId(const GraphViewer& graph_viewer, HashValue&
model_hash)](0fcce74a56/onnxruntime/core/providers/tensorrt/tensorrt_execution_provider.cc (L715))
to generate hash id for TRT engine cache

How to test:
* On WIndows, run:
* .\onnxruntime_test_all.exe
--gtest_filter=TensorrtExecutionProviderTest.TRTMetadefIdGeneratorUsingModelHashing
* .\onnxruntime_test_all.exe
--gtest_filter=TensorrtExecutionProviderTest.TRTSubgraphIdGeneratorUsingModelHashing

**Appendix**
* [Existing engine id generator that uses model
path](https://github.com/microsoft/onnxruntime/blob/main/onnxruntime/core/framework/execution_provider.cc#L112-L182)
2022-09-21 11:10:05 -07:00
Adrian Lizarraga
39e20686a0
[EP Perf Dashboard] Fix incorrect calls to trtexec with fp16 inputs (#13018) 2022-09-21 10:31:45 -07:00