Commit graph

7481 commits

Author SHA1 Message Date
Justin Chu
402e1995f0
Create clang-tidy CI (#12653)
Update clang-tidy config to prepare for creating a CI workflow to run
clang-tidy.
Added clangtidy check in CI

Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com>
2022-09-30 08:05:38 -07:00
Changming Sun
5f1bc8ff56
Add "--parallel" to the build flags of WASM pipeline (#13179) 2022-09-30 06:54:39 -07:00
Yi Zhang
a862b0cad1
increase ios_CI_coreml stage timeout limit (#13157)
### Description
As titile 

### Motivation and Context
Recently, it became more frequently that the workflow canceled due to
timeout.
2022-09-30 14:45:14 +08:00
Changming Sun
dd2aec170d
Update Coding_Conventions_and_Standards.md (#7705) 2022-09-29 23:32:37 -07:00
sumitsays
f3180f3ac8
[DML EP] Enable graph inside DML Graph (#13073)
### Description
Kernels like Attention, BatchNormalization15, etc, can be implemented by
using multiple DML APIs. This PR paves the path for graph-based kernel
implementation.
As part of this PR, every kernel in DML EP will now wrap their
DML_OPERATOR_DESC into a graph and send it to FusedGraphKernel.
FusedGraphKernel will stich this smaller graph into its main DML_GRAPH.

All onnxconformance test and Winml model tests passed.

Co-authored-by: Sumit Agarwal <sumitagarwal@microsoft.com>
Co-authored-by: Dwayne Robinson <dwayner@microsoft.com>
2022-09-29 23:32:20 -07:00
cloudhan
c93cb8f949
Revert "Enable ROCm to use tunable GEMM" (#13160)
Reverts microsoft/onnxruntime#12853 due to CI pipeline problem.
2022-09-30 14:01:16 +08:00
Ye Wang
c8781b77f6
Decouple use_sequence_as_input_ids from has_hidden_states (#13130)
### Description
<!-- Describe your changes. -->

A fix for parity issue in huggingface bart model with beam search
https://github.com/microsoft/onnxruntime/pull/12779

### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2022-09-29 22:45:52 -07:00
Scott McKay
32395e2e16
Add handling for variadic inputs/outputs in a function. (#13140)
### Description
<!-- Describe your changes. -->
Add handling for variadic inputs/outputs in a function.

### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->

#13121
2022-09-30 14:51:17 +10:00
Scott McKay
4d8510611b
Update find_optimizer_opset_version_updates_required.py to use the ONNX headers to determine the latest opset. (#12484)
**Description**: 
Use the onnx headers to find the latest opset for each operator. This
allows the script to detect optimizers with
`graph_utils::IsSupportedOptypeVersionAndDomain` calls that need
updating when run during the update of the onnx commit id. Without this
change issues are not detected until a new kernel is registered.

**Motivation and Context**
Detect optimizers that need updates as part of the ONNX update process.
2022-09-29 16:55:22 +10:00
Vincent Wang
6c63c1c9ee
Multiple Gather to Split Fusion (#13095)
For below code in some transformers models:
```
fused_qkv = fused_qkv.view(batch_size, seq_length, self.num_heads, 3, self.head_dim)
return fused_qkv[..., 0, :], fused_qkv[..., 1, :], fused_qkv[..., 2, :]
```

The exported graph will contains 3 Gather nodes, currently ORT's
GatherGrad CUDA implementation is slow. This pattern can be fused to use
one Split, so that we can launch less kernels for the compute, the perf
of Split/Concat (for grad) is also better than Gather/GatherGrad.

In a real example, one GatherGrad will take 15ms and there are 3 for
each layer in the graph, after the fusion, one Concat takes only 35us.
The total time of a step is improved from 1.5s to 0.4s.
2022-09-29 11:09:57 +08:00
PeixuanZuo
3157cdb19a
[ROCm] Fix MIGraphX ciagent user Permissions issues (#13137)
### Description
<!-- Describe your changes. -->

fix migraphx ci pipeline failed problem.

Disabled MIGraphX pipeline now. It will  be Enabled when this PR merge.

### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2022-09-29 10:25:02 +08:00
Baiju Meswani
5182d6610d
Upgrade pytorch to 1.12.1 for training pipelines (#13128) 2022-09-28 17:59:49 -07:00
sfatimar
c9a86fa27f
Openvino GPU Unit/Python Tests fix failure (#13122)
### Description
We fix iGPU Unit and Python tests with this PR
We add packaging pip pkg to build Many Linux DockerFile


### Motivation and Context
This change is required to make sure iGPU Unit Test/Python Tests with OV
are fixed
 - If it fixes an open issue, please link to the issue here. -->

Co-authored-by: shamaksx <shamax.kshirsagar@intel.com>
Co-authored-by: mayavijx <mayax.vijayan@intel.com>
Co-authored-by: pratiksha <pratikshax.bapusaheb.vanse@intel.com>
Co-authored-by: pratiksha <mohsinx.mohammad@intel.com>
Co-authored-by: Sahar Fatima <sfatima.3001@gmail.com>
Co-authored-by: Preetha Veeramalai <preetha.veeramalai@intel.com>
Co-authored-by: nmaajidk <n.maajid.khan@intel.com>
Co-authored-by: Mateusz Tabaka <mateusz.tabaka@intel.com>
2022-09-28 16:00:06 -07:00
Adam Pocock
388d3cf847
[Java] Fix OnnxSequence semantics (#13012)
Previously OnnxSequence would flatten out a list of tensors into a
single output array assuming they were all scalar values. This doesn't
accurately represent the semantics of an ONNX sequence, but was what the
semantics appeared to be years ago when I first wrote that class. This
PR changes it so that the `getValue` method on `OnnxSequence` unwraps
the sequence and returns `List<? extends OnnxValue>` allowing the user
to process the individual ONNX values separately. It's done this way
rather than returning a multidimensional array for a tensor and a Java
map for a map as multidimensional arrays are very inefficient in Java
and best practice when operating with a OnnxTensor in Java is to use a
`java.nio.ByteBuffer`. So allowing users to access each `OnnxTensor`s
individually allows them to control how the data is materialised on the
Java heap.
2022-09-28 15:53:30 -07:00
Edward Chen
55ae71c160
Reduce Objective-C static analysis build time. (#13149) 2022-09-28 15:49:48 -07:00
PeixuanZuo
c26bb1bb19
Allow fastgelu/skiplayernorm profile by pass args from commandline (#13025)
**Description**: Describe your changes.
This allow us quickly launch a microbench session by, for example:
`python skip_layer_norm_test.py 8 128 128 float32 `
2022-09-28 15:48:59 -07:00
cloudhan
32c2c4b480
Change ROCm to use tunable GEMM (#12853)
Change ROCm to use tunable GEMM. It is not enabled in this PR. This will drastically improve GEMM performance in some shapes and dtypes configuration. This will benefit the overall performance for BERT inference and hopefully, training, when enabled.
2022-09-28 16:21:54 +08:00
PeixuanZuo
5e4ebbd9d9
[ROCm] add MIGraphX ci pipeline (#11569)
**Description**: Describe your changes.
Add migraphx ci pipeline, test build and unit tests.
This PR is based on #11492 

Pipeline :
https://dev.azure.com/onnxruntime/onnxruntime/_build/results?buildId=765711&view=results
2022-09-28 10:59:30 +08:00
Yi Zhang
19774f9230
print test case's skip reason (#13118)
### Description
as title


### Motivation and Context
easy to debug
2022-09-28 09:33:31 +08:00
Baiju Meswani
f99d00fa38
Add rel* branches to upload training packages to final storage (#13124) 2022-09-27 17:20:17 -07:00
Rachel Guo
9a44a69653
Refactor NNAPI EP OpBuilder/OpSupportChecker structure (#13065)
### Description
<!-- Describe your changes. -->

As title

-Split long OpBuilder and OpSupportChecker files into individual
operator files.

-Add OpBuilder/SupportChecker registry factories.

-Combine the functionality of op_builder and op_support_checker into one
op_builder.
### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->

The NNAPI OPBuilder was splitted into OPBuilder (For EP::Compile) and
OPSupportChecker (for EP::GetCapability)
At the time it was reasonable choice, but OPBuilder/OPSupportChecker
share some logic and has to use addition helper.

Clean up now to make NNAPI OPBuilder/OPSupportChecker into single
OPBuilder (similar to what CoreML EP has)
2022-09-27 17:12:09 -07:00
Edward Chen
457a53c92f
Fix static analysis warning by making derived classes final. (#13123)
Follow up to #13059, which only updated the base classes. This change ensures that the derived classes will not be base classes.
2022-09-27 15:45:45 -07:00
Scott McKay
e19163167e
Update React Native documentation to reflect change to use full ORT (#13091)
### Description
<!-- Describe your changes. -->
Update React Native documentation to reflect change to use full ORT. Fix
broken links.


### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
ORT v1.13 uses the full ORT package. Instructions for performing a
custom build did not cover this.

Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com>
2022-09-28 08:11:58 +10:00
Edward Chen
5c89c37f7f
Consolidate enabled/default kernel def type constraints (#13034)
Consolidate enabled/default kernel def type constraint types into enabled.
2022-09-27 14:04:15 -07:00
Faith Xu
440f31668f
Labeler: Test /i regex for case sensitivity (#13115)
### Description
Test if regex change will make auto labeling case insensitive
2022-09-27 13:58:09 -07:00
PeixuanZuo
13d1a3c007
[ROCm] add SkipLayerNorm vectorize Regular case (#12821)
**Description**: Describe your changes.
add SkipLayerNorm vectorize regular case
1. when hidden size <= 1024, SkipLayerNormTunable op can use both small
case and regular case
2. when hidden size > 1024, SkipLayerNormTunable op can only use regular
case.

**Motivation and Context**
- Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here.
2022-09-27 12:52:10 -07:00
leqiao-1
43766ee36d
Fix OLive build pipeline (#13114) 2022-09-27 10:19:58 -07:00
Vincent Wang
94e34ace15
Bugfix for SimplifiedLayerNormalization (#12975)
This PR is to fix https://github.com/microsoft/onnxruntime/issues/12930
and https://github.com/microsoft/onnxruntime/issues/12579.

In detail:
- For CPU EP, since current impl of SimplifiedLayerNormalization doesn't
support input and scale having different data types, so if the sub-graph
contains Cast Op, the sub-graph will not fused, this guarantee that both
inputs and output data type will be same
- For CUDA EP, add (fp16, float) support to (T,V) type constraints all
combinations of fp16 and float can be supported in the impl

With the fix, the original model can be run with
SimplifiedLayerNormalization, which also helps to improve the perf.
2022-09-27 14:24:16 +08:00
RandySheriffH
237ccc01c7
Remove one last nuphar reference (#13111)
Remove one last nuphar reference.
2022-09-26 23:02:36 -07:00
Changming Sun
b25437ec41
Upgrade protobuf version (#13100)
Upgrade protobuf version from 3.18.1 to 3.18.3 to address CVE-2022-1941
2022-09-26 21:30:28 -07:00
Hector Li
073dbba784
skip the placeholder inputs while adding node inputs as sub-graph inputs (#13106)
Fix issue that all nodes inputs are added as sub-graph inputs event the input does not exist.

Solution:
Skip the placeholder inputs while adding node inputs as sub-graph inputs. E.g Onnx node test test_resize_upsample_scales_linear, 2nd input roi is empty.
2022-09-26 21:06:29 -07:00
Yufeng Li
c746083344
use parameter names to specify argument mapping (#13108)
use parameter names to specify argument mapping to avoid mismatches.
2022-09-26 20:56:59 -07:00
RandySheriffH
e3bdba37a8
Mitigate prefast static analysis warnings (#13032)
Address static analysis warnings:

https://msdata.visualstudio.com/DefaultCollection/Vienna/_workitems/edit/1944984/

https://msdata.visualstudio.com/DefaultCollection/Vienna/_workitems/edit/1943846/

Co-authored-by: Randy Shuai <rashuai@microsoft.com>
2022-09-26 17:06:33 -07:00
RandySheriffH
77a066c700
Drop nuphar from java API (#13107)
Drop nuphar from:

- java API
- tvm.cmake
- run_build.sh
2022-09-26 17:06:08 -07:00
Vincent Wang
0e98fb4e9b
Fix Build Error for CUDA113 Introduced by 6efa9d9 (#13089)
Fix build error for CUDA version < 11.4. The error was introduce by
commit 6efa9d9e10.
2022-09-27 07:57:14 +08:00
Edward Chen
b62ba0b5a7
Remove old enable_linux_gpu_tests parameter from template invocation. (#13102)
Remove old enable_linux_gpu_tests parameter from template invocation in build-perf-test-binaries-pipeline.yml.
2022-09-26 16:27:40 -07:00
Chen Fu
e9b1bbc6a5
fix Numpy array None judgement bug (#13103)
fix https://github.com/microsoft/onnxruntime/issues/13054
2022-09-26 15:15:32 -07:00
RandySheriffH
a83a9ed6b0
Remove miscellaneous nuphar configs (#13070)
Remove a handful of nuphar related configurations after deprecation.

Co-authored-by: Randy Shuai <rashuai@microsoft.com>
2022-09-26 13:41:28 -07:00
Jian Chen
44c14e8cbb
Adding test case for conv per channel with QDQ format (#13041)
**Description**: Adding test case for conv per channel with QDQ format
2022-09-26 16:25:28 -04:00
Dale Phurrough
2ae33b3613
fix CuDNN lib path for Windows (#12974)
Fixes microsoft/onnxruntime#12969

### Motivation and Context

Build is broken, can't find cudnn.lib with nvidia official install of
cuDNN

Alternative method is to use `IF(EXISTS
${onnxruntime_CUDNN_HOME}/lib/x64/cudnn.lib)` to test for legacy
location and only add the legacy dir to the path, else add the current
official `lib/` dir.
2022-09-26 13:23:38 -07:00
Nat Kershaw (MSFT)
ce2ea44a56
Try to fix GitHub labeling action (#12999) 2022-09-26 11:46:28 -07:00
Changming Sun
7116825aef
Add CMAKE_CUDA_ARCHITECTURES list to python packaging pipeline (#13081) 2022-09-26 10:22:43 -07:00
mayavijx
ade0d29174
Updated Dockerfile.ubuntu_openvino with OV 2022.2 official release (#13069)
Updated Dockerfile.ubuntu_openvino to use OV 2022.2 official release
which was using pre release only.
2022-09-26 00:15:52 -07:00
dependabot[bot]
365a01397d Bump protobuf from 3.17.0 to 3.18.3 in /tools/ci_build
Bumps [protobuf](https://github.com/protocolbuffers/protobuf) from 3.17.0 to 3.18.3.
- [Release notes](https://github.com/protocolbuffers/protobuf/releases)
- [Changelog](https://github.com/protocolbuffers/protobuf/blob/main/generate_changelog.py)
- [Commits](https://github.com/protocolbuffers/protobuf/compare/v3.17.0...v3.18.3)

---
updated-dependencies:
- dependency-name: protobuf
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
2022-09-25 20:00:36 -07:00
Scott McKay
b820256f34
Add check that bias and scale sizes match norm_size in LayerNormalization (#13060)
### Description
Add check that bias and scale sizes match norm_size in
LayerNormalization.

### Motivation and Context
#12917
2022-09-26 08:22:49 +10:00
Hariharan Seshadri
19c51376c4
Introduce QDQ transformer fusion tools for ordered quantized ops (#12661) 2022-09-24 23:22:44 -07:00
dependabot[bot]
6587a85f8f Bump protobuf from 3.18.1 to 3.18.3 in /tools/ci_build/github/linux/tvm
Bumps [protobuf](https://github.com/protocolbuffers/protobuf) from 3.18.1 to 3.18.3.
- [Release notes](https://github.com/protocolbuffers/protobuf/releases)
- [Changelog](https://github.com/protocolbuffers/protobuf/blob/main/generate_changelog.py)
- [Commits](https://github.com/protocolbuffers/protobuf/compare/v3.18.1...v3.18.3)

---
updated-dependencies:
- dependency-name: protobuf
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
2022-09-24 21:12:16 -07:00
dependabot[bot]
c1ff4b468d Bump protobuf in /tools/ci_build/github/linux/docker/scripts/manylinux
Bumps [protobuf](https://github.com/protocolbuffers/protobuf) from 3.18.1 to 3.18.3.
- [Release notes](https://github.com/protocolbuffers/protobuf/releases)
- [Changelog](https://github.com/protocolbuffers/protobuf/blob/main/generate_changelog.py)
- [Commits](https://github.com/protocolbuffers/protobuf/compare/v3.18.1...v3.18.3)

---
updated-dependencies:
- dependency-name: protobuf
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
2022-09-24 15:21:50 -07:00
Chih-Hsuan Yen
9abd6e3a30
setup.py: use packaging instead of wheel.vendored.packaging (#13083) 2022-09-24 08:32:44 -07:00
ytaous
2cc4e7e5c2
[Build] Fix broken AMD CI (#13082)
Introduced by https://github.com/microsoft/onnxruntime/pull/12949
- add missing lines in excluded list

Co-authored-by: Ethan Tao <ettao@microsoft.com>
2022-09-24 00:21:25 -07:00