### Description
Fix SNPE build issue caused by cmake dependency refactor
### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
fix issue: https://github.com/microsoft/onnxruntime/pull/14547
### Description
Introduce collective ops into onnxruntime inference build, including
1) AllReduce and AllGather schema in contrib op, controlled by USE_MPI
flag
2) AllReduce and AllGather kernel in cuda EP, controlled by ORT_USE_NCCL
flag
### Motivation and Context
Enable the collective ops in onnxruntime inference build so we have the
ability to run distributed inference with multiple GPUs.
The original ncclAllReduce ops in training build require quite complex
configurations, which is not suitable for inference case, and it already
broken. so we introduce a new implementation.
---------
Co-authored-by: Cheng Tang <chenta@microsoft.com@orttrainingdev9.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net>
### Description
<!-- Describe your changes. -->
1. fix a bug in relative position bias kernel where seq_len > 32
2. rename extra_add_qk to relative_position_bias
3. support relative_position_bias in multihead attention (B, N, S, S*)
4. gru_gate support by Lei
### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
---------
Co-authored-by: Ubuntu <wy@v100-2.0cdb2e52twzevn1i4fi45bylyg.jx.internal.cloudapp.net>
Co-authored-by: Lei Zhang <zhang.huanning@hotmail.com>
### Description
<!-- Describe your changes. -->
### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
Co-authored-by: Scott McKay <skottmckay@gmail.com>
### Description
Reduce the cuda library size by:
1. refactoring beam_search_top_k to reduce template instantiation. It
saves ~56MB
2. opt out TopK for type uint*, int8_t and int16_t. It saves ~50MB.
### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
### Description
This is a follow-up of
https://github.com/microsoft/onnxruntime/pull/14428 for Stable Diffusion
CUDA optimizations:
(1) use NchwConv to replace Conv in onnx graph and add Tranpose nodes
accordingly
(2) reduce sequential Transpose nodes to at most one.
(3) symbolic shape infer of NchwConv
(4) fix add bias transpose which causes CUDA error (launching more than
1024 threads per block) in inferencing fp32 model.
(5) add models (bert, bart, stable_diffusion subdirectories) to package;
(6) remove option --disable_channels_last
Note that
(1) We can add a few graph transformations to reduce Transpose nodes
further. It is not done in this PR due to time limit.
(2) Stable diffusion 2.1 model outputs black images. It seems that
forcing Attention to float32 could avoid the issue. However it is much
slow to use float32 Attention.
### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
### Description
update VS2019 to VS 2022 in
onnxruntime-Nuget-WindowsAI-Pipeline-Official
### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
### Description
<!-- Describe your changes. -->
To faster unblock pipeline failure globally, disable these real models
tests from onnx repo for now. Meanwhile, we are trying to move these
models to Azure.
### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
https://github.com/onnx/onnx/issues/4857 these models in onnx repo are
broken. They are setup 4 years ago and the owner of these AWS instances
is unfound.
Making basic porting effort to run Sampling UT on ROCm ep, based on the
commits:
https://github.com/microsoft/onnxruntime/pull/13426https://github.com/microsoft/onnxruntime/pull/14218
1. enabling EmbedLayerNorm op
2. enabling Sampling op
3. enabling helpers to copy data from CPU->GPU for subgraph
This task is the first checkpoint. There could be other missing ops when
testing a real model.
We will migrate more code onto ROCm as needed.
Co-authored-by: Ubuntu <ettao@ettao-amd-dev1.zvflicr54joexhdgnhvmxrxygg.phxx.internal.cloudapp.net>
…("####") should append for each input_def, not only on the last one
else branch of this if should return ignore_identity
3d7518762a/onnxruntime/core/optimizer/identical_children_consolidation.cc (L66)
identity.append("####") should append for each input_def, not only on
the last one
### Description
<!-- Describe your changes. -->
### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
### Description
Change the return type of Softmax
function(`dispatch_warpwise_softmax_forward `and
`dispatch_blockwise_softmax_forward`) from `void ` to `Status`.
### Motivation and Context
Softmax function will call TunableOp which return Status. It's necessary
to pass the `Status` from inner function to outer function.
### Description
<!-- Describe your changes. -->
### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
---------
Co-authored-by: Ubuntu <wy@v100-2.0cdb2e52twzevn1i4fi45bylyg.jx.internal.cloudapp.net>
### Only link mpi when either use_mpi or use_nccl enabled
To fix the issue https://github.com/microsoft/onnxruntime/issues/14278.
Talked with @askhade, we think if users want to enable NCCL/MPi but MPI
is not found, it should be failure instead of warning.
So this PR made the change. As a result, to make CIs pass, we need
disable NCCL/MPI explicitly in the build command. This PR take an
alternative approach, e.g. since NCCL and MPi are not used for
customers, disable NCCL by default if "--disable_nccl" not specified,
disable MPI by default if "--use_mpi" not specified.
### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
### Description
Add stable diffusion CUDA kernel optimizations.
The following are included:
(1) GroupNorm operator. This kernel is from TensorRT 8.5.
(2) BiasSplitGelu operator. This kernel is modified from SplitGelu of
TensorRT 8.5. We added bias to the SplitGelu.
(3) NhwcConv operator. This adds support of NHWC format (ONNX Conv
operator uses NCHW format).
(3) Update MultiHeadAttention (packed kv and no bias) for cross
attention. This could avoid transpose of kv for TRT fused cross
attention kernel.
(4) Optimization and benchmark script
Not included:
(1) Script to convert Conv to NhwcConv in onnx graph.
(2) Update symbolic shape inference for NhwcConv.
(3) Add SeqLen2Spatial operator
(4) Documents
Limitations: GroupNorm, BiasSplitGelu and NhwcConv kernels are
implemented based on stable diffusion usage. They might not be
applicable to any input size or dimensions. For example, BiasSplitGelu
requires hidden size to be 2560 | 5120 | 10240, and NhwcConv assumes 4D
input/weight.
There is minor increasement of binary size. For SM=75 only, python
package wheel size adds (33757K - 33640K) = 117 KB. It is possible to
move NHWC from template parameter to constructor to reduce binary size
(with slight cost of performance).
Note: for RTX 4090/4080/4070 Ti, need build with CUDA 11.8 and latest
cuDNN to get best performance.
### Description
Fix not working REMOVE_ITEM.
`onnxruntime/contrib_ops/rocm/aten_ops/aten_op.cc` is hipyfied from
`onnxruntime/contrib_ops/cuda/aten_ops/aten_op.cc`.
The file correct path is
`${CMAKE_CURRENT_BINARY_DIR}/amdgpu/onnxruntime/contrib_ops/rocm/aten_ops/aten_op.cc`
and it exists in hipyfied source files list
`onnxruntime_rocm_generated_contrib_ops_cc_srcs`.
A better way to fix it: If we don't want to build a file. Add it into
hipify excluded files and will not hipify it.
### Description
If we set flag 'disable_exceptions' to build ORT:
`onnxruntime/contrib_ops/cpu/quantization/qlinear_global_average_pool.cc.o`
woundn't generate such symbols which used by qlinear_pool.c
```
0000000000000000 W _ZN11onnxruntime7contrib27ComputeQLinearGlobalAvgPoolIaEENS_6common6StatusEPKT_fS4_PS4_fS4_lllbPNS_11concurrency10ThreadPoolE
0000000000000000 W _ZN11onnxruntime7contrib27ComputeQLinearGlobalAvgPoolIhEENS_6common6StatusEPKT_fS4_PS4_fS4_lllbPNS_11concurrency10ThreadPoolE
```
so we get a error of undefined symbols of
ComputeQLinearGlobalAvgPool<uin8_t> and
ComputeQLinearGlobalAvgPool<in8_t>......
### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
Fix compilation issue when DISABLE_SPARSE_TENSORS is defined
### Description
There is missing semicolon when DISABLE_SPARSE_TENSORS is defined
### Motivation and Context
Avoid a compilation failure when cmake option
`onnxruntime_DISABLE_SPARSE_TENSORS` is turned on
### Description
<!-- Describe your changes. -->
Fix issue with schema lookup where there are custom ops using the ONNX
domain.
Update testing infrastructure to use an explicit domain for custom ops.
Using an empty string clashes with the ONNX domain and can cause
unexpected issues. It's also a bad example for external users as our
docs point to the unit tests.
Fix a couple of places using exact matching of the node since version to
be slightly more flexible and use a range (which aligns with how the
kernel lookup works).
### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
Fixes a problem that came up when adding support for standalone custom
ops in an ORT format model. Separating these changes out to simplify
review.
### Description
Applies ORT node names to corresponding compiled operators or DML graph
nodes.
### Motivation and Context
This makes it easier to correlate ONNX nodes to events in PIX GPU
captures when using the DML EP. Names set in the DML graph nodes require
additional modifications to the DML runtime library (available in a
future NuGet package).
### Description
Gather to Split optimizer fails if opset == 18. This PR fixes one bug
and extend unit tests.
### Motivation and Context
The model produced by the optimizer does not follow onnx specifications
with opset 18.
(cherry picked from commit 414b73a02123b672e496326664cd2dc3bd6c6d24)
### Rework for PR https://github.com/microsoft/onnxruntime/pull/14068:
Enable multiple step run for adamw tests (on device training)
### Removed duplicated MACRO checks for training.
### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
### Description
Re-work `OrtApi::GetAvailableProviders` in a way that the data is
returned in a single allocation.
Fix exception safety issues and fix `Release` function.
Remove warning suppressions.
Fix exception safety issue in C++ API.
Fix exception safety issue in C# API.
Move EP name length enforcement to the implementation.
### Motivation and Context
The original motivation comes from
https://github.com/microsoft/onnxruntime/issues/14378.
However, the API is already implemented.
Cc: @prabhat00155
### Description
change deepspeed version in warning from 0.7.3 to 0.8.0
### Motivation and Context
The version was updated for Deepspeed support in ORT from 0.7.3 to 0.8.0
but wasn't updated in the warnings message and this PR is to fix that.
### Description
<!-- Describe your changes. -->
https://dev.azure.com/aiinfra/ONNX%20Runtime/_workitems/edit/11263/
### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
This PR registers ScatterElements-16 to the DML EP
- CPU fallback is added if the reduction attribute is in use, as this is
not yet supported by DML.
---------
Co-authored-by: Numfor Mbiziwo-Tiapo <numform@microsoft.com>
### Fix failures due to black && pyright package updates
#### Problem
In the passing 3 hours, all PRs have their Python lint CI failed. Many
files are reported not well formatted. I believe this is due to updated
black package did some changes. Also there is a pyright checking
failure, after investigation it is due to pyright package upgrade.
##### Failure 1: "Lint Python" failure related to pyright:
```
Run jordemort/action-pyright@v1
Run $GITHUB_ACTION_PATH/script.sh
🐶 Installing reviewdog ... https://github.com/reviewdog/reviewdog🔎 Running pyright with reviewdog 🐶 ...
+ npm exec --yes -- pyright@latest --outputjson --lib
No configuration file found.
pyproject.toml file found at /home/runner/work/onnxruntime/onnxruntime.
Loading pyproject.toml file at /home/runner/work/onnxruntime/onnxruntime/pyproject.toml
Assuming Python version 3.10
Assuming Python platform Linux
No include entries specified; assuming /home/runner/work/onnxruntime/onnxruntime
stubPath /home/runner/work/onnxruntime/onnxruntime/typings is not a valid directory.
Searching for source files
Found 628 source files
An internal error occurred while type checking file "/home/runner/work/onnxruntime/onnxruntime/tools/android_custom_build/build_custom_android_package.py": TypeError: Cannot read properties of undefined (reading 'paramType')
at map (/home/runner/.npm/_npx/fbb43b1786f81b3f/node_modules/pyright/dist/pyright-internal/src/analyzer/typeEvaluator.ts:7982:91)
at Array.map (<anonymous>)
at filterOverloadMatchesForAnyArgs (/home/runner/.npm/_npx/fbb43b1786f81b3f/node_modules/pyright/dist/pyright-internal/src/analyzer/typeEvaluator.ts:7982:44)
at validateOverloadsWithExpandedTypes (/home/runner/.npm/_npx/fbb43b1786f81b3f/node_modules/pyright/dist/pyright-internal/src/analyzer/typeEvaluator.ts:7879:40)
at validateOverloadedFunctionArguments (/home/runner/.npm/_npx/fbb43b1786f81b3f/node_modules/pyright/dist/pyright-internal/src/analyzer/typeEvaluator.ts:8138:32)
at callback (/home/runner/.npm/_npx/fbb43b1786f81b3f/node_modules/pyright/dist/pyright-internal/src/analyzer/typeEvaluator.ts:8904:48)
at callback (/home/runner/.npm/_npx/fbb43b1786f81b3f/node_modules/pyright/dist/pyright-internal/src/analyzer/typeEvaluator.ts:3699:39)
at doForEachSubtype (/home/runner/.npm/_npx/fbb43b1786f81b3f/node_modules/pyright/dist/pyright-internal/src/analyzer/typeUtils.ts:673:9)
at expandSubtype (/home/runner/.npm/_npx/fbb43b1786f81b3f/node_modules/pyright/dist/pyright-internal/src/analyzer/typeEvaluator.ts:3692:13)
at mapSubtypesExpandTypeVars (/home/runner/.npm/_npx/fbb43b1786f81b3f/node_modules/pyright/dist/pyright-internal/src/analyzer/typeEvaluator.ts:3723:13)
at validateCallArguments (/home/runner/.npm/_npx/fbb43b1786f81b3f/node_modules/pyright/dist/pyright-internal/src/analyzer/typeEvaluator.ts:8768:28)
at getTypeOfCall (/home/runner/.npm/_npx/fbb43b1786f81b3f/node_modules/pyright/dist/pyright-internal/src/analyzer/typeEvaluator.ts:7374:36)
at getTypeOfExpression (/home/runner/.npm/_npx/fbb43b1786f81b3f/node_modules/pyright/dist/pyright-internal/src/analyzer/typeEvaluator.ts:1022:30)
at evaluateTypesForExpressionInContext (/home/runner/.npm/_npx/fbb43b1786f81b3f/node_modules/pyright/dist/pyright-internal/src/analyzer/typeEvaluator.ts:18807:21)
at callback (/home/runner/.npm/_npx/fbb43b1786f81b3f/node_modules/pyright/dist/pyright-internal/src/analyzer/typeEvaluator.ts:866:13)
at evaluateTypeForSubnode (/home/runner/.npm/_npx/fbb43b1786f81b3f/node_modules/pyright/dist/pyright-internal/src/analyzer/typeEvaluator.ts:19042:9)
at callback (/home/runner/.npm/_npx/fbb43b1786f81b3f/node_modules/pyright/dist/pyright-internal/src/analyzer/typeEvaluator.ts:865:16)
at s.getTypeResult (/home/runner/.npm/_npx/fbb43b1786f81b3f/node_modules/pyright/dist/pyright-internal/src/common/timing.ts:40:20)
at O.visitReturn (/home/runner/.npm/_npx/fbb43b1786f81b3f/node_modules/pyright/dist/pyright-internal/src/analyzer/checker.ts:900:48)
at O.visit (/home/runner/.npm/_npx/fbb43b1786f81b3f/node_modules/pyright/dist/pyright-internal/src/analyzer/parseTreeWalker.ts:526:29)
at O.visitNode (/home/runner/.npm/_npx/fbb43b1786f81b3f/node_modules/pyright/dist/pyright-internal/src/analyzer/parseTreeWalker.ts:933:21)
at O.walk (/home/runner/.npm/_npx/fbb43b1786f81b3f/node_modules/pyright/dist/pyright-internal/src/analyzer/parseTreeWalker.ts:915:37)
at O.walk (/home/runner/.npm/_npx/fbb43b1786f81b3f/node_modules/pyright/dist/pyright-internal/src/analyzer/checker.ts:303:19)
at forEach (/home/runner/.npm/_npx/fbb43b1786f81b3f/node_modules/pyright/dist/pyright-internal/src/analyzer/parseTreeWalker.ts:924:22)
at Array.forEach (<anonymous>)
at O.walkMultiple (/home/runner/.npm/_npx/fbb43b1786f81b3f/node_modules/pyright/dist/pyright-internal/src/analyzer/parseTreeWalker.ts:922:15)
at O.walk (/home/runner/.npm/_npx/fbb43b1786f81b3f/node_modules/pyright/dist/pyright-internal/src/analyzer/parseTreeWalker.ts:917:18)
at O.walk (/home/runner/.npm/_npx/fbb43b1786f81b3f/node_modules/pyright/dist/pyright-internal/src/analyzer/checker.ts:303:19)
at O._walkStatementsAndReportUnreachable (/home/runner/.npm/_npx/fbb43b1786f81b3f/node_modules/pyright/dist/pyright-internal/src/analyzer/checker.ts:2450:18)
at O.visitSuite (/home/runner/.npm/_npx/fbb43b1786f81b3f/node_modules/pyright/dist/pyright-internal/src/analyzer/checker.ts:312:14)
at O.visit (/home/runner/.npm/_npx/fbb43b1786f81b3f/node_modules/pyright/dist/pyright-internal/src/analyzer/parseTreeWalker.ts:544:29)
at O.visitNode (/home/runner/.npm/_npx/fbb43b1786f81b3f/node_modules/pyright/dist/pyright-internal/src/analyzer/parseTreeWalker.ts:933:21)
at O.walk (/home/runner/.npm/_npx/fbb43b1786f81b3f/node_modules/pyright/dist/pyright-internal/src/analyzer/parseTreeWalker.ts:915:37)
at O.walk (/home/runner/.npm/_npx/fbb43b1786f81b3f/node_modules/pyright/dist/pyright-internal/src/analyzer/checker.ts:303:19)
at O.visitFunction (/home/runner/.npm/_npx/fbb43b1786f81b3f/node_modules/pyright/dist/pyright-internal/src/analyzer/checker.ts:638:18)
at O.visit (/home/runner/.npm/_npx/fbb43b1786f81b3f/node_modules/pyright/dist/pyright-internal/src/analyzer/parseTreeWalker.ts:442:29)
at O.visitNode (/home/runner/.npm/_npx/fbb43b1786f81b3f/node_modules/pyright/dist/pyright-internal/src/analyzer/parseTreeWalker.ts:933:21)
at O.walk (/home/runner/.npm/_npx/fbb43b1786f81b3f/node_modules/pyright/dist/pyright-internal/src/analyzer/parseTreeWalker.ts:915:37)
at O.walk (/home/runner/.npm/_npx/fbb43b1786f81b3f/node_modules/pyright/dist/pyright-internal/src/analyzer/checker.ts:303:19)
at O._walkStatementsAndReportUnreachable (/home/runner/.npm/_npx/fbb43b1786f81b3f/node_modules/pyright/dist/pyright-internal/src/analyzer/checker.ts:2450:18)
at O.check (/home/runner/.npm/_npx/fbb43b1786f81b3f/node_modules/pyright/dist/pyright-internal/src/analyzer/checker.ts:282:14)
at callback (/home/runner/.npm/_npx/fbb43b1786f81b3f/node_modules/pyright/dist/pyright-internal/src/analyzer/sourceFile.ts:1353:29)
at s.timeOperation (/home/runner/.npm/_npx/fbb43b1786f81b3f/node_modules/pyright/dist/pyright-internal/src/common/timing.ts:44:28)
at callback (/home/runner/.npm/_npx/fbb43b1786f81b3f/node_modules/pyright/dist/pyright-internal/src/analyzer/sourceFile.ts:1350:45)
at t.LogTracker.log (/home/runner/.npm/_npx/fbb43b1786f81b3f/node_modules/pyright/dist/pyright-internal/src/common/logTracker.ts:36:20)
at t.SourceFile.check (/home/runner/.npm/_npx/fbb43b1786f81b3f/node_modules/pyright/dist/pyright-internal/src/analyzer/sourceFile.ts:1348:33)
at callback (/home/runner/.npm/_npx/fbb43b1786f81b3f/node_modules/pyright/dist/pyright-internal/src/analyzer/program.ts:[11](https://github.com/microsoft/onnxruntime/actions/runs/4060639890/jobs/6989928568#step:9:12)[33](https://github.com/microsoft/onnxruntime/actions/runs/4060639890/jobs/6989928568#step:9:35):40)
at t.LogTracker.log (/home/runner/.npm/_npx/fbb43b1786f81b3f/node_modules/pyright/dist/pyright-internal/src/common/logTracker.ts:36:20)
at L._checkTypes (/home/runner/.npm/_npx/fbb43b1786f81b3f/node_modules/pyright/dist/pyright-internal/src/analyzer/program.ts:1103:33)
at callback (/home/runner/.npm/_npx/fbb43b1786f81b3f/node_modules/pyright/dist/pyright-internal/src/analyzer/program.ts:577:30)
at callback (/home/runner/.npm/_npx/fbb43b1786f81b3f/node_modules/pyright/dist/pyright-internal/src/analyzer/typeEvaluator.ts:646:20)
at s.runWithCancellationToken [as timeOperation] (/home/runner/.npm/_npx/fbb43b1786f81b3f/node_modules/pyright/dist/pyright-internal/src/common/timing.ts:44:28)
at L._runEvaluatorWithCancellationToken (/home/runner/.npm/_npx/fbb43b1786f81b3f/node_modules/pyright/dist/pyright-internal/src/analyzer/program.ts:2467:41)
at L.analyze (/home/runner/.npm/_npx/fbb43b1786f81b3f/node_modules/pyright/dist/pyright-internal/src/analyzer/program.ts:541:21)
at analyzeProgram (/home/runner/.npm/_npx/fbb43b1786f81b3f/node_modules/pyright/dist/pyright-internal/src/analyzer/analysis.ts:52:33)
at t.BackgroundAnalysisProgram.startAnalysis (/home/runner/.npm/_npx/fbb43b1786f81b3f/node_modules/pyright/dist/pyright-internal/src/analyzer/backgroundAnalysisProgram.ts:151:16)
at Timeout._onTimeout (/home/runner/.npm/_npx/fbb43b1786f81b3f/node_modules/pyright/dist/pyright-internal/src/analyzer/service.ts:1771:67)
at listOnTimeout (node:internal/timers:559:17)
at processTimers (node:internal/timers:502:7)
Error performing analysis: TypeError: Cannot read properties of undefined (reading 'paramType')
at map (/home/runner/.npm/_npx/fbb43b1786f81b3f/node_modules/pyright/dist/pyright-internal/src/analyzer/typeEvaluator.ts:7982:91)
at Array.map (<anonymous>)
at filterOverloadMatchesForAnyArgs (/home/runner/.npm/_npx/fbb43b1786f81b3f/node_modules/pyright/dist/pyright-internal/src/analyzer/typeEvaluator.ts:7982:44)
at validateOverloadsWithExpandedTypes (/home/runner/.npm/_npx/fbb43b1786f81b3f/node_modules/pyright/dist/pyright-internal/src/analyzer/typeEvaluator.ts:7879:40)
at validateOverloadedFunctionArguments (/home/runner/.npm/_npx/fbb43b1786f81b3f/node_modules/pyright/dist/pyright-internal/src/analyzer/typeEvaluator.ts:8138:32)
at callback (/home/runner/.npm/_npx/fbb43b1786f81b3f/node_modules/pyright/dist/pyright-internal/src/analyzer/typeEvaluator.ts:8904:48)
at callback (/home/runner/.npm/_npx/fbb43b1786f81b3f/node_modules/pyright/dist/pyright-internal/src/analyzer/typeEvaluator.ts:3699:39)
at doForEachSubtype (/home/runner/.npm/_npx/fbb43b1786f81b3f/node_modules/pyright/dist/pyright-internal/src/analyzer/typeUtils.ts:673:9)
at expandSubtype (/home/runner/.npm/_npx/fbb43b1786f81b3f/node_modules/pyright/dist/pyright-internal/src/analyzer/typeEvaluator.ts:3692:13)
at mapSubtypesExpandTypeVars (/home/runner/.npm/_npx/fbb43b1786f81b3f/node_modules/pyright/dist/pyright-internal/src/analyzer/typeEvaluator.ts:[37](https://github.com/microsoft/onnxruntime/actions/runs/4060639890/jobs/6989928568#step:9:40)23:13)
at validateCallArguments (/home/runner/.npm/_npx/fbb43b1786f81b3f/node_modules/pyright/dist/pyright-internal/src/analyzer/typeEvaluator.ts:8768:28)
at getTypeOfCall (/home/runner/.npm/_npx/fbb43b1786f81b3f/node_modules/pyright/dist/pyright-internal/src/analyzer/typeEvaluator.ts:7374:36)
at getTypeOfExpression (/home/runner/.npm/_npx/fbb43b1786f81b3f/node_modules/pyright/dist/pyright-internal/src/analyzer/typeEvaluator.ts:1022:30)
at evaluateTypesForExpressionInContext (/home/runner/.npm/_npx/fbb43b1786f81b3f/node_modules/pyright/dist/pyright-internal/src/analyzer/typeEvaluator.ts:18807:21)
at callback (/home/runner/.npm/_npx/fbb43b1786f81b3f/node_modules/pyright/dist/pyright-internal/src/analyzer/typeEvaluator.ts:866:13)
at evaluateTypeForSubnode (/home/runner/.npm/_npx/fbb43b1786f81b3f/node_modules/pyright/dist/pyright-internal/src/analyzer/typeEvaluator.ts:19042:9)
at callback (/home/runner/.npm/_npx/fbb43b1786f81b3f/node_modules/pyright/dist/pyright-internal/src/analyzer/typeEvaluator.ts:865:16)
at s.getTypeResult (/home/runner/.npm/_npx/fbb43b1786f81b3f/node_modules/pyright/dist/pyright-internal/src/common/timing.ts:40:20)
at O.visitReturn (/home/runner/.npm/_npx/fbb43b1786f81b3f/node_modules/pyright/dist/pyright-internal/src/analyzer/checker.ts:900:48)
at O.visit (/home/runner/.npm/_npx/fbb43b1786f81b3f/node_modules/pyright/dist/pyright-internal/src/analyzer/parseTreeWalker.ts:526:29)
at O.visitNode (/home/runner/.npm/_npx/fbb43b1786f81b3f/node_modules/pyright/dist/pyright-internal/src/analyzer/parseTreeWalker.ts:933:21)
at O.walk (/home/runner/.npm/_npx/fbb43b1786f81b3f/node_modules/pyright/dist/pyright-internal/src/analyzer/parseTreeWalker.ts:915:37)
at O.walk (/home/runner/.npm/_npx/fbb43b1786f81b3f/node_modules/pyright/dist/pyright-internal/src/analyzer/checker.ts:303:19)
at forEach (/home/runner/.npm/_npx/fbb43b1786f81b3f/node_modules/pyright/dist/pyright-internal/src/analyzer/parseTreeWalker.ts:924:22)
at Array.forEach (<anonymous>)
at O.walkMultiple (/home/runner/.npm/_npx/fbb43b1786f81b3f/node_modules/pyright/dist/pyright-internal/src/analyzer/parseTreeWalker.ts:922:15)
at O.walk (/home/runner/.npm/_npx/fbb43b1786f81b3f/node_modules/pyright/dist/pyright-internal/src/analyzer/parseTreeWalker.ts:917:18)
at O.walk (/home/runner/.npm/_npx/fbb43b1786f81b3f/node_modules/pyright/dist/pyright-internal/src/analyzer/checker.ts:303:19)
at O._walkStatementsAndReportUnreachable (/home/runner/.npm/_npx/fbb43b1786f81b3f/node_modules/pyright/dist/pyright-internal/src/analyzer/checker.ts:2450:18)
at O.visitSuite (/home/runner/.npm/_npx/fbb43b1786f81b3f/node_modules/pyright/dist/pyright-internal/src/analyzer/checker.ts:312:14)
at O.visit (/home/runner/.npm/_npx/fbb43b1786f81b3f/node_modules/pyright/dist/pyright-internal/src/analyzer/parseTreeWalker.ts:544:29)
at O.visitNode (/home/runner/.npm/_npx/fbb43b1786f81b3f/node_modules/pyright/dist/pyright-internal/src/analyzer/parseTreeWalker.ts:933:21)
at O.walk (/home/runner/.npm/_npx/fbb43b1786f81b3f/node_modules/pyright/dist/pyright-internal/src/analyzer/parseTreeWalker.ts:915:37)
at O.walk (/home/runner/.npm/_npx/fbb43b1786f81b3f/node_modules/pyright/dist/pyright-internal/src/analyzer/checker.ts:303:19)
at O.visitFunction (/home/runner/.npm/_npx/fbb43b1786f81b3f/node_modules/pyright/dist/pyright-internal/src/analyzer/checker.ts:6[38](https://github.com/microsoft/onnxruntime/actions/runs/4060639890/jobs/6989928568#step:9:41):18)
at O.visit (/home/runner/.npm/_npx/fbb43b1786f81b3f/node_modules/pyright/dist/pyright-internal/src/analyzer/parseTreeWalker.ts:442:29)
at O.visitNode (/home/runner/.npm/_npx/fbb43b1786f81b3f/node_modules/pyright/dist/pyright-internal/src/analyzer/parseTreeWalker.ts:933:21)
at O.walk (/home/runner/.npm/_npx/fbb43b1786f81b3f/node_modules/pyright/dist/pyright-internal/src/analyzer/parseTreeWalker.ts:915:37)
at O.walk (/home/runner/.npm/_npx/fbb43b1786f81b3f/node_modules/pyright/dist/pyright-internal/src/analyzer/checker.ts:303:19)
at O._walkStatementsAndReportUnreachable (/home/runner/.npm/_npx/fbb43b1786f81b3f/node_modules/pyright/dist/pyright-internal/src/analyzer/checker.ts:2450:18)
at O.check (/home/runner/.npm/_npx/fbb43b1786f81b3f/node_modules/pyright/dist/pyright-internal/src/analyzer/checker.ts:282:14)
at callback (/home/runner/.npm/_npx/fbb43b1786f81b3f/node_modules/pyright/dist/pyright-internal/src/analyzer/sourceFile.ts:1353:29)
at s.timeOperation (/home/runner/.npm/_npx/fbb43b1786f81b3f/node_modules/pyright/dist/pyright-internal/src/common/timing.ts:44:28)
at callback (/home/runner/.npm/_npx/fbb43b1786f81b3f/node_modules/pyright/dist/pyright-internal/src/analyzer/sourceFile.ts:1350:45)
at t.LogTracker.log (/home/runner/.npm/_npx/fbb43b1786f81b3f/node_modules/pyright/dist/pyright-internal/src/common/logTracker.ts:36:20)
at t.SourceFile.check (/home/runner/.npm/_npx/fbb43b1786f81b3f/node_modules/pyright/dist/pyright-internal/src/analyzer/sourceFile.ts:1348:33)
at callback (/home/runner/.npm/_npx/fbb43b1786f81b3f/node_modules/pyright/dist/pyright-internal/src/analyzer/program.ts:1133:[40](https://github.com/microsoft/onnxruntime/actions/runs/4060639890/jobs/6989928568#step:9:43))
at t.LogTracker.log (/home/runner/.npm/_npx/fbb43b1786f81b3f/node_modules/pyright/dist/pyright-internal/src/common/logTracker.ts:36:20)
at L._checkTypes (/home/runner/.npm/_npx/fbb43b1786f81b3f/node_modules/pyright/dist/pyright-internal/src/analyzer/program.ts:1103:33)
at callback (/home/runner/.npm/_npx/fbb43b1786f81b3f/node_modules/pyright/dist/pyright-internal/src/analyzer/program.ts:577:30)
at callback (/home/runner/.npm/_npx/fbb43b1786f81b3f/node_modules/pyright/dist/pyright-internal/src/analyzer/typeEvaluator.ts:646:20)
at s.runWithCancellationToken [as timeOperation] (/home/runner/.npm/_npx/fbb43b1786f81b3f/node_modules/pyright/dist/pyright-internal/src/common/timing.ts:44:28)
at L._runEvaluatorWithCancellationToken (/home/runner/.npm/_npx/fbb43b1786f81b3f/node_modules/pyright/dist/pyright-internal/src/analyzer/program.ts:2467:[41](https://github.com/microsoft/onnxruntime/actions/runs/4060639890/jobs/6989928568#step:9:44))
at L.analyze (/home/runner/.npm/_npx/fbb[43](https://github.com/microsoft/onnxruntime/actions/runs/4060639890/jobs/6989928568#step:9:46)b1786f81b3f/node_modules/pyright/dist/pyright-internal/src/analyzer/program.ts:541:21)
at analyzeProgram (/home/runner/.npm/_npx/fbb43b1786f81b3f/node_modules/pyright/dist/pyright-internal/src/analyzer/analysis.ts:52:33)
at t.BackgroundAnalysisProgram.startAnalysis (/home/runner/.npm/_npx/fbb43b1786f81b3f/node_modules/pyright/dist/pyright-internal/src/analyzer/backgroundAnalysisProgram.ts:151:16)
at Timeout._onTimeout (/home/runner/.npm/_npx/fbb43b1786f81b3f/node_modules/pyright/dist/pyright-internal/src/analyzer/service.ts:1771:67)
at listOnTimeout (node:internal/timers:559:17)
at processTimers (node:internal/timers:502:7)
+ true
+ python3 /home/runner/work/_actions/jordemort/action-pyright/v1/pyright_to_rdjson/pyright_to_rdjson.py
Traceback (most recent call last):
File "/home/runner/work/_actions/jordemort/action-pyright/v1/pyright_to_rdjson/pyright_to_rdjson.py", line 53, in <module>
print(pyright_to_rdjson(sys.stdin))
File "/home/runner/work/_actions/jordemort/action-pyright/v1/pyright_to_rdjson/pyright_to_rdjson.py", line 8, in pyright_to_rdjson
pyright: Dict = json.load(jsonin)
File "/usr/lib/python3.10/json/__init__.py", line 293, in load
return loads(fp.read(),
File "/usr/lib/python3.10/json/__init__.py", line 3[46](https://github.com/microsoft/onnxruntime/actions/runs/4060639890/jobs/6989928568#step:9:49), in loads
return _default_decoder.decode(s)
File "/usr/lib/python3.10/json/decoder.py", line 337, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/usr/lib/python3.10/json/decoder.py", line 3[55](https://github.com/microsoft/onnxruntime/actions/runs/4060639890/jobs/6989928568#step:9:58), in raw_decode
raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
+ cleanup
+ '[' -n /tmp/tmp.o6rGAdR1LC ']'
+ '[' -d /tmp/tmp.o6rGAdR1LC ']'
+ rm -rf /tmp/tmp.o6rGAdR1LC
Error: Process completed with exit code 1.
```
##### Failure 2: "Python format" failure related to "psf/black@stable":
Many files are reported not well formatted, an example:
```
--- /home/runner/work/onnxruntime/onnxruntime/onnxruntime/python/onnxruntime_inference_collection.py 2023-02-01 03:25:08.361480 +0000
+++ /home/runner/work/onnxruntime/onnxruntime/onnxruntime/python/onnxruntime_inference_collection.py 2023-02-01 03:25:23.6[28](https://github.com/microsoft/onnxruntime/actions/runs/4060639890/jobs/6989928837#step:4:30)466 +0000
@@ -103,11 +103,10 @@
"""
This is the main class used to run a model.
"""
def __init__(self):
-
# self._sess is managed by the derived class and relies on bindings from C.InferenceSession
self._sess = None
self._enable_fallback = True
would reformat /home/runner/work/onnxruntime/onnxruntime/onnxruntime/python/onnxruntime_inference_collection.py
def get_session_options(self):
```
#### Root causes
Failure 1. `pyright` publish new release 1.1.292 about 4 hourse ago..
https://www.npmjs.com/package/pyright?activeTab=versions. If we revert
the version back to previous release 1.1.291, then this test pass.
Failure 2. `black` ublish its release few hours ago.
https://pypi.org/project/black/#history

#### Fixes
Failure 1. Fixed the `pyright` to use previous release 1.1.291.
Failure 2. This PR firstly attempted to update all impacted files based
on new version of black package offline. But we hit a throttling issue
when calling format services :
```
{"severity":"ERROR","time":"2023-02-01T08:00:08.090158864Z","logging.googleapis.com/sourceLocation":{"file":"/home/runner/work/reviewdog/reviewdog/doghouse/server/github_checker.go","line":"45","function":"github.com/reviewdog/reviewdog/doghouse/server.(*checkerGitHubClient).UpdateCheckRun"},"message":"UpdateCheckRun failed: {\"message\":\"Invalid request.\\n\\nOnly 65535 characters are allowed; 89431 were supplied.\",\"documentation_url\":\"https://docs.github.com/rest/reference/checks#update-a-check-run\"}"}
```
So an alternative fix is done here, e.g. fix the version of the black
package to previous release 22.12.0.
**Would like to get some feedback from @justinchuby, feel free to make
some change based on this to unblock other PRs.**
### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
* Added the OrtDnnlProviderOptions structure to expose configuration
options to the user
* The number of threads can be defined by the user with the -i flag on
the perftest
* Number of threads can also be configured via the OMP_NUM_THREADS
environment variable
* The number of threads defined in the OrtDnnlProviderOptions is
prioritized over the environment variable
### Description
Avoids thread oversubscription caused by OpenMP allocating the maximum
number of threads possible for oneDNN EP. Added support for the
OrtDnnlProviderOptions, this will allow for more EP customization
capabilities, and allows for user defined number of threads.
### Motivation and Context
- Improves performances and allows for user to fine tune the number of
threads
Add two args for spinning control for onnxruntime_perf_test:
1. Stop spinning entirely for threads in intra-op thread pool.
2. Stop spinning only between ort runs.
Co-authored-by: Randy Shuai <rashuai@microsoft.com>
### Description
upgrade protobuf to 3.20.2, same as onnx 1.13.0
### Motivation and Context
Per component governance requirement and Fixes#14060
unused-parameter error occurs in 2 conditions.
1. compile protolbuf
`onnxruntime_src/cmake/external/protobuf/src/google/protobuf/repeated_ptr_field.h:752:66:
error: unused parameter ‘prototype’ [-Werror=unused-parameter]`
2. include onnx_pb.h
```
2023-01-28T10:20:15.0410853Z FAILED: CMakeFiles/onnxruntime_pybind11_state.dir/onnxruntime_src/onnxruntime/python/onnxruntime_pybind_iobinding.cc.o
......
2023-01-28T10:20:15.0466024Z from /build/Debug/_deps/onnx-src/onnx/onnx_pb.h:51,
2023-01-28T10:20:15.0466958Z from /onnxruntime_src/include/onnxruntime/core/framework/to_tensor_proto_element_type.h:10,
....
2023-01-28T10:20:15.0609678Z /build/Debug/_deps/onnx-build/onnx/onnx-operators-ml.pb.h:1178:25: required from here
2023-01-28T10:20:15.0610895Z /onnxruntime_src/cmake/external/protobuf/src/google/protobuf/repeated_ptr_field.h:752:66: error: unused parameter ‘prototype’ [-Werror=unused-parameter]
2023-01-28T10:20:15.0611707Z cc1plus: all warnings being treated as errors
```
https://dev.azure.com/onnxruntime/2a773b67-e88b-4c7f-9fc0-87d31fea8ef2/_apis/build/builds/874605/logs/22
### Fix build error on Windows when building with "
--enable_language_interop_ops -cmake_extra_defines
onnxruntime_DISABLE_ABSEIL=ON"
This is a subsequent fix after
https://github.com/microsoft/onnxruntime/pull/14309, which fixed build
for onnxruntime_DISABLE_ABSEIL=ON build.
Going furthur, if we enable --enable_language_interop_ops, there are
following two errors:
```
test_symm_qgemm.cpp
test_transpose.cpp
onnxruntime_session.lib(inference_session.obj) : error LNK2019: unresolved external symbol "void __cdecl onnxruntime::L
oadInterOp(class std::basic_string<wchar_t,struct std::char_traits<wchar_t>,class std::allocator<wchar_t> > const &,cla
ss std::vector<struct Ort::CustomOpDomain,class std::allocator<struct Ort::CustomOpDomain> > &,class std::function<void
__cdecl(char const *)> const &)" (?LoadInterOp@onnxruntime@@YAXAEBV?$basic_string@_WU?$char_traits@_W@std@@V?$allocato
r@_W@2@@std@@AEAV?$vector@UCustomOpDomain@Ort@@V?$allocator@UCustomOpDomain@Ort@@@std@@@3@AEBV?$function@$$A6AXPEBD@Z@3
@@Z) referenced in function "public: __cdecl <lambda_f3a907e0b0a0e11d80d305605215cce8>::operator()(class std::shared_pt
r<class onnxruntime::Model> &)const " (??R<lambda_f3a907e0b0a0e11d80d305605215cce8>@@QEBA@AEAV?$shared_ptr@VModel@onnxr
untime@@@std@@@Z) [C:\Users\pengwa\dev\onnxruntime\build\Windows\RelWithDebInfo\onnxruntime_test_trainer.vcxproj]
onnxruntime_session.lib(inference_session.obj) : error LNK2019: unresolved external symbol "void __cdecl onnxruntime::L
oadInterOp(class onnx::ModelProto const &,class std::vector<struct Ort::CustomOpDomain,class std::allocator<struct Ort:
:CustomOpDomain> > &,class std::function<void __cdecl(char const *)> const &)" (?LoadInterOp@onnxruntime@@YAXAEBVModelP
roto@onnx@@AEAV?$vector@UCustomOpDomain@Ort@@V?$allocator@UCustomOpDomain@Ort@@@std@@@std@@AEBV?$function@$$A6AXPEBD@Z@
5@@Z) referenced in function "public: __cdecl <lambda_340b7b787b9c0f81848d348e60fe6c91>::operator()(class std::shared_p
tr<class onnxruntime::Model> &)const " (??R<lambda_340b7b787b9c0f81848d348e60fe6c91>@@QEBA@AEAV?$shared_ptr@VModel@onnx
runtime@@@std@@@Z) [C:\Users\pengwa\dev\onnxruntime\build\Windows\RelWithDebInfo\onnxruntime_test_trainer.vcxproj]
C:\Users\pengwa\dev\onnxruntime\build\Windows\RelWithDebInfo\RelWithDebInfo\onnxruntime_test_trainer.exe : fatal error
LNK1120: 2 unresolved externals [C:\Users\pengwa\dev\onnxruntime\build\Windows\RelWithDebInfo\onnxruntime_test_trainer.
vcxproj]
onnxruntime.vcxproj -> C:\Users\pengwa\dev\onnxruntime\build\Windows\RelWithDebInfo\RelWithDebInfo\onnxruntime.dll
onnxruntime_test_utils.vcxproj -> C:\Users\pengwa\dev\onnxruntime\build\Windows\RelWithDebInfo\RelWithDebInfo\onnxrun
time_test_utils.lib
CUDACOMPILE : nvcc warning : The 'compute_35', 'compute_37', 'sm_35', and 'sm_37' architectures are deprecated, and may
be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [C:\Users\pengwa\dev\onnxruntime
\build\Windows\RelWithDebInfo\custom_op_library.vcxproj]
cuda_ops.cu
CUDACOMPILE : nvcc warning : The 'compute_35', 'compute_37', 'sm_35', and 'sm_37' architectures are deprecated, and may
be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [C:\Users\pengwa\dev\onnxruntime
\build\Windows\RelWithDebInfo\onnxruntime_test_cuda_ops_lib.vcxproj]
```
```
kernel_type_str_resolver_utils_test.cc
local_kernel_registry_test.cc
C:\Users\pengwa\dev\onnxruntime\onnxruntime\test\framework\allocation_planner_test.cc(1388,9): error C2220: the followin
g warning is treated as an error [C:\Users\pengwa\dev\onnxruntime\build\Windows\RelWithDebInfo\onnxruntime_test_all.vcxp
roj]
C:\Users\pengwa\dev\onnxruntime\onnxruntime\test\framework\allocation_planner_test.cc(1388,9): warning C4067: unexpected
tokens following preprocessor directive - expected a newline [C:\Users\pengwa\dev\onnxruntime\build\Windows\RelWithDebI
nfo\onnxruntime_test_all.vcxproj]
```
### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->