Commit graph

9455 commits

Author SHA1 Message Date
cloudhan
87bef1f3f2
Move composable_kernel to deps.txt (#17245) 2023-08-23 17:39:16 -07:00
Dmitri Smirnov
33c87f6283
ORT_ENFORCE on the iterator must come before iterator is dereferenced. (#17265)
### Description
Move `ORT_ENFORCE` on the iterator before iterator is used for the first
time.
2023-08-23 17:20:01 -07:00
Baiju Meswani
6c95d959f3
Make batchnorm training mode available in inference only package (#17270) 2023-08-23 15:19:11 -07:00
Dmitri Smirnov
fdc3bcae20
Disable local symbol table for function shape inferencing. (#17267)
### Description
Temporarily disable symbol tables.

### Motivation and Context
Local symbol tables mark unrelated shapes re-use and cause inference to
error out.

https://github.com/microsoft/onnxruntime/issues/17061
2023-08-23 14:46:21 -07:00
Yulong Wang
8b18d48c7c
[js/webgpu] make IndicesHelper implementation implicit (#17193)
### Description
This change makes it no longer required to call indicesHelper.impl() in
shader code.
2023-08-23 14:41:35 -07:00
Rachel Guo
aed7c6ffc7
Exclude fp16 support flag definition from minimal build (#17259)
### Description
<!-- Describe your changes. -->

As title.

### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->

Reduce minimal build binary size for mobile to meet office team
requirement.

cc @chenfucn

Co-authored-by: rachguo <rachguo@rachguos-Mini.attlocal.net>
2023-08-23 10:13:19 -07:00
Scott McKay
b3cb775cf9
Two fixes involving minimal builds (#17000)
### Description
<!-- Describe your changes. -->
- allocation planner was breaking if graph had no nodes
- in this particular model a branch of an If node returned an outer
scope value directly.

- if model used non-tensor types and sparse tensors are disabled the
call to IsSpareTensor causes an exception when prematurely terminates
the code.
- it's perfectly fine to check if a value is a sparse tensor when
support for them is disabled. we just can't do anything with that
OrtValue which is what the current ifdef's after the call to
IsSparseTensor handle.




### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
Fix model execution failure for partner with model that uses sequences
in a minimal build with sparse tensors disabled.
2023-08-23 16:01:22 +10:00
BoarQing
d21a2f064b
[VITISAI] fix compile error for onnxruntime (#17252)
### Description
<!-- Describe your changes. -->
Updated the code to pass in the missing parameter


### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
Compile error. See https://github.com/microsoft/onnxruntime/issues/17139

Co-authored-by: Yueqing Zhang <yueqingz@amd.com>
2023-08-22 22:40:39 -07:00
Ashwini Khade
56102ecbdd
On-Device Training - Enable loading from buffer (#16417) 2023-08-22 19:59:32 -07:00
Edward Chen
ae62d752d6
Prevent GSL_SUPPRESS arguments from being modified by clang-format (#17242)
Prevent `GSL_SUPPRESS` arguments from being modified by clang-format and update existing usages.

clang-format was changing something like `GSL_SUPPRESS(r.11)` to `GSL_SUPPRESS(r .11)`.

For some compilers (e.g., clang), the `gsl::suppress` attribute takes a quoted string argument. We don't want to insert spaces there.
2023-08-22 18:26:53 -07:00
kunal-vaishnavi
4b3477f171
Add Whisper scripts (#17043)
### Description
This PR adds benchmark scripts for Whisper. It is a follow-up to [this
PR](https://github.com/microsoft/onnxruntime/pull/17020) that adds the
LLaMA scripts.



### Motivation and Context
This PR enables benchmarking Whisper across various configurations.
2023-08-22 18:14:44 -07:00
Arthur Islamov
5842144d98
[js/web] JSEP Gemm for opset 13 (#16936)
### Description
Added JSEP Gemm registration for opset 13. It was falling back to CPU
provider as CPU has it for 13

---------

Co-authored-by: Guenther Schmuelling <guschmue@microsoft.com>
2023-08-22 18:13:20 -07:00
kunal-vaishnavi
edac3ef150
Add LLaMA scripts (#17020)
### Description
This PR adds the following scripts for LLaMA:
- LLaMA conversion (support for TorchScript and Dynamo exporters)
- LLaMA parity
- LLaMA benchmark
- LLaMA quantization
- LLaMA integration with [Hugging Face
Optimum](https://github.com/huggingface/optimum)



### Motivation and Context
This PR adds scripts for using LLaMA. There is a [follow-up
PR](https://github.com/microsoft/onnxruntime/pull/17043) for adding
scripts for Whisper.
2023-08-22 18:05:11 -07:00
Guenther Schmuelling
d3d3dde844
fix webgpu split (#17258)
fix webgpu split for the case of split_sizes coming from input[1]
2023-08-22 16:49:22 -07:00
shaahji
d76dbc4fc3
Issue#16990: Cast -> AllToAll -> Cast fails with random output (#17075) 2023-08-22 12:47:23 -07:00
Edward Chen
bd8a488f4b
Enable verbose logging in unit test program with environment variable. (#17133)
Enable verbose logging in unit test program with environment variable.
E.g., `ORT_UNIT_TEST_MAIN_LOG_LEVEL=0 ./onnxruntime_test_all --gtest_filter="<test that I want to see more logs for>"`.
2023-08-22 12:13:52 -07:00
Yi Zhang
61a79436e2
Common pre-build steps of Windows CI (#16970)
### Description
Unify some pre-build common steps.

### Motivation and Context
In the long run, other devs should only focus on build option and test
commands.
It would reduce mistakes and maintenance cost to use common template
steps.
There will be more PRs to achieve the goal.
2023-08-22 18:09:55 +08:00
PeixuanZuo
d5c565156d
[ROCm] add SimplifiedSkipLayerNorm implementation (#17213)
add SimplifiedSkipLayerNorm implementation
2023-08-22 12:06:58 +08:00
cloudhan
4e6cec4d09
Update ck and enable test (#16383)
Apply the fix in https://github.com/ROCmSoftwarePlatform/composable_kernel/issues/728
Introduce more kernel instances and allow the introduction of streamk and splitk.
2023-08-22 11:08:55 +08:00
Baiju Meswani
aae9a52e8b
Avoid pushing cpu package to https://download.onnxruntime.ai/ (#17238) 2023-08-21 15:47:07 -07:00
Dmitri Smirnov
ced0cfbfea
[C#]Fix API Comment (#17236)
### Description
Fix comment reference to a renamed public API.

### Motivation and Context
Avoid confusion of incorrect docs.

We want this in 1.16 release
2023-08-21 15:46:31 -07:00
Hector Li
618f4839d1
[QNN EP] Re-enable some node tests for QNN (#17237)
### Description
Re-enable some node tests for QNN
2023-08-21 13:58:17 -07:00
Changming Sun
e2b6827a59
Add a CUDA 12.x pipeline and improve install_third_party_deps.ps1 (#17231)
### Description
1. Add a CUDA 12.x pipeline
2. Improve install_third_party_deps.ps1: avoid using Start-process.
Directly call the command instead.

### Motivation and Context
Since our official packages and all CI pipelines still use CUDA 11.x, we need extra pipelines to validate our source code level compatibility with CUDA 12.x. BTW for sure the prebuilt binaries in our release page are not compatible with CUDA 12.x. Do not report bugs for that. 

AB#15152
2023-08-21 13:04:36 -07:00
Emmanuel Ferdman
08ca624d2b
Fix: update hyperlinks to the Jupyter notebooks (#16145)
### Description
<!-- Describe your changes. -->

This PR fixes broken hyperlinks in the documentation that should lead
users to Jupyter notebooks. Currently, the hyperlinks are not working as
intended. The PR resolves this issue by updating the hyperlinks to
correctly direct users to the Jupyter notebooks.


### Motivation and Context
<!-- - Why is this change required? What problem does it solve? -->

It fixes broken hyperlinks leading to the Jupyter notebooks.
2023-08-21 09:53:05 -07:00
Sheil Kumar
cbaa008391
Bump DirectML version from 1.12.0 to 1.12.1 (#17225)
Bump DirectML version from 1.12.0 to 1.12.1

Co-authored-by: Sheil Kumar <sheilk@microsoft.com>
2023-08-20 09:55:38 -07:00
kunal-vaishnavi
4bea5ec513
Add Whisper export with beam search test cases (#17228)
### Description
This PR adds test cases for the custom export of [Whisper with beam
search](https://github.com/microsoft/onnxruntime/tree/main/onnxruntime/python/tools/transformers/models/whisper).



### Motivation and Context
This PR checks that Whisper can be exported and runs with parity.
2023-08-20 00:58:08 -07:00
Chi Lo
9445539e2c
Update dependency for deps.txt (#17220)
https://github.com/microsoft/onnxruntime/pull/17059 updates deps.txt and
we also need to update cgmanifest.json and upload the files to Azure
DevOps


https://aiinfra.visualstudio.com/Lotus/_build/results?buildId=342803&view=results
for testing
2023-08-19 00:43:25 -07:00
Yulong Wang
6fc3fd9ece
[js/webgpu] support Cast operator (#16489)
### Description
support `Cast` operator for webgpu backend.

Cast operator for webgpu backend currently only supports f32, u32, i32
and bool.
2023-08-18 23:51:03 -07:00
Yulong Wang
bf1c62c181
check in build script for webgpu (#17126)
### Description
check in build script for webgpu described in gist
https://gist.github.com/fs-eire/a55b2c7e10a6864b9602c279b8b75dce

once this PR get merged, I can update the gist to use this file
2023-08-18 23:50:29 -07:00
Edward Chen
d6cd41cfc1
[CoreML EP] Add Shape, Gather, and Slice ops (#17153)
Add CoreML EP shape related ops:
- Shape
- Gather
- Slice

Add support for int64/int32 inputs in CoreML EP.
2023-08-18 22:34:34 -07:00
Edward Chen
2b4cc24d5c
[CoreML EP] Limit input shapes to at most rank 5 (#17086)
When considering nodes for the CoreML EP, limit input shapes to at most rank 5.
2023-08-18 20:33:40 -07:00
Yulong Wang
3426954525
disable browser stack tests (#17224)
### Description
disable browser stack tests
2023-08-18 17:14:12 -07:00
Changming Sun
3cec88bd12
FIX: memory leak checker is incompatible with std::stacktrace (#17209)
### Description
When I worked on PR #17173, I didn't notice that
onnxruntime\core\platform\windows\debug_alloc.cc also needs to call
dbghelp functions like SymInitialize. So, if we use vc runtime's
stacktrace functionality, vc runtime will initialize/uninitialize the
dbghelp library independently and vc runtime's stacktrace helper DLLs
get unloaded before our memory leak checker starts get work. Then we
call SymSetOptions, it crashes.

More details:
In VC runtime the C++23 stacktrace functions are implemented on top of
dbgeng.dll. In C:\Program Files\Microsoft Visual
Studio\2022\Enterprise\VC\Tools\MSVC\14.37.32822\crt\src\stl\stacktrace.cpp,
you can see it has:
```
                dbgeng = LoadLibraryExW(L"dbgeng.dll", nullptr, LOAD_LIBRARY_SEARCH_SYSTEM32);
```
The dbgeng.dll is a wrapper around dbghelp.dll. It calls SymInitialize
and SymCleanup. dbgeng.dll gets unloaded before our memory leak check
starts to run. In theory we should be able to call SymInitialize again
if the previous user who called SymInitialize has also called
SymCleanup. However, users can use
SymRegisterCallback/SymRegisterCallback64/SymRegisterCallbackW64 to
register callback functions to dbghelp.dll. These callback functions
need to be alive when SymSetOptions(and some other dbghelp APIs) get
called.

### Motivation and Context
2023-08-18 17:10:33 -07:00
Changming Sun
6db72165eb
Fix python packaging test pipeline (#17204)
### Description
1. Fix python packaging test pipeline. There was an error in
tools/ci_build/github/linux/run_python_tests.sh that it installed a
released version of onnxruntime python package from pypi.org to run the
test. Supposedly it should pick one from the current build.
2. Refactor the pipeline to allow choosing cmake build type from the web
UI when manually trigger a build. Now this feature is for Linux only.
Because I don't want to change too much when we are about to cut a
release branch. After that I will expand it to all platforms. This
feature is useful for debugging pipeline issues, also, we may consider
having a nightly pipeline to run all tests in Debug mode which may catch
extra bugs because in debug mode we can enforce range check.

Test run:
https://aiinfra.visualstudio.com/Lotus/_build/results?buildId=342674&view=results

### Motivation and Context
Currently the pipeline has a crash error. 

AB#18580
2023-08-18 14:51:26 -07:00
xhcao
dd3b2cefd6
[js/webgpu] Support int32 type for binary (#16901)
### Description
Enable typed binary and support int32 type for binary.

Co-authored-by: Xing Xu <xing.xu@intel.com>

---------

Co-authored-by: Xing Xu <xing.xu@intel.com>
2023-08-18 12:19:01 -07:00
Adam Louly
c0b6c6c94b
Add SGDOptimizer in the on-device training offline tooling (onnxblock) (#17085)
### Description
Adding SGDOptimizer to on device training onnxblock
2023-08-18 10:50:39 -07:00
Changming Sun
ee09a5ff35
Add DISABLE_CUSPARSE_DEPRECATED flag to CUDA build (#17207)
This is to suppress a warning and make Windows CUDA 12.2 build work.
2023-08-18 10:25:49 -07:00
Hariharan Seshadri
a476dbf430
[JS/WebGPU] Support Tile operator (#17123)
### Description
As title

### Motivation and Context
Improve WebGPU op coverage
2023-08-18 10:07:21 -07:00
satyajandhyala
7d1a5635a0
[JS/Web] Added SkipLayerNormalization operator. (#17102)
### Description
Add SkipLayerNormalization operator to JSEP.



### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2023-08-18 09:59:03 -07:00
RandySheriffH
9266cf1772
Skip setting the name when AzureEP enabled. (#17208)
Skip setting the name when AzureEP enabled.

Co-authored-by: Randy Shuai <rashuai@microsoft.com>
2023-08-18 09:53:36 -07:00
Ashwini Khade
68a670c7f8
Move some tests from CUDA only to CPU (#17189)
### Description
Minor PR to move some CUDA only on-device training tests to CPU as well.
This is to make sure we have good coverage for CPU too.



### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2023-08-18 09:44:57 -07:00
Tianlei Wu
d65aa5400c
clean up transformers scripts (#17179)
(1) Remove class BertOptimizationOptions that has been deprecated a long
time ago
(2) Move sys path setttings to `__init__.py`, and update imports
(3) Fix bert_perf_test to run properly.
(4) Fix a onnx path in a whisper test case
(5) Fix a few typos
(6) Update comments in bert_perf_test regarding to graph inputs
2023-08-17 23:14:49 -07:00
Jack
78b35652a3
fix issue with obtaining the decoder layer number when converting the T5 model. (#17185)
### Description
fix issue with obtaining the decoder layer number when converting the T5
model.

### Motivation and Context
fix issue: https://github.com/microsoft/onnxruntime/issues/17072

Test with
[byt5-small](https://huggingface.co/google/byt5-small/tree/main) model,
which has 12 encoder layers and 4 decoder layers.
Here is the log.

![image](https://github.com/microsoft/onnxruntime/assets/3481539/ff1b69c5-f485-4301-a333-9ee2a984df07)
2023-08-17 23:14:22 -07:00
Adrian Lizarraga
6ee4be724b
Update LICENSE name in NuGet packaging pipelines (#17183)
### Description
Updates NuGet packaging pipelines to use the correct license name.

### Motivation and Context
The license name changed. See https://github.com/microsoft/onnxruntime/pull/17170
The QNN_Windows_Nuget and Zip-Nuget-* pipelines will not run without this update.
2023-08-17 22:22:19 -07:00
Dmitri Smirnov
5c54b64a63
Create NodeArgs for all Constant nodes and initializers for functions being inlined (#17089)
### Description
When functions are inlined and constant nodes are being converted to
initializers, we need to create NodeArg for them.
Similar for inlined function subgraph, but we choose to give priority to
non-constant nodes and then fill the gaps with constant and
initializers.

### Motivation and Context
This addresses issue
https://github.com/microsoft/onnxruntime/issues/16813 for
`eca_halonext26ts_mod.onnx` model where it fails to remove unused
initializer because `NodeArg` was not created for it.
2023-08-17 14:22:28 -07:00
Changming Sun
0cccbcc47b
Move DML build job's Prefast task to a CPU machine pool (#17192)
### Description
Move DML build job's Prefast task to a CPU machine pool which has larger
memory. The current one runs out of memory in every run.

### Motivation and Context
To fix the broken python packaging pipeline.
2023-08-17 13:16:29 -07:00
Jian Chen
e0022d061f
Set web-ci-pipeline.yml only triggered when related fields are updated (#17148)
- 'js/web'
    - 'js/node'
    - 'onnxruntime/core/providers/js'
    is updated

### Description
<!-- Describe your changes. -->



### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2023-08-17 12:55:35 -07:00
BoarQing
df124c9313
[VITISAI] 1. Fix reading .dat and .onnx on Linux 2. Fix issue of compiling graph twice (#17108)
### Description
<!-- Describe your changes. -->
1. Fix reading .dat and .onnx on Linux 2. Fix issue of compiling graph
twice


### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
1. Previous we have not tested large model on Linux. When the model is
sperate into .dat and .onnx, it failed to load the model.
2. Check if the provider pointer is already existed. If existed, do not
create again.
2023-08-17 12:30:03 -07:00
Chi Lo
2fb148dd88
Temporarily enforce "Debug build" TRT EP with trt oss parser on Windows (#17059)
This PR handles two changes:

1. There is an issue when running "Debug build" TRT EP with "Release
build" TRT builtin parser on Windows. Enforce use oss parser for Debug
build.
Note: args.config in build.py is an array, for example ["Debug",
"Release"...]. The code will be much mess if we made the change there.
2. Update to use latest commit of oss parser.

Please see the https://github.com/microsoft/onnxruntime/issues/16273
2023-08-17 12:17:25 -07:00
Pranav Sharma
59a2801136
Fix NuGet pkging pipeline (#17195)
### Description
Fix NuGet pkging pipeline

### Motivation and Context
Fix NuGet pkging pipeline
2023-08-17 11:23:34 -07:00