Commit graph

9452 commits

Author SHA1 Message Date
Patrice Vignola
bc533a6723
[DML EP] Add dynamic graph compilation (#18199) 2023-11-02 01:29:59 -07:00
Tianlei Wu
c273f7a2d4
Cherry-pick LLaMA/SDXL to rel-1.16.2 (#18202)
Cherry-pick changes related to LLaMA and StableDiffusion XL to 1.16.2 release branch.

### Motivation and Context
---------

Co-authored-by: kunal-vaishnavi <115581922+kunal-vaishnavi@users.noreply.github.com>
Co-authored-by: Patrice Vignola <vignola.patrice@gmail.com>
Co-authored-by: petermcaughan <peter.mcaughan@gmail.com>
Co-authored-by: Peter McAughan <petermca@microsoft.com>
Co-authored-by: Jambay Kinley <jambaykinley@microsoft.com>
Co-authored-by: PeixuanZuo <94887879+PeixuanZuo@users.noreply.github.com>
Co-authored-by: Ye Wang <52801275+wangyems@users.noreply.github.com>
Co-authored-by: Your Name <you@example.com>
Co-authored-by: aciddelgado <139922440+aciddelgado@users.noreply.github.com>
Co-authored-by: tlwu@microsoft.com <tlwu@a100.crj0ad2y1kku1j4yxl4sj10o4e.gx.internal.cloudapp.net>
Co-authored-by: Yufeng Li <liyufeng1987@gmail.com>
Co-authored-by: JiCheng <wejoncy@163.com>
Co-authored-by: Justin Chu <justinchuby@users.noreply.github.com>
2023-11-01 14:39:47 -07:00
Changming Sun
0240274dfd
Add support for GCC 13 (#18178)
It is related to #18155 .

The issue has been fixed in the main branch by @jchen351
2023-11-01 10:33:52 -07:00
Patrice Vignola
749bcc7937
[DML EP] Add subgraph fusion support (#18125) 2023-10-31 10:51:22 -07:00
Changming Sun
6ae7c51a2f
Revert "Disable dml stage in windows GPU pipeline temporarily. (#18034)" (#18150) (#18170)
This reverts commit 99b8dcaae2.


### Motivation and Context
Restore the dml stage in windows GPU  pipeline.
Agent issue is solved by adding Feature.DisableGpuDriver in pool
properties.

---------

Co-authored-by: Yi Zhang <zhanyi@microsoft.com>
2023-10-30 21:57:00 -07:00
Patrice Vignola
99b0f626ac
[DML EP] Complete python IO binding implementation (#18124)
@fdwr This is the part 2 of the pybind work that was started earlier.
This adds the following features to the python IO binding
implementation:

- Use a bucketized allocator in order to reduce the number of resource
allocations
- Implement the following functions: `ortvalue_from_numpy`,
`update_inplace`, `ortvalue_from_shape_and_type` and `numpy`
- Modify the `onnxruntime_test_python_iobinding` tests to also run on
DML

Co-authored-by: Jeff Bloomfield <jeffbloo@microsoft.com>
2023-10-30 14:34:01 -07:00
Patrice Vignola
53cb9424a4
[DML EP] Enable more MHA masks (#18120)
Those masks are used for MHA in LLaMA.
2023-10-30 14:02:51 -07:00
Changming Sun
c829550180
Increase version number for preparing the 1.16.2 release (#18070)
1. Increase version number for preparing the 1.16.2 release (#18070)
2. cherry-pick 18034
2023-10-26 10:48:54 -07:00
Changming Sun
2a1fd2586f
Upgrade transformers to fix CI (#17830)
### Description

Python package pipeline fails due to "tokenizers" compilation. Since
"tokenizers" is a dep of "transformers", we update its version and hope
a new solution had been there.

```
error: casting `&T` to `&mut T` is undefined behavior, even if the reference is unused, consider instead using an `UnsafeCell`
--> tokenizers-lib/src/models/bpe/trainer.rs:517:47
```



### Motivation and Context
Cherry-pick from #17823
2023-10-09 13:55:47 -07:00
Yufeng Li
c3fd281620
Fix onnx quantizer activation and weight type attribute
#17651
2023-10-05 09:30:34 -07:00
Yulong Wang
f480a3618a
[hotfix] fix session option access in Node.js binding (#17762)
### Description
fix session option access in Node.js binding


### Motivation and Context
This is a bug that affect transformer.js using ONNX Runtime Node.js
binding. Issue: #17377

This bug is already fixed in main branch, but it is not picked in 1.16
release.
2023-10-03 17:55:01 -07:00
RandySheriffH
6df4211f12
Cancel EP check in python for 1.16.1 (#17768)
Remove the condition to allow an empty provide list.

Co-authored-by: Randy Shuai <rashuai@microsoft.com>
2023-10-03 13:02:24 -07:00
Changming Sun
264a7405e5
Cherry-picks for 1.16.1 release (#17741)
Cherry-pick the following PRs to the release branch:

Fix: Fail to skip disabledmodel in winml (#17728) 
Move dotnet build and test into docker in Linux CPU CI (#17417) 
Run Nuget_Test_Linux_GPU in container (#17452) 
Run Final_Jar_Testing_Linux_GPU in docker (#17533) 
TreeEnsemble speed up (#17449) 
Remove onnxruntime extensions from list of gitmodules (#17615) 
Include onnxruntime_float16.h in the package. (#17637) 
Fix static quantization for QDQ and Percentile distribution (#17649) 
[TensorRT EP] Back out the PerThreadContext (#17690) 
Update nodejs to 18.x (#17657) 
Update linux-wasm-ci.yml: remove the ln command (#17735)
2023-10-02 15:04:56 -07:00
Changming Sun
e7a0495a87
Cherry-picks pipeline changes to 1.16.0 release branch (#17577)
### Description
1. Delete Prefast tasks (#17522)
2. Disable yum update (#17551)
3. Avoid calling patchelf (#17365 and #17562) we that we can validate
the above fix

The main problem I'm trying to solve is: our GPU package depends on both
CUDA 11.x and CUDA 12.x . However, it's not easy to see the information
because ldd doesn't work with the shared libraries we generate(see issue
#9754) . So the patchelf change are useful for me to validate the
"Disabling yum update" was successful. As you can see we call "yum
update" from multiple places. Without some kind of validation it's hard
to say if I have covered all of them.
The Prefast change is needed because I'm going to update the VM images
in the next a few weeks. In case of we need to publish a patch release
after that.

### Motivation and Context
Without this fix we will mix using CUDA 11.x and CUDA 12.x. And it will
crash every time when we use TensorRT.
2023-09-18 15:03:48 -07:00
Vincent Wang
06ea28ba65
[rel-1.16.0] Cherry-pick 16940 and 17523 (#17506) 2023-09-14 10:46:41 -07:00
Chi Lo
0772d54933
[rel-1.16.0] Cherry-pick 17507 (#17520)
Cherry-pick #17507  for rel-1.16.0.

Note: The PR 17507 contains the part of engine decryption refactor that
we don't want to include it in ORT 1.16 release. This cherry pick PR
excludes this part.
2023-09-12 13:10:29 -07:00
Changming Sun
a9df3aea72
Remove 52 from CMAKE_CUDA_ARCHITECTURES to reduce Nuget package size (#17461)
### Description
Remove 52 from CMAKE_CUDA_ARCHITECTURES to reduce Nuget package size. 

### Motivation and Context
PR #17227 increased binary size by 20%. Right the package size is about
260MB. However, nuget has a hard limit of 250MB. Without this change we
cannot publish the package.
2023-09-08 15:25:07 +08:00
Hector Li
196df08326
[rel-1.16.0] Disable QNN QDQ test for release branch (#17463)
Disable QNN QDQ test for release branch

### Description
Disable QNN QDQ test for release branch to get rid of model test failure
caused by new model update in build image.
2023-09-08 14:43:12 +08:00
Edward Chen
2406e9c567
[rel-1.16.0] Use name of temporary provisioning profile. (#17456)
### Description
<!-- Describe your changes. -->

Use name of temporary provisioning profile.

### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->

The old provisioning profile no longer works. Switched to a temporary
one that we can use before a new one is available. The temporary one has
a different name.

Alternative to #17454.
2023-09-08 12:01:10 +10:00
Vincent Wang
4296043d94
Cherry-pick 2nd Round (#17386)
Cherry-pick 2nd round for 1.16.0 release.
PR List:

#17201
#17270
#17311
#17315
#17320
#17326
#17355
#17227
#17380
#17386
2023-09-07 09:56:22 -07:00
Vincent Wang
198fc901b1
Cherry-pick 1st Round (#17308)
Cherry-pick 1st round for rel-1.16.0 from
https://github.com/microsoft/onnxruntime/issues?q=label%3Arelease%3A1.16+label%3Atriage%3Aapproved+is%3Aclosed
except #17201 because it caused UT failure and is not fixed yet.

PR list:
#16417
#16936
#17000
#17236
#17238
#17240
#17252
#17255
#17258
#17265
#17267
#17277
2023-08-28 12:34:27 -07:00
Sheil Kumar
cbaa008391
Bump DirectML version from 1.12.0 to 1.12.1 (#17225)
Bump DirectML version from 1.12.0 to 1.12.1

Co-authored-by: Sheil Kumar <sheilk@microsoft.com>
2023-08-20 09:55:38 -07:00
kunal-vaishnavi
4bea5ec513
Add Whisper export with beam search test cases (#17228)
### Description
This PR adds test cases for the custom export of [Whisper with beam
search](https://github.com/microsoft/onnxruntime/tree/main/onnxruntime/python/tools/transformers/models/whisper).



### Motivation and Context
This PR checks that Whisper can be exported and runs with parity.
2023-08-20 00:58:08 -07:00
Chi Lo
9445539e2c
Update dependency for deps.txt (#17220)
https://github.com/microsoft/onnxruntime/pull/17059 updates deps.txt and
we also need to update cgmanifest.json and upload the files to Azure
DevOps


https://aiinfra.visualstudio.com/Lotus/_build/results?buildId=342803&view=results
for testing
2023-08-19 00:43:25 -07:00
Yulong Wang
6fc3fd9ece
[js/webgpu] support Cast operator (#16489)
### Description
support `Cast` operator for webgpu backend.

Cast operator for webgpu backend currently only supports f32, u32, i32
and bool.
2023-08-18 23:51:03 -07:00
Yulong Wang
bf1c62c181
check in build script for webgpu (#17126)
### Description
check in build script for webgpu described in gist
https://gist.github.com/fs-eire/a55b2c7e10a6864b9602c279b8b75dce

once this PR get merged, I can update the gist to use this file
2023-08-18 23:50:29 -07:00
Edward Chen
d6cd41cfc1
[CoreML EP] Add Shape, Gather, and Slice ops (#17153)
Add CoreML EP shape related ops:
- Shape
- Gather
- Slice

Add support for int64/int32 inputs in CoreML EP.
2023-08-18 22:34:34 -07:00
Edward Chen
2b4cc24d5c
[CoreML EP] Limit input shapes to at most rank 5 (#17086)
When considering nodes for the CoreML EP, limit input shapes to at most rank 5.
2023-08-18 20:33:40 -07:00
Yulong Wang
3426954525
disable browser stack tests (#17224)
### Description
disable browser stack tests
2023-08-18 17:14:12 -07:00
Changming Sun
3cec88bd12
FIX: memory leak checker is incompatible with std::stacktrace (#17209)
### Description
When I worked on PR #17173, I didn't notice that
onnxruntime\core\platform\windows\debug_alloc.cc also needs to call
dbghelp functions like SymInitialize. So, if we use vc runtime's
stacktrace functionality, vc runtime will initialize/uninitialize the
dbghelp library independently and vc runtime's stacktrace helper DLLs
get unloaded before our memory leak checker starts get work. Then we
call SymSetOptions, it crashes.

More details:
In VC runtime the C++23 stacktrace functions are implemented on top of
dbgeng.dll. In C:\Program Files\Microsoft Visual
Studio\2022\Enterprise\VC\Tools\MSVC\14.37.32822\crt\src\stl\stacktrace.cpp,
you can see it has:
```
                dbgeng = LoadLibraryExW(L"dbgeng.dll", nullptr, LOAD_LIBRARY_SEARCH_SYSTEM32);
```
The dbgeng.dll is a wrapper around dbghelp.dll. It calls SymInitialize
and SymCleanup. dbgeng.dll gets unloaded before our memory leak check
starts to run. In theory we should be able to call SymInitialize again
if the previous user who called SymInitialize has also called
SymCleanup. However, users can use
SymRegisterCallback/SymRegisterCallback64/SymRegisterCallbackW64 to
register callback functions to dbghelp.dll. These callback functions
need to be alive when SymSetOptions(and some other dbghelp APIs) get
called.

### Motivation and Context
2023-08-18 17:10:33 -07:00
Changming Sun
6db72165eb
Fix python packaging test pipeline (#17204)
### Description
1. Fix python packaging test pipeline. There was an error in
tools/ci_build/github/linux/run_python_tests.sh that it installed a
released version of onnxruntime python package from pypi.org to run the
test. Supposedly it should pick one from the current build.
2. Refactor the pipeline to allow choosing cmake build type from the web
UI when manually trigger a build. Now this feature is for Linux only.
Because I don't want to change too much when we are about to cut a
release branch. After that I will expand it to all platforms. This
feature is useful for debugging pipeline issues, also, we may consider
having a nightly pipeline to run all tests in Debug mode which may catch
extra bugs because in debug mode we can enforce range check.

Test run:
https://aiinfra.visualstudio.com/Lotus/_build/results?buildId=342674&view=results

### Motivation and Context
Currently the pipeline has a crash error. 

AB#18580
2023-08-18 14:51:26 -07:00
xhcao
dd3b2cefd6
[js/webgpu] Support int32 type for binary (#16901)
### Description
Enable typed binary and support int32 type for binary.

Co-authored-by: Xing Xu <xing.xu@intel.com>

---------

Co-authored-by: Xing Xu <xing.xu@intel.com>
2023-08-18 12:19:01 -07:00
Adam Louly
c0b6c6c94b
Add SGDOptimizer in the on-device training offline tooling (onnxblock) (#17085)
### Description
Adding SGDOptimizer to on device training onnxblock
2023-08-18 10:50:39 -07:00
Changming Sun
ee09a5ff35
Add DISABLE_CUSPARSE_DEPRECATED flag to CUDA build (#17207)
This is to suppress a warning and make Windows CUDA 12.2 build work.
2023-08-18 10:25:49 -07:00
Hariharan Seshadri
a476dbf430
[JS/WebGPU] Support Tile operator (#17123)
### Description
As title

### Motivation and Context
Improve WebGPU op coverage
2023-08-18 10:07:21 -07:00
satyajandhyala
7d1a5635a0
[JS/Web] Added SkipLayerNormalization operator. (#17102)
### Description
Add SkipLayerNormalization operator to JSEP.



### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2023-08-18 09:59:03 -07:00
RandySheriffH
9266cf1772
Skip setting the name when AzureEP enabled. (#17208)
Skip setting the name when AzureEP enabled.

Co-authored-by: Randy Shuai <rashuai@microsoft.com>
2023-08-18 09:53:36 -07:00
Ashwini Khade
68a670c7f8
Move some tests from CUDA only to CPU (#17189)
### Description
Minor PR to move some CUDA only on-device training tests to CPU as well.
This is to make sure we have good coverage for CPU too.



### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2023-08-18 09:44:57 -07:00
Tianlei Wu
d65aa5400c
clean up transformers scripts (#17179)
(1) Remove class BertOptimizationOptions that has been deprecated a long
time ago
(2) Move sys path setttings to `__init__.py`, and update imports
(3) Fix bert_perf_test to run properly.
(4) Fix a onnx path in a whisper test case
(5) Fix a few typos
(6) Update comments in bert_perf_test regarding to graph inputs
2023-08-17 23:14:49 -07:00
Jack
78b35652a3
fix issue with obtaining the decoder layer number when converting the T5 model. (#17185)
### Description
fix issue with obtaining the decoder layer number when converting the T5
model.

### Motivation and Context
fix issue: https://github.com/microsoft/onnxruntime/issues/17072

Test with
[byt5-small](https://huggingface.co/google/byt5-small/tree/main) model,
which has 12 encoder layers and 4 decoder layers.
Here is the log.

![image](https://github.com/microsoft/onnxruntime/assets/3481539/ff1b69c5-f485-4301-a333-9ee2a984df07)
2023-08-17 23:14:22 -07:00
Adrian Lizarraga
6ee4be724b
Update LICENSE name in NuGet packaging pipelines (#17183)
### Description
Updates NuGet packaging pipelines to use the correct license name.

### Motivation and Context
The license name changed. See https://github.com/microsoft/onnxruntime/pull/17170
The QNN_Windows_Nuget and Zip-Nuget-* pipelines will not run without this update.
2023-08-17 22:22:19 -07:00
Dmitri Smirnov
5c54b64a63
Create NodeArgs for all Constant nodes and initializers for functions being inlined (#17089)
### Description
When functions are inlined and constant nodes are being converted to
initializers, we need to create NodeArg for them.
Similar for inlined function subgraph, but we choose to give priority to
non-constant nodes and then fill the gaps with constant and
initializers.

### Motivation and Context
This addresses issue
https://github.com/microsoft/onnxruntime/issues/16813 for
`eca_halonext26ts_mod.onnx` model where it fails to remove unused
initializer because `NodeArg` was not created for it.
2023-08-17 14:22:28 -07:00
Changming Sun
0cccbcc47b
Move DML build job's Prefast task to a CPU machine pool (#17192)
### Description
Move DML build job's Prefast task to a CPU machine pool which has larger
memory. The current one runs out of memory in every run.

### Motivation and Context
To fix the broken python packaging pipeline.
2023-08-17 13:16:29 -07:00
Jian Chen
e0022d061f
Set web-ci-pipeline.yml only triggered when related fields are updated (#17148)
- 'js/web'
    - 'js/node'
    - 'onnxruntime/core/providers/js'
    is updated

### Description
<!-- Describe your changes. -->



### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2023-08-17 12:55:35 -07:00
BoarQing
df124c9313
[VITISAI] 1. Fix reading .dat and .onnx on Linux 2. Fix issue of compiling graph twice (#17108)
### Description
<!-- Describe your changes. -->
1. Fix reading .dat and .onnx on Linux 2. Fix issue of compiling graph
twice


### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
1. Previous we have not tested large model on Linux. When the model is
sperate into .dat and .onnx, it failed to load the model.
2. Check if the provider pointer is already existed. If existed, do not
create again.
2023-08-17 12:30:03 -07:00
Chi Lo
2fb148dd88
Temporarily enforce "Debug build" TRT EP with trt oss parser on Windows (#17059)
This PR handles two changes:

1. There is an issue when running "Debug build" TRT EP with "Release
build" TRT builtin parser on Windows. Enforce use oss parser for Debug
build.
Note: args.config in build.py is an array, for example ["Debug",
"Release"...]. The code will be much mess if we made the change there.
2. Update to use latest commit of oss parser.

Please see the https://github.com/microsoft/onnxruntime/issues/16273
2023-08-17 12:17:25 -07:00
Pranav Sharma
59a2801136
Fix NuGet pkging pipeline (#17195)
### Description
Fix NuGet pkging pipeline

### Motivation and Context
Fix NuGet pkging pipeline
2023-08-17 11:23:34 -07:00
cloudhan
049adb9f31
[ROCm] Remove redundant ep field in softmax (#17048) 2023-08-17 11:53:30 +08:00
Changming Sun
5249b7ab7c
Re-implement stacktrace (#17173)
### Description
Re-implement stacktrace. The new implementation doesn't directly use
Windows API, hence can avoid problems regarding to
initialize/uninitialize the dbghelp library.

### Motivation and Context
2023-08-16 16:07:49 -07:00
Dmitri Smirnov
f45eef399e
Fix visualization issues with Attribute/Tensor protos (#17188)
### Description
Protobuf Natvis
2023-08-16 13:56:51 -07:00