Commit graph

396 commits

Author SHA1 Message Date
cloudhan
4e6cec4d09
Update ck and enable test (#16383)
Apply the fix in https://github.com/ROCmSoftwarePlatform/composable_kernel/issues/728
Introduce more kernel instances and allow the introduction of streamk and splitk.
2023-08-22 11:08:55 +08:00
Sheil Kumar
cbaa008391
Bump DirectML version from 1.12.0 to 1.12.1 (#17225)
Bump DirectML version from 1.12.0 to 1.12.1

Co-authored-by: Sheil Kumar <sheilk@microsoft.com>
2023-08-20 09:55:38 -07:00
Dmitri Smirnov
f45eef399e
Fix visualization issues with Attribute/Tensor protos (#17188)
### Description
Protobuf Natvis
2023-08-16 13:56:51 -07:00
Matthieu Darbois
5e971bc51a
Rework WIL dependency retrieval/usage (#17130)
### Description
1. `onnxruntime_fetchcontent_makeavailable` works around unconditional
install commands so that can be used instead of `FetchContent_Populate`
2. This dependency is Windows specific, mark it as such.

### Motivation and Context
1. This simplifies `cmake/external/wil.cmake` not to do anything
specific wether WIL was fetched or found
2. Given it's specific to Windows, it might not be available on other OS
in specific air-gapped environment such as
[conan-center-index](https://github.com/conan-io/conan-center-index).
This allows downstream builds not to require specific patches for
something not required by the build in the first place.
2023-08-15 09:11:46 -07:00
Wenbing Li
d052c8a45c
Remove the extensions submodule (#17097)
### Description
Remove the onnxruntime-extensions submodule since it now was used via
cmake FetchContent


### Motivation and Context
The submodule relies on an outdated version of the extensions, and the
build instructions should be updated to eliminate any confusion.
2023-08-14 10:16:33 -07:00
Yulong Wang
9cd4e5af68
[wasm] upgrade emsdk to 3.1.44 (#17069)
### Description
This change upgrade emsdk to 3.1.44.

Because backend is upgraded to LLVM 16, so need to fix a lot of build
failures caused by "-Wshorten-64-to-32".

most of the build failures comes from generated `onnx.pb.h`, and this
can be fixed by including "core/graph/onnx_protobuf.h", which detects
and ignore shorten-64-to-32 warnings.
2023-08-10 16:08:36 -07:00
Bowen Bao
6986981482
Bump ONNX version (#16325)
### Description
Bump ONNX version to https://github.com/onnx/onnx/tree/rel-1.14.1 to
include a fix for segfault when shape inferencing nested onnx functions.



### Motivation and Context
Resolves #16170
2023-08-10 11:27:28 -07:00
Dmitri Smirnov
07dfe34714
Fix FunctionProto visualization (#17063)
### Description
Title

### Motivation and Context
Need to debug function protos
2023-08-09 11:05:52 -07:00
Dmitri Smirnov
d5e4bdbe7d
Fix protobuf TaggedStringPtr display (#17008)
### Description
<!-- Describe your changes. -->
Adjust nativs to display tagged strings.

### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
Hard to debug without seeing names.
2023-08-04 17:51:01 -07:00
Changming Sun
73ddba964f
Update the MacOS/Linux build scripts that build/install protobuf from source (#16906)
### Description
1. As a follow-up of #16761, this PR allows build ORT on iOS/Android
without the need to explicitly specify a protoc path. #16761 is for
WASM. This one is for iOS/Android
2. Update the MacOS/Linux build scripts that build/install protobuf from
source. Make them be more flexible. Add the support for
RedHatEnterprise(ubi), which will needed for upgrading the base image
from centos:7 to ubi:8.
3. Update tools/ci_build/github/pai/rocm-ci-pipeline-env.Dockerfile :
the docker file's base image has preinstalled protobuf in /usr/local, we
should uninstall them to avoid conflicts.
2023-07-31 10:51:48 -07:00
Dmitri Smirnov
50764362ac
Update protobuf Natvis visualization (#16911)
### Description
Protobuf library update broke debug visualization.

### Motivation and Context
Hard to debug
2023-07-31 09:35:21 -07:00
Arthur Islamov
210d29b40e
Allow --build_wasm on a mac system (#16761)
### Description
Changes allow downloading prebuilt protoc compiler when building
WebAssebly version on mac systems.
Otherwise it tries to build a js/wasm version of protoc and throws an
error while executing it: "protoc.js permission denied"


### Motivation and Context
I need to switch between my main working computer and a PC to make
changes to WebAssebly build. Would like not to do that anymore.
2023-07-21 14:21:37 -07:00
Jeff Daily
bb136f86c8
[ROCm][MIGraphX] for googletest dep, set OVERRIDE_FIND_PACKAGE (#16715)
Otherwise, an unsupported version of gtest/gmock will be found at
/opt/conda/include for ROCm builds. Though this issue was initially
found for ROCm builds, the issue is generic. onnxruntime requires a
specific version of googletest and should not rely on locating
googletest using find_package.

The ROCm error was:

```
In file included from /opt/conda/include/gmock/gmock-spec-builders.h:75,
                 from /opt/conda/include/gmock/gmock-generated-function-mockers.h:47,
                 from /opt/conda/include/gmock/gmock-function-mocker.h:39,
                 from /opt/conda/include/gmock/gmock.h:61,
                 from /stage/onnxruntime/onnxruntime/test/util/test_utils.cc:17:
/opt/conda/include/gmock/gmock-matchers.h: In instantiation of ‘bool testing::internal::PointwiseMatcher<TupleMatcher, RhsContainer>::Impl<LhsContainer>::
MatchAndExplain(LhsContainer, testing::MatchResultListener*) const [with LhsContainer = const gsl::span<const float>&; TupleMatcher = testing::internal::
FloatingEq2Matcher<float>; RhsContainer = gsl::span<const float>]’:
/opt/conda/include/gmock/gmock-matchers.h:2303:10:   required from here
/opt/conda/include/gmock/gmock-matchers.h:2312:48: error: no type named ‘const_iterator’ in ‘testing::internal::PointwiseMatcher<testing::internal::
FloatingEq2Matcher<float>, gsl::span<const float> >::Impl<const gsl::span<const float>&>::LhsStlContainer’ {aka ‘class gsl::span<const float>’}
```
2023-07-21 00:57:38 +08:00
RandySheriffH
6e29e185f3
Clean AzureEP logics (#16367)
Moving out AzureEP invokers out of core runtime.

---------

Co-authored-by: Randy Shuai <rashuai@microsoft.com>
2023-06-21 09:38:52 -07:00
Edward Chen
1261d0b8ba
Fix some build issues on MacOS with Xcode 14.3. (#15878)
- Fix flatbuffers flatc warning, unused-but-set-variable.
- Address `-Wshorten-64-to-32` warnings (fix in our code, allow in dependencies' code).
- Update CI builds to use Xcode 14.3.
- Update minimum iOS version to 12.0.
- Update Mac hosted agents to MacOS 13 where possible.
2023-06-07 12:07:11 -07:00
神楽坂帕琪
abd94b65b7
eigen.cmake use url info from deps.txt (#16129)
### Description

`eigen.cmake` use url info provided by deps.txt instead of using raw
url.
2023-05-30 11:07:20 -07:00
Sumit Agarwal
70d2dc8209
[DML EP] Fix issue with --dml_path build option (#15972)
### Description
DML_PACKAGE_DIR cmake variable is not getting set properly when dml_path
build options is used.


### Motivation and Context
- Why is this change required? What problem does it solve?
It is required for DML Perf dashboard.
<!--- If it fixes an open issue, please link to the issue here. -->
2023-05-24 19:20:40 -05:00
RandySheriffH
d35361bf9d
Fix python pipeline for AzureEP without using root (#16023)
Fix python pipeline for AzureEP without using root, this is for 1.15.

---------

Co-authored-by: Randy Shuai <rashuai@microsoft.com>
2023-05-22 16:38:47 -07:00
Changming Sun
0204594f90
Cleanup WASM cmake code (#15996)
### Description
Remove the "onnxruntime_BUILD_WEBASSEMBLY" cmake option. Use `if
(CMAKE_SYSTEM_NAME STREQUAL "Emscripten")` instead. It makes some code
look more nature.
For example,

```cmake
if (CMAKE_SYSTEM_NAME STREQUAL "iOS" OR CMAKE_SYSTEM_NAME STREQUAL "Android" OR onnxruntime_BUILD_WEBASSEMBLY)
```
becomes
```cmake
if (CMAKE_SYSTEM_NAME STREQUAL "iOS" OR CMAKE_SYSTEM_NAME STREQUAL "Android" OR CMAKE_SYSTEM_NAME STREQUAL "Emscripten")
```
2023-05-20 18:07:39 -07:00
Patrice Vignola
310b22aa0c
[DML EP] Update DirectML version to 1.12.0 (#16011) 2023-05-18 19:37:12 -07:00
Changming Sun
842b1a3472
Revert a change in #15797: restore the correct version of emsdk (#15995)
### Description
Revert a change in #15797: restore the correct version of emsdk


### Motivation and Context
Without change, when you build it on Windows you will see:
```
2023-05-17 19:41:30,093 build [INFO] - Activating emsdk...
2023-05-17 19:41:30,093 util.run [INFO] - Running subprocess in 'C:\src\onnxruntime2\cmake\external\emsdk'
  'C:\src\onnxruntime2\cmake\external\emsdk\emsdk.bat' activate 3.1.37
error: tool or SDK not found: '3.1.37'
```
2023-05-18 07:41:38 -07:00
RandySheriffH
7c4e8267e7
Implement openAI endpoint invoker for nuget (#15797)
Implement openAI audio endpoint, and enable nuget packaging.

---------

Co-authored-by: Randy Shuai <rashuai@microsoft.com>
2023-05-11 22:04:02 -07:00
Jian Chen
1a73d61829
Update eigen to 3.4 and remove the eigen from git submodule (#15875)
### Description
Update eigen to 3.4 and remove the eigen from git submodule

### Motivation and Context
We need to have eigen 3.4 for c++20
2023-05-11 11:56:59 -07:00
sdegrande
cf062dbdb1
FlatBuffers fails to compile with gcc13. (#15787)
When building the FlatBuffers dependencies, gcc13 emits a
stringop-overflow warning. All warnings being turned into errors, that
fails the compilation of FlatBuffers, and as a consequence also fails
the build of onnxruntime.

This commit adds the application of a patch to FlatBuffers's
CMakeList.txt, to add -Wno-error=stringop-overflow to the
CMAKE_CXX_FLAGS.
2023-05-11 11:20:19 -07:00
liqun Fu
ac9ae9f7c5
update onnx release 1.14 for docker files (#15680)
### Description
this is for ort 1.15 release to work with onnx 1.14
It shall be merged after onnx 1.14 release and before ort 1.15 release.


### Motivation and Context

---------

Signed-off-by: Liqun Fu <liqfu@microsoft.com>
2023-05-10 13:15:56 -07:00
Sumit Agarwal
b473e3f3c6
[DML EP] Update DirectML version to 1.11.0 (#15858)
### Description
- Update DML version to 1.11.0
- Disable Gemm+Softmax fusion



### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2023-05-09 12:48:15 -07:00
Yulong Wang
0457fd0b40
upgrade emsdk to 3.1.37 (#15817)
### Description
upgrade emsdk to 3.1.37

WIP branch to debug the mystery memory issue in web assembly
multi-thread build.
2023-05-08 16:49:47 -07:00
Changming Sun
328cabb194
Download protoc from Github Release instead of Nuget (#15731)
### Description
Download protoc from Github Release instead of Nuget to avoid having
dependency on nuget.exe on Linux

### Motivation and Context
To avoid having dependency on nuget.exe on Linux. Many users' build
environment do not have nuget or dotnet.
2023-05-02 12:18:59 -07:00
Sumit Agarwal
4c4f688a93
[DML EP] Fix dml_external_project (#15656)
### Description
While building ORT for DML EP with `dml_EXTERNAL_PROJECT` flag, 2
variables (`DML_SHARED_LIB`, `DML_PACKAGE_DIR`) value is not set
properly.



### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2023-05-01 12:02:56 -07:00
PeixuanZuo
59ea35d592
[ROCm] add CK GroupNorm to GroupNormTunable (#15510)
- Add CK GroupNorm to GroupNormTunable.
- Reduce configuration of GroupNormNHWCOp because CK implementation is
better.

The performance gain on stable diffusion v1.5.
Before:
```
'height': 512
'width': 512
'steps': 50
'batch_size': 1
'batch_count': 5
'num_prompts': 1
'average_latency': 2.4782688856124877
'median_latency': 2.4783748388290405
'provider': 'ROCMExecutionProvider'
'disable_safety_checker': True 
```

After:
```
'height': 512, 
'width': 512, 
'steps': 50, 
'batch_size': 1,
'batch_count': 5,
'num_prompts': 1, 
'average_latency': 2.107170510292053,
 'median_latency': 2.1067750453948975,
 'first_run_memory_MB': -1, 
'second_run_memory_MB': -1,
'provider': 'ROCMExecutionProvider', 
'disable_safety_checker': True
```
2023-04-19 13:54:59 +08:00
Yi Zhang
698e9f71cd
Improve cache hit rate in windows build (#15538)
### Description
1.  Update /Zi to /Z7 in abseil project while using cache
2.  Skip target_precompile_headers while using cache


### Motivation and Context
There're about 1/4 uncacheable calls in Windows GPU compilation with
cache.
```
Uncacheable calls:                   441 / 1641 (26.87%)
  Could not use precompiled header:  361 /  441 (81.86%)
  Preprocessing failed:                1 /  441 ( 0.23%)
  Unsupported compiler option:        79 /  441 (17.91%)
```

https://dev.azure.com/onnxruntime/onnxruntime/_build/results?buildId=961916&view=logs&j=5076e696-f193-5f12-2d8a-703dda41a79b&t=9b927034-e3ef-5e25-c6df-387bc37acd63&l=21

The root cause of `Unsupported compiler option` is that /Zi in Abseil
isn't updated to /Z7.
The root cause of `Could not use precompiled header` is the
`target_precompile_headers` creates cmake_pch.pch every time and it's
hash value is changed too.

### Result
It could reduce compilation time by another 20%. 
For example:
It took 16m43 in CUDA training compilation on Windows.
It takes 13m32 after the change.

https://dev.azure.com/onnxruntime/onnxruntime/_build/results?buildId=964002&view=logs&s=959c6b43-5937-53e5-5f36-e53cb0249117


### N.B.
In winml project, it's using own target_precompile**d**_header
https://github.com/microsoft/onnxruntime/blob/main/cmake/precompiled_header.cmake.
Just let it be.
2023-04-18 09:31:35 -07:00
Edward Chen
9f5aa8e021
Add clog back to onnxruntime_EXTERNAL_LIBRARIES. (#15363)
### Description
<!-- Describe your changes. -->

Add clog back to onnxruntime_EXTERNAL_LIBRARIES.

### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->

Fix iOS packaging pipeline build failure.
2023-04-05 09:11:19 -07:00
Matthieu Darbois
85bb13345d
Rework some external targets to ease building with -DFETCHCONTENT_FULLY_DISCONNECTED=ON (#15323)
### Description
Rework some external targets to ease building with
`-DFETCHCONTENT_FULLY_DISCONNECTED=ON`
This will allow package managers to more easily provide an onnxruntime
package by reducing the amount of patching needed downstream at each
version.

### Motivation and Context
Availability of onnxruntime in some C++ package managers
https://github.com/microsoft/onnxruntime/issues/7150
https://github.com/conan-io/conan-center-index/issues/16699
https://github.com/microsoft/vcpkg/issues/20548

My initial intent is to get this in conan but the PR would most likely
be useful (though not tested) to vcpkg as well (and maybe others).
I tried to get only a first batch of not too specific patches (i.e. not
specific to conan).

The first commit reworks `flatbuffers` and just extends what @snnn did
in https://github.com/microsoft/onnxruntime/pull/13991
The second commit reworks `pytorch_cpuinfo`
The third commit reworks `google_nsync`
2023-04-03 17:45:12 -07:00
Changming Sun
15f7dca9fb
Update protobuf to 3.21.x (#15245)
### Description

Fixed
[AB#10092](https://aiinfra.visualstudio.com/6a833879-cd9b-44a4-a9de-adc2d818f13c/_workitems/edit/10092),
[AB#11753](https://aiinfra.visualstudio.com/6a833879-cd9b-44a4-a9de-adc2d818f13c/_workitems/edit/11753),
[AB#11759](https://aiinfra.visualstudio.com/6a833879-cd9b-44a4-a9de-adc2d818f13c/_workitems/edit/11759)

### Motivation and Context
The one we use has a security issue in Java, though we don't use that
version's protobuf java package.
2023-03-29 14:08:18 -07:00
Changming Sun
4a0b86eba6
Update the post-merge pipeline (#14965)
### Description
1.  Remove Linux jobs for ORT-Extension combined build
2.  Add a macOS build job for ORT-Extension combined build
3. Adjust the yaml file so that it can support two different ADO
instances.


### Motivation and Context
To test our code better. And it will enable us to run such tests for
every commit in the main branch. It would be easier for us to figure out
which change caused a build break.

See
[AB#13435](https://aiinfra.visualstudio.com/6a833879-cd9b-44a4-a9de-adc2d818f13c/_workitems/edit/13435)
2023-03-29 13:12:07 -07:00
Changming Sun
ffcfb1ec98
Remove protobuf submodule (#15190)
### Description
Remove protobuf submodule as a follow-up of #13523

"Android CI Pipeline" and "Zip-Nuget-Java-Nodejs Packaging Pipeline"
need to be tested.


### Motivation and Context
It is related to
[AB#11753](https://aiinfra.visualstudio.com/6a833879-cd9b-44a4-a9de-adc2d818f13c/_workitems/edit/11753)

Fixed
[AB#14027](https://aiinfra.visualstudio.com/6a833879-cd9b-44a4-a9de-adc2d818f13c/_workitems/edit/14027)
2023-03-27 10:35:49 -07:00
Ye Wang
2ee822d483
Extend memory efficient attention coverage in Attention/MHA cuda op (#15064)
### Description
<!-- Describe your changes. -->

1. upgrade cutlass to 3.0 that containing attn_bias support.
2. extend Attention/MHA to use memory efficient attention when
rel_pos_bias with [1, num_head, s, s*] and 1d mask with [2 * batch_size
+ 1] are present.

new mask format introduction:
MASK_1D_KEY_SEQ_LEN_START,  
[3 * batch_size + 2] with [key_len[0], ..., key_len[batch_size - 1],
query_start[0], ..., query_start[batch_size - 1], query_end[batch_size -
1], key_start[0], ..., key_start[batch_size - 1], key_end[batch_size -
1]]

e.g
2D mask with [[1, 1, 1, 0, 0, 0], [1, 1, 1, 1, 1, 0]] converts to this
1D mask is [3, 5, 0, 6, 12, 0, 6, 12]


### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->

It potentially benefits tnlrv6 and t5(encoder)

---------

Co-authored-by: Ubuntu <wy@v100-2.0cdb2e52twzevn1i4fi45bylyg.jx.internal.cloudapp.net>
Co-authored-by: Kunal Vaishnavi <kvaishnavi@microsoft.com>
Co-authored-by: Kunal Vaishnavi <kvaishnavi@microsoft.com@orttrainingdev7.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net>
2023-03-23 11:05:17 -07:00
cloudhan
a5ab88247b
ROCm Flash Attention (#14838)
Adds flash attention via composable kernel for ROCm EP
2023-03-16 10:39:58 +08:00
Maximilian Müller
ad4db12699
TensorRT EP - timing cache (#14767)
### Description

This will enable a user to use a TensorRT timing cache based on #10297
to accelerate build times on a device with the same compute capability.
This will work across models as it simply store kernel runtimes for
specific configurations. Those files are usually very small (only a few
MB) which makes them very easy to ship with an application to accelerate
the build time on the user end.

### Motivation and Context
Especially for workstation use cases TRT build times can be a roadblock.
With a few model from ONNX model zoo i evaluated speedups when a timing
cache is present.
`./build/onnxruntime_perf_test -e tensorrt -I -t 5 -i
"trt_timing_cache_enable|true" <onnx_path>`

|Model | no Cache | with Cache|
| ------------- | ------------- | ------------- |
|efficientnet-lite4-11 | 34.6 s | 7.7 s|
|yolov4 | 108.62 s | 9.4 s|

To capture this is had to modify the onnxruntime_perf_test. The time is
sometimes not captured within "Session creation time cost:" which is why
i introduced "First inference time cost:".

---------

Co-authored-by: Chi Lo <Chi.Lo@microsoft.com>
2023-03-10 09:02:27 -08:00
Yulong Wang
69c5edb11b
[wasm] upgrade emsdk from 3.1.19 to 3.1.32 (#14818)
### Description
upgrade emsdk from 3.1.19 to 3.1.32

also add explicit config for stack size (1MB).
2023-02-28 11:06:09 -08:00
Jian Chen
62ee0c8110
Migrating ORT Extensions from Git submodule to cmake FetchContent (#14298)
### Description
<!-- Describe your changes. -->

Merging extensions from Git submodule to cmake FetchContent


### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->

---------

Co-authored-by: Changming Sun <chasun@microsoft.com>
Co-authored-by: Jian Chen <jchen351@MacBook-Pro.local>
2023-02-22 19:42:36 -08:00
Erick Muñoz
8372c86e7f
[oneDNN] Update to oneDNN v3.0 (#14267)
### Description
Update oneDNN version from 2.7 to 3.0



### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2023-02-17 09:56:29 -08:00
Valery Chernov
ba8a00f62f
[TVM EP] Support zero copying TVM EP output tensor to ONNX Runtime output tensor (#12593)
**Description**:
Support new feature of TVM Virtual Machine (method `set_outputs`) on TVM
Execution Provider side. It allows to avoid excess copying from TVM EP
output tensor to ONNX Runtime one

**Motivation and Context**
Tests with multiple output topologies and big output tensors shows that
there is overheads spent on copying from TVM EP to ONNX Runtime.
Returning output(s) on preallocated memory for VirtualMachine was
implemented on TVM side.

**Details**
`set_output_zero_copy` provider option for TVM EP switches on/off this
feature. It is true by default.
The feature works for both GraphExecutor and VirtualMachine from TVM.

---------

Co-authored-by: Valery Chernov <valery.chernov@deelvin.com>
2023-02-08 10:02:20 -08:00
Hector Li
cd7098fdf4
fix snpe build (#14616)
### Description
Fix SNPE build issue caused by cmake dependency refactor

### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
fix issue: https://github.com/microsoft/onnxruntime/pull/14547
2023-02-07 15:33:05 -08:00
Yi Zhang
80f807c03d
upgrade protobuf to 3.20.2 and onnx to 1.13 (#14279)
### Description
upgrade protobuf to 3.20.2, same as onnx 1.13.0

### Motivation and Context
Per component governance requirement and Fixes #14060

unused-parameter error occurs in 2 conditions.
1. compile protolbuf

`onnxruntime_src/cmake/external/protobuf/src/google/protobuf/repeated_ptr_field.h:752:66:
error: unused parameter ‘prototype’ [-Werror=unused-parameter]`
2. include onnx_pb.h
```
2023-01-28T10:20:15.0410853Z FAILED: CMakeFiles/onnxruntime_pybind11_state.dir/onnxruntime_src/onnxruntime/python/onnxruntime_pybind_iobinding.cc.o 
......
2023-01-28T10:20:15.0466024Z                  from /build/Debug/_deps/onnx-src/onnx/onnx_pb.h:51,
2023-01-28T10:20:15.0466958Z                  from /onnxruntime_src/include/onnxruntime/core/framework/to_tensor_proto_element_type.h:10,
....
2023-01-28T10:20:15.0609678Z /build/Debug/_deps/onnx-build/onnx/onnx-operators-ml.pb.h:1178:25:   required from here
2023-01-28T10:20:15.0610895Z /onnxruntime_src/cmake/external/protobuf/src/google/protobuf/repeated_ptr_field.h:752:66: error: unused parameter ‘prototype’ [-Werror=unused-parameter]
2023-01-28T10:20:15.0611707Z cc1plus: all warnings being treated as errors

```

https://dev.azure.com/onnxruntime/2a773b67-e88b-4c7f-9fc0-87d31fea8ef2/_apis/build/builds/874605/logs/22
2023-01-31 12:55:09 -08:00
Sumit Agarwal
edb377f2cb
[DML EP] Upgrade DML to 1.10.1 (#14433)
### Description
Updated DirectML version to 1.10.1
(https://www.nuget.org/packages/Microsoft.AI.DirectML/1.10.1)



### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2023-01-25 21:07:10 -08:00
Tianlei Wu
94b1791974
Upgrade CUTLASS to v2.11 and add sequence length threshold for cutlass FMHA (#14401)
### Description
Add sequence length threshold for triggering cutlass FMHA in FP32. See
performance test results in
https://github.com/microsoft/onnxruntime/pull/14343 to see how this
threshold is selected.

Upgrade cutlass to v2.11 and update deps.txt and cgmanifest for nuget
pipeline build (test build:
https://aiinfra.visualstudio.com/Lotus/_build/results?buildId=268574&view=results)
2023-01-25 09:43:48 -08:00
Tianlei Wu
414b012f42
Add memory efficient attention from CUTLASS (#14343)
### Description
Add memory efficient attention from CUTLASS.

TODO (in next pull request): 
(1) Need performance tests on different GPUs, then add a sequence length
threshold (only activate it for long sequence length).
(2) Merge changes from https://github.com/NVIDIA/cutlass/pull/773 when
it is in cutlass master.
2023-01-20 12:33:01 -08:00
Guenther Schmuelling
60290393f3
enable ort-extensions in wasm release builds (#14239)
enable ort-extensions in wasm release builds. sentence piece, gpt2, bert
and word piece tokenizers for now.

wasm size will grow from 8.4MB to 8.9MB.
2023-01-17 12:39:13 -08:00
RandySheriffH
83ad562826
Rename CloudEP to AzureEP (#14175)
Rename CloudEP to AzureEP.

Co-authored-by: Randy Shuai <rashuai@microsoft.com>
2023-01-11 12:25:04 -08:00