Commit graph

348 commits

Author SHA1 Message Date
Guenther Schmuelling
60290393f3
enable ort-extensions in wasm release builds (#14239)
enable ort-extensions in wasm release builds. sentence piece, gpt2, bert
and word piece tokenizers for now.

wasm size will grow from 8.4MB to 8.9MB.
2023-01-17 12:39:13 -08:00
RandySheriffH
83ad562826
Rename CloudEP to AzureEP (#14175)
Rename CloudEP to AzureEP.

Co-authored-by: Randy Shuai <rashuai@microsoft.com>
2023-01-11 12:25:04 -08:00
Ashwini Khade
d92c663f28
Create dedicated build for training api (#14136)
### Description
Enable creating dedicated build for on device training. With this PR we
can build a lean binary for on device training using flag
--enable_training_apis. This binary includes only the essentials like
training ops, optimizers etc and NOT features like Aten fallback,
strided tensors, gradient builders etc . This binary also removes all
the deprecated components like training::TrainingSession and OrtTrainer
etc

### Motivation and Context
This enables our partners to create a lean binary for on device
training.
2023-01-10 20:58:04 -08:00
Guenther Schmuelling
6b8c72cfa6
pin ort-ext to 81e7799c69044c745239202085eb0a98f102937b (#14044)
pin onnxruntime-extension to 81e7799c69044c745239202085eb0a98f102937b in
preparation to in enable extension in wasm build.
2023-01-10 10:10:17 -08:00
liqun Fu
1be36913cc
to work with onnx 1.13 rc, implement ver 18 reduce and optioanl ops, … (#13765) 2023-01-09 10:26:16 -08:00
Yi Zhang
2ce7b1c1dc
Enable cache for msbuild (#14085)
### Description
Enable ccache in windows CPU compilation.
The windows compilation in CI could be reduced to 1 more minute at most.

![image](https://user-images.githubusercontent.com/16190118/210294061-86742cf4-65c7-4cc2-9725-e102c3c64abd.png)
2023-01-06 11:19:57 +08:00
PeixuanZuo
4eac0db3af
[ROCm] Add GemmFastGelu CK implementation (#13759)
### Description
<!-- Describe your changes. -->

Add GemmFastGelu CK implementation.

TODO 
1. The performance of CK GemmFastGelu in ORT is not good as using CK
directly, still need to investigate the reason and improve the CK in
ORT.
`GemmFastGeluUnfused float16 NN m=49152 n=3072 k=768 2298.8064 us 100.89
tflops`
`withbias DeviceGemmMultipleD_Xdl_CShuffle<256, 256, 128, 32, 8, 8,
Default> LoopScheduler: Default, PipelineVersion: v1 float16 NN m=49152
n=3072 k=768 2401.9799 us 96.56 tflops`

### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->

Co-authored-by: peixuanzuo <peixuanzuo@linmif39a000004.zvflicr54joexhdgnhvmxrxygg.phxx.internal.cloudapp.net>
2023-01-05 17:53:30 +08:00
RandySheriffH
587e891cae
CloudEP (#13855)
Implement CloudEP for hybrid inferencing.
The PR introduces zero new API, customers could configure session and
run options to do inferencing with Azure [triton
endpoint.](https://learn.microsoft.com/en-us/azure/machine-learning/how-to-deploy-with-triton?tabs=azure-cli%2Cendpoint)
Sample configuration in python be like:

```
sess_opt.add_session_config_entry('cloud.endpoint_type', 'triton');
sess_opt.add_session_config_entry('cloud.uri', 'https://cloud.com');
sess_opt.add_session_config_entry('cloud.model_name', 'detection2');
sess_opt.add_session_config_entry('cloud.model_version', '7'); // optional, default 1
sess_opt.add_session_config_entry('cloud.verbose', '1'); // optional, default '0', meaning no verbose
...
run_opt.add_run_config_entry('use_cloud', '1') # 0 for local inferencing, 1 for cloud endpoint.
run_opt.add_run_config_entry('cloud.auth_key', '...')
...
sess.run(None, {'input':input_}, run_opt)
```

Co-authored-by: Randy Shuai <rashuai@microsoft.com>
2023-01-03 10:03:15 -08:00
Dmitri Smirnov
d762aa2a4c
Let Cmake decide where to place abseil (#14057)
### Description
Remove Abseil module placement specifications


### Motivation and Context
Allow Cmake defaults take place and possible redirection of all
submodules for sharing between the local builds.
2022-12-23 12:08:13 -08:00
Changming Sun
05137e6ec4
Use target name for flatbuffers (#13991)
### Description

Use target name for flatbuffers.
Add version range for flatbuffers. It is similar to #13870 
### Motivation and Context
To fix a build error:
```
CMake Error at onnxruntime_graph.cmake:88 (add_dependencies):
  The dependency target "flatbuffers" of target "onnxruntime_graph" does not
  exist.
Call Stack (most recent call first):
  CMakeLists.txt:1490 (include)
```

It happens when flatbuffers library is already installed. For example,
on Ubuntu people may get it from apt-get. But, the one provided by
Ubuntu 20.04 is not compatible with our code. The one in Ubuntu 22.04
works fine.
2022-12-20 11:44:02 -08:00
Changming Sun
fc2a6db573
Update absl to the latest release (#13990)
### Description
Update absl to a new version

### Motivation and Context
The new version contains fixes that are needed for Nvidia GPU build.
Once we update it to that version, we don't need to maintain our private
patches for Nvidia GPU build.
2022-12-19 14:25:13 -08:00
Changming Sun
05dc1165a5
Add protobuf version constraint (#13870)
To fix a build error:


/home/xxxxxxxxxxxxx/onnxruntime/build/Linux/Debug/tensorboard/compat/proto/cost_graph.pb.cc:17:8:
error:
‘PROTOBUF_INTERNAL_EXPORT_tensorboard_2fcompat_2fproto_2ftensor_5fshape_2eproto’
does not name a type
17 | extern
PROTOBUF_INTERNAL_EXPORT_tensorboard_2fcompat_2fproto_2ftensor_5fshape_2eproto
::PROTOBUF_NAMESPACE_ID::internal::SCCInfo<1>
scc_info_TensorShapeProto_tensorboard_2fcompat_2fproto_2ftensor_5fshape_2eproto;
2022-12-08 16:14:16 -08:00
Changming Sun
81c2defd3b
Remove unused git submodules (#13830) 2022-12-07 21:59:16 -08:00
Changming Sun
04900f96c1
Improve dependency management (#13523)
## Description
1. Convert some git submodules to cmake external projects
2. Update nsync from
[1.23.0](https://github.com/google/nsync/releases/tag/1.23.0) to
[1.25.0](https://github.com/google/nsync/releases/tag/1.25.0)
3. Update re2 from 2021-06-01 to 2022-06-01
4. Update wil from an old commit to 1.0.220914.1 tag
5. Update gtest to a newer commit so that it can optionally leverage
absl/re2 for parsing command line flags.

The following git submodules are deleted:

1. FP16
2. safeint
3. XNNPACK
4. cxxopts
5. dlpack
7. flatbuffers
8. googlebenchmark
9. json
10. mimalloc
11. mp11
12. pthreadpool

More will come.

## Motivation and Context
There are 3 ways of integrating 3rd party C/C++ libraries into ONNX
Runtime:
1. Install them to a system location, then use cmake's find_package
module to locate them.
2.  Use git submodules 
6.  Use cmake's external projects(externalproject_add). 

At first when this project was just started, we considered both option 2
and option 3. We preferred option 2 because:

1. It's easier to handle authentication. At first this project was not
open source, and it had some other non-public dependencies. If we use
git submodule, ADO will handle authentication smoothly. Otherwise we
need to manually pass tokens around and be very careful on not exposing
them in build logs.
2. At that time, cmake fetched dependencies after "cmake" finished
generating vcprojects/makefiles. So it was very difficult to make cflags
consistent. Since cmake 3.11, it has a new command: FetchContent, which
fetches dependencies when it generates vcprojects/makefiles just before
add_subdirectories, so the parent project's variables/settings can be
easily passed to the child projects.

And when the project went on,  we had some new concerns:
1. As we started to have more and more EPs and build configs, the number
of submodules grew quickly. For more developers, most ORT submodules are
not relevant to them. They shouldn't need to download all of them.
2. It is impossible to let two different build configs use two different
versions of the same dependency. For example, right now we have protobuf
3.18.3 in the submodules. Then every EP must use the same version.
Whenever we have a need to upgrade protobuf, we need to coordinate
across the whole team and many external developers. I can't manage it
anymore.
3. Some projects want to manage the dependencies in a different way,
either because of their preference or because of compliance
requirements. For example, some Microsoft teams want to use vcpkg, but
we don't want to force every user of onnxruntime using vcpkg.
7. Someone wants to dynamically link to protobuf, but our build script
only does static link.
8. Hard to handle security vulnerabilities. For example, whenever
protobuf has a security patch, we have a lot of things to do. But if we
allowed people to build ORT with a different version of protobuf without
changing ORT"s source code, the customer who build ORT from source will
be able to act on such things in a quicker way. They will not need to
wait ORT having a patch release.
9. Every time we do a release, github will also publish a source file
zip file and a source file tarball for us. But they are not usable,
because they miss submodules.
 
### New features

After this change, users will be able to:
1. Build the dependencies in the way they want, then install them to
somewhere(for example, /usr or a temp folder).
2. Or download the dependencies by using cmake commands from these
dependencies official website
3. Similar to the above, but use your private mirrors to migrate supply
chain risks.
4. Use different versions of the dependencies, as long as our source
code is compatible with them. For example, you may use you can't use
protobuf 3.20.x as they need code changes in ONNX Runtime.
6.  Only download the things the current build needs.
10. Avoid building external dependencies again and again in every build.

### Breaking change
The onnxruntime_PREFER_SYSTEM_LIB build option is removed you could think from now 
it is default ON. If you don't like the new behavior, you can set FETCHCONTENT_TRY_FIND_PACKAGE_MODE to NEVER.
Besides, for who relied on the onnxruntime_PREFER_SYSTEM_LIB build
option, please be aware that this PR will change find_package calls from
Module mode to Config mode. For example, in the past if you have
installed protobuf from apt-get from ubuntu 20.04's official repo,
find_package can find it and use it. But after this PR, it won't. This
is because that protobuf version provided by Ubuntu 20.04 is too old to
support the "config mode". It can be resolved by getting a newer version
of protobuf from somewhere.
2022-12-01 09:51:59 -08:00
Patrice Vignola
4128e44b4f
[DML EP] Upgrade DML to 1.10.0 (#13796)
### Description
Upgrade DML to 1.10.0
2022-11-30 21:32:14 -08:00
Changming Sun
29ed8811e5
Move C/C++ deps' URLs to deps.txt (#13769)
### Description
1. Move C/C++ deps' URLs to deps.txt, and download the dependencies from
Azure Devops Artifacts instead of github.
2. Add "EXCLUDE_FROM_ALL" keyword to the cmake external projects, so
that we only build the parts we need and avoid installing the 3rd-party
dependencies when people run `make install` in ORT's build directory.
However, at this moment cmake itself doesn't have the feature. So I
copied their code to cmake/external/helper_functions.cmake and modified
it.

This PR is split from #13523, to make that one smaller. 

### Motivation and Context
1. Secure the supply chain
2. Make it be possible to automatically detect if ORT has an old
dependency that hasn't been updated from a long time.
2022-11-29 18:06:35 -08:00
Peter Salas
b383312f4c
[tvm] Add support for int8 models, update TVM revision (#13519)
### Description
In the TVM EP, this adds more entries to the conversion from
`ONNXTensorElementDataType` to `DLDataType`. Additionally, it removes an
unused function and updates the TVM revision to allow running models
from recent revisions of TVM.

### Motivation and Context
In the TVM EP, the mapping from `ONNXTensorElementDataType` to
`DLDataType` was incomplete and neglected several integer types (in
particular `ONNX_TENSOR_ELEMENT_DATA_TYPE_UINT8` and
`ONNX_TENSOR_ELEMENT_DATA_TYPE_INT8`) which prevented some models from
running.

Co-authored-by: Peter Salas <psalas@octoml.ai>
2022-11-08 11:28:32 -08:00
Changming Sun
efcbdac58e
Remove the cmake option: onnxruntime_DEV_MODE (#13573)
1. Remove the cmake option onnxruntime_DEV_MODE and replace it with
"--compile-no-warning-as-error"
2. Suppress some GSL warnings because now we treat nvcc diag warnings as
errors
2022-11-07 09:06:28 -08:00
Edward Chen
4401f50c5e
Change GSL download to use HTTPS URL. (#13563) 2022-11-04 18:01:18 -07:00
cloudhan
2de883c592
Update CK and fix performance issue on dev machine (#13531)
1. Update CK to its latest develop branch
2. `-mllvm -amdgpu-early-inline-all=true` is critical to CK's
performance, ensure it is properly configured.
- The flags are propagated from target `hip-lang::device`'s
`INTERFACE_COMPILE_OPTIONS`, we must not manually add the flags.
- Instead, we must ensure this target is properly configured by checking
_CMAKE_HIP_DEVICE_RUNTIME_TARGET is set.

TL,DR

`hip-lang::device` sometime will be not be properly configured if our
`CMAKE_PREFIX_PATH` is not configured carefully. In the CI docker, the
configuration is in good state, but on dev machine it is not, which then
silently result poor performance for kernels. We fixed it in this PR and
add a guard to avoid unsuccessful future editing and to prevent
convoluted debugging process.

`_CMAKE_HIP_DEVICE_RUNTIME_TARGET ` is shared in
`/opt/rocm/lib/cmake/hip-lang/hip-lang-config.cmake` and it is internal
to
[CMake](https://gitlab.kitware.com/cmake/cmake/-/merge_requests/6121/diffs),
the variable name will not be changed in the foreseeable future.
2022-11-03 19:32:30 +08:00
George Nash
77be22f379
[oneDNN ep] Update from oneDNN v2.7.0 to oneDNN v2.7.1 (#13536)
The oneDNN 2.7.1 release includes multiple functional and performance
improvements.

Signed-off-by: George Nash <george.nash@intel.com>

### Description
Update the oneDNN library from 2.7.0 to 2.7.1. This contains multiple
functional and performance improvements.



### Motivation and Context
This is a minor point release from the oneDNN library that gives
performance and functional fixes that were found in the oneDNN 2.7
library shortly after release.

Signed-off-by: George Nash <george.nash@intel.com>
2022-11-02 15:57:49 -07:00
Changming Sun
b1e1b25e04
Delete CUB (#13534)
### Description
Delete CUB

### Motivation and Context
Because it is already in CUDA SDK.
2022-11-02 13:06:22 -07:00
Edward Chen
2ecd1d6622
Switch GSL to MS GSL 4.0.0 (#13416) 2022-10-29 04:15:20 -07:00
JiCheng
20c3c35c33
[XNNPACK] support building xnnpack EP for IOS (#13461)
### Description
support building xnnpack for IOS


### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2022-10-28 15:03:04 +08:00
Changming Sun
4a20c0d98b
Delete zlib.cmake (#13467)
Delete the file because it is not included by any other file.
2022-10-27 15:36:04 -07:00
Rui Ren
136e15bfaf
revert cmake external file (#13459) 2022-10-26 11:38:15 -07:00
Ted Themistokleous
a561fde126
MIGraphX Execution Provider: Stream Synchronization (#12899)
**Description**: Changes to the MIGraphx execution provider code to
allow for stream synchronization on the gpu side

**Motivation and Context**
Performance boost by removing redundant host to device synchronizations 

The current implementation of the execution provider continuously calls
hipDeviceSynchronize() between computations which adds overhead and an
idle wait between the GPU's computations. This is noticeable during
device

This change leverages new functionality that's been added to MIGraphX to
allow for GPU side synchronization which avoids the need for
host->device waits.

To maintain backwards compatibility with older MIGraphX versions, the
compile time define MIGRAPHX_STREAM_SYNC has been added to the API to
allow for older version operate with newer builds of onnxruntime without
loss of functionality to the current feature set as of (08/09/22)

Co-authored-by: Ted Themistokleous <tthemist@amd.com>
2022-10-14 10:23:51 -07:00
Dmitri Smirnov
25c0a66934
Natvis adjustments to make debugging bearable (#13237)
### Description

- Fix Abseil::InlinedVector inlined storage visualization
- Fix typo in protobuf natvis.
- Add basic gsl.natvis


### Motivation and Context
Debugging is hard.
2022-10-10 10:06:55 -07:00
cloudhan
72076b1eb2
Update ROCm CI to use HIP LANGUAGE (#13214)
Update for ROCm CI before reland tunable GEMM #12853. This PR also update
composable kernel to use CMakes's HIP language support so that we can
mix C/C++ compiler with HIP compiler instead of locking to hip-clang
2022-10-05 16:15:16 +08:00
Yulong Wang
054464dce2
fix XNNPACK on WebAssembly SIMD (#13161)
### Description

fix XNNPACK on WebAssembly SIMD.

Flag "-msimd128" need to be applied to every source file when compiling
WASM SIMD. Currently only a part of the source files are compiled with
this flag so we get inconsistent result for
`sizeof(xnn_f32_minmax_params)` because the type definition include a
`#ifdef` for `__wasm_simd128__`. The inconsistency causes writing
garbage data to a stack variable and eventually cause the crash.

XNNPACK libraries are C libraries so need to apply the build flags not
only to `CMAKE_CXX_FLAGS` but also to `CMAKE_C_FLAGS`.
2022-09-30 16:34:15 -07:00
George Nash
b76a65c784
Upgrade the oneDNN ep to use oneDNNv2.7 (#13175)
### Description
This updates the oneDNN library used by oneDNN ep from version 2.6 to
version 2.7



### Motivation and Context
This brings in the many improvements incorporated into the oneDNN
library to the oneDNN execution provider.

Signed-off-by: George Nash <george.nash@intel.com>
2022-09-30 12:29:17 -07:00
Changming Sun
b25437ec41
Upgrade protobuf version (#13100)
Upgrade protobuf version from 3.18.1 to 3.18.3 to address CVE-2022-1941
2022-09-26 21:30:28 -07:00
RandySheriffH
77a066c700
Drop nuphar from java API (#13107)
Drop nuphar from:

- java API
- tvm.cmake
- run_build.sh
2022-09-26 17:06:08 -07:00
sumitsays
363c695dad
Update DML 1.9.0 to 1.9.1 (#12966)
Update DML to 1.9.1

Co-authored-by: Dwayne Robinson <dwayner@microsoft.com>
2022-09-15 10:54:22 -07:00
cloudhan
10f9a69707
Use CMake EXCLUDE_FROM_ALL for composable kernels to avoid building of conv related kernels (#12855) 2022-09-14 22:11:31 -07:00
Chun-Wei Chen
d819b56fba
Consume ONNX 1.12.1 to prevent vulnerability issue while loading external file (#12915)
* consume ONNX 1.12.1 to prevent vulnerability issue while loading external tensors

* update ONNX 1.12.1

* test updated PR

* use official rel-1.12.1 commit
2022-09-14 21:10:24 -07:00
Scott McKay
022d9e2d0c
Get files for XNNPACK wasm build from BUILD.bazel. (#12892)
Get files for wasm build from BUILD.bazel.
2022-09-09 12:38:57 -07:00
Guenther Schmuelling
f856be162e
fix xnnpack wasm build (#12845) 2022-09-06 19:20:07 -07:00
Yulong Wang
82a28cc2c3
upgrade emsdk to 3.1.19 (#12690)
* upgrade emsdk to 3.1.19

* fix build break

* ignore '-Wunused-but-set-variable' in eigen

* add malloc and free in exported functions

* EXPORTED_FUNCTIONS
2022-08-30 13:42:45 -07:00
cloudhan
46c074a6c8
Update composable kernel and enable experimental inter wave scheduling (#12626)
Update ck to latest master and enable interwave scheduling
2022-08-25 22:19:41 -07:00
Dmitri Smirnov
616677104a
ONNX Protobuf natvis with some google::protobuf (#12580)
ONNX Protobuf natvis with some google::protobuf structures
  Add leading underscore to local Intrinsic
2022-08-15 09:59:07 -07:00
Cheng
819c36701f
[xnnpack] basic QDQ operators support (#11912)
* basic ops for mobilenet,qconv,qsoftmax,qavgpool

update Xnnpack to latest

unit test

* NodeUnit: use outputedge to replace output-node

* qdq model e2e test

* use inlinedvector to replace vector

* conv bias check

* tensorshape helpers

* Refactor xnn_op minmax

* Qlinearsoftmax schema update

* Remove qlinearsoftmax registration

Co-authored-by: Jicheng Wen <jicwen@microsoft.com>
2022-08-11 10:12:51 +08:00
cloudhan
f39354d7cb
Add composable kernel GEMM baseline for kernel explorer (#12364)
* Split GemmBase RocBlasGemm

* Add composable kernel GEMM baseline

* Make linter happy

* Address review comment

* Update bert cases with batchsize

* Adjust includes to fix IWYU lint

* Only builds and links used ck kernels to improve building time

* Remove warmup run on SelectImpl

* Add comment to utility function

* Mute cpplint

* Make RocBlasGemm<T>::SelectImpl semantically correct

* Add reduced basic test cases for ck gemm

* More robust gemm testing

* Fix warnings

* Fix grammar
2022-08-04 17:32:20 -07:00
Dmitri Smirnov
a4ef0e7f7b
Remove dynamic allocation for ThreadPool ParallelSection (#12429)
Use InlinedVector in a TP
Store per thread parallel section in std::optional and avoid memory allocation
2022-08-04 09:46:16 -07:00
Dmitri Smirnov
eebaf5f270
Adjust and fixx abseil-cpp debugging visualization (#12415)
Move abseil-cpp.natvis file, add it to PDB, adjust visualization
2022-08-02 15:08:17 -07:00
Valery Chernov
1a4868e5c4
[TVM EP] Hot fix of build on Windows of TVM EP with ipp-crypto (#12381)
fix of build on Windows with ipp-crypto. cmake warnings fix

Co-authored-by: Valery Chernov <valery.chernov@deelvin.com>
2022-07-31 14:36:54 +02:00
Valery Chernov
e2423bb55c
[TVM EP] Build on Windows with ipp-crypto support (#12336)
* update TVM EP docs for ipp-crypto build conditions

* add ipp-crypto by ExternalProject_Add

Co-authored-by: Valery Chernov <valery.chernov@deelvin.com>
2022-07-28 15:40:19 +02:00
Valery Chernov
3b0aaa9e0e
[TVM EP] support build on Windows (#11851)
* add description of build ORT+TVM EP on Windows

* fix cmake error related to symlink creation on Windows

* add llvm config path to build flags for correct build on Windows

* update TVM_EP.md for llvm_config build arg

* fix warnings skipping during build on Windows

* fix using string or wstring for model path to correct build on Windows (MSVC error)

* fix error in custom logger for correct build on Windows

* implement glob algorithm for Windows

* additional build fixes

* update TVM with export of VM symbols for dll

* description of nasm issue and workaround

* update TVM with export of Executable from VM symbols for dll

* description of installation of ipp-crypto dependencies on Windows

* cmake key for ipp-crypto build

* fix wstring for TVMso EP

* fix ipp-crypto build

* cmake key onnxruntime_TVM_USE_HASH switch off not specific methods, but full hash functionality

* fix absolute path to compiled lib

* update TVM_EP.md, fix lint warnings

* update TVM_EP.md

* small fixes after review

* switch on handshake functionality for Linux workflow

Co-authored-by: Valery Chernov <valery.chernov@deelvin.com>
Co-authored-by: KJlaccHoeUM9l <wotpricol@mail.ru>
2022-07-13 10:48:42 +02:00
Dwayne Robinson
32a8751dc4
DML EP Update to DML 1.9 (#12090)
* Update to DML 1.9

* Appease obnoxious Python formatting tool
2022-07-05 16:30:54 -07:00
Wenbing Li
479e71a7a8
enable the extensions custom build for java and android (#11823) 2022-07-05 10:34:14 -07:00