onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-05-31 23:27:43 +00:00

Author	SHA1	Message	Date
Edward Chen	4901987d1d	Remove SafeInt dependency from Objective-C API. (#13698 )	2022-11-18 17:06:12 -08:00
Changming Sun	3e9e5e9d6d	Patch Protobuf and ONNX's cmake files and enforce BinSkim check (#13694 ) Patch Protobuf and ONNX's cmake files and enforce BinSkim check. This PR has overlap with #13523 . I would prefer to get this one merged first so that we can finished the BinSkim work, and I try to make this PR as small as possible.	2022-11-18 10:09:47 -08:00
Changming Sun	7a57976d1a	Make natvis files work better (#13665 ) ### Description After this change, you will see GSL.natvis and wil.nativs files will be added to every onnxruntime_xxx project. Like this: ![image](https://user-images.githubusercontent.com/856316/202081013-314145a8-7a0f-4f45-bf85-f9ed0e247c63.png) This is because in onnxruntime_common.cmake we have: ```cmake if (MSVC) set(ABSEIL_NATVIS_FILE "abseil-cpp.natvis") target_sources( onnxruntime_common INTERFACE $<BUILD_INTERFACE:${PROJECT_SOURCE_DIR}/external/${ABSEIL_NATVIS_FILE}>) endif() ``` It sets a property, INTERFACE_SOURCES, on the target "onnxruntime_common". Then if anyone else uses: ``` target_link_libraries(mytarget PRIVATE onnxruntime_common) ``` The nativis file will be added to `mytarget`. However, in this project we don't use such things for the targets that are static libraries. For example, onnxruntime_graph is a static library. Instead, we use the `onnxruntime_add_include_to_target ` function to explicitly control what we want to propagate . The function was written before we started to have nativis files. So it doesn't pass a source file from one static library to another. Now we have the need. Probably only for Windows. ### Motivation and Context Add natvis files to every project.	2022-11-17 19:13:40 -08:00
Jian Chen	8442d9df2c	Cjian/c4244 round 6 (#13663 ) ### Description Fix round 6 ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2022-11-16 16:26:11 -05:00
cloudhan	369a822409	Share TunableOp between CUDA and ROCM EP (#13560 ) Make TunableOp to support CUDA kernel authoring and add the corresponding supports for kernel explorer	2022-11-11 13:56:44 +08:00
Patrice Vignola	3482180ec2	DML EP add a registration for Shape and Size (#13442 ) ### Description Add a DML registration for Shape to avoid copying back to the CPU just to get the shape of a GPU tensor. ### Motivation and Context When using free dimensions, many Transformers models extensively use the `Shape` operator. This causes hundreds of GPU->CPU copy that should be completely avoidable. Note that this change also uses the same heuristics as other providers (e.g. CUDA) to force some tensors on the CPU in certain situations. Co-authored-by: Patrice Vignola <pavignol@microsoft.com>	2022-11-08 19:29:37 -08:00
Peter Salas	b383312f4c	[tvm] Add support for int8 models, update TVM revision (#13519 ) ### Description In the TVM EP, this adds more entries to the conversion from `ONNXTensorElementDataType` to `DLDataType`. Additionally, it removes an unused function and updates the TVM revision to allow running models from recent revisions of TVM. ### Motivation and Context In the TVM EP, the mapping from `ONNXTensorElementDataType` to `DLDataType` was incomplete and neglected several integer types (in particular `ONNX_TENSOR_ELEMENT_DATA_TYPE_UINT8` and `ONNX_TENSOR_ELEMENT_DATA_TYPE_INT8`) which prevented some models from running. Co-authored-by: Peter Salas <psalas@octoml.ai>	2022-11-08 11:28:32 -08:00
Changming Sun	efcbdac58e	Remove the cmake option: onnxruntime_DEV_MODE (#13573 ) 1. Remove the cmake option onnxruntime_DEV_MODE and replace it with "--compile-no-warning-as-error" 2. Suppress some GSL warnings because now we treat nvcc diag warnings as errors	2022-11-07 09:06:28 -08:00
Changming Sun	23da468154	Upgrade cmake version to 3.24 (#13569 ) ### Description Upgrade cmake version to 3.24 because I need to use a new feature that is only provided in that version and later. Starting from cmake 3.24, the [FetchContent](https://cmake.org/cmake/help/latest/module/FetchContent.html#module:FetchContent) module and the [find_package()](https://cmake.org/cmake/help/latest/command/find_package.html#command:find_package) command now support integration capabilities, which means calls to "FetchContent" can be implicitly redirected to "find_package", and vice versa. Users can use a cmake variable to control the behavior. So, we don't need to provide such a build option. We can delete our "onnxruntime_PREFER_SYSTEM_LIB" build option and let cmake handle it. And it would be easier for who wants to use vcpkg. ### Motivation and Context Provide a unified package management method, and get aligned with the community. This change is split from #13523 for easier review.	2022-11-04 22:58:51 -07:00
George Nash	0296bc74c1	oneDNN ep bf16 enabling (#13484 ) ### Description This adds bfloat16 support to the oneDNN ep. When using the oneDNN ep this enables bfloat16 support for the following ops: Exp, Sigmoid, Tanh, Relu, MatMul, Gelu, BiasGelu, Add, Sub, Mul, Div, Div, Sqrt, Pow, ReduceMean, Abs, Cast, Equal, Exp, FastGelu, FusedMatMul, Gemm, Greter, GreaterOrEqual, LeakyRelu, Less, LessOrEqual, LRN, ReduceOps, Reshape, Squeeze, Transpose, and Unsqueeze. LayerNorm with some internal casting. BatchNorm only enabled BFloat16 for input and output, scale and bias still need fp32 input. Added bfloat16 unit tests for all of the operators in question. When possible we reused the already existing unit tests that were added by CUDA and ROCM eps. In many of the unit tests an unusual pattern will be seen #if defined(USE_DNNL) TEST(Test, bfloat16_test) { #if defined(USE_DNNL) // oneDNN ep specific code #endif //test code } #endif Although it looks unusual this was purposely done if another ep implements bfloat16 support for that operator they will be able to enable the unit test by adding there execution provider to the first line without needing to edit inside the test. Example: `#if defined(USE_CUDA) \|\| defined(USE_DNNL)` see the MatMul_float16 test in matmul_test.cc for and example of how this is useful. Additionally two new ISA checks (AVX512_BF16 and AMX-BF16) were added to the cpuid_info code in. This was important to detecting is bfloat16 operations are supported by the CPU. ### Motivation and Context This expands the capabilities of the oneDNN execution provider to support models containing bfloat16 operations. Signed-off-by: George Nash <george.nash@intel.com> Signed-off-by: Ruihan-Yin <ruihan.yin@intel.com>	2022-11-04 18:25:09 -07:00
Edward Chen	4401f50c5e	Change GSL download to use HTTPS URL. (#13563 )	2022-11-04 18:01:18 -07:00
cloudhan	2de883c592	Update CK and fix performance issue on dev machine (#13531 ) 1. Update CK to its latest develop branch 2. `-mllvm -amdgpu-early-inline-all=true` is critical to CK's performance, ensure it is properly configured. - The flags are propagated from target `hip-lang::device`'s `INTERFACE_COMPILE_OPTIONS`, we must not manually add the flags. - Instead, we must ensure this target is properly configured by checking _CMAKE_HIP_DEVICE_RUNTIME_TARGET is set. TL,DR `hip-lang::device` sometime will be not be properly configured if our `CMAKE_PREFIX_PATH` is not configured carefully. In the CI docker, the configuration is in good state, but on dev machine it is not, which then silently result poor performance for kernels. We fixed it in this PR and add a guard to avoid unsuccessful future editing and to prevent convoluted debugging process. `_CMAKE_HIP_DEVICE_RUNTIME_TARGET ` is shared in `/opt/rocm/lib/cmake/hip-lang/hip-lang-config.cmake` and it is internal to [CMake](https://gitlab.kitware.com/cmake/cmake/-/merge_requests/6121/diffs), the variable name will not be changed in the foreseeable future.	2022-11-03 19:32:30 +08:00
George Nash	77be22f379	[oneDNN ep] Update from oneDNN v2.7.0 to oneDNN v2.7.1 (#13536 ) The oneDNN 2.7.1 release includes multiple functional and performance improvements. Signed-off-by: George Nash <george.nash@intel.com> ### Description Update the oneDNN library from 2.7.0 to 2.7.1. This contains multiple functional and performance improvements. ### Motivation and Context This is a minor point release from the oneDNN library that gives performance and functional fixes that were found in the oneDNN 2.7 library shortly after release. Signed-off-by: George Nash <george.nash@intel.com>	2022-11-02 15:57:49 -07:00
Changming Sun	b1e1b25e04	Delete CUB (#13534 ) ### Description Delete CUB ### Motivation and Context Because it is already in CUDA SDK.	2022-11-02 13:06:22 -07:00
Wei-Sheng Chin	b5904c40dd	Enable ORT in TorchDynamo (#13259 ) This PR enables ORT to execute graphs captured by TorchDynamo. Major compilation code is in `OrtBackend.compile` in ort_backend.py. `register_backend.py` is for plugging `OrtBackend` into TorchDynamo as a compiler.	2022-11-01 11:19:29 -07:00
PeixuanZuo	c8886c5b4c	Revert "Update CK and fix performance due to lacking -amdgpu-early-inline-all=true (#13493 )" This reverts commit `4dd053cc15`.	2022-11-01 13:05:55 +08:00
Baiju Meswani	c557a55816	Fix on-device training ExportModelForInferencing api (#13510 )	2022-10-31 21:29:06 -07:00
Edward Chen	2ecd1d6622	Switch GSL to MS GSL 4.0.0 (#13416 )	2022-10-29 04:15:20 -07:00
cloudhan	4dd053cc15	Update CK and fix performance due to lacking -amdgpu-early-inline-all=true (#13493 ) 1. Update CK to its latest develop branch 2. `-mllvm -amdgpu-early-inline-all=true` is critical to CK's performance, add it.	2022-10-28 09:36:00 -07:00
JiCheng	20c3c35c33	[XNNPACK] support building xnnpack EP for IOS (#13461 ) ### Description support building xnnpack for IOS ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2022-10-28 15:03:04 +08:00
Changming Sun	4a20c0d98b	Delete zlib.cmake (#13467 ) Delete the file because it is not included by any other file.	2022-10-27 15:36:04 -07:00
Rui Ren	136e15bfaf	revert cmake external file (#13459 )	2022-10-26 11:38:15 -07:00
cloudhan	2748f38362	Drop hip_add_library (#13406 ) Switching to use CMake's builtin hip language support.	2022-10-25 12:57:48 +08:00
cloudhan	928c9fc348	Hipify during build instead of before cmake config (#13333 ) ### Description Currently, hipify happens before cmake is configured and then cmake glob the directories. This get rids of thoes customized python threading logic and opt for build system itself to generate the files. This also supersede the half baked branch [sukha/hipify-with-cmake](https://github.com/microsoft/onnxruntime/tree/sukha/hipify-with-cmake)	2022-10-20 22:46:22 -07:00
Ted Themistokleous	a561fde126	MIGraphX Execution Provider: Stream Synchronization (#12899 ) Description: Changes to the MIGraphx execution provider code to allow for stream synchronization on the gpu side Motivation and Context Performance boost by removing redundant host to device synchronizations The current implementation of the execution provider continuously calls hipDeviceSynchronize() between computations which adds overhead and an idle wait between the GPU's computations. This is noticeable during device This change leverages new functionality that's been added to MIGraphX to allow for GPU side synchronization which avoids the need for host->device waits. To maintain backwards compatibility with older MIGraphX versions, the compile time define MIGRAPHX_STREAM_SYNC has been added to the API to allow for older version operate with newer builds of onnxruntime without loss of functionality to the current feature set as of (08/09/22) Co-authored-by: Ted Themistokleous <tthemist@amd.com>	2022-10-14 10:23:51 -07:00
cloudhan	790e363909	Reland: Change ROCm to use tunable GEMM (#13231 ) Reland: Change ROCm to use tunable GEMM (#12853)	2022-10-13 21:49:42 -07:00
Wei-Sheng Chin	dc324b1d90	[LazyTensor] Make LORT Build Again with Latest PyTorch (#13303 ) `python setup.py develop` doesn't install PyTorch as a normal package in site-packages anymore, and the user must stay at PyTorch's root directory to call `import torch`. This will break LORT tests because LORT tests contains `import torch` and are called outside PyTorch root directory. To make PyTorch a normal package again, this PR build PyTorch with `python setup.py install`.	2022-10-13 13:56:17 -07:00
Dmitri Smirnov	25c0a66934	Natvis adjustments to make debugging bearable (#13237 ) ### Description - Fix Abseil::InlinedVector inlined storage visualization - Fix typo in protobuf natvis. - Add basic gsl.natvis ### Motivation and Context Debugging is hard.	2022-10-10 10:06:55 -07:00
cloudhan	51ac6617f5	Fix warnings and enable dev mode for ROCm CI (#13223 ) Fix warnings and enable dev mode for ROCm CI: * Fix ROCm headers complaining "This file is deprecated. Use the header file from ..." * Disable warning signed and unsigned compare for kernel explorer * Fix unused and nondiscard warnings * Enable dev mode for ROCm CI * Walkaround error "unknown warning option '-Wno-nonnull-compare'" in kernel explorer by using '-Wno-unknown-warning-option' to ignore the unknown option * Fix error "unused parameter 'mask'" * Fix warning "instantiation of variable 'onnxruntime::rocm::Consts<float>::One' required here, but no definition is available", etc. Fixed by using C++17's inline (implied by constexpr) static initialization. * Remove unused variable * Add the missing `override` specifier	2022-10-07 09:45:01 +08:00
Edward Chen	4e37464cc5	Add build configuration to binary size checks pipeline. (#13208 ) Add another build configuration to binary size checks pipeline. Enable additional configurations to be added more easily.	2022-10-05 12:39:19 -07:00
cloudhan	72076b1eb2	Update ROCm CI to use HIP LANGUAGE (#13214 ) Update for ROCm CI before reland tunable GEMM #12853. This PR also update composable kernel to use CMakes's HIP language support so that we can mix C/C++ compiler with HIP compiler instead of locking to hip-clang	2022-10-05 16:15:16 +08:00
Yulong Wang	054464dce2	fix XNNPACK on WebAssembly SIMD (#13161 ) ### Description fix XNNPACK on WebAssembly SIMD. Flag "-msimd128" need to be applied to every source file when compiling WASM SIMD. Currently only a part of the source files are compiled with this flag so we get inconsistent result for `sizeof(xnn_f32_minmax_params)` because the type definition include a `#ifdef` for `__wasm_simd128__`. The inconsistency causes writing garbage data to a stack variable and eventually cause the crash. XNNPACK libraries are C libraries so need to apply the build flags not only to `CMAKE_CXX_FLAGS` but also to `CMAKE_C_FLAGS`.	2022-09-30 16:34:15 -07:00
George Nash	b76a65c784	Upgrade the oneDNN ep to use oneDNNv2.7 (#13175 ) ### Description This updates the oneDNN library used by oneDNN ep from version 2.6 to version 2.7 ### Motivation and Context This brings in the many improvements incorporated into the oneDNN library to the oneDNN execution provider. Signed-off-by: George Nash <george.nash@intel.com>	2022-09-30 12:29:17 -07:00
cloudhan	c93cb8f949	Revert "Enable ROCm to use tunable GEMM" (#13160 ) Reverts microsoft/onnxruntime#12853 due to CI pipeline problem.	2022-09-30 14:01:16 +08:00
cloudhan	32c2c4b480	Change ROCm to use tunable GEMM (#12853 ) Change ROCm to use tunable GEMM. It is not enabled in this PR. This will drastically improve GEMM performance in some shapes and dtypes configuration. This will benefit the overall performance for BERT inference and hopefully, training, when enabled.	2022-09-28 16:21:54 +08:00
Rachel Guo	9a44a69653	Refactor NNAPI EP OpBuilder/OpSupportChecker structure (#13065 ) ### Description <!-- Describe your changes. --> As title -Split long OpBuilder and OpSupportChecker files into individual operator files. -Add OpBuilder/SupportChecker registry factories. -Combine the functionality of op_builder and op_support_checker into one op_builder. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> The NNAPI OPBuilder was splitted into OPBuilder (For EP::Compile) and OPSupportChecker (for EP::GetCapability) At the time it was reasonable choice, but OPBuilder/OPSupportChecker share some logic and has to use addition helper. Clean up now to make NNAPI OPBuilder/OPSupportChecker into single OPBuilder (similar to what CoreML EP has)	2022-09-27 17:12:09 -07:00
Changming Sun	b25437ec41	Upgrade protobuf version (#13100 ) Upgrade protobuf version from 3.18.1 to 3.18.3 to address CVE-2022-1941	2022-09-26 21:30:28 -07:00
RandySheriffH	77a066c700	Drop nuphar from java API (#13107 ) Drop nuphar from: - java API - tvm.cmake - run_build.sh	2022-09-26 17:06:08 -07:00
RandySheriffH	a83a9ed6b0	Remove miscellaneous nuphar configs (#13070 ) Remove a handful of nuphar related configurations after deprecation. Co-authored-by: Randy Shuai <rashuai@microsoft.com>	2022-09-26 13:41:28 -07:00
Dale Phurrough	2ae33b3613	fix CuDNN lib path for Windows (#12974 ) Fixes microsoft/onnxruntime#12969 ### Motivation and Context Build is broken, can't find cudnn.lib with nvidia official install of cuDNN Alternative method is to use `IF(EXISTS ${onnxruntime_CUDNN_HOME}/lib/x64/cudnn.lib)` to test for legacy location and only add the legacy dir to the path, else add the current official `lib/` dir.	2022-09-26 13:23:38 -07:00
Changming Sun	eafd67b8fd	Update CUDA version to 11.6 and refactor python packaging pipeline (#13002 ) 1. Update CUDA version from 11.4 to 11.6. 2. Update Manylinux version 3. Upgrade GCC version from 10 to 11 for most x86_64 pipelines. CentOS 7 ARM64 doesn't have GCC 11 yet. 4. Refactor python packaging pipeline: a. Split Linux GPU build job to two parts, build and test, so that the build part doesn't need to use a GPU machine b. Make the Linux GPU build job and Linux CPU build job more similar: share the same bash script and yaml file. 5. Temporarily disable Attention_Mask1D_Fp16_B2_FusedNoPadding because it is causing one of our packaging pipeline to fail. I have created an ADO task for this.	2022-09-23 00:29:27 -07:00
cloudhan	a24b41d92e	Move all TunableOp related falicilities to EP level directory (#12857 ) Some Ops in EP directory instead of contrib_ops directory will require TunableOp. We will also need to add EP level session tuning options for it. So move those code all at once. Also remove duplicated utility functions.	2022-09-23 11:10:19 +08:00
wangxiyuan	952c99304a	Add CANN EP (#12416 ) Description: This PR adds Ascend CANN execution provider support. Motivation and Context - Why is this change required? What problem does it solve? As the info shown in the issue. CANN is the API layer for Ascend processor. Add CANN EP can allow user run onnx model on Ascend hardware via onnxruntime The detail change: 1. Added CANN EP framework. 2. Added the basic operators to support ResNet and VGG model. 3. Added C/C++、Python API support - If it fixes an open issue, please link to the issue here. https://github.com/microsoft/onnxruntime/issues/11477 Author: lijiawei <lijiawei19@huawei.com> wangxiyuan <wangxiyuan1007@gmail.com> Co-authored-by: FFrog <ljw1101.vip@gmail.com>	2022-09-22 14:53:40 -07:00
sfatimar	cccbe90764	Openvino ep 2022.2 v4.2 (#13023 ) This changes are to align OV 2022.2 Release with ORT . Changes CPU FP16 Support, dGPU Support, RHEL Dockerfile, Ubuntu 20 Dockerfile Motivation and Context - This change is required to ensure ORT-OpenVINO Execution Provider is aligned with latest changes. - If it fixes an open issue, please link to the issue here. Co-authored-by: mayavijx <mayax.vijayan@intel.com> Co-authored-by: shamaksx <shamax.kshirsagar@intel.com> Co-authored-by: pratiksha <pratikshax.bapusaheb.vanse@intel.com> Co-authored-by: pratiksha <mohsinx.mohammad@intel.com> Co-authored-by: Sahar Fatima <sfatima.3001@gmail.com> Co-authored-by: Preetha Veeramalai <preetha.veeramalai@intel.com> Co-authored-by: nmaajidk <n.maajid.khan@intel.com> Co-authored-by: Mateusz Tabaka <mateusz.tabaka@intel.com> Co-authored-by: intel <intel@iotgecsp-nuc04.iind.intel.com>	2022-09-22 12:31:40 -07:00
Adam Louly	268bfe2a5d	python training api bindings (#12610 ) Description: Python API Bindings for on device training. Motivation and Context - This PR contains api bindings so python users can perform a whole training loop. Co-authored-by: Adam Louly <adamlouly@microsoft.com@orttrainingdev7.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net> Co-authored-by: Baiju Meswani <bmeswani@microsoft.com>	2022-09-16 09:38:24 -07:00
sumitsays	363c695dad	Update DML 1.9.0 to 1.9.1 (#12966 ) Update DML to 1.9.1 Co-authored-by: Dwayne Robinson <dwayner@microsoft.com>	2022-09-15 10:54:22 -07:00
cloudhan	10f9a69707	Use CMake EXCLUDE_FROM_ALL for composable kernels to avoid building of conv related kernels (#12855 )	2022-09-14 22:11:31 -07:00
Chun-Wei Chen	d819b56fba	Consume ONNX 1.12.1 to prevent vulnerability issue while loading external file (#12915 ) * consume ONNX 1.12.1 to prevent vulnerability issue while loading external tensors * update ONNX 1.12.1 * test updated PR * use official rel-1.12.1 commit	2022-09-14 21:10:24 -07:00
Scott McKay	022d9e2d0c	Get files for XNNPACK wasm build from BUILD.bazel. (#12892 ) Get files for wasm build from BUILD.bazel.	2022-09-09 12:38:57 -07:00
pallavides	6ebb7b91eb	Re-apply fix for mkl issue for eager mode (#12881 ) * reapply fix for mkl issue for eager mode * add comment, update link libs	2022-09-08 12:29:24 -07:00

1 2 3 4 5 ...

1190 commits