onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-07-16 18:31:27 +00:00

Author	SHA1	Message	Date
Yi Zhang	80f807c03d	upgrade protobuf to 3.20.2 and onnx to 1.13 (#14279 ) ### Description upgrade protobuf to 3.20.2, same as onnx 1.13.0 ### Motivation and Context Per component governance requirement and Fixes #14060 unused-parameter error occurs in 2 conditions. 1. compile protolbuf `onnxruntime_src/cmake/external/protobuf/src/google/protobuf/repeated_ptr_field.h:752:66: error: unused parameter ‘prototype’ [-Werror=unused-parameter]` 2. include onnx_pb.h ``` 2023-01-28T10:20:15.0410853Z FAILED: CMakeFiles/onnxruntime_pybind11_state.dir/onnxruntime_src/onnxruntime/python/onnxruntime_pybind_iobinding.cc.o ...... 2023-01-28T10:20:15.0466024Z from /build/Debug/_deps/onnx-src/onnx/onnx_pb.h:51, 2023-01-28T10:20:15.0466958Z from /onnxruntime_src/include/onnxruntime/core/framework/to_tensor_proto_element_type.h:10, .... 2023-01-28T10:20:15.0609678Z /build/Debug/_deps/onnx-build/onnx/onnx-operators-ml.pb.h:1178:25: required from here 2023-01-28T10:20:15.0610895Z /onnxruntime_src/cmake/external/protobuf/src/google/protobuf/repeated_ptr_field.h:752:66: error: unused parameter ‘prototype’ [-Werror=unused-parameter] 2023-01-28T10:20:15.0611707Z cc1plus: all warnings being treated as errors ``` https://dev.azure.com/onnxruntime/2a773b67-e88b-4c7f-9fc0-87d31fea8ef2/_apis/build/builds/874605/logs/22	2023-01-31 12:55:09 -08:00
pengwa	e2dd1315c7	Fix build for --enable_language_interop_ops + DISABLE_ABSEIL=ON (#14469 ) ### Fix build error on Windows when building with " --enable_language_interop_ops -cmake_extra_defines onnxruntime_DISABLE_ABSEIL=ON" This is a subsequent fix after https://github.com/microsoft/onnxruntime/pull/14309, which fixed build for onnxruntime_DISABLE_ABSEIL=ON build. Going furthur, if we enable --enable_language_interop_ops, there are following two errors: ``` test_symm_qgemm.cpp test_transpose.cpp onnxruntime_session.lib(inference_session.obj) : error LNK2019: unresolved external symbol "void __cdecl onnxruntime::L oadInterOp(class std::basic_string<wchar_t,struct std::char_traits<wchar_t>,class std::allocator<wchar_t> > const &,cla ss std::vector<struct Ort::CustomOpDomain,class std::allocator<struct Ort::CustomOpDomain> > &,class std::function<void __cdecl(char const )> const &)" (?LoadInterOp@onnxruntime@@YAXAEBV?$basic_string@_WU?$char_traits@_W@std@@V?$allocato r@_W@2@@std@@AEAV?$vector@UCustomOpDomain@Ort@@V?$allocator@UCustomOpDomain@Ort@@@std@@@3@AEBV?$function@$$A6AXPEBD@Z@3 @@Z) referenced in function "public: __cdecl <lambda_f3a907e0b0a0e11d80d305605215cce8>::operator()(class std::shared_pt r<class onnxruntime::Model> &)const " (??R<lambda_f3a907e0b0a0e11d80d305605215cce8>@@QEBA@AEAV?$shared_ptr@VModel@onnxr untime@@@std@@@Z) [C:\Users\pengwa\dev\onnxruntime\build\Windows\RelWithDebInfo\onnxruntime_test_trainer.vcxproj] onnxruntime_session.lib(inference_session.obj) : error LNK2019: unresolved external symbol "void __cdecl onnxruntime::L oadInterOp(class onnx::ModelProto const &,class std::vector<struct Ort::CustomOpDomain,class std::allocator<struct Ort: :CustomOpDomain> > &,class std::function<void __cdecl(char const )> const &)" (?LoadInterOp@onnxruntime@@YAXAEBVModelP roto@onnx@@AEAV?$vector@UCustomOpDomain@Ort@@V?$allocator@UCustomOpDomain@Ort@@@std@@@std@@AEBV?$function@$$A6AXPEBD@Z@ 5@@Z) referenced in function "public: __cdecl <lambda_340b7b787b9c0f81848d348e60fe6c91>::operator()(class std::shared_p tr<class onnxruntime::Model> &)const " (??R<lambda_340b7b787b9c0f81848d348e60fe6c91>@@QEBA@AEAV?$shared_ptr@VModel@onnx runtime@@@std@@@Z) [C:\Users\pengwa\dev\onnxruntime\build\Windows\RelWithDebInfo\onnxruntime_test_trainer.vcxproj] C:\Users\pengwa\dev\onnxruntime\build\Windows\RelWithDebInfo\RelWithDebInfo\onnxruntime_test_trainer.exe : fatal error LNK1120: 2 unresolved externals [C:\Users\pengwa\dev\onnxruntime\build\Windows\RelWithDebInfo\onnxruntime_test_trainer. vcxproj] onnxruntime.vcxproj -> C:\Users\pengwa\dev\onnxruntime\build\Windows\RelWithDebInfo\RelWithDebInfo\onnxruntime.dll onnxruntime_test_utils.vcxproj -> C:\Users\pengwa\dev\onnxruntime\build\Windows\RelWithDebInfo\RelWithDebInfo\onnxrun time_test_utils.lib CUDACOMPILE : nvcc warning : The 'compute_35', 'compute_37', 'sm_35', and 'sm_37' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [C:\Users\pengwa\dev\onnxruntime \build\Windows\RelWithDebInfo\custom_op_library.vcxproj] cuda_ops.cu CUDACOMPILE : nvcc warning : The 'compute_35', 'compute_37', 'sm_35', and 'sm_37' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). [C:\Users\pengwa\dev\onnxruntime \build\Windows\RelWithDebInfo\onnxruntime_test_cuda_ops_lib.vcxproj] ``` ``` kernel_type_str_resolver_utils_test.cc local_kernel_registry_test.cc C:\Users\pengwa\dev\onnxruntime\onnxruntime\test\framework\allocation_planner_test.cc(1388,9): error C2220: the followin g warning is treated as an error [C:\Users\pengwa\dev\onnxruntime\build\Windows\RelWithDebInfo\onnxruntime_test_all.vcxp roj] C:\Users\pengwa\dev\onnxruntime\onnxruntime\test\framework\allocation_planner_test.cc(1388,9): warning C4067: unexpected tokens following preprocessor directive - expected a newline [C:\Users\pengwa\dev\onnxruntime\build\Windows\RelWithDebI nfo\onnxruntime_test_all.vcxproj] ``` ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2023-01-31 12:34:45 +08:00
Ankit	a5b620e79d	[Build] Fix arm64 Docker build (#14283 )	2023-01-30 16:25:19 -08:00
Wei-Sheng Chin	679ae7ff33	[Java] Fix warnings (#14076 ) Fix C6011, C6385, C6386 found by Visual Studio. Basically, I set the maximum number of options for every EP to 128. To my knowledge, 128 is big enough to support all EPs. For support arbitrary number of EP options, we probably need #13999 and create a "std::vector"-like struct in C language.	2023-01-30 09:22:28 -08:00
Ashwini Khade	764202d740	fix prefast warning (#14446 ) ### Description Fixes a prefast warning: https://aiinfra.visualstudio.com/ONNX%20Runtime/_workitems/edit/11113 ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2023-01-30 09:13:39 -08:00
cloudhan	3b6d551c35	Enable ccache for HIP objects (#14465 ) This enables HIP compiler to be launched with `ccache` when build with `--use_cache`	2023-01-28 22:34:24 +08:00
Vincent Wang	7aecb2150f	Fix onnxruntime-CI-nightly-ort-pipeline Failure (#14464 ) PyTorch skipped version 1.14 and jumped to 2.0, while the image for the onnxruntime-CI-nightly-ort-pipeline is still using nightly-ubuntu2004-cu116-py38-torch1140dev. Switch to the new torch version image to fix the failure of the pipeline.	2023-01-28 16:05:56 +08:00
Vincent Wang	91d42e9d85	Tool to Convert ONNX Model to TFEvents (#14160 ) A tool to convert ONNX model to tfevents so that we can use tensorboard to open it for visualization. This is especially useful for debugging when the ONNX model is too large to open by Netron. usage: onnx2tfevents.py [-h] [--logdir LOGDIR] [--model MODEL]	2023-01-28 15:09:15 +08:00
Yulong Wang	d9219685ad	always set OpSchema in CreateNodeHelper() (#14356 ) ### Description as a more generic solution to #13660, always set OpSchema in CreateNodeHelper() so that added nodes by transformers will have OpSchema set ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2023-01-27 16:56:14 -08:00
dependabot[bot]	b5b70eaa8c	Bump ua-parser-js from 0.7.31 to 0.7.33 in /js/web (#14435 )	2023-01-27 23:22:48 +00:00
Zhang Lei	f87dd408f6	Support long sequence in attention (#14371 ) Support long sequence in attention operator for (1) raw mask of 2/3/4-D, (2) no mask. Set longer greedy search max length.	2023-01-27 09:39:09 -08:00
shalvamist	368d2fc11e	Added E2E test for Image Tensor API (#14406 ) ### Description Added E2E test - Currently covering - URL --> Tensor ImageData --> Tensor HTML Image Element --> Tensor Tensor --> ImageData --------- Co-authored-by: shalvamist <shalva.mist@microsoft.com>	2023-01-27 08:54:27 -08:00
Wei-Sheng Chin	4ef64f3681	Fix warning c26409 (#14079 ) We should avoid using `new` and `delete` in C/C++ code whenever possible as suggested by VC compiler.	2023-01-26 15:43:53 -08:00
Yulong Wang	de11527d76	[js] fix js/web bundle (#14434 ) ### Description make sure "crypto" is not processed by webpack for browser configuration	2023-01-26 14:43:09 -08:00
Rui Ren	eacd829d23	Bump ORT version number (#14226 ) ### Description Bump ort version after the creation of release candidate of 1.14 Co-authored-by: ruiren <ruiren@microsoft.com>	2023-01-26 12:33:47 -08:00
Ye Wang	d9c744ed9a	Fix a bug in t5 beamsearch with half precision (#14436 ) the CreateEncoderInputs functor was passed to the ctor as nullptr when type is MLFloat16. ### Description <!-- Describe your changes. --> ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2023-01-26 11:14:22 -08:00
liqun Fu	2b1a59f01a	cpu support of LpPool(18) (#14205 ) Signed-off-by: Liqun Fu <liqfu@microsoft.com> ### Description To support LpPool (18) ### Motivation and Context for Ort 1.14 release Signed-off-by: Liqun Fu <liqfu@microsoft.com>	2023-01-25 23:14:56 -08:00
Sumit Agarwal	edb377f2cb	[DML EP] Upgrade DML to 1.10.1 (#14433 ) ### Description Updated DirectML version to 1.10.1 (https://www.nuget.org/packages/Microsoft.AI.DirectML/1.10.1) ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2023-01-25 21:07:10 -08:00
Pranav Sharma	3b8dfe2e27	Don't use free to satisfy Prefast requirements (#14354 ) ### Description Don't use free to satisfy Prefast requirements ### Motivation and Context Fix ADO#9004	2023-01-25 18:50:18 -08:00
Yulong Wang	4d9ddb5193	[js] upgrade packages in js/web/test/e2e (#14334 ) ### Description upgrade versions to latest to avoid security vulerables.	2023-01-25 18:03:48 -08:00
Thiago Crepaldi	32c05fcdd1	Add Col2Im CPU op (#12311 ) Description This PR implements N-dimensional Col2Im as a contrib CPU Op as specified by ONNX's https://github.com/onnx/onnx/pull/3948 Motivation and Context - Col2Im enables models such as: - [SS-DCNet](https://github.com/xhp-hust-2018-2011/SS-DCNet) - [DSTT](https://github.com/ruiliu-ai/DSTT) - It also serves to document the ORT's obscure `math::Col2ImNd` utility Signed-off-by: Liqun Fu <liqfu@microsoft.com> Co-authored-by: Liqun Fu <liqfu@microsoft.com>	2023-01-25 12:23:00 -08:00
Tianlei Wu	94b1791974	Upgrade CUTLASS to v2.11 and add sequence length threshold for cutlass FMHA (#14401 ) ### Description Add sequence length threshold for triggering cutlass FMHA in FP32. See performance test results in https://github.com/microsoft/onnxruntime/pull/14343 to see how this threshold is selected. Upgrade cutlass to v2.11 and update deps.txt and cgmanifest for nuget pipeline build (test build: https://aiinfra.visualstudio.com/Lotus/_build/results?buildId=268574&view=results)	2023-01-25 09:43:48 -08:00
Edward Chen	7cc9aed314	Android package custom build script update (#14403 ) Update Android package custom build script. - Use later version of various dependencies (CMake, JDK, Android command line tools, Android NDK, Ubuntu). The CMake version was too old for the current ORT code. - Do in-container build in a directory that is not shared with the host. Resolves some file permission issues and speeds up file access. Add a nightly build to make sure the script works with the latest ORT.	2023-01-25 09:19:05 -08:00
Edward Chen	3bc092b1ea	Update ORT format v5 change docs to cover limited backwards compatibility in 1.14. (#14413 )	2023-01-25 08:23:12 -08:00
Adrian Lizarraga	85d7e9c596	Fix unused variable for CUDA EP builds with USE_FLASH_ATTENTION off (#14404 ) ### Description Fixes unused `use_memory_efficient_attention` variable in contrib_ops/cuda/bert/attention_impl.cu. ### Motivation and Context ORT with CUDA version < 11.6 fails to build for release configurations due to an unused variable. ```shell c:\...\onnxruntime\onnxruntime\contrib_ops\cuda\bert\attention_impl.cu(420): error : variable "use_memory_efficient_attention" was declared but never referenced [C:\...\onnxruntime\build\Windows\RelWithDebInfo\onnx runtime_providers_cuda.vcxproj] detected during instantiation of "onnxruntime::common::Status onnxruntime::contrib::cuda::QkvToContext(const cudaDeviceProp &, cublasHandle_t &, cudaStream_t, onnxruntime::contrib::AttentionParameters &, onnxruntime::contrib::cuda::AttentionData<T> &) [wit h T=float]" (923): here ``` This happens for CUDA < 11.6. Our cmake script turns off onnxruntime_USE_FLASH_ATTENTION for CUDA < 11.6, which leaves the aforementioned variable unused outside of asserts (which are removed in release builds). The USE_FLASH_ATTENTION option was added by https://github.com/microsoft/onnxruntime/pull/14343	2023-01-24 09:31:57 -08:00
Edward Chen	3c1ef7dee6	Fix CI build with no Abseil. (#14400 ) Use '\|\|' instead of 'or' in onnxruntime/core/optimizer/attention_fusion_helper.h.	2023-01-24 09:17:35 -08:00
Kevin Chen	81120e9e8b	Add custom tolerance option for onnx_test_runner (#13683 ) Signed-off-by: Kevin Chen <kevinch@nvidia.com> ### Description Add a `-t` option for `onnx_test_runner` to allow users to specify custom tolerance values when running ONNX models. ### Motivation and Context For some backends, the default tolerance of 1-e5 is too tight to pass accuracy checks with ONNX model zoo reference values, especially if only one or two values are mismatched. Having a custom option will allow different backends to specify their own custom tolerance when running these models. Signed-off-by: Kevin Chen <kevinch@nvidia.com>	2023-01-23 16:42:36 -08:00
liqun Fu	7b6d880b28	cpu to support bitwise ops (#14197 )	2023-01-23 16:42:18 -08:00
sfatimar	77b455b969	Ort openvino 4.3 cli (#14341 ) ### Description Introduce cache_dir CLI for graph serialisation. Replace existing use_compile_network and blob_dump_path cli options for openvino with a single command line option "cache_dir" specifying the path that needs to be passed for blob dump/load improving the developer experience. ### Motivation and Context? We were having two values to set cache dir which was unnecessary Co-authored-by: Preetha <preetha.veeramalai@intel.com>	2023-01-23 14:17:52 -08:00
Scott McKay	c252a7f992	Remove exclusions for ONNX model tests that now pass. (#14337 ) ### Description <!-- Describe your changes. --> Remove exclusions for ONNX model tests that now pass due to kernels being implemented. Update ONNX update doc to point to correct location for tests. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Run as many tests as possible. Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com>	2023-01-24 08:04:27 +10:00
liqun Fu	05915d8393	support Pad(18) (#14219 )	2023-01-23 12:14:35 -08:00
Hector Li	f03c507cf0	Fix fuzz test (#14385 ) Fix fuzz test	2023-01-22 22:17:43 -08:00
Nat Kershaw (MSFT)	abaed6f474	Add link to Python API examples (#14345 )	2023-01-21 16:23:16 -08:00
Tianlei Wu	a95fcb4345	UNet fusion and fp16 conversion for stable diffusion (#14248 ) Add script to fuse nodes to optimized operators in stable diffusion 1.5 models, and a script to convert fp32 models to fp16 models. Tested with stable diffusion 1.5. Note that the optimized model needs onnxruntime-gpu v1.14 (release candidate will be available soon). Note: We will update the script to work with latest diffusers and stable diffusion v2 and v2.1 models.	2023-01-21 10:16:44 -08:00
Nat Kershaw (MSFT)	e57c312f9d	Pin sphinx to avoid broken link (#14383 )	2023-01-21 09:50:56 -08:00
Yi Zhang	cf3661ff6d	Revert "Allow PostAnalysis@2 task to continue on error for Windows_Pa… (#14375 ) …ckaging_CPU_x86_default (#14332)" This reverts commit `a491f33f54`. ### Description ### Motivation and Context It looks an ADO issue. Now, it's recovered. It could be reenabled.	2023-01-21 09:32:39 +08:00
Nat Kershaw (MSFT)	0d40119624	Fix broken link (#14368 ) Fixes #11661	2023-01-20 15:55:03 -08:00
Ye Wang	de7a868d5f	Update quantization_defs.cc (#14380 ) ### Description <!-- Describe your changes. --> ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2023-01-20 15:03:50 -08:00
Hariharan Seshadri	2d8ee5251c	Misc transformer fixes - 3 (#14320 )	2023-01-20 13:57:57 -08:00
kunal-vaishnavi	72821a6113	Add PyTorch 2.0 to ORT transformer benchmarking (#14300 ) ### Description This PR adds PyTorch 2.0 as an option when running the ORT transformer benchmarking script. ### Motivation and Context PyTorch released [PyTorch 2.0](https://pytorch.org/get-started/pytorch-2.0/) in the nightly binaries and a stable release of PyTorch 2.0 is expected in March 2023.	2023-01-20 12:50:53 -08:00
Tianlei Wu	414b012f42	Add memory efficient attention from CUTLASS (#14343 ) ### Description Add memory efficient attention from CUTLASS. TODO (in next pull request): (1) Need performance tests on different GPUs, then add a sequence length threshold (only activate it for long sequence length). (2) Merge changes from https://github.com/NVIDIA/cutlass/pull/773 when it is in cutlass master.	2023-01-20 12:33:01 -08:00
Zhang Lei	e64f357ad4	Fix some prefast checking found problems. (#14342 ) Fix : BUG 8989, BUG 9014	2023-01-20 11:04:52 -08:00
Edward Chen	3b382ea7e1	Free OrtStatus in ASSERT_ORT_STATUS_OK, make run_android_emulator.py work with newer JDK version (#14369 ) - Free OrtStatus in ASSERT_ORT_STATUS_OK in model_tests.cc - Make run_android_emulator.py work with newer JDK version	2023-01-20 09:27:47 -08:00
cao lei	22fdc31667	remove unnecessary waitOnEPStep when current node and the consumer node are in the same stream (#14173 ) ### Description Remove the unnecessary WaitOnEPStep if the current operator node and its consumer are in the same stream while there are notifications filed in the current node ### Motivation and Context In the current code, the WaitOnEPStep will always be launched as long as the notification is filed in the input node, no matter the current node and the input node are in the same stream or not, which is not necessary. This PR is to remove the WaitOnEPStep for this case. Co-authored-by: Lei Cao <leca@microsoft.com>	2023-01-20 07:35:15 -08:00
Kyushick Lee	cd24f0794a	Extend ort_backend.py for another ep (#14349 ) ### Description <!-- Describe your changes. --> This PR extends OrtBackend to allow for configuring an EP based on the name, and fallbacks to existing mechanism that infers the EP based on tensor affinity if nothing is provided. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Currently OrtBackend needs `get_ort_device()` with the device tag inferred from torch.Tensor, but ort device is not yet supported for dort. The change allows run dort with a supported EP, by configuring dort with a desired EP and letting the dort (ort InferenceSession) take CPU-affined pytorch Tensors as inputs then inject data transfer nodes internally.	2023-01-20 07:30:00 -08:00
Yi Zhang	3d6cea14f4	Remove intermedia obj files once build finished (#14361 ) ### Description Remove intermedia obj files and reenable cache ### Motivation and Context Recently, training_debug_x64 pipeline often failed due to not enough space. It could free nearly 8G space by deleting obj files. So, the compilation cache can be reenabled	2023-01-20 13:37:15 +08:00
Ye Wang	668586e8f8	Support muP in Attention (#14348 ) ### Description <!-- Describe your changes. --> ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Co-authored-by: Ubuntu <wy@v100-2.0cdb2e52twzevn1i4fi45bylyg.jx.internal.cloudapp.net>	2023-01-19 20:36:55 -08:00
Tianlei Wu	1dd07d147d	fix windows build error (#14362 ) ### Description Fix https://github.com/microsoft/onnxruntime/issues/14359 test\greedy_search_top_one.cc(21,44): warning C4244: '=': conversion from 'int32_t' to '_Ty', possible loss of data [C:\Users\11000978\onnxruntime\build\Windows\Debug\onnxrunti me_providers_cuda.vcxproj] ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2023-01-19 18:20:46 -08:00
Wei-Sheng Chin	432a9912a3	Fix LORT CI failure due to PyTorch change (#14367 ) As title. The fuser in LORT doesn't like "scalar". With a recent PyTorch change, scalar is intorduced somewhere it was there before. Now, a simple fix is to check if all inputs are tensors or some specially allowed cases before sending ops to ORT.	2023-01-19 16:02:40 -08:00
RandySheriffH	36ba3d8d21	Exclude a multi-stream case from reduced ops build (#14351 ) Exclude a multi-stream case from reduced ops build to unblock [pipeline](https://dev.azure.com/onnxruntime/onnxruntime/_build?definitionId=120&_a=summary). Co-authored-by: Randy Shuai <rashuai@microsoft.com>	2023-01-19 14:39:25 -08:00

1 2 3 4 5 ...

8076 commits