onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-07-08 17:17:15 +00:00

Author	SHA1	Message	Date
Baiju Meswani	e464588a0e	Avoid generating training documentation during packaging (#15795 )	2023-05-03 19:09:07 -07:00
Jian Chen	5eedd884c8	Adding support for conv fp16 fusion on Resnet50v1 (#15474 ) ### Description Adding support for conv fp16 fusion with Conv-Add and Conv-Add-act. Specifically tested on on Resnet50v1 ### Motivation and Context Adding support for conv fp16 fusion with Conv-Add and Conv-Add-act. Specifically tested on on Resnet50v1	2023-05-03 15:48:06 -07:00
Changming Sun	1fb2f2605b	Update VERSION_NUMBER (#15773 ) ### Description 1. Update VERSION_NUMBER for preparing the upcoming release. This PR's commit will not be included in the 1.15 release branch 2. Delete package/rpm/onnxruntime.spec since it was not used in past years. ### Motivation and Context Preparing the release. Fixed [AB#15311](https://aiinfra.visualstudio.com/6a833879-cd9b-44a4-a9de-adc2d818f13c/_workitems/edit/15311)	2023-05-03 15:07:34 -07:00
Baiju Meswani	ba7b83ff3c	Remove onnxruntime_PYBIND_EXPORT_OPSCHEMA definition from onnxruntime (#15776 )	2023-05-03 13:08:35 -07:00
Changming Sun	41c082fdde	Add a Github workflow for Prefast (#15763 )	2023-05-03 11:42:51 -07:00
Changming Sun	d53324d4a7	Update cmake version in a few places (#15775 ) ### Description They were missed in #15707 , because they are not in common places for Dockerfiles. Though this commit updated tools/ci_build/github/pai/rocm-ci-pipeline-env.Dockerfile, it won't automatically take effect. The image needs to be manually generated and pushed to a place, and before doing that our CMakeLists.txt also needs to be tweaked a little bit.	2023-05-02 22:56:28 -07:00
RandySheriffH	e3ec2b3a8e	Exclude cases from reduced build (#15779 ) Exclude cases from reduced build to unblock pipeline. Fixed [AB#15326](https://aiinfra.visualstudio.com/6a833879-cd9b-44a4-a9de-adc2d818f13c/_workitems/edit/15326) Co-authored-by: Randy Shuai <rashuai@microsoft.com>	2023-05-02 21:05:54 -07:00
Yulong Wang	ef1f17f3dc	[wasm/JSEP] add threaded build to artifacts (#15777 ) ### Description This is the first part to create a webassembly artifacts for ort-web webgpu EP (wasm build). there will be following steps to consume the artifacts in web build	2023-05-02 17:53:44 -07:00
Baiju Meswani	2d519d21af	Python documentation for onnxruntime-training (#15765 )	2023-05-02 16:58:16 -07:00
Jian Chen	abdd4f518a	Update TRT Windows Cuda 11.6 to 11.8 (#15746 ) ### Description Update TRT Windows cuda 11.6 to 11.8 ### Motivation and Context We are adapting newer version of cuda systemwide.	2023-05-02 12:23:13 -07:00
Changming Sun	328cabb194	Download protoc from Github Release instead of Nuget (#15731 ) ### Description Download protoc from Github Release instead of Nuget to avoid having dependency on nuget.exe on Linux ### Motivation and Context To avoid having dependency on nuget.exe on Linux. Many users' build environment do not have nuget or dotnet.	2023-05-02 12:18:59 -07:00
Nat Kershaw (MSFT)	e901cdbf54	Add wildcard paths to the API docs generation workflows (#15313 )	2023-05-02 10:43:45 -07:00
Chen Fu	4b8025e492	Parallelize fp16 pooling operators (#15766 ) ### Description Parallelize fp16 pooling operators ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2023-05-02 08:48:56 -07:00
Chen Fu	bc58fd5413	fix compilation error in no absl build (#15769 ) ### Description Fix no-absl build error:	2023-05-02 08:20:49 -07:00
Sohaib Iftikhar	92309187b3	Allow compilation using clang when using cuda. (#15672 ) ### Description Currently compiling with clang + cuda leads to: ``` /code/build/_deps/onnxruntime-src/include/onnxruntime/core/common/make_string.h:33:6: error: call to function 'operator<<' that is neither visible in the template definition nor found by argument-dependent lookup ss << t; ^ /code/build/_deps/onnxruntime-src/include/onnxruntime/core/common/make_string.h:39:3: note: in instantiation of function template specialization 'onnxruntime::detail::MakeStringImpl<gsl::span<const long, 18446744073709551615>>' requested here MakeStringImpl(ss, args...); ^ /code/build/_deps/onnxruntime-src/include/onnxruntime/core/common/make_string.h:46:3: note: in instantiation of function template specialization 'onnxruntime::detail::MakeStringImpl<const char , gsl::span<const long, 18446744073709551615>>' requested here MakeStringImpl(ss, args...); ^ /code/build/_deps/onnxruntime-src/include/onnxruntime/core/common/make_string.h:93:18: note: in instantiation of function template specialization 'onnxruntime::detail::MakeStringImpl<const char , gsl::span<const long, 18446744073709551615>>' requested here return detail::MakeStringImpl(detail::if_char_array_make_ptr_t<Args const&>(args)...); ^ /code/build/_deps/onnxruntime-src/onnxruntime/contrib_ops/cuda/quantization/qordered_ops/qordered_qdq.cc:73:12: note: in instantiation of function template specialization 'onnxruntime::MakeString<char[39], gsl::span<const long, 18446744073709551615>>' requested here return ORT_MAKE_STATUS(ONNXRUNTIME, INVALID_ARGUMENT, "Shape not meet clean tile requirement!", dims); ^ /code/build/_deps/onnxruntime-src/include/onnxruntime/core/common/common.h:188:48: note: expanded from macro 'ORT_MAKE_STATUS' ::onnxruntime::MakeString(__VA_ARGS__)) ^ /code/build/_deps/onnxruntime-src/include/onnxruntime/core/framework/tensor_shape.h:201:15: note: 'operator<<' should be declared prior to the call site or in namespace 'gsl' std::ostream& operator<<(std::ostream& out, const TensorShape& shape); ^ 1 error generated. ```	2023-05-02 01:12:39 -07:00
Changming Sun	034698cf6a	Revert "Implement lite custom op API (#15590 )" (#15768 ) This reverts commit `cdf4fc49fc` because it breaks the "debug_node_input_output" build in "Post Merge" pipeline	2023-05-02 01:10:10 -07:00
Prathik Rao	090312af71	add local state dict option (#15759 ) ### Description Adds an option to load local state dictionary for whisper model export. ### Motivation and Context This is useful to demonstrate workflow of using ORT Training to get model weights, downloading said weights onto a local gpu-enabled device, exporting the custom model using `convert_to_onnx.py`, and then nicely feeding the .onnx file into ORT InferenceSession.	2023-05-01 22:08:11 -07:00
Ye Wang	391f897983	Bring back SLN cuda kernel and use provider options to switch to standard implementation (#15660 )	2023-05-01 18:35:26 -07:00
Nat Kershaw (MSFT)	9219615471	Fix python AP docs generation (#15760 ) Docs are failing on the operator generation step. Remove this temporarily so that we can publish.	2023-05-01 18:31:59 -07:00
Changming Sun	5352f6d9b0	Make "--cuda_version" build arg optional (#15758 ) ### Description This change will allow us building CUDA EP without installing CUDA SDK on Windows. ### Motivation and Context Nvidia's CUDA installer comes with a VS extension. In the past, we require installing the extension. It is a little bit inconvenient since: 1. Visual Studio must be installed before CUDA SDK. CUDA's installer will not install the extension if your machine doesn't have Visual Studio. 2. We need to install CUDA SDK on our build machines, instead of just downloading it and using it. After this change, we will not need to install CUDA SDK on our build machines. So it will be easier to add a support for a different CUDA version. Also, fix two PreFast warnings.	2023-05-01 18:00:47 -07:00
Ye Wang	7f293d065a	Remove some D2D memcpy in T5 BeamSearch (#15706 ) ### Description <!-- Describe your changes. --> replace D2D memcpy with cuda kernels in creating decoder initial feeds before: ![image](https://user-images.githubusercontent.com/52801275/234746326-209110e3-1cd4-4e8c-a488-515dd34c06c7.png) after: ![image](https://user-images.githubusercontent.com/52801275/234736352-0bd027a0-e382-47d7-a1c5-ae9fbc9f1c9d.png) ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> --------- Co-authored-by: Ubuntu <wy@v100-2.0cdb2e52twzevn1i4fi45bylyg.jx.internal.cloudapp.net>	2023-05-01 13:03:21 -07:00
Ashwini Khade	0ffae8073b	Creating Nuget and Android packages for Training (#15712 ) ### Description This PR creates Nuget and Android for Training. ### Motivation and Context These packages are intended to be released in ORT 1.15 to enable On-Device Training Scenarios. ## Packaging Story for Learning On The Edge Release ### Nuget Packages: 1. New Native package -> Microsoft.ML.OnnxRuntime.Training (Native package will contain binaries for: win-x86, win-x64, win-arm, win-arm64, linux-x64, linux-arm64, android) 2. C# bindings will be added to existing package -> Microsoft.ML.OnnxRuntime.Managed ### Android Package published to Maven: 1. New package for training (full build) -> onnxruntime-training-android-full-aar ### Python Package published to PyPi: 1. Python bindings and offline tooling will be added to the existing ort training package -> onnxruntime-training	2023-05-01 12:59:56 -07:00
Sumit Agarwal	4c4f688a93	[DML EP] Fix dml_external_project (#15656 ) ### Description While building ORT for DML EP with `dml_EXTERNAL_PROJECT` flag, 2 variables (`DML_SHARED_LIB`, `DML_PACKAGE_DIR`) value is not set properly. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2023-05-01 12:02:56 -07:00
liqun Fu	62fc6ed5a8	[Feature Request] Support Resize opset 19 (#15633 )	2023-05-01 10:49:17 -07:00
cao lei	d58fa9805b	ExecutionProvider API refactor - replace OrtMemoryInfo with OrtDevice (#15618 ) ### Description ExecutionProvider API refactor - replace OrtMemoryInfo with OrtDevice ### Motivation and Context Currently “Location” is represented as ORTMemoryInfo, which is OrtDevice + OrtMemType, while OrtDevice is represent as DeviceType + DeviceId + MemType. As we can see there is some unnecessary hierarchy, the proposal is to make it a clear definition that to use OrtDevice as an abstraction for Location --------- Co-authored-by: Lei Cao <leca@microsoft.com>	2023-05-01 10:06:00 -07:00
Baiju Meswani	bb33285ec2	C# training api updates for on device training (#15720 )	2023-05-01 10:01:38 -07:00
shalvamist	c10a6a9d17	Tensor <--> image - Adding per channel compute for Norm mean & Bias (#14705 ) ### Description Enabled the use of per channel Bias and Mean normalization when converting an image <--> tensor. Added a few bug fixes and updates to the relevant E2E tests. --------- Co-authored-by: shalvamist <shalva.mist@microsoft.com>	2023-05-01 09:37:50 -07:00
Scott McKay	0770cf3699	Remove C# SessionOptions.RegisterCustomOpsUsingFunction. (#15754 ) Symbol visibility from DllImport is inconsistent across platforms resulting in the symbol not necessarily being visible to ORT native code that tries to look it up by name. Best solution is to use DllImport to load the library and to call the registration function directly. That requires the native SessionOptions handle and OrtApiBase struct. We could either make those public, or provide a helper where the user passes in a delegate from their DllImport. Can add when needed.	2023-05-01 09:14:21 -07:00
RandySheriffH	cdf4fc49fc	Implement lite custom op API (#15590 ) Implement a set of new APIs for lightweight custom ops registration, to save efforts on schema-composing. A few highlights: 1. Support build-time type inference; 2. Support function-as-op for "stateless" ops; 3. Support structure-as-op for "stateful" ops; 4. Support varied input/output forms such as span, scalar, and tensors, either optional or non-optional. --------- Co-authored-by: Randy Shuai <rashuai@microsoft.com>	2023-05-01 08:45:26 -07:00
Chen Fu	0e9472d391	NHWC graph optimizer (#15724 ) ### Description Augment nhwc graph optimizer to accommodate fp16 operators. ### Motivation and Context With new fp16 conv operator added. This operator prefers NHWC data layout. We need to augment existing graph optimizers to better utilize the new operator.	2023-05-01 08:44:07 -07:00
Chunye Wang@AMD	d35850c142	[VitisAI]Update VitisAI EP to be compatible with VitisAI 3.5 (#15673 ) ### Description Originally VitisAI EP only works with old version of VitisAI release. ### Motivation and Context Update VitisAI EP so that it works together with the current VitisiAI 3.5 and further version of VitisAI. We try our best to make it forward compatible. --------- Co-authored-by: Wang Chunye <chunywan@xilinx.com> Co-authored-by: mingyue <mingyue@amd.com> Co-authored-by: mingyueliuh <131847423+mingyueliuh@users.noreply.github.com> Co-authored-by: liumingyue <mingyue@xilinx.com> Co-authored-by: moore-ch <129165652+moore-ch@users.noreply.github.com> Co-authored-by: shoucair <shoucai.ren@amd.com> Co-authored-by: zz002 <zhenze.wang@amd.com> Co-authored-by: BoarQing <yuz75@Pitt.edu> Co-authored-by: Yueqing Zhang <yueqingz@amd.com> Co-authored-by: Scott McKay <Scott.McKay@microsoft.com>	2023-05-01 08:28:26 -07:00
Jeff Bloomfield	3df3a85114	Default kOrtSessionOptionsDisableQuantQDQ to 1 when the DML EP is registered (#15725 ) This addresses a performance regression in some INT8 models with the DirectML EP by defaulting OrtSessionOptionsDisableQuantQDQ to 1 when the EP is registered. This regression occured due to the introduction of the QDQ propagation transformer, which is based on this session option. That transformer maximizes the number of nodes which are executed as quantized by logically propagating quantize operators upstream and dequantize operators downstream. However, it does this simply by inserting QDQ pairs, with an expectation that something will recognize sequences of DQ->Op->Q. This logic and related L2 transformers are not currently enabled for the DirectML EP. This change also removes a noisy warning when the session option for memory pattern is overriden as the DirectML EP is registered.	2023-05-01 08:26:03 -07:00
Tianlei Wu	10dff4f665	only add type info from symbolic shape inference for fp16 conversion (#15617 ) ### Description Walkaround of https://github.com/microsoft/onnxruntime/issues/15521. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2023-04-30 23:22:11 -07:00
Chi Lo	6e652d0554	Support explicit TRT profiles from provider options (#15546 ) Previous behavior of TRT EP to set TRT optimization profiles for dynamic shape input is based on input tensor values. Users can't explicitly specify the profiles. This PR makes users capable of specifying min/max/opt profiles through newly added three provider options: `trt_profile_min_shapes`, `trt_profile_max_shapes` and `trt_profile_opt_shapes` with the format of "input1:dim1xdim2...,input2:dim3xdim4...". (Note: It's similar to --minShapes, --maxShapes and --optShapes of trtexec command-line [flags](https://docs.nvidia.com/deeplearning/tensorrt/developer-guide/index.html#trtexec-flags)) For example, if you are using onnxruntime_perf_test, you can try this: `./onnxruntime_perf_test -e tensorrt -r 1 -i "trt_profile_min_shapes\|imgs:1x3x384x288 trt_profile_max_shapes\|imgs:32x3x384x288 trt_profile_opt_shapes\|imgs:16x3x384x288" your_model_path` If the engine cache is enabled, you still need to provide these three explicit provider options in order to use this feature. ORT TRT will compare the min/max/opt profile shape with the ones saved in .profile file to decide whether to rebuild the engine. Constraints to use these provider options: (1) Need to specify min/max/opt profile shapes for all the dynamic shape input This feature is also requested by other users: https://github.com/microsoft/onnxruntime/issues/13851	2023-04-30 22:30:26 -07:00
Scott McKay	31e7d3d7d4	Disable TestRegisterCustomOpsWithFunction on Linux (#15747 ) ### Description <!-- Describe your changes. --> Disable new test that is failing on linux. Not required for this release. Will fix in the next week. Marshal.Prelink can be used on Windows to make the symbol available but Linux appears to work differently. Also need to update the pre-checkin tests so this is tested early as it's only failing in the E2E tests run in the packaging pipeline. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Fix packaging pipeline error.	2023-04-30 14:39:02 +10:00
Changming Sun	176161348e	Revert "make nuget workflow easy to debug. (#15693 )" (#15744 ) This reverts commit `53ff50d19a` because it make the nuget pipeline fail.	2023-04-29 19:05:01 -07:00
kunal-vaishnavi	7ae01cec15	Update wheel path to Whisper custom export script (#15739 ) ### Description This PR updates the documentation for using the Whisper custom export scripts via the wheel. ### Motivation and Context The path should say `onnxruntime.transformers.models.whisper.convert_to_onnx` instead of `onnxruntime.transformers.models.convert_to_onnx`.	2023-04-29 17:32:34 -07:00
sfatimar	4fbc08e3c2	VPUX config fix and dynamic_shape bug fixed. (#15737 ) Dynamic shapes was not working with serialized model so we are switching to compile model method ### Motivation and Context Dynamic shapes was not working with serialized model - If it fixes an open issue, please link to the issue here. --> Signed-off-by: MaajidKhan <n.maajid.khan@intel.com> Co-authored-by: MaajidKhan <n.maajid.khan@intel.com>	2023-04-29 15:48:34 -07:00
Adrian Lizarraga	d32c540b2d	[QNN EP] Support LRN operator (#15741 ) ### Description Adds support for the LRN operator to QNN EP. ### Motivation and Context Enables basic models like googlenet and alexnet to run entirely on QNN EP.	2023-04-29 13:23:42 -07:00
Changming Sun	65020d433e	Prefast fixes for CUDA EP (#15726 ) ### Description 1. Adjust cmake flags. Do not modify CMAKE_CXX_FLAGS globally. Only apply the flags to ORT code. 2. Fix some SDL warnings.	2023-04-29 12:43:12 -07:00
Jian Chen	ec2f038c6d	Update Nuget pipeline's Linux CUDA job to cuda 11.8 (#15516 ) ### Description Fixed AB#14497	2023-04-29 07:38:18 -07:00
Rachel Guo	c8bd34f975	[js/rn] Package dependency change to manage ort-extensions for react_native app (#15641 ) ### Description <!-- Describe your changes. --> js/react_native package dependency change to manage ort-extensions for react-native app. Enable optional inclusion of ort-ext aar/ ort-ext pods for react-native extensions apps when specifiy `ortExtensionsEnabled` in user's package.json file ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> --------- Co-authored-by: rachguo <rachguo@rachguos-Mac-mini.local> Co-authored-by: rachguo <rachguo@rachguos-Mini.attlocal.net>	2023-04-29 00:07:12 -07:00
Yuhong Guo	41dcf0d32e	Expose build information in dynamic lib (#15643 ) ### Description <!-- Describe your changes. --> 1. Add Build Info API to onnx. 2. Fix compile error while building onnxruntime_benchmark in MacOs. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> 1. When Onnxruntime lib is serving online, we need a way to detect how this lib is built. This PR helps the developer to get the build information using `strings` such as git branch, git commit id, build type and cmake cxx flags, which is showed as follows. ![image](https://user-images.githubusercontent.com/19584326/233794371-b2f95a2c-27fb-4709-a6dd-bf4bb12b0b5b.png) ![image](https://user-images.githubusercontent.com/19584326/233794360-f96f5d2e-332c-405c-83f1-370ccc2b86f8.png) If the build env has no git, there will be no git related infor: ![image](https://user-images.githubusercontent.com/19584326/234558596-298c1b01-9a90-41bf-9372-7259a8f8e5be.png) 3. Fix the following compile error while building benchmark in MacOs. ![image](https://user-images.githubusercontent.com/19584326/233793571-c261ac1f-47b2-434d-a293-7e9edc6c8a66.png) --------- Co-authored-by: Yuhong Guo <yuhong.gyh@antgroup.com>	2023-04-28 21:57:31 -07:00
Adrian Lizarraga	191deb4235	[QNN EP] Nuget package (#15711 ) Adds pipeline for QNN NuGet package (x64 and arm64).	2023-04-28 19:33:14 -07:00
Rachel Guo	6a6091a519	[rn] Add support for loading model from buffer on iOS (#13802 ) ### Description <!-- Describe your changes. --> -Add support for loading model from buffer on iOS -Update OnnxruntimeModuleTest to use updated loadModelFromBuffer Based on #12676 ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Issue: #12500 --------- Co-authored-by: rachguo <rachguo@rachguos-Mini.attlocal.net> Co-authored-by: rachguo <rachguo@rachguos-Mac-mini.local>	2023-04-28 17:34:26 -07:00
kunal-vaishnavi	fe1ddd7b61	Fix bug when adding Whisper to wheel (#15708 ) ### Description This PR adds `onnxruntime.transformers.models.whisper` to the wheel. ### Usage There is a README.md document that shows sample commands. The following command will show how to use the custom Whisper export script in more detail. ``` $ python3 -m onnxruntime.transformers.models.whisper.convert_to_onnx --help ``` ### Motivation and Context This fixes an issue with adding the Whisper custom export scripts to the wheel. The Whisper folder now appears in the wheel. ![Screenshot 2023-04-26 143705](https://user-images.githubusercontent.com/115581922/234708587-6d1b7d34-71a9-4f9f-a491-657ceb25afcb.jpg)	2023-04-28 16:03:55 -07:00
liqun Fu	2802c547a1	update OnnxMl.cs (#15702 )	2023-04-28 11:20:29 -07:00
pengwa	29d13cea42	Cumulative update on optimizers and tests (on-device training) (#15499 )	2023-04-28 09:55:39 -07:00
Adam Pocock	8a1a40ac63	[Java] CheckpointState AddProperty & GetProperty support (#15730 )	2023-04-28 09:52:52 -07:00
Chen Fu	be08b47e7b	Refine cast optimizer for safety (#15658 ) ### Description Cast optimizer may convert a fp16 node to fp32. This used to be safe as all fp16 kernels has fp32 implementation. As this assumption is no longer true, we need to check the validity of the operation ### Motivation and Context Main work here is to introduce an API to check whether a kernel is registered. Currently we don't have a way to do that without an operator node. This needs to be augmented. We need to query whether a kernel is registered by its property only, so that we can judge whether it is safe to construct a node long before we actually do so.	2023-04-28 09:32:54 -07:00

1 2 3 4 5 ...

8728 commits