onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-07-03 03:58:54 +00:00

Author	SHA1	Message	Date
cloudhan	049adb9f31	[ROCm] Remove redundant ep field in softmax (#17048 )	2023-08-17 11:53:30 +08:00
Changming Sun	5249b7ab7c	Re-implement stacktrace (#17173 ) ### Description Re-implement stacktrace. The new implementation doesn't directly use Windows API, hence can avoid problems regarding to initialize/uninitialize the dbghelp library. ### Motivation and Context	2023-08-16 16:07:49 -07:00
Dmitri Smirnov	f45eef399e	Fix visualization issues with Attribute/Tensor protos (#17188 ) ### Description Protobuf Natvis	2023-08-16 13:56:51 -07:00
RandySheriffH	3dd2c1b4d7	EP context for custom op (#16454 ) Implement infrastructures to allow EP resources surfaced to custom ops. --------- Co-authored-by: Randy Shuai <rashuai@microsoft.com>	2023-08-16 13:03:40 -07:00
Maximilian Müller	7b9d1f18c7	NVTX windows include and link fixes (#16831 ) ### Description For windows headers are not duplicated to the normal cuda include. For linux they are: ``` (base) maximilianm@maximilianm-dt-linux:~$ ls /usr/local/cuda/include/nvtx3 \| grep nvTool nvToolsExt.h nvToolsExtCuda.h nvToolsExtCudaRt.h nvToolsExtOpenCL.h nvToolsExtSync.h (base) maximilianm@maximilianm-dt-linux:~$ ls /usr/local/cuda/include \| grep nvTool nvToolsExt.h nvToolsExtCuda.h nvToolsExtCudaRt.h nvToolsExtOpenCL.h nvToolsExtSync.h ``` Is the preference via those added defines or should the include just be changed to be `nvtx3/` ? Also there is no library linking needed on Windows and the library is not even present.	2023-08-16 11:53:58 -07:00
Yulong Wang	cbee84ddfb	[js/web] allow optional input/output in operator test (#17184 ) ### Description allow optional input/output in operator test	2023-08-16 11:50:11 -07:00
Adrian Lizarraga	96b1ff610b	Add CI and PR validation triggers to QNN Windows x64 Pipeline yaml (#17178 ) ### Description Adds continuous integration and pull-requestion validation triggers directly to the yaml file for the Windows x64 QNN CI Pipeline. ### Motivation and Context There have been various unit tests failures that break the QNN_Windows_Nuget pipeline, which builds QNN EP for Windows x64. This PR ensures that QNN EP is built and tested on a Windows x64 image for every pull request.	2023-08-16 11:44:54 -07:00
Hariharan Seshadri	66df11769c	[JS/WebGPU] Expand operator fixes (#17137 )	2023-08-16 11:24:26 -07:00
Tianlei Wu	99349e58d7	dump tensor statistics (#15761 ) Dump statistics of input and/or output tensors of each node. It could help to find out why a model outputs NaN. To use this tool, just add `--cmake_extra_defines onnxruntime_DEBUG_NODE_INPUTS_OUTPUTS=1` when build onnxruntime package. Then set some environment varaibles before running model with onnxruntime: ``` export ORT_DEBUG_NODE_IO_DUMP_INPUT_DATA=1 export ORT_DEBUG_NODE_IO_DUMP_OUTPUT_DATA=1 export ORT_DEBUG_NODE_IO_DUMP_STATISTICS_DATA=1 ``` Then statistics data will be appended after the dumping of input and output tensors. One possible cause of a FP16 or mixed precision model outputs NaN: some number exceeds the limit of FP16 (like max FP16 value is 65504). When a fp32 model has value > 65504 in a node output, it will become INF when converting the node to FP16. In this case, you need keep related nodes in FP32 to avoid the issue. You can dump tensor statistics of FP32 model to find out such candidate nodes.	2023-08-16 10:53:48 -07:00
satyajandhyala	89b682e3f3	[JS/Web] The bias input is optional, not required, for LayerNormalization operator (#17143 ) ### Description Fix a typo. LayerNormalization takes 2 or 3 inputs. The third input, bias, is optional. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2023-08-16 10:41:20 -07:00
Preetha Veeramalai	2ae930333b	Add checks for session options and fix gsubgraph fallback exceptions (#17095 ) ### Description Bug fix for OVEP graph provider options and fallback ### Motivation and Context A bug fix logic is added to handle the fallback to CPU EP. Corner case Assertions are added for ProviderOptions in OpenVINO. --------- Co-authored-by: Sahar Fatima <sfatima.3001@gmail.com> Co-authored-by: Saurabh Kale <saurabh1.kale@intel.com>	2023-08-16 10:06:25 -07:00
Yulong Wang	133af1385c	[js/webgpu] update shader cache key to include input tensor datatype (#17176 ) ### Description update shader cache key to include input tensor datatype. and make the key a little bit easier to read	2023-08-16 09:14:19 -07:00
Jian Chen	8998b6811d	Fix NPM Packaging Pipeline (#17182 ) ### Description <!-- Describe your changes. --> ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2023-08-15 22:56:38 -07:00
PeixuanZuo	ebcd9b5cae	Fix deprecated optimum interface (#17112 ) The `latest_model_name` argument to create an {self.__class__.__name__} is deprecated since optimum 1.6.0. Replace it with `model_name`	2023-08-16 12:33:36 +08:00
Tianlei Wu	6b29837ed2	Move attention test data to file (#17158 ) (1) Move attention test data from code to file to avoid prefast crash (which blocks python packaging pipeline) (2) Enable some test cases that previously disabled in Windows (3) Fix an assertion error in `MultiHeadAttentionTest.CrossAttention_WithPastPassedInDirectly_NoMask` This test case is for Whisper cross attention. When Memory efficient attention was enabled, format is converted to BNSH, which trigger assertion error since memory efficient attention asserts BSNH format. Temporarily disable memory efficient attention for this case. I also disabled the test since Whisper does not use it anymore, and ROCm fails in the test.	2023-08-15 21:31:57 -07:00
xhcao	33ecde9af1	[js/webgpu] Fix reshape int32 test case (#17113 ) Co-authored-by: Xing Xu <xing.xu@intel.com> Co-authored-by: Xing Xu <xing.xu@intel.com>	2023-08-15 21:18:13 -07:00
Guenther Schmuelling	8289e8b6ef	[js/webgpu] fix a few shader errors (#17171 ) Fix for segment anything decoder, reduceMax with rank1 and concat.	2023-08-15 21:14:20 -07:00
Yulong Wang	35363dd9a5	[js/web] a few optimizations for test runner (#17174 ) ### Description 1. allows passing session options to operator test (eg. graph optimization level) 2. add a short flag '-x' for '--wasm-number-threads' as it is frequently used.	2023-08-15 21:00:23 -07:00
Justin Chu	2575b9aaa1	Improve comments in winml/ (#17163 ) Follow up of #17144. Manually fixed indentation in block comments and replaced all tabs with spaces.	2023-08-15 23:30:56 -04:00
dependabot[bot]	178e5991ac	Bump protobufjs from 6.11.3 to 6.11.4 in /js/node (#17177 )	2023-08-16 02:00:38 +00:00
Arthur Islamov	ccf14e891e	[js/web] JSEP node assignment optimization (#17128 ) ### Description Since WebGPU supports only float32 and int32, having Gather, Reshape, Shape, Squeeze and Unsqueeze ops with other data types create additional MemCpy ops and slow down the overall execution as all other OPs with other tensor types will be done on CPU. Before this patch SD Unet had these numbers: Node(s) placed on [CPUExecutionProvider]. Number of nodes: 1141 Node(s) placed on [JsExecutionProvider]. Number of nodes: 4025 memcpy tokens: 2001 After patch: Node(s) placed on [CPUExecutionProvider]. Number of nodes: 1735 Node(s) placed on [JsExecutionProvider]. Number of nodes: 2243 memcpu tokens: 813 It also gives more than 5X performance benefit. From 12sec for one Unet step to 2.2sec on RTX 3090 Ti, so we are almost getting to native performance. UPD: with latest changes from main branch and multi-threading it went down to 1.6sec. Will try re-exporting my model to onnx with maximum optimizations, like using MultiHeadAttention to decrease node count. Maybe after implementing that it can go in less than 1 sec	2023-08-15 18:58:05 -07:00
shaahji	3cdf42548f	Issue #17098 : Shape inferencing fails during quantization for large models (#17100 )	2023-08-15 18:38:14 -07:00
Wanming Lin	789bac1dc8	[WebNN EP] Support BatchNormalization op (#17071 ) Adds support for BatchNormalization via WebNN meanVarianceNormalization.	2023-08-15 17:52:09 -07:00
Pranav Sharma	c0f8197157	Add README to Nuget and fix license file name (#17170 ) ### Description Add README to Nuget and fix license file name ### Motivation and Context Fixes https://github.com/microsoft/onnxruntime/issues/17055	2023-08-15 16:04:34 -07:00
RandySheriffH	39dfcd5d84	Allow RunAsync with global TP (#17157 ) Allow RunAsync called with a global thread pool. --------- Co-authored-by: Randy Shuai <rashuai@microsoft.com>	2023-08-15 14:29:10 -07:00
Adam Louly	c647e3e8ab	Run nightly pipeline tests from the commit id. (#17162 ) ### Description The onnxruntime-CI-nightly-ort-pipeline encounters occasional failures due to synchronization discrepancies between the ACPT nightly image and the repository. We are addressing this by executing tests using the commit ID associated with the ort build within the ACPT image. --------- Co-authored-by: Adam Louly <adamlouly@microsoft.com@orttrainingdev9.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net>	2023-08-15 12:07:38 -07:00
dependabot[bot]	f086bd7bff	Bump github/issue-labeler from 2.5 to 3.2 (#16639 )	2023-08-15 18:00:19 +00:00
Changming Sun	8e203efc69	Cleanup cmake file (#17154 ) ### Description 1. Clean up cmake files. Remove some unused code 2. Remove the "Semmle" task from tools/ci_build/github/azure-pipelines/templates/win-ci.yml. Semmle is deprecated and replaced by CodeQL.	2023-08-15 10:51:33 -07:00
Changming Sun	2a22325005	Explicitly set JDK version when building ORT java package (#17147 ) ### Description Explicitly set JDK version when building ORT java package. This is to fix an internal build error.	2023-08-15 10:36:05 -07:00
Severin Simmler	90754fc077	Fix invalid escape sequence (#17145 ) ### Description - Removed one unused import - Escaped a backslash in a path ### Motivation and Context I see this `DeprecationWarning` when I import `onnxruntime`: ``` onnxruntime/capi/_pybind_state.py:28: DeprecationWarning: invalid escape sequence '\S' "(other than %SystemRoot%\System32), " ``` A future version of Python (maybe 3.13?) will raise a `SyntaxError` for invalid escape sequences.	2023-08-15 10:29:54 -07:00
Tianlei Wu	3aba736ee2	Refactoring of Stable Diffusion scripts (#17138 ) Reduce duplicated code in two stable diffusion pipelines (CUDA and TensorRT). Move the common code to models.py	2023-08-15 09:36:31 -07:00
Matthieu Darbois	5e971bc51a	Rework WIL dependency retrieval/usage (#17130 ) ### Description 1. `onnxruntime_fetchcontent_makeavailable` works around unconditional install commands so that can be used instead of `FetchContent_Populate` 2. This dependency is Windows specific, mark it as such. ### Motivation and Context 1. This simplifies `cmake/external/wil.cmake` not to do anything specific wether WIL was fetched or found 2. Given it's specific to Windows, it might not be available on other OS in specific air-gapped environment such as [conan-center-index](https://github.com/conan-io/conan-center-index). This allows downstream builds not to require specific patches for something not required by the build in the first place.	2023-08-15 09:11:46 -07:00
Tianlei Wu	412b0d0831	Update BERT and GPT-2 optimization notebooks for CPU EP (#17057 ) The notebooks are not up to update. (1) Update BERT and GPT-2 optimization notebooks for CPU EP with latest PyTorch and ONNX Runtime. (2) Add links to quantization example ### Motivation and Context https://github.com/microsoft/onnxruntime/issues/16515	2023-08-15 00:55:03 -07:00
pengwa	abf9765d73	PythonOp Enhancement: Bool and Tuple[Bool] Constants, Materialize Grads, Empty Inputs, Save In Context (#16828 ) ### PythonOp Enhancement: Bool and Tuple[Bool] Constants, Materialize Grads, Empty Inputs, Save In Context 1. Support `bool` or `Tuple[bool]` constant type in inputs. 2. Support `ctx.set_materialize_grads(True\|False)` 3. Backward op can accept empty input (that don't require grad) 4. Special handling for ORT tensors are saved in context Scenario: a tensor is generated by ORT, then it might be saved for backward by `ctx.save_for_backward(tensor)`, while `tensor`'s reference count is not increased in ORT's allocation plan, so it is possible ORT release the tensor data, before backward usage. Currently: we copy every tensor before running autograd.Function.forward(), this might be a problem for cases there are many PythonOp (for example zero stage 3). Proposal: To avoid those unnecessary copies for tensors that are not saved in context, this change introduced a `_GlobalOpKernelInfoMap`. During the kernel first run, we will anyway copy all tensors generated from ORT, and give it to torch.autograd.Function for run, then we check whether the inputs needs to be saved in context, and save the input index that needs saving in `_GlobalOpKernelInfoMap`. Then for later iterations, we just copy what is needed.	2023-08-15 13:31:04 +08:00
Adrian Lizarraga	b734db1924	[QNN EP] Fix CI build on Windows x64 pipelines (#17152 ) ### Description - Disables Resize tests that use nearest mode on QNN CPU. - Fixes indentation problems on yaml for win x64 qnn pipeline. ### Motivation and Context The QNN windows Nuget pipeline does not run due to failing unit tests on Windows x64. These tests should not be enabled until we determine the rounding behavior of QNN's ResizeNearestNeighbor operator.	2023-08-14 21:03:14 -07:00
Justin Chu	416dc2e84d	Fix clang-format comment indents on Windows for winml/ (#17144 ) On Windows, clang-format has a bug when AlignTrailingComments.Kind is set to `Leave` (https://clang.llvm.org/docs/ClangFormatStyleOptions.html#aligntrailingcomments), where it will keep adding indentation to comments after each formatting runs. This PR changes to always align comments so we do not hit the bug. As a consequence of the options change we need to reformat some of the files. Note that this option is aligned with the rest of the repository.	2023-08-14 23:50:14 -04:00
xhcao	24e0bd37b4	[JS/WebGPU] Support Log operator (#17045 ) ### Description <!-- Describe your changes. --> ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2023-08-14 18:04:12 -07:00
Baiju Meswani	289600b47d	ONNX Runtime training cpu package name for ADO (#17109 )	2023-08-14 11:32:35 -07:00
PeixuanZuo	be2200c00b	[ROCm] fix python package pipeline (#17136 ) ROCm python package pipeline failed because this PR(https://github.com/microsoft/onnxruntime/pull/16325) changed onnx version to a commit and we need to build onnx from source. Low protobuf version will cause build errors. This PR remove `cmake ` and `protobuf ` from Dockerfile, these two will install by `install_os_deps.sh`.	2023-08-14 11:22:43 -07:00
Jian Chen	45f52987a2	Web CI Pipeline Isolation (#17005 ) ### Description <!-- Describe your changes. --> ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2023-08-14 10:37:37 -07:00
Wenbing Li	d052c8a45c	Remove the extensions submodule (#17097 ) ### Description Remove the onnxruntime-extensions submodule since it now was used via cmake FetchContent ### Motivation and Context The submodule relies on an outdated version of the extensions, and the build instructions should be updated to eliminate any confusion.	2023-08-14 10:16:33 -07:00
Jian Chen	68ea9631af	Fix typo onnxruntimecpubuilpython (#17120 ) ### Description The correct name should be onnxruntimecpubuildpython ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Co-authored-by: Yulong Wang <7679871+fs-eire@users.noreply.github.com>	2023-08-14 08:34:43 -07:00
RandySheriffH	f71b6944bf	Fix nuget pipeline (#17110 ) Fix nuget pipeline by correcting the calling convention on c# delegate. --------- Co-authored-by: Randy Shuai <rashuai@microsoft.com>	2023-08-14 09:04:37 -04:00
Vincent Wang	e55e1b7da9	Mark end of version 16 C API (#17107 ) Mark end of version 16 C API in preparation for ORT 1.16 release.	2023-08-14 14:01:55 +08:00
pengwa	cd7b3f54da	Allow defining customized PythonOp shape inferer (#17093 ) ### Allow defining customized PythonOp shape inferer For `torch.autograd.Function`, we converted it to PythonOp in MSDomain, there are two places to do shape inferencing for it: 1. in SymbolicShapeInfer, there is one. 2. in PythonOp op definition. For common PythonOp, since we don't know the relation ship between inputs and outputs, so we only infer the rank from output ranks, and generate symbolic dimensions for each dim. While this will introduce many meaningless symbolic dimensions, sometimes blocking our graph transformers to do op fusion. This PR provide a way to define custom shape inferencing for `torch.autograd.Function` we defined, to propagate the original dimensions across the PythonOp at the best efforts. But the 2rd one is not covered yet, we could refine that later. Fixing 1st one is enough for ORTModule training/evaluation. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2023-08-14 09:13:32 +08:00
Guenther Schmuelling	9204cd7392	[js/webgpu] Add C++ registration for operator Tanh in JSEP (#17124 ) add webgpu/tanh Co-authored-by: Yulong Wang <7679871+fs-eire@users.noreply.github.com>	2023-08-12 11:43:39 -07:00
Yulong Wang	e7adbb38f6	[js/webgpu] disable test case 'test_batchnorm_epsilon_training_mode' temporarily (#17129 ) ### Description test case 'test_batchnorm_epsilon_training_mode' on webgpu is failing. the issue need time to investigate so comment this off and re-enable it when the root cause is fixed.	2023-08-12 08:53:10 -07:00
Chen Fu	f2e1b91634	add int4 quantization code in python (#17077 ) ### Description Adding int4 quantization code in python ### Motivation and Context Python quantization tool no-longer needs to invoke shell to call a native exe	2023-08-11 15:17:58 -07:00
Yulong Wang	5704e71b89	update onnx.patch to apply wasm build break fix (#17104 ) ### Description This PR fixes build break for WebAssembly introduced in `6986981482` (`435ad2b1d8`). This change updates onnx.patch in onnxruntime repo. the corresponding PR in onnx repo is: https://github.com/onnx/onnx/pull/5495. It may takes a while for the next onnx version bump.	2023-08-11 15:00:39 -07:00
liqun Fu	6697635b91	To support size opset 19 (#15689 )	2023-08-11 14:48:53 -07:00

1 2 3 4 5 ...

9405 commits