onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-07-16 18:31:27 +00:00

Author	SHA1	Message	Date
BoarQing	b8bbc898c6	fix errors for node with empty name for vitis ai (#16949 ) ### Description Fixed the issue of finding nodes with empty name for vitis ai. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> It is required because we encountered this error when testing newly created models.	2023-08-02 19:08:49 -07:00
Dmitri Smirnov	246cb3a197	Simplify shrink, replace Eigne in Sign implemenation (#16975 ) ### Description <!-- Describe your changes. --> Simplify Shrink. Replace Eigen code with the one that does not require fp16 conversion in Sign. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2023-08-02 18:24:38 -07:00
Guenther Schmuelling	0df2e14038	js/webgpu: argmax,argmin,softmax support (#16882 ) argmax and argmin are similar to reduce. Eventually we need to add optimized flavors of the shader. softmax is optimized but only works on the last axis for now which should be the common use case. todo: enable more ut for argmax/argmin	2023-08-02 18:16:19 -07:00
Hariharan Seshadri	506ddb3d5d	[js/WebGPU] Support int32 Transpose in WebGPU (#16952 )	2023-08-02 16:27:24 -07:00
BoarQing	6361b22103	vitis ai support generic data type (#16902 ) ### Description <!-- Describe your changes. --> Support more data types for vitis ai. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> It is required because the models we are testing now have uint8 data type. To solve this once for all, we changed the code to support generic data type.	2023-08-02 15:56:39 -07:00
satyajandhyala	d399648869	[JS/Web] Added Resize kMSInternalNHWCDomain domain registration. (#16946 ) ### Description Added Resize NHWC domain kernel registration. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2023-08-02 14:16:21 -07:00
Michael Klimenko	07e6648e12	Enable Intel oneAPI DPC++/C++ compiler build (#16587 ) Last week I fixed error #16484 found when trying to build onnxruntime with the icpx compiler. Another thing I found out is that icpx uses -ffast-math flag by default. You can check it by running the compiler with -v flag like following: ```bash # Setup the environment . /opt/intel/oneapi/setvars.sh # Compile any file to see all the implicit flags icpx -v main.cpp ``` This leads to a bunch of warnings during the build like: ```bash In file included from /mnt/f/wsl_home/onnxruntime/onnxruntime/test/providers/cpu/tensor/upsample_op_test.cc:5: In file included from /mnt/f/wsl_home/onnxruntime/onnxruntime/test/providers/provider_test_utils.h:6: In file included from /mnt/f/wsl_home/onnxruntime/onnxruntime/test/providers/checkers.h:10: In file included from /mnt/f/wsl_home/onnxruntime/onnxruntime/core/util/math_cpuonly.h:68: In file included from /mnt/f/wsl_home/onnxruntime/build/Linux/RelWithDebInfo/_deps/eigen-src/Eigen/Core:172: /mnt/f/wsl_home/onnxruntime/build/Linux/RelWithDebInfo/_deps/eigen-src/Eigen/src/Core/MathFunctions.h:1019:12: warning: comparison with NaN always evaluates to false in fast floating point modes [-Wtautological-constant-compare] return isnan EIGEN_NOT_A_MACRO (x); ^~~~~~~~~~~~~~~~~~~~~~~~~~~ ``` And some tests are failing as well, usually with infinities involved. To list a few: ```bash # ... 1: [ FAILED ] IsInfTest.test_isinf_float 1: [ FAILED ] IsInfTest.test_isinf_double 1: [ FAILED ] IsInfTest.test_isinf_positive_float 1: [ FAILED ] IsInfTest.test_isinf_positive_double 1: [ FAILED ] IsInfTest.test_isinf_negative_float 1: [ FAILED ] IsInfTest.test_isinf_negative_double 1: [ FAILED ] IsNaNOpTest.IsNaNFloat 1: [ FAILED ] IsNaNOpTest.IsNaNDouble # ... ``` This PR adds a quick global check for the IntelLLVM compiler, as in the way its name is reported by CMake and then, depending on the compiler driver, sets either MSVC-like or GCC-like switch to disable fast-maths. Probably a bit cleaner solution would be to use ```target_compile_options(${TARGET} PRIVATE MEOW)``` instead of a global-wide ```set(CMAKE_CXX_FLAGS MEOW)```, but then we'd be required to add it to all the individual targets and execution providers and this will lead to a lot of code duplication.	2023-08-02 12:50:35 -07:00
Tianlei Wu	76aff63f37	Update bert_perf_test to test inputs with different padding ratio (#16963 ) Add --average_sequence_length and --random_sequence_length so that we can test the performance of model on different padding ratio.	2023-08-02 10:28:39 -07:00
RandySheriffH	c392fdeb1b	RunAsync Python API (#16760 ) Implement python binding for RunAsync API. --------- Co-authored-by: Randy Shuai <rashuai@microsoft.com>	2023-08-02 10:15:34 -07:00
Dmitri Smirnov	bd4d011142	[C#] Rename unreleased API, add utilities (#16806 ) ### Description 1. rename OrtValue.FillStringTensorElement to StringTensorSetElementAt . To the API user I think we're conceptually setting the string at an offset in the tensor with is roughly equivalent to `List<string> list ... list[index] = "value"`. 2. While working on new inference examples, I noticed that I am still inclined to use `DenseTensor` for N-D indexing. Added `GetStrides()` and `GetIndex()` from strides for long dims, so the user can obtain strides and translate N-D indices into a flat index to operate directly on the native `OrtValue` buffers. Expose these functions to the user. 3. Make sure we generate docs for C# public static functions.	2023-08-02 10:06:42 -07:00
Chi Lo	f4faceab28	Ignore deprecated declarations warning for TRT EP build (#16948 ) In additions to `onnxruntime_test_all`, `onnxruntime_shared_lib_test` and `onnxruntime_customopregistration_test` should also add "-Wno-deprecated-declarations" flag to ignore compiler warning	2023-08-02 09:51:58 -07:00
satyajandhyala	f8d933df31	[JS/Web] Register JSEP contrib ops only once per process. (#16950 ) ### Description Fix contrib ops once once. ### Motivation and Context Fix the earlier commit adding Gelu contrib op to the JSEP.	2023-08-02 00:27:11 -07:00
pengwa	b9d80131a7	Save optimized pre_grad graph once ready (#16816 ) ### Save optimized pre_grad graph once it's ready `graph_builder.build()` did two things for training: 1. optimized forward graph, e.g. pre_grad graph optimization. 2. build gradient graph. Originally after `graph_builder.build()` completed, pre_graph graph is saved. While if pre_grad graph optimization completed, but fail during gradient graph build, we still cannot get pre_grad graph to investigate. This PR made the change once pre_grad graph is ready, we save it (if save_model is enabled) in C++ backend.	2023-08-02 14:05:26 +08:00
Wanming Lin	ba49d64f67	[WebNN EP] Support LpPool, GlobalLpPool, and Log ops (#16954 ) BTW, reset minimal supported opset to 1, because with minimal supported opset 7 will ignore all ops that have last since version less than 7. e.g. GlobalLpPool, it only has two opset versions: 1, 2.	2023-08-01 22:35:10 -07:00
Yulong Wang	4a2a248dd7	remove unused comments in mac CI yml file (#16964 ) ### Description remove unused comments in mac CI yml file	2023-08-01 20:52:12 -07:00
zesongw	5912837791	[WebNN EP] Fix bug when Pad has negative padding value. (#16878 ) Padding value in ONNX Pad can be negative, which indicates remove pixel. WebNN EP can not support such operation, so it needs to use slice to handle this case.	2023-08-01 19:41:02 -07:00
Yi Zhang	36c5b0dcdd	Fix onnxruntime_tvm (#16933 ) ### Description it works but it's ugly. ### Motivation and Context Fix tvm ci	2023-08-02 07:51:00 +08:00
Tianlei Wu	50bf310dea	[CUDA] RelativePositionBias supports input with padding removed (#16923 ) update RelativePositionBias to support input with padding removed. - [x] add bias transpose kernel - [x] add test - [x] update operator document	2023-08-01 16:39:09 -07:00
Yulong Wang	afac67bcc3	[build] fix the CI pipeline (#16962 ) ### Description There are currently multiple failures that blocking the CI pipelines so this PR has all of the fixes in order to make sure it passes the CI. Otherwise a single fix will still fail the CI. includes: #16960 #16958 Please help to make sure this PR get merged once CI passed. @snnn @carzh @guschmue Fixed: [AB#18118](https://aiinfra.visualstudio.com/6a833879-cd9b-44a4-a9de-adc2d818f13c/_workitems/edit/18118) --------- Co-authored-by: Caroline Zhu <carolinezhu@microsoft.com@orttrainingdev7.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net>	2023-08-01 16:22:45 -07:00
Tianlei Wu	1fbd1ed179	[CUDA] PackedMultiHeadAttention support Bias and separated Q, K and V inputs (#16913 ) ### Description Follow-up change for PackedMultiHeadAttention added in https://github.com/microsoft/onnxruntime/pull/16779: - [x] Add Bias input - [x] Add CUDA kernels to support separated query, key and values inputs. - [x] Update operator documents - [x] Add unit tests	2023-08-01 15:30:41 -07:00
Changming Sun	e412d93b00	Add lsb-release package to android custom build (#16944 ) ### Description Add lsb-release package to android custom build ### Motivation and Context To fix a build issue: /workspace/onnxruntime/tools/ci_build/github/linux/docker/inference/x64/python/cpu/scripts/install_protobuf.sh: line 27: lsb_release: command not found	2023-08-01 11:27:29 -07:00
Changming Sun	1333f73a68	Add ONNX 1.14 test data (#16943 ) This PR is similar to #15256	2023-08-01 11:19:27 -07:00
Yulong Wang	969c95f73f	[js/common] a few fixes/revises to onnxruntime-common (#16853 ) ### Description - enable unit test for js/common in CI - add debug config in js/.vscode/launch.json - enable source map for js/common/test for debugging purposes; add source map files to ignore list - ignore js/common/test folder for npm packaging	2023-08-01 11:17:39 -07:00
Yi Zhang	c4e4b98fb2	replace one pool with onnxruntime-Win2022-GPU-T4 (#16953 ) ### Description replace one pool ### Motivation and Context onnxruntime-gpu-tensorrt8-winbuild-t4 would be deprecated	2023-08-01 21:02:56 +08:00
Yulong Wang	6046456bb6	build break: apply formatter fix (#16947 ) ### Description build break: apply formatter fix	2023-08-01 01:10:55 -07:00
Patrice Vignola	49512e558a	[DML EP] Add I/O binding and `If` operator (#16859 ) Being able to leverage I/O binding for DML and registering `If` for the DML EP allows us to avoid copying the past/present key/values back and forth between the CPU and the GPU after every token. This gives us a 25% performance increase for Dolly V2 with 128 tokens on an RTX 4090.	2023-07-31 19:45:59 -07:00
Artyom Stepanishchev	ba23e5b234	[JS/Common] Fix malformed result of Tensor.fromImage(ImageBitmap) (#16919 ) ### Description Set `canvas` dimensions to the `ImageBitmap` dimensions, thus fixing a malformed Tensor creation. ### Motivation and Context According to the [HTMLCanvasElement.drawImage() spec](https://html.spec.whatwg.org/multipage/canvas.html#drawing-images): > When the destination rectangle is outside the destination image (the output bitmap), the pixels that land outside the output bitmap are discarded, as if the destination was an infinite canvas whose rendering was clipped to the dimensions of the output bitmap. meaning that `ImageBitmap` pixels exceeding the canvas dimensions will be discarded. Since no canvas dimensions are set for `Tensor.fromImage(ImageBitmap)` if-case, the default 300x150px canvas dimensions are used leading to the creation of malformed Tensors where all the exceeding pixels are discarded and equal to `0, 0, 0, 0` during the subsequent `pixels2DContext.getImageData()` call.	2023-07-31 18:18:06 -07:00
Jiajia Qin	fa8487ea3a	[js/webgpu] Check profilingMode in each run (#16897 ) ### Description <!-- Describe your changes. --> This PR moves checking profilingMode to each run instead of the initialization stage. In this way, users can start/stop profiling at any time. Otherwise, profiling only take effects at the very beginning and can't be stopped.	2023-07-31 17:37:24 -07:00
kunal-vaishnavi	3c72f43f78	Extend saving models optimized by inference session (#16912 ) ### Description This PR adds support for saving model optimizations after loading a model that contains external data into an `InferenceSession`. ### Motivation and Context This PR is a follow-up to a [previous PR](https://github.com/microsoft/onnxruntime/pull/16716) for saving a model optimized by an `InferenceSession`.	2023-07-31 16:39:35 -07:00
Changming Sun	73ddba964f	Update the MacOS/Linux build scripts that build/install protobuf from source (#16906 ) ### Description 1. As a follow-up of #16761, this PR allows build ORT on iOS/Android without the need to explicitly specify a protoc path. #16761 is for WASM. This one is for iOS/Android 2. Update the MacOS/Linux build scripts that build/install protobuf from source. Make them be more flexible. Add the support for RedHatEnterprise(ubi), which will needed for upgrading the base image from centos:7 to ubi:8. 3. Update tools/ci_build/github/pai/rocm-ci-pipeline-env.Dockerfile : the docker file's base image has preinstalled protobuf in /usr/local, we should uninstall them to avoid conflicts.	2023-07-31 10:51:48 -07:00
Yi Zhang	28a099fca8	unify the steps of downloading cuda sdk and setup env (#16896 ) ### Description The `%AGENT_TEMPDIRECTORY%\v11.8` is created in azcopy step. So, the set env step should be after the azcopy step. ### Motivation and Context Correct the previous logic Unify the step since multiple jobs are using it.	2023-07-31 10:25:04 -07:00
Dmitri Smirnov	50764362ac	Update protobuf Natvis visualization (#16911 ) ### Description Protobuf library update broke debug visualization. ### Motivation and Context Hard to debug	2023-07-31 09:35:21 -07:00
satyajandhyala	77b2b618b2	[JS/WebGPU] Add Resize operator (#16680 ) ### Description Implemented Resize operator support in JSEP ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2023-07-31 09:35:06 -07:00
Hector Li	3fd1d3b9bd	Improve graph transformer DoubleQDQPairsRemover (#16910 ) Improve graph transformer DoubleQDQPairsRemover ### Description Improve DoubleQDQPairsRemover to not reset the scale & zero point if existing value are same on the target DQ & Q nodes. ### Motivation and Context Fix a bug that DoubleQDQPairsRemover reset the scale value while removing unnecessary DQ & Q nodes.	2023-07-31 09:24:46 -07:00
satyajandhyala	dd24d52737	[JS/Web] Added Gelu contrib operator support to JSEP (#16909 ) ### Description Added Gelu operator to JSEP ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2023-07-31 09:18:58 -07:00
Tianlei Wu	92b6e10d37	skip test_smooth_quant to unblock Python Package Pipeline (#16914 ) ### Description Python Package Pipeline failed since there is exception raised in test_smooth_quant (from #16288): ``` File "/home/cloudtest/.local/lib/python3.8/site-packages/onnxruntime/quantization/quantize.py", line 384, in quantize_static importlib.import_module("neural_compressor.adaptor.ox_utils.smooth_quant") File "/home/cloudtest/.local/lib/python3.8/site-packages/neural_compressor/__init__.py", line 24, in <module> from .contrib import * File "/home/cloudtest/.local/lib/python3.8/site-packages/neural_compressor/contrib/__init__.py", line 19, in <module> from .strategy import * File "/home/cloudtest/.local/lib/python3.8/site-packages/neural_compressor/contrib/strategy/__init__.py", line 26, in <module> __import__(basename(f)[:-3], globals(), locals(), level=1) File "/home/cloudtest/.local/lib/python3.8/site-packages/neural_compressor/contrib/strategy/sigopt.py", line 22, in <module> from neural_compressor.strategy.strategy import strategy_registry, TuneStrategy File "/home/cloudtest/.local/lib/python3.8/site-packages/neural_compressor/strategy/__init__.py", line 20, in <module> from .strategy import STRATEGIES File "/home/cloudtest/.local/lib/python3.8/site-packages/neural_compressor/strategy/strategy.py", line 41, in <module> from ..algorithm import AlgorithmScheduler, ALGORITHMS File "/home/cloudtest/.local/lib/python3.8/site-packages/neural_compressor/algorithm/__init__.py", line 20, in <module> from .algorithm import ALGORITHMS, Algorithm, AlgorithmScheduler, algorithm_registry File "/home/cloudtest/.local/lib/python3.8/site-packages/neural_compressor/algorithm/algorithm.py", line 21, in <module> from neural_compressor.utils.create_obj_from_config import get_algorithm File "/home/cloudtest/.local/lib/python3.8/site-packages/neural_compressor/utils/create_obj_from_config.py", line 20, in <module> from neural_compressor.metric import METRICS File "/home/cloudtest/.local/lib/python3.8/site-packages/neural_compressor/metric/__init__.py", line 30, in <module> __import__(basename(f)[:-3], globals(), locals(), level=1) File "/home/cloudtest/.local/lib/python3.8/site-packages/neural_compressor/metric/coco_tools.py", line 54, in <module> from pycocotools import coco File "/usr/local/lib/python3.8/dist-packages/pycocotools/coco.py", line 52, in <module> from . import mask as maskUtils File "/usr/local/lib/python3.8/dist-packages/pycocotools/mask.py", line 3, in <module> import pycocotools._mask as _mask File "pycocotools/_mask.pyx", line 1, in init pycocotools._mask ValueError: numpy.ndarray size changed, may indicate binary incompatibility. Expected 96 from C header, got 88 from PyObject ``` The cause is pycocotools package uses "oldest-supported-numpy", which might cause older version numpy in build pycocotools: `9e9164f979/PythonAPI/pyproject.toml (L4)` Related issue: https://github.com/cocodataset/cocoapi/issues/248 ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2023-07-29 11:24:28 -07:00
Tianlei Wu	742edec5e8	[CUDA] Add PackedMultiHeadAttention operator (#16779 ) ### Description Add new operator for MultiHeadAttention with inputs removed padding. This only supports packed QKV format.	2023-07-28 16:35:38 -07:00
Alexey Kamenev	7c05f7bab1	Fix IRFFT contrib op output dimension calculation (#15662 ) ### Description Fixes the issue with IRFFT output dimension calculation as described in #13236 ### Motivation and Context Please refer to #13236 for detailed description. Specifically, [this code](https://github.com/microsoft/onnxruntime/blob/main/onnxruntime/contrib_ops/cuda/math/fft_ops.cc#L103) computes the output dimension as: ``` out_dim = in_dim * 2 - 1 ``` while it should be this instead: ``` out_dim = 2 * (in_dim - 1) ``` (assuming the original signal has even number of samples, of course). For example, if the original signal has 4 samples, then the round trip should look something like: ``` 4 -> (one-sided RFFT) -> 3 (complex) -> (one-sided IRFFT) -> 4 ``` with the current code the output will be a signal with 5 points. --------- Co-authored-by: Alexey Kamenev <akamenev@nvidia.com> Co-authored-by: Nick Geneva <nicholasgeneva@gmail.com>	2023-07-28 15:52:37 -07:00
Yulong Wang	1743e9a615	[js] enable formatter for more file types (#16888 ) ### Description enable formatter for .js/.json/.jsonc/.md files	2023-07-28 15:46:58 -07:00
Paul Willot	65534ff9ef	Update setup.py to add py311 (#16899 ) ### Description Update setup.py to add python 3.11 ### Motivation and Context Python 3.11 is supported since release 1.15	2023-07-28 13:04:50 -07:00
Scott McKay	21a71d52bd	Enable CodeQL for Android build as per 1CS requirement. (#16875 ) ### Description <!-- Describe your changes. --> Split stages for CPU and CPU+NNAPI builds as CodeQL is enabled at the stage level. We run it for CPU+NNAPI as that covers all the Android code. We don't want to run it for both as duplicate issues would be created for a problem in code included in both builds. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2023-07-28 17:54:23 +10:00
Yi Zhang	9f21f694cf	stop support to VS 2019 (#16892 ) ### Description Remove VS 2019 code. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2023-07-28 13:09:35 +08:00
pengwa	a021cb1b6e	Allow creating ConstantScalarNode for double type (#16797 ) ### Allow creating ConstantScalarNode for double type Allow create ConstantScalarNode for double type. Looks double type is not respected when creating constant. So fix it. ``` onnxruntime::python::addObjectMethodsForTraining(pybind11::module&, onnxruntime::python::ExecutionProviderRegistrationFn)::<lambda(onnxruntime::training::OrtModuleGraphBuilder*, const onnxruntime::training::TrainingGraphTransformerConfiguration&)> [ONNXRuntimeError] : 1 : FAIL : Type Error: Type parameter (T) of Optype (Sub) bound to different types (tensor(double) and tensor(float) in node (/_original_module/_original_model/gpt_neox/layers.0/input_layernorm/Pow_Grad/Sub_1). ```	2023-07-28 12:41:22 +08:00
Wanming Lin	634a3f2f28	[WebNN EP] Support Max and Min ops (#16858 )	2023-07-27 18:22:03 -07:00
Changming Sun	9dcbcf1d2f	Delete unused files (#16887 ) ### Description These yaml files and docker files are not used by any pipeline. If I were wrong, feel free to submit a PR to get the wrongly deleted file back from git history (git keeps everything forever).	2023-07-27 16:46:09 -07:00
Changming Sun	161a9d1d6d	Add some safety check for conv op (#16839 ) ### Description Add some safety check for conv op. It is to validate if the attributes coming from a conv op are in a valid range. (shouldn't be too large or too small).	2023-07-27 16:37:55 -07:00
satyajandhyala	e67547b978	[JS/WebGPU] Added Flatten operator support. (#16860 ) ### Description Added Flatten operator support to JSEP. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2023-07-27 12:50:45 -07:00
Hector Li	ec935a5533	[QNN EP] Check the axis attribute for LayerNorm for HTP (#16872 ) Check the axis attribute for LayerNorm for HTP ### Description Add code to check the axis value for LayerNorm for HTP explicitly to make sure only the last dimension is allowed.	2023-07-27 09:59:46 -07:00
Hector Li	2748f51603	[QNN EP] Remove duplicate string define (#16877 ) [QNN EP] Remove duplicate string define ### Description Remove duplicate string define for QNN op parameters, use them directly from QNN header file.	2023-07-27 09:08:55 -07:00
Prathik Rao	779fba1666	ORT Cache (#16744 ) ### Description <!-- Describe your changes. --> This PR adds support to cache the exported training/evaluation ONNX model in `ORTModule`. On future runs, instead of exporting the model again, we can pick up the model from a location on disc and run `ORTModule` training/evaluation. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> ORT Training DRI Contribution --------- Co-authored-by: root <root@orttrainingdev8.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net> Co-authored-by: Prathik Rao <prathikrao@microsoft.com@orttrainingdev8.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net> Co-authored-by: Baiju Meswani <bmeswani@microsoft.com> Co-authored-by: pengwa <pengwa@microsoft.com>	2023-07-27 09:00:43 -07:00

1 2 3 4 5 ...

9276 commits