onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-07-20 19:12:24 +00:00

Author	SHA1	Message	Date
Vincent Wang	04f7c2deda	FP16_Optimizer Support for more Deepspeed Versions (#12046 ) * fp16_optimizer for more ds versions * change ds version * bugfix * fix bug	2022-06-30 18:36:17 +08:00
Tianlei Wu	ecca6f4d16	Move beamsearch shared initializers from subgraphs to main graph (#12025 ) * move shared initializers to parent graph * add --disable_shared_initializers	2022-06-29 22:43:41 -07:00
zhijxu	9f260fb60f	resolve comments	2022-06-30 11:26:13 +08:00
zhijxu	100aebbd26	resolve comments	2022-06-30 11:26:13 +08:00
zhijxu	2295b24cd5	support optimizer opt for deepspeed 0.5.9	2022-06-30 11:26:13 +08:00
George Wu	102d01b206	update roialign cuda impl to onnx opset16 (#12036 ) * roialign opset16 * fix * fix	2022-06-29 17:32:59 -07:00
Yi-Hong Lyu	c8cd36da01	Resize optimization for all architectures (#11956 ) With this patch, it optimizes Resize when the input X is 4D int8/uint8 tensor and the mode is linear by: * Transforming NCHW Resize to NHWC variant * Using the NHWC Resize kernel without floating-point computation It improves DeepLab V3 with uint8 quantization by 19% on X64. It also improves Resize of DeepLab V3 with int8 quantization by 15%~18% on X64.	2022-06-29 09:19:19 -07:00
Chun-Wei Chen	4eb54ff9a5	Add warning about future computation change for ConvTranspose with auto_pad (#11984 ) * Add warning about future computation change for Convtranspose with auto_pad * improve msg * update TODO to make lint happy * update more contents for warning and add if * valid was not infected * move it into kernel registration * parse auto_pad myself * try to use conv_transpose_attrs_.auto_pad directly	2022-06-29 06:53:31 -07:00
Valery Chernov	8ba8146650	[TVM] handshake mechanism for support of TVMso EP (#11437 ) * infrastructure for handshake mechanism was implemented. sha256 was selected as first hash algorithm * check hash during compile in TVMso EP * add IPP-CRYPTO to external dependencies for TVM EP * made checkHash method constant * removed the public implementation of the SHA-256 algorithm so as not to cause a license conflict * implemented SHA-256 calculation using ipp-crypto library * fix dependency for ipp-crypto * add provider options for hash check * update documentation for added provider options * add hash check condition * fix docs * fix lint * fix ORT_THROW Co-authored-by: Valery Chernov <valery.chernov@deelvin.com> Co-authored-by: KJlaccHoeUM9l <wotpricol@mail.ru>	2022-06-29 14:57:18 +02:00
dependabot[bot]	c0dd9be7ba	Bump electron from 13.6.6 to 15.5.5 in /js/web (#11884 ) Bumps [electron](https://github.com/electron/electron) from 13.6.6 to 15.5.5. - [Release notes](https://github.com/electron/electron/releases) - [Changelog](https://github.com/electron/electron/blob/main/docs/breaking-changes.md) - [Commits](https://github.com/electron/electron/compare/v13.6.6...v15.5.5) --- updated-dependencies: - dependency-name: electron dependency-type: direct:development ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2022-06-28 15:50:44 -07:00
Yosshi999	0702364d7a	[js/web][bugfix] fix negative axes for unsqueeze (#11944 ) [js/web] fix negative axes for unsqueeze	2022-06-28 11:28:35 -07:00
Tianlei Wu	9be2b6046b	convert_beam_search supports large gpt2 model (#11989 ) (1) add --run_shape_inference to make shape inference optional (2) add --vocab_mask to make the input optional (3) add --overwrite in gpt2 convert_to_onnx to allow overwrite existed raw onnx from PyTorch (4) save gpt2 model tensors to one external data file by default (5) group convert_beam_search arguments to multiple groups (6) make --decoder_onnx optional for gpt2 model (7) replace print by logger (8) update shape inference function to support external data. (9) when saving external data, show warning if onnx version < 1.12	2022-06-28 10:02:35 -07:00
sumitsays	4552dd38c6	[DML EP] Pad operator: Handle negative pad counts (#11974 ) * Pad fallback to CPU * Added queryPad in operatorRegistration.cpp * Acknowledged PR comments * Used any_of * used none_of instead of any_of Co-authored-by: Sumit Agarwal <sumitagarwal@microsoft.com>	2022-06-28 00:41:57 -07:00
RandySheriffH	d5fcb432fa	Generalize native op creation (#11539 ) * create op from ep * read input count from context * create holder to host nodes * fix typo * cast type before comparison * throw error on API fail * silence warning from minimal build * switch to unique_ptr with deleter to host nodes * fix typo * fix build err for minimal * fix build err for minimal * add UT for conv * enable test on CUDA * add comment * fix typo * use gsl::span and string view for Node constructor * Added two APIs - CopyKernelInfo and ReleaseKernelInfo * pass gsl::span by value * switch to span<NodeArg* const> to allow for reference to const containers * fix typo * fix reduced build err * fix reduced build err * refactoring node construction logic * rename exceptions * add input and output count as arguments for op creation * refactor static member * use ORT_CATCH instead of catch * cancel try catch * add static value name map * format input definition and set err code * fix comments * fix typo	2022-06-27 21:12:15 -07:00
Dwayne Robinson	fc0143fe68	DML EP ResNet50 opset 15 fails in ONNX checker for FusedBatchNormalization lacking training_mode attribute (#12010 ) FusedBatchNormalization include training_mode attribute	2022-06-27 19:41:34 -07:00
Edward Chen	f045994389	[NNAPI EP] Update NNAPI headers (#11954 ) Update the NNAPI headers to a more recent version (copied from TF Lite v2.9.1).	2022-06-27 18:54:06 -07:00
Edward Chen	466b2d9f3d	[C# Tests] Add support for double tensor output in TestPreTrainedModels. (#12008 ) Add support for double tensor output in TestPreTrainedModels.	2022-06-27 18:49:19 -07:00
Sheil Kumar	7d712c8f8b	Fix WinML Tests are still targetting deprecated (deleted) experimental signal op definitions (#12006 ) * fix winml tests * remove legacy test * switch idft -> dft+inverse attr * upgrade opset 13->17 for signal ops tests	2022-06-27 16:35:50 -07:00
Yulong Wang	bd973bcf1e	[js/rn] upgrade dependencies for e2e test (#11863 ) * [js/rn] upgrade dependencies for e2e test * use JDK11 only for gradle * expand variable	2022-06-27 14:56:49 -07:00
Dwayne Robinson	8cd02508c8	Include opset 15 in Conv+BatchNormalization fusion (#11960 )	2022-06-27 10:59:14 -07:00
dependabot[bot]	68afa2d362	Bump async from 2.6.3 to 2.6.4 in /js/react_native/e2e (#11280 ) Bumps [async](https://github.com/caolan/async) from 2.6.3 to 2.6.4. - [Release notes](https://github.com/caolan/async/releases) - [Changelog](https://github.com/caolan/async/blob/v2.6.4/CHANGELOG.md) - [Commits](https://github.com/caolan/async/compare/v2.6.3...v2.6.4) --- updated-dependencies: - dependency-name: async dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2022-06-27 10:30:01 -07:00
George Nash	9583841ef7	Improve performance of BiasGelu on oneDNN execution provider (#11935 ) Improve performance of BiasGelu on OneDNN execution provider This modifies how BiasGelu is handled by the OneDNN execution provider by executing the gelu_erf primitive as a postop of the binary_add primitive. Also fixes extra data copies made when running on GPU. Signed-off-by: George Nash <george.nash@intel.com>	2022-06-27 08:34:11 -07:00
Scott McKay	f72288b453	Fix a couple of typos (#11943 ) Fix couple of typos	2022-06-27 10:32:14 +10:00
Gary Miguel	dc5d6b9515	register signal ops for opset 17 (#11778 ) * Register signal ops for op set 17 Note code is mostly being moved, not added. These ops were previously only registered as Microsoft contrib ops and only built if `BUILD_MS_EXPERIMENTAL_OPS=1`. They've been added to the ai.onnx standard op set in version 17. Main components of this change: * Move the kernels from the conrib_ops directory to the core directory. * Add function bodies for ms experimental ops. This will allow old models that use the contrib ops to continue to function. All the function bodies consist of a single op (the new standard op), so performance overhead should be minimal. Minor clean-up also in this change: * De-duplicate get_scalar_value_from_tensor: put it in a new utils.h. * Fix some bugs that caused compilation errors with the experimental ops. Tested with `build.sh --ms_experimental` * Fix some spelling errors and lint violations. * Replace a couple of switch statements with `MLTypeCallDispatcher`. * Use `InlineVector` instead of `std::vector`. Unblocks https://github.com/microsoft/onnxruntime/issues/11640	2022-06-27 10:26:55 +10:00
Hubert Lu	f4ba199bad	Optimize FastGelu with float2 and float4 vectorized kernels on ROCm (#11491 ) * Using vectorized loads (float2) for fp16 to improve performance * Fix a few warnings from cpplint * Fix a few warnings from cpplint * Use __float2half2_rn and fix some cpplint warnings * Move some computaions to LaunchFastGeluKernel * Fix some Lint C++ warning * Using vectorized loads (float4) for fp16 to improve performance * Switch whether to optimize FastGelu with float4 vectorization * Switch to float4 memory access based on input_length in FastGelu * Comment how to set the threshold of float2 and float4 vectorized kernels * Add FastGelu fp16 unit tests for bias_length = 2 and 8 * Make vectorized kernels generic with aligned_vector * Unify the vectorized kernels with/without bias * Refactor the code to suppress cpplint warnings * Solve formatting issues * Remove cudaDeviceProp from FastGeluKernel and LaunchFastGeluKernel * Move fast_gelu_impl.h to rocm/bert * Fix some Lint C++ warnings and code alignment	2022-06-24 12:46:17 -07:00
Dmitri Smirnov	088bc7494b	Deprecate APIs returning raw ptrs and provide replacements (#11922 ) Provider better documentation	2022-06-24 09:50:04 -07:00
G. Ramalingam	b1411c8357	Restructure function inliner (#11731 ) * Add nested function call tests * Add overload for Specialize * Pass symboltable to onnx shape inference * Avoid renaming empty names * Enable sequence_map tests which failed before this change	2022-06-24 09:21:31 -07:00
pengwa	0d6cbc6e57	fix memory profile for partial graph run (#11911 ) * fix mpi build for gcc8 or higher * fix memory profile for partial graph run * Revert "fix mpi build for gcc8 or higher" This reverts commit fb60beb05402cd380597a12fc25880c0c8652ed4. * remove debug code * fix build * fix build * fix cpplint and python black format	2022-06-24 13:08:14 +08:00
Wil Brady	fa7f80c847	Eager mode: Argmax and fixup max and min. (#11861 ) * Eager mode ArgMax support. * Fix basic max and min functionality with minor generator update. Note this does not address all max and min api scope. * Add addmm test.	2022-06-23 15:55:34 -04:00
Tianlei Wu	2c4e4b6afc	MT5 onnx conversion for beam search (#11958 ) * support mt5 * save external data to one file * update default value of --model_name_or_path and --decoder_onnx	2022-06-23 10:23:28 -07:00
Dmitri Smirnov	607b7df060	Allow saving on CPU usage for infrequent inference requests by reducing thread spinning (#11841 ) Introduce Start/Stop threadpool spinning switch Add a session config option to force spinning stop at the end of the Run()	2022-06-23 10:04:37 -07:00
pengwa	c398ad513f	Fix orttraining-linux-ci-pipeline - Symbolic shape infer (#11965 ) fix symbolic shape error due to upgraded numpy + legacy sympy	2022-06-23 08:23:36 -07:00
Ye Wang	e24349b8f2	Optimize t5 encoder in beam search (#11926 ) * ooptimize t5 encoder * update * update * update * refactor expand impl * cuda tests passed * update * alignment * more alignments * review comments	2022-06-22 12:45:02 -07:00
Dwayne Robinson	f6d2fe8311	MeanVarianceNormalization CPU EP axes attribute validation (#11925 ) Validate axes attribute parameter properly rather than silently returning incorrect results	2022-06-22 12:03:13 -07:00
Preetha Veeramalai	f54476a42f	Dll version fix ovep4.1 (#11953 ) * Setting default version values for ovep dlls as well * Update backend_manager.cc Co-authored-by: mayavijx <mayax.vijayan@intel.com> Co-authored-by: mohsin <mohsinx.mohammad@intel.com>	2022-06-22 11:09:36 -07:00
pengwa	2229c48547	fix mpi in training build (#11855 ) fix mpi build for gcc8 or higher	2022-06-22 10:04:44 +08:00
Vincent Wang	03beed0ceb	Remove Cast before and after Gelu (#11885 ) * fuse cast gelu * use PropagateCastOps * fix ut	2022-06-22 09:07:48 +08:00
Gary Miguel	4bf22e2a40	Update ONNX to 1.12 (#11924 ) Follow-ups that need to happen after this and before the next ORT release: * Support SequenceMap with https://github.com/microsoft/onnxruntime/pull/11731 * Support signal ops with https://github.com/microsoft/onnxruntime/pull/11778 Follow-ups that need to happen after this but don't necessarily need to happen before the release: * Implement LayerNormalization kernel for opset version 17: https://github.com/microsoft/onnxruntime/issues/11916 Fixes #11640	2022-06-21 17:19:52 -07:00
Dwayne Robinson	64f95d400a	Update DML 1.9 Nuget package to fix WindowsAI nuget pipeline build issue (#11934 )	2022-06-21 15:55:51 -07:00
Scott McKay	3b1224dc08	Add .net6 support to the C# nuget package. (#11908 ) * Add .net6 support to the C# nuget package. Currently requires jumping through a lot of hoops due to .net 6 only being supported in the preview release of VS 2022. Build existing targets using msbuild. Add .net6 targets and build using dotnet. Create nuget package with combined targets. A few misc automated changes from VS to spacing and adding a couple of properties.	2022-06-22 08:08:24 +10:00
Arseny	8c8a781cdb	fix: handle setBindingDimensions return value in TensorRT EP (#11929 )	2022-06-21 14:30:27 -07:00
Edward Chen	5646410f65	Enable Pad test cases with initializer inputs only when building NNAPI EP on Android. (#11932 )	2022-06-21 14:16:55 -07:00
sfatimar	61a74f2f4d	Mohsin/enable dynamic shapes (#11867 ) * Add pypi build changes to latest Master * Add ORT training part of OV build * Disabling SqueezeOpTest.BadAxes * Add ONNXruntime branch ARG to Docker build * Changes to include file details versions * Commit File Version Updates * Change naming for linux build * Add fix for pylint format errors * Fix pylint warnings. * Enable Dynamic Shapes for OV_API_20 * Update requirements.txt whl version- internal_ci fix * Update backend_manager.cc MYRIAD Fix * Update wheel version in requirements.txt * Update backend_manager.cc * Update backend_manager.cc * Update backend_manager.cc * Update setup.py * Fix pylint warnings * Fix pylint warnings 2 * Update backend_manager.cc * Update backend_manager.cc * Update backend_manager.cc * Update backend_manager.cc * Update backend_manager.cc * Update backend_manager.cc * Update backend_manager.cc * Update backend_manager.cc Co-authored-by: Preetha Veeramalai <preetha.veeramalai@intel.com> Co-authored-by: mayavijx <mayax.vijayan@intel.com> Co-authored-by: Sahar Fatima <sfatima.3001@gmail.com> Co-authored-by: mohsinmx <mohsinx.mohammad@intel.com>	2022-06-21 08:03:58 -07:00
Adrian Lizarraga	b20daeda81	Update Linux Multi GPU TensorRT pipeline to TensorRT 8.4 (#11923 ) * Try manually installing trt8.4 in multi-gpu pipeline * Remove stmts that clean up cmake, ctest. Update tensorrt repository name passed to get_docker_image.py * Update trt and cudnn home * Don't install trtexec cli tool. * Increase job timeout * Revert timeout change and use trt placeholder builder build option	2022-06-21 07:59:11 -07:00
Ye Wang	859ef277a0	apply zcode changes to the beam search op (#11880 ) * apply zcode changes to the beam search op * fix pipeline failure * add doc * workaround for C# * update * update * use name zcode * review comment * review comments * fix cpplint * review coments	2022-06-20 18:39:07 -07:00
RandySheriffH	cefceff5c9	Mark the end of APIs for release 1.12 (#11914 ) * mark the end of APIs for 1.12 * add static assert for C API 1.12	2022-06-20 15:22:55 -07:00
Adrian Lizarraga	ca35ea417a	[EP-Perf] Install new wheel>=0.35.1 dependency (#11917 )	2022-06-20 15:09:27 -07:00
Yi Zhang	7f1e9e8c67	Bash: there should be a whitespace after not operator. (#11910 ) add whitespace after not	2022-06-21 05:14:32 +08:00
Chi Lo	457ce6cb89	Make symbolic shape inference script support external weight (#11909 ) * add support for external data * fix format * fix format * fix typo * fix typo	2022-06-20 13:07:45 -07:00
Dwayne Robinson	c1577d08ca	DML EP QuantizeLinear defer axis validation for test_quantizelinear_cpu (#11906 ) DML EP QuantizeLinear defer axis validation	2022-06-20 11:03:32 -07:00

1 2 3 4 5 ...

6940 commits