onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-06-23 02:38:28 +00:00

Author	SHA1	Message	Date
Ella Charlaix	ec00d28a2f	Fix a typo in quantization tools (#10940 )	2022-04-15 22:23:50 +00:00
PeixuanZuo	69e2d319ed	[FIX] symbolic shape infer error with onnx-1.11.0 (#10674 ) * [FIX] symbolic shape infer error with onnx-1.11.0 * [FIX] consider inputs name contains 'unk__' * [TEST] enable gpt2 test * [FIX] gpt2_megatron_opt.onnx graph	2022-04-15 22:22:35 +00:00
Chi Lo	b713855a98	Release 1.11.0 cherry pick round 1 (#10915 ) * Update to flatbuffers v2.0.0 (#10866) * Fix Reduced ops pipeline (#10861) * Fix a couple of issues with the python package tools (#10858) * Tweaks to the model utils * Add handling for a dim_value of -1 when replacing the entire input shape. This occurs in models exported from PaddlePaddle * make pytorch helpers accessible in package * make QDQ helpers accessible in package * Fix wrong percentile values returned during calibration (#10847) * Use numpy.percentile to get the lookup value. * Use 1.0 as float value rather than integer. * Add missing cdf parameter for `np.percentile`. * Use 100. instead of 1.0 * Remove print. * Update from @yufenglee * Add support for opset 16 to transpose optimizer. (#10841) * Add support for opset 16 to transpose optimizer. Only change required is for GridSample to be added to the layout sensitive ops. The existing handling for layout transpose works with that as the first input and first output are layout sensitive. Update the optimize to be able to return an error message if it fails. * Use separate build directories for full and mobile iOS packages. (#10835) * Address performance issue with abseil flat_hash_table. (#10819) When returning by value in a cross DLL call, the hash table even though containing all the entries that are originally there can not find at least some of them. Reverting to std::unordered_set pending further investigation. * Mark end of version 11 C API. (#10803) * Mark end of version 11 C API * Add static_assert * avoid using LocalFree on FormatMessageW buffer (#10796) * remove local free * Remove local free from onnxruntime * don't allocate * Change to use constexpr to satisfy CPU build warning * Integrate C-API tests into Pipelines for release packages (#10794) * add c-api test for package * fix bug for running c-api test for package * refine run application script * remove redundant code * include CUDA test * Remove testing CUDA EP temporarily * fix bug * Code refactor * try to fix YAML bug * try to fix YAML bug * try to fix YAML bug * fix bug for multiple directories in Pipelines * fix bug * add comments and fix bug * Update c-api-noopenmp-packaging-pipelines.yml * Remove failOnStandardError flag in Pipelines * Detect runtime CUDA JIT and warn the user (#10781) * Use cudaMalloc vs cudaDeviceSynchronize and show the total time * Update convert_onnx_models_to_ort.py to support runtime optimizations. (#10765) Add runtime optimization support to ONNX -> ORT format conversion script. Replace `--optimization_level`, `--use_nnapi`, and `--use_coreml` with a new `--optimization_style` option. * Add multithreading test and put a lock on nvinfer1::createInferRuntime() for TRT EP (#10714) * Add multithread unit test and put lock on library call * update code * remove debug code * add comment * add one session multi-threads inference * Put lock for build engine all the time * Update naming and comment * remove unnecessary lock * Revert "remove unnecessary lock" This reverts commit 9c2317b1d2273dec0ebdeb52160bc757839e5edc. * Fix handling of nodes inserted by NHWC transformer. (#10904) (#10925) * Revert "Upsample support NHWC (#10554)" (#10917) This reverts commit `bd08f11a58`. Co-authored-by: Yufeng Li <liyufeng1987@gmail.com> * [python API] Change raise import error when `C:\Windows\System32\vcruntime140_1.dll` is not found to warning (#10927) * remove throw if C:\\Windows\\System32\\vcruntime140_1.dll cannot be found * Add comments and update warning message * adding back accidentally removed line Co-authored-by: gwang0000 <62914304+gwang0000@users.noreply.github.com> * [js] Create npm packaging pipeline (#10886) * create npm packaging pipeline * fix indentations * Update npm-packaging-pipeline.yml for Azure Pipelines * Update npm-packaging-pipeline.yml for Azure Pipelines * Update npm-packaging-pipeline.yml for Azure Pipelines * react-native-ci as a template * fix typos * fix template paths * add a depencendy * change a stage name * set different artifact name for each package * fix typo * Update npm-packaging-pipeline.yml for Azure Pipelines Set a build Id for node npm package as a parameter * Update npm-packaging-pipeline.yml for Azure Pipelines Set a build Id for node npm package as a parameter * Update npm-packaging-pipeline.yml for Azure Pipelines * Follow up update for python API checking if `vcruntime140_1.dll` is available (#10927) (#10933) Co-authored-by: Hariharan Seshadri <hasesh@microsoft.com> Co-authored-by: Scott McKay <skottmckay@gmail.com> Co-authored-by: Funtowicz Morgan <mfuntowicz@users.noreply.github.com> Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com> Co-authored-by: Dmitri Smirnov <yuslepukhin@users.noreply.github.com> Co-authored-by: Pranav Sharma <prs@microsoft.com> Co-authored-by: Ryan Lai <rylai@microsoft.com> Co-authored-by: Ryan Hill <38674843+RyanUnderhill@users.noreply.github.com> Co-authored-by: Yi-Hong Lyu <yilyu@microsoft.com> Co-authored-by: Yufeng Li <liyufeng1987@gmail.com> Co-authored-by: Guoyu Wang <62914304+gwang-msft@users.noreply.github.com> Co-authored-by: gwang0000 <62914304+gwang0000@users.noreply.github.com> Co-authored-by: Sunghoon <35605090+hanbitmyths@users.noreply.github.com>	2022-03-18 11:16:30 -07:00
Chun-Wei Chen	e0cec5c4a6	skip optional related models from opset16 (#10840 )	2022-03-10 23:36:06 -08:00
Hariharan Seshadri	7f0b0d0274	Fix bug in MemcpyToHost (#10829 ) * Fix bug in MemcpyToHost * Fix comment * Fix warnings * Address PR comments	2022-03-10 08:17:16 -08:00
Chun-Wei Chen	e6456260b5	remove final six (#10822 )	2022-03-10 08:16:40 -08:00
Edward Chen	22c475520e	Make QDQSelectorActionTransformer() is_int8_allowed parameter required. (#10823 ) Make QDQSelectorActionTransformer() is_int8_allowed parameter required. Set it to QDQIsInt8Allowed() in places it was previously set to false.	2022-03-09 16:17:00 -08:00
Changming Sun	cc6bc34c8c	Update protobuf submodule (#10801 )	2022-03-09 09:37:58 -08:00
Dmitri Smirnov	58521fb822	Make training CUDA kernels to adhere established code structure patterns (#10735 ) Current training optimizer kernels include CPU headers that affects changes that we can make in the CPU code with C++14 compiler and other refactoring efforts. Rearrange the kernel according to the established patterns and do not include headers that are not needed.	2022-03-09 09:06:45 -08:00
Adam Pocock	4ef81b142d	Making the Java tests faster by optionally disabling ones which require running multiple JVMs. (#10811 )	2022-03-08 22:19:37 -08:00
Hariharan Seshadri	ae97ecf05b	Fix CPU, CUDA Selu activation logic (#10771 )	2022-03-08 19:53:27 -08:00
Edward Chen	c147c9dda6	Remove ORT_ENABLE_RUNTIME_OPTIMIZATION_IN_MINIMAL_BUILD. (#10778 ) Remove ORT_ENABLE_RUNTIME_OPTIMIZATION_IN_MINIMAL_BUILD as it is now implied by ORT_EXTENDED_MINIMAL_BUILD. Remove related CMake option.	2022-03-08 16:18:49 -08:00
George Wu	769aa8363d	update onnx-tensorrt to bring in https://github.com/onnx/onnx-tensorrt/pull/812 (#10810 )	2022-03-08 14:51:07 -08:00
Jingqiao Fu	f4fd67cc2c	Revert "add load from buffer (#10162 )" (#10590 ) This reverts commit `5cd57bb726`.	2022-03-08 13:35:23 -08:00
dependabot[bot]	7e04dccca7	Bump numpy in /tools/ci_build/github/linux/docker/scripts (#10385 ) Bumps [numpy](https://github.com/numpy/numpy) from 1.16.6 to 1.21.0. - [Release notes](https://github.com/numpy/numpy/releases) - [Changelog](https://github.com/numpy/numpy/blob/main/doc/HOWTO_RELEASE.rst.txt) - [Commits](https://github.com/numpy/numpy/compare/v1.16.6...v1.21.0) --- updated-dependencies: - dependency-name: numpy dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2022-03-08 11:02:36 -08:00
Sunghoon	68c8f5a1ef	Change a pipeline vmImage from windows-latest to windows-2019 (#10804 )	2022-03-08 10:49:59 -08:00
Yufeng Li	33c6819196	add qdq support of Sigmoid (#10800 )	2022-03-08 10:29:15 -08:00
Changming Sun	6260733533	Fix eager mode pipeline (#10802 ) It was still using python 3.6	2022-03-08 09:26:20 -08:00
Hariharan Seshadri	a9d9c6b486	Register CPU, CUDA and ROCM opset-16 kernels for some operators (#10643 )	2022-03-08 09:18:39 -08:00
Changming Sun	ce07dc30fd	Change how we apply patches to absl (#10799 )	2022-03-08 02:03:06 -08:00
George Wu	1e4a4bfe58	update onnx-tensorrt reference. (#10795 )	2022-03-07 21:45:46 -08:00
liqun Fu	da885a72e8	update with onnx 1.11 release (#10441 )	2022-03-07 21:10:55 -08:00
Yulong Wang	80917342b7	[js] upgrade mocha@8.2.1 to 9.2.1 (#10793 )	2022-03-07 20:40:24 -08:00
dependabot[bot]	4d943c9bd3	Bump numpy from 1.16.6 to 1.21.0 in /tools/ci_build/github/linux/docker/scripts/manylinux (#10387 ) * Bump numpy in /tools/ci_build/github/linux/docker/scripts/manylinux	2022-03-07 20:39:49 -08:00
PeixuanZuo	c07a27a008	[FIX] delete python3.6 from AMD python package docker image builder (#10790 ) * [UPDATE] delete python3.6 to cooperate numpy==1.21.0 * [UPDATE] delete python3.6 to cooperate numpy==1.21.0	2022-03-07 18:21:43 -08:00
Vincent Wang	4a38f9e31d	enable strided tensor for training only (#10748 )	2022-03-08 08:31:28 +08:00
zhangyaobit	b7f00b9682	Refactor the common code per operator into an abstract base class. (#10785 )	2022-03-07 13:15:49 -08:00
Daigo HIROOKA	a08036da09	correct symbolic name of GridSample operation (#10782 ) Function name needs to match PyTorch ATen op name, which is `aten::grid_sampler`.	2022-03-07 12:49:12 -08:00
dependabot[bot]	3e54f94bb0	Bump karma from 6.3.14 to 6.3.16 in /js/web Bumps [karma](https://github.com/karma-runner/karma) from 6.3.14 to 6.3.16. - [Release notes](https://github.com/karma-runner/karma/releases) - [Changelog](https://github.com/karma-runner/karma/blob/master/CHANGELOG.md) - [Commits](https://github.com/karma-runner/karma/compare/v6.3.14...v6.3.16) --- updated-dependencies: - dependency-name: karma dependency-type: direct:development ... Signed-off-by: dependabot[bot] <support@github.com>	2022-03-07 11:47:23 -08:00
Yulong Wang	25fdcfbd14	[js/web] allow multiple inference session creating concurrently (#10784 ) * test case * bugfix * fix * support multi session init	2022-03-07 11:35:06 -08:00
RandySheriffH	a4b5fa334a	Add type and shape information to profiled numbers (#10773 ) * add func to collect type shape * reformat * refactor perf view * remove obsolete	2022-03-07 10:17:58 -08:00
Changming Sun	d8bf9a479b	Remove python 3.6 from training pipelines (#10780 ) Because the numpy we use doesn't support python 3.6. And inference pipelines already removed python 3.6.	2022-03-07 09:57:24 -08:00
Hariharan Seshadri	9d30262422	Fix AMD training pipeline (#10788 )	2022-03-07 08:53:08 -08:00
Chen Fu	50a6f095cd	Symmetric QGEMM kernel for ARMv8 A55 chip (#10754 ) ARM a55 micro-architecture (with dot product instructions), similar to a53, is widely used as little cores in big.Little configurations. A55 has a narrower memory load/store hardware, where a 128b load instruction would block the pipeline for 2 whole cycles, during which no other instructions can be executed. On the other hand, a 64b load instruction can be duo issued with many other instructions. This change adds a Symmetric QGEMM kernel for a55 micro-architecture, where we replace ldr q4,[x1],#16 with ldr d4,[x1],#8 ldr x11,[x1],#8 ins v4.d[1],x11 so that we can try to hide the memory load cycles behind computing cycles in the kernel. Co-authored-by: Chen Fu <fuchen@microsoft.com>	2022-03-07 08:41:13 -08:00
PeixuanZuo	55af7a96a7	update the amd ci pipeline (#10723 ) * [TEST] test to get amd pipeline information * [FIX] lower the threshold * [UPDATE] add retry task * [UPDATE] add retry task * [ERROR] error to occur retry * [FIX] error * [UPDATE] update retryCountOnTaskFailure to 1 time * [UPDATE] add showmeminfo	2022-03-07 18:39:42 +08:00
Fei Hu	60acfd3dd8	Support CUDA Graph in the CUDA EP (#9978 )	2022-03-06 20:47:31 -08:00
Tianlei Wu	0e335aba37	Update BeamSearch operator spec to support t5 (#10777 ) * change BeamSearch op to support encoder decoder model * check model_type and decoder attribute * fix * update comments * warn shape inference issue with onnx v1.11 or T5 * skip parity test when tempature != 1.0 * fix build	2022-03-04 21:52:45 -08:00
George Nash	6be5185088	Update dnnl Add, Mul, Sub, Div ops to handle scalar values (#10756 ) * Update dnnl Add, Mul, Sub, Div ops to handle scalar values Signed-off-by: George Nash <george.nash@intel.com> * Add additional scalar support for dnnl execution provider This will add scalar support for: Eltwise operators: Abs, Elu, Exp, LeakyRelu, Log, Relu, Round, Sigmoid, Softplus, Sqrt, and Tanh Gelu operators: BiasGelu, FastGelu, and Gelu Softmax operator Signed-off-by: George Nash <george.nash@intel.com>	2022-03-04 19:28:25 -08:00
Ye Wang	259ade2557	Add ability to modify num_hidden_layers from benchmark script (#10760 ) * add ability to modify num_hidden_layers from benchmark script * comment * Revert "comment" This reverts commit 28794b0e4f86506dcc937738894fcef97fc84e48. * Revert "add ability to modify num_hidden_layers from benchmark script" This reverts commit 96f36ed7f751721bcf4e3ab8748a715f19a4e044. * review coments Co-authored-by: Ubuntu <wy@linux-v100.aidmrjtolptuzevavgwhrapqcd.jx.internal.cloudapp.net>	2022-03-04 18:28:51 -08:00
Ella Charlaix	fde847473b	Add min max moving average calibration method (#10753 ) * Add min max moving average calibration method * Modify the calibration extra options dictionnary creation	2022-03-04 14:55:31 -08:00
Maxiwell	43ff27c7c8	ppc64le: optimizing the MlasQuantizeLinear() with VSX (#10644 ) This code is valid only when -mcpu is set to utilize POWER9 technology or above. A compatible code for POWER8 was created as well, but it was not tuned for performance.	2022-03-04 14:54:56 -08:00
Tianlei Wu	379b3cdef6	T5 to ONNX conversion script (#10766 ) * T5 onnx conversion script	2022-03-04 14:42:04 -08:00
Olivia Jain	12eb660415	Compare TRT vs ORT-TRT Accurately (#10565 ) * get inputs independently for trtexec * track one process only * remove engine and profile files * change time to commit time * add runtime option for io binding * move to commit date * fixes * add option for graph optimization * cleanup docker script * include remaining changes * choose graph optimization option * add space in option	2022-03-04 10:14:18 -08:00
dependabot[bot]	e3c85d4262	Bump numpy Bumps [numpy](https://github.com/numpy/numpy) from 1.19.5 to 1.21.0. - [Release notes](https://github.com/numpy/numpy/releases) - [Changelog](https://github.com/numpy/numpy/blob/main/doc/HOWTO_RELEASE.rst.txt) - [Commits](https://github.com/numpy/numpy/compare/v1.19.5...v1.21.0) --- updated-dependencies: - dependency-name: numpy dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com>	2022-03-04 09:51:32 -08:00
dependabot[bot]	b780a3784e	Bump numpy in /tools/ci_build/github/linux/docker/scripts/training Bumps [numpy](https://github.com/numpy/numpy) from 1.19.5 to 1.21.0. - [Release notes](https://github.com/numpy/numpy/releases) - [Changelog](https://github.com/numpy/numpy/blob/main/doc/HOWTO_RELEASE.rst.txt) - [Commits](https://github.com/numpy/numpy/compare/v1.19.5...v1.21.0) --- updated-dependencies: - dependency-name: numpy dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com>	2022-03-04 09:38:38 -08:00
dependabot[bot]	0b0e8ccf92	Bump numpy Bumps [numpy](https://github.com/numpy/numpy) from 1.19.5 to 1.21.0. - [Release notes](https://github.com/numpy/numpy/releases) - [Changelog](https://github.com/numpy/numpy/blob/main/doc/HOWTO_RELEASE.rst.txt) - [Commits](https://github.com/numpy/numpy/compare/v1.19.5...v1.21.0) --- updated-dependencies: - dependency-name: numpy dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com>	2022-03-04 09:34:58 -08:00
Changming Sun	283d0c47b4	Update our absl cmake files (#10762 )	2022-03-04 09:28:04 -08:00
zhangyaobit	4c88fa5971	Add micro-benchmark for FastGelu (#10744 ) * Add micro-benchmark for FastGelu * Delete the bert-base case, as it is very similar to the bert-large one. * Add argument parsing and more user-friendly provider type assertion.	2022-03-04 08:51:15 -08:00
Valery Chernov	46d0b20ac2	upstream TVM. small code cleaning (#10515 ) Co-authored-by: Valery Chernov <valery.chernov@deelvin.com>	2022-03-04 12:15:29 +01:00
Edward Chen	395a7242d6	[iOS packaging] Minor updates. (#10755 ) * Change storage container, simplify build definition parameters. * Remove explicit version from Objective-C docs. * Increase timeout. * Use real storage account. * Get static website URL with az cli.	2022-03-04 16:02:53 +10:00

1 2 3 4 5 ...

6484 commits