onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-07-16 18:31:27 +00:00

Author	SHA1	Message	Date
Changming Sun	2d2eebb844	Correct a comment "WINVER=0x0602" means Windows 8. source: https://docs.microsoft.com/en-us/cpp/porting/modifying-winver-and-win32-winnt?view=msvc-170	2022-03-11 11:42:41 -08:00
Ryan Lai	2e7592ddf8	avoid using LocalFree on FormatMessageW buffer (#10796 ) * remove local free * Remove local free from onnxruntime * don't allocate * Change to use constexpr to satisfy CPU build warning	2022-03-11 11:11:40 -08:00
Kotaro Yamamoto	64556888a1	add python binding for RunOptions config entry (#10694 )	2022-03-11 08:49:22 -08:00
pengwa	d478a53d43	don't clear grad_fns & add test (#10671 )	2022-03-11 14:31:54 +08:00
Edward Chen	1a62306db7	Use separate build directories for full and mobile iOS packages. (#10835 )	2022-03-10 19:33:06 -08:00
Chun-Wei Chen	5202efd11e	remove unused six in code and CIs (#10832 )	2022-03-10 15:38:44 -08:00
Changming Sun	f87a06cd96	Patch absl so that it doesn't disable important VC++ warnings (#10836 ) This PR is just for making onnxruntime passing Binskim rules. Below is how I made it: git clone absl repo, checkout the version we are using Then apply our patch file Make modifications Regenerate the patch file by "git diff > C:\src\onnxruntime\cmake\patch\xxx.patch" Then submit the change to our repo You will need to repeat the steps when you need to advance the absl commit or add more changes to it.	2022-03-10 15:35:39 -08:00
Pranav Sharma	97ae44d060	Mark end of version 11 C API. (#10803 ) * Mark end of version 11 C API * Add static_assert	2022-03-10 15:11:02 -08:00
Abhishek Jindal	3ae2bfaefe	Abjindal/torch api change gelu (#10833 ) * changing gelu backward op and adding required files * cleaning up file and adding comments * version comparison issue	2022-03-10 11:56:30 -08:00
Dmitri Smirnov	1d545dfe87	Address performance issue with abseil flat_hash_table. (#10819 ) When returning by value in a cross DLL call, the hash table even though containing all the entries that are originally there can not find at least some of them. Reverting to std::unordered_set pending further investigation.	2022-03-10 09:49:55 -08:00
Hariharan Seshadri	e80ff63274	Fix bug in MemcpyToHost (#10816 )	2022-03-10 07:02:27 -08:00
Ryan Hill	9853eaa14f	Detect runtime CUDA JIT and warn the user (#10781 ) * Use cudaMalloc vs cudaDeviceSynchronize and show the total time	2022-03-09 19:15:16 -08:00
Changming Sun	cc3a3476ed	Uninstall onnxruntime-training before running local tests (#10827 ) * Uninstall onnxruntime-training before running local tests	2022-03-09 18:45:04 -08:00
zhangyaobit	9cbcc93e03	Add micro-benchmarks for Attention and SkipLayerNormalization ops. (#10798 ) * Add micro-benchmarks for Attention and SkipLayerNormalization ops. * Add choices for argument provider and precision. * Automatically select CUDA or ROCM execution provider.	2022-03-09 18:18:51 -08:00
Abhishek Jindal	1c313f4476	changing gelu backward op and adding required files (#10813 ) * changing gelu backward op and adding required files * cleaning up file and adding comments	2022-03-09 16:54:51 -08:00
Edward Chen	0293e525ea	Make QDQSelectorActionTransformer() is_int8_allowed parameter required. (#10820 ) Make QDQSelectorActionTransformer() is_int8_allowed parameter required. Set it to QDQIsInt8Allowed() in places it was previously set to false.	2022-03-09 16:19:43 -08:00
Changming Sun	cc6bc34c8c	Update protobuf submodule (#10801 )	2022-03-09 09:37:58 -08:00
Dmitri Smirnov	58521fb822	Make training CUDA kernels to adhere established code structure patterns (#10735 ) Current training optimizer kernels include CPU headers that affects changes that we can make in the CPU code with C++14 compiler and other refactoring efforts. Rearrange the kernel according to the established patterns and do not include headers that are not needed.	2022-03-09 09:06:45 -08:00
Adam Pocock	4ef81b142d	Making the Java tests faster by optionally disabling ones which require running multiple JVMs. (#10811 )	2022-03-08 22:19:37 -08:00
Hariharan Seshadri	ae97ecf05b	Fix CPU, CUDA Selu activation logic (#10771 )	2022-03-08 19:53:27 -08:00
Edward Chen	c147c9dda6	Remove ORT_ENABLE_RUNTIME_OPTIMIZATION_IN_MINIMAL_BUILD. (#10778 ) Remove ORT_ENABLE_RUNTIME_OPTIMIZATION_IN_MINIMAL_BUILD as it is now implied by ORT_EXTENDED_MINIMAL_BUILD. Remove related CMake option.	2022-03-08 16:18:49 -08:00
George Wu	769aa8363d	update onnx-tensorrt to bring in https://github.com/onnx/onnx-tensorrt/pull/812 (#10810 )	2022-03-08 14:51:07 -08:00
Jingqiao Fu	f4fd67cc2c	Revert "add load from buffer (#10162 )" (#10590 ) This reverts commit `5cd57bb726`.	2022-03-08 13:35:23 -08:00
dependabot[bot]	7e04dccca7	Bump numpy in /tools/ci_build/github/linux/docker/scripts (#10385 ) Bumps [numpy](https://github.com/numpy/numpy) from 1.16.6 to 1.21.0. - [Release notes](https://github.com/numpy/numpy/releases) - [Changelog](https://github.com/numpy/numpy/blob/main/doc/HOWTO_RELEASE.rst.txt) - [Commits](https://github.com/numpy/numpy/compare/v1.16.6...v1.21.0) --- updated-dependencies: - dependency-name: numpy dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2022-03-08 11:02:36 -08:00
Sunghoon	68c8f5a1ef	Change a pipeline vmImage from windows-latest to windows-2019 (#10804 )	2022-03-08 10:49:59 -08:00
Yufeng Li	33c6819196	add qdq support of Sigmoid (#10800 )	2022-03-08 10:29:15 -08:00
Changming Sun	6260733533	Fix eager mode pipeline (#10802 ) It was still using python 3.6	2022-03-08 09:26:20 -08:00
Hariharan Seshadri	a9d9c6b486	Register CPU, CUDA and ROCM opset-16 kernels for some operators (#10643 )	2022-03-08 09:18:39 -08:00
Changming Sun	ce07dc30fd	Change how we apply patches to absl (#10799 )	2022-03-08 02:03:06 -08:00
George Wu	1e4a4bfe58	update onnx-tensorrt reference. (#10795 )	2022-03-07 21:45:46 -08:00
liqun Fu	da885a72e8	update with onnx 1.11 release (#10441 )	2022-03-07 21:10:55 -08:00
Yulong Wang	80917342b7	[js] upgrade mocha@8.2.1 to 9.2.1 (#10793 )	2022-03-07 20:40:24 -08:00
dependabot[bot]	4d943c9bd3	Bump numpy from 1.16.6 to 1.21.0 in /tools/ci_build/github/linux/docker/scripts/manylinux (#10387 ) * Bump numpy in /tools/ci_build/github/linux/docker/scripts/manylinux	2022-03-07 20:39:49 -08:00
PeixuanZuo	c07a27a008	[FIX] delete python3.6 from AMD python package docker image builder (#10790 ) * [UPDATE] delete python3.6 to cooperate numpy==1.21.0 * [UPDATE] delete python3.6 to cooperate numpy==1.21.0	2022-03-07 18:21:43 -08:00
Vincent Wang	4a38f9e31d	enable strided tensor for training only (#10748 )	2022-03-08 08:31:28 +08:00
zhangyaobit	b7f00b9682	Refactor the common code per operator into an abstract base class. (#10785 )	2022-03-07 13:15:49 -08:00
Daigo HIROOKA	a08036da09	correct symbolic name of GridSample operation (#10782 ) Function name needs to match PyTorch ATen op name, which is `aten::grid_sampler`.	2022-03-07 12:49:12 -08:00
dependabot[bot]	3e54f94bb0	Bump karma from 6.3.14 to 6.3.16 in /js/web Bumps [karma](https://github.com/karma-runner/karma) from 6.3.14 to 6.3.16. - [Release notes](https://github.com/karma-runner/karma/releases) - [Changelog](https://github.com/karma-runner/karma/blob/master/CHANGELOG.md) - [Commits](https://github.com/karma-runner/karma/compare/v6.3.14...v6.3.16) --- updated-dependencies: - dependency-name: karma dependency-type: direct:development ... Signed-off-by: dependabot[bot] <support@github.com>	2022-03-07 11:47:23 -08:00
Yulong Wang	25fdcfbd14	[js/web] allow multiple inference session creating concurrently (#10784 ) * test case * bugfix * fix * support multi session init	2022-03-07 11:35:06 -08:00
RandySheriffH	a4b5fa334a	Add type and shape information to profiled numbers (#10773 ) * add func to collect type shape * reformat * refactor perf view * remove obsolete	2022-03-07 10:17:58 -08:00
Changming Sun	d8bf9a479b	Remove python 3.6 from training pipelines (#10780 ) Because the numpy we use doesn't support python 3.6. And inference pipelines already removed python 3.6.	2022-03-07 09:57:24 -08:00
Hariharan Seshadri	9d30262422	Fix AMD training pipeline (#10788 )	2022-03-07 08:53:08 -08:00
Chen Fu	50a6f095cd	Symmetric QGEMM kernel for ARMv8 A55 chip (#10754 ) ARM a55 micro-architecture (with dot product instructions), similar to a53, is widely used as little cores in big.Little configurations. A55 has a narrower memory load/store hardware, where a 128b load instruction would block the pipeline for 2 whole cycles, during which no other instructions can be executed. On the other hand, a 64b load instruction can be duo issued with many other instructions. This change adds a Symmetric QGEMM kernel for a55 micro-architecture, where we replace ldr q4,[x1],#16 with ldr d4,[x1],#8 ldr x11,[x1],#8 ins v4.d[1],x11 so that we can try to hide the memory load cycles behind computing cycles in the kernel. Co-authored-by: Chen Fu <fuchen@microsoft.com>	2022-03-07 08:41:13 -08:00
PeixuanZuo	55af7a96a7	update the amd ci pipeline (#10723 ) * [TEST] test to get amd pipeline information * [FIX] lower the threshold * [UPDATE] add retry task * [UPDATE] add retry task * [ERROR] error to occur retry * [FIX] error * [UPDATE] update retryCountOnTaskFailure to 1 time * [UPDATE] add showmeminfo	2022-03-07 18:39:42 +08:00
Fei Hu	60acfd3dd8	Support CUDA Graph in the CUDA EP (#9978 )	2022-03-06 20:47:31 -08:00
Tianlei Wu	0e335aba37	Update BeamSearch operator spec to support t5 (#10777 ) * change BeamSearch op to support encoder decoder model * check model_type and decoder attribute * fix * update comments * warn shape inference issue with onnx v1.11 or T5 * skip parity test when tempature != 1.0 * fix build	2022-03-04 21:52:45 -08:00
George Nash	6be5185088	Update dnnl Add, Mul, Sub, Div ops to handle scalar values (#10756 ) * Update dnnl Add, Mul, Sub, Div ops to handle scalar values Signed-off-by: George Nash <george.nash@intel.com> * Add additional scalar support for dnnl execution provider This will add scalar support for: Eltwise operators: Abs, Elu, Exp, LeakyRelu, Log, Relu, Round, Sigmoid, Softplus, Sqrt, and Tanh Gelu operators: BiasGelu, FastGelu, and Gelu Softmax operator Signed-off-by: George Nash <george.nash@intel.com>	2022-03-04 19:28:25 -08:00
Ye Wang	259ade2557	Add ability to modify num_hidden_layers from benchmark script (#10760 ) * add ability to modify num_hidden_layers from benchmark script * comment * Revert "comment" This reverts commit 28794b0e4f86506dcc937738894fcef97fc84e48. * Revert "add ability to modify num_hidden_layers from benchmark script" This reverts commit 96f36ed7f751721bcf4e3ab8748a715f19a4e044. * review coments Co-authored-by: Ubuntu <wy@linux-v100.aidmrjtolptuzevavgwhrapqcd.jx.internal.cloudapp.net>	2022-03-04 18:28:51 -08:00
Ella Charlaix	fde847473b	Add min max moving average calibration method (#10753 ) * Add min max moving average calibration method * Modify the calibration extra options dictionnary creation	2022-03-04 14:55:31 -08:00
Maxiwell	43ff27c7c8	ppc64le: optimizing the MlasQuantizeLinear() with VSX (#10644 ) This code is valid only when -mcpu is set to utilize POWER9 technology or above. A compatible code for POWER8 was created as well, but it was not tuned for performance.	2022-03-04 14:54:56 -08:00

1 2 3 4 5 ...

6493 commits