onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-06-26 03:00:54 +00:00

Author	SHA1	Message	Date
Justin Chu	3d2bcb3386	Use unregister_custom_op_symbolic to unregister torch symbolics (#12146 ) Description: Use unregister_custom_op_symbolic to unregister torch symbolics Motivation and Context Fixes #11305	2022-07-21 10:47:53 -07:00
Rachel Guo	496618594f	Update supported ops md for NNAPI/CoreML EP (#12245 ) * update supported ops md * address pr comments * address pr comments * wording	2022-07-21 10:23:08 -07:00
LironKesem	7dc45bc311	Implementing aten::gt.Scalar_out and aten::lt.Scalar_out (#12181 ) * Implementing aten::gt.Scalar_out and aten::lt.Scalar_out * modified the code according to code review	2022-07-21 10:36:43 -04:00
Yi Zhang	007ef42749	Fix: Test coverage is undercounting and profiling errors (#12260 ) add data relocation for onnx_test_runner	2022-07-21 16:19:24 +08:00
Ye Wang	5066ef1185	Fix a bug in beam search custom attention mask allocation (#12240 )	2022-07-20 23:42:54 -07:00
Yulong Wang	0c78b71352	prepare test folder from GitHub (#12220 ) * consume onnx test data from github * ensure tests * update script and allow opset specification * fix python format * fix python format * consume new filter format * fix linting error	2022-07-20 22:01:08 -07:00
Tianlei Wu	568d08994f	fix test_optimizer.py (#12219 ) * fix optimizer test * update message and skip test instead of uncomment * fix deprecated warning	2022-07-20 19:21:26 -07:00
101arrowz	c72bb8aaa9	[js/web] add OffscreenCanvas support to WebGL backend (#12159 ) * Add OffscreenCanvas support to WebGL backend * fix format * fix lint	2022-07-20 14:06:03 -07:00
Rachel Guo	471dbfc250	[NNAPI] Add int32_t as supported input data type and other minor gather op updates (#12171 ) * update (including commented out code for gather) * update tests etc. * update * minor updates * fix typo * fix build * minor update * address pr comment * refine comments * address pr comment * update condition check and UTs * refine code comments * address lint warning	2022-07-20 12:07:46 -07:00
Tianlei Wu	5651d91c32	Fix onnx version comparison (#12223 ) use version.parse to compare version	2022-07-20 11:14:06 -07:00
Jian Chen	43e1e89453	Update aarch64 building pool to aiinfra-linux-ARM64-CPU-2019 (#12243 ) * Setting new pool for arm64 * Setting defualt pool name * adding DockerInstaller stage * try to install docker from apt-get * change to specific * adding chmod to docker.sock * install dotnet sdk * specic dotnet 3.1.x * add manuall step to install dotnet * typo bass * remove inputs * change dotnet installation dir * skipComponentGovernanceDetection on arm64 linux * variables typo * variables: - name: skipComponentGovernanceDetection value: true * update variables * skipComponentGovernanceDetection set to true * moving varliables * moving the variables again * setting condition on cgd * indentation * indentation again * conditional variable * if * remove cgd * conditionl on cgd * condition * parameters * clean up	2022-07-20 12:08:02 -04:00
msftlincoln	424120d0fa	cpplint & Eager mode: refactor and add comments to empty_* functions, general lint cleanup in ort_aten (#12238 ) * empty* comments and code reuse * lint * more cpplint * add cpplint settings * test empty	2022-07-20 11:47:57 -04:00
Vincent Wang	72c689a502	[CUDA] Use dim3.z to Handle Large Input For GatherGrad (#12250 ) * use dim3.z to handle large input size * less blocks	2022-07-20 18:42:52 +08:00
pengwa	ebfd81e67e	Fix BiasGeluGrad bug (#12200 ) * use 3D grid to avoid the upper limit of grid dimension * enrich tests * Revert "use 3D grid to avoid the upper limit of grid dimension" This reverts commit 2d5badf2fe8cd985f3f29ee2cb18fff13d07c2ab. * change to a fix: switch the 1st and 2nd dim	2022-07-20 17:59:29 +08:00
Vincent Wang	3cdc6d7775	[ORTModule] Bugfix of torch.chunk's Custom Symbolic when chunks==1 (#12249 ) handle custom chunk with chunks==1	2022-07-20 17:00:41 +08:00
cloudhan	a0074ba9bc	Add baseline gemm for kernel explorer (#12050 ) Use rocblasGemmHelper gemm wrapper from ORT and profile for bert param size only.	2022-07-20 13:49:26 +08:00
mindest	add631410a	[ROCm] Re-enable ReduceL1, L2 and related tests (#12209 ) Re-enable ReduceL1,L2 and related tests	2022-07-20 13:13:02 +08:00
Juan Paez	9b6ef17c5f	Eager opgen support for in-place operations with variadic args (#12125 ) * use torch library binding frontend for tensorlist * fix test * allow in-place modification of variadic args * fix lint issues * update ORT eager readme Co-authored-by: Juan Paez <juanpaez@microsoft.com>	2022-07-19 21:01:00 -07:00
Xinya Zhang	5e2109f7ef	[ROCm] Enable GridSample Op. (#11969 )	2022-07-19 20:44:30 -07:00
Dmitri Smirnov	4f106d2b3b	Eliminate unnecessary status lock acquisition in TP (#12196 ) Eliminate unnecessary status lock acquisition in the Thread Pool	2022-07-19 14:16:12 -07:00
Tianlei Wu	972e5e7300	Improve symbolic shape inference in transformers tools (#12217 ) improve symbolic shape inference handling n transformers tools: avoid infinite loop and suppress duplicated warnings	2022-07-19 13:27:35 -07:00
Jameson Miller	975bb56e8c	Eager mode - argmax_out: set output tensor (#12233 ) This change updates the implementation or te argmax_out operator to 1) set the output tensor correctly and 2) remove the unnecessary use of a temporary tensor to store intermediate result of onnx ArgMax operation. Previously, the argmax_out operator did not correctly update the out tensor - it replaced the OrtValue instead of the memory backing the OrtValue . To properly update the output tensor, we need to calculate the expected shape of the out tensor. We add the helper function calculate_reduction_shape to calculate the shape of the reduced tensor from the input tensor, dimension to reduce, and option to keep the reduced dimension or not. This is based on the utility functions in aten/src/ATen/native/ReduceOpsUtils.h in the PyTorch repository, but is tailored to be a bit more specific to our current needs. Notes: We considered just directly leveraging PyTorch's utility functions (e.g. get_reduction_shape) to calculate the shape of the reduced tensor from aten/src/ATen/native/ReduceOpsUtils.h in the PyTorch repository, but including this header file resulted in warnings around unused functions that we need to handle. As we only need a limited functionality at the moment, we instead implemented our own utility function to calculate the reduction shape for our specific current needs. If we need a utility function to more generally calculate the reduction shape, we could consider switching to leveraging the utility methods in PyTorch.	2022-07-19 14:37:03 -04:00
Dmitri Smirnov	555e88982f	Fix GH issue 12208 (#12224 )	2022-07-19 10:03:43 -07:00
Changming Sun	2cb642927b	Simplify get_docker_image.py (#12166 ) Simplify get_docker_image.py by leverage docker itself remote cache functionality.	2022-07-19 09:53:01 -07:00
Tianlei Wu	0c319d6e94	Exclude implicit inputs from dump of encoder feeds in beam search (#12222 ) fix encoder feeds dump	2022-07-19 09:44:12 -07:00
Alexey Gladyshev	66978c7ef5	[TVM EP][CI] Added TVMso EP testing into CI (#12188 ) * refactor test for model with undefined shapes * add test for TVMso EP * update build script for TVM EP tests * fix pylint * disable test for Windows * fix black * fix python format * fix pylint * fix python format * replace Path.resolve with os.path.join * fix python path issue Co-authored-by: Valery Chernov <valery.chernov@deelvin.com>	2022-07-19 16:05:28 +02:00
Wil Brady	4235ebc161	Add eager mode support for mm.out (matrix multiplication). (#12214 ) * Add eager mode support for mm.out (matrix multiplication). * Fallback to cpu when mm requirements not met so cpu can print error message.	2022-07-19 07:28:48 -04:00
Michael Melesse	bb5bd08545	[ROCM] Navi21 fixes pr (#11368 ) * add scripts * update docker scripts * update build script * create run script * add test script * add log 3 flags * use the right build function * build navi * add clean script * add pytorch like soln * only build gfx 1030 * use HOST side var * ignore logs * update scripts * GPU_WARP_SIZE_HOST * update scripts * remove scripts/amd * match main * add GPU_WARP_SIZE_HOST on cuda side * match main * correct gfx1030 * remove print * move gfx add to rocm5.0 * remove inline * make constexpr on cuda side	2022-07-18 22:26:57 -07:00
Vincent Wang	173bcdbc71	[CUDA] Split/Concat Kernel Optimization (#12175 ) * split concat optimization * bugfix * fix ut * deprecate LooseVersion	2022-07-19 08:10:46 +08:00
Yulong Wang	ced7c2deac	[js/web] use windowed Chrome for perf mode (#12157 )	2022-07-18 14:04:27 -07:00
Tianlei Wu	b81b652608	Add --disable_shape_inference option to optimizer.py (#12215 )	2022-07-18 13:52:02 -07:00
Sean Murray	93229949d4	Fix bug where onnxruntime_USE_NCCL flag would default to ON (#12195 ) Fix bug where onnxruntime_USE_NCCL flag would default to ON, causing ORT to not build properly. New functionality: flag is ON when training is enabled and NCCL is not disabled. Flag is OFF otherwise	2022-07-18 12:13:08 -07:00
Tianlei Wu	17b84c78f7	remove identity in transformers model graph fusion (#12194 ) * remove identity in fusion	2022-07-18 09:59:42 -07:00
caoting-dotcom	4d38b84e26	Add file mapping for windows platform. (#12183 ) * Add file mapping for windows platform. * Add unit test for file mapping for windows. Also add an error message for mis-aligned offset * Add unit test for file mapping for windows. Also add an error message for mis-aligned offset * Update data type to avoid warnings * Compitable data type to avoid warnings. Update CreatFileMapping2 condition for winml compiling. * Add type conversion to avoid warnings for X86 release build. Co-authored-by: Ting Cao <ticao@microsoft.com>	2022-07-18 09:24:12 -07:00
leqiao-1	09af4a7fdd	remove wrong placed libs (#12201 )	2022-07-18 09:22:22 -07:00
Alexey Gladyshev	d31db1aa57	[TVM EP][CI] Integrate TVM EP into ORT public CI on Windows (#12161 ) * Integrate TVM EP into ORT public CI on Windows * empty commit for restart pylint * empty commit for restart pylint	2022-07-18 11:12:16 +02:00
msftlincoln	52095fb042	Fix line spacing/break issue, extend existing tests (#12191 ) * fix line length * extend test cases * lint	2022-07-15 19:32:34 -04:00
msftlincoln	a2dc6d32fc	OnnxRuntime Eager: Implement log_softmax with ONNX Ops (#12190 ) * share CHECK_STATUS * log_softmax	2022-07-15 15:03:08 -04:00
msftlincoln	9bca8405aa	bitwise_and ONNX support (#12189 ) * bitwise_and ONNX support * whitespace lint	2022-07-15 12:59:56 -04:00
Wil Brady	89bf6c9b5d	Simple eager training models (#12180 ) * Simple NN using ort, and added or modified ort op support.	2022-07-15 09:18:00 -04:00
msftlincoln	fafb24142f	add comment to explain local scalar dense (#12179 ) * add comment to explain local scalar dense * spacing	2022-07-15 09:03:43 -04:00
Viswanath Boga	05c31a036d	fixing positions for beam search gpt2 (#12156 ) * fixing positions for beam search gpt2 Co-authored-by: Tianlei Wu <tlwu@microsoft.com>	2022-07-14 13:31:59 -07:00
Wil Brady	9ebef91a6f	Update eager Readme.md (#12170 )	2022-07-14 06:05:50 -04:00
PeixuanZuo	7b53b223b8	[UPDATE] update AMD CI pipeline to Rocm5.2 with torch1.11 (#12162 ) * [UPDATE] update ci to rocm5.2 + torch1.11 * [Revert] disable ort module test * [DELETE] delete Rocm5.1.1 ci test result * [UPDATE] update the comments	2022-07-14 16:38:16 +08:00
Vincent Wang	a7eb9fe3ac	Remove Apex Dependency For Deepspeed FP16_Optimizer (#12077 ) * remove apex dependency * fix amd build	2022-07-14 11:15:53 +08:00
Wil Brady	5da1e5d36d	Eager mode: Fix some python warnings. (#12167 )	2022-07-13 20:24:42 -04:00
Maxiwell S. Garcia	51f8456c4d	ppc64le: Optimizing the MlasQLinearMulKernel() to use VSX instructions (#12051 )	2022-07-13 11:11:29 -07:00
Chen Fu	040c2f4517	x86/64 U8S8 Gemm Precision Fix (#12088 ) Add a graph optimization that convert u8s8 matrix multiplication to u8u8 if needed In x86/64 platforms, specifically SSE4.1, AVX2 and AVX512 CPUs provide better performance computing u8s8 matrix multiplications. Unfortunately, the higher performance comes with value overflow problems, as described in: https://www.intel.com/content/www/us/en/develop/documentation/onednn-developer-guide-and-reference/top/advanced-topics/nuances-of-int8-computations.html In this change we added a session option "session.x64quantprecision" (default off). For operators that calls u8s8 matrix multiplications, e.g. QAttention, we convert them to u8u8 when the following conditions are all satisfied: 1. Current CPU is SSE4.1, AVX2 or AVX512 with no VNNI support 2. Session option "session.x64quantprecision" is on. 3. Constant weight tensor contains values outside of [-64, 63] range Note that when weight tensor is not constant, QDQS8ToU8Transformer should already convert it to u8.	2022-07-13 10:12:25 -07:00
Wil Brady	48647bc7d7	Fix NonZero eager impl. (#12143 )	2022-07-13 05:50:33 -04:00
Valery Chernov	3b0aaa9e0e	[TVM EP] support build on Windows (#11851 ) * add description of build ORT+TVM EP on Windows * fix cmake error related to symlink creation on Windows * add llvm config path to build flags for correct build on Windows * update TVM_EP.md for llvm_config build arg * fix warnings skipping during build on Windows * fix using string or wstring for model path to correct build on Windows (MSVC error) * fix error in custom logger for correct build on Windows * implement glob algorithm for Windows * additional build fixes * update TVM with export of VM symbols for dll * description of nasm issue and workaround * update TVM with export of Executable from VM symbols for dll * description of installation of ipp-crypto dependencies on Windows * cmake key for ipp-crypto build * fix wstring for TVMso EP * fix ipp-crypto build * cmake key onnxruntime_TVM_USE_HASH switch off not specific methods, but full hash functionality * fix absolute path to compiled lib * update TVM_EP.md, fix lint warnings * update TVM_EP.md * small fixes after review * switch on handshake functionality for Linux workflow Co-authored-by: Valery Chernov <valery.chernov@deelvin.com> Co-authored-by: KJlaccHoeUM9l <wotpricol@mail.ru>	2022-07-13 10:48:42 +02:00

... 15 16 17 18 19 ...

7863 commits