onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-07-09 17:28:58 +00:00

Author	SHA1	Message	Date
Wil Brady	1163294699	Fixing up some python warnings. (#12319 )	2022-07-27 07:24:37 -04:00
Adam Louly	f3dcbf539a	Checkpoint load inference (#12168 ) * LoadCheckPoint to tensor cpp functions (draft) * Load Checkpoint into inference model * fix python lint * fix python lint * Fixing lint and some unused imports * added assert for zero weights model, resolved other issues * resolved issues * Solved issues * changed variable names for get_models * paparameters names missmatched fix Co-authored-by: Adam Louly <adamlouly@microsoft.com@orttrainingdev7.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net>	2022-07-26 11:08:50 -05:00
Wil Brady	de57daaab0	Eager mode: binary ops more complete behavior and testing. (#12293 ) * Remove hand written add_.Tensor as it can now be generated. * Generate .out for tensor version of basic math ops. Add.out testing added too. * Remove sin tests as they are covered by parameterized tests. Also, moved all parameterized tests to the end in their own section. * Add binary ops tests for tensors. Scalar tests are calling the aten .out which is for tensor. * Add support for scalar input to add, div, mul, and sub.	2022-07-26 09:14:57 -04:00
Vincent Wang	c40f73ae0c	Remove aten::binary_cross_entropy_with_logits from ATen Fallback (#12301 )	2022-07-26 07:29:56 +08:00
Dmitri Smirnov	3bf614fd47	Eliminate memory allocations per recent profiling (#12225 ) * Alloc begin FeedsFetches refactoring Refactor Tensor class Fix buffer deletor Remove new/delete deleted Adjust alloc move Fix up xnnpack provider Clarifying the comment on Create()	2022-07-25 14:14:38 -07:00
Baiju Meswani	ddb45e9126	On device training CI pipeline (#11987 )	2022-07-25 10:07:17 -07:00
Jameson Miller	8d0e86dec8	Apply project formatting rules to ort_aten.cpp (#12294 ) * Apply project formatting rules to ort_aten.cpp Formatting applied by formatting the file in VS Code. This file is under active development and the inconsistent formatting was causing friction due to: 1. cpplint job on Pipeline was flagging a lot of style issues, resulting in a lot of noisy annotations. 2. local edits would result in changes that are not part of the core change. While there are other files in this part of the source tree with inconsistent formatting, this file was causing the most friction. We can come back and address the other files later, which would be a much larger change. * Apply consistent pattern for invoker.Invoke(...)	2022-07-25 07:26:35 -04:00
Vincent Wang	0fa3aeb65c	[CUDA] Add Strided Tensor Support for Expand->GatherElements for Training (#11976 ) * strided tensor for expand and gather_elements * bugfix * simplify CoalesceDimensions * resolve comments * resolve more comments.	2022-07-25 16:05:26 +08:00
pengwa	75bda9f267	CPU AdamW implementation (#11978 ) * cpu adamwoptimizer implementation * unit tests for cpu kernel pass * refine based on comments * parallize the weights loop in PrepareForCompute. * fix wrong test data path * fix kernel hash * fix rocm ci pipeline	2022-07-25 09:43:52 +08:00
Juan Paez	4f57da78cf	OrtModule fix pytorch version comparison (#12280 ) * fix torch version comparison * remove patchfile Co-authored-by: Juan Paez <juanpaez@microsoft.com>	2022-07-22 09:11:28 -07:00
pengwa	feabafe58b	Fix memory consumption discrepancy (#12266 ) * release cached cuda memory after temp model_copy run * op schema change only: remove PythonOp forward output from PythonOpGrad inputs. * always export model using torch.no_grad * 1.update PythonOP's "input_requires_grads" attribute according to ORT gradient graph. 2. remove PythonOp's "output_tensor_requires_grads" attribute because in torch.no_grad mode, the exported value is not correct. 3. [related to 2] remove PythonOPGrad's "input_tensor_requires_grads" because it comes from corresponding PythonOP's "output_tensor_requires_grads". * fix uts * refine basde on wschin's comments && fix pylint * fix comments * fix unused variable	2022-07-22 16:55:50 +08:00
Ashwini Khade	ceb76429db	Merge pull request #12056 from microsoft/bmeswani/merge-training_dev/on_device_poc Merge On-Device-Training Offline Tooling and C/C++ APIs	2022-07-21 15:09:48 -07:00
Wil Brady	45c0be8a25	Modify generator for eager to use all inputs for determining promote type. (#12268 ) * Sort supported types order so we get a consistently generated order of types. * Fix promote type to include all the input types and not just the first one.	2022-07-21 17:21:10 -04:00
Baiju Meswani	cbf08c7a7b	Make GetTrainingApi as a part of the OrtApis, add Training API documentation and address other pull request review comments	2022-07-21 18:11:48 +00:00
LironKesem	7dc45bc311	Implementing aten::gt.Scalar_out and aten::lt.Scalar_out (#12181 ) * Implementing aten::gt.Scalar_out and aten::lt.Scalar_out * modified the code according to code review	2022-07-21 10:36:43 -04:00
msftlincoln	424120d0fa	cpplint & Eager mode: refactor and add comments to empty_* functions, general lint cleanup in ort_aten (#12238 ) * empty* comments and code reuse * lint * more cpplint * add cpplint settings * test empty	2022-07-20 11:47:57 -04:00
Vincent Wang	72c689a502	[CUDA] Use dim3.z to Handle Large Input For GatherGrad (#12250 ) * use dim3.z to handle large input size * less blocks	2022-07-20 18:42:52 +08:00
pengwa	ebfd81e67e	Fix BiasGeluGrad bug (#12200 ) * use 3D grid to avoid the upper limit of grid dimension * enrich tests * Revert "use 3D grid to avoid the upper limit of grid dimension" This reverts commit 2d5badf2fe8cd985f3f29ee2cb18fff13d07c2ab. * change to a fix: switch the 1st and 2nd dim	2022-07-20 17:59:29 +08:00
Vincent Wang	3cdc6d7775	[ORTModule] Bugfix of torch.chunk's Custom Symbolic when chunks==1 (#12249 ) handle custom chunk with chunks==1	2022-07-20 17:00:41 +08:00
Juan Paez	9b6ef17c5f	Eager opgen support for in-place operations with variadic args (#12125 ) * use torch library binding frontend for tensorlist * fix test * allow in-place modification of variadic args * fix lint issues * update ORT eager readme Co-authored-by: Juan Paez <juanpaez@microsoft.com>	2022-07-19 21:01:00 -07:00
Jameson Miller	975bb56e8c	Eager mode - argmax_out: set output tensor (#12233 ) This change updates the implementation or te argmax_out operator to 1) set the output tensor correctly and 2) remove the unnecessary use of a temporary tensor to store intermediate result of onnx ArgMax operation. Previously, the argmax_out operator did not correctly update the out tensor - it replaced the OrtValue instead of the memory backing the OrtValue . To properly update the output tensor, we need to calculate the expected shape of the out tensor. We add the helper function calculate_reduction_shape to calculate the shape of the reduced tensor from the input tensor, dimension to reduce, and option to keep the reduced dimension or not. This is based on the utility functions in aten/src/ATen/native/ReduceOpsUtils.h in the PyTorch repository, but is tailored to be a bit more specific to our current needs. Notes: We considered just directly leveraging PyTorch's utility functions (e.g. get_reduction_shape) to calculate the shape of the reduced tensor from aten/src/ATen/native/ReduceOpsUtils.h in the PyTorch repository, but including this header file resulted in warnings around unused functions that we need to handle. As we only need a limited functionality at the moment, we instead implemented our own utility function to calculate the reduction shape for our specific current needs. If we need a utility function to more generally calculate the reduction shape, we could consider switching to leveraging the utility methods in PyTorch.	2022-07-19 14:37:03 -04:00
Wil Brady	4235ebc161	Add eager mode support for mm.out (matrix multiplication). (#12214 ) * Add eager mode support for mm.out (matrix multiplication). * Fallback to cpu when mm requirements not met so cpu can print error message.	2022-07-19 07:28:48 -04:00
Michael Melesse	bb5bd08545	[ROCM] Navi21 fixes pr (#11368 ) * add scripts * update docker scripts * update build script * create run script * add test script * add log 3 flags * use the right build function * build navi * add clean script * add pytorch like soln * only build gfx 1030 * use HOST side var * ignore logs * update scripts * GPU_WARP_SIZE_HOST * update scripts * remove scripts/amd * match main * add GPU_WARP_SIZE_HOST on cuda side * match main * correct gfx1030 * remove print * move gfx add to rocm5.0 * remove inline * make constexpr on cuda side	2022-07-18 22:26:57 -07:00
Vincent Wang	173bcdbc71	[CUDA] Split/Concat Kernel Optimization (#12175 ) * split concat optimization * bugfix * fix ut * deprecate LooseVersion	2022-07-19 08:10:46 +08:00
msftlincoln	52095fb042	Fix line spacing/break issue, extend existing tests (#12191 ) * fix line length * extend test cases * lint	2022-07-15 19:32:34 -04:00
msftlincoln	a2dc6d32fc	OnnxRuntime Eager: Implement log_softmax with ONNX Ops (#12190 ) * share CHECK_STATUS * log_softmax	2022-07-15 15:03:08 -04:00
msftlincoln	9bca8405aa	bitwise_and ONNX support (#12189 ) * bitwise_and ONNX support * whitespace lint	2022-07-15 12:59:56 -04:00
Wil Brady	89bf6c9b5d	Simple eager training models (#12180 ) * Simple NN using ort, and added or modified ort op support.	2022-07-15 09:18:00 -04:00
msftlincoln	fafb24142f	add comment to explain local scalar dense (#12179 ) * add comment to explain local scalar dense * spacing	2022-07-15 09:03:43 -04:00
Wil Brady	9ebef91a6f	Update eager Readme.md (#12170 )	2022-07-14 06:05:50 -04:00
PeixuanZuo	7b53b223b8	[UPDATE] update AMD CI pipeline to Rocm5.2 with torch1.11 (#12162 ) * [UPDATE] update ci to rocm5.2 + torch1.11 * [Revert] disable ort module test * [DELETE] delete Rocm5.1.1 ci test result * [UPDATE] update the comments	2022-07-14 16:38:16 +08:00
Vincent Wang	a7eb9fe3ac	Remove Apex Dependency For Deepspeed FP16_Optimizer (#12077 ) * remove apex dependency * fix amd build	2022-07-14 11:15:53 +08:00
Wil Brady	5da1e5d36d	Eager mode: Fix some python warnings. (#12167 )	2022-07-13 20:24:42 -04:00
Wil Brady	48647bc7d7	Fix NonZero eager impl. (#12143 )	2022-07-13 05:50:33 -04:00
jingyanwangms	a9d0d3323e	Use updated symbolic_helper.check_training_mode (#11900 ) Co-authored-by: Jingyan Wang, Baiju Meswani	2022-07-12 17:26:06 -07:00
msftlincoln	a6fd1a3b85	Eager mode generator improvements for multiple onnx operators and extra test cases (#12111 ) * test case for masked_select * isolate variables per onnx_op, include line numbers for ORT errors * format errors * correct masked_select impl, broadcast test * node attrs naming fixed	2022-07-12 16:05:09 -04:00
LironKesem	9647a3be40	Add tests for all unary aten ops supported in eager mode (#12087 ) * Add tests for all uniary aten ops supported in eager mode * fixing the PR draft * fixing the merge * changing eval to be at compile time * adding requirements for eager * 1.adding function to {ops}_out 2.cleaning the code and adding comments * editing the code according to code review Co-authored-by: root <root@AHA-LIRONKESE-1>	2022-07-12 08:53:19 -04:00
Wil Brady	f1047e0456	Fix minor python and cpp warnings from previous PR. (#12140 ) Description: In the PR 12018 a few fixable python and cpp warning were introduced that this PR cleans up. Also adding a comment on the intent of test_mul_bool and out testing on test_ones. Motivation and Context When iterating in Python, use a list instead of a set and don't use reserved words Fix long line in cpp Clarify test_mul_bool intent for future developers. fill_ implements torch.ones under the covers but in previous pr verification on the out param was not added so adding it here.	2022-07-11 16:18:40 -04:00
Wil Brady	418cfdc766	Update create_ort_attribute to set the tensor dimension and value correctly. Implement eager fill_ (#12018 ) * Update create_ort_attribute to set the tensor dimension and value correctly. * Eager mode support for fill_ and mm.out (mm uses mm.out).	2022-07-11 11:18:04 -04:00
Wil Brady	c04afae9a9	Add eager ops for unary ops with out. (#12106 )	2022-07-08 12:09:26 -04:00
Wil Brady	1948b7c726	Add eager support for eq and ne ops. (#12031 ) * Add eager support for aten::eq and aten:ne. * Add generator support for resizing output param.	2022-07-06 12:39:04 -04:00
ytaous	7b8f45dd60	[ROCm] Enable build option for autograd (#11945 ) * add autograd build option * disable UTs * disable UTs * UT-step1 * UT-step1 * UT-step2 * UT-step2 * UT-step2 * UT-step2 * UT-step2 * UT-step2 * Fix UTs * increase shm * code clean up Co-authored-by: Ethan Tao <ettao@microsoft.com@orttrainingdev7.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net>	2022-07-05 18:11:29 -07:00
Baiju Meswani	3d53ce5e7b	Remove ROCM macro for InPlaceAccumulatorV2 test	2022-07-05 21:05:20 +00:00
Baiju Meswani	1aa27e127c	Resolve build conflicts with master	2022-07-05 19:53:54 +00:00
Jameson Miller	ae88f43550	Eager mode: structure for supporting out= operators (#12066 ) * Add utility methods for resize_output * Eager mode: implement abs.out This is an initial hand written implementation of an out= operator to demonstrate how to structure out= methods using resize_out helper methods. This is meant to be used as a reference when we update the code generator to generate implementations for out= operations.	2022-07-01 13:35:12 -04:00
Jameson Miller	3e6b8d159a	Eager mode: implement resize_ operation (#12004 ) Add support for PyTorch `resize_` operation. The PyTorch API method is documented here: https://pytorch.org/docs/stable/generated/torch.Tensor.resize_.html Implementation notes: There are some implementation details that might deviate from expectations: - As the Onnxruntime::tensor does not support resize operation, this functionality is supported on the TensorImpl by swapping out the backing tensor if the size changes. - In the ORT model the shape of the TensorImpl is defined by the backing onnxruntime::tensor, so it is not supported to have a TensorImpl with a different shape / size than the backing onnxruntime::tensor. This means when resizing to a smaller TensorImpl, other implementations might keep the same backing storage, ORT will re-allocate a new onnxruntime::tensor and copy over as many of the existing elements that fit. Functionally, you will end up with same output, but the underlying buffer will be re-allocated. A future change could be to allow ORTTensorImpl to have a different size / shape than the onnxrutime::tensor backing it, and then we could improve this behavior. The canonical CPU / CUDA implementations in PyTorch repository: CPU: aten/src/ATen/native/Resize.cpp CUDA: aten/src/ATen/native/cuda/Resize.cpp	2022-06-30 22:14:37 -04:00
Baiju Meswani	a457ddc41d	Merge branch 'master' of https://github.com/microsoft/onnxruntime into bmeswani/merge_pr	2022-06-30 21:53:07 +00:00
Wil Brady	0fa2041f68	Add eager support for aten:: equal. (#12020 )	2022-06-30 15:46:14 -04:00
Vincent Wang	04f7c2deda	FP16_Optimizer Support for more Deepspeed Versions (#12046 ) * fp16_optimizer for more ds versions * change ds version * bugfix * fix bug	2022-06-30 18:36:17 +08:00
zhijxu	9f260fb60f	resolve comments	2022-06-30 11:26:13 +08:00

1 2 3 4 5 ...

1037 commits