onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-06-01 23:30:35 +00:00

Author	SHA1	Message	Date
Wil Brady	b0e027c661	Add aten::_softmax to eager ops. (#11820 )	2022-06-13 13:05:26 -04:00
Vincent Wang	f745eb1d3f	fix gradient ut (#11797 )	2022-06-10 12:14:19 +08:00
Vincent Wang	5ecfaef042	ATen Fallback for Inference (#11597 ) * aten op for inference * fix build error * more some code to training only * remove domain from operator name * move aten_op_executor ext out from ortmodule * add pipeline * add exec mode * fix script * fix ut script * fix test pipeline * failure test * rollback * bugfix * resolve comments * enable aten for python build only * fix win build * use target_compile_definitions * support io binding * turn off aten by default * fix ut Co-authored-by: Vincent Wang <weicwang@microsoft.com> Co-authored-by: zhijxu <zhijxu@microsoft.com>	2022-06-09 16:07:30 +08:00
PeixuanZuo	908e19dc16	[FIX] using torch.version.cuda/hip to ensure build ORTModule Torch C++ CUDA extension for docker build (#11675 ) * [FIX] cpp ext * Update orttraining/orttraining/python/training/ortmodule/torch_cpp_extensions/install.py Co-authored-by: Thiago Crepaldi <thiago.crepaldi@microsoft.com> * [FIX] fix python format Co-authored-by: Thiago Crepaldi <thiago.crepaldi@microsoft.com>	2022-06-07 07:51:26 +08:00
Changming Sun	3c1dd9514d	Revert "fixed point based requantization on arm64 (#11540 )" (#11732 ) This reverts commit `1f2c926`. Because it makes our packaging pipeline crash Error message: [ RUN ] QLinearConvTest.Conv3D_S8S8_Depthwise Test #1: onnxruntime_test_all ...................Subprocess killed***Exception: 838.24 sec We haven't successfully reproduced the bug on a real ARM64 hardware. Currently we only saw it showed up with qemu. More investigations are on-going.	2022-06-03 19:12:25 -07:00
Yufeng Li	1f2c92673b	fixed point based requantization on arm64 (#11540 ) * fixed point based requantization on arm64 * reverse MlasConvSymDepthwiseKernel u8s8 and s8s8 order	2022-06-02 12:34:17 -07:00
Vincent Wang	54d1573d2f	[ORTModule] Enable SimplifiedLayerNormalization Fusion (#11580 ) * enable SimplifiedLayerNormalization fuse * remove allow_layer_norm_mod_precision flag	2022-06-01 15:09:39 +08:00
pengwa	44f7b1bf2c	MTA AdamWOptimizer (#11506 ) * skeleton change * adam compute kernels * add rtol/atol for tests * some clean up * optional outputs * more clean up * add tests * adamw mode=1 test pass * clean up tests * add HF AdamW test cases * refactor adam test file * make test pass * all test pass, fix comments * rename to adamw * make test pass again * fix cpplint * minor fixes * fix python lint * Fix build and tests * fix builds * fix windows build * fix win build * minor fix * Refine based on comments * resolve comments * formatting * resolve comments * add ut	2022-05-27 19:52:04 +08:00
Vincent Wang	02724c54ff	[CUDA] Implement BitmaskDropout, BitmaskBiasDropout and BitmaskDropoutGrad (#11534 ) * Implement BitmaskDropout and associated unit tests. * Implement BitmaskDropoutGrad and associated unit tests. * Implement Dropout -> BitmaskDropout rewrite rule and associated unit tests. * Implement (Dropout,DropoutGrad) -> (BitmaskDropout,BitmaskDropoutGrad) rewrite rule. This commit does not yet include unit tests for this rewrite rule. This commit also introduces improved documentation for all changes which will be grouped into this PR. * bitmask dropout * fix win build * bugfix for rocm * bugfix * fix code format * fix ut * fix build break * fix ut in win * resolve comments * fix ut in trt * resolve comments * fix rocm build error * fix typo Co-authored-by: Aidan Beggs <aidanbeggs@microsoft.com>	2022-05-27 17:24:47 +08:00
Vincent Wang	eadb1a3128	Speed Up GradientChecker Running (#11579 ) * fix gradient tester * test size adjust * fix win build	2022-05-27 15:14:53 +08:00
Thiago Crepaldi	427230431a	Fix torch cpp ext build when CPU wheel is installed but GPU card is present (#11608 ) * Fix torch cpp ext build when CPU wheel is installed but GPU card is present Also there is a minor improvement for ATen operator that allows both "::op" and "aten::op" name for operators * Fix flake8 false positive	2022-05-25 09:44:26 -04:00
PeixuanZuo	a67994316a	Update rocm ci to ROCm5.1.1 + torch1.10.0 * [UPDATE] update amd ci pipeline 2 rocm5.1.1 * [FIX] json format error * [ERROR] disable unit tests * [FIX] ucx error * [FIX] cmake version * [FIX] units test	2022-05-20 11:07:21 +08:00
Vincent Wang	436c4f9b79	Add BFloat16 (bf16) support for ATen (#11546 ) Co-authored-by: Vincent Wang <weicwang@microsoft.com>	2022-05-19 10:04:08 -04:00
Vincent Wang	084165c748	Change MinGrad/MaxGrad to Use Distributed Logic (#11388 ) * change min max grad * resolve comments	2022-05-05 11:49:40 +08:00
Tang, Cheng	ae043e3963	Support ort device tensor in ortmodule's inference (#11112 ) * support ort device tensor in ort module inference * fallback aten equal to cpu; add ortmodule inference test case * fix python format Co-authored-by: Cheng Tang <chenta@microsoft.com@orttrainingdev9.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net>	2022-05-03 14:28:30 -07:00
Changming Sun	5023f6750b	Revert "Call pluggable EP's shutdown function in Environment::~Environment() (#11120 )" (#11393 ) This reverts commit `4983d6e5d6`. We can't destroy OrtEnv through python's atexit function, because at that time there might be many other ORT python objects alive.	2022-05-02 14:38:31 -07:00
G. Ramalingam	024747bff4	Allow int32 as shape type (#11345 ) Signed-off-by: Ganesan Ramalingam <grama@microsoft.com>	2022-05-02 10:10:30 -07:00
Tang, Cheng	4b875e3543	Re-implment the function support in onnxruntime (#11167 ) * initial fix * refactor the function handle * update the implementation * fix linux build break * fix training build * fix minmal build * fix gradient checker * deprecate the local function members in graph. host it in model * fix changming's comments * fix comments about inlined containers * fix a missed inlined container * fix training build * avoid const for std string_view Co-authored-by: Cheng Tang <chenta@microsoft.com>	2022-04-29 10:15:58 -07:00
mindest	c8270c2940	Add ATen export and gradient for torch.max/min (#11275 ) * add aten export for max, max.dim * rewrite grad of max (no dim); add cases for min * update UT cases * mod sym shape infer * resolve comments: shape infer, add comments, etc. * add test for torch.max of two tensors * resolve peng's comments: keepdim; test case * correct python format * fix recently introduced lint error	2022-04-28 17:30:33 +08:00
Vincent Wang	1c64351e09	Create Tensor with Strides (#11294 ) * create tensor with strides * resolve comments * refactor Co-authored-by: Vincent Wang <weicwang@microsoft.com>	2022-04-28 16:49:37 +08:00
Justin Chu	d64769c38e	Set black's target version (#11370 ) Description: Set black's target version to be py37 - py310 Motivation and Context Black by default targets its format for py3.10. Since our project supports python 3.7, we need to target version to all the python versions supported. Re-ran black. 13 files reformatted.	2022-04-27 14:52:19 -07:00
Justin Chu	fdce4fa6af	Format all python files under onnxruntime with black and isort (#11324 ) Description: Format all python files under onnxruntime with black and isort. After checking in, we can use .git-blame-ignore-revs to ignore the formatting PR in git blame. #11315, #11316	2022-04-26 09:35:16 -07:00
Gary Miguel	7aa4af238a	Add strict_shape_type_inference config option (#11081 ) Prior to this, certain shape and type errors were surfaced only when the model was using the latest known op set version. Providing users an explicit option allows for better testing of code that produces models, which includes unit tests within this repo and other repos such as the TF-ONNX and PT-ONNX converters. Remove the previous behavior which seems quite counter-intuitive: an otherwise identical model with a later op set version should be treated identically in this regard. The option defaults to false to avoid causing errors for users that rely on the previous permissive behavior. Turned on the strict enforcement by default in OpTester, which revealed a few disagreements between ORT and ONNX on what the correct output shape should be. Fix shape inference bug in ReduceSumTraining with noop_with_empty_axes=1 which was revealed. Fix TensorOpTest.Unsqueeze_scalar, which was testing negative axes on an op set version where the op did not actually support negative axes. Fixes #9506.	2022-04-21 08:32:40 -07:00
Vincent Wang	06026fe8e6	SizeInBytes Fix for Strided Tensor (#11224 ) * SizeInBytes Fix for Strided Tensor * resolve comments	2022-04-19 15:13:00 +08:00
pengwa	9765ef8b4e	fix build warnings (#11213 ) * fix build warning	2022-04-18 21:09:09 +08:00
ashbhandare	ddb17294b2	Fix gradient builder for Cast (#11008 ) * fix grad builder for cast * reviw comments Co-authored-by: Aishwarya Bhandare <aibhanda@microsoft.com@orttrainingdev8.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net> Co-authored-by: Aishwarya Bhandare <aibhanda@microsoft.com@orttrainingdev7.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net>	2022-04-12 16:08:21 -07:00
Vincent Wang	bcc62e0cbf	move some process out of training step (#11150 )	2022-04-08 17:30:11 +08:00
Changming Sun	4983d6e5d6	Call pluggable EP's shutdown function in Environment::~Environment() (#11120 ) I disabled some tests temporarily. I will move them to a separated executable file in another PR. In the future, I want to combine onnxruntime::Environment and OrtEnv classes. Now we have 3 env classes, it is too confusing: 1. onnxruntime::Env 2. onnxruntime::Environment 3. OrtEnv Our python binding uses onnxruntime::Environment, while all other language bindings use OrtEnv. So python doesn't unload EPs but the others do. It's better to make them consistent. Please note even I added the call, currently the unload function still is a no-op on Linux. So, currently on Windows we must unload the EPs while on Linux we must not do it.	2022-04-07 14:11:29 -07:00
Dmitri Smirnov	2700261f7c	Provide an API to supply external initializers data from user buffers (#11109 ) Imlpement AddExternalInitializers	2022-04-07 12:21:53 -07:00
Xavier Dupré	3f42665a40	Improve transfered time from ort to torch (#9610 ) * Improve transfered time from ort to torch * Use static_cast * fix call to Python API for python <= 3.8 * investigation * fix ref counts * disable import if no training * one function to convert multiple ortvalues * add proto_type * enforce dlpack->deleter to be not null * fix _ortvalues_to_torch_tensor for eager mode * rename proto_type into element_type in the Python API * conversion from ort to torch 2x times faster * fix conversion of list of OrtValue * replace has_bool_tensor by bool_tensor_indices * introduce _ortvalues_to_torch_tensor_list * use _ortvalues_to_torch_tensor_list for cache * fix ambiguity between c and python classes Co-authored-by: xavier dupré <xavier.dupre@gmail.com> Co-authored-by: Thiago Crepaldi <thiago.crepaldi@microsoft.com>	2022-04-06 09:12:58 +02:00
Abhishek Jindal	91c940b619	adding fill scalar for torch ones direct initialization on ort device (#10898 ) * adding fill scalar for torch ones direct initialization on device and adding test case for it * using ConstantOfShape to for implementing fill Scalar in atenops * adding case for handling at::Tensor attribute * handling the at::Tensor type for ConstantOfShape * handling the at::Tensor type for ConstantOfShape with attr type * handling the at::Tensor type case * converting the data to tensor in case of aten tensor mapping is needed * handling aten tensor case * handling aten tensor case and reversing the string case * changing type of scalar	2022-04-05 11:17:25 -07:00
G. Ramalingam	2c2408814f	Add function body for SoftmaxCrossEntropyLossGrad (#10779 ) * Add function definition for SoftmaxCrossEntropyLossGrad Signed-off-by: Ganesan Ramalingam <grama@microsoft.com> * Cleanup Signed-off-by: Ganesan Ramalingam <grama@microsoft.com> * Eliminate unused variable Signed-off-by: Ganesan Ramalingam <grama@microsoft.com> * Fix index of weight tensor Signed-off-by: Ganesan Ramalingam <grama@microsoft.com> * A few fixes to handle typing and weight Signed-off-by: Ganesan Ramalingam <grama@microsoft.com> * Fix for zero D dimensions Signed-off-by: Ganesan Ramalingam <grama@microsoft.com> * Add function body to internal op also Signed-off-by: Ganesan Ramalingam <grama@microsoft.com> * A few fixes Signed-off-by: Ganesan Ramalingam <grama@microsoft.com> * Fix type variable name Signed-off-by: Ganesan Ramalingam <grama@microsoft.com> * Fix type constraint var Signed-off-by: Ganesan Ramalingam <grama@microsoft.com> * Fix ignore_index handling in testcase Signed-off-by: Ganesan Ramalingam <grama@microsoft.com@orttrainingdev7.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net> * Add fun def for SoftmaxCrossEntropyLossInternal Signed-off-by: Ganesan Ramalingam <grama@microsoft.com@orttrainingdev7.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net> * Specify opset Signed-off-by: Ganesan Ramalingam <grama@microsoft.com@orttrainingdev7.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net> * Handle opset in NLL function Signed-off-by: Ganesan Ramalingam <grama@microsoft.com> * Address PR feedback Signed-off-by: Ganesan Ramalingam <grama@microsoft.com> * Modify onehot Signed-off-by: Ganesan Ramalingam <grama@microsoft.com> * Eliminate duplicate statement Co-authored-by: Ganesan Ramalingam <grama@microsoft.com@orttrainingdev7.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net>	2022-04-05 10:52:40 -07:00
Vincent Wang	6a6840d5c6	Fuse LayerNormalization for Apex O2 (#10233 )	2022-03-29 21:22:04 +08:00
Vincent Wang	3b6cee8059	[CUDA] Optimize Conv and ConvGrad for Training (#10999 ) * Optimize Conv and ConvGrad for Training * add provider option to control * fix typo	2022-03-29 07:31:36 +08:00
Baiju Meswani	9c6cc018a9	Add utility to get the gradient graph from GradientGraphBuilder (#10995 ) * Add pybind method to get the gradient graph * Fix segmentation fault because of logging for gradien building	2022-03-25 17:13:56 -07:00
Chandru Ramakrishnan	cb31b7eab1	Fixed creation of ORT_Value to pass offset of 0 (#11004 )	2022-03-25 15:52:10 -04:00
Scott McKay	47c09e6701	Clarify usage of kOnnxDomainAlias. (#10962 ) * Clarify usage of kOnnxDomainAlias.	2022-03-25 09:52:59 +10:00
pengwa	89ef987ab1	Improve NonZero on CUDA/ROCM (#10307 ) * improve NonZero * fix megatron_fp16 optimzier, fix the doc * multi_tensor_applier * resolve comment * fix building warning * fix build error when enabling training and use tensorrt	2022-03-25 07:35:45 +08:00
mpapdiwala	1e917c879e	Adding support for saving and loading train step info properties in the state dict and checkpoint file. (#10569 ) * Adding optimization step and step parameter to the ORTTrainer constructor * Added ORTTrainerOptions for optimization step * Adding Train Step Info Settings to State Dictionary * Adding train step info key * Updating comments * Reverting changes * Updating test case for new state dict entry train_step_info	2022-03-24 11:50:45 -07:00
mindest	3c5853dcbc	register custom_op_symbolic for squeeze (#10970 ) * register custom_op_symbolic for squeeze * remove misleading warning msg from symbolic_opset9	2022-03-24 10:28:21 +08:00
Chandru Ramakrishnan	07201726ed	Fixed macros for graph transformer registration. (#10983 )	2022-03-23 14:55:17 -04:00
raviskolli	480c793125	Update training packages to Pytorch 1.11.0 (#10851 ) * Update ortmodule training packages to Pytorch 1.11.0 Co-authored-by: Harshitha Venkata <havenka@microsoft.com> Co-authored-by: Baiju Meswani <bmeswani@microsoft.com>	2022-03-22 16:45:51 -07:00
Baiju Meswani	565318ce86	Support ORT WASM compilation with the training flag (#10973 ) * Add training support for ORT web assembly compilation * Use wrapper for eigen includes in training	2022-03-22 16:13:35 -07:00
Chandru Ramakrishnan	4a5b5328a4	Added support to Eager CodeGen for multiple in-place parameters. (#10945 ) * Added support to CodeGen for multiple inplace output parameters. * Updated output Tensor to references.	2022-03-21 13:10:22 -07:00
G. Ramalingam	8703d37517	Extend DropoutGrad function to support bfloat16 (#10662 ) * Update DropoutGrad function to support bfloat16 * Eliminate dead comments * Set opset version for testcase Signed-off-by: Ganesan Ramalingam <grama@microsoft.com> * Update to new builder Signed-off-by: Ganesan Ramalingam <grama@microsoft.com>	2022-03-20 15:11:08 -07:00
Vincent Wang	8860fded02	Disable Some Einsum ORTModule Tests Due to Issue from PyTorch Exporter (#10906 ) * disable some einsum tests due to pytorch issue * disable tests on specific torch versions * use skipif	2022-03-18 21:28:18 +08:00
mindest	d7d7665023	restore random states after export_model (#10705 ) * restore random states after export_model * move get/set_random_states inside _export_model * add comments for random state save/restore * add unit test for random state check * resolve comments * fix error	2022-03-17 11:56:25 +08:00
Edward Chen	f468ea40e5	Refactor Node::AddAttribute() (#10869 )	2022-03-16 14:53:00 +10:00
Edward Chen	e53422c6d0	Update convert_onnx_models_to_ort.py to support runtime optimizations. (#10765 ) Add runtime optimization support to ONNX -> ORT format conversion script. Replace `--optimization_level`, `--use_nnapi`, and `--use_coreml` with a new `--optimization_style` option.	2022-03-14 16:50:41 -07:00
Abhishek Jindal	03181caeae	Creating test case for printing ort tensor (#10850 ) * creating a test for printing ort tensor * modifying comment for error case * Using Output Grabber to assert the print output * modifying the print ort test * removing comments * removing sys import	2022-03-11 21:39:48 -08:00

1 2 3 4 5 ...

958 commits