onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-06-04 23:59:56 +00:00

Author	SHA1	Message	Date
Changming Sun	4bfff45859	Downgrade Eigen (#8817 )	2021-08-23 18:06:23 -07:00
Chandru Ramakrishnan	2693af9799	Ported changes / bug fixes from torch/ort. (#8784 ) * Ported changes / bug fixes from torch/ort. * Fixed formatting * Renamed function * Renamed module_ to module. * Revert "Renamed module_ to module." This reverts commit b17fc114b3db20d174283811d90592b5b8154c19. * Include pybind common header to fix linker errors on windows debug. * Fix to generation of > 1 custom op. Co-authored-by: Ashwin Hari <ashari@microsoft.com>	2021-08-23 17:45:40 -04:00
George Nash	d4a88cfe3f	Add Gemm op to DNNL Exectution provider (#8799 ) * Implement Gemm op for DNNL execution provider Signed-off-by: George Nash <george.nash@intel.com> * Remove KernelRegistry and Gemm op for dnnl ep The KernelRegistry for the dnnl execution provider only registered a Gemm op that as best we can tell was never actually used and also was not using the dnnl library. We have implemented a Gemm op in the DNNL execution provider subgraph code and thus are removing the unused Gemm op that was in the dnnl KernelRegistry. Signed-off-by: George Nash <george.nash@intel.com> * Fix duplicated output and kernelshape inference fix getcapability to make sure subgraph outputs do not have duplicates fix kernelshape inference in pool Signed-off-by: Wang <zhaoyang.wang@intel.com> * Removed most dnnl specialized ifdefs from gradient_ops_test code Re-enable GlobalAveragePoolGrad test for dnnl ep The bugs that were exposed by the GlobalAveragePoolGrad test have been fixed and this test no longer needs to be disabled for DNNL. Removed the ReluGradDnnl test. We are getting the testing from the already existing ReluGrad test. MaxPoolGrad test no longer has specialized execution provider enabling for DNNL execution provider. It will now run without the extra enabling. ConvGrad is the only test that still has dnnl specialized ifdefs However, the ConvGrad code was not being executed by the code unless it was listed first in the list of execution providers. Signed-off-by: George Nash <george.nash@intel.com> * Fix transpose issue on Gemm On transposing square matrices, getmemoryandreshape will fail to reshape fix by adding a bool Signed-off-by: Wang <zhaoyang.wang@intel.com> * Save memory space by reusing internal tensor for output The intermediat matmul output tensor can be used as the output tensor for the binary calculation. Remove the unused IsAttributeSupported from the DnnlGemmNodeCapability class since we now support all of the Gemm attributes in our implementation. Signed-off-by: George Nash <george.nash@intel.com> Co-authored-by: Wang <zhaoyang.wang@intel.com>	2021-08-23 08:45:34 -07:00
Suffian Khan	9fa0d8392a	Extend node debugging utilities to push tensors and node placement to SQL database (#8672 ) * adding support for tracing to sqldb instead of files * use compiled statements * script to pull tensors from db * link sqlite3 * remove node info redundant with onnx graph * addressing PR comments * address PR comments and include program counter * third party notice * use find_pacakge * add to cgmanifests.json * address thread safety and add pid suffix * build fi * python script to select on devicetype * remove unpopulated and redundant Shape and Type fields * comment * comment * PR comments * add graph execution counter to session state * move increment to inference session * std::endl to \n * ifdef on graph execution counter * add ifdef to inference session * move DEBUG_NODE_INPUTS_OUTPUTS to CMakeLists.txt	2021-08-21 00:40:12 -07:00
Sherlock	81889a1cf6	Invertible ReluGrad (#8773 ) * Invertible Relu Grad	2021-08-19 11:29:05 -07:00
Aaron Bockover	b2813656f5	eager: fix build against latest PyTorch master (#8745 ) Improve README as well.	2021-08-18 14:27:21 -04:00
pengwa	0983d61969	refine glue code and tests (#8510 )	2021-08-18 11:38:00 +08:00
ashbhandare	cc275e7529	Gradient Accumulation optimization verified for correctness (#8273 ) * Fetching frontier tensors to frontend * Move before session initialize call * Fetch tensor and add to cache * Rest of the changes for using cache * Review comments * Review changes * Review comments * switch to shared_ptr * Fix bug after rebase * FE docstring change	2021-08-17 16:24:44 -07:00
baijumeswani	871eeb4dbd	Support dicts as inputs to ORTModule (#8718 )	2021-08-17 13:40:55 -07:00
Thiago Crepaldi	ed254c283f	Add support for experimental json config for fallback (#8759 )	2021-08-17 13:35:42 -07:00
Thiago Crepaldi	419834d285	Add PyTorch fallback for ORTModule forward exceptions (#8346 )	2021-08-17 10:41:15 -07:00
M. Zeeshan Siddiqui	0fb82f0f8a	Memory aware gradient builder. (#8582 )	2021-08-16 19:01:22 -07:00
Nat Kershaw (MSFT)	aa12d68c37	Update ORTModule API docstrings (#8309 )	2021-08-16 16:53:01 -07:00
George Nash	e695cd304a	Dnnl refactor (#8627 ) * dnnl ep rework rework DnnlTensor,DnnlNode,DnnlSubgraph to support arbitrary graph topology and tensor data types rework GetCapability to claim nodes in graph greedily from node topological ordering and delay creation of DnnlSubgraph until Compile rework compile to have DnnlSubgraphPrimitive as the object to handle primitive creation and execution instead of thread local primitive pool which duplicates intermediate memory allocated by the EP across threads DnnlSubgraphPrimitive provides helpers to handle many common functions for each dnnl primitive builder and become the centralized place to store input, output, intermediate memories, initializer memories and etc it provides functions to obtain input memories with automatic reordering/reshaping and moving between engines it provides interfaces to add primitive, set output memory for single node and etc add CONCURRENT_EXEC compile flag for dnnl library as without it, convolution primitive cannot be created and executed on different threads enable unit tests to run on dnnl ep as well if built with dnnl ep add dnnl ep support for Matmulinteger * Add Relu to the DNNL refactor Signed-off-by: George Nash <george.nash@intel.com> * Add Convolution op to the DNNL rework Signed-off-by: George Nash <george.nash@intel.com> * Add Pooling ops to the DNNL rework This adds the following ops: - AveragePool - GlobalAveragePool - GlobalMaxPool - MaxPool Note: Pooling with dilation is not yet supported. Note: GlobalLpPool, LpPool, MaxRoiPool, and MaxUnpool are not supported yet. Signed-off-by: George Nash <george.nash@intel.com> * Add Sum op to the DNNL rework Signed-off-by: George Nash <george.nash@intel.com> * Add ConvGrad op to the DNNL rework Signed-off-by: George Nash <george.nash@intel.com> * Add MaxPoolGrad and AveragePoolGrad ops to DNNL rework Signed-off-by: George Nash <george.nash@intel.com> * Added lrn operator to the refactored code Signed-off by chethan.palangoutu.keshava@intel.com * Added ReduceMean DNNL op to the refactor code Signed-off-by: Chethan Palangotu Keshava <chethan.palangotu.keshava@intel.com> * Added Softmax DNNL op for the refactored code Signed-off-by: Chethan Palangotu Keshava <chethan.palangotu.keshava@intel.com> * Added BatchNorm DNNL op inference-only for refactored code Signed-off-by: Chethan Palangotu Keshava <chethan.palangotu.keshava@intel.com> * Added Binary Ops to DNNL rework Signed-off-by: Wang <zhaoyang.wang@intel.com> * Added ReluGrad to DNNL Rework Signed-off-by: Wang <zhaoyang.wang@intel.com> * Update OneDNN tag to v2.3 Signed-off-by: Wang <zhaoyang.wang@intel.com> * Added support for memory upto dim size 12 this is to fix the CI test cases that contain binary ops of input dim size > 5 Signed-off-by: Wang <zhaoyang.wang@intel.com> * Prevent claiming support for float16 and bfloat16 when only float is suppoted By using The string.find used was causing the code to claiming support for float16 and bfloat16 when we only supported float. We now explicitly check the code for the data type or the data type with a 7 letter prefix basically prefixed with "tensor(" Signed-off-by: George Nash <george.nash@intel.com> * Disable uint8 mul and div, improve type conversion Disable mul_uint8 and div_uint8 test cases as they use modulo for overflow handling while onednn uses saturation improve ype conversion using enum instead of string comparsion as well as adding more types Signed-off-by: Wang <zhaoyang.wang@intel.com> Co-authored-by: Wang <zhaoyang.wang@intel.com> Co-authored-by: Chethan Palangotu Keshava <chethan.palangotu.keshava@intel.com>	2021-08-13 14:15:43 -07:00
Changming Sun	436ac6dd5f	Rename ml_value.h to ort_value.h (#8726 )	2021-08-13 07:04:56 -07:00
baijumeswani	217b2c9f93	Removing filelock import from ORTModule (#8722 )	2021-08-12 21:19:49 -07:00
Tang, Cheng	de2a53e46d	[eager mode] fix build and support customize shared provider entry point (#8680 ) * fix build break * support customize the name of shared provide lib's entry point * fix non training build * check error code * check return code	2021-08-11 15:10:35 -07:00
harshithapv	c24335246b	Support bool type for Pad Op and fix Unsqueeze in Tile grad for Opset 13 (#8602 ) * changes * tile grad unsqueeze fix for opset 13 * clean up * remove bool support for opset 2 to 12 for Pad as it is not supported. * Copy OperatorKernels.md from artifacts of Windows CI build.	2021-08-11 11:21:02 -07:00
mindest	a56e325eb8	constrain inputs for min/max grad UT (#8632 ) * fix inputs for min/max grad UT * use random inputs (truncated)	2021-08-07 18:29:06 +08:00
Tang, Cheng	6d3c2c85ef	Integrate eager mode source code into onnxruntime repo (#8584 ) * integrate eager mode source codde; build with cmake and integrate the python test * Adding the python path for importing libraries in the Eager mode * fix clang break;check if training and python enabled * handling the linking of torch libraries across multiple platforms * merge and fix the naming * add build instruction Co-authored-by: Abhishek Jindal <abjindal@OrtTrainingDev0.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net> Co-authored-by: ajindal1 <abjindal@microsoft.com>	2021-08-06 08:30:27 -07:00
Ashwini Khade	96eb9810ba	Update onnx (#8458 ) * updates for picking pnnx commit * add tests filter to c# tests * plus test fixes * fix versioning for contrib ops * fix tests * test filter for optional ops * more versioning related updates * fix test * fix layernorm spec * more updates * update docs * add more test filters * more filters * update binary size threshold * update docs * plus more fixes * updates per review * update to release commit * add filters for optional type tests * plus updates	2021-08-05 09:21:44 -07:00
Changming Sun	0510688411	Update compliance tasks in python packaging pipeline and fix some compile warnings (#8471 ) 1. Update SDLNativeRules from v2 to v3. The new one allows us setting excluded paths. 2. Update TSAUpload from v1 to v2. And add a config file ".gdn/.gdntsa" for it. 3. Fix some parentheses warnings 4. Update cmake to the latest. 5. Remove "--x86" build option from pipeline yaml files. Now we can auto-detect cpu architecture from python. So we don't need to ask user to specify it.	2021-07-30 17:16:37 -07:00
baijumeswani	816ad86d14	Configuring ORTModule - Internal Options (#8537 )	2021-07-30 13:05:32 -07:00
satyajandhyala	5e2f4263db	Enable cast propagation in the frontend. (#8517 )	2021-07-28 17:06:49 -07:00
baijumeswani	2e28cbaa64	Configuring ORTModule - End User Facing Options (#8470 )	2021-07-28 10:51:43 -07:00
Sherlock	1370cbe256	[ORTModule] Extract output schema in module's true train/eval mode (#8516 ) * Extract output schema in module's true train/eval mode	2021-07-28 09:55:07 -07:00
mindest	a71dab691d	Implement BatchNormInternal for cuda (#8172 ) * correct batchnorm replacement output order; remove bn replacement in grad graph builder * update op defs and kernel class * implement batch norm internal and grad. * change saved_var into saved_inv_std * cuda test case: bn internal * remove redundant include * fix comment; add support and UT for 1d input. * exclude batch_norm_internal in amd_hipify * run BNInternal UT for CUDA only * fix CI error * fix comment errors * fix error * add comment for inconsistency with cudnnBN doc * additional comments for cudnnBN inconsistency	2021-07-28 16:04:49 +08:00
Vincent Wang	1798698545	avgpool2d atenop (#8507 )	2021-07-28 14:04:55 +08:00
Sherlock	686f9b530b	ORTModule set_seed in int (#8511 )	2021-07-27 15:43:13 -07:00
Oliver Rausch	1685ab8138	Implement Concat with Strided copy (#8336 ) Adds a StridedCopy function that implements a copy from strided tensor to another. This parallelizes the Concat operator, and can also be used in the future to parallelize many other data movement operators (e.g. Transpose, Split, etc.). This operation is also required for the proposed data layout extensions to ORT.	2021-07-27 18:27:56 +02:00
ytaous	1ae32655b3	fix t5 assert error (#8501 ) Co-authored-by: Ethan Tao <ettao@OrtTrainingDev4.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>	2021-07-27 09:04:01 -07:00
ytaous	ab5289f109	Performance: enable faster training with skip checks config (#8411 ) * freeze/fastpath support * more comments on _fast_path * per comments * minor fix * IntFlag improve * address comments Co-authored-by: Ethan Tao <ettao@OrtTrainingDev4.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>	2021-07-23 10:23:13 -07:00
Vincent Wang	c8d210de29	Decouple Forward and Backward of ATenOp (#8301 ) * atenop for inference * assert if dtype mismatch * atenop config in frontend * fix orttrainer test * gradient def not only for ATenOp * bugfix * fix gradient input shape and type issue * fix after merge master	2021-07-23 16:53:26 +08:00
Thiago Crepaldi	9073c094d4	Update torch litghning and re-enable test	2021-07-22 14:18:07 -07:00
pengwa	892ac9f55a	code structure update (rename only) (#8410 )	2021-07-22 23:50:19 +08:00
Edward Chen	695536a7ac	Make some common macros safer to use. (#8445 )	2021-07-21 12:14:36 -07:00
Sherlock	28527b4867	Handle duplicated names for output_grads (#8431 )	2021-07-20 10:17:31 -07:00
Sherlock	4931ef666d	Update ORTModule frontend code owner file (#8335 )	2021-07-14 09:26:04 -07:00
pengwa	7db4fc8c2a	Fix segment fault for custom function (#8331 ) * unregister registered python functions upon normal interpreter termination * atexit.register(unregister_python_functions) should be called by __init__.py * minor fix	2021-07-13 18:01:33 +08:00
Tang, Cheng	e467d78a11	fix a typo (#8334 ) Co-authored-by: Cheng Tang <chenta@microsoft.com>	2021-07-09 09:24:43 -07:00
Tang, Cheng	598454bb5f	Fix the mix precision handle for square case (#8333 ) * handle unsqueeze change in opset13 * fix the node arguments index check for square case (x * x) * Revert "fix the node arguments index check for square case (x * x)" This reverts commit c66344f0a82c35d8c24d31f2264cf7e9b235ce22. * handle the square case (x * x) for node argument search Co-authored-by: Cheng Tang <chenta@microsoft.com>	2021-07-09 09:24:19 -07:00
Hariharan Seshadri	46e5c8d4b9	Cosmetic change in test infrastructure (#8292 )	2021-07-08 21:52:02 -07:00
pengwa	5454af4b95	decouple the shared python dependency (#8294 ) * remove warnining message for non-training build * move to/from dlpack for onnxruntime_python back into python project	2021-07-09 11:47:11 +08:00
satyajandhyala	84bc20fe9d	Enable cast propagation with level one by default. (#8286 )	2021-07-08 14:38:09 -07:00
pengwa	6dbfb8db0e	autograd function fallback perf (#8312 ) * fix known issues * Update orttraining/orttraining/test/python/orttraining_test_ortmodule_autograd.py Co-authored-by: Thiago Crepaldi <thiago.crepaldi@microsoft.com>	2021-07-09 00:29:40 +08:00
baijumeswani	6652d17dcd	Support lists as inputs to ORTModule (#8311 )	2021-07-07 13:04:19 -07:00
Tang, Cheng	d7c3703371	handle unsqueeze change in opset13 (#8308 ) Co-authored-by: Cheng Tang <chenta@microsoft.com>	2021-07-06 22:30:24 -07:00
pengwa	2347a0aca8	Autograd Function Fallback bug fix - moe support (#8105 ) * Support forward inputs orders like "Non_tensor/Tensor/Non_tensor". Correspondingly, support "None/Tensor_Grad/None" fpr backward outputs. * Report RuntimeError when PythonOp detected but _enable_custom_autograd_function is enabled. * Fix "PoliCheck ] - Defect : Term "hang", Component : orttraining\orttraining\python\training\ortmodule\__init__.py (1 issue)" * rename call_convention->input_convention, input_tensor_requires_grads->input_requires_grads * fix minor comment * revert polycheck fix in case of conflict * Update orttraining/orttraining/core/graph/training_op_defs.cc Co-authored-by: Tim Harris <tiharr@microsoft.com> * Apply suggestions from code review Refine the schema description Co-authored-by: Tim Harris <tiharr@microsoft.com> * Resolve review comments Co-authored-by: Tim Harris <tiharr@microsoft.com>	2021-07-07 08:58:01 +08:00
Suffian Khan	036eee5b66	register softmaxinternal with rocm (#8289 )	2021-07-02 16:29:18 -07:00
Vincent Wang	88ec95ea96	Support OrtMemTypeCPUInput for ATenOp/ATenOpGrad (#8116 )	2021-07-02 23:04:43 +08:00

1 2 3 4 5 ...

727 commits