onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-07-11 17:48:34 +00:00

Author	SHA1	Message	Date
Chandru Ramakrishnan	d8bcb3d6a4	Added virtual destructor to adasum_interface.h (#7882 )	2021-05-30 11:11:10 -04:00
Ryan Hill	5a63904aa9	Remove some templated versions of functions that are no longer needed (#7868 ) * Switch to non template version of function	2021-05-28 13:22:45 -07:00
baijumeswani	ddf4aaaae1	Resolve issue with wrapped ORTModule load_state_dict (#7847 ) * Encapsulate children modules inside a ModuleAccessor object to prevent erroneuos iteration over children while loading the state dictionary * Add named_models, models, apply methods, change ModuleAccessor to ModuleMetadata and modify unit tests * Change ModuleMetadata module getter logic, raise NotImplementedError for add_modules * Add comment explaining why overriding _load_from_state_dict method is needed	2021-05-27 16:11:37 -07:00
Edward Chen	45a7352622	Update Mac CI builds to use macOS-10.15 image, Xcode 12.4. (#7437 ) Update Mac CI builds to use macOS-10.15 image, Xcode 12.4.	2021-05-27 09:39:34 -07:00
Sherlock	fc472a04be	Relax tol for Conv1D fp16 test (#7844 ) * Relax tol for Conv1D fp16 test Co-authored-by: Sherlock Huang <bahuang@OrtTrainingDev3.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>	2021-05-26 17:04:35 -07:00
Thiago Crepaldi	c5ea5907c0	Fix permission error for ORTModule lock file (#7814 )	2021-05-26 14:18:25 -07:00
harshithapv	4fe59c8b29	delete model_copy to save memory allocated in forward call (#7832 ) * delete model copy * add flag * address comments * address flag comment Co-authored-by: root <root@OrtTrainingDev0.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>	2021-05-25 22:22:13 -07:00
Jesse Benson	29c68888af	Update BERT convergence baseline.	2021-05-25 17:11:46 -07:00
ytaous	ff655175ff	Eliminate no op node - add 0 (#7798 ) * eliminate add 0 * typo * rank check * fix build Co-authored-by: Ethan Tao <ettao@OrtTrainingDev4.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>	2021-05-25 13:01:34 -07:00
baijumeswani	13a129054f	Prevent unnecessary re-initialization of the graph when model has unused parameters (#7799 )	2021-05-22 20:52:26 -07:00
baijumeswani	a6ca9f0a40	Use list comprehensions instead of list appends where possible (#7753 ) * Use list comprehensions instead of list appends where possible * Add OrtValueVector class as an opaque object in pybind * Add dlpack methods to the OrtValueVector pybind class	2021-05-21 10:28:09 -07:00
Sherlock	2a02871157	Disable reuse for YieldOp's inputs (FW partial graph's output) (#7767 ) * Disable reuse for YieldOp's input Co-authored-by: Sherlock Huang <bahuang@OrtTrainingDev3.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>	2021-05-20 21:39:36 -07:00
Peng	c2435d24ec	Clean up ROCm4.1 Dockerfile build directory (#7732 ) * Clean up ROCm4.1 Dockerfile build directory * remove the UCX and OMPI build directories after installation	2021-05-20 10:04:49 -07:00
Ryan Hill	c99aa3a3f3	Ryanunderhill/cuda shared (#7626 ) * First iteration of making cuda a shared provider. Separated out shared OpKernel change, so doing this to merge with that change. * More cuda shared library refactoring * More cuda shared library refactoring * More build options tested, converted the training ops over. * Fix merge breaks * Fix submodules * Fix submodules * Fix submodules * Fix python * Fix compile errors * Duplicate symbol fix * Test fix for ROCM provider * Another ROCM test workaround * ROCM Build Test * ROCM build fix * ROCM * ROCM * ROCM * ROCM * ROCM * ROCM test * Reduce header dependencies * Remove redundant namespace * Test fix for linux * Fix linux build * Fix Eigen build error * Fix unused parameter warning * Test link error * Another linker test * Linker test * Linker test * Another test * Another build test * Fix linux link error * Build test * Fix control flow ops to use common base class with core code * Remove extra qualifiers * Fix template syntax for linux * Fix cuda memory leak * Fix pybind * Test disabling cast * Cleanup * Restore cuda in test * Remove more header dependencies * Test not adding cuda provider to session * Make GetProviderInfo_CUDA throw * No-op cuda provider creation * Fix some setup issues * Fix memory cleanup on unload * Diagnostics * Don't unload library * Add diagnostics * Fix deleting registry at right time. * Test disabling profiler * Fix merge break * Revert profiler change * Move unloading of shared providers into Environment * Free more global allocations before library unloads * Add more diagnostics * Move unloading back to the OrtEnv as there are multiple Environments created during a session. Remove some library dependencies for tests. * Fix more cmake files * ERROR -> WARNING * Fix python shutdown * Test not using dml in pipeline * Change python version and disable dml * Update python version * Test adding unload method for shared providers * Disable DLL test * Python test * Revert "Python test" This reverts commit `c7ec2cfe98`. * Revert "Disable DLL test" This reverts commit `e901cb93aa`. * Revert "Test adding unload method for shared providers" This reverts commit `c427b78799`. * Point to RyanWinGPU * Revert python version * Fix id_to_allocator_map * Another python exit test * Remove extra debug messages Try a more clean python shutdown through DllMain * Revert DllMain idea, it didn't work * Merge conflicts * Fix merge with master issues. * Comments * Undo edit to file * Cleanup + new training ops * Revert yml changes * Fix another merge error * ROCM fix * ROCM fix v2 * Put back Linux hack, it is necessary * Stupid fixes * Fix submodule out of sync * ROCM fix 3 * ROCM 4 * Test java fix * Fix typos * Java test on my VM * Fix build error * Spotless fix * Leave temp file around to load properly * Fix cleanup on exit * Fix break * Java comments * Remove LongformerAttentionBase workaround * Spotless fix * Switch yml back to regular build pool * Revert "Switch yml back to regular build pool" This reverts commit `be35fc2a5a`. * Code review feedback * Fix errors due to merge * Spotless fix * Fix minimal build * Java fix for non cuda case * Java fix for CPU build * Fix Nuphar? * Fix nuphar 2 * Fix formatting * Revert "Remove LongformerAttentionBase workaround" This reverts commit `648679b370`. * Training fix * Another java fix * Formatting * Formatting * For orttraining * Last orttraining build fix... * training fixes * Fix test provider error * Missing pass command * Removed in wrong spot * Python typo * Python typos * Python crash on exit, possibly due to unloading of libraries. * Remove test_execution_provider from training build Only enable python atexit on windows Remove assert on provider library exit * Still can't unload providers in python, alas. * Disable Nvtx temporarily * MPI Kernels for Training * MPI Kernels part 2 * Patch through INcclService * Oops, wrong CMakeLists * Missing namespace * Fix missing () * Move INcclService::GetInstance around to link nicer * Missing } * Missing MPI libraries for Cuda * Add extra GetType functions used by MPI * Missing Nccl library * Remove LOGS statements as a test * Add in a couple more missing GetType methods * Update comments * Missed a logging reference in mpi_context.h * Convert aten_op to shared (due to marge with master) * Test moving DistributedRunContext instance into shared provider layer (with purpose error to verify it's being built properly) * Test passed, now with fix * Missing static * Oops, scope DistributedRunContext to just NCCL * Merge related issues and code review feedback. * Merge error * Bump to rel-1.9.1 (#7684) * Formatting * Code review feedback for Java build on non Windows * Remove cupti library dependency from core library * Test Java pipeline fix * Linux build fix * Revert "Linux build fix" This reverts commit `a73a811516`. * Revert "Remove cupti library dependency from core library" This reverts commit `6a889ee8bf`. * Packaging pipeline fixes to copy cuda shared provider for tensorrt & standard packages * Add cuda to Tensorrt nuget package * onnxruntime_common still has a cuda header dependency Co-authored-by: ashbhandare <ash.bhandare@gmail.com>	2021-05-20 07:53:47 -07:00
Vincent Wang	47b3cc4bde	GatherGrad Bugfix (#7752 ) * gathergrad bugfix * fix win build	2021-05-19 18:53:57 +08:00
Thiago Crepaldi	e05b15175d	Add cpp ext lock file check during ORTModule init (#7740 ) * Add cpp ext lock file check during ORTModule init * Address comments	2021-05-18 12:57:05 -07:00
baijumeswani	e161213f8e	Handle model with no parameters (#7736 ) * Handle model with no parameters * Set the minimum module_output_grads as 0 to handle parameterless models	2021-05-18 09:33:57 -07:00
baijumeswani	c873f5589d	Fix bug where the output names were sorted lexicographically (#7709 )	2021-05-17 10:27:20 -07:00
Thiago Crepaldi	6c41ed597b	Add custom autograd function to prevent input passthrough on ORTModule (#7694 ) * Changes for investigation * Gradient for Identity * Keep Identity betwen YieldOp and GraphOutput * Revert debugging changes * Add custom autograd fn to prevent input passthrough on ORTModule * Add comment Co-authored-by: Sherlock Huang <bahuang@OrtTrainingDev3.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>	2021-05-17 09:56:02 -07:00
Thiago Crepaldi	4fe2ffae16	Fix ORTModule python doc generation (#7704 ) * Fix ORTModule python doc generation * Address comment	2021-05-17 09:55:49 -07:00
ashbhandare	bfbcc89db1	Add MLFloat16 support for SoftmaxCrossEntropyLoss for CUDA EP (#7679 ) * Forward op changes * Add tests, improve kernel * add opset 13 registration, remove unnecessary changes * Add fp16 grad for SCELoss, review comments	2021-05-14 09:00:27 -07:00
baijumeswani	37f69fcee5	Regain performance by caching initializer names in ORTModule (#7685 )	2021-05-13 20:54:49 -07:00
raviskolli	4b37901f10	Aten support for rocm (#7680 ) * Aten support for rocm * Removed aten_ops.cc as it is resued from cuda version	2021-05-13 15:56:03 -07:00
Aswin John Mathews	4afdc19958	ROCm optimized layernorm for MI100 (#7682 ) * layernorm optimizations * Changed HIP flag from HIP_VERSION to __HIP_PLATFORM_HCC__	2021-05-13 15:54:06 -07:00
satyajandhyala	d90a99aad5	Fix the build on dev machines by replacing std::tuple with two arguments with std::pair (#7683 )	2021-05-13 15:11:51 -07:00
harshithapv	31ca21b782	Replace Where Grad "Mul" with "Where" (#7672 ) * replace where grad mul with where * clean up * auto formatting * remove not for second input	2021-05-13 08:54:43 -07:00
Vincent Wang	dac24f7d63	Add ATenOp and call aten::embedding and its Backward Op from ORT (#7590 ) * build with libtorch and impl torchembedding * fix op shape infer * local commit * atenfunctionop * call aten operator from online extension * rollback build.py * resolve comments * bugfix * fix build * fix ortmodule test * remove external outputs, resolve comments * resolve comments * export embedding to microsoft::atenop * bugfix	2021-05-13 09:24:27 +08:00
Weixing Zhang	9241f62e4c	enable MatMulScale and cast propagation for ROCm EP. (#7657 )	2021-05-12 13:43:24 -07:00
M. Zeeshan Siddiqui	5d9885f706	Fix BadNames. (#7658 )	2021-05-11 16:06:10 -07:00
baijumeswani	c5aeaa9419	Support for unused model initializers (#7631 ) * Support for unused model initializers * Change graph_info.initializer* to sets	2021-05-11 12:26:56 -07:00
satyajandhyala	9f69b2f291	Added InsertAndReduce strategy to PropagateCastOps transformation in addition to FloodFill strategy (#7454 ) * Moved GraphTransformerConfiguration to a separate file and added strategy option to PropagateCastOps transformation. * Added testing both FloodFill and InsertAndReduce stratigies for cast propagation. * Added AddConsumer and RemoveConsumer functions to in graph.h for efficient graph editing. * Added PropagateCastOps code documentation * Added GraphTransformationConfiguration class hierarchy information * Added RemoveInputOutputUpDownCasts	2021-05-10 20:46:28 -07:00
baijumeswani	08fbfe9607	Resolve issue where a registered buffer was parsed incorrectly as a user input (#7617 )	2021-05-10 19:04:27 -07:00
Pranav Prakash	a684e9aa52	Add pre-training transform to convert BatchNorm to BatchNormInternal (#7539 ) * Add transformer for BatchNorm -> BN Internal * Add test for BN replacement transformer	2021-05-10 15:13:59 -07:00
baijumeswani	88c95ef06b	Support for primitive types in ortmodule (#7588 )	2021-05-10 10:59:47 -07:00
Hariharan Seshadri	4b691a5c0d	Add ability for memory arenas to "shrink" periodically (#7284 )	2021-05-08 07:53:21 -07:00
Scott McKay	9fc4116d51	Use ASSERT_STATUS_OK so the error message is output if there's a failure. (#7515 )	2021-05-07 20:23:34 +10:00
Vincent Wang	0c91b643fe	Bugfix for Scatter and GatherElementsGrad (#7593 ) * bugfix for scatter and gather elements grad * resolve comments	2021-05-07 14:02:26 +08:00
Derek Murray	94c97ac8c2	Fix compiler warnings treated as errors in GistEncodeDecode. (#7568 ) * Fix compiler warning in GistEncodeDecode. * Fix other use of member variable. * Make `compression_type_` const. * Change floor to floorf in CUDA code. * Statically cast size_t to int in GIST CUDA kernels * Add explicit cast to `long` in gist.cc Co-authored-by: Derek Murray <demurra@microsoft.com>	2021-05-05 09:05:11 -07:00
Xavier Dupré	ade6ed51eb	Speed up Reduce operators for consecutive reduced axes (#7206 ) * Improves Reduction for three specific configurations * Support ReduceMean * add ReduceMax, ReduceMin * refactoring	2021-05-05 09:14:00 +02:00
Sergii Dymchenko	a647da3e1a	Fix 2 input Gemm grad (#7561 ) * Add test for 2 input Gemm grad. * Fix 2 input Gemm grad.	2021-05-04 12:00:14 -07:00
harshithapv	d812354ebd	Tile grad fix (#7556 ) * tile grad fix * code clean up	2021-05-04 11:16:26 -07:00
Fanny Nina Paravecino	c3c4db2c1b	Upgrade GIST memory compression nodes, kernels, optimizer rule, and cli (#6262 ) * Add gist nodes, kernels, optimizer rule, and cli * Add Gist CUDA kernels * Added/updated gist compression cli to bert, gpt2, mnist * Fix decode priority generator for large models * Fix hardcoded decode priority generator, update gist training test * Fix incomplete if/else sequence for CI build * Added MSFP15 for gist compression type * fix Msfp15 bug * Resolved azure pipeline errors - unsupported ORT_RETURN macro format, cudastream argument * Resolved hardcoded cudastream argument, Pack8 zero error * Resolved PR comments - except gist tests * Added TypeInference to Gist Nodes, To attribute to Gist Decoder, Updated Gist Test Cases * Reverted error in merge commit * Updated logger usage in Gist rule, Updated GistPackMSFP15 compressed tensor's explaination * Converted onnxruntime::make_unique to std::make_unique based on PR 7502 Co-authored-by: Fanny Nina Paravecino <faninapa@microsoft.com> Co-authored-by: Aayush Ankit <aayushankit@microsoft.com> Co-authored-by: Aayush Ankit <Aayush-Ankit@users.noreply.github.com> Co-authored-by: Fanny Nina Paravecino <fanny.nina@microsoft.com>	2021-05-04 10:33:35 -07:00
Sherlock	c1ed647170	ORTModule enable run_symbolic_shape_infer by default (#7423 ) * ORTModule enable run_symbolic_shape_infer by default * Fix UTs by replacing Relu with Softmax Co-authored-by: Sherlock Huang <bahuang@OrtTrainingDev3.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>	2021-05-04 10:08:14 -07:00
Sherlock	6714f2f85d	Improve tol value logging in ORTModule test (#7544 ) Co-authored-by: Sherlock Huang <bahuang@OrtTrainingDev3.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>	2021-05-03 09:43:40 -07:00
Pranav Prakash	8ba6ed953f	Fix batch norm training op on CPU (#6946 ) * Fix batch norm training op on CPU * Add BatchNorm 14 Op Support * Update hashes for BN * Exclude TRT and OpenVINO for BatchNorm training test	2021-05-01 11:25:19 -07:00
Sherlock	668a65f1a7	Complete GetGlobalAveragePoolGradient (#7514 ) * Improve GetGlobalAveragePoolGradient Co-authored-by: Sherlock Huang <bahuang@OrtTrainingDev3.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>	2021-04-30 18:04:01 -07:00
Thiago Crepaldi	9ba9da0c95	Fix unused registered buffers issue on ORTModule (#7525 )	2021-04-30 13:50:23 -07:00
Tang, Cheng	54db6648af	kerne invoker api for eager mode (#7473 ) * initial draft for kernel invoke api * initial implementation of kernel invoker * [eager] fix build on Mac * [eager] increment input name in kernel invoker * temp fix for type in eager mode * use global default log manager * rollback the previous commit since it break linux build * Revert "rollback the previous commit since it break linux build" This reverts commit `58c2c3423a`. * Eager Mode: fix linking on macOS * optimizer_execution_frame: ignore unused lambda capture (model_path) * fix link issue * ORTInvoker: set correct input argument tensor element proto types Do not set a type proto on output arguments to allow ORT to deduce them * ORTInvoker: create only one logging manager * Minor fix to set execution provider type correctly. (#7000) Co-authored-by: Chandru Ramakrishnan <chandru-r@github.com> * training fix * support config output ml values in frame, so we can use it to implement inplace update * Fix range loop error while building. (#7087) Co-authored-by: Chandru Ramakrishnan <chandru-r@github.com> * Conditionally link with nsync_cpp if not windows. (#7151) Co-authored-by: Chandru Ramakrishnan <chandru-r@github.com> * Fixed initialization order in ORT kernel invoker (#7342) * Updated constructor of ort_kernel_invoker to take a logger. * Changed linking order. * Updated test. * add inplace ut * add build option * Update include/onnxruntime/core/eager/ort_kernel_invoker.h Co-authored-by: Derek Murray <Derek.Murray@microsoft.com> * resolve comments in pr * fix build break;merge from master * fix build break Co-authored-by: Cheng Tang <chenta@microsoft.com> Co-authored-by: Aaron Bockover <abock@microsoft.com> Co-authored-by: Chandru Ramakrishnan <41447659+chandru-r@users.noreply.github.com> Co-authored-by: Chandru Ramakrishnan <chandru-r@github.com> Co-authored-by: Derek Murray <Derek.Murray@microsoft.com>	2021-04-30 13:33:58 -07:00
Changming Sun	1012535dab	Change onnxruntime::make_unique to std::make_unique (#7502 ) 1. Change onnxruntime::make_unique to std::make_unique 2. Add "-std=c++14" to ROCM EP's build flags.	2021-04-29 17:04:53 -07:00
sabreshao	e6a3308db7	Optimize cuComputeGradInput performance. (#7479 ) Move the checking of gamma to host and specialize both case through template.	2021-04-28 17:08:31 -07:00

1 2 3 4 5 ...

649 commits