onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-06-21 02:18:09 +00:00

Author	SHA1	Message	Date
Hariharan Seshadri	aa2622efb2	Support multiple dynamic inputs in custom ops (#6666 )	2021-02-16 20:54:30 -08:00
Weixing Zhang	299ace0759	Support to allow user to specify compute stream per session (#3723 ) * Support to allow user to specify compute stream per session Create computation cuda stream explicitly rather than use default legacy stream or per-thread default stream. remove some redudant cudaStreamSynchronize fix gpt2 model test failures don't use default stream in nccl either. add stream schronization in OnRunEnd() using cub::DeviceScan::InclusiveSum which can be called with stream specified. fix topK failure due to latest rebase fix tensorrt support user specified stream add user_stream support in tensorrt EP use same stream for both tensort and CUDA EP. fix ScatterND specify stream for adasum and p2p kernels. fix loop fix CApiTest.custom_op_handler fix CApiTest.varied_input_custom_op_handler change for cudaMemcpyFromSymbol improve provider options for user specified compute stream * add changes for ROCM EP * fix GatherGrad UT for ROCM EP * clean code and fix NonMaxSuppression * use default stream for ROCM now * fix CApiTest.custom_op_handler:OrtFormatCustomOpTests.ConvertOnnxModelToOrt * fix tensorrt ut: CApiTest.io_binding_cuda Co-authored-by: Weixing Zhang <wezhan@microsoft.com>	2021-02-05 15:48:18 -08:00
Scott McKay	c84bb9df9f	Add ability to track per operator types in reduced build config. (#6428 ) * Add ability to generate configuration that includes required types for individual operators, to allow build size reduction based on that. - Add python bindings for ORT format models - Add script to update bindings and help info - Add parsing of ORT format models - Add ability to enable type reduction to config generation - Update build.py to only allow operator/type reduction via config - simpler to require config to be generated first - can't mix a type aware (ORT format model only) and non-type aware config as that may result in insufficient types being enabled - Add script to create reduced build config - Update CIs	2021-01-29 07:59:51 +10:00
Hector Li	b5d1a49b30	Share allocator between CUDA EP & TRT EP. (#6332 ) * Share allocator between CUDA EP & TRT EP. limitation: 1. Does not cover the per-thread allocator created by CUDA EP, still need to figure out the way to remove it 2. Need to have more identifiers to make it able to share CPU allocator across all EPs	2021-01-27 00:14:43 -08:00
Scott McKay	e1dc268e45	Add support for custom ops to minimal build. (#6228 ) * Add support for custom ops to minimal build. Cost is only ~8KB so including in base minimal build.	2021-01-25 10:41:00 +10:00
Edward Chen	d761571afc	Deprecate Python global configuration functions [Part 2] (#6171 ) Update Python API to allow more flexibility for setting providers and provider options. The providers argument (InferenceSession/TrainingSession constructors, InferenceSession.set_providers()) now also accepts a tuple of (name, options dict). Fix get_available_providers() API (and the corresponding function in the C API) to return the providers in default priority order. Now it can be used as a starting point for the providers argument and maintain the default priority order. Convert some usages of the deprecated global configuration functions to use EP-specific options instead. Update some EP-specific option parsing to fail on unknown options. Other clean up.	2021-01-07 10:10:55 -08:00
Hariharan Seshadri	2347de4a9e	Fix Linux/Mac error message on input type mismatch (#6256 )	2021-01-05 22:21:24 -08:00
Hariharan Seshadri	d42399e1b0	Allow querying a GraphProto's doc_string as part of ModelMetadata (#6248 )	2021-01-05 22:18:03 -08:00
Hector Li	ffb4b62826	Fix allocator issue for TensorRT IOBinding (#6240 ) * Fix issue: https://github.com/microsoft/onnxruntime/issues/6094 Root cause: we didn't expose the OrtMemoryInfo for TRT, so it will cause issue if user want use IObinding for Tensorrt. Short term fix, add the OrtMemoryInfo for TRT. Long term should unify the allocator for CUDA and TRT	2020-12-31 20:15:43 -08:00
Changming Sun	1fc7f92f25	Fix a memory leak in test_inference.cc (#6201 ) * Fix a memory leak in test_inference.cc	2020-12-25 13:02:21 -08:00
RandySheriffH	404982ded5	Enable varied input type for custom op (#6066 ) * allow custom op taking varied types * refactor test case * add test model * refactor test case * enable copy elision * update test case * fix issue in ToString function	2020-12-09 15:10:42 -08:00
Hariharan Seshadri	d46dbeafd3	Expose knobs to create and share (CPU) allocators across sessions in C# and Python (#5634 )	2020-11-21 14:12:33 -08:00
RandySheriffH	20ae1ea21f	Remerge custom gpu op (#5818 ) * add case for cpu custom op on gpu * format doc * restrict GPU custom op on Linux GPU CI only * separate cu file to a independent project * fix typo * include cuda_add lib * move lib def * add file header Co-authored-by: RandySheriffH <rashuai@microsoft.com>	2020-11-16 09:27:46 -08:00
RandySheriffH	c23fbba463	Fix reduce pipeline by replacing model (#5813 ) * update model and better comment * fix parameter Co-authored-by: Randy Shuai <rashuai@microsoft.com>	2020-11-14 20:17:23 -08:00
Guoyu Wang	dc0f7b8f82	Remove onnxruntime_session_options_config_keys.h from c_api (#5772 ) * Remove seesion config keys header from c_api * remove copy session config header in release package * Keep the session option config header in the package	2020-11-12 09:12:13 -08:00
Hariharan Seshadri	63b85fc696	Fix VS 2017 build break (#5745 )	2020-11-10 10:25:43 -08:00
Dmitri Smirnov	2bf5046d4e	Add tag types for Ort::Float16_t and Ort:Bfloat16_t structs (#5716 ) Add tag types for Ort::Float16_t and Ort:Bfloat16_t structs that contain uint16_t values for float16 and bfloat16. These will serve as type dispatching types for C++ API. They are of uint16_t size and arrays of these types can be used to create Tensors of the corresponding types. Make documentation Doxygen compliant.	2020-11-06 16:41:26 -08:00
Wenbing Li	5b44982971	Change the OrtCustomOp invocation as a constant. (#5506 ) * Chanage the OrtCustomOp invocation as a constant. * fix build on macos * build fixing	2020-11-02 10:38:07 -08:00
Changming Sun	d9293f38e6	Revert "Custom Op on GPU (#5620 )" This reverts commit `2c63196600`.	2020-10-30 21:23:51 -07:00
Hariharan Seshadri	7a80a4b526	Support more C# APIs (#5608 )	2020-10-30 19:19:50 -07:00
RandySheriffH	2c63196600	Custom Op on GPU (#5620 ) * add case for cpu custom op on gpu * format doc * restrict GPU custom op on Linux GPU CI only * separate cu file to a independent project * fix typo Co-authored-by: RandySheriffH <rashuai@microsoft.com>	2020-10-30 12:25:44 -07:00
Hariharan Seshadri	44773c60e3	Add a CUDA based IOBinding test (#5572 )	2020-10-26 10:57:36 -07:00
Hariharan Seshadri	4b29423656	Re-enable custom op shared library test for debug builds (#5475 )	2020-10-19 17:14:31 -07:00
Hariharan Seshadri	b9f90e297e	Support sharing of initializers between session via the Python API (#5407 )	2020-10-09 20:26:28 -07:00
Changming Sun	1a04b8f8b7	Add valgrind support to our cmake files (#5296 )	2020-09-28 09:31:08 -07:00
Pranav Sharma	974b9bfc09	Allow sharing of initializers between sessions. (#5092 ) * Allow sharing of initializers between sessions. * Allow sharing of initializers between sessions (2). * Add test for C# * Add test for C#; address PR comments * Address PR comments Moved AddInitializer logic to internal session options Added tests for owned buffer Clarified documentation Fix bug where memory info and not device was getting compared * Fix test * Fix training build * Add ver 5 end marker and ver 6 starter, add scenario and usage examples.	2020-09-21 14:09:37 -07:00
Dmitri Smirnov	a90ab12589	Refactor onnx_test_runner (#5169 ) Refactor onnx_test_runner for better object ownership, code readability and maintainability.	2020-09-18 13:19:35 -07:00
Dmitri Smirnov	e6f85f338e	Refactor TensorAt, prepare for release (#5180 ) * Refactor TensorAt locations* must be const and int64_t since our dims are int64_t Remove unnecessary copy of locations. Remove unnecesary casting and C-casting. Simplify implementation. Add a check for string type. Make CXX api return T& to fully expose C API in C++, const std::vector& by value as it covers more ground and eliminate redundant copy. Eliminate inner loop, compute strides first.	2020-09-16 10:20:45 -07:00
Chun-Wei Chen	7f3aa3a163	Add GetStartTime() for profiler to get private profiling_start_time_ (#4994 ) * add GetStartTime() for profiler * add function in inference_session * remove qualified name * add the api in cxx_api.h * rename starttime to StartTimeNs, expost profiling object * rename GetProfilingStartTime * move Ortapis to the right place * move to the end * add const for session * const the right place * use const auto instead of const auto* for session * remove const for auto getstarttime * remove const for auto getstarttime add unit tests * nit: update test name and add comments	2020-09-16 00:17:04 -07:00
Pranav Sharma	2c1410afe7	Remove usage of macros for constants in public header. (#5061 ) * Remove usage of macros for constants * Fix linkage issue	2020-09-05 01:27:20 -07:00
xkszltl	44b3accb74	Missing header for `std::once_flag` and `std::call_once`. (#5010 )	2020-09-02 00:46:59 -07:00
Pranav Sharma	ad1701dfb1	Rename DeviceAllocatorRegistrationInfo to a more generic name; Use OrtArenaCfg for arena members; Remove unused OrtMemType; Simplify CreateAllocator interface. (#4970 ) * Rename DeviceAllocatorRegistrationInfo to a more generic name; Remove OrtMemType; Simplify CreateAllocator interface. * - fix builds - fixed mixed aggregation + constructor calls (which were coded before this PR) - changed default value of max_mem in API header - added some validation of values for for arena_extend_strategy * fix tensorrt and cuda tests	2020-09-01 09:25:32 -07:00
RandySheriffH	14b51d6502	CiPipeline@ReducedOpsBuild (#4917 ) * cancel night build on pyop * setup ci pipeline for build of reduced ops * add back c# test * remove debugging print * add testing model * add more arg in pipeline script * disable pipeline trigger temporarily * fix yaml format * fix yaml format * fix pipeline error * rid c# test * add ops for test cases * add Conv from domain com.microsoft.nchwc * remove --reduce_ops * fix typo * remove --build_java * add test case for excluded op * update doc with --skip_test * formatting code, renaming files and simplify yaml * remove debug build from yaml * remove surplus ops from included_ops.txt * add MinSizeRel build to yaml * rename test cases and models * exclude ir test from minimum build * restrict ir test to be only applied to reduced ops build	2020-08-31 21:21:18 -07:00
Hariharan Seshadri	6c26e52134	Support accessing a model's metadata in C# (#4867 ) Implement access to model's metadata in C#	2020-08-25 11:13:49 -07:00
Changming Sun	26546f81fe	Remove the private ONNX protobuf definition file (#4878 )	2020-08-24 12:40:33 -07:00
Pranav Sharma	29dcfb24ab	Allow multiple sessions to share an allocator, optimize constant folding memory usage, expose arena configs. (#4813 ) * Add support for sharing allocators * Incremental update * Address some PR comments, add unit tests, add documentation. * Address PR comments, add tests and some documentation. * Fix build and test issues * Remove RegisterAllocator API restoring the OrtAllocator interface changes. Changed docs to reflect this. Also fixed the orttraining segfault. The segfault was because in the case of training session, the CPU exec prov is not available at the time the transformers are applied. Changed it to create a new one.	2020-08-22 10:03:17 -07:00
Josh Bradley	b7254551f0	Add new api function At() (#4457 ) * add modern standards to function arguments * add first version of At for better tensor element access	2020-08-11 18:34:03 -07:00
Dmitri Smirnov	3530ce541c	Expose IOBinding features via C/C++/C# language bindings. (#4646 ) Expose I/O Binding in C/C++/C# Expose OrtAllocator, OrtMemoryAllocation, OrtMemoryInfo and OrtIoBinding	2020-08-10 13:33:49 -07:00
RandySheriffH	e802b0498f	EnrichPyOpUT (#4681 ) * cancel night build on pyop * enrich PyOp UTs * init script only once * remove space * update models * Show usage of kwargs in doc	2020-08-05 14:11:56 -07:00
RandySheriffH	948a33bdfc	FixPyOpSegFault&MakeItStaticLib (#4600 ) * remove pyop wrapper * add py threading logic * fix doc * fix doc * fix doc * format doc * format doc * format doc * reenable test Co-authored-by: RandySheriffH <rashuai@microsoft.com>	2020-07-28 11:45:25 -07:00
Alisha Sonawalla	1e67fff93c	Add GetStringTensorElement, GetStringTensorElementLength and FillStringTensorElement API (#4374 ) Add new string tensor APIs and unit tests	2020-07-24 21:35:46 -07:00
Changming Sun	8ada440961	Move model tests to onnxruntime_test_all (#4521 ) 1. Move model tests to onnxruntime_test_all 2. Publish TestResults of Windows CI build.	2020-07-15 16:46:18 -07:00
Prabhat	151ef1c8a5	Add C++ wrapper for GetAvailableProviders() C API (#4313 )	2020-06-25 13:11:55 +05:30
Prabhat	57fabfba7a	Added GetAvailableProviders() to C API (#4247 ) * Added GetAvailableProviders to C API * Fix API version and Windows build error * Changed function name * Changed ORT_API_VERSION to 4 * Moved all_providers array to constants.h * Move check for providers to constants.h * Changed name of array to avoid warning * Address review comment * Added unit test	2020-06-22 10:10:25 +08:00
George Wu	6f729b100f	use LOAD_WITH_ALTERED_SEARCH_PATH for LoadLibraryExA (#3908 )	2020-05-11 19:53:34 -07:00
Ryan Hill	408f62dd57	Load provider shared libraries relative to core runtime executable (#3884 ) * Load provider DLL relative to core runtime executable * Use LoadLibraryEx to fix dependent DLL loading * Fix custom op DLL loading path issue.	2020-05-09 20:49:15 -07:00
Changming Sun	7c89f38a34	Fix static analysis warnings found by VC++ (#3530 ) 1. Fix static analysis warnings found by VC++ 2. Add a new pipeline for static analysis 3. Merge all the windows CI build into one single yaml file.(Easier to queue them all). 4. Make DNNL build faster by disabling building the tests and examples. 5. Enable custom op unitest.	2020-04-16 01:46:47 -07:00
Hariharan Seshadri	abfb275ac0	Support listing keys in custom metadata map via C/C++ API (#3477 ) * Support listing keys in custom metadata map via C/C++ API * nit * PR feedback * Nit	2020-04-15 12:14:03 -07:00
Changming Sun	b63349c8d6	Fix custom op test failure (#3525 )	2020-04-14 20:36:42 -07:00
Pranav Sharma	3568f8d186	Allow a custom op with the same name to be registered for several providers. (#3400 )	2020-04-02 15:38:51 -07:00

1 2

92 commits