onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-07-05 04:17:53 +00:00

Author	SHA1	Message	Date
Vincent Wang	3b6cee8059	[CUDA] Optimize Conv and ConvGrad for Training (#10999 ) * Optimize Conv and ConvGrad for Training * add provider option to control * fix typo	2022-03-29 07:31:36 +08:00
Chi Lo	8ba52b0a05	Bump master version to 1.12 (#10797 ) * bump master version to 1.11 * bump master version to 1.12	2022-03-28 12:30:11 -07:00
Scott McKay	47c09e6701	Clarify usage of kOnnxDomainAlias. (#10962 ) * Clarify usage of kOnnxDomainAlias.	2022-03-25 09:52:59 +10:00
Leandro Gracia Gil	1cc2cfb7b8	Move #ifndef ORT_CXX_API_THROW to the no exceptions case. (#10937 ) This is related to https://github.com/microsoft/onnxruntime/issues/10564 which introduced a fix in the wrong case where exceptions are enabled.	2022-03-21 11:12:56 -07:00
Valery Chernov	625a1f7673	[TVM EP] code refactor (#10655 ) * rename info to options for TVM EP * transfer options processing from TVMExecutionProvider to TVMEPOptions * transfer TVMRunner to separated files * implement TVMCompiler class * replace CompileFunc by TVMCompiler object. update TVMRunner. now it does not depend on TvmExecutionProvider * correct logging of TVM EP options * RunnerImpl, GERunnerImpl and VMRunnerImpl were implemented * add prepareComputeInfo method * remove update_output_shapes flag * embed all TVM EP dependences to tvm namespace. transfer model compilation from TVMRunner. connect TVMRunnerImpl to TVMRunner * refactor compileModel method * small cleaning * separate TVM EP options data store and processing * replace TvmTensorShape by InlinedVector with max_size 5 * correct indentation * update TVM hash Co-authored-by: Valery Chernov <valery.chernov@deelvin.com>	2022-03-16 13:55:04 +01:00
Edward Chen	f468ea40e5	Refactor Node::AddAttribute() (#10869 )	2022-03-16 14:53:00 +10:00
Edward Chen	e53422c6d0	Update convert_onnx_models_to_ort.py to support runtime optimizations. (#10765 ) Add runtime optimization support to ONNX -> ORT format conversion script. Replace `--optimization_level`, `--use_nnapi`, and `--use_coreml` with a new `--optimization_style` option.	2022-03-14 16:50:41 -07:00
Hariharan Seshadri	e80ff63274	Fix bug in MemcpyToHost (#10816 )	2022-03-10 07:02:27 -08:00
Edward Chen	c147c9dda6	Remove ORT_ENABLE_RUNTIME_OPTIMIZATION_IN_MINIMAL_BUILD. (#10778 ) Remove ORT_ENABLE_RUNTIME_OPTIMIZATION_IN_MINIMAL_BUILD as it is now implied by ORT_EXTENDED_MINIMAL_BUILD. Remove related CMake option.	2022-03-08 16:18:49 -08:00
Vincent Wang	4a38f9e31d	enable strided tensor for training only (#10748 )	2022-03-08 08:31:28 +08:00
Fei Hu	60acfd3dd8	Support CUDA Graph in the CUDA EP (#9978 )	2022-03-06 20:47:31 -08:00
Scott McKay	e337f5faf3	Enable QDQ cleanup and NHWC optimizers in an extended minimal build. (#10729 ) * Enable QDQ cleanup and NHWC optimizers in an extended minimal build.	2022-03-04 15:45:42 +10:00
Rachel Guo	a9dc50ba8b	Add option to force QDQIsInt8Allowed to return true when exporting to ORT format (#10719 ) * wip * save * minor update * fix * fix * Revert "fix" This reverts commit `a76f364b2d`. * revert * revert * revert submodule removal * address pr comments * minor fix * address cr comments * fix format Co-authored-by: rachguo <rachguo@rachguos-Mini.attlocal.net>	2022-03-02 23:26:14 -08:00
Yulong Wang	f4b2d3af2b	Upgrade emsdk to 3.1.3 (#10577 )	2022-02-28 23:52:41 -08:00
Vincent Wang	9a22b5d253	Strided Tensor Support for Eager Mode (#10578 ) * strided tensor for eager mode * fix build and resolve comments * fix win x86 build	2022-03-01 14:25:31 +08:00
Dmitri Smirnov	e23a224518	Fix CUDA 10.2 compile error due to inlined_containers.h inclusion (#10702 ) Fix CUDA 10.2 compile error due to inlined_containers.h inclusion into a common CUDA header. Use NumberOfNodes() to reserve space in a hash table Prefer separate call to reserve() rather than passing in the hash table constructor. They have somewhat different meaning.	2022-02-28 19:56:44 -08:00
cloudhan	3243c9579f	Fix VLOG?_DEFAULT macros usability. (#10568 ) * Add `set_default_logger_verbosity` api. * fix docs * make flake8 happy	2022-03-01 13:16:26 +10:00
Scott McKay	1f6d8248da	Add optional optimizer to remove leftover Q->DQ pairs after all other QDQ processing has completed (#10659 ) Add an optimizer that can remove leftover Q->DQ pairs. Depending on the model this may help with performance and/or improve accuracy. Optional as it could make things worse so user needs to be aware of this and test what works best for their scenario. Enable with SessionOptions config param `session.enable_quant_qdq_cleanup`	2022-03-01 08:05:02 +10:00
Thiago Crepaldi	e788cc2a23	Convert com.microsoft::ATen into org.pytorch.aten::ATen onnx op (#10060 ) Signed-off-by: Thiago Crepaldi <thiago.crepaldi@microsoft.com>	2022-02-28 14:14:45 -05:00
Ryan Hill	eb116595d4	Add ability to customize ORT_CXX_API_THROW (#10688 )	2022-02-28 00:15:10 -08:00
Dmitri Smirnov	b30e0e2283	Remove inline_containers include from tensor_shape (#10682 ) Hide Inlined Hash set and maps guts behind template forward declarations. Currently CUDA 10.2 compiler can not compile abseil but provider interfaces use those types in their signatures. InlinedVector seems to be fine. Introduce core/common/inlined_containers_fwd.h header	2022-02-26 20:07:18 -08:00
Dmitri Smirnov	2679711bee	Refactor transformers and other code to reduce memory allocation calls (#10523 ) Work on minimizing memory management calls by reducing number of allocations and copies. Replace std::unordered_set to InlinedHashSet and add usage of InlinedVector. Employ std::move() to minimize copying and memory allocations. Remove copying of the const shared data into each of the PropagateCast transformer instances. Move inlined_containers.h header to include/common Adjust AsSpan imlementation for C++ < 17	2022-02-24 16:17:14 -08:00
RandySheriffH	e056fbaa51	Add restrictions for hybrid cpus for thread pool task distribution (#10393 ) * add restrictions for hybrid cpus * add unit test to mock hybrid cpu * attach hybrid flag * add mocking interface to CpuInfo * make is_hybrid * make mock function const * add force_hybrid for thread pool * remove header	2022-02-17 14:34:09 -08:00
Ashwini Khade	f436d3437e	Add layout transformer for NNAPI (#10371 ) * Add layout transformer for NNAPI * plus merge fixes * plus some more merge fixes * test fixes * comments + cleanup * plus updates * post merge changes * enable layout transformer in extended minimal build * plus more comments * more tests + fix CI * plus updates per review * more updates per review * fix file name * fix qdq tests * plus more updates * plus updates * typo fix * fix qdq selection in 2nd optimization pass * fix typo * fix a test * update dependency structure for layout transformer * plus updates * more updates * plus change * more updates to fix linker error in minimal build * remove unnecessary headers	2022-02-15 20:25:29 -08:00
Vincent Wang	ceb1e2b1a6	[ROCm] Bugfix of BFloat16-float conversion and Add FastGelu Kernel for AMD (#10557 ) * bf16 bugfix on amd * enable fastgelu ut on amd	2022-02-16 11:11:08 +08:00
Valery Chernov	1cdc23aba4	[TVM EP] Rename Standalone TVM (STVM) Execution Provider to TVM EP (#10260 ) * update java API for STVM EP. Issue is from PR#10019 * use_stvm -> use_tvm * rename stvm worktree * STVMAllocator -> TVMAllocator * StvmExecutionProviderInfo -> TvmExecutionProviderInfo * stvm -> tvm for cpu_targets. resolve onnxruntime::tvm and origin tvm namespaces conflict * STVMRunner -> TVMRunner * StvmExecutionProvider -> TvmExecutionProvider * tvm::env_vars * StvmProviderFactory -> TvmProviderFactory * rename factory funcs * StvmCPUDataTransfer -> TvmCPUDataTransfer * small clean * STVMFuncState -> TVMFuncState * USE_TVM -> NUPHAR_USE_TVM * USE_STVM -> USE_TVM * python API: providers.stvm -> providers.tvm. clean TVM_EP.md * clean build scripts #1 * clean build scripts, java frontend and others #2 * once more clean #3 * fix build of nuphar tvm test * final transfer stvm namespace to onnxruntime::tvm * rename stvm->tvm * NUPHAR_USE_TVM -> USE_NUPHAR_TVM * small fixes for correct CI tests * clean after rebase. Last renaming stvm to tvm, separate TVM and Nuphar in cmake and build files * update CUDA support for TVM EP * roll back CudaNN home check * ERROR for not positive input shape dimension instead of WARNING * update documentation for CUDA * small corrections after review * update GPU description * update GPU description * misprints were fixed * cleaned up error msgs Co-authored-by: Valery Chernov <valery.chernov@deelvin.com> Co-authored-by: KJlaccHoeUM9l <wotpricol@mail.ru> Co-authored-by: Thierry Moreau <tmoreau@octoml.ai>	2022-02-15 10:21:02 +01:00
Chi Lo	0f5d0a091a	Make user capable of adding new field in OrtTensorRTProviderOptionsV2 as new provider option (#10450 ) * modify code for add additional field in OrtTensorRTProviderOptionsV2 * add include file * fix typo * fix bug * add comment * fix code * revert change	2022-02-05 11:15:12 -08:00
Edward Chen	c43c1691ad	Enable transpose optimizer in minimal extended build (#10349 ) Enable transpose optimizer and infrastructure it depends on in a minimal extended build.	2022-01-31 09:41:04 -08:00
Dwayne Robinson	b02f4ece5e	Remove cbegin and cend calls which do not exist in std::span or gsl::span (#10426 )	2022-01-28 14:25:12 -08:00
Edward Chen	0e951d7d6b	Add some more documentation for the C/C++ API tensor creation functions. (#10394 )	2022-01-27 13:19:11 -08:00
Changming Sun	ec4362f8f3	Enable more static analysis warnings and enable the analyzer for training cpu (#10176 )	2022-01-27 11:17:20 -08:00
Dmitri Smirnov	3367ddc5ba	Add abseil cgmanifest declaration. Update coding standards. (#10374 ) Add abseil cgmanifest declaration. Update coding standards for InlinedContainers Adjust coding guidelines. Add default N calculation for InlinedVector<T, N> for general use. Rename T from InlinedShapeVectorT. Fix Eager build Add LLVM Copyright with modified derived code notice.	2022-01-27 08:32:05 -08:00
Weixing Zhang	ea9c8a7cdc	support MIGraphXEP to work with ROCMEP for inference on AMD GPU (#10368 ) Co-authored-by: Weixing Zhang <wezhan@microsoft.com> Support MIGraphXEP to work with ROCMEP for inference on AMD GPU	2022-01-26 15:52:56 -08:00
Edward Chen	df16c605e8	Add "available since" message for C API additions since v1.10.0. (#10348 )	2022-01-25 10:15:34 -08:00
Edward Chen	4b87d2c172	Fix dockerfiles/Dockerfile.arm32v7 build. (#10360 ) Install CMake, ignore some Eigen warnings.	2022-01-24 19:06:09 -08:00
Dmitri Smirnov	7e092a7e3f	Reduce number of memory allocations based on a customer profiling case (#10193 ) Add abseil and inlined containers typedefs Introduce TensorShapeVector for shape building. Use gsl::span<const T> to make interfaces accept different types of vector like args. Introduce InineShapeVectorT for shape capacity typed instantiations Refactor cuda slice along with provider shared interfaces Refactor Concat, Conv, Pad Build with Conv Einsum and ConvTranspose refactored. Remove TesnorShape::GetDimsAsVector() Refactor SliceIterator and SliceIteratorBase Refactor broadcast Refactor Pads for twice as long Remove memory planner intermediate shapes vector Refactor orttraining Fix passing TenshroShapeVector to tests Remove abseil copy and submodule, use FetchContent_Declare/Fetch Path with separate command Make RocmAsyncBuffer accept anything convertible to span. Adjust Linux GPU pipeline.	2022-01-24 10:40:46 -08:00
Vincent Wang	44e2db9397	CUDA BFloat16 Refactor (#10085 )	2022-01-14 19:38:56 +08:00
RandySheriffH	79d2a0d185	Dynamic cost model to mitigate high E2E perf variance (#9833 ) * commit dyamic block size * summarize granularity * add configure * add test case * call std stoi * add comments * fix typo * rename var * update comment * reset default * better comments * extend LoopCounter for dynamic blocking * fix comments and add more UT * update comments * swtich type to std::ptrdiff_t * format code with better indention * cast ptrdiff_t * fix typo	2022-01-11 17:26:41 -08:00
Shucai Xiao	ce103ace93	Amdmigraphx fix build error (#9272 ) * fix build error * rename a missing api for the MIGraphX EP	2022-01-10 15:18:43 -08:00
Dwayne Robinson	1f5b073508	Minor DirectML EP provider factory comments (#9965 )	2022-01-10 02:06:31 -08:00
Nat Kershaw (MSFT)	d52d3c0052	Update C/C++ API docs automation to create a PR (instead of push to publish branch) (#10093 )	2022-01-07 16:16:47 -08:00
Hariharan Seshadri	0552a47ec2	Enable CUDA provider option configuration for C# (#10188 )	2022-01-06 11:03:14 -08:00
Edward Chen	792db33f01	Enable loading of ORT format model graph runtime optimizations (#9901 ) Initial implementation of load/replay of runtime optimizations in an ORT format model.	2022-01-04 12:09:07 -08:00
stevenlix	05d20343ee	Remove duplicated constant initializer copies for TensorRT nodes (#10105 ) * add new field constant_initializers in metadef and remove constant initializers from trt node inputs * remove redundancy * use GetConstantInitializer() to get constant initializers * add ORT_ENFORCE check Co-authored-by: Ubuntu <azureuser@orteplinuxdev.bxgbzpva45kedp3rhbsbit4phb.jx.internal.cloudapp.net>	2021-12-22 12:19:56 -08:00
Changming Sun	4e9e01cb3c	Fix SDL warnings in CPU EP (#9975 )	2021-12-19 20:54:29 -08:00
Edward Chen	3466ee45a3	Add hash value typedef. (#9710 ) Add a typedef for the various hash value variables. Use of a typedef conveys some additional meaning.	2021-12-15 19:07:17 -08:00
Valery Chernov	b327e89efa	Standalone TVM Executor Provider (#10019 ) * squashed commit for standalone tvm execution provider * critical fix for correct python build with stvm ep * get tuning log file from ep options. It has priority over AUTOTVM_TUNING_LOG * updates and fixes * update parsing of stvm provider options * add support of external data for onnx model * add conditional dump of subgraphs * remove unused code * get input tensor shapes through provider options. get output shapes for fixed input ones by TVM API * support AUTO_TVM tuning log file inside ORT. Selector for Ansor and Auto_TVM is provider option (tuning_type) * add fp16 * add functionality of conversion of model layout to NHWC if need. Necessary parameter was added to STVM provider options * fix license text in header. fix log format * small fixes * fix issues from flake8 * remove model proto construction from GetCapability * reserve memory for vector of DLTensors * add simple tutorial for STVM EP * STVM docs * jroesch/tvm -> apache/tvm * remove dead code, unneccessary logs and comments * fix in readme * improve tutorial notebook * tvm update * update STVM_EP.md * fix default value * update STVM_EP.md * some TODOs for the future development * shorten long lines * add hyperlink to STVM_EP.md * fix Linux CI error * fix error in csharp test Co-authored-by: Jared Roesch <jroesch@octoml.ai> Co-authored-by: Valery Chernov <valery.chernov@deelvin.com> Co-authored-by: KJlaccHoeUM9l <wotpricol@mail.ru>	2021-12-15 16:59:20 -08:00
Changming Sun	20f8a06f1f	Remove OpenMP code (#10032 )	2021-12-15 00:58:42 -08:00
Changming Sun	9d9ebd3b85	Fix some static analysis warnings in the core framework (#10033 )	2021-12-14 14:41:42 -08:00
Ryan Hill	343a76945b	Fix some documentation errors plus ones generating doxygen warnings (#9993 )	2021-12-09 17:42:34 -08:00

1 2 3 4 5 ...

634 commits