onnxruntime/include/onnxruntime/core/framework
edgchen1 6c7da5e9d3
Optimize CUDA Sum op kernel and refactor CUDA elementwise variadic input op kernels (#4418)
For the special case where all variadic inputs of a kernel are the same shape (i.e. no broadcasting is required) and there are few enough of them, we perform the entire computation in a single kernel. The general implementation (which was previously used for this special case) handles broadcasting by repeatedly invoking a binary kernel on successive inputs.
2020-07-10 10:20:23 -07:00
..
alloc_kind.h Avoid copy of pre-existing value to subgraph output (#637) 2019-03-19 06:55:59 +10:00
allocator.h Cleanup SessionState. Move allocator lookup to SessionState. (#4194) 2020-06-28 14:55:42 +10:00
customregistry.h CustomRegistry should use composition instead of inheritence 2019-04-05 14:14:10 -07:00
data_types.h address PR comments (#3312) 2020-03-25 19:35:12 -07:00
data_types_internal.h Fix some warnings on Windows (#2560) 2020-01-22 15:59:11 -08:00
endian.h Edgchen1/endian utils (#2181) 2019-10-21 22:28:35 -07:00
execution_provider.h Cleanup SessionState. Move allocator lookup to SessionState. (#4194) 2020-06-28 14:55:42 +10:00
fence.h Remove unnecessary casts from OrtValue to MLValue(#1051) 2019-05-17 07:52:59 -07:00
framework_common.h Combine OrtValue and MLValue into one type (#1043) 2019-05-16 10:22:49 -07:00
func_api.h Ryanunderhill/mkldnn dll (#3314) 2020-05-06 00:57:09 -07:00
kernel_def_builder.h Introduce training changes. 2020-03-11 14:39:03 -07:00
kernel_registry.h Parallel all the activations ops (#3722) 2020-05-05 01:18:17 -07:00
ml_value.h Introduce container type runtime checks and other improvements (#2522) 2019-12-04 16:04:17 -08:00
op_kernel.h Optimize CUDA Sum op kernel and refactor CUDA elementwise variadic input op kernels (#4418) 2020-07-10 10:20:23 -07:00
op_kernel_info.h Replace GSL with GSL-LITE submodule and fix up refs (#1920) 2019-10-01 12:43:29 -07:00
op_node_proto_helper.h Replace GSL with GSL-LITE submodule and fix up refs (#1920) 2019-10-01 12:43:29 -07:00
run_options.h Support training_mode flag in eval (#4324) 2020-07-08 10:38:54 -07:00
sparse_tensor.h Replace GSL with GSL-LITE submodule and fix up refs (#1920) 2019-10-01 12:43:29 -07:00
tensor.h View Op - new unit tests and add support for tensor memcpy by offset/size (#3439) 2020-04-07 13:07:11 -07:00
tensor_shape.h Filter out info from non-const initializers during shape inferencing (#1806) 2019-09-26 13:44:33 +10:00