mirror of
https://github.com/saymrwulf/onnxruntime.git
synced 2026-05-30 23:18:20 +00:00
* A few performance improvements: - Make the iteration in NonZero more efficient by using a raw pointer and simplifying the increment logic - add another unit test to check the new logic works with 3 dimensional tensor - gains about 2% for ssd_mobilenet - Avoid floating point operations on each iteration on Concat - about 0.5% for ssd_mobilenet and ssd_resnet34 - Put common case first in ExecutionFrame::AllocateAsPerAllocationPlan to avoid unnecessary call to IsSparseTensor - about 0.05% for ssd_mobilenet - Minor tweak to put some ctors in the TensorShape header so they can be inlined more easily |
||
|---|---|---|
| .. | ||
| alloc_kind.h | ||
| allocator.h | ||
| customregistry.h | ||
| data_types.h | ||
| execution_provider.h | ||
| fence.h | ||
| framework_common.h | ||
| func_api.h | ||
| kernel_def_builder.h | ||
| kernel_registry.h | ||
| ml_value.h | ||
| op_kernel.h | ||
| op_kernel_info.h | ||
| op_node_proto_helper.h | ||
| run_options.h | ||
| sparse_tensor.h | ||
| tensor.h | ||
| tensor_shape.h | ||