onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-05-29 23:06:41 +00:00

History

Scott McKay 6e430c0526 A few performance improvements coming out of ssd_mobilenet and ssd_resnet34 analysis (#1578 ) * A few performance improvements: - Make the iteration in NonZero more efficient by using a raw pointer and simplifying the increment logic - add another unit test to check the new logic works with 3 dimensional tensor - gains about 2% for ssd_mobilenet - Avoid floating point operations on each iteration on Concat - about 0.5% for ssd_mobilenet and ssd_resnet34 - Put common case first in ExecutionFrame::AllocateAsPerAllocationPlan to avoid unnecessary call to IsSparseTensor - about 0.05% for ssd_mobilenet - Minor tweak to put some ctors in the TensorShape header so they can be inlined more easily		2019-08-08 07:20:00 +10:00
..
common	Remove unneeded C APIs + some refactoring. (#1555 )	2019-08-07 11:05:29 -07:00
framework	A few performance improvements coming out of ssd_mobilenet and ssd_resnet34 analysis (#1578 )	2019-08-08 07:20:00 +10:00
graph	Don't create implicit input for outer scope value if there is a subgraph input with the same name. (#1186 )	2019-08-02 07:23:41 +10:00
optimizer	Add/correct missing SAL annotations + avoid using unsigned types (except where counts are involved). (#1451 )	2019-07-22 23:25:53 -07:00
platform	Enable use of session based threadpool. (#854 )	2019-04-18 10:20:46 -07:00
providers	Expose provider factory C API, especially for CUDA users (#1461 )	2019-07-22 19:03:06 -07:00
session	Remove unneeded C APIs + some refactoring. (#1555 )	2019-08-07 11:05:29 -07:00