onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-06-26 03:00:54 +00:00

History

Dmitri Smirnov e1901a7e10 Improve performance of CUDA implementations for GatherElements and Greater, Equal and Less (#4989 ) Make GatherElements kernel process 16 items each. unroll the constant loop. Quit loops early for zero dividend. Optimize Binary CompareFunction and remove Impl_Cast invocation.		2020-09-02 10:17:39 -07:00
..
codegen	Convert TensorRT provider into a shared library (#4721 )	2020-08-10 21:17:16 -07:00
common	Add Cmake config for onnxruntime_NO_EXCEPTIONS (#4975 )	2020-09-01 10:17:50 -07:00
dll
flatbuffers	[ORT Mobile] file format schema and file I/O code (#4973 )	2020-09-01 11:51:31 +10:00
framework	Rename DeviceAllocatorRegistrationInfo to a more generic name; Use OrtArenaCfg for arena members; Remove unused OrtMemType; Simplify CreateAllocator interface. (#4970 )	2020-09-01 09:25:32 -07:00
graph	[ORT Mobile] file format schema and file I/O code (#4973 )	2020-09-01 11:51:31 +10:00
language_interop_ops	FixPyOpSegFault&MakeItStaticLib (#4600 )	2020-07-28 11:45:25 -07:00
mlas	Add option ORT_NO_EXCEPTIONS to disable most exception/throw in /onnxruntime/ (#4894 )	2020-08-28 23:03:51 -07:00
optimizer	Pass Model Path to TensorProtoToMLValue from Constant Folding for External Inputs (#5000 )	2020-09-02 21:54:40 +08:00
platform	Remove evaluate telemetry due to redundancy (#4996 )	2020-09-01 17:02:00 -07:00
profile
protobuf	implement per-channel for quantizelinear and dequantizelinear (#4759 )	2020-08-21 12:08:50 -07:00
providers	Improve performance of CUDA implementations for GatherElements and Greater, Equal and Less (#4989 )	2020-09-02 10:17:39 -07:00
session	Remove evaluate telemetry due to redundancy (#4996 )	2020-09-01 17:02:00 -07:00
util