onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-07-11 17:48:34 +00:00

History

Yufeng Li 8c5db7f973 use legacy stream mode (#2076 ) In ORT, there is only 3 cuda stream: default, HtoD, DtoH. And both HtoD and DtoH are non-blocking stream. Thus, per-thread stream mode doesn't have any benefit. I also tried in multiple thread env and the legacy mode is also better than per-thread model. Below is the perf of a 3 layer bert on v100. Unit is ms: batch size 1: concurrency \| c=1 \| c=2 \| c=4 legacy \| 0.54 \| 1.17 \| 2.68 per-thread \| 0.66 \| 1.37 \| 2.86 batch size 4: concurrency \| c=1 \| c=2 \| c=4 legacy \| 1.1 \| 2.22 \| 4.6 per-thread \| 1.21 \| 2.44 \| 4.98 batch size 64: concurrency \| c=1 \| c=2 \| c=4 legacy \| 8.09 \| 16.13 \| 32.37 per-thread \| 8.18 \| 16.26 \| 32.45		2019-10-14 16:03:04 -07:00
..
external	Update nGraph to version 0.26 (#1965 )	2019-10-14 10:37:48 -07:00
onnx	make builds more robust (#906 ) (#932 )	2019-04-29 12:58:20 -07:00
patches	Update nGraph to version 0.26 (#1965 )	2019-10-14 10:37:48 -07:00
CMakeLists.txt	use legacy stream mode (#2076 )	2019-10-14 16:03:04 -07:00
ConfigureVisualStudioCodeAnalysis.props
EnableVisualStudioCodeAnalysis.props
get_boost.cmake	restore ninja compatibility	2019-05-15 10:18:52 -07:00
onnxruntime.cmake	Replace GSL with GSL-LITE submodule and fix up refs (#1920 )	2019-10-01 12:43:29 -07:00
onnxruntime_automl_featurizers.cmake	Replace GSL with GSL-LITE submodule and fix up refs (#1920 )	2019-10-01 12:43:29 -07:00
onnxruntime_codegen.cmake	Replace GSL with GSL-LITE submodule and fix up refs (#1920 )	2019-10-01 12:43:29 -07:00
onnxruntime_common.cmake	Replace GSL with GSL-LITE submodule and fix up refs (#1920 )	2019-10-01 12:43:29 -07:00
onnxruntime_config.h.in	Ignore some gcc warnings (#1996 )	2019-10-07 16:32:34 -07:00
onnxruntime_csharp.cmake	Conditionally export execution provider apis in chsarp (#1724 )	2019-09-09 11:17:44 -07:00
onnxruntime_dependencies.dot
onnxruntime_framework.cmake	Replace GSL with GSL-LITE submodule and fix up refs (#1920 )	2019-10-01 12:43:29 -07:00
onnxruntime_graph.cmake	Replace GSL with GSL-LITE submodule and fix up refs (#1920 )	2019-10-01 12:43:29 -07:00
onnxruntime_language_interop_ops.cmake	Replace GSL with GSL-LITE submodule and fix up refs (#1920 )	2019-10-01 12:43:29 -07:00
onnxruntime_mlas.cmake	Introduce a separate check and conditional for AVX512BW build (#2083 )	2019-10-10 16:14:00 -07:00
onnxruntime_nuphar_extern.cmake	Weba/merge ngemm (#2021 )	2019-10-05 12:09:22 -07:00
onnxruntime_optimizer.cmake	Cleanup some aspects of the Initializer class used by optimizers (#2005 )	2019-10-09 10:37:44 +10:00
onnxruntime_providers.cmake	Update TensorRT to version 6.0.1.5 (#1966 )	2019-10-06 10:40:53 -07:00
onnxruntime_pyop.cmake	Replace GSL with GSL-LITE submodule and fix up refs (#1920 )	2019-10-01 12:43:29 -07:00
onnxruntime_python.cmake	pack pyop in nightly build (#2018 )	2019-10-08 12:02:45 -07:00
onnxruntime_server.cmake	Replace GSL with GSL-LITE submodule and fix up refs (#1920 )	2019-10-01 12:43:29 -07:00
onnxruntime_session.cmake	Replace GSL with GSL-LITE submodule and fix up refs (#1920 )	2019-10-01 12:43:29 -07:00
onnxruntime_unittests.cmake	Replace std::regex with re2 bc CentOS std::regex is broken (#2017 )	2019-10-04 18:47:03 -07:00
onnxruntime_util.cmake	Replace GSL with GSL-LITE submodule and fix up refs (#1920 )	2019-10-01 12:43:29 -07:00
protobuf_function.cmake	Use protobuf-lite to reduce onnxruntime.dll size. (#639 )	2019-03-21 14:06:38 -07:00