onnxruntime/onnxruntime/test/framework
Weixing Zhang 299ace0759
Support to allow user to specify compute stream per session (#3723)
* Support to allow user to specify compute stream per session

Create computation cuda stream explicitly rather than use default legacy stream or per-thread default stream.

remove some redudant cudaStreamSynchronize

fix gpt2 model test failures

don't use default stream in nccl either.

add stream schronization in OnRunEnd()

using cub::DeviceScan::InclusiveSum which can be called with stream specified.

fix topK failure due to latest rebase

fix tensorrt

support user specified stream

add user_stream support in tensorrt EP

use same stream for both tensort and CUDA EP.

fix ScatterND

specify stream for adasum and p2p kernels.

fix loop

fix CApiTest.custom_op_handler

fix CApiTest.varied_input_custom_op_handler

change for cudaMemcpyFromSymbol

improve provider options for user specified compute stream

* add changes for ROCM EP

* fix GatherGrad UT for ROCM EP

* clean code and fix NonMaxSuppression

* use default stream for ROCM now

* fix CApiTest.custom_op_handler:OrtFormatCustomOpTests.ConvertOnnxModelToOrt

* fix tensorrt ut: CApiTest.io_binding_cuda

Co-authored-by: Weixing Zhang <wezhan@microsoft.com>
2021-02-05 15:48:18 -08:00
..
cuda Fix CUDA test hang: (#6138) 2020-12-16 16:32:56 +10:00
allocation_planner_test.cc
allocator_test.cc Unify activation and initializer alignment value (#6109) 2020-12-14 13:13:41 -08:00
bfc_arena_test.cc Fix edge case in BFCArena where allocation failures could lead to an infinite loop. (#6145) 2020-12-17 07:52:31 +10:00
data_types_test.cc Add tag types for Ort::Float16_t and Ort:Bfloat16_t structs (#5716) 2020-11-06 16:41:26 -08:00
distance_test.cc
dummy_allocator.cc
dummy_allocator.h
dummy_provider.cc
dummy_provider.h
endian_test.cc
execution_frame_test.cc Unify activation and initializer alignment value (#6109) 2020-12-14 13:13:41 -08:00
execution_provider_test.cc Helper for compiling EP to generate deterministic unique ids for use in MetaDef names (#6156) 2020-12-21 12:17:58 +10:00
float_16_test.cc
inference_session_test.cc Support to allow user to specify compute stream per session (#3723) 2021-02-05 15:48:18 -08:00
insert_cast_transformer_test.cc
kernel_registry_test.cc
local_kernel_registry_test.cc
math_test.cc
mem_pattern_planner_test.cc Exclude some training specific code from the minimal build. Cleanup some related aspects of allocation planner. (#5861) 2020-11-20 20:25:46 +10:00
memcpy_transformer_test.cc
model_builder_utils.h
opaque_kernels_test.cc
ort_model_only_test.cc Add support for custom ops to minimal build. (#6228) 2021-01-25 10:41:00 +10:00
parallel_executor_test.cc
provider_options_utils_test.cc Deprecate Python global configuration functions [Part 2] (#6171) 2021-01-07 10:10:55 -08:00
random_test.cc
session_state_test.cc Remove onnxruntime_session_options_config_keys.h from c_api (#5772) 2020-11-12 09:12:13 -08:00
shape_inference_test.cc
sparse_kernels_test.cc add external data support to tensor proto utils (#6257) 2021-01-13 14:14:18 -08:00
tensor_test.cc enable arena for arm64 (#5613) 2020-10-28 08:40:43 -07:00
tensorutils_test.cc add external data support to tensor proto utils (#6257) 2021-01-13 14:14:18 -08:00
test_tensor_loader.cc
test_utils.cc [CoreML EP] Add CI for CoreML EP (macOS) and add coreml_flags for EP options (#6481) 2021-01-28 12:25:46 -08:00
test_utils.h [CoreML EP] Add CI for CoreML EP (macOS) and add coreml_flags for EP options (#6481) 2021-01-28 12:25:46 -08:00
TestAllocatorManager.cc
TestAllocatorManager.h