onnxruntime/onnxruntime/core
Weixing Zhang b4b1c6440a
Enable ORT with CUDA 11 toolkit (#4168)
* ORT on CUDA 11

1. Seperate HOROVOD and MPI
2. Seperate NCCL from HOROVOD in CMakeLists.txt
2. Remove dependency on external cub
3. cudnnSetRNNDescriptor is changed in cuDNN 8.0

* polish the code about MPI/NCCL in CMakeLists.txt and build.py

* check CUDA version

* ${MPI_INCLUDE_DIRS} should be PUBLIC

* sm30, sm50 are deprecated in CUDA 11 Toolkit

* update change based on code review feedback.

* add sm_52

* improve MPI/NCCL build path

Co-authored-by: Weixing Zhang <wezhan@microsoft.com>
2020-06-15 08:47:03 -07:00
..
codegen
common Fix Windows Inbox build failing on 1) building raw api tests and 2) referencing _winml namespace in onnxruntime.dll (#3872) 2020-05-08 15:59:16 -07:00
dll
framework fix optional input/outputs (#4229) 2020-06-15 08:10:22 +10:00
graph Enable static memory planning for pipeline. (#4204) 2020-06-12 21:43:50 -07:00
language_interop_ops
mlas MLAS: fuse float output into quantized GEMM (#4215) 2020-06-12 17:50:40 -07:00
optimizer Fix subgraph based reshape fusion (#4185) 2020-06-14 21:10:08 -07:00
platform use LOAD_WITH_ALTERED_SEARCH_PATH for LoadLibraryExA (#3908) 2020-05-11 19:53:34 -07:00
profile Use OrtMutex and OrtCondVar everywhere instead of std::mutex/std::condition_variable for consistency. 2020-06-03 08:42:16 -07:00
protobuf Address PR comments and clean up. (#3536) 2020-04-15 15:51:52 -07:00
providers Enable ORT with CUDA 11 toolkit (#4168) 2020-06-15 08:47:03 -07:00
session Fix crash reported in #4070. (#4091) 2020-06-01 15:27:14 -07:00
util MLAS: fuse float output into quantized GEMM (#4215) 2020-06-12 17:50:40 -07:00