onnxruntime/cmake
Weixing Zhang aec4cb489e
ROCm EP for AMD GPU (#5480)
The ROCm EP is designed and implemented based on AMD GPU software stack named ROCm. Here is the link for the details about ROCm: https://rocmdocs.amd.com/en/latest/

ROCm EP was created based on the following things:
1. AMD GPU programming language: HIP
2. AMD GPU HIP language runtime: amdhip64
3. BLAS: rocBLAS, hipBLAS
4. DNN: miOpen
5. Collective Communication library: RCCL
6. cub: hipCub
7. …

Current status:
BERT-L and GPT2 training can be ran on AMD GPU with data parallel.

Next:
1. Make more GPU code be sharable between ROCm EP and CUDA EP since HIP language and HIP runtime API are very close to CUDA.
2. Continue improving the implementation.
3. Continue GPU kernel optimization.
4. Support model parallelism on ROCm EP.
……

The rocm kernels have been removed from this commit and will be in a separate PR. Since the original PR was too big(~180 files), it was suggested to split the PR into two parts, one is rocm-kernels, the other is non rocm kernels.  

Co-authored-by: Weixing Zhang <wezhan@microsoft.com>
Co-authored-by: sabreshao <sabre.shao@amd.com>
Co-authored-by: anghostcici <11013544+anghostcici@users.noreply.github.com>
Co-authored-by: Suffian Khan <sukha@microsoft.com>
Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com>
2020-10-29 17:13:04 -07:00
..
external Remove MKLML build config (#5559) 2020-10-21 13:11:25 -07:00
horovod Address PR comments and clean up. (#3536) 2020-04-15 15:51:52 -07:00
patches OpenVINO EP v2.0 (#3585) 2020-04-24 04:06:02 -07:00
tensorboard Introduce training changes. 2020-03-11 14:39:03 -07:00
CMakeLists.txt ROCm EP for AMD GPU (#5480) 2020-10-29 17:13:04 -07:00
CMakeSettings.json Fork the WinML APIs into the Microsoft namespace (#3503) 2020-04-17 06:18:54 -07:00
codeconv.runsettings
ConfigureVisualStudioCodeAnalysis.props
EnableVisualStudioCodeAnalysis.props
flake8.cmake Older flake8 versions report false positives and don't handle the same things in the config file. (#3983) 2020-05-20 07:29:22 +10:00
onnxruntime.cmake ROCm EP for AMD GPU (#5480) 2020-10-29 17:13:04 -07:00
onnxruntime_codegen.cmake [ORT Mobile] file format schema and file I/O code (#4973) 2020-09-01 11:51:31 +10:00
onnxruntime_common.cmake Various armv7 related fixes (#5394) 2020-10-09 22:34:32 +10:00
onnxruntime_config.h.in Thread pool changes (#3153) 2020-03-30 12:18:40 -07:00
onnxruntime_csharp.cmake Add amd migraphx execution provider to onnx runtime (#2929) 2020-05-27 04:24:59 +08:00
onnxruntime_flatbuffers.cmake fix build break (#5306) 2020-09-28 00:10:48 -07:00
onnxruntime_framework.cmake Ryanunderhill/backout 5014 (#5167) 2020-09-14 22:48:00 -07:00
onnxruntime_fuzz_test.cmake Onnxruntime fuzzing (#4341) 2020-07-06 16:34:34 -07:00
onnxruntime_graph.cmake Replace MPI Send and Recv with NCCL Send and Recv (#5054) 2020-09-09 09:39:56 -07:00
onnxruntime_ios.toolchain.cmake Add iOS test pipeline and a sample app. (#5298) 2020-09-29 13:53:11 -07:00
onnxruntime_java.cmake [Android NNAPI EP] Remove dependency on external JD/DNNLibrary (#4576) 2020-07-22 14:08:12 -07:00
onnxruntime_java_unittests.cmake [java] Adds a CUDA test (#3956) 2020-05-18 12:05:51 -07:00
onnxruntime_language_interop_ops.cmake [ORT Mobile] file format schema and file I/O code (#4973) 2020-09-01 11:51:31 +10:00
onnxruntime_mlas.cmake MLAS: Add support for AVXVNNI (#5592) 2020-10-26 16:27:48 -07:00
onnxruntime_nodejs.cmake build: split nodejs binding build and test to avoid timeout issue (#4188) 2020-06-10 19:16:32 -07:00
onnxruntime_nuphar_extern.cmake
onnxruntime_optimizer.cmake [ORT Mobile] file format schema and file I/O code (#4973) 2020-09-01 11:51:31 +10:00
onnxruntime_providers.cmake ROCm EP for AMD GPU (#5480) 2020-10-29 17:13:04 -07:00
onnxruntime_pyop.cmake [ORT Mobile] file format schema and file I/O code (#4973) 2020-09-01 11:51:31 +10:00
onnxruntime_python.cmake ROCm EP for AMD GPU (#5480) 2020-10-29 17:13:04 -07:00
onnxruntime_session.cmake [ORT Mobile] file format schema and file I/O code (#4973) 2020-09-01 11:51:31 +10:00
onnxruntime_training.cmake ROCm EP for AMD GPU (#5480) 2020-10-29 17:13:04 -07:00
onnxruntime_unittests.cmake ROCm EP for AMD GPU (#5480) 2020-10-29 17:13:04 -07:00
onnxruntime_util.cmake Create Utils for Adding Range and Marker (#4013) 2020-05-24 22:55:24 -07:00
precompiled_header.cmake
protobuf_function.cmake Last major set of ORT format model changes (#5056) 2020-09-05 07:59:01 +10:00
set_winapi_family_desktop.h Fix WCOS/Win32 linking bugs (#3126) 2020-03-19 08:52:40 -07:00
store_toolchain.cmake Use onecore umbrella lib in onecore builds (#5182) 2020-09-16 10:46:27 -07:00
target_delayload.cmake Use onecore umbrella lib in onecore builds (#5182) 2020-09-16 10:46:27 -07:00
wcos_rules_override.cmake Use onecore umbrella lib in onecore builds (#5182) 2020-09-16 10:46:27 -07:00
wil.cmake
winml.cmake Store/containerized apps support (#4651) 2020-09-09 14:36:35 -07:00
winml_cppwinrt.cmake Add Experimental WinRT API IDL as placeholder for adding new winrt features (#4736) 2020-08-12 12:45:19 -07:00
winml_sdk_helpers.cmake
winml_unittests.cmake Add WinML Model testing (#5417) 2020-10-15 19:04:12 -07:00