onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-07-19 19:00:47 +00:00

History

Weixing Zhang fff85a6a35 Add GPU kernels for ROCm EP (#5655 ) * Add kernels for AMD GPU. This PR is mostly about GPU kernels for ROCm EP. Due to similar GPU programming language (CUDA and HIP and similar math library calls, one principle in ROCM EP design is to share CUDA kernels as much as possible for ROCm. Thus, the script amd_hipify.py has been created for converting CUDA kernels to ROCm HIP kernels automatically during compilation phase. But, for some reasons such as perf issue, syntax difference..., some converted kernels need some manual intervention. These kernels will be checked in the repo physically for now. In order to avoid manual intervention, the plan is to refactor CUDA kernels to make them portable between CUDA EP and ROCm EP as much as possible. Please refer to "HIP Porting Guide" for details. * like lamb, multi-tensor-apply needs to be disabled for IsAllFiniteOp and ReduceAllL2, current AMD GPU compiler has perf issue for kernel parameter which is a structure with "pass by value". * Use hipMemsetAsync and add checks on HIP calls. * move the generated files to build folder. Co-authored-by: Jesse Benson <jesseb@microsoft.com>		2020-11-06 16:11:06 -08:00
..
external	Upgrade optional implementation to https://github.com/martinmoene/optional-lite . (#5563 )	2020-11-03 15:27:47 -08:00
horovod
patches
tensorboard
CMakeLists.txt	Modify logic to determine OV Version (#5701 )	2020-11-05 15:12:02 -08:00
CMakeSettings.json
codeconv.runsettings
ConfigureVisualStudioCodeAnalysis.props
EnableVisualStudioCodeAnalysis.props
flake8.cmake	Older flake8 versions report false positives and don't handle the same things in the config file. (#3983 )	2020-05-20 07:29:22 +10:00
onnxruntime.cmake	ROCm EP for AMD GPU (#5480 )	2020-10-29 17:13:04 -07:00
onnxruntime_codegen.cmake	[ORT Mobile] file format schema and file I/O code (#4973 )	2020-09-01 11:51:31 +10:00
onnxruntime_common.cmake	Upgrade optional implementation to https://github.com/martinmoene/optional-lite . (#5563 )	2020-11-03 15:27:47 -08:00
onnxruntime_config.h.in
onnxruntime_csharp.cmake	Add amd migraphx execution provider to onnx runtime (#2929 )	2020-05-27 04:24:59 +08:00
onnxruntime_flatbuffers.cmake	fix build break (#5306 )	2020-09-28 00:10:48 -07:00
onnxruntime_framework.cmake	Ryanunderhill/backout 5014 (#5167 )	2020-09-14 22:48:00 -07:00
onnxruntime_fuzz_test.cmake	Onnxruntime fuzzing (#4341 )	2020-07-06 16:34:34 -07:00
onnxruntime_graph.cmake	Replace MPI Send and Recv with NCCL Send and Recv (#5054 )	2020-09-09 09:39:56 -07:00
onnxruntime_ios.toolchain.cmake	Add iOS test pipeline and a sample app. (#5298 )	2020-09-29 13:53:11 -07:00
onnxruntime_java.cmake	[Android NNAPI EP] Remove dependency on external JD/DNNLibrary (#4576 )	2020-07-22 14:08:12 -07:00
onnxruntime_java_unittests.cmake	[java] Adds a CUDA test (#3956 )	2020-05-18 12:05:51 -07:00
onnxruntime_language_interop_ops.cmake	[ORT Mobile] file format schema and file I/O code (#4973 )	2020-09-01 11:51:31 +10:00
onnxruntime_mlas.cmake	MLAS: Add support for AVXVNNI (#5592 )	2020-10-26 16:27:48 -07:00
onnxruntime_nodejs.cmake	build: split nodejs binding build and test to avoid timeout issue (#4188 )	2020-06-10 19:16:32 -07:00
onnxruntime_nuphar_extern.cmake
onnxruntime_optimizer.cmake	[ORT Mobile] file format schema and file I/O code (#4973 )	2020-09-01 11:51:31 +10:00
onnxruntime_providers.cmake	Add GPU kernels for ROCm EP (#5655 )	2020-11-06 16:11:06 -08:00
onnxruntime_pyop.cmake	[ORT Mobile] file format schema and file I/O code (#4973 )	2020-09-01 11:51:31 +10:00
onnxruntime_python.cmake	ROCm EP for AMD GPU (#5480 )	2020-10-29 17:13:04 -07:00
onnxruntime_session.cmake	[ORT Mobile] file format schema and file I/O code (#4973 )	2020-09-01 11:51:31 +10:00
onnxruntime_training.cmake	ROCm EP for AMD GPU (#5480 )	2020-10-29 17:13:04 -07:00
onnxruntime_unittests.cmake	Revert "Custom Op on GPU (#5620 )"	2020-10-30 21:23:51 -07:00
onnxruntime_util.cmake	Create Utils for Adding Range and Marker (#4013 )	2020-05-24 22:55:24 -07:00
precompiled_header.cmake
protobuf_function.cmake	Last major set of ORT format model changes (#5056 )	2020-09-05 07:59:01 +10:00
set_winapi_family_desktop.h
store_toolchain.cmake	Use onecore umbrella lib in onecore builds (#5182 )	2020-09-16 10:46:27 -07:00
target_delayload.cmake	Use onecore umbrella lib in onecore builds (#5182 )	2020-09-16 10:46:27 -07:00
wcos_rules_override.cmake	Use onecore umbrella lib in onecore builds (#5182 )	2020-09-16 10:46:27 -07:00
wil.cmake
winml.cmake	Store/containerized apps support (#4651 )	2020-09-09 14:36:35 -07:00
winml_cppwinrt.cmake	Add Experimental WinRT API IDL as placeholder for adding new winrt features (#4736 )	2020-08-12 12:45:19 -07:00
winml_sdk_helpers.cmake
winml_unittests.cmake	Add WinML Model testing (#5417 )	2020-10-15 19:04:12 -07:00