pytorch/cmake
Peter Bell fa86874bbd Fix intermittent link errors in NCCL build (#84245)
Should fix #13362 and fix #83790

I think I've discovered the root cause of the intermittent nccl link
failures. If we look at the variable name in the redefinition error:
```
_02021d91_11_sendrecv_cu_0bc7b9c8_11152
```

this is the name of the file being compiled + some form of unique ID.
As part of NCCL's build process, the same file is compiled multiple
times with different macro definitions depending on which operator and
dtype are being compiled, e.g.
```
nvcc -DNCCL_OP=0 -DNCCL_TYPE=0 -dc sendrecv.cu -o sendrecv_sum_i8.o
```

Since the filename parts are the same, then if the unique IDs also
happen to collide then the entire identifier will collide and the link
fails. So the fix here is to generate a unique `.cu` file for each
object file. I've implemented this as a `.patch` file that gets
applied from our cmake code, but if we instead fork nccl that would be
cleaner.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/84245
Approved by: https://github.com/janeyx99, https://github.com/malfet
2022-09-13 19:55:52 +00:00
..
External Fix intermittent link errors in NCCL build (#84245) 2022-09-13 19:55:52 +00:00
Modules Fix false positive AVX, AVX2 and AVX512 detection with MSVC (#82554) 2022-08-01 23:52:49 +00:00
Modules_CUDA_fix
public Remove caffe2 mobile (#84338) 2022-09-08 01:49:55 +00:00
Allowlist.cmake
BuildVariables.cmake
Caffe2Config.cmake.in [ROCm] Load ROCm if Torch is used as a dependency (#80469) 2022-07-05 21:04:07 +00:00
Caffe2ConfigVersion.cmake.in
cmake_uninstall.cmake.in
Codegen.cmake [retake2][mobile] Fix lightweight dispatch OOM error by introducing selective build (#80791) 2022-07-15 18:04:25 +00:00
DebugHelper.cmake
Dependencies.cmake Remove caffe2 mobile (#84338) 2022-09-08 01:49:55 +00:00
FlatBuffers.cmake
GoogleTestPatch.cmake
IncludeSource.cpp.in
iOS.cmake
Metal.cmake
MiscCheck.cmake
ProtoBuf.cmake
ProtoBufPatch.cmake
Summary.cmake [Build] Replace message() in caffe2/CMakeLists.txt with message in cmake/Summary.cmake (#84814) 2022-09-12 16:32:32 +00:00
TorchConfig.cmake.in reorder cpuinfo and clog deps in TorchConfig.cmake (#79551) 2022-06-16 18:23:26 +00:00
TorchConfigVersion.cmake.in
VulkanCodegen.cmake Consolidate all python targets in the tools folder (#80408) 2022-06-29 23:27:47 +00:00
VulkanDependencies.cmake