mirror of
https://github.com/saymrwulf/pytorch.git
synced 2026-05-14 20:57:59 +00:00
Summary: NCCL library is built using [CUDA separate compilation](https://devblogs.nvidia.com/separate-compilation-linking-cuda-device-code/), which consists of building intermediate CUDA binaries and then linking them into GPU code that could be executed on device. Intermediate CUDA code is stored in `__nv_relfatbin` section, and code that can be launched is stored in `.nv_fatbin`. When `nvcc` is used to link executable/shared library, it removes those intermediate binaries, but default host linker is not aware of that and therefore it is kept inside host executable. Help compiler by removing `__nv_relfatbin` sections from object file inside `libncc_static.a`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/35843 Test Plan: Build pytorch with CUDA and run `test_distributed.py` Differential Revision: D20882224 Pulled By: malfet fbshipit-source-id: f23dd4aa416518324cb38b9bd6846e73a1c7dd21 |
||
|---|---|---|
| .. | ||
| External | ||
| Modules | ||
| Modules_CUDA_fix | ||
| public | ||
| BuildVariables.cmake | ||
| Caffe2Config.cmake.in | ||
| Caffe2ConfigVersion.cmake.in | ||
| cmake_uninstall.cmake.in | ||
| Codegen.cmake | ||
| Dependencies.cmake | ||
| GoogleTestPatch.cmake | ||
| iOS.cmake | ||
| MiscCheck.cmake | ||
| ProtoBuf.cmake | ||
| ProtoBufPatch.cmake | ||
| Summary.cmake | ||
| TorchConfig.cmake.in | ||
| TorchConfigVersion.cmake.in | ||
| Utils.cmake | ||
| Whitelist.cmake | ||