[NCCL][CUDA] Set PYTORCH_C10_DRIVER_API_SUPPORTED in ProcessGroupNCCL.cpp compilation (#137828)

Otherwise `expandable_segments()` is hardcoded to false in `CUDAAllocatorConfig.h`

Pull Request resolved: https://github.com/pytorch/pytorch/pull/137828
Approved by: https://github.com/yifuwang, https://github.com/Skylion007
This commit is contained in:
eqy 2024-10-14 19:38:20 +00:00 committed by PyTorch MergeBot
parent 19918a1863
commit 914c90dcea

View file

@ -562,6 +562,7 @@ if(USE_CUDA)
${TORCH_SRC_DIR}/csrc/distributed/c10d/intra_node_comm.cpp
${TORCH_SRC_DIR}/csrc/distributed/c10d/CudaDMAConnectivity.cpp
${TORCH_SRC_DIR}/csrc/distributed/c10d/CUDASymmetricMemory.cu
${TORCH_SRC_DIR}/csrc/distributed/c10d/ProcessGroupNCCL.cpp
PROPERTIES COMPILE_FLAGS "-DPYTORCH_C10_DRIVER_API_SUPPORTED=1"
)
endif()