pytorch/cmake
Stefan-Alin Pahontu 0674ab7e33 solve apl dependency issue (#145215)
According to the [APL documentation](https://developer.arm.com/documentation/101004/2404/General-information/Arm-Performance-Libraries-example-programs), libraries ending with _mp are OpenMP multi-threaded libraries.

When a project is compiled with MSVC and the -openmp flag, the vcomp library (Visual C++ implementation of OpenMP) is used for runtime calls.

However, the current APL implementation uses the libomp.dll (LLVM) variant.

As a result, there are unexpected behaviors at runtime.

---

For Example:

```python
import torch

# Create a sparse tensor
# Input (Sparse Tensor):
# [[0, 1],
#  [1, 0]]
indices = torch.tensor([[0, 1], [1, 0]])
values = torch.tensor([1, 1], dtype=torch.float32)
size = torch.Size([2, 2])

sparse_tensor = torch.sparse_coo_tensor(indices, values, size)

# Convert sparse tensor to dense tensor
dense_tensor = sparse_tensor.to_dense()

# Expected Output (Dense Tensor):
# [[0, 1],
#  [1, 0]]
print("\nDense Tensor:")
print(dense_tensor)
```

However, it prints unexpected outputs such as:

```python
# [[0, 11],
#  [10, 0]]
```

The issue arises because the following code does not function as expected at runtime:

https://github.com/pytorch/pytorch/blob/main/aten/src/ATen/ParallelOpenMP.h#L30

```c++
// returns 1 , however since OpenMP is enabled it should return total number of threads
int64_t num_threads = omp_get_num_threads();
```

---

In the runtime, loading multiple OpenMP libraries (in this case `libomp` and `vcomp`) is causing unexpected behaviours.

So, we've changed libraries from `_mp` to non `_mp` versions and we used `vcomp` for OpenMP calls.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/145215
Approved by: https://github.com/ozanMSFT, https://github.com/malfet

Co-authored-by: Ozan Aydin <148207261+ozanMSFT@users.noreply.github.com>
2025-01-27 13:02:16 +00:00
..
External Let aotriton.cmake detect the best binary package to use, and deprecate aotriton_version.txt (#137443) 2025-01-09 00:00:02 +00:00
Modules solve apl dependency issue (#145215) 2025-01-27 13:02:16 +00:00
Modules_CUDA_fix [NVIDIA] Full Family Blackwell Support codegen (#145436) 2025-01-24 04:36:00 +00:00
public [ROCm] hipblaslt rowwise f8 gemm (#144432) 2025-01-15 18:23:44 +00:00
Allowlist.cmake
BuildVariables.cmake
Caffe2Config.cmake.in
CheckAbi.cmake
cmake_uninstall.cmake.in
Codegen.cmake [Build] Add COMMIT_SHA to caffe2::GetBuildOptions (#141313) 2024-11-26 00:09:36 +00:00
DebugHelper.cmake
Dependencies.cmake Revert "Reverting the PR adding Kleidiai-based int4 kernels (#145392)" (#145505) 2025-01-23 18:50:59 +00:00
FlatBuffers.cmake
GoogleTestPatch.cmake
IncludeSource.cpp.in
iOS.cmake
Metal.cmake [MPS] Support includes in metal objects (#145087) 2025-01-18 05:35:22 +00:00
MiscCheck.cmake Add SVE implementation of embedding_lookup_idx (#133995) 2024-10-15 18:52:44 +00:00
prioritized_text.txt
ProtoBuf.cmake
ProtoBufPatch.cmake
Summary.cmake Revert "Reverting the PR adding Kleidiai-based int4 kernels (#145392)" (#145505) 2025-01-23 18:50:59 +00:00
TorchConfig.cmake.in Revert "Reverting the PR adding Kleidiai-based int4 kernels (#145392)" (#145505) 2025-01-23 18:50:59 +00:00
TorchConfigVersion.cmake.in
VulkanCodegen.cmake
VulkanDependencies.cmake