pytorch

mirror of https://github.com/saymrwulf/pytorch.git synced 2026-05-14 20:57:59 +00:00

History

Stefan-Alin Pahontu 0674ab7e33 solve apl dependency issue (#145215 ) According to the [APL documentation](https://developer.arm.com/documentation/101004/2404/General-information/Arm-Performance-Libraries-example-programs), libraries ending with _mp are OpenMP multi-threaded libraries. When a project is compiled with MSVC and the -openmp flag, the vcomp library (Visual C++ implementation of OpenMP) is used for runtime calls. However, the current APL implementation uses the libomp.dll (LLVM) variant. As a result, there are unexpected behaviors at runtime. --- For Example: ```python import torch # Create a sparse tensor # Input (Sparse Tensor): # [[0, 1], # [1, 0]] indices = torch.tensor([[0, 1], [1, 0]]) values = torch.tensor([1, 1], dtype=torch.float32) size = torch.Size([2, 2]) sparse_tensor = torch.sparse_coo_tensor(indices, values, size) # Convert sparse tensor to dense tensor dense_tensor = sparse_tensor.to_dense() # Expected Output (Dense Tensor): # [[0, 1], # [1, 0]] print("\nDense Tensor:") print(dense_tensor) ``` However, it prints unexpected outputs such as: ```python # [[0, 11], # [10, 0]] ``` The issue arises because the following code does not function as expected at runtime: https://github.com/pytorch/pytorch/blob/main/aten/src/ATen/ParallelOpenMP.h#L30 ```c++ // returns 1 , however since OpenMP is enabled it should return total number of threads int64_t num_threads = omp_get_num_threads(); ``` --- In the runtime, loading multiple OpenMP libraries (in this case `libomp` and `vcomp`) is causing unexpected behaviours. So, we've changed libraries from `_mp` to non `_mp` versions and we used `vcomp` for OpenMP calls. Pull Request resolved: https://github.com/pytorch/pytorch/pull/145215 Approved by: https://github.com/ozanMSFT, https://github.com/malfet Co-authored-by: Ozan Aydin <148207261+ozanMSFT@users.noreply.github.com>		2025-01-27 13:02:16 +00:00
..
External	Let aotriton.cmake detect the best binary package to use, and deprecate aotriton_version.txt (#137443 )	2025-01-09 00:00:02 +00:00
Modules	solve apl dependency issue (#145215 )	2025-01-27 13:02:16 +00:00
Modules_CUDA_fix	[NVIDIA] Full Family Blackwell Support codegen (#145436 )	2025-01-24 04:36:00 +00:00
public	[ROCm] hipblaslt rowwise f8 gemm (#144432 )	2025-01-15 18:23:44 +00:00
Allowlist.cmake
BuildVariables.cmake
Caffe2Config.cmake.in
CheckAbi.cmake
cmake_uninstall.cmake.in
Codegen.cmake	[Build] Add `COMMIT_SHA` to `caffe2::GetBuildOptions` (#141313 )	2024-11-26 00:09:36 +00:00
DebugHelper.cmake
Dependencies.cmake	Revert "Reverting the PR adding Kleidiai-based int4 kernels (#145392 )" (#145505 )	2025-01-23 18:50:59 +00:00
FlatBuffers.cmake
GoogleTestPatch.cmake
IncludeSource.cpp.in
iOS.cmake
Metal.cmake	[MPS] Support includes in metal objects (#145087 )	2025-01-18 05:35:22 +00:00
MiscCheck.cmake	Add SVE implementation of embedding_lookup_idx (#133995 )	2024-10-15 18:52:44 +00:00
prioritized_text.txt
ProtoBuf.cmake
ProtoBufPatch.cmake
Summary.cmake	Revert "Reverting the PR adding Kleidiai-based int4 kernels (#145392 )" (#145505 )	2025-01-23 18:50:59 +00:00
TorchConfig.cmake.in	Revert "Reverting the PR adding Kleidiai-based int4 kernels (#145392 )" (#145505 )	2025-01-23 18:50:59 +00:00
TorchConfigVersion.cmake.in
VulkanCodegen.cmake
VulkanDependencies.cmake