pytorch

mirror of https://github.com/saymrwulf/pytorch.git synced 2026-05-14 20:57:59 +00:00

Author	SHA1	Message	Date
Ting Lu	d0e70c4fd3	windows Magma build for cu128 (#146653 ) https://github.com/pytorch/pytorch/issues/145570 removing `.ci/pytorch/windows/internal/cuda_install.bat` as it is a duplicate with` .github/scripts/windows/cuda_install.bat`. The later one is the one in use - https://github.com/pytorch/pytorch/pull/146653/files#diff-613791f266f2f7b81148ca8f447b0cd6c6544f824f5f46a78a2794006c78957bR8 Pull Request resolved: https://github.com/pytorch/pytorch/pull/146653 Approved by: https://github.com/atalman Co-authored-by: atalman <atalman@fb.com>	2025-02-10 13:48:55 +00:00
PyTorch MergeBot	f17109bd96	Revert "windows Magma build for cu128 (#146653 )" This reverts commit `9e27d36e2b`. Reverted https://github.com/pytorch/pytorch/pull/146653 on behalf of https://github.com/atalman due to Broke nightly builds ([comment](https://github.com/pytorch/pytorch/pull/146653#issuecomment-2643882976))	2025-02-07 19:37:16 +00:00
Ting Lu	9e27d36e2b	windows Magma build for cu128 (#146653 ) https://github.com/pytorch/pytorch/issues/145570 removing `.ci/pytorch/windows/internal/cuda_install.bat` as it is a duplicate with` .github/scripts/windows/cuda_install.bat`. The later one is the one in use - https://github.com/pytorch/pytorch/pull/146653/files#diff-613791f266f2f7b81148ca8f447b0cd6c6544f824f5f46a78a2794006c78957bR8 Pull Request resolved: https://github.com/pytorch/pytorch/pull/146653 Approved by: https://github.com/atalman	2025-02-07 18:09:30 +00:00
Catherine Lee	97b64f2e5c	Fix workflow for closing nonexistent disable issues (#146447 ) The workflow could not update issues because it didn't have permissions, and it looked green because it didn't check return codes. Tested by running the workflow and seeing that issues did get closed Fixes https://github.com/pytorch/pytorch/issues/145382 Pull Request resolved: https://github.com/pytorch/pytorch/pull/146447 Approved by: https://github.com/huydhn	2025-02-05 22:29:05 +00:00
Ting Lu	9e45bc82e9	[aarch64] CUDA 12.8 aarch64 builds to nightly binaries (#146378 ) https://github.com/pytorch/pytorch/issues/145570 Adding Cuda 12.8 and keeping 12.6 for the sbsa build, supported CUDA_ARCH: 9.0, 10.0, 12.0 Refactor the binaries matrix for cuda sbsa build. Previously cuda-aarch64 was hardcoded to cuda 12.6. Now reads 12.6 and 12.8, new build naming example [manywheel-py3_9-cuda-aarch64-12_8-build](https://github.com/pytorch/pytorch/actions/runs/13132625006/job/36640885079?pr=146378#logs) TODO: once 12.8 is stable, remove 12.6 in sbsa Pull Request resolved: https://github.com/pytorch/pytorch/pull/146378 Approved by: https://github.com/atalman	2025-02-05 02:55:21 +00:00
atalman	a7cc6d3e84	Manylinux 2.28 migration - remove pre-cxx11 abi libtorch builds (#146200 ) Related to: https://github.com/pytorch/pytorch/issues/123649 Removing pre-cxx11 abi builds. As per announcement : https://dev-discuss.pytorch.org/t/pytorch-linux-wheels-switching-to-new-wheel-build-platform-manylinux-2-28-on-november-12-2024/2581 Pull Request resolved: https://github.com/pytorch/pytorch/pull/146200 Approved by: https://github.com/kit1980, https://github.com/huydhn	2025-01-31 21:43:12 +00:00
Aleksei Nikiforov	eb5a0718c2	S390x nightly builds timeouts (#146041 ) Sometimes build timeouts at the end. This should be fixed by increased timeout. Pull Request resolved: https://github.com/pytorch/pytorch/pull/146041 Approved by: https://github.com/huydhn, https://github.com/malfet	2025-01-31 17:29:11 +00:00
Ting Lu	9232355bb0	Add CUDA 12.8 manywheel x86 Builds to Binaries Matrix (#145792 ) https://github.com/pytorch/pytorch/issues/145570 Adding cuda 12.8.0 x86 builds first Pull Request resolved: https://github.com/pytorch/pytorch/pull/145792 Approved by: https://github.com/nWEIdia, https://github.com/malfet, https://github.com/atalman	2025-01-31 16:12:02 +00:00
PyTorch MergeBot	967cf85f3a	Revert "Update mi300 labels to account for multiple clusters. (#145923 )" This reverts commit `3e135993bd`. Reverted https://github.com/pytorch/pytorch/pull/145923 on behalf of https://github.com/atalman due to reverting back to one cluster ([comment](https://github.com/pytorch/pytorch/pull/145923#issuecomment-2625022826))	2025-01-30 16:45:50 +00:00
Benjamin Glass	933b6d9830	cpp_wrapper: enable in aarch64 and x86 nightly dashboard performance runs (#145791 ) Adds `cpp_wrapper` mode to the nightly inductor benchmark runs, as well as optionally for manually triggered runs. This is justified by `aot_inductor` already being in those runs. Additionally, re-enables `aot_inductor` in the nightly aarch64 runs. It was disabled 5 months ago to deal with a performance instability, which has likely gone away at this point. Pull Request resolved: https://github.com/pytorch/pytorch/pull/145791 Approved by: https://github.com/desertfire	2025-01-30 02:55:45 +00:00
Yang Wang	a9ed7bd78e	[utilization] pipeline to create clean db records (#145327 ) upload_utilization_script to generate db-ready-insert records to s3 - generate two files: metadata and timeseries in ossci-utilization buckets - convert log record to db format ones - add unit test job for tools/stats/ Related Prs: setup composite action for data pipeline: https://github.com/pytorch/pytorch/pull/145310 add permission for composite action to access S3 bucket: https://github.com/pytorch-labs/pytorch-gha-infra/pull/595 add insert logic in s3 replicator: https://github.com/pytorch/test-infra/pull/6217 Pull Request resolved: https://github.com/pytorch/pytorch/pull/145327 Approved by: https://github.com/huydhn Co-authored-by: Huy Do <huydhn@gmail.com>	2025-01-29 23:48:50 +00:00
saienduri	3e135993bd	Update mi300 labels to account for multiple clusters. (#145923 ) We now have multiple Kubernetes clusters of mi300x resources, and this commit updates labels accordingly to target both clusters evenly. Pull Request resolved: https://github.com/pytorch/pytorch/pull/145923 Approved by: https://github.com/jeffdaily	2025-01-29 16:56:43 +00:00
Ting Lu	354fe48db9	Add magma cuda build 12.8 (#145765 ) https://github.com/pytorch/pytorch/issues/145570 Pull Request resolved: https://github.com/pytorch/pytorch/pull/145765 Approved by: https://github.com/malfet	2025-01-29 08:43:38 +00:00
Ting Lu	f4ca98950e	Add CUDA 12.8 libtorch image (#145789 ) https://github.com/pytorch/pytorch/issues/145570 Builds 12.8 libtorch docker/deprecate 12.1 meanwhile Pull Request resolved: https://github.com/pytorch/pytorch/pull/145789 Approved by: https://github.com/nWEIdia, https://github.com/atalman	2025-01-29 02:59:37 +00:00
Wei Wang	6bcb545d9c	[CI][CUDA][cuSPARSELt] cusparselt 0.6.3 and cu121 related cleanups (#145793 ) Make ci cusparselt installation be consistent with nightly binary Remove cu121 related docker build jobs and inductor runs Update test failures relating to cu121 Retry of https://github.com/pytorch/pytorch/pull/145696 Pull Request resolved: https://github.com/pytorch/pytorch/pull/145793 Approved by: https://github.com/eqy, https://github.com/tinglvv	2025-01-28 21:01:58 +00:00
atalman	5382ab57d7	Move trunk windows builds to CUDA-12.4 (#145844 ) Same as : https://github.com/pytorch/pytorch/pull/130446 That should catch build regressions that were previously only detectable during the nightly builds for 12.4 Pull Request resolved: https://github.com/pytorch/pytorch/pull/145844 Approved by: https://github.com/janeyx99, https://github.com/malfet	2025-01-28 18:00:51 +00:00
Huy Do	56915b093a	Fix environment deployment spam (#145823 ) With https://github.com/pytorch-labs/pytorch-gha-infra/pull/598 in place, the environment can now be removed. Fixes https://github.com/pytorch/pytorch/issues/145704 Pull Request resolved: https://github.com/pytorch/pytorch/pull/145823 Approved by: https://github.com/clee2000	2025-01-28 17:46:31 +00:00
Zain Rizvi	097ccd9c39	Move ROCm MI300 jobs to unstable to make CI green (#145790 ) This is a temporary change to reduce intermittent tests failures. Jobs can be moved back once those machines get better runner isolation. This also sneaks in a small fix to all the rocm job's build step to be run on Linux Foundation runners (the get-label-type dependency). The inductor-rocm-mi300 workflow already had it, but it was missing in the rocm-mi300 workflow. Pull Request resolved: https://github.com/pytorch/pytorch/pull/145790 Approved by: https://github.com/yangw-dev	2025-01-28 17:25:15 +00:00
Aaron Orenstein	60f98262f1	PEP585: .github (#145707 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/145707 Approved by: https://github.com/huydhn	2025-01-27 21:21:01 +00:00
Ting Lu	93dd6bc4d8	Add CUDA 12.8 installation and manylinux-cuda12.8 (#145567 ) Breaking https://github.com/pytorch/pytorch/pull/145557 into two parts. Need to have manylinux-cuda12.8 in order to build magma. Issue: https://github.com/pytorch/pytorch/issues/145570 Pull Request resolved: https://github.com/pytorch/pytorch/pull/145567 Approved by: https://github.com/nWEIdia, https://github.com/atalman	2025-01-27 20:49:07 +00:00
Huy Do	2f8ad8f4b9	Run inductor perf benchmark on ROCm (#145763 ) This requires https://github.com/pytorch/pytorch/pull/144594. The test run on PT2 dashboard is at https://hud.pytorch.org/benchmark/compilers?dashboard=torchinductor&startTime=Mon%2C%2020%20Jan%202025%2019%3A46%3A14%20GMT&stopTime=Mon%2C%2027%20Jan%202025%2019%3A46%3A14%20GMT&granularity=hour&mode=inference&dtype=bfloat16&deviceName=rocm&lBranch=144594&lCommit=9f5cb037965aa2990b2e4593610bca92526ebb3b&rBranch=144594&rCommit=9f5cb037965aa2990b2e4593610bca92526ebb3b Pull Request resolved: https://github.com/pytorch/pytorch/pull/145763 Approved by: https://github.com/jeffdaily	2025-01-27 20:19:03 +00:00
Huy Do	5d01a2874f	Increase the number of perf benchmark shards (#145534 ) Per the discussion on https://github.com/pytorch/pytorch/issues/140332#issuecomment-2610805551, this adds 2 more shards for HF, 2 more for TorchBench, and 1 more for TIMM. Pull Request resolved: https://github.com/pytorch/pytorch/pull/145534 Approved by: https://github.com/jeanschmidt	2025-01-27 16:20:42 +00:00
amdfaa	e57cdb8402	[ROCm] trunk.yml only runs pre-merge via ciflow/trunk label (#145629 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/145629 Approved by: https://github.com/jeffdaily	2025-01-24 18:31:33 +00:00
amdfaa	ce371ab4c6	[ROCm] Create inductor-rocm-mi300 (#145621 ) - Adds an mi300 inductor workflow to main. Pull Request resolved: https://github.com/pytorch/pytorch/pull/145621 Approved by: https://github.com/jeffdaily Co-authored-by: Jeff Daily <jeff.daily@amd.com>	2025-01-24 17:04:17 +00:00
atalman	5d24a9a274	Advance docker release latest verison to cuda 12.4 (#145566 ) Fixed latest tag in ghcr.io to be cuda 12.4 docker image. Todo, Need to add it to : https://github.com/pytorch/builder/blob/main/CUDA_UPGRADE_GUIDE.MD Will need to check if we can automate this by introducing cuda_stable variable or something like this. Pull Request resolved: https://github.com/pytorch/pytorch/pull/145566 Approved by: https://github.com/nWEIdia, https://github.com/kit1980, https://github.com/malfet	2025-01-24 15:27:25 +00:00
Yang Wang	6d4f5f7688	[Utilization][Usage Log] Add data model for record (#145114 ) Add data model for consistency and data model change in the future. The data model will be used during the post-test-process pipeline Pull Request resolved: https://github.com/pytorch/pytorch/pull/145114 Approved by: https://github.com/huydhn	2025-01-23 19:04:41 +00:00
amdfaa	c9e12d6a3b	[ROCm] Update rocm.yml and add rocm-mi300.yml (#145398 ) - Added another workflow to run the mi300 jobs post-merge. - Updated rocm.yml to use mi200s instead of mi300s. - Required to get an idea of how PRs are landing on our mi200s and mi300s. Pull Request resolved: https://github.com/pytorch/pytorch/pull/145398 Approved by: https://github.com/jeffdaily Co-authored-by: Jeff Daily <jeff.daily@amd.com>	2025-01-23 00:07:50 +00:00
Huy Do	266fd35c58	Fix ExecuTorch, XLA, Triton hash updates (#145314 ) Fix some stale hash updates https://github.com/pytorch/pytorch/pulls/pytorchupdatebot reported by @izaitsevfb * XLA and ExecuTorch now wait for all jobs in pull instead of hardcoding the job names which are not correct anymore and the bot waits forever there * Trion commit hash hasn't been updated automatically since 2023 and people have been updating the pin manually with their testings from time to time, so I doubt that it would be an useful thing to keep. The vision update failures looks more complex though and I would need to take a closer look. So, I will keep it in another PR Pull Request resolved: https://github.com/pytorch/pytorch/pull/145314 Approved by: https://github.com/izaitsevfb	2025-01-21 23:24:21 +00:00
Catherine Lee	7dd9d1f243	Update clickhouse-connect to 0.8.14 (#144915 ) Corresponds to https://github.com/pytorch/test-infra/pull/6177 I only tested the slow test script but I also did testing on the new version with scripts in https://github.com/pytorch/test-infra/pull/6177 Pull Request resolved: https://github.com/pytorch/pytorch/pull/144915 Approved by: https://github.com/huydhn	2025-01-21 21:43:18 +00:00
Huy Do	eb553ae3cf	Fix broken gpt_fast micro benchmark after #144315 (#145235 ) The benchmark is failing with the following error ``` File "/var/lib/jenkins/workspace/benchmarks/gpt_fast/benchmark.py", line 333, in <module> main(output_file=args.output, only_model=args.only) File "/var/lib/jenkins/workspace/benchmarks/gpt_fast/benchmark.py", line 308, in main lst = func(device) File "/var/lib/jenkins/workspace/benchmarks/gpt_fast/benchmark.py", line 66, in run_mlp_layer_norm_gelu us_per_iter = benchmarker.benchmark(compiled_mod, (x,)) * 1000 File "/opt/conda/envs/py_3.9/lib/python3.9/site-packages/torch/_inductor/runtime/benchmarking.py", line 39, in wrapper return fn(self, args, *kwargs) TypeError: benchmark() missing 1 required positional argument: 'fn_kwargs' ``` An example error is https://github.com/pytorch/pytorch/actions/runs/12862761823/job/35858912555 I also assign `oncall: pt2` as the owner of this job going forward. Pull Request resolved: https://github.com/pytorch/pytorch/pull/145235 Approved by: https://github.com/nmacchioni	2025-01-21 17:42:24 +00:00
atalman	2cffbff7da	Add 3.13t Windows and MacOS binary builds (#141806 ) Related to: https://github.com/pytorch/pytorch/issues/130249 For conda uses approach described here: https://conda-forge.org/blog/2024/09/26/python-313/ Create Python 3.13t conda env like so: ``` conda create -n py313 python=3.13 python-freethreading -c conda-forge ``` For windows executable installation we need to pass additional parameter to enable 3.13t: ``` Include_freethreaded=1 ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/141806 Approved by: https://github.com/albanD	2025-01-21 17:16:19 +00:00
Wang, Chuanqi	225a10febe	[CI] Add xpu linux build into pull workflow (#145084 ) To mitigate the XPU build failure risk introduced by non-XPU specific PRs. Refer #144967 & #143803 Pull Request resolved: https://github.com/pytorch/pytorch/pull/145084 Approved by: https://github.com/huydhn, https://github.com/atalman	2025-01-20 19:31:48 +00:00
Aleksei Nikiforov	53e2408015	Improve cleanup of cancelled jobs on s390x for tests too (#144968 ) Follow up to https://github.com/pytorch/pytorch/pull/144149 Pull Request resolved: https://github.com/pytorch/pytorch/pull/144968 Approved by: https://github.com/huydhn	2025-01-20 12:56:07 +00:00
Huy Do	cf28d613f1	Allow ROCm runner to upload benchmark results if found (#144710 ) https://github.com/pytorch/pytorch/wiki/How-to-integrate-with-PyTorch-OSS-benchmark-database. This will unblock AMD when they try to run benchmark MI300 benchmarks on CI. Pull Request resolved: https://github.com/pytorch/pytorch/pull/144710 Approved by: https://github.com/kit1980	2025-01-16 19:31:45 +00:00
atalman	519269a415	[BE] - Remove conda test and upload scripts and env variables from Workflows Part 1 (#144870 ) Remove conda test and upload scripts and env variables from Workflows Related to: https://github.com/pytorch/pytorch/issues/138506 Pull Request resolved: https://github.com/pytorch/pytorch/pull/144870 Approved by: https://github.com/malfet	2025-01-16 17:20:14 +00:00
Yutao Xu	6470b0ea6f	Update torch-xpu-ops commit pin (#144739 ) Update the torch-xpu-ops commit to [22cc419e4e60f469341712a5a103fa309a7dfd48](`22cc419e4e`), includes: - Fix building issue https://github.com/intel/torch-xpu-ops/issues/1279 - Aten operator coverage improvement Note: new torch-xpu-ops commit don't support bundle 0.5.3 Pull Request resolved: https://github.com/pytorch/pytorch/pull/144739 Approved by: https://github.com/EikanWang, https://github.com/malfet	2025-01-16 15:12:37 +00:00
Huy Do	05095a45f2	Fix the wrong artifact in remaining workflows (#144812 ) I missed them in https://github.com/pytorch/pytorch/pull/144694 as they weren't run often. But they are still failing nonetheless, i.e. https://github.com/pytorch/pytorch/actions/runs/12762640334/job/35578870178 The issue was from https://github.com/pytorch/pytorch/pull/125401 where it added `use-gha: ${{ inputs.use-gha }}` to linux_test workflow. Pull Request resolved: https://github.com/pytorch/pytorch/pull/144812 Approved by: https://github.com/clee2000	2025-01-15 20:36:40 +00:00
Wang, Chuanqi	b4b4e57469	[CD] Enable profiling for XPU Windows nightly wheels (#144316 ) PR https://github.com/pytorch/pytorch/pull/144034 added profiling support for torch XPU Windows binary, enable it in PyTorch XPU Windows CD Works for https://github.com/pytorch/pytorch/issues/114850 Pull Request resolved: https://github.com/pytorch/pytorch/pull/144316 Approved by: https://github.com/xuhancn, https://github.com/atalman	2025-01-14 19:01:27 +00:00
Nikita Shulga	6053242890	[CD] Enable python3.13t builds for aarch64 (#144698 ) But make sure that right numpy version is picked (2.0.2 does not support 3.13) Pull Request resolved: https://github.com/pytorch/pytorch/pull/144698 Approved by: https://github.com/atalman ghstack dependencies: #144696, #144697, #144716	2025-01-14 02:29:01 +00:00
Huy Do	b221f88fc1	Leave SCCACHE_S3_KEY_PREFIX empty to share the cache among all build jobs (#144704 ) This is a follow-up of https://github.com/pytorch/pytorch/pull/144112#pullrequestreview-2528451214. After leaving https://github.com/pytorch/pytorch/pull/144112 running for more than a week, all build jobs were fine, but I failed to see any improvement in build time. So, let's try @malfet suggestion by removing the prefix altogether to keep it simple. After this land, I will circle back on this to see if there is any improvements. Otherwise, it's still a simple BE change I guess. Here is the query I'm using to gather build time data for reference: ``` with jobs as ( select id, name, DATE_DIFF('minute', created_at, completed_at) as duration, DATE_TRUNC('week', created_at) as bucket from workflow_job where name like '%/ build' and html_url like concat('%', {repo: String }, '%') and conclusion = 'success' and created_at >= (CURRENT_TIMESTAMP() - INTERVAL 6 MONTHS) ), aggregated_jobs_in_bucket as ( select --groupArray(duration) as durations, --quantiles(0.9)(duration), avg(duration), bucket from jobs group by bucket ) select * from aggregated_jobs_in_bucket order by bucket desc ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/144704 Approved by: https://github.com/clee2000	2025-01-14 02:19:38 +00:00
atalman	c15d6508bd	Binary builds Docker images - remove cuda 12.1 (#144575 ) Remove cuda 12.1 from manylinux, libtoch and almalinux builds Pull Request resolved: https://github.com/pytorch/pytorch/pull/144575 Approved by: https://github.com/seemethere, https://github.com/kit1980, https://github.com/malfet, https://github.com/Skylion007	2025-01-13 22:44:59 +00:00
Huy Do	5129d6ef51	Fix inductor periodic smoke test wrong artifact (#144694 ) I'm not entirely sure why this failure starts to show up in periodic since Friday https://github.com/pytorch/pytorch/actions/runs/12716967189/job/35463656803. The artifact was uploaded to S3, but `use-gha: anything-non-empty-to-use-gh` was set and it was working. Maybe this is related to https://github.com/pytorch/pytorch/issues/144479 I also clean up the GCP/AWS A100 selection logic as the GCP cluster doesn't exist anymore. Pull Request resolved: https://github.com/pytorch/pytorch/pull/144694 Approved by: https://github.com/clee2000	2025-01-13 21:42:39 +00:00
Nikita Shulga	92ddb3d3d3	[MPS] Expose `MPSProfiler::start/stopCapture` to Python (#144561 ) I.e. when `MTL_CAPTURE_ENABLED` environment variable is set to 1, one should be able to invoke wrap the code with `torch.mps.profiler.capture_metal` to generate gputrace for shaders invoked inside the context manager. For example, code below: ```python import torch import os def foo(x): return x[:,::2].sin() + x[:, 1::2].cos() if __name__ == "__main__": os.environ["MTL_CAPTURE_ENABLED"] = "1" x = torch.rand(32, 1024, device="mps") with torch.mps.profiler.metal_capture("compiled_shader"): torch.compile(foo)(x) ``` should capture the execution of a `torch.compile` generated shader <img width="734" alt="image" src="https://github.com/user-attachments/assets/718ff64e-103b-4b11-b66c-c89cfc770b5d" /> Pull Request resolved: https://github.com/pytorch/pytorch/pull/144561 Approved by: https://github.com/manuelcandales ghstack dependencies: #144559, #144560	2025-01-11 02:05:36 +00:00
Aleksei Nikiforov	4143312e67	S390x ci periodic tests (#125401 ) Periodically run testsuite for s390x Dependencies update Package z3-solver is updated from version 4.12.2.0 to version 4.12.6.0. This is a minor version update, so no functional change is expected. The reason for update is build on s390x. pypi doesn't provide binary build for z3-solver for versions 4.12.2.0 or 4.12.6.0 for s390x. Unfortunately, version 4.12.2.0 fails to build with newer gcc used on s390x builders, but those errors are fixed in version 4.12.6.0. Due to this minor version bump fixes build on s390x. ``` # pip3 install z3-solver==4.12.2.0 ... In file included from /tmp/pip-install-756iytc6/z3-solver_ce6f750b780b4146a9a7c01e52672071/core/src/util/region.cpp:53: /tmp/pip-install-756iytc6/z3-solver_ce6f750b780b4146a9a7c01e52672071/core/src/util/region.cpp: In member function ‘void* region::allocate(size_t)’: /tmp/pip-install-756iytc6/z3-solver_ce6f750b780b4146a9a7c01e52672071/core/src/util/tptr.h:29:62: error: ‘uintptr_t’ does not name a type 29 \| #define ALIGN(T, PTR) reinterpret_cast<T>(((reinterpret_cast<uintptr_t>(PTR) >> PTR_ALIGNMENT) + \ \| ^~~~~~~~~ /tmp/pip-install-756iytc6/z3-solver_ce6f750b780b4146a9a7c01e52672071/core/src/util/region.cpp:82:22: note: in expansion of macro ‘ALIGN’ 82 \| m_curr_ptr = ALIGN(char , new_curr_ptr); \| ^~~~~ /tmp/pip-install-756iytc6/z3-solver_ce6f750b780b4146a9a7c01e52672071/core/src/util/region.cpp:57:1: note: ‘uintptr_t’ is defined in header ‘<cstdint>’; did you forget to ‘#include <cstdint>’? 56 \| #include "util/page.h" +++ \|+#include <cstdint> 57 \| ``` Python paths update* On AlmaLinux 8 s390x, old paths: ``` python -c 'from distutils.sysconfig import get_python_lib; print(get_python_lib())' /usr/lib/python3.12/site-packages ``` Total result is `/usr/lib/python3.12/site-packages/torch;/usr/lib/python3.12/site-packages` New paths: ``` python -c 'import site; print(";".join([x for x in site.getsitepackages()] + [x + "/torch" for x in site.getsitepackages()]))' /usr/local/lib64/python3.12/site-packages;/usr/local/lib/python3.12/site-packages;/usr/lib64/python3.12/site-packages;/usr/lib/python3.12/site-packages;/usr/local/lib64/python3.12/site-packages/torch;/usr/local/lib/python3.12/site-packages/torch;/usr/lib64/python3.12/site-packages/torch;/usr/lib/python3.12/site-packages/torch ``` ``` # python -c 'import torch ; print(torch)' <module 'torch' from '/usr/local/lib64/python3.12/site-packages/torch/__init__.py'> ``` `pip3 install dist/.whl` installs torch into `/usr/local/lib64/python3.12/site-packages`, and later it's not found by cmake with old paths: ``` CMake Error at CMakeLists.txt:9 (find_package): By not providing "FindTorch.cmake" in CMAKE_MODULE_PATH this project has asked CMake to find a package configuration file provided by "Torch", but CMake did not find one. ``` https://github.com/pytorch/pytorch/actions/runs/10994060107/job/30521868178?pr=125401 Builders availability* Build took 60 minutes Tests took: 150, 110, 65, 55, 115, 85, 50, 70, 105, 110 minutes (split into 10 shards) 60 + 150 + 110 + 65 + 55 + 115 + 85 + 50 + 70 + 105 + 110 = 975 minutes used. Let's double it. It would be 1950 minutes. We have 20 machines * 24 hours = 20 * 24 * 60 = 20 * 1440 = 28800 minutes We currently run 5 nightly binaries builds, each on average 90 minutes build, 15 minutes test, 5 minutes upload, 110 minutes total for each, 550 minutes total. Doubling would be 1100 minutes. That leaves 28800 - 1100 = 27700 minutes total. Periodic tests would use will leave 25750 minutes. Nightly binaries build + nightly tests = 3050 minutes. 25750 / 3050 = 8.44. So we could do both 8 more times for additional CI runs for any reason. And that is with pretty good safety margin. Skip test_tensorexpr On s390x, pytorch is built without llvm. Even if it would be built with llvm, llvm currently doesn't support used features on s390x and test fails with errors like: ``` JIT session error: Unsupported target machine architecture in ELF object pytorch-jitted-objectbuffer unknown file: Failure C++ exception with description "valOrErr INTERNAL ASSERT FAILED at "/var/lib/jenkins/workspace/torch/csrc/jit/tensorexpr/llvm_jit.h":34, please report a bug to PyTorch. Unexpected failure in LLVM JIT: Failed to materialize symbols: { (main, { func }) } ``` Disable cpp/static_runtime_test on s390x Quantization is not fully supported on s390x in pytorch yet. Pull Request resolved: https://github.com/pytorch/pytorch/pull/125401 Approved by: https://github.com/malfet Co-authored-by: Nikita Shulga <2453524+malfet@users.noreply.github.com>	2025-01-10 18:21:07 +00:00
Benjamin Glass	08eaaa61ea	Inductor dashboard benchmarks: swap unused freeze_autotune_cudagraphs workflow for cppwrapper workflow (#144427 ) GitHub limits us to 10 inputs per workflow_dispatch job, so this PR swaps out an input that is no longer used for the cppwrapper input. See [the HUD](https://hud.pytorch.org/benchmark/compilers?dashboard=torchinductor&startTime=Thu%2C%2002%20Jan%202025%2016%3A30%3A07%20GMT&stopTime=Thu%2C%2009%20Jan%202025%2016%3A30%3A07%20GMT&granularity=hour&mode=inference&dtype=bfloat16&deviceName=cuda%20(a100)&lBranch=gh/benjaminglass1/53/orig&lCommit=4c3d3ad3c7886cbda9705b41c6db5fa7da0d6fe9&rBranch=main&rCommit=00df63f09f07546bacec734f37132edc58ccf574) for an example showing that it works and displays sane output. Pull Request resolved: https://github.com/pytorch/pytorch/pull/144427 Approved by: https://github.com/desertfire, https://github.com/huydhn	2025-01-09 23:56:00 +00:00
Aleksei Nikiforov	127f836881	S390x cancelled jobs cleanup (#144149 ) Sometimes job is cancelled during nested docker container creation. This leads to nested docker container not being stopped and worker hanging forever in the job. Improve nested docker containers cleanup for these cases. Pull Request resolved: https://github.com/pytorch/pytorch/pull/144149 Approved by: https://github.com/seemethere	2025-01-09 20:45:19 +00:00
Jithun Nair	1365ae859c	[ROCm][CI] upgrade CI to ROCm 6.3 (#142152 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/142152 Approved by: https://github.com/jeffdaily, https://github.com/pruthvistony Co-authored-by: Jeff Daily <jeff.daily@amd.com>	2025-01-09 17:14:16 +00:00
Dmitry Nikolaev	d4871750d9	[ROCm] Enable post-merge trunk workflow on MI300 runners; skip and fix MI300 related failed tests (#143673 ) This PR * makes changes to the workflow files and scripts so we can run CI workflows on the MI300 runners * skips and fixes several tests, failed on MI300, observed in https://github.com/pytorch/pytorch/pull/140989 Skipped due to unsupported Float8_e4m3fn data type on MI300 (need to update test code to use datatypes supported by MI300): - distributed.tensor.parallel.test_micro_pipeline_tp.py::MicroPipelineTPTest::test_fuse_all_gather_scaled_matmul_A_dims_\_gather_dim_\ (24 tests across inductor/distributed configs) - distributed.tensor.parallel.test_micro_pipeline_tp.py::test_fuse_scaled_matmul_reduce_scatter_A_dims_\_scatter_dim_\ (12 tests across inductor/distributed configs)) - inductor.test_loop_ordering::LoopOrderingTest::test_fp8_cast_and_t - inductor.test_loop_ordering::LoopOrderingTest::test_fp8_pattern_2 Skipped due to AssertionError on MI300: - inductor.test_mkldnn_pattern_matcher.py::test_qconv2d_int8_mixed_bf16 - distributed._tools.test_sac_ilp::TestSACILP::test_sac_ilp_case1 Skipped: - test_cuda.py::TestCudaMallocAsync::test_clock_speed - test_cuda.py::TestCudaMallocAsync::test_power_draw - test_torch.py::TestTorchDeviceTypeCUDA::test_deterministic_cumsum_cuda Skipped flaky tests on MI300: - distributed.test_c10d_gloo.py::ProcessGroupGlooTest::test_gather_stress_cuda - inductor.test_cpu_repro::CPUReproTests::test_lstm_packed_unbatched_False* (256 tests) Fixed: - test_matmul_cuda.py::TestFP8MatmulCudaCUDA::test_float8_basics_cuda Features: - inductor/test_fp8.py - declare a new function to convert FP8 datatypes to ROCm supported FP8 datatypes. It keeps test names for CUDA and ROCm and allows to enable Inductor FP8 tests on CPU Pull Request resolved: https://github.com/pytorch/pytorch/pull/143673 Approved by: https://github.com/jeffdaily, https://github.com/malfet, https://github.com/pruthvistony Co-authored-by: saienduri <saimanas.enduri@amd.com> Co-authored-by: Jithun Nair <jithun.nair@amd.com> Co-authored-by: Nikita Shulga <2453524+malfet@users.noreply.github.com>	2025-01-09 05:18:57 +00:00
Huy Do	aa7d01ea22	Use sccache 0.9.0 on ROCm build job (#144125 ) TSIA, sccache 0.9.0 seems to work fine with ROCm build job Pull Request resolved: https://github.com/pytorch/pytorch/pull/144125 Approved by: https://github.com/jithunnair-amd, https://github.com/wdvr, https://github.com/jeffdaily	2025-01-04 08:56:48 +00:00
Huy Do	f3968373c1	Migrate the rest of CUDA 12.1 jobs to 12.4 (#144118 ) CUDA 12.4 is the default now and we don't build nightly 12.1 anymore, so it's time to move the rest of CI jobs to 12.4. I also clean up some redundant CI jobs on periodic and inductor-periodic. Pull Request resolved: https://github.com/pytorch/pytorch/pull/144118 Approved by: https://github.com/atalman	2025-01-03 17:45:41 +00:00

1 2 3 4 5 ...

2250 commits