pytorch

mirror of https://github.com/saymrwulf/pytorch.git synced 2026-05-14 20:57:59 +00:00

Author	SHA1	Message	Date
Michael Lazos	1252c1933d	Update to remind users to use torch.compile template (#145960 ) Users have been submitting fuzzer issues without meeting the requirements outline in the torch.compile issue template. This updates the note to remind users to use the torch.compile template for torch.compile bugs. Pull Request resolved: https://github.com/pytorch/pytorch/pull/145960 Approved by: https://github.com/eellison	2025-01-30 21:34:40 +00:00
Michael Lazos	d14046b58d	Update fuzzer guidance to include rng (#145962 ) Add another condition to fuzzer issue guidance. Pull Request resolved: https://github.com/pytorch/pytorch/pull/145962 Approved by: https://github.com/eellison	2025-01-30 21:33:57 +00:00
PyTorch MergeBot	967cf85f3a	Revert "Update mi300 labels to account for multiple clusters. (#145923 )" This reverts commit `3e135993bd`. Reverted https://github.com/pytorch/pytorch/pull/145923 on behalf of https://github.com/atalman due to reverting back to one cluster ([comment](https://github.com/pytorch/pytorch/pull/145923#issuecomment-2625022826))	2025-01-30 16:45:50 +00:00
Benjamin Glass	933b6d9830	cpp_wrapper: enable in aarch64 and x86 nightly dashboard performance runs (#145791 ) Adds `cpp_wrapper` mode to the nightly inductor benchmark runs, as well as optionally for manually triggered runs. This is justified by `aot_inductor` already being in those runs. Additionally, re-enables `aot_inductor` in the nightly aarch64 runs. It was disabled 5 months ago to deal with a performance instability, which has likely gone away at this point. Pull Request resolved: https://github.com/pytorch/pytorch/pull/145791 Approved by: https://github.com/desertfire	2025-01-30 02:55:45 +00:00
Yang Wang	a9ed7bd78e	[utilization] pipeline to create clean db records (#145327 ) upload_utilization_script to generate db-ready-insert records to s3 - generate two files: metadata and timeseries in ossci-utilization buckets - convert log record to db format ones - add unit test job for tools/stats/ Related Prs: setup composite action for data pipeline: https://github.com/pytorch/pytorch/pull/145310 add permission for composite action to access S3 bucket: https://github.com/pytorch-labs/pytorch-gha-infra/pull/595 add insert logic in s3 replicator: https://github.com/pytorch/test-infra/pull/6217 Pull Request resolved: https://github.com/pytorch/pytorch/pull/145327 Approved by: https://github.com/huydhn Co-authored-by: Huy Do <huydhn@gmail.com>	2025-01-29 23:48:50 +00:00
bglass@quansight.com	40ccb7a86d	cpp_wrapper: Move #includes to per-device header files (#145932 ) Summary: This prepares us for the next PR in the stack, where we introduce pre-compiled per-device header files to save compilation time. Reland https://github.com/pytorch/pytorch/pull/143909 after merge conflicts. Co-authored-by: Benjamin Glass <[bglass@quansight.com](mailto:bglass@quansight.com)> Differential Revision: D68656960 Pulled By: benjaminglass1 Pull Request resolved: https://github.com/pytorch/pytorch/pull/145932 Approved by: https://github.com/yushangdi, https://github.com/benjaminglass1 Co-authored-by: bglass@quansight.com <bglass@quansight.com>	2025-01-29 21:08:45 +00:00
saienduri	3e135993bd	Update mi300 labels to account for multiple clusters. (#145923 ) We now have multiple Kubernetes clusters of mi300x resources, and this commit updates labels accordingly to target both clusters evenly. Pull Request resolved: https://github.com/pytorch/pytorch/pull/145923 Approved by: https://github.com/jeffdaily	2025-01-29 16:56:43 +00:00
Ting Lu	354fe48db9	Add magma cuda build 12.8 (#145765 ) https://github.com/pytorch/pytorch/issues/145570 Pull Request resolved: https://github.com/pytorch/pytorch/pull/145765 Approved by: https://github.com/malfet	2025-01-29 08:43:38 +00:00
Ting Lu	f4ca98950e	Add CUDA 12.8 libtorch image (#145789 ) https://github.com/pytorch/pytorch/issues/145570 Builds 12.8 libtorch docker/deprecate 12.1 meanwhile Pull Request resolved: https://github.com/pytorch/pytorch/pull/145789 Approved by: https://github.com/nWEIdia, https://github.com/atalman	2025-01-29 02:59:37 +00:00
albanD	02dd7a7803	Extend abi-stable nitpick message to all the c stable files (#145862 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/145862 Approved by: https://github.com/ezyang	2025-01-28 23:22:23 +00:00
Wei Wang	6bcb545d9c	[CI][CUDA][cuSPARSELt] cusparselt 0.6.3 and cu121 related cleanups (#145793 ) Make ci cusparselt installation be consistent with nightly binary Remove cu121 related docker build jobs and inductor runs Update test failures relating to cu121 Retry of https://github.com/pytorch/pytorch/pull/145696 Pull Request resolved: https://github.com/pytorch/pytorch/pull/145793 Approved by: https://github.com/eqy, https://github.com/tinglvv	2025-01-28 21:01:58 +00:00
atalman	5382ab57d7	Move trunk windows builds to CUDA-12.4 (#145844 ) Same as : https://github.com/pytorch/pytorch/pull/130446 That should catch build regressions that were previously only detectable during the nightly builds for 12.4 Pull Request resolved: https://github.com/pytorch/pytorch/pull/145844 Approved by: https://github.com/janeyx99, https://github.com/malfet	2025-01-28 18:00:51 +00:00
Huy Do	56915b093a	Fix environment deployment spam (#145823 ) With https://github.com/pytorch-labs/pytorch-gha-infra/pull/598 in place, the environment can now be removed. Fixes https://github.com/pytorch/pytorch/issues/145704 Pull Request resolved: https://github.com/pytorch/pytorch/pull/145823 Approved by: https://github.com/clee2000	2025-01-28 17:46:31 +00:00
Zain Rizvi	097ccd9c39	Move ROCm MI300 jobs to unstable to make CI green (#145790 ) This is a temporary change to reduce intermittent tests failures. Jobs can be moved back once those machines get better runner isolation. This also sneaks in a small fix to all the rocm job's build step to be run on Linux Foundation runners (the get-label-type dependency). The inductor-rocm-mi300 workflow already had it, but it was missing in the rocm-mi300 workflow. Pull Request resolved: https://github.com/pytorch/pytorch/pull/145790 Approved by: https://github.com/yangw-dev	2025-01-28 17:25:15 +00:00
saienduri	7eb51e5464	Ensure GPU isolation for kubernetes pod MI300 runners. (#145829 ) Fixes the reason behind moving the tests to unstable initially. (https://github.com/pytorch/pytorch/pull/145790) We ensure gpu isolation for each pod within kubernetes by propagating the drivers selected for the pod from the Kubernetes layer up to the docker run in pytorch here. Now we stick with the GPUs assigned to the pod in the first place and there is no overlap between the test runners. Pull Request resolved: https://github.com/pytorch/pytorch/pull/145829 Approved by: https://github.com/jeffdaily	2025-01-28 17:20:46 +00:00
Aaron Orenstein	60f98262f1	PEP585: .github (#145707 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/145707 Approved by: https://github.com/huydhn	2025-01-27 21:21:01 +00:00
Ting Lu	93dd6bc4d8	Add CUDA 12.8 installation and manylinux-cuda12.8 (#145567 ) Breaking https://github.com/pytorch/pytorch/pull/145557 into two parts. Need to have manylinux-cuda12.8 in order to build magma. Issue: https://github.com/pytorch/pytorch/issues/145570 Pull Request resolved: https://github.com/pytorch/pytorch/pull/145567 Approved by: https://github.com/nWEIdia, https://github.com/atalman	2025-01-27 20:49:07 +00:00
Huy Do	2f8ad8f4b9	Run inductor perf benchmark on ROCm (#145763 ) This requires https://github.com/pytorch/pytorch/pull/144594. The test run on PT2 dashboard is at https://hud.pytorch.org/benchmark/compilers?dashboard=torchinductor&startTime=Mon%2C%2020%20Jan%202025%2019%3A46%3A14%20GMT&stopTime=Mon%2C%2027%20Jan%202025%2019%3A46%3A14%20GMT&granularity=hour&mode=inference&dtype=bfloat16&deviceName=rocm&lBranch=144594&lCommit=9f5cb037965aa2990b2e4593610bca92526ebb3b&rBranch=144594&rCommit=9f5cb037965aa2990b2e4593610bca92526ebb3b Pull Request resolved: https://github.com/pytorch/pytorch/pull/145763 Approved by: https://github.com/jeffdaily	2025-01-27 20:19:03 +00:00
Edward Z. Yang	635b98fa08	Add nitpick warning that aoti_torch/c/shim.h is ABI stable (#145745 ) Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/145745 Approved by: https://github.com/albanD	2025-01-27 19:25:37 +00:00
Huy Do	5d01a2874f	Increase the number of perf benchmark shards (#145534 ) Per the discussion on https://github.com/pytorch/pytorch/issues/140332#issuecomment-2610805551, this adds 2 more shards for HF, 2 more for TorchBench, and 1 more for TIMM. Pull Request resolved: https://github.com/pytorch/pytorch/pull/145534 Approved by: https://github.com/jeanschmidt	2025-01-27 16:20:42 +00:00
amdfaa	e57cdb8402	[ROCm] trunk.yml only runs pre-merge via ciflow/trunk label (#145629 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/145629 Approved by: https://github.com/jeffdaily	2025-01-24 18:31:33 +00:00
amdfaa	ce371ab4c6	[ROCm] Create inductor-rocm-mi300 (#145621 ) - Adds an mi300 inductor workflow to main. Pull Request resolved: https://github.com/pytorch/pytorch/pull/145621 Approved by: https://github.com/jeffdaily Co-authored-by: Jeff Daily <jeff.daily@amd.com>	2025-01-24 17:04:17 +00:00
atalman	5d24a9a274	Advance docker release latest verison to cuda 12.4 (#145566 ) Fixed latest tag in ghcr.io to be cuda 12.4 docker image. Todo, Need to add it to : https://github.com/pytorch/builder/blob/main/CUDA_UPGRADE_GUIDE.MD Will need to check if we can automate this by introducing cuda_stable variable or something like this. Pull Request resolved: https://github.com/pytorch/pytorch/pull/145566 Approved by: https://github.com/nWEIdia, https://github.com/kit1980, https://github.com/malfet	2025-01-24 15:27:25 +00:00
Yang Wang	6d4f5f7688	[Utilization][Usage Log] Add data model for record (#145114 ) Add data model for consistency and data model change in the future. The data model will be used during the post-test-process pipeline Pull Request resolved: https://github.com/pytorch/pytorch/pull/145114 Approved by: https://github.com/huydhn	2025-01-23 19:04:41 +00:00
amdfaa	c9e12d6a3b	[ROCm] Update rocm.yml and add rocm-mi300.yml (#145398 ) - Added another workflow to run the mi300 jobs post-merge. - Updated rocm.yml to use mi200s instead of mi300s. - Required to get an idea of how PRs are landing on our mi200s and mi300s. Pull Request resolved: https://github.com/pytorch/pytorch/pull/145398 Approved by: https://github.com/jeffdaily Co-authored-by: Jeff Daily <jeff.daily@amd.com>	2025-01-23 00:07:50 +00:00
PyTorch UpdateBot	3917053f63	[audio hash update] update the pinned audio hash (#145328 ) This PR is auto-generated nightly by [this action](https://github.com/pytorch/pytorch/blob/main/.github/workflows/nightly.yml). Update the pinned audio hash. Pull Request resolved: https://github.com/pytorch/pytorch/pull/145328 Approved by: https://github.com/pytorchbot	2025-01-22 19:39:03 +00:00
Huy Do	266fd35c58	Fix ExecuTorch, XLA, Triton hash updates (#145314 ) Fix some stale hash updates https://github.com/pytorch/pytorch/pulls/pytorchupdatebot reported by @izaitsevfb * XLA and ExecuTorch now wait for all jobs in pull instead of hardcoding the job names which are not correct anymore and the bot waits forever there * Trion commit hash hasn't been updated automatically since 2023 and people have been updating the pin manually with their testings from time to time, so I doubt that it would be an useful thing to keep. The vision update failures looks more complex though and I would need to take a closer look. So, I will keep it in another PR Pull Request resolved: https://github.com/pytorch/pytorch/pull/145314 Approved by: https://github.com/izaitsevfb	2025-01-21 23:24:21 +00:00
Catherine Lee	7dd9d1f243	Update clickhouse-connect to 0.8.14 (#144915 ) Corresponds to https://github.com/pytorch/test-infra/pull/6177 I only tested the slow test script but I also did testing on the new version with scripts in https://github.com/pytorch/test-infra/pull/6177 Pull Request resolved: https://github.com/pytorch/pytorch/pull/144915 Approved by: https://github.com/huydhn	2025-01-21 21:43:18 +00:00
Huy Do	eb553ae3cf	Fix broken gpt_fast micro benchmark after #144315 (#145235 ) The benchmark is failing with the following error ``` File "/var/lib/jenkins/workspace/benchmarks/gpt_fast/benchmark.py", line 333, in <module> main(output_file=args.output, only_model=args.only) File "/var/lib/jenkins/workspace/benchmarks/gpt_fast/benchmark.py", line 308, in main lst = func(device) File "/var/lib/jenkins/workspace/benchmarks/gpt_fast/benchmark.py", line 66, in run_mlp_layer_norm_gelu us_per_iter = benchmarker.benchmark(compiled_mod, (x,)) * 1000 File "/opt/conda/envs/py_3.9/lib/python3.9/site-packages/torch/_inductor/runtime/benchmarking.py", line 39, in wrapper return fn(self, args, *kwargs) TypeError: benchmark() missing 1 required positional argument: 'fn_kwargs' ``` An example error is https://github.com/pytorch/pytorch/actions/runs/12862761823/job/35858912555 I also assign `oncall: pt2` as the owner of this job going forward. Pull Request resolved: https://github.com/pytorch/pytorch/pull/145235 Approved by: https://github.com/nmacchioni	2025-01-21 17:42:24 +00:00
atalman	2cffbff7da	Add 3.13t Windows and MacOS binary builds (#141806 ) Related to: https://github.com/pytorch/pytorch/issues/130249 For conda uses approach described here: https://conda-forge.org/blog/2024/09/26/python-313/ Create Python 3.13t conda env like so: ``` conda create -n py313 python=3.13 python-freethreading -c conda-forge ``` For windows executable installation we need to pass additional parameter to enable 3.13t: ``` Include_freethreaded=1 ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/141806 Approved by: https://github.com/albanD	2025-01-21 17:16:19 +00:00
Wang, Chuanqi	225a10febe	[CI] Add xpu linux build into pull workflow (#145084 ) To mitigate the XPU build failure risk introduced by non-XPU specific PRs. Refer #144967 & #143803 Pull Request resolved: https://github.com/pytorch/pytorch/pull/145084 Approved by: https://github.com/huydhn, https://github.com/atalman	2025-01-20 19:31:48 +00:00
Aleksei Nikiforov	53e2408015	Improve cleanup of cancelled jobs on s390x for tests too (#144968 ) Follow up to https://github.com/pytorch/pytorch/pull/144149 Pull Request resolved: https://github.com/pytorch/pytorch/pull/144968 Approved by: https://github.com/huydhn	2025-01-20 12:56:07 +00:00
atalman	a215e174a1	[BE] Remove conda from scripts and build files Part 2 (#145015 ) Continuation of https://github.com/pytorch/pytorch/pull/144870 Remove conda logic from scripts: 1. Remove conda build from triton build script 2. Remove conda checks from setup.py 3. Remove conda from release scripts 4. Script read_conda_versions.sh is not used (checked via git grep) Related to: https://github.com/pytorch/pytorch/issues/138506 Pull Request resolved: https://github.com/pytorch/pytorch/pull/145015 Approved by: https://github.com/malfet, https://github.com/Skylion007	2025-01-17 16:26:24 +00:00
Wang, Eikan	dbed747aae	Add Intel GPU specific CMake files to merge rules (#135110 ) As the title. Pull Request resolved: https://github.com/pytorch/pytorch/pull/135110 Approved by: https://github.com/atalman	2025-01-17 09:44:13 +00:00
Huy Do	cf28d613f1	Allow ROCm runner to upload benchmark results if found (#144710 ) https://github.com/pytorch/pytorch/wiki/How-to-integrate-with-PyTorch-OSS-benchmark-database. This will unblock AMD when they try to run benchmark MI300 benchmarks on CI. Pull Request resolved: https://github.com/pytorch/pytorch/pull/144710 Approved by: https://github.com/kit1980	2025-01-16 19:31:45 +00:00
Nikita Shulga	ad15436db6	Fix `pt2-bug-report.yml` formatting (#144987 ) This is a 2nd regression caused by https://github.com/pytorch/pytorch/pull/144574 Test plan: `python3 -c "import yaml; foo=yaml.safe_load(open('pt2-bug-report.yml'));print(foo['body'][0])"` Before it printed ``` % python3 -c "import yaml; foo=yaml.safe_load(open('pt2-bug-report.yml'));print(foo['body'][0])" {'type': 'markdown', 'attributes': {'value': ''}} ``` After ``` % python3 -c "import yaml; foo=yaml.safe_load(open('pt2-bug-report.yml'));print(foo['body'][0])" {'type': 'markdown', 'attributes': {'value': '#### Note: Please write your bug report in English to ensure it can be understood and addressed by the development team.\n'}} ``` Fixes https://github.com/pytorch/pytorch/issues/144970 Pull Request resolved: https://github.com/pytorch/pytorch/pull/144987 Approved by: https://github.com/Skylion007, https://github.com/zou3519	2025-01-16 18:58:07 +00:00
atalman	519269a415	[BE] - Remove conda test and upload scripts and env variables from Workflows Part 1 (#144870 ) Remove conda test and upload scripts and env variables from Workflows Related to: https://github.com/pytorch/pytorch/issues/138506 Pull Request resolved: https://github.com/pytorch/pytorch/pull/144870 Approved by: https://github.com/malfet	2025-01-16 17:20:14 +00:00
Yutao Xu	6470b0ea6f	Update torch-xpu-ops commit pin (#144739 ) Update the torch-xpu-ops commit to [22cc419e4e60f469341712a5a103fa309a7dfd48](`22cc419e4e`), includes: - Fix building issue https://github.com/intel/torch-xpu-ops/issues/1279 - Aten operator coverage improvement Note: new torch-xpu-ops commit don't support bundle 0.5.3 Pull Request resolved: https://github.com/pytorch/pytorch/pull/144739 Approved by: https://github.com/EikanWang, https://github.com/malfet	2025-01-16 15:12:37 +00:00
Huy Do	05095a45f2	Fix the wrong artifact in remaining workflows (#144812 ) I missed them in https://github.com/pytorch/pytorch/pull/144694 as they weren't run often. But they are still failing nonetheless, i.e. https://github.com/pytorch/pytorch/actions/runs/12762640334/job/35578870178 The issue was from https://github.com/pytorch/pytorch/pull/125401 where it added `use-gha: ${{ inputs.use-gha }}` to linux_test workflow. Pull Request resolved: https://github.com/pytorch/pytorch/pull/144812 Approved by: https://github.com/clee2000	2025-01-15 20:36:40 +00:00
Wang, Chuanqi	b4b4e57469	[CD] Enable profiling for XPU Windows nightly wheels (#144316 ) PR https://github.com/pytorch/pytorch/pull/144034 added profiling support for torch XPU Windows binary, enable it in PyTorch XPU Windows CD Works for https://github.com/pytorch/pytorch/issues/114850 Pull Request resolved: https://github.com/pytorch/pytorch/pull/144316 Approved by: https://github.com/xuhancn, https://github.com/atalman	2025-01-14 19:01:27 +00:00
Nikita Shulga	6053242890	[CD] Enable python3.13t builds for aarch64 (#144698 ) But make sure that right numpy version is picked (2.0.2 does not support 3.13) Pull Request resolved: https://github.com/pytorch/pytorch/pull/144698 Approved by: https://github.com/atalman ghstack dependencies: #144696, #144697, #144716	2025-01-14 02:29:01 +00:00
Huy Do	b221f88fc1	Leave SCCACHE_S3_KEY_PREFIX empty to share the cache among all build jobs (#144704 ) This is a follow-up of https://github.com/pytorch/pytorch/pull/144112#pullrequestreview-2528451214. After leaving https://github.com/pytorch/pytorch/pull/144112 running for more than a week, all build jobs were fine, but I failed to see any improvement in build time. So, let's try @malfet suggestion by removing the prefix altogether to keep it simple. After this land, I will circle back on this to see if there is any improvements. Otherwise, it's still a simple BE change I guess. Here is the query I'm using to gather build time data for reference: ``` with jobs as ( select id, name, DATE_DIFF('minute', created_at, completed_at) as duration, DATE_TRUNC('week', created_at) as bucket from workflow_job where name like '%/ build' and html_url like concat('%', {repo: String }, '%') and conclusion = 'success' and created_at >= (CURRENT_TIMESTAMP() - INTERVAL 6 MONTHS) ), aggregated_jobs_in_bucket as ( select --groupArray(duration) as durations, --quantiles(0.9)(duration), avg(duration), bucket from jobs group by bucket ) select * from aggregated_jobs_in_bucket order by bucket desc ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/144704 Approved by: https://github.com/clee2000	2025-01-14 02:19:38 +00:00
atalman	c15d6508bd	Binary builds Docker images - remove cuda 12.1 (#144575 ) Remove cuda 12.1 from manylinux, libtoch and almalinux builds Pull Request resolved: https://github.com/pytorch/pytorch/pull/144575 Approved by: https://github.com/seemethere, https://github.com/kit1980, https://github.com/malfet, https://github.com/Skylion007	2025-01-13 22:44:59 +00:00
Huy Do	5129d6ef51	Fix inductor periodic smoke test wrong artifact (#144694 ) I'm not entirely sure why this failure starts to show up in periodic since Friday https://github.com/pytorch/pytorch/actions/runs/12716967189/job/35463656803. The artifact was uploaded to S3, but `use-gha: anything-non-empty-to-use-gh` was set and it was working. Maybe this is related to https://github.com/pytorch/pytorch/issues/144479 I also clean up the GCP/AWS A100 selection logic as the GCP cluster doesn't exist anymore. Pull Request resolved: https://github.com/pytorch/pytorch/pull/144694 Approved by: https://github.com/clee2000	2025-01-13 21:42:39 +00:00
Nikita Shulga	d44c3906b8	[EZ] [CD] Add 3.13 to FULL_PYTHON_VERSIONS (#144697 ) Separation was necessary for Conda codegen, but now it's gone Pull Request resolved: https://github.com/pytorch/pytorch/pull/144697 Approved by: https://github.com/atalman, https://github.com/izaitsevfb ghstack dependencies: #144696	2025-01-13 19:12:12 +00:00
Nikita Shulga	d2f905760d	[EZ] [CD] Eliminate stale TODO (#144696 ) As 3.13 has been enabled across the board, which one can verify by running `./github/regenerate.sh` and observe that non of the configs have changed Pull Request resolved: https://github.com/pytorch/pytorch/pull/144696 Approved by: https://github.com/izaitsevfb, https://github.com/atalman	2025-01-13 19:12:12 +00:00
Huy Do	e4b2e90e54	Fix broken YAML template after #144574 (#144604 ) The YAML syntax is wrong and GitHub complains about it https://github.com/pytorch/pytorch/blob/main/.github/ISSUE_TEMPLATE/pt2-bug-report.yml Pull Request resolved: https://github.com/pytorch/pytorch/pull/144604 Approved by: https://github.com/wdvr	2025-01-11 05:09:06 +00:00
Nikita Shulga	92ddb3d3d3	[MPS] Expose `MPSProfiler::start/stopCapture` to Python (#144561 ) I.e. when `MTL_CAPTURE_ENABLED` environment variable is set to 1, one should be able to invoke wrap the code with `torch.mps.profiler.capture_metal` to generate gputrace for shaders invoked inside the context manager. For example, code below: ```python import torch import os def foo(x): return x[:,::2].sin() + x[:, 1::2].cos() if __name__ == "__main__": os.environ["MTL_CAPTURE_ENABLED"] = "1" x = torch.rand(32, 1024, device="mps") with torch.mps.profiler.metal_capture("compiled_shader"): torch.compile(foo)(x) ``` should capture the execution of a `torch.compile` generated shader <img width="734" alt="image" src="https://github.com/user-attachments/assets/718ff64e-103b-4b11-b66c-c89cfc770b5d" /> Pull Request resolved: https://github.com/pytorch/pytorch/pull/144561 Approved by: https://github.com/manuelcandales ghstack dependencies: #144559, #144560	2025-01-11 02:05:36 +00:00
Sahan Paliskara	9ec8ecea71	Update documentation.yml	2025-01-10 15:27:28 -08:00
Sahan Paliskara	1ff8a1c4eb	Update documentation.yml to request english	2025-01-10 15:26:43 -08:00

1 2 3 4 5 ...

3626 commits