pytorch/.github/actions
saienduri 7eb51e5464 Ensure GPU isolation for kubernetes pod MI300 runners. (#145829)
Fixes the reason behind moving the tests to unstable initially. (https://github.com/pytorch/pytorch/pull/145790)
We ensure gpu isolation for each pod within kubernetes by propagating the drivers selected for the pod from the Kubernetes layer up to the docker run in pytorch here.
Now we stick with the GPUs assigned to the pod in the first place and there is no overlap between the test runners.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/145829
Approved by: https://github.com/jeffdaily
2025-01-28 17:20:46 +00:00
..
build-android Don't pass credentials explicitly to sccache (#140611) 2024-11-14 04:44:55 +00:00
checkout-pytorch [BE] Get rid of malfet/checkout@silent-checkout (#143516) 2024-12-19 00:36:36 +00:00
chown-workspace
diskspace-cleanup [ROCm] Enable post-merge trunk workflow on MI300 runners; skip and fix MI300 related failed tests (#143673) 2025-01-09 05:18:57 +00:00
download-build-artifacts Update to upload-artifacts and download-artifacts to v4 (#139808) 2024-11-06 05:57:41 +00:00
download-td-artifacts Silent TD warnings when there is no td_results.json (#142083) 2024-12-04 23:43:29 +00:00
filter-test-configs
get-workflow-job-id
linux-test [Utilization Monitor] input to disable utilization monitor (#140857) 2024-11-18 23:26:03 +00:00
pytest-cache-download Upload artifacts during test run (#125799) 2024-10-22 16:48:57 +00:00
pytest-cache-upload Upload artifacts during test run (#125799) 2024-10-22 16:48:57 +00:00
setup-linux updated EC2 fetching of metadata to use IMDSv2 (#138286) 2024-10-18 20:58:47 -07:00
setup-rocm Ensure GPU isolation for kubernetes pod MI300 runners. (#145829) 2025-01-28 17:20:46 +00:00
setup-win updated EC2 fetching of metadata to use IMDSv2 (#138286) 2024-10-18 20:58:47 -07:00
setup-xpu
teardown-rocm
teardown-win
teardown-xpu
test-pytorch-binary Remove builder repo from workflows and scripts (#143776) 2024-12-24 14:11:51 +00:00
upload-sccache-stats Upload sccache stats into benchmark database with build step time (#140839) 2024-11-21 22:38:45 +00:00
upload-test-artifacts Update to upload-artifacts and download-artifacts to v4 (#139808) 2024-11-06 05:57:41 +00:00