pytorch

mirror of https://github.com/saymrwulf/pytorch.git synced 2026-05-14 20:57:59 +00:00

Author	SHA1	Message	Date
Jack Taylor	bad69044d8	[ROCm] upgrade ROCm CI builds to py3.10 (#134108 ) Upgrade ROCm CI builds to py3.10 Pull Request resolved: https://github.com/pytorch/pytorch/pull/134108 Approved by: https://github.com/jeffdaily, https://github.com/jithunnair-amd, https://github.com/atalman	2024-09-18 17:39:34 +00:00
PyTorch MergeBot	7fe004f7cf	Revert "Add CI for Triton CPU backend (#135342 )" This reverts commit `426580a67d`. Reverted https://github.com/pytorch/pytorch/pull/135342 on behalf of https://github.com/jeanschmidt due to Broke internal signals, see D62737208 for more details ([comment](https://github.com/pytorch/pytorch/pull/133408#issuecomment-2353623816))	2024-09-16 18:33:33 +00:00
Jon Janzen	13bd1256f9	Delete stable prototype (#135911 ) This project ended up going in an entirely different direction, so we can close out all this Pull Request resolved: https://github.com/pytorch/pytorch/pull/135911 Approved by: https://github.com/izaitsevfb, https://github.com/malfet	2024-09-16 15:32:17 +00:00
Jez Ng	426580a67d	Add CI for Triton CPU backend (#135342 ) Where possible, I have marked failing tests (which we intend to fix or triage) as `@xfail_if_triton_cpu`. This will help us track progress of the Triton CPU backend over time. Tests that I don't think we need to address, or that are flaky, have been marked as skips. Successful CI run: https://github.com/pytorch/pytorch/actions/runs/10822238062/job/30028284549 Pull Request resolved: https://github.com/pytorch/pytorch/pull/135342 Approved by: https://github.com/jansel ghstack dependencies: #133408	2024-09-14 21:45:19 +00:00
Huy Do	db5e1b44d2	Fix inductor-micro-benchmark results upload (take 2) (#136052 ) I had a brain freeze when I wrote the original fix. The parameters were in the wrong order. Pull Request resolved: https://github.com/pytorch/pytorch/pull/136052 Approved by: https://github.com/clee2000, https://github.com/kit1980, https://github.com/malfet	2024-09-13 22:05:10 +00:00
Nikita Shulga	a30d5ba16c	Fix bug in split-build workflows codegen (#136043 ) By just deleting a few rogue lines left out in https://github.com/pytorch/pytorch/pull/135510 If file in workflows folder does not have a `.yml` extensions it will not be launched at all, will it? Pull Request resolved: https://github.com/pytorch/pytorch/pull/136043 Approved by: https://github.com/kit1980, https://github.com/atalman	2024-09-13 21:29:06 +00:00
atalman	a3d827a28c	Use python 3.11 for Large Wheel build (#136042 ) Use Python 3.11 in nightly Large wheel builds. Required for Colab testing Pull Request resolved: https://github.com/pytorch/pytorch/pull/136042 Approved by: https://github.com/kit1980, https://github.com/malfet Co-authored-by: Sergii Dymchenko <kit1980@gmail.com>	2024-09-13 20:27:11 +00:00
Huy Do	a130ed828a	Fix the upload of x86 micro benchmark results (#135780 ) Upload stats workflow currently skips this https://github.com/pytorch/pytorch/actions/runs/10807251335/job/29977650639, this is a miss from https://github.com/pytorch/pytorch/pull/135042. So, the workflow is running but nothing has been uploaded yet. Pull Request resolved: https://github.com/pytorch/pytorch/pull/135780 Approved by: https://github.com/atalman	2024-09-12 01:16:38 +00:00
Zain Rizvi	09519eb195	Support rolling over a percentage of workflows (#134816 ) In order to support adding a rollover percentage, this ended up being a complete rewrite of runner_determinator.py. Details of the new format are in the comments up top. On the plus side, this now includes some unit tests. Pull Request resolved: https://github.com/pytorch/pytorch/pull/134816 Approved by: https://github.com/PaliC, https://github.com/zxiiro	2024-09-11 18:01:26 +00:00
Jithun Nair	82a4df2d5f	[CI] [ROCm] Run rocm workflow on every push to main branch (#135644 ) Dial the frequency back up from https://github.com/pytorch/pytorch/pull/131637 Pull Request resolved: https://github.com/pytorch/pytorch/pull/135644 Approved by: https://github.com/huydhn	2024-09-11 17:21:05 +00:00
Catherine Lee	4ca65d3323	[CI] Increase sharding for jobs that are timing out (#135582 ) Increase sharding for * slow grad check * slow cuda tests slow / linux-focal-cuda12.1-py3.10-gcc9-sm86 / test * avx Pull Request resolved: https://github.com/pytorch/pytorch/pull/135582 Approved by: https://github.com/huydhn, https://github.com/malfet	2024-09-10 19:45:13 +00:00
Thanh Ha	5e0788befb	Migrate remaining jobs to use runner determinator (#134867 ) At this point all self-hosted runner jobs should be using the runner determinator to switch between LF and Meta runners. This change updates the remaining jobs that have not yet been migrated over. Issue: https://lf-pytorch.atlassian.net/browse/PC-25 Pull Request resolved: https://github.com/pytorch/pytorch/pull/134867 Approved by: https://github.com/ZainRizvi	2024-09-10 18:14:00 +00:00
atalman	9b764491e3	Use upload-artifact@v4.4.0 for create_release.yml (#135528 ) Fixes failure: https://github.com/pytorch/pytorch/actions/runs/10780281005/job/29895846007 Due broken sync ``` actions/upload-artifact@v2 and actions/download-artifact@v4.1.7 ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/135528 Approved by: https://github.com/kit1980, https://github.com/malfet	2024-09-09 20:48:52 +00:00
Sahan Paliskara	a4e6a0b240	[split build] move periodic split builds into own concurrency group (#135510 ) To avoid nightly workflows cancelling each other Pull Request resolved: https://github.com/pytorch/pytorch/pull/135510 Approved by: https://github.com/clee2000, https://github.com/huydhn, https://github.com/malfet Co-authored-by: Nikita Shulga <2453524+malfet@users.noreply.github.com>	2024-09-09 19:35:57 +00:00
Sahan Paliskara	0c661f3e1a	[Split Build] Refactor split build binary builds into their own workflows and move split build binary builds to periodic (#134624 ) As we need to move split build binary tests from trunk to periodic this pr, refactors those jobs out into its own workflow to achieve this. Pull Request resolved: https://github.com/pytorch/pytorch/pull/134624 Approved by: https://github.com/malfet	2024-09-06 23:57:56 +00:00
atalman	b46a1b9e2d	Use Python 3.9 on all libtorch jobs (#135245 ) Part of the migration py3.8->3.9 Pull Request resolved: https://github.com/pytorch/pytorch/pull/135245 Approved by: https://github.com/izaitsevfb	2024-09-06 02:27:22 +00:00
PyTorch MergeBot	8f66995459	Revert "Support rolling over a percentage of workflows (#134816 )" This reverts commit `fc890b55b5`. Reverted https://github.com/pytorch/pytorch/pull/134816 on behalf of https://github.com/malfet due to Causes lint to intermittently fail ([comment](https://github.com/pytorch/pytorch/pull/134816#issuecomment-2332902609))	2024-09-05 23:39:41 +00:00
Edward Z. Yang	3825607144	Add torch._logging.scribe (#135224 ) See https://github.com/pytorch/pytorch/pull/135138 for a usage example. Meta only, see https://docs.google.com/document/d/1JpbAQvRhTmuxjnKKjT7qq57dsnV84nxSLpWJo1abJuE/edit#heading=h.9wi46k7np6xw for context fbscribelogger is a library that allows us to write to scribe, which is Meta's logging infrastructure, when you have appropriate access token (this token is available for jobs running on main, as well as authorized jobs with the ci-scribe label). The resulting data is accessible via Scuba (a real time in-memory database) and Hive (a more traditional SQL persisted database). Here's the motivating use case. Suppose there is somewhere in PyTorch's codebase where you'd like to log an event, and then you'd like to find all the situations where this log is called. If PyTorch is rolled out to our internal users, we have some FB-oriented APIs (like torch._utils_internal.signpost_event) with which you can do this. But you have to actually land your PR to main, wait for it to be ingested to fbcode, and then wait for us to actually roll out this version, before you get any data. But what if you want the results within the next few hours? Instead, you can use torch._logging.scribe to directly write to our logging infrastructure from inside CI jobs. The most convenient approach is to log unstructured JSON blobs to `open_source_signpost` (added in this PR; you can also add your own dedicated table as described in the GDoc above). After adding logging code to your code, you can push your PR to CI, add 'ci-scribe' label, and in a few hours view the results in Scuba, e.g., (Meta-only) https://fburl.com/scuba/torch_open_source_signpost/z2mq8o4l If you want continuous logging on all commits on master, you can land your PR and it will be continuously get logging for all CI runs that happen on main. Eventually, if your dataset is important enough, you can consider collaborating with PyTorch Dev Infra to get the data collected in our public AWS cloud so that OSS users can view it without access to Meta's internal users. But this facility is really good for prototyping / one-off experiments. It's entirely self serve: just add your logging, run your PR CI with ci-scribe, get results, do analysis in Scuba. Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/135224 Approved by: https://github.com/Skylion007	2024-09-05 22:37:13 +00:00
Zain Rizvi	fc890b55b5	Support rolling over a percentage of workflows (#134816 ) In order to support adding a rollover percentage, this ended up being a complete rewrite of runner_determinator.py. Details of the new format are in the comments up top. On the plus side, this now includes some unit tests. Pull Request resolved: https://github.com/pytorch/pytorch/pull/134816 Approved by: https://github.com/PaliC, https://github.com/zxiiro	2024-09-05 22:21:45 +00:00
PyTorch MergeBot	f63571060c	Revert "Use actions/upload-artifact@v4.4.0 for rest of workflows (#135264 )" This reverts commit `9c0b03020b`. Reverted https://github.com/pytorch/pytorch/pull/135264 on behalf of https://github.com/atalman due to broke CI ([comment](https://github.com/pytorch/pytorch/pull/135264#issuecomment-2332674607))	2024-09-05 21:43:05 +00:00
Huy Do	24a223c49d	Run inductor micro benchmark on x86 metal runner (#135042 ) This enables inductor micro benchmark on CPU (x86): * Running on AWS metal runner for more accurate benchmark * I add a new `arch` column, which will be either x86_64 or arm64 for CPU or GPU name for GPU. We can use this later to differentiate between different setup, i.e. cuda (a100) vs cuda (a10g) or cpu (x86_64) vs cpu (arm64) The next step would be to run this one cpu arm64, and cuda (a10g). ### Testing Here is the CSV results from my test run https://github.com/pytorch/pytorch/actions/runs/10709344180 ``` name,metric,target,actual,dtype,device,arch,is_model mlp_layer_norm_gelu,flops_utilization,0.8,17.36,bfloat16,cpu,x86_64,False gather_gemv,memory_bandwidth(GB/s),990,170.80,int8,cpu,x86_64,False gather_gemv,memory_bandwidth(GB/s),1060,204.78,bfloat16,cpu,x86_64,False Mixtral-8x7B-v0.1,token_per_sec,175,26.68,int8,cpu,x86_64,True Mixtral-8x7B-v0.1,memory_bandwidth(GB/s),1130,171.91,int8,cpu,x86_64,True Mixtral-8x7B-v0.1,compilation_time(s),162,47.36,int8,cpu,x86_64,True gemv,memory_bandwidth(GB/s),870,236.36,int8,cpu,x86_64,False gemv,memory_bandwidth(GB/s),990,305.71,bfloat16,cpu,x86_64,False Llama-2-7b-chat-hf,token_per_sec,94,14.01,bfloat16,cpu,x86_64,True Llama-2-7b-chat-hf,memory_bandwidth(GB/s),1253,185.18,bfloat16,cpu,x86_64,True Llama-2-7b-chat-hf,compilation_time(s),162,74.99,bfloat16,cpu,x86_64,True Llama-2-7b-chat-hf,token_per_sec,144,25.09,int8,cpu,x86_64,True Llama-2-7b-chat-hf,memory_bandwidth(GB/s),957,165.83,int8,cpu,x86_64,True Llama-2-7b-chat-hf,compilation_time(s),172,70.69,int8,cpu,x86_64,True layer_norm,memory_bandwidth(GB/s),950,172.03,bfloat16,cpu,x86_64,False ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/135042 Approved by: https://github.com/yanboliang	2024-09-05 21:31:36 +00:00
atalman	9c0b03020b	Use actions/upload-artifact@v4.4.0 for rest of workflows (#135264 ) To be consistent with https://github.com/pytorch/pytorch/pull/135263 and rest of workflows. Use v4.4.0. Pull Request resolved: https://github.com/pytorch/pytorch/pull/135264 Approved by: https://github.com/kit1980, https://github.com/malfet	2024-09-05 21:05:06 +00:00
Jack Taylor	034717a029	[ROCm] remove triton-rocm commit pin and merge pins with triton.txt (#133438 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/133438 Approved by: https://github.com/jithunnair-amd, https://github.com/malfet Co-authored-by: Jithun Nair <37884920+jithunnair-amd@users.noreply.github.com>	2024-09-05 20:36:45 +00:00
atalman	8efe547046	Use actions/upload-artifact@v4.4.0 for triton builds (#135263 ) Same as: https://github.com/pytorch/pytorch/pull/135139 Fixes upload failure: https://github.com/pytorch/pytorch/actions/runs/10722567217/job/29748125015 fix regression introduced by https://github.com/pytorch/pytorch/pull/135068 Pull Request resolved: https://github.com/pytorch/pytorch/pull/135263 Approved by: https://github.com/kit1980, https://github.com/huydhn	2024-09-05 20:03:39 +00:00
Edward Z. Yang	2e2fb668fa	Upgrade expecttest to 0.2.1 (#135136 ) Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/135136 Approved by: https://github.com/albanD, https://github.com/atalman, https://github.com/Skylion007	2024-09-05 16:05:35 +00:00
Stonepia	9d24f945ba	[CI] Use larger instance for building triton whl (#135201 ) When running CI jobs of "Build Triton Wheels", it failed due to the lack of resources. This PR uses a larger runner to avoid these issues. The failure message is like: ``` Process completed with exit code 137. ``` Related running actions: Failed actions: https://github.com/pytorch/pytorch/actions/runs/10714445036 Success actions: https://github.com/pytorch/pytorch/actions/runs/10716710830 Pull Request resolved: https://github.com/pytorch/pytorch/pull/135201 Approved by: https://github.com/chuanqi129, https://github.com/atalman	2024-09-05 14:36:23 +00:00
Nikita Shulga	105ac2418c	Fix binary builds artifact download (#135139 ) By upgrading upload-artifacts action to v4.4.0 As artifact store layout is different between v3 and v4 actions and artifacts uploaded by v3 can not be downloaded by v4 Should fix`Unable to download artifact(s): Artifact not found for name: libtorch-cpu-shared-with-deps-release`, which could be seen for example [here](https://github.com/pytorch/pytorch/actions/runs/10707740040/job/29690137218#step:7:29) I.e. fix regression introduced by https://github.com/pytorch/pytorch/pull/135068 Pull Request resolved: https://github.com/pytorch/pytorch/pull/135139 Approved by: https://github.com/atalman, https://github.com/huydhn	2024-09-05 00:43:34 +00:00
chuanqiw	977a909250	[CI] Build pytorch wheel with Torch XPU Operators on Windows (#133151 ) # Description This pipeline enables the CI build on Windows with PR labeled with ciflow/xpu. This will build torch binary with Torch XPU Operators on Windows using Vision Studio BuildTools 2022. # Changes 1. Install xpu batch file (install_xpu.bat) - Check if build machine has oneAPI in environment, and if the version of it is latest. If not, install the latest public released oneAPI in the machine. 2. GHA callable pipeline (_win-build.yml) - Set vc_year and use_xpu as parameter to set build wheel environment. 3. GHA workflow (xpu.yml) - Add a new windows build job and pass parameters to it. 4. Build wheels script (.ci/pytorch/win-test-helpers/build_pytorch.bat) - Prepare environment for building, e.g. install oneAPI bundle. # Note 1. For building wheels on Intel GPU, you need Vision Studio BuildTools version >= 2022 2. This pipeline requires to use Vision Studio BuildTools 2022 to build wheels. For now, we specify "windows.4xlarge.nonephemeral" as build machine label in the yaml file. We will request to add self-hosted runners with Intel GPU and Vision Studio BuildTools 2022 installed soon. Work for #114850 Pull Request resolved: https://github.com/pytorch/pytorch/pull/133151 Approved by: https://github.com/chuanqi129, https://github.com/atalman Co-authored-by: chuanqiw <chuanqi.wang@intel.com>	2024-09-05 00:02:46 +00:00
atalman	60dfe1b35e	Fix lint after Bump actions/download-artifact update (#135109 ) Fixes lint after auto-generated PR: `367a78495f` Pull Request resolved: https://github.com/pytorch/pytorch/pull/135109 Approved by: https://github.com/ezyang, https://github.com/huydhn	2024-09-04 15:26:17 +00:00
chuanqiw	67208f08bd	[CD] Enable XPU nightly build on Windows (#134312 ) Depends on https://github.com/pytorch/builder/pull/1975 land. Works for https://github.com/pytorch/pytorch/issues/114850 Pull Request resolved: https://github.com/pytorch/pytorch/pull/134312 Approved by: https://github.com/atalman	2024-09-04 14:46:36 +00:00
Thanh Ha	dcf05fcb14	Fix stale job using non-existant ARC runner (#134863 ) The ARC CI system has been shutdown so this job is currently using a runner that doesn't exist. Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/134863 Approved by: https://github.com/ZainRizvi	2024-09-04 12:57:10 +00:00
dependabot[bot]	367a78495f	Bump actions/download-artifact from 2 to 4.1.7 in /.github/workflows (#135068 ) Bumps [actions/download-artifact](https://github.com/actions/download-artifact) from 2 to 4.1.7. - [Release notes](https://github.com/actions/download-artifact/releases) - [Commits](https://github.com/actions/download-artifact/compare/v2...v4.1.7) --- updated-dependencies: - dependency-name: actions/download-artifact dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2024-09-03 20:33:57 -07:00
Zain Rizvi	f05b716d6d	Add validator to ensure runner determinator script is kept in sync (#134800 ) We keep two copies of the runner-determinator script: 1. In runner_determinator.py, for ease of testing. This however is not actually executed during CI 2. Embedded in _runner-determinator.yml. This is what CI uses. Why the duplication? Short version: Because of how github CI works, during a given CI run the workflow yml files could actually come from the main branch, while the remaining files get read from the local commit. This can lead to a newer version of _runner-determinator.yml trying to invoke an older version of runner_determintor.py than it was actually designed for. Chaos ensues. We mitigate this by embedding the script into the yml file. But we still keep the script around because it's much easier to run tests against. This workflow's job is to ensure that if one edits the script in one of those two locations then they remember to update it in the other location as well Pull Request resolved: https://github.com/pytorch/pytorch/pull/134800 Approved by: https://github.com/zxiiro, https://github.com/PaliC ghstack dependencies: #134796	2024-09-03 23:29:04 +00:00
Zain Rizvi	469429b959	Refactor runner determinator (#134796 ) Some minor refactorings to make the code easier to parse and easier to add unit tests for. Keeping this as a separate PR for ease of review, since it should have zero functional behavior changes Pull Request resolved: https://github.com/pytorch/pytorch/pull/134796 Approved by: https://github.com/zxiiro, https://github.com/PaliC	2024-09-03 23:29:04 +00:00
PyTorch MergeBot	a1ba8e61d1	Revert "[ROCm] remove triton-rocm commit pin and merge pins with triton.txt (#133438 )" This reverts commit `5e8bf29148`. Reverted https://github.com/pytorch/pytorch/pull/133438 on behalf of https://github.com/ZainRizvi due to This still breaks linux binary builds. Added the appropriate labels to ensure tests can pass. See [GH job link](https://github.com/pytorch/pytorch/actions/runs/10626427003/job/29460479554) [HUD commit link](`5e8bf29148`) ([comment](https://github.com/pytorch/pytorch/pull/133438#issuecomment-2322246198))	2024-08-30 20:00:41 +00:00
Wouter Devriendt	db17a9898d	regenerate ci workflows for binary builds with new g4dn runners (#133404 ) Fixes #103104 Pull Request resolved: https://github.com/pytorch/pytorch/pull/133404 Approved by: https://github.com/ZainRizvi	2024-08-30 19:53:22 +00:00
Jack Taylor	5e8bf29148	[ROCm] remove triton-rocm commit pin and merge pins with triton.txt (#133438 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/133438 Approved by: https://github.com/jithunnair-amd, https://github.com/malfet Co-authored-by: Jithun Nair <37884920+jithunnair-amd@users.noreply.github.com>	2024-08-30 03:38:35 +00:00
atalman	6180574771	Move py 3.8->3.9 pull, trunk, inductor, prerioric CI tests (#133624 ) Part of Deprecation of python 3.8 and moving to 3.9. Related to: https://github.com/pytorch/pytorch/issues/120718 Except XPU and ROCM jobs Pull Request resolved: https://github.com/pytorch/pytorch/pull/133624 Approved by: https://github.com/Skylion007, https://github.com/malfet, https://github.com/ZainRizvi	2024-08-29 19:15:59 +00:00
Ivan Zaitsev	41e36e2b46	Reflect check_labels status as a signal (#134711 ) Fixes the workflow when meta-exported diff (co-dev) doesn't have the required labels, but the signal is suppressed due to job failure (e.g. [see this run](https://github.com/pytorch/pytorch/actions/runs/10590994706/job/29347663526?pr=134484)). With this change the workflow status correctly reflects the status of the check. # Testing * [illegal pr_num](https://github.com/pytorch/pytorch/actions/runs/10603163898/job/29386843591) * [successful run](https://github.com/pytorch/pytorch/actions/runs/10603279052/job/29387230110) (topic label present) * no labels: [check fails](https://github.com/pytorch/pytorch/actions/runs/10603310368/job/29387333864) Pull Request resolved: https://github.com/pytorch/pytorch/pull/134711 Approved by: https://github.com/clee2000	2024-08-29 03:11:16 +00:00
Bin Bao	e6bf1710ff	[Inductor][Refactor] Rename CPU benchmark test configs (#134639 ) Summary: benchmarks/dynamo/ci_expected_accuracy/update_expected.py expects a benchmark run config is named as {config}_{benchmark}, and CPU tests should follow the same naming convention. Pull Request resolved: https://github.com/pytorch/pytorch/pull/134639 Approved by: https://github.com/huydhn	2024-08-28 14:49:55 +00:00
atalman	d5aefadb17	[CD] Fix docker builds by installing setuptools (#134595 ) Seeing failures like this: ``` #49 844.6 //build_scripts/manylinux1-check.py:6: DeprecationWarning: The distutils package is deprecated and slated for removal in Python 3.12. Use setuptools or check PEP 632 for potential alternatives ..... [python 3/3] RUN bash build_scripts/build.sh && rm -r build_scripts: 846.9 ...it did, yay. 846.9 + for PYTHON in '/opt/python/*/bin/python' 846.9 + /opt/python/cpython-3.12.0/bin/python build_scripts/manylinux1-check.py 847.0 Traceback (most recent call last): 847.0 File "//build_scripts/manylinux1-check.py", line 55, in <module> 847.0 if is_manylinux1_compatible(): 847.0 ^^^^^^^^^^^^^^^^^^^^^^^^^^ 847.0 File "//build_scripts/manylinux1-check.py", line 6, in is_manylinux1_compatible 847.0 from distutils.util import get_platform 847.0 ModuleNotFoundError: No module named 'distutils' ------ ``` PR: https://github.com/pytorch/pytorch/pull/134455 Pull Request resolved: https://github.com/pytorch/pytorch/pull/134595 Approved by: https://github.com/kit1980, https://github.com/seemethere, https://github.com/malfet	2024-08-27 19:31:44 +00:00
Zain Rizvi	f480385277	Remove explicit Amz2023 reference from jobs (#134355 ) Changes jobs to go back to using the default AMI. Note: This is only a cleanup PR. It does NOT introduce any behavior changes in CI Now that the default variant uses the Amazon 2023 AMI and has been shown to be stable for a week, it's time to remove the explicit amz2023 references and go back to using the default variant. After a week or two, when this is rolled out to most people, we can remove the variants from scale config as well. Pull Request resolved: https://github.com/pytorch/pytorch/pull/134355 Approved by: https://github.com/jeanschmidt	2024-08-27 08:51:42 +00:00
atalman	78128cbdd8	[CD] Use ephemeral arm64 runners for nightly and docker builds (#134473 ) Follow up after adding linux arm64 ephemeral instances: https://github.com/pytorch/pytorch/pull/134469 Pull Request resolved: https://github.com/pytorch/pytorch/pull/134473 Approved by: https://github.com/malfet	2024-08-26 17:47:20 +00:00
atalman	a6fac0e969	Use ephemeral runners for windows nightly builds (#134463 ) This is definition of windows.4xlarge: ``` windows.4xlarge: disk_size: 256 instance_type: c5d.4xlarge is_ephemeral: true max_available: 420 os: windows ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/134463 Approved by: https://github.com/jeanschmidt	2024-08-26 16:33:19 +00:00
Thanh Ha	bb67ff2ba7	Migrate Windows bin jobs to runner determinator (#134231 ) Update Windows binary workflows to use the runner determinator script. Closes: pytorch/ci-infra#262 Pull Request resolved: https://github.com/pytorch/pytorch/pull/134231 Approved by: https://github.com/ZainRizvi	2024-08-26 14:56:00 +00:00
PyTorch MergeBot	4648848696	Revert "[ROCm] remove triton-rocm commit pin and merge pins with triton.txt (#133438 )" This reverts commit `f71c3d265a`. Reverted https://github.com/pytorch/pytorch/pull/133438 on behalf of https://github.com/jeanschmidt due to seems to have introduced breakages in linux binary builds ([comment](https://github.com/pytorch/pytorch/pull/133438#issuecomment-2308787310))	2024-08-25 11:20:30 +00:00
Jack Taylor	f71c3d265a	[ROCm] remove triton-rocm commit pin and merge pins with triton.txt (#133438 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/133438 Approved by: https://github.com/jithunnair-amd, https://github.com/malfet	2024-08-24 18:26:49 +00:00
chuanqiw	6245d5b87b	[CI] Update XPU ci test python version to 3.9 (#134214 ) Works for https://github.com/pytorch/pytorch/issues/114850 Pull Request resolved: https://github.com/pytorch/pytorch/pull/134214 Approved by: https://github.com/EikanWang, https://github.com/malfet	2024-08-24 18:11:36 +00:00
atalman	ff77c67d16	Use ephemeral runners for linux nightly builds (#134367 ) Should be landed with https://github.com/pytorch/test-infra/pull/5590 Pull Request resolved: https://github.com/pytorch/pytorch/pull/134367 Approved by: https://github.com/kit1980, https://github.com/malfet, https://github.com/seemethere	2024-08-24 12:49:07 +00:00
Nikita Shulga	09a82f3d24	[EZ][BE] Delete references to non-existing `AWS_SCCACHE` secrets (#134370 ) First of all, none of the binary builds should be using sccache for security and reliability reasons (as distributed cache can become corrupted/compromised), but even if they do all authentication to AWS service shoudl be done via OIDC Pull Request resolved: https://github.com/pytorch/pytorch/pull/134370 Approved by: https://github.com/seemethere, https://github.com/atalman	2024-08-23 22:23:48 +00:00

1 2 3 4 5 ...

1980 commits