Commit graph

720 commits

Author SHA1 Message Date
Catherine Lee
34638c82a6 [mergebot] No unique behavior for facebook bot re pending jobs (#119735)
if fb bot says merge without -f, do normal behavior and wait for pending checks
Pull Request resolved: https://github.com/pytorch/pytorch/pull/119735
Approved by: https://github.com/izaitsevfb, https://github.com/huydhn
2024-02-13 20:07:24 +00:00
Huy Do
5acd1f0f7d Add cherry-pick workflow (#119352)
After https://github.com/pytorch/test-infra/pull/4758, we can create a new workflow on PyTorch to receive `try-cherry-pick` dispatch event from the bot, and create the cherry pick PR.

* [x] Cherry pick a PR after it has been landed and create a cherry pick PR to the target release branch.
* [ ] The second part after this is to update the release tracker with the info.  This will be done in a subsequent PR.
* [ ] ghstack is not yet supported
* [ ] Cherry pick a reverted commit is not yet supported (from @kit1980 comment)

### Testing

The script can be used locally:

```
python cherry_pick.py --onto release/2.2 --classification release --github-actor huydhn 118907
The cherry pick PR is at https://github.com/pytorch/pytorch/pull/119351
```

The test cherry pick PR is created at https://github.com/pytorch/pytorch/pull/119351

Unit testing this on CI is tricky, so I test this out on canary instead.

* https://github.com/pytorch/pytorch-canary/pull/193#issuecomment-1933162707 creates the PR at https://github.com/pytorch/pytorch-canary/pull/201
  * One more test on canary with the new token https://github.com/pytorch/pytorch-canary/pull/193#issuecomment-1933229483.  The minimum required permission from what I see is `workflow`
* Cherry picking conflicts could happen and needs to be handled manually https://github.com/pytorch/pytorch-canary/pull/194#issuecomment-1933142975
* ~Require a linked issue when cherry picking regressions, critical fixes, or fixing new features https://github.com/pytorch/pytorch-canary/pull/193#issuecomment-1933174520~ Relax this requirement to a suggestion
Pull Request resolved: https://github.com/pytorch/pytorch/pull/119352
Approved by: https://github.com/atalman
2024-02-12 23:12:10 +00:00
Catherine Lee
ad217d4266 [ez] Add try catch for deleting old branches (#119696)
I think some chars in branch names affect the api calls, so just assume they're protected
Pull Request resolved: https://github.com/pytorch/pytorch/pull/119696
Approved by: https://github.com/huydhn
2024-02-12 21:08:59 +00:00
Catherine Lee
059bf1baa4 Separate clang lint? (#119575)
25 min -> 17 + 13 min, which is still not as fast as I want it to be but I'll take it
Lintrunner provides some parallelism by default, but it's not perfect

Reducing fetch-depth from all to 1 further reduces time by ~2-3 minutes

From non clang's logs:
```
2024-02-09T22:05:39.5297616Z Requirement already satisfied: PyYAML==6.0 in /opt/conda/lib/python3.11/site-packages (6.0)
2024-02-09T22:12:23.6164708Z Collecting black==23.12.1
```
I don't know why this part takes so long, maybe it's just buffering?  Clang version doesn't show this issue

See 5a750c8035
Pull Request resolved: https://github.com/pytorch/pytorch/pull/119575
Approved by: https://github.com/huydhn, https://github.com/malfet
2024-02-12 17:46:31 +00:00
Catherine Lee
3f82e435eb Fix delete branches (#119399)
Due to PR_WINDOW, if the magic string exists in the body but the pr was not updated recently, the query wouldn't find it and would delete the branch.  Instead, query separately for branches with the no-delete-branch label, which I created recently.

Might as well query for branches with open PRs while we're at it so PRs with the stale label won't get their branches deleted either
Pull Request resolved: https://github.com/pytorch/pytorch/pull/119399
Approved by: https://github.com/huydhn
2024-02-09 17:28:00 +00:00
PyTorch MergeBot
c6f39740c7 Revert "Fix delete branches (#119399)"
This reverts commit e1fc7e1ebc.

Reverted https://github.com/pytorch/pytorch/pull/119399 on behalf of https://github.com/clee2000 due to has a bug ([comment](https://github.com/pytorch/pytorch/pull/119399#issuecomment-1936291560))
2024-02-09 17:14:23 +00:00
Catherine Lee
5d6e323549 No TD (test removal) option in CI (#118808)
It currently doesn't do anything, but I will want these env vars later.  Maybe I should start using ghstack

Intention: --enable-td actually gets rid of tests

I am open to better names
Pull Request resolved: https://github.com/pytorch/pytorch/pull/118808
Approved by: https://github.com/huydhn, https://github.com/osalpekar
2024-02-09 16:42:27 +00:00
Catherine Lee
e1fc7e1ebc Fix delete branches (#119399)
Due to PR_WINDOW, if the magic string exists in the body but the pr was not updated recently, the query wouldn't find it and would delete the branch.  Instead, query separately for branches with the no-delete-branch label, which I created recently.

Might as well query for branches with open PRs while we're at it so PRs with the stale label won't get their branches deleted either
Pull Request resolved: https://github.com/pytorch/pytorch/pull/119399
Approved by: https://github.com/huydhn
2024-02-09 16:40:32 +00:00
Catherine Lee
200108c6e6 Delete old branches (#117079)
Example https://github.com/pytorch/pytorch/actions/runs/7562281351/job/20592425611?pr=117079 (The code to delete branches isn't being run, it's just listing the branches it wants to delete)

Internal code: https://fburl.com/code/hdvvbfkj

Threshold for branch with PR is 30 days regardless of whether or not the PR is merged or not (compared to 3 days if merged and 30 days if closed).  Threshold for branch without PR is 1.5 years (same internally).

Threshold of ~400 queries to github so it doesn't hit token usage limits.  Currently this leads to about 350 branches deleted per run.

Only query for the last 90 days of updated PRs to reduce token usage, so if a branch has a PR but it was updated 90+ days ago, it will think it doesn't have a PR and will wait for the 1.5 years branch update check instead, regardless of whether the PR is open or closed.

I tested that it could delete my own branch and it worked.

labeled with test-config/crossref because I just want the smallest test config possible to reduce CI usage
Pull Request resolved: https://github.com/pytorch/pytorch/pull/117079
Approved by: https://github.com/malfet
2024-02-05 20:50:05 +00:00
Nikita Shulga
4b59bfe8e5 [CI] Filter should not fail if pr_body is empty (#118934)
Otherwise it will fail with `TypeError: argument of type 'NoneType' is not iterable` (see https://github.com/pytorch/pytorch/actions/runs/7748725174/job/21131915226 for example)

```
% gh api /repos/pytorch/pytorch/issues/118927|
{
  "url": "https://api.github.com/repos/pytorch/pytorch/issues/118927",
  ...
  "body": null,
  ...
  "state_reason": null
}
```

TODO: Can we add a test for it?
Pull Request resolved: https://github.com/pytorch/pytorch/pull/118934
Approved by: https://github.com/clee2000, https://github.com/seemethere, https://github.com/huydhn
2024-02-02 00:49:20 +00:00
Edward Z. Yang
82b0341af3 s/verison/version/ (#118749)
Signed-off-by: Edward Z. Yang <ezyang@meta.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/118749
Approved by: https://github.com/malfet, https://github.com/albanD
2024-01-31 19:23:55 +00:00
Nikita Shulga
3011a4406f [BE][GHF] Do not hardcode default branch name (#118530)
Instead rely on `GitHubPR.default_branch()` which is the name of the repo's default branch.

Do not pass branch name `merge_changes` is called, as it is set to default branch inside the function

Pull Request resolved: https://github.com/pytorch/pytorch/pull/118530
Approved by: https://github.com/clee2000
2024-01-29 17:18:23 +00:00
Nikita Shulga
7cc7bf9dda [GHF] Add co-authors to PR (#118347)
Mention co-authors in PR body

Modify `CommitAuthors` to include query first two commit `authors`, which makes sure that authors from suggested commits are recognized.

Test plan: CI + check `get_authors()` on a few PRs
Pull Request resolved: https://github.com/pytorch/pytorch/pull/118347
Approved by: https://github.com/kit1980
2024-01-27 01:02:49 +00:00
Ivan Zaitsev
b599f5608c Fix mergeability check for ghstack PRs (#118258)
# Changes
* introduce `--check-mergeability` trymerge flag that attempts to merge PR locally, using the same merge logic as the mergebot, but requires just a read-only `GITHUB_TOKEN` and git repo.
* change mergeability workflow to utilize the new --check-mergeability logic

# Alternatives considered

1.
> Rewrite `https://github.com/pytorch/test-infra/actions/workflows/pr-dependencies-check.yml` to correctly support partially merged ghstacks.

That would be a slightly better approach, but ROI is lower, as it requires reimplementing trymerge logic and additional effort to consolidate the codebase (trymerge lives in pytorch repo).

`pr-dependencies-check.yml` still produces human-readable results for partially merged ghstack prs (even if it falsely reports them as non-mergeable).

2.

> Instead of introducing new trymerge flag, use existing flags, including `--dry-run`.

That didn't work, as no combination of existing flags skips the rule checks and ROCKSET lookups.

# Testing

1. Manual testing  `trymerge.py --check-mergeability`  on the regular and ghstack PRs:

```
export GITHUB_TOKEN=
export GIT_REPO_DIR=`pwd`
export GITHUB_REPOSITORY=pytorch/pytorch
export GIT_REMOTE_URL=https://github.com/pytorch/pytorch

# Test 1 (2 prs, 1 is closed)
python3 ../pytorch/.github/scripts/trymerge.py --check-mergeability  117862
Skipping 1 of 2 PR (#117859) as its already been merged

echo $?
0

# Test 2 (3 prs, 1 is closed)
python3 ../pytorch/.github/scripts/trymerge.py --check-mergeability  118125
Skipping 1 of 3 PR (#117859) as its already been merged

echo $?
0

# Test 3 (3 prs, intentional conflicts introduced into `main`):

python3 ../pytorch/.github/scripts/trymerge.py --check-mergeability  118125
Skipping 1 of 3 PR (#117859) as its already been merged
stdout:
Auto-merging torch/_inductor/ir.py
Auto-merging torch/_inductor/lowering.py
CONFLICT (content): Merge conflict in torch/_inductor/lowering.py
error: could not apply 66ba5b8792f... Realize inputs to DynamicScalar before unwrapping
...
RuntimeError: Command `git -C /Users/ivanzaitsev/pytorch2 cherry-pick -x 66ba5b8792fa076c4e512d920651e5b6b7e466f4` returned non-zero exit code 1
```

2.  Workflow run:
https://github.com/pytorch/pytorch/actions/runs/7660736172/job/20878651852?pr=118258

<img width="516" alt="image" src="https://github.com/pytorch/pytorch/assets/108101595/28fbf0d2-ac2a-4518-b41d-b32b41373747">
<img width="621" alt="image" src="https://github.com/pytorch/pytorch/assets/108101595/ddbf8566-a417-43ec-9d0e-f623f4a71313">

Pull Request resolved: https://github.com/pytorch/pytorch/pull/118258
Approved by: https://github.com/PaliC, https://github.com/huydhn
2024-01-26 03:15:56 +00:00
Catherine Lee
de9ddd19a5 Various CI settings (#117668)
Test [ci-verbose-test-logs] (this worked, the test logs printing while running and interleaved and are really long)

Settings for no timeout (step timeout still applies, only gets rid of ~30 min timeout for shard of test file) and no piping logs/extra verbose test logs (good for debugging deadlocks but results in very long and possibly interleaved logs).

Also allows these to be set via pr body if the label name is in brackets ex [label name] or the test above.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/117668
Approved by: https://github.com/huydhn
2024-01-26 00:17:29 +00:00
Catherine Lee
02a411d4a6 [mergebot] Dry run for labels + easier to read Dr CI result (#118240)
Dry run open for labels so we can run trymerge locally with dryrun without actually affected the PR

Make Dr.CI results easier to read (previously a massive json dump, now just the job names + ids, in a nicer format)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/118240
Approved by: https://github.com/huydhn
2024-01-25 23:06:43 +00:00
Huy Do
eebe7e1d37 Migrate update-viablestrict to test-infra (#118163)
In https://github.com/pytorch/test-infra/pull/4905, so that ExecuTorch can use the same GHA on their CI.

### Testing

https://github.com/pytorch/pytorch/actions/runs/7634906738/job/20799502532#step:2:15480
Pull Request resolved: https://github.com/pytorch/pytorch/pull/118163
Approved by: https://github.com/clee2000
2024-01-25 07:07:34 +00:00
DanilBaibak
a545ebc870 Switched macOS runners type to macos-m1-stable (#117651)
Switched macOS runners type to `macos-m1-stable`.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/117651
Approved by: https://github.com/huydhn
2024-01-24 11:55:13 +00:00
Bin Bao
c6930aad46 Update Triton pin (#117873)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/117873
Approved by: https://github.com/shunting314, https://github.com/malfet
2024-01-23 21:05:30 +00:00
Nikita Shulga
98a044d33e [CI] Build M1 conda binaries on M1 runners (#117801)
As usual, almost no work on PyTorch side, all changes are on the builder end, namely:
- 8b67d32929 - depend on `blas * mkl` only on x86 machines
- eb78393f1e - install arm64 conda when running on Apple Silicon
- 0d3aea4ee0 - constrain llvmdev-9 to x86 machines only
- 6c6a33b271 - set correct DEVELOPER_DIR path

TODO:
 - We should auto-detect this `DEVELOPER_DIR` via `xcode-select`
Pull Request resolved: https://github.com/pytorch/pytorch/pull/117801
Approved by: https://github.com/atalman
2024-01-19 14:31:12 +00:00
Huy Do
cf470e7b59 Migrate update-commit-hash to test-infra (#117506)
After https://github.com/pytorch/test-infra/pull/4885, the GHA is now reusable on `test-infra`.  This tests the change and we can also land it after https://github.com/pytorch/test-infra/pull/4885 lands.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/117506
Approved by: https://github.com/malfet, https://github.com/atalman
2024-01-17 00:15:04 +00:00
Jithun Nair
24c39bb5e5 Upgrade nightly wheels to rocm6.0 (#116983)
Follow-up to https://github.com/pytorch/builder/pull/1647

Pull Request resolved: https://github.com/pytorch/pytorch/pull/116983
Approved by: https://github.com/jeffdaily, https://github.com/atalman
2024-01-11 20:36:00 +00:00
Nikita Shulga
0f0020d76f [GHF] Add support for new style stacks (#116873)
Where base stack targets default branch, rather than base. But as
default branch is likely to advance, since PR was made, search for
mergebase before determining whether `base`..`head` are in sync with `orig` branch
Also, rather than hardcode default branch name, fetch it from `GitHubPR.default_branch()`

Test Plan: https://github.com/malfet/deleteme/pull/77

Pull Request resolved: https://github.com/pytorch/pytorch/pull/116873
Approved by: https://github.com/ezyang
2024-01-05 20:32:24 +00:00
Nikita Shulga
93b86bf531 [GHF] Implement stacked revert (#116447)
By adding `get_ghstack_dependent_prs` that using `git branch --contains`
finds all PRs containing stacked branch, selecting longest one (in
terms of distance between origin and default branch) and skipping all
open PRs

Please note, that reverts should be applied in a reversed order with the
one how PRs were landed originally.

Use a bit of a defensive programming, i.e. revert single PR if attempt to fetch dependencies fails for some reason.

Test plan:
 - Lint
 -  ```
    >>> from trymerge import GitRepo, GitHubPR, get_ghstack_prs, get_ghstack_dependent_prs
    >>> pr=GitHubPR("pytorch", "pytorch", 115188)
    >>> pr1=GitHubPR("pytorch", "pytorch", 115210)
    >>> repo=GitRepo("/Users/nshulga/git/pytorch/pytorch")
    >>> get_ghstack_dependent_prs(repo, pr1)
    [('22742d93a5357c9b5b45a74f91a6dc5599c9c266', <trymerge.GitHubPR object at 0x100f32f40>)]
    >>> get_ghstack_dependent_prs(repo, pr)
    [('22742d93a5357c9b5b45a74f91a6dc5599c9c266', <trymerge.GitHubPR object at 0x10102eaf0>), ('76b1d44d576c20be79295810904c589241ca1bd2', <trymerge.GitHubPR object at 0x10102eb50>)]
    >>> rc=get_ghstack_dependent_prs(repo, pr)
    rc[0]>>> rc[0][1].pr_num
    115210
    >>> rc[1][1].pr_num
    115188
    ```
 - see: https://github.com/malfet/deleteme/pull/59#issuecomment-1869904714 and https://github.com/malfet/deleteme/pull/74#issuecomment-1870542702

Fixes https://github.com/pytorch/test-infra/issues/4845
Pull Request resolved: https://github.com/pytorch/pytorch/pull/116447
Approved by: https://github.com/huydhn
ghstack dependencies: #116446
2023-12-27 23:01:16 +00:00
Nikita Shulga
5fcc2519f5 [GHF] Refactors (#116446)
Prep change for allowing stacked reverts

This is a no-op that factors out some helper function that would be
useful later:
 - `get_pr_commit_sha` finds a committed sha for a given PR
 - `_revlist_to_prs` converts a revlist to GitHubPRs conditionally
   filtering some out
 - `do_revert_prs` reverts multiple PRs in a batch, but so far is
   invoked with only one PR
Pull Request resolved: https://github.com/pytorch/pytorch/pull/116446
Approved by: https://github.com/huydhn, https://github.com/seemethere
2023-12-27 23:01:16 +00:00
Nikita Shulga
87da0e1d23 [GHF] Fix gh_get_labels for small repos (#116444)
Not sure if this is recent API change or what but  `gh_get_labels('malfet', 'deleteme')` used to raise an exception (see https://github.com/malfet/deleteme/actions/runs/7334535266/job/19971328673#step:6:37 )
```
  File "/home/runner/work/deleteme/deleteme/.github/scripts/label_utils.py", line 50, in get_last_page_num_from_header
    link_info[link_info.rindex(prefix) + len(prefix) : link_info.rindex(suffix)]
AttributeError: 'NoneType' object has no attribute 'rindex'
```

And with this fix it returns the expected list

Pull Request resolved: https://github.com/pytorch/pytorch/pull/116444
Approved by: https://github.com/huydhn
2023-12-27 15:50:42 +00:00
Huy Do
d6de2df6b6 Improve the error message when a PR lacks the necessary approvals (#116161)
The error message from https://github.com/pytorch/pytorch/pull/115329#issuecomment-1857135047 is pretty confusing because it lists some random `pytorch/metamates` folks from `superuser` merge rule.  My attempt here is to make the error message clearer by pointing out:

* All the matching merge rules and
* Their list of approvers

The message will now become:

```
Approvers from one of the follow rules are needed:
- Core Reviewers (1, 2, 3, 4, 5, ...)
- Core Maintainers (1, 2)
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/116161
Approved by: https://github.com/malfet, https://github.com/PaliC, https://github.com/atalman, https://github.com/ZainRizvi
2023-12-22 00:22:43 +00:00
atalman
7b6210e8a4 Use matrix generate script for docker release workflows (#115949)
Enable both supported CUDA version builds for docker release. Rather then building only 1 version.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/115949
Approved by: https://github.com/huydhn
2023-12-18 20:20:59 +00:00
Nikita Shulga
7ed2bc7c67 [GHF] Do not block reverts with internal changes (#115903)
As check is more often than not is unreliable, so better just post a
warning and let the revert proceed.

Fixes https://github.com/pytorch/test-infra/issues/4797

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115903
Approved by: https://github.com/clee2000, https://github.com/atalman
2023-12-15 17:00:07 +00:00
Nikita Shulga
28e37d4f3b Update Trition pin (#115743)
To include a cherry-pick of https://github.com/openai/triton/pull/2771 that should fix  cuda-11.8 runtime issues

Also, tweak build wheel script to update both ROCm and vanilla Trition builds version to 2.2 (even though on trunk it should probably be 3.3 already)

TODO: Remove `ROCM_TRITION_VERSION` once both trunk and ROCM version are in sync again

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115743
Approved by: https://github.com/davidberard98
2023-12-14 18:54:24 +00:00
Aaron Gokaslan
794545c11f [BE]: Enable RUF015 codebase wide (#115507)
Constant time access of first value in collection. This is a constant time operation instead of converting the item to a list to get the first item which is linear. The rule is turned on which automatically autofixes and enforces this.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115507
Approved by: https://github.com/malfet
2023-12-11 15:51:01 +00:00
Nikita Shulga
bf16fec463 Fix up triton builds (#115039)
Follow ups after https://github.com/pytorch/pytorch/pull/114772 and https://github.com/pytorch/pytorch/pull/108187

- Triton builds should be published from `main` rather than `nightly` branch, as:
   - They are independent of any PyTorch changes
   - Every nightly is pinned to a specific commit therefore publishing updated triton binaries will not affect previous nightlies
   - If this is not the case, nightly promotion will never happen as binary builds on main will continue to fail in perpetuity searching for new triton binary
- `patch_setup_py` is still needed to modify name of the package for ROCm builds

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115039
Approved by: https://github.com/seemethere, https://github.com/kit1980, https://github.com/huydhn
2023-12-03 23:14:41 +00:00
Nikita Shulga
a6294d8b9f [RelEng] Enable Py312 conda builds (#114819)
Once [sympy-1.12](https://anaconda.org/anaconda/sympy/files?version=1.12) has been added it can be build across the board

Majority of the changes are in the builder repo:
* 6b8c73fecb tweaks numpy and openssl deps
* fc773dde97 <- tweak MLK requirements for Windows
* ca378c16f8 do not depend on Triton
* 3c7404d80c <- build without GLOO_SSL

And finally, to workaround chicken-and-egg problem from [smoke_test.bat:97](b92da8cd64/windows/internal/smoke_test.bat (L97))
```cmd
call conda install -yq numpy pytorch %CONDA_EXTRA_ARGS%
```

Manually upload binaries to pytorch-nightly channel (will fix it akin to Nova in followup PRs)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/114819
Approved by: https://github.com/huydhn
2023-12-03 01:30:03 +00:00
Bin Bao
8a90249bc2 [inductor] Update triton pin (#114772)
Differential Revision: [D51761353](https://our.internmc.facebook.com/intern/diff/D51761353)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/114772
Approved by: https://github.com/shunting314, https://github.com/atalman
2023-12-02 19:13:56 +00:00
pbialecki
386b9c2adc build small pip wheels for CUDA 11.8 (#114620)
As discussed, we would like to start building all wheels using the CUDA PyPI dependencies.
Adding the "small wheel" workflow for CUDA 11.8 as it's already used for 12.1U1.

CC @malfet @atalman

Pull Request resolved: https://github.com/pytorch/pytorch/pull/114620
Approved by: https://github.com/atalman, https://github.com/malfet
2023-11-30 20:50:31 +00:00
Huy Do
6f340c6f30 Handle the case when opening a reverted PR with deleted head branch (#114423)
When reopening a reverted PR, `422: Unprocessable Entity` is returned when the head branch has been deleted, for example https://github.com/pytorch/pytorch/pull/112889#issuecomment-1823216686

```
{
  "message": "Validation Failed",
  "errors": [
    {
      "resource": "PullRequest",
      "code": "custom",
      "field": "state",
      "message": "state cannot be changed. The commsplit branch has been deleted."
    }
  ],
  "documentation_url": "https://docs.github.com/rest/pulls/pulls#update-a-pull-request"
}
```

The revert still happens though, only reopening PR fails, which is ok to ignore in this case I think instead of going the complicated route of trying to restore the deleted branch by merge bot.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/114423
Approved by: https://github.com/malfet, https://github.com/kit1980
2023-11-23 07:32:46 +00:00
atalman
7a697c4683 [RelEng] Tag docker images for release, pin unstable and disabled jobs, apply release only changes (#114355)
1. This tags docker images using docker pull/tag/push for current release
2. Sets RELEASE_VERSION_TAG var and regenerates the workflows using the new docker tag
3. Remove conda token setting and Binary tests release changes these are already automated
4. Pin unstable and disabled jobs, autumate: https://github.com/pytorch/pytorch/pull/111675

Test:
```
RELEASE_VERSION=2.2 ./scripts/release/apply-release-changes.sh
Tagging pytorch/manylinux-builder:cuda11.8-main to pytorch/manylinux-builder:cuda11.8-2.2 , dry_run: enabled
Tagging pytorch/manylinux-builder:cuda12.1-main to pytorch/manylinux-builder:cuda12.1-2.2 , dry_run: enabled
Tagging pytorch/libtorch-cxx11-builder:cuda11.8-main to pytorch/libtorch-cxx11-builder:cuda11.8-2.2 , dry_run: enabled
Tagging pytorch/libtorch-cxx11-builder:cuda12.1-main to pytorch/libtorch-cxx11-builder:cuda12.1-2.2 , dry_run: enabled
Tagging pytorch/manylinux-builder:rocm5.6-main to pytorch/manylinux-builder:rocm5.6-2.2 , dry_run: enabled
Tagging pytorch/manylinux-builder:rocm5.7-main to pytorch/manylinux-builder:rocm5.7-2.2 , dry_run: enabled
Tagging pytorch/libtorch-cxx11-builder:rocm5.6-main to pytorch/libtorch-cxx11-builder:rocm5.6-2.2 , dry_run: enabled
Tagging pytorch/libtorch-cxx11-builder:rocm5.7-main to pytorch/libtorch-cxx11-builder:rocm5.7-2.2 , dry_run: enabled
Tagging pytorch/manylinux-builder:cpu-main to pytorch/manylinux-builder:cpu-2.2 , dry_run: enabled
Tagging pytorch/libtorch-cxx11-builder:cpu-main to pytorch/libtorch-cxx11-builder:cpu-2.2 , dry_run: enabled
Tagging pytorch/manylinuxcxx11-abi-builder:cpu-cxx11-abi-main to pytorch/manylinuxcxx11-abi-builder:cpu-cxx11-abi-2.2 , dry_run: enabled
Tagging pytorch/manylinuxaarch64-builder:cpu-aarch64-main to pytorch/manylinuxaarch64-builder:cpu-aarch64-2.2 , dry_run: enabled
Tagging pytorch/conda-builder:cuda11.8-main to pytorch/conda-builder:cuda11.8-2.2 , dry_run: enabled
Tagging pytorch/conda-builder:cuda12.1-main to pytorch/conda-builder:cuda12.1-2.2 , dry_run: enabled
Tagging pytorch/conda-builder:cpu-main to pytorch/conda-builder:cpu-2.2 , dry_run: enabled
/data/users/atalman/pytorch/.github/workflows/generated-linux-binary-manywheel-nightly.yml
/data/users/atalman/pytorch/.github/workflows/generated-linux-binary-conda-nightly.yml
/data/users/atalman/pytorch/.github/workflows/generated-linux-binary-libtorch-cxx11-abi-nightly.yml
/data/users/atalman/pytorch/.github/workflows/generated-linux-binary-libtorch-pre-cxx11-nightly.yml
/data/users/atalman/pytorch/.github/workflows/generated-linux-aarch64-binary-manywheel-nightly.yml
/data/users/atalman/pytorch/.github/workflows/generated-linux-binary-manywheel-main.yml
/data/users/atalman/pytorch/.github/workflows/generated-linux-binary-libtorch-cxx11-abi-main.yml
/data/users/atalman/pytorch/.github/workflows/generated-linux-binary-libtorch-pre-cxx11-main.yml
/data/users/atalman/pytorch/.github/workflows/generated-windows-binary-wheel-nightly.yml
/data/users/atalman/pytorch/.github/workflows/generated-windows-binary-conda-nightly.yml
/data/users/atalman/pytorch/.github/workflows/generated-windows-binary-libtorch-release-nightly.yml
/data/users/atalman/pytorch/.github/workflows/generated-windows-binary-libtorch-debug-nightly.yml
/data/users/atalman/pytorch/.github/workflows/generated-windows-binary-libtorch-release-main.yml
/data/users/atalman/pytorch/.github/workflows/generated-windows-binary-libtorch-debug-main.yml
/data/users/atalman/pytorch/.github/workflows/generated-macos-binary-wheel-nightly.yml
/data/users/atalman/pytorch/.github/workflows/generated-macos-binary-conda-nightly.yml
/data/users/atalman/pytorch/.github/workflows/generated-macos-binary-libtorch-cxx11-abi-nightly.yml
/data/users/atalman/pytorch/.github/workflows/generated-macos-arm64-binary-libtorch-cxx11-abi-nightly.yml
/data/users/atalman/pytorch/.github/workflows/generated-macos-arm64-binary-wheel-nightly.yml
/data/users/atalman/pytorch/.github/workflows/generated-macos-arm64-binary-conda-nightly.yml
````

Result of pinning unstable and disabled jobs:
```
# The link to the published list of disabled jobs
DISABLED_JOBS_URL = "https://ossci-metrics.s3.amazonaws.com/disabled-jobs.json?versionid=kKJlAXdrUbk3CilXbKu.6OwNTGQB8a.B"
# and unstable jobs
UNSTABLE_JOBS_URL = "https://ossci-metrics.s3.amazonaws.com/unstable-jobs.json?versionid=vzaicOxSsh55iXBXwgGrW6dFeVtPfrhr"
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/114355
Approved by: https://github.com/malfet
2023-11-23 02:14:22 +00:00
atalman
995fae6060 Move small pypi build as default for linux cuda 12.1 (#114281)
This is first PR to resolve: https://github.com/pytorch/pytorch/issues/113972
Move our small wheel build as default
Test:
```
pip3 install --no-cache-dir --pre torch-2.2.0.dev20231121%2Bcu121-cp310-cp310-linux_x86_64.whl  --index-url https://download.pytorch.org/whl/nightly/cu121
Looking in indexes: https://download.pytorch.org/whl/nightly/cu121
Processing ./torch-2.2.0.dev20231121%2Bcu121-cp310-cp310-linux_x86_64.whl
Collecting filelock (from torch==2.2.0.dev20231121+cu121)
  Downloading https://download.pytorch.org/whl/nightly/filelock-3.9.0-py3-none-any.whl (9.7 kB)
Collecting typing-extensions>=4.8.0 (from torch==2.2.0.dev20231121+cu121)
  Downloading https://download.pytorch.org/whl/nightly/typing_extensions-4.8.0-py3-none-any.whl (31 kB)
Collecting sympy (from torch==2.2.0.dev20231121+cu121)
  Downloading https://download.pytorch.org/whl/nightly/sympy-1.11.1-py3-none-any.whl (6.5 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 6.5/6.5 MB 253.4 MB/s eta 0:00:00
Collecting networkx (from torch==2.2.0.dev20231121+cu121)
  Downloading https://download.pytorch.org/whl/nightly/networkx-3.0rc1-py3-none-any.whl (2.0 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 2.0/2.0 MB 387.1 MB/s eta 0:00:00
Collecting jinja2 (from torch==2.2.0.dev20231121+cu121)
  Downloading https://download.pytorch.org/whl/nightly/Jinja2-3.1.2-py3-none-any.whl (133 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 133.1/133.1 kB 365.3 MB/s eta 0:00:00
Collecting fsspec (from torch==2.2.0.dev20231121+cu121)
  Downloading https://download.pytorch.org/whl/nightly/fsspec-2023.4.0-py3-none-any.whl (153 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 154.0/154.0 kB 370.6 MB/s eta 0:00:00
Collecting pytorch-triton==2.1.0+6e4932cda8 (from torch==2.2.0.dev20231121+cu121)
  Downloading https://download.pytorch.org/whl/nightly/pytorch_triton-2.1.0%2B6e4932cda8-cp310-cp310-linux_x86_64.whl (125.4 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 125.4/125.4 MB 384.1 MB/s eta 0:00:00
Collecting nvidia-cuda-nvrtc-cu12==12.1.105 (from torch==2.2.0.dev20231121+cu121)
  Downloading https://download.pytorch.org/whl/nightly/cu121/nvidia_cuda_nvrtc_cu12-12.1.105-py3-none-manylinux1_x86_64.whl (23.7 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 23.7/23.7 MB 404.9 MB/s eta 0:00:00
Collecting nvidia-cuda-runtime-cu12==12.1.105 (from torch==2.2.0.dev20231121+cu121)
  Downloading https://download.pytorch.org/whl/nightly/cu121/nvidia_cuda_runtime_cu12-12.1.105-py3-none-manylinux1_x86_64.whl (823 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 823.6/823.6 kB 402.5 MB/s eta 0:00:00
Collecting nvidia-cuda-cupti-cu12==12.1.105 (from torch==2.2.0.dev20231121+cu121)
  Downloading https://download.pytorch.org/whl/nightly/cu121/nvidia_cuda_cupti_cu12-12.1.105-py3-none-manylinux1_x86_64.whl (14.1 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 14.1/14.1 MB 383.9 MB/s eta 0:00:00
Collecting nvidia-cudnn-cu12==8.9.2.26 (from torch==2.2.0.dev20231121+cu121)
  Downloading https://download.pytorch.org/whl/nightly/cu121/nvidia_cudnn_cu12-8.9.2.26-py3-none-manylinux1_x86_64.whl (731.7 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 731.7/731.7 MB 406.9 MB/s eta 0:00:00
Collecting nvidia-cublas-cu12==12.1.3.1 (from torch==2.2.0.dev20231121+cu121)
  Downloading https://download.pytorch.org/whl/nightly/cu121/nvidia_cublas_cu12-12.1.3.1-py3-none-manylinux1_x86_64.whl (410.6 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 410.6/410.6 MB 388.2 MB/s eta 0:00:00
Collecting nvidia-cufft-cu12==11.0.2.54 (from torch==2.2.0.dev20231121+cu121)
  Downloading https://download.pytorch.org/whl/nightly/cu121/nvidia_cufft_cu12-11.0.2.54-py3-none-manylinux1_x86_64.whl (121.6 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 121.6/121.6 MB 410.5 MB/s eta 0:00:00
Collecting nvidia-curand-cu12==10.3.2.106 (from torch==2.2.0.dev20231121+cu121)
  Downloading https://download.pytorch.org/whl/nightly/cu121/nvidia_curand_cu12-10.3.2.106-py3-none-manylinux1_x86_64.whl (56.5 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 56.5/56.5 MB 272.9 MB/s eta 0:00:00
Collecting nvidia-cusolver-cu12==11.4.5.107 (from torch==2.2.0.dev20231121+cu121)
  Downloading https://download.pytorch.org/whl/nightly/cu121/nvidia_cusolver_cu12-11.4.5.107-py3-none-manylinux1_x86_64.whl (124.2 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 124.2/124.2 MB 381.5 MB/s eta 0:00:00
Collecting nvidia-cusparse-cu12==12.1.0.106 (from torch==2.2.0.dev20231121+cu121)
  Downloading https://download.pytorch.org/whl/nightly/cu121/nvidia_cusparse_cu12-12.1.0.106-py3-none-manylinux1_x86_64.whl (196.0 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 196.0/196.0 MB 394.6 MB/s eta 0:00:00
Collecting nvidia-nccl-cu12==2.19.3 (from torch==2.2.0.dev20231121+cu121)
  Downloading https://download.pytorch.org/whl/nightly/cu121/nvidia_nccl_cu12-2.19.3-py3-none-manylinux1_x86_64.whl (166.0 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 166.0/166.0 MB 384.7 MB/s eta 0:00:00
Collecting nvidia-nvtx-cu12==12.1.105 (from torch==2.2.0.dev20231121+cu121)
  Downloading https://download.pytorch.org/whl/nightly/cu121/nvidia_nvtx_cu12-12.1.105-py3-none-manylinux1_x86_64.whl (99 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 99.1/99.1 kB 281.8 MB/s eta 0:00:00
Collecting nvidia-nvjitlink-cu12 (from nvidia-cusolver-cu12==11.4.5.107->torch==2.2.0.dev20231121+cu121)
  Downloading https://download.pytorch.org/whl/nightly/cu121/nvidia_nvjitlink_cu12-12.1.105-py3-none-manylinux1_x86_64.whl (19.8 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 19.8/19.8 MB 367.3 MB/s eta 0:00:00
Collecting MarkupSafe>=2.0 (from jinja2->torch==2.2.0.dev20231121+cu121)
  Downloading https://download.pytorch.org/whl/nightly/MarkupSafe-2.1.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (25 kB)
Collecting mpmath>=0.19 (from sympy->torch==2.2.0.dev20231121+cu121)
  Downloading https://download.pytorch.org/whl/nightly/mpmath-1.2.1-py3-none-any.whl (532 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 532.6/532.6 kB 391.3 MB/s eta 0:00:00
Installing collected packages: mpmath, typing-extensions, sympy, nvidia-nvtx-cu12, nvidia-nvjitlink-cu12, nvidia-nccl-cu12, nvidia-curand-cu12, nvidia-cufft-cu12, nvidia-cuda-runtime-cu12, nvidia-cuda-nvrtc-cu12, nvidia-cuda-cupti-cu12, nvidia-cublas-cu12, networkx, MarkupSafe, fsspec, filelock, pytorch-triton, nvidia-cusparse-cu12, nvidia-cudnn-cu12, jinja2, nvidia-cusolver-cu12, torch
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/114281
Approved by: https://github.com/malfet, https://github.com/huydhn
2023-11-22 00:10:03 +00:00
Catherine Lee
dab272eed8 [td] Consistent pytest cache (#113804)
Move the pytest cache downloading into the build step and store it in additional ci files so that it stays consistent during sharding.

Only build env is taken into account now instead of also test config since we might not have the test config during build time, making it less specific, but I also think this might be better since tests are likely to fail across the same test config (I also think it might be worth not even looking at build env but thats a different topic)

Each cache upload should only include information from the current run.  Do not merge current cache with downloaded cache during upload (shouldn't matter anyways since the downloaded cache won't exist at the time)

From what I cant tell of the s3 retention policy, pytest cache files will be deleted after 30 days (cc @ZainRizvi to confirm), so we never have to worry about space or pulling old versions.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/113804
Approved by: https://github.com/ZainRizvi
2023-11-17 23:45:47 +00:00
Nikita Shulga
3fc38e6c83 [GHF] Abort merge on rebase failure (#113960)
Abort merges invoked with `-r` if there is nothing to rebase

Make `rebase_onto`/`rebase_ghstack_onto` return False if rebase is no-op and abort merge in that case

Remove `-e` option from both trymerge and tryrebase workflows as  one should never report failures on workflow dispatch

Pull Request resolved: https://github.com/pytorch/pytorch/pull/113960
Approved by: https://github.com/clee2000
2023-11-17 23:11:00 +00:00
Catherine Lee
c51827b8ce [ez] Hash update to reuse issues again (#113961)
The bot that creates the issue got changed, but the search did not, so it wasn't finding old PRs and was just making new ones.

This PR makes it reuse PRs again instead of making a new one everytime.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/113961
Approved by: https://github.com/huydhn
2023-11-17 19:06:38 +00:00
albanD
25fb88cf23 Add all 3.12 binary build for wheel. Let's see how it goes. V2 (#112882)
Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/112882
Approved by: https://github.com/malfet, https://github.com/sammcj
2023-11-16 18:20:12 +00:00
Eli Uriegas
84ee7453ad ci: Add clickable PR link to trymerge (#113712)
Adds a link to trymerge so that you can quickly click through the job to
the pull request for debugging.

Signed-off-by: Eli Uriegas <eliuriegas@meta.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/113712
Approved by: https://github.com/clee2000, https://github.com/malfet
2023-11-15 01:55:33 +00:00
Catherine Lee
6e73ae2022 [ci][ez] Add job_id to emit_metrics (#113099)
As in title.

Also print the job id in the step since I'm struggling to find it
Pull Request resolved: https://github.com/pytorch/pytorch/pull/113099
Approved by: https://github.com/seemethere
2023-11-08 10:32:41 +00:00
Huy Do
dd957138ec Pin Docker images to main (#112692)
This will help prevent a commit like 77901321d9 pushing to release branch from overwrite the Docker images used in main.  In addition, the `DEFAULT_TAG` can be easily updated to `2.1` for example when doing branch cut release.  This basically pins the Docker images like https://github.com/pytorch/pytorch/pull/111971

Pull Request resolved: https://github.com/pytorch/pytorch/pull/112692
Approved by: https://github.com/malfet
2023-11-02 17:39:45 +00:00
Nikita Shulga
1b86d5ef2f [Ci] Add arm64 libtorch CI config (#112474)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/112474
Approved by: https://github.com/ZainRizvi, https://github.com/seemethere
ghstack dependencies: #112451, #112452
2023-11-01 19:09:34 +00:00
Nikita Shulga
54c7d0d99d [GHF] Bot should reopen PR after revert (#112614)
Fixes https://github.com/pytorch/test-infra/issues/4692
Test plan, see https://github.com/malfet/deleteme/pull/58#issuecomment-1789365259 / https://github.com/malfet/deleteme/actions/runs/6723011476
Pull Request resolved: https://github.com/pytorch/pytorch/pull/112614
Approved by: https://github.com/seemethere, https://github.com/ezyang
ghstack dependencies: #112613
2023-11-01 18:03:32 +00:00
Nikita Shulga
4a2242e479 [BE] Use GITHUB_API_URL (#112613)
To avoid hardcoding the same string constant over and over again
Pull Request resolved: https://github.com/pytorch/pytorch/pull/112613
Approved by: https://github.com/seemethere
2023-11-01 18:03:32 +00:00
Nikita Shulga
8d6b4322d0 [CI] Limit libtorch builds to shared-with-deps (#112452)
As that is the only variant that is being mentioned on  https://pytorch.org/get-started/locally/

And for MacOS those three flavors were just building and uploading the
same thing 3 times over, see [this](https://github.com/pytorch/pytorch/actions/runs/6689661275/job/18176516410) for example:
```
upload: ../../_temp/artifacts/libtorch-macos-2.2.0.dev20231030.zip to s3://pytorch/libtorch/nightly/cpu/libtorch-macos-2.2.0.dev20231030.zip
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/112452
Approved by: https://github.com/huydhn
ghstack dependencies: #112451
2023-10-31 08:40:06 +00:00
Nikita Shulga
0ce8cf7c7a Update small wheel nccl-version to 2.19.3 (#112293)
To keep it in sync with https://github.com/pytorch/pytorch/pull/110827

Added check to `scripts/generate_binary_build_matrix.py` to validate submodule and small wheel nccl versions are the same

Step one in addressing https://github.com/pytorch/pytorch/issues/112285
Pull Request resolved: https://github.com/pytorch/pytorch/pull/112293
Approved by: https://github.com/huydhn
2023-10-31 01:20:01 +00:00