Commit graph

5216 commits

Author SHA1 Message Date
Nikita Shulga
0e83e7d56e [EZ] Add logic to build Metal shader with debug info (#146768)
By appending `-frecord-sources -gline-tables-only` to the compilation command

Helpful when debugging shaders compiled into libtorch

Test plan: Run
`python ../tools/build_with_debinfo.py ../aten/src/ATen/native/mps/kernels/UpSample.metal ../aten/src/ATen/native/mps/operations/UpSample.mm`
And then run following to capture shader and check that it contains debug info
```python
import torch
import os
os.environ["MTL_CAPTURE_ENABLED"]="1"
inp = torch.rand(size=(6, 3, 10, 20), device="mps", dtype=torch.float32)
with torch.mps.profiler.metal_capture("bilinear2d"):
    out = torch.nn.functional.interpolate(x, scale_factor=(1.7,0.9), mode="bilinear")
```
<img width="769" alt="image" src="https://github.com/user-attachments/assets/e0316c1c-07a4-4da5-97b9-886c56857c1d" />

Pull Request resolved: https://github.com/pytorch/pytorch/pull/146768
Approved by: https://github.com/dcci
2025-02-08 23:40:23 +00:00
Aaron Gokaslan
292af3cc89 [BE][Ez]: ISC001 Auto concatenate implicit one line strings (#146408)
Apply ruff rule about implicit string concatenation, this autofixes strings that are all the same type and on the same line. These lines are broken up likely as the result of autoformatters in the past. All fixes are automated using the autofixes in ISC001.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/146408
Approved by: https://github.com/justinchuby, https://github.com/janeyx99
2025-02-04 19:07:04 +00:00
Yang Wang
fd73ae2068 [Utilization] Convert timestamp to str for datetime64 (#145985)
Convert all timestamp(float) to int  timestamp during data pipeline for db type datetime64.
float does not work when try to insert into clickhouse using jsonExtract.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/145985
Approved by: https://github.com/huydhn
2025-02-03 21:05:18 +00:00
Scott Wolchok
3fae5c8509 torchgen: support exception boundary for ExecuTorch functions (#144341)
Needed for ExecuTorch diff D67904052.

Differential Revision: [D67906411](https://our.internmc.facebook.com/intern/diff/D67906411/)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/144341
Approved by: https://github.com/Jack-Khuu
2025-01-31 01:05:21 +00:00
cyy
d94d816d96 Simplify handling of max jobs in CMake builds (#145820)
Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/145820
Approved by: https://github.com/malfet
2025-01-31 00:55:39 +00:00
Yang Wang
a9ed7bd78e [utilization] pipeline to create clean db records (#145327)
upload_utilization_script to generate db-ready-insert records to s3
- generate two files: metadata and timeseries in ossci-utilization buckets
- convert log record to db format ones
- add unit test job for tools/stats/

Related Prs:
setup composite action for data pipeline: https://github.com/pytorch/pytorch/pull/145310
add permission for composite action to access S3 bucket: https://github.com/pytorch-labs/pytorch-gha-infra/pull/595
add insert logic in s3 replicator: https://github.com/pytorch/test-infra/pull/6217
Pull Request resolved: https://github.com/pytorch/pytorch/pull/145327
Approved by: https://github.com/huydhn

Co-authored-by: Huy Do <huydhn@gmail.com>
2025-01-29 23:48:50 +00:00
Catherine Lee
953e80936e [linter] Grep linter batches long command (#145950)
If the command is too long, the linter fails with
```
Failed due to OSError:
[Errno 7] Argument list too long: 'grep'
```
Fix this by batching the command so it is shorter

Limit of 750k was chosen due to `getconf ARG_MAX` returns ~1M on my mac.  My guess is that most people shouldn't hit this unless they run --all-files and the directory length is long.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/145950
Approved by: https://github.com/wdvr
2025-01-29 21:23:27 +00:00
Zain Rizvi
a6e3f294f1 Don't use mypy daemon in CI (#145961)
This is an attempt to fix flaky mypy errors in CI that look like:

```
dmypy status --verbose
connection_name         : /var/folders/rf/qrn1jkgj0b9_tcznwp8ck46w0000gn/T/tmpjoqsid7_/dmypy.sock
pid                     :      32233
error                   :  timed out
Daemon is stuck; consider /Users/zainr/pytorch/venv/bin/dmypy kill
```

"Fix" it by not using the daemon at all, since it doesn't actually provide any perf benefits in CI.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/145961
Approved by: https://github.com/malfet
2025-01-29 21:15:29 +00:00
rzou
ea141d8134 functional compiled autograd (#144707)
This PR squashes together the following commits:

https://github.com/pytorch/pytorch/pull/144115
https://github.com/pytorch/pytorch/pull/143417
https://github.com/pytorch/pytorch/pull/143405
https://github.com/pytorch/pytorch/pull/143387
https://github.com/pytorch/pytorch/pull/143304
https://github.com/pytorch/pytorch/pull/143296

This is a refactor of compiled autograd to use "functional autograd". The end goal is that it gets compiled autograd's initial capture to stop specializing on Tensor metadata, therefore allowing compiled autograd to better handle Tensor subclasses.

For more information, please read the commit messages for each PR.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/144707
Approved by: https://github.com/bdhirsh, https://github.com/xmfan, https://github.com/jansel
2025-01-27 05:20:56 +00:00
Isalia20
b75afa2e2e [MPS] cholesky implementation (#145701)
Requested in #77764

Closed #144193  due to a lot of conflicts when rebasing
Pull Request resolved: https://github.com/pytorch/pytorch/pull/145701
Approved by: https://github.com/malfet
2025-01-27 01:53:03 +00:00
Aaron Gokaslan
f3304571fc [BE][Ez]: FURB148 - remove useless enumerate calls (#145619)
Remove useless enumerate calls

Pull Request resolved: https://github.com/pytorch/pytorch/pull/145619
Approved by: https://github.com/drisspg
2025-01-24 23:37:15 +00:00
Aaron Orenstein
1335882b2a If mypy fails it should report the error back to lintrunner (#145550)
This happened to me because I had a bad LD_LIBRARY_PATH and mypy was failing to run (.so load error) - but lintrunner was silent about the underlying problem.

Differential Revision: [D68593081](https://our.internmc.facebook.com/intern/diff/D68593081)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/145550
Approved by: https://github.com/bobrenjc93, https://github.com/Skylion007
2025-01-24 15:40:30 +00:00
PyTorch MergeBot
6dd8283381 Revert "[compiled autograd] Proxy opaque nodes for built-in autograd nodes (#143296)"
This reverts commit 5531fafffe.

Reverted https://github.com/pytorch/pytorch/pull/143296 on behalf of https://github.com/izaitsevfb due to breaking internal tests T213390054 ([comment](https://github.com/pytorch/pytorch/pull/143296#issuecomment-2611224926))
2025-01-23 23:34:13 +00:00
Yang Wang
6d4f5f7688 [Utilization][Usage Log] Add data model for record (#145114)
Add data model for consistency and data model change in the future.

The data model will be used during the post-test-process pipeline
Pull Request resolved: https://github.com/pytorch/pytorch/pull/145114
Approved by: https://github.com/huydhn
2025-01-23 19:04:41 +00:00
Andy Lugo
faa10faa2c [ROCm] CK SDPA - Move arch check to CK patch (#144777)
__gfxXXX__ should only be visible by device code. Move the check to the ck kernel

Pull Request resolved: https://github.com/pytorch/pytorch/pull/144777
Approved by: https://github.com/jeffdaily, https://github.com/xw285cornell, https://github.com/jianyuh
2025-01-23 04:12:25 +00:00
PyTorch MergeBot
dddf52b1b9 Revert "Enable grep_linter to use -a (#144589)"
This reverts commit 3c55669b88.

Reverted https://github.com/pytorch/pytorch/pull/144589 on behalf of https://github.com/clee2000 due to the line parameter is kind of important and -a is not as important as I thought it was so I'm going to revert this ([comment](https://github.com/pytorch/pytorch/pull/144589#issuecomment-2608349155))
2025-01-22 21:55:27 +00:00
rzou
5531fafffe [compiled autograd] Proxy opaque nodes for built-in autograd nodes (#143296)
This PR is on the way to getting compiled autograd's initial capture to
stop specializing on Tensor metadata.

This PR changes compiled autograd's initial capture to proxy an opaque
(w.r.t. Dynamo) function into the graph for all built-in codegen'ed
autograd nodes and validate_outputs.

We changed each codegen'ed apply_with_saved (e.g.
MulBackward0::apply_with_saved) to call into Python to proxy a function
(compiled_autograd.ops.MulBackward0) into the graph. Then, we use the
node's InputMetadata to "guess" at the properties of the output Tensors
to create some new FakeTensors.

Some details:
- MulBackward0::apply_with_saved lives in libtorch_cpu, but needs to be
  call to Python via libtorch_python. There is an indirection
  (PyCompilerInterface) to do this.
- MulBackward0::apply_with_saved passes a C++ function to Python. To make
  our lives easier, every codegen'ed apply_with_saved passes a C++
  function with the same signature
  `(variable_list, ivalue_list) -> variable_list`.
- We define how to pack arbitrary C++ types into IValue via a helper
  IValuePacker struct and codegen functional variants of each builtin
  C++ autograd node (e.g. MulBackward0_apply_functional_ivalue).

MulBackward0 before this PR:
https://gist.github.com/zou3519/a80381d5fa38e970e413fcd91b0530de

MulBackward0 after this PR:
https://gist.github.com/zou3519/0c2eee8b3d8d96232b51ef430b53c5b0

Test Plan:
- existing tests
Pull Request resolved: https://github.com/pytorch/pytorch/pull/143296
Approved by: https://github.com/jansel
2025-01-22 21:50:29 +00:00
Aaron Orenstein
07669ed960 PEP585 update - benchmarks tools torchgen (#145101)
This is one of a series of PRs to update us to PEP585 (changing Dict -> dict, List -> list, etc).  Most of the PRs were completely automated with RUFF as follows:

Since RUFF UP006 is considered an "unsafe" fix first we need to enable unsafe fixes:

```
--- a/tools/linter/adapters/ruff_linter.py
+++ b/tools/linter/adapters/ruff_linter.py
@@ -313,6 +313,7 @@
                     "ruff",
                     "check",
                     "--fix-only",
+                    "--unsafe-fixes",
                     "--exit-zero",
                     *([f"--config={config}"] if config else []),
                     "--stdin-filename",
```

Then we need to tell RUFF to allow UP006 (as a final PR once all of these have landed this will be made permanent):

```
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -40,7 +40,7 @@

 [tool.ruff]
-target-version = "py38"
+target-version = "py39"
 line-length = 88
 src = ["caffe2", "torch", "torchgen", "functorch", "test"]

@@ -87,7 +87,6 @@
     "SIM116", # Disable Use a dictionary instead of consecutive `if` statements
     "SIM117",
     "SIM118",
-    "UP006", # keep-runtime-typing
     "UP007", # keep-runtime-typing
 ]
 select = [
```

Finally running `lintrunner -a --take RUFF` will fix up the deprecated uses.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/145101
Approved by: https://github.com/bobrenjc93
2025-01-18 05:05:07 +00:00
Tom Ritchford
46fbd63405 Fix unbind_copy and add its decomposition (#134319)
* Fixes https://github.com/pytorch/pytorch/issues/130829

Pull Request resolved: https://github.com/pytorch/pytorch/pull/134319
Approved by: https://github.com/amjames, https://github.com/eellison
2025-01-17 18:21:22 +00:00
PyTorch MergeBot
6c713ccb5e Revert "Make functionalization ViewMeta serializable with pickle. (#143712)"
This reverts commit b8abdaa286.

Reverted https://github.com/pytorch/pytorch/pull/143712 on behalf of https://github.com/kit1980 due to breaking internal builds ([comment](https://github.com/pytorch/pytorch/pull/143712#issuecomment-2597205261))
2025-01-17 00:52:50 +00:00
Yang Wang
fea9d18d5a [Utilization Log] Concurrently collect aggregate data during the output interval (#143235)
# overview
Add worker to collect metrics in short intervals
1.Worker: Add a worker to collect usage metrics, by default, every 500ms, notice this is configurable
2.Calculate &  avg and max as data point, by default, every 5 second.

# Other
clean up the log format for necessary needs, currentl we do not need to track gpu processesors etc, or all pids from psutil
Pull Request resolved: https://github.com/pytorch/pytorch/pull/143235
Approved by: https://github.com/huydhn
2025-01-16 23:52:43 +00:00
PyTorch MergeBot
46b92c025d Revert "Cholesky mps implementation (#144193)"
This reverts commit 727ae13318.

Reverted https://github.com/pytorch/pytorch/pull/144193 on behalf of https://github.com/malfet due to Alas, inductor changes broke inductor tests, see aa4a1ff027/1 ([comment](https://github.com/pytorch/pytorch/pull/144193#issuecomment-2596938163))
2025-01-16 21:37:32 +00:00
Yukio Siraichi
b8abdaa286 Make functionalization ViewMeta serializable with pickle. (#143712)
Fix: #141974

This PR makes `ViewMeta` sequence, present in functional tensors,
serializable with pickle. In order to accomplish that, it makes
`ViewMeta` an abstract class with overridable `forward` and `reverse`
functions. In this context, each operation that once instanciated
`ViewMeta`, should now create a new specialized class that inherits from
`ViewMeta. Therefore, this PR also uses codegen for creating these
specializations.

In summary, these are the changes this PR introduces:

- `ViewMeta` is turned into an abstract class (see
  _FunctionalStorageImpl.cpp_). `forward` and `reverse` are pure virtual
  functions that need to be implemented. `to_out_index` should be
  implemented by operations that might return more than 1 output.

- New `ViewMeta` specializations for `resize_` and `_unsafe_view` are
  created (see _FunctionalizeFallbackKernel.h_).

- New templates _ViewMetaClasses.{cpp,h}_ are created. They hold the
  declaration and definition of the `ViewMeta` specializations, which
  are automatically generated in the ATen codegen (see _gen.py_).

- New `_functionalization` Python sub-module is created (see
  _Module.cpp_). It serves as namespace for the `ViewMeta`
  specializations and `InverseReturnMode` enum.

- New template _ViewMetaClassesPythonBinding.cpp_ is created. It holds
  the automatically generated Python bindings for the `ViewMeta`
  specialization, which are generated in the torch codegen (see
  _generate_code.py_).

Note that this PR makes use of codegen at 2 different moments:

- ATen codegen (_gen.py_): generates the `ViewMeta` specialized classes.
- Torch codegen (_generate_code.py_): generated the Python bindings for
  them.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/143712
Approved by: https://github.com/bdhirsh
2025-01-16 19:41:41 +00:00
Isalia20
727ae13318 Cholesky mps implementation (#144193)
Requested in #77764

PR is still in draft because it needs some cleanups and optimizations to get to cpu performance the least. Tasks:
- [x] Make `upper=True` work, only `upper=False` works now
- [x] Code cleanup
- [x] Optimizations(Though might need some help on this)(tried my best, maybe there is still some more to squeeze out)
- [x] Checks for positive definite input
- [x] Support for (*, N, N) input, currently only supports (B, N, N) input
- [x] Support other dtypes(float16, bfloat16)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/144193
Approved by: https://github.com/malfet

Co-authored-by: Nikita Shulga <2453524+malfet@users.noreply.github.com>
2025-01-16 16:26:46 +00:00
fduwjj
e3c4d1b7d6 [c10d][fr] Fix the bug when we still mark mismatch when there are match case (#144916)
When we introduce partial match, we accidentally introduce the mark of mismatch for the full match case. This is wrong and this PR fix it.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/144916
Approved by: https://github.com/c-p-i-o
2025-01-16 04:36:30 +00:00
Catherine Lee
3c55669b88 Enable grep_linter to use -a (#144589)
Lintrunner can only apply changes (-a) if only one suggestion is made per file.  The grep_linter makes a suggestion for every line it finds incorrect, so it creates multiple suggestions per file if there are multiple lines that it wants to change

This sets the `line` parameter of the LintMessage to None for all of grep_linter, but I'm not sure if that entry did anything

I'm not sure if enabling -a is the best idea, since its currently used for tabs and tab width might differ each time?  I had one instance where running with -a cause the spacing to change.  On the other hand, -a would have already worked if only one line was bad
Pull Request resolved: https://github.com/pytorch/pytorch/pull/144589
Approved by: https://github.com/huydhn
2025-01-13 21:18:24 +00:00
PyTorch MergeBot
99f2491af9 Revert "Use absolute path path.resolve() -> path.absolute() (#129409)"
This reverts commit 45411d1fc9.

Reverted https://github.com/pytorch/pytorch/pull/129409 on behalf of https://github.com/jeanschmidt due to Breaking internal CI, @albanD please help get this PR merged ([comment](https://github.com/pytorch/pytorch/pull/129409#issuecomment-2571316444))
2025-01-04 14:17:20 +00:00
Xiaodong Wang
0a94bb432e [ROCm] CK Flash Attention Backend (#143695)
Replace https://github.com/pytorch/pytorch/pull/138947 for re-import.

Replaces https://github.com/ROCm/pytorch/pull/1592

This PR contains the initial implementation of SDPA with composable_kernel backend. The CK path can be forced by simply calling torch.backends.cuda.preferred_rocm_fa_library("ck"). Similarly, you can force the incumbent aotriton implementation by passing in "aotriton" or "default". As you'd expect, not setting this option will result in aotriton to be used as the backend. In the case of CK, if pytorch deems flash attention usable, then it will use the CK path in all the same places aotriton would have been used. This PR makes no changes to the heuristics which select which attention scheme to use (i.e. flash attention vs memory efficient attention vs math etc etc). It only gets called when flash attention is both enabled (via USE_FLASH_ATTENTION) and is selected at runtime by the existing heuristics.

Files located in pytorch/aten/src/ATen/native/transformers/hip/flash_attn/ck/mha* have been pulled from https://github.com/Dao-AILab/flash-attention courtesy of @tridao's hard work who is the co-author

NOTE: In order to use this backend, the user MUST set USE_CK_FLASH_ATTENTION=1 in their environment when they build PyTorch.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/143695
Approved by: https://github.com/malfet

Co-authored-by: Andy Lugo <Andy.LugoReyes@amd.com>
Co-authored-by: Jithun Nair <jithun.nair@amd.com>
2025-01-03 22:01:36 +00:00
Xuehai Pan
45411d1fc9 Use absolute path path.resolve() -> path.absolute() (#129409)
Changes:

1. Always explicit `.absolute()`: `Path(__file__)` -> `Path(__file__).absolute()`
2. Replace `path.resolve()` with `path.absolute()` if the code is resolving the PyTorch repo root directory.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/129409
Approved by: https://github.com/albanD
2025-01-03 20:03:40 +00:00
Catherine Lee
bb5e439f2d Add networkx as bazel dep to fix CI failure (#143995)
Add networkx as a dependency for test_bazel

Example failure: https://github.com/pytorch/pytorch/actions/runs/12551752021/job/34996706301

```

INFO: From Testing //:test_bazel:
==================== Test output for //:test_bazel:
Traceback (most recent call last):
  File "/var/lib/jenkins/.cache/bazel/_bazel_jenkins/fdf6d09bf4b4f04a71e2a7dfceb40620/sandbox/processwrapper-sandbox/6504/execroot/pytorch/bazel-out/k8-fastbuild/bin/test_bazel.runfiles/pytorch/test/_test_bazel.py", line 33, in <module>
    test_simple_compile_eager()
  File "/var/lib/jenkins/.cache/bazel/_bazel_jenkins/fdf6d09bf4b4f04a71e2a7dfceb40620/sandbox/processwrapper-sandbox/6504/execroot/pytorch/bazel-out/k8-fastbuild/bin/test_bazel.runfiles/pytorch/test/_test_bazel.py", line 27, in test_simple_compile_eager
    opt_foo1 = torch.compile(foo, backend="eager")
  File "/var/lib/jenkins/.cache/bazel/_bazel_jenkins/fdf6d09bf4b4f04a71e2a7dfceb40620/sandbox/processwrapper-sandbox/6504/execroot/pytorch/bazel-out/k8-fastbuild/bin/test_bazel.runfiles/pytorch/torch/__init__.py", line 2533, in compile
    backend = _TorchCompileWrapper(backend, mode, options, dynamic)
  File "/var/lib/jenkins/.cache/bazel/_bazel_jenkins/fdf6d09bf4b4f04a71e2a7dfceb40620/sandbox/processwrapper-sandbox/6504/execroot/pytorch/bazel-out/k8-fastbuild/bin/test_bazel.runfiles/pytorch/torch/__init__.py", line 2342, in __init__
    self.compiler_fn = lookup_backend(backend)
  File "/var/lib/jenkins/.cache/bazel/_bazel_jenkins/fdf6d09bf4b4f04a71e2a7dfceb40620/sandbox/processwrapper-sandbox/6504/execroot/pytorch/bazel-out/k8-fastbuild/bin/test_bazel.runfiles/pytorch/torch/_dynamo/backends/registry.py", line 66, in lookup_backend
    _lazy_import()
  File "/var/lib/jenkins/.cache/bazel/_bazel_jenkins/fdf6d09bf4b4f04a71e2a7dfceb40620/sandbox/processwrapper-sandbox/6504/execroot/pytorch/bazel-out/k8-fastbuild/bin/test_bazel.runfiles/pytorch/torch/_dynamo/backends/registry.py", line 102, in _lazy_import
    import_submodule(backends)
  File "/var/lib/jenkins/.cache/bazel/_bazel_jenkins/fdf6d09bf4b4f04a71e2a7dfceb40620/sandbox/processwrapper-sandbox/6504/execroot/pytorch/bazel-out/k8-fastbuild/bin/test_bazel.runfiles/pytorch/torch/_dynamo/utils.py", line 2797, in import_submodule
    importlib.import_module(f"{mod.__name__}.{filename[:-3]}")
  File "/var/lib/jenkins/.cache/bazel/_bazel_jenkins/fdf6d09bf4b4f04a71e2a7dfceb40620/execroot/pytorch/external/python3_10_x86_64-unknown-linux-gnu/lib/python3.10/importlib/__init__.py", line 126, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1050, in _gcd_import
  File "<frozen importlib._bootstrap>", line 1027, in _find_and_load
  File "<frozen importlib._bootstrap>", line 1006, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 688, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 883, in exec_module
  File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed
  File "/var/lib/jenkins/.cache/bazel/_bazel_jenkins/fdf6d09bf4b4f04a71e2a7dfceb40620/sandbox/processwrapper-sandbox/6504/execroot/pytorch/bazel-out/k8-fastbuild/bin/test_bazel.runfiles/pytorch/torch/_dynamo/backends/common.py", line 12, in <module>
    from torch._functorch.aot_autograd import (
  File "/var/lib/jenkins/.cache/bazel/_bazel_jenkins/fdf6d09bf4b4f04a71e2a7dfceb40620/sandbox/processwrapper-sandbox/6504/execroot/pytorch/bazel-out/k8-fastbuild/bin/test_bazel.runfiles/pytorch/torch/_functorch/aot_autograd.py", line 147, in <module>
    from .partitioners import default_partition
  File "/var/lib/jenkins/.cache/bazel/_bazel_jenkins/fdf6d09bf4b4f04a71e2a7dfceb40620/sandbox/processwrapper-sandbox/6504/execroot/pytorch/bazel-out/k8-fastbuild/bin/test_bazel.runfiles/pytorch/torch/_functorch/partitioners.py", line 31, in <module>
    from ._activation_checkpointing.graph_info_provider import GraphInfoProvider
  File "/var/lib/jenkins/.cache/bazel/_bazel_jenkins/fdf6d09bf4b4f04a71e2a7dfceb40620/sandbox/processwrapper-sandbox/6504/execroot/pytorch/bazel-out/k8-fastbuild/bin/test_bazel.runfiles/pytorch/torch/_functorch/_activation_checkpointing/graph_info_provider.py", line 3, in <module>
    import networkx as nx
ModuleNotFoundError: No module named 'networkx'
```

No periodic runs on this PR or its main branch commit, but I'm pretty sure its started on https://togithub.com/pytorch/pytorch/pull/143539

Pull Request resolved: https://github.com/pytorch/pytorch/pull/143995
Approved by: https://github.com/huydhn
2025-01-02 19:42:18 +00:00
Benjamin Glass
d88a8c41d5 Fix flaky "Upload test stats" job (#143991)
Test stat uploading was intermittently failing due to certain XML strings being opportunistically converted to numbers, when string output was expected. This PR makes the conversion behavior optional, which should fix the stat uploads.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/143991
Approved by: https://github.com/clee2000, https://github.com/huydhn
2024-12-30 21:40:01 +00:00
Xuehai Pan
b6bdb67f82 [BE][Easy] use pathlib.Path instead of dirname / ".." / pardir (#129374)
Changes by apply order:

1. Replace all `".."` and `os.pardir` usage with `os.path.dirname(...)`.
2. Replace nested `os.path.dirname(os.path.dirname(...))` call with `str(Path(...).parent.parent)`.
3. Reorder `.absolute()` ~/ `.resolve()`~ and `.parent`: always resolve the path first.

    `.parent{...}.absolute()` -> `.absolute().parent{...}`

4. Replace chained `.parent x N` with `.parents[${N - 1}]`: the code is easier to read (see 5.)

    `.parent.parent.parent.parent` -> `.parents[3]`

5. ~Replace `.parents[${N - 1}]` with `.parents[${N} - 1]`: the code is easier to read and does not introduce any runtime overhead.~

    ~`.parents[3]` -> `.parents[4 - 1]`~

6. ~Replace `.parents[2 - 1]` with `.parent.parent`: because the code is shorter and easier to read.~

Pull Request resolved: https://github.com/pytorch/pytorch/pull/129374
Approved by: https://github.com/justinchuby, https://github.com/malfet
2024-12-29 17:23:13 +00:00
Xuehai Pan
d2f769476f [Easy] add quotes to shell activation commands (#143902)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/143902
Approved by: https://github.com/Skylion007, https://github.com/malfet
2024-12-27 19:17:46 +00:00
Xuehai Pan
c4bff71854 [Easy] Add ROCm support to nightly pull tool (#141282)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/141282
Approved by: https://github.com/malfet
ghstack dependencies: #143263
2024-12-27 00:07:38 +00:00
Xuehai Pan
51a7ecde80 [Easy] Bump CUDA nightly version to 11.8 / 12.4 / 12.6 in nightly pull tool (#143263)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/143263
Approved by: https://github.com/malfet
2024-12-26 19:01:38 +00:00
PyTorch MergeBot
475656fd9c Revert "[BE][Easy] use pathlib.Path instead of dirname / ".." / pardir (#129374)"
This reverts commit 2293fe1024.

Reverted https://github.com/pytorch/pytorch/pull/129374 on behalf of https://github.com/malfet due to failing internal ROCM builds with error: ModuleNotFoundError: No module named hipify ([comment](https://github.com/pytorch/pytorch/pull/129374#issuecomment-2562973920))
2024-12-26 17:32:23 +00:00
PyTorch MergeBot
cc4e70b7c3 Revert "Use absolute path path.resolve() -> path.absolute() (#129409)"
This reverts commit 135c7db99d.

Reverted https://github.com/pytorch/pytorch/pull/129409 on behalf of https://github.com/malfet due to need to revert to as dependency of https://github.com/pytorch/pytorch/pull/129374 ([comment](https://github.com/pytorch/pytorch/pull/129409#issuecomment-2562969825))
2024-12-26 17:26:06 +00:00
Xuehai Pan
b77406a9ec [BE][CI] bump ruff to 0.8.4 (#143753)
Changes:

1. Bump `ruff` from 0.7.4 to 0.8.4
2. Change `%`-formatted strings to f-string
3. Change arguments with the `__`-prefix to positional-only arguments with the `/` separator in function signature.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/143753
Approved by: https://github.com/Skylion007
2024-12-24 12:24:10 +00:00
Xuehai Pan
135c7db99d Use absolute path path.resolve() -> path.absolute() (#129409)
Changes:

1. Always explicit `.absolute()`: `Path(__file__)` -> `Path(__file__).absolute()`
2. Replace `path.resolve()` with `path.absolute()` if the code is resolving the PyTorch repo root directory.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/129409
Approved by: https://github.com/albanD
2024-12-24 08:33:08 +00:00
Jason Ansel
eebc93d41e Better fix for f-strings in set_linter for py3.12 (#143725)
#143628 didn't handle a few cases right for example:
```py
$ python3 tools/linter/adapters/set_linter.py torch/_inductor/scheduler.py
torch/_inductor/scheduler.py:261:24: Builtin `set` is deprecated
  259 |                 multiline=False,
  260 |             )
  261 |         return f"{self}{data_str}"
                               ^
  262 |
  263 |     def log_details(self) -> None:

torch/_inductor/scheduler.py:261:33: Builtin `set` is deprecated
  259 |                 multiline=False,
  260 |             )
  261 |         return f"{self}{data_str}"
                                        ^
  262 |
  263 |     def log_details(self) -> None:
```
also multi-line fstrings
Pull Request resolved: https://github.com/pytorch/pytorch/pull/143725
Approved by: https://github.com/yanboliang
2024-12-22 22:51:27 +00:00
Tom Ritchford
f1cbf4b1b5 Enable ruff's unused variable checking everywhere in pytorch (#136965)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/136965
Approved by: https://github.com/cyyever, https://github.com/albanD
2024-12-22 02:33:11 +00:00
Xuehai Pan
2293fe1024 [BE][Easy] use pathlib.Path instead of dirname / ".." / pardir (#129374)
Changes by apply order:

1. Replace all `".."` and `os.pardir` usage with `os.path.dirname(...)`.
2. Replace nested `os.path.dirname(os.path.dirname(...))` call with `str(Path(...).parent.parent)`.
3. Reorder `.absolute()` ~/ `.resolve()`~ and `.parent`: always resolve the path first.

    `.parent{...}.absolute()` -> `.absolute().parent{...}`

4. Replace chained `.parent x N` with `.parents[${N - 1}]`: the code is easier to read (see 5.)

    `.parent.parent.parent.parent` -> `.parents[3]`

5. ~Replace `.parents[${N - 1}]` with `.parents[${N} - 1]`: the code is easier to read and does not introduce any runtime overhead.~

    ~`.parents[3]` -> `.parents[4 - 1]`~

6. ~Replace `.parents[2 - 1]` with `.parent.parent`: because the code is shorter and easier to read.~

Pull Request resolved: https://github.com/pytorch/pytorch/pull/129374
Approved by: https://github.com/justinchuby, https://github.com/malfet
2024-12-21 22:08:01 +00:00
Jason Ansel
04b26ee1e8 Fix false positive from f-strings in set_linter (#143628)
This linter was going crazy in python 3.12, example:
```py
$ python3 tools/linter/adapters/set_linter.py torch/_inductor/runtime/triton_heuristics.py
torch/_inductor/runtime/triton_heuristics.py:192:25: Builtin `set` is deprecated
  190 |     args_str += ", ".join(call_args)
  191 |     for k, v in call_kwargs.items():
  192 |         args_str += f", {k}={v}"
                                ^
  193 |
  194 |     abs_path = os.path.abspath(sys.argv[0])

torch/_inductor/runtime/triton_heuristics.py:192:27: Builtin `set` is deprecated
  190 |     args_str += ", ".join(call_args)
  191 |     for k, v in call_kwargs.items():
  192 |         args_str += f", {k}={v}"
                                  ^
  193 |
  194 |     abs_path = os.path.abspath(sys.argv[0])

torch/_inductor/runtime/triton_heuristics.py:192:29: Builtin `set` is deprecated
  190 |     args_str += ", ".join(call_args)
  191 |     for k, v in call_kwargs.items():
  192 |         args_str += f", {k}={v}"
                                    ^
  193 |
  194 |     abs_path = os.path.abspath(sys.argv[0])

torch/_inductor/runtime/triton_heuristics.py:192:31: Builtin `set` is deprecated
  190 |     args_str += ", ".join(call_args)
  191 |     for k, v in call_kwargs.items():
  192 |         args_str += f", {k}={v}"
                                      ^
  193 |
  194 |     abs_path = os.path.abspath(sys.argv[0])

torch/_inductor/runtime/triton_heuristics.py:195:17: Builtin `set` is deprecated
  193 |
  194 |     abs_path = os.path.abspath(sys.argv[0])
  195 |     with open(f"{abs_path}.launch_params", "a") as f:
                        ^
  196 |         f.write(f"{kernel_name} | {args_str}\n")
  197 |

torch/_inductor/runtime/triton_heuristics.py:195:26: Builtin `set` is deprecated
  193 |
  194 |     abs_path = os.path.abspath(sys.argv[0])
  195 |     with open(f"{abs_path}.launch_params", "a") as f:
                                 ^
  196 |         f.write(f"{kernel_name} | {args_str}\n")
  197 |

torch/_inductor/runtime/triton_heuristics.py:196:19: Builtin `set` is deprecated
  194 |     abs_path = os.path.abspath(sys.argv[0])
  195 |     with open(f"{abs_path}.launch_params", "a") as f:
  196 |         f.write(f"{kernel_name} | {args_str}\n")
                          ^
  197 |
  198 |

torch/_inductor/runtime/triton_heuristics.py:196:31: Builtin `set` is deprecated
  194 |     abs_path = os.path.abspath(sys.argv[0])
  195 |     with open(f"{abs_path}.launch_params", "a") as f:
  196 |         f.write(f"{kernel_name} | {args_str}\n")
                                      ^
  197 |
  198 |

torch/_inductor/runtime/triton_heuristics.py:196:35: Builtin `set` is deprecated
  194 |     abs_path = os.path.abspath(sys.argv[0])
  195 |     with open(f"{abs_path}.launch_params", "a") as f:
  196 |         f.write(f"{kernel_name} | {args_str}\n")
                                          ^
  197 |
  198 |

torch/_inductor/runtime/triton_heuristics.py:196:44: Builtin `set` is deprecated
  194 |     abs_path = os.path.abspath(sys.argv[0])
  195 |     with open(f"{abs_path}.launch_params", "a") as f:
  196 |         f.write(f"{kernel_name} | {args_str}\n")
                                                   ^
  197 |
  198 |

torch/_inductor/runtime/triton_heuristics.py:729:26: Builtin `set` is deprecated
  727 |         exec(
  728 |             f"""
  729 |             def launcher({', '.join(def_args)}, grid, stream):
                                 ^
  730 |                 if callable(grid):
  731 |                     grid_0, grid_1, grid_2 = grid(grid_meta)

torch/_inductor/runtime/triton_heuristics.py:729:46: Builtin `set` is deprecated
  727 |         exec(
  728 |             f"""
  729 |             def launcher({', '.join(def_args)}, grid, stream):
                                                     ^
  730 |                 if callable(grid):
  731 |                     grid_0, grid_1, grid_2 = grid(grid_meta)

torch/_inductor/runtime/triton_heuristics.py:735:24: Builtin `set` is deprecated
  733 |                     grid_0, grid_1, grid_2 = grid
  734 |
  735 |                 args = {', '.join(call_args)},
                               ^
  736 |                 launch_args = get_launch_args(
  737 |                     grid, grid_0, grid_1, grid_2, stream, function,

torch/_inductor/runtime/triton_heuristics.py:735:45: Builtin `set` is deprecated
  733 |                     grid_0, grid_1, grid_2 = grid
  734 |
  735 |                 args = {', '.join(call_args)},
                                                    ^
  736 |                 launch_args = get_launch_args(
  737 |                     grid, grid_0, grid_1, grid_2, stream, function,

torch/_inductor/runtime/triton_heuristics.py:1144:20: Builtin `set` is deprecated
 1142 |     cur_file = inspect.stack()[1].filename
 1143 |     summary_str = (
 1144 |         f"SUMMARY ({cur_file})\n"
                           ^
 1145 |         f"{overall_time:.2f}ms   \t {overall_gb:.2f} GB\t {overall_gb / (overall_time / 1e3):.2f}GB/s"
 1146 |     )

torch/_inductor/runtime/triton_heuristics.py:1144:29: Builtin `set` is deprecated
 1142 |     cur_file = inspect.stack()[1].filename
 1143 |     summary_str = (
 1144 |         f"SUMMARY ({cur_file})\n"
                                    ^
 1145 |         f"{overall_time:.2f}ms   \t {overall_gb:.2f} GB\t {overall_gb / (overall_time / 1e3):.2f}GB/s"
 1146 |     )

torch/_inductor/runtime/triton_heuristics.py:1162:61: Builtin `set` is deprecated
 1160 |                 )
 1161 |                 file.write("====================\n")
 1162 |                 file.write(f"TRITON KERNELS BANDWIDTH INFO ({cur_file})\n")
                                                                    ^
 1163 |                 for ms, num_gb, gb_per_s, kernel_name in sorted_calls:
 1164 |                     # also display the runtime percentage for each kernel

torch/_inductor/runtime/triton_heuristics.py:1162:70: Builtin `set` is deprecated
 1160 |                 )
 1161 |                 file.write("====================\n")
 1162 |                 file.write(f"TRITON KERNELS BANDWIDTH INFO ({cur_file})\n")
                                                                             ^
 1163 |                 for ms, num_gb, gb_per_s, kernel_name in sorted_calls:
 1164 |                     # also display the runtime percentage for each kernel

torch/_inductor/runtime/triton_heuristics.py:1166:36: Builtin `set` is deprecated
 1164 |                     # also display the runtime percentage for each kernel
 1165 |                     percentage = f"{ms / overall_time * 100:.2f}%"
 1166 |                     suffix = f" \t {percentage} \t {kernel_name}"
                                           ^
 1167 |                     bw_info_str = create_bandwidth_info_str(
 1168 |                         ms,

torch/_inductor/runtime/triton_heuristics.py:1166:47: Builtin `set` is deprecated
 1164 |                     # also display the runtime percentage for each kernel
 1165 |                     percentage = f"{ms / overall_time * 100:.2f}%"
 1166 |                     suffix = f" \t {percentage} \t {kernel_name}"
                                                      ^
 1167 |                     bw_info_str = create_bandwidth_info_str(
 1168 |                         ms,

torch/_inductor/runtime/triton_heuristics.py:1166:52: Builtin `set` is deprecated
 1164 |                     # also display the runtime percentage for each kernel
 1165 |                     percentage = f"{ms / overall_time * 100:.2f}%"
 1166 |                     suffix = f" \t {percentage} \t {kernel_name}"
                                                           ^
 1167 |                     bw_info_str = create_bandwidth_info_str(
 1168 |                         ms,

torch/_inductor/runtime/triton_heuristics.py:1166:64: Builtin `set` is deprecated
 1164 |                     # also display the runtime percentage for each kernel
 1165 |                     percentage = f"{ms / overall_time * 100:.2f}%"
 1166 |                     suffix = f" \t {percentage} \t {kernel_name}"
                                                                       ^
 1167 |                     bw_info_str = create_bandwidth_info_str(
 1168 |                         ms,

torch/_inductor/runtime/triton_heuristics.py:1175:30: Builtin `set` is deprecated
 1173 |                     )
 1174 |                     file.write(bw_info_str + "\n")
 1175 |                 file.write(f"{summary_str}\n\n")
                                     ^
 1176 |         except Exception as e:
 1177 |             log.warning(

torch/_inductor/runtime/triton_heuristics.py:1175:42: Builtin `set` is deprecated
 1173 |                     )
 1174 |                     file.write(bw_info_str + "\n")
 1175 |                 file.write(f"{summary_str}\n\n")
                                                 ^
 1176 |         except Exception as e:
 1177 |             log.warning(

torch/_inductor/runtime/triton_heuristics.py:1205:29: Builtin `set` is deprecated
 1203 |         else:
 1204 |             possible_names = _find_names(self)
 1205 |             kernel_name = f"{max(possible_names, key=len)}"
                                    ^
 1206 |             if not re.match(self.regex_filter, kernel_name):
 1207 |                 return

torch/_inductor/runtime/triton_heuristics.py:1205:58: Builtin `set` is deprecated
 1203 |         else:
 1204 |             possible_names = _find_names(self)
 1205 |             kernel_name = f"{max(possible_names, key=len)}"
                                                                 ^
 1206 |             if not re.match(self.regex_filter, kernel_name):
 1207 |                 return

torch/_inductor/runtime/triton_heuristics.py:1241:60: Builtin `set` is deprecated
 1239 |                     "%s",
 1240 |                     create_bandwidth_info_str(
 1241 |                         ms, num_gb, gb_per_s, suffix=f" \t {kernel_name}"
                                                                   ^
 1242 |                     ),
 1243 |                 )

torch/_inductor/runtime/triton_heuristics.py:1241:72: Builtin `set` is deprecated
 1239 |                     "%s",
 1240 |                     create_bandwidth_info_str(
 1241 |                         ms, num_gb, gb_per_s, suffix=f" \t {kernel_name}"
                                                                               ^
 1242 |                     ),
 1243 |                 )

torch/_inductor/runtime/triton_heuristics.py:1256:15: Builtin `set` is deprecated
 1254 |     for cfg in configs:
 1255 |         hasher.update(
 1256 |             f"{sorted(cfg.kwargs.items())} {cfg.num_warps} {cfg.num_stages}\n".encode()
                      ^
 1257 |         )
 1258 |     return hasher.hexdigest()

torch/_inductor/runtime/triton_heuristics.py:1256:42: Builtin `set` is deprecated
 1254 |     for cfg in configs:
 1255 |         hasher.update(
 1256 |             f"{sorted(cfg.kwargs.items())} {cfg.num_warps} {cfg.num_stages}\n".encode()
                                                 ^
 1257 |         )
 1258 |     return hasher.hexdigest()

torch/_inductor/runtime/triton_heuristics.py:1256:44: Builtin `set` is deprecated
 1254 |     for cfg in configs:
 1255 |         hasher.update(
 1256 |             f"{sorted(cfg.kwargs.items())} {cfg.num_warps} {cfg.num_stages}\n".encode()
                                                   ^
 1257 |         )
 1258 |     return hasher.hexdigest()

torch/_inductor/runtime/triton_heuristics.py:1256:58: Builtin `set` is deprecated
 1254 |     for cfg in configs:
 1255 |         hasher.update(
 1256 |             f"{sorted(cfg.kwargs.items())} {cfg.num_warps} {cfg.num_stages}\n".encode()
                                                                 ^
 1257 |         )
 1258 |     return hasher.hexdigest()

torch/_inductor/runtime/triton_heuristics.py:1256:60: Builtin `set` is deprecated
 1254 |     for cfg in configs:
 1255 |         hasher.update(
 1256 |             f"{sorted(cfg.kwargs.items())} {cfg.num_warps} {cfg.num_stages}\n".encode()
                                                                   ^
 1257 |         )
 1258 |     return hasher.hexdigest()

torch/_inductor/runtime/triton_heuristics.py:1256:75: Builtin `set` is deprecated
 1254 |     for cfg in configs:
 1255 |         hasher.update(
 1256 |             f"{sorted(cfg.kwargs.items())} {cfg.num_warps} {cfg.num_stages}\n".encode()
                                                                                  ^
 1257 |         )
 1258 |     return hasher.hexdigest()

torch/_inductor/runtime/triton_heuristics.py:1377:23: Builtin `set` is deprecated
 1375 |         if numel is None:
 1376 |             continue
 1377 |         block = cfg[f"{label}BLOCK"]
                              ^
 1378 |         if numel == 1:
 1379 |             assert block == 1, (

torch/_inductor/runtime/triton_heuristics.py:1377:29: Builtin `set` is deprecated
 1375 |         if numel is None:
 1376 |             continue
 1377 |         block = cfg[f"{label}BLOCK"]
                                    ^
 1378 |         if numel == 1:
 1379 |             assert block == 1, (

torch/_inductor/runtime/triton_heuristics.py:1381:24: Builtin `set` is deprecated
 1379 |             assert block == 1, (
 1380 |                 f"TritonKernel.indexing assumes numel == 1 => BLOCK == 1"
 1381 |                 f" but {label.lower()}numel=={numel} and {label}BLOCK={block} (cfg={cfg})."
                               ^
 1382 |             )
 1383 |         max_block = TRITON_MAX_BLOCK[label]

torch/_inductor/runtime/triton_heuristics.py:1381:38: Builtin `set` is deprecated
 1379 |             assert block == 1, (
 1380 |                 f"TritonKernel.indexing assumes numel == 1 => BLOCK == 1"
 1381 |                 f" but {label.lower()}numel=={numel} and {label}BLOCK={block} (cfg={cfg})."
                                             ^
 1382 |             )
 1383 |         max_block = TRITON_MAX_BLOCK[label]

torch/_inductor/runtime/triton_heuristics.py:1381:46: Builtin `set` is deprecated
 1379 |             assert block == 1, (
 1380 |                 f"TritonKernel.indexing assumes numel == 1 => BLOCK == 1"
 1381 |                 f" but {label.lower()}numel=={numel} and {label}BLOCK={block} (cfg={cfg})."
                                                     ^
 1382 |             )
 1383 |         max_block = TRITON_MAX_BLOCK[label]

torch/_inductor/runtime/triton_heuristics.py:1381:52: Builtin `set` is deprecated
 1379 |             assert block == 1, (
 1380 |                 f"TritonKernel.indexing assumes numel == 1 => BLOCK == 1"
 1381 |                 f" but {label.lower()}numel=={numel} and {label}BLOCK={block} (cfg={cfg})."
                                                           ^
 1382 |             )
 1383 |         max_block = TRITON_MAX_BLOCK[label]

torch/_inductor/runtime/triton_heuristics.py:1381:58: Builtin `set` is deprecated
 1379 |             assert block == 1, (
 1380 |                 f"TritonKernel.indexing assumes numel == 1 => BLOCK == 1"
 1381 |                 f" but {label.lower()}numel=={numel} and {label}BLOCK={block} (cfg={cfg})."
                                                                 ^
 1382 |             )
 1383 |         max_block = TRITON_MAX_BLOCK[label]

torch/_inductor/runtime/triton_heuristics.py:1381:64: Builtin `set` is deprecated
 1379 |             assert block == 1, (
 1380 |                 f"TritonKernel.indexing assumes numel == 1 => BLOCK == 1"
 1381 |                 f" but {label.lower()}numel=={numel} and {label}BLOCK={block} (cfg={cfg})."
                                                                       ^
 1382 |             )
 1383 |         max_block = TRITON_MAX_BLOCK[label]

torch/_inductor/runtime/triton_heuristics.py:1381:71: Builtin `set` is deprecated
 1379 |             assert block == 1, (
 1380 |                 f"TritonKernel.indexing assumes numel == 1 => BLOCK == 1"
 1381 |                 f" but {label.lower()}numel=={numel} and {label}BLOCK={block} (cfg={cfg})."
                                                                              ^
 1382 |             )
 1383 |         max_block = TRITON_MAX_BLOCK[label]

torch/_inductor/runtime/triton_heuristics.py:1381:77: Builtin `set` is deprecated
 1379 |             assert block == 1, (
 1380 |                 f"TritonKernel.indexing assumes numel == 1 => BLOCK == 1"
 1381 |                 f" but {label.lower()}numel=={numel} and {label}BLOCK={block} (cfg={cfg})."
                                                                                    ^
 1382 |             )
 1383 |         max_block = TRITON_MAX_BLOCK[label]

torch/_inductor/runtime/triton_heuristics.py:1381:84: Builtin `set` is deprecated
 1379 |             assert block == 1, (
 1380 |                 f"TritonKernel.indexing assumes numel == 1 => BLOCK == 1"
 1381 |                 f" but {label.lower()}numel=={numel} and {label}BLOCK={block} (cfg={cfg})."
                                                                                           ^
 1382 |             )
 1383 |         max_block = TRITON_MAX_BLOCK[label]

torch/_inductor/runtime/triton_heuristics.py:1381:88: Builtin `set` is deprecated
 1379 |             assert block == 1, (
 1380 |                 f"TritonKernel.indexing assumes numel == 1 => BLOCK == 1"
 1381 |                 f" but {label.lower()}numel=={numel} and {label}BLOCK={block} (cfg={cfg})."
                                                                                               ^
 1382 |             )
 1383 |         max_block = TRITON_MAX_BLOCK[label]

torch/_inductor/runtime/triton_heuristics.py:1384:52: Builtin `set` is deprecated
 1382 |             )
 1383 |         max_block = TRITON_MAX_BLOCK[label]
 1384 |         max_block_str = f'config.triton.max_block["{label}"]'
                                                           ^
 1385 |         assert max_block % block == 0, (
 1386 |             f"TritonKernel.indexing assumes {label}BLOCK divides {max_block_str}"

torch/_inductor/runtime/triton_heuristics.py:1384:58: Builtin `set` is deprecated
 1382 |             )
 1383 |         max_block = TRITON_MAX_BLOCK[label]
 1384 |         max_block_str = f'config.triton.max_block["{label}"]'
                                                                 ^
 1385 |         assert max_block % block == 0, (
 1386 |             f"TritonKernel.indexing assumes {label}BLOCK divides {max_block_str}"

torch/_inductor/runtime/triton_heuristics.py:1386:45: Builtin `set` is deprecated
 1384 |         max_block_str = f'config.triton.max_block["{label}"]'
 1385 |         assert max_block % block == 0, (
 1386 |             f"TritonKernel.indexing assumes {label}BLOCK divides {max_block_str}"
                                                    ^
 1387 |             f" but {label}BLOCK={block} and {max_block_str}={max_block} (cfg={cfg})."
 1388 |         )

torch/_inductor/runtime/triton_heuristics.py:1386:51: Builtin `set` is deprecated
 1384 |         max_block_str = f'config.triton.max_block["{label}"]'
 1385 |         assert max_block % block == 0, (
 1386 |             f"TritonKernel.indexing assumes {label}BLOCK divides {max_block_str}"
                                                          ^
 1387 |             f" but {label}BLOCK={block} and {max_block_str}={max_block} (cfg={cfg})."
 1388 |         )

torch/_inductor/runtime/triton_heuristics.py:1386:66: Builtin `set` is deprecated
 1384 |         max_block_str = f'config.triton.max_block["{label}"]'
 1385 |         assert max_block % block == 0, (
 1386 |             f"TritonKernel.indexing assumes {label}BLOCK divides {max_block_str}"
                                                                         ^
 1387 |             f" but {label}BLOCK={block} and {max_block_str}={max_block} (cfg={cfg})."
 1388 |         )

torch/_inductor/runtime/triton_heuristics.py:1386:80: Builtin `set` is deprecated
 1384 |         max_block_str = f'config.triton.max_block["{label}"]'
 1385 |         assert max_block % block == 0, (
 1386 |             f"TritonKernel.indexing assumes {label}BLOCK divides {max_block_str}"
                                                                                       ^
 1387 |             f" but {label}BLOCK={block} and {max_block_str}={max_block} (cfg={cfg})."
 1388 |         )

torch/_inductor/runtime/triton_heuristics.py:1387:20: Builtin `set` is deprecated
 1385 |         assert max_block % block == 0, (
 1386 |             f"TritonKernel.indexing assumes {label}BLOCK divides {max_block_str}"
 1387 |             f" but {label}BLOCK={block} and {max_block_str}={max_block} (cfg={cfg})."
                           ^
 1388 |         )
 1389 |

torch/_inductor/runtime/triton_heuristics.py:1387:26: Builtin `set` is deprecated
 1385 |         assert max_block % block == 0, (
 1386 |             f"TritonKernel.indexing assumes {label}BLOCK divides {max_block_str}"
 1387 |             f" but {label}BLOCK={block} and {max_block_str}={max_block} (cfg={cfg})."
                                 ^
 1388 |         )
 1389 |

torch/_inductor/runtime/triton_heuristics.py:1387:33: Builtin `set` is deprecated
 1385 |         assert max_block % block == 0, (
 1386 |             f"TritonKernel.indexing assumes {label}BLOCK divides {max_block_str}"
 1387 |             f" but {label}BLOCK={block} and {max_block_str}={max_block} (cfg={cfg})."
                                        ^
 1388 |         )
 1389 |

torch/_inductor/runtime/triton_heuristics.py:1387:39: Builtin `set` is deprecated
 1385 |         assert max_block % block == 0, (
 1386 |             f"TritonKernel.indexing assumes {label}BLOCK divides {max_block_str}"
 1387 |             f" but {label}BLOCK={block} and {max_block_str}={max_block} (cfg={cfg})."
                                              ^
 1388 |         )
 1389 |

torch/_inductor/runtime/triton_heuristics.py:1387:45: Builtin `set` is deprecated
 1385 |         assert max_block % block == 0, (
 1386 |             f"TritonKernel.indexing assumes {label}BLOCK divides {max_block_str}"
 1387 |             f" but {label}BLOCK={block} and {max_block_str}={max_block} (cfg={cfg})."
                                                    ^
 1388 |         )
 1389 |

torch/_inductor/runtime/triton_heuristics.py:1387:59: Builtin `set` is deprecated
 1385 |         assert max_block % block == 0, (
 1386 |             f"TritonKernel.indexing assumes {label}BLOCK divides {max_block_str}"
 1387 |             f" but {label}BLOCK={block} and {max_block_str}={max_block} (cfg={cfg})."
                                                                  ^
 1388 |         )
 1389 |

torch/_inductor/runtime/triton_heuristics.py:1387:61: Builtin `set` is deprecated
 1385 |         assert max_block % block == 0, (
 1386 |             f"TritonKernel.indexing assumes {label}BLOCK divides {max_block_str}"
 1387 |             f" but {label}BLOCK={block} and {max_block_str}={max_block} (cfg={cfg})."
                                                                    ^
 1388 |         )
 1389 |

torch/_inductor/runtime/triton_heuristics.py:1387:71: Builtin `set` is deprecated
 1385 |         assert max_block % block == 0, (
 1386 |             f"TritonKernel.indexing assumes {label}BLOCK divides {max_block_str}"
 1387 |             f" but {label}BLOCK={block} and {max_block_str}={max_block} (cfg={cfg})."
                                                                              ^
 1388 |         )
 1389 |

torch/_inductor/runtime/triton_heuristics.py:1387:78: Builtin `set` is deprecated
 1385 |         assert max_block % block == 0, (
 1386 |             f"TritonKernel.indexing assumes {label}BLOCK divides {max_block_str}"
 1387 |             f" but {label}BLOCK={block} and {max_block_str}={max_block} (cfg={cfg})."
                                                                                     ^
 1388 |         )
 1389 |

torch/_inductor/runtime/triton_heuristics.py:1387:82: Builtin `set` is deprecated
 1385 |         assert max_block % block == 0, (
 1386 |             f"TritonKernel.indexing assumes {label}BLOCK divides {max_block_str}"
 1387 |             f" but {label}BLOCK={block} and {max_block_str}={max_block} (cfg={cfg})."
                                                                                         ^
 1388 |         )
 1389 |

torch/_inductor/runtime/triton_heuristics.py:1402:19: Builtin `set` is deprecated
 1400 |             assert (
 1401 |                 val <= max_block
 1402 |             ), f"'{var}' too large. Maximum: {max_block}. Actual: {val}."
                          ^
 1403 |
 1404 |

torch/_inductor/runtime/triton_heuristics.py:1402:23: Builtin `set` is deprecated
 1400 |             assert (
 1401 |                 val <= max_block
 1402 |             ), f"'{var}' too large. Maximum: {max_block}. Actual: {val}."
                              ^
 1403 |
 1404 |

torch/_inductor/runtime/triton_heuristics.py:1402:46: Builtin `set` is deprecated
 1400 |             assert (
 1401 |                 val <= max_block
 1402 |             ), f"'{var}' too large. Maximum: {max_block}. Actual: {val}."
                                                     ^
 1403 |
 1404 |

torch/_inductor/runtime/triton_heuristics.py:1402:56: Builtin `set` is deprecated
 1400 |             assert (
 1401 |                 val <= max_block
 1402 |             ), f"'{var}' too large. Maximum: {max_block}. Actual: {val}."
                                                               ^
 1403 |
 1404 |

torch/_inductor/runtime/triton_heuristics.py:1402:67: Builtin `set` is deprecated
 1400 |             assert (
 1401 |                 val <= max_block
 1402 |             ), f"'{var}' too large. Maximum: {max_block}. Actual: {val}."
                                                                          ^
 1403 |
 1404 |

torch/_inductor/runtime/triton_heuristics.py:1402:71: Builtin `set` is deprecated
 1400 |             assert (
 1401 |                 val <= max_block
 1402 |             ), f"'{var}' too large. Maximum: {max_block}. Actual: {val}."
                                                                              ^
 1403 |
 1404 |

torch/_inductor/runtime/triton_heuristics.py:1551:21: Builtin `set` is deprecated
 1549 |     rnumels = {}
 1550 |     for idx in range(num_reduction_dims - 1, -1, -1):
 1551 |         prefix = f"r{idx}_"
                            ^
 1552 |         max_size = min(size_hints[prefix], TRITON_MAX_BLOCK[prefix.upper()])
 1553 |         dim = min(max_size, remaining)

torch/_inductor/runtime/triton_heuristics.py:1551:25: Builtin `set` is deprecated
 1549 |     rnumels = {}
 1550 |     for idx in range(num_reduction_dims - 1, -1, -1):
 1551 |         prefix = f"r{idx}_"
                                ^
 1552 |         max_size = min(size_hints[prefix], TRITON_MAX_BLOCK[prefix.upper()])
 1553 |         dim = min(max_size, remaining)

torch/_inductor/runtime/triton_heuristics.py:1556:34: Builtin `set` is deprecated
 1554 |         assert (
 1555 |             remaining % dim == 0
 1556 |         ), f"Expected dimension '{dim}' to divide remaining size '{remaining}'"
                                         ^
 1557 |         rnumels[prefix] = dim
 1558 |         remaining //= dim

torch/_inductor/runtime/triton_heuristics.py:1556:38: Builtin `set` is deprecated
 1554 |         assert (
 1555 |             remaining % dim == 0
 1556 |         ), f"Expected dimension '{dim}' to divide remaining size '{remaining}'"
                                             ^
 1557 |         rnumels[prefix] = dim
 1558 |         remaining //= dim

torch/_inductor/runtime/triton_heuristics.py:1556:67: Builtin `set` is deprecated
 1554 |         assert (
 1555 |             remaining % dim == 0
 1556 |         ), f"Expected dimension '{dim}' to divide remaining size '{remaining}'"
                                                                          ^
 1557 |         rnumels[prefix] = dim
 1558 |         remaining //= dim

torch/_inductor/runtime/triton_heuristics.py:1556:77: Builtin `set` is deprecated
 1554 |         assert (
 1555 |             remaining % dim == 0
 1556 |         ), f"Expected dimension '{dim}' to divide remaining size '{remaining}'"
                                                                                    ^
 1557 |         rnumels[prefix] = dim
 1558 |         remaining //= dim

torch/_inductor/runtime/triton_heuristics.py:1564:38: Builtin `set` is deprecated
 1562 |     assert (
 1563 |         r == final_numel
 1564 |     ), f"Expected ND reduction size ({rnumels}) to have {r} elements."
                                             ^
 1565 |     assert all(
 1566 |         rnumels[prefix] <= size_hints[prefix] for prefix in rnumels

torch/_inductor/runtime/triton_heuristics.py:1564:46: Builtin `set` is deprecated
 1562 |     assert (
 1563 |         r == final_numel
 1564 |     ), f"Expected ND reduction size ({rnumels}) to have {r} elements."
                                                     ^
 1565 |     assert all(
 1566 |         rnumels[prefix] <= size_hints[prefix] for prefix in rnumels

torch/_inductor/runtime/triton_heuristics.py:1564:57: Builtin `set` is deprecated
 1562 |     assert (
 1563 |         r == final_numel
 1564 |     ), f"Expected ND reduction size ({rnumels}) to have {r} elements."
                                                                ^
 1565 |     assert all(
 1566 |         rnumels[prefix] <= size_hints[prefix] for prefix in rnumels

torch/_inductor/runtime/triton_heuristics.py:1564:59: Builtin `set` is deprecated
 1562 |     assert (
 1563 |         r == final_numel
 1564 |     ), f"Expected ND reduction size ({rnumels}) to have {r} elements."
                                                                  ^
 1565 |     assert all(
 1566 |         rnumels[prefix] <= size_hints[prefix] for prefix in rnumels

torch/_inductor/runtime/triton_heuristics.py:1567:37: Builtin `set` is deprecated
 1565 |     assert all(
 1566 |         rnumels[prefix] <= size_hints[prefix] for prefix in rnumels
 1567 |     ), f"rnumels exceed size_hints. {rnumels} > {size_hints}"
                                            ^
 1568 |
 1569 |     return rnumels

torch/_inductor/runtime/triton_heuristics.py:1567:45: Builtin `set` is deprecated
 1565 |     assert all(
 1566 |         rnumels[prefix] <= size_hints[prefix] for prefix in rnumels
 1567 |     ), f"rnumels exceed size_hints. {rnumels} > {size_hints}"
                                                    ^
 1568 |
 1569 |     return rnumels

torch/_inductor/runtime/triton_heuristics.py:1567:49: Builtin `set` is deprecated
 1565 |     assert all(
 1566 |         rnumels[prefix] <= size_hints[prefix] for prefix in rnumels
 1567 |     ), f"rnumels exceed size_hints. {rnumels} > {size_hints}"
                                                        ^
 1568 |
 1569 |     return rnumels

torch/_inductor/runtime/triton_heuristics.py:1567:60: Builtin `set` is deprecated
 1565 |     assert all(
 1566 |         rnumels[prefix] <= size_hints[prefix] for prefix in rnumels
 1567 |     ), f"rnumels exceed size_hints. {rnumels} > {size_hints}"
                                                                   ^
 1568 |
 1569 |     return rnumels

torch/_inductor/runtime/triton_heuristics.py:1746:49: Builtin `set` is deprecated
 1744 |
 1745 |     if not configs:
 1746 |         raise NotImplementedError(f"size_hints: {size_hints}")
                                                        ^
 1747 |     return cached_autotune(
 1748 |         size_hints,

torch/_inductor/runtime/triton_heuristics.py:1746:60: Builtin `set` is deprecated
 1744 |
 1745 |     if not configs:
 1746 |         raise NotImplementedError(f"size_hints: {size_hints}")
                                                                   ^
 1747 |     return cached_autotune(
 1748 |         size_hints,

torch/_inductor/runtime/triton_heuristics.py:1928:32: Builtin `set` is deprecated
 1926 |         for prefix in size_hints:
 1927 |             if prefix_is_reduction(prefix):
 1928 |                 c.kwargs.pop(f"{prefix.upper()}BLOCK")
                                       ^
 1929 |
 1930 |     if disable_pointwise_autotuning(inductor_meta):

torch/_inductor/runtime/triton_heuristics.py:1928:47: Builtin `set` is deprecated
 1926 |         for prefix in size_hints:
 1927 |             if prefix_is_reduction(prefix):
 1928 |                 c.kwargs.pop(f"{prefix.upper()}BLOCK")
                                                      ^
 1929 |
 1930 |     if disable_pointwise_autotuning(inductor_meta):

torch/_inductor/runtime/triton_heuristics.py:1975:49: Builtin `set` is deprecated
 1973 |     assert triton_meta is not None
 1974 |     if len(size_hints) != 2:
 1975 |         raise NotImplementedError(f"size_hints: {size_hints}")
                                                        ^
 1976 |
 1977 |     configs = _reduction_configs(size_hints=size_hints, inductor_meta=inductor_meta)

torch/_inductor/runtime/triton_heuristics.py:1975:60: Builtin `set` is deprecated
 1973 |     assert triton_meta is not None
 1974 |     if len(size_hints) != 2:
 1975 |         raise NotImplementedError(f"size_hints: {size_hints}")
                                                                   ^
 1976 |
 1977 |     configs = _reduction_configs(size_hints=size_hints, inductor_meta=inductor_meta)

torch/_inductor/runtime/triton_heuristics.py:2082:56: Builtin `set` is deprecated
 2080 |         xnumel, ynumel, znumel = numels[2], numels[1], numels[0]
 2081 |     else:
 2082 |         raise AssertionError(f"invalid size for numels {len(numels)}")
                                                               ^
 2083 |
 2084 |     def get_grid_dim(numel, block):

torch/_inductor/runtime/triton_heuristics.py:2082:68: Builtin `set` is deprecated
 2080 |         xnumel, ynumel, znumel = numels[2], numels[1], numels[0]
 2081 |     else:
 2082 |         raise AssertionError(f"invalid size for numels {len(numels)}")
                                                                           ^
 2083 |
 2084 |     def get_grid_dim(numel, block):

torch/_inductor/runtime/triton_heuristics.py:2104:57: Builtin `set` is deprecated
 2102 |             torch._check(
 2103 |                 y_grid <= max_y_grid,
 2104 |                 lambda: f"Generated y grid beyond 2^16 ({y_grid}) not supported with z dimension present. File issue",
                                                                ^
 2105 |             )
 2106 |

torch/_inductor/runtime/triton_heuristics.py:2104:64: Builtin `set` is deprecated
 2102 |             torch._check(
 2103 |                 y_grid <= max_y_grid,
 2104 |                 lambda: f"Generated y grid beyond 2^16 ({y_grid}) not supported with z dimension present. File issue",
                                                                       ^
 2105 |             )
 2106 |

torch/_inductor/runtime/triton_heuristics.py:2113:43: Builtin `set` is deprecated
 2111 |         )
 2112 |
 2113 |     setattr(grid_fn, "grid_fn_str", f"grid{numels}")  # noqa: B010
                                                  ^
 2114 |
 2115 |     return grid_fn

torch/_inductor/runtime/triton_heuristics.py:2113:50: Builtin `set` is deprecated
 2111 |         )
 2112 |
 2113 |     setattr(grid_fn, "grid_fn_str", f"grid{numels}")  # noqa: B010
                                                         ^
 2114 |
 2115 |     return grid_fn

torch/_inductor/runtime/triton_heuristics.py:2122:48: Builtin `set` is deprecated
 2120 |         return (meta["RSPLIT"], ceildiv(xnumel, meta.get("XBLOCK", 1)), 1)
 2121 |
 2122 |     grid_fn_str = f"cooperative_reduction_grid({xnumel})"
                                                       ^
 2123 |     setattr(grid_fn, "grid_fn_str", grid_fn_str)  # noqa: B010
 2124 |     return grid_fn

torch/_inductor/runtime/triton_heuristics.py:2122:55: Builtin `set` is deprecated
 2120 |         return (meta["RSPLIT"], ceildiv(xnumel, meta.get("XBLOCK", 1)), 1)
 2121 |
 2122 |     grid_fn_str = f"cooperative_reduction_grid({xnumel})"
                                                              ^
 2123 |     setattr(grid_fn, "grid_fn_str", grid_fn_str)  # noqa: B010
 2124 |     return grid_fn

torch/_inductor/runtime/triton_heuristics.py:2135:54: Builtin `set` is deprecated
 2133 |     coop_grid = cooperative_reduction_grid(xnumel)
 2134 |     normal_grid = grid(xnumel)
 2135 |     grid_fn_str = f"maybe_cooperative_reduction_grid({xnumel})"
                                                             ^
 2136 |     setattr(grid_fn, "grid_fn_str", grid_fn_str)  # noqa: B010
 2137 |     return grid_fn

torch/_inductor/runtime/triton_heuristics.py:2135:61: Builtin `set` is deprecated
 2133 |     coop_grid = cooperative_reduction_grid(xnumel)
 2134 |     normal_grid = grid(xnumel)
 2135 |     grid_fn_str = f"maybe_cooperative_reduction_grid({xnumel})"
                                                                    ^
 2136 |     setattr(grid_fn, "grid_fn_str", grid_fn_str)  # noqa: B010
 2137 |     return grid_fn

torch/_inductor/runtime/triton_heuristics.py:2145:37: Builtin `set` is deprecated
 2143 |         return (ceildiv(rnumel, meta.get("R0_BLOCK", 1)), xnumel, 1)
 2144 |
 2145 |     grid_fn_str = f"split_scan_grid({xnumel}, {rnumel})"
                                            ^
 2146 |     setattr(grid_fn, "grid_fn_str", grid_fn_str)  # noqa: B010
 2147 |

torch/_inductor/runtime/triton_heuristics.py:2145:44: Builtin `set` is deprecated
 2143 |         return (ceildiv(rnumel, meta.get("R0_BLOCK", 1)), xnumel, 1)
 2144 |
 2145 |     grid_fn_str = f"split_scan_grid({xnumel}, {rnumel})"
                                                   ^
 2146 |     setattr(grid_fn, "grid_fn_str", grid_fn_str)  # noqa: B010
 2147 |

torch/_inductor/runtime/triton_heuristics.py:2145:47: Builtin `set` is deprecated
 2143 |         return (ceildiv(rnumel, meta.get("R0_BLOCK", 1)), xnumel, 1)
 2144 |
 2145 |     grid_fn_str = f"split_scan_grid({xnumel}, {rnumel})"
                                                      ^
 2146 |     setattr(grid_fn, "grid_fn_str", grid_fn_str)  # noqa: B010
 2147 |

torch/_inductor/runtime/triton_heuristics.py:2145:54: Builtin `set` is deprecated
 2143 |         return (ceildiv(rnumel, meta.get("R0_BLOCK", 1)), xnumel, 1)
 2144 |
 2145 |     grid_fn_str = f"split_scan_grid({xnumel}, {rnumel})"
                                                             ^
 2146 |     setattr(grid_fn, "grid_fn_str", grid_fn_str)  # noqa: B010
 2147 |

torch/_inductor/runtime/triton_heuristics.py:2173:42: Builtin `set` is deprecated
 2171 |             assert (
 2172 |                 min_blocks_d is None or min_blocks == min_blocks_d
 2173 |             ), f"inconsistent min_blocks {min_blocks} vs  x grid {numels[-1]}"
                                                 ^
 2174 |     else:
 2175 |         # sequential dispatch

torch/_inductor/runtime/triton_heuristics.py:2173:53: Builtin `set` is deprecated
 2171 |             assert (
 2172 |                 min_blocks_d is None or min_blocks == min_blocks_d
 2173 |             ), f"inconsistent min_blocks {min_blocks} vs  x grid {numels[-1]}"
                                                            ^
 2174 |     else:
 2175 |         # sequential dispatch

torch/_inductor/runtime/triton_heuristics.py:2173:66: Builtin `set` is deprecated
 2171 |             assert (
 2172 |                 min_blocks_d is None or min_blocks == min_blocks_d
 2173 |             ), f"inconsistent min_blocks {min_blocks} vs  x grid {numels[-1]}"
                                                                         ^
 2174 |     else:
 2175 |         # sequential dispatch

torch/_inductor/runtime/triton_heuristics.py:2173:77: Builtin `set` is deprecated
 2171 |             assert (
 2172 |                 min_blocks_d is None or min_blocks == min_blocks_d
 2173 |             ), f"inconsistent min_blocks {min_blocks} vs  x grid {numels[-1]}"
                                                                                    ^
 2174 |     else:
 2175 |         # sequential dispatch
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/143628
Approved by: https://github.com/yanboliang, https://github.com/rec
2024-12-20 11:45:26 +00:00
Ryan Guo
629de4da60 [dynamo] Add a lint rule to restrict what 3P library one can import (#143312)
As title, this patch prevents developers from importing third party
libraries to patch things in Dynamo, unless there's no other easy
workaround (in which case one would add the library to the allowlist in
`import_linter.py`, as instructed by the lint error).

For instance, if we remove `einops` from the allowlist, we'd get this
```verbatim
>>> Lint for torch/_dynamo/decorators.py:

  Error (IMPORT) Disallowed import

    importing from einops is not allowed, if you believe there's a valid
    reason, please add it to import_linter.py

        608  |# Note: this carefully avoids eagerly import einops.
        609  |# TODO: we should delete this whole _allow_in_graph_einops logic by approximately 2024 Q2
        610  |def _allow_in_graph_einops():
    >>> 611  |    import einops
        612  |
        613  |    try:
        614  |        # requires einops > 0.6.1, torch >= 2.0

  Error (IMPORT) Disallowed import

    importing from einops is not allowed, if you believe there's a valid
    reason, please add it to import_linter.py

        612  |
        613  |    try:
        614  |        # requires einops > 0.6.1, torch >= 2.0
    >>> 615  |        from einops._torch_specific import (  # type: ignore[attr-defined]  # noqa: F401
        616  |            _ops_were_registered_in_torchdynamo,
        617  |        )
        618  |
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/143312
Approved by: https://github.com/zou3519
2024-12-19 20:59:16 +00:00
Eli Uriegas
b247f87845 tools: Add a tool to build wheels for multiple python versions (#143361)
Adds a tool to build bdist_wheels sequentially for multiple different
python versions (if specified).

The goal of this tool is to eventually be able to utilize this in our
binary build runs to significantly reduce the amount of time we take to
build packages by utilizing a local ccache from the first build.

Tested locally using the following:
```
$ ccache -C # clear cache
# -p could actually reference any python interpreter
$ python tools/packaging/build_wheel.py \
	-p /home/eliuriegas/.local/share/uv/python/cpython-3.12.7-linux-x86_64-gnu/bin/python3.12 \
	-p /home/eliuriegas/.local/share/uv/python/cpython-3.13.0-linux-x86_64-gnu/bin/python3.13 \
	-d dist-multi/
...
2024-12-17 10:48:11,365 - INFO - Build time (3.12.7): 571.440689s
2024-12-17 10:48:11,365 - INFO - Build time (3.13.0): 191.147503s
```

Signed-off-by: Eli Uriegas <eliuriegas@meta.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/143361
Approved by: https://github.com/malfet, https://github.com/atalman
2024-12-17 21:56:06 +00:00
Chirag Pandya
0bdc173ab6 [fr] recognize all_reduce_barrier as a valid op (#143354)
Summary:
D67068632 introduced a better profiling name for barrier operations to be able to distinguish various ops.

Unfortunately, this broke Flight Recorder Analysis with the following error as reported by dmwu
```
fr_trace -m torchx-param_bench_16g_mi300x-all_to_all -a 0 --mast_job_version 98 -w 16
Traceback (most recent call last):
  File "/usr/local/fbcode/platform010/lib/python3.10/runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/usr/local/fbcode/platform010/lib/python3.10/runpy.py", line 86, in _run_code
```

Test Plan: Test manually.

Differential Revision: D67305997

Pull Request resolved: https://github.com/pytorch/pytorch/pull/143354
Approved by: https://github.com/wconstab
2024-12-17 21:09:18 +00:00
PyTorch MergeBot
969b07b96f Revert "[ROCm] CK Flash Attention Backend (#138947)"
This reverts commit 500d02921b.

Reverted https://github.com/pytorch/pytorch/pull/138947 on behalf of https://github.com/atalman due to Breaks default windows checkout ([comment](https://github.com/pytorch/pytorch/pull/138947#issuecomment-2548998359))
2024-12-17 16:46:57 +00:00
Andy Lugo
500d02921b [ROCm] CK Flash Attention Backend (#138947)
Replaces https://github.com/ROCm/pytorch/pull/1592

This PR contains the initial implementation of SDPA with composable_kernel backend. The CK path can be forced by simply calling `torch.backends.cuda.preferred_rocm_fa_library("ck")`. Similarly, you can force the incumbent aotriton implementation by passing in "aotriton" or "default". As you'd expect, not setting this option will result in aotriton to be used as the backend. In the case of CK, if pytorch deems flash attention usable, then it will use the CK path in all the same places aotriton would have been used. This PR makes no changes to the heuristics which select which attention scheme to use (i.e. flash attention vs memory efficient attention vs math etc etc). It only gets called when flash attention is both enabled (via `USE_FLASH_ATTENTION`) and is selected at runtime by the existing heuristics.

Files located in pytorch/aten/src/ATen/native/transformers/hip/flash_attn/ck/mha* have been pulled from https://github.com/Dao-AILab/flash-attention courtesy of @tridao's hard work who is the co-author

NOTE: In order to use this backend, the user MUST set USE_CK_FLASH_ATTENTION=1 in their environment when they build PyTorch.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/138947
Approved by: https://github.com/pruthvistony, https://github.com/xw285cornell, https://github.com/leitian

Co-authored-by: Xiaodong Wang <xw285@cornell.edu>
2024-12-17 02:18:07 +00:00
rzou
557da8014d [gen_autograd_functions] rename some variables (#143166)
This is a follow-up from https://github.com/pytorch/pytorch/pull/141278.

Test Plan:
- existing tests
Pull Request resolved: https://github.com/pytorch/pytorch/pull/143166
Approved by: https://github.com/soulitzer
2024-12-16 23:18:55 +00:00
Huy Do
39cacc1d81 Fix missing tests on test tool lint job (#143052)
A follow-up from https://github.com/pytorch/pytorch/pull/142476#discussion_r1878888558 where some tests are not discovered correctly by pytest

### Testing

https://github.com/pytorch/pytorch/actions/runs/12287448581/job/34289531307?pr=143052#step:14:162 shows the correct number of tests now
Pull Request resolved: https://github.com/pytorch/pytorch/pull/143052
Approved by: https://github.com/ZainRizvi
2024-12-12 20:29:32 +00:00