mirror of https://github.com/saymrwulf/pytorch.git synced 2026-05-14 20:57:59 +00:00

History

Brian Hirsh 7ca156f0ee partitioner: avoid inserting duplicates into heap (#145082 ) Fixes https://github.com/pytorch/pytorch/issues/145081 This looks like it was a source of quadratic compile times in the torchtitan CP graphs. There's some code in the partitioner that iteratively adds users of a node to a heap, and pops the earliest user. If you have long parallel chains of fusible ops that all eventually feed into some shared ops, then this can result in: (1) a node getting added to the heap many times (2) each time we pop that node, we add (duplicates of) each of that node users to the heap (3) repeat with each user Pull Request resolved: https://github.com/pytorch/pytorch/pull/145082 Approved by: https://github.com/xmfan		2025-01-28 23:44:45 +00:00
..
distributed	Revert "Use absolute path `path.resolve()` -> `path.absolute()` (#129409 )"	2025-01-04 14:17:20 +00:00
dynamo	partitioner: avoid inserting duplicates into heap (#145082 )	2025-01-28 23:44:45 +00:00
fastrnns	PEP585 update - benchmarks tools torchgen (#145101 )	2025-01-18 05:05:07 +00:00
framework_overhead_benchmark
functional_autograd_benchmark	PEP585 update - benchmarks tools torchgen (#145101 )	2025-01-18 05:05:07 +00:00
fuser
gpt_fast	Fix broken gpt_fast micro benchmark after #144315 (#145235 )	2025-01-21 17:42:24 +00:00
inference
instruction_counts	PEP585 update - benchmarks tools torchgen (#145101 )	2025-01-18 05:05:07 +00:00
nested
operator_benchmark	Additional operators in operator benchmark (#145625 )	2025-01-26 19:20:02 +00:00
overrides_benchmark
profiler_benchmark	Apply TorchFix TOR203 fixes (#143691 )	2024-12-23 18:21:03 +00:00
record_function_benchmark
serialization
sparse
static_runtime	Re-enable some C++ warnings (#142332 )	2024-12-12 04:02:12 +00:00
tensorexpr	[BE][CI] bump `ruff` to 0.8.4 (#143753 )	2024-12-24 12:24:10 +00:00
transformer	PEP585 update - benchmarks tools torchgen (#145101 )	2025-01-18 05:05:07 +00:00
compare-fastrnn-results.py
compare.sh
README.md
upload_scribe.py

README.md

PyTorch Benchmarks

This folder contains scripts that produce reproducible timings of various PyTorch features.

It also provides mechanisms to compare PyTorch with other frameworks.

Setup environment

Make sure you're on a machine with CUDA, torchvision, and pytorch installed. Install in the following order:

# Install torchvision. It comes with the pytorch stable release binary
conda install pytorch torchvision -c pytorch

# Install the latest pytorch master from source.
# It should supersede the installation from the release binary.
cd $PYTORCH_HOME
python setup.py build develop

# Check the pytorch installation version
python -c "import torch; print(torch.__version__)"

Benchmark List

Please refer to each subfolder to discover each benchmark suite. Links are provided where descriptions exist: