mirror of https://github.com/saymrwulf/pytorch.git synced 2026-05-14 20:57:59 +00:00

History

Wang, Eikan 0bde810572 Add more debug information for Inductor (#90008 ) - Add graph index to the profile information of the Inductor kernel for better debugability. The generated code for different graphs could produce kernels with the same name. The side effect is that it is hard to identify the portion of E2E performance for these kernels because the profiler will aggregate the performance with the same kernel name regardless of different graphs. Hence, this PR added the graph index to the profile information to address this limitation. - Label arbitrary code ranges for `eager` and `opt` modes for better debugability The profile information of dynamo benchmarks mixes the eager mode and opt mode. It is hard to separate the range for different modes. This PR added eager and opt marks to the profile information to address this limitation. Pull Request resolved: https://github.com/pytorch/pytorch/pull/90008 Approved by: https://github.com/jgong5, https://github.com/jansel		2022-12-02 09:34:48 +00:00
..
cpp	[NVFuser] Upstream push 1026 (#87779 )	2022-11-04 20:04:34 +00:00
distributed	Fix typos under benchmarks, test, and tools directories (#87975 )	2022-10-29 01:26:17 +00:00
dynamo	Add more debug information for Inductor (#90008 )	2022-12-02 09:34:48 +00:00
fastrnns
framework_overhead_benchmark
functional_autograd_benchmark
fuser
instruction_counts	Fix typos under benchmarks, test, and tools directories (#87975 )	2022-10-29 01:26:17 +00:00
nested	Use tensor cores for NT bmm (#86856 )	2022-11-02 21:51:40 +00:00
operator_benchmark	Fix typos under benchmarks, test, and tools directories (#87975 )	2022-10-29 01:26:17 +00:00
overrides_benchmark
profiler_benchmark
record_function_benchmark
serialization
sparse
static_runtime	Back out "[static-runtime] change the backend for permute_copy" (#89463 )	2022-11-22 06:26:10 +00:00
tensorexpr
transformer	Update sdp dispatch logic to enable fused backward (#89154 )	2022-11-21 20:02:09 +00:00
compare-fastrnn-results.py
compare.sh
README.md
upload_scribe.py

README.md

PyTorch Benchmarks

This folder contains scripts that produce reproducible timings of various PyTorch features.

It also provides mechanisms to compare PyTorch with other frameworks.

Setup environment

Make sure you're on a machine with CUDA, torchvision, and pytorch installed. Install in the following order:

# Install torchvision. It comes with the pytorch stable release binary
conda install pytorch torchvision -c pytorch

# Install the latest pytorch master from source.
# It should supersede the installation from the release binary.
cd $PYTORCH_HOME
python setup.py build develop

# Check the pytorch installation version
python -c "import torch; print(torch.__version__)"

Benchmark List

Please refer to each subfolder to discover each benchmark suite

Fast RNNs benchmarks