pytorch

mirror of https://github.com/saymrwulf/pytorch.git synced 2026-05-14 20:57:59 +00:00

History

Chirag Pandya d549ddfb14 [fr][rfc] use a logger to control output for flight recorder analyzer (#139656 ) Summary: Use a logger to control output to console. This is useful for hiding out debug/detail messages from the console v/s showing everything together. Test Plan: Ran `torchfrtrace` with various switches. The `-v` verbose swtch ``` torchfrtrace --prefix "trace_" /tmp/ -v loaded 2 files in 0.2567298412322998s built groups, memberships Not all ranks joining collective 3 at entry 2 group info: 0:default_pg collective: nccl:all_reduce missing ranks: {1} input sizes: [[4, 5]] output sizes: [[4, 5]] expected ranks: 2 collective state: scheduled collective stack trace: <module> at /home/cpio/test/c.py:66 appending a non-matching collective built collectives, nccl_calls Groups id desc size -------------------- ---------- ------ 09000494312501845833 default_pg 2 Memberships group_id global_rank -------------------- ------------- 09000494312501845833 0 09000494312501845833 1 Collectives id group_id ---- ---------- 0 0 1 0 NCCLCalls id collective_id group_id global_rank traceback_id collective_type sizes ---- --------------- ---------- ------------- -------------- ----------------- -------- 0 0 0 0 0 nccl:all_reduce [[3, 4]] 1 0 0 1 0 nccl:all_reduce [[3, 4]] 2 1 0 0 0 nccl:all_reduce [[3, 4]] 3 1 0 1 0 nccl:all_reduce [[3, 4]] 4 0 0 0 nccl:all_reduce [[4, 5]] ``` Without the verbose switch ``` ❯ torchfrtrace --prefix "trace_" /tmp/ Not all ranks joining collective 3 at entry 2 group info: 0:default_pg collective: nccl:all_reduce missing ranks: {1} input sizes: [[4, 5]] output sizes: [[4, 5]] expected ranks: 2 collective state: scheduled collective stack trace: <module> at /home/cpio/test/c.py:66 ``` With the `-j` switch: ``` ❯ torchfrtrace --prefix "trace_" /tmp/ -j Rank 0 Rank 1 ------------------------------------------------- ------------------------------------------------- all_reduce(input_sizes=[[3, 4]], state=completed) all_reduce(input_sizes=[[3, 4]], state=completed) all_reduce(input_sizes=[[3, 4]], state=completed) all_reduce(input_sizes=[[3, 4]], state=completed) all_reduce(input_sizes=[[4, 5]], state=scheduled) ``` Differential Revision: D65438520 Pull Request resolved: https://github.com/pytorch/pytorch/pull/139656 Approved by: https://github.com/fduwjj		2024-11-05 20:14:18 +00:00
..
alerts
amd_build
autograd	Fix and test several NJT reductions (#139317 )	2024-10-31 20:55:38 +00:00
bazel_tools
build/bazel	Bump certifi from 2024.2.2 to 2024.7.4 in /tools/build/bazel (#130173 )	2024-10-28 15:44:49 -07:00
build_defs
code_analyzer
code_coverage
config
coverage_plugins_package
dynamo
flight_recorder	[fr][rfc] use a logger to control output for flight recorder analyzer (#139656 )	2024-11-05 20:14:18 +00:00
gdb
github
iwyu
jit
linter	Use clang-tidy 17 (#139678 )	2024-11-05 16:00:25 +00:00
lite_interpreter	C10_UNUSED to [[maybe_unused]] (#6357 ) (#138364 )	2024-10-19 13:17:43 +00:00
lldb
onnx	[11/N] Fix extra warnings brought by clang-tidy-17 (#139599 )	2024-11-04 23:57:41 +00:00
pyi	Revert "Tighten type hints for tensor arithmetic (#135392 )"	2024-11-04 23:30:15 +00:00
rules
rules_cc
setup_helpers	Add torch.xpu.get_arch_list and torch.xpu.get_gencode_flags for XPU (#137773 )	2024-10-18 02:28:08 +00:00
shared
stats
test	[BE] Format `.ci/` / `.github/` / `benchmarks/` / `functorch/` / `tools/` / `torchgen/` with `ruff format` (#132577 )	2024-10-11 18:30:26 +00:00
testing	Move slow test query to ClickHouse (#139322 )	2024-10-30 23:58:27 +00:00
__init__.py
bazel.bzl
BUCK.bzl
BUCK.oss
build_libtorch.py
build_pytorch_libs.py
build_with_debinfo.py	Improve build_with_deb_info (#138290 )	2024-10-18 18:50:12 +00:00
download_mnist.py
extract_scripts.py
gen_flatbuffers.sh
gen_vulkan_spv.py
generate_torch_version.py
generated_dirs.txt
git_add_generated_dirs.sh
git_reset_generated_dirs.sh
nightly.py	[tools] fix nightly pull tool when the conda environment not exists (#138448 )	2024-10-21 19:35:48 +00:00
nightly_hotpatch.py
nvcc_fix_deps.py
README.md
render_junit.py
substitute.py
update_masked_docs.py
vscode_settings.py

README.md

This folder contains a number of scripts which are used as part of the PyTorch build process. This directory also doubles as a Python module hierarchy (thus the __init__.py).

Overview

Modern infrastructure:

autograd - Code generation for autograd. This includes definitions of all our derivatives.
jit - Code generation for JIT
shared - Generic infrastructure that scripts in tools may find useful.
- module_loader.py - Makes it easier to import arbitrary Python files in a script, without having to add them to the PYTHONPATH first.

Build system pieces:

setup_helpers - Helper code for searching for third-party dependencies on the user system.
build_pytorch_libs.py - cross-platform script that builds all of the constituent libraries of PyTorch, but not the PyTorch Python extension itself.
build_libtorch.py - Script for building libtorch, a standalone C++ library without Python support. This build script is tested in CI.

Developer tools which you might find useful:

git_add_generated_dirs.sh and git_reset_generated_dirs.sh - Use this to force add generated files to your Git index, so that you can conveniently run diffs on them when working on code-generation. (See also generated_dirs.txt which specifies the list of directories with generated files.)

Important if you want to run on AMD GPU:

amd_build - HIPify scripts, for transpiling CUDA into AMD HIP. Right now, PyTorch and Caffe2 share logic for how to do this transpilation, but have separate entry-points for transpiling either PyTorch or Caffe2 code.
- build_amd.py - Top-level entry point for HIPifying our codebase.

Tools which are only situationally useful:

docker - Dockerfile for running (but not developing) PyTorch, using the official conda binary distribution. Context: https://github.com/pytorch/pytorch/issues/1619
download_mnist.py - Download the MNIST dataset; this is necessary if you want to run the C++ API tests.