pytorch/docs/source
Jianyu Huang 0a35986cdb Add option to configure reduced precision math backend for SDPA (#135964)
Summary: Address https://github.com/pytorch/pytorch/issues/135778 by adding a global flag to configure whether using high precision or low precision for math backend of SDPA.

Test Plan: buck2 run mode/opt //scripts/feikou/llm:run_attn_kernels

Differential Revision: D62625515

Pull Request resolved: https://github.com/pytorch/pytorch/pull/135964
Approved by: https://github.com/jbschlosser
2024-09-24 07:11:38 +00:00
..
_static Clean up distributed/CONTRIBUTING.md (#128450) 2024-06-22 02:41:22 +00:00
_templates
community Add Alban and Piotr into Core Maintainers (#130903) 2024-07-20 16:02:42 +00:00
elastic
notes Add option to configure reduced precision math backend for SDPA (#135964) 2024-09-24 07:11:38 +00:00
rpc [Doc] fix some typos (found by codespell and typos) (#132544) 2024-08-05 17:21:56 +00:00
scripts [Doc] fix some typos (found by codespell and typos) (#132544) 2024-08-05 17:21:56 +00:00
amp.rst Update document for autocast on CPU (#135299) 2024-09-13 09:11:47 +00:00
autograd.rst
backends.rst Add option to configure reduced precision math backend for SDPA (#135964) 2024-09-24 07:11:38 +00:00
benchmark_utils.rst
bottleneck.rst
checkpoint.rst [checkpoint] Clean up selective activation checkpoint and make public (#125795) 2024-06-18 18:18:50 +00:00
complex_numbers.rst
cond.rst [Doc] fix some typos (found by codespell and typos) (#132544) 2024-08-05 17:21:56 +00:00
conf.py Revert "[dtensor] move DTensor to public namespace (#133113)" 2024-08-19 05:00:19 +00:00
config_mod.rst
cpp_extension.rst
cpp_index.rst
cpu.rst
cuda._sanitizer.rst
cuda.rst Uses MemPoolContext to route allocations from CUDACachingAllocator (#134685) 2024-08-29 03:56:31 +00:00
cuda.tunable.rst
cuda_environment_variables.rst
cudnn_persistent_rnn.rst
cudnn_rnn_determinism.rst
data.rst
ddp_comm_hooks.rst
debugging_environment_variables.rst
deploy.rst
deterministic.rst
distributed.algorithms.join.rst
distributed.checkpoint.rst [Doc] fix some typos (found by codespell and typos) (#132544) 2024-08-05 17:21:56 +00:00
distributed.elastic.rst
distributed.optim.rst
distributed.pipelining.rst [PP] Add ZeroBubble schedule (#133467) 2024-08-22 13:32:15 +00:00
distributed.rst [reland][dtensor] move DTensor to public namespace (#134203) 2024-09-08 17:08:40 +00:00
distributed.tensor.parallel.rst Update link in distributed.tensor.parallel.rst (#136103) 2024-09-15 19:36:29 +00:00
distributed.tensor.rst [reland][dtensor] move DTensor to public namespace (#134203) 2024-09-08 17:08:40 +00:00
distributions.rst
dlpack.rst
docutils.conf
export.ir_spec.rst
export.rst Add some doc for export_for_training (#135918) 2024-09-15 17:08:12 +00:00
fft.rst
fsdp.rst
func.api.rst
func.batch_norm.rst
func.migrating.rst
func.rst
func.ux_limitations.rst
func.whirlwind_tour.rst
future_mod.rst
futures.rst
fx.experimental.rst Only thunkify proxies in some situations (#132421) 2024-08-08 12:03:06 +00:00
fx.rst Consolidate SymDispatchMode into ProxyTensorMode (#132674) 2024-08-08 12:02:54 +00:00
hub.rst
index.rst [reland][dtensor] move DTensor to public namespace (#134203) 2024-09-08 17:08:40 +00:00
jit.rst
jit_builtin_functions.rst
jit_language_reference.rst [Doc] fix some typos (found by codespell and typos) (#132544) 2024-08-05 17:21:56 +00:00
jit_language_reference_v2.rst [Doc] fix some typos (found by codespell and typos) (#132544) 2024-08-05 17:21:56 +00:00
jit_python_reference.rst
jit_unsupported.rst
jit_utils.rst
library.rst Remove duplicated words in library.rst (#136340) 2024-09-20 03:30:54 +00:00
linalg.rst
logging.rst
masked.rst Add MaskedTensor passthrough: unfold, F.Unfold, F.Fold, stack (#125262) 2024-09-06 19:06:23 +00:00
math-quantizer-equation.png
meta.rst
miscellaneous_environment_variables.rst [RFC] Add support for device extension autoloading (#127074) 2024-07-09 06:14:13 +00:00
mobile_optimizer.rst Add ExecuTorch warning to mobile_optimizer (#134697) 2024-09-04 17:47:14 +00:00
model_zoo.rst
module_tracker.rst
monitor.rst
mps.rst Add support in Python API for the recommended max working set size. (#128289) 2024-06-12 16:03:57 +00:00
mps_environment_variables.rst [MPS] Add mps profiler env vars to docs (#129552) 2024-07-04 06:44:48 +00:00
mtia.rst [MTIA] Support torch.cuda.get_device_capability equivalent API on MTIA (#135889) 2024-09-17 17:42:56 +00:00
multiprocessing.rst
name_inference.rst
named_tensor.rst
nested.rst
nn.attention.bias.rst
nn.attention.flex_attention.rst [Inductor] Added and_masks and or_masks utilities & make fully masked out rows 0 instead of nan (#131552) 2024-07-25 21:29:46 +00:00
nn.attention.rst Make FlexAttention API public (#130755) 2024-07-16 16:21:25 +00:00
nn.functional.rst
nn.init.rst
nn.rst Make adding Buffers more like adding Parameters (#125971) 2024-07-31 10:32:40 +00:00
onnx.rst [ONNX] Improves documentation of ONNX exporter (#135372) 2024-09-09 15:09:01 +00:00
onnx_dynamo.rst [ONNX] Improves documentation of ONNX exporter (#135372) 2024-09-09 15:09:01 +00:00
onnx_dynamo_onnxruntime_backend.rst
onnx_torchscript.rst [ONNX] Remove logging apis from public (#133825) 2024-09-13 22:19:52 +00:00
onnx_torchscript_supported_aten_ops.rst
optim.rst Make optim.swa.util content accessible from the torch.optim doc (#133393) 2024-08-21 00:43:46 +00:00
package.rst
profiler.rst
quantization-accuracy-debugging.rst
quantization-backend-configuration.rst
quantization-support.rst Update pt2e numeric debugger to use node.meta["custom"] field (#134040) 2024-08-27 19:51:03 +00:00
quantization.rst
random.rst
rpc.rst
signal.rst
size.rst
sparse.rst SparseCsrCUDA: cuDSS backend for linalg.solve (#129856) 2024-08-22 07:57:30 +00:00
special.rst
storage.rst
tensor_attributes.rst Refine the logic of device construction when only device index is given (#129119) 2024-07-15 14:34:29 +00:00
tensor_view.rst
tensorboard.rst
tensors.rst add xpu to torch.tensors (#127280) 2024-06-11 18:13:01 +00:00
testing.rst
threading_environment_variables.rst
torch.ao.ns._numeric_suite.rst
torch.ao.ns._numeric_suite_fx.rst
torch.compiler.rst add xpu to torch.compile (#127279) 2024-06-13 21:15:09 +00:00
torch.compiler_aot_inductor.rst
torch.compiler_api.rst [RFC][dynamo] add decorator to register polyfill for unsupported C++ function to avoid graph break (#133712) 2024-08-21 06:36:41 +00:00
torch.compiler_best_practices_for_backends.rst
torch.compiler_cudagraph_trees.rst [CUDAGraph] add more docs for cudagraph trees (#127963) 2024-06-18 02:07:07 +00:00
torch.compiler_custom_backends.rst
torch.compiler_dynamic_shapes.rst
torch.compiler_dynamo_deepdive.rst Stop immediately specializing common constants 0/1 for plain int (#128327) 2024-07-03 16:41:51 +00:00
torch.compiler_dynamo_overview.rst
torch.compiler_fake_tensor.rst [BE] Reroute all uses of proxy_tensor.maybe_disable_fake_tensor_mode to fake_tensor.unset_fake_temporarily (#132770) 2024-08-08 23:07:23 +00:00
torch.compiler_faq.rst [dynamo] Retire CompileProfiler (#135133) 2024-09-05 01:08:40 +00:00
torch.compiler_fine_grain_apis.rst [Doc] fix some typos (found by codespell and typos) (#132544) 2024-08-05 17:21:56 +00:00
torch.compiler_get_started.rst Revert "[inductor] More fixes on the keys of constants and signature dictionaries (#135406)" 2024-09-16 17:58:02 +00:00
torch.compiler_inductor_profiling.rst
torch.compiler_ir.rst
torch.compiler_nn_module.rst
torch.compiler_performance_dashboard.rst
torch.compiler_profiling_torch_compile.rst [EZ] Fix spelling typo (#136157) 2024-09-16 19:30:30 +00:00
torch.compiler_transformations.rst
torch.compiler_troubleshooting.rst [dynamo] Retire CompileProfiler (#135133) 2024-09-05 01:08:40 +00:00
torch.overrides.rst
torch.rst Autoselect default device in FSDP construction. (#127609) 2024-08-08 05:25:17 +00:00
torch_cuda_memory.rst
torch_environment_variables.rst [Docs][MPS] Add mps environment variable table (#129008) 2024-06-20 03:30:35 +00:00
torch_nccl_environment_variables.rst [c10d][doc] Add docs for ENV variables TORCH_NCCL_ASYNC_ERROR_HANDLING TORCH_NCCL_TRACE_CPP_STACK and TORCH_NCCL_COORD_CHECK_MILSEC (#132920) 2024-08-09 21:08:20 +00:00
type_info.rst
utils.rst
xpu.rst [Intel GPU] Add XPU memory-related APIs (#129919) 2024-09-07 11:15:17 +00:00