pytorch/test/distributed/tensor
Xilun Wu c4d835fbab [DTensor][conv] add DTensor convolution_backward op support for case where the input Tensor has requires_grad=False (#142278)
Fixes #142058

## Summary
DTensor `convolution_backward` op throws exception when the input Tensor has `requires_grad=False` which happens if the conv layer is the first layer in the model.

ATEN convolution_backward op Usually returns 3 Tensors (grad_input, grad_weight, grad_bias) and the `grad_input` is actually an Optional[Tensor] which can be `None` in the case mentioned above.

However, the DTensor sharding propagation rule and corresponding TP conv backward implementation both assume that the `grad_input` would be existent.

## Fix
allow the `grad_input` to be `None` for `convolution_backward` op.

## Test
`pytest test/distributed/tensor/test_convolution_ops.py`

## Follow-up
The current implementation of DTensor conv op also ignores `output_mask` and this may need further care.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/142278
Approved by: https://github.com/bdhirsh
2025-02-10 07:06:40 +00:00
..
debug PEP585 update - test (#145176) 2025-01-22 04:48:28 +00:00
experimental PEP585 update - test (#145176) 2025-01-22 04:48:28 +00:00
parallel [2/N] Enable ruff F841 on distributed tests (#146132) 2025-02-02 03:44:48 +00:00
__init__.py [dtensor] move all tests to distribute/tensor folder (#144166) 2025-01-08 00:32:33 +00:00
README.md [dtensor] move all tests to distribute/tensor folder (#144166) 2025-01-08 00:32:33 +00:00
test_api.py [dtensor] move all tests to distribute/tensor folder (#144166) 2025-01-08 00:32:33 +00:00
test_attention.py [cp] override compute_log_sumexp to True for aten._scaled_dot_product_efficient_attention.default if False (#145421) 2025-01-24 06:17:54 +00:00
test_common_rules.py [dtensor] move all tests to distribute/tensor folder (#144166) 2025-01-08 00:32:33 +00:00
test_convolution_ops.py [DTensor][conv] add DTensor convolution_backward op support for case where the input Tensor has requires_grad=False (#142278) 2025-02-10 07:06:40 +00:00
test_dtensor.py [dtensor] move all tests to distribute/tensor folder (#144166) 2025-01-08 00:32:33 +00:00
test_dtensor_compile.py [dynamo][fullgraph] Do not skip frame with fullgraph=True (#146527) 2025-02-06 18:56:07 +00:00
test_dtensor_ops.py [DTensor] Add pointwise ops strategy for aten.minimum (#145816) 2025-01-29 01:19:01 +00:00
test_embedding_ops.py [dtensor] move all tests to distribute/tensor folder (#144166) 2025-01-08 00:32:33 +00:00
test_experimental_ops.py [dtensor] move all tests to distribute/tensor folder (#144166) 2025-01-08 00:32:33 +00:00
test_init.py [dtensor] move all tests to distribute/tensor folder (#144166) 2025-01-08 00:32:33 +00:00
test_math_ops.py [dtensor] move all tests to distribute/tensor folder (#144166) 2025-01-08 00:32:33 +00:00
test_matrix_ops.py Revert "[DTensor][Test] Create a simple unit test for tensordot (#146514)" 2025-02-06 11:26:43 +00:00
test_op_strategy.py [dtensor] move all tests to distribute/tensor folder (#144166) 2025-01-08 00:32:33 +00:00
test_optimizers.py [dtensor] move all tests to distribute/tensor folder (#144166) 2025-01-08 00:32:33 +00:00
test_pointwise_ops.py PEP585 update - test (#145176) 2025-01-22 04:48:28 +00:00
test_random_ops.py Tests Generelization for multiple accelerator devices (#139749) 2025-01-14 08:52:46 +00:00
test_redistribute.py Tests Generelization for multiple accelerator devices (#139749) 2025-01-14 08:52:46 +00:00
test_tensor_ops.py [dtensor] move all tests to distribute/tensor folder (#144166) 2025-01-08 00:32:33 +00:00
test_utils.py [dtensor] move all tests to distribute/tensor folder (#144166) 2025-01-08 00:32:33 +00:00
test_view_ops.py PEP585 update - test (#145176) 2025-01-22 04:48:28 +00:00
test_xla_integration.py PEP585 update - test (#145176) 2025-01-22 04:48:28 +00:00

Run distributed tensor tests:

from root, run (either CPU or GPU)

pytest test/distributed/tensor/test_dtensor.py

run specific test cases and print stdout/stderr:

pytest test/distributed/tensor/test_dtensor.py -s -k test_from_local