pytorch/test/functorch
Peter Bell a2a8c1fda0 [AOTDispatch] Return mutated inputs directly when keeping mutations (#120514)
Fixes #120242

The example from the issue now results in the graph
```python
def forward(self, arg0_1, arg1_1):
    sin = torch.ops.aten.sin.default(arg0_1);  arg0_1 = None
    copy_ = torch.ops.aten.copy_.default(arg1_1, sin);  arg1_1 = sin = None
    return (copy_,)
```

and the corresponding inductor kernel eliminates the intermediate buffer
completely

```python
def call(args):
    arg0_1, arg1_1 = args
    args.clear()
    assert_size_stride(arg0_1, (5, ), (1, ))
    assert_size_stride(arg1_1, (5, ), (1, ))
    with torch.cuda._DeviceGuard(0):
        torch.cuda.set_device(0)
        # Source Nodes: [sin], Original ATen: [aten.sin]
        stream0 = get_raw_stream(0)
        triton_poi_fused_sin_0.run(arg0_1, arg1_1, 5, grid=grid(5), stream=stream0)
        del arg0_1
    return (arg1_1, )
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/120514
Approved by: https://github.com/ezyang, https://github.com/oulgen, https://github.com/lezcano
2024-03-08 16:33:26 +00:00
..
attn_ft.py
attn_positional.py
common_utils.py
discover_coverage.py
functorch_additional_op_db.py
test_aotdispatch.py [AOTDispatch] Return mutated inputs directly when keeping mutations (#120514) 2024-03-08 16:33:26 +00:00
test_control_flow.py Windows Dynamo Error Removal CI Check (#115969) 2024-02-14 21:14:36 +00:00
test_dims.py
test_eager_transforms.py Let torch dynamo inline torch.func.grad (#118407) 2024-02-28 20:05:00 +00:00
test_logging.py
test_memory_efficient_fusion.py
test_minifier.py
test_ops.py Batch Norm Consolidation (#116092) 2024-03-08 15:07:15 +00:00
test_parsing.py
test_rearrange.py
test_vmap.py Batch Norm Consolidation (#116092) 2024-03-08 15:07:15 +00:00
test_vmap_registrations.py
xfail_suggester.py