mirror of
https://github.com/saymrwulf/pytorch.git
synced 2026-05-15 21:00:47 +00:00
Summary:
# context
* enable the `_get_user_embeddings` function
* run failed at P1610151892
```
torch._dynamo.exc.BackendCompilerFailed: backend='inductor' raised:
GuardOnDataDependentSymNode: Could not guard on data-dependent expression u22 <= 0 (unhinted: u22 <= 0). (Size-like symbols: u22)
ATTENTION: guard_size_oblivious would fix the error, evaluating expression to False.
Maybe you need to add guard_size_oblivious to framework code, see doc below for more guidance.
Potential framework code culprit (scroll up for full backtrace):
File "/data/users/hhy/fbsource/buck-out/v2/gen/fbcode/38472faba4e3e6c1/aps_models/ads/icvr/__icvr_launcher_live__/icvr_launcher_live#link-tree/torch/_decomp/decompositions.py", line 1692, in native_layer_norm_backward
if M <= 0 or N <= 0:
```
```
N = prod(inner_dims) # type: ignore[arg-type]
M = prod(outer_dims) # type: ignore[arg-type]
if M <= 0 or N <= 0:
return (
input.new_zeros(input_shape) if output_mask[0] else None,
input.new_zeros(input_shape[axis:]) if output_mask[1] else None,
input.new_zeros(input_shape[axis:]) if output_mask[2] else None,
)
```
# changes
* use guard_size_oblivious since the new_zeros return is kind of optimization, shouldn't impact the correctness of the follow up code logic.
* the size `ret[i][j]` could be zero, so the change in V1 isn't valid
* for more details: [post](https://fb.workplace.com/groups/6829516587176185/permalink/8003616173099548/)
```
from torch.fx.experimental.symbolic_shapes import guard_size_oblivious
if guard_size_oblivious(M <= 0) or guard_size_oblivious(N <= 0):
```
# past
* found `u22` was introduced at
```
def _wait_impl(self) -> List[List[int]]:
# Can not use is_torchdynamo_compiling(), as every such condition should be independent for compilation with graph breaks.
if isinstance(self._splits_awaitable, dist.Work):
self._splits_awaitable.wait()
ret = self._output_tensor.view(self.num_workers, -1).T.tolist() # <------ u22 introduced here
if not torch.jit.is_scripting() and is_torchdynamo_compiling():
for i in range(len(ret)):
for j in range(len(ret[i])):
torch._check_is_size(ret[i][j]) # <---------- my question: why the _check_is_size isn't enough??
torch._check(ret[i][j] > 0) # <------ added by diff V1
```
Test Plan:
# run command
```
TORCH_SHOW_CPP_STACKTRACES=1 TORCHDYNAMO_EXTENDED_DEBUG_CPP=1 TORCH_LOGS="+graph_code,output_code,dynamic,aot,guards,verbose_guards,recompiles,graph_breaks" TORCH_TRACE=/var/tmp/tt buck2 run fbcode//mode/opt fbcode//aps_models/ads/icvr:icvr_launcher_live -- mode=fmc/local_ig_fm_v4_mini training.pipeline_type=pt2 2>&1 | tee -a `tagT`.`tagH`.log
```
# results
* before
**without enabling `_get_user_embeddings`**
[14 Failures and Restarts](https://interncache-all.fbcdn.net/manifold/tlparse_reports/tree/logs/.tmp2eNI7p/failures_and_restarts.html)
log: P1610151892
{F1889387940}
* V1
enable `_get_user_embeddings`
with `torch._check(ret[i][j] > 0)`
[13 Failures and Restarts](https://interncache-all.fbcdn.net/manifold/tlparse_reports/tree/logs/.tmp6J1iY9/failures_and_restarts.html)
{F1889388378}
* V2
enable `_get_user_embeddings`
with `if guard_size_oblivious(M <= 0) or guard_size_oblivious(N <= 0):`
[tlparse](https://interncache-all.fbcdn.net/manifold/tlparse_reports/tree/logs/.tmpFhZZyC/index.html)
if guard_size_oblivious(M <= 0) or guard_size_oblivious(N <= 0):
Differential Revision: D63424929
Pull Request resolved: https://github.com/pytorch/pytorch/pull/136798
Approved by: https://github.com/ezyang
|
||
|---|---|---|
| .. | ||
| __init__.py | ||
| decompositions.py | ||
| decompositions_for_jvp.py | ||
| decompositions_for_rng.py | ||