mirror of
https://github.com/saymrwulf/onnxruntime.git
synced 2026-05-17 21:10:43 +00:00
### Description torch.optim Adam zero_grad() signature is zero_grad(set_to_none=True) https://pytorch.org/docs/stable/generated/torch.optim.Adam.html#torch.optim.Adam.zero_grad We set this flag in initialization, similar to deepspeed: https://deepspeed.readthedocs.io/en/latest/optimizers.html#deepspeed.ops.adam.FusedAdam Adding this flag to have signature parity with pytorch Adam ### Motivation and Context Easier model integration Co-authored-by: Jingyan Wang <jingywa@microsoft.com@orttrainingdev7.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net> |
||
|---|---|---|
| .. | ||
| __init__.py | ||
| _apex_amp_modifier.py | ||
| _ds_modifier.py | ||
| _megatron_modifier.py | ||
| _modifier.py | ||
| _modifier_registry.py | ||
| _multi_tensor_apply.py | ||
| config.py | ||
| fp16_optimizer.py | ||
| fused_adam.py | ||
| lr_scheduler.py | ||