onnxruntime/orttraining
pengwa 2092bebc78
Fix transformer layer detection for recompute (#20106)
### Fix transformer layer detection for recompute

Originally logic miss detecting the layer boudary node in Mistral model.
This PR simplifies the searching, by using more strong pattern's match,
to make sure it is flexible enough to cover different transformer
variants.

Also add a UT.

Add a warning when user enable layerwise recompute but no layer boudary
nodes are found.
2024-03-29 17:44:38 +08:00
..
orttraining Fix transformer layer detection for recompute (#20106) 2024-03-29 17:44:38 +08:00
tools Bump ruff to 0.3.2 and black to 24 (#19878) 2024-03-13 10:00:32 -07:00