onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-07-08 17:17:15 +00:00

History

stevenlix ce0025d3f2 Fallback Pow op in layer norm to FP32 in TRT to avoid overflow (#13639 ) Accuracy loss is observed when transformer models such as BERT, DeBERTa, ViT are running in TRT FP16 mode. The cause is that overflow happens at Pow op in layer norm. This PR provides the option to force Pow to run in TRT FP32 precision if overflow occurs. Co-authored-by: Ubuntu <azureuser@orteplinuxdev.bxgbzpva45kedp3rhbsbit4phb.jx.internal.cloudapp.net>	2022-11-29 13:37:31 -08:00
..
onnxruntime/core	Fallback Pow op in layer norm to FP32 in TRT to avoid overflow (#13639 )	2022-11-29 13:37:31 -08:00

Fallback Pow op in layer norm to FP32 in TRT to avoid overflow (#13639 )

Accuracy loss is observed when transformer models such as BERT, DeBERTa,
ViT are running in TRT FP16 mode. The cause is that overflow happens at
Pow op in layer norm.
This PR provides the option to force Pow to run in TRT FP32 precision if
overflow occurs.

Co-authored-by: Ubuntu <azureuser@orteplinuxdev.bxgbzpva45kedp3rhbsbit4phb.jx.internal.cloudapp.net>

2022-11-29 13:37:31 -08:00

onnxruntime/core

Fallback Pow op in layer norm to FP32 in TRT to avoid overflow (#13639 )

2022-11-29 13:37:31 -08:00