onnxruntime/include/onnxruntime/core
stevenlix ce0025d3f2
Fallback Pow op in layer norm to FP32 in TRT to avoid overflow (#13639)
Accuracy loss is observed when transformer models such as BERT, DeBERTa,
ViT are running in TRT FP16 mode. The cause is that overflow happens at
Pow op in layer norm.
This PR provides the option to force Pow to run in TRT FP32 precision if
overflow occurs.

Co-authored-by: Ubuntu <azureuser@orteplinuxdev.bxgbzpva45kedp3rhbsbit4phb.jx.internal.cloudapp.net>
2022-11-29 13:37:31 -08:00
..
common Enforce Prefast check in Windows CPU CI pipeline (#13735) 2022-11-23 09:25:02 -08:00
eager support register external ep lib information (#8897) 2021-08-31 20:51:22 -07:00
framework Enable ORT in TorchDynamo (#13259) 2022-11-01 11:19:29 -07:00
graph Ignore saved runtime optimizations when updating ORT format model <v5. (#13393) 2022-11-08 13:36:46 -08:00
optimizer Remove ORT_ENABLE_RUNTIME_OPTIMIZATION_IN_MINIMAL_BUILD. (#10778) 2022-03-08 16:18:49 -08:00
platform Improve thread pool creation failure handling. (#13313) 2022-10-15 17:57:19 -07:00
providers Fallback Pow op in layer norm to FP32 in TRT to avoid overflow (#13639) 2022-11-29 13:37:31 -08:00
session Allow CUDA EP enable or disable TunableOp via session options and environment variable (#13601) 2022-11-15 14:43:54 +08:00