onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-06-04 23:59:56 +00:00

History

Tianlei Wu c0d2472ede Disable fused causal attention (#14732 ) There is accuracy regression in GPT-2 model. Top1 match rate (vs PyTorch model) drops about 1%. The cause is the fused causal attention uses fp16 accumulation. Disable it by default and add an environment variable ORT_ENABLE_FUSED_CAUSAL_ATTENTION=1 to turn on it manually. It also updated the GPT-2 parity test script to generate left side padding to reflect the actual usage. To test: ``` python -m onnxruntime.transformers.models.gpt2.convert_to_onnx -m gpt2 --output gpt2.onnx -o -p fp16 --use_gpu ``` The top1-match-rate in the output is on-par with ORT 1.13.1.		2023-02-21 09:53:31 -08:00
..
backend	replace 'master' branch ref to 'main' for onnx repo (#12678 )	2022-08-30 13:41:42 -07:00
datasets
providers/tvm	Format all python files under onnxruntime with black and isort (#11324 )	2022-04-26 09:35:16 -07:00
tools	Disable fused causal attention (#14732 )	2023-02-21 09:53:31 -08:00
torch_cpp_extensions	[ORTModule] ATen Support for aten::upsample_nearest (#13364 )	2022-10-20 08:30:04 +08:00
training
__init__.py
_ld_preload.py
_pybind_state.py.in	Make ORT callable from various Pytorch compilers (LazyTensor, TorchDynamo, etc) (#10460 )	2022-08-22 09:40:40 -07:00
exported_symbols.lst
numpy_helper.h
onnxruntime_collect_build_info.py	Format all python files under onnxruntime with black and isort (#11324 )	2022-04-26 09:35:16 -07:00
onnxruntime_inference_collection.py	Offline tuning (#14558 )	2023-02-15 14:17:34 +08:00
onnxruntime_pybind.h	fix windows ci debug build break (#11495 )	2022-05-12 16:54:00 -07:00
onnxruntime_pybind_exceptions.cc
onnxruntime_pybind_exceptions.h
onnxruntime_pybind_iobinding.cc	Adds missing numpy type when looking for the ort correspondance (#10943 )	2022-03-22 14:44:48 -07:00
onnxruntime_pybind_mlvalue.cc	Multi-stream execution support (#13495 )	2022-12-15 07:39:29 -08:00
onnxruntime_pybind_mlvalue.h	Move OrtValueVector from onnxruntime-training to onnxruntime (#11176 )	2022-06-15 09:36:28 +02:00
onnxruntime_pybind_module.cc
onnxruntime_pybind_ortvalue.cc	Enable ORT in TorchDynamo (#13259 )	2022-11-01 11:19:29 -07:00
onnxruntime_pybind_schema.cc	Pass SessionOptions to XnnpackProviderFactoryCreator. (#13318 )	2022-12-10 14:23:46 +08:00
onnxruntime_pybind_sparse_tensor.cc	Multi-stream execution support (#13495 )	2022-12-15 07:39:29 -08:00
onnxruntime_pybind_state.cc	Offline tuning (#14558 )	2023-02-15 14:17:34 +08:00
onnxruntime_pybind_state.h
onnxruntime_pybind_state_common.cc	Allow CUDA EP enable or disable TunableOp via session options and environment variable (#13601 )	2022-11-15 14:43:54 +08:00
onnxruntime_pybind_state_common.h	[oneDNN] Improved thread handling (#13618 )	2023-01-31 14:37:13 -08:00
onnxruntime_validation.py	Format all python files under onnxruntime with black and isort (#11324 )	2022-04-26 09:35:16 -07:00
pybind.def
version_script.lds
version_script_expose_onnx_protobuf.lds