onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-06-01 23:30:35 +00:00

History

zhijiang 4dc4470cc7 Fix fusion for two LayerNorm sharing same input but with different weights (#15919 ) in gpt_j_residual(https://arxiv.org/pdf/2204.06745.pdf), there are 2 LN nodes will share one same input, and ORT does CSE graph optimization before LN fusion, which will modify the LN graph pattern and thus make LN fusion failure. ![image](https://github.com/microsoft/onnxruntime/assets/10530022/40990fd6-796f-4edf-be0b-3203e8503678)		2023-05-22 08:26:36 +08:00
..
contrib_ops	fix unused var warning in contrib_ops/cuda/bert/attention.cc (#16010 )	2023-05-18 17:42:08 -07:00
core	Introduce register-efficient warp-wise Softmax (#15266 )	2023-05-22 08:26:03 +08:00
python	optimization for whisper model with decoder masked multihead attention (#15827 )	2023-05-18 15:38:31 -07:00
test	Fix fusion for two LayerNorm sharing same input but with different weights (#15919 )	2023-05-22 08:26:36 +08:00
tool/etw	Run clang-format in CI (#15524 )	2023-04-18 09:26:58 -07:00
wasm	[js/webgpu] following up for JSEP/WebGPU code cleanup (#15666 )	2023-04-25 21:20:03 -07:00
__init__.py	Update VERSION_NUMBER (#15773 )	2023-05-03 15:07:34 -07:00
ReformatSource.ps1	Run clang-format in CI (#15524 )	2023-04-18 09:26:58 -07:00
ReformatSourcePython.bat
VSCodeCoverage.runsettings