onnxruntime/tools/ci_build/github
aciddelgado 4e27841bdb
fix gqa cpu nan bug (#20521)
### Description
There was a bug with gqa on cpu where on token case, with batch_size >
1, and with past_present_share_buffer off, the output would occasionally
contain nans. this pr fixes that. it also updates documentation and
fixes posid gen for rotary in cuda in prompt case.



### Motivation and Context
this pr solves the GQA CPU bug as well as updates the documentation and
makes seqlens_k irrelevant for prompt case, which is useful to prevent
user error.
2024-05-07 15:19:26 -07:00
..
android Bump ruff to 0.3.2 and black to 24 (#19878) 2024-03-13 10:00:32 -07:00
apple Remove usage of 'required reason' iOS API from protobuf (#20529) 2024-05-02 08:21:08 +10:00
azure-pipelines fix gqa cpu nan bug (#20521) 2024-05-07 15:19:26 -07:00
js Add MacOS build to ORT C Pod (#18550) 2023-11-28 10:11:53 -08:00
linux [TensorRT EP] support TensorRT 10-GA (#20506) 2024-05-01 11:10:53 -07:00
pai fix rocm ci pipeline (#19525) 2024-02-15 00:02:08 -08:00
windows [TensorRT EP] support TensorRT 10-GA (#20506) 2024-05-01 11:10:53 -07:00
Doxyfile_csharp.cfg [C#] Rename unreleased API, add utilities (#16806) 2023-08-02 10:06:42 -07:00