onnxruntime/docs
aciddelgado 4e27841bdb
fix gqa cpu nan bug (#20521)
### Description
There was a bug with gqa on cpu where on token case, with batch_size >
1, and with past_present_share_buffer off, the output would occasionally
contain nans. this pr fixes that. it also updates documentation and
fixes posid gen for rotary in cuda in prompt case.



### Motivation and Context
this pr solves the GQA CPU bug as well as updates the documentation and
makes seqlens_k irrelevant for prompt case, which is useful to prevent
user error.
2024-05-07 15:19:26 -07:00
..
c_cxx Remove extraneous javascript includes (#17558) 2023-09-14 20:43:24 -07:00
execution_providers/images
images
python Bump up version in main from 1.18.0 to 1.19.0 (#20489) 2024-04-29 20:21:41 -07:00
ABI_Dev_Notes.md Fix a typo in ABI_Dev_Notes.md (#17832) 2023-10-09 07:51:34 -07:00
Android_testing.md
C_API_Guidelines.md
cmake_guideline.md
Coding_Conventions_and_Standards.md [docs] Specify Objective-C max line length. (#16503) 2023-06-28 16:58:23 -07:00
ContribOperators.md fix gqa cpu nan bug (#20521) 2024-05-07 15:19:26 -07:00
FAQ.md [Technical docs] Fixed a couple of old links in FAQ.md (#17415) 2023-09-26 13:38:24 -07:00
How_To_Update_ONNX_Dev_Notes.md Integration with ONNX 1.16.0 (#19745) 2024-04-12 09:46:49 -07:00
Memory_Optimizer.md Prompt layer-wise recompute when applicable (#20126) 2024-04-10 11:50:28 +08:00
Model_Test.md
NotesOnThreading.md
ONNX_Runtime_Server_Usage.md
onnxruntime_dependencies.dot
onnxruntime_dependencies.png
onnxruntime_extensions.md Remove the extensions submodule (#17097) 2023-08-14 10:16:33 -07:00
OperatorKernels.md [DML EP] Register DFT-20 (#20341) 2024-05-02 11:08:39 -07:00
ORT_Format_Update_in_1.13.md
ORT_Use_Trtion_Kernel.md [ROCm] Add ROCm Triton TunableOp for GroupNorm (#16196) 2023-07-11 13:55:30 +08:00
ORTMobilePackageOperatorTypeSupport.md
ORTModule_Convergence_Notes.md Fix and enable few ORTModule Unit Tests (#19847) 2024-03-12 10:49:19 +08:00
ORTModule_ModuleWithLoss_Wrapper.md add steps to write modulewithloss wrapper (#16486) 2023-07-11 09:07:35 +08:00
ORTModule_PythonOp_Notes.md Add document for PythonOp (#17888) 2023-10-12 08:36:22 +08:00
ORTModule_Training_Guidelines.md Introducing ORTPipelineModule - DeepSpeed Parallel Pipeline Support. (#20287) 2024-04-18 11:30:15 -07:00
PR_Guidelines.md
Privacy.md [C# and Python APIs] Expose knobs to enable/disable platform telemetry collection (#5481) 2020-10-21 10:32:13 -07:00
Python_Dev_Notes.md
Reduced_Operator_Kernel_build.md
ReleaseManagement.md
Roadmap.md
Server.md
TVM_EP.md Fix: update hyperlinks to the Jupyter notebooks (#16145) 2023-08-21 09:53:05 -07:00
Versioning.md
WinML_principles.md