onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-05-16 21:00:14 +00:00

History

Yi Zhang 435e19953e Fix llama.covert_onnx to make it runnable in CI (#19372 ) ### Description 1. make parity_check use local model to avoid using hf token 2. del the model didn't work because it tried to del the object define out of the function scope. So it caused out of memory in A10. 3. In fact, 16G GPU memory (one T4) is enough. But the conversion process always be killed in T4 and it works on A10/24G. Standard_NC4as_T4_v3 has 28G CPU memory Standard_NV36ads_A10_v5 has 440G memory. It looks that the model conversion needs very huge memory. ### Motivation and Context Last time, I came across some issues in convert_to_onnx.py so I use the onnx model in https://github.com/microsoft/Llama-2-Onnx for testing. Now, these issues could be fixed. So I use onnx model generated by this repo and the CI can cover the model conversion.		2024-02-05 07:26:24 +08:00
..
android	Add LeakyRelu to list of NNAPI operators (#18880 )	2023-12-20 14:44:31 +10:00
apple	Fix iOS artifacts issue in Microsoft.ML.OnnxRuntime Nuget Package (#19311 )	2024-01-30 08:44:20 -08:00
azure-pipelines	Fix llama.covert_onnx to make it runnable in CI (#19372 )	2024-02-05 07:26:24 +08:00
js	Add MacOS build to ORT C Pod (#18550 )	2023-11-28 10:11:53 -08:00
linux	Fix a build issue: /MP was not enabled correctly (#19190 )	2024-01-29 12:45:38 -08:00
pai	[ROCm] Fix CI pipeline by fixing pytest version (#19407 )	2024-02-04 16:37:36 +08:00
windows	Enable Address Sanitizer in CI (#19073 )	2024-01-12 07:24:40 -08:00
Doxyfile_csharp.cfg