onnxruntime/tools/ci_build/github
Yi Zhang 435e19953e
Fix llama.covert_onnx to make it runnable in CI (#19372)
### Description
1.  make parity_check use local model to avoid using hf token
2. del the model didn't work because it tried to del the object define
out of the function scope.
     So it caused out of memory in A10.
3. In fact, 16G GPU memory (one T4) is enough. But the conversion
process always be killed in T4 and it works on A10/24G.
     Standard_NC4as_T4_v3 has 28G CPU memory
     Standard_NV36ads_A10_v5 has 440G memory.
     It looks that the model conversion needs very huge memory.

### Motivation and Context
Last time, I came across some issues in convert_to_onnx.py so I use the
onnx model in https://github.com/microsoft/Llama-2-Onnx for testing.
Now, these issues could be fixed. So I use onnx model generated by this
repo and the CI can cover the model conversion.
2024-02-05 07:26:24 +08:00
..
android Add LeakyRelu to list of NNAPI operators (#18880) 2023-12-20 14:44:31 +10:00
apple Fix iOS artifacts issue in Microsoft.ML.OnnxRuntime Nuget Package (#19311) 2024-01-30 08:44:20 -08:00
azure-pipelines Fix llama.covert_onnx to make it runnable in CI (#19372) 2024-02-05 07:26:24 +08:00
js Add MacOS build to ORT C Pod (#18550) 2023-11-28 10:11:53 -08:00
linux Fix a build issue: /MP was not enabled correctly (#19190) 2024-01-29 12:45:38 -08:00
pai [ROCm] Fix CI pipeline by fixing pytest version (#19407) 2024-02-04 16:37:36 +08:00
windows Enable Address Sanitizer in CI (#19073) 2024-01-12 07:24:40 -08:00
Doxyfile_csharp.cfg