mirror of
https://github.com/saymrwulf/transformers.git
synced 2026-05-14 20:58:08 +00:00
Fix image preview in multi-GPU inference docs (#35303)
fix: link for img
This commit is contained in:
parent
4302b27719
commit
927c3e39ec
2 changed files with 2 additions and 2 deletions
|
|
@ -64,5 +64,5 @@ You can benefit from considerable speedups for inference, especially for inputs
|
|||
For a single forward pass on [Llama](https://huggingface.co/docs/transformers/model_doc/llama#transformers.LlamaModel) with a sequence length of 512 and various batch sizes, the expected speedup is as follows:
|
||||
|
||||
<div style="text-align: center">
|
||||
<img src="huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/Meta-Llama-3-8B-Instruct, seqlen = 512, python, w_ compile.png">
|
||||
<img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/Meta-Llama-3-8B-Instruct%2C%20seqlen%20%3D%20512%2C%20python%2C%20w_%20compile.png">
|
||||
</div>
|
||||
|
|
|
|||
|
|
@ -64,5 +64,5 @@ torchrun --nproc-per-node 4 demo.py
|
|||
以下是 [Llama](https://huggingface.co/docs/transformers/model_doc/llama#transformers.LlamaModel) 模型在序列长度为 512 且不同批量大小情况下的单次前向推理的预期加速效果:
|
||||
|
||||
<div style="text-align: center">
|
||||
<img src="huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/Meta-Llama-3-8B-Instruct, seqlen = 512, python, w_ compile.png">
|
||||
<img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/Meta-Llama-3-8B-Instruct%2C%20seqlen%20%3D%20512%2C%20python%2C%20w_%20compile.png">
|
||||
</div>
|
||||
|
|
|
|||
Loading…
Reference in a new issue