From 927c3e39ec1fb78e571c0ec2521ae59ed05720f2 Mon Sep 17 00:00:00 2001 From: Jacky Lee <39754370+jla524@users.noreply.github.com> Date: Tue, 17 Dec 2024 09:33:50 -0800 Subject: [PATCH] Fix image preview in multi-GPU inference docs (#35303) fix: link for img --- docs/source/en/perf_infer_gpu_multi.md | 2 +- docs/source/zh/perf_infer_gpu_multi.md | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/source/en/perf_infer_gpu_multi.md b/docs/source/en/perf_infer_gpu_multi.md index 997509441..ea9421747 100644 --- a/docs/source/en/perf_infer_gpu_multi.md +++ b/docs/source/en/perf_infer_gpu_multi.md @@ -64,5 +64,5 @@ You can benefit from considerable speedups for inference, especially for inputs For a single forward pass on [Llama](https://huggingface.co/docs/transformers/model_doc/llama#transformers.LlamaModel) with a sequence length of 512 and various batch sizes, the expected speedup is as follows:
+
+