onnxruntime/docs
pengwa 4bfa84487c
Skip module clone for preparing large model export (#18663)
### Skip module clone for preparing large model export

For LLAMA2 13B, when running with Lora, DeepSpeed stage2 on 8 GPUs . It
failed during preparing outputs which will be used for
torch.onnx.export. The reason, we deep copy all the params including
both big sizes of frozen weights, + a little bit of Lora trainable
weight.

This PR will firstly check whether the GPU memmory is enough for a
cloned module, if not, skip the copy.

Copying the module is to guarantee the fw path run may change the
weight, while this case should be rare. But for now, Not-Able-To-Run is
worse than Runnable-with-A-little-bit-different-initial-weight,
especially for large models.
2023-12-05 12:41:17 -08:00
..
c_cxx Remove extraneous javascript includes (#17558) 2023-09-14 20:43:24 -07:00
execution_providers/images
images API Documentation (#8948) 2021-09-09 22:04:51 -07:00
python Openvino ep ort 23.1 (#17911) 2023-11-01 08:39:39 -07:00
ABI_Dev_Notes.md Fix a typo in ABI_Dev_Notes.md (#17832) 2023-10-09 07:51:34 -07:00
Android_testing.md
C_API_Guidelines.md Replace 'master' branch ref to 'main' in the code (#12547) 2022-08-22 10:48:12 -07:00
cmake_guideline.md fix some typo in docs (#13212) 2022-10-07 15:58:18 -07:00
Coding_Conventions_and_Standards.md [docs] Specify Objective-C max line length. (#16503) 2023-06-28 16:58:23 -07:00
ContribOperators.md Bfloat16 support for MatMulBnb4, Training support bitsandbytes>=0.41.2 (#18484) 2023-11-20 09:52:58 -08:00
FAQ.md [Technical docs] Fixed a couple of old links in FAQ.md (#17415) 2023-09-26 13:38:24 -07:00
How_To_Update_ONNX_Dev_Notes.md Remove exclusions for ONNX model tests that now pass. (#14337) 2023-01-24 08:04:27 +10:00
Memory_Optimizer.md Memory optimization refactor and refinement (#17481) 2023-11-23 11:39:00 +08:00
Model_Test.md
NotesOnThreading.md Replace 'master' branch ref to 'main' in the code (#12547) 2022-08-22 10:48:12 -07:00
ONNX_Runtime_Server_Usage.md
onnxruntime_dependencies.dot
onnxruntime_dependencies.png
onnxruntime_extensions.md Remove the extensions submodule (#17097) 2023-08-14 10:16:33 -07:00
OperatorKernels.md Add float16 type support to SplitToSequence and make code type independent (#18594) 2023-11-29 10:44:59 -08:00
ORT_Format_Update_in_1.13.md Update ORT format v5 change docs to cover limited backwards compatibility in 1.14. (#14413) 2023-01-25 08:23:12 -08:00
ORT_Use_Trtion_Kernel.md [ROCm] Add ROCm Triton TunableOp for GroupNorm (#16196) 2023-07-11 13:55:30 +08:00
ORTMobilePackageOperatorTypeSupport.md Replace 'master' branch ref to 'main' in the code (#12547) 2022-08-22 10:48:12 -07:00
ORTModule_Convergence_Notes.md Introduce ZeROOffloadSubscriber for ORTModule (#17006) 2023-08-25 00:15:22 +08:00
ORTModule_ModuleWithLoss_Wrapper.md add steps to write modulewithloss wrapper (#16486) 2023-07-11 09:07:35 +08:00
ORTModule_PythonOp_Notes.md Add document for PythonOp (#17888) 2023-10-12 08:36:22 +08:00
ORTModule_Training_Guidelines.md Skip module clone for preparing large model export (#18663) 2023-12-05 12:41:17 -08:00
PR_Guidelines.md
Privacy.md
Python_Dev_Notes.md
Reduced_Operator_Kernel_build.md replace 'master' branch ref to 'main' for onnx repo (#12678) 2022-08-30 13:41:42 -07:00
ReleaseManagement.md
Roadmap.md Replace 'master' branch ref to 'main' in the code (#12547) 2022-08-22 10:48:12 -07:00
Server.md
TVM_EP.md Fix: update hyperlinks to the Jupyter notebooks (#16145) 2023-08-21 09:53:05 -07:00
Versioning.md replace 'master' branch ref to 'main' for onnx repo (#12678) 2022-08-30 13:41:42 -07:00
WinML_principles.md Replace 'master' branch ref to 'main' in the code (#12547) 2022-08-22 10:48:12 -07:00