transformers

mirror of https://github.com/saymrwulf/transformers.git synced 2026-05-14 20:58:08 +00:00

History

Jerry Zhang 78d78cdf8a Add TorchAOHfQuantizer (#32306 ) * Add TorchAOHfQuantizer Summary: Enable loading torchao quantized model in huggingface. Test Plan: local test Reviewers: Subscribers: Tasks: Tags: * Fix a few issues * style * Added tests and addressed some comments about dtype conversion * fix torch_dtype warning message * fix tests * style * TorchAOConfig -> TorchAoConfig * enable offload + fix memory with multi-gpu * update torchao version requirement to 0.4.0 * better comments * add torch.compile to torchao README, add perf number link --------- Co-authored-by: Marc Sun <marc@huggingface.co>		2024-08-14 16:14:24 +02:00
..
agent.md	Add stream messages from agent run for gradio chatbot (#32142 )	2024-07-29 20:12:44 +02:00
backbones.md	doc: fix broken BEiT and DiNAT model links on Backbone page (#32029 )	2024-07-17 20:24:10 +01:00
callback.md	Update CometCallback to allow reusing of the running experiment (#31366 )	2024-07-05 08:13:46 +02:00
configuration.md
data_collator.md	Enhancing SFT Training Efficiency Using Packing and FlashAttention2 with Position IDs (#31629 )	2024-07-23 15:56:41 +02:00
deepspeed.md	[docs] DeepSpeed (#28542 )	2024-01-24 08:31:28 -08:00
feature_extractor.md
image_processor.md	Fast image processor (#28847 )	2024-06-11 15:47:38 +01:00
keras_callbacks.md
logging.md
model.md	Speedup model init on CPU (by 10x+ for llama-3-8B as one example) (#31771 )	2024-07-16 09:32:01 -04:00
onnx.md
optimizer_schedules.md	Add WSD scheduler (#30231 )	2024-04-25 12:07:21 +01:00
output.md	Update all references to canonical models (#29001 )	2024-02-16 08:16:58 +01:00
pipelines.md	Allow FP16 or other precision inference for Pipelines (#31342 )	2024-07-05 17:21:50 +01:00
processors.md
quantization.md	Add TorchAOHfQuantizer (#32306 )	2024-08-14 16:14:24 +02:00
text_generation.md	Add Watermarking LogitsProcessor and WatermarkDetector (#29676 )	2024-05-14 13:31:39 +05:00
tokenizer.md
trainer.md