transformers

mirror of https://github.com/saymrwulf/transformers.git synced 2026-05-14 20:58:08 +00:00

History

Jerry Zhang 78d78cdf8a Add TorchAOHfQuantizer (#32306 ) * Add TorchAOHfQuantizer Summary: Enable loading torchao quantized model in huggingface. Test Plan: local test Reviewers: Subscribers: Tasks: Tags: * Fix a few issues * style * Added tests and addressed some comments about dtype conversion * fix torch_dtype warning message * fix tests * style * TorchAOConfig -> TorchAoConfig * enable offload + fix memory with multi-gpu * update torchao version requirement to 0.4.0 * better comments * add torch.compile to torchao README, add perf number link --------- Co-authored-by: Marc Sun <marc@huggingface.co>		2024-08-14 16:14:24 +02:00
..
internal	Gemma2: add cache warning (#32279 )	2024-08-07 10:03:05 +05:00
main_classes	Add TorchAOHfQuantizer (#32306 )	2024-08-14 16:14:24 +02:00
model_doc	"to be not" -> "not to be" (#32636 )	2024-08-12 20:20:17 +01:00
quantization	Add TorchAOHfQuantizer (#32306 )	2024-08-14 16:14:24 +02:00
tasks	[docs] Translation guide (#32547 )	2024-08-08 13:43:14 -07:00
_config.py
_redirects.yml	Docs / Quantization: Redirect deleted page (#31063 )	2024-05-28 18:29:22 +02:00
_toctree.yml	Add TorchAOHfQuantizer (#32306 )	2024-08-14 16:14:24 +02:00
accelerate.md
add_new_model.md
add_new_pipeline.md
agents.md	Agents use grammar (#31735 )	2024-08-07 11:42:52 +02:00
attention.md
autoclass_tutorial.md
benchmarks.md
bertology.md
big_models.md
chat_templating.md	Cleanup tool calling documentation and rename doc (#32337 )	2024-08-12 16:20:14 +01:00
community.md
contributing.md
conversations.md	[docs] change temperature to a positive value (#32077 )	2024-07-23 17:47:51 +01:00
create_a_model.md	Enable HF pretrained backbones (#31145 )	2024-06-06 22:02:38 +01:00
custom_models.md
debugging.md
deepspeed.md	Fix typos (#31819 )	2024-07-08 11:52:47 +01:00
fast_tokenizers.md
fsdp.md
generation_strategies.md	Docs: alert for the possibility of manipulating logits (#32467 )	2024-08-07 16:34:46 +01:00
gguf.md	Add Qwen2 GGUF loading support (#31175 )	2024-06-03 14:55:10 +01:00
glossary.md	Fix typos (#31819 )	2024-07-08 11:52:47 +01:00
hpo_train.md
index.md	Add new model (#32615 )	2024-08-12 08:22:47 +02:00
installation.md	Use `HF_HUB_OFFLINE` + fix has_file in offline mode (#31016 )	2024-05-29 11:55:43 +01:00
kv_cache.md	Cache: create docs (#32150 )	2024-08-06 10:24:19 +05:00
llm_optims.md	Generate: end-to-end compilation (#30788 )	2024-07-29 10:52:13 +01:00
llm_tutorial.md	Generate: update links on LLM tutorial doc (#30550 )	2024-04-30 18:14:12 +01:00
llm_tutorial_optimization.md	Fix typos (#31819 )	2024-07-08 11:52:47 +01:00
model_memory_anatomy.md
model_sharing.md	Docs: formatting nits (#32247 )	2024-07-30 15:49:14 +01:00
model_summary.md
multilingual.md
notebooks.md
pad_truncation.md
peft.md	Docs / Quantization: Replace all occurences of `load_in_8bit` with bnb config (#31136 )	2024-05-30 16:47:35 +02:00
perf_hardware.md	Fix typos (#31819 )	2024-07-08 11:52:47 +01:00
perf_infer_cpu.md
perf_infer_gpu_one.md	Add Qwen2-Audio (#32137 )	2024-08-08 15:47:24 +02:00
perf_torch_compile.md	fix(docs): Fixed a link in docs (#32274 )	2024-07-29 10:50:43 +01:00
perf_train_cpu.md
perf_train_cpu_many.md
perf_train_gpu_many.md	Update perf_train_gpu_many.md (#31451 )	2024-06-18 11:00:26 -07:00
perf_train_gpu_one.md	Add torch_empty_cache_steps to TrainingArguments (#31546 )	2024-07-04 13:20:49 -04:00
perf_train_special.md
perf_train_tpu_tf.md
performance.md
perplexity.md
philosophy.md
pipeline_tutorial.md	Allow FP16 or other precision inference for Pipelines (#31342 )	2024-07-05 17:21:50 +01:00
pipeline_webserver.md
pr_checks.md
preprocessing.md	chore: remove duplicate words (#31853 )	2024-07-09 10:38:29 +01:00
quicktour.md	docs: fix broken link (#31370 )	2024-06-12 11:33:00 +01:00
run_scripts.md	Fix broken link to Transformers notebooks (#30512 )	2024-04-29 10:57:51 +01:00
sagemaker.md
serialization.md
task_summary.md
tasks_explained.md
testing.md	Docs: Fixed WhisperModel.forward’s docstring link (#32498 )	2024-08-07 11:01:33 -07:00
tf_xla.md	fix(docs): Fixed a link in docs (#32274 )	2024-07-29 10:50:43 +01:00
tflite.md
tokenizer_summary.md	[docs] Spanish translation of tokenizer_summary.md (#31154 )	2024-06-03 16:52:23 -07:00
torchscript.md
trainer.md	Add support for GrokAdamW optimizer (#32521 )	2024-08-13 13:20:28 +01:00
training.md	Added the necessay import of module (#30804 )	2024-05-14 18:45:06 +01:00
troubleshooting.md