mirror of
https://github.com/saymrwulf/transformers.git
synced 2026-05-15 21:01:19 +00:00
* Add model with cli tool * Remove unwanted stuff * Add new code * Remove inference runner * Style * Fix checks * Test updates * make fixup * fix docs * fix doc * fix test * hopefully fix pipeline tests * refactor * fix CIs * add comment * rename to `GPTBigCodeForCausalLM` * correct readme * make fixup + docs * make fixup * fixes * fixes * Remove pruning * Remove import * Doc updates * More pruning removal * Combine copies * Single MQA implementation, remove kv cache pre-allocation and padding * Update doc * Revert refactor to match gpt2 style * Merge back key and value caches, fix some type hints * Update doc * Fix position ids pith padding (PR 21080) * Add conversion script temporarily * Update conversion script * Remove checkpoint conversion * New model * Fix MQA test * Fix copies * try fix tests * FIX TEST!! * remove `DoubleHeadsModel` * add MQA tests * add slow tests * clean up * add CPU checker * final fixes * fixes - fix GPU issue - fixed slow tests - skip disk offload * fix final issue * Simplify and comment baddbmm fix * Remove unnecessary code * Transpose tweaks * Use beta=1 on cpu, improve tests --------- Co-authored-by: younesbelkada <younesbelkada@gmail.com> |
||
|---|---|---|
| .. | ||
| internal | ||
| main_classes | ||
| model_doc | ||
| tasks | ||
| _config.py | ||
| _toctree.yml | ||
| accelerate.mdx | ||
| add_new_model.mdx | ||
| add_new_pipeline.mdx | ||
| add_tensorflow_model.mdx | ||
| attention.mdx | ||
| autoclass_tutorial.mdx | ||
| benchmarks.mdx | ||
| bertology.mdx | ||
| big_models.mdx | ||
| community.mdx | ||
| contributing.md | ||
| converting_tensorflow_models.mdx | ||
| create_a_model.mdx | ||
| custom_models.mdx | ||
| debugging.mdx | ||
| fast_tokenizers.mdx | ||
| generation_strategies.mdx | ||
| glossary.mdx | ||
| hpo_train.mdx | ||
| index.mdx | ||
| installation.mdx | ||
| migration.mdx | ||
| model_sharing.mdx | ||
| model_summary.mdx | ||
| multilingual.mdx | ||
| notebooks.md | ||
| pad_truncation.mdx | ||
| perf_hardware.mdx | ||
| perf_infer_cpu.mdx | ||
| perf_infer_gpu_many.mdx | ||
| perf_infer_gpu_one.mdx | ||
| perf_infer_special.mdx | ||
| perf_train_cpu.mdx | ||
| perf_train_cpu_many.mdx | ||
| perf_train_gpu_many.mdx | ||
| perf_train_gpu_one.mdx | ||
| perf_train_special.mdx | ||
| perf_train_tpu.mdx | ||
| perf_train_tpu_tf.mdx | ||
| performance.mdx | ||
| perplexity.mdx | ||
| philosophy.mdx | ||
| pipeline_tutorial.mdx | ||
| pipeline_webserver.mdx | ||
| pr_checks.mdx | ||
| preprocessing.mdx | ||
| quicktour.mdx | ||
| run_scripts.mdx | ||
| sagemaker.mdx | ||
| serialization.mdx | ||
| task_summary.mdx | ||
| tasks_explained.mdx | ||
| testing.mdx | ||
| tf_xla.mdx | ||
| tokenizer_summary.mdx | ||
| torchscript.mdx | ||
| training.mdx | ||
| troubleshooting.mdx | ||