transformers/tests
Arthur 4bff54f921
Gemma capping (#34282)
* softcapping

* soft cap before the mask

* style

* ...

* super nit

* update

* fixes

* update

* small issue with modular

* fix modular imports

* update

* fixup

* simplify a hell lot

* simplify cleaning imports

* finish fixing

* update our design

* nits

* use a deprecation cycle

* updates

* Fix modular (recursive deps need to always be computed after merges!)

* push

* fix

* update

* fix modular order

* make fix-copies

* updates

* update

* ?

* don't compile for now

* ?

* fix some stuff

* donc!

* fix copies

* update

* fixup

* ?

* fix two tests

* fix?

* for now, don't use head info

* eager when output attentoin and sdpa or flash as it's the simplest behaviour (for our tests as well :))

* fix-copies

* revert sdpa check

* Apply suggestions from code review

Co-authored-by: Cyril Vallez <cyril.vallez@huggingface.co>

* rebase, fix-copies and push

* add a slow integration test

* update the test

* fix left padding issue

* fix test

* remove duplicate scaling

* quality

* add a small test and make sure it works

* 2b

---------

Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com>
Co-authored-by: Cyril Vallez <cyril.vallez@huggingface.co>
2024-11-19 13:52:38 +01:00
..
agents Agents: Small fixes in streaming to gradio + add tests (#34549) 2024-11-11 20:52:09 +01:00
benchmark
bettertransformer
deepspeed Trainer - deprecate tokenizer for processing_class (#32385) 2024-10-02 14:08:46 +01:00
extended [tests] skip tests for xpu (#33553) 2024-09-19 19:28:04 +01:00
fixtures
fsdp FSDP grad accum fix (#34645) 2024-11-15 22:28:06 +01:00
generation Gemma capping (#34282) 2024-11-19 13:52:38 +01:00
models Gemma capping (#34282) 2024-11-19 13:52:38 +01:00
optimization fix: Fixed the 1st argument name in classmethods (#31907) 2024-07-11 12:11:50 +01:00
peft_integration [PEFT] Add warning for missing key in LoRA adapter (#34068) 2024-10-24 17:56:40 +02:00
pipelines Add image text to text pipeline (#34170) 2024-10-31 15:48:11 -04:00
quantization Fix use_parallel_residual and qkv_bias for StableLM GGUF config extraction (#34450) 2024-11-05 18:26:20 +01:00
repo_utils Refactor CI: more explicit (#30674) 2024-08-30 18:17:25 +02:00
sagemaker Trainer - deprecate tokenizer for processing_class (#32385) 2024-10-02 14:08:46 +01:00
tokenization VLM: special multimodal Tokenizer (#34461) 2024-11-04 16:37:51 +01:00
tp Simplify Tensor Parallel implementation with PyTorch TP (#34184) 2024-11-18 19:51:49 +01:00
trainer Remove FSDP wrapping from sub-models. (#34452) 2024-11-15 23:00:03 +01:00
utils 🧼 remove v4.44 deprecations (#34245) 2024-11-15 23:07:24 +01:00
__init__.py
test_backbone_common.py
test_configuration_common.py Load sub-configs from composite configs (#34410) 2024-11-05 11:34:01 +01:00
test_feature_extraction_common.py
test_image_processing_common.py Add DetrImageProcessorFast (#34063) 2024-10-21 09:05:05 -04:00
test_image_transforms.py
test_modeling_common.py Fix skip of test_training_gradient_checkpointing (#34723) 2024-11-18 15:45:40 +01:00
test_modeling_flax_common.py
test_modeling_tf_common.py [TF] Fix Tensorflow XLA Generation on limited seq_len models (#33903) 2024-10-05 16:20:50 +02:00
test_pipeline_mixin.py Add image text to text pipeline (#34170) 2024-10-31 15:48:11 -04:00
test_processing_common.py Uniformize model processors (#31368) 2024-10-02 10:41:08 +02:00
test_sequence_feature_extraction_common.py
test_tokenization_common.py Retain newlines in chat template when continue_final_message=True (#34253) 2024-11-15 14:27:04 +00:00