transformers/tests
Jack Morris 832c6191ed
Add inputs_embeds param to ModernBertModel (#35373)
* update modular_modernbert -- add inputs_embeds param to ModernBertModel

* Fix implementation issues; extend to other classes; docstring

First of all, the inputs_embeds shouldn't fully replace `self.embeddings(input_ids)`, because this call also does layer normalization and dropout. So, now both input_ids and inputs_embeds is passed to the ModernBertEmbeddings, much like how BertEmbeddings is implemented.

I also added `inputs_embeds` to the docstring, and propagated the changes to the other model classes.

I also introduced an error if input_ids and input_embeds are both or neither provided.

Lastly, I fixed an issue with device being based solely on input_ids with attention_mask.

* Propagate inputs_embeds to ModernBertForMaskedLM correctly

Also reintroduce inputs_embeds test

---------

Co-authored-by: Tom Aarsen <Cubiegamedev@gmail.com>
2025-01-09 14:17:26 +01:00
..
agents Change is_soundfile_availble to is_soundfile_available (#35030) 2025-01-03 14:37:42 +01:00
benchmark
bettertransformer
deepspeed Trainer - deprecate tokenizer for processing_class (#32385) 2024-10-02 14:08:46 +01:00
extended [tests] skip tests for xpu (#33553) 2024-09-19 19:28:04 +01:00
fixtures
fsdp FSDP grad accum fix (#34645) 2024-11-15 22:28:06 +01:00
generation fix: Qwen2-VL generate with inputs_embeds (#35466) 2025-01-08 16:36:03 +01:00
models Add inputs_embeds param to ModernBertModel (#35373) 2025-01-09 14:17:26 +01:00
optimization
peft_integration added logic for deleting adapters once loaded (#34650) 2025-01-06 18:36:40 +00:00
pipelines Pipeline: simple API for assisted generation (#34504) 2025-01-08 17:08:02 +00:00
quantization Add Gemma2 GGUF support (#34002) 2025-01-03 14:50:07 +01:00
repo_utils Refactor CI: more explicit (#30674) 2024-08-30 18:17:25 +02:00
sagemaker Trainer - deprecate tokenizer for processing_class (#32385) 2024-10-02 14:08:46 +01:00
tokenization VLM: special multimodal Tokenizer (#34461) 2024-11-04 16:37:51 +01:00
tp Simplify Tensor Parallel implementation with PyTorch TP (#34184) 2024-11-18 19:51:49 +01:00
trainer Fix all output_dir in test_trainer.py to use tmp_dir (#35266) 2025-01-08 19:44:39 +01:00
utils More model refactoring! (#35359) 2025-01-09 11:09:09 +01:00
__init__.py
test_backbone_common.py
test_configuration_common.py Load sub-configs from composite configs (#34410) 2024-11-05 11:34:01 +01:00
test_feature_extraction_common.py
test_image_processing_common.py Fix Qwen2VL processor to handle odd number of frames (#35431) 2025-01-08 13:49:00 +01:00
test_image_transforms.py
test_modeling_common.py Fix flaky test_batching_equivalence (#35564) 2025-01-09 14:00:08 +01:00
test_modeling_flax_common.py 🚨All attention refactor🚨 (#35235) 2024-12-18 16:53:39 +01:00
test_modeling_tf_common.py 🚨All attention refactor🚨 (#35235) 2024-12-18 16:53:39 +01:00
test_pipeline_mixin.py Add image text to text pipeline (#34170) 2024-10-31 15:48:11 -04:00
test_processing_common.py VLMs: major clean up 🧼 (#34502) 2025-01-08 10:35:23 +01:00
test_sequence_feature_extraction_common.py
test_tokenization_common.py Correctly list the chat template file in the Tokenizer saved files list (#34974) 2025-01-07 19:11:02 +00:00