transformers

mirror of https://github.com/saymrwulf/transformers.git synced 2026-05-14 20:58:08 +00:00

History

Jack Morris 832c6191ed Add inputs_embeds param to ModernBertModel (#35373 ) * update modular_modernbert -- add inputs_embeds param to ModernBertModel * Fix implementation issues; extend to other classes; docstring First of all, the inputs_embeds shouldn't fully replace `self.embeddings(input_ids)`, because this call also does layer normalization and dropout. So, now both input_ids and inputs_embeds is passed to the ModernBertEmbeddings, much like how BertEmbeddings is implemented. I also added `inputs_embeds` to the docstring, and propagated the changes to the other model classes. I also introduced an error if input_ids and input_embeds are both or neither provided. Lastly, I fixed an issue with device being based solely on input_ids with attention_mask. * Propagate inputs_embeds to ModernBertForMaskedLM correctly Also reintroduce inputs_embeds test --------- Co-authored-by: Tom Aarsen <Cubiegamedev@gmail.com>		2025-01-09 14:17:26 +01:00
..
agents	Change `is_soundfile_availble` to `is_soundfile_available` (#35030 )	2025-01-03 14:37:42 +01:00
benchmark
bettertransformer
deepspeed	Trainer - deprecate tokenizer for processing_class (#32385 )	2024-10-02 14:08:46 +01:00
extended	[tests] skip tests for xpu (#33553 )	2024-09-19 19:28:04 +01:00
fixtures
fsdp	FSDP grad accum fix (#34645 )	2024-11-15 22:28:06 +01:00
generation	fix: Qwen2-VL generate with inputs_embeds (#35466 )	2025-01-08 16:36:03 +01:00
models	Add inputs_embeds param to ModernBertModel (#35373 )	2025-01-09 14:17:26 +01:00
optimization
peft_integration	added logic for deleting adapters once loaded (#34650 )	2025-01-06 18:36:40 +00:00
pipelines	Pipeline: simple API for assisted generation (#34504 )	2025-01-08 17:08:02 +00:00
quantization	Add Gemma2 GGUF support (#34002 )	2025-01-03 14:50:07 +01:00
repo_utils	Refactor CI: more explicit (#30674 )	2024-08-30 18:17:25 +02:00
sagemaker	Trainer - deprecate tokenizer for processing_class (#32385 )	2024-10-02 14:08:46 +01:00
tokenization	VLM: special multimodal Tokenizer (#34461 )	2024-11-04 16:37:51 +01:00
tp	Simplify Tensor Parallel implementation with PyTorch TP (#34184 )	2024-11-18 19:51:49 +01:00
trainer	Fix all output_dir in test_trainer.py to use tmp_dir (#35266 )	2025-01-08 19:44:39 +01:00
utils	More model refactoring! (#35359 )	2025-01-09 11:09:09 +01:00
__init__.py
test_backbone_common.py
test_configuration_common.py	Load sub-configs from composite configs (#34410 )	2024-11-05 11:34:01 +01:00
test_feature_extraction_common.py
test_image_processing_common.py	Fix Qwen2VL processor to handle odd number of frames (#35431 )	2025-01-08 13:49:00 +01:00
test_image_transforms.py
test_modeling_common.py	Fix flaky `test_batching_equivalence` (#35564 )	2025-01-09 14:00:08 +01:00
test_modeling_flax_common.py	🚨All attention refactor🚨 (#35235 )	2024-12-18 16:53:39 +01:00
test_modeling_tf_common.py	🚨All attention refactor🚨 (#35235 )	2024-12-18 16:53:39 +01:00
test_pipeline_mixin.py	Add image text to text pipeline (#34170 )	2024-10-31 15:48:11 -04:00
test_processing_common.py	VLMs: major clean up 🧼 (#34502 )	2025-01-08 10:35:23 +01:00
test_sequence_feature_extraction_common.py
test_tokenization_common.py	Correctly list the chat template file in the Tokenizer saved files list (#34974 )	2025-01-07 19:11:02 +00:00