transformers

mirror of https://github.com/saymrwulf/transformers.git synced 2026-05-14 20:58:08 +00:00

History

eustlb af2d7caff3 Add Moonshine (#34784 ) * config draft * full encoder forward * full decoder forward * fix sdpa and FA2 * fix sdpa and FA2 * moonshine model * moonshine model forward * fix attention with past_key_values * add MoonshineForConditionalGeneration * fix cache handling and causality for cross attention * no causal attention mask for the encoder * model addition (imports etc) * small nit * nits * Update src/transformers/models/moonshine/convert_usefulsensors_to_hf.py Co-authored-by: Joshua Lochner <admin@xenova.com> * add rope_theta * nits * model doc * Update src/transformers/models/auto/configuration_auto.py Co-authored-by: Joshua Lochner <admin@xenova.com> * imports * add MODEL_FOR_SPEECH_SEQ_2_SEQ_MAPPING_NAMES * updates modular * make * make fix-copies * ruff check examples fix * fix check_modular_conversion * nit * nits * nits * copied from -> imports * imports fix * integrate attention refacto * modular edge case * remove encoder * convolutions params in config * run modular_model_converter * make * Update docs/source/en/model_doc/moonshine.md Co-authored-by: Joshua Lochner <admin@xenova.com> * MoonshineModelTest * correct typo * make style * integration tests * make * modular convert * name conversion update (up_proj -> fc1 etc) * update config * update MLP * update attention * update encoder layer * update decoder layer * update convolutions parameters * update encoder * remove INPUTS_DOCSTRING * update decoder * update conditional generation * update pretrained model * imports * modular converted * update doc * fix * typo * update doc * update license * update init * split config in file * two classes for MLP * attention from GLM * from GlmRotaryEmbedding * split MLP * apply arthur's review suggestions * apply arthur's review suggestions * apply arthur's review suggestions * auto feature extractor * convert modular * fix + make * convert modular * make * unsplit config * use correct checkpoint * wrap generate * update tests * typos * make * typo * update doc --------- Co-authored-by: Joshua Lochner <admin@xenova.com>		2025-01-10 11:03:36 +01:00
..
agents	Change `is_soundfile_availble` to `is_soundfile_available` (#35030 )	2025-01-03 14:37:42 +01:00
benchmark
bettertransformer
deepspeed	Trainer - deprecate tokenizer for processing_class (#32385 )	2024-10-02 14:08:46 +01:00
extended	[tests] skip tests for xpu (#33553 )	2024-09-19 19:28:04 +01:00
fixtures
fsdp	FSDP grad accum fix (#34645 )	2024-11-15 22:28:06 +01:00
generation	fix: Qwen2-VL generate with inputs_embeds (#35466 )	2025-01-08 16:36:03 +01:00
models	Add Moonshine (#34784 )	2025-01-10 11:03:36 +01:00
optimization
peft_integration	added logic for deleting adapters once loaded (#34650 )	2025-01-06 18:36:40 +00:00
pipelines	Pipeline: simple API for assisted generation (#34504 )	2025-01-08 17:08:02 +00:00
quantization	Add Gemma2 GGUF support (#34002 )	2025-01-03 14:50:07 +01:00
repo_utils	Fix modular edge case + modular sorting order (#35562 )	2025-01-09 17:17:52 +01:00
sagemaker	Trainer - deprecate tokenizer for processing_class (#32385 )	2024-10-02 14:08:46 +01:00
tokenization	`tokenizer` train from iterator without pre_tokenizers (#35396 )	2025-01-09 15:34:43 +01:00
tp	Simplify Tensor Parallel implementation with PyTorch TP (#34184 )	2024-11-18 19:51:49 +01:00
trainer	Fix all output_dir in test_trainer.py to use tmp_dir (#35266 )	2025-01-08 19:44:39 +01:00
utils	More model refactoring! (#35359 )	2025-01-09 11:09:09 +01:00
__init__.py
test_backbone_common.py
test_configuration_common.py	Load sub-configs from composite configs (#34410 )	2024-11-05 11:34:01 +01:00
test_feature_extraction_common.py
test_image_processing_common.py	Fix Qwen2VL processor to handle odd number of frames (#35431 )	2025-01-08 13:49:00 +01:00
test_image_transforms.py
test_modeling_common.py	Fix flaky `SwitchTransformersModelTest::test_training_gradient` (#35587 )	2025-01-09 15:36:22 +01:00
test_modeling_flax_common.py	🚨All attention refactor🚨 (#35235 )	2024-12-18 16:53:39 +01:00
test_modeling_tf_common.py	🚨All attention refactor🚨 (#35235 )	2024-12-18 16:53:39 +01:00
test_pipeline_mixin.py	Add image text to text pipeline (#34170 )	2024-10-31 15:48:11 -04:00
test_processing_common.py	VLMs: major clean up 🧼 (#34502 )	2025-01-08 10:35:23 +01:00
test_sequence_feature_extraction_common.py
test_tokenization_common.py	[`tokenizers`] Ensure that add_prefix_space is propagated to backend_tokenizer.pre_tokenizer (#35593 )	2025-01-09 17:46:50 +01:00