transformers

mirror of https://github.com/saymrwulf/transformers.git synced 2026-05-14 20:58:08 +00:00

Author	SHA1	Message	Date
Yih-Dar	3897f2caf8	Enable pytest live log and show warning logs on GitHub Actions CI runs (#35912 ) * fix * remove * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-02-10 13:36:20 +01:00
Jingze Shi	48a309d0d2	Support constant lr with cooldown (#35453 ) * Add support for constant learning rate with cooldown * Add support for constant learning rate with cooldown * Add support for constant learning rate with cooldown * Add support for constant learning rate with cooldown * Add support for constant learning rate with cooldown * Add support for constant learning rate with cooldown * Add support for constant learning rate with cooldown * Add more warmup and cooldown methods to 'get_wsc_schedule' * Add more warmup and cooldown methods to 'get_wsc_schedule' * Add more warmup and cooldown methods to 'get_wsc_schedule' * Add more warmup and cooldown methods to 'get_wsc_schedule' * Add more warmup and decay methods to 'get_wsd_schedule' * support num_training_steps and num_stable_steps for get_wsd_schedule * support num_training_steps and num_stable_steps for get_wsd_schedule * get wsd scheduler before the `num_training_steps` decision * fix code_quality * Update stable branch logic * fix code_quality * Move stable stage decide to `get_wsd_schedule` * Update docstring of `get_wsd_schedule` * Update `num_train_steps` to optional * Update `num_train_steps` to optional * Update docstring of `get_wsd_schedule` * Update src/transformers/optimization.py Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> --------- Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-02-10 13:21:55 +01:00
Armaghan Shakir	9a6be63fdb	Add Apple's Depth-Pro for depth estimation (#34583 ) * implement config and model building blocks * refactor model architechture * update model outputs * update init param to include use_fov_model * update param name in config * fix hidden_states and attentions outputs for fov * sort config * complete minor todos * update patching * update config for encoder * fix config * use correct defaults in config * update merge for compatibility with different image size * restructure encoder for custom configuration * make fov model compatible with custom config * replace word "decoder" with "fusion" * weight conversion script * fix fov squeeze * update conversion script (without test) * upload ruff image processing * create fast image processing * use torch interpolation for image processing * complete post_process_depth_estimation * config: fix imports and sort args * apply inference in weight conversion * use mllama script instead for weight conversion * clean weight conversion script * add depth-pro status in other files * fill docstring in config * formatting * more formatting * formatting with ruff * formatting with style * fix copied classes * add examples; update weight convert script * fix using check_table.py and isort * fix config docstring * add depth pro to sdpa docs * undo unintentional changes in configuration_gemma.py * minor fixes * test image processing * fixes and tests * more fixes * use output states from image_encoder instead * Revert "use output states from image_encoder instead" This reverts commit 2408ec54e4f27d2abbecdb8374e58f34d91d8e96. * make embeddings dynamic * reshape output hidden states and attentions as part of computation graph * fix ruff formating * fix docstring failure * use num_fov_head_layers in tests * update doc * check consistency with config * ruff formatting * update test case * fix ruff formatting * add tests for fov * use interpolation in postprocess * run and fix slow tests locally * use scaled_images_features for image and fov encoder * return fused_hidden_states in fusion stage * fix example * fix ruff * fix copyright license for all files * add __all__ for each file * minor fixes - fix download spell - add push_to_hub option - fix Optional type hinting - apply single loop for DepthProImageProcessor.preprocess * return list in post_process_depth_estimation * minor fixes - capitalize start of docstring - use ignore copy - fix examples - move docstring templates and custom output classes to top - remove "-> None" typehinting from __init__ - type hinting for forward passes - fix docstrings for custom output classes * fix "ruff check" * update upsample and projection * major changes: (image size and merge optimization) - add support for images of any size - optimize merge operation - remove image_size from config - use full names instead of B, C, H, W - remove interpolation from fusion stage - add interpolation after merge - move validations to config - update integration test - add type hints for functions * fix push_to_hub option in weights conversion * remove image_size in weights conversion * major changes in the architecture - remove all DepthProViT modules and support different backbones using the AutoModel API - set default use_fov_model to False - validate parameters in configuration - update interpolate function: use "nearest" for faster computation - update reshape_feature function: remove all special tokens, possible from different backbones - update merge function: use padding from config instead of merge_out_size - remove patch_to_batch and batch_to_patch conversions for now - calculate out_size dynamically in the encoder - leave head_mask calculation to the backbone - fix bugs with merge - add more comments - update tests * placeholder for unused config attributes * improve docs amid review * minor change in docs * further optimize merge * fix formatting * remove unused patch/batch convertion functions * use original F.interpolate * improve function naming * minor chages - use torch_int instead of int - use proper for newly initialized tensors - use user provided return_dict for patch_encoder - use if-else block instead in self.use_fov_model * rearchitect upsample block for improved modularity * update upsample keys in weight conversion * improve padding in merge_patches * use double-loop for merge * update comments * create feature_extractor, reduce some forward code * introduce config.use_mask_token in dinov2 * minor fixes * minor fixes for onnx * update __init__ to latest format * remove DepthProConfig.to_dict() * major changes in backbone * update config in weight conversion * formatting * converted model is fp32 * improve naming and docs for feature_extractor->reconstruct_feature_maps * minor fixes; amid review * create intermediate vars in func call * use torch.testing.assert_close * use ModuleList instead of Sequential and ModuleDict * update docs * include fov in integraiton tests * update docs * improve initialization of convolution layers * fix unused fov keys * update tests * ruff format * fix test, amid kaimming initialization * add depthpro to toctree * add residual layer to _no_split_modules * architecture rework * Update src/transformers/models/depth_pro/image_processing_depth_pro.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * Update src/transformers/models/depth_pro/image_processing_depth_pro_fast.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * update docs * improve merge_patches * use flatten with fov_output * ruff formatting * update resources section in docs Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * fix typo "final_kernal_size" Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * fix output typehint for DepthProDepthEstimator Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * residual operation in 2 steps Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * use image_size instead of global patch_size in interpolation * replace all Sequential with ModuleList * update fov * update heads * fix and update conversion script for heads * ruff formatting * remove float32 conversion * use "Fov" instead of "FOV" in class names * use "Fov" instead of "FOV" in config docs * remove prune_heads * update fusion stage * use device in examples * update processor * ruff fixes * add do_rescale in image_processor_dict * skip test: test_fast_is_faster_than_slow * ruff formatting * DepthProImageProcessorFast in other files * revert antialias removal * add antialias in BaseImageProcessorFast * Revert "revert antialias removal" This reverts commit 5caa0bd8f9f7463b98410c04e6cfe8fef3adee18. * Revert "add antialias in BaseImageProcessorFast" This reverts commit 3ae1134780ae236872985523d9c0a444eabcc179. * update processor for grouping and antialias * try test_fast_is_faster_than_slow without "skip" or "flanky" * update checkpoint * update checkpoint * use @is_flanky for processor test * update checkpoint to "apple/DepthPro-hf" --------- Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>	2025-02-10 11:32:45 +00:00
Raushan Turganbay	eebd2c972c	Chat template: update for processor (#35953 ) * update * we need batched nested input to always process correctly * update a bit * fix copies	2025-02-10 09:52:19 +01:00
Matt	a18b7fdd9e	Move audio top_k tests to the right file and add slow decorator (#36072 ) * Move audio top_k tests to the right file and add slow decorator because we load a real model * empty commit to trigger tests	2025-02-07 14:32:30 +00:00
Jade Choghari	006d9249ec	Adding RT-DETRv2 for object detection (#34773 ) * cookiecutter add rtdetrv2 * make modular working * working modelgit add . * working modelgit add . * finalize moduar inheritence * finalize moduar inheritence * Update src/transformers/models/rtdetrv2/modular_rtdetrv2.py Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com> * update modular and add rename * remove output ckpt * define loss_kwargs * fix CamelCase naming * fix naming + files * fix modular and convert file * additional changes * fix modular * fix import error (switch to lazy) * fix autobackbone * make style * add * update testing * fix loss * remove old folder * fix testing for v2 * update docstring * fix docstring * add resnetv2 (with modular bug to fix) * remove resnetv2 backbone * fix changes * small fixes * remove rtdetrv2resnetconfig * add rtdetrv2 name to convert * make style * Update docs/source/en/model_doc/rt_detr_v2.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update src/transformers/models/rt_detr_v2/modular_rt_detr_v2.py Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update src/transformers/models/rt_detr_v2/modular_rt_detr_v2.py Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * fix modular typo after review * add reviewed changes * add final review changes * Update docs/source/en/model_doc/rt_detr_v2.md Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com> * Update src/transformers/models/rt_detr_v2/__init__.py Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com> * Update src/transformers/models/rt_detr_v2/convert_rt_detr_v2_weights_to_hf.py Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com> * add review changes * remove rtdetrv2 resnet * removing this weird project change * change ckpt name from jadechoghari to author * implement review and update testing * update naming and remove wrong ckpt * name * make fix-copies * Fix RT-DETR loss * Add resources, fix name * Fix repo in docs * Fix table name --------- Co-authored-by: jadechoghari <jadechoghari@users.noreply.huggingface.co> Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> Co-authored-by: qubvel <qubvel@gmail.com>	2025-02-06 19:28:45 +00:00
Matt	4563ba2c6f	Fix StopStringCriteria to handle tokens above len(tokenizer) (#35797 ) * Fix StopStringCriteria to handle tokens above len(tokenizer) This fixes #35244 by clipping token IDs to be within the tokenizer's vocabulary size before performing the embedding lookup. This prevents index errors when model.config.vocab_size > len(tokenizer). The fix: 1. Adds a clamp operation to ensure token IDs are within bounds 2. Adds a test case to verify the behavior * Use self.stop_strings instead of stop_strings * Handle clipping correctly * make fixup * Update test to the new embedding vecs * Use much bigger values in the mismatch test * Typo fix * Slight simplification --------- Co-authored-by: openhands <openhands@all-hands.dev>	2025-02-06 16:53:28 +00:00
Zach Mueller	28f73bc307	Fix model kwargs (#35875 ) * Save state * Make a failing test * Better test * mpt -> done, many more to go * Rm extranious * Bamba * Bert * big_bird * biogpt * bloom * codegen * ctrl * data2vec * dbrx * Through up to Dbrx * electra * ernie * falcon * Fuyu/persimmon * Include noop kwargs to base models * Rebase * Skip musigen * Refactor/skip mllama * Revert makefile * Rm file * Fix PT failing, need to modify rest of loss funcs to not resize * Propagate some * Continue * More * More options * Mostly fixed * Proved that it's the same * Bloom is good * Make ability to override loss func possible * Fixup * Clean * Fix xglm * Quality tests * Skip OCR2 * Make specific loss for xglm * Make order the same/line up 1:1 * xglm * Skip fx output loss bloom model * Didn't pass in pad_token_id * Fix quality	2025-02-06 11:35:25 -05:00
湛露先生	1590c66430	Fix words typos in ggml test. (#36060 ) Signed-off-by: zhanluxianshen <zhanluxianshen@163.com>	2025-02-06 15:32:40 +00:00
Zach Mueller	1ce0e2992e	Nail in edge case of torch dtype being overriden permantly in the case of an error (#35845 ) * Nail in edge case of torch dtype * Rm unused func * Apply suggestions from code review Co-authored-by: Benjamin Bossan <BenjaminBossan@users.noreply.github.com> * Refactor tests to only mock what we need, don't introduce injection functions * SetUp/TearDown * Do super --------- Co-authored-by: Benjamin Bossan <BenjaminBossan@users.noreply.github.com>	2025-02-06 09:05:23 -05:00
Raushan Turganbay	3dd1de39bb	Paligemma: fix generation with Gemma2 (#36044 ) * fix paligemma * nit * use `kwargs` in models that can load any LM	2025-02-06 14:31:32 +01:00
Yih-Dar	dce9970884	Update `test_flash_attn_2_can_dispatch_composite_models` (#36050 ) * update * update * update --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-02-06 12:09:49 +01:00
Yaswanth Gali	7aee036e54	Iterative generation using Input embeds and `past_key_values` (#35890 ) * Iterative generation using input embeds * ruff fix * Added Testcase * Updated comment * ♻️ Refactored testcase * Skip test for these models * Continue generation using input embeds and cache * Skip generate_continue_from_embeds test * Refactor `prepare_input_for_generation` func * Continue generation using input embeds and cache * Modular changes fix * Overwrite 'prepare_inputs_for_generation' function	2025-02-06 11:06:05 +01:00
Sambhav Dixit	0de15c988b	Fix Audio Classification Pipeline top_k Documentation Mismatch and Bug #35736 (#35771 ) * added condition for top_k Doc mismatch fix * initilation of test file for top_k changes * added test for returning all labels * added test for few labels * tests/test_audio_classification_top_k.py * final fix * ruff fix --------- Co-authored-by: sambhavnoobcoder <indosambahv@gmail.com>	2025-02-05 16:25:08 +00:00
Stas Bekman	9dc1efa5d4	DeepSpeed github repo move sync (#36021 ) deepspeed github repo move	2025-02-05 08:19:31 -08:00
Yoni Gozlan	fa56dcc2ab	Refactoring of ImageProcessorFast (#35069 ) * add init and base image processing functions * add add_fast_image_processor to transformers-cli * add working fast image processor clip * add fast image processor to doc, working tests * remove "to be implemented" SigLip * fix unprotected import * fix unprotected vision import * update ViTImageProcessorFast * increase threshold slow fast ewuivalence * add fast img blip * add fast class in tests with cli * improve cli * add fast image processor convnext * add LlavaPatchingMixin and fast image processor for llava_next and llava_onevision * add device kwarg to ImagesKwargs for fast processing on cuda * cleanup * fix unprotected import * group images by sizes and add batch processing * Add batch equivalence tests, skip when center_crop is used * cleanup * update init and cli * fix-copies * refactor convnext, cleanup base * fix * remove patching mixins, add piped torchvision transforms for ViT * fix unbatched processing * fix f strings * protect imports * change llava onevision to class transforms (test) * fix convnext * improve formatting (following Pavel review) * fix handling device arg * improve cli * fix * fix inits * Add distinction between preprocess and _preprocess, and support for arbitrary kwargs through valid_extra_kwargs * uniformize qwen2_vl fast * fix docstrings * add add fast image processor llava * remove min_pixels max_pixels from accepted size * nit * nit * refactor fast image processors docstrings * cleanup and remove fast class transforms * update add fast image processor transformers cli * cleanup docstring * uniformize pixtral fast and make _process_image explicit * fix prepare image structure llava next/onevision * Use typed kwargs instead of explicit args * nit fix import Unpack * clearly separate pops and gets in base preprocess. Use explicit typed kwargs * make qwen2_vl preprocess arguments hashable	2025-02-04 17:52:31 -05:00
David	8d73a38606	Add DAB-DETR for object detection (#30803 ) * initial commit * encoder+decoder layer changes WIP * architecture checks * working version of detection + segmentation * fix modeling outputs * fix return dict + output att/hs * found the position embedding masking bug * pre-training version * added iamge processors * typo in init.py * iterupdate set to false * fixed num_labels in class_output linear layer bias init * multihead attention shape fixes * test improvements * test update * dab-detr model_doc update * dab-detr model_doc update2 * test fix:test_retain_grad_hidden_states_attentions * config file clean and renaming variables * config file clean and renaming variables fix * updated convert_to_hf file * small fixes * style and qulity checks * return_dict fix * Merge branch main into add_dab_detr * small comment fix * skip test_inputs_embeds test * image processor updates + image processor test updates * check copies test fix update * updates for check_copies.py test * updates for check_copies.py test2 * tied weights fix * fixed image processing tests and fixed shared weights issues * added numpy nd array option to get_Expected_values method in test_image_processing_dab_detr.py * delete prints from test file * SafeTensor modification to solve HF Trainer issue * removing the safetensor modifications * make fix copies and hf uplaod has been added. * fixed index.md * fixed repo consistency * styel fix and dabdetrimageprocessor docstring update * requested modifications after the first review * Update src/transformers/models/dab_detr/image_processing_dab_detr.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * repo consistency has been fixed * update copied NestedTensor function after main merge * Update src/transformers/models/dab_detr/modeling_dab_detr.py Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com> * temp commit * temp commit2 * temp commit 3 * unit tests are fixed * fixed repo consistency * updated expected_boxes varible values based on related notebook results in DABDETRIntegrationTests file. * temporarialy config modifications and repo consistency fixes * Put dilation parameter back to config * pattern embeddings have been added to the rename_keys method * add dilation comment to config + add as an exception in check_config_attributes SPECIAL CASES * delete FeatureExtractor part from docs.md * requested modifications in modeling_dab_detr.py * [run_slow] dab_detr * deleted last segmentation code part, updated conversion script and changed the hf path in test files * temp commit of requested modifications * temp commit of requested modifications 2 * updated config file, resolved codepaths and refactored conversion script * updated decodelayer block types and refactored conversion script * style and quality update * small modifications based on the request * attentions are refactored * removed loss functions from modeling file, added loss function to lossutils, tried to move the MLP layer generation to config but it failed * deleted imageprocessor * fixed conversion script + quality and style * fixed config_att * [run_slow] dab_detr * changing model path in conversion file and in test file * fix Decoder variable naming * testing the old loss function * switched back to the new loss function and testing with the odl attention functions * switched back to the new last good result modeling file * moved back to the version when I asked the review * missing new line at the end of the file * old version test * turn back to newest mdoel versino but change image processor * style fix * style fix after merge main * [run_slow] dab_detr * [run_slow] dab_detr * added device and type for head bias data part * [run_slow] dab_detr * fixed model head bias data fill * changed test_inference_object_detection_head assertTrues to torch test assert_close * fixes part 1 * quality update * self.bbox_embed in decoder has been restored * changed Assert true torch closeall methods to torch testing assertclose * modelcard markdown file has been updated * deleted intemediate list from decoder module --------- Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>	2025-02-04 17:28:27 +00:00
Yih-Dar	fe52679e74	Update tests regarding attention types after #35235 (#36024 ) * update * update * update * dev-ci * more changes * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-02-04 18:04:47 +01:00
Marc Sun	9f486badd5	Display warning for unknown quants config instead of an error (#35963 ) * add supports_quant_method check * fix * add test and fix suggestions * change logic slightly --------- Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>	2025-02-04 15:17:01 +01:00
Sumit Vij	bc9a6d8302	Fix device mismatch error in Whisper model during feature extraction (#35866 ) * Fix device mismatch error in whisper feature extraction * Set default device * Address code review feedback --------- Co-authored-by: eustlb <94853470+eustlb@users.noreply.github.com>	2025-02-04 12:23:08 +01:00
Cyril Vallez	9afb904b15	Refactor (and fix) gpt_neox (#35610 ) * start a nice modular * Update modular_gpt_neox.py * Update modular_gpt_neox.py * Update modular_gpt_neox.py * Update modular_gpt_neox.py * update * Update modular_gpt_neox.py * convert * fix attribute * fix attrs * oups * fix * fix * fix * fix * fix * fix order to pass test (see with accelerate team) * trigger CIs * modular * update * up * Update test_modeling_gpt_neox.py * Update test_modeling_gpt_neox.py * trigger CIs * correctly pass arg * simplify * remove key warning * update tp -> it's compatible since the view is before * trigger CIs	2025-02-04 11:18:43 +01:00
Ryoo Kwangrok	b1954fd64a	layernorm_decay_fix (#35927 ) * layernorm_decay_fix * W293 fix * ruff format fix * black format * ruff format * erase last layer * add test_get_parameter_names_rmsnorm * rmsnorm fix	2025-02-04 11:01:49 +01:00
Dmitry Tarasov	2ba040a71f	apply_chat_template: consistent behaviour for return_assistant_tokens_mask=True return_tensors=True (#35582 ) * apply_chat_template: consistent return_tensors behaviour with return_assistant_tokens_mask flag * test_chat_template_return_assistant_tokens_mask: support tokenizers with no attention mask * test_chat_template_return_assistant_tokens_mask: skip tokenizers with no padding token * test_chat_template_return_assistant_tokens_mask: force tokenizer padding_side=right --------- Co-authored-by: Eduard Allakhverdov <goncharova@airi.net> Co-authored-by: d.tarasov <d.tarasov@airi.net>	2025-02-04 10:27:52 +01:00
Raushan Turganbay	5d75a25b03	Qwen2-VL: fix rope delta calculation (#36013 ) * fix rope delats calculation * add test * style	2025-02-04 09:48:29 +01:00
Alex Brooks	e284c7e954	Update Granite Vision Model Path / Tests (#35998 ) * Update granite vision model path Signed-off-by: Alex-Brooks <Alex.brooks@ibm.com> * Enable granite vision test Signed-off-by: Alex-Brooks <Alex.brooks@ibm.com> --------- Signed-off-by: Alex-Brooks <Alex.brooks@ibm.com>	2025-02-03 20:06:03 +01:00
Arthur	7eecdf2a86	Update-tp test (#35844 ) * update test for now * up * cleanup * update todo	2025-02-03 09:37:02 +01:00
Yoni Gozlan	2b46943195	Add GOT-OCR 2.0 to Transformers (#34721 ) * init modular got_ocr2 * Get correct got_ocr architecture * add processing * run modular with processing * add working inference * apply modular * Refactor and fix style * Refactor, cleanup, fix style * fix init order * Fix docs * add base modeling tests * fix style and consistency * rename doc file * fix repo consistency * fix inference with box * add image processing and support for crop_to_multi_page * Fix batch inference * add tests * fixup * fix slow test * fix docstrings * Add model doc * update to new init * fix input autocast pixel_values dtype * update doc * move doc to multimodal * Reformat crop_image_to_patches and add docstrings * Fix example in forward docstring * Address Pablo review * [run slow] got_ocr2 * remove defaults defined twice * apply modular * add torch_device to integration tests * update modular * follow-up Pavel review * add device variable in doc * fix doc multi-page * Force eager attention for vision encoder to avoid attn implementation conflict * revert qwen2vl doc changes * use Qwen2ForCausalLM instead of Qwen2Model * make fixup * refactor gotocr2 to llava style * uniformize function names and reduce checks * final nits * fix pixel_values dtype error * change checkpoint names * fix modular	2025-01-31 11:28:13 -05:00
Yoni Gozlan	d7188ba600	Add support for nested images to LLava and VipLLava (#35558 ) * move make_flat_list_of_images and make_batched_videos to image_utils * remove unnecessary is_vision_available * move make_nested_list_of_images to image_utils * fix fast pixtral image processor * fix import mllama * fix make_nested_list_of_images * add tests * convert 4d arrays/tensors to list * add test_make_batched_videos * add support nested batch of videos * fix image processing qwen2vl	2025-01-30 16:49:20 -05:00
Marcel	e4227eb4d4	Handle empty change indices in SAM's mask to rle conversion (#35665 ) * Handle empty change indices in RLE conversion for masks * [test] Add unit tests for RLE encoding of masks in SamProcessor * [test] Update RLE conversion tests to use TensorFlow implementation * [test] Fix formatting in SamProcessorTest according to check_code_quality action * [test] Fix formatting in SamProcessorTest according to check_code_quality * [test] Refactored rle test cases into one test and used tf tensors in tf test cases * [test] Fix: removed self parameter from refactored methods * [test] Removed nested methods in run-length encoding tests for PyTorch and TensorFlow * [test] Added description to individual to run-length encoding tests for PyTorch and TensorFlow.	2025-01-30 19:08:38 +00:00
Yih-Dar	5757681837	Less flaky for `TimmBackboneModelTest::test_batching_equivalence` (#35971 ) * fix * remove is_flaky * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-01-30 16:56:26 +01:00
Raushan Turganbay	365fecb4d0	Whisper: fix static cache CI (#35852 ) * fix * remove overriden method * small change	2025-01-30 12:43:00 +01:00
Raushan Turganbay	9725e5be2f	Pixtral: vectorize patch embeddings and enable tests (#35122 ) * initial POC * - batch mix feature * fix tests * fix tests * make style * do not skip and instead fix tests * update * return back the test * correct text with the correct ckpt	2025-01-30 12:40:18 +01:00
Joao Gante	8bc4c89ee9	[bart] minor test fixes (#35965 ) fix tests	2025-01-30 10:00:11 +00:00
Joao Gante	4d3b1076a1	[generate] move max time tests (#35962 ) * move max time tests to their right place * move test to the right place	2025-01-29 17:56:46 +00:00
Fanli Lin	f0ae65c198	[tests] further fix `Tester object has no attribute '_testMethodName'` (#35781 ) * bug fix * update with more cases * more entries * Fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-01-29 16:05:33 +01:00
Jonas Rohw	23d782ead2	Output dicts support in text generation pipeline (#35092 ) * Support for generate_argument: return_dict_in_generate=True, instead of returning a error * fix: call test with return_dict_in_generate=True * fix: Only import torch if it is present * update: Encapsulate output_dict changes * fix: added back original comments --------- Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>	2025-01-29 14:44:46 +00:00
Yih-Dar	cf90404807	Fix flaky `test_assisted_decoding_matches_greedy_search` (#35951 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-01-29 14:50:07 +01:00
Yih-Dar	c600e89f5c	Update `unwrap_and_save_reload_schedule` to use `weights_only=False` (#35952 ) * fix * Fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-01-29 14:30:57 +01:00
Nadav Timor	42c8ccfd4c	fix `test_generated_length_assisted_generation` (#34935 ) fix test_generated_length_assisted_generation	2025-01-29 12:03:45 +00:00
Joao Gante	ece8c42488	Test: generate with `torch.compile(model.forward)` as a fast test (#34544 )	2025-01-28 14:10:38 +00:00
Cyril Vallez	f48ecd7608	Fix TP initialization (#35860 ) * fix tp * Update modeling_utils.py * style * style * Update test_tp.py * Update test_tp.py * style * Update test_tp.py * Update test_tp.py * Update test_tp.py * Update test_tp.py	2025-01-28 15:07:37 +01:00
Raushan Turganbay	f85ba20449	Qwen-2-5-VL: fix CI (#35935 ) fix	2025-01-28 14:51:57 +01:00
Cyril Vallez	3f860dba55	Fix mask slicing for models with HybridCache (#35681 ) * correctly slice * check mask * Update modular_gemma2.py * fix * add tests * fix typo * finally fix mask slicing * Finally correctly slice in all cases!! * add test for all attention functions * small fix in tests * trick around dynamo tracing issue * last update * more robust * kwargs propagation * make it explicit for checkpointing * apply modular	2025-01-28 14:35:00 +01:00
Raushan Turganbay	b764c20b09	Fix: loading DBRX back from saved path (#35728 ) * fix dtype as dict for some models + add test * add comment in tests	2025-01-28 11:38:45 +01:00
Isotr0py	e57b459997	Split and clean up GGUF quantization tests (#35502 ) * clean up ggml test Signed-off-by: Isotr0py <2037008807@qq.com> * port remaining tests Signed-off-by: Isotr0py <2037008807@qq.com> * further cleanup Signed-off-by: Isotr0py <2037008807@qq.com> * format Signed-off-by: Isotr0py <2037008807@qq.com> * fix broken tests Signed-off-by: Isotr0py <2037008807@qq.com> * update comment Signed-off-by: Isotr0py <2037008807@qq.com> * fix Signed-off-by: Isotr0py <2037008807@qq.com> * reorganize tests Signed-off-by: Isotr0py <2037008807@qq.com> * k-quants use qwen2.5-0.5B Signed-off-by: Isotr0py <2037008807@qq.com> * move ggml tokenization test Signed-off-by: Isotr0py <2037008807@qq.com> * remove dead code Signed-off-by: Isotr0py <2037008807@qq.com> * add assert for serilization test Signed-off-by: Isotr0py <2037008807@qq.com> * use str for parameterize Signed-off-by: Isotr0py <2037008807@qq.com> --------- Signed-off-by: Isotr0py <2037008807@qq.com>	2025-01-27 15:46:57 +01:00
Mikhail Moskovchenko	5450e7c84a	🔴 🔴 🔴 Added `segmentation maps` support for DPT image processor (#34345 ) * Added `segmentation_maps` support for DPT image processor * Added tests for dpt image processor * Moved preprocessing into separate functions * Added # Copied from statements * Fixed # Copied from statements * Added `segmentation_maps` support for DPT image processor * Added tests for dpt image processor * Moved preprocessing into separate functions * Added # Copied from statements * Fixed # Copied from statements	2025-01-27 15:14:00 +01:00
pglorio	33cb1f7b61	Add Zamba2 (#34517 ) * First commit * Finish model implementation * First commit * Finish model implementation * Register zamba2 * generated modeling and configuration * generated modeling and configuration * added hybrid cache * fix attention_mask in mamba * dropped unused loras * fix flash2 * config docstrings * fix config and fwd pass * make fixup fixes * text_modeling_zamba2 * small fixes * make fixup fixes * Fix modular model converter * added inheritances in modular, renamed zamba cache * modular rebase * new modular conversion * fix generated modeling file * fixed import for Zamba2RMSNormGated * modular file cleanup * make fixup and model tests * dropped inheritance for Zamba2PreTrainedModel * make fixup and unit tests * Add inheritance of rope from GemmaRotaryEmbedding * moved rope to model init * drop del self.self_attn and del self.feed_forward * fix tests * renamed lora -> adapter * rewrote adapter implementation * fixed tests * Fix torch_forward in mamba2 layer * Fix torch_forward in mamba2 layer * Fix torch_forward in mamba2 layer * Dropped adapter in-place sum * removed rope from attention init * updated rope * created get_layers method * make fixup fix * make fixup fixes * make fixup fixes * update to new attention standard * update to new attention standard * make fixup fixes * minor fixes * cache_position * removed cache_position postion_ids use_cache * remove config from modular * removed config from modular (2) * import apply_rotary_pos_emb from llama * fixed rope_kwargs * Instantiate cache in Zamba2Model * fix cache * fix @slow decorator * small fix in modular file * Update docs/source/en/model_doc/zamba2.md Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * several minor fixes * inherit mamba2decoder fwd and drop position_ids in mamba * removed docstrings from modular * reinstate zamba2 attention decoder fwd * use regex for tied keys * Revert "use regex for tied keys" This reverts commit 9007a522b1f831df6d516a281c0d3fdd20a118f5. * use regex for tied keys * add cpu to slow forward tests * dropped config.use_shared_mlp_adapter * Update docs/source/en/model_doc/zamba2.md Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * re-convert from modular --------- Co-authored-by: root <root@node-2.us-southcentral1-a.compute.internal> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2025-01-27 10:51:23 +01:00
Arthur	b912f5ee43	use torch.testing.assertclose instead to get more details about error in cis (#35659 ) * use torch.testing.assertclose instead to get more details about error in cis * fix * style * test_all * revert for I bert * fixes and updates * more image processing fixes * more image processors * fix mamba and co * style * less strick * ok I won't be strict * skip and be done * up	2025-01-24 16:55:28 +01:00
CalOmnie	b5aaf87509	Fix `test_pipelines_video_classification` that was always failing (#35842 ) * Fix test_pipelines_video_classification that was always failing * Update video pipeline docstring to reflect actual return type --------- Co-authored-by: Louis Groux <louis.cal.groux@gmail.com>	2025-01-23 19:22:32 +01:00
Alex Brooks	71cc8161b2	Granite Vision Support (#35579 ) * Add multimodal granite support Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com> Support multiple image feature layres Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com> * Remove failing validation for visual encoders with no cls Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com> * Update llava based models / configs to support list of feature layers Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com> * Add tests for multiple feature layers Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com> * Use conditional instead of except for misaligned feature shapes Signed-off-by: Alex-Brooks <Alex.brooks@ibm.com> * crop cls from each hidden state Signed-off-by: Alex-Brooks <Alex.brooks@ibm.com> * Fix formatting Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com> * Support single vision feature int in vipllava Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com> * Fix typo in vision feature selection strategy validation Signed-off-by: Alex-Brooks <Alex.brooks@ibm.com> * Add tentative integration test for granite vision models Signed-off-by: Alex-Brooks <Alex.brooks@ibm.com> * Add granite vision docs Replace multimodal granite refs with granite vision Add granite vision / llava next alias Signed-off-by: Alex-Brooks <Alex.brooks@ibm.com> * Use image url in granitevision example Signed-off-by: Alex-Brooks <Alex.brooks@ibm.com> --------- Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com> Signed-off-by: Alex-Brooks <Alex.brooks@ibm.com>	2025-01-23 17:15:52 +01:00

1 2 3 4 5 ...

4481 commits