Cyril Vallez
46276f9a7f
Fix modular edge case + modular sorting order ( #35562 )
...
* look-ahead negation
* re add examples by default
* Fix the bug in topological sort
* Update create_dependency_mapping.py
* start adding test
* finalize test
* more tests
* style
* style
2025-01-09 17:17:52 +01:00
Cyril Vallez
965a2fb320
More model refactoring! ( #35359 )
...
* cohere
* style
* phi3
* style
* small fix
* small fix
* phi3 longrope
* oups
* Update rope (only for phi3 still)
* Update test_modeling_rope_utils.py
* Update modeling_phi3.py
* fix
* fix copies
* style
* Fix copied from bad renaming
2025-01-09 11:09:09 +01:00
NielsRogge
8490d3159c
Add ViTPose ( #30530 )
...
* First draft
* Make fixup
* Make forward pass worké
* Improve code
* More improvements
* More improvements
* Make predictions match
* More improvements
* Improve image processor
* Fix model tests
* Add classic decoder
* Convert classic decoder
* Verify image processor
* Fix classic decoder logits
* Clean up
* Add post_process_pose_estimation
* Improve post_process_pose_estimation
* Use AutoBackbone
* Add support for MoE models
* Fix tests, improve num_experts%
* Improve variable names
* Make fixup
* More improvements
* Improve post_process_pose_estimation
* Compute centers and scales
* Improve postprocessing
* More improvements
* Fix ViTPoseBackbone tests
* Add docstrings, fix image processor tests
* Update index
* Use is_cv2_available
* Add model to toctree
* Add cv2 to doc tests
* Remove script
* Improve conversion script
* Add coco_to_pascal_voc
* Add box_to_center_and_scale to image_transforms
* Update tests
* Add integration test
* Fix merge
* Address comments
* Replace numpy by pytorch, improve docstrings
* Remove get_input_embeddings
* Address comments
* Move coco_to_pascal_voc
* Address comment
* Fix style
* Address comments
* Fix test
* Address comment
* Remove udp
* Remove comment
* [WIP] need to check if the numpy function is same as cv
* add scipy affine_transform
* Update src/transformers/models/vitpose/image_processing_vitpose.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* refactor convert
* add output_shape
* add atol 5e-2
* Use hf_hub_download in conversion script
* make box_to_center more applicable
* skipt test_get_set_embedding
* fix to accept array and fix CI
* add co-contributor
* make it to tensor type output
* add torch
* change to torch tensor
* add more test
* minor change
* CI test change
* import torch should be above ImageProcessor
* make style
* try not use torch in def
* Update src/transformers/models/vitpose/image_processing_vitpose.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* Update src/transformers/models/vitpose_backbone/configuration_vitpose_backbone.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* Update src/transformers/models/vitpose_backbone/modeling_vitpose_backbone.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* Update src/transformers/models/vitpose/modeling_vitpose.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* fix
* fix
* add caution
* make more detail about dataset_index
* Update src/transformers/models/vitpose/modeling_vitpose.py
Co-authored-by: Sangbum Daniel Choi <34004152+SangbumChoi@users.noreply.github.com>
* Update src/transformers/models/vitpose/image_processing_vitpose.py
Co-authored-by: Sangbum Daniel Choi <34004152+SangbumChoi@users.noreply.github.com>
* add docs
* Update docs/source/en/model_doc/vitpose.md
* Update src/transformers/models/vitpose/configuration_vitpose.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* Update src/transformers/__init__.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* Revert "Update src/transformers/__init__.py"
This reverts commit 7ffa504450bb9dbccf9c7ea668441b98a1939d5c.
* change name
* Update src/transformers/models/vitpose/image_processing_vitpose.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update tests/models/vitpose/test_modeling_vitpose.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update docs/source/en/model_doc/vitpose.md
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/vitpose/modeling_vitpose.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/vitpose_backbone/modeling_vitpose_backbone.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/vitpose/image_processing_vitpose.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* move vitpose only function to image_processor
* raise valueerror when using timm backbone
* use out_indices
* Update src/transformers/models/vitpose/image_processing_vitpose.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* remove camel-case of def flip_back
* rename vitposeEstimatorOutput
* Update src/transformers/models/vitpose_backbone/modeling_vitpose_backbone.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* fix confused camelcase of MLP
* remove in-place logic
* clear scale description
* make consistent batch format
* docs update
* formatting docstring
* add batch tests
* test docs change
* Update src/transformers/models/vitpose/image_processing_vitpose.py
* Update src/transformers/models/vitpose/configuration_vitpose.py
* chagne ViT to Vit
* change to enable MoE
* make fix-copies
* Update docs/source/en/model_doc/vitpose.md
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* extract udp
* add more described docs
* simple fix
* change to accept target_size
* make style
* Update src/transformers/models/vitpose/image_processing_vitpose.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/vitpose/configuration_vitpose.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* change to `verify_backbone_config_arguments`
* Update docs/source/en/model_doc/vitpose.md
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* remove unnecessary copy
* make config immutable
* enable gradient checkpointing
* update inappropriate docstring
* linting docs
* split function for visibility
* make style
* check isinstances
* change to acceptable use_pretrained_backbone
* make style
* remove copy in docs
* Update src/transformers/models/vitpose_backbone/modeling_vitpose_backbone.py
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
* Update docs/source/en/model_doc/vitpose.md
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
* Update src/transformers/models/vitpose/modeling_vitpose.py
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
* simple fix + make style
* change input config of activation function to string
* Update docs/source/en/model_doc/vitpose.md
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
* tmp docs
* delete index.md
* make fix-copies
* simple fix
* change conversion to sam2/mllama style
* Update src/transformers/models/vitpose/image_processing_vitpose.py
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
* Update src/transformers/models/vitpose/image_processing_vitpose.py
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
* refactor convert
* add supervision
* Update src/transformers/models/vitpose_backbone/modeling_vitpose_backbone.py
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
* remove reduntant def
* seperate code block for visualization
* add validation for num_moe
* final commit
* add labels
* [run-slow] vitpose, vitpose_backbone
* Update src/transformers/models/vitpose/convert_vitpose_to_hf.py
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
* enable all conversion
* final commit
* [run-slow] vitpose, vitpose_backbone
* ruff check --fix
* [run-slow] vitpose, vitpose_backbone
* rename split module
* [run-slow] vitpose, vitpose_backbone
* fix pos_embed
* Simplify init
* Revert "fix pos_embed"
This reverts commit 2c56a4806e30bc9b5753b142fa04b913306c54ff.
* refactor single loop
* allow flag to enable custom model
* efficiency of MoE to not use unused experts
* make style
* Fix range -> arange to avoid warning
* Revert MOE router, a new one does not work
* Fix postprocessing a bit (labels)
* Fix type hint
* Fix docs snippets
* Fix links to checkpoints
* Fix checkpoints in tests
* Fix test
* Add image to docs
---------
Co-authored-by: Niels Rogge <nielsrogge@nielss-mbp.home>
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
Co-authored-by: sangbumchoi <danielsejong55@gmail.com>
Co-authored-by: Sangbum Daniel Choi <34004152+SangbumChoi@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
2025-01-08 16:02:14 +00:00
Yoni Gozlan
651cfb400f
Add support for modular with fast image processors ( #35379 )
...
* Add support for modular with fast image processors
* fix order and remove copied from
* add comment for "image_processing*_fast"
2025-01-08 08:37:57 -05:00
Raushan Turganbay
d1681ec2b6
VLMs: major clean up 🧼 ( #34502 )
...
only lllava models are modified
2025-01-08 10:35:23 +01:00
Jade Choghari
7176e06b52
Add TextNet ( #34979 )
...
* WIP
* Add config and modeling for Fast model
* Refactor modeling and add tests
* More changes
* WIP
* Add tests
* Add conversion script
* Add conversion scripts, integration tests, image processor
* Fix style and copies
* Add fast model to init
* Add fast model in docs and other places
* Fix import of cv2
* Rename image processing method
* Fix build
* Fix Build
* fix style and fix copies
* Fix build
* Fix build
* Fix Build
* Clean up docstrings
* Fix Build
* Fix Build
* Fix Build
* Fix build
* Add test for image_processing_fast and add documentation tests
* some refactorings
* Fix failing tests
* Incorporate PR feedbacks
* Incorporate PR feedbacks
* Incorporate PR feedbacks
* Incorporate PR feedbacks
* Incorporate PR feedbacks
* Introduce TextNet
* Fix failures
* Refactor textnet model
* Fix failures
* Add cv2 to setup
* Fix failures
* Fix failures
* Add CV2 dependency
* Fix bugs
* Fix build issue
* Fix failures
* Remove textnet from modeling fast
* Fix build and other things
* Fix build
* some cleanups
* some cleanups
* Some more cleanups
* Fix build
* Incorporate PR feedbacks
* More cleanup
* More cleanup
* More cleanup
* Fix build
* Remove all the references of fast model
* More cleanup
* Fix build
* Incorporate PR feedbacks
* Incorporate PR feedbacks
* Incorporate PR feedbacks
* Incorporate PR feedbacks
* Incorporate PR feedbacks
* Incorporate PR feedbacks
* Incorporate PR feedbacks
* Incorporate PR feedbacks
* Incorporate PR feedbacks
* Incorporate PR feedbacks
* Fix Build
* Fix build
* Fix build
* Fix build
* Fix build
* Fix build
* Incorporate PR feedbacks
* Fix style
* Fix build
* Incorporate PR feedbacks
* Fix image processing mean and std
* Incorporate PR feedbacks
* fix build failure
* Add assertion to image processor
* Incorporate PR feedbacks
* Incorporate PR feedbacks
* fix style failures
* fix build
* Fix Imageclassification's linear layer, also introduce TextNetImageProcessor
* Fix build
* Fix build
* Fix build
* Fix build
* Incorporate PR feedbacks
* Incorporate PR feedbacks
* Fix build
* Incorporate PR feedbacks
* Remove some script
* Incorporate PR feedbacks
* Incorporate PR feedbacks
* Incorporate PR feedbacks
* Incorporate PR feedbacks
* Fix image processing in textnet
* Incorporate PR Feedbacks
* Fix CI failures
* Fix failing test
* Fix failing test
* Fix failing test
* Fix failing test
* Fix failing test
* Fix failing test
* Add textnet to readme
* Improve readability
* Incorporate PR feedbacks
* fix code style
* fix key error and convert working
* tvlt shouldn't be here
* fix test modeling test
* Fix tests, make fixup
* Make fixup
* Make fixup
* Remove TEXTNET_PRETRAINED_MODEL_ARCHIVE_LIST
* improve type annotation
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
* Update tests/models/textnet/test_image_processing_textnet.py
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
* improve type annotation
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
* space typo
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
* improve type annotation
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
* Update src/transformers/models/textnet/configuration_textnet.py
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
* make conv layer kernel sizes and strides default to None
* Update src/transformers/models/textnet/modeling_textnet.py
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
* Update src/transformers/models/textnet/modeling_textnet.py
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
* fix keyword bug
* add batch init and make fixup
* Make fixup
* Update integration test
* Add figure
* Update textnet.md
* add testing and fix errors (classification, imgprocess)
* fix error check
* make fixup
* make fixup
* revert to original docstring
* add make style
* remove conflict for now
* Update modeling_auto.py
got a confusion in `timm_wrapper` - was giving some conflicts
* Update tests/models/textnet/test_modeling_textnet.py
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
* Update src/transformers/models/textnet/modeling_textnet.py
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
* Update tests/models/textnet/test_modeling_textnet.py
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
* Update src/transformers/models/textnet/modeling_textnet.py
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
* add changes
* Update textnet.md
* add doc
* add authors hf ckpt + rename
* add feedback: classifier/docs
---------
Co-authored-by: raghavanone <opensourcemaniacfreak@gmail.com>
Co-authored-by: jadechoghari <jadechoghari@users.noreply.huggingface.co>
Co-authored-by: Niels <niels.rogge1@gmail.com>
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
2025-01-08 09:52:51 +01:00
Lysandre Debut
b2f2977533
Applies the rest of the init refactor except to modular files ( #35238 )
...
* [test_all] Applies the rest of the init refactor except to modular files
* Revert modular that doesn't work
* [test_all] TFGPT2Tokenizer
2025-01-05 18:30:08 +01:00
NielsRogge
6e0515e99c
Add DINOv2 with registers ( #35348 )
...
* added changes from 32905
* fixed mistakes caused by select all paste
* rename diff_dinov2...
* ran tests
* Fix modular
* Fix tests
* Use new init
* Simplify drop path
* Convert all checkpoints
* Add figure and summary
* Update paths
* Update docs
* Update docs
* Update toctree
* Update docs
---------
Co-authored-by: BernardZach <bernardzach00@gmail.com>
Co-authored-by: Zach Bernard <132859071+BernardZach@users.noreply.github.com>
2024-12-24 13:21:59 +01:00
Arthur
6fae2a84ae
Update test fetcher when we want to test all ( #35364 )
...
* [test-all]
* style
* [test-all]
* [test_all]
* [test_all]
* style
2024-12-20 15:10:43 +01:00
Yu Chin Fabian Lim
9613933b02
Add the Bamba Model ( #34982 )
...
* initial commit for PR
Co-authored-by: Gabe Goodhart <gabe.l.hart@gmail.com>
* rename dynamic cache
Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com>
* add more unit tests
Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com>
* add integration test
Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com>
* add integration test
Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com>
* Add modular bamba file
* Remove trainer changes from unrelated PR
* Modify modular and cofig to get model running
* Fix some CI errors and beam search
* Fix a plethora of bugs from CI/docs/etc
* Add bamba to models with special caches
* Updat to newer mamba PR for mamba sublayer
* fix test_left_padding_compatibility
Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com>
* fix style
Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com>
* fix remaining tests
Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com>
* missed this test
Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com>
* ran make style
Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com>
* move slow tag to integration obj
Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com>
* make style
Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com>
* address comments
Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com>
* fix modular
Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com>
* left out one part of modular
Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com>
* change model
Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com>
* Make Rotary modular as well
* Update bamba.md
Added overview, update Model inference card and added config
* Update bamba.md
* Update bamba.md
* Update bamba.md
Minor fixes
* Add docs for config and model back
Signed-off-by: Antoni Viros i Martin <aviros@ibm.com>
* Add warning when using fast kernels
* replaced generate example
Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com>
* Address comments from PR
Signed-off-by: Antoni Viros i Martin <aviros@ibm.com>
* Propagate attention fixes
Signed-off-by: Antoni Viros i Martin <aviros@ibm.com>
* Fix attention interfaces to the new API
Signed-off-by: Antoni Viros i Martin <aviros@ibm.com>
* Fix API for decoder layer
Signed-off-by: Antoni Viros i Martin <aviros@ibm.com>
* Remove extra weights
Signed-off-by: Antoni Viros i Martin <aviros@ibm.com>
---------
Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com>
Signed-off-by: Antoni Viros i Martin <aviros@ibm.com>
Co-authored-by: Gabe Goodhart <gabe.l.hart@gmail.com>
Co-authored-by: Antoni Viros i Martin <aviros@ibm.com>
Co-authored-by: divya-kumari32 <72085811+divya-kumari32@users.noreply.github.com>
Co-authored-by: Antoni Viros <ani300@gmail.com>
2024-12-18 20:18:17 +01:00
Arthur
2c47618c1a
🚨 All attention refactor 🚨 ( #35235 )
...
* refactor LlamaAttention
* minimal changes
* fix llama
* update
* modular gemmas
* modular nits
* modular updates
* nits
* simplify
* gpt2
* more modualr and fixes
* granite
* modular modular modular
* nits
* update
* qwen2 + starcoder2
* mostly gemma2
* Update image_processing_auto.py
* fix
* Update modular_starcoder2.py
* fix
* remove all copied from attentions
* remove gcv
* make fix-copies
* oups
* oups2.0
* fix some modulars + all copied from
* should be good now
* revert unwanted changes
* Update modeling_decision_transformer.py
* finish cleanup
* Update modeling_olmo.py
* consistency
* re-add gradient checkpointing attribute
* fix
* style
* make config necessary
* bis
* bis
* Update modeling_my_new_model2.py
* is_causal attr
* fix
* remove past kv return from decoder layer
* fix
* default rope config
* correctly fix rope config
* fix bias
* fix gpt2 attention output
* fix test
* fix inits
* fix default sdpa
* fix default sdpa implementation
* harmonize classes
* fix mistral
* fix sliding window models
* mixtral
* be more explicit
* style
* fix
* several fixes
* Update modeling_dbrx.py
* fix test
* olmo + phi
* rotary
* syle
* phi
* phi again
* again
* kwargs
* Update test_modeling_common.py
* skip fx tracing tests
* Update modeling_utils.py
* gemma 2
* again
* Update modeling_recurrent_gemma.py
* gemma2
* granite
* style
* starcoder
* Update sdpa_attention.py
* switch args
* Update modeling_mllama.py
* fix
* cache type tests
* gpt2
* Update test_modeling_common.py
* fix
* consistency
* fix shape with encoder
* should be the last one
* tests non model
* most comments
* small oupsi
* be more explicit in modulars
* more explicit modulars
* CIs! it works locally
* add kwargs to _flash_attention_forward
---------
Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com>
2024-12-18 16:53:39 +01:00
Yih-Dar
f1b7634fc8
Trigger GitHub CI with a comment on PR ( #35211 )
...
* fix
* fix
* comment
* final
* final
* final
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-12-18 13:56:49 +01:00
Matt
e0ae9b5974
🚨 🚨 🚨 Delete conversion scripts when making release wheels ( #35296 )
...
* Delete conversion scripts when making release wheels
* make fixup
* Update docstring
2024-12-17 14:18:42 +00:00
Billel Mokeddem
6c08b3b6e5
Add Falcon3 documentation ( #35307 )
...
* Add Falcon3 documentation
* Update Falcon3 documentation
* Change Falcon to Falcon3
* Update docs and run make fix-copies
* Add blog post and huggingface models links
2024-12-17 14:23:13 +01:00
Tony Wu
f33a0cebb3
Add ColPali to 🤗 transformers ( #33736 )
...
* feat: run `add-new-model-like`
* feat: add paligemma code with "copied from"
* feat: add ColPaliProcessor
* feat: add ColPaliModel
* feat: add ColPaliConfig
* feat: rename `ColPaliForConditionalGeneration` to `ColPaliModel`
* fixup modeling colpali
* fix: fix root import shortcuts
* fix: fix `modeling_auto` dict
* feat: comment out ColPali test file
* fix: fix typos from `add-new-model-like`
* feat: explicit the forward input args
* feat: move everything to `modular_colpali.py`
* fix: put back ColPaliProcesor
* feat: add auto-generated files
* fix: run `fix-copies`
* fix: remove DOCStRING constants to make modular converter work
* fix: fix typo + modular converter
* fix: add missing imports
* feat: no more errors when loading ColPaliModel
* fix: remove unused args in forward + tweak doc
* feat: rename `ColPaliModel` to `ColPaliForRetrieval`
* fix: apply `fix-copies`
* feat: add ColPaliProcessor to `modular_colpali`
* fix: run make quality + make style
* fix: remove duplicate line in configuration_auto
* feat: make ColPaliModel inehrit from PaliGemmaForConditionalGeneration
* fix: tweak and use ColPaliConfig
* feat: rename `score` to `post_process_retrieval`
* build: run modular formatter + make style
* feat: convert colpali weights + fixes
* feat: remove old weight converter file
* feat: add and validate tests
* feat: replace harcoded path to "vidore/colpali-v1.2-hf" in tests
* fix: add bfloat16 conversion in weight converter
* feat: replace pytest with unittest in modeling colpali test
* feat: add sanity check for weight conversion (doesn't work yet)
* feat: add shape sanity check in weigth converter
* feat: make ColPaliProcessor args explicit
* doc: add doc for ColPali
* fix: trying to fix output mismatch
* feat: tweaks
* fix: ColPaliModelOutput inherits from ModelOutput instead of PaliGemmaCausalLMOutputWithPast
* fix: address comments on PR
* fix: adapt tests to the Hf norm
* wip: try things
* feat: add `__call__` method to `ColPaliProcessor`
* feat: remove need for dummy image in `process_queries`
* build: run new modular converter
* fix: fix incorrect method override
* Fix tests, processing, modular, convert
* fix tokenization auto
* hotfix: manually fix processor -> fixme once convert modular is fixed
* fix: convert weights working
* feat: rename and improve convert weight script
* feat: tweaks
* fest: remove `device` input for `post_process_retrieval`
* refactor: remove unused `get_torch_device`
* Fix all tests
* docs: update ColPali model doc
* wip: fix convert weights to hf
* fix logging modular
* docs: add acknowledgements in model doc
* docs: add missing docstring to ColPaliProcessor
* docs: tweak
* docs: add doc for `ColPaliForRetrievalOutput.forward`
* feat: add modifications from colpali-engine v0.3.2 in ColPaliProcessor
* fix: fix and upload colapli hf weights
* refactor: rename `post_process_retrieval` to `score_retrieval`
* fix: fix wrong typing for `score_retrieval`
* test: add integration test for ColPali
* chore: rerun convert modular
* build: fix root imports
* Update docs/source/en/index.md
Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>
* fix: address PR comments
* wip: reduce the prediction gap in weight conversion
* docs: add comment in weight conversion script
* docs: add example for `ColPaliForRetrieval.forward`
* tests: change dataset path to the new one in hf-internal
* fix: colpali weight conversion works
* test: add fine-grained check for ColPali integration test
* fix: fix typos in convert weight script
* docs: move input docstring in a variable
* fix: remove hardcoded torch device in test
* fix: run the new modular refactor
* docs: fix python example for ColPali
* feat: add option to choose `score_retrieval`'s output dtype and device
* docs: update doc for `score_retrieval`
* feat: add `patch_size` property in ColPali model
* chore: run `make fix-copies`
* docs: update description for ColPali cookbooks
* fix: remove `ignore_index` methods
* feat: remove non-transformers specific methods
* feat: update `__init__.py` to new hf format
* fix: fix root imports in transformers
* feat: remove ColPali's inheritance from PaliGemma
* Fix CI issues
* nit remove prints
* feat: remove ColPali config and model from `modular_colpali.py`
* feat: add `ColPaliPreTrainedModel` and update modeling and configuration code
* fix: fix auto-removed imports in root `__init__.py`
* fix: various fixes
* fix: fix `_init_weight`
* temp: comment `AutoModel.from_config` for experiments
* fix: add missing `output_attentions` arg in ColPali's forward
* fix: fix `resize_token_embeddings`
* fix: make `input_ids` optional in forward
* feat: rename `projection_layer` to `embedding_proj_layer`
* wip: fix convert colpali weight script
* fix tests and convert weights from original repo
* fix unprotected import
* fix unprotected torch import
* fix style
* change vlm_backbone_config to vlm_config
* fix unprotected import in modular this time
* fix: load config from Hub + tweaks in convert weight script
* docs: move example usage from model docstring to model markdown
* docs: fix input docstring for ColPali's forward method
* fix: use `sub_configs` for ColPaliConfig
* fix: remove non-needed sanity checks in weight conversion script + tweaks
* fix: fix issue with `replace_return_docstrings` in ColPali's `forward`
* docs: update docstring for `ColPaliConfig`
* test: change model path in ColPali test
* fix: fix ColPaliConfig
* fix: fix weight conversion script
* test: fix expected weights for ColPali model
* docs: update ColPali markdown
* docs: fix minor typo in ColPaliProcessor
* Fix tests and add _no_split_modules
* add text_config to colpali config
* [run slow] colpali
* move inputs to torch_device in integration test
* skip test_model_parallelism
* docs: clarify quickstart snippet in ColPali's model card
* docs: update ColPali's model card
---------
Co-authored-by: yonigozlan <yoni.gozlan@huggingface.co>
Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>
2024-12-17 11:26:43 +01:00
Arthur
a7f5479b45
fix modular order ( #35297 )
...
* fix modular ordre
* fix
* style
2024-12-17 08:05:35 +01:00
Yih-Dar
66531a1ec3
Aggeregate test summary files in CircleCI workflow runs ( #34989 )
...
* fix
* fix
* fix
* fix
* fix
* fix
* fix
* fix
* fix
* fix
* fix
* fix
* fix
* fix
* fix
* fix
* fix
* fix
* fix
* fix
* fix
* fix
* fix
* fix
* fix
* fix
* fix
* fix
* fix
* fix
* fix
* fix
* fix
* fix
* fix
* try 1
* try 1
* try 1
* try 1
* try 1
* try 1
* try 1
* try 1
* try 1
* try 1
* try 1
* try 1
* try 1
* try 1
* try 1
* try 1
* try 1
* try 1
* try 1
* try 1
* try 1
* try 1
* try 1
* try 1
* try 1
* try 1
* try 1
* try 1
* try 1
* try 1
* try 1
* try 1
* try 1
* try 1
* try 1
* try 1
* try 1
* try 1
* try 1
* try 1
* try 1
* fix
* fix
* fix
* update
* fix
* fix
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-12-16 11:06:17 +01:00
alexrs-cohere
64478c7631
Add Cohere2 model ( #35224 )
2024-12-13 09:35:50 +01:00
Pavel Iakubovskii
5fcf6286bf
Add TimmWrapper ( #34564 )
...
* Add files
* Init
* Add TimmWrapperModel
* Fix up
* Some fixes
* Fix up
* Remove old file
* Sort out import orders
* Fix some model loading
* Compatible with pipeline and trainer
* Fix up
* Delete test_timm_model_1/config.json
* Remove accidentally commited files
* Delete src/transformers/models/modeling_timm_wrapper.py
* Remove empty imports; fix transformations applied
* Tidy up
* Add image classifcation model to special cases
* Create pretrained model; enable device_map='auto'
* Enable most tests; fix init order
* Sort imports
* [run-slow] timm_wrapper
* Pass num_classes into timm.create_model
* Remove train transforms from image processor
* Update timm creation with pretrained=False
* Fix gamma/beta issue for timm models
* Fixing gamma and beta renaming for timm models
* Simplify config and model creation
* Remove attn_implementation diff
* Fixup
* Docstrings
* Fix warning msg text according to test case
* Fix device_map auto
* Set dtype and device for pixel_values in forward
* Enable output hidden states
* Enable tests for hidden_states and model parallel
* Remove default scriptable arg
* Refactor inner model
* Update timm version
* Fix _find_mismatched_keys function
* Change inheritance for Classification model (fix weights loading with device_map)
* Minor bugfix
* Disable save pretrained for image processor
* Rename hook method for loaded keys correction
* Rename state dict keys on save, remove `timm_model` prefix, make checkpoint compatible with `timm`
* Managing num_labels <-> num_classes attributes
* Enable loading checkpoints in Trainer to resume training
* Update error message for output_hidden_states
* Add output hidden states test
* Decouple base and classification models
* Add more test cases
* Add save-load-to-timm test
* Fix test name
* Fixup
* Add do_pooling
* Add test for do_pooling
* Fix doc
* Add tests for TimmWrapperModel
* Add validation for `num_classes=0` in timm config + test for DINO checkpoint
* Adjust atol for test
* Fix docs
* dev-ci
* dev-ci
* Add tests for image processor
* Update docs
* Update init to new format
* Update docs in configuration
* Fix some docs in image processor
* Improve docs for modeling
* fix for is_timm_checkpoint
* Update code examples
* Fix header
* Fix typehint
* Increase tolerance a bit
* Fix Path
* Fixing model parallel tests
* Disable "parallel" tests
* Add comment for metadata
* Refactor AutoImageProcessor for timm wrapper loading
* Remove custom test_model_outputs_equivalence
* Add require_timm decorator
* Fix comment
* Make image processor work with older timm versions and tensor input
* Save config instead of whole model in image processor tests
* Add docstring for `image_processor_filename`
* Sanitize kwargs for timm image processor
* Fix doc style
* Update check for tensor input
* Update normalize
* Remove _load_timm_model function
---------
Co-authored-by: Amy Roberts <22614925+amyeroberts@users.noreply.github.com>
2024-12-11 12:40:30 +00:00
Aymeric Roucher
9ad4c93536
Add Aria ( #34157 )
...
* Add Aria
---------
Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2024-12-06 12:17:34 +01:00
João Marcelo
50189e36a6
Add I-JEPA ( #33125 )
...
* first draft
* add IJepaEmbeddings class
* fix copy-from for IJepa model
* add weight conversion script
* update attention class names in IJepa model
* style changes
* Add push_to_hub option to convert_ijepa_checkpoint function
* add initial tests for I-JEPA
* minor style changes to conversion script
* make fixup related
* rename conversion script
* Add I-JEPA to sdpa docs
* minor fixes
* adjust conversion script
* update conversion script
* adjust sdpa docs
* [run_slow] ijepa
* [run-slow] ijepa
* [run-slow] ijepa
* [run-slow] ijepa
* [run-slow] ijepa
* [run-slow] ijepa
* formatting issues
* adjust modeling to modular code
* add IJepaModel to objects to ignore in docstring checks
* [run-slow] ijepa
* fix formatting issues
* add usage instruction snippet to docs
* change pos encoding, add checkpoint for doc
* add verify logits for all models
* [run-slow] ijepa
* update docs to include image feature extraction instructions
* remove pooling layer from IJepaModel in image classification class
* [run-slow] ijepa
* remove pooling layer from IJepaModel constructor
* update docs
* [run-slow] ijepa
* [run-slow] ijepa
* small changes
* [run-slow] ijepa
* style adjustments
* update copyright in init file
* adjust modular ijepa
* [run-slow] ijepa
2024-12-05 16:14:46 +01:00
Cyril Vallez
1da1e0d7f2
Support for easier multimodal use of modular ( #35056 )
...
* update modular and add examples
* style
* improve example comments
* style
* fix small logic issue for imports
* fix relative order issue when files do not make sense
* Improve comments
* trigger CIs
2024-12-04 15:13:11 +01:00
Yih-Dar
6300212946
Fix utils/check_bad_commit.py (for auto ping in CI) ( #34943 )
...
* fix
* fix
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-11-28 15:34:38 +01:00
Yoni Gozlan
3a8eb74668
Fix support for image processors modifications in modular ( #34866 )
...
* add fix and examples
* fix camel case naming
2024-11-22 18:14:24 -05:00
Cyril Vallez
4e90b99ed9
Refactor StarCoder2 using modular ( #34015 )
...
* Create modular_starcoder2.py
* Update modular_starcoder2.py
* update
* finalize modular
* revert # no-unravel
* Add support
* style
* Update modular_model_converter.py
* update docstring
2024-11-21 14:52:39 +01:00
Yih-Dar
40821a2478
Fix CI slack reporting issue ( #34833 )
...
* fix
* fix
* fix
* fix
* fix
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-11-20 21:36:13 +01:00
Cyril Vallez
e3a5889ef0
Modular fix ( #34802 )
...
* Modular fix
* style
* remove logger warning
* Update modular_model_converter.py
2024-11-19 16:08:57 +01:00
Arthur
4bff54f921
Gemma capping ( #34282 )
...
* softcapping
* soft cap before the mask
* style
* ...
* super nit
* update
* fixes
* update
* small issue with modular
* fix modular imports
* update
* fixup
* simplify a hell lot
* simplify cleaning imports
* finish fixing
* update our design
* nits
* use a deprecation cycle
* updates
* Fix modular (recursive deps need to always be computed after merges!)
* push
* fix
* update
* fix modular order
* make fix-copies
* updates
* update
* ?
* don't compile for now
* ?
* fix some stuff
* donc!
* fix copies
* update
* fixup
* ?
* fix two tests
* fix?
* for now, don't use head info
* eager when output attentoin and sdpa or flash as it's the simplest behaviour (for our tests as well :))
* fix-copies
* revert sdpa check
* Apply suggestions from code review
Co-authored-by: Cyril Vallez <cyril.vallez@huggingface.co>
* rebase, fix-copies and push
* add a slow integration test
* update the test
* fix left padding issue
* fix test
* remove duplicate scaling
* quality
* add a small test and make sure it works
* 2b
---------
Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com>
Co-authored-by: Cyril Vallez <cyril.vallez@huggingface.co>
2024-11-19 13:52:38 +01:00
Cyril Vallez
e2ac16b28a
Large modular logic refactoring ( #34487 )
...
* rework converter
* Update modular_model_converter.py
* Update modular_model_converter.py
* Update modular_model_converter.py
* Update modular_model_converter.py
* cleaning
* cleaning
* finalize imports
* imports
* Update modular_model_converter.py
* Better renaming to avoid visiting same file multiple times
* start converting files
* style
* address most comments
* style
* remove unused stuff in get_needed_imports
* style
* move class dependency functions outside class
* Move main functions outside class
* style
* Update modular_model_converter.py
* rename func
* add augmented dependencies
* Update modular_model_converter.py
* Add types_to_file_type + tweak annotation handling
* Allow assignment dependency mapping + fix regex
* style + update modular examples
* fix modular_roberta example (wrong redefinition of __init__)
* slightly correct order in which dependencies will appear
* style
* review comments
* Performance + better handling of dependencies when they are imported
* style
* Add advanced new classes capabilities
* style
* add forgotten check
* Update modeling_llava_next_video.py
* Add prority list ordering in check_conversion as well
* Update check_modular_conversion.py
* Update configuration_gemma.py
2024-11-01 10:13:51 +01:00
Yoni Gozlan
203e27059b
Add image text to text pipeline ( #34170 )
...
* Standardize image-text-to-text-models-output
add post_process_image_text_to_text to chameleon and cleanup
Fix legacy kwarg behavior and deprecation warning
add post_process_image_text_to_text to qwen2_vl and llava_onevision
Add post_process_image_text_to_text to idefics3, mllama, pixtral processor
* nit var name post_process_image_text_to_text udop
* nit fix deprecation warnings
* Add image-text-to-text pipeline
* add support for image url in chat template for pipeline
* Reformat to be fully compatible with chat templates
* Add tests chat template
* Fix imports and tests
* Add pipeline tag
* change logic handling of single prompt ans multiple images
* add pipeline mapping to models
* fix batched inference
* fix tests
* Add manual batching for preprocessing
* Fix outputs with nested images
* Add support for all common processing kwargs
* Add default padding when multiple text inputs (batch size>1)
* nit change version deprecation warning
* Add support for text only inference
* add chat_template warnings
* Add pipeline tests and add copied from post process function
* Fix batched pipeline tests
* nit
* Fix pipeline tests blip2
* remove unnecessary max_new_tokens
* revert processing kosmos2 and remove unnecessary max_new_tokens
* fix pipeline tests idefics
* Force try loading processor if pipeline supports it
* revert load_processor change
* hardcode loading only processor
* remove unnecessary try except
* skip imagetexttotext tests for kosmos2 as tiny model causes problems
* Make code clearer
* Address review comments
* remove preprocessing logic from pipeline
* fix fuyu
* add BC resize fuyu
* Move post_process_image_text_to_text to ProcessorMixin
* add guard in post_process
* fix zero shot object detection pipeline
* add support for generator input in pipeline
* nit
* change default image-text-to-text model to llava onevision
* fix owlv2 size dict
* Change legacy deprecation warning to only show when True
2024-10-31 15:48:11 -04:00
hlky
9e3d704e23
Fixes for Modular Converter on Windows ( #34266 )
...
* Separator in regex
* Standardize separator for relative path in auto generated message
* open() encoding
* Replace `\` on `os.path.abspath`
---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2024-10-29 11:40:41 +01:00
Yih-Dar
9360f1827d
Tiny update after #34383 ( #34404 )
...
* update
* update
* update
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-10-28 12:01:05 +01:00
Yih-Dar
223855314f
no filter ( #34391 )
...
* no filter
* no filter
* no filter
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-10-25 12:32:39 +02:00
Yih-Dar
a308d28d39
[auto. ping] Avoid sending empty info + add more team members ( #34383 )
...
* update
* update
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-10-24 19:07:23 +02:00
Raushan Turganbay
21d5025826
Attn implementation for composite models ( #32238 )
...
* first try
* codestyle
* idefics2 is happy
* [run-slow] llava, llava_next, video_llava, vipllava, llava_next_video, idefics, idefics2, kosmos2, fuyu, blip, blip_2, instructblip, instructblipvideo, paligemma
* fix-copies
* [run-slow] llava, llava_next, video_llava, vipllava, llava_next_video, idefics, idefics2, kosmos2, fuyu, blip, blip_2, instructblip, instructblipvideo
* blip-2 needs to init vision from config
* when was this removed O_o
* minor fix
* tests
* this way?
* tests
* model-agnostic code
* codestyle
* add tests for idefics
* modify general test for VLMs
* no generation test for vlm yet!
* no generation test here also
* wanr in VIT-SDPA if output attn
* add more tests
* user can pass dict as attn impl
* repo consistency
* update
* muicgen
* no prints
* forgot speech enc-dec and clip
* how many composite models we have?
* musicgen meelody is same as mudicgen
* +siglip
* fix tests + add some more
* remove idefics custom overriden code
* make idefics2 automappable
* nits
* skip tests
* doctests
* Update src/transformers/models/idefics2/configuration_idefics2.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update tests/models/clip/test_modeling_clip.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update tests/models/idefics2/test_modeling_idefics2.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update tests/models/idefics2/test_modeling_idefics2.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/configuration_utils.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* major update, no need for automap
* clean up
* add FA2 test
* more tests
* style
* skip tests
* why did these started failing now?
* no attributes for FA2 needed
* one tiny test
* address comment about FA2 false warning
* style
* add new models and resolve conflicts
* fix copies
* let it be this way for now, come back tomorrow to review
* some more fixes
* update
* more updates
* update
* fix copies
* style and tests
* another big update
* fix tests
* fix tests
* update
* another update
* fix tests
* fix copies
* fix tests
---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-10-22 06:54:44 +02:00
Arthur
c1c7e89620
Fix Gradient Accumulation issue ( #34191 )
...
* quick fix
* 3 losses
* oups
* fix
* nits
* check how it scales for special models
* propagate for conditiona detr
* propagate
* propagate
* propagate
* fixes
* propagate changes
* update
* fixup
* nits
* f string
* fixes
* more fixes
* ?
* nit
* arg annoying f string
* nits
* grumble
* update
* nit
* refactor
* fix fetch tests
* nit
* nit
* Update src/transformers/loss/loss_utils.py
Co-authored-by: Kashif Rasul <kashif.rasul@gmail.com>
* update
* nit
* fixup
* make pass
* nits
* port code to more models
* fixup
* ntis
* arf
* update
* update
* nits
* update
* fix
* update
* nits
* fine
* agjkfslga.jsdlkgjklas
* nits
* fix fx?
* update
* update
* styel
* fix imports
* update
* update
* fixup to fix the torch fx?
---------
Co-authored-by: Kashif Rasul <kashif.rasul@gmail.com>
2024-10-17 22:34:40 +02:00
Joao Gante
f51ac9e059
Generate: visit non-llm prepare_inputs_for_generation ( #34199 )
...
* tmp
* all visited
* test all
* Update src/transformers/models/moshi/modeling_moshi.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* delete another one :D
---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2024-10-17 16:53:48 +01:00
Yih-Dar
fce1fcfe71
Ping team members for new failed tests in daily CI ( #34171 )
...
* ping
* fix
* fix
* fix
* remove runner
* update members
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-10-17 16:11:52 +02:00
Yoach Lacombe
9ba021ea75
Moshi integration ( #33624 )
...
* clean mimi commit
* some nits suggestions from Arthur
* make fixup
* first moshi WIP
* converting weights working + configuration + generation configuration
* finalize converting script - still missing tokenizer and FE and processor
* fix saving model w/o default config
* working generation
* use GenerationMixin instead of inheriting
* add delay pattern mask
* fix right order: moshi codes then user codes
* unconditional inputs + generation config
* get rid of MoshiGenerationConfig
* blank user inputs
* update convert script:fix conversion, add tokenizer, feature extractor and bf16
* add and correct Auto classes
* update modeling code, configuration and tests
* make fixup
* fix some copies
* WIP: add integration tests
* add dummy objects
* propose better readiblity and code organisation
* update tokenization tests
* update docstrigns, eval and modeling
* add .md
* make fixup
* add MoshiForConditionalGeneration to ignore Auto
* revert mimi changes
* re
* further fix
* Update moshi.md
* correct md formating
* move prepare causal mask to class
* fix copies
* fix depth decoder causal
* fix and correct some tests
* make style and update .md
* correct config checkpoitn
* Update tests/models/moshi/test_tokenization_moshi.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update tests/models/moshi/test_tokenization_moshi.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* make style
* Update src/transformers/models/moshi/__init__.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* fixup
* change firm in copyrights
* udpate config with nested dict
* replace einsum
* make style
* change split to True
* add back splt=False
* remove tests in convert
* Update tests/models/moshi/test_modeling_moshi.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* add default config repo + add model to FA2 docstrings
* remove logits float
* fix some tokenization tests and ignore some others
* make style tokenization tests
* update modeling with sliding window + update modeling tests
* [run-slow] moshi
* remove prepare for generation frol CausalLM
* isort
* remove copied from
* ignore offload tests
* update causal mask and prepare 4D mask aligned with recent changes
* further test refine + add back prepare_inputs_for_generation for depth decoder
* correct conditional use of prepare mask
* update slow integration tests
* fix multi-device forward
* remove previous solution to device_map
* save_load is flaky
* fix generate multi-devices
* fix device
* move tensor to int
---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: Marc Sun <marc@huggingface.co>
2024-10-16 11:21:49 +02:00
Yoni Gozlan
65442718c4
Add support for inheritance from class with different suffix in modular ( #34077 )
...
* add support for different suffix in modular
* add dummy example, pull new changes for modular
* nide lines order change
2024-10-15 14:55:09 +02:00
Raushan Turganbay
fd70464fa7
Fix flaky tests ( #34069 )
...
* fix mllama only
* allow image token index
2024-10-11 14:41:46 +01:00
Lysandre Debut
f052e94bcc
Fix flax failures ( #33912 )
...
* Few fixes here and there
* Remove typos
* Remove typos
2024-10-11 14:38:35 +02:00
Mohamed Mekkouri
24b82f3cd5
Small Fix to modular converter ( #34051 )
...
* small_fix
* supporting both src/tranformers and examples/
* make style
2024-10-10 18:43:27 +02:00
Raushan Turganbay
adea67541a
Phi3: fix attn for sliding window ( #33586 )
...
* fix phi3 attn fir sliding window
* fix tests
* address most comment
* style
* update after rebase
* add more models
* fix tests
2024-10-10 11:50:39 +02:00
Arthur
e783f12f20
[Patch helper] update to not have to checkout main ( #34006 )
...
add more support
2024-10-09 09:21:46 +02:00
Cyril Vallez
17806d11ba
Improve modular converter ( #33991 )
...
* improve modular
* style
* Update modular_model_converter.py
* pretty print warning
* style
* Support to remove unused classes as part of added dependencies as well
* nits
* correct bug
* add example
* style
* Add documentation
2024-10-08 14:53:58 +02:00
Yoni Gozlan
e2001c3413
Add auto model for image-text-to-text ( #32472 )
...
* Add Auto model for image-text-to-text
* Remove donut from processing auto, add chameleon ti image text to text models
* add qwen2_vl and llava_onevision
* add pixtral to auto model for image-text-to-text
* add mllama and idefics3
* remove models in IGNORE_NON_AUTO_CONFIGURED
* add AutoModelForImageTextToText to tests and doc
2024-10-08 14:26:43 +02:00
Arthur
a3add29097
Add support for __all__ and potentilly deleting functions ( #33859 )
...
* Add support for __all__ and potentailly deleting functions
* updates
* update
* nits
* remove dummies
* fix warning
* fixup
* style
* update
* fixup
* skip copied from when # skip
* remove log
* bring dummies back
* fixup
* remove copied from
* fixup
* remove warnings from `make fix-copies`
* fix doc issues
* nits
* Better error message !
* add support for more flexible naming!
* style
* breaking style?
* fix super() renaming issues
* del not needed when you don't call super().__init__()
* style
* no more fmt on :)
* properly remove `self`
* fixup
* fix
* doc nits
* add some doc 🫡
2024-10-08 10:19:17 +02:00
pglorio
f319ba16fa
Add Zamba ( #30950 )
...
* Update index.md
* Rebase
* Rebase
* Updates from make fixup
* Update zamba.md
* Batched inference
* Update
* Fix tests
* Fix tests
* Fix tests
* Fix tests
* Update docs/source/en/model_doc/zamba.md
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update docs/source/en/model_doc/zamba.md
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update configuration_zamba.py
* Update src/transformers/models/zamba/modeling_zamba.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update src/transformers/models/zamba/modeling_zamba.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update src/transformers/models/zamba/modeling_zamba.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update src/transformers/models/zamba/modeling_zamba.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update modeling_zamba.py
* Update modeling_zamba.py
* Update modeling_zamba.py
* Update configuration_zamba.py
* Update modeling_zamba.py
* Update modeling_zamba.py
* Merge branch 'main' of https://github.com/Zyphra/transformers_zamba
* Update ZambaForCausalLM
* Update ZambaForCausalLM
* Describe diffs with original mamba layer
* Moved mamba init into `_init_weights`
* Update index.md
* Rebase
* Rebase
* Updates from make fixup
* Update zamba.md
* Batched inference
* Update
* Fix tests
* Fix tests
* Fix tests
* Fix tests
* Update docs/source/en/model_doc/zamba.md
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update docs/source/en/model_doc/zamba.md
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update configuration_zamba.py
* Update src/transformers/models/zamba/modeling_zamba.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update src/transformers/models/zamba/modeling_zamba.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update src/transformers/models/zamba/modeling_zamba.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update src/transformers/models/zamba/modeling_zamba.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update modeling_zamba.py
* Update modeling_zamba.py
* Update modeling_zamba.py
* Update configuration_zamba.py
* Update modeling_zamba.py
* Update modeling_zamba.py
* Merge branch 'main' of https://github.com/Zyphra/transformers_zamba
* Update ZambaForCausalLM
* Moved mamba init into `_init_weights`
* Update ZambaForCausalLM
* Describe diffs with original mamba layer
* make fixup fixes
* quality test fixes
* Fix Zamba model path
* circleci fixes
* circleci fixes
* circleci fixes
* circleci fixes
* circleci fixes
* circleci fixes
* circleci fixes
* circleci fixes
* circleci fixes
* Update
* circleci fixes
* fix zamba test from merge
* fix ValueError for disabling mamba kernels
* add HF copyright
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* shared_transf --> shared_transformer
* Update src/transformers/models/zamba/modeling_zamba.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update src/transformers/models/zamba/modeling_zamba.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Fixes
* Move attention head dim to config
* Fix circle/ci tests
* Update modeling_zamba.py
* apply GenerationMixin inheritance change from upstream
* apply import ordering
* update needed transformers version for zamba
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* add contribution author
* add @slow to avoid CI
* Update src/transformers/models/zamba/modeling_zamba.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Define attention_hidden_size
* Added doc for attention_head_size
* trigger CI
* Fix doc of attention_hidden_size
* [run-slow] zamba
* Fixed shared layer logic, swapped up<->gate in mlp
* shared_transformer -> shared_transf
* reformat HybridLayer __init__
* fix docstrings in zamba config
* added definition of _get_input_ids_and_config
* fixed formatting of _get_input_ids_and_config
---------
Co-authored-by: root <root@node-4.us-southcentral1-a.compute.internal>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: root <root@node-1.us-southcentral1-a.compute.internal>
Co-authored-by: Quentin Anthony <qganthony@yahoo.com>
2024-10-04 22:28:05 +02:00
Guillaume LEGENDRE
4df3ccddb7
Migrate the CI runners to the new clusters ( #33849 )
...
* try fixing push-ci
* move to new runners
* move benchmark.yml to new runners
* move doctest_job.yml to new runners
* move doctests.yml to new runners
* move push-important-models.yml to new runners
* move self-pr-slow-ci.yml to new runners
* fix typo
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
* fix working directory
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
* fix working directory
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
* improve code
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
---------
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
2024-10-03 14:39:49 +02:00