transformers/utils
Yu Chin Fabian Lim 9613933b02
Add the Bamba Model (#34982)
* initial commit for PR

Co-authored-by: Gabe Goodhart <gabe.l.hart@gmail.com>

* rename dynamic cache

Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com>

* add more unit tests

Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com>

* add integration test

Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com>

* add integration test

Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com>

* Add modular bamba file

* Remove trainer changes from unrelated PR

* Modify modular and cofig to get model running

* Fix some CI errors and beam search

* Fix a plethora of bugs from CI/docs/etc

* Add bamba to models with special caches

* Updat to newer mamba PR for mamba sublayer

* fix test_left_padding_compatibility

Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com>

* fix style

Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com>

* fix remaining tests

Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com>

* missed this test

Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com>

* ran make style

Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com>

* move slow tag to integration obj

Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com>

* make style

Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com>

* address comments

Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com>

* fix modular

Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com>

* left out one part of modular

Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com>

* change model

Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com>

* Make Rotary modular as well

* Update bamba.md

Added overview, update Model inference card and added config

* Update bamba.md

* Update bamba.md

* Update bamba.md

Minor fixes

* Add docs for config and model back

Signed-off-by: Antoni Viros i Martin <aviros@ibm.com>

* Add warning when using fast kernels

* replaced generate example

Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com>

* Address comments from PR

Signed-off-by: Antoni Viros i Martin <aviros@ibm.com>

* Propagate attention fixes

Signed-off-by: Antoni Viros i Martin <aviros@ibm.com>

* Fix attention interfaces to the new API

Signed-off-by: Antoni Viros i Martin <aviros@ibm.com>

* Fix API for decoder layer

Signed-off-by: Antoni Viros i Martin <aviros@ibm.com>

* Remove extra weights

Signed-off-by: Antoni Viros i Martin <aviros@ibm.com>

---------

Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com>
Signed-off-by: Antoni Viros i Martin <aviros@ibm.com>
Co-authored-by: Gabe Goodhart <gabe.l.hart@gmail.com>
Co-authored-by: Antoni Viros i Martin <aviros@ibm.com>
Co-authored-by: divya-kumari32 <72085811+divya-kumari32@users.noreply.github.com>
Co-authored-by: Antoni Viros <ani300@gmail.com>
2024-12-18 20:18:17 +01:00
..
test_module
tf_ops
add_pipeline_model_mapping_to_test.py
check_bad_commit.py Fix utils/check_bad_commit.py (for auto ping in CI) (#34943) 2024-11-28 15:34:38 +01:00
check_build.py Fix import of FalconMambaForCausalLM (#33381) 2024-09-10 09:14:54 +02:00
check_config_attributes.py Add the Bamba Model (#34982) 2024-12-18 20:18:17 +01:00
check_config_docstrings.py Add TimmWrapper (#34564) 2024-12-11 12:40:30 +00:00
check_copies.py Generate: visit non-llm prepare_inputs_for_generation (#34199) 2024-10-17 16:53:48 +01:00
check_doc_toc.py
check_docstrings.py Add Aria (#34157) 2024-12-06 12:17:34 +01:00
check_doctest_list.py
check_dummies.py
check_inits.py Fix import of FalconMambaForCausalLM (#33381) 2024-09-10 09:14:54 +02:00
check_model_tester.py
check_modular_conversion.py Large modular logic refactoring (#34487) 2024-11-01 10:13:51 +01:00
check_repo.py Add Aria (#34157) 2024-12-06 12:17:34 +01:00
check_self_hosted_runner.py
check_support_list.py [RoBERTa-based] Add support for sdpa (#30510) 2024-08-28 10:26:00 +02:00
check_table.py Add Falcon3 documentation (#35307) 2024-12-17 14:23:13 +01:00
check_tf_ops.py
create_dependency_mapping.py fix modular order (#35297) 2024-12-17 08:05:35 +01:00
create_dummy_models.py CI: fix efficientnet pipeline timeout and prevent future similar issues due to large image size (#33123) 2024-08-27 11:58:27 +01:00
custom_init_isort.py Import structure & first three model refactors (#31329) 2024-09-10 11:10:53 +02:00
deprecate_models.py
download_glue_data.py
extract_warnings.py
get_ci_error_statistics.py
get_github_job_time.py
get_modified_files.py
get_previous_daily_ci.py Ping team members for new failed tests in daily CI (#34171) 2024-10-17 16:11:52 +02:00
get_test_info.py CI: fix efficientnet pipeline timeout and prevent future similar issues due to large image size (#33123) 2024-08-27 11:58:27 +01:00
important_models.txt
models_to_deprecate.py
modular_model_converter.py fix modular order (#35297) 2024-12-17 08:05:35 +01:00
not_doctested.txt Add support for __all__ and potentilly deleting functions (#33859) 2024-10-08 10:19:17 +02:00
notification_service.py Fix CI slack reporting issue (#34833) 2024-11-20 21:36:13 +01:00
notification_service_doc_tests.py
notification_service_quantization.py Revive Nightly/Past CI (#31159) 2024-06-20 18:57:24 +02:00
past_ci_versions.py
patch_helper.py [Patch helper] update to not have to checkout main (#34006) 2024-10-09 09:21:46 +02:00
pr_slow_ci_models.py Trigger GitHub CI with a comment on PR (#35211) 2024-12-18 13:56:49 +01:00
print_env.py
process_bad_commit_report.py Tiny update after #34383 (#34404) 2024-10-28 12:01:05 +01:00
process_circleci_workflow_test_reports.py Aggeregate test summary files in CircleCI workflow runs (#34989) 2024-12-16 11:06:17 +01:00
process_test_artifacts.py fix the parallel number of CI nodes when it is smaller than number of tests (#33276) 2024-09-03 16:53:21 +02:00
release.py 🚨🚨🚨 Delete conversion scripts when making release wheels (#35296) 2024-12-17 14:18:42 +00:00
set_cuda_devices_for_ci.py Fix Cohere CI (#31263) 2024-06-10 15:16:58 +02:00
slow_documentation_tests.txt
sort_auto_mappings.py
split_doctest_jobs.py
split_model_tests.py
tests_fetcher.py no filter (#34391) 2024-10-25 12:32:39 +02:00
update_metadata.py Add ColPali to 🤗 transformers (#33736) 2024-12-17 11:26:43 +01:00
update_tiny_models.py Mention model_info.id instead of model_info.modelId (#32106) 2024-07-22 14:14:47 +01:00