transformers

mirror of https://github.com/saymrwulf/transformers.git synced 2026-05-15 21:01:19 +00:00

History

Sanchit Gandhi e93103632b Add bloom flax (#25094 ) * First commit * step 1 working * add alibi * placeholder for `scan` * add matrix mult alibi * beta scaling factor for bmm * working v1 - simple forward pass * move layer_number from attribute to arg in call * partial functioning scan * hacky working scan * add more modifs * add test * update scan for new kwarg order * fix position_ids problem * fix bug in attention layer * small fix - do the alibi broadcasting only once * prelim refactor * finish refactor * alibi shifting * incorporate dropout_add to attention module * make style * make padding work again * update * remove bogus file * up * get generation to work * clean code a bit * added small tests * adding albii test * make CI tests pass: - change init weight - add correct tuple for output attention - add scan test - make CI tests work * fix few nits * fix nit onnx * fix onnx nit * add missing dtype args to nn.Modules * remove debugging statements * fix scan generate * Update modeling_flax_bloom.py * Update test_modeling_flax_bloom.py * Update test_modeling_flax_bloom.py * Update test_modeling_flax_bloom.py * fix small test issue + make style * clean up * Update tests/models/bloom/test_modeling_flax_bloom.py Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * fix function name * small fix test * forward contrib credits from PR17761 * Fix failing test * fix small typo documentation * fix non passing test - remove device from build alibi * refactor call - refactor `FlaxBloomBlockCollection` module * make style * upcast to fp32 * cleaner way to upcast * remove unused args * remove layer number * fix scan test * make style * fix i4 casting * fix slow test * Update src/transformers/models/bloom/modeling_flax_bloom.py Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * remove `layer_past` * refactor a bit * fix `scan` slow test * remove useless import * major changes - remove unused code - refactor a bit - revert import `torch` * major refactoring - change build alibi * remove scan * fix tests * make style * clean-up alibi * add integration tests * up * fix batch norm conversion * style * style * update pt-fx cross tests * update copyright * Update src/transformers/modeling_flax_pytorch_utils.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * per-weight check * style * line formats --------- Co-authored-by: younesbelkada <younesbelkada@gmail.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Co-authored-by: haileyschoelkopf <haileyschoelkopf@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>		2023-07-27 18:24:56 +01:00
..
internal	Generate: add SequenceBiasLogitsProcessor (#24334 )	2023-06-21 11:14:41 +01:00
main_classes	fsdp fixes and enhancements (#24980 )	2023-07-21 17:52:48 +05:30
model_doc	Add bloom flax (#25094 )	2023-07-27 18:24:56 +01:00
tasks	[`T5`, `MT5`, `UMT5`] Add [T5, MT5, UMT5]ForSequenceClassification (#24726 )	2023-07-25 21:02:49 +02:00
_config.py
_toctree.yml	[`MPT`] Add MosaicML's `MPT` model to transformers (#24629 )	2023-07-25 14:32:40 +02:00
accelerate.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00
add_new_model.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00
add_new_pipeline.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00
add_tensorflow_model.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00
attention.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00
autoclass_tutorial.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00
benchmarks.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00
bertology.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00
big_models.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00
community.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00
contributing.md
create_a_model.md	Update old existing feature extractor references (#24552 )	2023-06-29 10:17:36 +01:00
custom_models.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00
custom_tools.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00
debugging.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00
fast_tokenizers.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00
generation_strategies.md	Generate: `group_beam_search` requires `diversity_penalty>0.0` (#24456 )	2023-06-27 10:46:39 +01:00
glossary.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00
hpo_train.md	Update RayTune doc link for Hyperparameter tuning (#24422 )	2023-06-22 10:38:01 -04:00
index.md	Add bloom flax (#25094 )	2023-07-27 18:24:56 +01:00
installation.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00
model_memory_anatomy.md	[docs] Performance docs tidy up, part 1 (#23963 )	2023-07-24 08:57:24 -04:00
model_sharing.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00
model_summary.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00
multilingual.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00
notebooks.md
pad_truncation.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00
perf_hardware.md	🌐 [i18n-KO] Translated `perf_hardware.md` to Korean (#24966 )	2023-07-25 07:44:24 -04:00
perf_infer_cpu.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00
perf_infer_gpu_many.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00
perf_infer_gpu_one.md	fix: add TOC anchor link (#25066 )	2023-07-25 08:02:33 -04:00
perf_infer_special.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00
perf_train_cpu.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00
perf_train_cpu_many.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00
perf_train_gpu_many.md	deprecate `sharded_ddp` training argument (#24825 )	2023-07-17 06:57:42 -04:00
perf_train_gpu_one.md	Set `TF32` flag for PyTorch cuDNN backend (#25075 )	2023-07-25 08:04:48 -04:00
perf_train_special.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00
perf_train_tpu.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00
perf_train_tpu_tf.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00
performance.md	[docs] Performance docs tidy up, part 1 (#23963 )	2023-07-24 08:57:24 -04:00
perplexity.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00
philosophy.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00
pipeline_tutorial.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00
pipeline_webserver.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00
pr_checks.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00
preprocessing.md	Removal of deprecated vision methods and specify deprecation versions (#24570 )	2023-06-29 15:09:51 +01:00
quicktour.md	🌐 [i18n-KO] Fixed Korean and English `quicktour.md` (#24664 )	2023-07-21 08:19:28 -04:00
run_scripts.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00
sagemaker.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00
serialization.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00
task_summary.md	Fix doctest (#25031 )	2023-07-25 22:10:06 +02:00
tasks_explained.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00
testing.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00
tf_xla.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00
tflite.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00
tokenizer_summary.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00
torchscript.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00
training.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00
transformers_agents.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00
troubleshooting.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00