transformers/utils
Matt ee0d001de7
Add a TF in-graph tokenizer for BERT (#17701)
* Add a TF in-graph tokenizer for BERT

* Add from_pretrained

* Add proper truncation, option handling to match other tokenizers

* Add proper imports and guards

* Add test, fix all the bugs exposed by said test

* Fix truncation of paired texts in graph mode, more test updates

* Small fixes, add a (very careful) test for savedmodel

* Add tensorflow-text dependency, make fixup

* Update documentation

* Update documentation

* make fixup

* Slight changes to tests

* Add some docstring examples

* Update tests

* Update tests and add proper lowercasing/normalization

* make fixup

* Add docstring for padding!

* Mark slow tests

* make fixup

* Fall back to BertTokenizerFast if BertTokenizer is unavailable

* Fall back to BertTokenizerFast if BertTokenizer is unavailable

* make fixup

* Properly handle tensorflow-text dummies
2022-06-27 12:06:21 +01:00
..
test_module Fix from_pretrained with default base_model_prefix (#15814) 2022-02-24 11:43:51 +01:00
tf_ops
check_config_docstrings.py Add a check on config classes docstring checkpoints (#17012) 2022-04-30 10:40:46 +02:00
check_copies.py [Wav2Vec2Conformer] Official release (#17709) 2022-06-15 18:34:15 +02:00
check_dummies.py Add a TF in-graph tokenizer for BERT (#17701) 2022-06-27 12:06:21 +01:00
check_inits.py Make check_init script more robust and clean inits (#17408) 2022-05-25 07:23:56 -04:00
check_repo.py Add LongT5 model (#16792) 2022-06-13 22:36:58 +02:00
check_table.py Enable doc in Spanish (#16518) 2022-04-04 10:25:46 -04:00
check_tf_ops.py
custom_init_isort.py explicitly set utf8 for Windows (#17664) 2022-06-13 08:05:45 -04:00
documentation_tests.txt Improve encoder decoder model docs (#17815) 2022-06-24 14:48:19 +02:00
download_glue_data.py
get_modified_files.py Updates the default branch from master to main (#16326) 2022-03-23 03:46:59 -04:00
notification_service.py Attempt to change Push CI to workflow_run (#17753) 2022-06-18 08:35:03 +02:00
notification_service_doc_tests.py explicitly set utf8 for Windows (#17664) 2022-06-13 08:05:45 -04:00
prepare_for_doc_test.py [DocTests Speech] Add doc tests for all speech models (#15031) 2022-01-27 14:29:31 +01:00
print_env.py Print more library versions in CI (#17384) 2022-06-02 10:24:16 +02:00
release.py Clean README in post release job as well. (#17519) 2022-06-02 07:44:03 -04:00
sort_auto_mappings.py Automatically sort auto mappings (#17250) 2022-05-16 13:24:20 -04:00
tests_fetcher.py Properly get tests deps in test_fetcher (#17870) 2022-06-24 16:56:46 -04:00
update_metadata.py Replace commit sha by commit url for update jobs (#14852) 2021-12-21 11:17:11 -05:00