mirror of
https://github.com/saymrwulf/transformers.git
synced 2026-05-14 20:58:08 +00:00
* Conversion from slow to fast for BPE spm vocabs contained an error. - There is only 1 test currently (tokenizers + slow) that used the modified path and it's reformer, which does not contain any ids modification so the bug was silent for now. - The real issue is that vocab variable was overloaded by SentencePieceExtractor, leading to Slow specific vocab oddities to be completely ignored - The bug was reported here https://github.com/huggingface/transformers/issues/9518 - Ran the complete tokenization test suite with slow without error (`RUN_SLOW=1 pytest -sv tests/test_tokenization_*`) * Remove rebase error. * Adding the fixture. |
||
|---|---|---|
| .. | ||
| tests_samples | ||
| dummy-config.json | ||
| empty.txt | ||
| input.txt | ||
| sample_text.txt | ||
| sample_text_no_unicode.txt | ||
| spiece.model | ||
| test_sentencepiece.model | ||
| test_sentencepiece_bpe.model | ||
| test_sentencepiece_no_bos.model | ||