From f3312515b7cf66de65b71b623a4cb719517d4fc0 Mon Sep 17 00:00:00 2001 From: Patrick von Platen Date: Tue, 20 Oct 2020 15:42:29 +0200 Subject: [PATCH] Add note for WikiSplit --- model_cards/google/roberta2roberta_L-24_wikisplit/README.md | 3 +++ 1 file changed, 3 insertions(+) diff --git a/model_cards/google/roberta2roberta_L-24_wikisplit/README.md b/model_cards/google/roberta2roberta_L-24_wikisplit/README.md index 8ba18aaeb..8d4a2b380 100644 --- a/model_cards/google/roberta2roberta_L-24_wikisplit/README.md +++ b/model_cards/google/roberta2roberta_L-24_wikisplit/README.md @@ -17,6 +17,9 @@ Disclaimer: The model card has been written by the Hugging Face team. You can use this model for sentence splitting, *e.g.* +**IMPORTANT**: The model was not trained on the `"` (double quotation mark) character -> so the before tokenizing the text, +it is advised to replace all `"` (double quotation marks) with two single `'` (single quotation mark). + ```python from transformers import AutoTokenizer, AutoModelForSeq2SeqLM