From fb7330b30ebfbb3f07b87203f0405ee09905eeda Mon Sep 17 00:00:00 2001 From: Jim Regan Date: Mon, 17 Aug 2020 21:48:05 +0100 Subject: [PATCH] update with #s of sentences/tokens (#6546) --- model_cards/jimregan/BERTreach/README.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/model_cards/jimregan/BERTreach/README.md b/model_cards/jimregan/BERTreach/README.md index c18966e89..72453102f 100644 --- a/model_cards/jimregan/BERTreach/README.md +++ b/model_cards/jimregan/BERTreach/README.md @@ -15,6 +15,8 @@ tags: * Newscrawl 300k portion of the [Leipzig Corpora](https://wortschatz.uni-leipzig.de/en/download/irish) * Private news corpus crawled with [Corpus Crawler](https://github.com/google/corpuscrawler) +(2125804 sentences, 47419062 tokens, as reckoned by wc) + ``` from transformers import pipeline fill_mask = pipeline("fill-mask", model="jimregan/BERTreach", tokenizer="jimregan/BERTreach")