From 8ffd7fb12db877cfa28f8709bc563b4346a560c5 Mon Sep 17 00:00:00 2001
From: Patrick von Platen <patrick.v.platen@gmail.com>
Date: Wed, 21 Oct 2020 12:27:09 +0200
Subject: [PATCH] Update README.md

---
 .../prophetnet-large-uncased-cnndm/README.md          | 11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/model_cards/microsoft/prophetnet-large-uncased-cnndm/README.md b/model_cards/microsoft/prophetnet-large-uncased-cnndm/README.md
index 094dbf402..085403067 100644
--- a/model_cards/microsoft/prophetnet-large-uncased-cnndm/README.md
+++ b/model_cards/microsoft/prophetnet-large-uncased-cnndm/README.md
@@ -1,3 +1,9 @@
+---
+language: en
+datasets:
+- cnn_dailymail
+---
+
 ## prophetnet-large-uncased-cnndm
 Fine-tuned weights(converted from [original fairseq version repo](https://github.com/microsoft/ProphetNet)) for [ProphetNet](https://arxiv.org/abs/2001.04063) on summarization task CNN/DailyMail.  
 ProphetNet is a new pre-trained language model for sequence-to-sequence learning with a novel self-supervised objective called future n-gram prediction.  
@@ -15,8 +21,11 @@ inputs = tokenizer([ARTICLE_TO_SUMMARIZE], max_length=100, return_tensors='pt')
 
 # Generate Summary
 summary_ids = model.generate(inputs['input_ids'], num_beams=4, max_length=512, early_stopping=True)
-tokenizer.batch_decode(summary_ids.tolist())
+tokenizer.batch_decode(summary_ids, skip_special_tokens=True)
+
+# should give: 'ustc was founded in beijing by the chinese academy of sciences in 1958. [X_SEP] ustc\'s mission was to develop a high - level science and technology workforce. [X_SEP] the establishment was hailed as " a major event in the history of chinese education and science "'
 ```
+
 Here, [X_SEP] is used as a special token to seperate sentences.
 ### Citation
 ```bibtex