From aeba4f95bb6065a6335b71b48bf4e731b3056003 Mon Sep 17 00:00:00 2001
From: Darigov Research <30328618+darigovresearch@users.noreply.github.com>
Date: Sun, 28 Feb 2021 13:27:54 +0000
Subject: [PATCH] Adds terms to Glossary (#10443)

* feat: Adds three definitions to glossary from @cronoik

Needed a definition for transformer which in turn needed 2 more definitions

To do with issue https://github.com/huggingface/transformers/issues/9078

* fix: Adjusts definition of neural network to make it easier to read
---
 docs/source/glossary.rst | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/docs/source/glossary.rst b/docs/source/glossary.rst
index 8c52a67d5..be7a46869 100644
--- a/docs/source/glossary.rst
+++ b/docs/source/glossary.rst
@@ -21,6 +21,7 @@ General terms
 - CLM: causal language modeling, a pretraining task where the model reads the texts in order and has to predict the
   next word. It's usually done by reading the whole sentence but using a mask inside the model to hide the future
   tokens at a certain timestep.
+- deep learning: machine learning algorithms which uses neural networks with several layers.
 - MLM: masked language modeling, a pretraining task where the model sees a corrupted version of the texts, usually done
   by masking some tokens randomly, and has to predict the original text.
 - multimodal: a task that combines texts with another kind of inputs (for instance images).
@@ -33,10 +34,12 @@ General terms
   involve a self-supervised objective, which can be reading the text and trying to predict the next word (see CLM) or
   masking some words and trying to predict them (see MLM).
 - RNN: recurrent neural network, a type of model that uses a loop over a layer to process texts.
+- self-attention: each element of the input finds out which other elements of the input they should attend to.
 - seq2seq or sequence-to-sequence: models that generate a new sequence from an input, like translation models, or
   summarization models (such as :doc:`Bart </model_doc/bart>` or :doc:`T5 </model_doc/t5>`).
 - token: a part of a sentence, usually a word, but can also be a subword (non-common words are often split in subwords)
   or a punctuation symbol.
+- transformer: self-attention based deep learning model architecture.
 
 Model inputs
 -----------------------------------------------------------------------------------------------------------------------