From df759114c939a2c276085df168141a8a5fa3acaa Mon Sep 17 00:00:00 2001 From: LysandreJik Date: Fri, 5 Jul 2019 17:35:26 -0400 Subject: [PATCH] Single file documentation for each model, accompanied by the Documentation overview. --- docs/index.rst | 2 - docs/source/index.rst | 15 +- docs/source/model_doc/bert.rst | 110 +++++++ docs/source/model_doc/gpt.rst | 59 ++++ docs/source/model_doc/gpt2.rst | 49 +++ .../{doc.rst => model_doc/overview.rst} | 291 ++---------------- docs/source/model_doc/transformerxl.rst | 26 ++ docs/source/model_doc/xlm.rst | 2 + docs/source/model_doc/xlnet.rst | 2 + 9 files changed, 290 insertions(+), 266 deletions(-) delete mode 100644 docs/index.rst create mode 100644 docs/source/model_doc/bert.rst create mode 100644 docs/source/model_doc/gpt.rst create mode 100644 docs/source/model_doc/gpt2.rst rename docs/source/{doc.rst => model_doc/overview.rst} (60%) create mode 100644 docs/source/model_doc/transformerxl.rst create mode 100644 docs/source/model_doc/xlm.rst create mode 100644 docs/source/model_doc/xlnet.rst diff --git a/docs/index.rst b/docs/index.rst deleted file mode 100644 index 4639f1d21..000000000 --- a/docs/index.rst +++ /dev/null @@ -1,2 +0,0 @@ -Home -==== diff --git a/docs/source/index.rst b/docs/source/index.rst index 85125f3cf..d7b60bd66 100644 --- a/docs/source/index.rst +++ b/docs/source/index.rst @@ -4,16 +4,29 @@ Pytorch-Transformers: The Big & Extending Repository of pretrained Transformers .. toctree:: :maxdepth: 2 + :caption: Notes installation usage - doc examples notebooks tpu cli +.. toctree:: + :maxdepth: 2 + :caption: Package Reference + + model_doc/overview + model_doc/bert + model_doc/gpt + model_doc/transformerxl + model_doc/gpt2 + model_doc/xlm + model_doc/xlnet + + .. image:: https://circleci.com/gh/huggingface/pytorch-pretrained-BERT.svg?style=svg :target: https://circleci.com/gh/huggingface/pytorch-pretrained-BERT :alt: CircleCI diff --git a/docs/source/model_doc/bert.rst b/docs/source/model_doc/bert.rst new file mode 100644 index 000000000..018f3e396 --- /dev/null +++ b/docs/source/model_doc/bert.rst @@ -0,0 +1,110 @@ +BERT +---------------------------------------------------- + +``BertTokenizer`` +~~~~~~~~~~~~~~~~~~~~~ + +``BertTokenizer`` perform end-to-end tokenization, i.e. basic tokenization followed by WordPiece tokenization. + +This class has five arguments: + + +* ``vocab_file``\ : path to a vocabulary file. +* ``do_lower_case``\ : convert text to lower-case while tokenizing. **Default = True**. +* ``max_len``\ : max length to filter the input of the Transformer. Default to pre-trained value for the model if ``None``. **Default = None** +* ``do_basic_tokenize``\ : Do basic tokenization before wordpice tokenization. Set to false if text is pre-tokenized. **Default = True**. +* ``never_split``\ : a list of tokens that should not be splitted during tokenization. **Default = ``["[UNK]", "[SEP]", "[PAD]", "[CLS]", "[MASK]"]``\ ** + +and three methods: + + +* ``tokenize(text)``\ : convert a ``str`` in a list of ``str`` tokens by (1) performing basic tokenization and (2) WordPiece tokenization. +* ``convert_tokens_to_ids(tokens)``\ : convert a list of ``str`` tokens in a list of ``int`` indices in the vocabulary. +* ``convert_ids_to_tokens(tokens)``\ : convert a list of ``int`` indices in a list of ``str`` tokens in the vocabulary. +* `save_vocabulary(directory_path)`: save the vocabulary file to `directory_path`. Return the path to the saved vocabulary file: ``vocab_file_path``. The vocabulary can be reloaded with ``BertTokenizer.from_pretrained('vocab_file_path')`` or ``BertTokenizer.from_pretrained('directory_path')``. + +Please refer to the doc strings and code in `\ ``tokenization.py`` <./pytorch_pretrained_bert/tokenization.py>`_ for the details of the ``BasicTokenizer`` and ``WordpieceTokenizer`` classes. In general it is recommended to use ``BertTokenizer`` unless you know what you are doing. + + +``BertAdam`` +~~~~~~~~~~~~~~~~ + +``BertAdam`` is a ``torch.optimizer`` adapted to be closer to the optimizer used in the TensorFlow implementation of Bert. The differences with PyTorch Adam optimizer are the following: + + +* BertAdam implements weight decay fix, +* BertAdam doesn't compensate for bias as in the regular Adam optimizer. + +The optimizer accepts the following arguments: + + +* ``lr`` : learning rate +* ``warmup`` : portion of ``t_total`` for the warmup, ``-1`` means no warmup. Default : ``-1`` +* ``t_total`` : total number of training steps for the learning + rate schedule, ``-1`` means constant learning rate. Default : ``-1`` +* ``schedule`` : schedule to use for the warmup (see above). + Can be ``'warmup_linear'``\ , ``'warmup_constant'``\ , ``'warmup_cosine'``\ , ``'none'``\ , ``None`` or a ``_LRSchedule`` object (see below). + If ``None`` or ``'none'``\ , learning rate is always kept constant. + Default : ``'warmup_linear'`` +* ``b1`` : Adams b1. Default : ``0.9`` +* ``b2`` : Adams b2. Default : ``0.999`` +* ``e`` : Adams epsilon. Default : ``1e-6`` +* ``weight_decay:`` Weight decay. Default : ``0.01`` +* ``max_grad_norm`` : Maximum norm for the gradients (\ ``-1`` means no clipping). Default : ``1.0`` + + +1. ``BertModel`` +~~~~~~~~~~~~~~~~~~~~ + +.. autoclass:: pytorch_pretrained_bert.BertModel + :members: + + +2. ``BertForPreTraining`` +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +.. autoclass:: pytorch_pretrained_bert.BertForPreTraining + :members: + + +3. ``BertForMaskedLM`` +~~~~~~~~~~~~~~~~~~~~~~~~~~ + +.. autoclass:: pytorch_pretrained_bert.BertForMaskedLM + :members: + + +4. ``BertForNextSentencePrediction`` +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +.. autoclass:: pytorch_pretrained_bert.BertForNextSentencePrediction + :members: + + +5. ``BertForSequenceClassification`` +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +.. autoclass:: pytorch_pretrained_bert.BertForSequenceClassification + :members: + + +6. ``BertForMultipleChoice`` +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +.. autoclass:: pytorch_pretrained_bert.BertForMultipleChoice + :members: + + +7. ``BertForTokenClassification`` +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +.. autoclass:: pytorch_pretrained_bert.BertForTokenClassification + :members: + + +8. ``BertForQuestionAnswering`` +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +.. autoclass:: pytorch_pretrained_bert.BertForQuestionAnswering + :members: + diff --git a/docs/source/model_doc/gpt.rst b/docs/source/model_doc/gpt.rst new file mode 100644 index 000000000..59e84a342 --- /dev/null +++ b/docs/source/model_doc/gpt.rst @@ -0,0 +1,59 @@ +OpenAI GPT +---------------------------------------------------- + + +``OpenAIGPTTokenizer`` +~~~~~~~~~~~~~~~~~~~~~~~~~~ + +``OpenAIGPTTokenizer`` perform Byte-Pair-Encoding (BPE) tokenization. + +This class has four arguments: + + +* ``vocab_file``\ : path to a vocabulary file. +* ``merges_file``\ : path to a file containing the BPE merges. +* ``max_len``\ : max length to filter the input of the Transformer. Default to pre-trained value for the model if ``None``. **Default = None** +* ``special_tokens``\ : a list of tokens to add to the vocabulary for fine-tuning. If SpaCy is not installed and BERT's ``BasicTokenizer`` is used as the pre-BPE tokenizer, these tokens are not split. **Default= None** + +and five methods: + + +* ``tokenize(text)``\ : convert a ``str`` in a list of ``str`` tokens by performing BPE tokenization. +* ``convert_tokens_to_ids(tokens)``\ : convert a list of ``str`` tokens in a list of ``int`` indices in the vocabulary. +* ``convert_ids_to_tokens(tokens)``\ : convert a list of ``int`` indices in a list of ``str`` tokens in the vocabulary. +* ``set_special_tokens(self, special_tokens)``\ : update the list of special tokens (see above arguments) +* ``encode(text)``\ : convert a ``str`` in a list of ``int`` tokens by performing BPE encoding. +* `decode(ids, skip_special_tokens=False, clean_up_tokenization_spaces=False)`: decode a list of `int` indices in a string and do some post-processing if needed: (i) remove special tokens from the output and (ii) clean up tokenization spaces. +* `save_vocabulary(directory_path)`: save the vocabulary, merge and special tokens files to `directory_path`. Return the path to the three files: ``vocab_file_path``\ , ``merge_file_path``\ , ``special_tokens_file_path``. The vocabulary can be reloaded with ``OpenAIGPTTokenizer.from_pretrained('directory_path')``. + +Please refer to the doc strings and code in `\ ``tokenization_openai.py`` <./pytorch_pretrained_bert/tokenization_openai.py>`_ for the details of the ``OpenAIGPTTokenizer``. + + +``OpenAIAdam`` +~~~~~~~~~~~~~~~~~~ + +``OpenAIAdam`` is similar to ``BertAdam``. +The differences with ``BertAdam`` is that ``OpenAIAdam`` compensate for bias as in the regular Adam optimizer. + +``OpenAIAdam`` accepts the same arguments as ``BertAdam``. + + +9. ``OpenAIGPTModel`` +~~~~~~~~~~~~~~~~~~~~~~~~~ + +.. autoclass:: pytorch_pretrained_bert.OpenAIGPTModel + :members: + + +10. ``OpenAIGPTLMHeadModel`` +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +.. autoclass:: pytorch_pretrained_bert.OpenAIGPTLMHeadModel + :members: + + +11. ``OpenAIGPTDoubleHeadsModel`` +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +.. autoclass:: pytorch_pretrained_bert.OpenAIGPTDoubleHeadsModel + :members: diff --git a/docs/source/model_doc/gpt2.rst b/docs/source/model_doc/gpt2.rst new file mode 100644 index 000000000..bfcf26acb --- /dev/null +++ b/docs/source/model_doc/gpt2.rst @@ -0,0 +1,49 @@ +OpenAI GPT2 +---------------------------------------------------- + + +``GPT2Tokenizer`` +~~~~~~~~~~~~~~~~~~~~~ + +``GPT2Tokenizer`` perform byte-level Byte-Pair-Encoding (BPE) tokenization. + +This class has three arguments: + + +* ``vocab_file``\ : path to a vocabulary file. +* ``merges_file``\ : path to a file containing the BPE merges. +* ``errors``\ : How to handle unicode decoding errors. **Default = ``replace``\ ** + +and two methods: + + +* ``tokenize(text)``\ : convert a ``str`` in a list of ``str`` tokens by performing byte-level BPE. +* ``convert_tokens_to_ids(tokens)``\ : convert a list of ``str`` tokens in a list of ``int`` indices in the vocabulary. +* ``convert_ids_to_tokens(tokens)``\ : convert a list of ``int`` indices in a list of ``str`` tokens in the vocabulary. +* ``set_special_tokens(self, special_tokens)``\ : update the list of special tokens (see above arguments) +* ``encode(text)``\ : convert a ``str`` in a list of ``int`` tokens by performing byte-level BPE. +* ``decode(tokens)``\ : convert back a list of ``int`` tokens in a ``str``. +* `save_vocabulary(directory_path)`: save the vocabulary, merge and special tokens files to `directory_path`. Return the path to the three files: ``vocab_file_path``\ , ``merge_file_path``\ , ``special_tokens_file_path``. The vocabulary can be reloaded with ``OpenAIGPTTokenizer.from_pretrained('directory_path')``. + +Please refer to `\ ``tokenization_gpt2.py`` <./pytorch_pretrained_bert/tokenization_gpt2.py>`_ for more details on the ``GPT2Tokenizer``. + + +14. ``GPT2Model`` +~~~~~~~~~~~~~~~~~~~~~ + +.. autoclass:: pytorch_pretrained_bert.GPT2Model + :members: + + +15. ``GPT2LMHeadModel`` +~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +.. autoclass:: pytorch_pretrained_bert.GPT2LMHeadModel + :members: + + +16. ``GPT2DoubleHeadsModel`` +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +.. autoclass:: pytorch_pretrained_bert.GPT2DoubleHeadsModel + :members: diff --git a/docs/source/doc.rst b/docs/source/model_doc/overview.rst similarity index 60% rename from docs/source/doc.rst rename to docs/source/model_doc/overview.rst index 662799053..8f5e94baf 100644 --- a/docs/source/doc.rst +++ b/docs/source/model_doc/overview.rst @@ -1,8 +1,7 @@ -Docs +Overview ================================================ - Here is a detailed documentation of the classes in the package and how to use them: .. list-table:: @@ -24,6 +23,31 @@ Here is a detailed documentation of the classes in the package and how to use th - API of the optimizers +Configurations +^^^^^^^^^^^^^^ + +Models (BERT, GPT, GPT-2 and Transformer-XL) are defined and build from configuration classes which contains the +parameters of the models (number of layers, dimensionalities...) and a few utilities to read and write from JSON +configuration files. The respective configuration classes are: + + +* ``BertConfig`` for ``BertModel`` and BERT classes instances. +* ``OpenAIGPTConfig`` for ``OpenAIGPTModel`` and OpenAI GPT classes instances. +* ``GPT2Config`` for ``GPT2Model`` and OpenAI GPT-2 classes instances. +* ``TransfoXLConfig`` for ``TransfoXLModel`` and Transformer-XL classes instances. + +These configuration classes contains a few utilities to load and save configurations: + + +* ``from_dict(cls, json_object)``\ : A class method to construct a configuration from a Python dictionary of parameters. + Returns an instance of the configuration class. +* ``from_json_file(cls, json_file)``\ : A class method to construct a configuration from a json file of parameters. +Returns an instance of the configuration class. +* ``to_dict()``\ : Serializes an instance to a Python dictionary. Returns a dictionary. +* ``to_json_string()``\ : Serializes an instance to a JSON string. Returns a string. +* ``to_json_file(json_file_path)``\ : Save an instance to a json file. + + Loading Google AI or OpenAI pre-trained weights or PyTorch dump ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ @@ -124,7 +148,7 @@ Usually, if you don't set any specific environment variable, ``pytorch_pretraine You can alsways safely delete ``pytorch_pretrained_bert`` cache but the pretrained model weights and vocabulary files wil have to be re-downloaded from our S3. Serialization best-practices -^^^^^^^^^^^^^^^^^^^^^^^^^^^^ +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ This section explain how you can save and re-load a fine-tuned model (BERT, GPT, GPT-2 and Transformer-XL). There are three types of files you need to save to be able to reload a fine-tuned model: @@ -212,267 +236,8 @@ Here is another way you can save and reload the model if you want to use specifi model.load_state_dict(state_dict) tokenizer = OpenAIGPTTokenizer(output_vocab_file) -Configurations -^^^^^^^^^^^^^^ - -Models (BERT, GPT, GPT-2 and Transformer-XL) are defined and build from configuration classes which containes the parameters of the models (number of layers, dimensionalities...) and a few utilities to read and write from JSON configuration files. The respective configuration classes are: - - -* ``BertConfig`` for ``BertModel`` and BERT classes instances. -* ``OpenAIGPTConfig`` for ``OpenAIGPTModel`` and OpenAI GPT classes instances. -* ``GPT2Config`` for ``GPT2Model`` and OpenAI GPT-2 classes instances. -* ``TransfoXLConfig`` for ``TransfoXLModel`` and Transformer-XL classes instances. - -These configuration classes contains a few utilities to load and save configurations: - - -* ``from_dict(cls, json_object)``\ : A class method to construct a configuration from a Python dictionary of parameters. Returns an instance of the configuration class. -* ``from_json_file(cls, json_file)``\ : A class method to construct a configuration from a json file of parameters. Returns an instance of the configuration class. -* ``to_dict()``\ : Serializes an instance to a Python dictionary. Returns a dictionary. -* ``to_json_string()``\ : Serializes an instance to a JSON string. Returns a string. -* ``to_json_file(json_file_path)``\ : Save an instance to a json file. - -Models -^^^^^^ - -1. ``BertModel`` -~~~~~~~~~~~~~~~~~~~~ - -.. autoclass:: pytorch_pretrained_bert.BertModel - :members: - - -2. ``BertForPreTraining`` -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -.. autoclass:: pytorch_pretrained_bert.BertForPreTraining - :members: - - -3. ``BertForMaskedLM`` -~~~~~~~~~~~~~~~~~~~~~~~~~~ - -.. autoclass:: pytorch_pretrained_bert.BertForMaskedLM - :members: - - -4. ``BertForNextSentencePrediction`` -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -.. autoclass:: pytorch_pretrained_bert.BertForNextSentencePrediction - :members: - - -5. ``BertForSequenceClassification`` -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -.. autoclass:: pytorch_pretrained_bert.BertForSequenceClassification - :members: - - -6. ``BertForMultipleChoice`` -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -.. autoclass:: pytorch_pretrained_bert.BertForMultipleChoice - :members: - - -7. ``BertForTokenClassification`` -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -.. autoclass:: pytorch_pretrained_bert.BertForTokenClassification - :members: - - -8. ``BertForQuestionAnswering`` -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -.. autoclass:: pytorch_pretrained_bert.BertForQuestionAnswering - :members: - - -9. ``OpenAIGPTModel`` -~~~~~~~~~~~~~~~~~~~~~~~~~ - -.. autoclass:: pytorch_pretrained_bert.OpenAIGPTModel - :members: - - -10. ``OpenAIGPTLMHeadModel`` -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -.. autoclass:: pytorch_pretrained_bert.OpenAIGPTLMHeadModel - :members: - - -11. ``OpenAIGPTDoubleHeadsModel`` -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -.. autoclass:: pytorch_pretrained_bert.OpenAIGPTDoubleHeadsModel - :members: - - -12. ``TransfoXLModel`` -~~~~~~~~~~~~~~~~~~~~~~~~~~ - -.. autoclass:: pytorch_pretrained_bert.TransfoXLModel - :members: - - -13. ``TransfoXLLMHeadModel`` -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -.. autoclass:: pytorch_pretrained_bert.TransfoXLLMHeadModel - :members: - - -14. ``GPT2Model`` -~~~~~~~~~~~~~~~~~~~~~ - -.. autoclass:: pytorch_pretrained_bert.GPT2Model - :members: - - -15. ``GPT2LMHeadModel`` -~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -.. autoclass:: pytorch_pretrained_bert.GPT2LMHeadModel - :members: - - -16. ``GPT2DoubleHeadsModel`` -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -.. autoclass:: pytorch_pretrained_bert.GPT2DoubleHeadsModel - :members: - - -Tokenizers -^^^^^^^^^^ - -``BertTokenizer`` -~~~~~~~~~~~~~~~~~~~~~ - -``BertTokenizer`` perform end-to-end tokenization, i.e. basic tokenization followed by WordPiece tokenization. - -This class has five arguments: - - -* ``vocab_file``\ : path to a vocabulary file. -* ``do_lower_case``\ : convert text to lower-case while tokenizing. **Default = True**. -* ``max_len``\ : max length to filter the input of the Transformer. Default to pre-trained value for the model if ``None``. **Default = None** -* ``do_basic_tokenize``\ : Do basic tokenization before wordpice tokenization. Set to false if text is pre-tokenized. **Default = True**. -* ``never_split``\ : a list of tokens that should not be splitted during tokenization. **Default = ``["[UNK]", "[SEP]", "[PAD]", "[CLS]", "[MASK]"]``\ ** - -and three methods: - - -* ``tokenize(text)``\ : convert a ``str`` in a list of ``str`` tokens by (1) performing basic tokenization and (2) WordPiece tokenization. -* ``convert_tokens_to_ids(tokens)``\ : convert a list of ``str`` tokens in a list of ``int`` indices in the vocabulary. -* ``convert_ids_to_tokens(tokens)``\ : convert a list of ``int`` indices in a list of ``str`` tokens in the vocabulary. -* `save_vocabulary(directory_path)`: save the vocabulary file to `directory_path`. Return the path to the saved vocabulary file: ``vocab_file_path``. The vocabulary can be reloaded with ``BertTokenizer.from_pretrained('vocab_file_path')`` or ``BertTokenizer.from_pretrained('directory_path')``. - -Please refer to the doc strings and code in `\ ``tokenization.py`` <./pytorch_pretrained_bert/tokenization.py>`_ for the details of the ``BasicTokenizer`` and ``WordpieceTokenizer`` classes. In general it is recommended to use ``BertTokenizer`` unless you know what you are doing. - -``OpenAIGPTTokenizer`` -~~~~~~~~~~~~~~~~~~~~~~~~~~ - -``OpenAIGPTTokenizer`` perform Byte-Pair-Encoding (BPE) tokenization. - -This class has four arguments: - - -* ``vocab_file``\ : path to a vocabulary file. -* ``merges_file``\ : path to a file containing the BPE merges. -* ``max_len``\ : max length to filter the input of the Transformer. Default to pre-trained value for the model if ``None``. **Default = None** -* ``special_tokens``\ : a list of tokens to add to the vocabulary for fine-tuning. If SpaCy is not installed and BERT's ``BasicTokenizer`` is used as the pre-BPE tokenizer, these tokens are not split. **Default= None** - -and five methods: - - -* ``tokenize(text)``\ : convert a ``str`` in a list of ``str`` tokens by performing BPE tokenization. -* ``convert_tokens_to_ids(tokens)``\ : convert a list of ``str`` tokens in a list of ``int`` indices in the vocabulary. -* ``convert_ids_to_tokens(tokens)``\ : convert a list of ``int`` indices in a list of ``str`` tokens in the vocabulary. -* ``set_special_tokens(self, special_tokens)``\ : update the list of special tokens (see above arguments) -* ``encode(text)``\ : convert a ``str`` in a list of ``int`` tokens by performing BPE encoding. -* `decode(ids, skip_special_tokens=False, clean_up_tokenization_spaces=False)`: decode a list of `int` indices in a string and do some post-processing if needed: (i) remove special tokens from the output and (ii) clean up tokenization spaces. -* `save_vocabulary(directory_path)`: save the vocabulary, merge and special tokens files to `directory_path`. Return the path to the three files: ``vocab_file_path``\ , ``merge_file_path``\ , ``special_tokens_file_path``. The vocabulary can be reloaded with ``OpenAIGPTTokenizer.from_pretrained('directory_path')``. - -Please refer to the doc strings and code in `\ ``tokenization_openai.py`` <./pytorch_pretrained_bert/tokenization_openai.py>`_ for the details of the ``OpenAIGPTTokenizer``. - -``TransfoXLTokenizer`` -~~~~~~~~~~~~~~~~~~~~~~~~~~ - -``TransfoXLTokenizer`` perform word tokenization. This tokenizer can be used for adaptive softmax and has utilities for counting tokens in a corpus to create a vocabulary ordered by toekn frequency (for adaptive softmax). See the adaptive softmax paper (\ `Efficient softmax approximation for GPUs `_\ ) for more details. - -The API is similar to the API of ``BertTokenizer`` (see above). - -Please refer to the doc strings and code in `\ ``tokenization_transfo_xl.py`` <./pytorch_pretrained_bert/tokenization_transfo_xl.py>`_ for the details of these additional methods in ``TransfoXLTokenizer``. - -``GPT2Tokenizer`` -~~~~~~~~~~~~~~~~~~~~~ - -``GPT2Tokenizer`` perform byte-level Byte-Pair-Encoding (BPE) tokenization. - -This class has three arguments: - - -* ``vocab_file``\ : path to a vocabulary file. -* ``merges_file``\ : path to a file containing the BPE merges. -* ``errors``\ : How to handle unicode decoding errors. **Default = ``replace``\ ** - -and two methods: - - -* ``tokenize(text)``\ : convert a ``str`` in a list of ``str`` tokens by performing byte-level BPE. -* ``convert_tokens_to_ids(tokens)``\ : convert a list of ``str`` tokens in a list of ``int`` indices in the vocabulary. -* ``convert_ids_to_tokens(tokens)``\ : convert a list of ``int`` indices in a list of ``str`` tokens in the vocabulary. -* ``set_special_tokens(self, special_tokens)``\ : update the list of special tokens (see above arguments) -* ``encode(text)``\ : convert a ``str`` in a list of ``int`` tokens by performing byte-level BPE. -* ``decode(tokens)``\ : convert back a list of ``int`` tokens in a ``str``. -* `save_vocabulary(directory_path)`: save the vocabulary, merge and special tokens files to `directory_path`. Return the path to the three files: ``vocab_file_path``\ , ``merge_file_path``\ , ``special_tokens_file_path``. The vocabulary can be reloaded with ``OpenAIGPTTokenizer.from_pretrained('directory_path')``. - -Please refer to `\ ``tokenization_gpt2.py`` <./pytorch_pretrained_bert/tokenization_gpt2.py>`_ for more details on the ``GPT2Tokenizer``. - -Optimizers -^^^^^^^^^^ - -``BertAdam`` -~~~~~~~~~~~~~~~~ - -``BertAdam`` is a ``torch.optimizer`` adapted to be closer to the optimizer used in the TensorFlow implementation of Bert. The differences with PyTorch Adam optimizer are the following: - - -* BertAdam implements weight decay fix, -* BertAdam doesn't compensate for bias as in the regular Adam optimizer. - -The optimizer accepts the following arguments: - - -* ``lr`` : learning rate -* ``warmup`` : portion of ``t_total`` for the warmup, ``-1`` means no warmup. Default : ``-1`` -* ``t_total`` : total number of training steps for the learning - rate schedule, ``-1`` means constant learning rate. Default : ``-1`` -* ``schedule`` : schedule to use for the warmup (see above). - Can be ``'warmup_linear'``\ , ``'warmup_constant'``\ , ``'warmup_cosine'``\ , ``'none'``\ , ``None`` or a ``_LRSchedule`` object (see below). - If ``None`` or ``'none'``\ , learning rate is always kept constant. - Default : ``'warmup_linear'`` -* ``b1`` : Adams b1. Default : ``0.9`` -* ``b2`` : Adams b2. Default : ``0.999`` -* ``e`` : Adams epsilon. Default : ``1e-6`` -* ``weight_decay:`` Weight decay. Default : ``0.01`` -* ``max_grad_norm`` : Maximum norm for the gradients (\ ``-1`` means no clipping). Default : ``1.0`` - -``OpenAIAdam`` -~~~~~~~~~~~~~~~~~~ - -``OpenAIAdam`` is similar to ``BertAdam``. -The differences with ``BertAdam`` is that ``OpenAIAdam`` compensate for bias as in the regular Adam optimizer. - -``OpenAIAdam`` accepts the same arguments as ``BertAdam``. - Learning Rate Schedules -~~~~~~~~~~~~~~~~~~~~~~~ +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ The ``.optimization`` module also provides additional schedules in the form of schedule objects that inherit from ``_LRSchedule``. All ``_LRSchedule`` subclasses accept ``warmup`` and ``t_total`` arguments at construction. diff --git a/docs/source/model_doc/transformerxl.rst b/docs/source/model_doc/transformerxl.rst new file mode 100644 index 000000000..c84693b38 --- /dev/null +++ b/docs/source/model_doc/transformerxl.rst @@ -0,0 +1,26 @@ +Transformer XL +---------------------------------------------------- + + +``TransfoXLTokenizer`` +~~~~~~~~~~~~~~~~~~~~~~~~~~ + +``TransfoXLTokenizer`` perform word tokenization. This tokenizer can be used for adaptive softmax and has utilities for counting tokens in a corpus to create a vocabulary ordered by toekn frequency (for adaptive softmax). See the adaptive softmax paper (\ `Efficient softmax approximation for GPUs `_\ ) for more details. + +The API is similar to the API of ``BertTokenizer`` (see above). + +Please refer to the doc strings and code in `\ ``tokenization_transfo_xl.py`` <./pytorch_pretrained_bert/tokenization_transfo_xl.py>`_ for the details of these additional methods in ``TransfoXLTokenizer``. + + +12. ``TransfoXLModel`` +~~~~~~~~~~~~~~~~~~~~~~~~~~ + +.. autoclass:: pytorch_pretrained_bert.TransfoXLModel + :members: + + +13. ``TransfoXLLMHeadModel`` +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +.. autoclass:: pytorch_pretrained_bert.TransfoXLLMHeadModel + :members: diff --git a/docs/source/model_doc/xlm.rst b/docs/source/model_doc/xlm.rst new file mode 100644 index 000000000..70b5fa3b4 --- /dev/null +++ b/docs/source/model_doc/xlm.rst @@ -0,0 +1,2 @@ +XLM +---------------------------------------------------- diff --git a/docs/source/model_doc/xlnet.rst b/docs/source/model_doc/xlnet.rst new file mode 100644 index 000000000..d2fd996cb --- /dev/null +++ b/docs/source/model_doc/xlnet.rst @@ -0,0 +1,2 @@ +XLNet +----------------------------------------------------