mirror of
https://github.com/saymrwulf/transformers.git
synced 2026-05-14 20:58:08 +00:00
Minor docs typo fixes (#8797)
* Fix minor typos * Additional typos * Style fix Co-authored-by: guyrosin <guyrosin@assist-561.cs.technion.ac.il>
This commit is contained in:
parent
5ced23dc84
commit
3a08cc1ce7
5 changed files with 10 additions and 9 deletions
|
|
@ -125,7 +125,7 @@ Follow these steps to start contributing:
|
|||
$ git checkout -b a-descriptive-name-for-my-changes
|
||||
```
|
||||
|
||||
**do not** work on the `master` branch.
|
||||
**Do not** work on the `master` branch.
|
||||
|
||||
4. Set up a development environment by running the following command in a virtual environment:
|
||||
|
||||
|
|
|
|||
|
|
@ -2,7 +2,6 @@ Preprocessing data
|
|||
=======================================================================================================================
|
||||
|
||||
In this tutorial, we'll explore how to preprocess your data using 🤗 Transformers. The main tool for this is what we
|
||||
|
||||
call a :doc:`tokenizer <main_classes/tokenizer>`. You can build one using the tokenizer class associated to the model
|
||||
you would like to use, or directly with the :class:`~transformers.AutoTokenizer` class.
|
||||
|
||||
|
|
@ -52,7 +51,7 @@ The tokenizer can decode a list of token ids in a proper sentence:
|
|||
"[CLS] Hello, I'm a single sentence! [SEP]"
|
||||
|
||||
As you can see, the tokenizer automatically added some special tokens that the model expects. Not all models need
|
||||
special tokens; for instance, if we had used` gtp2-medium` instead of `bert-base-cased` to create our tokenizer, we
|
||||
special tokens; for instance, if we had used `gpt2-medium` instead of `bert-base-cased` to create our tokenizer, we
|
||||
would have seen the same sentence as the original one here. You can disable this behavior (which is only advised if you
|
||||
have added those special tokens yourself) by passing ``add_special_tokens=False``.
|
||||
|
||||
|
|
|
|||
|
|
@ -240,7 +240,9 @@ activations of the model.
|
|||
[ 0.08181786, -0.04179301]], dtype=float32)>,)
|
||||
|
||||
The model can return more than just the final activations, which is why the output is a tuple. Here we only asked for
|
||||
the final activations, so we get a tuple with one element. .. note::
|
||||
the final activations, so we get a tuple with one element.
|
||||
|
||||
.. note::
|
||||
|
||||
All 🤗 Transformers models (PyTorch or TensorFlow) return the activations of the model *before* the final activation
|
||||
function (like SoftMax) since this final activation function is often fused with the loss.
|
||||
|
|
|
|||
|
|
@ -70,8 +70,8 @@ inference.
|
|||
optimizations afterwards.
|
||||
|
||||
.. note::
|
||||
For more information about the optimizations enabled by ONNXRuntime, please have a look at the (`ONNXRuntime Github
|
||||
<https://github.com/microsoft/onnxruntime/tree/master/onnxruntime/python/tools/transformers>`_)
|
||||
For more information about the optimizations enabled by ONNXRuntime, please have a look at the `ONNXRuntime Github
|
||||
<https://github.com/microsoft/onnxruntime/tree/master/onnxruntime/python/tools/transformers>`_.
|
||||
|
||||
Quantization
|
||||
-----------------------------------------------------------------------------------------------------------------------
|
||||
|
|
|
|||
|
|
@ -20,14 +20,14 @@ DataCollator = NewType("DataCollator", Callable[[List[InputDataClass]], Dict[str
|
|||
|
||||
def default_data_collator(features: List[InputDataClass]) -> Dict[str, torch.Tensor]:
|
||||
"""
|
||||
Very simple data collator that simply collates batches of dict-like objects and erforms special handling for
|
||||
Very simple data collator that simply collates batches of dict-like objects and performs special handling for
|
||||
potential keys named:
|
||||
|
||||
- ``label``: handles a single value (int or float) per object
|
||||
- ``label_ids``: handles a list of values per object
|
||||
|
||||
Des not do any additional preprocessing: property names of the input object will be used as corresponding inputs to
|
||||
the model. See glue and ner for example of how it's useful.
|
||||
Does not do any additional preprocessing: property names of the input object will be used as corresponding inputs
|
||||
to the model. See glue and ner for example of how it's useful.
|
||||
"""
|
||||
|
||||
# In this function we'll make the assumption that all `features` in the batch
|
||||
|
|
|
|||
Loading…
Reference in a new issue