pytorch/docs/source/nn.rst
Emilio Castillo d38a71d579 torch.nn.modules.LazyModuleMixin and torch.nn.LazyLinear (Shape Inference II) (#44538)
Summary:
Retake on https://github.com/pytorch/pytorch/issues/40493 after all the feedback from albanD

This PR implements the generic Lazy mechanism and a sample `LazyLinear` layer with the `UninitializedParameter`.

The main differences with the previous PR are two;
Now `torch.nn.Module` remains untouched.
We don't require an explicit initialization or a dummy forward pass before starting the training or inference of the actual module. Making this much simpler to use from the user side.

As we discussed offline, there was the suggestion of not using a mixin, but changing the `__class__` attribute of `LazyLinear` to become `Linear` once it's completely initialized. While this can be useful, by the time being we need `LazyLinear` to be a `torch.nn.Module` subclass since there are many checks that rely on the modules being instances of `torch.nn.Module`.
This can cause problems when we create complex modules such as
```
class MyNetwork(torch.nn.Module):
    def __init__(self):
        super(MyNetwork, self).__init__()
        self.conv = torch.nn.Conv2d(20, 4, 2)
        self.linear = torch.nn.LazyLinear(10)
    def forward(self, x):
        y = self.conv(x).clamp(min=0)
        return self.linear(y)
```
Here, when the __setattr__ function is called at the time LazyLinear is registered, it won't be added to the child modules of `MyNetwork`, so we have to manually do it later, but currently there is no way to do such thing as we can't access the parent module from LazyLinear once it becomes the Linear module. (We can add a workaround to this if needed).

TODO:

Add convolutions once the design is OK
Fix docstrings

Pull Request resolved: https://github.com/pytorch/pytorch/pull/44538

Reviewed By: ngimel

Differential Revision: D24162854

Pulled By: albanD

fbshipit-source-id: 6d58dfe5d43bfb05b6ee506e266db3cf4b885f0c
2020-10-19 13:13:54 -07:00

388 lines
7 KiB
ReStructuredText

.. role:: hidden
:class: hidden-section
torch.nn
===================================
These are the basic building block for graphs
.. contents:: torch.nn
:depth: 2
:local:
:backlinks: top
.. currentmodule:: torch.nn
.. autosummary::
:toctree: generated
:nosignatures:
:template: classtemplate.rst
~parameter.Parameter
~parameter.UninitializedParameter
Containers
----------------------------------
.. autosummary::
:toctree: generated
:nosignatures:
:template: classtemplate.rst
Module
Sequential
ModuleList
ModuleDict
ParameterList
ParameterDict
.. currentmodule:: torch
Convolution Layers
----------------------------------
.. autosummary::
:toctree: generated
:nosignatures:
:template: classtemplate.rst
nn.Conv1d
nn.Conv2d
nn.Conv3d
nn.ConvTranspose1d
nn.ConvTranspose2d
nn.ConvTranspose3d
nn.Unfold
nn.Fold
Pooling layers
----------------------------------
.. autosummary::
:toctree: generated
:nosignatures:
:template: classtemplate.rst
nn.MaxPool1d
nn.MaxPool2d
nn.MaxPool3d
nn.MaxUnpool1d
nn.MaxUnpool2d
nn.MaxUnpool3d
nn.AvgPool1d
nn.AvgPool2d
nn.AvgPool3d
nn.FractionalMaxPool2d
nn.LPPool1d
nn.LPPool2d
nn.AdaptiveMaxPool1d
nn.AdaptiveMaxPool2d
nn.AdaptiveMaxPool3d
nn.AdaptiveAvgPool1d
nn.AdaptiveAvgPool2d
nn.AdaptiveAvgPool3d
Padding Layers
--------------
.. autosummary::
:toctree: generated
:nosignatures:
:template: classtemplate.rst
nn.ReflectionPad1d
nn.ReflectionPad2d
nn.ReplicationPad1d
nn.ReplicationPad2d
nn.ReplicationPad3d
nn.ZeroPad2d
nn.ConstantPad1d
nn.ConstantPad2d
nn.ConstantPad3d
Non-linear Activations (weighted sum, nonlinearity)
---------------------------------------------------
.. autosummary::
:toctree: generated
:nosignatures:
:template: classtemplate.rst
nn.ELU
nn.Hardshrink
nn.Hardsigmoid
nn.Hardtanh
nn.Hardswish
nn.LeakyReLU
nn.LogSigmoid
nn.MultiheadAttention
nn.PReLU
nn.ReLU
nn.ReLU6
nn.RReLU
nn.SELU
nn.CELU
nn.GELU
nn.Sigmoid
nn.SiLU
nn.Softplus
nn.Softshrink
nn.Softsign
nn.Tanh
nn.Tanhshrink
nn.Threshold
Non-linear Activations (other)
------------------------------
.. autosummary::
:toctree: generated
:nosignatures:
:template: classtemplate.rst
nn.Softmin
nn.Softmax
nn.Softmax2d
nn.LogSoftmax
nn.AdaptiveLogSoftmaxWithLoss
Normalization Layers
----------------------------------
.. autosummary::
:toctree: generated
:nosignatures:
:template: classtemplate.rst
nn.BatchNorm1d
nn.BatchNorm2d
nn.BatchNorm3d
nn.GroupNorm
nn.SyncBatchNorm
nn.InstanceNorm1d
nn.InstanceNorm2d
nn.InstanceNorm3d
nn.LayerNorm
nn.LocalResponseNorm
Recurrent Layers
----------------
.. autosummary::
:toctree: generated
:nosignatures:
:template: classtemplate.rst
nn.RNNBase
nn.RNN
nn.LSTM
nn.GRU
nn.RNNCell
nn.LSTMCell
nn.GRUCell
Transformer Layers
----------------------------------
.. autosummary::
:toctree: generated
:nosignatures:
:template: classtemplate.rst
nn.Transformer
nn.TransformerEncoder
nn.TransformerDecoder
nn.TransformerEncoderLayer
nn.TransformerDecoderLayer
Linear Layers
----------------------------------
.. autosummary::
:toctree: generated
:nosignatures:
:template: classtemplate.rst
nn.Identity
nn.Linear
nn.Bilinear
nn.LazyLinear
Dropout Layers
--------------
.. autosummary::
:toctree: generated
:nosignatures:
:template: classtemplate.rst
nn.Dropout
nn.Dropout2d
nn.Dropout3d
nn.AlphaDropout
Sparse Layers
-------------
.. autosummary::
:toctree: generated
:nosignatures:
:template: classtemplate.rst
nn.Embedding
nn.EmbeddingBag
Distance Functions
------------------
.. autosummary::
:toctree: generated
:nosignatures:
:template: classtemplate.rst
nn.CosineSimilarity
nn.PairwiseDistance
Loss Functions
--------------
.. autosummary::
:toctree: generated
:nosignatures:
:template: classtemplate.rst
nn.L1Loss
nn.MSELoss
nn.CrossEntropyLoss
nn.CTCLoss
nn.NLLLoss
nn.PoissonNLLLoss
nn.KLDivLoss
nn.BCELoss
nn.BCEWithLogitsLoss
nn.MarginRankingLoss
nn.HingeEmbeddingLoss
nn.MultiLabelMarginLoss
nn.SmoothL1Loss
nn.SoftMarginLoss
nn.MultiLabelSoftMarginLoss
nn.CosineEmbeddingLoss
nn.MultiMarginLoss
nn.TripletMarginLoss
nn.TripletMarginWithDistanceLoss
Vision Layers
----------------
.. autosummary::
:toctree: generated
:nosignatures:
:template: classtemplate.rst
nn.PixelShuffle
nn.Upsample
nn.UpsamplingNearest2d
nn.UpsamplingBilinear2d
Shuffle Layers
----------------
.. autosummary::
:toctree: generated
:nosignatures:
:template: classtemplate.rst
nn.ChannelShuffle
DataParallel Layers (multi-GPU, distributed)
--------------------------------------------
.. autosummary::
:toctree: generated
:nosignatures:
:template: classtemplate.rst
nn.DataParallel
nn.parallel.DistributedDataParallel
Utilities
---------
From the ``torch.nn.utils`` module
.. currentmodule:: torch.nn.utils
.. autosummary::
:toctree: generated
:nosignatures:
clip_grad_norm_
clip_grad_value_
parameters_to_vector
vector_to_parameters
.. autosummary::
:toctree: generated
:nosignatures:
:template: classtemplate.rst
prune.BasePruningMethod
.. autosummary::
:toctree: generated
:nosignatures:
prune.PruningContainer
prune.Identity
prune.RandomUnstructured
prune.L1Unstructured
prune.RandomStructured
prune.LnStructured
prune.CustomFromMask
prune.identity
prune.random_unstructured
prune.l1_unstructured
prune.random_structured
prune.ln_structured
prune.global_unstructured
prune.custom_from_mask
prune.remove
prune.is_pruned
weight_norm
remove_weight_norm
spectral_norm
remove_spectral_norm
Utility functions in other modules
.. currentmodule:: torch
.. autosummary::
:toctree: generated
:nosignatures:
nn.utils.rnn.PackedSequence
nn.utils.rnn.pack_padded_sequence
nn.utils.rnn.pad_packed_sequence
nn.utils.rnn.pad_sequence
nn.utils.rnn.pack_sequence
nn.Flatten
nn.Unflatten
Quantized Functions
--------------------
Quantization refers to techniques for performing computations and storing tensors at lower bitwidths than
floating point precision. PyTorch supports both per tensor and per channel asymmetric linear quantization. To learn more how to use quantized functions in PyTorch, please refer to the :ref:`quantization-doc` documentation.
Lazy Modules Initialization
---------------------------
.. currentmodule:: torch
.. autosummary::
:toctree: generated
:nosignatures:
:template: classtemplate.rst
nn.modules.lazy.LazyModuleMixin