mirror of
https://github.com/saymrwulf/pytorch.git
synced 2026-05-14 20:57:59 +00:00
Summary: Retake on https://github.com/pytorch/pytorch/issues/40493 after all the feedback from albanD This PR implements the generic Lazy mechanism and a sample `LazyLinear` layer with the `UninitializedParameter`. The main differences with the previous PR are two; Now `torch.nn.Module` remains untouched. We don't require an explicit initialization or a dummy forward pass before starting the training or inference of the actual module. Making this much simpler to use from the user side. As we discussed offline, there was the suggestion of not using a mixin, but changing the `__class__` attribute of `LazyLinear` to become `Linear` once it's completely initialized. While this can be useful, by the time being we need `LazyLinear` to be a `torch.nn.Module` subclass since there are many checks that rely on the modules being instances of `torch.nn.Module`. This can cause problems when we create complex modules such as ``` class MyNetwork(torch.nn.Module): def __init__(self): super(MyNetwork, self).__init__() self.conv = torch.nn.Conv2d(20, 4, 2) self.linear = torch.nn.LazyLinear(10) def forward(self, x): y = self.conv(x).clamp(min=0) return self.linear(y) ``` Here, when the __setattr__ function is called at the time LazyLinear is registered, it won't be added to the child modules of `MyNetwork`, so we have to manually do it later, but currently there is no way to do such thing as we can't access the parent module from LazyLinear once it becomes the Linear module. (We can add a workaround to this if needed). TODO: Add convolutions once the design is OK Fix docstrings Pull Request resolved: https://github.com/pytorch/pytorch/pull/44538 Reviewed By: ngimel Differential Revision: D24162854 Pulled By: albanD fbshipit-source-id: 6d58dfe5d43bfb05b6ee506e266db3cf4b885f0c
388 lines
7 KiB
ReStructuredText
388 lines
7 KiB
ReStructuredText
.. role:: hidden
|
|
:class: hidden-section
|
|
|
|
torch.nn
|
|
===================================
|
|
|
|
These are the basic building block for graphs
|
|
|
|
.. contents:: torch.nn
|
|
:depth: 2
|
|
:local:
|
|
:backlinks: top
|
|
|
|
|
|
.. currentmodule:: torch.nn
|
|
|
|
|
|
.. autosummary::
|
|
:toctree: generated
|
|
:nosignatures:
|
|
:template: classtemplate.rst
|
|
|
|
~parameter.Parameter
|
|
~parameter.UninitializedParameter
|
|
|
|
Containers
|
|
----------------------------------
|
|
|
|
.. autosummary::
|
|
:toctree: generated
|
|
:nosignatures:
|
|
:template: classtemplate.rst
|
|
|
|
Module
|
|
Sequential
|
|
ModuleList
|
|
ModuleDict
|
|
ParameterList
|
|
ParameterDict
|
|
|
|
.. currentmodule:: torch
|
|
|
|
Convolution Layers
|
|
----------------------------------
|
|
|
|
.. autosummary::
|
|
:toctree: generated
|
|
:nosignatures:
|
|
:template: classtemplate.rst
|
|
|
|
nn.Conv1d
|
|
nn.Conv2d
|
|
nn.Conv3d
|
|
nn.ConvTranspose1d
|
|
nn.ConvTranspose2d
|
|
nn.ConvTranspose3d
|
|
nn.Unfold
|
|
nn.Fold
|
|
|
|
Pooling layers
|
|
----------------------------------
|
|
|
|
.. autosummary::
|
|
:toctree: generated
|
|
:nosignatures:
|
|
:template: classtemplate.rst
|
|
|
|
nn.MaxPool1d
|
|
nn.MaxPool2d
|
|
nn.MaxPool3d
|
|
nn.MaxUnpool1d
|
|
nn.MaxUnpool2d
|
|
nn.MaxUnpool3d
|
|
nn.AvgPool1d
|
|
nn.AvgPool2d
|
|
nn.AvgPool3d
|
|
nn.FractionalMaxPool2d
|
|
nn.LPPool1d
|
|
nn.LPPool2d
|
|
nn.AdaptiveMaxPool1d
|
|
nn.AdaptiveMaxPool2d
|
|
nn.AdaptiveMaxPool3d
|
|
nn.AdaptiveAvgPool1d
|
|
nn.AdaptiveAvgPool2d
|
|
nn.AdaptiveAvgPool3d
|
|
|
|
Padding Layers
|
|
--------------
|
|
|
|
.. autosummary::
|
|
:toctree: generated
|
|
:nosignatures:
|
|
:template: classtemplate.rst
|
|
|
|
nn.ReflectionPad1d
|
|
nn.ReflectionPad2d
|
|
nn.ReplicationPad1d
|
|
nn.ReplicationPad2d
|
|
nn.ReplicationPad3d
|
|
nn.ZeroPad2d
|
|
nn.ConstantPad1d
|
|
nn.ConstantPad2d
|
|
nn.ConstantPad3d
|
|
|
|
Non-linear Activations (weighted sum, nonlinearity)
|
|
---------------------------------------------------
|
|
|
|
.. autosummary::
|
|
:toctree: generated
|
|
:nosignatures:
|
|
:template: classtemplate.rst
|
|
|
|
nn.ELU
|
|
nn.Hardshrink
|
|
nn.Hardsigmoid
|
|
nn.Hardtanh
|
|
nn.Hardswish
|
|
nn.LeakyReLU
|
|
nn.LogSigmoid
|
|
nn.MultiheadAttention
|
|
nn.PReLU
|
|
nn.ReLU
|
|
nn.ReLU6
|
|
nn.RReLU
|
|
nn.SELU
|
|
nn.CELU
|
|
nn.GELU
|
|
nn.Sigmoid
|
|
nn.SiLU
|
|
nn.Softplus
|
|
nn.Softshrink
|
|
nn.Softsign
|
|
nn.Tanh
|
|
nn.Tanhshrink
|
|
nn.Threshold
|
|
|
|
Non-linear Activations (other)
|
|
------------------------------
|
|
|
|
.. autosummary::
|
|
:toctree: generated
|
|
:nosignatures:
|
|
:template: classtemplate.rst
|
|
|
|
nn.Softmin
|
|
nn.Softmax
|
|
nn.Softmax2d
|
|
nn.LogSoftmax
|
|
nn.AdaptiveLogSoftmaxWithLoss
|
|
|
|
Normalization Layers
|
|
----------------------------------
|
|
|
|
.. autosummary::
|
|
:toctree: generated
|
|
:nosignatures:
|
|
:template: classtemplate.rst
|
|
|
|
nn.BatchNorm1d
|
|
nn.BatchNorm2d
|
|
nn.BatchNorm3d
|
|
nn.GroupNorm
|
|
nn.SyncBatchNorm
|
|
nn.InstanceNorm1d
|
|
nn.InstanceNorm2d
|
|
nn.InstanceNorm3d
|
|
nn.LayerNorm
|
|
nn.LocalResponseNorm
|
|
|
|
Recurrent Layers
|
|
----------------
|
|
|
|
.. autosummary::
|
|
:toctree: generated
|
|
:nosignatures:
|
|
:template: classtemplate.rst
|
|
|
|
nn.RNNBase
|
|
nn.RNN
|
|
nn.LSTM
|
|
nn.GRU
|
|
nn.RNNCell
|
|
nn.LSTMCell
|
|
nn.GRUCell
|
|
|
|
Transformer Layers
|
|
----------------------------------
|
|
|
|
.. autosummary::
|
|
:toctree: generated
|
|
:nosignatures:
|
|
:template: classtemplate.rst
|
|
|
|
nn.Transformer
|
|
nn.TransformerEncoder
|
|
nn.TransformerDecoder
|
|
nn.TransformerEncoderLayer
|
|
nn.TransformerDecoderLayer
|
|
|
|
Linear Layers
|
|
----------------------------------
|
|
|
|
.. autosummary::
|
|
:toctree: generated
|
|
:nosignatures:
|
|
:template: classtemplate.rst
|
|
|
|
nn.Identity
|
|
nn.Linear
|
|
nn.Bilinear
|
|
nn.LazyLinear
|
|
|
|
Dropout Layers
|
|
--------------
|
|
|
|
.. autosummary::
|
|
:toctree: generated
|
|
:nosignatures:
|
|
:template: classtemplate.rst
|
|
|
|
nn.Dropout
|
|
nn.Dropout2d
|
|
nn.Dropout3d
|
|
nn.AlphaDropout
|
|
|
|
Sparse Layers
|
|
-------------
|
|
|
|
.. autosummary::
|
|
:toctree: generated
|
|
:nosignatures:
|
|
:template: classtemplate.rst
|
|
|
|
nn.Embedding
|
|
nn.EmbeddingBag
|
|
|
|
Distance Functions
|
|
------------------
|
|
|
|
.. autosummary::
|
|
:toctree: generated
|
|
:nosignatures:
|
|
:template: classtemplate.rst
|
|
|
|
nn.CosineSimilarity
|
|
nn.PairwiseDistance
|
|
|
|
Loss Functions
|
|
--------------
|
|
|
|
.. autosummary::
|
|
:toctree: generated
|
|
:nosignatures:
|
|
:template: classtemplate.rst
|
|
|
|
nn.L1Loss
|
|
nn.MSELoss
|
|
nn.CrossEntropyLoss
|
|
nn.CTCLoss
|
|
nn.NLLLoss
|
|
nn.PoissonNLLLoss
|
|
nn.KLDivLoss
|
|
nn.BCELoss
|
|
nn.BCEWithLogitsLoss
|
|
nn.MarginRankingLoss
|
|
nn.HingeEmbeddingLoss
|
|
nn.MultiLabelMarginLoss
|
|
nn.SmoothL1Loss
|
|
nn.SoftMarginLoss
|
|
nn.MultiLabelSoftMarginLoss
|
|
nn.CosineEmbeddingLoss
|
|
nn.MultiMarginLoss
|
|
nn.TripletMarginLoss
|
|
nn.TripletMarginWithDistanceLoss
|
|
|
|
Vision Layers
|
|
----------------
|
|
|
|
.. autosummary::
|
|
:toctree: generated
|
|
:nosignatures:
|
|
:template: classtemplate.rst
|
|
|
|
nn.PixelShuffle
|
|
nn.Upsample
|
|
nn.UpsamplingNearest2d
|
|
nn.UpsamplingBilinear2d
|
|
|
|
Shuffle Layers
|
|
----------------
|
|
|
|
.. autosummary::
|
|
:toctree: generated
|
|
:nosignatures:
|
|
:template: classtemplate.rst
|
|
|
|
nn.ChannelShuffle
|
|
|
|
DataParallel Layers (multi-GPU, distributed)
|
|
--------------------------------------------
|
|
|
|
.. autosummary::
|
|
:toctree: generated
|
|
:nosignatures:
|
|
:template: classtemplate.rst
|
|
|
|
nn.DataParallel
|
|
nn.parallel.DistributedDataParallel
|
|
|
|
Utilities
|
|
---------
|
|
|
|
From the ``torch.nn.utils`` module
|
|
|
|
.. currentmodule:: torch.nn.utils
|
|
.. autosummary::
|
|
:toctree: generated
|
|
:nosignatures:
|
|
|
|
clip_grad_norm_
|
|
clip_grad_value_
|
|
parameters_to_vector
|
|
vector_to_parameters
|
|
|
|
.. autosummary::
|
|
:toctree: generated
|
|
:nosignatures:
|
|
:template: classtemplate.rst
|
|
|
|
prune.BasePruningMethod
|
|
|
|
.. autosummary::
|
|
:toctree: generated
|
|
:nosignatures:
|
|
|
|
prune.PruningContainer
|
|
prune.Identity
|
|
prune.RandomUnstructured
|
|
prune.L1Unstructured
|
|
prune.RandomStructured
|
|
prune.LnStructured
|
|
prune.CustomFromMask
|
|
prune.identity
|
|
prune.random_unstructured
|
|
prune.l1_unstructured
|
|
prune.random_structured
|
|
prune.ln_structured
|
|
prune.global_unstructured
|
|
prune.custom_from_mask
|
|
prune.remove
|
|
prune.is_pruned
|
|
weight_norm
|
|
remove_weight_norm
|
|
spectral_norm
|
|
remove_spectral_norm
|
|
|
|
Utility functions in other modules
|
|
|
|
.. currentmodule:: torch
|
|
.. autosummary::
|
|
:toctree: generated
|
|
:nosignatures:
|
|
|
|
nn.utils.rnn.PackedSequence
|
|
nn.utils.rnn.pack_padded_sequence
|
|
nn.utils.rnn.pad_packed_sequence
|
|
nn.utils.rnn.pad_sequence
|
|
nn.utils.rnn.pack_sequence
|
|
|
|
nn.Flatten
|
|
nn.Unflatten
|
|
|
|
Quantized Functions
|
|
--------------------
|
|
|
|
Quantization refers to techniques for performing computations and storing tensors at lower bitwidths than
|
|
floating point precision. PyTorch supports both per tensor and per channel asymmetric linear quantization. To learn more how to use quantized functions in PyTorch, please refer to the :ref:`quantization-doc` documentation.
|
|
|
|
Lazy Modules Initialization
|
|
---------------------------
|
|
|
|
.. currentmodule:: torch
|
|
.. autosummary::
|
|
:toctree: generated
|
|
:nosignatures:
|
|
:template: classtemplate.rst
|
|
|
|
nn.modules.lazy.LazyModuleMixin
|