mirror of
https://github.com/saymrwulf/stable-baselines3.git
synced 2026-05-29 23:07:07 +00:00
Updated custom policy docs to better explain the `mlp_extractor`'s dimensions (#1196)
* Updated custom policy docs Better explained how the dimensions of the mlp_extractor work, including the action net and the value net after the layers specified in net_arch. * Improved custom policy doc Section: Custom Network Architecture. Explained with greater detail that an action net and a value net will be added on top of the net_arch. * Improved custom policy doc Section: Custom Network Architecture. Merged a comment into a note * Alignment Co-authored-by: Quentin GALLOUÉDEC <gallouedec.quentin@gmail.com>
This commit is contained in:
parent
e39bc3da00
commit
6d55a09f81
2 changed files with 21 additions and 0 deletions
|
|
@ -60,6 +60,25 @@ Custom Network Architecture
|
|||
One way of customising the policy network architecture is to pass arguments when creating the model,
|
||||
using ``policy_kwargs`` parameter:
|
||||
|
||||
.. note::
|
||||
An extra linear layer will be added on top of the layers specified in ``net_arch``, in order to have the right output dimensions and activation functions (e.g. Softmax for discrete actions).
|
||||
|
||||
In the following example, as CartPole's action space has a dimension of 2, the final dimensions of the ``net_arch``'s layers will be:
|
||||
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
obs
|
||||
<4>
|
||||
/ \
|
||||
<32> <32>
|
||||
| |
|
||||
<32> <32>
|
||||
| |
|
||||
<2> <1>
|
||||
action value
|
||||
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
import gym
|
||||
|
|
@ -69,6 +88,7 @@ using ``policy_kwargs`` parameter:
|
|||
|
||||
# Custom actor (pi) and value function (vf) networks
|
||||
# of two layers of size 32 each with Relu activation function
|
||||
# Note: an extra linear layer will be added on top of the pi and the vf nets, respectively
|
||||
policy_kwargs = dict(activation_fn=th.nn.ReLU,
|
||||
net_arch=[dict(pi=[32, 32], vf=[32, 32])])
|
||||
# Create the agent
|
||||
|
|
|
|||
|
|
@ -52,6 +52,7 @@ Documentation:
|
|||
^^^^^^^^^^^^^^
|
||||
- Updated Hugging Face Integration page (@simoninithomas)
|
||||
- Changed ``env`` to ``vec_env`` when environment is vectorized
|
||||
- Updated custom policy docs to better explain the ``mlp_extractor``'s dimensions (@AlexPasqua)
|
||||
- Update custom policy documentation (@athatheo)
|
||||
|
||||
Release 1.6.2 (2022-10-10)
|
||||
|
|
|
|||
Loading…
Reference in a new issue