Updated custom policy docs to better explain the `mlp_extractor`'s dimensions (#1196)

* Updated custom policy docs

Better explained how the dimensions of the mlp_extractor work, including the action net and the value net after the layers specified in net_arch.

* Improved custom policy doc

Section: Custom Network Architecture.
Explained with greater detail that an action net and a value net will be added on top of the net_arch.

* Improved custom policy doc

Section: Custom Network Architecture.
Merged a comment into a note

* Alignment

Co-authored-by: Quentin GALLOUÉDEC <gallouedec.quentin@gmail.com>
This commit is contained in:
Alex Pasquali 2022-12-12 16:19:51 +01:00 committed by GitHub
parent e39bc3da00
commit 6d55a09f81
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23
2 changed files with 21 additions and 0 deletions

View file

@ -60,6 +60,25 @@ Custom Network Architecture
One way of customising the policy network architecture is to pass arguments when creating the model,
using ``policy_kwargs`` parameter:
.. note::
An extra linear layer will be added on top of the layers specified in ``net_arch``, in order to have the right output dimensions and activation functions (e.g. Softmax for discrete actions).
In the following example, as CartPole's action space has a dimension of 2, the final dimensions of the ``net_arch``'s layers will be:
.. code-block:: none
obs
<4>
/ \
<32> <32>
| |
<32> <32>
| |
<2> <1>
action value
.. code-block:: python
import gym
@ -69,6 +88,7 @@ using ``policy_kwargs`` parameter:
# Custom actor (pi) and value function (vf) networks
# of two layers of size 32 each with Relu activation function
# Note: an extra linear layer will be added on top of the pi and the vf nets, respectively
policy_kwargs = dict(activation_fn=th.nn.ReLU,
net_arch=[dict(pi=[32, 32], vf=[32, 32])])
# Create the agent

View file

@ -52,6 +52,7 @@ Documentation:
^^^^^^^^^^^^^^
- Updated Hugging Face Integration page (@simoninithomas)
- Changed ``env`` to ``vec_env`` when environment is vectorized
- Updated custom policy docs to better explain the ``mlp_extractor``'s dimensions (@AlexPasqua)
- Update custom policy documentation (@athatheo)
Release 1.6.2 (2022-10-10)