mirror of
https://github.com/saymrwulf/stable-baselines3.git
synced 2026-05-18 21:30:19 +00:00
Add imitation library docs (#200)
* docs: Add imitation library docs * Fix doc syntax errors * Fix internal link; PDF->abstract for DAgger for consistency * Grammar * Update migration guide Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org> Co-authored-by: Adam Gleave <adam@gleave.me>
This commit is contained in:
parent
dd6e361204
commit
b252f4212c
4 changed files with 59 additions and 2 deletions
55
docs/guide/imitation.rst
Normal file
55
docs/guide/imitation.rst
Normal file
|
|
@ -0,0 +1,55 @@
|
|||
.. _imitation:
|
||||
|
||||
Imitation Learning
|
||||
==================
|
||||
|
||||
The `imitation <https://github.com/HumanCompatibleAI/imitation>`__ library implements
|
||||
imitation learning algorithms on top of Stable-Baselines3, including:
|
||||
|
||||
- Behavioral Cloning
|
||||
- `DAgger <https://arxiv.org/abs/1011.0686>`_ with synthetic examples
|
||||
- `Adversarial Inverse Reinforcement Learning <https://arxiv.org/abs/1710.11248>`_ (AIRL)
|
||||
- `Generative Adversarial Imitation Learning <https://arxiv.org/abs/1606.03476>`_ (GAIL)
|
||||
|
||||
|
||||
It also provides `CLI scripts <#cli-quickstart>`_ for training and saving
|
||||
demonstrations from RL experts, and for training imitation learners on these demonstrations.
|
||||
|
||||
|
||||
Installation
|
||||
------------
|
||||
|
||||
Installation requires Python 3.7+:
|
||||
|
||||
::
|
||||
|
||||
pip install imitation
|
||||
|
||||
|
||||
CLI Quickstart
|
||||
---------------------
|
||||
|
||||
::
|
||||
|
||||
# Train PPO agent on cartpole and collect expert demonstrations
|
||||
python -m imitation.scripts.expert_demos with fast cartpole log_dir=quickstart
|
||||
|
||||
# Train GAIL from demonstrations
|
||||
python -m imitation.scripts.train_adversarial with fast gail cartpole rollout_path=quickstart/rollouts/final.pkl
|
||||
|
||||
# Train AIRL from demonstrations
|
||||
python -m imitation.scripts.train_adversarial with fast airl cartpole rollout_path=quickstart/rollouts/final.pkl
|
||||
|
||||
|
||||
.. note::
|
||||
|
||||
You can remove the ``fast`` option to run training to completion. For more CLI options
|
||||
and information on reading Tensorboard plots, see the
|
||||
`README <https://github.com/HumanCompatibleAI/imitation#cli-quickstart>`_.
|
||||
|
||||
|
||||
Python Interface Quickstart
|
||||
---------------------------
|
||||
|
||||
This `example script <https://github.com/HumanCompatibleAI/imitation/blob/master/examples/quickstart.py>`_
|
||||
uses the Python API to train BC, GAIL, and AIRL models on CartPole data.
|
||||
|
|
@ -46,8 +46,8 @@ Breaking Changes
|
|||
- The features extractor (CNN extractor) is shared between policy and q-networks for DDPG/SAC/TD3 and only the policy loss used to update it (much faster)
|
||||
- Tensorboard legacy logging was dropped in favor of having one logger for the terminal and Tensorboard (cf :ref:`Tensorboard integration <tensorboard>`)
|
||||
- We dropped ACKTR/ACER support because of their complexity compared to simpler alternatives (PPO, SAC, TD3) performing as good.
|
||||
- We dropped GAIL support as we are focusing on model-free RL only, you can however take a look at the `Imitation Learning Baseline Implementations <https://github.com/HumanCompatibleAI/imitation>`_
|
||||
which are based on SB3.
|
||||
- We dropped GAIL support as we are focusing on model-free RL only, you can however take a look at the :ref:`imitation project <imitation>` which implements
|
||||
GAIL and other imitation learning algorithms on top of SB3.
|
||||
- ``action_probability`` is currently not implemented in the base class
|
||||
|
||||
You can take a look at the `issue about SB3 implementation design <https://github.com/hill-a/stable-baselines/issues/576>`_ for more details.
|
||||
|
|
|
|||
|
|
@ -44,6 +44,7 @@ Main Features
|
|||
guide/callbacks
|
||||
guide/tensorboard
|
||||
guide/rl_zoo
|
||||
guide/imitation
|
||||
guide/migration
|
||||
guide/checking_nan
|
||||
guide/developer
|
||||
|
|
|
|||
|
|
@ -40,6 +40,7 @@ Others:
|
|||
Documentation:
|
||||
^^^^^^^^^^^^^^
|
||||
- Added first draft of migration guide
|
||||
- Added intro to `imitation <https://github.com/HumanCompatibleAI/imitation>`_ library (@shwang)
|
||||
- Enabled doc for ``CnnPolicies``
|
||||
|
||||
|
||||
|
|
|
|||
Loading…
Reference in a new issue