stable-baselines3/docs/index.rst

124 lines
3.1 KiB
ReStructuredText
Raw Normal View History

2020-05-07 08:10:51 +00:00
.. Stable Baselines3 documentation master file, created by
2019-09-26 09:46:40 +00:00
sphinx-quickstart on Thu Sep 26 11:06:54 2019.
You can adapt this file completely to your liking, but it should at least
contain the root `toctree` directive.
Stable-Baselines3 Docs - Reliable Reinforcement Learning Implementations
========================================================================
2019-09-26 09:46:40 +00:00
`Stable Baselines3 (SB3) <https://github.com/DLR-RM/stable-baselines3>`_ is a set of reliable implementations of reinforcement learning algorithms in PyTorch.
2020-05-08 11:09:38 +00:00
It is the next major version of `Stable Baselines <https://github.com/hill-a/stable-baselines>`_.
2020-05-07 08:10:51 +00:00
Github repository: https://github.com/DLR-RM/stable-baselines3
2019-09-26 09:46:40 +00:00
RL Baselines3 Zoo (training framework for SB3): https://github.com/DLR-RM/rl-baselines3-zoo
2019-09-26 09:46:40 +00:00
RL Baselines3 Zoo provides a collection of pre-trained agents, scripts for training, evaluating agents, tuning hyperparameters, plotting results and recording videos.
2019-09-26 09:46:40 +00:00
SB3 Contrib (experimental RL code, latest algorithms): https://github.com/Stable-Baselines-Team/stable-baselines3-contrib
2019-09-26 09:46:40 +00:00
2020-05-07 08:10:51 +00:00
Main Features
--------------
- Unified structure for all algorithms
- PEP8 compliant (unified code style)
- Documented functions and classes
- Tests, high code coverage and type hints
- Clean code
- Tensorboard support
- **The performance of each algorithm was tested** (see *Results* section in their respective page)
2020-05-07 08:10:51 +00:00
2019-09-26 09:46:40 +00:00
.. toctree::
:maxdepth: 2
:caption: User Guide
2020-05-07 08:10:51 +00:00
guide/install
2019-09-26 09:46:40 +00:00
guide/quickstart
2020-05-07 08:10:51 +00:00
guide/rl_tips
guide/rl
2020-05-07 14:08:23 +00:00
guide/algos
guide/examples
2019-09-26 09:46:40 +00:00
guide/vec_envs
2020-05-07 08:10:51 +00:00
guide/custom_env
2020-05-07 14:08:23 +00:00
guide/custom_policy
2020-05-07 08:10:51 +00:00
guide/callbacks
guide/tensorboard
2020-05-08 09:58:43 +00:00
guide/rl_zoo
guide/sb3_contrib
guide/imitation
2020-05-07 08:10:51 +00:00
guide/migration
guide/checking_nan
2020-05-08 14:20:21 +00:00
guide/developer
guide/save_format
guide/export
2019-09-26 09:46:40 +00:00
.. toctree::
:maxdepth: 1
:caption: RL Algorithms
modules/base
2020-01-20 15:19:35 +00:00
modules/a2c
modules/ddpg
modules/dqn
Implement HER (#120) * Added working her version, Online sampling is missing. * Updated test_her. * Added first version of online her sampling. Still problems with tensor dimensions. * Reformat * Fixed tests * Added some comments. * Updated changelog. * Add missing init file * Fixed some small bugs. * Reduced arguments for HER, small changes. * Added getattr. Fixed bug for online sampling. * Updated save/load funtions. Small changes. * Added her to init. * Updated save method. * Updated her ratio. * Move obs_wrapper * Added DQN test. * Fix potential bug * Offline and online her share same sample_goal function. * Changed lists into arrays. * Updated her test. * Fix online sampling * Fixed action bug. Updated time limit for episodes. * Updated convert_dict method to take keys as arguments. * Renamed obs dict wrapper. * Seed bit flipping env * Remove get_episode_dict * Add fast online sampling version * Added documentation. * Vectorized reward computation * Vectorized goal sampling * Update time limit for episodes in online her sampling. * Fix max episode length inference * Bug fix for Fetch envs * Fix for HER + gSDE * Reformat (new black version) * Added info dict to compute new reward. Check her_replay_buffer again. * Fix info buffer * Updated done flag. * Fixes for gSDE * Offline her version uses now HerReplayBuffer as episode storage. * Fix num_timesteps computation * Fix get torch params * Vectorized version for offline sampling. * Modified offline her sampling to use sample method of her_replay_buffer * Updated HER tests. * Updated documentation * Cleanup docstrings * Updated to review comments * Fix pytype * Update according to review comments. * Removed random goal strategy. Updated sample transitions. * Updated migration. Removed time signal removal. * Update doc * Fix potential load issue * Add VecNormalize support for dict obs * Updated saving/loading replay buffer for HER. * Fix test memory usage * Fixed save/load replay buffer. * Fixed save/load replay buffer * Fixed transition index after loading replay buffer in online sampling * Better error handling * Add tests for get_time_limit * More tests for VecNormalize with dict obs * Update doc * Improve HER description * Add test for sde support * Add comments * Add comments * Remove check that was always valid * Fix for terminal observation * Updated buffer size in offline version and reset of HER buffer * Reformat * Update doc * Remove np.empty + add doc * Fix loading * Updated loading replay buffer * Separate online and offline sampling + bug fixes * Update tensorboard log name * Version bump * Bug fix for special case Co-authored-by: Antonin Raffin <antonin.raffin@dlr.de> Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>
2020-10-22 09:56:43 +00:00
modules/her
2019-09-26 09:46:40 +00:00
modules/ppo
modules/sac
modules/td3
2020-05-07 08:10:51 +00:00
.. toctree::
:maxdepth: 1
:caption: Common
2020-05-08 11:09:38 +00:00
common/atari_wrappers
common/env_util
2020-05-07 08:10:51 +00:00
common/distributions
common/evaluation
common/env_checker
2020-05-08 11:09:38 +00:00
common/monitor
common/logger
common/noise
common/utils
2019-09-26 09:46:40 +00:00
.. toctree::
:maxdepth: 1
:caption: Misc
misc/changelog
2020-05-07 08:10:51 +00:00
misc/projects
2019-09-26 09:46:40 +00:00
2020-05-05 13:02:35 +00:00
Citing Stable Baselines3
2020-05-05 14:28:38 +00:00
------------------------
2019-09-26 09:46:40 +00:00
To cite this project in publications:
.. code-block:: bibtex
2020-05-05 14:28:38 +00:00
@misc{stable-baselines3,
2020-02-03 14:57:37 +00:00
author = {Raffin, Antonin and Hill, Ashley and Ernestus, Maximilian and Gleave, Adam and Kanervisto, Anssi and Dormann, Noah},
2020-05-05 13:02:35 +00:00
title = {Stable Baselines3},
2019-09-26 09:46:40 +00:00
year = {2019},
publisher = {GitHub},
journal = {GitHub repository},
2020-05-05 14:28:38 +00:00
howpublished = {\url{https://github.com/DLR-RM/stable-baselines3}},
2019-09-26 09:46:40 +00:00
}
Contributing
------------
To any interested in making the rl baselines better, there are still some improvements
that need to be done.
You can check issues in the `repo <https://github.com/DLR-RM/stable-baselines3/issues>`_.
If you want to contribute, please read `CONTRIBUTING.md <https://github.com/DLR-RM/stable-baselines3/blob/master/CONTRIBUTING.md>`_ first.
2019-09-26 09:46:40 +00:00
Indices and tables
-------------------
* :ref:`genindex`
* :ref:`search`
* :ref:`modindex`