mirror of
https://github.com/saymrwulf/stable-baselines3.git
synced 2026-05-14 20:58:03 +00:00
Update readme and clarify planned features (#2030)
* Update readme and clarify planned features * Fix rtd python version * Fix pip version for rtd * Update rtd ubuntu and mambaforge * Add upper bound for gymnasium * [ci skip] Update readme
This commit is contained in:
parent
3d59b5c86b
commit
dd3d0acf15
7 changed files with 36 additions and 21 deletions
|
|
@ -16,6 +16,6 @@ conda:
|
|||
environment: docs/conda_env.yml
|
||||
|
||||
build:
|
||||
os: ubuntu-22.04
|
||||
os: ubuntu-24.04
|
||||
tools:
|
||||
python: "mambaforge-22.9"
|
||||
python: "mambaforge-23.11"
|
||||
|
|
|
|||
|
|
@ -6,7 +6,7 @@ into two categories:
|
|||
- Create an issue about your intended feature, and we shall discuss the design and
|
||||
implementation. Once we agree that the plan looks good, go ahead and implement it.
|
||||
2. You want to implement a feature or bug-fix for an outstanding issue
|
||||
- Look at the outstanding issues here: https://github.com/DLR-RM/stable-baselines3/issues
|
||||
- Look at the outstanding issues here: https://github.com/DLR-RM/stable-baselines3/labels/help%20wanted
|
||||
- Pick an issue or feature and comment on the task that you want to work on this feature.
|
||||
- If you need more context on a particular issue, please ask, and we shall provide.
|
||||
|
||||
|
|
|
|||
32
README.md
32
README.md
|
|
@ -1,6 +1,6 @@
|
|||
<!-- [](https://gitlab.com/araffin/stable-baselines3/-/commits/master) -->
|
||||

|
||||
[](https://stable-baselines3.readthedocs.io/en/master/?badge=master) [](https://gitlab.com/araffin/stable-baselines3/-/commits/master)
|
||||
[](https://github.com/DLR-RM/stable-baselines3/actions/workflows/ci.yml)
|
||||
[](https://stable-baselines3.readthedocs.io/en/master/?badge=master) [](https://github.com/DLR-RM/stable-baselines3/actions/workflows/ci.yml)
|
||||
[](https://github.com/psf/black)
|
||||
|
||||
|
||||
|
|
@ -22,6 +22,8 @@ These algorithms will make it easier for the research community and industry to
|
|||
**The performance of each algorithm was tested** (see *Results* section in their respective page),
|
||||
you can take a look at the issues [#48](https://github.com/DLR-RM/stable-baselines3/issues/48) and [#49](https://github.com/DLR-RM/stable-baselines3/issues/49) for more details.
|
||||
|
||||
We also provide detailed logs and reports on the [OpenRL Benchmark](https://wandb.ai/openrlbenchmark/sb3) platform.
|
||||
|
||||
|
||||
| **Features** | **Stable-Baselines3** |
|
||||
| --------------------------- | ----------------------|
|
||||
|
|
@ -41,7 +43,13 @@ you can take a look at the issues [#48](https://github.com/DLR-RM/stable-baselin
|
|||
|
||||
### Planned features
|
||||
|
||||
Please take a look at the [Roadmap](https://github.com/DLR-RM/stable-baselines3/issues/1) and [Milestones](https://github.com/DLR-RM/stable-baselines3/milestones).
|
||||
Since most of the features from the [original roadmap](https://github.com/DLR-RM/stable-baselines3/issues/1) have been implemented, there are no major changes planned for SB3, it is now *stable*.
|
||||
If you want to contribute, you can search in the issues for the ones where [help is welcomed](https://github.com/DLR-RM/stable-baselines3/labels/help%20wanted) and the other [proposed enhancements](https://github.com/DLR-RM/stable-baselines3/labels/enhancement).
|
||||
|
||||
While SB3 development is now focused on bug fixes and maintenance (doc update, user experience, ...), there is more active development going on in the associated repositories:
|
||||
- newer algorithms are regularly added to the [SB3 Contrib](https://github.com/Stable-Baselines-Team/stable-baselines3-contrib) repository
|
||||
- faster variants are developed in the [SBX (SB3 + Jax)](https://github.com/araffin/sbx) repository
|
||||
- the training framework for SB3, the RL Zoo, has an active [roadmap](https://github.com/DLR-RM/rl-baselines3-zoo/issues/299)
|
||||
|
||||
## Migration guide: from Stable-Baselines (SB2) to Stable-Baselines3 (SB3)
|
||||
|
||||
|
|
@ -79,7 +87,7 @@ Documentation: https://rl-baselines3-zoo.readthedocs.io/en/master/
|
|||
|
||||
We implement experimental features in a separate contrib repository: [SB3-Contrib](https://github.com/Stable-Baselines-Team/stable-baselines3-contrib)
|
||||
|
||||
This allows SB3 to maintain a stable and compact core, while still providing the latest features, like Recurrent PPO (PPO LSTM), Truncated Quantile Critics (TQC), Quantile Regression DQN (QR-DQN) or PPO with invalid action masking (Maskable PPO).
|
||||
This allows SB3 to maintain a stable and compact core, while still providing the latest features, like Recurrent PPO (PPO LSTM), CrossQ, Truncated Quantile Critics (TQC), Quantile Regression DQN (QR-DQN) or PPO with invalid action masking (Maskable PPO).
|
||||
|
||||
Documentation is available online: [https://sb3-contrib.readthedocs.io/](https://sb3-contrib.readthedocs.io/)
|
||||
|
||||
|
|
@ -97,17 +105,16 @@ It provides a minimal number of features compared to SB3 but can be much faster
|
|||
### Prerequisites
|
||||
Stable Baselines3 requires Python 3.8+.
|
||||
|
||||
#### Windows 10
|
||||
#### Windows
|
||||
|
||||
To install stable-baselines on Windows, please look at the [documentation](https://stable-baselines3.readthedocs.io/en/master/guide/install.html#prerequisites).
|
||||
|
||||
|
||||
### Install using pip
|
||||
Install the Stable Baselines3 package:
|
||||
```sh
|
||||
pip install 'stable-baselines3[extra]'
|
||||
```
|
||||
pip install stable-baselines3[extra]
|
||||
```
|
||||
**Note:** Some shells such as Zsh require quotation marks around brackets, i.e. `pip install 'stable-baselines3[extra]'` ([More Info](https://stackoverflow.com/a/30539963)).
|
||||
|
||||
This includes an optional dependencies like Tensorboard, OpenCV or `ale-py` to train on atari games. If you do not need those, you can use:
|
||||
```sh
|
||||
|
|
@ -177,6 +184,7 @@ All the following examples can be executed online using Google Colab notebooks:
|
|||
| ------------------- | ------------------ | ------------------ | ------------------ | ------------------- | ------------------ | --------------------------------- |
|
||||
| ARS<sup>[1](#f1)</sup> | :x: | :heavy_check_mark: | :heavy_check_mark: | :x: | :x: | :heavy_check_mark: |
|
||||
| A2C | :x: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: |
|
||||
| CrossQ<sup>[1](#f1)</sup> | :x: | :heavy_check_mark: | :x: | :x: | :x: | :heavy_check_mark: |
|
||||
| DDPG | :x: | :heavy_check_mark: | :x: | :x: | :x: | :heavy_check_mark: |
|
||||
| DQN | :x: | :x: | :heavy_check_mark: | :x: | :x: | :heavy_check_mark: |
|
||||
| HER | :x: | :heavy_check_mark: | :heavy_check_mark: | :x: | :x: | :heavy_check_mark: |
|
||||
|
|
@ -191,7 +199,7 @@ All the following examples can be executed online using Google Colab notebooks:
|
|||
|
||||
<b id="f1">1</b>: Implemented in [SB3 Contrib](https://github.com/Stable-Baselines-Team/stable-baselines3-contrib) GitHub repository.
|
||||
|
||||
Actions `gym.spaces`:
|
||||
Actions `gymnasium.spaces`:
|
||||
* `Box`: A N-dimensional box that contains every point in the action space.
|
||||
* `Discrete`: A list of possible actions, where each timestep only one of the actions can be used.
|
||||
* `MultiDiscrete`: A list of possible actions, where each timestep only one action of each discrete set can be used.
|
||||
|
|
@ -218,9 +226,9 @@ To run a single test:
|
|||
python3 -m pytest -v -k 'test_check_env_dict_action'
|
||||
```
|
||||
|
||||
You can also do a static type check using `pytype` and `mypy`:
|
||||
You can also do a static type check using `mypy`:
|
||||
```sh
|
||||
pip install pytype mypy
|
||||
pip install mypy
|
||||
make type
|
||||
```
|
||||
|
||||
|
|
@ -252,6 +260,8 @@ To cite this repository in publications:
|
|||
}
|
||||
```
|
||||
|
||||
Note: If you need to refer to a specific version of SB3, you can also use the [Zenodo DOI](https://doi.org/10.5281/zenodo.8123988).
|
||||
|
||||
## Maintainers
|
||||
|
||||
Stable-Baselines3 is currently maintained by [Ashley Hill](https://github.com/hill-a) (aka @hill-a), [Antonin Raffin](https://araffin.github.io/) (aka [@araffin](https://github.com/araffin)), [Maximilian Ernestus](https://github.com/ernestum) (aka @ernestum), [Adam Gleave](https://github.com/adamgleave) (@AdamGleave), [Anssi Kanervisto](https://github.com/Miffyli) (@Miffyli) and [Quentin Gallouédec](https://gallouedec.com/) (@qgallouedec).
|
||||
|
|
|
|||
|
|
@ -1,18 +1,18 @@
|
|||
name: root
|
||||
channels:
|
||||
- pytorch
|
||||
- defaults
|
||||
- conda-forge
|
||||
dependencies:
|
||||
- cpuonly=1.0=0
|
||||
- pip=22.3.1
|
||||
- python=3.8
|
||||
- pytorch=1.13.0=py3.8_cpu_0
|
||||
- pip=24.2
|
||||
- python=3.11
|
||||
- pytorch=2.5.0=py3.11_cpu_0
|
||||
- pip:
|
||||
- gymnasium
|
||||
- gymnasium>=0.28.1,<0.30
|
||||
- cloudpickle
|
||||
- opencv-python-headless
|
||||
- pandas
|
||||
- numpy
|
||||
- numpy>=1.20,<2.0
|
||||
- matplotlib
|
||||
- sphinx>=5,<8
|
||||
- sphinx_rtd_theme>=1.3.0
|
||||
|
|
|
|||
|
|
@ -10,6 +10,7 @@ Name ``Box`` ``Discrete`` ``MultiDiscrete`` ``MultiBinary``
|
|||
=================== =========== ============ ================= =============== ================
|
||||
ARS [#f1]_ ✔️ ✔️ ❌ ❌ ✔️
|
||||
A2C ✔️ ✔️ ✔️ ✔️ ✔️
|
||||
CrossQ [#f1]_ ✔️ ❌ ❌ ❌ ✔️
|
||||
DDPG ✔️ ❌ ❌ ❌ ✔️
|
||||
DQN ❌ ✔️ ❌ ❌ ✔️
|
||||
HER ✔️ ✔️ ❌ ❌ ✔️
|
||||
|
|
|
|||
|
|
@ -113,12 +113,14 @@ To cite this project in publications:
|
|||
url = {http://jmlr.org/papers/v22/20-1364.html}
|
||||
}
|
||||
|
||||
Note: If you need to refer to a specific version of SB3, you can also use the `Zenodo DOI <https://doi.org/10.5281/zenodo.8123988>`_.
|
||||
|
||||
Contributing
|
||||
------------
|
||||
|
||||
To any interested in making the rl baselines better, there are still some improvements
|
||||
that need to be done.
|
||||
You can check issues in the `repo <https://github.com/DLR-RM/stable-baselines3/issues>`_.
|
||||
You can check issues in the `repository <https://github.com/DLR-RM/stable-baselines3/labels/help%20wanted>`_.
|
||||
|
||||
If you want to contribute, please read `CONTRIBUTING.md <https://github.com/DLR-RM/stable-baselines3/blob/master/CONTRIBUTING.md>`_ first.
|
||||
|
||||
|
|
|
|||
|
|
@ -68,6 +68,7 @@ Others:
|
|||
- Updated PyTorch version on CI to 2.3.1
|
||||
- Added a warning to recommend using CPU with on policy algorithms (A2C/PPO) and ``MlpPolicy``
|
||||
- Switched to uv to download packages faster on GitHub CI
|
||||
- Updated dependencies for read the doc
|
||||
|
||||
Bug Fixes:
|
||||
^^^^^^^^^^
|
||||
|
|
@ -75,6 +76,7 @@ Bug Fixes:
|
|||
Documentation:
|
||||
^^^^^^^^^^^^^^
|
||||
- Updated PPO doc to recommend using CPU with ``MlpPolicy``
|
||||
- Clarified documentation about planned features and citing software
|
||||
|
||||
Release 2.3.2 (2024-04-27)
|
||||
--------------------------
|
||||
|
|
|
|||
Loading…
Reference in a new issue