Commit graph

865 commits

Author SHA1 Message Date
Antonin RAFFIN
daaebd0a52
Drop python 3.8 and add python 3.12 support (#2041)
* Drop python 3.8 support, add python 3.12 support

* Upgrade to python 3.9 syntax

* Fixes for Numpy v2

* Fix doc warning
2024-11-18 15:40:36 +01:00
Antonin RAFFIN
020ee42f4d
Release 2.4.0 (#2040) 2024-11-18 11:03:03 +01:00
Antonin RAFFIN
e4f4f123e3
Add note about SAC ent coeff optimization (#2037)
* Allow new sphinx version

* Add note about SAC ent coeff and add DQN tutorial link
2024-11-08 11:01:04 +01:00
Mark Towers
8f0b488bc5
Update Gymnasium to v1.0.0 (#1837)
* Update Gymnasium to v1.0.0a1

* Comment out `gymnasium.wrappers.monitor` (todo update to VideoRecord)

* Fix ruff warnings

* Register Atari envs

* Update `getattr` to `Env.get_wrapper_attr`

* Reorder imports

* Fix `seed` order

* Fix collecting `max_steps`

* Copy and paste video recorder to prevent the need to rewrite the vec vide recorder wrapper

* Use `typing.List` rather than list

* Fix env attribute forwarding

* Separate out env attribute collection from its utilisation

* Update for Gymnasium alpha 2

* Remove assert for OrderedDict

* Update setup.py

* Add type: ignore

* Test with Gymnasium main

* Remove `gymnasium.logger.debug/info`

* Fix github CI yaml

* Run gym 0.29.1 on python 3.10

* Update lower bounds

* Integrate video recorder

* Remove ordered dict

* Update changelog

---------

Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>
2024-11-04 12:03:12 +01:00
Antonin RAFFIN
dd3d0acf15
Update readme and clarify planned features (#2030)
* Update readme and clarify planned features

* Fix rtd python version

* Fix pip version for rtd

* Update rtd ubuntu and mambaforge

* Add upper bound for gymnasium

* [ci skip] Update readme
2024-10-29 12:23:13 +01:00
Antonin RAFFIN
3d59b5c86b
Use uv on GitHub CI for faster download and update changelog (#2026)
* Use uv on GitHub CI for faster download and update changelog

* Fix new mypy issues
2024-10-24 15:20:05 +02:00
Devin White
56c153f048
Add warning when using PPO on GPU and update doc (#2017)
* Update documentation

Added comment to PPO documentation that CPU should primarily be used unless using CNN as well as sample code. Added warning to user for both PPO and A2C that CPU should be used if the user is running GPU without using a CNN, reference Issue #1245.

* Add warning to base class and add test

---------

Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>
2024-10-07 11:24:47 +02:00
Antonin RAFFIN
512eea923a
Warn users when using multi-dim MultiDiscrete obs space (#2003)
* Update env checker to warn users when using multi-dim MultiDiscrete obs space

* Update changelog
2024-09-13 13:15:23 +02:00
Antonin RAFFIN
9a3b28bb9f
[ci skip] Update README.md, fix image display 2024-08-23 08:58:43 +02:00
Antonin RAFFIN
4a7631b71d
Fix test device for buffers (#1993)
* Prevent test_device from being a noop

* Update changelog

---------

Co-authored-by: Adrià Garriga-Alonso <adria@far.ai>
2024-08-18 12:33:22 +02:00
Jan-Hendrik Ewers
4a1137ba3a
Add np.ndarray as a recognized type for TB histograms. (#1635)
* Add np.ndarray as a recognized type for TB histograms.

Torch histograms allow th.Tensor, np.ndarray, and caffe2 formatted strings. This commits expands the TensorBoardOutputFormat's capabilities to log the two former types.

* Update changelog to reflect bug fix

* fix: try/catch for if either np or torch aren't at the required versions. See https://github.com/DLR-RM/stable-baselines3/pull/1635 for more details

* fix: Add comment describing the test for when add_histogram should not have been called

* Cleanup

---------

Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>
2024-08-02 11:55:27 +02:00
Chris Schindlbeck
6ad6fa55b6
Fix various typos (#1981) 2024-07-29 10:44:23 +02:00
Antonin RAFFIN
bd3c0c6530
Fix loading of optimizer with older DQN models (#1978) 2024-07-26 14:57:55 +02:00
Antonin RAFFIN
000544cc1f
Add support for pre and post linear modules in create_mlp (#1975)
* Add support for pre and post linear modules in `create_mlp`

* Disable mypy for python 3.8

* Reformat toml file

* Update docstring

Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>

* Add some comments

---------

Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>
2024-07-22 13:42:33 +02:00
Quentin Gallouédec
1a69fc8314
Update examples.rst (#1969) 2024-07-15 23:57:24 +02:00
Corentin
d8148deeaa
Updated DQN optimizer input to only include q_network parameters as input (#1963)
* Updated DQN optimizer input to only include q_network parameters

* Update version

---------

Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>
2024-07-05 19:07:55 +02:00
Sahit Chintalapudi
0eebde7ca1
Fix typo in examples.rst (#1962)
The variable `env` is not defined. The gym env we want to change is `vec_env`
2024-07-05 15:00:48 +02:00
Dominik Baron
24ebf1a1df
Remove unnecessary SDE resampling in PPO update (#1933)
* Remove unnecessary SDE resampling in PPO update

* Update changelog.rst

* Update version

* Update PyTorch version on CI

* Update ruff

* Limit NumPy version

* Reformat

---------

Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>
2024-06-29 20:07:32 +02:00
will-maclean
4efee92fba
Set CallbackList children's parent correctly (#1939)
* Fixing #1791

* Update test and version

* Add test for callback after eval

* Fix mypy error

* Remove tqdm warnings

---------

Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>
2024-06-07 14:07:28 +02:00
Joe Ksiazek
0b06d8ab20
Fix error when loading a model that has net_arch manually set to None (#1937)
* Fix loading a model with net_arch=None

* Remove redundant get

* Dummy commit

* Add to contributors

* Update test and version

---------

Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>
2024-06-05 17:27:40 +02:00
Ole Petersen
6c00565778
Fix memory leak in base_class.py (#1908)
* Fix memory leak in base_class.py

Loading the data return value is not necessary since it is unused. Loading the data causes a memory leak through the ep_info_buffer variable. I found this while loading a PPO learner from storage on a multi-GPU system since the ep_info_buffer is loaded to the memory location it was on while it was saved to disk, instead of the target loading location, and is then not cleaned up.

* Update changelog.rst

* Update changelog

---------

Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>
2024-05-15 15:59:32 +02:00
Chris Schindlbeck
4317c62598
Fix various typos (#1926)
* Fix various typos

* Update changelog

---------

Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>
2024-05-15 15:19:39 +02:00
Andrew James
766b9e9f7d
Avoid torch type-error under torch.compile (#1922)
* Avoid torch type-error under torch.compile

* Update changelog and version

* Update stable_baselines3/common/buffers.py

Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>

---------

Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>
2024-05-13 17:28:23 +02:00
Antonin RAFFIN
285e01f64a
Hotfix: revert loading with weights_only=True (#1913) 2024-04-27 15:08:38 +02:00
Nicolò Lucchesi
35eccaf04f
Fix tensorboad video slow numpy->torch conversion (#1910)
* fixed tb video docs

* updated changelog

* add comment on expected render() output

* Update changelog.rst

---------

Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>
2024-04-26 12:12:04 +02:00
Corentin
e93175084f
Adding ER-MRL to community project (#1904)
* Add ER_MRL

* Update changelog

* Move ER-MRL at the end of the file

* Improve project description

* Update changelog

---------

Co-authored-by: Antonin Raffin <antonin.raffin@dlr.de>
2024-04-25 14:31:15 +02:00
Antonin Raffin
4af4a32d1b
Update RL Tips and Tricks section 2024-04-22 10:25:32 +02:00
Mark Smith
9a749389d3
Cast learning_rate to float lambda for pickle safety when doing model.load (#1901)
* create failing test for unpickle error

* Fix learning_rate argument causing failure in weights_only=True if passed a function with non-float types

* Updated with feedback from araffin on PR#1901

* Update test and version

* Update changelog and SBX doc

---------

Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>
2024-04-22 10:04:01 +02:00
Chaitanya Bisht
5623d98f9d
Fixed broken link in ppo.rst (#1884) 2024-04-08 15:48:26 +02:00
Antonin RAFFIN
40ba50467c
Fix typo in changelog (#1882) 2024-04-01 16:07:52 +02:00
Antonin RAFFIN
429be93c48
Release v2.3.0 (#1879)
* Release v2.3.0

* Fix typos
2024-03-31 20:25:19 +02:00
Corentin
071226d3e8
Log success rate for on policy algorithms (#1870)
* Add success rate in monitor for on policy algorithms

* Update changelog

* make commit-checks refactoring

* Assert buffers are not none in _dump_logs

* Automatic refactoring of the type hinting

* Add success_rate logging test for on policy algorithms

* Update changelog

* Reformat

* Fix tests and update changelog

---------

Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>
2024-03-22 12:13:48 +01:00
Antonin RAFFIN
8b3723c6d8
Update ruff and documentation for hf sb3 (#1866)
* Update ruff

* Only load weights with `torch.load()` to avoid security issues

* Update doc about HF integration and remote code execution

* Fix doc build

* Revert weight_only=True for policies
2024-03-11 13:53:06 +01:00
Rushit Shah
f375cc3939
Fix docstring for `log_interval` to differentiate between on-policy/off-policy logging frequency (#1855)
* Fix docstring for log_interval inside the learn method in the base class.

* Updated changelog.

* Update docstring

---------

Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>
2024-03-04 11:42:16 +01:00
StagOverflow
56f20e40a2
Fix sum_independent_dims docstring to reflect output shape (#1851)
Co-authored-by: Heinrick Lumini <heinrl@Heinricks-MacBook-Pro.local>
2024-02-27 14:49:42 +01:00
Antonin RAFFIN
a8e905977f
Update env checker for spaces with non-zero start (#1845)
* Update ruff

* Update env checker for non-zero start
2024-02-19 16:44:02 +01:00
Antonin RAFFIN
1cba1bbd2f
Update to black style v24 (#1834) 2024-02-13 11:36:05 +01:00
Marek Michalik
beee4279eb
Fix example in README.md (#1830)
* Fix example in README.md

* Update changelog

---------

Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>
2024-02-13 10:47:05 +01:00
Antonin RAFFIN
620e58e61f
Update SB3 ONNX export documentation (#1816) 2024-01-30 15:53:25 +01:00
Antonin RAFFIN
a9273f968e
Update TD3/DDPG/DQN defaults for consistency (#1785)
* Update TD3/DDPG/DQN defaults for consistency

* Update changelog
2024-01-12 16:05:14 +01:00
Francesco Capuano
a653aec10d
Docs: Env attributes should be modified using env setters (#1789)
* add: paragraph on how to modify vec envs attributes via setters (solves
DLR-RM#1573)

* Update vec env doc

* Update callback doc and SB3 version

* Fix indentation

---------

Co-authored-by: Antonin Raffin <antonin.raffin@dlr.de>
2024-01-10 14:46:40 +01:00
Quentin Gallouédec
373166d6ac
Fix doc: Gym to Gymnasium Atari install command in examples.rst (#1773)
* Update examples.rst

* Update changelog.rst
2023-12-05 11:31:11 +01:00
Quentin Gallouédec
c8fda060d4
Adding PokemonRedExperiments project (#1762)
* Adding pokemon red

* update changelog

---------

Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>
2023-11-23 16:18:00 +01:00
Quentin Gallouédec
e3dea4b2e0
Release 2.2.1: Hotfix file closing (#1754)
* new closing policy

* revert #1742

* Add tests and update changelog

---------

Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>
2023-11-17 23:50:23 +01:00
Antonin RAFFIN
e1eac844af
Release v2.2.0 (#1750) 2023-11-16 17:42:10 +01:00
Antonin RAFFIN
23fbeb5975
Fix resource warning (#1742)
* Fix resource warning

* Add test and update changelog

* Fix for new mypy version
2023-11-16 17:11:13 +01:00
Antonin RAFFIN
b413f4c285
Fix VecEnv type hints (#1736)
* Fix VecNormalize type hints

* Fix VecEnv utils type annotations

* Apply suggestions from code review

Co-authored-by: M. Ernestus <maximilian@ernestus.de>

* Remove PyType

---------

Co-authored-by: M. Ernestus <maximilian@ernestus.de>
2023-11-08 09:46:40 +01:00
Antonin RAFFIN
d671402c93
Fix policies type annotations (#1735) 2023-11-06 18:35:28 +01:00
Antonin RAFFIN
a35c08c0d6
Fix offpolicy algo type hints (#1734)
* Fix offpolicy algo type hints

* Update PyTorch to have latest type hints

* Fix pip argument

* Try PyTorch 2.0.1

* Revert "Try PyTorch 2.0.1"

This reverts commit 0e0ead442d524d26f1f7e1a0bb21e2bfc0245b69.

* Update changelog
2023-11-06 11:17:36 +01:00
Antonin RAFFIN
018ea5ab67
Fix distributions type hints (#1733)
* Fix distributions type hints

* Add test for multim binary action space

* Fix test
2023-11-06 10:09:01 +01:00