Commit graph

851 commits

Author SHA1 Message Date
Antonin Raffin
955e202258 Try torch compile and other optimization 2024-07-07 15:20:21 +02:00
Corentin
d8148deeaa
Updated DQN optimizer input to only include q_network parameters as input (#1963)
* Updated DQN optimizer input to only include q_network parameters

* Update version

---------

Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>
2024-07-05 19:07:55 +02:00
Sahit Chintalapudi
0eebde7ca1
Fix typo in examples.rst (#1962)
The variable `env` is not defined. The gym env we want to change is `vec_env`
2024-07-05 15:00:48 +02:00
Dominik Baron
24ebf1a1df
Remove unnecessary SDE resampling in PPO update (#1933)
* Remove unnecessary SDE resampling in PPO update

* Update changelog.rst

* Update version

* Update PyTorch version on CI

* Update ruff

* Limit NumPy version

* Reformat

---------

Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>
2024-06-29 20:07:32 +02:00
will-maclean
4efee92fba
Set CallbackList children's parent correctly (#1939)
* Fixing #1791

* Update test and version

* Add test for callback after eval

* Fix mypy error

* Remove tqdm warnings

---------

Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>
2024-06-07 14:07:28 +02:00
Joe Ksiazek
0b06d8ab20
Fix error when loading a model that has net_arch manually set to None (#1937)
* Fix loading a model with net_arch=None

* Remove redundant get

* Dummy commit

* Add to contributors

* Update test and version

---------

Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>
2024-06-05 17:27:40 +02:00
Ole Petersen
6c00565778
Fix memory leak in base_class.py (#1908)
* Fix memory leak in base_class.py

Loading the data return value is not necessary since it is unused. Loading the data causes a memory leak through the ep_info_buffer variable. I found this while loading a PPO learner from storage on a multi-GPU system since the ep_info_buffer is loaded to the memory location it was on while it was saved to disk, instead of the target loading location, and is then not cleaned up.

* Update changelog.rst

* Update changelog

---------

Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>
2024-05-15 15:59:32 +02:00
Chris Schindlbeck
4317c62598
Fix various typos (#1926)
* Fix various typos

* Update changelog

---------

Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>
2024-05-15 15:19:39 +02:00
Andrew James
766b9e9f7d
Avoid torch type-error under torch.compile (#1922)
* Avoid torch type-error under torch.compile

* Update changelog and version

* Update stable_baselines3/common/buffers.py

Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>

---------

Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>
2024-05-13 17:28:23 +02:00
Antonin RAFFIN
285e01f64a
Hotfix: revert loading with weights_only=True (#1913) 2024-04-27 15:08:38 +02:00
Nicolò Lucchesi
35eccaf04f
Fix tensorboad video slow numpy->torch conversion (#1910)
* fixed tb video docs

* updated changelog

* add comment on expected render() output

* Update changelog.rst

---------

Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>
2024-04-26 12:12:04 +02:00
Corentin
e93175084f
Adding ER-MRL to community project (#1904)
* Add ER_MRL

* Update changelog

* Move ER-MRL at the end of the file

* Improve project description

* Update changelog

---------

Co-authored-by: Antonin Raffin <antonin.raffin@dlr.de>
2024-04-25 14:31:15 +02:00
Antonin Raffin
4af4a32d1b
Update RL Tips and Tricks section 2024-04-22 10:25:32 +02:00
Mark Smith
9a749389d3
Cast learning_rate to float lambda for pickle safety when doing model.load (#1901)
* create failing test for unpickle error

* Fix learning_rate argument causing failure in weights_only=True if passed a function with non-float types

* Updated with feedback from araffin on PR#1901

* Update test and version

* Update changelog and SBX doc

---------

Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>
2024-04-22 10:04:01 +02:00
Chaitanya Bisht
5623d98f9d
Fixed broken link in ppo.rst (#1884) 2024-04-08 15:48:26 +02:00
Antonin RAFFIN
40ba50467c
Fix typo in changelog (#1882) 2024-04-01 16:07:52 +02:00
Antonin RAFFIN
429be93c48
Release v2.3.0 (#1879)
* Release v2.3.0

* Fix typos
2024-03-31 20:25:19 +02:00
Corentin
071226d3e8
Log success rate for on policy algorithms (#1870)
* Add success rate in monitor for on policy algorithms

* Update changelog

* make commit-checks refactoring

* Assert buffers are not none in _dump_logs

* Automatic refactoring of the type hinting

* Add success_rate logging test for on policy algorithms

* Update changelog

* Reformat

* Fix tests and update changelog

---------

Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>
2024-03-22 12:13:48 +01:00
Antonin RAFFIN
8b3723c6d8
Update ruff and documentation for hf sb3 (#1866)
* Update ruff

* Only load weights with `torch.load()` to avoid security issues

* Update doc about HF integration and remote code execution

* Fix doc build

* Revert weight_only=True for policies
2024-03-11 13:53:06 +01:00
Rushit Shah
f375cc3939
Fix docstring for `log_interval` to differentiate between on-policy/off-policy logging frequency (#1855)
* Fix docstring for log_interval inside the learn method in the base class.

* Updated changelog.

* Update docstring

---------

Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>
2024-03-04 11:42:16 +01:00
StagOverflow
56f20e40a2
Fix sum_independent_dims docstring to reflect output shape (#1851)
Co-authored-by: Heinrick Lumini <heinrl@Heinricks-MacBook-Pro.local>
2024-02-27 14:49:42 +01:00
Antonin RAFFIN
a8e905977f
Update env checker for spaces with non-zero start (#1845)
* Update ruff

* Update env checker for non-zero start
2024-02-19 16:44:02 +01:00
Antonin RAFFIN
1cba1bbd2f
Update to black style v24 (#1834) 2024-02-13 11:36:05 +01:00
Marek Michalik
beee4279eb
Fix example in README.md (#1830)
* Fix example in README.md

* Update changelog

---------

Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>
2024-02-13 10:47:05 +01:00
Antonin RAFFIN
620e58e61f
Update SB3 ONNX export documentation (#1816) 2024-01-30 15:53:25 +01:00
Antonin RAFFIN
a9273f968e
Update TD3/DDPG/DQN defaults for consistency (#1785)
* Update TD3/DDPG/DQN defaults for consistency

* Update changelog
2024-01-12 16:05:14 +01:00
Francesco Capuano
a653aec10d
Docs: Env attributes should be modified using env setters (#1789)
* add: paragraph on how to modify vec envs attributes via setters (solves
DLR-RM#1573)

* Update vec env doc

* Update callback doc and SB3 version

* Fix indentation

---------

Co-authored-by: Antonin Raffin <antonin.raffin@dlr.de>
2024-01-10 14:46:40 +01:00
Quentin Gallouédec
373166d6ac
Fix doc: Gym to Gymnasium Atari install command in examples.rst (#1773)
* Update examples.rst

* Update changelog.rst
2023-12-05 11:31:11 +01:00
Quentin Gallouédec
c8fda060d4
Adding PokemonRedExperiments project (#1762)
* Adding pokemon red

* update changelog

---------

Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>
2023-11-23 16:18:00 +01:00
Quentin Gallouédec
e3dea4b2e0
Release 2.2.1: Hotfix file closing (#1754)
* new closing policy

* revert #1742

* Add tests and update changelog

---------

Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>
2023-11-17 23:50:23 +01:00
Antonin RAFFIN
e1eac844af
Release v2.2.0 (#1750) 2023-11-16 17:42:10 +01:00
Antonin RAFFIN
23fbeb5975
Fix resource warning (#1742)
* Fix resource warning

* Add test and update changelog

* Fix for new mypy version
2023-11-16 17:11:13 +01:00
Antonin RAFFIN
b413f4c285
Fix VecEnv type hints (#1736)
* Fix VecNormalize type hints

* Fix VecEnv utils type annotations

* Apply suggestions from code review

Co-authored-by: M. Ernestus <maximilian@ernestus.de>

* Remove PyType

---------

Co-authored-by: M. Ernestus <maximilian@ernestus.de>
2023-11-08 09:46:40 +01:00
Antonin RAFFIN
d671402c93
Fix policies type annotations (#1735) 2023-11-06 18:35:28 +01:00
Antonin RAFFIN
a35c08c0d6
Fix offpolicy algo type hints (#1734)
* Fix offpolicy algo type hints

* Update PyTorch to have latest type hints

* Fix pip argument

* Try PyTorch 2.0.1

* Revert "Try PyTorch 2.0.1"

This reverts commit 0e0ead442d524d26f1f7e1a0bb21e2bfc0245b69.

* Update changelog
2023-11-06 11:17:36 +01:00
Antonin RAFFIN
018ea5ab67
Fix distributions type hints (#1733)
* Fix distributions type hints

* Add test for multim binary action space

* Fix test
2023-11-06 10:09:01 +01:00
Antonin RAFFIN
294f2b4309
Documentation update (#1732)
* Update RL Tips

* Fix grammar

* Update SBX doc

* Fix various typos and grammar mistakes
2023-11-03 17:17:46 +01:00
M. Ernestus
69afefc91d
Add rollout_buffer_class parameter to on-policy algorithms (#1720)
* Add rollout_buffer_class and rollout_buffer_kwargs parameters to OnPolicyAlgorithm

* Add rollout_buffer_class and rollout_buffer_kwargs to PPO.

* Add rollout_buffer_class and rollout_buffer_kwargs to A2C.

* Make use of the rollout buffer kwargs.

* Update version

* Add test and update doc

---------

Co-authored-by: Antonin Raffin <antonin.raffin@dlr.de>
2023-10-27 17:36:24 +02:00
M. Ernestus
f56ddeda10
Merge pull request #1724 from DLR-RM/chores/update-deps
Update dependencies (shimmy, sphinx), remove `sphinx_autodoc_typehints`
2023-10-24 21:28:05 +02:00
Antonin Raffin
350931f441
Use mambaforge as tool 2023-10-23 20:38:48 +02:00
Antonin Raffin
75a5b68e8a
Remove tools key 2023-10-23 20:34:20 +02:00
Antonin Raffin
c1fc8e4c75
Update RTD config 2023-10-23 20:31:58 +02:00
Antonin Raffin
6c70993c8f
Remove sphinx-autodoc-typehints 2023-10-23 20:26:33 +02:00
Antonin Raffin
d672008a32
Update dependencies (remove sphinx type hint plugin), protect type aliases 2023-10-23 20:14:15 +02:00
Antonin Raffin
80245bccc8
Update dependencies (shimmy, sphinx) 2023-10-23 20:14:09 +02:00
Hosseinkhan Rémy
aab545901f
Add support for setting options at reset with VecEnv (#1606)
* Update signatures, and test with options

* Update changelog and black formatting

* Finish implementation (fixes, doc, tests)

* Use deepcopy to avoid side effects (modif by reference)

* Fix for mypy

---------

Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>
2023-10-23 13:38:48 +02:00
Jan-Hendrik Ewers
2ddf015cd9
fix: Follow PEP8 guidelines and evaluate falsy to truthy with not rather than is False. (#1707)
* fix: Follow PEP8 guidelines and evaluate falsy to truth with `not` rather than `is False`.

https://docs.python.org/2/library/stdtypes.html#truth-value-testing

* chore: Update changelog inline with intent of changes in PR #1707

Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>

* fix: Change `is False` to `not` as per PEP8

* chore: Remove superfluous comment about `is False`

* test: One On- and one Off-Policy algorithm (A2C and SAC respectively), with settings to speed up testing

* Update changelog

* chore: Remove EvalCallback as it's not actually required

* Update changelog.rst

* Rm duplicated "others" section in changelog.rst

---------

Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>
Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>
2023-10-09 12:21:12 +02:00
Quentin Gallouédec
c6bf251d46
Add argument features_extractor to ActorCriticPolicy.extract_features (#1710)
* add argument to extract_features

* remove empty lines

* changelog and version
2023-10-09 11:11:36 +02:00
Antonin RAFFIN
c6c660e51b
Fix type annotations of buffers (#1700)
* Fix type annotation and replay buffer

* Exclude pytype check

* Remove some pytype specific annotaiton and update changelog

* Fix HerReplayBuffer type hints

* try remove   # type: ignore[assignment]

* revert change

---------

Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>
2023-09-28 18:52:46 +02:00
Kyle Sayers
fab6cb339d
BitFlippingEnv argument check and docs clarification (#1698)
* made change, not tested yet

* add back _obs_space with note on purpose

* match formatting

* update documentation
2023-09-27 10:18:30 +02:00