Commit graph

487 commits

Author SHA1 Message Date
Nicolò Lucchesi
35eccaf04f
Fix tensorboad video slow numpy->torch conversion (#1910)
* fixed tb video docs

* updated changelog

* add comment on expected render() output

* Update changelog.rst

---------

Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>
2024-04-26 12:12:04 +02:00
Corentin
e93175084f
Adding ER-MRL to community project (#1904)
* Add ER_MRL

* Update changelog

* Move ER-MRL at the end of the file

* Improve project description

* Update changelog

---------

Co-authored-by: Antonin Raffin <antonin.raffin@dlr.de>
2024-04-25 14:31:15 +02:00
Antonin Raffin
4af4a32d1b
Update RL Tips and Tricks section 2024-04-22 10:25:32 +02:00
Mark Smith
9a749389d3
Cast learning_rate to float lambda for pickle safety when doing model.load (#1901)
* create failing test for unpickle error

* Fix learning_rate argument causing failure in weights_only=True if passed a function with non-float types

* Updated with feedback from araffin on PR#1901

* Update test and version

* Update changelog and SBX doc

---------

Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>
2024-04-22 10:04:01 +02:00
Chaitanya Bisht
5623d98f9d
Fixed broken link in ppo.rst (#1884) 2024-04-08 15:48:26 +02:00
Antonin RAFFIN
40ba50467c
Fix typo in changelog (#1882) 2024-04-01 16:07:52 +02:00
Antonin RAFFIN
429be93c48
Release v2.3.0 (#1879)
* Release v2.3.0

* Fix typos
2024-03-31 20:25:19 +02:00
Corentin
071226d3e8
Log success rate for on policy algorithms (#1870)
* Add success rate in monitor for on policy algorithms

* Update changelog

* make commit-checks refactoring

* Assert buffers are not none in _dump_logs

* Automatic refactoring of the type hinting

* Add success_rate logging test for on policy algorithms

* Update changelog

* Reformat

* Fix tests and update changelog

---------

Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>
2024-03-22 12:13:48 +01:00
Antonin RAFFIN
8b3723c6d8
Update ruff and documentation for hf sb3 (#1866)
* Update ruff

* Only load weights with `torch.load()` to avoid security issues

* Update doc about HF integration and remote code execution

* Fix doc build

* Revert weight_only=True for policies
2024-03-11 13:53:06 +01:00
Rushit Shah
f375cc3939
Fix docstring for `log_interval` to differentiate between on-policy/off-policy logging frequency (#1855)
* Fix docstring for log_interval inside the learn method in the base class.

* Updated changelog.

* Update docstring

---------

Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>
2024-03-04 11:42:16 +01:00
StagOverflow
56f20e40a2
Fix sum_independent_dims docstring to reflect output shape (#1851)
Co-authored-by: Heinrick Lumini <heinrl@Heinricks-MacBook-Pro.local>
2024-02-27 14:49:42 +01:00
Antonin RAFFIN
a8e905977f
Update env checker for spaces with non-zero start (#1845)
* Update ruff

* Update env checker for non-zero start
2024-02-19 16:44:02 +01:00
Antonin RAFFIN
1cba1bbd2f
Update to black style v24 (#1834) 2024-02-13 11:36:05 +01:00
Marek Michalik
beee4279eb
Fix example in README.md (#1830)
* Fix example in README.md

* Update changelog

---------

Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>
2024-02-13 10:47:05 +01:00
Antonin RAFFIN
620e58e61f
Update SB3 ONNX export documentation (#1816) 2024-01-30 15:53:25 +01:00
Antonin RAFFIN
a9273f968e
Update TD3/DDPG/DQN defaults for consistency (#1785)
* Update TD3/DDPG/DQN defaults for consistency

* Update changelog
2024-01-12 16:05:14 +01:00
Francesco Capuano
a653aec10d
Docs: Env attributes should be modified using env setters (#1789)
* add: paragraph on how to modify vec envs attributes via setters (solves
DLR-RM#1573)

* Update vec env doc

* Update callback doc and SB3 version

* Fix indentation

---------

Co-authored-by: Antonin Raffin <antonin.raffin@dlr.de>
2024-01-10 14:46:40 +01:00
Quentin Gallouédec
373166d6ac
Fix doc: Gym to Gymnasium Atari install command in examples.rst (#1773)
* Update examples.rst

* Update changelog.rst
2023-12-05 11:31:11 +01:00
Quentin Gallouédec
c8fda060d4
Adding PokemonRedExperiments project (#1762)
* Adding pokemon red

* update changelog

---------

Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>
2023-11-23 16:18:00 +01:00
Quentin Gallouédec
e3dea4b2e0
Release 2.2.1: Hotfix file closing (#1754)
* new closing policy

* revert #1742

* Add tests and update changelog

---------

Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>
2023-11-17 23:50:23 +01:00
Antonin RAFFIN
e1eac844af
Release v2.2.0 (#1750) 2023-11-16 17:42:10 +01:00
Antonin RAFFIN
23fbeb5975
Fix resource warning (#1742)
* Fix resource warning

* Add test and update changelog

* Fix for new mypy version
2023-11-16 17:11:13 +01:00
Antonin RAFFIN
b413f4c285
Fix VecEnv type hints (#1736)
* Fix VecNormalize type hints

* Fix VecEnv utils type annotations

* Apply suggestions from code review

Co-authored-by: M. Ernestus <maximilian@ernestus.de>

* Remove PyType

---------

Co-authored-by: M. Ernestus <maximilian@ernestus.de>
2023-11-08 09:46:40 +01:00
Antonin RAFFIN
d671402c93
Fix policies type annotations (#1735) 2023-11-06 18:35:28 +01:00
Antonin RAFFIN
a35c08c0d6
Fix offpolicy algo type hints (#1734)
* Fix offpolicy algo type hints

* Update PyTorch to have latest type hints

* Fix pip argument

* Try PyTorch 2.0.1

* Revert "Try PyTorch 2.0.1"

This reverts commit 0e0ead442d524d26f1f7e1a0bb21e2bfc0245b69.

* Update changelog
2023-11-06 11:17:36 +01:00
Antonin RAFFIN
018ea5ab67
Fix distributions type hints (#1733)
* Fix distributions type hints

* Add test for multim binary action space

* Fix test
2023-11-06 10:09:01 +01:00
Antonin RAFFIN
294f2b4309
Documentation update (#1732)
* Update RL Tips

* Fix grammar

* Update SBX doc

* Fix various typos and grammar mistakes
2023-11-03 17:17:46 +01:00
M. Ernestus
69afefc91d
Add rollout_buffer_class parameter to on-policy algorithms (#1720)
* Add rollout_buffer_class and rollout_buffer_kwargs parameters to OnPolicyAlgorithm

* Add rollout_buffer_class and rollout_buffer_kwargs to PPO.

* Add rollout_buffer_class and rollout_buffer_kwargs to A2C.

* Make use of the rollout buffer kwargs.

* Update version

* Add test and update doc

---------

Co-authored-by: Antonin Raffin <antonin.raffin@dlr.de>
2023-10-27 17:36:24 +02:00
Antonin Raffin
6c70993c8f
Remove sphinx-autodoc-typehints 2023-10-23 20:26:33 +02:00
Antonin Raffin
d672008a32
Update dependencies (remove sphinx type hint plugin), protect type aliases 2023-10-23 20:14:15 +02:00
Hosseinkhan Rémy
aab545901f
Add support for setting options at reset with VecEnv (#1606)
* Update signatures, and test with options

* Update changelog and black formatting

* Finish implementation (fixes, doc, tests)

* Use deepcopy to avoid side effects (modif by reference)

* Fix for mypy

---------

Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>
2023-10-23 13:38:48 +02:00
Jan-Hendrik Ewers
2ddf015cd9
fix: Follow PEP8 guidelines and evaluate falsy to truthy with not rather than is False. (#1707)
* fix: Follow PEP8 guidelines and evaluate falsy to truth with `not` rather than `is False`.

https://docs.python.org/2/library/stdtypes.html#truth-value-testing

* chore: Update changelog inline with intent of changes in PR #1707

Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>

* fix: Change `is False` to `not` as per PEP8

* chore: Remove superfluous comment about `is False`

* test: One On- and one Off-Policy algorithm (A2C and SAC respectively), with settings to speed up testing

* Update changelog

* chore: Remove EvalCallback as it's not actually required

* Update changelog.rst

* Rm duplicated "others" section in changelog.rst

---------

Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>
Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>
2023-10-09 12:21:12 +02:00
Quentin Gallouédec
c6bf251d46
Add argument features_extractor to ActorCriticPolicy.extract_features (#1710)
* add argument to extract_features

* remove empty lines

* changelog and version
2023-10-09 11:11:36 +02:00
Antonin RAFFIN
c6c660e51b
Fix type annotations of buffers (#1700)
* Fix type annotation and replay buffer

* Exclude pytype check

* Remove some pytype specific annotaiton and update changelog

* Fix HerReplayBuffer type hints

* try remove   # type: ignore[assignment]

* revert change

---------

Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>
2023-09-28 18:52:46 +02:00
Kyle Sayers
fab6cb339d
BitFlippingEnv argument check and docs clarification (#1698)
* made change, not tested yet

* add back _obs_space with note on purpose

* match formatting

* update documentation
2023-09-27 10:18:30 +02:00
Antonin RAFFIN
2ca94cb73d
Add check for common mistake when mixing Gym/VecEnv API (#1696) 2023-09-25 12:39:22 +02:00
Corentin
f4c5b1e5e2
Fix check_env for Sequence observation space (#1690)
* Fix Sequence obs env_checker

* Fix Sequence obs env_checker

* Add test : env_checker for Sequence obs

* Add test : env_checker for Sequence obs

* Cleanup and improve env checker messages

---------

Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>
2023-09-24 12:36:52 +02:00
Nicholas Goldowsky-Dill
1cd6ae42d5
Fix reward of SimpleMultiObsEnv to always be float (#1676)
* Fix reward of SimpleMultiObsEnv to always be float

Previously the reward was sometimes returned as an int.

* changelog

* Update changelog.rst

* Update version.txt

* Fix type annotation

* Fix import

---------

Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>
Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>
2023-09-16 08:56:04 +02:00
Antonin RAFFIN
99712760c8
Fix render_mode when loading VecNormalize (#1671)
* Fix render_mode when loading VecNormalize

* Switch from isort to ruff, and cap black version

* Add test and update changelog
2023-09-12 11:28:32 +02:00
Antonin RAFFIN
57dbefe80c
Fix read the doc default theme (#1668)
* Fix doc theme because rtd change default

* Fix doc build
2023-09-07 09:53:05 +02:00
Patrick Helm
e071796549
Fixes replay buffer device after loading in OffPolicyAlgorithm (#1662)
* sets replay buffer device after loading

* update changelog

* update changelog

* correct changelog

* add test for replay buffer device

* Fix test to actually test the bug fix

* [ci skip] Update version

* [ci skip] Update docker images

---------

Co-authored-by: PatrickHelm <patrick.helm@gmx.net>
Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>
2023-09-03 12:50:02 +02:00
PatrickHelm
16c6a886db
Fix squash output unscaling when using gSDE (#1652)
* prevents squash_output if not use_sde, see #1592

* update changelog

* add unscaling of actions taken during training

* add test regarding squashing and unquashing

* avoids try-except block

* format Gymnasium code with black

* makes mypy pass

* makes pytype pass

* sort imports

* makes error message in assert statement clearer

Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>

* improves code commenting

* replaces full env with wrapper

* Cleanup code

* Reformat

---------

Co-authored-by: PatrickHelm <patrick.helm@gmx.net>
Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>
Co-authored-by: Antonin Raffin <antonin.raffin@dlr.de>
2023-09-01 17:58:15 +02:00
PatrickHelm
84163b468c
Fixes update_locals() in collect_rollouts() of OnPolicyAlgorithm (#1660)
* calls update_locals() before on_rollout_end()

* update changelog
2023-08-30 17:02:41 +02:00
PatrickHelm
c99d65c664
Fix VectorizedActionNoise in OffPolicyAlgorithm (#1657)
* moves VectorizedActionNoise into _setup_learn()

* update changelog

---------

Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>
Co-authored-by: Antonin Raffin <antonin.raffin@dlr.de>
2023-08-30 12:37:14 +02:00
PatrickHelm
5c93e9f426
Fix random seed int32 for Windows (#1655)
* reduce high in randint to avoid Windows oob error

* update changelog

* implements @MikhailGerasimov's suggestion

---------

Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>
2023-08-30 10:56:58 +02:00
Antonin RAFFIN
e9f0f23ce4
Fix type hints for callbacks, utils and VecTranspose (#1648)
* Fix type hints in `common/utils.py`

* Fix `VecTranspose` type annotations

* Fix types for callbacks

* Update changelog

* Fix video recorder type hints

* Fix save utils type hints

* Allow BytesIO

* Improve error message

* Make logger and training env properties

* Clarify which open_path fn is called
2023-08-29 16:04:08 +02:00
Antonin RAFFIN
f4ec0f6afa
Release v2.1.0 (#1646) 2023-08-17 21:17:46 +02:00
Alex Pasquali
ff2115d562
[Docs] Added DeepNetSlice to community projects (#1639)
* Added DeepNetSlice to community projects

* Added description of network slice placement

---------

Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>
2023-08-05 18:12:08 +02:00
Antonin RAFFIN
17f02a8ae1
Fix env checker bounds, expose all invalid indices at once (#1638)
* Fix bug in env_checker.py bounds warning message

* Fix bug where Gym Environment Checker does not output the correct warning message when dealing with observation spaces that have different upper and different lower bounds

* Update test_env_checker.py with more comprehensive tests

* Make naming consistent

* Update version

* Catch all invalid indices at once

---------

Co-authored-by: gabo_tor <gabriel0torre@gmail.com>
2023-08-02 16:43:45 +02:00
Kyle He
d43400b464
Fix typo in the documentation for Custom Policy Networks (#1620)
* Update custom_policy.rst

* Update changelog.rst

---------

Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>
2023-08-01 13:20:29 +02:00