stable-baselines3

mirror of https://github.com/saymrwulf/stable-baselines3.git synced 2026-05-18 21:30:19 +00:00

Author	SHA1	Message	Date
Nicolò Lucchesi	35eccaf04f	Fix tensorboad video slow numpy->torch conversion (#1910 ) * fixed tb video docs * updated changelog * add comment on expected render() output * Update changelog.rst --------- Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2024-04-26 12:12:04 +02:00
Corentin	e93175084f	Adding ER-MRL to community project (#1904 ) * Add ER_MRL * Update changelog * Move ER-MRL at the end of the file * Improve project description * Update changelog --------- Co-authored-by: Antonin Raffin <antonin.raffin@dlr.de>	2024-04-25 14:31:15 +02:00
Antonin Raffin	4af4a32d1b	Update RL Tips and Tricks section	2024-04-22 10:25:32 +02:00
Mark Smith	9a749389d3	Cast learning_rate to float lambda for pickle safety when doing model.load (#1901 ) * create failing test for unpickle error * Fix learning_rate argument causing failure in weights_only=True if passed a function with non-float types * Updated with feedback from araffin on PR#1901 * Update test and version * Update changelog and SBX doc --------- Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>	2024-04-22 10:04:01 +02:00
Chaitanya Bisht	5623d98f9d	Fixed broken link in ppo.rst (#1884 )	2024-04-08 15:48:26 +02:00
Antonin RAFFIN	40ba50467c	Fix typo in changelog (#1882 )	2024-04-01 16:07:52 +02:00
Antonin RAFFIN	429be93c48	Release v2.3.0 (#1879 ) * Release v2.3.0 * Fix typos	2024-03-31 20:25:19 +02:00
Corentin	071226d3e8	Log success rate for on policy algorithms (#1870 ) * Add success rate in monitor for on policy algorithms * Update changelog * make commit-checks refactoring * Assert buffers are not none in _dump_logs * Automatic refactoring of the type hinting * Add success_rate logging test for on policy algorithms * Update changelog * Reformat * Fix tests and update changelog --------- Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>	2024-03-22 12:13:48 +01:00
Antonin RAFFIN	8b3723c6d8	Update ruff and documentation for hf sb3 (#1866 ) * Update ruff * Only load weights with `torch.load()` to avoid security issues * Update doc about HF integration and remote code execution * Fix doc build * Revert weight_only=True for policies	2024-03-11 13:53:06 +01:00
Rushit Shah	f375cc3939	Fix docstring for ``log_interval`` to differentiate between on-policy/off-policy logging frequency (#1855 ) * Fix docstring for log_interval inside the learn method in the base class. * Updated changelog. * Update docstring --------- Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2024-03-04 11:42:16 +01:00
StagOverflow	56f20e40a2	Fix `sum_independent_dims` docstring to reflect output shape (#1851 ) Co-authored-by: Heinrick Lumini <heinrl@Heinricks-MacBook-Pro.local>	2024-02-27 14:49:42 +01:00
Antonin RAFFIN	a8e905977f	Update env checker for spaces with non-zero start (#1845 ) * Update ruff * Update env checker for non-zero start	2024-02-19 16:44:02 +01:00
Antonin RAFFIN	1cba1bbd2f	Update to black style v24 (#1834 )	2024-02-13 11:36:05 +01:00
Marek Michalik	beee4279eb	Fix example in README.md (#1830 ) * Fix example in README.md * Update changelog --------- Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>	2024-02-13 10:47:05 +01:00
Antonin RAFFIN	620e58e61f	Update SB3 ONNX export documentation (#1816 )	2024-01-30 15:53:25 +01:00
Antonin RAFFIN	a9273f968e	Update TD3/DDPG/DQN defaults for consistency (#1785 ) * Update TD3/DDPG/DQN defaults for consistency * Update changelog	2024-01-12 16:05:14 +01:00
Francesco Capuano	a653aec10d	Docs: Env attributes should be modified using env setters (#1789 ) * add: paragraph on how to modify vec envs attributes via setters (solves DLR-RM#1573) * Update vec env doc * Update callback doc and SB3 version * Fix indentation --------- Co-authored-by: Antonin Raffin <antonin.raffin@dlr.de>	2024-01-10 14:46:40 +01:00
Quentin Gallouédec	373166d6ac	Fix doc: Gym to Gymnasium Atari install command in `examples.rst` (#1773 ) * Update examples.rst * Update changelog.rst	2023-12-05 11:31:11 +01:00
Quentin Gallouédec	c8fda060d4	Adding PokemonRedExperiments project (#1762 ) * Adding pokemon red * update changelog --------- Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2023-11-23 16:18:00 +01:00
Quentin Gallouédec	e3dea4b2e0	Release 2.2.1: Hotfix file closing (#1754 ) * new closing policy * revert #1742 * Add tests and update changelog --------- Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>	2023-11-17 23:50:23 +01:00
Antonin RAFFIN	e1eac844af	Release v2.2.0 (#1750 )	2023-11-16 17:42:10 +01:00
Antonin RAFFIN	23fbeb5975	Fix resource warning (#1742 ) * Fix resource warning * Add test and update changelog * Fix for new mypy version	2023-11-16 17:11:13 +01:00
Antonin RAFFIN	b413f4c285	Fix `VecEnv` type hints (#1736 ) * Fix VecNormalize type hints * Fix VecEnv utils type annotations * Apply suggestions from code review Co-authored-by: M. Ernestus <maximilian@ernestus.de> * Remove PyType --------- Co-authored-by: M. Ernestus <maximilian@ernestus.de>	2023-11-08 09:46:40 +01:00
Antonin RAFFIN	d671402c93	Fix policies type annotations (#1735 )	2023-11-06 18:35:28 +01:00
Antonin RAFFIN	a35c08c0d6	Fix offpolicy algo type hints (#1734 ) * Fix offpolicy algo type hints * Update PyTorch to have latest type hints * Fix pip argument * Try PyTorch 2.0.1 * Revert "Try PyTorch 2.0.1" This reverts commit 0e0ead442d524d26f1f7e1a0bb21e2bfc0245b69. * Update changelog	2023-11-06 11:17:36 +01:00
Antonin RAFFIN	018ea5ab67	Fix distributions type hints (#1733 ) * Fix distributions type hints * Add test for multim binary action space * Fix test	2023-11-06 10:09:01 +01:00
Antonin RAFFIN	294f2b4309	Documentation update (#1732 ) * Update RL Tips * Fix grammar * Update SBX doc * Fix various typos and grammar mistakes	2023-11-03 17:17:46 +01:00
M. Ernestus	69afefc91d	Add rollout_buffer_class parameter to on-policy algorithms (#1720 ) * Add rollout_buffer_class and rollout_buffer_kwargs parameters to OnPolicyAlgorithm * Add rollout_buffer_class and rollout_buffer_kwargs to PPO. * Add rollout_buffer_class and rollout_buffer_kwargs to A2C. * Make use of the rollout buffer kwargs. * Update version * Add test and update doc --------- Co-authored-by: Antonin Raffin <antonin.raffin@dlr.de>	2023-10-27 17:36:24 +02:00
Antonin Raffin	6c70993c8f	Remove sphinx-autodoc-typehints	2023-10-23 20:26:33 +02:00
Antonin Raffin	d672008a32	Update dependencies (remove sphinx type hint plugin), protect type aliases	2023-10-23 20:14:15 +02:00
Hosseinkhan Rémy	aab545901f	Add support for setting `options` at reset with `VecEnv` (#1606 ) * Update signatures, and test with options * Update changelog and black formatting * Finish implementation (fixes, doc, tests) * Use deepcopy to avoid side effects (modif by reference) * Fix for mypy --------- Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2023-10-23 13:38:48 +02:00
Jan-Hendrik Ewers	2ddf015cd9	fix: Follow PEP8 guidelines and evaluate falsy to truthy with `not` rather than `is False`. (#1707 ) * fix: Follow PEP8 guidelines and evaluate falsy to truth with `not` rather than `is False`. https://docs.python.org/2/library/stdtypes.html#truth-value-testing * chore: Update changelog inline with intent of changes in PR #1707 Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com> * fix: Change `is False` to `not` as per PEP8 * chore: Remove superfluous comment about `is False` * test: One On- and one Off-Policy algorithm (A2C and SAC respectively), with settings to speed up testing * Update changelog * chore: Remove EvalCallback as it's not actually required * Update changelog.rst * Rm duplicated "others" section in changelog.rst --------- Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com> Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>	2023-10-09 12:21:12 +02:00
Quentin Gallouédec	c6bf251d46	Add argument `features_extractor` to `ActorCriticPolicy.extract_features` (#1710 ) * add argument to extract_features * remove empty lines * changelog and version	2023-10-09 11:11:36 +02:00
Antonin RAFFIN	c6c660e51b	Fix type annotations of buffers (#1700 ) * Fix type annotation and replay buffer * Exclude pytype check * Remove some pytype specific annotaiton and update changelog * Fix HerReplayBuffer type hints * try remove # type: ignore[assignment] * revert change --------- Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>	2023-09-28 18:52:46 +02:00
Kyle Sayers	fab6cb339d	BitFlippingEnv argument check and docs clarification (#1698 ) * made change, not tested yet * add back _obs_space with note on purpose * match formatting * update documentation	2023-09-27 10:18:30 +02:00
Antonin RAFFIN	2ca94cb73d	Add check for common mistake when mixing Gym/VecEnv API (#1696 )	2023-09-25 12:39:22 +02:00
Corentin	f4c5b1e5e2	Fix check_env for Sequence observation space (#1690 ) * Fix Sequence obs env_checker * Fix Sequence obs env_checker * Add test : env_checker for Sequence obs * Add test : env_checker for Sequence obs * Cleanup and improve env checker messages --------- Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>	2023-09-24 12:36:52 +02:00
Nicholas Goldowsky-Dill	1cd6ae42d5	Fix reward of SimpleMultiObsEnv to always be float (#1676 ) * Fix reward of SimpleMultiObsEnv to always be float Previously the reward was sometimes returned as an int. * changelog * Update changelog.rst * Update version.txt * Fix type annotation * Fix import --------- Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com> Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>	2023-09-16 08:56:04 +02:00
Antonin RAFFIN	99712760c8	Fix render_mode when loading VecNormalize (#1671 ) * Fix render_mode when loading VecNormalize * Switch from isort to ruff, and cap black version * Add test and update changelog	2023-09-12 11:28:32 +02:00
Antonin RAFFIN	57dbefe80c	Fix read the doc default theme (#1668 ) * Fix doc theme because rtd change default * Fix doc build	2023-09-07 09:53:05 +02:00
Patrick Helm	e071796549	Fixes replay buffer device after loading in OffPolicyAlgorithm (#1662 ) * sets replay buffer device after loading * update changelog * update changelog * correct changelog * add test for replay buffer device * Fix test to actually test the bug fix * [ci skip] Update version * [ci skip] Update docker images --------- Co-authored-by: PatrickHelm <patrick.helm@gmx.net> Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2023-09-03 12:50:02 +02:00
PatrickHelm	16c6a886db	Fix squash output unscaling when using gSDE (#1652 ) * prevents squash_output if not use_sde, see #1592 * update changelog * add unscaling of actions taken during training * add test regarding squashing and unquashing * avoids try-except block * format Gymnasium code with black * makes mypy pass * makes pytype pass * sort imports * makes error message in assert statement clearer Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org> * improves code commenting * replaces full env with wrapper * Cleanup code * Reformat --------- Co-authored-by: PatrickHelm <patrick.helm@gmx.net> Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org> Co-authored-by: Antonin Raffin <antonin.raffin@dlr.de>	2023-09-01 17:58:15 +02:00
PatrickHelm	84163b468c	Fixes `update_locals()` in `collect_rollouts()` of `OnPolicyAlgorithm` (#1660 ) * calls update_locals() before on_rollout_end() * update changelog	2023-08-30 17:02:41 +02:00
PatrickHelm	c99d65c664	Fix `VectorizedActionNoise` in `OffPolicyAlgorithm` (#1657 ) * moves VectorizedActionNoise into _setup_learn() * update changelog --------- Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org> Co-authored-by: Antonin Raffin <antonin.raffin@dlr.de>	2023-08-30 12:37:14 +02:00
PatrickHelm	5c93e9f426	Fix random seed int32 for Windows (#1655 ) * reduce high in randint to avoid Windows oob error * update changelog * implements @MikhailGerasimov's suggestion --------- Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2023-08-30 10:56:58 +02:00
Antonin RAFFIN	e9f0f23ce4	Fix type hints for callbacks, utils and `VecTranspose` (#1648 ) * Fix type hints in `common/utils.py` * Fix `VecTranspose` type annotations * Fix types for callbacks * Update changelog * Fix video recorder type hints * Fix save utils type hints * Allow BytesIO * Improve error message * Make logger and training env properties * Clarify which open_path fn is called	2023-08-29 16:04:08 +02:00
Antonin RAFFIN	f4ec0f6afa	Release v2.1.0 (#1646 )	2023-08-17 21:17:46 +02:00
Alex Pasquali	ff2115d562	[Docs] Added DeepNetSlice to community projects (#1639 ) * Added DeepNetSlice to community projects * Added description of network slice placement --------- Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2023-08-05 18:12:08 +02:00
Antonin RAFFIN	17f02a8ae1	Fix env checker bounds, expose all invalid indices at once (#1638 ) * Fix bug in env_checker.py bounds warning message * Fix bug where Gym Environment Checker does not output the correct warning message when dealing with observation spaces that have different upper and different lower bounds * Update test_env_checker.py with more comprehensive tests * Make naming consistent * Update version * Catch all invalid indices at once --------- Co-authored-by: gabo_tor <gabriel0torre@gmail.com>	2023-08-02 16:43:45 +02:00
Kyle He	d43400b464	Fix typo in the documentation for Custom Policy Networks (#1620 ) * Update custom_policy.rst * Update changelog.rst --------- Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2023-08-01 13:20:29 +02:00

1 2 3 4 5 ...

487 commits