stable-baselines3

mirror of https://github.com/saymrwulf/stable-baselines3.git synced 2026-06-29 03:31:08 +00:00

Author	SHA1	Message	Date
Antonin RAFFIN	1cba1bbd2f	Update to black style v24 (#1834 )	2024-02-13 11:36:05 +01:00
Marek Michalik	beee4279eb	Fix example in README.md (#1830 ) * Fix example in README.md * Update changelog --------- Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>	2024-02-13 10:47:05 +01:00
Antonin RAFFIN	620e58e61f	Update SB3 ONNX export documentation (#1816 )	2024-01-30 15:53:25 +01:00
Antonin RAFFIN	a9273f968e	Update TD3/DDPG/DQN defaults for consistency (#1785 ) * Update TD3/DDPG/DQN defaults for consistency * Update changelog	2024-01-12 16:05:14 +01:00
Francesco Capuano	a653aec10d	Docs: Env attributes should be modified using env setters (#1789 ) * add: paragraph on how to modify vec envs attributes via setters (solves DLR-RM#1573) * Update vec env doc * Update callback doc and SB3 version * Fix indentation --------- Co-authored-by: Antonin Raffin <antonin.raffin@dlr.de>	2024-01-10 14:46:40 +01:00
Quentin Gallouédec	373166d6ac	Fix doc: Gym to Gymnasium Atari install command in `examples.rst` (#1773 ) * Update examples.rst * Update changelog.rst	2023-12-05 11:31:11 +01:00
Quentin Gallouédec	c8fda060d4	Adding PokemonRedExperiments project (#1762 ) * Adding pokemon red * update changelog --------- Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2023-11-23 16:18:00 +01:00
Quentin Gallouédec	e3dea4b2e0	Release 2.2.1: Hotfix file closing (#1754 ) * new closing policy * revert #1742 * Add tests and update changelog --------- Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>	2023-11-17 23:50:23 +01:00
Antonin RAFFIN	e1eac844af	Release v2.2.0 (#1750 )	2023-11-16 17:42:10 +01:00
Antonin RAFFIN	23fbeb5975	Fix resource warning (#1742 ) * Fix resource warning * Add test and update changelog * Fix for new mypy version	2023-11-16 17:11:13 +01:00
Antonin RAFFIN	b413f4c285	Fix `VecEnv` type hints (#1736 ) * Fix VecNormalize type hints * Fix VecEnv utils type annotations * Apply suggestions from code review Co-authored-by: M. Ernestus <maximilian@ernestus.de> * Remove PyType --------- Co-authored-by: M. Ernestus <maximilian@ernestus.de>	2023-11-08 09:46:40 +01:00
Antonin RAFFIN	d671402c93	Fix policies type annotations (#1735 )	2023-11-06 18:35:28 +01:00
Antonin RAFFIN	a35c08c0d6	Fix offpolicy algo type hints (#1734 ) * Fix offpolicy algo type hints * Update PyTorch to have latest type hints * Fix pip argument * Try PyTorch 2.0.1 * Revert "Try PyTorch 2.0.1" This reverts commit 0e0ead442d524d26f1f7e1a0bb21e2bfc0245b69. * Update changelog	2023-11-06 11:17:36 +01:00
Antonin RAFFIN	018ea5ab67	Fix distributions type hints (#1733 ) * Fix distributions type hints * Add test for multim binary action space * Fix test	2023-11-06 10:09:01 +01:00
Antonin RAFFIN	294f2b4309	Documentation update (#1732 ) * Update RL Tips * Fix grammar * Update SBX doc * Fix various typos and grammar mistakes	2023-11-03 17:17:46 +01:00
M. Ernestus	69afefc91d	Add rollout_buffer_class parameter to on-policy algorithms (#1720 ) * Add rollout_buffer_class and rollout_buffer_kwargs parameters to OnPolicyAlgorithm * Add rollout_buffer_class and rollout_buffer_kwargs to PPO. * Add rollout_buffer_class and rollout_buffer_kwargs to A2C. * Make use of the rollout buffer kwargs. * Update version * Add test and update doc --------- Co-authored-by: Antonin Raffin <antonin.raffin@dlr.de>	2023-10-27 17:36:24 +02:00
Antonin Raffin	6c70993c8f	Remove sphinx-autodoc-typehints	2023-10-23 20:26:33 +02:00
Antonin Raffin	d672008a32	Update dependencies (remove sphinx type hint plugin), protect type aliases	2023-10-23 20:14:15 +02:00
Hosseinkhan Rémy	aab545901f	Add support for setting `options` at reset with `VecEnv` (#1606 ) * Update signatures, and test with options * Update changelog and black formatting * Finish implementation (fixes, doc, tests) * Use deepcopy to avoid side effects (modif by reference) * Fix for mypy --------- Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2023-10-23 13:38:48 +02:00
Jan-Hendrik Ewers	2ddf015cd9	fix: Follow PEP8 guidelines and evaluate falsy to truthy with `not` rather than `is False`. (#1707 ) * fix: Follow PEP8 guidelines and evaluate falsy to truth with `not` rather than `is False`. https://docs.python.org/2/library/stdtypes.html#truth-value-testing * chore: Update changelog inline with intent of changes in PR #1707 Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com> * fix: Change `is False` to `not` as per PEP8 * chore: Remove superfluous comment about `is False` * test: One On- and one Off-Policy algorithm (A2C and SAC respectively), with settings to speed up testing * Update changelog * chore: Remove EvalCallback as it's not actually required * Update changelog.rst * Rm duplicated "others" section in changelog.rst --------- Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com> Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>	2023-10-09 12:21:12 +02:00
Quentin Gallouédec	c6bf251d46	Add argument `features_extractor` to `ActorCriticPolicy.extract_features` (#1710 ) * add argument to extract_features * remove empty lines * changelog and version	2023-10-09 11:11:36 +02:00
Antonin RAFFIN	c6c660e51b	Fix type annotations of buffers (#1700 ) * Fix type annotation and replay buffer * Exclude pytype check * Remove some pytype specific annotaiton and update changelog * Fix HerReplayBuffer type hints * try remove # type: ignore[assignment] * revert change --------- Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>	2023-09-28 18:52:46 +02:00
Kyle Sayers	fab6cb339d	BitFlippingEnv argument check and docs clarification (#1698 ) * made change, not tested yet * add back _obs_space with note on purpose * match formatting * update documentation	2023-09-27 10:18:30 +02:00
Antonin RAFFIN	2ca94cb73d	Add check for common mistake when mixing Gym/VecEnv API (#1696 )	2023-09-25 12:39:22 +02:00
Corentin	f4c5b1e5e2	Fix check_env for Sequence observation space (#1690 ) * Fix Sequence obs env_checker * Fix Sequence obs env_checker * Add test : env_checker for Sequence obs * Add test : env_checker for Sequence obs * Cleanup and improve env checker messages --------- Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>	2023-09-24 12:36:52 +02:00
Nicholas Goldowsky-Dill	1cd6ae42d5	Fix reward of SimpleMultiObsEnv to always be float (#1676 ) * Fix reward of SimpleMultiObsEnv to always be float Previously the reward was sometimes returned as an int. * changelog * Update changelog.rst * Update version.txt * Fix type annotation * Fix import --------- Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com> Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>	2023-09-16 08:56:04 +02:00
Antonin RAFFIN	99712760c8	Fix render_mode when loading VecNormalize (#1671 ) * Fix render_mode when loading VecNormalize * Switch from isort to ruff, and cap black version * Add test and update changelog	2023-09-12 11:28:32 +02:00
Antonin RAFFIN	57dbefe80c	Fix read the doc default theme (#1668 ) * Fix doc theme because rtd change default * Fix doc build	2023-09-07 09:53:05 +02:00
Patrick Helm	e071796549	Fixes replay buffer device after loading in OffPolicyAlgorithm (#1662 ) * sets replay buffer device after loading * update changelog * update changelog * correct changelog * add test for replay buffer device * Fix test to actually test the bug fix * [ci skip] Update version * [ci skip] Update docker images --------- Co-authored-by: PatrickHelm <patrick.helm@gmx.net> Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2023-09-03 12:50:02 +02:00
PatrickHelm	16c6a886db	Fix squash output unscaling when using gSDE (#1652 ) * prevents squash_output if not use_sde, see #1592 * update changelog * add unscaling of actions taken during training * add test regarding squashing and unquashing * avoids try-except block * format Gymnasium code with black * makes mypy pass * makes pytype pass * sort imports * makes error message in assert statement clearer Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org> * improves code commenting * replaces full env with wrapper * Cleanup code * Reformat --------- Co-authored-by: PatrickHelm <patrick.helm@gmx.net> Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org> Co-authored-by: Antonin Raffin <antonin.raffin@dlr.de>	2023-09-01 17:58:15 +02:00
PatrickHelm	84163b468c	Fixes `update_locals()` in `collect_rollouts()` of `OnPolicyAlgorithm` (#1660 ) * calls update_locals() before on_rollout_end() * update changelog	2023-08-30 17:02:41 +02:00
PatrickHelm	c99d65c664	Fix `VectorizedActionNoise` in `OffPolicyAlgorithm` (#1657 ) * moves VectorizedActionNoise into _setup_learn() * update changelog --------- Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org> Co-authored-by: Antonin Raffin <antonin.raffin@dlr.de>	2023-08-30 12:37:14 +02:00
PatrickHelm	5c93e9f426	Fix random seed int32 for Windows (#1655 ) * reduce high in randint to avoid Windows oob error * update changelog * implements @MikhailGerasimov's suggestion --------- Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2023-08-30 10:56:58 +02:00
Antonin RAFFIN	e9f0f23ce4	Fix type hints for callbacks, utils and `VecTranspose` (#1648 ) * Fix type hints in `common/utils.py` * Fix `VecTranspose` type annotations * Fix types for callbacks * Update changelog * Fix video recorder type hints * Fix save utils type hints * Allow BytesIO * Improve error message * Make logger and training env properties * Clarify which open_path fn is called	2023-08-29 16:04:08 +02:00
Antonin RAFFIN	f4ec0f6afa	Release v2.1.0 (#1646 )	2023-08-17 21:17:46 +02:00
Alex Pasquali	ff2115d562	[Docs] Added DeepNetSlice to community projects (#1639 ) * Added DeepNetSlice to community projects * Added description of network slice placement --------- Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2023-08-05 18:12:08 +02:00
Antonin RAFFIN	17f02a8ae1	Fix env checker bounds, expose all invalid indices at once (#1638 ) * Fix bug in env_checker.py bounds warning message * Fix bug where Gym Environment Checker does not output the correct warning message when dealing with observation spaces that have different upper and different lower bounds * Update test_env_checker.py with more comprehensive tests * Make naming consistent * Update version * Catch all invalid indices at once --------- Co-authored-by: gabo_tor <gabriel0torre@gmail.com>	2023-08-02 16:43:45 +02:00
Kyle He	d43400b464	Fix typo in the documentation for Custom Policy Networks (#1620 ) * Update custom_policy.rst * Update changelog.rst --------- Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2023-08-01 13:20:29 +02:00
Tobias Rohrer	ba77dd7c61	Fix to use float64 actions for off policy algorithms (#1572 ) * Added test cases where off policy algorithms fail with float64 actionspace * casting observations and actions to `np.float32` to unify behaviour between `ReplayBuffer` and `RolloutBuffer`. Fixing issue #1145 * reformatted using black * making test more restrictive by checking models action is float64 * added changelog entry * undo cast of observations as `preprocessing.preprocess_obs()` casts them to float32 anyways. * - Casting to float32 only, if action.dtype is float64 - Added cast to `DictReplayBuffer` as well * Added tests for multiple variations of continuous action types and observation spaces * applied reformatting by `make commit-checks` * Added typing and comment referring to description in merge request * Apply linter for single element slice * Rename helper and refactor tests * Update changelog and docstring --------- Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2023-07-24 16:38:03 +02:00
Stefan Schneider	5abd50a853	Docs: Add mobile-env to community projects (#1617 ) * Docs: Add mobile-env to community projects * Update docs Readme with correct install command Without the quotes, I get `no matches found: .[docs]` * Add changelog entry for adding mobile-env * Fix format in projects.rst Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org> --------- Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2023-07-21 16:33:01 +02:00
Antonin RAFFIN	a730b9b66a	Relax logger check for Windows (#1615 ) * Relax logger check for Windows * Update tests	2023-07-21 07:02:38 +02:00
Mark Towers	61e1060525	Update Gymnasium to v0.29.0 (#1610 ) * Update setup.py to v0.29.0 * Remove invalid test * Loosen version and update changelog --------- Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>	2023-07-18 14:22:22 +02:00
BertrandDecoster	fa7a3168f3	Update the Callbacks: Evaluate Agent Performance section of the Examples (#1604 ) * Update examples.rst section "Callbacks: Evaluate Agent Performance" Two typos fixed * Update changelog --------- Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>	2023-07-18 13:02:47 +02:00
Antonin RAFFIN	d68ff2e17f	Drop python 3.7, add 3.11 and update github templates (#1587 ) * Add missing word in patch error message * Add changelog * Drop python 3.7, add 3.11 and update github templates * [ci skip] Update version in doc * Update minimum PyTorch version * Update conda env and fix mypy --------- Co-authored-by: Lukas Hass <lukas@slucky.de>	2023-07-03 12:44:18 +02:00
Antonin Raffin	cc103ff725	Update doc before 2.0 release	2023-06-23 12:31:14 +02:00
Antonin RAFFIN	1036c05680	Release v2.0.0 (#1571 ) * RUF012: Explicit ClassVar * Prepare v2.0.0 * Update docs/misc/changelog.rst --------- Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>	2023-06-23 12:21:58 +02:00
Antonin RAFFIN	4fdb65ecf3	Doc fix and add Stable-Baselines3 Jax (SBX) page (#1566 ) * Fix custom policy example * Add RL Zoo doc link * Add changelog to pypi * Add SBX doc page * Fix small mistake in docstring --------- Co-authored-by: Peter Elmers <peter.elmers@yahoo.com>	2023-06-21 18:54:16 +02:00
Jonathan	f667f086ea	Fixes HER mixed ordering of desired_goal and achieved_goal (#1570 ) * change ordering of achieved_goal and desired_goal to match expected compute_reward order * Update changelog.rst * Update version * Update version.txt * Update changelog.rst --------- Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>	2023-06-21 16:27:06 +02:00
Antonin RAFFIN	ffe26ccf95	Fix render bug for vec env wrappers (#1525 ) * Fix render bug for vec env wrappers * Fix tests and update changelog * Better fix, backward compatible * remove render_mode from VecEnv init * Make DictObsVecEnv inherit from VecEnv * format * Fix env_is_wrapped * try/except getting render mode ( (https://github.com/DLR-RM/stable-baselines3/pull/1525#discussion_r1206888921) * update version * Fix env_is_wrapped in test_vec_extract_dict --------- Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com> Co-authored-by: Quentin GALLOUÉDEC <gallouedec.quentin@gmail.com>	2023-06-07 16:20:40 +02:00
Lara Bergmann	32778ddc94	Fix wrong truncation in HER replay buffer (#1543 ) * fix episode start idx that leads to wrong episode length * add episode length test * Update changelog * Reformat files * Use replay_buffer.dones to test HER truncation warning * truncate_last_trajectory: sample truncated episode and handle infinite horizon tasks * make test_truncate_last_trajectory independent of learning * Add timeout comment HER truncate_last_trajectory Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com> * Update version.txt * Update version --------- Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>	2023-06-07 15:57:12 +02:00

1 2 3 4 5 ...

475 commits