stable-baselines3

mirror of https://github.com/saymrwulf/stable-baselines3.git synced 2026-07-18 18:52:30 +00:00

Author	SHA1	Message	Date
Antonin RAFFIN	b8b2d30a83	Add `has_attr` for `VecEnv` (#2077 ) * Add `has_attr` for `VecEnv` * Add special case for gymnasium<1.0 * Update changelog.rst * Update black version	2025-02-03 10:43:56 +01:00
Antonin RAFFIN	f8ea2995cb	Doc update: custom envs, IsaacLab, Brax and dm_control (#2072 ) * Add note about start!=0 for Discrete spaces * Update doc for IsaacLab and dm_control * Fix test due to rounding error	2025-01-26 11:42:57 +01:00
Antonin RAFFIN	dba0baa491	Fix mypy error (#2067 ) * Fix mypy error * Ignore new errors	2025-01-07 11:57:54 +01:00
Antonin RAFFIN	57e8b97df5	Fix video recorder and add test (#2063 ) * Fix video recorder and add test * Update github CI * Install ffmpeg * Revert "Update github CI" This reverts commit 07791e97fccae4f003b2909428b23f59557d7034. * Skip VecVideoRecorder test on github	2024-12-21 08:24:25 +01:00
Marc Duclusaud	f432a6fcdc	Adding FRASA to the projects page (#2059 ) * Adding FRASA to the projects page * Updating changelog.rst * Ignore mypy errors for np arrays (python 3.11+) --------- Co-authored-by: Antonin Raffin <antonin.raffin@dlr.de>	2024-12-17 14:53:07 +01:00
Antonin RAFFIN	98366922b4	Fix linter warnings (order __all__) (#2048 )	2024-11-29 13:55:18 +01:00
Antonin RAFFIN	daaebd0a52	Drop python 3.8 and add python 3.12 support (#2041 ) * Drop python 3.8 support, add python 3.12 support * Upgrade to python 3.9 syntax * Fixes for Numpy v2 * Fix doc warning	2024-11-18 15:40:36 +01:00
Mark Towers	8f0b488bc5	Update Gymnasium to v1.0.0 (#1837 ) * Update Gymnasium to v1.0.0a1 * Comment out `gymnasium.wrappers.monitor` (todo update to VideoRecord) * Fix ruff warnings * Register Atari envs * Update `getattr` to `Env.get_wrapper_attr` * Reorder imports * Fix `seed` order * Fix collecting `max_steps` * Copy and paste video recorder to prevent the need to rewrite the vec vide recorder wrapper * Use `typing.List` rather than list * Fix env attribute forwarding * Separate out env attribute collection from its utilisation * Update for Gymnasium alpha 2 * Remove assert for OrderedDict * Update setup.py * Add type: ignore * Test with Gymnasium main * Remove `gymnasium.logger.debug/info` * Fix github CI yaml * Run gym 0.29.1 on python 3.10 * Update lower bounds * Integrate video recorder * Remove ordered dict * Update changelog --------- Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2024-11-04 12:03:12 +01:00
Antonin RAFFIN	3d59b5c86b	Use uv on GitHub CI for faster download and update changelog (#2026 ) * Use uv on GitHub CI for faster download and update changelog * Fix new mypy issues	2024-10-24 15:20:05 +02:00
Devin White	56c153f048	Add warning when using PPO on GPU and update doc (#2017 ) * Update documentation Added comment to PPO documentation that CPU should primarily be used unless using CNN as well as sample code. Added warning to user for both PPO and A2C that CPU should be used if the user is running GPU without using a CNN, reference Issue #1245. * Add warning to base class and add test --------- Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2024-10-07 11:24:47 +02:00
Antonin RAFFIN	512eea923a	Warn users when using multi-dim MultiDiscrete obs space (#2003 ) * Update env checker to warn users when using multi-dim MultiDiscrete obs space * Update changelog	2024-09-13 13:15:23 +02:00
Jan-Hendrik Ewers	4a1137ba3a	Add np.ndarray as a recognized type for TB histograms. (#1635 ) * Add np.ndarray as a recognized type for TB histograms. Torch histograms allow th.Tensor, np.ndarray, and caffe2 formatted strings. This commits expands the TensorBoardOutputFormat's capabilities to log the two former types. * Update changelog to reflect bug fix * fix: try/catch for if either np or torch aren't at the required versions. See https://github.com/DLR-RM/stable-baselines3/pull/1635 for more details * fix: Add comment describing the test for when add_histogram should not have been called * Cleanup --------- Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2024-08-02 11:55:27 +02:00
Chris Schindlbeck	6ad6fa55b6	Fix various typos (#1981 )	2024-07-29 10:44:23 +02:00
Antonin RAFFIN	bd3c0c6530	Fix loading of optimizer with older DQN models (#1978 )	2024-07-26 14:57:55 +02:00
Antonin RAFFIN	000544cc1f	Add support for pre and post linear modules in `create_mlp` (#1975 ) * Add support for pre and post linear modules in `create_mlp` * Disable mypy for python 3.8 * Reformat toml file * Update docstring Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com> * Add some comments --------- Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>	2024-07-22 13:42:33 +02:00
will-maclean	4efee92fba	Set CallbackList children's parent correctly (#1939 ) * Fixing #1791 * Update test and version * Add test for callback after eval * Fix mypy error * Remove tqdm warnings --------- Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2024-06-07 14:07:28 +02:00
Joe Ksiazek	0b06d8ab20	Fix error when loading a model that has net_arch manually set to None (#1937 ) * Fix loading a model with net_arch=None * Remove redundant get * Dummy commit * Add to contributors * Update test and version --------- Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>	2024-06-05 17:27:40 +02:00
Ole Petersen	6c00565778	Fix memory leak in base_class.py (#1908 ) * Fix memory leak in base_class.py Loading the data return value is not necessary since it is unused. Loading the data causes a memory leak through the ep_info_buffer variable. I found this while loading a PPO learner from storage on a multi-GPU system since the ep_info_buffer is loaded to the memory location it was on while it was saved to disk, instead of the target loading location, and is then not cleaned up. * Update changelog.rst * Update changelog --------- Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2024-05-15 15:59:32 +02:00
Chris Schindlbeck	4317c62598	Fix various typos (#1926 ) * Fix various typos * Update changelog --------- Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>	2024-05-15 15:19:39 +02:00
Andrew James	766b9e9f7d	Avoid torch type-error under torch.compile (#1922 ) * Avoid torch type-error under torch.compile * Update changelog and version * Update stable_baselines3/common/buffers.py Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org> --------- Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>	2024-05-13 17:28:23 +02:00
Antonin RAFFIN	285e01f64a	Hotfix: revert loading with `weights_only=True` (#1913 )	2024-04-27 15:08:38 +02:00
Mark Smith	9a749389d3	Cast learning_rate to float lambda for pickle safety when doing model.load (#1901 ) * create failing test for unpickle error * Fix learning_rate argument causing failure in weights_only=True if passed a function with non-float types * Updated with feedback from araffin on PR#1901 * Update test and version * Update changelog and SBX doc --------- Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>	2024-04-22 10:04:01 +02:00
Corentin	071226d3e8	Log success rate for on policy algorithms (#1870 ) * Add success rate in monitor for on policy algorithms * Update changelog * make commit-checks refactoring * Assert buffers are not none in _dump_logs * Automatic refactoring of the type hinting * Add success_rate logging test for on policy algorithms * Update changelog * Reformat * Fix tests and update changelog --------- Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>	2024-03-22 12:13:48 +01:00
Antonin RAFFIN	8b3723c6d8	Update ruff and documentation for hf sb3 (#1866 ) * Update ruff * Only load weights with `torch.load()` to avoid security issues * Update doc about HF integration and remote code execution * Fix doc build * Revert weight_only=True for policies	2024-03-11 13:53:06 +01:00
Rushit Shah	f375cc3939	Fix docstring for ``log_interval`` to differentiate between on-policy/off-policy logging frequency (#1855 ) * Fix docstring for log_interval inside the learn method in the base class. * Updated changelog. * Update docstring --------- Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2024-03-04 11:42:16 +01:00
StagOverflow	56f20e40a2	Fix `sum_independent_dims` docstring to reflect output shape (#1851 ) Co-authored-by: Heinrick Lumini <heinrl@Heinricks-MacBook-Pro.local>	2024-02-27 14:49:42 +01:00
Antonin RAFFIN	a8e905977f	Update env checker for spaces with non-zero start (#1845 ) * Update ruff * Update env checker for non-zero start	2024-02-19 16:44:02 +01:00
Antonin RAFFIN	1cba1bbd2f	Update to black style v24 (#1834 )	2024-02-13 11:36:05 +01:00
Quentin Gallouédec	e3dea4b2e0	Release 2.2.1: Hotfix file closing (#1754 ) * new closing policy * revert #1742 * Add tests and update changelog --------- Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>	2023-11-17 23:50:23 +01:00
Antonin RAFFIN	23fbeb5975	Fix resource warning (#1742 ) * Fix resource warning * Add test and update changelog * Fix for new mypy version	2023-11-16 17:11:13 +01:00
Antonin RAFFIN	b413f4c285	Fix `VecEnv` type hints (#1736 ) * Fix VecNormalize type hints * Fix VecEnv utils type annotations * Apply suggestions from code review Co-authored-by: M. Ernestus <maximilian@ernestus.de> * Remove PyType --------- Co-authored-by: M. Ernestus <maximilian@ernestus.de>	2023-11-08 09:46:40 +01:00
Antonin RAFFIN	d671402c93	Fix policies type annotations (#1735 )	2023-11-06 18:35:28 +01:00
Antonin RAFFIN	a35c08c0d6	Fix offpolicy algo type hints (#1734 ) * Fix offpolicy algo type hints * Update PyTorch to have latest type hints * Fix pip argument * Try PyTorch 2.0.1 * Revert "Try PyTorch 2.0.1" This reverts commit 0e0ead442d524d26f1f7e1a0bb21e2bfc0245b69. * Update changelog	2023-11-06 11:17:36 +01:00
Antonin RAFFIN	018ea5ab67	Fix distributions type hints (#1733 ) * Fix distributions type hints * Add test for multim binary action space * Fix test	2023-11-06 10:09:01 +01:00
M. Ernestus	69afefc91d	Add rollout_buffer_class parameter to on-policy algorithms (#1720 ) * Add rollout_buffer_class and rollout_buffer_kwargs parameters to OnPolicyAlgorithm * Add rollout_buffer_class and rollout_buffer_kwargs to PPO. * Add rollout_buffer_class and rollout_buffer_kwargs to A2C. * Make use of the rollout buffer kwargs. * Update version * Add test and update doc --------- Co-authored-by: Antonin Raffin <antonin.raffin@dlr.de>	2023-10-27 17:36:24 +02:00
Antonin Raffin	d672008a32	Update dependencies (remove sphinx type hint plugin), protect type aliases	2023-10-23 20:14:15 +02:00
Hosseinkhan Rémy	aab545901f	Add support for setting `options` at reset with `VecEnv` (#1606 ) * Update signatures, and test with options * Update changelog and black formatting * Finish implementation (fixes, doc, tests) * Use deepcopy to avoid side effects (modif by reference) * Fix for mypy --------- Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2023-10-23 13:38:48 +02:00
Jan-Hendrik Ewers	2ddf015cd9	fix: Follow PEP8 guidelines and evaluate falsy to truthy with `not` rather than `is False`. (#1707 ) * fix: Follow PEP8 guidelines and evaluate falsy to truth with `not` rather than `is False`. https://docs.python.org/2/library/stdtypes.html#truth-value-testing * chore: Update changelog inline with intent of changes in PR #1707 Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com> * fix: Change `is False` to `not` as per PEP8 * chore: Remove superfluous comment about `is False` * test: One On- and one Off-Policy algorithm (A2C and SAC respectively), with settings to speed up testing * Update changelog * chore: Remove EvalCallback as it's not actually required * Update changelog.rst * Rm duplicated "others" section in changelog.rst --------- Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com> Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>	2023-10-09 12:21:12 +02:00
Quentin Gallouédec	c6bf251d46	Add argument `features_extractor` to `ActorCriticPolicy.extract_features` (#1710 ) * add argument to extract_features * remove empty lines * changelog and version	2023-10-09 11:11:36 +02:00
Antonin RAFFIN	c6c660e51b	Fix type annotations of buffers (#1700 ) * Fix type annotation and replay buffer * Exclude pytype check * Remove some pytype specific annotaiton and update changelog * Fix HerReplayBuffer type hints * try remove # type: ignore[assignment] * revert change --------- Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>	2023-09-28 18:52:46 +02:00
Kyle Sayers	fab6cb339d	BitFlippingEnv argument check and docs clarification (#1698 ) * made change, not tested yet * add back _obs_space with note on purpose * match formatting * update documentation	2023-09-27 10:18:30 +02:00
Antonin RAFFIN	2ca94cb73d	Add check for common mistake when mixing Gym/VecEnv API (#1696 )	2023-09-25 12:39:22 +02:00
Antonin RAFFIN	b85fa7533e	Fix allowed types for save util (#1693 )	2023-09-24 12:38:19 +02:00
Corentin	f4c5b1e5e2	Fix check_env for Sequence observation space (#1690 ) * Fix Sequence obs env_checker * Fix Sequence obs env_checker * Add test : env_checker for Sequence obs * Add test : env_checker for Sequence obs * Cleanup and improve env checker messages --------- Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>	2023-09-24 12:36:52 +02:00
Nicholas Goldowsky-Dill	1cd6ae42d5	Fix reward of SimpleMultiObsEnv to always be float (#1676 ) * Fix reward of SimpleMultiObsEnv to always be float Previously the reward was sometimes returned as an int. * changelog * Update changelog.rst * Update version.txt * Fix type annotation * Fix import --------- Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com> Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>	2023-09-16 08:56:04 +02:00
Antonin RAFFIN	99712760c8	Fix render_mode when loading VecNormalize (#1671 ) * Fix render_mode when loading VecNormalize * Switch from isort to ruff, and cap black version * Add test and update changelog	2023-09-12 11:28:32 +02:00
Patrick Helm	e071796549	Fixes replay buffer device after loading in OffPolicyAlgorithm (#1662 ) * sets replay buffer device after loading * update changelog * update changelog * correct changelog * add test for replay buffer device * Fix test to actually test the bug fix * [ci skip] Update version * [ci skip] Update docker images --------- Co-authored-by: PatrickHelm <patrick.helm@gmx.net> Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2023-09-03 12:50:02 +02:00
PatrickHelm	16c6a886db	Fix squash output unscaling when using gSDE (#1652 ) * prevents squash_output if not use_sde, see #1592 * update changelog * add unscaling of actions taken during training * add test regarding squashing and unquashing * avoids try-except block * format Gymnasium code with black * makes mypy pass * makes pytype pass * sort imports * makes error message in assert statement clearer Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org> * improves code commenting * replaces full env with wrapper * Cleanup code * Reformat --------- Co-authored-by: PatrickHelm <patrick.helm@gmx.net> Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org> Co-authored-by: Antonin Raffin <antonin.raffin@dlr.de>	2023-09-01 17:58:15 +02:00
PatrickHelm	84163b468c	Fixes `update_locals()` in `collect_rollouts()` of `OnPolicyAlgorithm` (#1660 ) * calls update_locals() before on_rollout_end() * update changelog	2023-08-30 17:02:41 +02:00
PatrickHelm	c99d65c664	Fix `VectorizedActionNoise` in `OffPolicyAlgorithm` (#1657 ) * moves VectorizedActionNoise into _setup_learn() * update changelog --------- Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org> Co-authored-by: Antonin Raffin <antonin.raffin@dlr.de>	2023-08-30 12:37:14 +02:00

1 2 3 4 5 ...

303 commits