stable-baselines3

mirror of https://github.com/saymrwulf/stable-baselines3.git synced 2026-06-29 03:31:08 +00:00

Author	SHA1	Message	Date
Antonin Raffin	955e202258	Try torch compile and other optimization	2024-07-07 15:20:21 +02:00
Corentin	d8148deeaa	Updated DQN optimizer input to only include q_network parameters as input (#1963 ) * Updated DQN optimizer input to only include q_network parameters * Update version --------- Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2024-07-05 19:07:55 +02:00
Sahit Chintalapudi	0eebde7ca1	Fix typo in examples.rst (#1962 ) The variable `env` is not defined. The gym env we want to change is `vec_env`	2024-07-05 15:00:48 +02:00
Dominik Baron	24ebf1a1df	Remove unnecessary SDE resampling in PPO update (#1933 ) * Remove unnecessary SDE resampling in PPO update * Update changelog.rst * Update version * Update PyTorch version on CI * Update ruff * Limit NumPy version * Reformat --------- Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2024-06-29 20:07:32 +02:00
will-maclean	4efee92fba	Set CallbackList children's parent correctly (#1939 ) * Fixing #1791 * Update test and version * Add test for callback after eval * Fix mypy error * Remove tqdm warnings --------- Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2024-06-07 14:07:28 +02:00
Joe Ksiazek	0b06d8ab20	Fix error when loading a model that has net_arch manually set to None (#1937 ) * Fix loading a model with net_arch=None * Remove redundant get * Dummy commit * Add to contributors * Update test and version --------- Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>	2024-06-05 17:27:40 +02:00
Ole Petersen	6c00565778	Fix memory leak in base_class.py (#1908 ) * Fix memory leak in base_class.py Loading the data return value is not necessary since it is unused. Loading the data causes a memory leak through the ep_info_buffer variable. I found this while loading a PPO learner from storage on a multi-GPU system since the ep_info_buffer is loaded to the memory location it was on while it was saved to disk, instead of the target loading location, and is then not cleaned up. * Update changelog.rst * Update changelog --------- Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2024-05-15 15:59:32 +02:00
Chris Schindlbeck	4317c62598	Fix various typos (#1926 ) * Fix various typos * Update changelog --------- Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>	2024-05-15 15:19:39 +02:00
Andrew James	766b9e9f7d	Avoid torch type-error under torch.compile (#1922 ) * Avoid torch type-error under torch.compile * Update changelog and version * Update stable_baselines3/common/buffers.py Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org> --------- Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>	2024-05-13 17:28:23 +02:00
Antonin RAFFIN	285e01f64a	Hotfix: revert loading with `weights_only=True` (#1913 )	2024-04-27 15:08:38 +02:00
Nicolò Lucchesi	35eccaf04f	Fix tensorboad video slow numpy->torch conversion (#1910 ) * fixed tb video docs * updated changelog * add comment on expected render() output * Update changelog.rst --------- Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2024-04-26 12:12:04 +02:00
Corentin	e93175084f	Adding ER-MRL to community project (#1904 ) * Add ER_MRL * Update changelog * Move ER-MRL at the end of the file * Improve project description * Update changelog --------- Co-authored-by: Antonin Raffin <antonin.raffin@dlr.de>	2024-04-25 14:31:15 +02:00
Antonin Raffin	4af4a32d1b	Update RL Tips and Tricks section	2024-04-22 10:25:32 +02:00
Mark Smith	9a749389d3	Cast learning_rate to float lambda for pickle safety when doing model.load (#1901 ) * create failing test for unpickle error * Fix learning_rate argument causing failure in weights_only=True if passed a function with non-float types * Updated with feedback from araffin on PR#1901 * Update test and version * Update changelog and SBX doc --------- Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>	2024-04-22 10:04:01 +02:00
Chaitanya Bisht	5623d98f9d	Fixed broken link in ppo.rst (#1884 )	2024-04-08 15:48:26 +02:00
Antonin RAFFIN	40ba50467c	Fix typo in changelog (#1882 )	2024-04-01 16:07:52 +02:00
Antonin RAFFIN	429be93c48	Release v2.3.0 (#1879 ) * Release v2.3.0 * Fix typos	2024-03-31 20:25:19 +02:00
Corentin	071226d3e8	Log success rate for on policy algorithms (#1870 ) * Add success rate in monitor for on policy algorithms * Update changelog * make commit-checks refactoring * Assert buffers are not none in _dump_logs * Automatic refactoring of the type hinting * Add success_rate logging test for on policy algorithms * Update changelog * Reformat * Fix tests and update changelog --------- Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>	2024-03-22 12:13:48 +01:00
Antonin RAFFIN	8b3723c6d8	Update ruff and documentation for hf sb3 (#1866 ) * Update ruff * Only load weights with `torch.load()` to avoid security issues * Update doc about HF integration and remote code execution * Fix doc build * Revert weight_only=True for policies	2024-03-11 13:53:06 +01:00
Rushit Shah	f375cc3939	Fix docstring for ``log_interval`` to differentiate between on-policy/off-policy logging frequency (#1855 ) * Fix docstring for log_interval inside the learn method in the base class. * Updated changelog. * Update docstring --------- Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2024-03-04 11:42:16 +01:00
StagOverflow	56f20e40a2	Fix `sum_independent_dims` docstring to reflect output shape (#1851 ) Co-authored-by: Heinrick Lumini <heinrl@Heinricks-MacBook-Pro.local>	2024-02-27 14:49:42 +01:00
Antonin RAFFIN	a8e905977f	Update env checker for spaces with non-zero start (#1845 ) * Update ruff * Update env checker for non-zero start	2024-02-19 16:44:02 +01:00
Antonin RAFFIN	1cba1bbd2f	Update to black style v24 (#1834 )	2024-02-13 11:36:05 +01:00
Marek Michalik	beee4279eb	Fix example in README.md (#1830 ) * Fix example in README.md * Update changelog --------- Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>	2024-02-13 10:47:05 +01:00
Antonin RAFFIN	620e58e61f	Update SB3 ONNX export documentation (#1816 )	2024-01-30 15:53:25 +01:00
Antonin RAFFIN	a9273f968e	Update TD3/DDPG/DQN defaults for consistency (#1785 ) * Update TD3/DDPG/DQN defaults for consistency * Update changelog	2024-01-12 16:05:14 +01:00
Francesco Capuano	a653aec10d	Docs: Env attributes should be modified using env setters (#1789 ) * add: paragraph on how to modify vec envs attributes via setters (solves DLR-RM#1573) * Update vec env doc * Update callback doc and SB3 version * Fix indentation --------- Co-authored-by: Antonin Raffin <antonin.raffin@dlr.de>	2024-01-10 14:46:40 +01:00
Quentin Gallouédec	373166d6ac	Fix doc: Gym to Gymnasium Atari install command in `examples.rst` (#1773 ) * Update examples.rst * Update changelog.rst	2023-12-05 11:31:11 +01:00
Quentin Gallouédec	c8fda060d4	Adding PokemonRedExperiments project (#1762 ) * Adding pokemon red * update changelog --------- Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2023-11-23 16:18:00 +01:00
Quentin Gallouédec	e3dea4b2e0	Release 2.2.1: Hotfix file closing (#1754 ) * new closing policy * revert #1742 * Add tests and update changelog --------- Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>	2023-11-17 23:50:23 +01:00
Antonin RAFFIN	e1eac844af	Release v2.2.0 (#1750 )	2023-11-16 17:42:10 +01:00
Antonin RAFFIN	23fbeb5975	Fix resource warning (#1742 ) * Fix resource warning * Add test and update changelog * Fix for new mypy version	2023-11-16 17:11:13 +01:00
Antonin RAFFIN	b413f4c285	Fix `VecEnv` type hints (#1736 ) * Fix VecNormalize type hints * Fix VecEnv utils type annotations * Apply suggestions from code review Co-authored-by: M. Ernestus <maximilian@ernestus.de> * Remove PyType --------- Co-authored-by: M. Ernestus <maximilian@ernestus.de>	2023-11-08 09:46:40 +01:00
Antonin RAFFIN	d671402c93	Fix policies type annotations (#1735 )	2023-11-06 18:35:28 +01:00
Antonin RAFFIN	a35c08c0d6	Fix offpolicy algo type hints (#1734 ) * Fix offpolicy algo type hints * Update PyTorch to have latest type hints * Fix pip argument * Try PyTorch 2.0.1 * Revert "Try PyTorch 2.0.1" This reverts commit 0e0ead442d524d26f1f7e1a0bb21e2bfc0245b69. * Update changelog	2023-11-06 11:17:36 +01:00
Antonin RAFFIN	018ea5ab67	Fix distributions type hints (#1733 ) * Fix distributions type hints * Add test for multim binary action space * Fix test	2023-11-06 10:09:01 +01:00
Antonin RAFFIN	294f2b4309	Documentation update (#1732 ) * Update RL Tips * Fix grammar * Update SBX doc * Fix various typos and grammar mistakes	2023-11-03 17:17:46 +01:00
M. Ernestus	69afefc91d	Add rollout_buffer_class parameter to on-policy algorithms (#1720 ) * Add rollout_buffer_class and rollout_buffer_kwargs parameters to OnPolicyAlgorithm * Add rollout_buffer_class and rollout_buffer_kwargs to PPO. * Add rollout_buffer_class and rollout_buffer_kwargs to A2C. * Make use of the rollout buffer kwargs. * Update version * Add test and update doc --------- Co-authored-by: Antonin Raffin <antonin.raffin@dlr.de>	2023-10-27 17:36:24 +02:00
M. Ernestus	f56ddeda10	Merge pull request #1724 from DLR-RM/chores/update-deps Update dependencies (shimmy, sphinx), remove `sphinx_autodoc_typehints`	2023-10-24 21:28:05 +02:00
Antonin Raffin	350931f441	Use mambaforge as tool	2023-10-23 20:38:48 +02:00
Antonin Raffin	75a5b68e8a	Remove tools key	2023-10-23 20:34:20 +02:00
Antonin Raffin	c1fc8e4c75	Update RTD config	2023-10-23 20:31:58 +02:00
Antonin Raffin	6c70993c8f	Remove sphinx-autodoc-typehints	2023-10-23 20:26:33 +02:00
Antonin Raffin	d672008a32	Update dependencies (remove sphinx type hint plugin), protect type aliases	2023-10-23 20:14:15 +02:00
Antonin Raffin	80245bccc8	Update dependencies (shimmy, sphinx)	2023-10-23 20:14:09 +02:00
Hosseinkhan Rémy	aab545901f	Add support for setting `options` at reset with `VecEnv` (#1606 ) * Update signatures, and test with options * Update changelog and black formatting * Finish implementation (fixes, doc, tests) * Use deepcopy to avoid side effects (modif by reference) * Fix for mypy --------- Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2023-10-23 13:38:48 +02:00
Jan-Hendrik Ewers	2ddf015cd9	fix: Follow PEP8 guidelines and evaluate falsy to truthy with `not` rather than `is False`. (#1707 ) * fix: Follow PEP8 guidelines and evaluate falsy to truth with `not` rather than `is False`. https://docs.python.org/2/library/stdtypes.html#truth-value-testing * chore: Update changelog inline with intent of changes in PR #1707 Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com> * fix: Change `is False` to `not` as per PEP8 * chore: Remove superfluous comment about `is False` * test: One On- and one Off-Policy algorithm (A2C and SAC respectively), with settings to speed up testing * Update changelog * chore: Remove EvalCallback as it's not actually required * Update changelog.rst * Rm duplicated "others" section in changelog.rst --------- Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com> Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>	2023-10-09 12:21:12 +02:00
Quentin Gallouédec	c6bf251d46	Add argument `features_extractor` to `ActorCriticPolicy.extract_features` (#1710 ) * add argument to extract_features * remove empty lines * changelog and version	2023-10-09 11:11:36 +02:00
Antonin RAFFIN	c6c660e51b	Fix type annotations of buffers (#1700 ) * Fix type annotation and replay buffer * Exclude pytype check * Remove some pytype specific annotaiton and update changelog * Fix HerReplayBuffer type hints * try remove # type: ignore[assignment] * revert change --------- Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>	2023-09-28 18:52:46 +02:00
Kyle Sayers	fab6cb339d	BitFlippingEnv argument check and docs clarification (#1698 ) * made change, not tested yet * add back _obs_space with note on purpose * match formatting * update documentation	2023-09-27 10:18:30 +02:00

1 2 3 4 5 ...

851 commits