stable-baselines3

mirror of https://github.com/saymrwulf/stable-baselines3.git synced 2026-05-18 21:30:19 +00:00

Author	SHA1	Message	Date
Antonin RAFFIN	3c028f3d5c	Fix `load_from_tensor` (#1231 )	2022-12-22 17:28:18 +01:00
Quentin Gallouédec	5549b34231	Fix ``stable_baselines3/common/vec_env/vec_check_nan.py`` type hints (#1226 ) * super() init style * "async_step" arg to "event"; "news" to "dones"; improve docstring * Remove vec_check_nan from mypy exclude * Update changelog	2022-12-22 12:24:59 +01:00
Alex Pasquali	2cfcec4f50	Modified ActorCriticPolicy to support non-shared features extractor (#1148 ) * Modified ActorCriticPolicy to support non-shared features extractor * Refactored features extraction with non-shared features extractor in ActorCriticPolicy and updated doc Doc update: added 'warning' on custom policy docs that says that, if the features extractor is non-shared, it's not possible to have shared layers in the mlp_extractor * Moved attrib share_features_extractor in class * Updated custom policy doc for non-shared features extractor * Updated changelog * Made some if-statements more readable if policies.py The if-statements are related to the shared/non-shared features extractor in ActorCritic policies * Simplify implementation and add run test * Keep order in module gain to keep previous results consistents * Fix test * Improved docstring in policies.py Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com> * Added some tests * feature extractor -> features extractor * Fix test * Fix env_id in test * Make features extractor parameter explicit * Remove duplicate Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org> Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com> Co-authored-by: Antonin Raffin <antonin.raffin@dlr.de>	2022-12-20 15:12:05 +01:00
Antonin RAFFIN	8452106734	Fix support of image like normalized inputs (#1214 ) * Fix support of image like normalized inputs * Improve docstring and warning message. * Don't check if obs is image when normalize_images is False (lil opt) * Comment fix * Fix normalize_images not passed to parent * Check for subclasses too * Remove useless multiline * Update version and add comment * Fix some typos Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>	2022-12-20 13:18:28 +01:00
Quentin Gallouédec	ca944fed2d	Update version (#1220 ) * Replace .to(device) when possible * fix numpy dep * black * Add warning for device != cpu and copy=False * Update changelog * Remove warning * Update buffers.py * Update version * Fix type checking Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2022-12-19 13:53:00 +01:00
Antonin Raffin	213b06b0c6	Monkey-patch `np.bool = bool`	2022-12-19 13:20:48 +01:00
Quentin Gallouédec	68a40e0940	Construct tensors directly on GPU (#1218 ) * Replace .to(device) when possible * fix numpy dep * black * Add warning for device != cpu and copy=False * Update changelog * Remove warning * Update buffers.py	2022-12-19 12:50:22 +01:00
Antonin RAFFIN	0c1bc0b1da	Fix `stable_baselines3/common/atari_wrappers.py` type hints (#1216 ) * Fix `stable_baselines3/common/atari_wrappers.py` type hints * Fix initialization Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>	2022-12-18 16:13:44 +01:00
Antonin RAFFIN	07094c3f2e	Fix `stable_baselines3/common/preprocessing.py` type hints (#1217 )	2022-12-18 15:53:17 +01:00
Quentin Gallouédec	e39bc3da00	Add support for multidimensional `spaces.MultiBinary` observations (#1179 ) * Fix `get_obs_shape` for multidimensi onnal Multibinary space * Update changelog * more tests * fix multidiscrete one-hot encoding * refactor tests * Update changelog.rst * Update changelog.rst * batched obs and revert preprocess_obs changes * Add support for multidimensional ``spaces.MultiBinary`` observations Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org> Co-authored-by: Antonin Raffin <antonin.raffin@dlr.de>	2022-12-08 18:46:41 +01:00
Quentin Gallouédec	002850f8ac	Fix `stable_baselines3/common/torch_layers.py` type hint (#1191 ) * Remove torch layers from mypy exclude * Make torch layers mypy compliant * Extra type specification * Update changelog Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2022-11-29 23:46:32 +01:00
Zikang Xiong	852d635742	Exposed modules in __init__.py with __all__ (#1195 ) * Exposed modules in __init__.py with __all__ * Remove flake8 ignore and update root __all__ * Update version Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>	2022-11-29 23:33:46 +01:00
Quentin Gallouédec	b46396a664	Fix `stable_baselines3/common/env_util.py` type hint (#1192 ) * Remove env_util from mypy exclude * Fix make_atari_env type hint * Update changelog	2022-11-29 15:36:55 +01:00
Quentin Gallouédec	5cd891317e	Add `with_bias` parameter to `create_mlp` (#1188 ) * Add with_bias arg * Update changelog * move torch_layers to the last position * Update version Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>	2022-11-29 12:43:16 +01:00
Quentin Gallouédec	6902fac5e7	Fix `stable_baselines3/common/type_aliases.py` type hint (#1189 )	2022-11-29 12:26:16 +01:00
Quentin Gallouédec	0973b01b9d	Fix `tests/test_distributions.py` type hint (#1186 ) * Fixed test_distribution type hint * Impose list[int] for action dim	2022-11-29 11:27:59 +01:00
Quentin Gallouédec	aee0ba03c7	Update changelog for #1184 (#1185 )	2022-11-28 19:36:26 +01:00
Quentin Gallouédec	e3b24829a5	Drop `gym.GoalEnv` and other minor changes initally from #780 (#1184 ) * Various changes from #780 * Fix env_checker for goal_env detection	2022-11-28 18:22:31 +01:00
Antonin RAFFIN	cd630a3121	Fixes for flake8 6.0 (#1181 )	2022-11-25 15:14:55 +01:00
Juan Rocamonde	68b190b667	Raise error when same env object instance is passed in vectorized environment (#1154 ) * Raise error when same env object instance is passed in vectorized environment * At to changelog * Add raises to docstring * Add test * Also test make_vec_env * Fix test * Try to enable color for MyPy * Update version and ignore lint warnings Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com> Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org> Co-authored-by: Antonin Raffin <antonin.raffin@dlr.de>	2022-11-22 14:28:58 +01:00
Quentin Gallouédec	f3abda5cbc	Fix `Self` return type (#1167 ) * Fix Self annotation * Update changelog * Define type var on top * ClassSelf to SelfClass * annotate self * Revert Running meanstd change * Revert vecnormalize change (static method rejected) Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2022-11-22 13:42:39 +01:00
Adam Gleave	4fb8aec215	Update evaluate_policy type annotation to support policies as well as RL algorithms (#1146 ) * Add PolicyPredictor protocol and use it in evaluate_policy * Update changelog * Move Protocol to type_aliases to avoid circular import * Add test for evaluate_policy on BasePolicy * Remove unused import * Use typing_extensions * Move typing_extensions to 3rd party * Add version range (typing_extensions uses SemVer) * Import Protocol from typing_extensions only on Python<3.8 Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com> * Install typing_extensions only on Python<3.8 * Add missing sys import * Fix import ordering * Fix observation type hint in predict Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com> Co-authored-by: Quentin GALLOUÉDEC <gallouedec.quentin@gmail.com>	2022-11-03 15:36:19 +01:00
Quentin Gallouédec	d5d1a02c15	Allow model trained with python3.7 to be loaded with python3.8+ without the `custom_objects` workaround (#1123 ) * Fix loading * Remove documentation note * Update changelog * Revert save_format change * Add test for errors while unpickling * Update version and cleanup Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2022-10-17 17:33:47 +02:00
Quentin Gallouédec	5ef10c8e69	Fix type annotation of ``policy` `in` `BaseAlgorithm` `and` `OffPolicyAlgorithm`` (#1120 )	2022-10-17 10:16:20 +02:00
Juan Rocamonde	cdcdd32c51	Fix return type of `evaluate_actions` (#1118 ) * Fix return type of ActorCriticPolicy.evaluate_actions to optional entropy tensor * Update changelog.rst	2022-10-14 17:45:28 +02:00
Antonin RAFFIN	508f8ffd59	Remove deprecated features and attributes (#1104 ) * Remove deprecated eval env * Remove deprecated ret attribute * Remove sde net arch * Remove unused code * Update test comment Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>	2022-10-11 10:55:16 +02:00
Antonin RAFFIN	e2f81bb70b	Release v1.6.2 (#1103 ) * Release v1.6.2 * Remove Gitlab CI, no more minutes	2022-10-10 16:37:11 +02:00
tobirohrer	d8a430e088	Deprecate `create_eval_env`, `eval_env` and `eval_freq` parameter (#1082 ) * Adds deprecation warning if `eval_env` or `eval_freq` parameters are used. See #925 * added changelog entry * added missing backtick * deprecating `create_eval_env` parameter as well and adding comments to explain the `stacklevel` parameter used * Updated tests to ignore DeprecationWarnings * Updated changelog entry * - Removed the `create_eval_env` parameter from the examples in the docs - Removed information about the `create_eval_env` parameter from the migration docs - Added information about deprecation of the `create_eval_env` parameter in the docs * Add alternative in docstring * Update docstrings * `eval_freq` warning in docstring * Add deprecation comments in tests Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com> Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org> Co-authored-by: Quentin GALLOUÉDEC <gallouedec.quentin@gmail.com>	2022-10-10 15:39:38 +02:00
Antonin RAFFIN	7c21b79188	Add progress bar callback and argument (#1095 ) * Add progress bar callback and argument * Update doc * Update changelog * Upgrade pytype in docker image * Use tqdm.write in the logger to have cleaner output * Fix logger test * Fix when doing multiple calls to learn() * Address comments from code-review	2022-10-06 18:17:31 +02:00
Alex Pasquali	6a8c9ddc8b	Updated type hint and extended docstring in make_vec_env and make_atari_env (#1085 ) * Updated type hint and extended docstring in make_vec_env The function itself was already working with callables, but it wasn't considerent in the type hint of the function's signature. Extended the description of the wrapper_class parameter with a link to a Github issue containing more details on the matter. * Updated type hint in make_atari_env The function itself was already working with callables, but it wasn't considerent in the type hint of the function's signature. * Updated docstring in make_atari_env When modifying the type hint of the parameter 'env_id' (in this commit: fda6872f73c11075901ba88f2520f6316f818d1d), I forgot to update its description in the docstrig. Doing it now. * Removed redundant type in env_id's type hint in make_vec_env and make_atari_env Callable[..., gym.Env] already includes Type[gym.Env], as pointed out here: https://github.com/DLR-RM/stable-baselines3/pull/1085#issuecomment-1269685218 Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2022-10-06 13:36:06 +02:00
Antonin RAFFIN	21300c9aaf	Release v1.6.1 (#1080 )	2022-09-29 12:15:55 +02:00
Akhil	def0574d03	Fixed typos (#1076 ) * Updated docstring from n_steps to n_rollout_steps This must be a typo * Fixed typo in a comment in ppo.py * Update changelog Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com> Co-authored-by: Antonin Raffin <antonin.raffin@dlr.de>	2022-09-28 14:57:46 +02:00
Juan Rocamonde	e22e372306	Fix duplicate key error in HumanOutputFormat (#1079 ) * Fix duplicate key error in HumanOutputFormat * Update changelog * Add test * Update changelog.rst Co-authored-by: Adam Gleave <adam@gleave.me> Co-authored-by: Adam Gleave <adam@gleave.me>	2022-09-28 12:06:07 +02:00
Juan Rocamonde	432b3f876d	Fix return type for load, learn in BaseAlgorithm (#1043 ) * Fix return type for load, learn in BaseAlgorithm * Update changelog * Add typing extensions to dependencies * Import directly from typing for python >3.11 * Reorder changelog to reflect merge order * Roll back to typevar solution * Updated changelog * Remove typing extensions requirement * Update base_class.py * Remove final point in changelog * Additional type fixes across project Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com> Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2022-09-26 12:13:56 +02:00
Dominic Kerr	899eee6bd4	Automatically create missing directories of ``filenames passed to` `ResultsWriter`` (#1072 ) * Create (if any) missing filename directories, passed into ResultsWriter * Fixed incorrect ``filename`` docstring (if ``filename`` where ``None``, the string method ``filename.endswith(Monitor.EXT)`` would raise an ``AttributeError``), and renamed ``reset_keywords`` docstring. * Added description of #1068 * Ignore pytype errors * Update changelog.rst Co-authored-by: dominicgkerr <dominicgkerr1@gmail.co> Co-authored-by: Antonin Raffin <antonin.raffin@dlr.de> Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2022-09-21 13:14:38 +02:00
Quentin Gallouédec	b7456392ac	Transfer `ABC` inheritance from `BaseModel` to `BasePolicy` (#1062 ) Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2022-09-19 22:10:22 +02:00
Quentin Gallouédec	440735cbd0	Fix loading a model with different number of environments (#1058 ) * Fix loading with new `n_envs` * Update tests * Update changelog * Fix the fix * Remove `self._setup_model()` from `set_env()` * Raise `AssertionError` when setting env with a different `n_envs` * Update unitests Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2022-09-17 11:10:03 +02:00
Juan Rocamonde	18b29a68e8	Remove forward() method from common.policies.BaseModel (#1061 ) * Remove forward() method. * Updated changelog	2022-09-11 18:39:13 +02:00
Quentin Gallouédec	98e786f744	Clarify and standardize verbosity documentation (#1056 ) * Standardize the use of verbosity: > to >= * Make verbose docstring more specific * Update changelog	2022-09-09 16:46:28 +02:00
Quentin Gallouédec	29f6687b98	Raise error when observation keys and observation space keys don't match (#1047 ) * Raise error when observation keys and observation space keys don't match * Print the difference in keys * Update changelog	2022-09-05 14:54:58 +02:00
Juan Rocamonde	fdca786f09	Fix replay_buffer_class type annotation (#1042 ) * Fix replay_buffer_class type annotation * Update changelog * Further replacement of same type annotation issue * Formatting * Rolled back formatting changes for consistency	2022-09-01 20:10:01 -07:00
Sidney Tio	304c17dc78	Add append mode to Monitor (#1037 ) * Added option to override or use existing CSVs * Updated changelog for Monitor override * Changed default value to override * Simplify code and add test * Update version * Fix for pytype Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>	2022-08-31 11:53:44 +02:00
Hugh Perkins	2cc1477fa2	Fix advantage normalization with mini-batchsize of 1 (#1028 ) * fix nan in advnatages with batch size 1, for ppo * changelog * black * Simplify test * Bump version Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>	2022-08-25 11:50:08 +02:00
Anand Balakrishnan	59af0c1b01	`CheckpointCallback` can now save replay buffer and `VecNormalize` (#1030 ) * CheckpointCallback now saves replay buffer (if present) * VecNormalize stats are saved at checkpoints * Make checkpointing replay buffer and VecNormalize opt-in * Edit changelog * Add documentation for new parameters * Update docs/misc/changelog.rst * Add documentation for new parameters * Implement suggested edits * Reformat code * Fix git conflict * Add .pkl suffix to VecNormalize checkpoints * Add tests for new CheckpointCallback params * Merge CheckpointCallback tests * Update test and add helper for checkpoint path Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2022-08-25 10:57:51 +02:00
Honglu Fan	29a481a288	Include `running_mean` and `running_val` when updating target networks (#1004 ) * include `running_mean` and `running_val` when updating target networks in DQN, SAC, TD3. * Update stable_baselines3/common/utils.py Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org> * Precompute batch norm parameters in `_setup_model` and directly copy them in the target update. * include `running_mean` and `running_val` when updating target networks in DQN, SAC, TD3. * Update stable_baselines3/common/utils.py Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org> * Precompute batch norm parameters in `_setup_model` and directly copy them in the target update. * Fix `DictReplayBuffer.next_observations` type (#1013) * Fix DictReplayBuffer.next_observations type * Update changelog Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org> * Fixed missing verbose parameter passing (#1011) Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com> * Support for `device=auto` buffers and set it as default value (#1009) * Default device is "auto" for buffer + auto device support in BufferBaseClass * Update docstring * Update tests * Unify tests * Update changelog * Fix tests on CUDA device Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org> Co-authored-by: Antonin Raffin <antonin.raffin@dlr.de> * Precompute batch norm parameters in `_setup_model` and directly copy them in the target update. * Update test * Add comments and update tests * Bump version * Remove one extra space to conform code style. * Update docstrings Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org> Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com> Co-authored-by: Burak Demirbilek <BurakDmb@users.noreply.github.com> Co-authored-by: Antonin Raffin <antonin.raffin@dlr.de>	2022-08-23 10:20:43 +02:00
Timothé	01cc127d32	Support hparams logging to tensorboard (#984 ) * create Hparam class & support in all OutputFormats * add hparams documentation & example * add hparam tests * remove unnecessary test & fix name * format changes * support hyperparameters logging to tensorboard * fix HParams class docstring * use more explicit variable names * raise error instead of warning * Unpin protobuf * Add test for logging hparams Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2022-08-22 22:06:54 +02:00
Quentin Gallouédec	73822c34da	Support for `device=auto` buffers and set it as default value (#1009 ) * Default device is "auto" for buffer + auto device support in BufferBaseClass * Update docstring * Update tests * Unify tests * Update changelog * Fix tests on CUDA device Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org> Co-authored-by: Antonin Raffin <antonin.raffin@dlr.de>	2022-08-16 17:54:55 +02:00
Burak Demirbilek	792e3bcc27	Fixed missing verbose parameter passing (#1011 ) Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>	2022-08-16 13:32:32 +02:00
Quentin Gallouédec	a30d36002b	Fix `DictReplayBuffer.next_observations` type (#1013 ) * Fix DictReplayBuffer.next_observations type * Update changelog Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2022-08-16 10:53:22 +02:00
Quentin Gallouédec	c4f54fcf04	Handling multi-dimensional action spaces (#971 ) * Handle non 1D action shape * Revert changes of observation (out of the scope of this PR) * Apply changes to DictReplayBuffer * Update tests * Rollout buffer n-D actions space handling * Remove error when non 1D action space * ActorCriticPolicy return action with the proper shape * remove useless reshape * Update changelog * Add tests Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2022-08-06 14:19:20 +02:00

1 2 3 4 5 ...

254 commits