stable-baselines3

mirror of https://github.com/saymrwulf/stable-baselines3.git synced 2026-07-09 17:29:20 +00:00

Author	SHA1	Message	Date
Quentin Gallouédec	831f1ca586	Merge branch 'master' into fix-common-vec_env-__init__-type-hint	2022-11-29 15:44:40 +01:00
Quentin Gallouédec	ef07165bd4	Update changelog	2022-11-29 15:43:27 +01:00
Quentin Gallouédec	b46396a664	Fix `stable_baselines3/common/env_util.py` type hint (#1192 ) * Remove env_util from mypy exclude * Fix make_atari_env type hint * Update changelog	2022-11-29 15:36:55 +01:00
Quentin Gallouédec	5cd891317e	Add `with_bias` parameter to `create_mlp` (#1188 ) * Add with_bias arg * Update changelog * move torch_layers to the last position * Update version Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>	2022-11-29 12:43:16 +01:00
Quentin Gallouédec	6902fac5e7	Fix `stable_baselines3/common/type_aliases.py` type hint (#1189 )	2022-11-29 12:26:16 +01:00
Quentin Gallouédec	0973b01b9d	Fix `tests/test_distributions.py` type hint (#1186 ) * Fixed test_distribution type hint * Impose list[int] for action dim	2022-11-29 11:27:59 +01:00
Quentin Gallouédec	aee0ba03c7	Update changelog for #1184 (#1185 )	2022-11-28 19:36:26 +01:00
Quentin Gallouédec	e3b24829a5	Drop `gym.GoalEnv` and other minor changes initally from #780 (#1184 ) * Various changes from #780 * Fix env_checker for goal_env detection	2022-11-28 18:22:31 +01:00
Antonin RAFFIN	cd630a3121	Fixes for flake8 6.0 (#1181 )	2022-11-25 15:14:55 +01:00
Juan Rocamonde	68b190b667	Raise error when same env object instance is passed in vectorized environment (#1154 ) * Raise error when same env object instance is passed in vectorized environment * At to changelog * Add raises to docstring * Add test * Also test make_vec_env * Fix test * Try to enable color for MyPy * Update version and ignore lint warnings Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com> Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org> Co-authored-by: Antonin Raffin <antonin.raffin@dlr.de>	2022-11-22 14:28:58 +01:00
Quentin Gallouédec	f3abda5cbc	Fix `Self` return type (#1167 ) * Fix Self annotation * Update changelog * Define type var on top * ClassSelf to SelfClass * annotate self * Revert Running meanstd change * Revert vecnormalize change (static method rejected) Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2022-11-22 13:42:39 +01:00
Quentin Gallouédec	abffa16198	Mypy type checking (#1143 ) * Install and configure mypy * Test if github CI uses setup.cfg for mypy * force color output * tab to space * Try to fix regex * follow_imports silent * use space as indentation * fix indentation setup.cfg * Show error code * Update doc * Udate changelog * Ignore mypy cache files from commit * Update gitlab CI * Add pytype and mypy entry in Makefile * Make mypy happy Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2022-11-16 13:22:57 +01:00
Taimur Shahzad Gill	7e1db1aaaa	Fixed errors in the documentation (#1159 ) * Fixed errors in the documentation Fixed grammatical and punctuation errors, and improved the sentence structure. * Added username in the contributors	2022-11-07 15:38:41 +01:00
Adam Gleave	4fb8aec215	Update evaluate_policy type annotation to support policies as well as RL algorithms (#1146 ) * Add PolicyPredictor protocol and use it in evaluate_policy * Update changelog * Move Protocol to type_aliases to avoid circular import * Add test for evaluate_policy on BasePolicy * Remove unused import * Use typing_extensions * Move typing_extensions to 3rd party * Add version range (typing_extensions uses SemVer) * Import Protocol from typing_extensions only on Python<3.8 Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com> * Install typing_extensions only on Python<3.8 * Add missing sys import * Fix import ordering * Fix observation type hint in predict Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com> Co-authored-by: Quentin GALLOUÉDEC <gallouedec.quentin@gmail.com>	2022-11-03 15:36:19 +01:00
Thomas Simonini	fc6c111cc3	Changelog Update	2022-10-24 11:03:20 +02:00
Quentin Gallouédec	d5d1a02c15	Allow model trained with python3.7 to be loaded with python3.8+ without the `custom_objects` workaround (#1123 ) * Fix loading * Remove documentation note * Update changelog * Revert save_format change * Add test for errors while unpickling * Update version and cleanup Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2022-10-17 17:33:47 +02:00
Quentin Gallouédec	5ef10c8e69	Fix type annotation of ``policy` `in` `BaseAlgorithm` `and` `OffPolicyAlgorithm`` (#1120 )	2022-10-17 10:16:20 +02:00
Juan Rocamonde	cdcdd32c51	Fix return type of `evaluate_actions` (#1118 ) * Fix return type of ActorCriticPolicy.evaluate_actions to optional entropy tensor * Update changelog.rst	2022-10-14 17:45:28 +02:00
Quentin Gallouédec	1bff6215b6	New Issue forms (#1111 ) * Update bug report template * .md -> .yml * System info section * Custom env issue form * documentation form * Question template * Feature request template * Rm old templates * Update changelog	2022-10-13 17:46:21 +02:00
Antonin RAFFIN	508f8ffd59	Remove deprecated features and attributes (#1104 ) * Remove deprecated eval env * Remove deprecated ret attribute * Remove sde net arch * Remove unused code * Update test comment Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>	2022-10-11 10:55:16 +02:00
Antonin RAFFIN	e2f81bb70b	Release v1.6.2 (#1103 ) * Release v1.6.2 * Remove Gitlab CI, no more minutes	2022-10-10 16:37:11 +02:00
tobirohrer	d8a430e088	Deprecate `create_eval_env`, `eval_env` and `eval_freq` parameter (#1082 ) * Adds deprecation warning if `eval_env` or `eval_freq` parameters are used. See #925 * added changelog entry * added missing backtick * deprecating `create_eval_env` parameter as well and adding comments to explain the `stacklevel` parameter used * Updated tests to ignore DeprecationWarnings * Updated changelog entry * - Removed the `create_eval_env` parameter from the examples in the docs - Removed information about the `create_eval_env` parameter from the migration docs - Added information about deprecation of the `create_eval_env` parameter in the docs * Add alternative in docstring * Update docstrings * `eval_freq` warning in docstring * Add deprecation comments in tests Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com> Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org> Co-authored-by: Quentin GALLOUÉDEC <gallouedec.quentin@gmail.com>	2022-10-10 15:39:38 +02:00
Antonin RAFFIN	7c21b79188	Add progress bar callback and argument (#1095 ) * Add progress bar callback and argument * Update doc * Update changelog * Upgrade pytype in docker image * Use tqdm.write in the logger to have cleaner output * Fix logger test * Fix when doing multiple calls to learn() * Address comments from code-review	2022-10-06 18:17:31 +02:00
Alex Pasquali	6a8c9ddc8b	Updated type hint and extended docstring in make_vec_env and make_atari_env (#1085 ) * Updated type hint and extended docstring in make_vec_env The function itself was already working with callables, but it wasn't considerent in the type hint of the function's signature. Extended the description of the wrapper_class parameter with a link to a Github issue containing more details on the matter. * Updated type hint in make_atari_env The function itself was already working with callables, but it wasn't considerent in the type hint of the function's signature. * Updated docstring in make_atari_env When modifying the type hint of the parameter 'env_id' (in this commit: fda6872f73c11075901ba88f2520f6316f818d1d), I forgot to update its description in the docstrig. Doing it now. * Removed redundant type in env_id's type hint in make_vec_env and make_atari_env Callable[..., gym.Env] already includes Type[gym.Env], as pointed out here: https://github.com/DLR-RM/stable-baselines3/pull/1085#issuecomment-1269685218 Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2022-10-06 13:36:06 +02:00
Quentin Gallouédec	a697401e03	Standardized the use of ``"`` for string representation (#1086 ) * Replace ``'`` by ``" `` in python code * Update changelog * Rm whitespace	2022-10-03 15:15:39 +02:00
Quentin Gallouédec	d3eb0e3ed6	Fix importlib dependency (#1088 ) * Set requirement ``importlib-metadata~=4.13`` * Update changelog	2022-10-03 12:03:51 +02:00
Antonin RAFFIN	537a82a7fd	Update export doc (fixes + add torch jit) (#1074 ) * Update export doc (fixes + add torch jit) * Fix conflicts * Update according to code review comments * fix torch -> th Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>	2022-09-30 14:30:40 +02:00
Antonin RAFFIN	21300c9aaf	Release v1.6.1 (#1080 )	2022-09-29 12:15:55 +02:00
Akhil	def0574d03	Fixed typos (#1076 ) * Updated docstring from n_steps to n_rollout_steps This must be a typo * Fixed typo in a comment in ppo.py * Update changelog Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com> Co-authored-by: Antonin Raffin <antonin.raffin@dlr.de>	2022-09-28 14:57:46 +02:00
Juan Rocamonde	e22e372306	Fix duplicate key error in HumanOutputFormat (#1079 ) * Fix duplicate key error in HumanOutputFormat * Update changelog * Add test * Update changelog.rst Co-authored-by: Adam Gleave <adam@gleave.me> Co-authored-by: Adam Gleave <adam@gleave.me>	2022-09-28 12:06:07 +02:00
Juan Rocamonde	432b3f876d	Fix return type for load, learn in BaseAlgorithm (#1043 ) * Fix return type for load, learn in BaseAlgorithm * Update changelog * Add typing extensions to dependencies * Import directly from typing for python >3.11 * Reorder changelog to reflect merge order * Roll back to typevar solution * Updated changelog * Remove typing extensions requirement * Update base_class.py * Remove final point in changelog * Additional type fixes across project Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com> Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2022-09-26 12:13:56 +02:00
Dominic Kerr	899eee6bd4	Automatically create missing directories of ``filenames passed to` `ResultsWriter`` (#1072 ) * Create (if any) missing filename directories, passed into ResultsWriter * Fixed incorrect ``filename`` docstring (if ``filename`` where ``None``, the string method ``filename.endswith(Monitor.EXT)`` would raise an ``AttributeError``), and renamed ``reset_keywords`` docstring. * Added description of #1068 * Ignore pytype errors * Update changelog.rst Co-authored-by: dominicgkerr <dominicgkerr1@gmail.co> Co-authored-by: Antonin Raffin <antonin.raffin@dlr.de> Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2022-09-21 13:14:38 +02:00
Alex Pasquali	d0b129ecc3	Updated custom policy docs (#1067 )	2022-09-18 09:17:57 +02:00
Quentin Gallouédec	440735cbd0	Fix loading a model with different number of environments (#1058 ) * Fix loading with new `n_envs` * Update tests * Update changelog * Fix the fix * Remove `self._setup_model()` from `set_env()` * Raise `AssertionError` when setting env with a different `n_envs` * Update unitests Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2022-09-17 11:10:03 +02:00
Juan Rocamonde	18b29a68e8	Remove forward() method from common.policies.BaseModel (#1061 ) * Remove forward() method. * Updated changelog	2022-09-11 18:39:13 +02:00
Quentin Gallouédec	98e786f744	Clarify and standardize verbosity documentation (#1056 ) * Standardize the use of verbosity: > to >= * Make verbose docstring more specific * Update changelog	2022-09-09 16:46:28 +02:00
Quentin Gallouédec	29f6687b98	Raise error when observation keys and observation space keys don't match (#1047 ) * Raise error when observation keys and observation space keys don't match * Print the difference in keys * Update changelog	2022-09-05 14:54:58 +02:00
Juan Rocamonde	fdca786f09	Fix replay_buffer_class type annotation (#1042 ) * Fix replay_buffer_class type annotation * Update changelog * Further replacement of same type annotation issue * Formatting * Rolled back formatting changes for consistency	2022-09-01 20:10:01 -07:00
Sidney Tio	304c17dc78	Add append mode to Monitor (#1037 ) * Added option to override or use existing CSVs * Updated changelog for Monitor override * Changed default value to override * Simplify code and add test * Update version * Fix for pytype Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>	2022-08-31 11:53:44 +02:00
Hugh Perkins	2cc1477fa2	Fix advantage normalization with mini-batchsize of 1 (#1028 ) * fix nan in advnatages with batch size 1, for ppo * changelog * black * Simplify test * Bump version Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>	2022-08-25 11:50:08 +02:00
Anand Balakrishnan	59af0c1b01	`CheckpointCallback` can now save replay buffer and `VecNormalize` (#1030 ) * CheckpointCallback now saves replay buffer (if present) * VecNormalize stats are saved at checkpoints * Make checkpointing replay buffer and VecNormalize opt-in * Edit changelog * Add documentation for new parameters * Update docs/misc/changelog.rst * Add documentation for new parameters * Implement suggested edits * Reformat code * Fix git conflict * Add .pkl suffix to VecNormalize checkpoints * Add tests for new CheckpointCallback params * Merge CheckpointCallback tests * Update test and add helper for checkpoint path Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2022-08-25 10:57:51 +02:00
Honglu Fan	29a481a288	Include `running_mean` and `running_val` when updating target networks (#1004 ) * include `running_mean` and `running_val` when updating target networks in DQN, SAC, TD3. * Update stable_baselines3/common/utils.py Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org> * Precompute batch norm parameters in `_setup_model` and directly copy them in the target update. * include `running_mean` and `running_val` when updating target networks in DQN, SAC, TD3. * Update stable_baselines3/common/utils.py Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org> * Precompute batch norm parameters in `_setup_model` and directly copy them in the target update. * Fix `DictReplayBuffer.next_observations` type (#1013) * Fix DictReplayBuffer.next_observations type * Update changelog Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org> * Fixed missing verbose parameter passing (#1011) Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com> * Support for `device=auto` buffers and set it as default value (#1009) * Default device is "auto" for buffer + auto device support in BufferBaseClass * Update docstring * Update tests * Unify tests * Update changelog * Fix tests on CUDA device Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org> Co-authored-by: Antonin Raffin <antonin.raffin@dlr.de> * Precompute batch norm parameters in `_setup_model` and directly copy them in the target update. * Update test * Add comments and update tests * Bump version * Remove one extra space to conform code style. * Update docstrings Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org> Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com> Co-authored-by: Burak Demirbilek <BurakDmb@users.noreply.github.com> Co-authored-by: Antonin Raffin <antonin.raffin@dlr.de>	2022-08-23 10:20:43 +02:00
Timothé	01cc127d32	Support hparams logging to tensorboard (#984 ) * create Hparam class & support in all OutputFormats * add hparams documentation & example * add hparam tests * remove unnecessary test & fix name * format changes * support hyperparameters logging to tensorboard * fix HParams class docstring * use more explicit variable names * raise error instead of warning * Unpin protobuf * Add test for logging hparams Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2022-08-22 22:06:54 +02:00
Antonin RAFFIN	57e0054e62	Add Quentin to the list of maintainers (#1014 )	2022-08-17 09:55:40 +02:00
Quentin Gallouédec	73822c34da	Support for `device=auto` buffers and set it as default value (#1009 ) * Default device is "auto" for buffer + auto device support in BufferBaseClass * Update docstring * Update tests * Unify tests * Update changelog * Fix tests on CUDA device Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org> Co-authored-by: Antonin Raffin <antonin.raffin@dlr.de>	2022-08-16 17:54:55 +02:00
Burak Demirbilek	792e3bcc27	Fixed missing verbose parameter passing (#1011 ) Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>	2022-08-16 13:32:32 +02:00
Quentin Gallouédec	a30d36002b	Fix `DictReplayBuffer.next_observations` type (#1013 ) * Fix DictReplayBuffer.next_observations type * Update changelog Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2022-08-16 10:53:22 +02:00
Quentin Gallouédec	c4f54fcf04	Handling multi-dimensional action spaces (#971 ) * Handle non 1D action shape * Revert changes of observation (out of the scope of this PR) * Apply changes to DictReplayBuffer * Update tests * Rollout buffer n-D actions space handling * Remove error when non 1D action space * ActorCriticPolicy return action with the proper shape * remove useless reshape * Update changelog * Add tests Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2022-08-06 14:19:20 +02:00
jlp-ue	6ce33f5bd2	Fix url in docs (#1000 ) * fixed URL in docs * Update changelog.rst	2022-08-05 17:54:48 +02:00
Francesco Lucianò	646d6d38b6	Fixed typo in PPO doc (#983 ) * Fixed typo Fixed typo * Update changelog Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>	2022-07-30 12:52:35 +02:00

1 2 3 4 5 ...

327 commits