stable-baselines3

mirror of https://github.com/saymrwulf/stable-baselines3.git synced 2026-07-01 03:45:11 +00:00

Author	SHA1	Message	Date
Quentin Gallouédec	38a22e8ef4	Merge branch 'master' into doc/EnvPoolAdapter	2023-02-15 17:27:24 +01:00
Quentin Gallouédec	4c3bc98240	fix missing "use"	2023-02-15 17:26:04 +01:00
Quentin Gallouédec	12e9917c24	Fix image-based normalized env loading (#1321 ) * Fix * Add test * Update changelog * fix memory error avoidance * Update version * image env test * black * check_shape_equal * check shape equal in vecnormalize * Allow spaces not to be box or dict * rm `test_save_load_vecnormalized_image` in favor of `test_vec_env` * Remove unused imports --------- Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org> Co-authored-by: Antonin Raffin <antonin.raffin@dlr.de>	2023-02-15 14:17:18 +01:00
harveybellini	7a1e429702	Remove Note from examples - Code works (#1330 ) * Remove Note Gif creation works with Atari Environments using the script provided below. * Update changelog --------- Co-authored-by: Antonin Raffin <antonin.raffin@dlr.de>	2023-02-15 13:14:02 +01:00
Vikas Kumar	69b94dd6a8	Rename "timesteps" to "episodes" in `log_interval` documentation (#1325 ) * change timestamp to episode for logging * update changelog * minor format modif * minor format modif --------- Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>	2023-02-10 21:15:09 +01:00
Quentin GALLOUÉDEC	7aabb64d24	Update doc	2023-02-10 14:17:44 +01:00
Sidney Tio	489b1fdaf2	Add the argument `dtype` (default to `float32`) to the noise (#1301 ) * Fixed noise to return float32 * Updated changelog * Fixed test to use numpy arrays instead of python floats * Sorted imports for tests * Added dtype to constructor * Removed dtype parameter for VectorizedActionNoise * __init__ -> None; Capitalize and period in docstring when needed; fix dtype type hint; dtype in docstring * fix dtype type hint * Update version * Clarify changelog [skip ci] * empty commit to run ci * Update docs/misc/changelog.rst --------- Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com> Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2023-02-07 13:42:14 +01:00
Quentin Gallouédec	2e4a45020e	Refactor observation stacking (#1238 ) * refactor stacking obs * Improve docstring * remove all StackedDictObservations * Update tests and make stacked obs clearer * Fix type check * fix stacked_observation_space * undo init change, deprecate StackedDictObservations * deprecate stack_observation_space * type hints * ignore pytype errors * undo vecenv doc change * Deprecation warning in StackedDictObs doctstring * Fix vec_env.rst * Fix __all__ sorting * fix pytype ignore statement * Update docstring * stack * Remove n_stack * Update changelog * Simplify code * Rename test file * Re-use variable for shift * Fix doc build * Remove pytype comment * Disable pytype error --------- Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2023-02-06 22:41:59 +01:00
adamfrly	411ff697dd	Ensure train/n_updates metric accounts for early stopping of training loop (#1311 ) * Correct _n_updates when target_kl stops loop early * Update changelog * Simplify code --------- Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>	2023-02-06 15:48:41 +01:00
Marco Tröster	d0c1a87faf	Add scaling section to A2C documentation (#1250 ) * add scaling section to A2C documentation * add cross-reference to vectorized envs article * turn it as note * update changelog * add Bonifatius94 to the list of contributors * fix issue number --------- Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com> Co-authored-by: Quentin GALLOUÉDEC <gallouedec.quentin@gmail.com> Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2023-02-02 12:34:38 +01:00
Quentin Gallouédec	82bc63fca4	Upgrade black formatting (#1310 ) * apply black * Reformat tests --------- Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>	2023-02-02 11:58:41 +01:00
Alex Pasquali	bea3c44ba5	Fixed typo in A2C's docstring (#1303 )	2023-01-28 12:04:07 +01:00
Quentin Gallouédec	5ee9009535	Add sticky actions for Atari games (#1286 ) * repeat_action_probability * Add test * Undo atari wrapper doc change since CI fails * remove action_repeat_probability from make_atari_env * Add sticky action wrapper and improve documentation * Update changelog * handle the case noop_max=0 * Update tests * Comply to ALE implementation * Reorder doc * Add doc warning and don't wrap with sticky action when not needed * fix docstring and reorder * Move `action_repeat_probability` args at the last position * Add ref * Update doc and wrap with frameskip only if needed * Update changelog Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2023-01-26 10:32:58 +01:00
Quentin Gallouédec	637988c9cc	Fix Atari wrapper bug: tried to step environment that needs reset (#1297 ) * fix 1060 * update changelog	2023-01-26 00:31:20 +01:00
Alex Pasquali	b702884c23	Removed shared layers in mlp_extractor (#1292 ) * Modified actor-critic policies & MlpExtractor class ActorCriticPolicy: - changed type hint of net_arch param: now it's a dict - removed check that if features extractor is not shared: no shared layers are allowed in the mlp_extractor regardless of the features extractor ActorCriticCnnPolicy: - changed type hint of net_arch param: now it's a dict MultiInputActorcriticPolicy: - changed type hint of net_arch param: now it's a dict MlpExtractor: - changed type hint of net_arch param: now it's a dict - adapted networks creation - adapted methods: forward, forward_actor & forward_critic * Removed shared layers in mlp_extractor * Updated docs and changelog + reformat * Updated custom policy tests * Removed test on deprecation warning for share layers in mlp_extractor Now shared layers are removed * Update version * Update RL Zoo doc * Fix linter warnings * Add ruff to Makefile (experimental) * Add backward compat code and minor updates * Update tests * Add backward compatibility * Fix test * Improve compat code Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2023-01-23 14:55:19 +01:00
Quentin Gallouédec	69fdf155e1	Downgrade `sphinx-autodoc-typehints` (#1291 ) * Update setup.py * black * hotfix pytype	2023-01-23 10:56:45 +01:00
Quentin Gallouédec	92f7a6f23b	Fix `test_vec_normalize.py`, `test_tensorboard.py` and `common/monitor.py` type hint (#1194 ) * Remove from mypy exclude * type hint for metadata * Union[float, int] -> float * Remove useless __init__ * Type hint for model and logger in BaseCallback * Type hint for metric_dict * Update changelog * fix test_tensorboard * ignore gamma type checking * Fix monitor type hint * Update logger type hints * Fix type annotation and bump version * Fix circular import Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2023-01-13 18:28:22 +01:00
Yu Zheng	9bb1538b78	Fix outdated `load_parameters` to `set_parameters` (#1270 ) * Update examples.rst * Update changelog Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>	2023-01-11 14:13:21 +01:00
Antonin RAFFIN	6b8905acdb	Release v1.7.0 (#1268 )	2023-01-10 17:32:57 +01:00
Dominic Kerr	5aa6e7d340	Fix `ProgressBarCallback` under-reporting (#1260 ) * Updated tqdm progress bar constructor to account for the effects of train_freq/n_steps/num_envs on total_timesteps. Ensure progress bar is "flushed" on training end. * Added description of PR #1260. Fixed formatting typo * Partial revert Co-authored-by: dominicgkerr <dominicgkerr1@gmail.co> Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org> Co-authored-by: Antonin Raffin <antonin.raffin@dlr.de>	2023-01-10 15:17:52 +01:00
Alex Pasquali	30a19848ce	Deprecation of shared layers in `MlpExtractor` (#1252 ) * Deprecation warning for shared layers in Mlpextractor * Updated changelog * Updated custom policy doc * Update doc and deprecation * Fix doc build * Minor edits Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>	2023-01-05 09:59:36 +01:00
Quentin Gallouédec	4fa17dcf0f	Standardize the use of `from gym import spaces` (#1240 ) * generalize the use of `from gym import spaces` * command line get system info * Documentation line length for doc * update changelog * add space before os plateform to avoid ref to other issue * format * get_system_info update in changelog * fix type check error * fix get system info * add comment about regex * update version	2023-01-02 14:51:11 +01:00
Friedrich Yuan	2bb8ef5e63	Add RLeXplore to the project page (#1246 ) * Update project page Adding the repo "rl-exploration-baselines" to the project page. * Update changelog.rst * Update projects.rst * Update changelog.rst * Update docs/misc/projects.rst * Update changelog.rst Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>	2022-12-28 15:06:09 +01:00
Antonin RAFFIN	e78ba6ffa4	Hotfix to load policies saved with SB3 <= v1.6 (#1234 ) * Hotfix to load policies saved with SB3 <= v1.6 * Add warning and test * Update doc	2022-12-22 23:58:30 +01:00
Antonin RAFFIN	3c028f3d5c	Fix `load_from_tensor` (#1231 )	2022-12-22 17:28:18 +01:00
Quentin Gallouédec	5549b34231	Fix ``stable_baselines3/common/vec_env/vec_check_nan.py`` type hints (#1226 ) * super() init style * "async_step" arg to "event"; "news" to "dones"; improve docstring * Remove vec_check_nan from mypy exclude * Update changelog	2022-12-22 12:24:59 +01:00
Quentin Gallouédec	9aff1137a9	Add support for Python 3.10 (#1227 ) * Add python 3.10 and 3.11 * Update setup * Fix CI * Drop 3.11 (because of pytorch) * Update changelog * revert unwanted change in setup.cfg * Remove remark about pytorch	2022-12-21 15:52:48 +01:00
Antonin RAFFIN	7202ece85b	Update tensorboard callback doc (#1221 ) Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>	2022-12-21 12:51:28 +01:00
Quentin Gallouédec	96b1a7cf01	`env_id` consistency in tests (#1224 ) Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2022-12-20 16:01:26 +01:00
Quentin Gallouédec	7fb8336f40	Update PR template (#1225 ) Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2022-12-20 15:13:42 +01:00
Alex Pasquali	2cfcec4f50	Modified ActorCriticPolicy to support non-shared features extractor (#1148 ) * Modified ActorCriticPolicy to support non-shared features extractor * Refactored features extraction with non-shared features extractor in ActorCriticPolicy and updated doc Doc update: added 'warning' on custom policy docs that says that, if the features extractor is non-shared, it's not possible to have shared layers in the mlp_extractor * Moved attrib share_features_extractor in class * Updated custom policy doc for non-shared features extractor * Updated changelog * Made some if-statements more readable if policies.py The if-statements are related to the shared/non-shared features extractor in ActorCritic policies * Simplify implementation and add run test * Keep order in module gain to keep previous results consistents * Fix test * Improved docstring in policies.py Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com> * Added some tests * feature extractor -> features extractor * Fix test * Fix env_id in test * Make features extractor parameter explicit * Remove duplicate Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org> Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com> Co-authored-by: Antonin Raffin <antonin.raffin@dlr.de>	2022-12-20 15:12:05 +01:00
Antonin RAFFIN	8452106734	Fix support of image like normalized inputs (#1214 ) * Fix support of image like normalized inputs * Improve docstring and warning message. * Don't check if obs is image when normalize_images is False (lil opt) * Comment fix * Fix normalize_images not passed to parent * Check for subclasses too * Remove useless multiline * Update version and add comment * Fix some typos Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>	2022-12-20 13:18:28 +01:00
Quentin Gallouédec	ca944fed2d	Update version (#1220 ) * Replace .to(device) when possible * fix numpy dep * black * Add warning for device != cpu and copy=False * Update changelog * Remove warning * Update buffers.py * Update version * Fix type checking Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2022-12-19 13:53:00 +01:00
Antonin Raffin	9af2d11b6e	Update changelog	2022-12-19 13:21:10 +01:00
Antonin Raffin	213b06b0c6	Monkey-patch `np.bool = bool`	2022-12-19 13:20:48 +01:00
Quentin Gallouédec	68a40e0940	Construct tensors directly on GPU (#1218 ) * Replace .to(device) when possible * fix numpy dep * black * Add warning for device != cpu and copy=False * Update changelog * Remove warning * Update buffers.py	2022-12-19 12:50:22 +01:00
Antonin RAFFIN	0c1bc0b1da	Fix `stable_baselines3/common/atari_wrappers.py` type hints (#1216 ) * Fix `stable_baselines3/common/atari_wrappers.py` type hints * Fix initialization Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>	2022-12-18 16:13:44 +01:00
Antonin RAFFIN	07094c3f2e	Fix `stable_baselines3/common/preprocessing.py` type hints (#1217 )	2022-12-18 15:53:17 +01:00
Alex Pasquali	6d55a09f81	Updated custom policy docs to better explain the ``mlp_extractor``'s dimensions (#1196 ) * Updated custom policy docs Better explained how the dimensions of the mlp_extractor work, including the action net and the value net after the layers specified in net_arch. * Improved custom policy doc Section: Custom Network Architecture. Explained with greater detail that an action net and a value net will be added on top of the net_arch. * Improved custom policy doc Section: Custom Network Architecture. Merged a comment into a note * Alignment Co-authored-by: Quentin GALLOUÉDEC <gallouedec.quentin@gmail.com>	2022-12-12 16:19:51 +01:00
Quentin Gallouédec	e39bc3da00	Add support for multidimensional `spaces.MultiBinary` observations (#1179 ) * Fix `get_obs_shape` for multidimensi onnal Multibinary space * Update changelog * more tests * fix multidiscrete one-hot encoding * refactor tests * Update changelog.rst * Update changelog.rst * batched obs and revert preprocess_obs changes * Add support for multidimensional ``spaces.MultiBinary`` observations Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org> Co-authored-by: Antonin Raffin <antonin.raffin@dlr.de>	2022-12-08 18:46:41 +01:00
Quentin Gallouédec	6763a864c8	Upgrade CI/github-actions (#1204 ) * checkout v2 -> v3; setup-python v2 -> v4 * Update changelog.rst	2022-12-07 16:43:47 +01:00
Athanasios Theocharis	f7d7ed3fa7	Update custom_policy.rst (#1183 ) * Update custom_policy.rst * Update changelog Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com> Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org> Co-authored-by: Antonin Raffin <antonin.raffin@dlr.de>	2022-12-06 17:51:52 +01:00
Quentin Gallouédec	002850f8ac	Fix `stable_baselines3/common/torch_layers.py` type hint (#1191 ) * Remove torch layers from mypy exclude * Make torch layers mypy compliant * Extra type specification * Update changelog Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2022-11-29 23:46:32 +01:00
Zikang Xiong	852d635742	Exposed modules in __init__.py with __all__ (#1195 ) * Exposed modules in __init__.py with __all__ * Remove flake8 ignore and update root __all__ * Update version Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>	2022-11-29 23:33:46 +01:00
Quentin Gallouédec	b46396a664	Fix `stable_baselines3/common/env_util.py` type hint (#1192 ) * Remove env_util from mypy exclude * Fix make_atari_env type hint * Update changelog	2022-11-29 15:36:55 +01:00
Quentin Gallouédec	5cd891317e	Add `with_bias` parameter to `create_mlp` (#1188 ) * Add with_bias arg * Update changelog * move torch_layers to the last position * Update version Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>	2022-11-29 12:43:16 +01:00
Quentin Gallouédec	6902fac5e7	Fix `stable_baselines3/common/type_aliases.py` type hint (#1189 )	2022-11-29 12:26:16 +01:00
Quentin Gallouédec	0973b01b9d	Fix `tests/test_distributions.py` type hint (#1186 ) * Fixed test_distribution type hint * Impose list[int] for action dim	2022-11-29 11:27:59 +01:00
Quentin Gallouédec	aee0ba03c7	Update changelog for #1184 (#1185 )	2022-11-28 19:36:26 +01:00
Quentin Gallouédec	e3b24829a5	Drop `gym.GoalEnv` and other minor changes initally from #780 (#1184 ) * Various changes from #780 * Fix env_checker for goal_env detection	2022-11-28 18:22:31 +01:00

1 2 3 4 5 ...

744 commits