stable-baselines3

mirror of https://github.com/saymrwulf/stable-baselines3.git synced 2026-05-18 21:30:19 +00:00

Author	SHA1	Message	Date
npit	4232f9daa9	Rename the observations variable in the evaluation util to avoid shadowing (#1288 ) * Rename the observations variable in the evaluation util to avoid shadowing This enables a callback in evaluate_policy to have access to the observation vector that is fed to the environment step function, which is currently shadowed by the output observation. * Update changelog * Add test * Move assignment outside of the loop --------- Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org> Co-authored-by: Antonin Raffin <antonin.raffin@dlr.de>	2023-04-11 18:00:33 +02:00
Antonin RAFFIN	84f5511e08	Update changelog and cleanup (#1434 )	2023-04-08 15:36:55 +02:00
Jonas Reiher	12250eb761	Add stats window argument (#1424 ) * added stats_window_size argument * updated changelog * docstring info updated * added missing tensorboard log docstring * added stats_window_size argument for all models * fixed stats_window_size test * Update version --------- Co-authored-by: Antonin Raffin <antonin.raffin@dlr.de>	2023-04-05 11:33:26 +02:00
Antonin RAFFIN	5a70af8abd	Fix type hints for DQN (#1354 ) * Fix type hints for DQN * [ci skip] Remove commented line * Refine types * Fix vectorized obs detection * Fix for pytype * Fix check at load time to create replay buffer * One config file to rule them all * Delete unused config --------- Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>	2023-03-30 11:31:47 +02:00
Omar Younis	a60b0179e0	Fix: Reshape action in DictRolloutBuffer (#1395 ) * reshape action in DictRolloutBuffer * improve buffer test * update changelog * add comment * Update comments and version --------- Co-authored-by: Antonin Raffin <antonin.raffin@dlr.de>	2023-03-29 16:25:05 +02:00
Fiete	b6aa507a22	Make check_env assertions in regards to observation_space more actionable (#1400 ) * add instructions for running single tests in the README, add assertions for observation_space * update changelog * address linting warnings * correct pytest command in the README * correct review comments, run make commit-checks * truncate lines that are too long * address make lint warning about checking module availability * fix tests * use f-strings for formatting assertion messages * fix type issue * Refactor tests, improve error messages --------- Co-authored-by: Antonin Raffin <antonin.raffin@dlr.de>	2023-03-29 15:26:03 +02:00
Quentin Gallouédec	c5adad82b2	Multiprocessing support for HerReplayBuffer (#704 ) * IM compat. modif from old fork * mp her working, without offline sampling * update readme and doc * fix discrete action/obs space case * handle offline sampling * fix pos to be consistent with the old version * improve typing and docstring * fix discrete obs special case * new her, using episode uid * deal with full buffer * offline not implemented * info storage; compute_reward as arg; offline sampling error * offline sampling; timeout_termination; fix last_trans detection * rm max_episode_length from tests * fix loading and loading test * Fix episode sampling strategy * Episode interrupted not valid * Typo * Fix infos sampling, next_obs desired goals, offline sampling * update tests for multienvs * speed up code * handle timeout sampling when samping * give up ep_uid for ep_start and ep_lenght * speed up sampling * Improve docstring * Typos and renaming * Fix typing * Fix linter warnings * Renaming + add note * fix reward type * Fix future sampling strategy * Fix future goal selection strategy * env_fn as lambda * Re-fix linter warnings * Formatting * Fix offline sampling * restore the initial performance budget * Remove max_episode_length for HerReplayBuffer kwargs * SubprcVecEnv compat test * Dedicated SubrocVecEnv test rm n_envs from parametrization * Back to using the env arg instead of compute_reward * Up VecEnv import * fix lint warnings * fix docstring * Fix device issue * actor_loss_modifier in SAV and TD3 * Merge RewardModifier and ActorLossModifier into Surgeon * update surgeon for rnd * fix uninteded merge * fix uninteded merge * fix unintended merge * Rm unintended merge * Fix KeyError * Remove useless `all_inds` * Minor docstring format * Fix hint * speedup! * Speedup again * speedup * np.nonzero * fix env normalization * flat sampling for speedup * typo * drop online * format * remove observation from env_cheker (see #1335) * update changelog * default device to "auto" * add comment for info storage * add comment for ep_start and ep_length attributes * a[b][c] to a[b, c] * comment flatnonzero and unravel_index * update _sample_goals docstring * Fix future gaol sampling for split episode * add informative error message for learning_starts too small * use keyword arg for env * try fix pytye * Update stable_baselines3/common/off_policy_algorithm.py Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org> * Add `copy_info_dict` option * Ignore pytype * Update changelog * Rename variables and improve documentation * Ignore new bug bear rule * Add note about future strategy * Add deprecation warning * Fix bug trying to pickle buffer kwargs --------- Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2023-03-20 12:03:57 +01:00
Antonin RAFFIN	e5deeed16e	Update doc about Gymnasium support (#1382 )	2023-03-14 12:43:19 +01:00
Antonin RAFFIN	470771b5c2	Fix Atari Roms download, enable RUF linting (#1379 ) * Add extra no Atari and fix CI for forks * Enable ruff rules * Change to no roms	2023-03-12 18:47:52 +01:00
Antonin RAFFIN	10e83865ec	Switch to `pyproject.toml` and `ruff` (#1361 ) * Switch to `pyproject.toml` and `ruff` * Fix for Atari ROMs and mypy * Switch order in CI, lint first	2023-03-11 22:15:26 +01:00
Antonin RAFFIN	f0382a25bd	Add documentation about default network architecture (#1353 ) * Add documentation about default network architecture * [ci skip] Rename custom policy section to Policy Networks	2023-03-02 14:14:57 +01:00
Antonin RAFFIN	ed8783cb73	Add support for dict/tuple obs space for VecCheckNaN (#1348 ) * Add support for dict/tuple obs space for VecCheckNaN * Handle list too * Address comments from code review * Ignore B028 (explicit stack level)	2023-02-27 13:45:17 +01:00
Antonin RAFFIN	085bdd5a68	Remove deprecated usage of feature extractor (#1296 ) * Remove deprecated usage of feature extractor * Update changelog and version * Update changelog.rst	2023-02-19 12:53:10 +01:00
Quentin Gallouédec	12e9917c24	Fix image-based normalized env loading (#1321 ) * Fix * Add test * Update changelog * fix memory error avoidance * Update version * image env test * black * check_shape_equal * check shape equal in vecnormalize * Allow spaces not to be box or dict * rm `test_save_load_vecnormalized_image` in favor of `test_vec_env` * Remove unused imports --------- Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org> Co-authored-by: Antonin Raffin <antonin.raffin@dlr.de>	2023-02-15 14:17:18 +01:00
harveybellini	7a1e429702	Remove Note from examples - Code works (#1330 ) * Remove Note Gif creation works with Atari Environments using the script provided below. * Update changelog --------- Co-authored-by: Antonin Raffin <antonin.raffin@dlr.de>	2023-02-15 13:14:02 +01:00
Vikas Kumar	69b94dd6a8	Rename "timesteps" to "episodes" in `log_interval` documentation (#1325 ) * change timestamp to episode for logging * update changelog * minor format modif * minor format modif --------- Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>	2023-02-10 21:15:09 +01:00
Sidney Tio	489b1fdaf2	Add the argument `dtype` (default to `float32`) to the noise (#1301 ) * Fixed noise to return float32 * Updated changelog * Fixed test to use numpy arrays instead of python floats * Sorted imports for tests * Added dtype to constructor * Removed dtype parameter for VectorizedActionNoise * __init__ -> None; Capitalize and period in docstring when needed; fix dtype type hint; dtype in docstring * fix dtype type hint * Update version * Clarify changelog [skip ci] * empty commit to run ci * Update docs/misc/changelog.rst --------- Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com> Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2023-02-07 13:42:14 +01:00
Quentin Gallouédec	2e4a45020e	Refactor observation stacking (#1238 ) * refactor stacking obs * Improve docstring * remove all StackedDictObservations * Update tests and make stacked obs clearer * Fix type check * fix stacked_observation_space * undo init change, deprecate StackedDictObservations * deprecate stack_observation_space * type hints * ignore pytype errors * undo vecenv doc change * Deprecation warning in StackedDictObs doctstring * Fix vec_env.rst * Fix __all__ sorting * fix pytype ignore statement * Update docstring * stack * Remove n_stack * Update changelog * Simplify code * Rename test file * Re-use variable for shift * Fix doc build * Remove pytype comment * Disable pytype error --------- Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2023-02-06 22:41:59 +01:00
adamfrly	411ff697dd	Ensure train/n_updates metric accounts for early stopping of training loop (#1311 ) * Correct _n_updates when target_kl stops loop early * Update changelog * Simplify code --------- Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>	2023-02-06 15:48:41 +01:00
Marco Tröster	d0c1a87faf	Add scaling section to A2C documentation (#1250 ) * add scaling section to A2C documentation * add cross-reference to vectorized envs article * turn it as note * update changelog * add Bonifatius94 to the list of contributors * fix issue number --------- Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com> Co-authored-by: Quentin GALLOUÉDEC <gallouedec.quentin@gmail.com> Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2023-02-02 12:34:38 +01:00
Alex Pasquali	bea3c44ba5	Fixed typo in A2C's docstring (#1303 )	2023-01-28 12:04:07 +01:00
Quentin Gallouédec	5ee9009535	Add sticky actions for Atari games (#1286 ) * repeat_action_probability * Add test * Undo atari wrapper doc change since CI fails * remove action_repeat_probability from make_atari_env * Add sticky action wrapper and improve documentation * Update changelog * handle the case noop_max=0 * Update tests * Comply to ALE implementation * Reorder doc * Add doc warning and don't wrap with sticky action when not needed * fix docstring and reorder * Move `action_repeat_probability` args at the last position * Add ref * Update doc and wrap with frameskip only if needed * Update changelog Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2023-01-26 10:32:58 +01:00
Quentin Gallouédec	637988c9cc	Fix Atari wrapper bug: tried to step environment that needs reset (#1297 ) * fix 1060 * update changelog	2023-01-26 00:31:20 +01:00
Alex Pasquali	b702884c23	Removed shared layers in mlp_extractor (#1292 ) * Modified actor-critic policies & MlpExtractor class ActorCriticPolicy: - changed type hint of net_arch param: now it's a dict - removed check that if features extractor is not shared: no shared layers are allowed in the mlp_extractor regardless of the features extractor ActorCriticCnnPolicy: - changed type hint of net_arch param: now it's a dict MultiInputActorcriticPolicy: - changed type hint of net_arch param: now it's a dict MlpExtractor: - changed type hint of net_arch param: now it's a dict - adapted networks creation - adapted methods: forward, forward_actor & forward_critic * Removed shared layers in mlp_extractor * Updated docs and changelog + reformat * Updated custom policy tests * Removed test on deprecation warning for share layers in mlp_extractor Now shared layers are removed * Update version * Update RL Zoo doc * Fix linter warnings * Add ruff to Makefile (experimental) * Add backward compat code and minor updates * Update tests * Add backward compatibility * Fix test * Improve compat code Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2023-01-23 14:55:19 +01:00
Quentin Gallouédec	92f7a6f23b	Fix `test_vec_normalize.py`, `test_tensorboard.py` and `common/monitor.py` type hint (#1194 ) * Remove from mypy exclude * type hint for metadata * Union[float, int] -> float * Remove useless __init__ * Type hint for model and logger in BaseCallback * Type hint for metric_dict * Update changelog * fix test_tensorboard * ignore gamma type checking * Fix monitor type hint * Update logger type hints * Fix type annotation and bump version * Fix circular import Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2023-01-13 18:28:22 +01:00
Yu Zheng	9bb1538b78	Fix outdated `load_parameters` to `set_parameters` (#1270 ) * Update examples.rst * Update changelog Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>	2023-01-11 14:13:21 +01:00
Antonin RAFFIN	6b8905acdb	Release v1.7.0 (#1268 )	2023-01-10 17:32:57 +01:00
Dominic Kerr	5aa6e7d340	Fix `ProgressBarCallback` under-reporting (#1260 ) * Updated tqdm progress bar constructor to account for the effects of train_freq/n_steps/num_envs on total_timesteps. Ensure progress bar is "flushed" on training end. * Added description of PR #1260. Fixed formatting typo * Partial revert Co-authored-by: dominicgkerr <dominicgkerr1@gmail.co> Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org> Co-authored-by: Antonin Raffin <antonin.raffin@dlr.de>	2023-01-10 15:17:52 +01:00
Alex Pasquali	30a19848ce	Deprecation of shared layers in `MlpExtractor` (#1252 ) * Deprecation warning for shared layers in Mlpextractor * Updated changelog * Updated custom policy doc * Update doc and deprecation * Fix doc build * Minor edits Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>	2023-01-05 09:59:36 +01:00
Quentin Gallouédec	4fa17dcf0f	Standardize the use of `from gym import spaces` (#1240 ) * generalize the use of `from gym import spaces` * command line get system info * Documentation line length for doc * update changelog * add space before os plateform to avoid ref to other issue * format * get_system_info update in changelog * fix type check error * fix get system info * add comment about regex * update version	2023-01-02 14:51:11 +01:00
Friedrich Yuan	2bb8ef5e63	Add RLeXplore to the project page (#1246 ) * Update project page Adding the repo "rl-exploration-baselines" to the project page. * Update changelog.rst * Update projects.rst * Update changelog.rst * Update docs/misc/projects.rst * Update changelog.rst Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>	2022-12-28 15:06:09 +01:00
Antonin RAFFIN	e78ba6ffa4	Hotfix to load policies saved with SB3 <= v1.6 (#1234 ) * Hotfix to load policies saved with SB3 <= v1.6 * Add warning and test * Update doc	2022-12-22 23:58:30 +01:00
Antonin RAFFIN	3c028f3d5c	Fix `load_from_tensor` (#1231 )	2022-12-22 17:28:18 +01:00
Quentin Gallouédec	5549b34231	Fix ``stable_baselines3/common/vec_env/vec_check_nan.py`` type hints (#1226 ) * super() init style * "async_step" arg to "event"; "news" to "dones"; improve docstring * Remove vec_check_nan from mypy exclude * Update changelog	2022-12-22 12:24:59 +01:00
Quentin Gallouédec	9aff1137a9	Add support for Python 3.10 (#1227 ) * Add python 3.10 and 3.11 * Update setup * Fix CI * Drop 3.11 (because of pytorch) * Update changelog * revert unwanted change in setup.cfg * Remove remark about pytorch	2022-12-21 15:52:48 +01:00
Antonin RAFFIN	7202ece85b	Update tensorboard callback doc (#1221 ) Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>	2022-12-21 12:51:28 +01:00
Quentin Gallouédec	96b1a7cf01	`env_id` consistency in tests (#1224 ) Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2022-12-20 16:01:26 +01:00
Quentin Gallouédec	7fb8336f40	Update PR template (#1225 ) Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2022-12-20 15:13:42 +01:00
Alex Pasquali	2cfcec4f50	Modified ActorCriticPolicy to support non-shared features extractor (#1148 ) * Modified ActorCriticPolicy to support non-shared features extractor * Refactored features extraction with non-shared features extractor in ActorCriticPolicy and updated doc Doc update: added 'warning' on custom policy docs that says that, if the features extractor is non-shared, it's not possible to have shared layers in the mlp_extractor * Moved attrib share_features_extractor in class * Updated custom policy doc for non-shared features extractor * Updated changelog * Made some if-statements more readable if policies.py The if-statements are related to the shared/non-shared features extractor in ActorCritic policies * Simplify implementation and add run test * Keep order in module gain to keep previous results consistents * Fix test * Improved docstring in policies.py Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com> * Added some tests * feature extractor -> features extractor * Fix test * Fix env_id in test * Make features extractor parameter explicit * Remove duplicate Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org> Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com> Co-authored-by: Antonin Raffin <antonin.raffin@dlr.de>	2022-12-20 15:12:05 +01:00
Antonin RAFFIN	8452106734	Fix support of image like normalized inputs (#1214 ) * Fix support of image like normalized inputs * Improve docstring and warning message. * Don't check if obs is image when normalize_images is False (lil opt) * Comment fix * Fix normalize_images not passed to parent * Check for subclasses too * Remove useless multiline * Update version and add comment * Fix some typos Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>	2022-12-20 13:18:28 +01:00
Quentin Gallouédec	ca944fed2d	Update version (#1220 ) * Replace .to(device) when possible * fix numpy dep * black * Add warning for device != cpu and copy=False * Update changelog * Remove warning * Update buffers.py * Update version * Fix type checking Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2022-12-19 13:53:00 +01:00
Antonin Raffin	9af2d11b6e	Update changelog	2022-12-19 13:21:10 +01:00
Quentin Gallouédec	68a40e0940	Construct tensors directly on GPU (#1218 ) * Replace .to(device) when possible * fix numpy dep * black * Add warning for device != cpu and copy=False * Update changelog * Remove warning * Update buffers.py	2022-12-19 12:50:22 +01:00
Antonin RAFFIN	0c1bc0b1da	Fix `stable_baselines3/common/atari_wrappers.py` type hints (#1216 ) * Fix `stable_baselines3/common/atari_wrappers.py` type hints * Fix initialization Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>	2022-12-18 16:13:44 +01:00
Antonin RAFFIN	07094c3f2e	Fix `stable_baselines3/common/preprocessing.py` type hints (#1217 )	2022-12-18 15:53:17 +01:00
Alex Pasquali	6d55a09f81	Updated custom policy docs to better explain the ``mlp_extractor``'s dimensions (#1196 ) * Updated custom policy docs Better explained how the dimensions of the mlp_extractor work, including the action net and the value net after the layers specified in net_arch. * Improved custom policy doc Section: Custom Network Architecture. Explained with greater detail that an action net and a value net will be added on top of the net_arch. * Improved custom policy doc Section: Custom Network Architecture. Merged a comment into a note * Alignment Co-authored-by: Quentin GALLOUÉDEC <gallouedec.quentin@gmail.com>	2022-12-12 16:19:51 +01:00
Quentin Gallouédec	e39bc3da00	Add support for multidimensional `spaces.MultiBinary` observations (#1179 ) * Fix `get_obs_shape` for multidimensi onnal Multibinary space * Update changelog * more tests * fix multidiscrete one-hot encoding * refactor tests * Update changelog.rst * Update changelog.rst * batched obs and revert preprocess_obs changes * Add support for multidimensional ``spaces.MultiBinary`` observations Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org> Co-authored-by: Antonin Raffin <antonin.raffin@dlr.de>	2022-12-08 18:46:41 +01:00
Quentin Gallouédec	6763a864c8	Upgrade CI/github-actions (#1204 ) * checkout v2 -> v3; setup-python v2 -> v4 * Update changelog.rst	2022-12-07 16:43:47 +01:00
Athanasios Theocharis	f7d7ed3fa7	Update custom_policy.rst (#1183 ) * Update custom_policy.rst * Update changelog Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com> Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org> Co-authored-by: Antonin Raffin <antonin.raffin@dlr.de>	2022-12-06 17:51:52 +01:00
Quentin Gallouédec	002850f8ac	Fix `stable_baselines3/common/torch_layers.py` type hint (#1191 ) * Remove torch layers from mypy exclude * Make torch layers mypy compliant * Extra type specification * Update changelog Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2022-11-29 23:46:32 +01:00

1 2 3 4 5 ...

410 commits