stable-baselines3

mirror of https://github.com/saymrwulf/stable-baselines3.git synced 2026-06-06 00:03:28 +00:00

Author	SHA1	Message	Date
Kallinteris Andreas	9c338f917a	`vec_env`s fix `seed()` causing a reset (#1486 ) * `dummy_vec_env` fix `seed()` causing a reset * rename `seed` * fixes * bug fix * fix seed return type * Cleanup seeding, add test and remove compat wrapper * Update env checker and tests * Add deterministic test for make_vec_env --------- Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>	2023-05-20 10:30:54 +02:00
Antonin RAFFIN	fd0cd82339	Update outdated custom env doc (#1490 ) * Update outdated custom env doc * fix render_mode and term/trunc/reset_info * gym -> gymnasium --------- Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>	2023-05-08 13:48:26 +02:00
Quentin Gallouédec	9cebedc89f	Fix Colab logger error (#1484 ) * fix HumanOutputFormat * update version * update changelog * TextIO annotation, TextIOBase isinstance * update changelog * test for HumanOutputFormat with custom TextIO * rm extra test line * Update tests/test_logger.py Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org> --------- Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2023-05-05 14:26:39 +02:00
Antonin RAFFIN	63a0bb9da1	Type annotation bundle (logger, vec env, custom envs) (#1479 ) * Switch from List to Sequence for `seed()` type hint * Fix logger type hints * Improve replay buffer type hints * Fix custom envs type annotations * Fix VecMonitor type hints * Fix RMSprop type hint * Fix vec extract dict obs type hints * Fix vec frame stack type annotations * Fix base vec env type hints * Fix dummy vec env type hints * Fix for mypy * Fixes for the tests * mypy doesn't like when we overwrite type * fix step of SimpleMultiObsEnv * remove useless type specification * Rm useless type hint * Improve logger type hint * format * rm useless type hint * Re-add variables in constructor, remove unused import --------- Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>	2023-05-04 20:27:15 +02:00
Sidney Tio	d6ddee9366	Add evalcallback example (#1468 ) * Moved 'Monitoring Training' to subsubsection of 'Using callbacks' * Added EvalCallback example * Updated Changelogs * Edited the language * Moved subsection headers up one level * added make_vec_env into Evalcallback example * Added parameters to the top for readability * Added note on multiple training environments * Added more clarity to eval_freq note * Apply suggestions from code review --------- Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2023-05-02 18:02:36 +02:00
Übertreiber	4f9805eeb8	Fix overly relaxed version requirement on NumPy (#1472 ) Since commit `489b1fda`, this package has been using `numpy.typing.DTypeLike`, which was only added in [NumPy 1.20][1]. [1]: https://numpy.org/doc/stable/release/1.20.0-notes.html#numpy-is-now-typed Co-authored-by: troiganto <troiganto@proton.me> Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2023-04-27 19:07:53 +02:00
Tobias Rohrer	6cbb2c9303	Fix DQN target update interval for multi-env (#1463 ) * Calculating target update interval per environment in `_on_step()`. See GitHub issue #1373 * Added changelog entry and changed test comment * Added requested changes from code review * Update version --------- Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2023-04-27 18:35:33 +02:00
Lei He	dc09d81f9c	Added UAV_Navigation_DRL_AirSim to the project page (#1462 ) * Update changelog.rst * Update projects.rst * Update grammar and fix doc build --------- Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2023-04-20 23:12:57 +02:00
Antonin RAFFIN	96526ed08a	Update issue templates and env infos (#1451 ) * Update issue templates and env infos * Fix pytype	2023-04-14 13:50:14 +02:00
Antonin RAFFIN	40e0b9d2c8	Add Gymnasium support (#1327 ) * Fix failing set_env test * Fix test failiing due to deprectation of env.seed * Adjust mean reward threshold in failing test * Fix her test failing due to rng * Change seed and revert reward threshold to 90 * Pin gym version * Make VecEnv compatible with gym seeding change * Revert change to VecEnv reset signature * Change subprocenv seed cmd to call reset instead * Fix type check * Add backward compat * Add `compat_gym_seed` helper * Add goal env checks in env_checker * Add docs on HER requirements for envs * Capture user warning in test with inverted box space * Update ale-py version * Fix randint * Allow noop_max to be zero * Update changelog * Update docker image * Update doc conda env and dockerfile * Custom envs should not have any warnings * Fix test for numpy >= 1.21 * Add check for vectorized compute reward * Bump to gym 0.24 * Fix gym default step docstring * Test downgrading gym * Revert "Test downgrading gym" This reverts commit 0072b77156c006ada8a1d6e26ce347ed85a83eeb. * Fix protobuf error * Fix in dependencies * Fix protobuf dep * Use newest version of cartpole * Update gym * Fix warning * Loosen required scipy version * Scipy no longer needed * Try gym 0.25 * Silence warnings from gym * Filter warnings during tests * Update doc * Update requirements * Add gym 26 compat in vec env * Fixes in envs and tests for gym 0.26+ * Enforce gym 0.26 api * format * Fix formatting * Fix dependencies * Fix syntax * Cleanup doc and warnings * Faster tests * Higher budget for HER perf test (revert prev change) * Fixes and update doc * Fix doc build * Fix breaking change * Fixes for rendering * Rename variables in monitor * update render method for gym 0.26 API backwards compatible (mode argument is allowed) while using the gym 0.26 API (render mode is determined at environment creation) * update tests and docs to new gym render API * undo removal of render modes metatadata check * set rgb_array as default render mode for gym.make * undo changes & raise warning if not 'rgb_array' * Fix type check * Remove recursion and fix type checking * Remove hacks for protobuf and gym 0.24 * Fix type annotations * reuse existing render_mode attribute * return tiled images for 'human' render mode * Allow to use opencv for human render, fix typos * Add warning when using non-zero start with Discrete (fixes #1197) * Fix type checking * Bug fixes and handle more cases * Throw proper warnings * Update test * Fix new metadata name * Ignore numpy warnings * Fixes in vec recorder * Global ignore * Filter local warning too * Monkey patch not needed for gym 26 * Add doc of VecEnv vs Gym API * Add render test * Fix return type * Update VecEnv vs Gym API doc * Fix for custom render mode * Fix return type * Fix type checking * check test env test_buffer * skip render check * check env test_dict_env * test_env test_gae * check envs in remaining tests * Update tests * Add warning for Discrete action space with non-zero (#1295) * Fix atari annotation * ignore get_action_meanings [attr-defined] * Fix mypy issues * Add patch for gym/gymnasium transition * Switch to gymnasium * Rely on signature instead of version * More patches * Type ignore because of https://github.com/Farama-Foundation/Gymnasium/pull/39 * Fix doc build * Fix pytype errors * Fix atari requirement * Update env checker due to change in dtype for Discrete * Fix type hint * Convert spaces for saved models * Ignore pytype * Remove gitlab CI * Disable pytype for convert space * Fix undefined info * Fix undefined info * Upgrade shimmy * Fix wrappers type annotation (need PR from Gymnasium) * Fix gymnasium dependency * Fix dependency declaration * Cap pygame version for python 3.7 * Point to master branch (v0.28.0) * Fix: use main not master branch * Rename done to terminated * Fix pygame dependency for python 3.7 * Rename gym to gymnasium * Update Gymnasium * Fix test * Fix tests * Forks don't have access to private variables * Fix linter warnings * Update read the doc env * Fix env checker for GoalEnv * Fix import * Update env checker (more info) and fix dtype * Use micromamab for Docker * Update dependencies * Clarify VecEnv doc * Fix Gymnasium version * Copy file only after mamba install * [ci skip] Update docker doc * Polish code * Reformat * Remove deprecated features * Ignore warning * Update doc * Update examples and changelog * Fix type annotation bundle (SAC, TD3, A2C, PPO, base class) (#1436) * Fix SAC type hints, improve DQN ones * Fix A2C and TD3 type hints * Fix PPO type hints * Fix on-policy type hints * Fix base class type annotation, do not use defaults * Update version * Disable mypy for python 3.7 * Rename Gym26StepReturn * Update continuous critic type annotation * Fix pytype complain --------- Co-authored-by: Carlos Luis <carlos.luisgonc@gmail.com> Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com> Co-authored-by: Thomas Lips <37955681+tlpss@users.noreply.github.com> Co-authored-by: tlips <thomas.lips@ugent.be> Co-authored-by: tlpss <thomas17.lips@gmail.com> Co-authored-by: Quentin GALLOUÉDEC <gallouedec.quentin@gmail.com>	2023-04-14 13:13:59 +02:00
WeberSamuel	15c9daa2ba	Fix VecExtractDictObs does not handle terminal observation (#1443 ) * VecExtractDictObs handle terminal_observation * Added VecExtractDictObs handle terminal_output to changelog * Update changelog.rst * Update test_vec_extract_dict_obs.py Add random dones in env to test if terminal_observation is properly handled * Made test deterministic * Fixed bug in test * Improved test * Fix format in test * Update test * Fix type hint * Ignore pytype warning * Ignore pytype --------- Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2023-04-12 15:20:04 +02:00
npit	4232f9daa9	Rename the observations variable in the evaluation util to avoid shadowing (#1288 ) * Rename the observations variable in the evaluation util to avoid shadowing This enables a callback in evaluate_policy to have access to the observation vector that is fed to the environment step function, which is currently shadowed by the output observation. * Update changelog * Add test * Move assignment outside of the loop --------- Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org> Co-authored-by: Antonin Raffin <antonin.raffin@dlr.de>	2023-04-11 18:00:33 +02:00
Antonin RAFFIN	84f5511e08	Update changelog and cleanup (#1434 )	2023-04-08 15:36:55 +02:00
Jonas Reiher	12250eb761	Add stats window argument (#1424 ) * added stats_window_size argument * updated changelog * docstring info updated * added missing tensorboard log docstring * added stats_window_size argument for all models * fixed stats_window_size test * Update version --------- Co-authored-by: Antonin Raffin <antonin.raffin@dlr.de>	2023-04-05 11:33:26 +02:00
Antonin RAFFIN	5a70af8abd	Fix type hints for DQN (#1354 ) * Fix type hints for DQN * [ci skip] Remove commented line * Refine types * Fix vectorized obs detection * Fix for pytype * Fix check at load time to create replay buffer * One config file to rule them all * Delete unused config --------- Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>	2023-03-30 11:31:47 +02:00
Omar Younis	a60b0179e0	Fix: Reshape action in DictRolloutBuffer (#1395 ) * reshape action in DictRolloutBuffer * improve buffer test * update changelog * add comment * Update comments and version --------- Co-authored-by: Antonin Raffin <antonin.raffin@dlr.de>	2023-03-29 16:25:05 +02:00
Fiete	b6aa507a22	Make check_env assertions in regards to observation_space more actionable (#1400 ) * add instructions for running single tests in the README, add assertions for observation_space * update changelog * address linting warnings * correct pytest command in the README * correct review comments, run make commit-checks * truncate lines that are too long * address make lint warning about checking module availability * fix tests * use f-strings for formatting assertion messages * fix type issue * Refactor tests, improve error messages --------- Co-authored-by: Antonin Raffin <antonin.raffin@dlr.de>	2023-03-29 15:26:03 +02:00
Quentin Gallouédec	c5adad82b2	Multiprocessing support for HerReplayBuffer (#704 ) * IM compat. modif from old fork * mp her working, without offline sampling * update readme and doc * fix discrete action/obs space case * handle offline sampling * fix pos to be consistent with the old version * improve typing and docstring * fix discrete obs special case * new her, using episode uid * deal with full buffer * offline not implemented * info storage; compute_reward as arg; offline sampling error * offline sampling; timeout_termination; fix last_trans detection * rm max_episode_length from tests * fix loading and loading test * Fix episode sampling strategy * Episode interrupted not valid * Typo * Fix infos sampling, next_obs desired goals, offline sampling * update tests for multienvs * speed up code * handle timeout sampling when samping * give up ep_uid for ep_start and ep_lenght * speed up sampling * Improve docstring * Typos and renaming * Fix typing * Fix linter warnings * Renaming + add note * fix reward type * Fix future sampling strategy * Fix future goal selection strategy * env_fn as lambda * Re-fix linter warnings * Formatting * Fix offline sampling * restore the initial performance budget * Remove max_episode_length for HerReplayBuffer kwargs * SubprcVecEnv compat test * Dedicated SubrocVecEnv test rm n_envs from parametrization * Back to using the env arg instead of compute_reward * Up VecEnv import * fix lint warnings * fix docstring * Fix device issue * actor_loss_modifier in SAV and TD3 * Merge RewardModifier and ActorLossModifier into Surgeon * update surgeon for rnd * fix uninteded merge * fix uninteded merge * fix unintended merge * Rm unintended merge * Fix KeyError * Remove useless `all_inds` * Minor docstring format * Fix hint * speedup! * Speedup again * speedup * np.nonzero * fix env normalization * flat sampling for speedup * typo * drop online * format * remove observation from env_cheker (see #1335) * update changelog * default device to "auto" * add comment for info storage * add comment for ep_start and ep_length attributes * a[b][c] to a[b, c] * comment flatnonzero and unravel_index * update _sample_goals docstring * Fix future gaol sampling for split episode * add informative error message for learning_starts too small * use keyword arg for env * try fix pytye * Update stable_baselines3/common/off_policy_algorithm.py Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org> * Add `copy_info_dict` option * Ignore pytype * Update changelog * Rename variables and improve documentation * Ignore new bug bear rule * Add note about future strategy * Add deprecation warning * Fix bug trying to pickle buffer kwargs --------- Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2023-03-20 12:03:57 +01:00
Antonin RAFFIN	e5deeed16e	Update doc about Gymnasium support (#1382 )	2023-03-14 12:43:19 +01:00
Antonin RAFFIN	470771b5c2	Fix Atari Roms download, enable RUF linting (#1379 ) * Add extra no Atari and fix CI for forks * Enable ruff rules * Change to no roms	2023-03-12 18:47:52 +01:00
Antonin RAFFIN	10e83865ec	Switch to `pyproject.toml` and `ruff` (#1361 ) * Switch to `pyproject.toml` and `ruff` * Fix for Atari ROMs and mypy * Switch order in CI, lint first	2023-03-11 22:15:26 +01:00
Antonin RAFFIN	f0382a25bd	Add documentation about default network architecture (#1353 ) * Add documentation about default network architecture * [ci skip] Rename custom policy section to Policy Networks	2023-03-02 14:14:57 +01:00
Antonin RAFFIN	ed8783cb73	Add support for dict/tuple obs space for VecCheckNaN (#1348 ) * Add support for dict/tuple obs space for VecCheckNaN * Handle list too * Address comments from code review * Ignore B028 (explicit stack level)	2023-02-27 13:45:17 +01:00
Antonin RAFFIN	085bdd5a68	Remove deprecated usage of feature extractor (#1296 ) * Remove deprecated usage of feature extractor * Update changelog and version * Update changelog.rst	2023-02-19 12:53:10 +01:00
Quentin Gallouédec	12e9917c24	Fix image-based normalized env loading (#1321 ) * Fix * Add test * Update changelog * fix memory error avoidance * Update version * image env test * black * check_shape_equal * check shape equal in vecnormalize * Allow spaces not to be box or dict * rm `test_save_load_vecnormalized_image` in favor of `test_vec_env` * Remove unused imports --------- Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org> Co-authored-by: Antonin Raffin <antonin.raffin@dlr.de>	2023-02-15 14:17:18 +01:00
harveybellini	7a1e429702	Remove Note from examples - Code works (#1330 ) * Remove Note Gif creation works with Atari Environments using the script provided below. * Update changelog --------- Co-authored-by: Antonin Raffin <antonin.raffin@dlr.de>	2023-02-15 13:14:02 +01:00
Vikas Kumar	69b94dd6a8	Rename "timesteps" to "episodes" in `log_interval` documentation (#1325 ) * change timestamp to episode for logging * update changelog * minor format modif * minor format modif --------- Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>	2023-02-10 21:15:09 +01:00
Sidney Tio	489b1fdaf2	Add the argument `dtype` (default to `float32`) to the noise (#1301 ) * Fixed noise to return float32 * Updated changelog * Fixed test to use numpy arrays instead of python floats * Sorted imports for tests * Added dtype to constructor * Removed dtype parameter for VectorizedActionNoise * __init__ -> None; Capitalize and period in docstring when needed; fix dtype type hint; dtype in docstring * fix dtype type hint * Update version * Clarify changelog [skip ci] * empty commit to run ci * Update docs/misc/changelog.rst --------- Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com> Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2023-02-07 13:42:14 +01:00
Quentin Gallouédec	2e4a45020e	Refactor observation stacking (#1238 ) * refactor stacking obs * Improve docstring * remove all StackedDictObservations * Update tests and make stacked obs clearer * Fix type check * fix stacked_observation_space * undo init change, deprecate StackedDictObservations * deprecate stack_observation_space * type hints * ignore pytype errors * undo vecenv doc change * Deprecation warning in StackedDictObs doctstring * Fix vec_env.rst * Fix __all__ sorting * fix pytype ignore statement * Update docstring * stack * Remove n_stack * Update changelog * Simplify code * Rename test file * Re-use variable for shift * Fix doc build * Remove pytype comment * Disable pytype error --------- Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2023-02-06 22:41:59 +01:00
adamfrly	411ff697dd	Ensure train/n_updates metric accounts for early stopping of training loop (#1311 ) * Correct _n_updates when target_kl stops loop early * Update changelog * Simplify code --------- Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>	2023-02-06 15:48:41 +01:00
Marco Tröster	d0c1a87faf	Add scaling section to A2C documentation (#1250 ) * add scaling section to A2C documentation * add cross-reference to vectorized envs article * turn it as note * update changelog * add Bonifatius94 to the list of contributors * fix issue number --------- Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com> Co-authored-by: Quentin GALLOUÉDEC <gallouedec.quentin@gmail.com> Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2023-02-02 12:34:38 +01:00
Alex Pasquali	bea3c44ba5	Fixed typo in A2C's docstring (#1303 )	2023-01-28 12:04:07 +01:00
Quentin Gallouédec	5ee9009535	Add sticky actions for Atari games (#1286 ) * repeat_action_probability * Add test * Undo atari wrapper doc change since CI fails * remove action_repeat_probability from make_atari_env * Add sticky action wrapper and improve documentation * Update changelog * handle the case noop_max=0 * Update tests * Comply to ALE implementation * Reorder doc * Add doc warning and don't wrap with sticky action when not needed * fix docstring and reorder * Move `action_repeat_probability` args at the last position * Add ref * Update doc and wrap with frameskip only if needed * Update changelog Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2023-01-26 10:32:58 +01:00
Quentin Gallouédec	637988c9cc	Fix Atari wrapper bug: tried to step environment that needs reset (#1297 ) * fix 1060 * update changelog	2023-01-26 00:31:20 +01:00
Alex Pasquali	b702884c23	Removed shared layers in mlp_extractor (#1292 ) * Modified actor-critic policies & MlpExtractor class ActorCriticPolicy: - changed type hint of net_arch param: now it's a dict - removed check that if features extractor is not shared: no shared layers are allowed in the mlp_extractor regardless of the features extractor ActorCriticCnnPolicy: - changed type hint of net_arch param: now it's a dict MultiInputActorcriticPolicy: - changed type hint of net_arch param: now it's a dict MlpExtractor: - changed type hint of net_arch param: now it's a dict - adapted networks creation - adapted methods: forward, forward_actor & forward_critic * Removed shared layers in mlp_extractor * Updated docs and changelog + reformat * Updated custom policy tests * Removed test on deprecation warning for share layers in mlp_extractor Now shared layers are removed * Update version * Update RL Zoo doc * Fix linter warnings * Add ruff to Makefile (experimental) * Add backward compat code and minor updates * Update tests * Add backward compatibility * Fix test * Improve compat code Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2023-01-23 14:55:19 +01:00
Quentin Gallouédec	92f7a6f23b	Fix `test_vec_normalize.py`, `test_tensorboard.py` and `common/monitor.py` type hint (#1194 ) * Remove from mypy exclude * type hint for metadata * Union[float, int] -> float * Remove useless __init__ * Type hint for model and logger in BaseCallback * Type hint for metric_dict * Update changelog * fix test_tensorboard * ignore gamma type checking * Fix monitor type hint * Update logger type hints * Fix type annotation and bump version * Fix circular import Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2023-01-13 18:28:22 +01:00
Yu Zheng	9bb1538b78	Fix outdated `load_parameters` to `set_parameters` (#1270 ) * Update examples.rst * Update changelog Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>	2023-01-11 14:13:21 +01:00
Antonin RAFFIN	6b8905acdb	Release v1.7.0 (#1268 )	2023-01-10 17:32:57 +01:00
Dominic Kerr	5aa6e7d340	Fix `ProgressBarCallback` under-reporting (#1260 ) * Updated tqdm progress bar constructor to account for the effects of train_freq/n_steps/num_envs on total_timesteps. Ensure progress bar is "flushed" on training end. * Added description of PR #1260. Fixed formatting typo * Partial revert Co-authored-by: dominicgkerr <dominicgkerr1@gmail.co> Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org> Co-authored-by: Antonin Raffin <antonin.raffin@dlr.de>	2023-01-10 15:17:52 +01:00
Alex Pasquali	30a19848ce	Deprecation of shared layers in `MlpExtractor` (#1252 ) * Deprecation warning for shared layers in Mlpextractor * Updated changelog * Updated custom policy doc * Update doc and deprecation * Fix doc build * Minor edits Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>	2023-01-05 09:59:36 +01:00
Quentin Gallouédec	4fa17dcf0f	Standardize the use of `from gym import spaces` (#1240 ) * generalize the use of `from gym import spaces` * command line get system info * Documentation line length for doc * update changelog * add space before os plateform to avoid ref to other issue * format * get_system_info update in changelog * fix type check error * fix get system info * add comment about regex * update version	2023-01-02 14:51:11 +01:00
Friedrich Yuan	2bb8ef5e63	Add RLeXplore to the project page (#1246 ) * Update project page Adding the repo "rl-exploration-baselines" to the project page. * Update changelog.rst * Update projects.rst * Update changelog.rst * Update docs/misc/projects.rst * Update changelog.rst Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>	2022-12-28 15:06:09 +01:00
Antonin RAFFIN	e78ba6ffa4	Hotfix to load policies saved with SB3 <= v1.6 (#1234 ) * Hotfix to load policies saved with SB3 <= v1.6 * Add warning and test * Update doc	2022-12-22 23:58:30 +01:00
Antonin RAFFIN	3c028f3d5c	Fix `load_from_tensor` (#1231 )	2022-12-22 17:28:18 +01:00
Quentin Gallouédec	5549b34231	Fix ``stable_baselines3/common/vec_env/vec_check_nan.py`` type hints (#1226 ) * super() init style * "async_step" arg to "event"; "news" to "dones"; improve docstring * Remove vec_check_nan from mypy exclude * Update changelog	2022-12-22 12:24:59 +01:00
Quentin Gallouédec	9aff1137a9	Add support for Python 3.10 (#1227 ) * Add python 3.10 and 3.11 * Update setup * Fix CI * Drop 3.11 (because of pytorch) * Update changelog * revert unwanted change in setup.cfg * Remove remark about pytorch	2022-12-21 15:52:48 +01:00
Antonin RAFFIN	7202ece85b	Update tensorboard callback doc (#1221 ) Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>	2022-12-21 12:51:28 +01:00
Quentin Gallouédec	96b1a7cf01	`env_id` consistency in tests (#1224 ) Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2022-12-20 16:01:26 +01:00
Quentin Gallouédec	7fb8336f40	Update PR template (#1225 ) Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2022-12-20 15:13:42 +01:00
Alex Pasquali	2cfcec4f50	Modified ActorCriticPolicy to support non-shared features extractor (#1148 ) * Modified ActorCriticPolicy to support non-shared features extractor * Refactored features extraction with non-shared features extractor in ActorCriticPolicy and updated doc Doc update: added 'warning' on custom policy docs that says that, if the features extractor is non-shared, it's not possible to have shared layers in the mlp_extractor * Moved attrib share_features_extractor in class * Updated custom policy doc for non-shared features extractor * Updated changelog * Made some if-statements more readable if policies.py The if-statements are related to the shared/non-shared features extractor in ActorCritic policies * Simplify implementation and add run test * Keep order in module gain to keep previous results consistents * Fix test * Improved docstring in policies.py Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com> * Added some tests * feature extractor -> features extractor * Fix test * Fix env_id in test * Make features extractor parameter explicit * Remove duplicate Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org> Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com> Co-authored-by: Antonin Raffin <antonin.raffin@dlr.de>	2022-12-20 15:12:05 +01:00
Antonin RAFFIN	8452106734	Fix support of image like normalized inputs (#1214 ) * Fix support of image like normalized inputs * Improve docstring and warning message. * Don't check if obs is image when normalize_images is False (lil opt) * Comment fix * Fix normalize_images not passed to parent * Check for subclasses too * Remove useless multiline * Update version and add comment * Fix some typos Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>	2022-12-20 13:18:28 +01:00
Quentin Gallouédec	ca944fed2d	Update version (#1220 ) * Replace .to(device) when possible * fix numpy dep * black * Add warning for device != cpu and copy=False * Update changelog * Remove warning * Update buffers.py * Update version * Fix type checking Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2022-12-19 13:53:00 +01:00
Antonin Raffin	9af2d11b6e	Update changelog	2022-12-19 13:21:10 +01:00
Quentin Gallouédec	68a40e0940	Construct tensors directly on GPU (#1218 ) * Replace .to(device) when possible * fix numpy dep * black * Add warning for device != cpu and copy=False * Update changelog * Remove warning * Update buffers.py	2022-12-19 12:50:22 +01:00
Antonin RAFFIN	0c1bc0b1da	Fix `stable_baselines3/common/atari_wrappers.py` type hints (#1216 ) * Fix `stable_baselines3/common/atari_wrappers.py` type hints * Fix initialization Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>	2022-12-18 16:13:44 +01:00
Antonin RAFFIN	07094c3f2e	Fix `stable_baselines3/common/preprocessing.py` type hints (#1217 )	2022-12-18 15:53:17 +01:00
Alex Pasquali	6d55a09f81	Updated custom policy docs to better explain the ``mlp_extractor``'s dimensions (#1196 ) * Updated custom policy docs Better explained how the dimensions of the mlp_extractor work, including the action net and the value net after the layers specified in net_arch. * Improved custom policy doc Section: Custom Network Architecture. Explained with greater detail that an action net and a value net will be added on top of the net_arch. * Improved custom policy doc Section: Custom Network Architecture. Merged a comment into a note * Alignment Co-authored-by: Quentin GALLOUÉDEC <gallouedec.quentin@gmail.com>	2022-12-12 16:19:51 +01:00
Quentin Gallouédec	e39bc3da00	Add support for multidimensional `spaces.MultiBinary` observations (#1179 ) * Fix `get_obs_shape` for multidimensi onnal Multibinary space * Update changelog * more tests * fix multidiscrete one-hot encoding * refactor tests * Update changelog.rst * Update changelog.rst * batched obs and revert preprocess_obs changes * Add support for multidimensional ``spaces.MultiBinary`` observations Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org> Co-authored-by: Antonin Raffin <antonin.raffin@dlr.de>	2022-12-08 18:46:41 +01:00
Quentin Gallouédec	6763a864c8	Upgrade CI/github-actions (#1204 ) * checkout v2 -> v3; setup-python v2 -> v4 * Update changelog.rst	2022-12-07 16:43:47 +01:00
Athanasios Theocharis	f7d7ed3fa7	Update custom_policy.rst (#1183 ) * Update custom_policy.rst * Update changelog Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com> Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org> Co-authored-by: Antonin Raffin <antonin.raffin@dlr.de>	2022-12-06 17:51:52 +01:00
Quentin Gallouédec	002850f8ac	Fix `stable_baselines3/common/torch_layers.py` type hint (#1191 ) * Remove torch layers from mypy exclude * Make torch layers mypy compliant * Extra type specification * Update changelog Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2022-11-29 23:46:32 +01:00
Zikang Xiong	852d635742	Exposed modules in __init__.py with __all__ (#1195 ) * Exposed modules in __init__.py with __all__ * Remove flake8 ignore and update root __all__ * Update version Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>	2022-11-29 23:33:46 +01:00
Quentin Gallouédec	b46396a664	Fix `stable_baselines3/common/env_util.py` type hint (#1192 ) * Remove env_util from mypy exclude * Fix make_atari_env type hint * Update changelog	2022-11-29 15:36:55 +01:00
Quentin Gallouédec	5cd891317e	Add `with_bias` parameter to `create_mlp` (#1188 ) * Add with_bias arg * Update changelog * move torch_layers to the last position * Update version Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>	2022-11-29 12:43:16 +01:00
Quentin Gallouédec	6902fac5e7	Fix `stable_baselines3/common/type_aliases.py` type hint (#1189 )	2022-11-29 12:26:16 +01:00
Quentin Gallouédec	0973b01b9d	Fix `tests/test_distributions.py` type hint (#1186 ) * Fixed test_distribution type hint * Impose list[int] for action dim	2022-11-29 11:27:59 +01:00
Quentin Gallouédec	aee0ba03c7	Update changelog for #1184 (#1185 )	2022-11-28 19:36:26 +01:00
Quentin Gallouédec	e3b24829a5	Drop `gym.GoalEnv` and other minor changes initally from #780 (#1184 ) * Various changes from #780 * Fix env_checker for goal_env detection	2022-11-28 18:22:31 +01:00
Antonin RAFFIN	cd630a3121	Fixes for flake8 6.0 (#1181 )	2022-11-25 15:14:55 +01:00
Juan Rocamonde	68b190b667	Raise error when same env object instance is passed in vectorized environment (#1154 ) * Raise error when same env object instance is passed in vectorized environment * At to changelog * Add raises to docstring * Add test * Also test make_vec_env * Fix test * Try to enable color for MyPy * Update version and ignore lint warnings Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com> Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org> Co-authored-by: Antonin Raffin <antonin.raffin@dlr.de>	2022-11-22 14:28:58 +01:00
Quentin Gallouédec	f3abda5cbc	Fix `Self` return type (#1167 ) * Fix Self annotation * Update changelog * Define type var on top * ClassSelf to SelfClass * annotate self * Revert Running meanstd change * Revert vecnormalize change (static method rejected) Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2022-11-22 13:42:39 +01:00
Quentin Gallouédec	abffa16198	Mypy type checking (#1143 ) * Install and configure mypy * Test if github CI uses setup.cfg for mypy * force color output * tab to space * Try to fix regex * follow_imports silent * use space as indentation * fix indentation setup.cfg * Show error code * Update doc * Udate changelog * Ignore mypy cache files from commit * Update gitlab CI * Add pytype and mypy entry in Makefile * Make mypy happy Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2022-11-16 13:22:57 +01:00
Franz Srambical	8641b05b09	Fix typo in documentation (#1177 )	2022-11-15 15:00:03 +01:00
Taimur Shahzad Gill	7e1db1aaaa	Fixed errors in the documentation (#1159 ) * Fixed errors in the documentation Fixed grammatical and punctuation errors, and improved the sentence structure. * Added username in the contributors	2022-11-07 15:38:41 +01:00
Adam Gleave	4fb8aec215	Update evaluate_policy type annotation to support policies as well as RL algorithms (#1146 ) * Add PolicyPredictor protocol and use it in evaluate_policy * Update changelog * Move Protocol to type_aliases to avoid circular import * Add test for evaluate_policy on BasePolicy * Remove unused import * Use typing_extensions * Move typing_extensions to 3rd party * Add version range (typing_extensions uses SemVer) * Import Protocol from typing_extensions only on Python<3.8 Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com> * Install typing_extensions only on Python<3.8 * Add missing sys import * Fix import ordering * Fix observation type hint in predict Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com> Co-authored-by: Quentin GALLOUÉDEC <gallouedec.quentin@gmail.com>	2022-11-03 15:36:19 +01:00
Antonin RAFFIN	0532a5719c	Fix integration documentation (#1135 )	2022-10-24 13:20:58 +02:00
Antonin Raffin	37a942c8f9	Fixes	2022-10-24 12:53:48 +02:00
Thomas Simonini	0274aaf056	Update docs/guide/integrations.rst Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2022-10-24 11:22:33 +02:00
Thomas Simonini	fc6c111cc3	Changelog Update	2022-10-24 11:03:20 +02:00
Thomas Simonini	714737c986	Update Hugging Face Integration Documentation	2022-10-24 10:55:30 +02:00
Quentin Gallouédec	d5d1a02c15	Allow model trained with python3.7 to be loaded with python3.8+ without the `custom_objects` workaround (#1123 ) * Fix loading * Remove documentation note * Update changelog * Revert save_format change * Add test for errors while unpickling * Update version and cleanup Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2022-10-17 17:33:47 +02:00
Quentin Gallouédec	5ef10c8e69	Fix type annotation of ``policy` `in` `BaseAlgorithm` `and` `OffPolicyAlgorithm`` (#1120 )	2022-10-17 10:16:20 +02:00
Juan Rocamonde	cdcdd32c51	Fix return type of `evaluate_actions` (#1118 ) * Fix return type of ActorCriticPolicy.evaluate_actions to optional entropy tensor * Update changelog.rst	2022-10-14 17:45:28 +02:00
Quentin Gallouédec	1bff6215b6	New Issue forms (#1111 ) * Update bug report template * .md -> .yml * System info section * Custom env issue form * documentation form * Question template * Feature request template * Rm old templates * Update changelog	2022-10-13 17:46:21 +02:00
Antonin RAFFIN	508f8ffd59	Remove deprecated features and attributes (#1104 ) * Remove deprecated eval env * Remove deprecated ret attribute * Remove sde net arch * Remove unused code * Update test comment Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>	2022-10-11 10:55:16 +02:00
Sam Toyer	5e8f06b3cb	Link to full imitation docs (#1106 )	2022-10-10 21:36:30 -07:00
Antonin RAFFIN	e2f81bb70b	Release v1.6.2 (#1103 ) * Release v1.6.2 * Remove Gitlab CI, no more minutes	2022-10-10 16:37:11 +02:00
tobirohrer	d8a430e088	Deprecate `create_eval_env`, `eval_env` and `eval_freq` parameter (#1082 ) * Adds deprecation warning if `eval_env` or `eval_freq` parameters are used. See #925 * added changelog entry * added missing backtick * deprecating `create_eval_env` parameter as well and adding comments to explain the `stacklevel` parameter used * Updated tests to ignore DeprecationWarnings * Updated changelog entry * - Removed the `create_eval_env` parameter from the examples in the docs - Removed information about the `create_eval_env` parameter from the migration docs - Added information about deprecation of the `create_eval_env` parameter in the docs * Add alternative in docstring * Update docstrings * `eval_freq` warning in docstring * Add deprecation comments in tests Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com> Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org> Co-authored-by: Quentin GALLOUÉDEC <gallouedec.quentin@gmail.com>	2022-10-10 15:39:38 +02:00
Antonin RAFFIN	7c21b79188	Add progress bar callback and argument (#1095 ) * Add progress bar callback and argument * Update doc * Update changelog * Upgrade pytype in docker image * Use tqdm.write in the logger to have cleaner output * Fix logger test * Fix when doing multiple calls to learn() * Address comments from code-review	2022-10-06 18:17:31 +02:00
Alex Pasquali	6a8c9ddc8b	Updated type hint and extended docstring in make_vec_env and make_atari_env (#1085 ) * Updated type hint and extended docstring in make_vec_env The function itself was already working with callables, but it wasn't considerent in the type hint of the function's signature. Extended the description of the wrapper_class parameter with a link to a Github issue containing more details on the matter. * Updated type hint in make_atari_env The function itself was already working with callables, but it wasn't considerent in the type hint of the function's signature. * Updated docstring in make_atari_env When modifying the type hint of the parameter 'env_id' (in this commit: fda6872f73c11075901ba88f2520f6316f818d1d), I forgot to update its description in the docstrig. Doing it now. * Removed redundant type in env_id's type hint in make_vec_env and make_atari_env Callable[..., gym.Env] already includes Type[gym.Env], as pointed out here: https://github.com/DLR-RM/stable-baselines3/pull/1085#issuecomment-1269685218 Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2022-10-06 13:36:06 +02:00
Quentin Gallouédec	a697401e03	Standardized the use of ``"`` for string representation (#1086 ) * Replace ``'`` by ``" `` in python code * Update changelog * Rm whitespace	2022-10-03 15:15:39 +02:00
Quentin Gallouédec	d3eb0e3ed6	Fix importlib dependency (#1088 ) * Set requirement ``importlib-metadata~=4.13`` * Update changelog	2022-10-03 12:03:51 +02:00
Antonin RAFFIN	537a82a7fd	Update export doc (fixes + add torch jit) (#1074 ) * Update export doc (fixes + add torch jit) * Fix conflicts * Update according to code review comments * fix torch -> th Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>	2022-09-30 14:30:40 +02:00
Antonin RAFFIN	21300c9aaf	Release v1.6.1 (#1080 )	2022-09-29 12:15:55 +02:00
Akhil	def0574d03	Fixed typos (#1076 ) * Updated docstring from n_steps to n_rollout_steps This must be a typo * Fixed typo in a comment in ppo.py * Update changelog Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com> Co-authored-by: Antonin Raffin <antonin.raffin@dlr.de>	2022-09-28 14:57:46 +02:00
Juan Rocamonde	e22e372306	Fix duplicate key error in HumanOutputFormat (#1079 ) * Fix duplicate key error in HumanOutputFormat * Update changelog * Add test * Update changelog.rst Co-authored-by: Adam Gleave <adam@gleave.me> Co-authored-by: Adam Gleave <adam@gleave.me>	2022-09-28 12:06:07 +02:00
Juan Rocamonde	432b3f876d	Fix return type for load, learn in BaseAlgorithm (#1043 ) * Fix return type for load, learn in BaseAlgorithm * Update changelog * Add typing extensions to dependencies * Import directly from typing for python >3.11 * Reorder changelog to reflect merge order * Roll back to typevar solution * Updated changelog * Remove typing extensions requirement * Update base_class.py * Remove final point in changelog * Additional type fixes across project Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com> Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2022-09-26 12:13:56 +02:00
Dominic Kerr	899eee6bd4	Automatically create missing directories of ``filenames passed to` `ResultsWriter`` (#1072 ) * Create (if any) missing filename directories, passed into ResultsWriter * Fixed incorrect ``filename`` docstring (if ``filename`` where ``None``, the string method ``filename.endswith(Monitor.EXT)`` would raise an ``AttributeError``), and renamed ``reset_keywords`` docstring. * Added description of #1068 * Ignore pytype errors * Update changelog.rst Co-authored-by: dominicgkerr <dominicgkerr1@gmail.co> Co-authored-by: Antonin Raffin <antonin.raffin@dlr.de> Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2022-09-21 13:14:38 +02:00
Alex Pasquali	d0b129ecc3	Updated custom policy docs (#1067 )	2022-09-18 09:17:57 +02:00
Quentin Gallouédec	440735cbd0	Fix loading a model with different number of environments (#1058 ) * Fix loading with new `n_envs` * Update tests * Update changelog * Fix the fix * Remove `self._setup_model()` from `set_env()` * Raise `AssertionError` when setting env with a different `n_envs` * Update unitests Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2022-09-17 11:10:03 +02:00
Juan Rocamonde	18b29a68e8	Remove forward() method from common.policies.BaseModel (#1061 ) * Remove forward() method. * Updated changelog	2022-09-11 18:39:13 +02:00
Quentin Gallouédec	98e786f744	Clarify and standardize verbosity documentation (#1056 ) * Standardize the use of verbosity: > to >= * Make verbose docstring more specific * Update changelog	2022-09-09 16:46:28 +02:00
Quentin Gallouédec	29f6687b98	Raise error when observation keys and observation space keys don't match (#1047 ) * Raise error when observation keys and observation space keys don't match * Print the difference in keys * Update changelog	2022-09-05 14:54:58 +02:00
Juan Rocamonde	fdca786f09	Fix replay_buffer_class type annotation (#1042 ) * Fix replay_buffer_class type annotation * Update changelog * Further replacement of same type annotation issue * Formatting * Rolled back formatting changes for consistency	2022-09-01 20:10:01 -07:00
Luke Fisher	a7f30b04e3	Updated minor grammar error (#1041 ) "an history" -> "a history"	2022-08-31 18:04:15 +02:00
Sidney Tio	304c17dc78	Add append mode to Monitor (#1037 ) * Added option to override or use existing CSVs * Updated changelog for Monitor override * Changed default value to override * Simplify code and add test * Update version * Fix for pytype Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>	2022-08-31 11:53:44 +02:00
Hugh Perkins	2cc1477fa2	Fix advantage normalization with mini-batchsize of 1 (#1028 ) * fix nan in advnatages with batch size 1, for ppo * changelog * black * Simplify test * Bump version Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>	2022-08-25 11:50:08 +02:00
Anand Balakrishnan	59af0c1b01	`CheckpointCallback` can now save replay buffer and `VecNormalize` (#1030 ) * CheckpointCallback now saves replay buffer (if present) * VecNormalize stats are saved at checkpoints * Make checkpointing replay buffer and VecNormalize opt-in * Edit changelog * Add documentation for new parameters * Update docs/misc/changelog.rst * Add documentation for new parameters * Implement suggested edits * Reformat code * Fix git conflict * Add .pkl suffix to VecNormalize checkpoints * Add tests for new CheckpointCallback params * Merge CheckpointCallback tests * Update test and add helper for checkpoint path Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2022-08-25 10:57:51 +02:00
Honglu Fan	29a481a288	Include `running_mean` and `running_val` when updating target networks (#1004 ) * include `running_mean` and `running_val` when updating target networks in DQN, SAC, TD3. * Update stable_baselines3/common/utils.py Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org> * Precompute batch norm parameters in `_setup_model` and directly copy them in the target update. * include `running_mean` and `running_val` when updating target networks in DQN, SAC, TD3. * Update stable_baselines3/common/utils.py Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org> * Precompute batch norm parameters in `_setup_model` and directly copy them in the target update. * Fix `DictReplayBuffer.next_observations` type (#1013) * Fix DictReplayBuffer.next_observations type * Update changelog Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org> * Fixed missing verbose parameter passing (#1011) Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com> * Support for `device=auto` buffers and set it as default value (#1009) * Default device is "auto" for buffer + auto device support in BufferBaseClass * Update docstring * Update tests * Unify tests * Update changelog * Fix tests on CUDA device Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org> Co-authored-by: Antonin Raffin <antonin.raffin@dlr.de> * Precompute batch norm parameters in `_setup_model` and directly copy them in the target update. * Update test * Add comments and update tests * Bump version * Remove one extra space to conform code style. * Update docstrings Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org> Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com> Co-authored-by: Burak Demirbilek <BurakDmb@users.noreply.github.com> Co-authored-by: Antonin Raffin <antonin.raffin@dlr.de>	2022-08-23 10:20:43 +02:00
Timothé	01cc127d32	Support hparams logging to tensorboard (#984 ) * create Hparam class & support in all OutputFormats * add hparams documentation & example * add hparam tests * remove unnecessary test & fix name * format changes * support hyperparameters logging to tensorboard * fix HParams class docstring * use more explicit variable names * raise error instead of warning * Unpin protobuf * Add test for logging hparams Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2022-08-22 22:06:54 +02:00
Antonin RAFFIN	57e0054e62	Add Quentin to the list of maintainers (#1014 )	2022-08-17 09:55:40 +02:00
Quentin Gallouédec	73822c34da	Support for `device=auto` buffers and set it as default value (#1009 ) * Default device is "auto" for buffer + auto device support in BufferBaseClass * Update docstring * Update tests * Unify tests * Update changelog * Fix tests on CUDA device Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org> Co-authored-by: Antonin Raffin <antonin.raffin@dlr.de>	2022-08-16 17:54:55 +02:00
Burak Demirbilek	792e3bcc27	Fixed missing verbose parameter passing (#1011 ) Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>	2022-08-16 13:32:32 +02:00
Quentin Gallouédec	a30d36002b	Fix `DictReplayBuffer.next_observations` type (#1013 ) * Fix DictReplayBuffer.next_observations type * Update changelog Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2022-08-16 10:53:22 +02:00
Quentin Gallouédec	c4f54fcf04	Handling multi-dimensional action spaces (#971 ) * Handle non 1D action shape * Revert changes of observation (out of the scope of this PR) * Apply changes to DictReplayBuffer * Update tests * Rollout buffer n-D actions space handling * Remove error when non 1D action space * ActorCriticPolicy return action with the proper shape * remove useless reshape * Update changelog * Add tests Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2022-08-06 14:19:20 +02:00
jlp-ue	6ce33f5bd2	Fix url in docs (#1000 ) * fixed URL in docs * Update changelog.rst	2022-08-05 17:54:48 +02:00
Francesco Lucianò	646d6d38b6	Fixed typo in PPO doc (#983 ) * Fixed typo Fixed typo * Update changelog Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>	2022-07-30 12:52:35 +02:00
Marsel Khisamutdinov	d532362e94	Adds info on split tensorboard graphs (#989 ) * Add info on split tensorboard graphs. * Change wording to make it look better. * Update changelog.rst * Rephrase and add link to issue Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>	2022-07-30 12:44:25 +02:00
Adam Gleave	b1cc15970a	Use higher resolution time_ns() and avoid division by zero (#979 ) * Use higher resolution time and round up to eps * Update changelog * Add test case * Fix formatting, time()->time_ns * Bugfix: ns is integer not float * Move test to better place * Divide by 1e9 earlier	2022-07-25 23:02:53 +02:00
Quentin Gallouédec	fda3d4d748	Fix returned type in predict (#964 ) * `arr[0]` to `arr.squeeze(0)` * `squeeze(axis=0)` to `squeeze(0)` * Type testing * Add type test for unvectorized observation * `squeeze(0)` to `squeeze(axis=0)` * Treatment of the laziness symptoms * Update changelog * Udate changelog Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2022-07-18 11:22:19 +02:00
Antonin RAFFIN	a18b91e01a	Replace "nature" with "Nature" (magazine) to reduce confusion (#965 ) * Replace "nature" with "Nature" (magazine) to reduce confusion * Replace "nature" with "Nature" (magazine) to reduce confusion * Update changelog Co-authored-by: mel <callmesolis@gmail.com>	2022-07-15 22:48:27 +02:00
Antonin Raffin	38706f12f3	Use ICRL url for PPO blog post	2022-07-12 23:48:05 +02:00
Antonin RAFFIN	c1f1c3d3d7	Release v1.6.0 (#958 ) * Release v1.6.0 + update doc + add copy button * Update read the doc conda env * Update year * Fix bug in kl divergence check * Rephrase requirement for envpool and isaac gym	2022-07-12 22:50:23 +02:00
Max Weltevrede	ef10189d80	Prohibit simultaneous use of optimize_memory_usage and handle_timeout_termination (#948 ) * Prohibit simultaneous use of optimize_memory_buffer and handle_timeout_termination * Modify test to avoid unsupported buffer configuration * Change from assertion to raising of ValueError * Update changelog * Update style for consistency * Use handle_timeout_termination when possible Co-authored-by: Anssi <kaneran21@hotmail.com> Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>	2022-07-04 15:08:54 +02:00
Ram Rachum	d64bcb401a	Fix exception cause in base_class.py (#940 )	2022-06-21 20:58:02 +01:00
Antonin RAFFIN	7ce7b6a8c2	Update defaults for offpolicy algos with features extractor (#935 )	2022-06-18 10:52:52 +02:00
Antonin RAFFIN	d68f0a2411	Update doc: SB3 Contrib RecurrentPPO (#927 ) * Update doc: contrib update * Update docs/misc/changelog.rst Co-authored-by: Anssi <kaneran21@hotmail.com> * Address Anssi comments Co-authored-by: Anssi <kaneran21@hotmail.com>	2022-05-31 18:11:16 +02:00
Antonin RAFFIN	4b89fbf283	Fix issues due to newer version of protobuf and sphinx (#924 )	2022-05-29 21:09:50 +02:00
Antonin RAFFIN	49813d8c68	Update doc and add check for unbounded action space (#918 )	2022-05-25 16:24:21 +02:00
TibiGG	2fcf8f91c1	Removed redundant double-check of nested Dict (#908 ) * Removed redundant double-check of nested Dict observation space from BaseAlgorithm * Update changelog Co-authored-by: tibigg <tg4018@ic.ac.uk>	2022-05-09 14:36:15 +03:00
Antonin RAFFIN	0fadc94df3	Fix synchronization bug with EvalCallback (#907 )	2022-05-08 21:54:34 +03:00
Thomas Rudolf	c2518dc160	Add doc to use mlflow logger (#889 ) * ADD feature for mlflow logger via MLflowOutputFormat. * Move MLFlow integration to doc Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>	2022-05-08 15:28:31 +02:00
Marsel Khisamutdinov	e98ae129de	Fix a grammatical mistake (#899 ) Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2022-05-03 16:27:48 +02:00
Antonin RAFFIN	c5f0aa5de0	Update doc: PPO blog post and remark on timeouts (#896 )	2022-05-01 16:26:34 +02:00
Antonin RAFFIN	a6f5049a99	Upgrade code to Python 3.7+ syntax using `pyupgrade` (#887 ) * Upgrade code to Python 3.7+ syntax * Update changelog	2022-04-25 13:01:38 +03:00
Bryan Collazo	3c468ff558	Update ppo documentation (remove redundant and) (#874 ) * Update ppo documentation (remove redundant and) PTAL, thanks! * Update changelog * Pin ale-py version Co-authored-by: Antonin Raffin <antonin.raffin@dlr.de>	2022-04-19 14:15:51 +02:00
Paul Scheikl	ed308a71be	Fixed unchecked None value in SubprocVecEnv (#808 ) * Fixed unchecked None value in SubprocVecEnv * Fixed unchecked None value in DummyVecEnv * Fix formatting * Update test and changelog * Improve test Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2022-04-12 16:05:40 +02:00
Antonin RAFFIN	39a4f9379a	Escape tensorboard log name (#857 ) * escape tensorboard log name Otherwise utils does not recognize the log. * Added fix to changelog * Modifications made by: make commit-checks . * Revert "Modifications made by: make commit-checks ." This reverts commit 529a275d9475f85ef031038a8f3565f7301e5371. * Update changelog and add test Co-authored-by: James Hirschorn <James.Hirschorn@quantitative-technologies.com>	2022-04-11 21:49:18 +02:00
Antonin RAFFIN	248f082cdc	Bump min PyTorch version (#855 )	2022-04-11 18:34:15 +02:00
Quentin Gallouédec	16703b1314	Fix HER goal selection (#848 ) * Goal sampled from next_achieved_goal instead of achived_goal * No need to have special case for future anymore * Update changelog Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>	2022-04-11 17:50:02 +02:00
Grégoire Passault	254bb10c42	Replacing the policy registry with policy "aliases" (#842 ) * Replacing the policy registry with policy "aliases" * Fixing import order and SAC * Changing arg. order to be sure policy_aliases is a kwarg * Import orders * Removing pytype error check * Reformat * Fix alias import * Not using mutable {} as default for policy_aliases * Empty aliases initialization * Using static attributes for policy_aliases * Fixing isort * Fixing back bad merge * Running isort * Fixing aliases for A2C and PPO * Using f-string * Moving policy_aliases definition position * Adding change in the changelog * Update version Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>	2022-04-08 21:21:53 +02:00
Yifei Cheng	44e53ff811	Enable force_zip64 (#839 ) * Enable force_zip64 * mark tests as expensive * Update changelog Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>	2022-03-28 10:35:33 +02:00
Antonin RAFFIN	30772aa9f5	Release v1.5.0 (#835 ) * Release v1.5.0 * Fix link	2022-03-25 14:38:22 +01:00
Grégoire Passault	00ac43b0a9	Removing dead code for handling time limits (#831 ) * Removing dead code for handling time limits (see #829) * Mentionning remove_time_limit_termination in the changelog * Update changelog.rst Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2022-03-23 13:33:55 +01:00
Yuan	009bb0549a	Update tensorboard.rst in SummaryWriterCallback (#822 ) * Update tensorboard.rst * update changelog.rst * update changelog.rst, add username	2022-03-15 21:48:52 +01:00
Antonin RAFFIN	e88eb1c9ca	Add explanation of logger output (#803 ) * Add explanation of logger output * Apply suggestions from code review Co-authored-by: Anssi <kaneran21@hotmail.com> * Add example output Co-authored-by: Anssi <kaneran21@hotmail.com>	2022-03-07 12:20:43 +01:00
Julio César Alves	cdaa9ab418	Callback to early stop the training if there is no model improvement after consecutive evaluations (#741 ) * Added StopTrainingOnNoModelImprovement callback and callback_after_eval parameter in EvalCallback * Correction in EvalCallback and tests for StopTrainingOnNoModelImprovement * Update the docs related to new StopTrainingOnNoModelImprovement callback * Update doc Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org> Co-authored-by: Antonin Raffin <antonin.raffin@dlr.de>	2022-02-25 11:56:47 +01:00
Quentin Gallouédec	db5366fb51	`None` as default value for `env` in `HerReplayBuffer.sample` + `DQN` batch size typing fix (#790 ) * `env` to `None` by default in `HerReplayBuffer.sample` (#788) * Fix DQN batch_size typing * Fix changelog Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org> Co-authored-by: Antonin Raffin <antonin.raffin@dlr.de>	2022-02-24 15:51:01 +01:00
Quentin Gallouédec	13fcb12471	Fix normalization for `DictReplayBuffer` (#744 ) * Normalize samples DictReplayBuffer (#743) * Fixed sample normalization in ``DictReplayBuffer`` (#743) * Test buffer normalization * Rename test replay buffer * Bump version Co-authored-by: Anssi <kaneran21@hotmail.com> Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2022-02-23 13:04:57 +01:00
Boyuan Chen	7a01637128	Fix VecNormalization bug for Dict obs (#768 ) * fix #724 VecNormalization bug for Dict obs * update test and changelog * Update changelog Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>	2022-02-23 12:33:41 +01:00

1 2 3 4 5 ...

521 commits