stable-baselines3

mirror of https://github.com/saymrwulf/stable-baselines3.git synced 2026-05-26 22:45:15 +00:00

Author	SHA1	Message	Date
Antonin RAFFIN	daaebd0a52	Drop python 3.8 and add python 3.12 support (#2041 ) * Drop python 3.8 support, add python 3.12 support * Upgrade to python 3.9 syntax * Fixes for Numpy v2 * Fix doc warning	2024-11-18 15:40:36 +01:00
Mark Towers	8f0b488bc5	Update Gymnasium to v1.0.0 (#1837 ) * Update Gymnasium to v1.0.0a1 * Comment out `gymnasium.wrappers.monitor` (todo update to VideoRecord) * Fix ruff warnings * Register Atari envs * Update `getattr` to `Env.get_wrapper_attr` * Reorder imports * Fix `seed` order * Fix collecting `max_steps` * Copy and paste video recorder to prevent the need to rewrite the vec vide recorder wrapper * Use `typing.List` rather than list * Fix env attribute forwarding * Separate out env attribute collection from its utilisation * Update for Gymnasium alpha 2 * Remove assert for OrderedDict * Update setup.py * Add type: ignore * Test with Gymnasium main * Remove `gymnasium.logger.debug/info` * Fix github CI yaml * Run gym 0.29.1 on python 3.10 * Update lower bounds * Integrate video recorder * Remove ordered dict * Update changelog --------- Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2024-11-04 12:03:12 +01:00
Antonin RAFFIN	3d59b5c86b	Use uv on GitHub CI for faster download and update changelog (#2026 ) * Use uv on GitHub CI for faster download and update changelog * Fix new mypy issues	2024-10-24 15:20:05 +02:00
Devin White	56c153f048	Add warning when using PPO on GPU and update doc (#2017 ) * Update documentation Added comment to PPO documentation that CPU should primarily be used unless using CNN as well as sample code. Added warning to user for both PPO and A2C that CPU should be used if the user is running GPU without using a CNN, reference Issue #1245. * Add warning to base class and add test --------- Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2024-10-07 11:24:47 +02:00
Antonin RAFFIN	512eea923a	Warn users when using multi-dim MultiDiscrete obs space (#2003 ) * Update env checker to warn users when using multi-dim MultiDiscrete obs space * Update changelog	2024-09-13 13:15:23 +02:00
Antonin RAFFIN	4a7631b71d	Fix test device for buffers (#1993 ) * Prevent test_device from being a noop * Update changelog --------- Co-authored-by: Adrià Garriga-Alonso <adria@far.ai>	2024-08-18 12:33:22 +02:00
Jan-Hendrik Ewers	4a1137ba3a	Add np.ndarray as a recognized type for TB histograms. (#1635 ) * Add np.ndarray as a recognized type for TB histograms. Torch histograms allow th.Tensor, np.ndarray, and caffe2 formatted strings. This commits expands the TensorBoardOutputFormat's capabilities to log the two former types. * Update changelog to reflect bug fix * fix: try/catch for if either np or torch aren't at the required versions. See https://github.com/DLR-RM/stable-baselines3/pull/1635 for more details * fix: Add comment describing the test for when add_histogram should not have been called * Cleanup --------- Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2024-08-02 11:55:27 +02:00
Antonin RAFFIN	bd3c0c6530	Fix loading of optimizer with older DQN models (#1978 )	2024-07-26 14:57:55 +02:00
Antonin RAFFIN	000544cc1f	Add support for pre and post linear modules in `create_mlp` (#1975 ) * Add support for pre and post linear modules in `create_mlp` * Disable mypy for python 3.8 * Reformat toml file * Update docstring Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com> * Add some comments --------- Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>	2024-07-22 13:42:33 +02:00
Dominik Baron	24ebf1a1df	Remove unnecessary SDE resampling in PPO update (#1933 ) * Remove unnecessary SDE resampling in PPO update * Update changelog.rst * Update version * Update PyTorch version on CI * Update ruff * Limit NumPy version * Reformat --------- Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2024-06-29 20:07:32 +02:00
will-maclean	4efee92fba	Set CallbackList children's parent correctly (#1939 ) * Fixing #1791 * Update test and version * Add test for callback after eval * Fix mypy error * Remove tqdm warnings --------- Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2024-06-07 14:07:28 +02:00
Joe Ksiazek	0b06d8ab20	Fix error when loading a model that has net_arch manually set to None (#1937 ) * Fix loading a model with net_arch=None * Remove redundant get * Dummy commit * Add to contributors * Update test and version --------- Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>	2024-06-05 17:27:40 +02:00
Chris Schindlbeck	4317c62598	Fix various typos (#1926 ) * Fix various typos * Update changelog --------- Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>	2024-05-15 15:19:39 +02:00
Mark Smith	9a749389d3	Cast learning_rate to float lambda for pickle safety when doing model.load (#1901 ) * create failing test for unpickle error * Fix learning_rate argument causing failure in weights_only=True if passed a function with non-float types * Updated with feedback from araffin on PR#1901 * Update test and version * Update changelog and SBX doc --------- Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>	2024-04-22 10:04:01 +02:00
Corentin	071226d3e8	Log success rate for on policy algorithms (#1870 ) * Add success rate in monitor for on policy algorithms * Update changelog * make commit-checks refactoring * Assert buffers are not none in _dump_logs * Automatic refactoring of the type hinting * Add success_rate logging test for on policy algorithms * Update changelog * Reformat * Fix tests and update changelog --------- Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>	2024-03-22 12:13:48 +01:00
Antonin RAFFIN	a8e905977f	Update env checker for spaces with non-zero start (#1845 ) * Update ruff * Update env checker for non-zero start	2024-02-19 16:44:02 +01:00
Quentin Gallouédec	e3dea4b2e0	Release 2.2.1: Hotfix file closing (#1754 ) * new closing policy * revert #1742 * Add tests and update changelog --------- Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>	2023-11-17 23:50:23 +01:00
Antonin RAFFIN	23fbeb5975	Fix resource warning (#1742 ) * Fix resource warning * Add test and update changelog * Fix for new mypy version	2023-11-16 17:11:13 +01:00
Antonin RAFFIN	018ea5ab67	Fix distributions type hints (#1733 ) * Fix distributions type hints * Add test for multim binary action space * Fix test	2023-11-06 10:09:01 +01:00
M. Ernestus	69afefc91d	Add rollout_buffer_class parameter to on-policy algorithms (#1720 ) * Add rollout_buffer_class and rollout_buffer_kwargs parameters to OnPolicyAlgorithm * Add rollout_buffer_class and rollout_buffer_kwargs to PPO. * Add rollout_buffer_class and rollout_buffer_kwargs to A2C. * Make use of the rollout buffer kwargs. * Update version * Add test and update doc --------- Co-authored-by: Antonin Raffin <antonin.raffin@dlr.de>	2023-10-27 17:36:24 +02:00
Hosseinkhan Rémy	aab545901f	Add support for setting `options` at reset with `VecEnv` (#1606 ) * Update signatures, and test with options * Update changelog and black formatting * Finish implementation (fixes, doc, tests) * Use deepcopy to avoid side effects (modif by reference) * Fix for mypy --------- Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2023-10-23 13:38:48 +02:00
Jan-Hendrik Ewers	2ddf015cd9	fix: Follow PEP8 guidelines and evaluate falsy to truthy with `not` rather than `is False`. (#1707 ) * fix: Follow PEP8 guidelines and evaluate falsy to truth with `not` rather than `is False`. https://docs.python.org/2/library/stdtypes.html#truth-value-testing * chore: Update changelog inline with intent of changes in PR #1707 Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com> * fix: Change `is False` to `not` as per PEP8 * chore: Remove superfluous comment about `is False` * test: One On- and one Off-Policy algorithm (A2C and SAC respectively), with settings to speed up testing * Update changelog * chore: Remove EvalCallback as it's not actually required * Update changelog.rst * Rm duplicated "others" section in changelog.rst --------- Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com> Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>	2023-10-09 12:21:12 +02:00
Antonin RAFFIN	2ca94cb73d	Add check for common mistake when mixing Gym/VecEnv API (#1696 )	2023-09-25 12:39:22 +02:00
Corentin	f4c5b1e5e2	Fix check_env for Sequence observation space (#1690 ) * Fix Sequence obs env_checker * Fix Sequence obs env_checker * Add test : env_checker for Sequence obs * Add test : env_checker for Sequence obs * Cleanup and improve env checker messages --------- Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>	2023-09-24 12:36:52 +02:00
Antonin RAFFIN	99712760c8	Fix render_mode when loading VecNormalize (#1671 ) * Fix render_mode when loading VecNormalize * Switch from isort to ruff, and cap black version * Add test and update changelog	2023-09-12 11:28:32 +02:00
Patrick Helm	e071796549	Fixes replay buffer device after loading in OffPolicyAlgorithm (#1662 ) * sets replay buffer device after loading * update changelog * update changelog * correct changelog * add test for replay buffer device * Fix test to actually test the bug fix * [ci skip] Update version * [ci skip] Update docker images --------- Co-authored-by: PatrickHelm <patrick.helm@gmx.net> Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2023-09-03 12:50:02 +02:00
PatrickHelm	16c6a886db	Fix squash output unscaling when using gSDE (#1652 ) * prevents squash_output if not use_sde, see #1592 * update changelog * add unscaling of actions taken during training * add test regarding squashing and unquashing * avoids try-except block * format Gymnasium code with black * makes mypy pass * makes pytype pass * sort imports * makes error message in assert statement clearer Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org> * improves code commenting * replaces full env with wrapper * Cleanup code * Reformat --------- Co-authored-by: PatrickHelm <patrick.helm@gmx.net> Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org> Co-authored-by: Antonin Raffin <antonin.raffin@dlr.de>	2023-09-01 17:58:15 +02:00
Antonin RAFFIN	e9f0f23ce4	Fix type hints for callbacks, utils and `VecTranspose` (#1648 ) * Fix type hints in `common/utils.py` * Fix `VecTranspose` type annotations * Fix types for callbacks * Update changelog * Fix video recorder type hints * Fix save utils type hints * Allow BytesIO * Improve error message * Make logger and training env properties * Clarify which open_path fn is called	2023-08-29 16:04:08 +02:00
Antonin RAFFIN	17f02a8ae1	Fix env checker bounds, expose all invalid indices at once (#1638 ) * Fix bug in env_checker.py bounds warning message * Fix bug where Gym Environment Checker does not output the correct warning message when dealing with observation spaces that have different upper and different lower bounds * Update test_env_checker.py with more comprehensive tests * Make naming consistent * Update version * Catch all invalid indices at once --------- Co-authored-by: gabo_tor <gabriel0torre@gmail.com>	2023-08-02 16:43:45 +02:00
Tobias Rohrer	ba77dd7c61	Fix to use float64 actions for off policy algorithms (#1572 ) * Added test cases where off policy algorithms fail with float64 actionspace * casting observations and actions to `np.float32` to unify behaviour between `ReplayBuffer` and `RolloutBuffer`. Fixing issue #1145 * reformatted using black * making test more restrictive by checking models action is float64 * added changelog entry * undo cast of observations as `preprocessing.preprocess_obs()` casts them to float32 anyways. * - Casting to float32 only, if action.dtype is float64 - Added cast to `DictReplayBuffer` as well * Added tests for multiple variations of continuous action types and observation spaces * applied reformatting by `make commit-checks` * Added typing and comment referring to description in merge request * Apply linter for single element slice * Rename helper and refactor tests * Update changelog and docstring --------- Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2023-07-24 16:38:03 +02:00
Antonin RAFFIN	a730b9b66a	Relax logger check for Windows (#1615 ) * Relax logger check for Windows * Update tests	2023-07-21 07:02:38 +02:00
Mark Towers	61e1060525	Update Gymnasium to v0.29.0 (#1610 ) * Update setup.py to v0.29.0 * Remove invalid test * Loosen version and update changelog --------- Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>	2023-07-18 14:22:22 +02:00
Antonin RAFFIN	ffe26ccf95	Fix render bug for vec env wrappers (#1525 ) * Fix render bug for vec env wrappers * Fix tests and update changelog * Better fix, backward compatible * remove render_mode from VecEnv init * Make DictObsVecEnv inherit from VecEnv * format * Fix env_is_wrapped * try/except getting render mode ( (https://github.com/DLR-RM/stable-baselines3/pull/1525#discussion_r1206888921) * update version * Fix env_is_wrapped in test_vec_extract_dict --------- Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com> Co-authored-by: Quentin GALLOUÉDEC <gallouedec.quentin@gmail.com>	2023-06-07 16:20:40 +02:00
Lara Bergmann	32778ddc94	Fix wrong truncation in HER replay buffer (#1543 ) * fix episode start idx that leads to wrong episode length * add episode length test * Update changelog * Reformat files * Use replay_buffer.dones to test HER truncation warning * truncate_last_trajectory: sample truncated episode and handle infinite horizon tasks * make test_truncate_last_trajectory independent of learning * Add timeout comment HER truncate_last_trajectory Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com> * Update version.txt * Update version --------- Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>	2023-06-07 15:57:12 +02:00
lutogniew	e76316341d	Fix env checker single-step-env edge case (#1524 ) * Fix env checker single-step-env edge case Before this change, env checker failed to `reset()` the tested environment before calling `step()` when checking for `Inf` / `NaN`. This could cause environments which happened to have only one `step()` available before the episode was terminated to fail. This is now fixed. * Code review fixes #1 As suggested by Antonin Raffin <antonin.raffin@ensta.org>.	2023-05-25 17:12:32 +02:00
Kallinteris Andreas	9c338f917a	`vec_env`s fix `seed()` causing a reset (#1486 ) * `dummy_vec_env` fix `seed()` causing a reset * rename `seed` * fixes * bug fix * fix seed return type * Cleanup seeding, add test and remove compat wrapper * Update env checker and tests * Add deterministic test for make_vec_env --------- Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>	2023-05-20 10:30:54 +02:00
Quentin Gallouédec	9cebedc89f	Fix Colab logger error (#1484 ) * fix HumanOutputFormat * update version * update changelog * TextIO annotation, TextIOBase isinstance * update changelog * test for HumanOutputFormat with custom TextIO * rm extra test line * Update tests/test_logger.py Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org> --------- Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2023-05-05 14:26:39 +02:00
Antonin RAFFIN	63a0bb9da1	Type annotation bundle (logger, vec env, custom envs) (#1479 ) * Switch from List to Sequence for `seed()` type hint * Fix logger type hints * Improve replay buffer type hints * Fix custom envs type annotations * Fix VecMonitor type hints * Fix RMSprop type hint * Fix vec extract dict obs type hints * Fix vec frame stack type annotations * Fix base vec env type hints * Fix dummy vec env type hints * Fix for mypy * Fixes for the tests * mypy doesn't like when we overwrite type * fix step of SimpleMultiObsEnv * remove useless type specification * Rm useless type hint * Improve logger type hint * format * rm useless type hint * Re-add variables in constructor, remove unused import --------- Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>	2023-05-04 20:27:15 +02:00
Tobias Rohrer	6cbb2c9303	Fix DQN target update interval for multi-env (#1463 ) * Calculating target update interval per environment in `_on_step()`. See GitHub issue #1373 * Added changelog entry and changed test comment * Added requested changes from code review * Update version --------- Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2023-04-27 18:35:33 +02:00
Antonin RAFFIN	40e0b9d2c8	Add Gymnasium support (#1327 ) * Fix failing set_env test * Fix test failiing due to deprectation of env.seed * Adjust mean reward threshold in failing test * Fix her test failing due to rng * Change seed and revert reward threshold to 90 * Pin gym version * Make VecEnv compatible with gym seeding change * Revert change to VecEnv reset signature * Change subprocenv seed cmd to call reset instead * Fix type check * Add backward compat * Add `compat_gym_seed` helper * Add goal env checks in env_checker * Add docs on HER requirements for envs * Capture user warning in test with inverted box space * Update ale-py version * Fix randint * Allow noop_max to be zero * Update changelog * Update docker image * Update doc conda env and dockerfile * Custom envs should not have any warnings * Fix test for numpy >= 1.21 * Add check for vectorized compute reward * Bump to gym 0.24 * Fix gym default step docstring * Test downgrading gym * Revert "Test downgrading gym" This reverts commit 0072b77156c006ada8a1d6e26ce347ed85a83eeb. * Fix protobuf error * Fix in dependencies * Fix protobuf dep * Use newest version of cartpole * Update gym * Fix warning * Loosen required scipy version * Scipy no longer needed * Try gym 0.25 * Silence warnings from gym * Filter warnings during tests * Update doc * Update requirements * Add gym 26 compat in vec env * Fixes in envs and tests for gym 0.26+ * Enforce gym 0.26 api * format * Fix formatting * Fix dependencies * Fix syntax * Cleanup doc and warnings * Faster tests * Higher budget for HER perf test (revert prev change) * Fixes and update doc * Fix doc build * Fix breaking change * Fixes for rendering * Rename variables in monitor * update render method for gym 0.26 API backwards compatible (mode argument is allowed) while using the gym 0.26 API (render mode is determined at environment creation) * update tests and docs to new gym render API * undo removal of render modes metatadata check * set rgb_array as default render mode for gym.make * undo changes & raise warning if not 'rgb_array' * Fix type check * Remove recursion and fix type checking * Remove hacks for protobuf and gym 0.24 * Fix type annotations * reuse existing render_mode attribute * return tiled images for 'human' render mode * Allow to use opencv for human render, fix typos * Add warning when using non-zero start with Discrete (fixes #1197) * Fix type checking * Bug fixes and handle more cases * Throw proper warnings * Update test * Fix new metadata name * Ignore numpy warnings * Fixes in vec recorder * Global ignore * Filter local warning too * Monkey patch not needed for gym 26 * Add doc of VecEnv vs Gym API * Add render test * Fix return type * Update VecEnv vs Gym API doc * Fix for custom render mode * Fix return type * Fix type checking * check test env test_buffer * skip render check * check env test_dict_env * test_env test_gae * check envs in remaining tests * Update tests * Add warning for Discrete action space with non-zero (#1295) * Fix atari annotation * ignore get_action_meanings [attr-defined] * Fix mypy issues * Add patch for gym/gymnasium transition * Switch to gymnasium * Rely on signature instead of version * More patches * Type ignore because of https://github.com/Farama-Foundation/Gymnasium/pull/39 * Fix doc build * Fix pytype errors * Fix atari requirement * Update env checker due to change in dtype for Discrete * Fix type hint * Convert spaces for saved models * Ignore pytype * Remove gitlab CI * Disable pytype for convert space * Fix undefined info * Fix undefined info * Upgrade shimmy * Fix wrappers type annotation (need PR from Gymnasium) * Fix gymnasium dependency * Fix dependency declaration * Cap pygame version for python 3.7 * Point to master branch (v0.28.0) * Fix: use main not master branch * Rename done to terminated * Fix pygame dependency for python 3.7 * Rename gym to gymnasium * Update Gymnasium * Fix test * Fix tests * Forks don't have access to private variables * Fix linter warnings * Update read the doc env * Fix env checker for GoalEnv * Fix import * Update env checker (more info) and fix dtype * Use micromamab for Docker * Update dependencies * Clarify VecEnv doc * Fix Gymnasium version * Copy file only after mamba install * [ci skip] Update docker doc * Polish code * Reformat * Remove deprecated features * Ignore warning * Update doc * Update examples and changelog * Fix type annotation bundle (SAC, TD3, A2C, PPO, base class) (#1436) * Fix SAC type hints, improve DQN ones * Fix A2C and TD3 type hints * Fix PPO type hints * Fix on-policy type hints * Fix base class type annotation, do not use defaults * Update version * Disable mypy for python 3.7 * Rename Gym26StepReturn * Update continuous critic type annotation * Fix pytype complain --------- Co-authored-by: Carlos Luis <carlos.luisgonc@gmail.com> Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com> Co-authored-by: Thomas Lips <37955681+tlpss@users.noreply.github.com> Co-authored-by: tlips <thomas.lips@ugent.be> Co-authored-by: tlpss <thomas17.lips@gmail.com> Co-authored-by: Quentin GALLOUÉDEC <gallouedec.quentin@gmail.com>	2023-04-14 13:13:59 +02:00
WeberSamuel	15c9daa2ba	Fix VecExtractDictObs does not handle terminal observation (#1443 ) * VecExtractDictObs handle terminal_observation * Added VecExtractDictObs handle terminal_output to changelog * Update changelog.rst * Update test_vec_extract_dict_obs.py Add random dones in env to test if terminal_observation is properly handled * Made test deterministic * Fixed bug in test * Improved test * Fix format in test * Update test * Fix type hint * Ignore pytype warning * Ignore pytype --------- Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2023-04-12 15:20:04 +02:00
npit	4232f9daa9	Rename the observations variable in the evaluation util to avoid shadowing (#1288 ) * Rename the observations variable in the evaluation util to avoid shadowing This enables a callback in evaluate_policy to have access to the observation vector that is fed to the environment step function, which is currently shadowed by the output observation. * Update changelog * Add test * Move assignment outside of the loop --------- Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org> Co-authored-by: Antonin Raffin <antonin.raffin@dlr.de>	2023-04-11 18:00:33 +02:00
Antonin RAFFIN	84f5511e08	Update changelog and cleanup (#1434 )	2023-04-08 15:36:55 +02:00
Jonas Reiher	12250eb761	Add stats window argument (#1424 ) * added stats_window_size argument * updated changelog * docstring info updated * added missing tensorboard log docstring * added stats_window_size argument for all models * fixed stats_window_size test * Update version --------- Co-authored-by: Antonin Raffin <antonin.raffin@dlr.de>	2023-04-05 11:33:26 +02:00
Antonin RAFFIN	5a70af8abd	Fix type hints for DQN (#1354 ) * Fix type hints for DQN * [ci skip] Remove commented line * Refine types * Fix vectorized obs detection * Fix for pytype * Fix check at load time to create replay buffer * One config file to rule them all * Delete unused config --------- Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>	2023-03-30 11:31:47 +02:00
Omar Younis	a60b0179e0	Fix: Reshape action in DictRolloutBuffer (#1395 ) * reshape action in DictRolloutBuffer * improve buffer test * update changelog * add comment * Update comments and version --------- Co-authored-by: Antonin Raffin <antonin.raffin@dlr.de>	2023-03-29 16:25:05 +02:00
Fiete	b6aa507a22	Make check_env assertions in regards to observation_space more actionable (#1400 ) * add instructions for running single tests in the README, add assertions for observation_space * update changelog * address linting warnings * correct pytest command in the README * correct review comments, run make commit-checks * truncate lines that are too long * address make lint warning about checking module availability * fix tests * use f-strings for formatting assertion messages * fix type issue * Refactor tests, improve error messages --------- Co-authored-by: Antonin Raffin <antonin.raffin@dlr.de>	2023-03-29 15:26:03 +02:00
Quentin Gallouédec	c5adad82b2	Multiprocessing support for HerReplayBuffer (#704 ) * IM compat. modif from old fork * mp her working, without offline sampling * update readme and doc * fix discrete action/obs space case * handle offline sampling * fix pos to be consistent with the old version * improve typing and docstring * fix discrete obs special case * new her, using episode uid * deal with full buffer * offline not implemented * info storage; compute_reward as arg; offline sampling error * offline sampling; timeout_termination; fix last_trans detection * rm max_episode_length from tests * fix loading and loading test * Fix episode sampling strategy * Episode interrupted not valid * Typo * Fix infos sampling, next_obs desired goals, offline sampling * update tests for multienvs * speed up code * handle timeout sampling when samping * give up ep_uid for ep_start and ep_lenght * speed up sampling * Improve docstring * Typos and renaming * Fix typing * Fix linter warnings * Renaming + add note * fix reward type * Fix future sampling strategy * Fix future goal selection strategy * env_fn as lambda * Re-fix linter warnings * Formatting * Fix offline sampling * restore the initial performance budget * Remove max_episode_length for HerReplayBuffer kwargs * SubprcVecEnv compat test * Dedicated SubrocVecEnv test rm n_envs from parametrization * Back to using the env arg instead of compute_reward * Up VecEnv import * fix lint warnings * fix docstring * Fix device issue * actor_loss_modifier in SAV and TD3 * Merge RewardModifier and ActorLossModifier into Surgeon * update surgeon for rnd * fix uninteded merge * fix uninteded merge * fix unintended merge * Rm unintended merge * Fix KeyError * Remove useless `all_inds` * Minor docstring format * Fix hint * speedup! * Speedup again * speedup * np.nonzero * fix env normalization * flat sampling for speedup * typo * drop online * format * remove observation from env_cheker (see #1335) * update changelog * default device to "auto" * add comment for info storage * add comment for ep_start and ep_length attributes * a[b][c] to a[b, c] * comment flatnonzero and unravel_index * update _sample_goals docstring * Fix future gaol sampling for split episode * add informative error message for learning_starts too small * use keyword arg for env * try fix pytye * Update stable_baselines3/common/off_policy_algorithm.py Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org> * Add `copy_info_dict` option * Ignore pytype * Update changelog * Rename variables and improve documentation * Ignore new bug bear rule * Add note about future strategy * Add deprecation warning * Fix bug trying to pickle buffer kwargs --------- Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2023-03-20 12:03:57 +01:00
Antonin RAFFIN	470771b5c2	Fix Atari Roms download, enable RUF linting (#1379 ) * Add extra no Atari and fix CI for forks * Enable ruff rules * Change to no roms	2023-03-12 18:47:52 +01:00
Quentin Gallouédec	12e9917c24	Fix image-based normalized env loading (#1321 ) * Fix * Add test * Update changelog * fix memory error avoidance * Update version * image env test * black * check_shape_equal * check shape equal in vecnormalize * Allow spaces not to be box or dict * rm `test_save_load_vecnormalized_image` in favor of `test_vec_env` * Remove unused imports --------- Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org> Co-authored-by: Antonin Raffin <antonin.raffin@dlr.de>	2023-02-15 14:17:18 +01:00

1 2 3 4 5 ...

295 commits