mirror of
https://github.com/saymrwulf/stable-baselines3.git
synced 2026-05-16 21:10:08 +00:00
* IM compat. modif from old fork * mp her working, without offline sampling * update readme and doc * fix discrete action/obs space case * handle offline sampling * fix pos to be consistent with the old version * improve typing and docstring * fix discrete obs special case * new her, using episode uid * deal with full buffer * offline not implemented * info storage; compute_reward as arg; offline sampling error * offline sampling; timeout_termination; fix last_trans detection * rm max_episode_length from tests * fix loading and loading test * Fix episode sampling strategy * Episode interrupted not valid * Typo * Fix infos sampling, next_obs desired goals, offline sampling * update tests for multienvs * speed up code * handle timeout sampling when samping * give up ep_uid for ep_start and ep_lenght * speed up sampling * Improve docstring * Typos and renaming * Fix typing * Fix linter warnings * Renaming + add note * fix reward type * Fix future sampling strategy * Fix future goal selection strategy * env_fn as lambda * Re-fix linter warnings * Formatting * Fix offline sampling * restore the initial performance budget * Remove max_episode_length for HerReplayBuffer kwargs * SubprcVecEnv compat test * Dedicated SubrocVecEnv test rm n_envs from parametrization * Back to using the env arg instead of compute_reward * Up VecEnv import * fix lint warnings * fix docstring * Fix device issue * actor_loss_modifier in SAV and TD3 * Merge RewardModifier and ActorLossModifier into Surgeon * update surgeon for rnd * fix uninteded merge * fix uninteded merge * fix unintended merge * Rm unintended merge * Fix KeyError * Remove useless `all_inds` * Minor docstring format * Fix hint * speedup! * Speedup again * speedup * np.nonzero * fix env normalization * flat sampling for speedup * typo * drop online * format * remove observation from env_cheker (see #1335) * update changelog * default device to "auto" * add comment for info storage * add comment for ep_start and ep_length attributes * a[b][c] to a[b, c] * comment flatnonzero and unravel_index * update _sample_goals docstring * Fix future gaol sampling for split episode * add informative error message for learning_starts too small * use keyword arg for env * try fix pytye * Update stable_baselines3/common/off_policy_algorithm.py Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org> * Add `copy_info_dict` option * Ignore pytype * Update changelog * Rename variables and improve documentation * Ignore new bug bear rule * Add note about future strategy * Add deprecation warning * Fix bug trying to pickle buffer kwargs --------- Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org> |
||
|---|---|---|
| .. | ||
| a2c.rst | ||
| base.rst | ||
| ddpg.rst | ||
| dqn.rst | ||
| her.rst | ||
| ppo.rst | ||
| sac.rst | ||
| td3.rst | ||