stable-baselines3

mirror of https://github.com/saymrwulf/stable-baselines3.git synced 2026-05-26 22:45:15 +00:00

Author	SHA1	Message	Date
Adam Gleave	4fb8aec215	Update evaluate_policy type annotation to support policies as well as RL algorithms (#1146 ) * Add PolicyPredictor protocol and use it in evaluate_policy * Update changelog * Move Protocol to type_aliases to avoid circular import * Add test for evaluate_policy on BasePolicy * Remove unused import * Use typing_extensions * Move typing_extensions to 3rd party * Add version range (typing_extensions uses SemVer) * Import Protocol from typing_extensions only on Python<3.8 Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com> * Install typing_extensions only on Python<3.8 * Add missing sys import * Fix import ordering * Fix observation type hint in predict Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com> Co-authored-by: Quentin GALLOUÉDEC <gallouedec.quentin@gmail.com>	2022-11-03 15:36:19 +01:00
Antonin RAFFIN	52c29dc497	Fix evaluation script for recurrent policies (#678 ) * Fix evaluation script for RNN * Add error message * Revert "Add error message" This reverts commit 8d69b6cf4de2cd13aecfb425bd3145fad6a6c49a. * Fix for pytype * Rename mask to `episode_start` * Fix type hint * Fix type hints * Remove confusing part of sentence Co-authored-by: Anssi <kaneran21@hotmail.com>	2021-11-30 13:49:06 +01:00
Benjamin Black	a038044d11	Added support for vector envs in evaluation (#447 ) * added vector env support to evaluate_policy * fixed linting and documentation * updated changelog * fixed code style issue * added tests for vec env * fixed formatting * renamed observations * added comments for vector evaluation * fixed issues * Cleanup + bump version * Add comment * Fix wrong count of episodes Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org> Co-authored-by: Antonin Raffin <antonin.raffin@dlr.de>	2021-05-28 12:40:29 +02:00
Costa Huang	ddbe0e93f9	Support for `VecMonitor` for gym3-style environments (#311 ) * add vectorized monitor * auto format of the code * add documentation and VecExtractDictObs * refactor and add test cases * add test cases and format * avoid circular import and fix doc * fix type * fix type * oops * Update stable_baselines3/common/monitor.py Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org> * Update stable_baselines3/common/monitor.py Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org> * add test cases * update changelog * fix mutable argument * quick fix * Apply suggestions from code review * fix terminal observation for gym3 envs * delete comment * Update doc and bump version * Add warning when already using `Monitor` wrapper * Update vecmonitor tests * Fixes Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2021-04-13 18:09:31 +02:00
Antonin RAFFIN	c62e9259db	Add custom objects support + bug fix (#336 ) * Add support for custom objects * Add python 3.8 to the CI * Bump version * PyType fixes * [ci skip] Fix typo * Add note about slow-down + fix typos * Minor edits to the doc * Bug fix for DQN * Update test * Add test for custom objects	2021-03-06 15:17:43 +02:00
Anssi	18d10dbf42	Use Monitor episode reward/length for `evaluate_policy` (#220 ) * Update evaluate_policy to use monitor data if available * Update documentation * Cleaning up * Remove unnecessary typing trickery * Update doc * Rename is_wrapped to clarify it is for vecenvs * Add is_wrapped for regular envs * Add is_wrapped call for subprocvecenv and update code for circular imports * Move new functions back to env_util and fix imports * Update changelog * Clarify evaluate_policy docs * Add tests for wrapped modifying episode lengths * Fix tests * Update changelog * Minor edits * Add warn switch to evaluate_policy and update tests Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2020-11-16 11:52:28 +01:00
M. Ernestus	c74509ae9d	Add callable signatures to type annotations. (#215 ) * Add callback signature to the learning rate type annotations. * Add callback signature to the learning rate schedule type annotations. * Add missing type annotations for learning rate callbacks. * Add signature to old-style learning and evaluation callbacks. * Add signature to env wrapper callback. * Add type annotation to closure function. * Use MaybeCallback more consistently. * Update changelog. * Remove now unused List import. * Fix import order. * Add type alias for learning rate schedules. * Optimize imports. * Fix messed up import. * Remove resolved TODO. Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2020-11-15 17:50:28 +01:00
Antonin RAFFIN	55912576ed	Cleanup docstring types (#169 ) * Cleanup docstring types * Update style * Test with js hack * Revert "Test with js hack" This reverts commit d091f438e8851ab8d01b66628e06a104f5e5ec69. * Fix types * Fix typo * Update CONTRIBUTING example	2020-10-02 20:05:55 +03:00
Antonin RAFFIN	2c924f52f5	Update docs (custom policy, type hints) (#167 ) * Change import * Update custom policy doc * Re-enable sphinx_autodoc_typehints * Update docker image * Attempt to fix read the doc build error * Add sphinx_autodoc_typehints to read the doc env * Fix pip version * Add full custom policy example * Fix	2020-09-29 20:41:14 +03:00
Antonin RAFFIN	21e9994ff9	Fix double reset and improve typing coverage (#136 ) * Fix double reset and improve typing coverage * Revert minor edit * Add doc about types	2020-08-05 13:12:02 +03:00
Antonin RAFFIN	23afedb254	Auto-formatting with black and isort (#97 ) * Add auto formatting with black and isort * Reformat code * Ignore typing errors * Add note about line length * Add minimum version for isort * Add commit-checks * Update docker image * Fixed lost import (during last merge) * Fix opencv dependency	2020-07-16 16:12:16 +02:00
Anssi	44f8218df0	Review of code (A2C, PPO and refactoring) (#35 ) * Split torch module code into torch_layers file * Updated reference to CNN * Change 'CxWxH' to 'CxHxW', as per common notion * Fix missing import in policies.py * Move PPOPolicy to OnlineActorCriticPolicy * Create OnPolicyRLModel from PPO, and make A2C and PPO inherit * Update A2C optimizer comment * Clean weight init scales for clarity * Fix A2C log_interval default parameter * Rename 'progress' to 'progress_remaining * Rename 'Models' to 'Algorithms' * Rename 'OnlineActorCriticPolicy' to 'ActorCriticPolicy' * Move static functions out from BaseAlgorithm * Move on/off_policy base algorithms to their own files * Add files for A2C/PPO * Fix docs * Fix pytype * Update documentation on OnPolicyAlgorithm * Add proper doctstring for on_policy rollout gathering * Add bit clarification on the mlppolicy/cnnpolicy naming * Move static function is_vectorized_policies to utils.py * Checking docstrings, pep8 fixes * Update changelog * Clean changelog * Remove policy warnings for sac/td3 * Add monitor_wrapper for OnPolicyAlgorithm. Clean tb logging variables. Add parameter keywords to OffPolicyAlgorithm super init Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2020-06-09 13:54:18 +02:00
Antonin RAFFIN	d542732c8d	Rename to stable-baselines3	2020-05-05 15:02:35 +02:00

13 commits