* Add warning about total_env_steps not dividing neatly into batch size
* Stylistic cleanup
* Black reformatting
* Add clearer documentation and update changelog
* Update changelog.rst
* Use specific RolloutBuffer terminology
Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>
* Change to minibatch language
Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>
* Cleaning up language describing rollout buffer requirements
Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>
* Switch to using env.num_envs
* Working tests
* Black and isort still fighting each other
* codestyle finally happy
* Basic test exists, possibly in the wrong file
* Update phrasing
Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>
* Fix for arguments order in explained_variance()
Fix for arguments order in explained_variance() in PPO
* Fix for arguments order in explained_variance()
Fix for arguments order in explained_variance() in a2c
* Fix for arguments order in explained_variance()
update changelog.rst
* Add callback signature to the learning rate type annotations.
* Add callback signature to the learning rate schedule type annotations.
* Add missing type annotations for learning rate callbacks.
* Add signature to old-style learning and evaluation callbacks.
* Add signature to env wrapper callback.
* Add type annotation to closure function.
* Use MaybeCallback more consistently.
* Update changelog.
* Remove now unused List import.
* Fix import order.
* Add type alias for learning rate schedules.
* Optimize imports.
* Fix messed up import.
* Remove resolved TODO.
Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>
* Small docstring improvements related to the notion of Rollout
* documented changes in changelog.rst, added myself to contributers
* Minor edits
Co-authored-by: Stefan Heid <stefan.heid@upb.de>
Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>
* Add auto formatting with black and isort
* Reformat code
* Ignore typing errors
* Add note about line length
* Add minimum version for isort
* Add commit-checks
* Update docker image
* Fixed lost import (during last merge)
* Fix opencv dependency
* Split torch module code into torch_layers file
* Updated reference to CNN
* Change 'CxWxH' to 'CxHxW', as per common notion
* Fix missing import in policies.py
* Move PPOPolicy to OnlineActorCriticPolicy
* Create OnPolicyRLModel from PPO, and make A2C and PPO inherit
* Update A2C optimizer comment
* Clean weight init scales for clarity
* Fix A2C log_interval default parameter
* Rename 'progress' to 'progress_remaining
* Rename 'Models' to 'Algorithms'
* Rename 'OnlineActorCriticPolicy' to 'ActorCriticPolicy'
* Move static functions out from BaseAlgorithm
* Move on/off_policy base algorithms to their own files
* Add files for A2C/PPO
* Fix docs
* Fix pytype
* Update documentation on OnPolicyAlgorithm
* Add proper doctstring for on_policy rollout gathering
* Add bit clarification on the mlppolicy/cnnpolicy naming
* Move static function is_vectorized_policies to utils.py
* Checking docstrings, pep8 fixes
* Update changelog
* Clean changelog
* Remove policy warnings for sac/td3
* Add monitor_wrapper for OnPolicyAlgorithm. Clean tb logging variables. Add parameter keywords to OffPolicyAlgorithm super init
Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>
* init commit tensorboard-integration
* Added tb logger to ppo (with output exclusions)
* fixed truncated stdout
* categorize stdout outputs by tag
* separated exclusions from values, added missing logs
* saving exclusions as dict instead of list
* reformatting, auto run indexing
* included renaming suggestions, fixed tests
* tb support for sac
* linting
* moved logging to base class
* tb support for td3
* removed histograms, non-verbose output working
* modifed changelog
* linting
* fixed type error
* moved logger config to utils
* removed episode_rewards log from ppo
* Enable tensorboard in tests
* Remove unused import
* Update logger sub titles
* Minor edit for PPO
* Update logger and tb log folder
* Pass correct logger to Callbacks
* updated docs
* added tb example image to docs
* add support for continuing training in tensorboard
* added tensorboard to docs index
* added tb test
* moved logger config to _setup_learn, updated tests
* accessing verbose from base class
* Update doc and tests
* Rename session -> time
* Update version
* Update logger truncate
* Update types
* Remove duplicated code
Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>