* Change saving/loading normalization parameters to use single pickle file
* Remove 'use_gae' from RolloutBuffer compute_returns function
* Add some missing tests for normalizer, nan-checker and PPO clip_value_fn argument
* Update changelog
* Fix typo
* Use proper pytest.raises for catching errors in tests
* Add comment on GAE and how to obtain non-GAE behaviour
* Remove save/load_running_average from VecNormalize in favor of load/save
* Update changelog
* Update docstring
* Add accidentally removed tests for VecNormalize
Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>
* Split torch module code into torch_layers file
* Updated reference to CNN
* Change 'CxWxH' to 'CxHxW', as per common notion
* Fix missing import in policies.py
* Move PPOPolicy to OnlineActorCriticPolicy
* Create OnPolicyRLModel from PPO, and make A2C and PPO inherit
* Update A2C optimizer comment
* Clean weight init scales for clarity
* Fix A2C log_interval default parameter
* Rename 'progress' to 'progress_remaining
* Rename 'Models' to 'Algorithms'
* Rename 'OnlineActorCriticPolicy' to 'ActorCriticPolicy'
* Move static functions out from BaseAlgorithm
* Move on/off_policy base algorithms to their own files
* Add files for A2C/PPO
* Fix docs
* Fix pytype
* Update documentation on OnPolicyAlgorithm
* Add proper doctstring for on_policy rollout gathering
* Add bit clarification on the mlppolicy/cnnpolicy naming
* Move static function is_vectorized_policies to utils.py
* Checking docstrings, pep8 fixes
* Update changelog
* Clean changelog
* Remove policy warnings for sac/td3
* Add monitor_wrapper for OnPolicyAlgorithm. Clean tb logging variables. Add parameter keywords to OffPolicyAlgorithm super init
Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>
* init commit tensorboard-integration
* Added tb logger to ppo (with output exclusions)
* fixed truncated stdout
* categorize stdout outputs by tag
* separated exclusions from values, added missing logs
* saving exclusions as dict instead of list
* reformatting, auto run indexing
* included renaming suggestions, fixed tests
* tb support for sac
* linting
* moved logging to base class
* tb support for td3
* removed histograms, non-verbose output working
* modifed changelog
* linting
* fixed type error
* moved logger config to utils
* removed episode_rewards log from ppo
* Enable tensorboard in tests
* Remove unused import
* Update logger sub titles
* Minor edit for PPO
* Update logger and tb log folder
* Pass correct logger to Callbacks
* updated docs
* added tb example image to docs
* add support for continuing training in tensorboard
* added tensorboard to docs index
* added tb test
* moved logger config to _setup_learn, updated tests
* accessing verbose from base class
* Update doc and tests
* Rename session -> time
* Update version
* Update logger truncate
* Update types
* Remove duplicated code
Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>
* Implemented Vectorized Action Noise
Vectorized Action Noise allows for multiple instances of
ActionNoiseProcesses to run in parallel. This makes it easier to
run TD3/SAC/DDPG with VecEnv.
* fixed linting issues
* make test function name consistent
Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>
* sanity checks and more detailed test
* Update stable_baselines3/common/noise.py
Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>
* Added assertion error message in noises setter
* Corrected tests to reflect change to AssertionError from ValueError
Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>
* Test gitlab-ci
* Try different image
* Add pytest and doc build
* Fix command
* Fix image used for CI
* Seperate pytest builds
* Fix weird seg fault in docker image due to FakeImageEnv
* Fix make command
* [ci skip] Add space in the badges
* Fix CI failures
* Re-install opencv
* Use opencv-headless
* Test with new docker image