* Prohibit simultaneous use of optimize_memory_buffer and handle_timeout_termination
* Modify test to avoid unsupported buffer configuration
* Change from assertion to raising of ValueError
* Update changelog
* Update style for consistency
* Use handle_timeout_termination when possible
Co-authored-by: Anssi <kaneran21@hotmail.com>
Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>
* Fixed unchecked None value in SubprocVecEnv
* Fixed unchecked None value in DummyVecEnv
* Fix formatting
* Update test and changelog
* Improve test
Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>
* escape tensorboard log name
Otherwise utils does not recognize the log.
* Added fix to changelog
* Modifications made by: make commit-checks .
* Revert "Modifications made by: make commit-checks ."
This reverts commit 529a275d9475f85ef031038a8f3565f7301e5371.
* Update changelog and add test
Co-authored-by: James Hirschorn <James.Hirschorn@quantitative-technologies.com>
* Goal sampled from next_achieved_goal instead of achived_goal
* No need to have special case for future anymore
* Update changelog
Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>
* Replacing the policy registry with policy "aliases"
* Fixing import order and SAC
* Changing arg. order to be sure policy_aliases is a kwarg
* Import orders
* Removing pytype error check
* Reformat
* Fix alias import
* Not using mutable {} as default for policy_aliases
* Empty aliases initialization
* Using static attributes for policy_aliases
* Fixing isort
* Fixing back bad merge
* Running isort
* Fixing aliases for A2C and PPO
* Using f-string
* Moving policy_aliases definition position
* Adding change in the changelog
* Update version
Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>
* Removing dead code for handling time limits (see #829)
* Mentionning remove_time_limit_termination in the changelog
* Update changelog.rst
Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>
* Added StopTrainingOnNoModelImprovement callback and callback_after_eval parameter in EvalCallback
* Correction in EvalCallback and tests for StopTrainingOnNoModelImprovement
* Update the docs related to new StopTrainingOnNoModelImprovement callback
* Update doc
Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>
Co-authored-by: Antonin Raffin <antonin.raffin@dlr.de>
* Allow PPO to turn of advantage normalization
* update changelog
* Add a test case
* Update test and sanity check
* Fix tests
Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>
* Make HumanOutputFormat length configurable and bump to 36 by default
* Add test case
* Updated changelog
* Blacken
* Blacken code
* Fix GitLab CI: switch to Docker container with new black version
* Incorporate suggestion
* Add class docstring
* Dummy commit to retrigger GitLab
Co-authored-by: Anssi <kaneran21@hotmail.com>
* fix Atari in CI
* fix dtype and atari extra
* Update setup.py
* remove 3.6
* note about how to install Atari
* pendulum-v1
* atari v5
* black
* fix pendulum capitalization
* add minimum version
* moved things in changelog to breaking changes
* partial v5 fix
* env update to pass tests
* mismatch env version fixed
* Fix tests after merge
* Include autorom in setup.py
* Blacken code
* Fix dtype issue in more robust way
* Fix GitLab CI: switch to Docker container with new black version
* Remove workaround from GitLab. (May need to rebuild Docker for this though.)
* Revert to v4
* Update setup.py
* Apply suggestions from code review
* Remove unnecessary autorom
* Consistent gym versions
Co-authored-by: J K Terry <justinkterry@gmail.com>
Co-authored-by: Anssi <kaneran21@hotmail.com>
Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>
Co-authored-by: modanesh <mohamad4danesh@gmail.com>
Co-authored-by: Adam Gleave <adam@gleave.me>
* Add Hugging Face to SB3 doc
* Update doc + fixes
* Use SB3 model from the hub
* Bump version
* Fixes
Co-authored-by: simoninithomas <simonini_thomas@outlook.fr>
* Writing the additional info_keywords into the episode infos that are passed to the resulst writer. Directly taken from the non-vec version of monitor.
* Added test for monitoring info_keywords.
* Removed unnecessary step of registering the env. Not using make_vec_env, because it applies a monitor wrapper to the env.
* Reformat
Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>
* more verbose documentation regarding `.load` vs `.set_parameters` (#683, #614)
* add a note to explain the difference between `.load` and `.set_parameters` to the examples
* fix typos
Co-authored-by: Anssi <kaneran21@hotmail.com>
Co-authored-by: Anssi <kaneran21@hotmail.com>
* Added ``newline="\n"`` when opening CSV monitor files so that each line ends with ``\r\n`` instead of ``\r\r\n`` on Windows while Linux environments are not affected
* Add multi-env training support for SAC
* Fix for dict obs
* Pytype fixes
* Fix assert on number of envs
* Remove for loop
* Add support for Dict obs
* Start cleanup
* Update doc and bug fix
* Add support for vectorized action noise
and add multi env example for off-policy
* Update version
* Bug fix with VecNormalize
* Update README table
* Update variable names
* Update changelog and version
* Update doc and fix for `gradient_steps=-1`
* Add test for `gradient_steps=-1`
* Disable pytype pyi errors
* Fix for DQN
* Update comment on deepcopy
* Remove episode_reward field
* Fix RolloutReturn
* Avoid modification by reference
* Fix error message
Co-authored-by: Anssi <kaneran21@hotmail.com>