* Allow PPO to turn of advantage normalization
* update changelog
* Add a test case
* Update test and sanity check
* Fix tests
Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>
* Page modified. Following fixes are committed in response to issue#755.
- fixed the broken url on creating custom gym environment. Also added appropriate advice by citing official OpenAi gym documents.
- SB3 text tweaked.
* modified page
- updated the in-line text hyperlinks to follow Sphinx restructured text format.
* modified page
- updated the in-line text hyperlinks to follow Sphinx restructured text format.
- updated text grammar
* Language
Co-authored-by: Anssi <kaneran21@hotmail.com>
* Make HumanOutputFormat length configurable and bump to 36 by default
* Add test case
* Updated changelog
* Blacken
* Blacken code
* Fix GitLab CI: switch to Docker container with new black version
* Incorporate suggestion
* Add class docstring
* Dummy commit to retrigger GitLab
Co-authored-by: Anssi <kaneran21@hotmail.com>
* fix Atari in CI
* fix dtype and atari extra
* Update setup.py
* remove 3.6
* note about how to install Atari
* pendulum-v1
* atari v5
* black
* fix pendulum capitalization
* add minimum version
* moved things in changelog to breaking changes
* partial v5 fix
* env update to pass tests
* mismatch env version fixed
* Fix tests after merge
* Include autorom in setup.py
* Blacken code
* Fix dtype issue in more robust way
* Fix GitLab CI: switch to Docker container with new black version
* Remove workaround from GitLab. (May need to rebuild Docker for this though.)
* Revert to v4
* Update setup.py
* Apply suggestions from code review
* Remove unnecessary autorom
* Consistent gym versions
Co-authored-by: J K Terry <justinkterry@gmail.com>
Co-authored-by: Anssi <kaneran21@hotmail.com>
Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>
Co-authored-by: modanesh <mohamad4danesh@gmail.com>
Co-authored-by: Adam Gleave <adam@gleave.me>
* Add Hugging Face to SB3 doc
* Update doc + fixes
* Use SB3 model from the hub
* Bump version
* Fixes
Co-authored-by: simoninithomas <simonini_thomas@outlook.fr>
* Writing the additional info_keywords into the episode infos that are passed to the resulst writer. Directly taken from the non-vec version of monitor.
* Added test for monitoring info_keywords.
* Removed unnecessary step of registering the env. Not using make_vec_env, because it applies a monitor wrapper to the env.
* Reformat
Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>
* more verbose documentation regarding `.load` vs `.set_parameters` (#683, #614)
* add a note to explain the difference between `.load` and `.set_parameters` to the examples
* fix typos
Co-authored-by: Anssi <kaneran21@hotmail.com>
Co-authored-by: Anssi <kaneran21@hotmail.com>
* Added ``newline="\n"`` when opening CSV monitor files so that each line ends with ``\r\n`` instead of ``\r\r\n`` on Windows while Linux environments are not affected
* Add multi-env training support for SAC
* Fix for dict obs
* Pytype fixes
* Fix assert on number of envs
* Remove for loop
* Add support for Dict obs
* Start cleanup
* Update doc and bug fix
* Add support for vectorized action noise
and add multi env example for off-policy
* Update version
* Bug fix with VecNormalize
* Update README table
* Update variable names
* Update changelog and version
* Update doc and fix for `gradient_steps=-1`
* Add test for `gradient_steps=-1`
* Disable pytype pyi errors
* Fix for DQN
* Update comment on deepcopy
* Remove episode_reward field
* Fix RolloutReturn
* Avoid modification by reference
* Fix error message
Co-authored-by: Anssi <kaneran21@hotmail.com>
* Fix evaluation script for RNN
* Add error message
* Revert "Add error message"
This reverts commit 8d69b6cf4de2cd13aecfb425bd3145fad6a6c49a.
* Fix for pytype
* Rename mask to `episode_start`
* Fix type hint
* Fix type hints
* Remove confusing part of sentence
Co-authored-by: Anssi <kaneran21@hotmail.com>
* Add a section on exporting to TFLite/Coral with demonstration
* Changelog to reflect new export documentation
* Update docs/guide/export.rst
Fingers on autopilot make word wrong
Co-authored-by: Anssi <kaneran21@hotmail.com>
* Update docs/guide/export.rst
Better wording clarity
Co-authored-by: Anssi <kaneran21@hotmail.com>
* Update docs/guide/export.rst
Better wording clarity
Co-authored-by: Anssi <kaneran21@hotmail.com>
* Clarify motivations and hardware
* Update docs/misc/changelog.rst
Make consistent with other changelog entries
Co-authored-by: Anssi <kaneran21@hotmail.com>
* Sphinx wants the section underline to be at least this long
* Remove first-person voice
* Typos
Co-authored-by: Anssi <kaneran21@hotmail.com>
* Update rl_tips.rst
indent fix to make if done and its following statement work
* Fix indentation and update changelog
* Skip type check for python 3.9
Co-authored-by: paulg <cove9988@gmail.com>
* Store number of timesteps at the beginning of each learn cycle
* Update changelog
* Set default _num_timesteps_at_start in the contructor
* Test case for FPS logger
* Adjust test to cover both on-policy and off-policy algorithms
* Fix formatting
* Update test and add comment
* Fix test
Co-authored-by: Oleksii Kachaiev <okachaiev@riotgames.com>
Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>
* Add highway-env to the list of projects using SB3
Many thanks for this fantastic library, keep up the good work!
* Update changelog with added documentation
* Add `system_env_info`
* Add `print_system_info` to load
and store system info at save time
* Remove TODO
* Rename to `get_system_info`
* Import as sb3 for consistency
* Update changelog
* Add warning for old SB3 versions
* Use underscore litteral for more clarity
* Use a consistent key to log the total timesteps
This changes the timestep logging key of on-policy algorithms from
`time/total_timesteps` to `time/total timesteps` (note the
underscore/space). The off-policy algorithms and the eval callback
already use the latter, so this behavior is more consistent.
* Use underscores instead of spaces in logging keys
Most keys already followed this policy and consistent behavior is
friendlier to new users.
* Minor edit and bump version
Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>