* Modified ActorCriticPolicy to support non-shared features extractor
* Refactored features extraction with non-shared features extractor in ActorCriticPolicy and updated doc
Doc update: added 'warning' on custom policy docs that says that, if the features extractor is non-shared, it's not possible to have shared layers in the mlp_extractor
* Moved attrib share_features_extractor in class
* Updated custom policy doc for non-shared features extractor
* Updated changelog
* Made some if-statements more readable if policies.py
The if-statements are related to the shared/non-shared features extractor in ActorCritic policies
* Simplify implementation and add run test
* Keep order in module gain to keep previous results consistents
* Fix test
* Improved docstring in policies.py
Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>
* Added some tests
* feature extractor -> features extractor
* Fix test
* Fix env_id in test
* Make features extractor parameter explicit
* Remove duplicate
Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>
Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>
Co-authored-by: Antonin Raffin <antonin.raffin@dlr.de>
* Fix support of image like normalized inputs
* Improve docstring and warning message.
* Don't check if obs is image when normalize_images is False (lil opt)
* Comment fix
* Fix normalize_images not passed to parent
* Check for subclasses too
* Remove useless multiline
* Update version and add comment
* Fix some typos
Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>
* Updated custom policy docs
Better explained how the dimensions of the mlp_extractor work, including the action net and the value net after the layers specified in net_arch.
* Improved custom policy doc
Section: Custom Network Architecture.
Explained with greater detail that an action net and a value net will be added on top of the net_arch.
* Improved custom policy doc
Section: Custom Network Architecture.
Merged a comment into a note
* Alignment
Co-authored-by: Quentin GALLOUÉDEC <gallouedec.quentin@gmail.com>
* Adds deprecation warning if `eval_env` or `eval_freq` parameters are used. See #925
* added changelog entry
* added missing backtick
* deprecating `create_eval_env` parameter as well and adding comments to explain the `stacklevel` parameter used
* Updated tests to ignore DeprecationWarnings
* Updated changelog entry
* - Removed the `create_eval_env` parameter from the examples in the docs
- Removed information about the `create_eval_env` parameter from the migration docs
- Added information about deprecation of the `create_eval_env` parameter in the docs
* Add alternative in docstring
* Update docstrings
* `eval_freq` warning in docstring
* Add deprecation comments in tests
Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>
Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>
Co-authored-by: Quentin GALLOUÉDEC <gallouedec.quentin@gmail.com>
* Add progress bar callback and argument
* Update doc
* Update changelog
* Upgrade pytype in docker image
* Use tqdm.write in the logger to have cleaner output
* Fix logger test
* Fix when doing multiple calls to learn()
* Address comments from code-review
* create Hparam class & support in all OutputFormats
* add hparams documentation & example
* add hparam tests
* remove unnecessary test & fix name
* format changes
* support hyperparameters logging to tensorboard
* fix HParams class docstring
* use more explicit variable names
* raise error instead of warning
* Unpin protobuf
* Add test for logging hparams
Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>
* Add info on split tensorboard graphs.
* Change wording to make it look better.
* Update changelog.rst
* Rephrase and add link to issue
Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>
* Added StopTrainingOnNoModelImprovement callback and callback_after_eval parameter in EvalCallback
* Correction in EvalCallback and tests for StopTrainingOnNoModelImprovement
* Update the docs related to new StopTrainingOnNoModelImprovement callback
* Update doc
Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>
Co-authored-by: Antonin Raffin <antonin.raffin@dlr.de>
* Page modified. Following fixes are committed in response to issue#755.
- fixed the broken url on creating custom gym environment. Also added appropriate advice by citing official OpenAi gym documents.
- SB3 text tweaked.
* modified page
- updated the in-line text hyperlinks to follow Sphinx restructured text format.
* modified page
- updated the in-line text hyperlinks to follow Sphinx restructured text format.
- updated text grammar
* Language
Co-authored-by: Anssi <kaneran21@hotmail.com>
* fix Atari in CI
* fix dtype and atari extra
* Update setup.py
* remove 3.6
* note about how to install Atari
* pendulum-v1
* atari v5
* black
* fix pendulum capitalization
* add minimum version
* moved things in changelog to breaking changes
* partial v5 fix
* env update to pass tests
* mismatch env version fixed
* Fix tests after merge
* Include autorom in setup.py
* Blacken code
* Fix dtype issue in more robust way
* Fix GitLab CI: switch to Docker container with new black version
* Remove workaround from GitLab. (May need to rebuild Docker for this though.)
* Revert to v4
* Update setup.py
* Apply suggestions from code review
* Remove unnecessary autorom
* Consistent gym versions
Co-authored-by: J K Terry <justinkterry@gmail.com>
Co-authored-by: Anssi <kaneran21@hotmail.com>
Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>
Co-authored-by: modanesh <mohamad4danesh@gmail.com>
Co-authored-by: Adam Gleave <adam@gleave.me>
* Add Hugging Face to SB3 doc
* Update doc + fixes
* Use SB3 model from the hub
* Bump version
* Fixes
Co-authored-by: simoninithomas <simonini_thomas@outlook.fr>
* more verbose documentation regarding `.load` vs `.set_parameters` (#683, #614)
* add a note to explain the difference between `.load` and `.set_parameters` to the examples
* fix typos
Co-authored-by: Anssi <kaneran21@hotmail.com>
Co-authored-by: Anssi <kaneran21@hotmail.com>
* Add multi-env training support for SAC
* Fix for dict obs
* Pytype fixes
* Fix assert on number of envs
* Remove for loop
* Add support for Dict obs
* Start cleanup
* Update doc and bug fix
* Add support for vectorized action noise
and add multi env example for off-policy
* Update version
* Bug fix with VecNormalize
* Update README table
* Update variable names
* Update changelog and version
* Update doc and fix for `gradient_steps=-1`
* Add test for `gradient_steps=-1`
* Disable pytype pyi errors
* Fix for DQN
* Update comment on deepcopy
* Remove episode_reward field
* Fix RolloutReturn
* Avoid modification by reference
* Fix error message
Co-authored-by: Anssi <kaneran21@hotmail.com>
* Add a section on exporting to TFLite/Coral with demonstration
* Changelog to reflect new export documentation
* Update docs/guide/export.rst
Fingers on autopilot make word wrong
Co-authored-by: Anssi <kaneran21@hotmail.com>
* Update docs/guide/export.rst
Better wording clarity
Co-authored-by: Anssi <kaneran21@hotmail.com>
* Update docs/guide/export.rst
Better wording clarity
Co-authored-by: Anssi <kaneran21@hotmail.com>
* Clarify motivations and hardware
* Update docs/misc/changelog.rst
Make consistent with other changelog entries
Co-authored-by: Anssi <kaneran21@hotmail.com>
* Sphinx wants the section underline to be at least this long
* Remove first-person voice
* Typos
Co-authored-by: Anssi <kaneran21@hotmail.com>