* Modified actor-critic policies & MlpExtractor class
ActorCriticPolicy:
- changed type hint of net_arch param: now it's a dict
- removed check that if features extractor is not shared: no shared layers are allowed in the mlp_extractor regardless of the features extractor
ActorCriticCnnPolicy:
- changed type hint of net_arch param: now it's a dict
MultiInputActorcriticPolicy:
- changed type hint of net_arch param: now it's a dict
MlpExtractor:
- changed type hint of net_arch param: now it's a dict
- adapted networks creation
- adapted methods: forward, forward_actor & forward_critic
* Removed shared layers in mlp_extractor
* Updated docs and changelog + reformat
* Updated custom policy tests
* Removed test on deprecation warning for share layers in mlp_extractor
Now shared layers are removed
* Update version
* Update RL Zoo doc
* Fix linter warnings
* Add ruff to Makefile (experimental)
* Add backward compat code and minor updates
* Update tests
* Add backward compatibility
* Fix test
* Improve compat code
Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>
* Remove from mypy exclude
* type hint for metadata
* Union[float, int] -> float
* Remove useless __init__
* Type hint for model and logger in BaseCallback
* Type hint for metric_dict
* Update changelog
* fix test_tensorboard
* ignore gamma type checking
* Fix monitor type hint
* Update logger type hints
* Fix type annotation and bump version
* Fix circular import
Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>
* generalize the use of `from gym import spaces`
* command line get system info
* Documentation line length for doc
* update changelog
* add space before os plateform to avoid ref to other issue
* format
* get_system_info update in changelog
* fix type check error
* fix get system info
* add comment about regex
* update version
* Modified ActorCriticPolicy to support non-shared features extractor
* Refactored features extraction with non-shared features extractor in ActorCriticPolicy and updated doc
Doc update: added 'warning' on custom policy docs that says that, if the features extractor is non-shared, it's not possible to have shared layers in the mlp_extractor
* Moved attrib share_features_extractor in class
* Updated custom policy doc for non-shared features extractor
* Updated changelog
* Made some if-statements more readable if policies.py
The if-statements are related to the shared/non-shared features extractor in ActorCritic policies
* Simplify implementation and add run test
* Keep order in module gain to keep previous results consistents
* Fix test
* Improved docstring in policies.py
Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>
* Added some tests
* feature extractor -> features extractor
* Fix test
* Fix env_id in test
* Make features extractor parameter explicit
* Remove duplicate
Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>
Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>
Co-authored-by: Antonin Raffin <antonin.raffin@dlr.de>
* Fix support of image like normalized inputs
* Improve docstring and warning message.
* Don't check if obs is image when normalize_images is False (lil opt)
* Comment fix
* Fix normalize_images not passed to parent
* Check for subclasses too
* Remove useless multiline
* Update version and add comment
* Fix some typos
Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>
* Updated custom policy docs
Better explained how the dimensions of the mlp_extractor work, including the action net and the value net after the layers specified in net_arch.
* Improved custom policy doc
Section: Custom Network Architecture.
Explained with greater detail that an action net and a value net will be added on top of the net_arch.
* Improved custom policy doc
Section: Custom Network Architecture.
Merged a comment into a note
* Alignment
Co-authored-by: Quentin GALLOUÉDEC <gallouedec.quentin@gmail.com>
* Adds deprecation warning if `eval_env` or `eval_freq` parameters are used. See #925
* added changelog entry
* added missing backtick
* deprecating `create_eval_env` parameter as well and adding comments to explain the `stacklevel` parameter used
* Updated tests to ignore DeprecationWarnings
* Updated changelog entry
* - Removed the `create_eval_env` parameter from the examples in the docs
- Removed information about the `create_eval_env` parameter from the migration docs
- Added information about deprecation of the `create_eval_env` parameter in the docs
* Add alternative in docstring
* Update docstrings
* `eval_freq` warning in docstring
* Add deprecation comments in tests
Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>
Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>
Co-authored-by: Quentin GALLOUÉDEC <gallouedec.quentin@gmail.com>
* Add progress bar callback and argument
* Update doc
* Update changelog
* Upgrade pytype in docker image
* Use tqdm.write in the logger to have cleaner output
* Fix logger test
* Fix when doing multiple calls to learn()
* Address comments from code-review
* create Hparam class & support in all OutputFormats
* add hparams documentation & example
* add hparam tests
* remove unnecessary test & fix name
* format changes
* support hyperparameters logging to tensorboard
* fix HParams class docstring
* use more explicit variable names
* raise error instead of warning
* Unpin protobuf
* Add test for logging hparams
Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>
* Add info on split tensorboard graphs.
* Change wording to make it look better.
* Update changelog.rst
* Rephrase and add link to issue
Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>
* Added StopTrainingOnNoModelImprovement callback and callback_after_eval parameter in EvalCallback
* Correction in EvalCallback and tests for StopTrainingOnNoModelImprovement
* Update the docs related to new StopTrainingOnNoModelImprovement callback
* Update doc
Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>
Co-authored-by: Antonin Raffin <antonin.raffin@dlr.de>
* Page modified. Following fixes are committed in response to issue#755.
- fixed the broken url on creating custom gym environment. Also added appropriate advice by citing official OpenAi gym documents.
- SB3 text tweaked.
* modified page
- updated the in-line text hyperlinks to follow Sphinx restructured text format.
* modified page
- updated the in-line text hyperlinks to follow Sphinx restructured text format.
- updated text grammar
* Language
Co-authored-by: Anssi <kaneran21@hotmail.com>
* fix Atari in CI
* fix dtype and atari extra
* Update setup.py
* remove 3.6
* note about how to install Atari
* pendulum-v1
* atari v5
* black
* fix pendulum capitalization
* add minimum version
* moved things in changelog to breaking changes
* partial v5 fix
* env update to pass tests
* mismatch env version fixed
* Fix tests after merge
* Include autorom in setup.py
* Blacken code
* Fix dtype issue in more robust way
* Fix GitLab CI: switch to Docker container with new black version
* Remove workaround from GitLab. (May need to rebuild Docker for this though.)
* Revert to v4
* Update setup.py
* Apply suggestions from code review
* Remove unnecessary autorom
* Consistent gym versions
Co-authored-by: J K Terry <justinkterry@gmail.com>
Co-authored-by: Anssi <kaneran21@hotmail.com>
Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>
Co-authored-by: modanesh <mohamad4danesh@gmail.com>
Co-authored-by: Adam Gleave <adam@gleave.me>
* Add Hugging Face to SB3 doc
* Update doc + fixes
* Use SB3 model from the hub
* Bump version
* Fixes
Co-authored-by: simoninithomas <simonini_thomas@outlook.fr>