* change timestamp to episode for logging
* update changelog
* minor format modif
* minor format modif
---------
Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>
* Fixed noise to return float32
* Updated changelog
* Fixed test to use numpy arrays instead of python floats
* Sorted imports for tests
* Added dtype to constructor
* Removed dtype parameter for VectorizedActionNoise
* __init__ -> None; Capitalize and period in docstring when needed; fix dtype type hint; dtype in docstring
* fix dtype type hint
* Update version
* Clarify changelog [skip ci]
* empty commit to run ci
* Update docs/misc/changelog.rst
---------
Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>
Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>
* repeat_action_probability
* Add test
* Undo atari wrapper doc change since CI fails
* remove action_repeat_probability from make_atari_env
* Add sticky action wrapper and improve documentation
* Update changelog
* handle the case noop_max=0
* Update tests
* Comply to ALE implementation
* Reorder doc
* Add doc warning and don't wrap with sticky action when not needed
* fix docstring and reorder
* Move `action_repeat_probability` args at the last position
* Add ref
* Update doc and wrap with frameskip only if needed
* Update changelog
Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>
* Modified actor-critic policies & MlpExtractor class
ActorCriticPolicy:
- changed type hint of net_arch param: now it's a dict
- removed check that if features extractor is not shared: no shared layers are allowed in the mlp_extractor regardless of the features extractor
ActorCriticCnnPolicy:
- changed type hint of net_arch param: now it's a dict
MultiInputActorcriticPolicy:
- changed type hint of net_arch param: now it's a dict
MlpExtractor:
- changed type hint of net_arch param: now it's a dict
- adapted networks creation
- adapted methods: forward, forward_actor & forward_critic
* Removed shared layers in mlp_extractor
* Updated docs and changelog + reformat
* Updated custom policy tests
* Removed test on deprecation warning for share layers in mlp_extractor
Now shared layers are removed
* Update version
* Update RL Zoo doc
* Fix linter warnings
* Add ruff to Makefile (experimental)
* Add backward compat code and minor updates
* Update tests
* Add backward compatibility
* Fix test
* Improve compat code
Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>
* Remove from mypy exclude
* type hint for metadata
* Union[float, int] -> float
* Remove useless __init__
* Type hint for model and logger in BaseCallback
* Type hint for metric_dict
* Update changelog
* fix test_tensorboard
* ignore gamma type checking
* Fix monitor type hint
* Update logger type hints
* Fix type annotation and bump version
* Fix circular import
Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>
* Updated tqdm progress bar constructor to account for the effects of train_freq/n_steps/num_envs on total_timesteps. Ensure progress bar is "flushed" on training end.
* Added description of PR #1260. Fixed formatting typo
* Partial revert
Co-authored-by: dominicgkerr <dominicgkerr1@gmail.co>
Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>
Co-authored-by: Antonin Raffin <antonin.raffin@dlr.de>
* generalize the use of `from gym import spaces`
* command line get system info
* Documentation line length for doc
* update changelog
* add space before os plateform to avoid ref to other issue
* format
* get_system_info update in changelog
* fix type check error
* fix get system info
* add comment about regex
* update version
* Modified ActorCriticPolicy to support non-shared features extractor
* Refactored features extraction with non-shared features extractor in ActorCriticPolicy and updated doc
Doc update: added 'warning' on custom policy docs that says that, if the features extractor is non-shared, it's not possible to have shared layers in the mlp_extractor
* Moved attrib share_features_extractor in class
* Updated custom policy doc for non-shared features extractor
* Updated changelog
* Made some if-statements more readable if policies.py
The if-statements are related to the shared/non-shared features extractor in ActorCritic policies
* Simplify implementation and add run test
* Keep order in module gain to keep previous results consistents
* Fix test
* Improved docstring in policies.py
Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>
* Added some tests
* feature extractor -> features extractor
* Fix test
* Fix env_id in test
* Make features extractor parameter explicit
* Remove duplicate
Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>
Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>
Co-authored-by: Antonin Raffin <antonin.raffin@dlr.de>
* Fix support of image like normalized inputs
* Improve docstring and warning message.
* Don't check if obs is image when normalize_images is False (lil opt)
* Comment fix
* Fix normalize_images not passed to parent
* Check for subclasses too
* Remove useless multiline
* Update version and add comment
* Fix some typos
Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>
* Replace .to(device) when possible
* fix numpy dep
* black
* Add warning for device != cpu and copy=False
* Update changelog
* Remove warning
* Update buffers.py
* Update version
* Fix type checking
Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>
* Replace .to(device) when possible
* fix numpy dep
* black
* Add warning for device != cpu and copy=False
* Update changelog
* Remove warning
* Update buffers.py
* Updated custom policy docs
Better explained how the dimensions of the mlp_extractor work, including the action net and the value net after the layers specified in net_arch.
* Improved custom policy doc
Section: Custom Network Architecture.
Explained with greater detail that an action net and a value net will be added on top of the net_arch.
* Improved custom policy doc
Section: Custom Network Architecture.
Merged a comment into a note
* Alignment
Co-authored-by: Quentin GALLOUÉDEC <gallouedec.quentin@gmail.com>
* Add with_bias arg
* Update changelog
* move torch_layers to the last position
* Update version
Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>