* Add custom arch for off-policy actor/critic networks
* Fix type hints
* Address comments
* Make sure number of updated parameters match in polyak
* Add zip_strict for strict-length zipping
* Fix building docs
* Add test for zip strict
* Faster tests
Co-authored-by: Anssi "Miffyli" Kanervisto <kaneran21@hotmail.com>
* Update comments and docstrings
* Rename get_torch_variables to private and update docs
* Clarify documentation on data, params and tensors
* Make excluded_save_params private and update docs
* Update get_torch_variable_names to get_torch_save_params for description
* Simplify saving code and update docs on params vs tensors
* Rename saved item tensors to pytorch_variables for clarity
* Reformat
* Fix a typo
* Add get/set_parameters, update tests accordingly
* Use f-strings for formatting
* Fix load docstring
* Reorganize functions in BaseClass
* Update changelog
* Add library version to the stored models
* Actually run isort this time
* Fix flake8 complaints and also fix testing code
* Fix isort
* ...and black
* Fix set_random_seed
Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>
Co-authored-by: Antonin Raffin <antonin.raffin@dlr.de>
* Fix type annotation in make_vec_env
The variable `vec_env_cls` is a type and not an instance of either DummyVecEnv or SubprocVecEnv
* Update changelog.rst
* Added a 'device' keyword argument to BaseAlgorithm.load().
Edited the save and load test to also test the load method with all possible devices.
Added the changes to the changelog
* improved the load test to ensure that the model loads to the correct device.
* improved the test: now the correctness is improved. If the get_device policy would change, it wouldn't break the test.
* Update tests/test_save_load.py
@araffin's suggestion during the PR process
Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>
* Update tests/test_save_load.py
Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>
* Bug fixes: when comparing devices, comparing only device type since get_device() doesn't provide device index.
Now the code loads all of the model parameters from the saved state dict straight into the required device. (fixed load_from_zip_file).
* PR fixes: bug fix - a non-related test failed when running on GPU. updated the assertion to consider only types of devices. Also corrected a related bug in 'get_device()' method.
* Update changelog.rst
Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>
* Fix storing correct episode dones
* Fix number of filters in NatureCNN network
* Add TF-like RMSprop for matching performance with sb2
* Remove stuff that was accidentally included
* Reformat
* Clarify variable naming
* Update changelog
* Add comment on RMSprop implementations to A2C
* Add test for RMSpropTFLike
Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>
* Add auto formatting with black and isort
* Reformat code
* Ignore typing errors
* Add note about line length
* Add minimum version for isort
* Add commit-checks
* Update docker image
* Fixed lost import (during last merge)
* Fix opencv dependency
* Add DDPG + TD3 with any number of critics
* Allow any number of critics for SAC
* Update doc
* [ci skip] Update DDPG example
* Remove unused parameter
* Add DDPG to identity test
* Fix computation with n_critics=1,3
* Update doc
* Apply suggestions from code review
Co-authored-by: Adam Gleave <adam@gleave.me>
* Update docstrings for off-policy algos
* Add check for sde
Co-authored-by: Adam Gleave <adam@gleave.me>