* Updated type hint and extended docstring in make_vec_env
The function itself was already working with callables, but it wasn't considerent in the type hint of the function's signature.
Extended the description of the wrapper_class parameter with a link to a Github issue containing more details on the matter.
* Updated type hint in make_atari_env
The function itself was already working with callables, but it wasn't considerent in the type hint of the function's signature.
* Updated docstring in make_atari_env
When modifying the type hint of the parameter 'env_id' (in this commit: fda6872f73c11075901ba88f2520f6316f818d1d), I forgot to update its description in the docstrig.
Doing it now.
* Removed redundant type in env_id's type hint in make_vec_env and make_atari_env
Callable[..., gym.Env] already includes Type[gym.Env], as pointed out here: https://github.com/DLR-RM/stable-baselines3/pull/1085#issuecomment-1269685218
Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>
* Updated docstring from n_steps to n_rollout_steps
This must be a typo
* Fixed typo in a comment in ppo.py
* Update changelog
Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>
Co-authored-by: Antonin Raffin <antonin.raffin@dlr.de>
* Fix return type for load, learn in BaseAlgorithm
* Update changelog
* Add typing extensions to dependencies
* Import directly from typing for python >3.11
* Reorder changelog to reflect merge order
* Roll back to typevar solution
* Updated changelog
* Remove typing extensions requirement
* Update base_class.py
* Remove final point in changelog
* Additional type fixes across project
Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>
Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>
* Fix loading with new `n_envs`
* Update tests
* Update changelog
* Fix the fix
* Remove `self._setup_model()` from `set_env()`
* Raise `AssertionError` when setting env with a different `n_envs`
* Update unitests
Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>
* Fix replay_buffer_class type annotation
* Update changelog
* Further replacement of same type annotation issue
* Formatting
* Rolled back formatting changes for consistency
* Added option to override or use existing CSVs
* Updated changelog for Monitor override
* Changed default value to override
* Simplify code and add test
* Update version
* Fix for pytype
Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>
* fix nan in advnatages with batch size 1, for ppo
* changelog
* black
* Simplify test
* Bump version
Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>
* include `running_mean` and `running_val` when updating target networks in DQN, SAC, TD3.
* Update stable_baselines3/common/utils.py
Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>
* Precompute batch norm parameters in `_setup_model` and directly copy them in the target update.
* include `running_mean` and `running_val` when updating target networks in DQN, SAC, TD3.
* Update stable_baselines3/common/utils.py
Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>
* Precompute batch norm parameters in `_setup_model` and directly copy them in the target update.
* Fix `DictReplayBuffer.next_observations` type (#1013)
* Fix DictReplayBuffer.next_observations type
* Update changelog
Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>
* Fixed missing verbose parameter passing (#1011)
Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>
* Support for `device=auto` buffers and set it as default value (#1009)
* Default device is "auto" for buffer + auto device support in BufferBaseClass
* Update docstring
* Update tests
* Unify tests
* Update changelog
* Fix tests on CUDA device
Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>
Co-authored-by: Antonin Raffin <antonin.raffin@dlr.de>
* Precompute batch norm parameters in `_setup_model` and directly copy them in the target update.
* Update test
* Add comments and update tests
* Bump version
* Remove one extra space to conform code style.
* Update docstrings
Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>
Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>
Co-authored-by: Burak Demirbilek <BurakDmb@users.noreply.github.com>
Co-authored-by: Antonin Raffin <antonin.raffin@dlr.de>
* create Hparam class & support in all OutputFormats
* add hparams documentation & example
* add hparam tests
* remove unnecessary test & fix name
* format changes
* support hyperparameters logging to tensorboard
* fix HParams class docstring
* use more explicit variable names
* raise error instead of warning
* Unpin protobuf
* Add test for logging hparams
Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>
* Handle non 1D action shape
* Revert changes of observation (out of the scope of this PR)
* Apply changes to DictReplayBuffer
* Update tests
* Rollout buffer n-D actions space handling
* Remove error when non 1D action space
* ActorCriticPolicy return action with the proper shape
* remove useless reshape
* Update changelog
* Add tests
Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>
* Add info on split tensorboard graphs.
* Change wording to make it look better.
* Update changelog.rst
* Rephrase and add link to issue
Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>
* Use higher resolution time and round up to eps
* Update changelog
* Add test case
* Fix formatting, time()->time_ns
* Bugfix: ns is integer not float
* Move test to better place
* Divide by 1e9 earlier
* `arr[0]` to `arr.squeeze(0)`
* `squeeze(axis=0)` to `squeeze(0)`
* Type testing
* Add type test for unvectorized observation
* `squeeze(0)` to `squeeze(axis=0)`
* Treatment of the laziness symptoms
* Update changelog
* Udate changelog
Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>
* Prohibit simultaneous use of optimize_memory_buffer and handle_timeout_termination
* Modify test to avoid unsupported buffer configuration
* Change from assertion to raising of ValueError
* Update changelog
* Update style for consistency
* Use handle_timeout_termination when possible
Co-authored-by: Anssi <kaneran21@hotmail.com>
Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>
* Fixed unchecked None value in SubprocVecEnv
* Fixed unchecked None value in DummyVecEnv
* Fix formatting
* Update test and changelog
* Improve test
Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>
* escape tensorboard log name
Otherwise utils does not recognize the log.
* Added fix to changelog
* Modifications made by: make commit-checks .
* Revert "Modifications made by: make commit-checks ."
This reverts commit 529a275d9475f85ef031038a8f3565f7301e5371.
* Update changelog and add test
Co-authored-by: James Hirschorn <James.Hirschorn@quantitative-technologies.com>
* Goal sampled from next_achieved_goal instead of achived_goal
* No need to have special case for future anymore
* Update changelog
Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>
* Replacing the policy registry with policy "aliases"
* Fixing import order and SAC
* Changing arg. order to be sure policy_aliases is a kwarg
* Import orders
* Removing pytype error check
* Reformat
* Fix alias import
* Not using mutable {} as default for policy_aliases
* Empty aliases initialization
* Using static attributes for policy_aliases
* Fixing isort
* Fixing back bad merge
* Running isort
* Fixing aliases for A2C and PPO
* Using f-string
* Moving policy_aliases definition position
* Adding change in the changelog
* Update version
Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>