* include `running_mean` and `running_val` when updating target networks in DQN, SAC, TD3.
* Update stable_baselines3/common/utils.py
Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>
* Precompute batch norm parameters in `_setup_model` and directly copy them in the target update.
* include `running_mean` and `running_val` when updating target networks in DQN, SAC, TD3.
* Update stable_baselines3/common/utils.py
Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>
* Precompute batch norm parameters in `_setup_model` and directly copy them in the target update.
* Fix `DictReplayBuffer.next_observations` type (#1013)
* Fix DictReplayBuffer.next_observations type
* Update changelog
Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>
* Fixed missing verbose parameter passing (#1011)
Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>
* Support for `device=auto` buffers and set it as default value (#1009)
* Default device is "auto" for buffer + auto device support in BufferBaseClass
* Update docstring
* Update tests
* Unify tests
* Update changelog
* Fix tests on CUDA device
Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>
Co-authored-by: Antonin Raffin <antonin.raffin@dlr.de>
* Precompute batch norm parameters in `_setup_model` and directly copy them in the target update.
* Update test
* Add comments and update tests
* Bump version
* Remove one extra space to conform code style.
* Update docstrings
Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>
Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>
Co-authored-by: Burak Demirbilek <BurakDmb@users.noreply.github.com>
Co-authored-by: Antonin Raffin <antonin.raffin@dlr.de>
* fix Atari in CI
* fix dtype and atari extra
* Update setup.py
* remove 3.6
* note about how to install Atari
* pendulum-v1
* atari v5
* black
* fix pendulum capitalization
* add minimum version
* moved things in changelog to breaking changes
* partial v5 fix
* env update to pass tests
* mismatch env version fixed
* Fix tests after merge
* Include autorom in setup.py
* Blacken code
* Fix dtype issue in more robust way
* Fix GitLab CI: switch to Docker container with new black version
* Remove workaround from GitLab. (May need to rebuild Docker for this though.)
* Revert to v4
* Update setup.py
* Apply suggestions from code review
* Remove unnecessary autorom
* Consistent gym versions
Co-authored-by: J K Terry <justinkterry@gmail.com>
Co-authored-by: Anssi <kaneran21@hotmail.com>
Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>
Co-authored-by: modanesh <mohamad4danesh@gmail.com>
Co-authored-by: Adam Gleave <adam@gleave.me>
* make sure DQN policy is always in correct mode - train or eval
* make set_training_mode an abstract method of the base policy - safer
* update docstring of _build method to note that the target network is put into eval mode
* use set_training_mode to put the dqn target network into eval mode
* use set_training_mode to set the training model of the q-network
* move set_training_mode abstract method from BasePolicy to BaseModel
* set train and eval mode for TD3
* make sure critic is always in correct mode during train
* set train and eval mode for SAC
* add comment re batch norm and dropout
* set train and eval mode for A2C and PPO
* add tests for collect rollouts with batch norm
* fix formatting
* update change log
* update version
* remove Optional typing for batch size - causing type check to fail
* Fix scipy dependency for toy text envs
* implement set_training_mode method in BaseModel
* move all tests of train/eval mode to test_train_eval_mode
* call learn with learning_starts = total_timesteps to test that collect_rollouts does not update batch norm
* remove extra calls to set_training_mode in train method of TD3 and SAC
* Allow gradient_steps=0
* Refactor tests
* Add comment + use aliases
* Typos
Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>