Commit graph

11 commits

Author SHA1 Message Date
Antonin RAFFIN
0fc0dd1b21
Fix off policy features extractor (#198)
* Faster tests

* Fix feature extractor bug + add check

* Add missing check

* Allow TD3 features extractor to be separate

* Add share features extractor option for SAC

* Bug fixes

* Apply suggestions from code review

Co-authored-by: Adam Gleave <adam@gleave.me>

Co-authored-by: Adam Gleave <adam@gleave.me>
2020-10-27 14:24:59 +01:00
Anssi
19c1a89a3a
Rename cmd_util to env_util (#197)
* Rename cmd_util to env_util

* Fix docs and add missing newline

* Address comments
2020-10-22 11:05:52 +02:00
Wilson
fe6ade3089
Allow env_kwargs in make_vec_env when env ID string supplied (#189)
* Allow env_kwargs in make_vec_env when env ID string supplied

Resolves #188

* Update docs/misc/changelog.rst

Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>

* Add test for env kargs in make_vec_env

* remove unnecessary args in test_vec_env_kwargs function

* Fixes and reformat

* Doc fix

Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>
2020-10-16 11:09:19 +02:00
Antonin RAFFIN
a1e055695c
Improve typing coverage (#175)
* Improve typing coverage

* Even more types

* Fixes

* Update changelog

* Unified docstrings

* Improve error messages for unsupported spaces
2020-10-07 10:51:49 +02:00
Antonin RAFFIN
55912576ed
Cleanup docstring types (#169)
* Cleanup docstring types

* Update style

* Test with js hack

* Revert "Test with js hack"

This reverts commit d091f438e8851ab8d01b66628e06a104f5e5ec69.

* Fix types

* Fix typo

* Update CONTRIBUTING example
2020-10-02 20:05:55 +03:00
Wilson
e908583e2a
Fix type annotation in make_vec_env (#162)
* Fix type annotation in make_vec_env

The variable `vec_env_cls` is a type and not an instance of either DummyVecEnv or SubprocVecEnv

* Update changelog.rst
2020-09-23 10:34:35 +02:00
Antonin RAFFIN
23afedb254
Auto-formatting with black and isort (#97)
* Add auto formatting with black and isort

* Reformat code

* Ignore typing errors

* Add note about line length

* Add minimum version for isort

* Add commit-checks

* Update docker image

* Fixed lost import (during last merge)

* Fix opencv dependency
2020-07-16 16:12:16 +02:00
Noah
96b771f24e
Implement DQN (#28)
* Created DQN template according to the paper.
Next steps:
- Create Policy
- Complete Training
- Debug

* Changed Base Class

* refactor save, to be consistence with overriding the excluded_save_params function. Do not try to exclude the parameters twice.

* Added simple DQN policy

* Finished learn and train function
- missing correct loss computation

* changed collect_rollouts to work with discrete space

* moved discrete space collect_rollouts to dqn

* basic dqn working

* deleted SDE related code

* added gradient clipping and moved greedy policy to policy

* changed policy to implement target network
and added soft update(in fact standart tau is 1 so hard update)

* fixed policy setup

* rebase target_update_intervall on _n_updates

* adapted all tests
all tests passing

* Move to stable-baseline3

* Fixes for DQN

* Fix tests + add CNNPolicy

* Allow any optimizer for DQN

* added some util functions to create a arbitrary linear schedule, fixed pickle problem with old exploration schedule

* more documentation

* changed buffer dtype

* refactor and document

* Added Sphinx Documentation
Updated changelog.rst

* removed custom collect_rollouts as it is no longer necessary

* Implemented suggestions to clean code and documentation.

* extracted some functions on tests to reduce duplicated code

* added support for exploration_fraction

* Fixed exploration_fraction

* Added documentation

* Fixed get_linear_fn -> proper progress scaling

* Merged master

* Added nature reference

* Changed default parameters to https://www.nature.com/articles/nature14236/tables/1

* Fixed n_updates to be incremented correctly

* Correct train_freq

* Doc update

* added special parameter for DQN in tests

* different fix for test_discrete

* Update docs/modules/dqn.rst

Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>

* Update docs/modules/dqn.rst

Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>

* Update docs/modules/dqn.rst

Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>

* Added RMSProp in optimizer_kwargs, as described in nature paper

* Exploration fraction is inverse of 50.000.000 (total frames) / 1.000.000 (frames with linear schedule) according to nature paper

* Changelog update for buffer dtype

* standard exlude parameters should be always excluded to assure proper saving only if intentionally included by ``include`` parameter

* slightly more iterations on test_discrete to pass the test

* added param use_rms_prop instead of mutable default argument

* forgot alpha

* using huber loss, adam and learning rate 1e-4

* account for train_freq in update_target_network

* Added memory check for both buffers

* Doc updated for buffer allocation

* Added psutil Requirement

* Adapted test_identity.py

* Fixes with new SB3 version

* Fix for tensorboard name

* Convert assert to warning and fix tests

* Refactor off-policy algorithms

* Fixes

* test: remove next_obs in replay buffer

* Update changelog

* Fix tests and use tmp_path where possible

* Fix sampling bug in buffer

* Do not store next obs on episode termination

* Fix replay buffer sampling

* Update comment

* moved epsilon from policy to model

* Update predict method

* Update atari wrappers to match SB2

* Minor edit in the buffers

* Update changelog

* Merge branch 'master' into dqn

* Update DQN to new structure

* Fix tests and remove hardcoded path

* Fix for DQN

* Disable memory efficient replay buffer by default

* Fix docstring

* Add tests for memory efficient buffer

* Update changelog

* Split collect rollout

* Move target update outside `train()` for DQN

* Update changelog

* Update linear schedule doc

* Cleanup DQN code

* Minor edit

* Update version and docker images

Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>
2020-06-29 11:16:54 +02:00
Antonin RAFFIN
54f6f5b6fb
Add flake8 linter and Github CI (#19)
* Cleanup code

* Add flake8 lint and github workflow

* Update build matrix

* Relax precision for python3.7
2020-05-12 17:55:01 +02:00
Antonin RAFFIN
8a61913a1d Update doc 2020-05-08 13:09:38 +02:00
Antonin RAFFIN
8046a24719 More doc + sync VecEnvs + atari 2020-05-07 16:08:23 +02:00