Commit graph

272 commits

Author SHA1 Message Date
Max Weltevrede
ef10189d80
Prohibit simultaneous use of optimize_memory_usage and handle_timeout_termination (#948)
* Prohibit simultaneous use of optimize_memory_buffer and handle_timeout_termination

* Modify test to avoid unsupported buffer configuration

* Change from assertion to raising of ValueError

* Update changelog

* Update style for consistency

* Use handle_timeout_termination when possible

Co-authored-by: Anssi <kaneran21@hotmail.com>
Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>
2022-07-04 15:08:54 +02:00
Ram Rachum
d64bcb401a
Fix exception cause in base_class.py (#940) 2022-06-21 20:58:02 +01:00
Antonin RAFFIN
7ce7b6a8c2
Update defaults for offpolicy algos with features extractor (#935) 2022-06-18 10:52:52 +02:00
Antonin RAFFIN
d68f0a2411
Update doc: SB3 Contrib RecurrentPPO (#927)
* Update doc: contrib update

* Update docs/misc/changelog.rst

Co-authored-by: Anssi <kaneran21@hotmail.com>

* Address Anssi comments

Co-authored-by: Anssi <kaneran21@hotmail.com>
2022-05-31 18:11:16 +02:00
Antonin RAFFIN
4b89fbf283
Fix issues due to newer version of protobuf and sphinx (#924) 2022-05-29 21:09:50 +02:00
Antonin RAFFIN
49813d8c68
Update doc and add check for unbounded action space (#918) 2022-05-25 16:24:21 +02:00
TibiGG
2fcf8f91c1
Removed redundant double-check of nested Dict (#908)
* Removed redundant double-check of nested Dict observation space from BaseAlgorithm

* Update changelog

Co-authored-by: tibigg <tg4018@ic.ac.uk>
2022-05-09 14:36:15 +03:00
Antonin RAFFIN
0fadc94df3
Fix synchronization bug with EvalCallback (#907) 2022-05-08 21:54:34 +03:00
Thomas Rudolf
c2518dc160
Add doc to use mlflow logger (#889)
* ADD feature for mlflow logger via MLflowOutputFormat.

* Move MLFlow integration to doc

Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>
2022-05-08 15:28:31 +02:00
Antonin RAFFIN
c5f0aa5de0
Update doc: PPO blog post and remark on timeouts (#896) 2022-05-01 16:26:34 +02:00
Antonin RAFFIN
a6f5049a99
Upgrade code to Python 3.7+ syntax using pyupgrade (#887)
* Upgrade code to Python 3.7+ syntax

* Update changelog
2022-04-25 13:01:38 +03:00
Bryan Collazo
3c468ff558
Update ppo documentation (remove redundant and) (#874)
* Update ppo documentation (remove redundant and)

PTAL, thanks!

* Update changelog

* Pin ale-py version

Co-authored-by: Antonin Raffin <antonin.raffin@dlr.de>
2022-04-19 14:15:51 +02:00
Paul Scheikl
ed308a71be
Fixed unchecked None value in SubprocVecEnv (#808)
* Fixed unchecked None value in SubprocVecEnv

* Fixed unchecked None value in DummyVecEnv

* Fix formatting

* Update test and changelog

* Improve test

Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>
2022-04-12 16:05:40 +02:00
Antonin RAFFIN
39a4f9379a
Escape tensorboard log name (#857)
* escape tensorboard log name

Otherwise utils does not recognize the log.

* Added fix to changelog

* Modifications made by: make commit-checks .

* Revert "Modifications made by: make commit-checks ."

This reverts commit 529a275d9475f85ef031038a8f3565f7301e5371.

* Update changelog and add test

Co-authored-by: James Hirschorn <James.Hirschorn@quantitative-technologies.com>
2022-04-11 21:49:18 +02:00
Antonin RAFFIN
248f082cdc
Bump min PyTorch version (#855) 2022-04-11 18:34:15 +02:00
Quentin Gallouédec
16703b1314
Fix HER goal selection (#848)
* Goal sampled from next_achieved_goal instead of achived_goal

* No need to have special case for future anymore

* Update changelog

Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>
2022-04-11 17:50:02 +02:00
Grégoire Passault
254bb10c42
Replacing the policy registry with policy "aliases" (#842)
* Replacing the policy registry with policy "aliases"

* Fixing import order and SAC

* Changing arg. order to be sure policy_aliases is a kwarg

* Import orders

* Removing pytype error check

* Reformat

* Fix alias import

* Not using mutable {} as default for policy_aliases

* Empty aliases initialization

* Using static attributes for policy_aliases

* Fixing isort

* Fixing back bad merge

* Running isort

* Fixing aliases for A2C and PPO

* Using f-string

* Moving policy_aliases definition position

* Adding change in the changelog

* Update version

Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>
2022-04-08 21:21:53 +02:00
Yifei Cheng
44e53ff811
Enable force_zip64 (#839)
* Enable force_zip64

* mark tests as expensive

* Update changelog

Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>
2022-03-28 10:35:33 +02:00
Antonin RAFFIN
30772aa9f5
Release v1.5.0 (#835)
* Release v1.5.0

* Fix link
2022-03-25 14:38:22 +01:00
Grégoire Passault
00ac43b0a9
Removing dead code for handling time limits (#831)
* Removing dead code for handling time limits (see #829)

* Mentionning remove_time_limit_termination in the changelog

* Update changelog.rst

Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>
2022-03-23 13:33:55 +01:00
Yuan
009bb0549a
Update tensorboard.rst in SummaryWriterCallback (#822)
* Update tensorboard.rst

* update changelog.rst

* update changelog.rst, add username
2022-03-15 21:48:52 +01:00
Antonin RAFFIN
e88eb1c9ca
Add explanation of logger output (#803)
* Add explanation of logger output

* Apply suggestions from code review

Co-authored-by: Anssi <kaneran21@hotmail.com>

* Add example output

Co-authored-by: Anssi <kaneran21@hotmail.com>
2022-03-07 12:20:43 +01:00
Julio César Alves
cdaa9ab418
Callback to early stop the training if there is no model improvement after consecutive evaluations (#741)
* Added StopTrainingOnNoModelImprovement callback and callback_after_eval parameter in EvalCallback

* Correction in EvalCallback and tests for StopTrainingOnNoModelImprovement

* Update the docs related to new StopTrainingOnNoModelImprovement callback

* Update doc

Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>
Co-authored-by: Antonin Raffin <antonin.raffin@dlr.de>
2022-02-25 11:56:47 +01:00
Quentin Gallouédec
db5366fb51
None as default value for env in HerReplayBuffer.sample + DQN batch size typing fix (#790)
* `env` to `None` by default in `HerReplayBuffer.sample` (#788)

* Fix DQN batch_size typing

* Fix changelog

Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>
Co-authored-by: Antonin Raffin <antonin.raffin@dlr.de>
2022-02-24 15:51:01 +01:00
Quentin Gallouédec
13fcb12471
Fix normalization for DictReplayBuffer (#744)
* Normalize samples DictReplayBuffer (#743)

* Fixed sample normalization in ``DictReplayBuffer`` (#743)

* Test buffer normalization

* Rename test replay buffer

* Bump version

Co-authored-by: Anssi <kaneran21@hotmail.com>
Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>
2022-02-23 13:04:57 +01:00
Boyuan Chen
7a01637128
Fix VecNormalization bug for Dict obs (#768)
* fix #724 VecNormalization bug for Dict obs

* update test and changelog

* Update changelog

Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>
2022-02-23 12:33:41 +01:00
Costa Huang
d2ebd2eeaa
Allow PPO to turn off advantage normalization (#763)
* Allow PPO to turn of advantage normalization

* update changelog

* Add a test case

* Update test and sanity check

* Fix tests

Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>
2022-02-22 15:29:21 +01:00
Antonin RAFFIN
7ce4bb8016
Pin gym version (#782)
* Pin gym version

* Cleanup warnings

* Reformat
2022-02-21 23:12:54 +01:00
Gianluca De Cola
58a98060f9
Update docstring on MlpExtractor. Resolves #736 (#774)
* Improve docstring on MlpExtractor.

* update changelog.
2022-02-16 01:50:17 +02:00
Gautam J
59bec30180
update docs fix indentation (#764)
* update docs fix indentation

Changed code block indentation from 2 spaces to 4 spaces for consistency.

* update changelog

* Update changelog.rst

Co-authored-by: Anssi <kaneran21@hotmail.com>
2022-02-07 21:00:53 +02:00
Manuel
40bda9a918
Remove explict forward calls (#753)
* Remove explict forward calls

* Changelog and commit checks.

* Reverted test forward removal for super call.

Co-authored-by: Anssi <kaneran21@hotmail.com>
2022-02-06 22:27:12 +02:00
Adam Gleave
78afcbd6d9
HumanOutputFormat: make length configurable, throw error if keys alias (#756)
* Make HumanOutputFormat length configurable and bump to 36 by default

* Add test case

* Updated changelog

* Blacken

* Blacken code

* Fix GitLab CI: switch to Docker container with new black version

* Incorporate suggestion

* Add class docstring

* Dummy commit to retrigger GitLab

Co-authored-by: Anssi <kaneran21@hotmail.com>
2022-02-05 12:57:35 +02:00
Adam Gleave
9ff26dafed
Fix changelog (#760) 2022-02-05 02:12:17 +02:00
Carlos Luis
5143cd19f7
Gym fixes - Follow up from #705 (#734)
* fix Atari in CI

* fix dtype and atari extra

* Update setup.py

* remove 3.6

* note about how to install Atari

* pendulum-v1

* atari v5

* black

* fix pendulum capitalization

* add minimum version

* moved things in changelog to breaking changes

* partial v5 fix

* env update to pass tests

* mismatch env version fixed

* Fix tests after merge

* Include autorom in setup.py

* Blacken code

* Fix dtype issue in more robust way

* Fix GitLab CI: switch to Docker container with new black version

* Remove workaround from GitLab. (May need to rebuild Docker for this though.)

* Revert to v4

* Update setup.py

* Apply suggestions from code review

* Remove unnecessary autorom

* Consistent gym versions

Co-authored-by: J K Terry <justinkterry@gmail.com>
Co-authored-by: Anssi <kaneran21@hotmail.com>
Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>
Co-authored-by: modanesh <mohamad4danesh@gmail.com>
Co-authored-by: Adam Gleave <adam@gleave.me>
2022-02-04 15:13:57 -08:00
Armand du Parc Locmaria
44dfedc061
Add furuta pendulum project to project list (#742)
* add furuta pendulum project

* Update changelog to reflect addition to docs

Co-authored-by: Anssi <kaneran21@hotmail.com>
2022-02-04 11:39:49 +02:00
Antonin RAFFIN
54bcfa4544
Add Hugging Face integration to SB3 doc (#733)
* Add Hugging Face to SB3 doc

* Update doc + fixes

* Use SB3 model from the hub

* Bump version

* Fixes

Co-authored-by: simoninithomas <simonini_thomas@outlook.fr>
2022-01-20 10:04:12 +01:00
Paul Scheikl
fc41600225
Fixed logging info_keywords in the VecMonitor class. (#730)
* Writing the additional info_keywords into the episode infos that are passed to the resulst writer. Directly taken from the non-vec version of monitor.

* Added test for monitoring info_keywords.

* Removed unnecessary step of registering the env. Not using make_vec_env, because it applies a monitor wrapper to the env.

* Reformat

Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>
2022-01-19 17:17:22 +01:00
Antonin RAFFIN
21f6a474a4
Release 1.4.0 (#729)
* Release 1.4.0

* Add integration section in the readme
2022-01-19 11:16:15 +01:00
Antonin RAFFIN
cd6e04705b
Update SB3 Contrib doc (ARS) and W&B integration (#726)
* Add ARS to SB3 contrib

* Add integration page
2022-01-18 15:10:25 +01:00
Antonin RAFFIN
e9a8979022
Add copy and combine method to running mean std (#716)
* Add copy and combine method to running mean std

* Update test

* Faster test

* Update test

* Update test

* Shift values in RMS test
2022-01-06 01:31:04 +02:00
IperGiove
d9e198e04f
Update custom_policy.rst (#711)
* Update custom_policy.rst

Added methods forward_actor and forward_critic in CustomNetwork class.

* Update doc

Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>
2022-01-03 16:22:58 +01:00
Thomas Gubler
c895c1d46f
Doc fix: A2C - fix guidance on RMSpropTFLike (#708)
* doc: A2C/migration: fix guidance on RMSpropTFLike

* Update changelog.rst

Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>
2021-12-30 11:28:12 +01:00
Antonin RAFFIN
4a5dfaedfc
Update SB3 contrib doc (+ fix backward compat) (#707)
* Fix `VecNormalize` load for SB3<= 1.3.0

* Update SB3 contrib doc

* Bump version
2021-12-29 14:25:09 +01:00
Antonin RAFFIN
bb16645c4e
Add skip option for VecTransposeImage and bug fix in frame stack (#700)
* Update doc

* Add comment

* Add skip option to VecTransposeImage and fix bug in frame stack
2021-12-23 17:12:49 +02:00
Quentin Gallouédec
d496cd4d95
Consistent use of device as keyword argument (#702)
* consistent device as keyword arg

* Fixed ``device`` arg inconsistency in changelog
2021-12-22 11:43:59 +01:00
Demetrio92
798b16aaf7
more verbose documentation regarding .load vs .set_parameters (#696)
* more verbose documentation regarding `.load` vs `.set_parameters` (#683, #614)

* add a note to explain the difference between `.load` and `.set_parameters` to the examples

* fix typos

Co-authored-by: Anssi <kaneran21@hotmail.com>

Co-authored-by: Anssi <kaneran21@hotmail.com>
2021-12-18 17:28:37 +02:00
hsuehch
222a69ca49
Eliminate extra empty lines in CSV monitor files on Windows (DLR-RM#692) (#695)
* Added ``newline="\n"`` when opening CSV monitor files so that each line ends with ``\r\n`` instead of ``\r\r\n`` on Windows while Linux environments are not affected
2021-12-18 16:04:33 +02:00
Antonin RAFFIN
e24147390d
Improve tests and add check for float32 (#686)
* Add additional checks

* Improve tests and error message

* Update changelog

* Bump version

* Update doc

* Add tests for action space

* Improve test
2021-12-09 14:14:33 +02:00
Antonin RAFFIN
77f4f5021d
Drop Python 3.6 support (#685)
* Drop python 3.6 support

* Update doc

* Update gitlab CI

* Update doc env

* Fix gitlab CI
2021-12-06 12:54:43 +01:00
Antonin RAFFIN
507ed1762e
Multiprocessing support for off policy algorithms (#439)
* Add multi-env training support for SAC

* Fix for dict obs

* Pytype fixes

* Fix assert on number of envs

* Remove for loop

* Add support for Dict obs

* Start cleanup

* Update doc and bug fix

* Add support for vectorized action noise
and add multi env example for off-policy

* Update version

* Bug fix with VecNormalize

* Update README table

* Update variable names

* Update changelog and version

* Update doc and fix for `gradient_steps=-1`

* Add test for `gradient_steps=-1`

* Disable pytype pyi errors

* Fix for DQN

* Update comment on deepcopy

* Remove episode_reward field

* Fix RolloutReturn

* Avoid modification by reference

* Fix error message

Co-authored-by: Anssi <kaneran21@hotmail.com>
2021-12-01 22:30:09 +01:00