stable-baselines3

mirror of https://github.com/saymrwulf/stable-baselines3.git synced 2026-05-18 21:30:19 +00:00

Author	SHA1	Message	Date
Max Weltevrede	ef10189d80	Prohibit simultaneous use of optimize_memory_usage and handle_timeout_termination (#948 ) * Prohibit simultaneous use of optimize_memory_buffer and handle_timeout_termination * Modify test to avoid unsupported buffer configuration * Change from assertion to raising of ValueError * Update changelog * Update style for consistency * Use handle_timeout_termination when possible Co-authored-by: Anssi <kaneran21@hotmail.com> Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>	2022-07-04 15:08:54 +02:00
Ram Rachum	d64bcb401a	Fix exception cause in base_class.py (#940 )	2022-06-21 20:58:02 +01:00
Antonin RAFFIN	7ce7b6a8c2	Update defaults for offpolicy algos with features extractor (#935 )	2022-06-18 10:52:52 +02:00
Antonin RAFFIN	d68f0a2411	Update doc: SB3 Contrib RecurrentPPO (#927 ) * Update doc: contrib update * Update docs/misc/changelog.rst Co-authored-by: Anssi <kaneran21@hotmail.com> * Address Anssi comments Co-authored-by: Anssi <kaneran21@hotmail.com>	2022-05-31 18:11:16 +02:00
Antonin RAFFIN	4b89fbf283	Fix issues due to newer version of protobuf and sphinx (#924 )	2022-05-29 21:09:50 +02:00
Antonin RAFFIN	49813d8c68	Update doc and add check for unbounded action space (#918 )	2022-05-25 16:24:21 +02:00
TibiGG	2fcf8f91c1	Removed redundant double-check of nested Dict (#908 ) * Removed redundant double-check of nested Dict observation space from BaseAlgorithm * Update changelog Co-authored-by: tibigg <tg4018@ic.ac.uk>	2022-05-09 14:36:15 +03:00
Antonin RAFFIN	0fadc94df3	Fix synchronization bug with EvalCallback (#907 )	2022-05-08 21:54:34 +03:00
Thomas Rudolf	c2518dc160	Add doc to use mlflow logger (#889 ) * ADD feature for mlflow logger via MLflowOutputFormat. * Move MLFlow integration to doc Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>	2022-05-08 15:28:31 +02:00
Antonin RAFFIN	c5f0aa5de0	Update doc: PPO blog post and remark on timeouts (#896 )	2022-05-01 16:26:34 +02:00
Antonin RAFFIN	a6f5049a99	Upgrade code to Python 3.7+ syntax using `pyupgrade` (#887 ) * Upgrade code to Python 3.7+ syntax * Update changelog	2022-04-25 13:01:38 +03:00
Bryan Collazo	3c468ff558	Update ppo documentation (remove redundant and) (#874 ) * Update ppo documentation (remove redundant and) PTAL, thanks! * Update changelog * Pin ale-py version Co-authored-by: Antonin Raffin <antonin.raffin@dlr.de>	2022-04-19 14:15:51 +02:00
Paul Scheikl	ed308a71be	Fixed unchecked None value in SubprocVecEnv (#808 ) * Fixed unchecked None value in SubprocVecEnv * Fixed unchecked None value in DummyVecEnv * Fix formatting * Update test and changelog * Improve test Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2022-04-12 16:05:40 +02:00
Antonin RAFFIN	39a4f9379a	Escape tensorboard log name (#857 ) * escape tensorboard log name Otherwise utils does not recognize the log. * Added fix to changelog * Modifications made by: make commit-checks . * Revert "Modifications made by: make commit-checks ." This reverts commit 529a275d9475f85ef031038a8f3565f7301e5371. * Update changelog and add test Co-authored-by: James Hirschorn <James.Hirschorn@quantitative-technologies.com>	2022-04-11 21:49:18 +02:00
Antonin RAFFIN	248f082cdc	Bump min PyTorch version (#855 )	2022-04-11 18:34:15 +02:00
Quentin Gallouédec	16703b1314	Fix HER goal selection (#848 ) * Goal sampled from next_achieved_goal instead of achived_goal * No need to have special case for future anymore * Update changelog Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>	2022-04-11 17:50:02 +02:00
Grégoire Passault	254bb10c42	Replacing the policy registry with policy "aliases" (#842 ) * Replacing the policy registry with policy "aliases" * Fixing import order and SAC * Changing arg. order to be sure policy_aliases is a kwarg * Import orders * Removing pytype error check * Reformat * Fix alias import * Not using mutable {} as default for policy_aliases * Empty aliases initialization * Using static attributes for policy_aliases * Fixing isort * Fixing back bad merge * Running isort * Fixing aliases for A2C and PPO * Using f-string * Moving policy_aliases definition position * Adding change in the changelog * Update version Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>	2022-04-08 21:21:53 +02:00
Yifei Cheng	44e53ff811	Enable force_zip64 (#839 ) * Enable force_zip64 * mark tests as expensive * Update changelog Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>	2022-03-28 10:35:33 +02:00
Antonin RAFFIN	30772aa9f5	Release v1.5.0 (#835 ) * Release v1.5.0 * Fix link	2022-03-25 14:38:22 +01:00
Grégoire Passault	00ac43b0a9	Removing dead code for handling time limits (#831 ) * Removing dead code for handling time limits (see #829) * Mentionning remove_time_limit_termination in the changelog * Update changelog.rst Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2022-03-23 13:33:55 +01:00
Yuan	009bb0549a	Update tensorboard.rst in SummaryWriterCallback (#822 ) * Update tensorboard.rst * update changelog.rst * update changelog.rst, add username	2022-03-15 21:48:52 +01:00
Antonin RAFFIN	e88eb1c9ca	Add explanation of logger output (#803 ) * Add explanation of logger output * Apply suggestions from code review Co-authored-by: Anssi <kaneran21@hotmail.com> * Add example output Co-authored-by: Anssi <kaneran21@hotmail.com>	2022-03-07 12:20:43 +01:00
Julio César Alves	cdaa9ab418	Callback to early stop the training if there is no model improvement after consecutive evaluations (#741 ) * Added StopTrainingOnNoModelImprovement callback and callback_after_eval parameter in EvalCallback * Correction in EvalCallback and tests for StopTrainingOnNoModelImprovement * Update the docs related to new StopTrainingOnNoModelImprovement callback * Update doc Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org> Co-authored-by: Antonin Raffin <antonin.raffin@dlr.de>	2022-02-25 11:56:47 +01:00
Quentin Gallouédec	db5366fb51	`None` as default value for `env` in `HerReplayBuffer.sample` + `DQN` batch size typing fix (#790 ) * `env` to `None` by default in `HerReplayBuffer.sample` (#788) * Fix DQN batch_size typing * Fix changelog Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org> Co-authored-by: Antonin Raffin <antonin.raffin@dlr.de>	2022-02-24 15:51:01 +01:00
Quentin Gallouédec	13fcb12471	Fix normalization for `DictReplayBuffer` (#744 ) * Normalize samples DictReplayBuffer (#743) * Fixed sample normalization in ``DictReplayBuffer`` (#743) * Test buffer normalization * Rename test replay buffer * Bump version Co-authored-by: Anssi <kaneran21@hotmail.com> Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2022-02-23 13:04:57 +01:00
Boyuan Chen	7a01637128	Fix VecNormalization bug for Dict obs (#768 ) * fix #724 VecNormalization bug for Dict obs * update test and changelog * Update changelog Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>	2022-02-23 12:33:41 +01:00
Costa Huang	d2ebd2eeaa	Allow PPO to turn off advantage normalization (#763 ) * Allow PPO to turn of advantage normalization * update changelog * Add a test case * Update test and sanity check * Fix tests Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2022-02-22 15:29:21 +01:00
Antonin RAFFIN	7ce4bb8016	Pin gym version (#782 ) * Pin gym version * Cleanup warnings * Reformat	2022-02-21 23:12:54 +01:00
Gianluca De Cola	58a98060f9	Update docstring on MlpExtractor. Resolves #736 (#774 ) * Improve docstring on MlpExtractor. * update changelog.	2022-02-16 01:50:17 +02:00
Gautam J	59bec30180	update docs fix indentation (#764 ) * update docs fix indentation Changed code block indentation from 2 spaces to 4 spaces for consistency. * update changelog * Update changelog.rst Co-authored-by: Anssi <kaneran21@hotmail.com>	2022-02-07 21:00:53 +02:00
Manuel	40bda9a918	Remove explict forward calls (#753 ) * Remove explict forward calls * Changelog and commit checks. * Reverted test forward removal for super call. Co-authored-by: Anssi <kaneran21@hotmail.com>	2022-02-06 22:27:12 +02:00
Adam Gleave	78afcbd6d9	HumanOutputFormat: make length configurable, throw error if keys alias (#756 ) * Make HumanOutputFormat length configurable and bump to 36 by default * Add test case * Updated changelog * Blacken * Blacken code * Fix GitLab CI: switch to Docker container with new black version * Incorporate suggestion * Add class docstring * Dummy commit to retrigger GitLab Co-authored-by: Anssi <kaneran21@hotmail.com>	2022-02-05 12:57:35 +02:00
Adam Gleave	9ff26dafed	Fix changelog (#760 )	2022-02-05 02:12:17 +02:00
Carlos Luis	5143cd19f7	Gym fixes - Follow up from #705 (#734 ) * fix Atari in CI * fix dtype and atari extra * Update setup.py * remove 3.6 * note about how to install Atari * pendulum-v1 * atari v5 * black * fix pendulum capitalization * add minimum version * moved things in changelog to breaking changes * partial v5 fix * env update to pass tests * mismatch env version fixed * Fix tests after merge * Include autorom in setup.py * Blacken code * Fix dtype issue in more robust way * Fix GitLab CI: switch to Docker container with new black version * Remove workaround from GitLab. (May need to rebuild Docker for this though.) * Revert to v4 * Update setup.py * Apply suggestions from code review * Remove unnecessary autorom * Consistent gym versions Co-authored-by: J K Terry <justinkterry@gmail.com> Co-authored-by: Anssi <kaneran21@hotmail.com> Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org> Co-authored-by: modanesh <mohamad4danesh@gmail.com> Co-authored-by: Adam Gleave <adam@gleave.me>	2022-02-04 15:13:57 -08:00
Armand du Parc Locmaria	44dfedc061	Add furuta pendulum project to project list (#742 ) * add furuta pendulum project * Update changelog to reflect addition to docs Co-authored-by: Anssi <kaneran21@hotmail.com>	2022-02-04 11:39:49 +02:00
Antonin RAFFIN	54bcfa4544	Add Hugging Face integration to SB3 doc (#733 ) * Add Hugging Face to SB3 doc * Update doc + fixes * Use SB3 model from the hub * Bump version * Fixes Co-authored-by: simoninithomas <simonini_thomas@outlook.fr>	2022-01-20 10:04:12 +01:00
Paul Scheikl	fc41600225	Fixed logging info_keywords in the VecMonitor class. (#730 ) * Writing the additional info_keywords into the episode infos that are passed to the resulst writer. Directly taken from the non-vec version of monitor. * Added test for monitoring info_keywords. * Removed unnecessary step of registering the env. Not using make_vec_env, because it applies a monitor wrapper to the env. * Reformat Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>	2022-01-19 17:17:22 +01:00
Antonin RAFFIN	21f6a474a4	Release 1.4.0 (#729 ) * Release 1.4.0 * Add integration section in the readme	2022-01-19 11:16:15 +01:00
Antonin RAFFIN	cd6e04705b	Update SB3 Contrib doc (ARS) and W&B integration (#726 ) * Add ARS to SB3 contrib * Add integration page	2022-01-18 15:10:25 +01:00
Antonin RAFFIN	e9a8979022	Add copy and combine method to running mean std (#716 ) * Add copy and combine method to running mean std * Update test * Faster test * Update test * Update test * Shift values in RMS test	2022-01-06 01:31:04 +02:00
IperGiove	d9e198e04f	Update custom_policy.rst (#711 ) * Update custom_policy.rst Added methods forward_actor and forward_critic in CustomNetwork class. * Update doc Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>	2022-01-03 16:22:58 +01:00
Thomas Gubler	c895c1d46f	Doc fix: A2C - fix guidance on RMSpropTFLike (#708 ) * doc: A2C/migration: fix guidance on RMSpropTFLike * Update changelog.rst Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2021-12-30 11:28:12 +01:00
Antonin RAFFIN	4a5dfaedfc	Update SB3 contrib doc (+ fix backward compat) (#707 ) * Fix `VecNormalize` load for SB3<= 1.3.0 * Update SB3 contrib doc * Bump version	2021-12-29 14:25:09 +01:00
Antonin RAFFIN	bb16645c4e	Add `skip` option for `VecTransposeImage` and bug fix in frame stack (#700 ) * Update doc * Add comment * Add skip option to VecTransposeImage and fix bug in frame stack	2021-12-23 17:12:49 +02:00
Quentin Gallouédec	d496cd4d95	Consistent use of `device` as keyword argument (#702 ) * consistent device as keyword arg * Fixed ``device`` arg inconsistency in changelog	2021-12-22 11:43:59 +01:00
Demetrio92	798b16aaf7	more verbose documentation regarding `.load` vs `.set_parameters` (#696 ) * more verbose documentation regarding `.load` vs `.set_parameters` (#683, #614) * add a note to explain the difference between `.load` and `.set_parameters` to the examples * fix typos Co-authored-by: Anssi <kaneran21@hotmail.com> Co-authored-by: Anssi <kaneran21@hotmail.com>	2021-12-18 17:28:37 +02:00
hsuehch	222a69ca49	Eliminate extra empty lines in CSV monitor files on Windows (DLR-RM#692) (#695 ) * Added ``newline="\n"`` when opening CSV monitor files so that each line ends with ``\r\n`` instead of ``\r\r\n`` on Windows while Linux environments are not affected	2021-12-18 16:04:33 +02:00
Antonin RAFFIN	e24147390d	Improve tests and add check for float32 (#686 ) * Add additional checks * Improve tests and error message * Update changelog * Bump version * Update doc * Add tests for action space * Improve test	2021-12-09 14:14:33 +02:00
Antonin RAFFIN	77f4f5021d	Drop Python 3.6 support (#685 ) * Drop python 3.6 support * Update doc * Update gitlab CI * Update doc env * Fix gitlab CI	2021-12-06 12:54:43 +01:00
Antonin RAFFIN	507ed1762e	Multiprocessing support for off policy algorithms (#439 ) * Add multi-env training support for SAC * Fix for dict obs * Pytype fixes * Fix assert on number of envs * Remove for loop * Add support for Dict obs * Start cleanup * Update doc and bug fix * Add support for vectorized action noise and add multi env example for off-policy * Update version * Bug fix with VecNormalize * Update README table * Update variable names * Update changelog and version * Update doc and fix for `gradient_steps=-1` * Add test for `gradient_steps=-1` * Disable pytype pyi errors * Fix for DQN * Update comment on deepcopy * Remove episode_reward field * Fix RolloutReturn * Avoid modification by reference * Fix error message Co-authored-by: Anssi <kaneran21@hotmail.com>	2021-12-01 22:30:09 +01:00

1 2 3 4 5 ...

272 commits