stable-baselines3

mirror of https://github.com/saymrwulf/stable-baselines3.git synced 2026-07-03 03:59:13 +00:00

Author	SHA1	Message	Date
Costa Huang	d2ebd2eeaa	Allow PPO to turn off advantage normalization (#763 ) * Allow PPO to turn of advantage normalization * update changelog * Add a test case * Update test and sanity check * Fix tests Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2022-02-22 15:29:21 +01:00
Antonin RAFFIN	7ce4bb8016	Pin gym version (#782 ) * Pin gym version * Cleanup warnings * Reformat	2022-02-21 23:12:54 +01:00
Gianluca De Cola	58a98060f9	Update docstring on MlpExtractor. Resolves #736 (#774 ) * Improve docstring on MlpExtractor. * update changelog.	2022-02-16 01:50:17 +02:00
Gautam J	59bec30180	update docs fix indentation (#764 ) * update docs fix indentation Changed code block indentation from 2 spaces to 4 spaces for consistency. * update changelog * Update changelog.rst Co-authored-by: Anssi <kaneran21@hotmail.com>	2022-02-07 21:00:53 +02:00
Manuel	40bda9a918	Remove explict forward calls (#753 ) * Remove explict forward calls * Changelog and commit checks. * Reverted test forward removal for super call. Co-authored-by: Anssi <kaneran21@hotmail.com>	2022-02-06 22:27:12 +02:00
Ashish Dutt	954daaac37	Custom environment page modified. Following fixes are committed in response to issue#755. (#758 ) * Page modified. Following fixes are committed in response to issue#755. - fixed the broken url on creating custom gym environment. Also added appropriate advice by citing official OpenAi gym documents. - SB3 text tweaked. * modified page - updated the in-line text hyperlinks to follow Sphinx restructured text format. * modified page - updated the in-line text hyperlinks to follow Sphinx restructured text format. - updated text grammar * Language Co-authored-by: Anssi <kaneran21@hotmail.com>	2022-02-05 13:36:36 +02:00
Adam Gleave	78afcbd6d9	HumanOutputFormat: make length configurable, throw error if keys alias (#756 ) * Make HumanOutputFormat length configurable and bump to 36 by default * Add test case * Updated changelog * Blacken * Blacken code * Fix GitLab CI: switch to Docker container with new black version * Incorporate suggestion * Add class docstring * Dummy commit to retrigger GitLab Co-authored-by: Anssi <kaneran21@hotmail.com>	2022-02-05 12:57:35 +02:00
Adam Gleave	9ff26dafed	Fix changelog (#760 )	2022-02-05 02:12:17 +02:00
Carlos Luis	5143cd19f7	Gym fixes - Follow up from #705 (#734 ) * fix Atari in CI * fix dtype and atari extra * Update setup.py * remove 3.6 * note about how to install Atari * pendulum-v1 * atari v5 * black * fix pendulum capitalization * add minimum version * moved things in changelog to breaking changes * partial v5 fix * env update to pass tests * mismatch env version fixed * Fix tests after merge * Include autorom in setup.py * Blacken code * Fix dtype issue in more robust way * Fix GitLab CI: switch to Docker container with new black version * Remove workaround from GitLab. (May need to rebuild Docker for this though.) * Revert to v4 * Update setup.py * Apply suggestions from code review * Remove unnecessary autorom * Consistent gym versions Co-authored-by: J K Terry <justinkterry@gmail.com> Co-authored-by: Anssi <kaneran21@hotmail.com> Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org> Co-authored-by: modanesh <mohamad4danesh@gmail.com> Co-authored-by: Adam Gleave <adam@gleave.me>	2022-02-04 15:13:57 -08:00
Armand du Parc Locmaria	44dfedc061	Add furuta pendulum project to project list (#742 ) * add furuta pendulum project * Update changelog to reflect addition to docs Co-authored-by: Anssi <kaneran21@hotmail.com>	2022-02-04 11:39:49 +02:00
Adam Gleave	f488d0772a	Autoformat code with black (new version complains about new things) (#757 ) * Blacken code * Fix GitLab CI: switch to Docker container with new black version	2022-02-04 02:56:06 +02:00
Antonin RAFFIN	54bcfa4544	Add Hugging Face integration to SB3 doc (#733 ) * Add Hugging Face to SB3 doc * Update doc + fixes * Use SB3 model from the hub * Bump version * Fixes Co-authored-by: simoninithomas <simonini_thomas@outlook.fr>	2022-01-20 10:04:12 +01:00
Paul Scheikl	fc41600225	Fixed logging info_keywords in the VecMonitor class. (#730 ) * Writing the additional info_keywords into the episode infos that are passed to the resulst writer. Directly taken from the non-vec version of monitor. * Added test for monitoring info_keywords. * Removed unnecessary step of registering the env. Not using make_vec_env, because it applies a monitor wrapper to the env. * Reformat Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>	2022-01-19 17:17:22 +01:00
Antonin RAFFIN	21f6a474a4	Release 1.4.0 (#729 ) * Release 1.4.0 * Add integration section in the readme	2022-01-19 11:16:15 +01:00
Antonin RAFFIN	cd6e04705b	Update SB3 Contrib doc (ARS) and W&B integration (#726 ) * Add ARS to SB3 contrib * Add integration page	2022-01-18 15:10:25 +01:00
Antonin RAFFIN	e9a8979022	Add copy and combine method to running mean std (#716 ) * Add copy and combine method to running mean std * Update test * Faster test * Update test * Update test * Shift values in RMS test	2022-01-06 01:31:04 +02:00
IperGiove	d9e198e04f	Update custom_policy.rst (#711 ) * Update custom_policy.rst Added methods forward_actor and forward_critic in CustomNetwork class. * Update doc Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>	2022-01-03 16:22:58 +01:00
Thomas Gubler	c895c1d46f	Doc fix: A2C - fix guidance on RMSpropTFLike (#708 ) * doc: A2C/migration: fix guidance on RMSpropTFLike * Update changelog.rst Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2021-12-30 11:28:12 +01:00
Antonin RAFFIN	4a5dfaedfc	Update SB3 contrib doc (+ fix backward compat) (#707 ) * Fix `VecNormalize` load for SB3<= 1.3.0 * Update SB3 contrib doc * Bump version	2021-12-29 14:25:09 +01:00
Antonin RAFFIN	bb16645c4e	Add `skip` option for `VecTransposeImage` and bug fix in frame stack (#700 ) * Update doc * Add comment * Add skip option to VecTransposeImage and fix bug in frame stack	2021-12-23 17:12:49 +02:00
Quentin Gallouédec	d496cd4d95	Consistent use of `device` as keyword argument (#702 ) * consistent device as keyword arg * Fixed ``device`` arg inconsistency in changelog	2021-12-22 11:43:59 +01:00
Demetrio92	798b16aaf7	more verbose documentation regarding `.load` vs `.set_parameters` (#696 ) * more verbose documentation regarding `.load` vs `.set_parameters` (#683, #614) * add a note to explain the difference between `.load` and `.set_parameters` to the examples * fix typos Co-authored-by: Anssi <kaneran21@hotmail.com> Co-authored-by: Anssi <kaneran21@hotmail.com>	2021-12-18 17:28:37 +02:00
hsuehch	222a69ca49	Eliminate extra empty lines in CSV monitor files on Windows (DLR-RM#692) (#695 ) * Added ``newline="\n"`` when opening CSV monitor files so that each line ends with ``\r\n`` instead of ``\r\r\n`` on Windows while Linux environments are not affected	2021-12-18 16:04:33 +02:00
Antonin RAFFIN	e24147390d	Improve tests and add check for float32 (#686 ) * Add additional checks * Improve tests and error message * Update changelog * Bump version * Update doc * Add tests for action space * Improve test	2021-12-09 14:14:33 +02:00
Antonin RAFFIN	77f4f5021d	Drop Python 3.6 support (#685 ) * Drop python 3.6 support * Update doc * Update gitlab CI * Update doc env * Fix gitlab CI	2021-12-06 12:54:43 +01:00
Antonin RAFFIN	507ed1762e	Multiprocessing support for off policy algorithms (#439 ) * Add multi-env training support for SAC * Fix for dict obs * Pytype fixes * Fix assert on number of envs * Remove for loop * Add support for Dict obs * Start cleanup * Update doc and bug fix * Add support for vectorized action noise and add multi env example for off-policy * Update version * Bug fix with VecNormalize * Update README table * Update variable names * Update changelog and version * Update doc and fix for `gradient_steps=-1` * Add test for `gradient_steps=-1` * Disable pytype pyi errors * Fix for DQN * Update comment on deepcopy * Remove episode_reward field * Fix RolloutReturn * Avoid modification by reference * Fix error message Co-authored-by: Anssi <kaneran21@hotmail.com>	2021-12-01 22:30:09 +01:00
Antonin RAFFIN	2ebb8aa22b	Update Citation (#684 ) * Update citation * Remove cff file	2021-12-01 18:55:21 +01:00
Antonin RAFFIN	52c29dc497	Fix evaluation script for recurrent policies (#678 ) * Fix evaluation script for RNN * Add error message * Revert "Add error message" This reverts commit 8d69b6cf4de2cd13aecfb425bd3145fad6a6c49a. * Fix for pytype * Rename mask to `episode_start` * Fix type hint * Fix type hints * Remove confusing part of sentence Co-authored-by: Anssi <kaneran21@hotmail.com>	2021-11-30 13:49:06 +01:00
Gary Briggs	8e5ede783f	Add a section on exporting to TFLite/Coral with demonstration (#679 ) * Add a section on exporting to TFLite/Coral with demonstration * Changelog to reflect new export documentation * Update docs/guide/export.rst Fingers on autopilot make word wrong Co-authored-by: Anssi <kaneran21@hotmail.com> * Update docs/guide/export.rst Better wording clarity Co-authored-by: Anssi <kaneran21@hotmail.com> * Update docs/guide/export.rst Better wording clarity Co-authored-by: Anssi <kaneran21@hotmail.com> * Clarify motivations and hardware * Update docs/misc/changelog.rst Make consistent with other changelog entries Co-authored-by: Anssi <kaneran21@hotmail.com> * Sphinx wants the section underline to be at least this long * Remove first-person voice * Typos Co-authored-by: Anssi <kaneran21@hotmail.com>	2021-11-28 10:54:50 +01:00
Shyamal H Anadkat	3b68dc7312	Update GAE computation docstring (#655 ) * Fix typo in buffers.py * Revert "Fix typo in buffers.py" This reverts commit ca643d5e3a509ae1b8a65bf0de98f4609ca9d8da. * Ignore pytype errors * Update GAE computation docstring Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>	2021-11-25 10:53:42 +01:00
Antonin Raffin	b37052cbf0	Pytest Color for GitHub actions	2021-11-18 10:38:50 +01:00
Parth Kothari	58e5506385	Editted Authors of DriverGym project (#669 )	2021-11-18 10:18:18 +01:00
Parth Kothari	1ac35eaef2	Add DriverGym project to SB3 project documentation (#665 ) * Added DriverGym project * Updated changelog * Update changelog.rst Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2021-11-17 11:13:43 +01:00
Antonin RAFFIN	d228364ccf	Add timeout handling for on-policy algorithms (#658 ) * Add timeout handling for on-policy algorithms * Fixes * Fix infinite loop in eval * Skip type check for python 3.9 * Fix for discrete obs + add docstring * Fix A2C test * Removed unused helper * Add test for infinite horizon * typed ast should be fixed * Apply suggestions from code review Co-authored-by: Anssi <kaneran21@hotmail.com> Co-authored-by: Anssi <kaneran21@hotmail.com>	2021-11-16 17:19:16 +01:00
Antonin RAFFIN	e75e1de4c1	Fix indentation in RL tips doc (#657 ) * Update rl_tips.rst indent fix to make if done and its following statement work * Fix indentation and update changelog * Skip type check for python 3.9 Co-authored-by: paulg <cove9988@gmail.com>	2021-11-10 16:54:20 +00:00
Antonin RAFFIN	2bb4500948	Fix `set_env` when using `VecNormalize` (#638 ) * Fix `set_env` when using `VecNormalize` * Update version	2021-11-02 13:52:26 +02:00
Antonin Raffin	6daf82bf74	Relax test	2021-10-31 19:03:28 +01:00
ac-93	98c1a637cf	add tactile-gym to the list of projects using SB3 (#640 ) * Update projects.rst * Update changelog.rst * Update projects.rst * Fix doc build Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2021-10-31 18:26:06 +01:00
Oleksii Kachaiev	0c17fedfac	Adjust FPS calculation to accommodate for reset_num_timesteps=False (#636 ) * Store number of timesteps at the beginning of each learn cycle * Update changelog * Set default _num_timesteps_at_start in the contructor * Test case for FPS logger * Adjust test to cover both on-policy and off-policy algorithms * Fix formatting * Update test and add comment * Fix test Co-authored-by: Oleksii Kachaiev <okachaiev@riotgames.com> Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2021-10-31 18:19:03 +01:00
Edouard Leurent	a2e3001598	Add highway-env to the list of projects using SB3 (#639 ) * Add highway-env to the list of projects using SB3 Many thanks for this fantastic library, keep up the good work! * Update changelog with added documentation	2021-10-30 13:53:36 +02:00
Oleksii Kachaiev	0503e694b2	Introduce norm_obs_keys param for VecNormalize environment wrapper (#631 ) * Implement new norm_obs_keys param for VecNormalize environment wrapper * Simplified doc string to avoid issues with lint and doc * Updated changelog * Update changelog.rst * Update test_vec_normalize.py * Update sanity checks * Fix backward compat * Update doc * Update changelog * Fix lint warnings * Fix tests * Minor edit * observation_space sanity check was applied twice Co-authored-by: Oleksii Kachaiev <okachaiev@riotgames.com> Co-authored-by: Anssi <kaneran21@hotmail.com> Co-authored-by: Antonin Raffin <antonin.raffin@dlr.de>	2021-10-28 19:18:39 +02:00
Antonin RAFFIN	7b977d7b03	Release 1.3.0 (#625 )	2021-10-23 17:07:00 +02:00
Antonin RAFFIN	e907eca18e	Fix `set_env` to keep the number of timesteps (#615 ) * Fix for `set_env` * Add test and update changelog * Use underscores and f-strings * Add PyPi info * Update comments	2021-10-23 16:36:40 +02:00
Antonin RAFFIN	1564a85081	System info helper (#613 ) * Add `system_env_info` * Add `print_system_info` to load and store system info at save time * Remove TODO * Rename to `get_system_info` * Import as sb3 for consistency * Update changelog * Add warning for old SB3 versions * Use underscore litteral for more clarity	2021-10-18 10:43:56 +02:00
Timo Kaufmann	09e9fc42eb	Use consistent logging keys (#605 ) * Use a consistent key to log the total timesteps This changes the timestep logging key of on-policy algorithms from `time/total_timesteps` to `time/total timesteps` (note the underscore/space). The off-policy algorithms and the eval callback already use the latter, so this behavior is more consistent. * Use underscores instead of spaces in logging keys Most keys already followed this policy and consistent behavior is friendlier to new users. * Minor edit and bump version Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>	2021-10-12 13:17:30 +02:00
Antonin RAFFIN	75aa31dcfb	Update SB3 contrib algorithms (#604 )	2021-10-10 15:41:39 +02:00
Antonin RAFFIN	1881d904a0	Doc fix and improve error messages (#598 ) * Fix custom env doc * Catch common mistake * Improve `EvalCallback` error message * Lint test * Update docs/guide/custom_env.rst Co-authored-by: Adam Gleave <adam@gleave.me> Co-authored-by: Adam Gleave <adam@gleave.me>	2021-10-08 18:08:31 +02:00
Ilja Avadiev	740d61ada3	Doc fix environment mixup (#588 )	2021-09-29 10:16:59 +02:00
Antonin RAFFIN	306e49fda6	Fixes in `is_vectorized_observation` (#587 ) * Fix is vectorized bug in DQN * Fix sub-classed obs	2021-09-28 21:57:49 +02:00
Antonin RAFFIN	201fbffa8c	Remove `sde_net_arch` + Simplify policy (#584 ) * Remove `sde_net_arch` + Simplify policy * Add warning at load time	2021-09-28 22:32:54 +03:00

1 2 3 4 5 ...

604 commits