stable-baselines3

mirror of https://github.com/saymrwulf/stable-baselines3.git synced 2026-07-11 17:48:55 +00:00

Author	SHA1	Message	Date
Juan Rocamonde	18b29a68e8	Remove forward() method from common.policies.BaseModel (#1061 ) * Remove forward() method. * Updated changelog	2022-09-11 18:39:13 +02:00
Quentin Gallouédec	98e786f744	Clarify and standardize verbosity documentation (#1056 ) * Standardize the use of verbosity: > to >= * Make verbose docstring more specific * Update changelog	2022-09-09 16:46:28 +02:00
Quentin Gallouédec	29f6687b98	Raise error when observation keys and observation space keys don't match (#1047 ) * Raise error when observation keys and observation space keys don't match * Print the difference in keys * Update changelog	2022-09-05 14:54:58 +02:00
Juan Rocamonde	fdca786f09	Fix replay_buffer_class type annotation (#1042 ) * Fix replay_buffer_class type annotation * Update changelog * Further replacement of same type annotation issue * Formatting * Rolled back formatting changes for consistency	2022-09-01 20:10:01 -07:00
Sidney Tio	304c17dc78	Add append mode to Monitor (#1037 ) * Added option to override or use existing CSVs * Updated changelog for Monitor override * Changed default value to override * Simplify code and add test * Update version * Fix for pytype Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>	2022-08-31 11:53:44 +02:00
Hugh Perkins	2cc1477fa2	Fix advantage normalization with mini-batchsize of 1 (#1028 ) * fix nan in advnatages with batch size 1, for ppo * changelog * black * Simplify test * Bump version Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>	2022-08-25 11:50:08 +02:00
Anand Balakrishnan	59af0c1b01	`CheckpointCallback` can now save replay buffer and `VecNormalize` (#1030 ) * CheckpointCallback now saves replay buffer (if present) * VecNormalize stats are saved at checkpoints * Make checkpointing replay buffer and VecNormalize opt-in * Edit changelog * Add documentation for new parameters * Update docs/misc/changelog.rst * Add documentation for new parameters * Implement suggested edits * Reformat code * Fix git conflict * Add .pkl suffix to VecNormalize checkpoints * Add tests for new CheckpointCallback params * Merge CheckpointCallback tests * Update test and add helper for checkpoint path Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2022-08-25 10:57:51 +02:00
Honglu Fan	29a481a288	Include `running_mean` and `running_val` when updating target networks (#1004 ) * include `running_mean` and `running_val` when updating target networks in DQN, SAC, TD3. * Update stable_baselines3/common/utils.py Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org> * Precompute batch norm parameters in `_setup_model` and directly copy them in the target update. * include `running_mean` and `running_val` when updating target networks in DQN, SAC, TD3. * Update stable_baselines3/common/utils.py Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org> * Precompute batch norm parameters in `_setup_model` and directly copy them in the target update. * Fix `DictReplayBuffer.next_observations` type (#1013) * Fix DictReplayBuffer.next_observations type * Update changelog Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org> * Fixed missing verbose parameter passing (#1011) Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com> * Support for `device=auto` buffers and set it as default value (#1009) * Default device is "auto" for buffer + auto device support in BufferBaseClass * Update docstring * Update tests * Unify tests * Update changelog * Fix tests on CUDA device Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org> Co-authored-by: Antonin Raffin <antonin.raffin@dlr.de> * Precompute batch norm parameters in `_setup_model` and directly copy them in the target update. * Update test * Add comments and update tests * Bump version * Remove one extra space to conform code style. * Update docstrings Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org> Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com> Co-authored-by: Burak Demirbilek <BurakDmb@users.noreply.github.com> Co-authored-by: Antonin Raffin <antonin.raffin@dlr.de>	2022-08-23 10:20:43 +02:00
Timothé	01cc127d32	Support hparams logging to tensorboard (#984 ) * create Hparam class & support in all OutputFormats * add hparams documentation & example * add hparam tests * remove unnecessary test & fix name * format changes * support hyperparameters logging to tensorboard * fix HParams class docstring * use more explicit variable names * raise error instead of warning * Unpin protobuf * Add test for logging hparams Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2022-08-22 22:06:54 +02:00
Antonin RAFFIN	57e0054e62	Add Quentin to the list of maintainers (#1014 )	2022-08-17 09:55:40 +02:00
Quentin Gallouédec	73822c34da	Support for `device=auto` buffers and set it as default value (#1009 ) * Default device is "auto" for buffer + auto device support in BufferBaseClass * Update docstring * Update tests * Unify tests * Update changelog * Fix tests on CUDA device Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org> Co-authored-by: Antonin Raffin <antonin.raffin@dlr.de>	2022-08-16 17:54:55 +02:00
Burak Demirbilek	792e3bcc27	Fixed missing verbose parameter passing (#1011 ) Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>	2022-08-16 13:32:32 +02:00
Quentin Gallouédec	a30d36002b	Fix `DictReplayBuffer.next_observations` type (#1013 ) * Fix DictReplayBuffer.next_observations type * Update changelog Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2022-08-16 10:53:22 +02:00
Quentin Gallouédec	c4f54fcf04	Handling multi-dimensional action spaces (#971 ) * Handle non 1D action shape * Revert changes of observation (out of the scope of this PR) * Apply changes to DictReplayBuffer * Update tests * Rollout buffer n-D actions space handling * Remove error when non 1D action space * ActorCriticPolicy return action with the proper shape * remove useless reshape * Update changelog * Add tests Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2022-08-06 14:19:20 +02:00
jlp-ue	6ce33f5bd2	Fix url in docs (#1000 ) * fixed URL in docs * Update changelog.rst	2022-08-05 17:54:48 +02:00
Francesco Lucianò	646d6d38b6	Fixed typo in PPO doc (#983 ) * Fixed typo Fixed typo * Update changelog Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>	2022-07-30 12:52:35 +02:00
Marsel Khisamutdinov	d532362e94	Adds info on split tensorboard graphs (#989 ) * Add info on split tensorboard graphs. * Change wording to make it look better. * Update changelog.rst * Rephrase and add link to issue Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>	2022-07-30 12:44:25 +02:00
Adam Gleave	b1cc15970a	Use higher resolution time_ns() and avoid division by zero (#979 ) * Use higher resolution time and round up to eps * Update changelog * Add test case * Fix formatting, time()->time_ns * Bugfix: ns is integer not float * Move test to better place * Divide by 1e9 earlier	2022-07-25 23:02:53 +02:00
Quentin Gallouédec	fda3d4d748	Fix returned type in predict (#964 ) * `arr[0]` to `arr.squeeze(0)` * `squeeze(axis=0)` to `squeeze(0)` * Type testing * Add type test for unvectorized observation * `squeeze(0)` to `squeeze(axis=0)` * Treatment of the laziness symptoms * Update changelog * Udate changelog Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2022-07-18 11:22:19 +02:00
Antonin RAFFIN	a18b91e01a	Replace "nature" with "Nature" (magazine) to reduce confusion (#965 ) * Replace "nature" with "Nature" (magazine) to reduce confusion * Replace "nature" with "Nature" (magazine) to reduce confusion * Update changelog Co-authored-by: mel <callmesolis@gmail.com>	2022-07-15 22:48:27 +02:00
Antonin RAFFIN	c1f1c3d3d7	Release v1.6.0 (#958 ) * Release v1.6.0 + update doc + add copy button * Update read the doc conda env * Update year * Fix bug in kl divergence check * Rephrase requirement for envpool and isaac gym	2022-07-12 22:50:23 +02:00
Max Weltevrede	ef10189d80	Prohibit simultaneous use of optimize_memory_usage and handle_timeout_termination (#948 ) * Prohibit simultaneous use of optimize_memory_buffer and handle_timeout_termination * Modify test to avoid unsupported buffer configuration * Change from assertion to raising of ValueError * Update changelog * Update style for consistency * Use handle_timeout_termination when possible Co-authored-by: Anssi <kaneran21@hotmail.com> Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>	2022-07-04 15:08:54 +02:00
Ram Rachum	d64bcb401a	Fix exception cause in base_class.py (#940 )	2022-06-21 20:58:02 +01:00
Antonin RAFFIN	7ce7b6a8c2	Update defaults for offpolicy algos with features extractor (#935 )	2022-06-18 10:52:52 +02:00
Antonin RAFFIN	d68f0a2411	Update doc: SB3 Contrib RecurrentPPO (#927 ) * Update doc: contrib update * Update docs/misc/changelog.rst Co-authored-by: Anssi <kaneran21@hotmail.com> * Address Anssi comments Co-authored-by: Anssi <kaneran21@hotmail.com>	2022-05-31 18:11:16 +02:00
Antonin RAFFIN	4b89fbf283	Fix issues due to newer version of protobuf and sphinx (#924 )	2022-05-29 21:09:50 +02:00
Antonin RAFFIN	49813d8c68	Update doc and add check for unbounded action space (#918 )	2022-05-25 16:24:21 +02:00
TibiGG	2fcf8f91c1	Removed redundant double-check of nested Dict (#908 ) * Removed redundant double-check of nested Dict observation space from BaseAlgorithm * Update changelog Co-authored-by: tibigg <tg4018@ic.ac.uk>	2022-05-09 14:36:15 +03:00
Antonin RAFFIN	0fadc94df3	Fix synchronization bug with EvalCallback (#907 )	2022-05-08 21:54:34 +03:00
Thomas Rudolf	c2518dc160	Add doc to use mlflow logger (#889 ) * ADD feature for mlflow logger via MLflowOutputFormat. * Move MLFlow integration to doc Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>	2022-05-08 15:28:31 +02:00
Antonin RAFFIN	c5f0aa5de0	Update doc: PPO blog post and remark on timeouts (#896 )	2022-05-01 16:26:34 +02:00
Antonin RAFFIN	a6f5049a99	Upgrade code to Python 3.7+ syntax using `pyupgrade` (#887 ) * Upgrade code to Python 3.7+ syntax * Update changelog	2022-04-25 13:01:38 +03:00
Bryan Collazo	3c468ff558	Update ppo documentation (remove redundant and) (#874 ) * Update ppo documentation (remove redundant and) PTAL, thanks! * Update changelog * Pin ale-py version Co-authored-by: Antonin Raffin <antonin.raffin@dlr.de>	2022-04-19 14:15:51 +02:00
Paul Scheikl	ed308a71be	Fixed unchecked None value in SubprocVecEnv (#808 ) * Fixed unchecked None value in SubprocVecEnv * Fixed unchecked None value in DummyVecEnv * Fix formatting * Update test and changelog * Improve test Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2022-04-12 16:05:40 +02:00
Antonin RAFFIN	39a4f9379a	Escape tensorboard log name (#857 ) * escape tensorboard log name Otherwise utils does not recognize the log. * Added fix to changelog * Modifications made by: make commit-checks . * Revert "Modifications made by: make commit-checks ." This reverts commit 529a275d9475f85ef031038a8f3565f7301e5371. * Update changelog and add test Co-authored-by: James Hirschorn <James.Hirschorn@quantitative-technologies.com>	2022-04-11 21:49:18 +02:00
Antonin RAFFIN	248f082cdc	Bump min PyTorch version (#855 )	2022-04-11 18:34:15 +02:00
Quentin Gallouédec	16703b1314	Fix HER goal selection (#848 ) * Goal sampled from next_achieved_goal instead of achived_goal * No need to have special case for future anymore * Update changelog Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>	2022-04-11 17:50:02 +02:00
Grégoire Passault	254bb10c42	Replacing the policy registry with policy "aliases" (#842 ) * Replacing the policy registry with policy "aliases" * Fixing import order and SAC * Changing arg. order to be sure policy_aliases is a kwarg * Import orders * Removing pytype error check * Reformat * Fix alias import * Not using mutable {} as default for policy_aliases * Empty aliases initialization * Using static attributes for policy_aliases * Fixing isort * Fixing back bad merge * Running isort * Fixing aliases for A2C and PPO * Using f-string * Moving policy_aliases definition position * Adding change in the changelog * Update version Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>	2022-04-08 21:21:53 +02:00
Yifei Cheng	44e53ff811	Enable force_zip64 (#839 ) * Enable force_zip64 * mark tests as expensive * Update changelog Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>	2022-03-28 10:35:33 +02:00
Antonin RAFFIN	30772aa9f5	Release v1.5.0 (#835 ) * Release v1.5.0 * Fix link	2022-03-25 14:38:22 +01:00
Grégoire Passault	00ac43b0a9	Removing dead code for handling time limits (#831 ) * Removing dead code for handling time limits (see #829) * Mentionning remove_time_limit_termination in the changelog * Update changelog.rst Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2022-03-23 13:33:55 +01:00
Yuan	009bb0549a	Update tensorboard.rst in SummaryWriterCallback (#822 ) * Update tensorboard.rst * update changelog.rst * update changelog.rst, add username	2022-03-15 21:48:52 +01:00
Antonin RAFFIN	e88eb1c9ca	Add explanation of logger output (#803 ) * Add explanation of logger output * Apply suggestions from code review Co-authored-by: Anssi <kaneran21@hotmail.com> * Add example output Co-authored-by: Anssi <kaneran21@hotmail.com>	2022-03-07 12:20:43 +01:00
Julio César Alves	cdaa9ab418	Callback to early stop the training if there is no model improvement after consecutive evaluations (#741 ) * Added StopTrainingOnNoModelImprovement callback and callback_after_eval parameter in EvalCallback * Correction in EvalCallback and tests for StopTrainingOnNoModelImprovement * Update the docs related to new StopTrainingOnNoModelImprovement callback * Update doc Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org> Co-authored-by: Antonin Raffin <antonin.raffin@dlr.de>	2022-02-25 11:56:47 +01:00
Quentin Gallouédec	db5366fb51	`None` as default value for `env` in `HerReplayBuffer.sample` + `DQN` batch size typing fix (#790 ) * `env` to `None` by default in `HerReplayBuffer.sample` (#788) * Fix DQN batch_size typing * Fix changelog Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org> Co-authored-by: Antonin Raffin <antonin.raffin@dlr.de>	2022-02-24 15:51:01 +01:00
Quentin Gallouédec	13fcb12471	Fix normalization for `DictReplayBuffer` (#744 ) * Normalize samples DictReplayBuffer (#743) * Fixed sample normalization in ``DictReplayBuffer`` (#743) * Test buffer normalization * Rename test replay buffer * Bump version Co-authored-by: Anssi <kaneran21@hotmail.com> Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2022-02-23 13:04:57 +01:00
Boyuan Chen	7a01637128	Fix VecNormalization bug for Dict obs (#768 ) * fix #724 VecNormalization bug for Dict obs * update test and changelog * Update changelog Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>	2022-02-23 12:33:41 +01:00
Costa Huang	d2ebd2eeaa	Allow PPO to turn off advantage normalization (#763 ) * Allow PPO to turn of advantage normalization * update changelog * Add a test case * Update test and sanity check * Fix tests Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2022-02-22 15:29:21 +01:00
Antonin RAFFIN	7ce4bb8016	Pin gym version (#782 ) * Pin gym version * Cleanup warnings * Reformat	2022-02-21 23:12:54 +01:00
Gianluca De Cola	58a98060f9	Update docstring on MlpExtractor. Resolves #736 (#774 ) * Improve docstring on MlpExtractor. * update changelog.	2022-02-16 01:50:17 +02:00
Gautam J	59bec30180	update docs fix indentation (#764 ) * update docs fix indentation Changed code block indentation from 2 spaces to 4 spaces for consistency. * update changelog * Update changelog.rst Co-authored-by: Anssi <kaneran21@hotmail.com>	2022-02-07 21:00:53 +02:00
Manuel	40bda9a918	Remove explict forward calls (#753 ) * Remove explict forward calls * Changelog and commit checks. * Reverted test forward removal for super call. Co-authored-by: Anssi <kaneran21@hotmail.com>	2022-02-06 22:27:12 +02:00
Adam Gleave	78afcbd6d9	HumanOutputFormat: make length configurable, throw error if keys alias (#756 ) * Make HumanOutputFormat length configurable and bump to 36 by default * Add test case * Updated changelog * Blacken * Blacken code * Fix GitLab CI: switch to Docker container with new black version * Incorporate suggestion * Add class docstring * Dummy commit to retrigger GitLab Co-authored-by: Anssi <kaneran21@hotmail.com>	2022-02-05 12:57:35 +02:00
Adam Gleave	9ff26dafed	Fix changelog (#760 )	2022-02-05 02:12:17 +02:00
Carlos Luis	5143cd19f7	Gym fixes - Follow up from #705 (#734 ) * fix Atari in CI * fix dtype and atari extra * Update setup.py * remove 3.6 * note about how to install Atari * pendulum-v1 * atari v5 * black * fix pendulum capitalization * add minimum version * moved things in changelog to breaking changes * partial v5 fix * env update to pass tests * mismatch env version fixed * Fix tests after merge * Include autorom in setup.py * Blacken code * Fix dtype issue in more robust way * Fix GitLab CI: switch to Docker container with new black version * Remove workaround from GitLab. (May need to rebuild Docker for this though.) * Revert to v4 * Update setup.py * Apply suggestions from code review * Remove unnecessary autorom * Consistent gym versions Co-authored-by: J K Terry <justinkterry@gmail.com> Co-authored-by: Anssi <kaneran21@hotmail.com> Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org> Co-authored-by: modanesh <mohamad4danesh@gmail.com> Co-authored-by: Adam Gleave <adam@gleave.me>	2022-02-04 15:13:57 -08:00
Armand du Parc Locmaria	44dfedc061	Add furuta pendulum project to project list (#742 ) * add furuta pendulum project * Update changelog to reflect addition to docs Co-authored-by: Anssi <kaneran21@hotmail.com>	2022-02-04 11:39:49 +02:00
Antonin RAFFIN	54bcfa4544	Add Hugging Face integration to SB3 doc (#733 ) * Add Hugging Face to SB3 doc * Update doc + fixes * Use SB3 model from the hub * Bump version * Fixes Co-authored-by: simoninithomas <simonini_thomas@outlook.fr>	2022-01-20 10:04:12 +01:00
Paul Scheikl	fc41600225	Fixed logging info_keywords in the VecMonitor class. (#730 ) * Writing the additional info_keywords into the episode infos that are passed to the resulst writer. Directly taken from the non-vec version of monitor. * Added test for monitoring info_keywords. * Removed unnecessary step of registering the env. Not using make_vec_env, because it applies a monitor wrapper to the env. * Reformat Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>	2022-01-19 17:17:22 +01:00
Antonin RAFFIN	21f6a474a4	Release 1.4.0 (#729 ) * Release 1.4.0 * Add integration section in the readme	2022-01-19 11:16:15 +01:00
Antonin RAFFIN	cd6e04705b	Update SB3 Contrib doc (ARS) and W&B integration (#726 ) * Add ARS to SB3 contrib * Add integration page	2022-01-18 15:10:25 +01:00
Antonin RAFFIN	e9a8979022	Add copy and combine method to running mean std (#716 ) * Add copy and combine method to running mean std * Update test * Faster test * Update test * Update test * Shift values in RMS test	2022-01-06 01:31:04 +02:00
IperGiove	d9e198e04f	Update custom_policy.rst (#711 ) * Update custom_policy.rst Added methods forward_actor and forward_critic in CustomNetwork class. * Update doc Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>	2022-01-03 16:22:58 +01:00
Thomas Gubler	c895c1d46f	Doc fix: A2C - fix guidance on RMSpropTFLike (#708 ) * doc: A2C/migration: fix guidance on RMSpropTFLike * Update changelog.rst Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2021-12-30 11:28:12 +01:00
Antonin RAFFIN	4a5dfaedfc	Update SB3 contrib doc (+ fix backward compat) (#707 ) * Fix `VecNormalize` load for SB3<= 1.3.0 * Update SB3 contrib doc * Bump version	2021-12-29 14:25:09 +01:00
Antonin RAFFIN	bb16645c4e	Add `skip` option for `VecTransposeImage` and bug fix in frame stack (#700 ) * Update doc * Add comment * Add skip option to VecTransposeImage and fix bug in frame stack	2021-12-23 17:12:49 +02:00
Quentin Gallouédec	d496cd4d95	Consistent use of `device` as keyword argument (#702 ) * consistent device as keyword arg * Fixed ``device`` arg inconsistency in changelog	2021-12-22 11:43:59 +01:00
Demetrio92	798b16aaf7	more verbose documentation regarding `.load` vs `.set_parameters` (#696 ) * more verbose documentation regarding `.load` vs `.set_parameters` (#683, #614) * add a note to explain the difference between `.load` and `.set_parameters` to the examples * fix typos Co-authored-by: Anssi <kaneran21@hotmail.com> Co-authored-by: Anssi <kaneran21@hotmail.com>	2021-12-18 17:28:37 +02:00
hsuehch	222a69ca49	Eliminate extra empty lines in CSV monitor files on Windows (DLR-RM#692) (#695 ) * Added ``newline="\n"`` when opening CSV monitor files so that each line ends with ``\r\n`` instead of ``\r\r\n`` on Windows while Linux environments are not affected	2021-12-18 16:04:33 +02:00
Antonin RAFFIN	e24147390d	Improve tests and add check for float32 (#686 ) * Add additional checks * Improve tests and error message * Update changelog * Bump version * Update doc * Add tests for action space * Improve test	2021-12-09 14:14:33 +02:00
Antonin RAFFIN	77f4f5021d	Drop Python 3.6 support (#685 ) * Drop python 3.6 support * Update doc * Update gitlab CI * Update doc env * Fix gitlab CI	2021-12-06 12:54:43 +01:00
Antonin RAFFIN	507ed1762e	Multiprocessing support for off policy algorithms (#439 ) * Add multi-env training support for SAC * Fix for dict obs * Pytype fixes * Fix assert on number of envs * Remove for loop * Add support for Dict obs * Start cleanup * Update doc and bug fix * Add support for vectorized action noise and add multi env example for off-policy * Update version * Bug fix with VecNormalize * Update README table * Update variable names * Update changelog and version * Update doc and fix for `gradient_steps=-1` * Add test for `gradient_steps=-1` * Disable pytype pyi errors * Fix for DQN * Update comment on deepcopy * Remove episode_reward field * Fix RolloutReturn * Avoid modification by reference * Fix error message Co-authored-by: Anssi <kaneran21@hotmail.com>	2021-12-01 22:30:09 +01:00
Antonin RAFFIN	2ebb8aa22b	Update Citation (#684 ) * Update citation * Remove cff file	2021-12-01 18:55:21 +01:00
Antonin RAFFIN	52c29dc497	Fix evaluation script for recurrent policies (#678 ) * Fix evaluation script for RNN * Add error message * Revert "Add error message" This reverts commit 8d69b6cf4de2cd13aecfb425bd3145fad6a6c49a. * Fix for pytype * Rename mask to `episode_start` * Fix type hint * Fix type hints * Remove confusing part of sentence Co-authored-by: Anssi <kaneran21@hotmail.com>	2021-11-30 13:49:06 +01:00
Gary Briggs	8e5ede783f	Add a section on exporting to TFLite/Coral with demonstration (#679 ) * Add a section on exporting to TFLite/Coral with demonstration * Changelog to reflect new export documentation * Update docs/guide/export.rst Fingers on autopilot make word wrong Co-authored-by: Anssi <kaneran21@hotmail.com> * Update docs/guide/export.rst Better wording clarity Co-authored-by: Anssi <kaneran21@hotmail.com> * Update docs/guide/export.rst Better wording clarity Co-authored-by: Anssi <kaneran21@hotmail.com> * Clarify motivations and hardware * Update docs/misc/changelog.rst Make consistent with other changelog entries Co-authored-by: Anssi <kaneran21@hotmail.com> * Sphinx wants the section underline to be at least this long * Remove first-person voice * Typos Co-authored-by: Anssi <kaneran21@hotmail.com>	2021-11-28 10:54:50 +01:00
Shyamal H Anadkat	3b68dc7312	Update GAE computation docstring (#655 ) * Fix typo in buffers.py * Revert "Fix typo in buffers.py" This reverts commit ca643d5e3a509ae1b8a65bf0de98f4609ca9d8da. * Ignore pytype errors * Update GAE computation docstring Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>	2021-11-25 10:53:42 +01:00
Parth Kothari	58e5506385	Editted Authors of DriverGym project (#669 )	2021-11-18 10:18:18 +01:00
Parth Kothari	1ac35eaef2	Add DriverGym project to SB3 project documentation (#665 ) * Added DriverGym project * Updated changelog * Update changelog.rst Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2021-11-17 11:13:43 +01:00
Antonin RAFFIN	d228364ccf	Add timeout handling for on-policy algorithms (#658 ) * Add timeout handling for on-policy algorithms * Fixes * Fix infinite loop in eval * Skip type check for python 3.9 * Fix for discrete obs + add docstring * Fix A2C test * Removed unused helper * Add test for infinite horizon * typed ast should be fixed * Apply suggestions from code review Co-authored-by: Anssi <kaneran21@hotmail.com> Co-authored-by: Anssi <kaneran21@hotmail.com>	2021-11-16 17:19:16 +01:00
Antonin RAFFIN	e75e1de4c1	Fix indentation in RL tips doc (#657 ) * Update rl_tips.rst indent fix to make if done and its following statement work * Fix indentation and update changelog * Skip type check for python 3.9 Co-authored-by: paulg <cove9988@gmail.com>	2021-11-10 16:54:20 +00:00
Antonin RAFFIN	2bb4500948	Fix `set_env` when using `VecNormalize` (#638 ) * Fix `set_env` when using `VecNormalize` * Update version	2021-11-02 13:52:26 +02:00
ac-93	98c1a637cf	add tactile-gym to the list of projects using SB3 (#640 ) * Update projects.rst * Update changelog.rst * Update projects.rst * Fix doc build Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2021-10-31 18:26:06 +01:00
Oleksii Kachaiev	0c17fedfac	Adjust FPS calculation to accommodate for reset_num_timesteps=False (#636 ) * Store number of timesteps at the beginning of each learn cycle * Update changelog * Set default _num_timesteps_at_start in the contructor * Test case for FPS logger * Adjust test to cover both on-policy and off-policy algorithms * Fix formatting * Update test and add comment * Fix test Co-authored-by: Oleksii Kachaiev <okachaiev@riotgames.com> Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2021-10-31 18:19:03 +01:00
Edouard Leurent	a2e3001598	Add highway-env to the list of projects using SB3 (#639 ) * Add highway-env to the list of projects using SB3 Many thanks for this fantastic library, keep up the good work! * Update changelog with added documentation	2021-10-30 13:53:36 +02:00
Oleksii Kachaiev	0503e694b2	Introduce norm_obs_keys param for VecNormalize environment wrapper (#631 ) * Implement new norm_obs_keys param for VecNormalize environment wrapper * Simplified doc string to avoid issues with lint and doc * Updated changelog * Update changelog.rst * Update test_vec_normalize.py * Update sanity checks * Fix backward compat * Update doc * Update changelog * Fix lint warnings * Fix tests * Minor edit * observation_space sanity check was applied twice Co-authored-by: Oleksii Kachaiev <okachaiev@riotgames.com> Co-authored-by: Anssi <kaneran21@hotmail.com> Co-authored-by: Antonin Raffin <antonin.raffin@dlr.de>	2021-10-28 19:18:39 +02:00
Antonin RAFFIN	7b977d7b03	Release 1.3.0 (#625 )	2021-10-23 17:07:00 +02:00
Antonin RAFFIN	e907eca18e	Fix `set_env` to keep the number of timesteps (#615 ) * Fix for `set_env` * Add test and update changelog * Use underscores and f-strings * Add PyPi info * Update comments	2021-10-23 16:36:40 +02:00
Antonin RAFFIN	1564a85081	System info helper (#613 ) * Add `system_env_info` * Add `print_system_info` to load and store system info at save time * Remove TODO * Rename to `get_system_info` * Import as sb3 for consistency * Update changelog * Add warning for old SB3 versions * Use underscore litteral for more clarity	2021-10-18 10:43:56 +02:00
Timo Kaufmann	09e9fc42eb	Use consistent logging keys (#605 ) * Use a consistent key to log the total timesteps This changes the timestep logging key of on-policy algorithms from `time/total_timesteps` to `time/total timesteps` (note the underscore/space). The off-policy algorithms and the eval callback already use the latter, so this behavior is more consistent. * Use underscores instead of spaces in logging keys Most keys already followed this policy and consistent behavior is friendlier to new users. * Minor edit and bump version Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>	2021-10-12 13:17:30 +02:00
Antonin RAFFIN	75aa31dcfb	Update SB3 contrib algorithms (#604 )	2021-10-10 15:41:39 +02:00
Antonin RAFFIN	1881d904a0	Doc fix and improve error messages (#598 ) * Fix custom env doc * Catch common mistake * Improve `EvalCallback` error message * Lint test * Update docs/guide/custom_env.rst Co-authored-by: Adam Gleave <adam@gleave.me> Co-authored-by: Adam Gleave <adam@gleave.me>	2021-10-08 18:08:31 +02:00
Ilja Avadiev	740d61ada3	Doc fix environment mixup (#588 )	2021-09-29 10:16:59 +02:00
Antonin RAFFIN	306e49fda6	Fixes in `is_vectorized_observation` (#587 ) * Fix is vectorized bug in DQN * Fix sub-classed obs	2021-09-28 21:57:49 +02:00
Antonin RAFFIN	201fbffa8c	Remove `sde_net_arch` + Simplify policy (#584 ) * Remove `sde_net_arch` + Simplify policy * Add warning at load time	2021-09-28 22:32:54 +03:00
batu	89af49ca91	ONNX Documentation Update (#464 ) * Updated ONNX documentation First draft on the documentation explaining how to export SB3 models in the ONNX format * Updated changelog with ONNX documentation fix * Address comments * Update changelog.rst * Update rtd env * Fixes + add test example Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org> Co-authored-by: Anssi Kanervisto <anssk@Anssis-MacBook-Air.local> Co-authored-by: Anssi Kanervisto <kaneran21@hotmail.com>	2021-09-26 17:40:35 +02:00
Baek Junyeob	914bc10a0d	Add policy-distillation-baselines to project page (#578 ) * Update projects.rst * Update docs/misc/projects.rst * Apply suggestions from code review * Update changelog.rst Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2021-09-20 16:30:16 +02:00
Adam Gleave	e825fbdd33	VecNormalize: allow non-continuous observations when norm_obs is False (#575 ) * VecNormalize: allow non-continuous observations when norm_obs is False * Update changelog, fix lint * Switch to environment present in new and old versions of Gym * Fix name Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>	2021-09-18 12:11:01 +02:00
Matthew Allen	76c212a854	Add RLGym to project page (#576 ) * Add RLGym to projects list. Per the request in this issue on our repo: https://github.com/lucas-emery/rocket-league-gym/issues/24 * Update changelog documentation section * Update changelog.rst * Update docs/misc/projects.rst Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2021-09-18 11:47:22 +02:00
Wilhelm Kirchgässner	303df08a80	Add GEM project to project section of doc (#574 ) * add GEM project to project section of doc * Update docs/misc/projects.rst * Update changelog.rst Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2021-09-18 11:10:04 +02:00
Cyprien	f3a35aa786	Add method `predict_values` for ActorCriticPolicy (#569 ) * feat: add method predict_values for ActorCriticPolicy * Fixes for new gym version * Reformat Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>	2021-09-15 14:03:04 +02:00
Antonin RAFFIN	16f8b21d9b	Add `get_distribution` for on-policy algorithms (#566 ) * feat: get_distribution method for ActorCriticPolicy New method get_distribution for class ActorCriticPolicy returning current action distribution given observations * doc: updating changelog.rst - adding block for Release 1.2.1a0 - adding cyprienc to contributors * style: make format * fix: updating version.txt Changing version from 1.2.0 to 1.2.1a0 * Update changelog * Add test for get distribution Co-authored-by: Cyprien <courtot.c@gmail.com>	2021-09-13 10:25:42 +02:00

1 2 3 4 5 ...

343 commits