stable-baselines3

mirror of https://github.com/saymrwulf/stable-baselines3.git synced 2026-06-26 03:01:19 +00:00

History

M. Ernestus 0c50d75ecb TD3 Code review (#245 ) * Removed unneeded overrides of feature_extractor and normalize_images in the TD3 Actor. * Add learning rate schedule example (#248) * Add learning rate schedule example * Update docs/guide/examples.rst Co-authored-by: Adam Gleave <adam@gleave.me> * Address comments Co-authored-by: Adam Gleave <adam@gleave.me> * Add supported action spaces checks (#254) * Add supported action spaces checks * Address comment * Use `pass` in an abstractmethod instead of deleting the arguments. * Remove the "deterministic" keyword from the forward method of the TD3 Actor since it always is deterministic anyways. * Rename _get_data to _get_data_to_reconstruct_model. _get_data was too generic and could have meant anything. * Remove the n_episodes_rollout parameter and allow passing tuples as train_freq instead. * Fix docstring of `train_freq` parameter. * Black fixes. * Fix TD3 delayed update + rename `_get_data()` * Fix TD3 test * Normalize `train_freq` to a tuple in the constructor and turn the warning into an assert. * Make one step the default train frequency. * Black fixes. * Change np.bool to bool. * Use the tuple format to specify an amount of steps in terms of steps or episodes in the collect_collouts of the off policy algorithm. * Use the tuple format to specify an amount of steps in terms of steps or episodes in the collect_collouts of HER. * Use named tuple for train freq * Rename train_freq to train_every and TrainFreq to ExperienceDuration. Also add some type annotations and documentation. * Black fixes. * Revert to train_freq * Fix terminal observation issues * Typo * Fix action noise bug in HER * Add assert when loading HER models * Update version Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org> Co-authored-by: Adam Gleave <adam@gleave.me>		2021-02-27 17:33:50 +01:00
..
__init__.py	Use Monitor episode reward/length for `evaluate_policy` (#220 )	2020-11-16 11:52:28 +01:00
base_vec_env.py	Use Monitor episode reward/length for `evaluate_policy` (#220 )	2020-11-16 11:52:28 +01:00
dummy_vec_env.py	Fix numpy warning and update migration guide (#307 )	2021-02-01 11:24:44 +01:00
obs_dict_wrapper.py	Implement HER (#120 )	2020-10-22 11:56:43 +02:00
subproc_vec_env.py	Use Monitor episode reward/length for `evaluate_policy` (#220 )	2020-11-16 11:52:28 +01:00
util.py	Improve typing coverage (#175 )	2020-10-07 10:51:49 +02:00
vec_check_nan.py	Improve typing coverage (#175 )	2020-10-07 10:51:49 +02:00
vec_frame_stack.py	Avoid transposing channel-first envs (#213 )	2020-11-03 12:34:09 +01:00
vec_normalize.py	TD3 Code review (#245 )	2021-02-27 17:33:50 +01:00
vec_transpose.py	TD3 Code review (#245 )	2021-02-27 17:33:50 +01:00
vec_video_recorder.py	Improve typing coverage (#175 )	2020-10-07 10:51:49 +02:00