stable-baselines3/stable_baselines3
Hugh Perkins 2cc1477fa2
Fix advantage normalization with mini-batchsize of 1 (#1028)
* fix nan in advnatages with batch size 1, for ppo

* changelog

* black

* Simplify test

* Bump version

Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>
2022-08-25 11:50:08 +02:00
..
a2c Upgrade code to Python 3.7+ syntax using pyupgrade (#887) 2022-04-25 13:01:38 +03:00
common CheckpointCallback can now save replay buffer and VecNormalize (#1030) 2022-08-25 10:57:51 +02:00
ddpg Upgrade code to Python 3.7+ syntax using pyupgrade (#887) 2022-04-25 13:01:38 +03:00
dqn Include running_mean and running_val when updating target networks (#1004) 2022-08-23 10:20:43 +02:00
her Support for device=auto buffers and set it as default value (#1009) 2022-08-16 17:54:55 +02:00
ppo Fix advantage normalization with mini-batchsize of 1 (#1028) 2022-08-25 11:50:08 +02:00
sac Fix advantage normalization with mini-batchsize of 1 (#1028) 2022-08-25 11:50:08 +02:00
td3 Include running_mean and running_val when updating target networks (#1004) 2022-08-23 10:20:43 +02:00
__init__.py Upgrade code to Python 3.7+ syntax using pyupgrade (#887) 2022-04-25 13:01:38 +03:00
py.typed Rename to stable-baselines3 2020-05-05 15:02:35 +02:00
version.txt Fix advantage normalization with mini-batchsize of 1 (#1028) 2022-08-25 11:50:08 +02:00