saymrwulf/stable-baselines3

mirror of https://github.com/saymrwulf/stable-baselines3.git synced 2026-05-17 21:20:11 +00:00

Antonin Raffin cdb62a93fe Bug fix for off-policy normalization

Now working properly

2019-11-15 11:00:31 +01:00

1,010 B

Raw Blame History

Torchy Baselines

PyTorch version of Stable Baselines, a set of improved implementations of reinforcement learning algorithms.

Implemented Algorithms

A2C
CEM-RL (with TD3)
PPO
SAC
TD3

Roadmap

TODO:

save/load
better predict
complete logger
SDE: reduce the number of parameters (only n_features instead of n_features x n_actions) for A2C (done for TD3)
SDE: learn the feature extractor?

Later:

get_parameters / set_parameters
CNN policies + normalization
tensorboard support
DQN
TRPO
ACER
DDPG
HER -> use stable-baselines because does not depends on tf?