mirror of
https://github.com/saymrwulf/stable-baselines3.git
synced 2026-05-16 21:10:08 +00:00
PyTorch version of Stable Baselines, reliable implementations of reinforcement learning algorithms.
| docs | ||
| scripts | ||
| tests | ||
| torchy_baselines | ||
| .coveragerc | ||
| .gitignore | ||
| LICENSE | ||
| README.md | ||
| setup.cfg | ||
| setup.py | ||
Torchy Baselines
PyTorch version of Stable Baselines, a set of improved implementations of reinforcement learning algorithms.
Implemented Algorithms
- A2C
- CEM-RL (with TD3)
- PPO
- SAC
- TD3
Roadmap
TODO:
- better predict
- complete logger
- Refactor: buffer with numpy array instead of pytorch
- Refactor: remove duplicated code for evaluation
- double check the shape of log prob
- try squashing both mean and output when using SAC + SDE
- plotting? -> zoo
Later:
- get_parameters / set_parameters
- SDE: use affine transform to scale the noise after a tanh transform?
- Use MultivariateNormal with full covariance matrix?
- CNN policies + normalization
- tensorboard support
- DQN
- TRPO
- ACER
- DDPG
- HER -> use stable-baselines because does not depends on tf?