stable-baselines3/stable_baselines3
Rohan Tangri df6f9de8f4
KL Divergence Helper Function (#431)
* add kl divergence wrapper

* add test

* update changelog

* black lint

* remove unused import

* Fix ent coef loading for SAC (#429)

* Fix ent coef loading for SAC

* Better fix and add comment

* add 'distribution' to base Distribution class

* add sample test

* revert to plain pytorch implementation

* black reformat

* Update docs/misc/changelog.rst

Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>

* Doc update (custom policy + fix her example) (#436)

* isort and black reformat

* float -> bool tensor

* add sanity test

* more concise kl code

* remove outdated comment

* all -> allclose assertion

Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>

* Fix PyTorch warning

* Update gSDE entropy test

* Update entropy test

Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>
Co-authored-by: Antonin Raffin <antonin.raffin@dlr.de>
2021-05-20 19:01:07 +02:00
..
a2c Dictionary Observations (#243) 2021-05-11 12:29:30 +02:00
common KL Divergence Helper Function (#431) 2021-05-20 19:01:07 +02:00
ddpg Dictionary Observations (#243) 2021-05-11 12:29:30 +02:00
dqn Dictionary Observations (#243) 2021-05-11 12:29:30 +02:00
her Dictionary Observations (#243) 2021-05-11 12:29:30 +02:00
ppo Dictionary Observations (#243) 2021-05-11 12:29:30 +02:00
sac Fix ent coef loading for SAC (#429) 2021-05-12 12:21:54 +03:00
td3 Dictionary Observations (#243) 2021-05-11 12:29:30 +02:00
__init__.py Dictionary Observations (#243) 2021-05-11 12:29:30 +02:00
py.typed Rename to stable-baselines3 2020-05-05 15:02:35 +02:00
version.txt KL Divergence Helper Function (#431) 2021-05-20 19:01:07 +02:00