stable-baselines3/docs/modules
Rohan Tangri 2ada2dd0b2
Update PPO KL Divergence Estimator (#419)
* remove unused all_kl_divs memory

* new kl approximate equation

* move kl check before update step

* update changelog

* add continue_training flag update to kl check

* add verbose check

* update changelog

* lint with black

* r -> log_ratio

* Add link to PR

* invert ratio

* Fix for Sphinx v4.0

Co-authored-by: Anssi <kaneran21@hotmail.com>
Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>
2021-05-10 13:21:00 +03:00
..
a2c.rst Add custom objects support + bug fix (#336) 2021-03-06 15:17:43 +02:00
base.rst Review of code (A2C, PPO and refactoring) (#35) 2020-06-09 13:54:18 +02:00
ddpg.rst Update PPO KL Divergence Estimator (#419) 2021-05-10 13:21:00 +03:00
dqn.rst Add code of conduct + update doc (#373) 2021-03-31 10:31:03 +02:00
her.rst Add custom objects support + bug fix (#336) 2021-03-06 15:17:43 +02:00
ppo.rst Add custom objects support + bug fix (#336) 2021-03-06 15:17:43 +02:00
sac.rst Add custom objects support + bug fix (#336) 2021-03-06 15:17:43 +02:00
td3.rst Add custom objects support + bug fix (#336) 2021-03-06 15:17:43 +02:00