Commit graph

40 commits

Author SHA1 Message Date
Antonin Raffin
5278a6f3f8 Testing off policy normalization 2019-11-14 14:35:00 +01:00
Antonin Raffin
95c741c707 Fix logger for discrete actions 2019-11-07 17:01:02 +01:00
Antonin Raffin
c6f90b9c3c Improve VecNormalize syncing for evaluation 2019-11-07 11:17:26 +01:00
Antonin Raffin
9644ae89cf Log ppo std 2019-10-31 16:17:08 +01:00
Antonin Raffin
925afe784c SDE on latent_pi 2019-10-31 11:44:27 +01:00
Antonin Raffin
0174ec269e Clean up 2019-10-29 18:43:16 +01:00
Antonin Raffin
c0cb9fc9c5 Fix predict method 2019-10-29 18:30:36 +01:00
Antonin Raffin
0d41bc1356 Add more logging 2019-10-29 15:15:11 +01:00
Antonin Raffin
c15b4bda1e Add first draft of SDE 2019-10-28 18:24:13 +01:00
Antonin Raffin
df1e7aa000 Add docstring 2019-10-28 17:42:39 +01:00
Antonin Raffin
d67822718c Add learning rate schedule 2019-10-28 16:47:13 +01:00
Antonin Raffin
799e30ff3d Bug fixes for A2C and PPO 2019-10-28 14:27:32 +01:00
Antonin Raffin
584f549fa1 Bug fix for discrete actions 2019-10-25 12:00:37 +02:00
Antonin RAFFIN
3bc746c6ee Add logger for PPO 2019-10-17 13:44:48 +02:00
Antonin RAFFIN
53898f3d1a Add flexible mlp 2019-10-17 13:32:25 +02:00
Antonin Raffin
b5656531d1 Enable logger for SAC/TD3 + refactor 2019-10-10 13:47:13 +02:00
Antonin Raffin
ef50bb81e8 Add support for categorical distribution 2019-10-08 13:06:38 +02:00
Antonin Raffin
440166fe26 Add a parameter to disable ortho init 2019-09-26 16:29:47 +02:00
Antonin Raffin
b4dc9d4e4d Add doc 2019-09-26 11:46:40 +02:00
Antonin Raffin
322399e8fe Update collect rollout 2019-09-25 13:20:06 +02:00
Antonin Raffin
6bfbb7198a Rename seed 2019-09-24 16:59:47 +02:00
Antonin Raffin
32648d9029 Add docstrings 2019-09-24 15:30:58 +02:00
Antonin Raffin
f4fe1362f0 Renaming 2019-09-24 14:53:03 +02:00
Antonin RAFFIN
8adb8f9931 Change default dist to gaussian 2019-09-22 12:56:27 +02:00
Antonin RAFFIN
ddaafcbc36 Refactor: add distributions 2019-09-22 12:52:49 +02:00
Antonin RAFFIN
70e1d673a9 Separate policy and value net 2019-09-21 18:12:06 +02:00
Antonin RAFFIN
2469ff3859 Reformat 2019-09-21 17:17:09 +02:00
Antonin RAFFIN
3ececcd3a9 Add tensorboard example 2019-09-21 17:09:26 +02:00
Antonin RAFFIN
e8ddd1f901 Improve initialization 2019-09-21 16:48:51 +02:00
Antonin Raffin
0e727a5f72 Full compat for VecEnv + bug fixes for cuda 2019-09-20 16:43:19 +02:00
Antonin Raffin
255ff10bff PPO VecEnv compat 2019-09-20 15:19:04 +02:00
Antonin RAFFIN
cc4380eccd Add eval env and clip vf 2019-09-19 17:18:41 +02:00
Antonin RAFFIN
fe8b415cbf First sign of life 2019-09-19 16:21:28 +02:00
Antonin RAFFIN
ad089f5b19 Add explained variance 2019-09-19 11:43:27 +02:00
Antonin RAFFIN
26f0c8d8e5 Refactor buffer 2019-09-19 11:43:15 +02:00
Antonin RAFFIN
149148d4c7 Bug fix actor forward 2019-09-18 23:55:41 +02:00
Antonin RAFFIN
525fe43552 Bug fix rollout buffer 2019-09-18 23:48:47 +02:00
Antonin RAFFIN
e1c1d5c4ab Bug fixes (not working yet) 2019-09-18 22:12:32 +02:00
Antonin RAFFIN
6bb7e183d2 Running PPO (not working yet) 2019-09-18 15:35:17 +02:00
Antonin Raffin
54dd7ea60d Start PPO 2019-09-18 13:10:27 +02:00