Commit graph

93 commits

Author SHA1 Message Date
Antonin Raffin
8aac10f3fa Fix device 2019-11-13 14:38:18 +01:00
Antonin Raffin
da325a0ba7 Solve NaN issue and reduce number of parameters 2019-11-13 13:02:37 +01:00
Antonin RAFFIN
d725d01186 Normalize returns 2019-11-12 22:32:21 +01:00
Antonin Raffin
623e3cf4f9 Add SDE lr 2019-11-12 18:38:42 +01:00
Antonin Raffin
a08382faab Add sde update for TD3 2019-11-12 18:37:13 +01:00
Antonin Raffin
f2a61949ae Bump version and change default noise clipping 2019-11-08 14:59:42 +01:00
Antonin Raffin
715865a0fe Add noise clipping 2019-11-08 13:17:38 +01:00
Antonin Raffin
f4546837c3 Add std to logger 2019-11-07 17:41:28 +01:00
Antonin Raffin
db87e0d36a Quick and dirty SDE version for TD3 2019-11-07 17:31:52 +01:00
Antonin Raffin
95c741c707 Fix logger for discrete actions 2019-11-07 17:01:02 +01:00
Antonin Raffin
c6f90b9c3c Improve VecNormalize syncing for evaluation 2019-11-07 11:17:26 +01:00
Antonin Raffin
6c7c8375a4 Update log interval 2019-11-07 11:16:59 +01:00
Antonin Raffin
9acff0f5b3 Remove plotting script 2019-10-31 17:01:27 +01:00
Antonin Raffin
0e092f7c52 Add plotting script 2019-10-31 16:59:35 +01:00
Antonin Raffin
9644ae89cf Log ppo std 2019-10-31 16:17:08 +01:00
Antonin Raffin
72a6f18e43 Add sde test + fix random seed 2019-10-31 14:14:30 +01:00
Antonin Raffin
925afe784c SDE on latent_pi 2019-10-31 11:44:27 +01:00
Antonin Raffin
862ae666b5 Try squashing the sde 2019-10-30 15:30:09 +01:00
Antonin Raffin
0174ec269e Clean up 2019-10-29 18:43:16 +01:00
Antonin Raffin
9e8f6e0020 Add default filename for monitor 2019-10-29 18:42:34 +01:00
Antonin Raffin
c0cb9fc9c5 Fix predict method 2019-10-29 18:30:36 +01:00
Antonin Raffin
42d50ed09b Add expln 2019-10-29 15:15:54 +01:00
Antonin Raffin
0d41bc1356 Add more logging 2019-10-29 15:15:11 +01:00
Antonin Raffin
69a348276e Add classic advantage computation 2019-10-29 12:36:40 +01:00
Antonin Raffin
c15b4bda1e Add first draft of SDE 2019-10-28 18:24:13 +01:00
Antonin Raffin
df1e7aa000 Add docstring 2019-10-28 17:42:39 +01:00
Antonin Raffin
d67822718c Add learning rate schedule 2019-10-28 16:47:13 +01:00
Antonin Raffin
799e30ff3d Bug fixes for A2C and PPO 2019-10-28 14:27:32 +01:00
Antonin Raffin
b150167bdd Update default hyperparams 2019-10-25 13:01:00 +02:00
Antonin Raffin
584f549fa1 Bug fix for discrete actions 2019-10-25 12:00:37 +02:00
Antonin Raffin
f8bcb8ee16 Update A2C params 2019-10-25 11:31:20 +02:00
Antonin Raffin
0ad743c85d Add A2C 2019-10-25 10:59:15 +02:00
Antonin RAFFIN
3bc746c6ee Add logger for PPO 2019-10-17 13:44:48 +02:00
Antonin RAFFIN
53898f3d1a Add flexible mlp 2019-10-17 13:32:25 +02:00
Antonin Raffin
64de9923d6 Buf fixes for python 2 2019-10-15 13:24:53 +02:00
Antonin Raffin
ab64ff464e Add tensorboard_log dummy arg 2019-10-14 11:09:22 +02:00
Antonin Raffin
b5656531d1 Enable logger for SAC/TD3 + refactor 2019-10-10 13:47:13 +02:00
Antonin Raffin
dbaa5daca6 Add logger and Monitor wrapper 2019-10-10 13:41:54 +02:00
Antonin Raffin
ef50bb81e8 Add support for categorical distribution 2019-10-08 13:06:38 +02:00
Antonin Raffin
4d0c033bf2 Bug fix when randomly sampling actions 2019-10-07 16:36:48 +02:00
Antonin Raffin
37ab9d10f1 Rescale actions and add action noise 2019-10-07 16:26:03 +02:00
Antonin RAFFIN
12f854e1aa Fix learning starts 2019-10-01 21:56:37 +02:00
Antonin Raffin
440166fe26 Add a parameter to disable ortho init 2019-09-26 16:29:47 +02:00
Antonin Raffin
b4dc9d4e4d Add doc 2019-09-26 11:46:40 +02:00
Antonin Raffin
70e5de1d1b Update SAC defaults 2019-09-25 17:07:54 +02:00
Antonin Raffin
0e4fc9c0ac Bug fix SAC 2019-09-25 13:30:08 +02:00
Antonin Raffin
322399e8fe Update collect rollout 2019-09-25 13:20:06 +02:00
Antonin Raffin
6bfbb7198a Rename seed 2019-09-24 16:59:47 +02:00
Antonin Raffin
32648d9029 Add docstrings 2019-09-24 15:30:58 +02:00
Antonin Raffin
f4fe1362f0 Renaming 2019-09-24 14:53:03 +02:00