Antonin Raffin
|
ad32aa60f3
|
Add sde scheduler
|
2019-11-18 16:03:08 +01:00 |
|
Antonin Raffin
|
d8a7556d84
|
Merge branch 'feat/sde' into feat/offpolicy-sde
|
2019-11-18 15:14:05 +01:00 |
|
Antonin Raffin
|
ef59a7e431
|
Update version + add docstring
|
2019-11-18 15:11:19 +01:00 |
|
Antonin Raffin
|
b9c20d443d
|
Update doc + add test for tanh bijector
|
2019-11-18 15:04:07 +01:00 |
|
Antonin Raffin
|
5d353d598c
|
Start cleanup + update docstrings
|
2019-11-18 14:09:31 +01:00 |
|
Antonin Raffin
|
fb64072859
|
Update sde test
|
2019-11-15 11:07:49 +01:00 |
|
Antonin Raffin
|
cdb62a93fe
|
Bug fix for off-policy normalization
Now working properly
|
2019-11-15 11:00:31 +01:00 |
|
Antonin Raffin
|
f719544386
|
Change default batch size for SAC
|
2019-11-14 14:35:47 +01:00 |
|
Antonin Raffin
|
5278a6f3f8
|
Testing off policy normalization
|
2019-11-14 14:35:00 +01:00 |
|
Antonin Raffin
|
8aac10f3fa
|
Fix device
|
2019-11-13 14:38:18 +01:00 |
|
Antonin Raffin
|
da325a0ba7
|
Solve NaN issue and reduce number of parameters
|
2019-11-13 13:02:37 +01:00 |
|
Antonin RAFFIN
|
d725d01186
|
Normalize returns
|
2019-11-12 22:32:21 +01:00 |
|
Antonin Raffin
|
623e3cf4f9
|
Add SDE lr
|
2019-11-12 18:38:42 +01:00 |
|
Antonin Raffin
|
a08382faab
|
Add sde update for TD3
|
2019-11-12 18:37:13 +01:00 |
|
Antonin Raffin
|
f2a61949ae
|
Bump version and change default noise clipping
|
2019-11-08 14:59:42 +01:00 |
|
Antonin Raffin
|
715865a0fe
|
Add noise clipping
|
2019-11-08 13:17:38 +01:00 |
|
Antonin Raffin
|
f4546837c3
|
Add std to logger
|
2019-11-07 17:41:28 +01:00 |
|
Antonin Raffin
|
db87e0d36a
|
Quick and dirty SDE version for TD3
|
2019-11-07 17:31:52 +01:00 |
|
Antonin Raffin
|
95c741c707
|
Fix logger for discrete actions
|
2019-11-07 17:01:02 +01:00 |
|
Antonin Raffin
|
c6f90b9c3c
|
Improve VecNormalize syncing for evaluation
|
2019-11-07 11:17:26 +01:00 |
|
Antonin Raffin
|
6c7c8375a4
|
Update log interval
|
2019-11-07 11:16:59 +01:00 |
|
Antonin Raffin
|
9acff0f5b3
|
Remove plotting script
|
2019-10-31 17:01:27 +01:00 |
|
Antonin Raffin
|
0e092f7c52
|
Add plotting script
|
2019-10-31 16:59:35 +01:00 |
|
Antonin Raffin
|
9644ae89cf
|
Log ppo std
|
2019-10-31 16:17:08 +01:00 |
|
Antonin Raffin
|
72a6f18e43
|
Add sde test + fix random seed
|
2019-10-31 14:14:30 +01:00 |
|
Antonin Raffin
|
925afe784c
|
SDE on latent_pi
|
2019-10-31 11:44:27 +01:00 |
|
Antonin Raffin
|
862ae666b5
|
Try squashing the sde
|
2019-10-30 15:30:09 +01:00 |
|
Antonin Raffin
|
0174ec269e
|
Clean up
|
2019-10-29 18:43:16 +01:00 |
|
Antonin Raffin
|
9e8f6e0020
|
Add default filename for monitor
|
2019-10-29 18:42:34 +01:00 |
|
Antonin Raffin
|
c0cb9fc9c5
|
Fix predict method
|
2019-10-29 18:30:36 +01:00 |
|
Antonin Raffin
|
42d50ed09b
|
Add expln
|
2019-10-29 15:15:54 +01:00 |
|
Antonin Raffin
|
0d41bc1356
|
Add more logging
|
2019-10-29 15:15:11 +01:00 |
|
Antonin Raffin
|
69a348276e
|
Add classic advantage computation
|
2019-10-29 12:36:40 +01:00 |
|
Antonin Raffin
|
c15b4bda1e
|
Add first draft of SDE
|
2019-10-28 18:24:13 +01:00 |
|
Antonin Raffin
|
df1e7aa000
|
Add docstring
|
2019-10-28 17:42:39 +01:00 |
|
Antonin Raffin
|
d67822718c
|
Add learning rate schedule
|
2019-10-28 16:47:13 +01:00 |
|
Antonin Raffin
|
799e30ff3d
|
Bug fixes for A2C and PPO
|
2019-10-28 14:27:32 +01:00 |
|
Antonin Raffin
|
b150167bdd
|
Update default hyperparams
|
2019-10-25 13:01:00 +02:00 |
|
Antonin Raffin
|
584f549fa1
|
Bug fix for discrete actions
|
2019-10-25 12:00:37 +02:00 |
|
Antonin Raffin
|
f8bcb8ee16
|
Update A2C params
|
2019-10-25 11:31:20 +02:00 |
|
Antonin Raffin
|
0ad743c85d
|
Add A2C
|
2019-10-25 10:59:15 +02:00 |
|
Antonin RAFFIN
|
3bc746c6ee
|
Add logger for PPO
|
2019-10-17 13:44:48 +02:00 |
|
Antonin RAFFIN
|
53898f3d1a
|
Add flexible mlp
|
2019-10-17 13:32:25 +02:00 |
|
Antonin Raffin
|
64de9923d6
|
Buf fixes for python 2
|
2019-10-15 13:24:53 +02:00 |
|
Antonin Raffin
|
ab64ff464e
|
Add tensorboard_log dummy arg
|
2019-10-14 11:09:22 +02:00 |
|
Antonin Raffin
|
b5656531d1
|
Enable logger for SAC/TD3 + refactor
|
2019-10-10 13:47:13 +02:00 |
|
Antonin Raffin
|
dbaa5daca6
|
Add logger and Monitor wrapper
|
2019-10-10 13:41:54 +02:00 |
|
Antonin Raffin
|
ef50bb81e8
|
Add support for categorical distribution
|
2019-10-08 13:06:38 +02:00 |
|
Antonin Raffin
|
4d0c033bf2
|
Bug fix when randomly sampling actions
|
2019-10-07 16:36:48 +02:00 |
|
Antonin Raffin
|
37ab9d10f1
|
Rescale actions and add action noise
|
2019-10-07 16:26:03 +02:00 |
|