Commit graph

128 commits

Author SHA1 Message Date
Antonin Raffin
3cdd5f20af Bug fix + add test for sde net arch 2019-12-02 14:14:48 +01:00
Antonin Raffin
8e9802784c Revert previous changes in SAC + SDE 2019-12-02 14:06:30 +01:00
Antonin Raffin
4e39a0627c Refactor: enable sde net arch for TD3 and SAC 2019-12-02 14:06:17 +01:00
Antonin Raffin
a2a8bbdf11 Sample n matrices for A2C/PPO when using SDE 2019-12-02 11:48:34 +01:00
Antonin Raffin
7a6a500398 Allow to use states directly as features for sde 2019-12-02 11:48:16 +01:00
Antonin Raffin
21e655ecbf Add test for SAC with different entropy temperature 2019-12-02 11:47:52 +01:00
Antonin RAFFIN
03a84f97ea Add monte-carlo test for SDE distribution 2019-12-01 16:46:39 +01:00
Antonin RAFFIN
879191b26a Bug fix in SAC with constant ent coeff + try batch sde matrices 2019-12-01 13:11:13 +01:00
Antonin Raffin
fbe29a7298 Track down autograd error
"Trying to backward through the graph a second time" -> added a comment
2019-11-27 17:29:47 +01:00
Antonin Raffin
fe67a98711 Log more values 2019-11-26 17:44:06 +01:00
Antonin Raffin
5483e02d1a Add SDE support for SAC 2019-11-26 15:26:12 +01:00
Antonin Raffin
d26fcf4566 Fix grad computation for sde test 2019-11-26 11:57:48 +01:00
Antonin Raffin
0885dbe74b Bug fix in choosing the distribution 2019-11-25 15:02:10 +01:00
Antonin Raffin
5d6649d92b Enable separate feature extraction for SDE 2019-11-25 14:54:13 +01:00
Antonin Raffin
d0003ee4ec Enable kwargs for proba dist 2019-11-25 14:00:21 +01:00
Antonin Raffin
5bbb14188d Unify A2C and TD3 SDE implementation 2019-11-25 13:19:33 +01:00
Antonin Raffin
c56865e10d Cleanup CEM, rename variables + add comments 2019-11-22 19:02:00 +01:00
Antonin Raffin
c47be0086e Add docstrings 2019-11-22 17:24:47 +01:00
Antonin Raffin
f0f2f10d1e Change default architecture 2019-11-22 17:24:14 +01:00
Antonin Raffin
03d2ab10f8 Fix clipped action when adapting noise with TD3 2019-11-22 15:04:34 +01:00
Antonin Raffin
604a19fbc3 Cleanup + update doc 2019-11-22 13:33:12 +01:00
Antonin Raffin
b84e5e9e27 Move flexible mlp to common 2019-11-22 13:06:41 +01:00
Antonin Raffin
ea3902cd32 Add doc for CEM-RL 2019-11-22 13:03:57 +01:00
Antonin Raffin
81a15414b0 Merge branch 'master' into feat/offpolicy-sde 2019-11-22 11:43:13 +01:00
Antonin Raffin
99ea0b3a54 Cleanup 2019-11-22 11:42:58 +01:00
Antonin Raffin
ad32aa60f3 Add sde scheduler 2019-11-18 16:03:08 +01:00
Antonin Raffin
d8a7556d84 Merge branch 'feat/sde' into feat/offpolicy-sde 2019-11-18 15:14:05 +01:00
Antonin Raffin
ef59a7e431 Update version + add docstring 2019-11-18 15:11:19 +01:00
Antonin Raffin
b9c20d443d Update doc + add test for tanh bijector 2019-11-18 15:04:07 +01:00
Antonin Raffin
5d353d598c Start cleanup + update docstrings 2019-11-18 14:09:31 +01:00
Antonin Raffin
fb64072859 Update sde test 2019-11-15 11:07:49 +01:00
Antonin Raffin
cdb62a93fe Bug fix for off-policy normalization
Now working properly
2019-11-15 11:00:31 +01:00
Antonin Raffin
f719544386 Change default batch size for SAC 2019-11-14 14:35:47 +01:00
Antonin Raffin
5278a6f3f8 Testing off policy normalization 2019-11-14 14:35:00 +01:00
Antonin Raffin
8aac10f3fa Fix device 2019-11-13 14:38:18 +01:00
Antonin Raffin
da325a0ba7 Solve NaN issue and reduce number of parameters 2019-11-13 13:02:37 +01:00
Antonin RAFFIN
d725d01186 Normalize returns 2019-11-12 22:32:21 +01:00
Antonin Raffin
623e3cf4f9 Add SDE lr 2019-11-12 18:38:42 +01:00
Antonin Raffin
a08382faab Add sde update for TD3 2019-11-12 18:37:13 +01:00
Antonin Raffin
f2a61949ae Bump version and change default noise clipping 2019-11-08 14:59:42 +01:00
Antonin Raffin
715865a0fe Add noise clipping 2019-11-08 13:17:38 +01:00
Antonin Raffin
f4546837c3 Add std to logger 2019-11-07 17:41:28 +01:00
Antonin Raffin
db87e0d36a Quick and dirty SDE version for TD3 2019-11-07 17:31:52 +01:00
Antonin Raffin
95c741c707 Fix logger for discrete actions 2019-11-07 17:01:02 +01:00
Antonin Raffin
c6f90b9c3c Improve VecNormalize syncing for evaluation 2019-11-07 11:17:26 +01:00
Antonin Raffin
6c7c8375a4 Update log interval 2019-11-07 11:16:59 +01:00
Antonin Raffin
9acff0f5b3 Remove plotting script 2019-10-31 17:01:27 +01:00
Antonin Raffin
0e092f7c52 Add plotting script 2019-10-31 16:59:35 +01:00
Antonin Raffin
9644ae89cf Log ppo std 2019-10-31 16:17:08 +01:00
Antonin Raffin
72a6f18e43 Add sde test + fix random seed 2019-10-31 14:14:30 +01:00