Antonin Raffin
|
3cdd5f20af
|
Bug fix + add test for sde net arch
|
2019-12-02 14:14:48 +01:00 |
|
Antonin Raffin
|
8e9802784c
|
Revert previous changes in SAC + SDE
|
2019-12-02 14:06:30 +01:00 |
|
Antonin Raffin
|
4e39a0627c
|
Refactor: enable sde net arch for TD3 and SAC
|
2019-12-02 14:06:17 +01:00 |
|
Antonin Raffin
|
a2a8bbdf11
|
Sample n matrices for A2C/PPO when using SDE
|
2019-12-02 11:48:34 +01:00 |
|
Antonin Raffin
|
7a6a500398
|
Allow to use states directly as features for sde
|
2019-12-02 11:48:16 +01:00 |
|
Antonin Raffin
|
21e655ecbf
|
Add test for SAC with different entropy temperature
|
2019-12-02 11:47:52 +01:00 |
|
Antonin RAFFIN
|
03a84f97ea
|
Add monte-carlo test for SDE distribution
|
2019-12-01 16:46:39 +01:00 |
|
Antonin RAFFIN
|
879191b26a
|
Bug fix in SAC with constant ent coeff + try batch sde matrices
|
2019-12-01 13:11:13 +01:00 |
|
Antonin Raffin
|
fbe29a7298
|
Track down autograd error
"Trying to backward through the graph a second time" -> added a comment
|
2019-11-27 17:29:47 +01:00 |
|
Antonin Raffin
|
fe67a98711
|
Log more values
|
2019-11-26 17:44:06 +01:00 |
|
Antonin Raffin
|
5483e02d1a
|
Add SDE support for SAC
|
2019-11-26 15:26:12 +01:00 |
|
Antonin Raffin
|
d26fcf4566
|
Fix grad computation for sde test
|
2019-11-26 11:57:48 +01:00 |
|
Antonin Raffin
|
0885dbe74b
|
Bug fix in choosing the distribution
|
2019-11-25 15:02:10 +01:00 |
|
Antonin Raffin
|
5d6649d92b
|
Enable separate feature extraction for SDE
|
2019-11-25 14:54:13 +01:00 |
|
Antonin Raffin
|
d0003ee4ec
|
Enable kwargs for proba dist
|
2019-11-25 14:00:21 +01:00 |
|
Antonin Raffin
|
5bbb14188d
|
Unify A2C and TD3 SDE implementation
|
2019-11-25 13:19:33 +01:00 |
|
Antonin Raffin
|
c56865e10d
|
Cleanup CEM, rename variables + add comments
|
2019-11-22 19:02:00 +01:00 |
|
Antonin Raffin
|
c47be0086e
|
Add docstrings
|
2019-11-22 17:24:47 +01:00 |
|
Antonin Raffin
|
f0f2f10d1e
|
Change default architecture
|
2019-11-22 17:24:14 +01:00 |
|
Antonin Raffin
|
03d2ab10f8
|
Fix clipped action when adapting noise with TD3
|
2019-11-22 15:04:34 +01:00 |
|
Antonin Raffin
|
604a19fbc3
|
Cleanup + update doc
|
2019-11-22 13:33:12 +01:00 |
|
Antonin Raffin
|
b84e5e9e27
|
Move flexible mlp to common
|
2019-11-22 13:06:41 +01:00 |
|
Antonin Raffin
|
ea3902cd32
|
Add doc for CEM-RL
|
2019-11-22 13:03:57 +01:00 |
|
Antonin Raffin
|
81a15414b0
|
Merge branch 'master' into feat/offpolicy-sde
|
2019-11-22 11:43:13 +01:00 |
|
Antonin Raffin
|
99ea0b3a54
|
Cleanup
|
2019-11-22 11:42:58 +01:00 |
|
Antonin Raffin
|
ad32aa60f3
|
Add sde scheduler
|
2019-11-18 16:03:08 +01:00 |
|
Antonin Raffin
|
d8a7556d84
|
Merge branch 'feat/sde' into feat/offpolicy-sde
|
2019-11-18 15:14:05 +01:00 |
|
Antonin Raffin
|
ef59a7e431
|
Update version + add docstring
|
2019-11-18 15:11:19 +01:00 |
|
Antonin Raffin
|
b9c20d443d
|
Update doc + add test for tanh bijector
|
2019-11-18 15:04:07 +01:00 |
|
Antonin Raffin
|
5d353d598c
|
Start cleanup + update docstrings
|
2019-11-18 14:09:31 +01:00 |
|
Antonin Raffin
|
fb64072859
|
Update sde test
|
2019-11-15 11:07:49 +01:00 |
|
Antonin Raffin
|
cdb62a93fe
|
Bug fix for off-policy normalization
Now working properly
|
2019-11-15 11:00:31 +01:00 |
|
Antonin Raffin
|
f719544386
|
Change default batch size for SAC
|
2019-11-14 14:35:47 +01:00 |
|
Antonin Raffin
|
5278a6f3f8
|
Testing off policy normalization
|
2019-11-14 14:35:00 +01:00 |
|
Antonin Raffin
|
8aac10f3fa
|
Fix device
|
2019-11-13 14:38:18 +01:00 |
|
Antonin Raffin
|
da325a0ba7
|
Solve NaN issue and reduce number of parameters
|
2019-11-13 13:02:37 +01:00 |
|
Antonin RAFFIN
|
d725d01186
|
Normalize returns
|
2019-11-12 22:32:21 +01:00 |
|
Antonin Raffin
|
623e3cf4f9
|
Add SDE lr
|
2019-11-12 18:38:42 +01:00 |
|
Antonin Raffin
|
a08382faab
|
Add sde update for TD3
|
2019-11-12 18:37:13 +01:00 |
|
Antonin Raffin
|
f2a61949ae
|
Bump version and change default noise clipping
|
2019-11-08 14:59:42 +01:00 |
|
Antonin Raffin
|
715865a0fe
|
Add noise clipping
|
2019-11-08 13:17:38 +01:00 |
|
Antonin Raffin
|
f4546837c3
|
Add std to logger
|
2019-11-07 17:41:28 +01:00 |
|
Antonin Raffin
|
db87e0d36a
|
Quick and dirty SDE version for TD3
|
2019-11-07 17:31:52 +01:00 |
|
Antonin Raffin
|
95c741c707
|
Fix logger for discrete actions
|
2019-11-07 17:01:02 +01:00 |
|
Antonin Raffin
|
c6f90b9c3c
|
Improve VecNormalize syncing for evaluation
|
2019-11-07 11:17:26 +01:00 |
|
Antonin Raffin
|
6c7c8375a4
|
Update log interval
|
2019-11-07 11:16:59 +01:00 |
|
Antonin Raffin
|
9acff0f5b3
|
Remove plotting script
|
2019-10-31 17:01:27 +01:00 |
|
Antonin Raffin
|
0e092f7c52
|
Add plotting script
|
2019-10-31 16:59:35 +01:00 |
|
Antonin Raffin
|
9644ae89cf
|
Log ppo std
|
2019-10-31 16:17:08 +01:00 |
|
Antonin Raffin
|
72a6f18e43
|
Add sde test + fix random seed
|
2019-10-31 14:14:30 +01:00 |
|