Antonin Raffin
|
5278a6f3f8
|
Testing off policy normalization
|
2019-11-14 14:35:00 +01:00 |
|
Antonin Raffin
|
95c741c707
|
Fix logger for discrete actions
|
2019-11-07 17:01:02 +01:00 |
|
Antonin Raffin
|
c6f90b9c3c
|
Improve VecNormalize syncing for evaluation
|
2019-11-07 11:17:26 +01:00 |
|
Antonin Raffin
|
9644ae89cf
|
Log ppo std
|
2019-10-31 16:17:08 +01:00 |
|
Antonin Raffin
|
925afe784c
|
SDE on latent_pi
|
2019-10-31 11:44:27 +01:00 |
|
Antonin Raffin
|
0174ec269e
|
Clean up
|
2019-10-29 18:43:16 +01:00 |
|
Antonin Raffin
|
c0cb9fc9c5
|
Fix predict method
|
2019-10-29 18:30:36 +01:00 |
|
Antonin Raffin
|
0d41bc1356
|
Add more logging
|
2019-10-29 15:15:11 +01:00 |
|
Antonin Raffin
|
c15b4bda1e
|
Add first draft of SDE
|
2019-10-28 18:24:13 +01:00 |
|
Antonin Raffin
|
df1e7aa000
|
Add docstring
|
2019-10-28 17:42:39 +01:00 |
|
Antonin Raffin
|
d67822718c
|
Add learning rate schedule
|
2019-10-28 16:47:13 +01:00 |
|
Antonin Raffin
|
799e30ff3d
|
Bug fixes for A2C and PPO
|
2019-10-28 14:27:32 +01:00 |
|
Antonin Raffin
|
584f549fa1
|
Bug fix for discrete actions
|
2019-10-25 12:00:37 +02:00 |
|
Antonin RAFFIN
|
3bc746c6ee
|
Add logger for PPO
|
2019-10-17 13:44:48 +02:00 |
|
Antonin RAFFIN
|
53898f3d1a
|
Add flexible mlp
|
2019-10-17 13:32:25 +02:00 |
|
Antonin Raffin
|
b5656531d1
|
Enable logger for SAC/TD3 + refactor
|
2019-10-10 13:47:13 +02:00 |
|
Antonin Raffin
|
ef50bb81e8
|
Add support for categorical distribution
|
2019-10-08 13:06:38 +02:00 |
|
Antonin Raffin
|
440166fe26
|
Add a parameter to disable ortho init
|
2019-09-26 16:29:47 +02:00 |
|
Antonin Raffin
|
b4dc9d4e4d
|
Add doc
|
2019-09-26 11:46:40 +02:00 |
|
Antonin Raffin
|
322399e8fe
|
Update collect rollout
|
2019-09-25 13:20:06 +02:00 |
|
Antonin Raffin
|
6bfbb7198a
|
Rename seed
|
2019-09-24 16:59:47 +02:00 |
|
Antonin Raffin
|
32648d9029
|
Add docstrings
|
2019-09-24 15:30:58 +02:00 |
|
Antonin Raffin
|
f4fe1362f0
|
Renaming
|
2019-09-24 14:53:03 +02:00 |
|
Antonin RAFFIN
|
8adb8f9931
|
Change default dist to gaussian
|
2019-09-22 12:56:27 +02:00 |
|
Antonin RAFFIN
|
ddaafcbc36
|
Refactor: add distributions
|
2019-09-22 12:52:49 +02:00 |
|
Antonin RAFFIN
|
70e1d673a9
|
Separate policy and value net
|
2019-09-21 18:12:06 +02:00 |
|
Antonin RAFFIN
|
2469ff3859
|
Reformat
|
2019-09-21 17:17:09 +02:00 |
|
Antonin RAFFIN
|
3ececcd3a9
|
Add tensorboard example
|
2019-09-21 17:09:26 +02:00 |
|
Antonin RAFFIN
|
e8ddd1f901
|
Improve initialization
|
2019-09-21 16:48:51 +02:00 |
|
Antonin Raffin
|
0e727a5f72
|
Full compat for VecEnv + bug fixes for cuda
|
2019-09-20 16:43:19 +02:00 |
|
Antonin Raffin
|
255ff10bff
|
PPO VecEnv compat
|
2019-09-20 15:19:04 +02:00 |
|
Antonin RAFFIN
|
cc4380eccd
|
Add eval env and clip vf
|
2019-09-19 17:18:41 +02:00 |
|
Antonin RAFFIN
|
fe8b415cbf
|
First sign of life
|
2019-09-19 16:21:28 +02:00 |
|
Antonin RAFFIN
|
ad089f5b19
|
Add explained variance
|
2019-09-19 11:43:27 +02:00 |
|
Antonin RAFFIN
|
26f0c8d8e5
|
Refactor buffer
|
2019-09-19 11:43:15 +02:00 |
|
Antonin RAFFIN
|
149148d4c7
|
Bug fix actor forward
|
2019-09-18 23:55:41 +02:00 |
|
Antonin RAFFIN
|
525fe43552
|
Bug fix rollout buffer
|
2019-09-18 23:48:47 +02:00 |
|
Antonin RAFFIN
|
e1c1d5c4ab
|
Bug fixes (not working yet)
|
2019-09-18 22:12:32 +02:00 |
|
Antonin RAFFIN
|
6bb7e183d2
|
Running PPO (not working yet)
|
2019-09-18 15:35:17 +02:00 |
|
Antonin Raffin
|
54dd7ea60d
|
Start PPO
|
2019-09-18 13:10:27 +02:00 |
|