Commit graph

74 commits

Author SHA1 Message Date
Antonin Raffin
44fce7c02a Fix typing errors and typos 2020-01-22 17:17:12 +01:00
Antonin Raffin
c542009641 Clean up code + bug fixes 2020-01-20 11:17:55 +01:00
Antonin Raffin
03e853997a Add squash_output and expln as policy param for ppo and a2c 2020-01-15 13:21:20 +01:00
Antonin Raffin
8831eff163 Unify evaluation 2020-01-07 14:00:03 +01:00
Antonin Raffin
161c608f9c Re-sample noise matrix for PPO 2019-12-20 11:28:20 +01:00
Antonin Raffin
1d6f9bf100 Add sample freq for SDE 2019-12-17 11:47:21 +01:00
Antonin Raffin
233f346d53 Update todos 2019-12-06 17:46:56 +01:00
Antonin Raffin
0117cc37f4 Merge branch 'master' into feat/sde-features 2019-12-05 16:33:41 +01:00
Dormann, Noah
03bf513e5e comment refactoring
Co-Authored-By: Raffin, Antonin <Antonin.Raffin@dlr.de>
2019-12-05 08:08:20 +01:00
Antonin Raffin
1ac1a7cad5 Reformat 2019-12-02 14:27:38 +01:00
Antonin Raffin
3cdd5f20af Bug fix + add test for sde net arch 2019-12-02 14:14:48 +01:00
Antonin Raffin
4e39a0627c Refactor: enable sde net arch for TD3 and SAC 2019-12-02 14:06:17 +01:00
Antonin Raffin
a2a8bbdf11 Sample n matrices for A2C/PPO when using SDE 2019-12-02 11:48:34 +01:00
Noah Dormann
6928879f5a Refactored doc-strings 2019-11-28 16:30:13 +01:00
Noah Dormann
e95858784a Formatted all files 2019-11-28 15:38:04 +01:00
Noah Dormann
9ff59eaf3d Added attribute self.policy_class to prevent errors when using self.policy as class 2019-11-28 15:25:01 +01:00
Noah Dormann
c75582dfbe resolving conflicts
# Conflicts:
#	torchy_baselines/a2c/a2c.py
#	torchy_baselines/ppo/ppo.py

Added optimizer params test
2019-11-28 12:12:06 +01:00
Noah Dormann
812cab84ac Changed PPO deterministic 2019-11-28 11:20:40 +01:00
Antonin Raffin
0885dbe74b Bug fix in choosing the distribution 2019-11-25 15:02:10 +01:00
Antonin Raffin
5d6649d92b Enable separate feature extraction for SDE 2019-11-25 14:54:13 +01:00
Antonin Raffin
d0003ee4ec Enable kwargs for proba dist 2019-11-25 14:00:21 +01:00
Antonin Raffin
5bbb14188d Unify A2C and TD3 SDE implementation 2019-11-25 13:19:33 +01:00
Antonin Raffin
c47be0086e Add docstrings 2019-11-22 17:24:47 +01:00
Antonin Raffin
f0f2f10d1e Change default architecture 2019-11-22 17:24:14 +01:00
Antonin Raffin
b84e5e9e27 Move flexible mlp to common 2019-11-22 13:06:41 +01:00
Antonin Raffin
ea3902cd32 Add doc for CEM-RL 2019-11-22 13:03:57 +01:00
Antonin Raffin
99ea0b3a54 Cleanup 2019-11-22 11:42:58 +01:00
Noah Dormann
924ba9aea6 cleaned comments on model specific get and load functions 2019-11-21 16:50:59 +01:00
Noah Dormann
2d72f6d1b5 Added SAC, TD3, A2C
Missing CEMRL
2019-11-21 16:46:53 +01:00
Noah Dormann
775a50cc5c saving all variables now added a2c support 2019-11-21 16:24:18 +01:00
Noah Dormann
fb5f192fc4 Implemented Changes suggested from Antonin-Raffin
Added Optimizer saving
2019-11-21 14:39:44 +01:00
Noah Dormann
a7655ca6e1 Reformated every file with PEP 8 errors 2019-11-21 13:01:03 +01:00
Noah Dormann
4b6234a1c8 finished test_save_load.py test 2019-11-21 11:39:47 +01:00
Antonin Raffin
5278a6f3f8 Testing off policy normalization 2019-11-14 14:35:00 +01:00
Noah Dormann
cc744a48b5 first save and load features 2019-11-12 17:03:57 +01:00
Antonin Raffin
95c741c707 Fix logger for discrete actions 2019-11-07 17:01:02 +01:00
Antonin Raffin
c6f90b9c3c Improve VecNormalize syncing for evaluation 2019-11-07 11:17:26 +01:00
Antonin Raffin
9644ae89cf Log ppo std 2019-10-31 16:17:08 +01:00
Antonin Raffin
925afe784c SDE on latent_pi 2019-10-31 11:44:27 +01:00
Antonin Raffin
0174ec269e Clean up 2019-10-29 18:43:16 +01:00
Antonin Raffin
c0cb9fc9c5 Fix predict method 2019-10-29 18:30:36 +01:00
Antonin Raffin
0d41bc1356 Add more logging 2019-10-29 15:15:11 +01:00
Antonin Raffin
c15b4bda1e Add first draft of SDE 2019-10-28 18:24:13 +01:00
Antonin Raffin
df1e7aa000 Add docstring 2019-10-28 17:42:39 +01:00
Antonin Raffin
d67822718c Add learning rate schedule 2019-10-28 16:47:13 +01:00
Antonin Raffin
799e30ff3d Bug fixes for A2C and PPO 2019-10-28 14:27:32 +01:00
Antonin Raffin
584f549fa1 Bug fix for discrete actions 2019-10-25 12:00:37 +02:00
Antonin RAFFIN
3bc746c6ee Add logger for PPO 2019-10-17 13:44:48 +02:00
Antonin RAFFIN
53898f3d1a Add flexible mlp 2019-10-17 13:32:25 +02:00
Antonin Raffin
b5656531d1 Enable logger for SAC/TD3 + refactor 2019-10-10 13:47:13 +02:00