Antonin Raffin
f1a4fa2d3f
Improve predict method
2020-02-12 15:25:05 +01:00
Antonin Raffin
7bafdb3a67
Add get_vec_normalize_env()
2020-02-12 11:34:29 +01:00
Antonin Raffin
2ce31c1e21
Fix entropy loss for squashed Gaussian and VecEnv seeding
2020-02-11 17:22:03 +01:00
Antonin Raffin
2afcf395b9
Update tests
2020-02-11 16:42:25 +01:00
Antonin Raffin
b7dcc8d58e
Add extend method
2020-02-11 16:40:44 +01:00
Antonin Raffin
75a86881b3
Add save/load for replay buffer
2020-02-05 13:10:02 +01:00
Antonin Raffin
d850a35311
Update tests
2020-02-03 15:57:37 +01:00
Antonin Raffin
ec657cc34e
Fix tests and change log_path behavior for EvalCallback
2020-01-31 13:42:04 +01:00
Antonin Raffin
6d59bfd4a0
Merge branch 'master' into feat/callbacks
2020-01-31 13:09:55 +01:00
Dormann, Noah
1f0dd60b97
Fix saving on GPU - Loading on CPU ( #45 )
...
* removed policy from save, changed th.loads to map to device
* found hack: catch pickle exception and trying th.load with mapping instead, otherwise raise exception with more information -> loading cuda on cpu raises exception -> leads to th.load with map being called
* deleted todo
* updated changelog
* start of saving refactor
* first working c
* all tests pass, save refactored
* - backwards compatibilty not always
- make pytest all passing
- make typing all passing
* Fixes and simplify the save method
* Remove unused param
* Fix backward compat
* Fix docstring
2020-01-31 13:06:55 +01:00
Antonin Raffin
b66003cfb3
Add callback support
2020-01-27 14:32:31 +01:00
Antonin Raffin
e5c6601726
Update VecNormalize (pickling) and improve tests
2020-01-20 11:58:16 +01:00
Antonin Raffin
89db65b1fb
Improve logger testing + add readers
2020-01-20 11:58:00 +01:00
Antonin Raffin
c542009641
Clean up code + bug fixes
2020-01-20 11:17:55 +01:00
Antonin Raffin
07345e5e27
Test for differential entropy
2019-12-18 13:45:56 +01:00
Antonin Raffin
0117cc37f4
Merge branch 'master' into feat/sde-features
2019-12-05 16:33:41 +01:00
Noah Dormann
88d4f44d55
added set_env test and set_env wrapping
2019-12-05 13:59:07 +01:00
Noah Dormann
8062ed6036
fixed load, to check if environment ist correctly
2019-12-05 13:36:19 +01:00
Noah Dormann
c3b0398d56
Changed load so it still works when env not saved
...
improved save function
2019-12-05 08:40:28 +01:00
Dormann, Noah
362bba73ba
adapted common style
...
Co-Authored-By: Raffin, Antonin <Antonin.Raffin@dlr.de>
2019-12-05 08:07:43 +01:00
Antonin Raffin
3cdd5f20af
Bug fix + add test for sde net arch
2019-12-02 14:14:48 +01:00
Antonin Raffin
21e655ecbf
Add test for SAC with different entropy temperature
2019-12-02 11:47:52 +01:00
Antonin RAFFIN
03a84f97ea
Add monte-carlo test for SDE distribution
2019-12-01 16:46:39 +01:00
Noah Dormann
c82025e673
Add Test for exclude/include feature of save
2019-11-28 16:07:15 +01:00
Noah Dormann
e95858784a
Formatted all files
2019-11-28 15:38:04 +01:00
Noah Dormann
9ff59eaf3d
Added attribute self.policy_class to prevent errors when using self.policy as class
2019-11-28 15:25:01 +01:00
Noah Dormann
e26564e0ec
Added function for setting up any attributes that weren't saved and thus not loaded
2019-11-28 13:35:16 +01:00
Noah Dormann
c75582dfbe
resolving conflicts
...
# Conflicts:
# torchy_baselines/a2c/a2c.py
# torchy_baselines/ppo/ppo.py
Added optimizer params test
2019-11-28 12:12:06 +01:00
Noah Dormann
812cab84ac
Changed PPO deterministic
2019-11-28 11:20:40 +01:00
Antonin Raffin
5483e02d1a
Add SDE support for SAC
2019-11-26 15:26:12 +01:00
Antonin Raffin
d26fcf4566
Fix grad computation for sde test
2019-11-26 11:57:48 +01:00
Antonin Raffin
0885dbe74b
Bug fix in choosing the distribution
2019-11-25 15:02:10 +01:00
Antonin Raffin
5d6649d92b
Enable separate feature extraction for SDE
2019-11-25 14:54:13 +01:00
Noah Dormann
cfb822aa91
Corrected test_run.py
2019-11-21 16:54:30 +01:00
Noah Dormann
2d72f6d1b5
Added SAC, TD3, A2C
...
Missing CEMRL
2019-11-21 16:46:53 +01:00
Noah Dormann
775a50cc5c
saving all variables now added a2c support
2019-11-21 16:24:18 +01:00
Noah Dormann
526c37bf1f
refactored the assets in test_save_load
...
fixed base_class 'params.pth'
2019-11-21 15:44:57 +01:00
Noah Dormann
17f84053b3
save implementation for a2c needed before uncommenting save and load test in test_run.py::test_onpolicy
2019-11-21 14:44:02 +01:00
Noah Dormann
fb5f192fc4
Implemented Changes suggested from Antonin-Raffin
...
Added Optimizer saving
2019-11-21 14:39:44 +01:00
Noah Dormann
a7655ca6e1
Reformated every file with PEP 8 errors
2019-11-21 13:01:03 +01:00
Noah Dormann
b20b70db48
Clean reformat
2019-11-21 11:51:47 +01:00
Noah Dormann
5bca52a87d
rearranged imports
2019-11-21 11:44:37 +01:00
Noah Dormann
4b6234a1c8
finished test_save_load.py test
2019-11-21 11:39:47 +01:00
Antonin Raffin
ad32aa60f3
Add sde scheduler
2019-11-18 16:03:08 +01:00
Antonin Raffin
d8a7556d84
Merge branch 'feat/sde' into feat/offpolicy-sde
2019-11-18 15:14:05 +01:00
Antonin Raffin
b9c20d443d
Update doc + add test for tanh bijector
2019-11-18 15:04:07 +01:00
Antonin Raffin
5d353d598c
Start cleanup + update docstrings
2019-11-18 14:09:31 +01:00
Antonin Raffin
fb64072859
Update sde test
2019-11-15 11:07:49 +01:00
Antonin Raffin
cdb62a93fe
Bug fix for off-policy normalization
...
Now working properly
2019-11-15 11:00:31 +01:00
Antonin Raffin
5278a6f3f8
Testing off policy normalization
2019-11-14 14:35:00 +01:00