Dormann, Noah
1f0dd60b97
Fix saving on GPU - Loading on CPU ( #45 )
...
* removed policy from save, changed th.loads to map to device
* found hack: catch pickle exception and trying th.load with mapping instead, otherwise raise exception with more information -> loading cuda on cpu raises exception -> leads to th.load with map being called
* deleted todo
* updated changelog
* start of saving refactor
* first working c
* all tests pass, save refactored
* - backwards compatibilty not always
- make pytest all passing
- make typing all passing
* Fixes and simplify the save method
* Remove unused param
* Fix backward compat
* Fix docstring
2020-01-31 13:06:55 +01:00
Raffin, Antonin
cc3b023533
Merge pull request #44 from Antonin-Raffin/typing
...
Add typing and update requirement to python 3.6
2020-01-27 11:57:48 +01:00
Antonin Raffin
fb57a6b80c
Update docstring
2020-01-23 11:20:12 +01:00
Antonin Raffin
7265d9e352
Fix multiline f-string
2020-01-23 10:56:53 +01:00
Antonin Raffin
ff0eddfb17
Partially type base class
2020-01-22 17:51:27 +01:00
Antonin Raffin
0328a39d1b
Update changelog
2020-01-22 17:25:08 +01:00
Antonin Raffin
9345b85cfc
Update changelog and README
2020-01-22 17:23:42 +01:00
Antonin Raffin
44fce7c02a
Fix typing errors and typos
2020-01-22 17:17:12 +01:00
Antonin Raffin
88f07bafb6
Convert format to f-strings
2020-01-22 16:39:25 +01:00
Antonin Raffin
37f9f13684
Revert all changes for python 2
...
+ Add makefile and pytype
2020-01-22 16:18:27 +01:00
Raffin, Antonin
8152b34aaa
Merge pull request #41 from Antonin-Raffin/docs/build
...
Build documentation
2020-01-20 16:21:10 +01:00
Antonin Raffin
9e250b6818
Build doc
2020-01-20 16:19:35 +01:00
Antonin Raffin
b8df12afe2
Release v0.1.0
2020-01-20 13:01:14 +01:00
Raffin, Antonin
358b27e9c9
Merge pull request #6 from Antonin-Raffin/feat/sde-features
...
Feature Extract for SDE
2020-01-20 13:00:18 +01:00
Antonin Raffin
0bed698ec5
Raise error for abstract methods
2020-01-20 12:57:40 +01:00
Antonin Raffin
e5c6601726
Update VecNormalize (pickling) and improve tests
2020-01-20 11:58:16 +01:00
Antonin Raffin
89db65b1fb
Improve logger testing + add readers
2020-01-20 11:58:00 +01:00
Antonin Raffin
c542009641
Clean up code + bug fixes
2020-01-20 11:17:55 +01:00
Antonin Raffin
ea20721632
Add TODO
2020-01-15 15:58:45 +01:00
Antonin Raffin
03e853997a
Add squash_output and expln as policy param for ppo and a2c
2020-01-15 13:21:20 +01:00
Antonin Raffin
60d5f4463d
Add use_expln option for td3
2020-01-08 17:04:28 +01:00
Antonin Raffin
d3a718b94e
Add extra dependency
2020-01-08 11:26:57 +01:00
Antonin Raffin
299ca007b5
Add comment about warmup phase
2020-01-07 17:36:26 +01:00
Antonin Raffin
8831eff163
Unify evaluation
2020-01-07 14:00:03 +01:00
Antonin RAFFIN
aa7b91333e
Add seeding for subproc vecenv
2019-12-30 12:01:37 +01:00
Antonin RAFFIN
4a79f7e5a7
Print std reward for evaluation
2019-12-24 13:12:04 +01:00
Antonin RAFFIN
57c890f3e9
LeakyClip not working yet
2019-12-22 14:38:30 +01:00
Antonin RAFFIN
3a7508ac16
Fix double clip
2019-12-22 13:56:30 +01:00
Antonin Raffin
f6c475a44b
Add use_expln as a policy argument
2019-12-20 18:10:24 +01:00
Antonin Raffin
7f34108ed6
Fix exp_ln computation
2019-12-20 18:02:01 +01:00
Antonin Raffin
9b3b34c9c4
Sample batch_size noise matrices for SAC
2019-12-20 11:28:44 +01:00
Antonin Raffin
161c608f9c
Re-sample noise matrix for PPO
2019-12-20 11:28:20 +01:00
Antonin Raffin
e894f1f11b
Add leakyclip
2019-12-19 18:20:02 +01:00
Antonin Raffin
69428346bd
Bump version
2019-12-19 15:32:03 +01:00
Antonin Raffin
84ebc3d7da
Relax the HardTanh limit
2019-12-19 15:28:51 +01:00
Antonin Raffin
a5c3418765
Update README (roadmap moved to github)
2019-12-19 15:28:36 +01:00
Antonin Raffin
bff0ca0ea8
Use HardTanh to relax the constrain
2019-12-19 11:59:00 +01:00
Antonin Raffin
c05c990285
Remove norm clipping
2019-12-18 16:56:51 +01:00
Antonin Raffin
07345e5e27
Test for differential entropy
2019-12-18 13:45:56 +01:00
Antonin Raffin
e49d97bf98
Fix infs in SAC by bounding the mean
2019-12-18 13:45:33 +01:00
Antonin Raffin
57708a628c
Add value function for SDE + TD3
2019-12-17 15:01:08 +01:00
Antonin Raffin
1d6f9bf100
Add sample freq for SDE
2019-12-17 11:47:21 +01:00
Antonin Raffin
4957f05810
Merge branch 'master' into feat/sde-features
2019-12-17 11:15:22 +01:00
Antonin Raffin
919dfee452
Try to clip grad norm
2019-12-17 11:14:44 +01:00
Raffin, Antonin
8874b9dd6b
Merge pull request #5 from Antonin-Raffin/feat/td3-sde
...
Off-Policy State Dependent Exploration
2019-12-17 11:09:01 +01:00
Antonin Raffin
d63cef7693
Add gradient clipping for SAC
2019-12-06 18:32:57 +01:00
Antonin Raffin
233f346d53
Update todos
2019-12-06 17:46:56 +01:00
Antonin Raffin
6c423add8d
Bump version
2019-12-05 16:44:27 +01:00
Antonin Raffin
1f2b047ab3
Merge branch 'master' into feat/td3-sde
2019-12-05 16:35:57 +01:00
Antonin Raffin
0117cc37f4
Merge branch 'master' into feat/sde-features
2019-12-05 16:33:41 +01:00