stable-baselines3/stable_baselines3
Tobias Rohrer ba77dd7c61
Fix to use float64 actions for off policy algorithms (#1572)
* Added test cases where off policy algorithms fail with float64 actionspace

* casting observations and actions to `np.float32` to unify behaviour between `ReplayBuffer` and `RolloutBuffer`. Fixing issue #1145

* reformatted using black

* making test more restrictive by checking models action is float64

* added changelog entry

* undo cast of observations as `preprocessing.preprocess_obs()` casts them to float32 anyways.

* - Casting to float32 only, if action.dtype is float64
- Added cast to `DictReplayBuffer` as well

* Added tests for multiple variations of continuous action types and observation spaces

* applied reformatting by `make commit-checks`

* Added typing and comment referring to description in merge request

* Apply linter for single element slice

* Rename helper and refactor tests

* Update changelog and docstring

---------

Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>
2023-07-24 16:38:03 +02:00
..
a2c Release v2.0.0 (#1571) 2023-06-23 12:21:58 +02:00
common Fix to use float64 actions for off policy algorithms (#1572) 2023-07-24 16:38:03 +02:00
ddpg Upgrade black formatting (#1310) 2023-02-02 11:58:41 +01:00
dqn Fix to use float64 actions for off policy algorithms (#1572) 2023-07-24 16:38:03 +02:00
her Fixes HER mixed ordering of desired_goal and achieved_goal (#1570) 2023-06-21 16:27:06 +02:00
ppo Release v2.0.0 (#1571) 2023-06-23 12:21:58 +02:00
sac Release v2.0.0 (#1571) 2023-06-23 12:21:58 +02:00
td3 Release v2.0.0 (#1571) 2023-06-23 12:21:58 +02:00
__init__.py Add Gymnasium support (#1327) 2023-04-14 13:13:59 +02:00
py.typed Rename to stable-baselines3 2020-05-05 15:02:35 +02:00
version.txt Fix to use float64 actions for off policy algorithms (#1572) 2023-07-24 16:38:03 +02:00