stable-baselines3

mirror of https://github.com/saymrwulf/stable-baselines3.git synced 2026-06-29 03:31:08 +00:00

History

Tobias Rohrer ba77dd7c61 Fix to use float64 actions for off policy algorithms (#1572 ) * Added test cases where off policy algorithms fail with float64 actionspace * casting observations and actions to `np.float32` to unify behaviour between `ReplayBuffer` and `RolloutBuffer`. Fixing issue #1145 * reformatted using black * making test more restrictive by checking models action is float64 * added changelog entry * undo cast of observations as `preprocessing.preprocess_obs()` casts them to float32 anyways. * - Casting to float32 only, if action.dtype is float64 - Added cast to `DictReplayBuffer` as well * Added tests for multiple variations of continuous action types and observation spaces * applied reformatting by `make commit-checks` * Added typing and comment referring to description in merge request * Apply linter for single element slice * Rename helper and refactor tests * Update changelog and docstring --------- Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>	2023-07-24 16:38:03 +02:00
..
changelog.rst	Fix to use float64 actions for off policy algorithms (#1572 )	2023-07-24 16:38:03 +02:00
projects.rst	Docs: Add mobile-env to community projects (#1617 )	2023-07-21 16:33:01 +02:00

Fix to use float64 actions for off policy algorithms (#1572 )

* Added test cases where off policy algorithms fail with float64 actionspace

* casting observations and actions to `np.float32` to unify behaviour between `ReplayBuffer` and `RolloutBuffer`. Fixing issue #1145

* reformatted using black

* making test more restrictive by checking models action is float64

* added changelog entry

* undo cast of observations as `preprocessing.preprocess_obs()` casts them to float32 anyways.

* - Casting to float32 only, if action.dtype is float64
- Added cast to `DictReplayBuffer` as well

* Added tests for multiple variations of continuous action types and observation spaces

* applied reformatting by `make commit-checks`

* Added typing and comment referring to description in merge request

* Apply linter for single element slice

* Rename helper and refactor tests

* Update changelog and docstring

---------

Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>

2023-07-24 16:38:03 +02:00

changelog.rst

Fix to use float64 actions for off policy algorithms (#1572 )

2023-07-24 16:38:03 +02:00

projects.rst

Docs: Add mobile-env to community projects (#1617 )

2023-07-21 16:33:01 +02:00