* Use a consistent key to log the total timesteps
This changes the timestep logging key of on-policy algorithms from
`time/total_timesteps` to `time/total timesteps` (note the
underscore/space). The off-policy algorithms and the eval callback
already use the latter, so this behavior is more consistent.
* Use underscores instead of spaces in logging keys
Most keys already followed this policy and consistent behavior is
friendlier to new users.
* Minor edit and bump version
Co-authored-by: Antonin Raffin <antonin.raffin@ensta.org>