stable-baselines3/stable_baselines3/common/vec_env/vec_transpose.py

import numpy as np
from gym import spaces

from stable_baselines3.common.preprocessing import is_image_space
from stable_baselines3.common.vec_env.base_vec_env import VecEnv, VecEnvStepReturn, VecEnvWrapper


class VecTransposeImage(VecEnvWrapper):
    """
    Re-order channels, from HxWxC to CxHxW.
    It is required for PyTorch convolution layers.

    :param venv:
    """

    def __init__(self, venv: VecEnv):
        assert is_image_space(venv.observation_space), "The observation space must be an image"

        observation_space = self.transpose_space(venv.observation_space)
        super(VecTransposeImage, self).__init__(venv, observation_space=observation_space)

    @staticmethod
    def transpose_space(observation_space: spaces.Box) -> spaces.Box:
        """
        Transpose an observation space (re-order channels).

        :param observation_space:
        :return:
        """
        assert is_image_space(observation_space), "The observation space must be an image"
        width, height, channels = observation_space.shape
        new_shape = (channels, width, height)
        return spaces.Box(low=0, high=255, shape=new_shape, dtype=observation_space.dtype)

    @staticmethod
    def transpose_image(image: np.ndarray) -> np.ndarray:
        """
        Transpose an image or batch of images (re-order channels).

        :param image:
        :return:
        """
        if len(image.shape) == 3:
            return np.transpose(image, (2, 0, 1))
        return np.transpose(image, (0, 3, 1, 2))

    def step_wait(self) -> VecEnvStepReturn:
        observations, rewards, dones, infos = self.venv.step_wait()

        # Transpose the terminal observations
        for idx, done in enumerate(dones):
            if not done:
                continue
            if "terminal_observation" in infos[idx]:
                infos[idx]["terminal_observation"] = self.transpose_image(infos[idx]["terminal_observation"])

        return self.transpose_image(observations), rewards, dones, infos

    def reset(self) -> np.ndarray:
        """
        Reset all environments
        """
        return self.transpose_image(self.venv.reset())

    def close(self) -> None:
        self.venv.close()
Add `VecTransposeImage` and fix for SAC 2020-04-21 18:41:58 +00:00			`import numpy as np`
			`from gym import spaces`

Rename to stable-baselines3 2020-05-05 13:02:35 +00:00			`from stable_baselines3.common.preprocessing import is_image_space`
Fix double reset and improve typing coverage (#136) * Fix double reset and improve typing coverage * Revert minor edit * Add doc about types 2020-08-05 10:12:02 +00:00			`from stable_baselines3.common.vec_env.base_vec_env import VecEnv, VecEnvStepReturn, VecEnvWrapper`
Add docstrings and missing types 2020-04-23 12:56:05 +00:00

Add `VecTransposeImage` and fix for SAC 2020-04-21 18:41:58 +00:00			`class VecTransposeImage(VecEnvWrapper):`
			`"""`
Review of code (A2C, PPO and refactoring) (#35) * Split torch module code into torch_layers file * Updated reference to CNN * Change 'CxWxH' to 'CxHxW', as per common notion * Fix missing import in policies.py * Move PPOPolicy to OnlineActorCriticPolicy * Create OnPolicyRLModel from PPO, and make A2C and PPO inherit * Update A2C optimizer comment * Clean weight init scales for clarity * Fix A2C log_interval default parameter * Rename 'progress' to 'progress_remaining * Rename 'Models' to 'Algorithms' * Rename 'OnlineActorCriticPolicy' to 'ActorCriticPolicy' * Move static functions out from BaseAlgorithm * Move on/off_policy base algorithms to their own files * Add files for A2C/PPO * Fix docs * Fix pytype * Update documentation on OnPolicyAlgorithm * Add proper doctstring for on_policy rollout gathering * Add bit clarification on the mlppolicy/cnnpolicy naming * Move static function is_vectorized_policies to utils.py * Checking docstrings, pep8 fixes * Update changelog * Clean changelog * Remove policy warnings for sac/td3 * Add monitor_wrapper for OnPolicyAlgorithm. Clean tb logging variables. Add parameter keywords to OffPolicyAlgorithm super init Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org> 2020-06-09 11:54:18 +00:00			`Re-order channels, from HxWxC to CxHxW.`
Add docstrings and missing types 2020-04-23 12:56:05 +00:00			`It is required for PyTorch convolution layers.`

Cleanup docstring types (#169) * Cleanup docstring types * Update style * Test with js hack * Revert "Test with js hack" This reverts commit d091f438e8851ab8d01b66628e06a104f5e5ec69. * Fix types * Fix typo * Update CONTRIBUTING example 2020-10-02 17:05:55 +00:00			`:param venv:`
Add `VecTransposeImage` and fix for SAC 2020-04-21 18:41:58 +00:00			`"""`
Reformat and code cleanup 2020-04-23 13:18:21 +00:00
Add `VecTransposeImage` and fix for SAC 2020-04-21 18:41:58 +00:00			`def __init__(self, venv: VecEnv):`
Auto-formatting with black and isort (#97) * Add auto formatting with black and isort * Reformat code * Ignore typing errors * Add note about line length * Add minimum version for isort * Add commit-checks * Update docker image * Fixed lost import (during last merge) * Fix opencv dependency 2020-07-16 14:12:16 +00:00			`assert is_image_space(venv.observation_space), "The observation space must be an image"`
Add `VecTransposeImage` and fix for SAC 2020-04-21 18:41:58 +00:00
			`observation_space = self.transpose_space(venv.observation_space)`
			`super(VecTransposeImage, self).__init__(venv, observation_space=observation_space)`

			`@staticmethod`
			`def transpose_space(observation_space: spaces.Box) -> spaces.Box:`
Add docstrings and missing types 2020-04-23 12:56:05 +00:00			`"""`
			`Transpose an observation space (re-order channels).`

Cleanup docstring types (#169) * Cleanup docstring types * Update style * Test with js hack * Revert "Test with js hack" This reverts commit d091f438e8851ab8d01b66628e06a104f5e5ec69. * Fix types * Fix typo * Update CONTRIBUTING example 2020-10-02 17:05:55 +00:00			`:param observation_space:`
			`:return:`
Add docstrings and missing types 2020-04-23 12:56:05 +00:00			`"""`
Auto-formatting with black and isort (#97) * Add auto formatting with black and isort * Reformat code * Ignore typing errors * Add note about line length * Add minimum version for isort * Add commit-checks * Update docker image * Fixed lost import (during last merge) * Fix opencv dependency 2020-07-16 14:12:16 +00:00			`assert is_image_space(observation_space), "The observation space must be an image"`
Add `VecTransposeImage` and fix for SAC 2020-04-21 18:41:58 +00:00			`width, height, channels = observation_space.shape`
			`new_shape = (channels, width, height)`
			`return spaces.Box(low=0, high=255, shape=new_shape, dtype=observation_space.dtype)`

			`@staticmethod`
			`def transpose_image(image: np.ndarray) -> np.ndarray:`
Add docstrings and missing types 2020-04-23 12:56:05 +00:00			`"""`
			`Transpose an image or batch of images (re-order channels).`

Cleanup docstring types (#169) * Cleanup docstring types * Update style * Test with js hack * Revert "Test with js hack" This reverts commit d091f438e8851ab8d01b66628e06a104f5e5ec69. * Fix types * Fix typo * Update CONTRIBUTING example 2020-10-02 17:05:55 +00:00			`:param image:`
			`:return:`
Add docstrings and missing types 2020-04-23 12:56:05 +00:00			`"""`
Bug fixes at loading and predict time 2020-04-21 19:06:07 +00:00			`if len(image.shape) == 3:`
			`return np.transpose(image, (2, 0, 1))`
Add `VecTransposeImage` and fix for SAC 2020-04-21 18:41:58 +00:00			`return np.transpose(image, (0, 3, 1, 2))`

Fix double reset and improve typing coverage (#136) * Fix double reset and improve typing coverage * Revert minor edit * Add doc about types 2020-08-05 10:12:02 +00:00			`def step_wait(self) -> VecEnvStepReturn:`
Add `VecTransposeImage` and fix for SAC 2020-04-21 18:41:58 +00:00			`observations, rewards, dones, infos = self.venv.step_wait()`
TD3 Code review (#245) * Removed unneeded overrides of feature_extractor and normalize_images in the TD3 Actor. * Add learning rate schedule example (#248) * Add learning rate schedule example * Update docs/guide/examples.rst Co-authored-by: Adam Gleave <adam@gleave.me> * Address comments Co-authored-by: Adam Gleave <adam@gleave.me> * Add supported action spaces checks (#254) * Add supported action spaces checks * Address comment * Use `pass` in an abstractmethod instead of deleting the arguments. * Remove the "deterministic" keyword from the forward method of the TD3 Actor since it always is deterministic anyways. * Rename _get_data to _get_data_to_reconstruct_model. _get_data was too generic and could have meant anything. * Remove the n_episodes_rollout parameter and allow passing tuples as train_freq instead. * Fix docstring of `train_freq` parameter. * Black fixes. * Fix TD3 delayed update + rename `_get_data()` * Fix TD3 test * Normalize `train_freq` to a tuple in the constructor and turn the warning into an assert. * Make one step the default train frequency. * Black fixes. * Change np.bool to bool. * Use the tuple format to specify an amount of steps in terms of steps or episodes in the collect_collouts of the off policy algorithm. * Use the tuple format to specify an amount of steps in terms of steps or episodes in the collect_collouts of HER. * Use named tuple for train freq * Rename train_freq to train_every and TrainFreq to ExperienceDuration. Also add some type annotations and documentation. * Black fixes. * Revert to train_freq * Fix terminal observation issues * Typo * Fix action noise bug in HER * Add assert when loading HER models * Update version Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org> Co-authored-by: Adam Gleave <adam@gleave.me> 2021-02-27 16:33:50 +00:00
			`# Transpose the terminal observations`
			`for idx, done in enumerate(dones):`
			`if not done:`
			`continue`
Support for `VecMonitor` for gym3-style environments (#311) * add vectorized monitor * auto format of the code * add documentation and VecExtractDictObs * refactor and add test cases * add test cases and format * avoid circular import and fix doc * fix type * fix type * oops * Update stable_baselines3/common/monitor.py Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org> * Update stable_baselines3/common/monitor.py Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org> * add test cases * update changelog * fix mutable argument * quick fix * Apply suggestions from code review * fix terminal observation for gym3 envs * delete comment * Update doc and bump version * Add warning when already using `Monitor` wrapper * Update vecmonitor tests * Fixes Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org> 2021-04-13 16:09:31 +00:00			`if "terminal_observation" in infos[idx]:`
			`infos[idx]["terminal_observation"] = self.transpose_image(infos[idx]["terminal_observation"])`
TD3 Code review (#245) * Removed unneeded overrides of feature_extractor and normalize_images in the TD3 Actor. * Add learning rate schedule example (#248) * Add learning rate schedule example * Update docs/guide/examples.rst Co-authored-by: Adam Gleave <adam@gleave.me> * Address comments Co-authored-by: Adam Gleave <adam@gleave.me> * Add supported action spaces checks (#254) * Add supported action spaces checks * Address comment * Use `pass` in an abstractmethod instead of deleting the arguments. * Remove the "deterministic" keyword from the forward method of the TD3 Actor since it always is deterministic anyways. * Rename _get_data to _get_data_to_reconstruct_model. _get_data was too generic and could have meant anything. * Remove the n_episodes_rollout parameter and allow passing tuples as train_freq instead. * Fix docstring of `train_freq` parameter. * Black fixes. * Fix TD3 delayed update + rename `_get_data()` * Fix TD3 test * Normalize `train_freq` to a tuple in the constructor and turn the warning into an assert. * Make one step the default train frequency. * Black fixes. * Change np.bool to bool. * Use the tuple format to specify an amount of steps in terms of steps or episodes in the collect_collouts of the off policy algorithm. * Use the tuple format to specify an amount of steps in terms of steps or episodes in the collect_collouts of HER. * Use named tuple for train freq * Rename train_freq to train_every and TrainFreq to ExperienceDuration. Also add some type annotations and documentation. * Black fixes. * Revert to train_freq * Fix terminal observation issues * Typo * Fix action noise bug in HER * Add assert when loading HER models * Update version Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org> Co-authored-by: Adam Gleave <adam@gleave.me> 2021-02-27 16:33:50 +00:00
Add `VecTransposeImage` and fix for SAC 2020-04-21 18:41:58 +00:00			`return self.transpose_image(observations), rewards, dones, infos`

			`def reset(self) -> np.ndarray:`
			`"""`
			`Reset all environments`
			`"""`
			`return self.transpose_image(self.venv.reset())`

			`def close(self) -> None:`
			`self.venv.close()`