Update examples.rst (#1969)

This commit is contained in:
Quentin Gallouédec 2024-07-15 23:57:24 +02:00 committed by GitHub
parent d8148deeaa
commit 1a69fc8314
No known key found for this signature in database
GPG key ID: B5690EEEBB952194

View file

@ -179,9 +179,9 @@ Multiprocessing with off-policy algorithms
vec_env = make_vec_env("Pendulum-v0", n_envs=4, seed=0)
# We collect 4 transitions per call to `ènv.step()`
# and performs 2 gradient steps per call to `ènv.step()`
# if gradient_steps=-1, then we would do 4 gradients steps per call to `ènv.step()`
# We collect 4 transitions per call to `env.step()`
# and performs 2 gradient steps per call to `env.step()`
# if gradient_steps=-1, then we would do 4 gradients steps per call to `env.step()`
model = SAC("MlpPolicy", vec_env, train_freq=1, gradient_steps=2, verbose=1)
model.learn(total_timesteps=10_000)