stable-baselines3/docs/guide/quickstart.rst
Antonin RAFFIN 7c21b79188
Add progress bar callback and argument (#1095)
* Add progress bar callback and argument

* Update doc

* Update changelog

* Upgrade pytype in docker image

* Use tqdm.write in the logger to have cleaner output

* Fix logger test

* Fix when doing multiple calls to learn()

* Address comments from code-review
2022-10-06 18:17:31 +02:00

43 lines
1 KiB
ReStructuredText

.. _quickstart:
===============
Getting Started
===============
Most of the library tries to follow a sklearn-like syntax for the Reinforcement Learning algorithms.
Here is a quick example of how to train and run A2C on a CartPole environment:
.. code-block:: python
import gym
from stable_baselines3 import A2C
env = gym.make("CartPole-v1")
model = A2C("MlpPolicy", env, verbose=1)
model.learn(total_timesteps=10_000)
obs = env.reset()
for i in range(1000):
action, _state = model.predict(obs, deterministic=True)
obs, reward, done, info = env.step(action)
env.render()
if done:
obs = env.reset()
.. note::
You can find explanations about the logger output and names in the :ref:`Logger <logger>` section.
Or just train a model with a one liner if
`the environment is registered in Gym <https://github.com/openai/gym/wiki/Environments>`_ and if
the policy is registered:
.. code-block:: python
from stable_baselines3 import A2C
model = A2C("MlpPolicy", "CartPole-v1").learn(10000)