stable-baselines3/docs/guide/quickstart.rst

41 lines
984 B
ReStructuredText
Raw Normal View History

2019-09-26 09:46:40 +00:00
.. _quickstart:
===============
Getting Started
===============
Most of the library tries to follow a sklearn-like syntax for the Reinforcement Learning algorithms.
2020-05-07 08:10:51 +00:00
Here is a quick example of how to train and run A2C on a CartPole environment:
2019-09-26 09:46:40 +00:00
.. code-block:: python
import gym
2020-05-07 08:10:51 +00:00
from stable_baselines3 import A2C
from stable_baselines3.a2c import MlpPolicy
2019-09-26 09:46:40 +00:00
2020-05-07 08:10:51 +00:00
env = gym.make('CartPole-v1')
2019-09-26 09:46:40 +00:00
2020-05-07 08:10:51 +00:00
model = A2C(MlpPolicy, env, verbose=1)
2019-09-26 09:46:40 +00:00
model.learn(total_timesteps=10000)
obs = env.reset()
for i in range(1000):
2020-05-07 08:10:51 +00:00
action, _state = model.predict(obs, deterministic=True)
obs, reward, done, info = env.step(action)
2019-09-26 09:46:40 +00:00
env.render()
2020-05-07 08:10:51 +00:00
if done:
obs = env.reset()
2019-09-26 09:46:40 +00:00
Or just train a model with a one liner if
`the environment is registered in Gym <https://github.com/openai/gym/wiki/Environments>`_ and if
the policy is registered:
.. code-block:: python
2020-05-07 08:10:51 +00:00
from stable_baselines3 import A2C
2019-09-26 09:46:40 +00:00
2020-05-07 08:10:51 +00:00
model = A2C('MlpPolicy', 'CartPole-v1').learn(10000)