diff --git a/docs/guide/migration.rst b/docs/guide/migration.rst
index c9c15cd..d3bb9d1 100644
--- a/docs/guide/migration.rst
+++ b/docs/guide/migration.rst
@@ -113,7 +113,7 @@ A2C
 	PyTorch implementation of RMSprop `differs from Tensorflow's <https://github.com/pytorch/pytorch/issues/23796>`_,
 	which leads to `different and potentially more unstable results <https://github.com/DLR-RM/stable-baselines3/pull/110#issuecomment-663255241>`_.
 	Use ``stable_baselines3.common.sb2_compat.rmsprop_tf_like.RMSpropTFLike`` optimizer to match the results
-	with TensorFlow's implementation. This can be done through ``policy_kwargs``: ``A2C(policy_kwargs=dict(optimizer_class=RMSpropTFLike, eps=1e-5))``
+	with TensorFlow's implementation. This can be done through ``policy_kwargs``: ``A2C(policy_kwargs=dict(optimizer_class=RMSpropTFLike, optimizer_kwargs=dict(eps=1e-5)))``
 
 
 PPO
diff --git a/docs/misc/changelog.rst b/docs/misc/changelog.rst
index 85598f7..09577fc 100644
--- a/docs/misc/changelog.rst
+++ b/docs/misc/changelog.rst
@@ -54,6 +54,7 @@ Documentation:
 - Updated ``BaseAlgorithm.load`` docstring (@Demetrio92)
 - Added a note on ``load`` behavior in the examples (@Demetrio92)
 - Updated SB3 Contrib doc
+- Fixed A2C and migration guide guidance on how to set epsilon with RMSpropTFLike (@thomasgubler)
 
 Release 1.3.0 (2021-10-23)
 ---------------------------
@@ -858,4 +859,4 @@ And all the contributors:
 @ShangqunYu @PierreExeter @JacopoPan @ltbd78 @tom-doerr @Atlis @liusida @09tangriro @amy12xx @juancroldan
 @benblack769 @bstee615 @c-rizz @skandermoalla @MihaiAnca13 @davidblom603 @ayeright @cyprienc
 @wkirgsn @AechPro @CUN-bjy @batu @IljaAvadiev @timokau @kachayev @cleversonahum
-@eleurent @ac-93 @cove9988 @theDebugger811 @hsuehch @Demetrio92
+@eleurent @ac-93 @cove9988 @theDebugger811 @hsuehch @Demetrio92 @thomasgubler
diff --git a/docs/modules/a2c.rst b/docs/modules/a2c.rst
index 3f672ee..e871424 100644
--- a/docs/modules/a2c.rst
+++ b/docs/modules/a2c.rst
@@ -14,7 +14,7 @@ It uses multiple workers to avoid the use of a replay buffer.
 
   If you find training unstable or want to match performance of stable-baselines A2C, consider using
   ``RMSpropTFLike`` optimizer from ``stable_baselines3.common.sb2_compat.rmsprop_tf_like``.
-  You can change optimizer with ``A2C(policy_kwargs=dict(optimizer_class=RMSpropTFLike, eps=1e-5))``.
+  You can change optimizer with ``A2C(policy_kwargs=dict(optimizer_class=RMSpropTFLike, optimizer_kwargs=dict(eps=1e-5)))``.
   Read more `here <https://github.com/DLR-RM/stable-baselines3/pull/110#issuecomment-663255241>`_.