timeseries/Readme.md
2025-02-10 17:34:27 +01:00

283 B

Readme

RL_countdown_r1zero

Reinforcement Learning of a countdown function . Target: R1-Zero Base Model

Hint: the file ./TinyZero/scripts/train_tiny_zero.sh --> data.train_batch_size=256
data.val_batch_size=1312 \ ! ADJUST ! Commandline handover of override doe not work .