We gratefully acknowledge support from
the Simons Foundation and member institutions.

Alex Havrilla is qualified to endorse.

Teaching Large Language Models to Reason with Reinforcement Learning

Alex Havrilla: Is registered as an author of this paper.
Can endorse for cs.LG. (why?)

Yuqing Du, Sharath Chandra Raparthy, Christoforos Nalmpantis, Jane Dwivedi-Yu, Maksym Zhuravinskyi, Eric Hambro, Sainbayar Sukhbaatar and Roberta Raileanu are not registered as owners of this paper. (why?)