Alex Havrilla is qualified to endorse.
Teaching Large Language Models to Reason with Reinforcement Learning
Alex Havrilla: | Is registered as an author of this paper. Can endorse for cs.LG. (why?) |
Yuqing Du, Sharath Chandra Raparthy, Christoforos Nalmpantis, Jane Dwivedi-Yu, Maksym Zhuravinskyi, Eric Hambro, Sainbayar Sukhbaatar and Roberta Raileanu are not registered as owners of this paper. (why?)