Multi-step Problem Solving Through a Verifier: An Empirical Analysis on Model-induced Process Supervision

Wang, Zihan; Li, Yunxuan; Wu, Yuexin; Luo, Liangchen; Hou, Le; Yu, Hongkun; Shang, **gbo

Computer Science > Artificial Intelligence

arXiv:2402.02658 (cs)

[Submitted on 5 Feb 2024]

Title:Multi-step Problem Solving Through a Verifier: An Empirical Analysis on Model-induced Process Supervision

Authors:Zihan Wang, Yunxuan Li, Yuexin Wu, Liangchen Luo, Le Hou, Hongkun Yu, **gbo Shang

View PDF HTML (experimental)

Abstract:Process supervision, using a trained verifier to evaluate the intermediate steps generated by reasoner, has demonstrated significant improvements in multi-step problem solving. In this paper, to avoid expensive human annotation effort on the verifier training data, we introduce Model-induced Process Supervision (MiPS), a novel method for automating data curation. MiPS annotates an intermediate step by sampling completions of this solution through the reasoning model, and obtaining an accuracy defined as the proportion of correct completions. Errors in the reasoner would cause MiPS to underestimate the accuracy of intermediate steps, therefore, we suggest and empirically show that verification focusing on high predicted scores of the verifier shall be preferred over that of low predicted scores, contrary to prior work. Our approach significantly improves the performance of PaLM 2 on math and coding tasks (accuracy +0.67% on GSM8K, +4.16% on MATH, +0.92% on MBPP compared with an output supervision trained verifier). Additionally, our study demonstrates that the verifier exhibits strong generalization ability across different reasoning models.

Subjects:	Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Machine Learning (cs.LG)
Cite as:	arXiv:2402.02658 [cs.AI]
	(or arXiv:2402.02658v1 [cs.AI] for this version)
	https://doi.org/10.48550/arXiv.2402.02658

Submission history

From: Zihan Wang [view email]
[v1] Mon, 5 Feb 2024 00:57:51 UTC (777 KB)

Computer Science > Artificial Intelligence

Title:Multi-step Problem Solving Through a Verifier: An Empirical Analysis on Model-induced Process Supervision

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Artificial Intelligence

Title:Multi-step Problem Solving Through a Verifier: An Empirical Analysis on Model-induced Process Supervision

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators