When Do Program-of-Thoughts Work for Reasoning?

Bi, Zhen; Zhang, Ningyu; Jiang, Yinuo; Deng, Shumin; Zheng, Guozhou; Chen, Huajun

Computer Science > Computation and Language

arXiv:2308.15452 (cs)

[Submitted on 29 Aug 2023 (v1), last revised 18 Dec 2023 (this version, v6)]

Title:When Do Program-of-Thoughts Work for Reasoning?

Authors:Zhen Bi, Ningyu Zhang, Yinuo Jiang, Shumin Deng, Guozhou Zheng, Huajun Chen

View PDF HTML (experimental)

Abstract:In the realm of embodied artificial intelligence, the reasoning capabilities of Large Language Models (LLMs) play a pivotal role. Although there are effective methods like program-of-thought prompting for LLMs which uses programming language to tackle complex reasoning tasks, the specific impact of code data on the improvement of reasoning capabilities remains under-explored. To address this gap, we propose complexity-impacted reasoning score (CIRS), which combines structural and logical attributes, to measure the correlation between code and reasoning abilities. Specifically, we use the abstract syntax tree to encode the structural information and calculate logical complexity by considering the difficulty and the cyclomatic complexity. Through an empirical analysis, we find not all code data of complexity can be learned or understood by LLMs. Optimal level of complexity is critical to the improvement of reasoning abilities by program-aided prompting. Then we design an auto-synthesizing and stratifying algorithm, and apply it to instruction generation for mathematical reasoning and code data filtering for code generation tasks. Extensive results demonstrates the effectiveness of our proposed approach. Code will be integrated into the EasyInstruct framework at this https URL.

Comments:	AAAI 2024
Subjects:	Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Software Engineering (cs.SE)
Cite as:	arXiv:2308.15452 [cs.CL]
	(or arXiv:2308.15452v6 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2308.15452

Submission history

From: Ningyu Zhang [view email]
[v1] Tue, 29 Aug 2023 17:22:39 UTC (5,939 KB)
[v2] Fri, 8 Sep 2023 02:31:35 UTC (5,939 KB)
[v3] Tue, 17 Oct 2023 12:17:41 UTC (5,939 KB)
[v4] Wed, 15 Nov 2023 14:06:30 UTC (5,938 KB)
[v5] Mon, 4 Dec 2023 07:01:15 UTC (5,938 KB)
[v6] Mon, 18 Dec 2023 16:15:33 UTC (5,939 KB)

Computer Science > Computation and Language

Title:When Do Program-of-Thoughts Work for Reasoning?

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:When Do Program-of-Thoughts Work for Reasoning?

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators