Can ChatGPT Replace Traditional KBQA Models? An In-depth Analysis of the Question Answering Performance of the GPT LLM Family

Tan, Yiming; Min, Dehai; Li, Yu; Li, Wenbo; Hu, Nan; Chen, Yongrui; Qi, Guilin

Computer Science > Computation and Language

arXiv:2303.07992 (cs)

[Submitted on 14 Mar 2023 (v1), last revised 20 Sep 2023 (this version, v3)]

Title:Can ChatGPT Replace Traditional KBQA Models? An In-depth Analysis of the Question Answering Performance of the GPT LLM Family

Authors:Yiming Tan, Dehai Min, Yu Li, Wenbo Li, Nan Hu, Yongrui Chen, Guilin Qi

View PDF

Abstract:ChatGPT is a powerful large language model (LLM) that covers knowledge resources such as Wikipedia and supports natural language question answering using its own knowledge. Therefore, there is growing interest in exploring whether ChatGPT can replace traditional knowledge-based question answering (KBQA) models. Although there have been some works analyzing the question answering performance of ChatGPT, there is still a lack of large-scale, comprehensive testing of various types of complex questions to analyze the limitations of the model. In this paper, we present a framework that follows the black-box testing specifications of CheckList proposed by Ribeiro et. al. We evaluate ChatGPT and its family of LLMs on eight real-world KB-based complex question answering datasets, which include six English datasets and two multilingual datasets. The total number of test cases is approximately 190,000. In addition to the GPT family of LLMs, we also evaluate the well-known FLAN-T5 to identify commonalities between the GPT family and other LLMs. The dataset and code are available at this https URL

Comments:	To be published in Proceedings of ISWC 2023, 22nd International Semantic Web Conference
Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2303.07992 [cs.CL]
	(or arXiv:2303.07992v3 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2303.07992

Submission history

From: Dehai Min [view email]
[v1] Tue, 14 Mar 2023 15:46:28 UTC (909 KB)
[v2] Fri, 4 Aug 2023 10:25:35 UTC (15,617 KB)
[v3] Wed, 20 Sep 2023 05:25:22 UTC (15,617 KB)

Computer Science > Computation and Language

Title:Can ChatGPT Replace Traditional KBQA Models? An In-depth Analysis of the Question Answering Performance of the GPT LLM Family

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Can ChatGPT Replace Traditional KBQA Models? An In-depth Analysis of the Question Answering Performance of the GPT LLM Family

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators