Towards Question-Answering as an Automatic Metric for Evaluating the Content Quality of a Summary

Deutsch, Daniel; Bedrax-Weiss, Tania; Roth, Dan

Computer Science > Computation and Language

arXiv:2010.00490 (cs)

[Submitted on 1 Oct 2020 (v1), last revised 26 Jul 2021 (this version, v3)]

Title:Towards Question-Answering as an Automatic Metric for Evaluating the Content Quality of a Summary

Authors:Daniel Deutsch, Tania Bedrax-Weiss, Dan Roth

View PDF

Abstract:A desirable property of a reference-based evaluation metric that measures the content quality of a summary is that it should estimate how much information that summary has in common with a reference. Traditional text overlap based metrics such as ROUGE fail to achieve this because they are limited to matching tokens, either lexically or via embeddings. In this work, we propose a metric to evaluate the content quality of a summary using question-answering (QA). QA-based methods directly measure a summary's information overlap with a reference, making them fundamentally different than text overlap metrics. We demonstrate the experimental benefits of QA-based metrics through an analysis of our proposed metric, QAEval. QAEval out-performs current state-of-the-art metrics on most evaluations using benchmark datasets, while being competitive on others due to limitations of state-of-the-art models. Through a careful analysis of each component of QAEval, we identify its performance bottlenecks and estimate that its potential upper-bound performance surpasses all other automatic metrics, approaching that of the gold-standard Pyramid Method.

Comments:	This is a pre-MIT Press publication version of the paper
Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2010.00490 [cs.CL]
	(or arXiv:2010.00490v3 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2010.00490

Submission history

From: Daniel Deutsch [view email]
[v1] Thu, 1 Oct 2020 15:33:09 UTC (1,343 KB)
[v2] Thu, 22 Apr 2021 16:47:46 UTC (767 KB)
[v3] Mon, 26 Jul 2021 18:47:26 UTC (767 KB)

Computer Science > Computation and Language

Title:Towards Question-Answering as an Automatic Metric for Evaluating the Content Quality of a Summary

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Towards Question-Answering as an Automatic Metric for Evaluating the Content Quality of a Summary

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators