Evaluating and Characterizing Human Rationales

Carton, Samuel; Rathore, Anirudh; Tan, Chenhao

Computer Science > Computation and Language

arXiv:2010.04736 (cs)

[Submitted on 9 Oct 2020]

Title:Evaluating and Characterizing Human Rationales

Authors:Samuel Carton, Anirudh Rathore, Chenhao Tan

View PDF

Abstract:Two main approaches for evaluating the quality of machine-generated rationales are: 1) using human rationales as a gold standard; and 2) automated metrics based on how rationales affect model behavior. An open question, however, is how human rationales fare with these automatic metrics. Analyzing a variety of datasets and models, we find that human rationales do not necessarily perform well on these metrics. To unpack this finding, we propose improved metrics to account for model-dependent baseline performance. We then propose two methods to further characterize rationale quality, one based on model retraining and one on using "fidelity curves" to reveal properties such as irrelevance and redundancy. Our work leads to actionable suggestions for evaluating and characterizing rationales.

Comments:	14 pages, 15 figures, to appear in EMNLP 2020. Code is available at this https URL
Subjects:	Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Computers and Society (cs.CY); Human-Computer Interaction (cs.HC); Machine Learning (cs.LG)
Cite as:	arXiv:2010.04736 [cs.CL]
	(or arXiv:2010.04736v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2010.04736

Submission history

From: Chenhao Tan [view email]
[v1] Fri, 9 Oct 2020 18:00:04 UTC (4,070 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.CL

< prev | next >

new | recent | 2020-10

Change to browse by:

cs
cs.AI
cs.CY
cs.HC
cs.LG

References & Citations

DBLP - CS Bibliography

listing | bibtex

Samuel Carton
Chenhao Tan

export BibTeX citation

Computer Science > Computation and Language

Title:Evaluating and Characterizing Human Rationales

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Evaluating and Characterizing Human Rationales

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators