Extend and Explain: Interpreting Very Long Language Models

Stremmel, Joel; Hill, Brian L.; Hertzberg, Jeffrey; Murillo, Jaime; Allotey, Llewelyn; Halperin, Eran

Computer Science > Computation and Language

arXiv:2209.01174 (cs)

[Submitted on 2 Sep 2022 (v1), last revised 28 Nov 2022 (this version, v3)]

Title:Extend and Explain: Interpreting Very Long Language Models

Authors:Joel Stremmel, Brian L. Hill, Jeffrey Hertzberg, Jaime Murillo, Llewelyn Allotey, Eran Halperin

View PDF

Abstract:While Transformer language models (LMs) are state-of-the-art for information extraction, long text introduces computational challenges requiring suboptimal preprocessing steps or alternative model architectures. Sparse attention LMs can represent longer sequences, overcoming performance hurdles. However, it remains unclear how to explain predictions from these models, as not all tokens attend to each other in the self-attention layers, and long sequences pose computational challenges for explainability algorithms when runtime depends on document length. These challenges are severe in the medical context where documents can be very long, and machine learning (ML) models must be auditable and trustworthy. We introduce a novel Masked Sampling Procedure (MSP) to identify the text blocks that contribute to a prediction, apply MSP in the context of predicting diagnoses from medical text, and validate our approach with a blind review by two clinicians. Our method identifies about 1.7x more clinically informative text blocks than the previous state-of-the-art, runs up to 100x faster, and is tractable for generating important phrase pairs. MSP is particularly well-suited to long LMs but can be applied to any text classifier. We provide a general implementation of MSP.

Comments:	11 pages
Subjects:	Computation and Language (cs.CL)
MSC classes:	I.2.7
Cite as:	arXiv:2209.01174 [cs.CL]
	(or arXiv:2209.01174v3 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2209.01174
Journal reference:	Proceedings of the 2nd Machine Learning for Health symposium, PMLR 193:218-258, 2022

Submission history

From: Joel Stremmel [view email]
[v1] Fri, 2 Sep 2022 17:15:43 UTC (3,210 KB)
[v2] Wed, 7 Sep 2022 20:20:53 UTC (2,851 KB)
[v3] Mon, 28 Nov 2022 17:29:15 UTC (3,208 KB)

Computer Science > Computation and Language

Title:Extend and Explain: Interpreting Very Long Language Models

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Extend and Explain: Interpreting Very Long Language Models

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators