Adapting Pretrained Language Models for Solving Tabular Prediction Problems in the Electronic Health Record

McMaster, Christopher; Liew, David FL; Pires, Douglas EV

Computer Science > Computation and Language

arXiv:2303.14920 (cs)

[Submitted on 27 Mar 2023]

Title:Adapting Pretrained Language Models for Solving Tabular Prediction Problems in the Electronic Health Record

Authors:Christopher McMaster, David FL Liew, Douglas EV Pires

View PDF

Abstract:We propose an approach for adapting the DeBERTa model for electronic health record (EHR) tasks using domain adaptation. We pretrain a small DeBERTa model on a dataset consisting of MIMIC-III discharge summaries, clinical notes, radiology reports, and PubMed abstracts. We compare this model's performance with a DeBERTa model pre-trained on clinical texts from our institutional EHR (MeDeBERTa) and an XGBoost model. We evaluate performance on three benchmark tasks for emergency department outcomes using the MIMIC-IV-ED dataset. We preprocess the data to convert it into text format and generate four versions of the original datasets to compare data processing and data inclusion. The results show that our proposed approach outperforms the alternative models on two of three tasks (p<0.001) and matches performance on the third task, with the use of descriptive columns improving performance over the original column names.

Subjects:	Computation and Language (cs.CL); Machine Learning (cs.LG)
Cite as:	arXiv:2303.14920 [cs.CL]
	(or arXiv:2303.14920v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2303.14920

Submission history

From: Christopher McMaster [view email]
[v1] Mon, 27 Mar 2023 05:34:19 UTC (149 KB)

Computer Science > Computation and Language

Title:Adapting Pretrained Language Models for Solving Tabular Prediction Problems in the Electronic Health Record

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Adapting Pretrained Language Models for Solving Tabular Prediction Problems in the Electronic Health Record

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators