Named Entity Recognition for Electronic Health Records: A Comparison of Rule-based and Machine Learning Approaches
Authors:
Philip John Gorinski,
Honghan Wu,
Claire Grover,
Richard Tobin,
Conn Talbot,
Heather Whalley,
Cathie Sudlow,
William Whiteley,
Beatrice Alex
Abstract:
This work investigates multiple approaches to Named Entity Recognition (NER) for text in Electronic Health Record (EHR) data. In particular, we look into the application of (i) rule-based, (ii) deep learning and (iii) transfer learning systems for the task of NER on brain imaging reports with a focus on records from patients with stroke. We explore the strengths and weaknesses of each approach, de…
▽ More
This work investigates multiple approaches to Named Entity Recognition (NER) for text in Electronic Health Record (EHR) data. In particular, we look into the application of (i) rule-based, (ii) deep learning and (iii) transfer learning systems for the task of NER on brain imaging reports with a focus on records from patients with stroke. We explore the strengths and weaknesses of each approach, develop rules and train on a common dataset, and evaluate each system's performance on common test sets of Scottish radiology reports from two sources (brain imaging reports in ESS -- Edinburgh Stroke Study data collected by NHS Lothian as well as radiology reports created in NHS Tayside). Our comparison shows that a hand-crafted system is the most accurate way to automatically label EHR, but machine learning approaches can provide a feasible alternative where resources for a manual system are not readily available.
△ Less
Submitted 5 June, 2019; v1 submitted 10 March, 2019;
originally announced March 2019.