Launching into clinical space with medspaCy: a new clinical text processing toolkit in Python
Authors:
Hannah Eyre,
Alec B Chapman,
Kelly S Peterson,
Jianlin Shi,
Patrick R Alba,
Makoto M Jones,
Tamara L Box,
Scott L DuVall,
Olga V Patterson
Abstract:
Despite impressive success of machine learning algorithms in clinical natural language processing (cNLP), rule-based approaches still have a prominent role. In this paper, we introduce medspaCy, an extensible, open-source cNLP library based on spaCy framework that allows flexible integration of rule-based and machine learning-based algorithms adapted to clinical text. MedspaCy includes a variety o…
▽ More
Despite impressive success of machine learning algorithms in clinical natural language processing (cNLP), rule-based approaches still have a prominent role. In this paper, we introduce medspaCy, an extensible, open-source cNLP library based on spaCy framework that allows flexible integration of rule-based and machine learning-based algorithms adapted to clinical text. MedspaCy includes a variety of components that meet common cNLP needs such as context analysis and map** to standard terminologies. By utilizing spaCy's clear and easy-to-use conventions, medspaCy enables development of custom pipelines that integrate easily with other spaCy-based modules. Our toolkit includes several core components and facilitates rapid development of pipelines for clinical text.
△ Less
Submitted 14 June, 2021;
originally announced June 2021.