Skip to main content

Showing 1–10 of 10 results for author: Kraljevic, Z

.
  1. arXiv:2310.04468  [pdf, other

    cs.CL cs.AI

    Validating transformers for redaction of text from electronic health records in real-world healthcare

    Authors: Zeljko Kraljevic, Anthony Shek, Joshua Au Yeung, Ewart Jonathan Sheldon, Mohammad Al-Agil, Haris Shuaib, Xi Bai, Kawsar Noor, Anoop D. Shah, Richard Dobson, James Teo

    Abstract: Protecting patient privacy in healthcare records is a top priority, and redaction is a commonly used method for obscuring directly identifiable information in text. Rule-based methods have been widely used, but their precision is often low causing over-redaction of text and frequently not being adaptable enough for non-standardised or unconventional structures of personal health information. Deep… ▽ More

    Submitted 5 October, 2023; originally announced October 2023.

  2. arXiv:2308.08234  [pdf, other

    cs.CL cs.AI cs.LG

    Challenges and Opportunities of Using Transformer-Based Multi-Task Learning in NLP Through ML Lifecycle: A Survey

    Authors: Lovre Torbarina, Tin Ferkovic, Lukasz Roguski, Velimir Mihelcic, Bruno Sarlija, Zeljko Kraljevic

    Abstract: The increasing adoption of natural language processing (NLP) models across industries has led to practitioners' need for machine learning systems to handle these models efficiently, from training to serving them in production. However, training, deploying, and updating multiple models can be complex, costly, and time-consuming, mainly when using transformer-based pre-trained language models. Multi… ▽ More

    Submitted 16 August, 2023; originally announced August 2023.

  3. arXiv:2212.08072  [pdf

    cs.CL cs.AI cs.LG

    Foresight -- Generative Pretrained Transformer (GPT) for Modelling of Patient Timelines using EHRs

    Authors: Zeljko Kraljevic, Dan Bean, Anthony Shek, Rebecca Bendayan, Harry Hemingway, Joshua Au Yeung, Alexander Deng, Alfie Baston, Jack Ross, Esther Idowu, James T Teo, Richard J Dobson

    Abstract: Background: Electronic Health Records hold detailed longitudinal information about each patient's health status and general clinical history, a large portion of which is stored within the unstructured text. Existing approaches focus mostly on structured data and a subset of single-domain outcomes. We explore how temporal modelling of patients from free text and structured data, using deep generati… ▽ More

    Submitted 24 January, 2023; v1 submitted 13 December, 2022; originally announced December 2022.

  4. arXiv:2107.03134  [pdf, other

    cs.CL

    MedGPT: Medical Concept Prediction from Clinical Narratives

    Authors: Zeljko Kraljevic, Anthony Shek, Daniel Bean, Rebecca Bendayan, James Teo, Richard Dobson

    Abstract: The data available in Electronic Health Records (EHRs) provides the opportunity to transform care, and the best way to provide better care for one patient is through learning from the data available on all other patients. Temporal modelling of a patient's medical history, which takes into account the sequence of past events, can be used to predict future events such as a diagnosis of a new disorde… ▽ More

    Submitted 7 July, 2021; originally announced July 2021.

    Comments: 6 pages, 2 figures, 3 tables

  5. arXiv:2011.09361  [pdf, other

    cs.LG cs.CY

    A Knowledge Distillation Ensemble Framework for Predicting Short and Long-term Hospitalisation Outcomes from Electronic Health Records Data

    Authors: Zina M Ibrahim, Daniel Bean, Thomas Searle, Honghan Wu, Anthony Shek, Zeljko Kraljevic, James Galloway, Sam Norton, James T Teo, Richard JB Dobson

    Abstract: The ability to perform accurate prognosis of patients is crucial for proactive clinical decision making, informed resource management and personalised care. Existing outcome prediction models suffer from a low recall of infrequent positive outcomes. We present a highly-scalable and robust machine learning framework to automatically predict adversity represented by mortality and ICU admission from… ▽ More

    Submitted 11 June, 2021; v1 submitted 18 November, 2020; originally announced November 2020.

    Comments: 14 pages

  6. arXiv:2010.01165  [pdf, other

    cs.CL cs.AI cs.LG

    Multi-domain Clinical Natural Language Processing with MedCAT: the Medical Concept Annotation Toolkit

    Authors: Zeljko Kraljevic, Thomas Searle, Anthony Shek, Lukasz Roguski, Kawsar Noor, Daniel Bean, Aurelie Mascio, Leilei Zhu, Amos A Folarin, Angus Roberts, Rebecca Bendayan, Mark P Richardson, Robert Stewart, Anoop D Shah, Wai Keong Wong, Zina Ibrahim, James T Teo, Richard JB Dobson

    Abstract: Electronic health records (EHR) contain large volumes of unstructured text, requiring the application of Information Extraction (IE) technologies to enable clinical analysis. We present the open-source Medical Concept Annotation Toolkit (MedCAT) that provides: a) a novel self-supervised machine learning algorithm for extracting concepts using any concept vocabulary including UMLS/SNOMED-CT; b) a f… ▽ More

    Submitted 25 March, 2021; v1 submitted 2 October, 2020; originally announced October 2020.

    Comments: Preprint: 27 Pages, 3 Figures

  7. arXiv:2005.06624  [pdf, other

    cs.CL cs.LG

    Comparative Analysis of Text Classification Approaches in Electronic Health Records

    Authors: Aurelie Mascio, Zeljko Kraljevic, Daniel Bean, Richard Dobson, Robert Stewart, Rebecca Bendayan, Angus Roberts

    Abstract: Text classification tasks which aim at harvesting and/or organizing information from electronic health records are pivotal to support clinical and translational research. However these present specific challenges compared to other classification tasks, notably due to the particular nature of the medical lexicon and language used in clinical records. Recent advances in embedding methods have shown… ▽ More

    Submitted 8 May, 2020; originally announced May 2020.

  8. arXiv:2002.08901  [pdf

    cs.CL cs.LG

    Identifying physical health comorbidities in a cohort of individuals with severe mental illness: An application of SemEHR

    Authors: Rebecca Bendayan, Honghan Wu, Zeljko Kraljevic, Robert Stewart, Tom Searle, Jaya Chaturvedi, Jayati Das-Munshi, Zina Ibrahim, Aurelie Mascio, Angus Roberts, Daniel Bean, Richard Dobson

    Abstract: Multimorbidity research in mental health services requires data from physical health conditions which is traditionally limited in mental health care electronic health records. In this study, we aimed to extract data from physical health conditions from clinical notes using SemEHR. Data was extracted from Clinical Record Interactive Search (CRIS) system at South London and Maudsley Biomedical Resea… ▽ More

    Submitted 7 February, 2020; originally announced February 2020.

    Comments: 4 pages, 2 tables

  9. arXiv:1912.10166  [pdf

    cs.CL cs.LG stat.ML

    MedCAT -- Medical Concept Annotation Tool

    Authors: Zeljko Kraljevic, Daniel Bean, Aurelie Mascio, Lukasz Roguski, Amos Folarin, Angus Roberts, Rebecca Bendayan, Richard Dobson

    Abstract: Biomedical documents such as Electronic Health Records (EHRs) contain a large amount of information in an unstructured format. The data in EHRs is a hugely valuable resource documenting clinical narratives and decisions, but whilst the text can be easily understood by human doctors it is challenging to use in research and clinical applications. To uncover the potential of biomedical documents we n… ▽ More

    Submitted 18 December, 2019; originally announced December 2019.

    Comments: Preprint, 25 pages, 5 figures and 4 tables

  10. arXiv:1907.07322  [pdf, other

    cs.HC cs.CL cs.LG

    MedCATTrainer: A Biomedical Free Text Annotation Interface with Active Learning and Research Use Case Specific Customisation

    Authors: Thomas Searle, Zeljko Kraljevic, Rebecca Bendayan, Daniel Bean, Richard Dobson

    Abstract: We present MedCATTrainer an interface for building, improving and customising a given Named Entity Recognition and Linking (NER+L) model for biomedical domain text. NER+L is often used as a first step in deriving value from clinical text. Collecting labelled data for training models is difficult due to the need for specialist domain knowledge. MedCATTrainer offers an interactive web-interface to i… ▽ More

    Submitted 16 July, 2019; originally announced July 2019.

    Journal ref: EMNLP/IJCNLP 2019