Evaluating Physician-AI Interaction for Cancer Management: Paving the Path towards Precision Oncology
Authors:
Zeshan Hussain,
Barbara D. Lam,
Fernando A. Acosta-Perez,
Irbaz Bin Riaz,
Maia Jacobs,
Andrew J. Yee,
David Sontag
Abstract:
We evaluated how clinicians approach clinical decision-making when given findings from both randomized controlled trials (RCTs) and machine learning (ML) models. To do so, we designed a clinical decision support system (CDSS) that displays survival curves and adverse event information from a synthetic RCT and ML model for 12 patients with multiple myeloma. We conducted an interventional study in a…
▽ More
We evaluated how clinicians approach clinical decision-making when given findings from both randomized controlled trials (RCTs) and machine learning (ML) models. To do so, we designed a clinical decision support system (CDSS) that displays survival curves and adverse event information from a synthetic RCT and ML model for 12 patients with multiple myeloma. We conducted an interventional study in a simulated setting to evaluate how clinicians synthesized the available data to make treatment decisions. Participants were invited to participate in a follow-up interview to discuss their choices in an open-ended format. When ML model results were concordant with RCT results, physicians had increased confidence in treatment choice compared to when they were given RCT results alone. When ML model results were discordant with RCT results, the majority of physicians followed the ML model recommendation in their treatment selection. Perceived reliability of the ML model was consistently higher after physicians were provided with data on how it was trained and validated. Follow-up interviews revealed four major themes: (1) variability in what variables participants used for decision-making, (2) perceived advantages to an ML model over RCT data, (3) uncertainty around decision-making when the ML model quality was poor, and (4) perception that this type of study is an important thought exercise for clinicians. Overall, ML-based CDSSs have the potential to change treatment decisions in cancer management. However, meticulous development and validation of these systems as well as clinician training are required before deployment.
△ Less
Submitted 23 April, 2024;
originally announced April 2024.
Impact of Large Language Model Assistance on Patients Reading Clinical Notes: A Mixed-Methods Study
Authors:
Niklas Mannhardt,
Elizabeth Bondi-Kelly,
Barbara Lam,
Chloe O'Connell,
Mercy Asiedu,
Hussein Mozannar,
Monica Agrawal,
Alejandro Buendia,
Tatiana Urman,
Irbaz B. Riaz,
Catherine E. Ricciardi,
Marzyeh Ghassemi,
David Sontag
Abstract:
Patients derive numerous benefits from reading their clinical notes, including an increased sense of control over their health and improved understanding of their care plan. However, complex medical concepts and jargon within clinical notes hinder patient comprehension and may lead to anxiety. We developed a patient-facing tool to make clinical notes more readable, leveraging large language models…
▽ More
Patients derive numerous benefits from reading their clinical notes, including an increased sense of control over their health and improved understanding of their care plan. However, complex medical concepts and jargon within clinical notes hinder patient comprehension and may lead to anxiety. We developed a patient-facing tool to make clinical notes more readable, leveraging large language models (LLMs) to simplify, extract information from, and add context to notes. We prompt engineered GPT-4 to perform these augmentation tasks on real clinical notes donated by breast cancer survivors and synthetic notes generated by a clinician, a total of 12 notes with 3868 words. In June 2023, 200 female-identifying US-based participants were randomly assigned three clinical notes with varying levels of augmentations using our tool. Participants answered questions about each note, evaluating their understanding of follow-up actions and self-reported confidence. We found that augmentations were associated with a significant increase in action understanding score (0.63 $\pm$ 0.04 for select augmentations, compared to 0.54 $\pm$ 0.02 for the control) with p=0.002. In-depth interviews of self-identifying breast cancer patients (N=7) were also conducted via video conferencing. Augmentations, especially definitions, elicited positive responses among the seven participants, with some concerns about relying on LLMs. Augmentations were evaluated for errors by clinicians, and we found misleading errors occur, with errors more common in real donated notes than synthetic notes, illustrating the importance of carefully written clinical notes. Augmentations improve some but not all readability metrics. This work demonstrates the potential of LLMs to improve patients' experience with clinical notes at a lower burden to clinicians. However, having a human in the loop is important to correct potential model errors.
△ Less
Submitted 17 January, 2024;
originally announced January 2024.