Search | arXiv e-print repository

Assessing Empathy in Large Language Models with Real-World Physician-Patient Interactions

Authors: Man Luo, Christopher J. Warren, Lu Cheng, Haidar M. Abdul-Muhsin, Imon Banerjee

Abstract: The integration of Large Language Models (LLMs) into the healthcare domain has the potential to significantly enhance patient care and support through the development of empathetic, patient-facing chatbots. This study investigates an intriguing question Can ChatGPT respond with a greater degree of empathy than those typically offered by physicians? To answer this question, we collect a de-identifi… ▽ More The integration of Large Language Models (LLMs) into the healthcare domain has the potential to significantly enhance patient care and support through the development of empathetic, patient-facing chatbots. This study investigates an intriguing question Can ChatGPT respond with a greater degree of empathy than those typically offered by physicians? To answer this question, we collect a de-identified dataset of patient messages and physician responses from Mayo Clinic and generate alternative replies using ChatGPT. Our analyses incorporate novel empathy ranking evaluation (EMRank) involving both automated metrics and human assessments to gauge the empathy level of responses. Our findings indicate that LLM-powered chatbots have the potential to surpass human physicians in delivering empathetic communication, suggesting a promising avenue for enhancing patient care and reducing professional burnout. The study not only highlights the importance of empathy in patient interactions but also proposes a set of effective automatic empathy ranking metrics, paving the way for the broader adoption of LLMs in healthcare. △ Less

Submitted 25 May, 2024; originally announced May 2024.

arXiv:2405.00752 [pdf, other]

Clustering Running Titles to Understand the Printing of Early Modern Books

Authors: Nikolai Vogler, Kartik Goyal, Samuel V. Lemley, D. J. Schuldt, Christopher N. Warren, Max G'Sell, Taylor Berg-Kirkpatrick

Abstract: We propose a novel computational approach to automatically analyze the physical process behind printing of early modern letterpress books via clustering the running titles found at the top of their pages. Specifically, we design and compare custom neural and feature-based kernels for computing pairwise visual similarity of a scanned document's running titles and cluster the titles in order to trac… ▽ More We propose a novel computational approach to automatically analyze the physical process behind printing of early modern letterpress books via clustering the running titles found at the top of their pages. Specifically, we design and compare custom neural and feature-based kernels for computing pairwise visual similarity of a scanned document's running titles and cluster the titles in order to track any deviations from the expected pattern of a book's printing. Unlike body text which must be reset for every page, the running titles are one of the static type elements in a skeleton forme i.e. the frame used to print each side of a sheet of paper, and were often re-used during a book's printing. To evaluate the effectiveness of our approach, we manually annotate the running title clusters on about 1600 pages across 8 early modern books of varying size and formats. Our method can detect potential deviation from the expected patterns of such skeleton formes, which helps bibliographers understand the phenomena associated with a text's transmission, such as censorship. We also validate our results against a manual bibliographic analysis of a counterfeit early edition of Thomas Hobbes' Leviathan (1651). △ Less

Submitted 22 May, 2024; v1 submitted 1 May, 2024; originally announced May 2024.

Comments: Accepted at ICDAR 2024; updated Acknowledgments in v2

arXiv:2311.15113 [pdf, other]

NCL-SM: A Fully Annotated Dataset of Images from Human Skeletal Muscle Biopsies

Authors: Atif Khan, Conor Lawless, Amy Vincent, Charlotte Warren, Valeria Di Leo, Tiago Gomes, A. Stephen McGough

Abstract: Single cell analysis of human skeletal muscle (SM) tissue cross-sections is a fundamental tool for understanding many neuromuscular disorders. For this analysis to be reliable and reproducible, identification of individual fibres within microscopy images (segmentation) of SM tissue should be automatic and precise. Biomedical scientists in this field currently rely on custom tools and general machi… ▽ More Single cell analysis of human skeletal muscle (SM) tissue cross-sections is a fundamental tool for understanding many neuromuscular disorders. For this analysis to be reliable and reproducible, identification of individual fibres within microscopy images (segmentation) of SM tissue should be automatic and precise. Biomedical scientists in this field currently rely on custom tools and general machine learning (ML) models, both followed by labour intensive and subjective manual interventions to fine-tune segmentation. We believe that fully automated, precise, reproducible segmentation is possible by training ML models. However, in this important biomedical domain, there are currently no good quality, publicly available annotated imaging datasets available for ML model training. In this paper we release NCL-SM: a high quality bioimaging dataset of 46 human SM tissue cross-sections from both healthy control subjects and from patients with genetically diagnosed muscle pathology. These images include $>$ 50k manually segmented muscle fibres (myofibres). In addition we also curated high quality myofibre segmentations, annotating reasons for rejecting low quality myofibres and low quality regions in SM tissue images, making these annotations completely ready for downstream analysis. This, we believe, will pave the way for development of a fully automatic pipeline that identifies individual myofibres within images of tissue sections and, in particular, also classifies individual myofibres that are fit for further analysis. △ Less

Submitted 25 November, 2023; originally announced November 2023.

Comments: Paper accepted at the Big Data Analytics for Health and Medicine (BDA4HM) workshop, IEEE BigData 2023, December 15th-18th, 2023, Sorrento, Italy, 07 pages. arXiv admin note: substantial text overlap with arXiv:2311.11099

arXiv:2311.11099 [pdf, other]

Introducing NCL-SM: A Fully Annotated Dataset of Images from Human Skeletal Muscle Biopsies

Authors: Atif Khan, Conor Lawless, Amy Vincent, Charlotte Warren, Valeria Di Leo, Tiago Gomes, A. Stephen McGough

Abstract: Single cell analysis of skeletal muscle (SM) tissue is a fundamental tool for understanding many neuromuscular disorders. For this analysis to be reliable and reproducible, identification of individual fibres within microscopy images (segmentation) of SM tissue should be precise. There is currently no tool or pipeline that makes automatic and precise segmentation and curation of images of SM tissu… ▽ More Single cell analysis of skeletal muscle (SM) tissue is a fundamental tool for understanding many neuromuscular disorders. For this analysis to be reliable and reproducible, identification of individual fibres within microscopy images (segmentation) of SM tissue should be precise. There is currently no tool or pipeline that makes automatic and precise segmentation and curation of images of SM tissue cross-sections possible. Biomedical scientists in this field rely on custom tools and general machine learning (ML) models, both followed by labour intensive and subjective manual interventions to get the segmentation right. We believe that automated, precise, reproducible segmentation is possible by training ML models. However, there are currently no good quality, publicly available annotated imaging datasets available for ML model training. In this paper we release NCL-SM: a high quality bioimaging dataset of 46 human tissue sections from healthy control subjects and from patients with genetically diagnosed muscle pathology. These images include $>$ 50k manually segmented muscle fibres (myofibres). In addition we also curated high quality myofibres and annotated reasons for rejecting low quality myofibres and regions in SM tissue images, making this data completely ready for downstream analysis. This, we believe, will pave the way for development of a fully automatic pipeline that identifies individual myofibres within images of tissue sections and, in particular, also classifies individual myofibres that are fit for further analysis. △ Less

Submitted 18 November, 2023; originally announced November 2023.

Comments: Extended Abstract presented at Machine Learning for Health (ML4H) symposium 2023, December 10th, 2023, New Orleans, United States, 09 pages Full Paper presented at Big Data Analytics for Health and Medicine (BDA4HM) workshop, IEEE BigData 2023, December 15th-18th, 2023, Sorrento, Italy

arXiv:2306.07998 [pdf, other]

Contrastive Attention Networks for Attribution of Early Modern Print

Authors: Nikolai Vogler, Kartik Goyal, Kishore PV Reddy, Elizaveta Pertseva, Samuel V. Lemley, Christopher N. Warren, Max G'Sell, Taylor Berg-Kirkpatrick

Abstract: In this paper, we develop machine learning techniques to identify unknown printers in early modern (c.~1500--1800) English printed books. Specifically, we focus on matching uniquely damaged character type-imprints in anonymously printed books to works with known printers in order to provide evidence of their origins. Until now, this work has been limited to manual investigations by analytical bibl… ▽ More In this paper, we develop machine learning techniques to identify unknown printers in early modern (c.~1500--1800) English printed books. Specifically, we focus on matching uniquely damaged character type-imprints in anonymously printed books to works with known printers in order to provide evidence of their origins. Until now, this work has been limited to manual investigations by analytical bibliographers. We present a Contrastive Attention-based Metric Learning approach to identify similar damage across character image pairs, which is sensitive to very subtle differences in glyph shapes, yet robust to various confounding sources of noise associated with digitized historical books. To overcome the scarce amount of supervised data, we design a random data synthesis procedure that aims to simulate bends, fractures, and inking variations induced by the early printing process. Our method successfully improves downstream damaged type-imprint matching among printed works from this period, as validated by in-domain human experts. The results of our approach on two important philosophical works from the Early Modern period demonstrate potential to extend the extant historical research about the origins and content of these books. △ Less

Submitted 12 June, 2023; originally announced June 2023.

Comments: Proceedings of AAAI 2023

arXiv:2005.01646 [pdf, other]

A Probabilistic Generative Model for Typographical Analysis of Early Modern Printing

Authors: Kartik Goyal, Chris Dyer, Christopher Warren, Max G'Sell, Taylor Berg-Kirkpatrick

Abstract: We propose a deep and interpretable probabilistic generative model to analyze glyph shapes in printed Early Modern documents. We focus on clustering extracted glyph images into underlying templates in the presence of multiple confounding sources of variance. Our approach introduces a neural editor model that first generates well-understood printing phenomena like spatial perturbations from templat… ▽ More We propose a deep and interpretable probabilistic generative model to analyze glyph shapes in printed Early Modern documents. We focus on clustering extracted glyph images into underlying templates in the presence of multiple confounding sources of variance. Our approach introduces a neural editor model that first generates well-understood printing phenomena like spatial perturbations from template parameters via interpertable latent variables, and then modifies the result by generating a non-interpretable latent vector responsible for inking variations, jitter, noise from the archiving process, and other unforeseen phenomena associated with Early Modern printing. Critically, by introducing an inference network whose input is restricted to the visual residual between the observation and the interpretably-modified template, we are able to control and isolate what the vector-valued latent variable captures. We show that our approach outperforms rigid interpretable clustering baselines (Ocular) and overly-flexible deep generative models (VAE) alike on the task of completely unsupervised discovery of typefaces in mixed-font documents. △ Less

Submitted 4 May, 2020; originally announced May 2020.

Comments: To appear at ACL 2020

arXiv:1710.09598 [pdf]

Privacy Preserving Internet Browsers: Forensic Analysis of Browzar

Authors: Christopher Warren, Eman El-Sheikh, Nhien-An Le-Khac

Abstract: With the advance of technology, Criminal Justice agencies are being confronted with an increased need to investigate crimes perpetuated partially or entirely over the Internet. These types of crime are known as cybercrimes. In order to conceal illegal online activity, criminals often use private browsing features or browsers designed to provide total browsing privacy. The use of private browsing i… ▽ More With the advance of technology, Criminal Justice agencies are being confronted with an increased need to investigate crimes perpetuated partially or entirely over the Internet. These types of crime are known as cybercrimes. In order to conceal illegal online activity, criminals often use private browsing features or browsers designed to provide total browsing privacy. The use of private browsing is a common challenge faced in for example child exploitation investigations, which usually originate on the Internet. Although private browsing features are not designed specifically for criminal activity, they have become a valuable tool for criminals looking to conceal their online activity. As such, Technological Crime units often focus their forensic analysis on thoroughly examining the web history on a computer. Private browsing features and browsers often require a more in-depth, post mortem analysis. This often requires the use of multiple tools, as well as different forensic approaches to uncover incriminating evidence. This evidence may be required in a court of law, where analysts are often challenged both on their findings and on the tools and approaches used to recover evidence. However, there are very few research on evaluating of private browsing in terms of privacy preserving as well as forensic acquisition and analysis of privacy preserving internet browsers. Therefore in this chapter, we firstly review the private mode of popular internet browsers. Next, we describe the forensic acquisition and analysis of Browzar, a privacy preserving internet browser and compare it with other popular internet browsers △ Less

Submitted 26 October, 2017; originally announced October 2017.

Showing 1–7 of 7 results for author: Warren, C