Skip to main content

Showing 1–4 of 4 results for author: Hegselmann, S

.
  1. arXiv:2402.15422  [pdf, other

    cs.CL cs.AI cs.LG

    A Data-Centric Approach To Generate Faithful and High Quality Patient Summaries with Large Language Models

    Authors: Stefan Hegselmann, Shannon Zejiang Shen, Florian Gierse, Monica Agrawal, David Sontag, Xiaoyi Jiang

    Abstract: Patients often face difficulties in understanding their hospitalizations, while healthcare workers have limited resources to provide explanations. In this work, we investigate the potential of large language models to generate patient summaries based on doctors' notes and study the effect of training data on the faithfulness and quality of the generated summaries. To this end, we release (i) a rig… ▽ More

    Submitted 25 June, 2024; v1 submitted 23 February, 2024; originally announced February 2024.

  2. arXiv:2312.00655   

    cs.LG

    Machine Learning for Health symposium 2023 -- Findings track

    Authors: Stefan Hegselmann, Antonio Parziale, Divya Shanmugam, Shengpu Tang, Mercy Nyamewaa Asiedu, Serina Chang, Thomas Hartvigsen, Harvineet Singh

    Abstract: A collection of the accepted Findings papers that were presented at the 3rd Machine Learning for Health symposium (ML4H 2023), which was held on December 10, 2023, in New Orleans, Louisiana, USA. ML4H 2023 invited high-quality submissions on relevant problems in a variety of health-related disciplines including healthcare, biomedicine, and public health. Two submission tracks were offered: the arc… ▽ More

    Submitted 15 December, 2023; v1 submitted 1 December, 2023; originally announced December 2023.

    MSC Class: 68Txx ACM Class: I.2; J.3; I.6; I.4

  3. arXiv:2210.10723  [pdf, other

    cs.CL cs.AI

    TabLLM: Few-shot Classification of Tabular Data with Large Language Models

    Authors: Stefan Hegselmann, Alejandro Buendia, Hunter Lang, Monica Agrawal, Xiaoyi Jiang, David Sontag

    Abstract: We study the application of large language models to zero-shot and few-shot classification of tabular data. We prompt the large language model with a serialization of the tabular data to a natural-language string, together with a short description of the classification problem. In the few-shot setting, we fine-tune the large language model using some labeled examples. We evaluate several serializa… ▽ More

    Submitted 17 March, 2023; v1 submitted 19 October, 2022; originally announced October 2022.

  4. arXiv:2205.12689  [pdf, other

    cs.CL cs.AI

    Large Language Models are Few-Shot Clinical Information Extractors

    Authors: Monica Agrawal, Stefan Hegselmann, Hunter Lang, Yoon Kim, David Sontag

    Abstract: A long-running goal of the clinical NLP community is the extraction of important variables trapped in clinical notes. However, roadblocks have included dataset shift from the general domain and a lack of public clinical corpora and annotations. In this work, we show that large language models, such as InstructGPT, perform well at zero- and few-shot information extraction from clinical text despite… ▽ More

    Submitted 30 November, 2022; v1 submitted 25 May, 2022; originally announced May 2022.

    Comments: Accepted as a long paper to The 2022 Conference on Empirical Methods in Natural Language Processing (EMNLP)