Skip to main content

Showing 1–23 of 23 results for author: Hartvigsen, T

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.16964  [pdf, other

    cs.LG cs.AI

    Are Language Models Actually Useful for Time Series Forecasting?

    Authors: Mingtian Tan, Mike A. Merrill, Vinayak Gupta, Tim Althoff, Thomas Hartvigsen

    Abstract: Large language models (LLMs) are being applied to time series tasks, particularly time series forecasting. However, are language models actually useful for time series? After a series of ablation studies on three recent and popular LLM-based time series forecasting methods, we find that removing the LLM component or replacing it with a basic attention layer does not degrade the forecasting results… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

    Comments: 25 pages, 8 figures and 20 tables

  2. arXiv:2406.12066  [pdf, other

    cs.CL

    Language Models are Surprisingly Fragile to Drug Names in Biomedical Benchmarks

    Authors: Jack Gallifant, Shan Chen, Pedro Moreira, Nikolaj Munch, Mingye Gao, Jackson Pond, Leo Anthony Celi, Hugo Aerts, Thomas Hartvigsen, Danielle Bitterman

    Abstract: Medical knowledge is context-dependent and requires consistent reasoning across various natural language expressions of semantically equivalent phrases. This is particularly crucial for drug names, where patients often use brand names like Advil or Tylenol instead of their generic equivalents. To study this, we create a new robustness dataset, RABBITS, to evaluate performance differences on medica… ▽ More

    Submitted 18 June, 2024; v1 submitted 17 June, 2024; originally announced June 2024.

    Comments: submitted for review, total 15 pages

  3. arXiv:2405.19567  [pdf, other

    cs.AI cs.CL cs.CV cs.LG

    Dr-LLaVA: Visual Instruction Tuning with Symbolic Clinical Grounding

    Authors: Shenghuan Sun, Gregory M. Goldgof, Alexander Schubert, Zhiqing Sun, Thomas Hartvigsen, Atul J. Butte, Ahmed Alaa

    Abstract: Vision-Language Models (VLM) can support clinicians by analyzing medical images and engaging in natural language interactions to assist in diagnostic and treatment tasks. However, VLMs often exhibit "hallucinogenic" behavior, generating textual outputs not grounded in contextual multimodal information. This challenge is particularly pronounced in the medical domain, where we do not only require VL… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

    Comments: Code available at: https://github.com/AlaaLab/Dr-LLaVA

  4. arXiv:2405.09373  [pdf, other

    cs.CL

    PolygloToxicityPrompts: Multilingual Evaluation of Neural Toxic Degeneration in Large Language Models

    Authors: Devansh Jain, Priyanshu Kumar, Samuel Gehman, Xuhui Zhou, Thomas Hartvigsen, Maarten Sap

    Abstract: Recent advances in large language models (LLMs) have led to their extensive global deployment, and ensuring their safety calls for comprehensive and multilingual toxicity evaluations. However, existing toxicity benchmarks are overwhelmingly focused on English, posing serious risks to deploying LLMs in other languages. We address this by introducing PolygloToxicityPrompts (PTP), the first large-sca… ▽ More

    Submitted 20 May, 2024; v1 submitted 15 May, 2024; originally announced May 2024.

  5. arXiv:2404.15004  [pdf, other

    cs.CL

    TAXI: Evaluating Categorical Knowledge Editing for Language Models

    Authors: Derek Powell, Walter Gerych, Thomas Hartvigsen

    Abstract: Humans rarely learn one fact in isolation. Instead, learning a new fact induces knowledge of other facts about the world. For example, in learning a korat is a type of cat, you also infer it is a mammal and has claws, ensuring your model of the world is consistent. Knowledge editing aims to inject new facts into language models to improve their factuality, but current benchmarks fail to evaluate c… ▽ More

    Submitted 6 June, 2024; v1 submitted 23 April, 2024; originally announced April 2024.

    Comments: Accepted to ACL 2024 (Findings)

  6. arXiv:2404.11757  [pdf, other

    cs.CL

    Language Models Still Struggle to Zero-shot Reason about Time Series

    Authors: Mike A. Merrill, Mingtian Tan, Vinayak Gupta, Tom Hartvigsen, Tim Althoff

    Abstract: Time series are critical for decision-making in fields like finance and healthcare. Their importance has driven a recent influx of works passing time series into language models, leading to non-trivial forecasting on some datasets. But it remains unknown whether non-trivial forecasting implies that language models can reason about time series. To address this gap, we generate a first-of-its-kind e… ▽ More

    Submitted 17 April, 2024; originally announced April 2024.

  7. arXiv:2403.01628  [pdf, ps, other

    cs.LG

    Recent Advances, Applications, and Open Challenges in Machine Learning for Health: Reflections from Research Roundtables at ML4H 2023 Symposium

    Authors: Hyewon Jeong, Sarah Jabbour, Yuzhe Yang, Rahul Thapta, Hussein Mozannar, William Jongwon Han, Nikita Mehandru, Michael Wornow, Vladislav Lialin, Xin Liu, Alejandro Lozano, Jiacheng Zhu, Rafal Dariusz Kocielnik, Keith Harrigian, Haoran Zhang, Edward Lee, Milos Vukadinovic, Aparna Balagopalan, Vincent Jeanselme, Katherine Matton, Ilker Demirel, Jason Fries, Parisa Rashidi, Brett Beaulieu-Jones, Xuhai Orson Xu , et al. (18 additional authors not shown)

    Abstract: The third ML4H symposium was held in person on December 10, 2023, in New Orleans, Louisiana, USA. The symposium included research roundtable sessions to foster discussions between participants and senior researchers on timely and relevant topics for the \ac{ML4H} community. Encouraged by the successful virtual roundtables in the previous year, we organized eleven in-person roundtables and four vir… ▽ More

    Submitted 5 April, 2024; v1 submitted 3 March, 2024; originally announced March 2024.

    Comments: ML4H 2023, Research Roundtables

  8. arXiv:2403.00131  [pdf, other

    cs.LG cs.AI

    UNITS: A Unified Multi-Task Time Series Model

    Authors: Shanghua Gao, Teddy Koker, Owen Queen, Thomas Hartvigsen, Theodoros Tsiligkaridis, Marinka Zitnik

    Abstract: Advances in time series models are driving a shift from conventional deep learning methods to pre-trained foundational models. While pre-trained transformers and reprogrammed text-based LLMs report state-of-the-art results, the best-performing architectures vary significantly across tasks, and models often have limited scope, such as focusing only on time series forecasting. Models that unify pred… ▽ More

    Submitted 29 May, 2024; v1 submitted 29 February, 2024; originally announced March 2024.

  9. arXiv:2402.15861  [pdf, other

    cs.CL

    MATHWELL: Generating Age-Appropriate Educational Math Word Problems

    Authors: Bryan R Christ, Jonathan Kropko, Thomas Hartvigsen

    Abstract: Math word problems are critical K-8 educational tools, but writing them is time-consuming and requires domain expertise. We suggest that language models can support K-8 math education by automatically generating problems. To be educational, generated problems must be 1) solvable, 2) accurate, and 3) appropriate. Existing datasets are unlabeled for these criteria, making them ill-suited for trainin… ▽ More

    Submitted 16 April, 2024; v1 submitted 24 February, 2024; originally announced February 2024.

    Comments: 26 pages, 9 figures

  10. arXiv:2402.08225  [pdf, other

    cs.LG

    Improving Black-box Robustness with In-Context Rewriting

    Authors: Kyle O'Brien, Nathan Ng, Isha Puri, Jorge Mendez, Hamid Palangi, Yoon Kim, Marzyeh Ghassemi, Thomas Hartvigsen

    Abstract: Machine learning models often excel on in-distribution (ID) data but struggle with unseen out-of-distribution (OOD) inputs. Most techniques for improving OOD robustness are not applicable to settings where the model is effectively a black box, such as when the weights are frozen, retraining is costly, or the model is leveraged via an API. Test-time augmentation (TTA) is a simple post-hoc technique… ▽ More

    Submitted 15 February, 2024; v1 submitted 13 February, 2024; originally announced February 2024.

  11. arXiv:2402.04398  [pdf, other

    cs.LG cs.AI stat.ML

    Learning from Time Series under Temporal Label Noise

    Authors: Sujay Nagaraj, Walter Gerych, Sana Tonekaboni, Anna Goldenberg, Berk Ustun, Thomas Hartvigsen

    Abstract: Many sequential classification tasks are affected by label noise that varies over time. Such noise can cause label quality to improve, worsen, or periodically change over time. We first propose and formalize temporal label noise, an unstudied problem for sequential classification of time series. In this setting, multiple labels are recorded in sequence while being corrupted by a time-dependent noi… ▽ More

    Submitted 6 February, 2024; originally announced February 2024.

  12. arXiv:2312.00655   

    cs.LG

    Machine Learning for Health symposium 2023 -- Findings track

    Authors: Stefan Hegselmann, Antonio Parziale, Divya Shanmugam, Shengpu Tang, Mercy Nyamewaa Asiedu, Serina Chang, Thomas Hartvigsen, Harvineet Singh

    Abstract: A collection of the accepted Findings papers that were presented at the 3rd Machine Learning for Health symposium (ML4H 2023), which was held on December 10, 2023, in New Orleans, Louisiana, USA. ML4H 2023 invited high-quality submissions on relevant problems in a variety of health-related disciplines including healthcare, biomedicine, and public health. Two submission tracks were offered: the arc… ▽ More

    Submitted 15 December, 2023; v1 submitted 1 December, 2023; originally announced December 2023.

    MSC Class: 68Txx ACM Class: I.2; J.3; I.6; I.4

  13. arXiv:2311.02466  [pdf, other

    cs.LG cs.AI

    Multi-State Brain Network Discovery

    Authors: Hang Yin, Yao Su, Xinyue Liu, Thomas Hartvigsen, Yanhua Li, Xiangnan Kong

    Abstract: Brain network discovery aims to find nodes and edges from the spatio-temporal signals obtained by neuroimaging data, such as fMRI scans of human brains. Existing methods tend to derive representative or average brain networks, assuming observed signals are generated by only a single brain activity state. However, the human brain usually involves multiple activity states, which jointly determine th… ▽ More

    Submitted 4 November, 2023; originally announced November 2023.

    Comments: Published as a regular paper at IEEE BigData 2023

  14. arXiv:2307.13503  [pdf, other

    cs.LG stat.ML

    Continuous Time Evidential Distributions for Irregular Time Series

    Authors: Taylor W. Killian, Haoran Zhang, Thomas Hartvigsen, Ava P. Amini

    Abstract: Prevalent in many real-world settings such as healthcare, irregular time series are challenging to formulate predictions from. It is difficult to infer the value of a feature at any given time when observations are sporadic, as it could take on a range of values depending on when it was last observed. To characterize this uncertainty we present EDICT, a strategy that learns an evidential distribut… ▽ More

    Submitted 25 July, 2023; originally announced July 2023.

    Comments: ICML 2023 Workshop on Interpretable Machine Learning in Healthcare. Code is available at https://github.com/twkillian/EDICT

  15. arXiv:2306.02109  [pdf, other

    cs.LG

    Encoding Time-Series Explanations through Self-Supervised Model Behavior Consistency

    Authors: Owen Queen, Thomas Hartvigsen, Teddy Koker, Huan He, Theodoros Tsiligkaridis, Marinka Zitnik

    Abstract: Interpreting time series models is uniquely challenging because it requires identifying both the location of time series signals that drive model predictions and their matching to an interpretable temporal pattern. While explainers from other modalities can be applied to time series, their inductive biases do not transfer well to the inherently challenging interpretation of time series. We present… ▽ More

    Submitted 24 October, 2023; v1 submitted 3 June, 2023; originally announced June 2023.

    Comments: Accepted to NeurIPS 2023 (spotlight)

  16. arXiv:2304.03728  [pdf, other

    cs.CL

    Interpretable Unified Language Checking

    Authors: Tianhua Zhang, Hongyin Luo, Yung-Sung Chuang, Wei Fang, Luc Gaitskell, Thomas Hartvigsen, Xixin Wu, Danny Fox, Helen Meng, James Glass

    Abstract: Despite recent concerns about undesirable behaviors generated by large language models (LLMs), including non-factual, biased, and hateful language, we find LLMs are inherent multi-task language checkers based on their latent representations of natural and social knowledge. We present an interpretable, unified, language checking (UniLC) method for both human and machine-generated language that aims… ▽ More

    Submitted 7 April, 2023; originally announced April 2023.

    Comments: 10 + 5 pages

  17. arXiv:2302.04052  [pdf, other

    cs.LG

    Finding Short Signals in Long Irregular Time Series with Continuous-Time Attention Policy Networks

    Authors: Thomas Hartvigsen, Jidapa Thadajarassiri, Xiangnan Kong, Elke Rundensteiner

    Abstract: Irregularly-sampled time series (ITS) are native to high-impact domains like healthcare, where measurements are collected over time at uneven intervals. However, for many classification problems, only small portions of long time series are often relevant to the class label. In this case, existing ITS models often fail to classify long series since they rely on careful imputation, which easily over… ▽ More

    Submitted 8 February, 2023; originally announced February 2023.

  18. arXiv:2211.11031  [pdf, other

    cs.LG

    Aging with GRACE: Lifelong Model Editing with Discrete Key-Value Adaptors

    Authors: Thomas Hartvigsen, Swami Sankaranarayanan, Hamid Palangi, Yoon Kim, Marzyeh Ghassemi

    Abstract: Deployed language models decay over time due to shifting inputs, changing user needs, or emergent world-knowledge gaps. When such problems are identified, we want to make targeted edits while avoiding expensive retraining. However, current model editors, which modify such behaviors of pre-trained models, degrade model performance quickly across multiple, sequential edits. We propose GRACE, a lifel… ▽ More

    Submitted 17 October, 2023; v1 submitted 20 November, 2022; originally announced November 2022.

    Comments: Accepted to NeurIPS 2023

  19. arXiv:2210.05411  [pdf, other

    cs.LG

    Class-Specific Explainability for Deep Time Series Classifiers

    Authors: Ramesh Doddaiah, Prathyush Parvatharaju, Elke Rundensteiner, Thomas Hartvigsen

    Abstract: Explainability helps users trust deep learning solutions for time series classification. However, existing explainability methods for multi-class time series classifiers focus on one class at a time, ignoring relationships between the classes. Instead, when a classifier is choosing between many classes, an effective explanation must show what sets the chosen class apart from the rest. We now forma… ▽ More

    Submitted 11 October, 2022; originally announced October 2022.

    Comments: This paper is accepted in ICDM 2022

  20. Stop&Hop: Early Classification of Irregular Time Series

    Authors: Thomas Hartvigsen, Walter Gerych, Jidapa Thadajarassiri, Xiangnan Kong, Elke Rundensteiner

    Abstract: Early classification algorithms help users react faster to their machine learning model's predictions. Early warning systems in hospitals, for example, let clinicians improve their patients' outcomes by accurately predicting infections. While early classification systems are advancing rapidly, a major gap remains: existing systems do not consider irregular time series, which have uneven and often-… ▽ More

    Submitted 20 August, 2022; originally announced August 2022.

    Comments: This paper was accepted to CIKM'22. Code at https://github.com/thartvigsen/StopAndHop

  21. arXiv:2205.10726  [pdf, other

    cs.CL cs.AI cs.LG

    TWEET-FID: An Annotated Dataset for Multiple Foodborne Illness Detection Tasks

    Authors: Ruofan Hu, Dongyu Zhang, Dandan Tao, Thomas Hartvigsen, Hao Feng, Elke Rundensteiner

    Abstract: Foodborne illness is a serious but preventable public health problem -- with delays in detecting the associated outbreaks resulting in productivity loss, expensive recalls, public safety hazards, and even loss of life. While social media is a promising source for identifying unreported foodborne illnesses, there is a dearth of labeled datasets for develo** effective outbreak detection models. To… ▽ More

    Submitted 13 September, 2022; v1 submitted 21 May, 2022; originally announced May 2022.

    Comments: LREC 2022

  22. arXiv:2205.03295  [pdf, other

    cs.LG cs.AI cs.CY

    The Road to Explainability is Paved with Bias: Measuring the Fairness of Explanations

    Authors: Aparna Balagopalan, Haoran Zhang, Kimia Hamidieh, Thomas Hartvigsen, Frank Rudzicz, Marzyeh Ghassemi

    Abstract: Machine learning models in safety-critical settings like healthcare are often blackboxes: they contain a large number of parameters which are not transparent to users. Post-hoc explainability methods where a simple, human-interpretable model imitates the behavior of these blackbox models are often proposed to help users trust model predictions. In this work, we audit the quality of such explanatio… ▽ More

    Submitted 2 June, 2022; v1 submitted 6 May, 2022; originally announced May 2022.

    Comments: Published in FAccT 2022

  23. arXiv:2203.09509  [pdf, other

    cs.CL

    ToxiGen: A Large-Scale Machine-Generated Dataset for Adversarial and Implicit Hate Speech Detection

    Authors: Thomas Hartvigsen, Saadia Gabriel, Hamid Palangi, Maarten Sap, Dipankar Ray, Ece Kamar

    Abstract: Toxic language detection systems often falsely flag text that contains minority group mentions as toxic, as those groups are often the targets of online hate. Such over-reliance on spurious correlations also causes systems to struggle with detecting implicitly toxic language. To help mitigate these issues, we create ToxiGen, a new large-scale and machine-generated dataset of 274k toxic and benign… ▽ More

    Submitted 14 July, 2022; v1 submitted 17 March, 2022; originally announced March 2022.

    Comments: Published as a long paper at ACL 2022. Code: https://github.com/microsoft/TOXIGEN