Skip to main content

Showing 1–50 of 99 results for author: Ghassemi, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.18562  [pdf, other

    cs.CV cs.LG

    Views Can Be Deceiving: Improved SSL Through Feature Space Augmentation

    Authors: Kimia Hamidieh, Haoran Zhang, Swami Sankaranarayanan, Marzyeh Ghassemi

    Abstract: Supervised learning methods have been found to exhibit inductive biases favoring simpler features. When such features are spuriously correlated with the label, this can result in suboptimal performance on minority subgroups. Despite the growing popularity of methods which learn from unlabeled data, the extent to which these representations rely on spurious features for prediction is unclear. In th… ▽ More

    Submitted 28 May, 2024; originally announced June 2024.

  2. arXiv:2406.16846  [pdf, other

    cs.LG cs.CY stat.ML

    Data Debiasing with Datamodels (D3M): Improving Subgroup Robustness via Data Selection

    Authors: Saachi Jain, Kimia Hamidieh, Kristian Georgiev, Andrew Ilyas, Marzyeh Ghassemi, Aleksander Madry

    Abstract: Machine learning models can fail on subgroups that are underrepresented during training. While techniques such as dataset balancing can improve performance on underperforming groups, they require access to training group annotations and can end up removing large portions of the dataset. In this paper, we introduce Data Debiasing with Datamodels (D3M), a debiasing approach which isolates and remove… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

  3. arXiv:2406.02745  [pdf, other

    cs.LG

    Measuring Stochastic Data Complexity with Boltzmann Influence Functions

    Authors: Nathan Ng, Roger Grosse, Marzyeh Ghassemi

    Abstract: Estimating the uncertainty of a model's prediction on a test point is a crucial part of ensuring reliability and calibration under distribution shifts. A minimum description length approach to this problem uses the predictive normalized maximum likelihood (pNML) distribution, which considers every possible label for a data point, and decreases confidence in a prediction if other labels are also co… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

  4. arXiv:2405.19356  [pdf, other

    eess.SP cs.AI cs.LG cs.RO

    An LSTM Feature Imitation Network for Hand Movement Recognition from sEMG Signals

    Authors: Chuheng Wu, S. Farokh Atashzar, Mohammad M. Ghassemi, Tuka Alhanai

    Abstract: Surface Electromyography (sEMG) is a non-invasive signal that is used in the recognition of hand movement patterns, the diagnosis of diseases, and the robust control of prostheses. Despite the remarkable success of recent end-to-end Deep Learning approaches, they are still limited by the need for large amounts of labeled data. To alleviate the requirement for big data, researchers utilize Feature… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

    Comments: This work has been submitted to RA-L, and under review

  5. arXiv:2405.13804  [pdf, other

    cs.CR

    Guarding Multiple Secrets: Enhanced Summary Statistic Privacy for Data Sharing

    Authors: Shuaiqi Wang, Rongzhe Wei, Mohsen Ghassemi, Eleonora Kreacic, Vamsi K. Potluru

    Abstract: Data sharing enables critical advances in many research areas and business applications, but it may lead to inadvertent disclosure of sensitive summary statistics (e.g., means or quantiles). Existing literature only focuses on protecting a single confidential quantity, while in practice, data sharing involves multiple sensitive statistics. We propose a novel framework to define, analyze, and prote… ▽ More

    Submitted 12 June, 2024; v1 submitted 22 May, 2024; originally announced May 2024.

  6. arXiv:2405.12021  [pdf, other

    cs.CL

    Can AI Relate: Testing Large Language Model Response for Mental Health Support

    Authors: Saadia Gabriel, Isha Puri, Xuhai Xu, Matteo Malgaroli, Marzyeh Ghassemi

    Abstract: Large language models (LLMs) are already being piloted for clinical use in hospital systems like NYU Langone, Dana-Farber and the NHS. A proposed deployment use case is psychotherapy, where a LLM-powered chatbot can treat a patient undergoing a mental health crisis. Deployment of LLMs for mental health response could hypothetically broaden access to psychotherapy and provide new possibilities for… ▽ More

    Submitted 20 May, 2024; originally announced May 2024.

    Comments: Under review

  7. arXiv:2403.17381  [pdf, other

    cs.LG cs.AI

    Application-Driven Innovation in Machine Learning

    Authors: David Rolnick, Alan Aspuru-Guzik, Sara Beery, Bistra Dilkina, Priya L. Donti, Marzyeh Ghassemi, Hannah Kerner, Claire Monteleoni, Esther Rolf, Milind Tambe, Adam White

    Abstract: As applications of machine learning proliferate, innovative algorithms inspired by specific real-world challenges have become increasingly important. Such work offers the potential for significant impact not merely in domains of application but also in machine learning itself. In this paper, we describe the paradigm of application-driven research in machine learning, contrasting it with the more s… ▽ More

    Submitted 26 March, 2024; originally announced March 2024.

    Comments: 12 pages, 3 figures

  8. Time2Stop: Adaptive and Explainable Human-AI Loop for Smartphone Overuse Intervention

    Authors: Adiba Orzikulova, Han Xiao, Zhipeng Li, Yukang Yan, Yuntao Wang, Yuanchun Shi, Marzyeh Ghassemi, Sung-Ju Lee, Anind K Dey, Xuhai "Orson" Xu

    Abstract: Despite a rich history of investigating smartphone overuse intervention techniques, AI-based just-in-time adaptive intervention (JITAI) methods for overuse reduction are lacking. We develop Time2Stop, an intelligent, adaptive, and explainable JITAI system that leverages machine learning to identify optimal intervention timings, introduces interventions with transparent AI explanations, and collect… ▽ More

    Submitted 3 March, 2024; originally announced March 2024.

  9. arXiv:2403.01628  [pdf, ps, other

    cs.LG

    Recent Advances, Applications, and Open Challenges in Machine Learning for Health: Reflections from Research Roundtables at ML4H 2023 Symposium

    Authors: Hyewon Jeong, Sarah Jabbour, Yuzhe Yang, Rahul Thapta, Hussein Mozannar, William Jongwon Han, Nikita Mehandru, Michael Wornow, Vladislav Lialin, Xin Liu, Alejandro Lozano, Jiacheng Zhu, Rafal Dariusz Kocielnik, Keith Harrigian, Haoran Zhang, Edward Lee, Milos Vukadinovic, Aparna Balagopalan, Vincent Jeanselme, Katherine Matton, Ilker Demirel, Jason Fries, Parisa Rashidi, Brett Beaulieu-Jones, Xuhai Orson Xu , et al. (18 additional authors not shown)

    Abstract: The third ML4H symposium was held in person on December 10, 2023, in New Orleans, Louisiana, USA. The symposium included research roundtable sessions to foster discussions between participants and senior researchers on timely and relevant topics for the \ac{ML4H} community. Encouraged by the successful virtual roundtables in the previous year, we organized eleven in-person roundtables and four vir… ▽ More

    Submitted 5 April, 2024; v1 submitted 3 March, 2024; originally announced March 2024.

    Comments: ML4H 2023, Research Roundtables

  10. arXiv:2402.16842  [pdf, other

    cs.LG

    Asymmetry in Low-Rank Adapters of Foundation Models

    Authors: Jiacheng Zhu, Kristjan Greenewald, Kimia Nadjahi, Haitz Sáez de Ocáriz Borde, Rickard Brüel Gabrielsson, Leshem Choshen, Marzyeh Ghassemi, Mikhail Yurochkin, Justin Solomon

    Abstract: Parameter-efficient fine-tuning optimizes large, pre-trained foundation models by updating a subset of parameters; in this class, Low-Rank Adaptation (LoRA) is particularly effective. Inspired by an effort to investigate the different roles of LoRA matrices during fine-tuning, this paper characterizes and leverages unexpected asymmetry in the importance of low-rank adapter matrices. Specifically,… ▽ More

    Submitted 27 February, 2024; v1 submitted 26 February, 2024; originally announced February 2024.

    Comments: 17 pages, 2 figures, 9 tables

  11. arXiv:2402.08225  [pdf, other

    cs.LG

    Improving Black-box Robustness with In-Context Rewriting

    Authors: Kyle O'Brien, Nathan Ng, Isha Puri, Jorge Mendez, Hamid Palangi, Yoon Kim, Marzyeh Ghassemi, Thomas Hartvigsen

    Abstract: Machine learning models often excel on in-distribution (ID) data but struggle with unseen out-of-distribution (OOD) inputs. Most techniques for improving OOD robustness are not applicable to settings where the model is effectively a black box, such as when the weights are frozen, retraining is costly, or the model is leveraged via an API. Test-time augmentation (TTA) is a simple post-hoc technique… ▽ More

    Submitted 15 February, 2024; v1 submitted 13 February, 2024; originally announced February 2024.

  12. arXiv:2402.04075  [pdf, other

    cs.CL

    Iterative Prompt Refinement for Radiation Oncology Symptom Extraction Using Teacher-Student Large Language Models

    Authors: Reza Khanmohammadi, Ahmed I Ghanem, Kyle Verdecchia, Ryan Hall, Mohamed Elshaikh, Benjamin Movsas, Hassan Bagher-Ebadian, Indrin Chetty, Mohammad M. Ghassemi, Kundan Thind

    Abstract: This study introduces a novel teacher-student architecture utilizing Large Language Models (LLMs) to improve prostate cancer radiotherapy symptom extraction from clinical notes. Mixtral, the student model, initially extracts symptoms, followed by GPT-4, the teacher model, which refines prompts based on Mixtral's performance. This iterative process involved 294 single symptom clinical notes across… ▽ More

    Submitted 6 February, 2024; originally announced February 2024.

  13. arXiv:2401.09637  [pdf, other

    cs.HC cs.AI cs.CL

    Impact of Large Language Model Assistance on Patients Reading Clinical Notes: A Mixed-Methods Study

    Authors: Niklas Mannhardt, Elizabeth Bondi-Kelly, Barbara Lam, Chloe O'Connell, Mercy Asiedu, Hussein Mozannar, Monica Agrawal, Alejandro Buendia, Tatiana Urman, Irbaz B. Riaz, Catherine E. Ricciardi, Marzyeh Ghassemi, David Sontag

    Abstract: Patients derive numerous benefits from reading their clinical notes, including an increased sense of control over their health and improved understanding of their care plan. However, complex medical concepts and jargon within clinical notes hinder patient comprehension and may lead to anxiety. We developed a patient-facing tool to make clinical notes more readable, leveraging large language models… ▽ More

    Submitted 17 January, 2024; originally announced January 2024.

  14. arXiv:2401.07516  [pdf, other

    cs.LG cs.SI

    Temporal Link Prediction Using Graph Embedding Dynamics

    Authors: Sanaz Hasanzadeh Fard, Mohammad Ghassemi

    Abstract: Graphs are a powerful representation tool in machine learning applications, with link prediction being a key task in graph learning. Temporal link prediction in dynamic networks is of particular interest due to its potential for solving complex scientific and real-world problems. Traditional approaches to temporal link prediction have focused on finding the aggregation of dynamics of the network a… ▽ More

    Submitted 15 January, 2024; originally announced January 2024.

  15. arXiv:2401.00942  [pdf, other

    cs.CE

    The Influence of Biomedical Research on Future Business Funding: Analyzing Scientific Impact and Content in Industrial Investments

    Authors: Reza Khanmohammadi, Simerjot Kaur, Charese H. Smiley, Tuka Alhanai, Ivan Brugere, Armineh Nourbakhsh, Mohammad M. Ghassemi

    Abstract: This paper investigates the relationship between scientific innovation in biomedical sciences and its impact on industrial activities, focusing on how the historical impact and content of scientific papers influenced future funding and innovation grant application content for small businesses. The research incorporates bibliometric analyses along with SBIR (Small Business Innovation Research) data… ▽ More

    Submitted 1 January, 2024; originally announced January 2024.

  16. arXiv:2401.00081  [pdf, other

    cs.LG q-fin.GN

    Synthetic Data Applications in Finance

    Authors: Vamsi K. Potluru, Daniel Borrajo, Andrea Coletta, Niccolò Dalmasso, Yousef El-Laham, Elizabeth Fons, Mohsen Ghassemi, Sriram Gopalakrishnan, Vikesh Gosai, Eleonora Kreačić, Ganapathy Mani, Saheed Obitayo, Deepak Paramanand, Natraj Raman, Mikhail Solonin, Srijan Sood, Svitlana Vyetrenko, Haibei Zhu, Manuela Veloso, Tucker Balch

    Abstract: Synthetic data has made tremendous strides in various commercial settings including finance, healthcare, and virtual reality. We present a broad overview of prototypical applications of synthetic data in the financial sector and in particular provide richer details for a few select ones. These cover a wide variety of data modalities including tabular, time-series, event-series, and unstructured ar… ▽ More

    Submitted 20 March, 2024; v1 submitted 29 December, 2023; originally announced January 2024.

    Comments: 50 pages, journal submission; updated 6 privacy levels

  17. arXiv:2312.10308  [pdf, other

    cs.LG

    Event-Based Contrastive Learning for Medical Time Series

    Authors: Hyewon Jeong, Nassim Oufattole, Matthew Mcdermott, Aparna Balagopalan, Bryan Jangeesingh, Marzyeh Ghassemi, Collin Stultz

    Abstract: In clinical practice, one often needs to identify whether a patient is at high risk of adverse outcomes after some key medical event. For example, quantifying the risk of adverse outcomes after an acute cardiovascular event helps healthcare providers identify those patients at the highest risk of poor outcomes; i.e., patients who benefit from invasive therapies that can lower their risk. Assessing… ▽ More

    Submitted 19 April, 2024; v1 submitted 15 December, 2023; originally announced December 2023.

    Comments: Accepted at Unifying Representations in Neural Models Workshop in NeurIPS 2023

  18. arXiv:2312.10083  [pdf

    cs.CY cs.AI cs.CV cs.LG

    The Limits of Fair Medical Imaging AI In The Wild

    Authors: Yuzhe Yang, Haoran Zhang, Judy W Gichoya, Dina Katabi, Marzyeh Ghassemi

    Abstract: As artificial intelligence (AI) rapidly approaches human-level performance in medical imaging, it is crucial that it does not exacerbate or propagate healthcare disparities. Prior research has established AI's capacity to infer demographic data from chest X-rays, leading to a key concern: do models using demographic shortcuts have unfair predictions across subpopulations? In this study, we conduct… ▽ More

    Submitted 11 December, 2023; originally announced December 2023.

    Comments: Code and data are available at https://github.com/YyzHarry/shortcut-ood-fairness

  19. arXiv:2311.02205  [pdf, other

    cs.CL

    An Introduction to Natural Language Processing Techniques and Framework for Clinical Implementation in Radiation Oncology

    Authors: Reza Khanmohammadi, Mohammad M. Ghassemi, Kyle Verdecchia, Ahmed I. Ghanem, Luo Bing, Indrin J. Chetty, Hassan Bagher-Ebadian, Farzan Siddiqui, Mohamed Elshaikh, Benjamin Movsas, Kundan Thind

    Abstract: Natural Language Processing (NLP) is a key technique for develo** Medical Artificial Intelligence (AI) systems that leverage Electronic Health Record (EHR) data to build diagnostic and prognostic models. NLP enables the conversion of unstructured clinical text into structured data that can be fed into AI algorithms. The emergence of the transformer architecture and large language models (LLMs) h… ▽ More

    Submitted 8 November, 2023; v1 submitted 3 November, 2023; originally announced November 2023.

  20. arXiv:2309.12325  [pdf, other

    cs.CY cs.AI cs.CV cs.LG

    FUTURE-AI: International consensus guideline for trustworthy and deployable artificial intelligence in healthcare

    Authors: Karim Lekadir, Aasa Feragen, Abdul Joseph Fofanah, Alejandro F Frangi, Alena Buyx, Anais Emelie, Andrea Lara, Antonio R Porras, An-Wen Chan, Arcadi Navarro, Ben Glocker, Benard O Botwe, Bishesh Khanal, Brigit Beger, Carol C Wu, Celia Cintas, Curtis P Langlotz, Daniel Rueckert, Deogratias Mzurikwao, Dimitrios I Fotiadis, Doszhan Zhussupov, Enzo Ferrante, Erik Meijering, Eva Weicken, Fabio A González , et al. (93 additional authors not shown)

    Abstract: Despite major advances in artificial intelligence (AI) for medicine and healthcare, the deployment and adoption of AI technologies remain limited in real-world clinical practice. In recent years, concerns have been raised about the technical, clinical, ethical and legal risks associated with medical AI. To increase real world adoption, it is essential that medical AI tools are trusted and accepted… ▽ More

    Submitted 11 August, 2023; originally announced September 2023.

    ACM Class: I.2.0; I.4.0; I.5.0

  21. arXiv:2309.12279  [pdf, ps, other

    cs.LG stat.ML

    The Broad Impact of Feature Imitation: Neural Enhancements Across Financial, Speech, and Physiological Domains

    Authors: Reza Khanmohammadi, Tuka Alhanai, Mohammad M. Ghassemi

    Abstract: Initialization of neural network weights plays a pivotal role in determining their performance. Feature Imitating Networks (FINs) offer a novel strategy by initializing weights to approximate specific closed-form statistical features, setting a promising foundation for deep learning architectures. While the applicability of FINs has been chiefly tested in biomedical domains, this study extends its… ▽ More

    Submitted 21 September, 2023; originally announced September 2023.

  22. arXiv:2308.04650  [pdf, other

    cs.LG eess.SP q-bio.QM

    Deep Metric Learning for the Hemodynamics Inference with Electrocardiogram Signals

    Authors: Hyewon Jeong, Collin M. Stultz, Marzyeh Ghassemi

    Abstract: Heart failure is a debilitating condition that affects millions of people worldwide and has a significant impact on their quality of life and mortality rates. An objective assessment of cardiac pressures remains an important method for the diagnosis and treatment prognostication for patients with heart failure. Although cardiac catheterization is the gold standard for estimating central hemodynami… ▽ More

    Submitted 10 September, 2023; v1 submitted 8 August, 2023; originally announced August 2023.

    Journal ref: MLHC 2023

  23. arXiv:2308.01525  [pdf, other

    cs.CV

    VisAlign: Dataset for Measuring the Degree of Alignment between AI and Humans in Visual Perception

    Authors: Jiyoung Lee, Seungho Kim, Seunghyun Won, Joonseok Lee, Marzyeh Ghassemi, James Thorne, Jaeseok Choi, O-Kil Kwon, Edward Choi

    Abstract: AI alignment refers to models acting towards human-intended goals, preferences, or ethical principles. Given that most large-scale deep learning models act as black boxes and cannot be manually controlled, analyzing the similarity between models and humans can be a proxy measure for ensuring AI safety. In this paper, we focus on the models' visual perception alignment with humans, further referred… ▽ More

    Submitted 20 October, 2023; v1 submitted 3 August, 2023; originally announced August 2023.

    Comments: Published as a conference paper at NeurIPS 2023 (Track on Datasets and Benchmarks)

  24. Mental-LLM: Leveraging Large Language Models for Mental Health Prediction via Online Text Data

    Authors: Xuhai Xu, Bingsheng Yao, Yuanzhe Dong, Saadia Gabriel, Hong Yu, James Hendler, Marzyeh Ghassemi, Anind K. Dey, Dakuo Wang

    Abstract: Advances in large language models (LLMs) have empowered a variety of applications. However, there is still a significant gap in research when it comes to understanding and enhancing the capabilities of LLMs in the field of mental health. In this work, we present a comprehensive evaluation of multiple LLMs on various mental health prediction tasks via online text data, including Alpaca, Alpaca-LoRA… ▽ More

    Submitted 28 January, 2024; v1 submitted 26 July, 2023; originally announced July 2023.

    Comments: Published at Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies (IMWUT) 2024

    MSC Class: 68U35 ACM Class: H.5.2; I.2.m

  25. arXiv:2306.14572  [pdf, other

    eess.IV cs.CV cs.LG

    Feature Imitating Networks Enhance The Performance, Reliability And Speed Of Deep Learning On Biomedical Image Processing Tasks

    Authors: Shangyang Min, Hassan B. Ebadian, Tuka Alhanai, Mohammad Mahdi Ghassemi

    Abstract: Feature-Imitating-Networks (FINs) are neural networks that are first trained to approximate closed-form statistical features (e.g. Entropy), and then embedded into other networks to enhance their performance. In this work, we perform the first evaluation of FINs for biomedical image processing tasks. We begin by training a set of FINs to imitate six common radiomics features, and then compare the… ▽ More

    Submitted 22 April, 2024; v1 submitted 26 June, 2023; originally announced June 2023.

  26. Evaluating the Impact of Social Determinants on Health Prediction in the Intensive Care Unit

    Authors: Ming Ying Yang, Gloria Hyunjung Kwak, Tom Pollard, Leo Anthony Celi, Marzyeh Ghassemi

    Abstract: Social determinants of health (SDOH) -- the conditions in which people live, grow, and age -- play a crucial role in a person's health and well-being. There is a large, compelling body of evidence in population health studies showing that a wide range of SDOH is strongly correlated with health outcomes. Yet, a majority of the risk prediction models based on electronic health records (EHR) do not i… ▽ More

    Submitted 14 August, 2023; v1 submitted 21 May, 2023; originally announced May 2023.

    Journal ref: In AAAI/ACM Conference on AI, Ethics, and Society (AIES '23), August 8-10, 2023, Montreal, QC, Canada. ACM, New York, NY, USA, 18 pages

  27. arXiv:2305.11348  [pdf, other

    cs.LG cs.CL cs.CR cs.CY

    In the Name of Fairness: Assessing the Bias in Clinical Record De-identification

    Authors: Yuxin Xiao, Shulammite Lim, Tom Joseph Pollard, Marzyeh Ghassemi

    Abstract: Data sharing is crucial for open science and reproducible research, but the legal sharing of clinical data requires the removal of protected health information from electronic health records. This process, known as de-identification, is often achieved through the use of machine learning algorithms by many commercial and open-source systems. While these systems have shown compelling results on aver… ▽ More

    Submitted 2 January, 2024; v1 submitted 18 May, 2023; originally announced May 2023.

    Comments: Accepted by FAccT 2023; updated appendix with the de-identification performance of GPT-4

  28. arXiv:2303.13567  [pdf

    cs.LG cs.CV eess.IV

    AI Models Close to your Chest: Robust Federated Learning Strategies for Multi-site CT

    Authors: Edward H. Lee, Brendan Kelly, Emre Altinmakas, Hakan Dogan, Maryam Mohammadzadeh, Errol Colak, Steve Fu, Olivia Choudhury, Ujjwal Ratan, Felipe Kitamura, Hernan Chaves, Jimmy Zheng, Mourad Said, Eduardo Reis, Jaekwang Lim, Patricia Yokoo, Courtney Mitchell, Golnaz Houshmand, Marzyeh Ghassemi, Ronan Killeen, Wendy Qiu, Joel Hayden, Farnaz Rafiee, Chad Klochko, Nicholas Bevins , et al. (5 additional authors not shown)

    Abstract: While it is well known that population differences from genetics, sex, race, and environmental factors contribute to disease, AI studies in medicine have largely focused on locoregional patient cohorts with less diverse data sources. Such limitation stems from barriers to large-scale data share and ethical concerns over data privacy. Federated learning (FL) is one potential pathway for AI developm… ▽ More

    Submitted 13 April, 2023; v1 submitted 23 March, 2023; originally announced March 2023.

  29. arXiv:2303.06992  [pdf, other

    cs.LG stat.ML

    Improving Mutual Information Estimation with Annealed and Energy-Based Bounds

    Authors: Rob Brekelmans, Sicong Huang, Marzyeh Ghassemi, Greg Ver Steeg, Roger Grosse, Alireza Makhzani

    Abstract: Mutual information (MI) is a fundamental quantity in information theory and machine learning. However, direct estimation of MI is intractable, even if the true joint probability density for the variables of interest is known, as it involves estimating a potentially high-dimensional log partition function. In this work, we present a unifying view of existing MI bounds from the perspective of import… ▽ More

    Submitted 13 March, 2023; originally announced March 2023.

    Comments: A shorter version appeared in the International Conference on Learning Representations (ICLR) 2022

    Journal ref: ICLR 2022 https://openreview.net/forum?id=T0B9AoM_bFg

  30. arXiv:2302.12254  [pdf, other

    cs.LG cs.AI cs.CV

    Change is Hard: A Closer Look at Subpopulation Shift

    Authors: Yuzhe Yang, Haoran Zhang, Dina Katabi, Marzyeh Ghassemi

    Abstract: Machine learning models often perform poorly on subgroups that are underrepresented in the training data. Yet, little is understood on the variation in mechanisms that cause subpopulation shifts, and how algorithms generalize across such diverse shifts at scale. In this work, we provide a fine-grained analysis of subpopulation shift. We first propose a unified framework that dissects and explains… ▽ More

    Submitted 17 August, 2023; v1 submitted 23 February, 2023; originally announced February 2023.

    Comments: ICML 2023

  31. arXiv:2301.05664  [pdf, other

    cs.LG stat.ML

    Risk Sensitive Dead-end Identification in Safety-Critical Offline Reinforcement Learning

    Authors: Taylor W. Killian, Sonali Parbhoo, Marzyeh Ghassemi

    Abstract: In safety-critical decision-making scenarios being able to identify worst-case outcomes, or dead-ends is crucial in order to develop safe and reliable policies in practice. These situations are typically rife with uncertainty due to unknown or stochastic characteristics of the environment as well as limited offline training data. As a result, the value of a decision at any time point should be bas… ▽ More

    Submitted 30 January, 2023; v1 submitted 13 January, 2023; originally announced January 2023.

    Comments: To appear in TMLR (01/2023). The submission and reviews can be viewed at: https://openreview.net/forum?id=oKlEOT83gI

  32. arXiv:2212.06081  [pdf, other

    cs.LG math.OC

    Fast Learning of Multidimensional Hawkes Processes via Frank-Wolfe

    Authors: Renbo Zhao, Niccolò Dalmasso, Mohsen Ghassemi, Vamsi K. Potluru, Tucker Balch, Manuela Veloso

    Abstract: Hawkes processes have recently risen to the forefront of tools when it comes to modeling and generating sequential events data. Multidimensional Hawkes processes model both the self and cross-excitation between different types of events and have been applied successfully in various domain such as finance, epidemiology and personalized recommendations, among others. In this work we present an adapt… ▽ More

    Submitted 12 December, 2022; originally announced December 2022.

    Comments: Presented at the NeurIPS 2022 Workshop on Synthetic Data for Empowering ML Research. 9 pages, 3 figures, 4 tables

  33. arXiv:2211.11031  [pdf, other

    cs.LG

    Aging with GRACE: Lifelong Model Editing with Discrete Key-Value Adaptors

    Authors: Thomas Hartvigsen, Swami Sankaranarayanan, Hamid Palangi, Yoon Kim, Marzyeh Ghassemi

    Abstract: Deployed language models decay over time due to shifting inputs, changing user needs, or emergent world-knowledge gaps. When such problems are identified, we want to make targeted edits while avoiding expensive retraining. However, current model editors, which modify such behaviors of pre-trained models, degrade model performance quickly across multiple, sequential edits. We propose GRACE, a lifel… ▽ More

    Submitted 17 October, 2023; v1 submitted 20 November, 2022; originally announced November 2022.

    Comments: Accepted to NeurIPS 2023

  34. arXiv:2210.17060  [pdf, other

    cs.LG

    MambaNet: A Hybrid Neural Network for Predicting the NBA Playoffs

    Authors: Reza Khanmohammadi, Sari Saba-Sadiya, Sina Esfandiarpour, Tuka Alhanai, Mohammad M. Ghassemi

    Abstract: In this paper, we present Mambanet: a hybrid neural network for predicting the outcomes of Basketball games. Contrary to other studies, which focus primarily on season games, this study investigates playoff games. MambaNet is a hybrid neural network architecture that processes a time series of teams' and players' game statistics and generates the probability of a team winning or losing an NBA play… ▽ More

    Submitted 31 October, 2022; originally announced October 2022.

  35. arXiv:2210.10769  [pdf, other

    cs.LG stat.ML

    "Why did the Model Fail?": Attributing Model Performance Changes to Distribution Shifts

    Authors: Haoran Zhang, Harvineet Singh, Marzyeh Ghassemi, Shalmali Joshi

    Abstract: Machine learning models frequently experience performance drops under distribution shifts. The underlying cause of such shifts may be multiple simultaneous factors such as changes in data quality, differences in specific covariate distributions, or changes in the relationship between label and features. When a model does fail during deployment, attributing performance change to these factors is cr… ▽ More

    Submitted 6 June, 2023; v1 submitted 19 October, 2022; originally announced October 2022.

    Comments: Published in ICML 2023

  36. arXiv:2209.05364  [pdf, other

    cs.LG stat.ML

    If Influence Functions are the Answer, Then What is the Question?

    Authors: Juhan Bae, Nathan Ng, Alston Lo, Marzyeh Ghassemi, Roger Grosse

    Abstract: Influence functions efficiently estimate the effect of removing a single training data point on a model's learned parameters. While influence estimates align well with leave-one-out retraining for linear models, recent works have shown this alignment is often poor in neural networks. In this work, we investigate the specific factors that cause this discrepancy by decomposing it into five separate… ▽ More

    Submitted 12 September, 2022; originally announced September 2022.

    Comments: 28 pages, 6 figures

  37. arXiv:2208.07961  [pdf, other

    stat.ML cs.LG cs.SI

    Online Learning for Mixture of Multivariate Hawkes Processes

    Authors: Mohsen Ghassemi, Niccolò Dalmasso, Simran Lamba, Vamsi K. Potluru, Sameena Shah, Tucker Balch, Manuela Veloso

    Abstract: Online learning of Hawkes processes has received increasing attention in the last couple of years especially for modeling a network of actors. However, these works typically either model the rich interaction between the events or the latent cluster of the actors or the network structure between the actors. We propose to model the latent structure of the network of actors as well as their rich inte… ▽ More

    Submitted 16 August, 2022; originally announced August 2022.

    Comments: 12 pages, 6 figures, 3 tables

    Journal ref: ICAIF 22: 3rd ACM International Conference on AI in Finance, November 2022, Pages 506-513

  38. arXiv:2207.13741  [pdf, other

    stat.ML cs.LG

    Differentially Private Learning of Hawkes Processes

    Authors: Mohsen Ghassemi, Eleonora Kreačić, Niccolò Dalmasso, Vamsi K. Potluru, Tucker Balch, Manuela Veloso

    Abstract: Hawkes processes have recently gained increasing attention from the machine learning community for their versatility in modeling event sequence data. While they have a rich history going back decades, some of their properties, such as sample complexity for learning the parameters and releasing differentially private versions, are yet to be thoroughly analyzed. In this work, we study standard Hawke… ▽ More

    Submitted 27 July, 2022; originally announced July 2022.

    Comments: 30 pages, 4 figures

  39. arXiv:2207.02093  [pdf, other

    cs.LG stat.ML

    Predicting Out-of-Domain Generalization with Neighborhood Invariance

    Authors: Nathan Ng, Neha Hulkund, Kyunghyun Cho, Marzyeh Ghassemi

    Abstract: Develo** and deploying machine learning models safely depends on the ability to characterize and compare their abilities to generalize to new environments. Although recent work has proposed a variety of methods that can directly predict or theoretically bound the generalization capacity of a model, they rely on strong assumptions such as matching train/test distributions and access to model grad… ▽ More

    Submitted 17 July, 2023; v1 submitted 5 July, 2022; originally announced July 2022.

    Comments: 38 pages, 5 figures, 28 tables

  40. arXiv:2206.02058  [pdf, other

    stat.ML cs.CY cs.LG

    When Personalization Harms: Reconsidering the Use of Group Attributes in Prediction

    Authors: Vinith M. Suriyakumar, Marzyeh Ghassemi, Berk Ustun

    Abstract: Machine learning models are often personalized with categorical attributes that are protected, sensitive, self-reported, or costly to acquire. In this work, we show models that are personalized with group attributes can reduce performance at a group level. We propose formal conditions to ensure the "fair use" of group attributes in prediction tasks by training one additional model -- i.e., collect… ▽ More

    Submitted 23 July, 2023; v1 submitted 4 June, 2022; originally announced June 2022.

    Comments: ICML 2023 Oral

  41. arXiv:2205.04616  [pdf, other

    cs.LG stat.AP

    Nightly Automobile Claims Prediction from Telematics-Derived Features: A Multilevel Approach

    Authors: Allen R. Williams, Yoolim **, Anthony Duer, Tuka Alhanai, Mohammad Ghassemi

    Abstract: In recent years it has become possible to collect GPS data from drivers and to incorporate this data into automobile insurance pricing for the driver. This data is continuously collected and processed nightly into metadata consisting of mileage and time summaries of each discrete trip taken, and a set of behavioral scores describing attributes of the trip (e.g, driver fatigue or driver distraction… ▽ More

    Submitted 9 May, 2022; originally announced May 2022.

  42. Write It Like You See It: Detectable Differences in Clinical Notes By Race Lead To Differential Model Recommendations

    Authors: Hammaad Adam, Ming Ying Yang, Kenrick Cato, Ioana Baldini, Charles Senteio, Leo Anthony Celi, Jiaming Zeng, Moninder Singh, Marzyeh Ghassemi

    Abstract: Clinical notes are becoming an increasingly important data source for machine learning (ML) applications in healthcare. Prior research has shown that deploying ML models can perpetuate existing biases against racial minorities, as bias can be implicitly embedded in data. In this study, we investigate the level of implicit race information available to ML models and human experts and the implicatio… ▽ More

    Submitted 1 November, 2022; v1 submitted 8 May, 2022; originally announced May 2022.

    Journal ref: Proceedings of the 2022 AAAI/ACM Conference on AI, Ethics, and Society (AIES 2022)

  43. arXiv:2205.03295  [pdf, other

    cs.LG cs.AI cs.CY

    The Road to Explainability is Paved with Bias: Measuring the Fairness of Explanations

    Authors: Aparna Balagopalan, Haoran Zhang, Kimia Hamidieh, Thomas Hartvigsen, Frank Rudzicz, Marzyeh Ghassemi

    Abstract: Machine learning models in safety-critical settings like healthcare are often blackboxes: they contain a large number of parameters which are not transparent to users. Post-hoc explainability methods where a simple, human-interpretable model imitates the behavior of these blackbox models are often proposed to help users trust model predictions. In this work, we audit the quality of such explanatio… ▽ More

    Submitted 2 June, 2022; v1 submitted 6 May, 2022; originally announced May 2022.

    Comments: Published in FAccT 2022

  44. arXiv:2203.12748  [pdf, other

    cs.LG cs.AI cs.CY stat.ML

    Is Fairness Only Metric Deep? Evaluating and Addressing Subgroup Gaps in Deep Metric Learning

    Authors: Natalie Dullerud, Karsten Roth, Kimia Hamidieh, Nicolas Papernot, Marzyeh Ghassemi

    Abstract: Deep metric learning (DML) enables learning with less supervision through its emphasis on the similarity structure of representations. There has been much work on improving generalization of DML in settings like zero-shot retrieval, but little is known about its implications for fairness. In this paper, we are the first to evaluate state-of-the-art DML methods trained on imbalanced data, and to sh… ▽ More

    Submitted 23 March, 2022; originally announced March 2022.

    Comments: Published as a conference paper at ICLR 2022

  45. arXiv:2203.12609  [pdf, other

    cs.LG cs.CV cs.CY eess.IV

    Improving the Fairness of Chest X-ray Classifiers

    Authors: Haoran Zhang, Natalie Dullerud, Karsten Roth, Lauren Oakden-Rayner, Stephen Robert Pfohl, Marzyeh Ghassemi

    Abstract: Deep learning models have reached or surpassed human-level performance in the field of medical imaging, especially in disease diagnosis using chest x-rays. However, prior work has found that such classifiers can exhibit biases in the form of gaps in predictive performance across protected groups. In this paper, we question whether striving to achieve zero disparities in predictive performance (i.e… ▽ More

    Submitted 23 March, 2022; originally announced March 2022.

    Comments: Published in CHIL 2022

  46. arXiv:2203.09365  [pdf, other

    cs.LG

    Semi-Markov Offline Reinforcement Learning for Healthcare

    Authors: Mehdi Fatemi, Mary Wu, Jeremy Petch, Walter Nelson, Stuart J. Connolly, Alexander Benz, Anthony Carnicelli, Marzyeh Ghassemi

    Abstract: Reinforcement learning (RL) tasks are typically framed as Markov Decision Processes (MDPs), assuming that decisions are made at fixed time intervals. However, many applications of great importance, including healthcare, do not satisfy this assumption, yet they are commonly modelled as MDPs after an artificial resha** of the data. In addition, most healthcare (and similar) problems are offline by… ▽ More

    Submitted 20 March, 2022; v1 submitted 17 March, 2022; originally announced March 2022.

    Comments: Published at Conference on Health, Inference, and Learning (CHIL) 2022

  47. arXiv:2112.01020  [pdf, other

    cs.LG

    Learning Optimal Predictive Checklists

    Authors: Haoran Zhang, Quaid Morris, Berk Ustun, Marzyeh Ghassemi

    Abstract: Checklists are simple decision aids that are often used to promote safety and reliability in clinical applications. In this paper, we present a method to learn checklists for clinical decision support. We represent predictive checklists as discrete linear classifiers with binary features and unit weights. We then learn globally optimal predictive checklists from data by solving an integer programm… ▽ More

    Submitted 14 January, 2022; v1 submitted 2 December, 2021; originally announced December 2021.

    Comments: Published in NeurIPS 2021

  48. arXiv:2110.08931  [pdf, other

    cs.CL

    Quantifying the Task-Specific Information in Text-Based Classifications

    Authors: Zining Zhu, Aparna Balagopalan, Marzyeh Ghassemi, Frank Rudzicz

    Abstract: Recently, neural natural language models have attained state-of-the-art performance on a wide variety of tasks, but the high performance can result from superficial, surface-level cues (Bender and Koller, 2020; Niven and Kao, 2020). These surface cues, as the ``shortcuts'' inherent in the datasets, do not contribute to the *task-specific information* (TSI) of the classification tasks. While it is… ▽ More

    Submitted 17 October, 2021; originally announced October 2021.

  49. arXiv:2110.06131  [pdf, other

    eess.SP cs.LG

    Fetal Gender Identification using Machine and Deep Learning Algorithms on Phonocardiogram Signals

    Authors: Reza Khanmohammadi, Mitra Sadat Mirshafiee, Mohammad Mahdi Ghassemi, Tuka Alhanai

    Abstract: Phonocardiogram (PCG) signal analysis is a critical, widely-studied technology to noninvasively analyze the heart's mechanical activity. Through evaluating heart sounds, this technology has been chiefly leveraged as a preliminary solution to automatically diagnose Cardiovascular diseases among adults; however, prenatal tasks such as fetal gender identification have been relatively less studied usi… ▽ More

    Submitted 10 October, 2021; originally announced October 2021.

  50. arXiv:2110.04831  [pdf, other

    cs.LG cs.AI

    Feature Imitating Networks

    Authors: Sari Saba-Sadiya, Tuka Alhanai, Mohammad M Ghassemi

    Abstract: In this paper, we introduce a novel approach to neural learning: the Feature-Imitating-Network (FIN). A FIN is a neural network with weights that are initialized to reliably approximate one or more closed-form statistical features, such as Shannon's entropy. In this paper, we demonstrate that FINs (and FIN ensembles) provide best-in-class performance for a variety of downstream signal processing a… ▽ More

    Submitted 23 October, 2021; v1 submitted 10 October, 2021; originally announced October 2021.