Skip to main content

Showing 1–22 of 22 results for author: Ustun, B

.
  1. arXiv:2403.01628  [pdf, ps, other

    cs.LG

    Recent Advances, Applications, and Open Challenges in Machine Learning for Health: Reflections from Research Roundtables at ML4H 2023 Symposium

    Authors: Hyewon Jeong, Sarah Jabbour, Yuzhe Yang, Rahul Thapta, Hussein Mozannar, William Jongwon Han, Nikita Mehandru, Michael Wornow, Vladislav Lialin, Xin Liu, Alejandro Lozano, Jiacheng Zhu, Rafal Dariusz Kocielnik, Keith Harrigian, Haoran Zhang, Edward Lee, Milos Vukadinovic, Aparna Balagopalan, Vincent Jeanselme, Katherine Matton, Ilker Demirel, Jason Fries, Parisa Rashidi, Brett Beaulieu-Jones, Xuhai Orson Xu , et al. (18 additional authors not shown)

    Abstract: The third ML4H symposium was held in person on December 10, 2023, in New Orleans, Louisiana, USA. The symposium included research roundtable sessions to foster discussions between participants and senior researchers on timely and relevant topics for the \ac{ML4H} community. Encouraged by the successful virtual roundtables in the previous year, we organized eleven in-person roundtables and four vir… ▽ More

    Submitted 5 April, 2024; v1 submitted 3 March, 2024; originally announced March 2024.

    Comments: ML4H 2023, Research Roundtables

  2. arXiv:2402.07745  [pdf, other

    cs.LG

    Predictive Churn with the Set of Good Models

    Authors: Jamelle Watson-Daniels, Flavio du Pin Calmon, Alexander D'Amour, Carol Long, David C. Parkes, Berk Ustun

    Abstract: Machine learning models in modern mass-market applications are often updated over time. One of the foremost challenges faced is that, despite increasing overall performance, these updates may flip specific model predictions in unpredictable ways. In practice, researchers quantify the number of unstable predictions between models pre and post update -- i.e., predictive churn. In this paper, we stud… ▽ More

    Submitted 12 February, 2024; originally announced February 2024.

  3. arXiv:2402.04398  [pdf, other

    cs.LG cs.AI stat.ML

    Learning from Time Series under Temporal Label Noise

    Authors: Sujay Nagaraj, Walter Gerych, Sana Tonekaboni, Anna Goldenberg, Berk Ustun, Thomas Hartvigsen

    Abstract: Many sequential classification tasks are affected by label noise that varies over time. Such noise can cause label quality to improve, worsen, or periodically change over time. We first propose and formalize temporal label noise, an unstudied problem for sequential classification of time series. In this setting, multiple labels are recorded in sequence while being corrupted by a time-dependent noi… ▽ More

    Submitted 6 February, 2024; originally announced February 2024.

  4. arXiv:2402.03481  [pdf, other

    cs.IR cs.LG cs.SI

    FINEST: Stabilizing Recommendations by Rank-Preserving Fine-Tuning

    Authors: Sejoon Oh, Berk Ustun, Julian McAuley, Srijan Kumar

    Abstract: Modern recommender systems may output considerably different recommendations due to small perturbations in the training data. Changes in the data from a single user will alter the recommendations as well as the recommendations of other users. In applications like healthcare, housing, and finance, this sensitivity can have adverse effects on user experience. We propose a method to stabilize a given… ▽ More

    Submitted 5 February, 2024; originally announced February 2024.

    Comments: Accepted at the 6th FAccTRec Workshop on Responsible Recommendation @ ACM RecSys 2023

  5. arXiv:2308.12820  [pdf, other

    cs.LG cs.CY stat.ML

    Prediction without Preclusion: Recourse Verification with Reachable Sets

    Authors: Avni Kothari, Bogdan Kulynych, Tsui-Wei Weng, Berk Ustun

    Abstract: Machine learning models are often used to decide who receives a loan, a job interview, or a public benefit. Models in such settings use features without considering their actionability. As a result, they can assign predictions that are fixed $-$ meaning that individuals who are denied loans and interviews are, in fact, precluded from access to credit and employment. In this work, we introduce a pr… ▽ More

    Submitted 1 May, 2024; v1 submitted 24 August, 2023; originally announced August 2023.

    Comments: ICLR 2024 Spotlight. The first two authors contributed equally

  6. arXiv:2305.09035  [pdf, other

    cs.LG

    Algorithmic Censoring in Dynamic Learning Systems

    Authors: Jennifer Chien, Margaret Roberts, Berk Ustun

    Abstract: Dynamic learning systems subject to selective labeling exhibit censoring, i.e. persistent negative predictions assigned to one or more subgroups of points. In applications like consumer finance, this results in groups of applicants that are persistently denied and thus never enter into the training data. In this work, we formalize censoring, demonstrate how it can arise, and highlight difficulties… ▽ More

    Submitted 29 June, 2023; v1 submitted 15 May, 2023; originally announced May 2023.

    Comments: 28 pages, 9 figures

  7. arXiv:2304.07031  [pdf, other

    cs.CV cs.LG

    Spectral Transfer Guided Active Domain Adaptation For Thermal Imagery

    Authors: Berkcan Ustun, Ahmet Kagan Kaya, Ezgi Cakir Ayerden, Fazil Altinel

    Abstract: The exploitation of visible spectrum datasets has led deep networks to show remarkable success. However, real-world tasks include low-lighting conditions which arise performance bottlenecks for models trained on large-scale RGB image datasets. Thermal IR cameras are more robust against such conditions. Therefore, the usage of thermal imagery in real-world applications can be useful. Unsupervised d… ▽ More

    Submitted 14 April, 2023; originally announced April 2023.

    Comments: Accepted to CVPR 2023 Perception Beyond the Visible Spectrum (PBVS) workshop

  8. arXiv:2302.03874  [pdf, other

    cs.LG cs.CY

    Participatory Personalization in Classification

    Authors: Hailey Joren, Chirag Nagpal, Katherine Heller, Berk Ustun

    Abstract: Machine learning models are often personalized with information that is protected, sensitive, self-reported, or costly to acquire. These models use information about people but do not facilitate nor inform their consent. Individuals cannot opt out of reporting personal information to a model, nor tell if they benefit from personalization in the first place. We introduce a family of classification… ▽ More

    Submitted 11 October, 2023; v1 submitted 7 February, 2023; originally announced February 2023.

    Comments: 37th Conference on Neural Information Processing Systems (NeurIPS 2023)

  9. arXiv:2206.02058  [pdf, other

    stat.ML cs.CY cs.LG

    When Personalization Harms: Reconsidering the Use of Group Attributes in Prediction

    Authors: Vinith M. Suriyakumar, Marzyeh Ghassemi, Berk Ustun

    Abstract: Machine learning models are often personalized with categorical attributes that are protected, sensitive, self-reported, or costly to acquire. In this work, we show models that are personalized with group attributes can reduce performance at a group level. We propose formal conditions to ensure the "fair use" of group attributes in prediction tasks by training one additional model -- i.e., collect… ▽ More

    Submitted 23 July, 2023; v1 submitted 4 June, 2022; originally announced June 2022.

    Comments: ICML 2023 Oral

  10. arXiv:2206.01131  [pdf, other

    cs.LG

    Predictive Multiplicity in Probabilistic Classification

    Authors: Jamelle Watson-Daniels, David C. Parkes, Berk Ustun

    Abstract: Machine learning models are often used to inform real world risk assessment tasks: predicting consumer default risk, predicting whether a person suffers from a serious illness, or predicting a person's risk to appear in court. Given multiple models that perform almost equally well for a prediction task, to what extent do predictions vary across these models? If predictions are relatively consisten… ▽ More

    Submitted 23 June, 2023; v1 submitted 2 June, 2022; originally announced June 2022.

    Comments: Published by AAAI Press, Palo Alto, California USA 2023, Association for the Advancement of Artificial Intelligence

  11. arXiv:2201.12686  [pdf, other

    cs.IR cs.CR cs.LG cs.SI

    Rank List Sensitivity of Recommender Systems to Interaction Perturbations

    Authors: Sejoon Oh, Berk Ustun, Julian McAuley, Srijan Kumar

    Abstract: Prediction models can exhibit sensitivity with respect to training data: small changes in the training data can produce models that assign conflicting predictions to individual data points during test time. In this work, we study this sensitivity in recommender systems, where users' recommendations are drastically altered by minor perturbations in other unrelated users' interactions. We introduce… ▽ More

    Submitted 16 August, 2022; v1 submitted 29 January, 2022; originally announced January 2022.

    Comments: Accepted for publication at: 31st ACM International Conference on Information and Knowledge Management (CIKM 2022). Code and data at: https://github.com/srijankr/casper

  12. arXiv:2112.01020  [pdf, other

    cs.LG

    Learning Optimal Predictive Checklists

    Authors: Haoran Zhang, Quaid Morris, Berk Ustun, Marzyeh Ghassemi

    Abstract: Checklists are simple decision aids that are often used to promote safety and reliability in clinical applications. In this paper, we present a method to learn checklists for clinical decision support. We represent predictive checklists as discrete linear classifiers with binary features and unit weights. We then learn globally optimal predictive checklists from data by solving an integer programm… ▽ More

    Submitted 14 January, 2022; v1 submitted 2 December, 2021; originally announced December 2021.

    Comments: Published in NeurIPS 2021

  13. arXiv:1909.06677  [pdf, other

    cs.LG cs.CY stat.ML

    Predictive Multiplicity in Classification

    Authors: Charles T. Marx, Flavio du Pin Calmon, Berk Ustun

    Abstract: Prediction problems often admit competing models that perform almost equally well. This effect challenges key assumptions in machine learning when competing models assign conflicting predictions. In this paper, we define predictive multiplicity as the ability of a prediction problem to admit competing models with conflicting predictions. We introduce formal measures to evaluate the severity of pre… ▽ More

    Submitted 16 September, 2020; v1 submitted 14 September, 2019; originally announced September 2019.

  14. arXiv:1901.10501  [pdf, other

    cs.LG cs.CY cs.IT stat.ML

    Repairing without Retraining: Avoiding Disparate Impact with Counterfactual Distributions

    Authors: Hao Wang, Berk Ustun, Flavio P. Calmon

    Abstract: When the performance of a machine learning model varies over groups defined by sensitive attributes (e.g., gender or ethnicity), the performance disparity can be expressed in terms of the probability distributions of the input and output variables over each group. In this paper, we exploit this fact to reduce the disparate impact of a fixed classification model over a population of interest. Given… ▽ More

    Submitted 17 May, 2019; v1 submitted 29 January, 2019; originally announced January 2019.

  15. Actionable Recourse in Linear Classification

    Authors: Berk Ustun, Alexander Spangher, Yang Liu

    Abstract: Machine learning models are increasingly used to automate decisions that affect humans - deciding who should receive a loan, a job interview, or a social service. In such applications, a person should have the ability to change the decision of a model. When a person is denied a loan by a credit score, for example, they should be able to alter its input variables in a way that guarantees approval.… ▽ More

    Submitted 8 November, 2019; v1 submitted 17 September, 2018; originally announced September 2018.

    Comments: Extended version. ACM Conference on Fairness, Accountability and Transparency [FAT2019]

  16. arXiv:1801.05398  [pdf, other

    cs.IT cs.LG stat.ML

    On the Direction of Discrimination: An Information-Theoretic Analysis of Disparate Impact in Machine Learning

    Authors: Hao Wang, Berk Ustun, Flavio P. Calmon

    Abstract: In the context of machine learning, disparate impact refers to a form of systematic discrimination whereby the output distribution of a model depends on the value of a sensitive attribute (e.g., race or gender). In this paper, we propose an information-theoretic framework to analyze the disparate impact of a binary classification model. We view the model as a fixed channel, and quantify disparate… ▽ More

    Submitted 11 May, 2018; v1 submitted 16 January, 2018; originally announced January 2018.

  17. arXiv:1610.00168  [pdf, other

    stat.ML math.OC stat.ME

    Learning Optimized Risk Scores

    Authors: Berk Ustun, Cynthia Rudin

    Abstract: Risk scores are simple classification models that let users make quick risk predictions by adding and subtracting a few small numbers. These models are widely used in medicine and criminal justice, but are difficult to learn from data because they need to be calibrated, sparse, use small integer coefficients, and obey application-specific operational constraints. In this paper, we present a new ma… ▽ More

    Submitted 16 September, 2019; v1 submitted 1 October, 2016; originally announced October 2016.

    Journal ref: Journal of Machine Learning Research 2019. Volume 20. Issue 150. Pages 1-75

  18. arXiv:1503.07810  [pdf, other

    stat.ML stat.AP

    Interpretable Classification Models for Recidivism Prediction

    Authors: Jiaming Zeng, Berk Ustun, Cynthia Rudin

    Abstract: We investigate a long-debated question, which is how to create predictive models of recidivism that are sufficiently accurate, transparent, and interpretable to use for decision-making. This question is complicated as these models are used to support different decisions, from sentencing, to determining release on probation, to allocating preventative social services. Each use case might have an ob… ▽ More

    Submitted 7 July, 2016; v1 submitted 26 March, 2015; originally announced March 2015.

    Comments: 45 pages, 17 figures

    Journal ref: Journal of Royal Statistics - Series A (2017)

  19. arXiv:1502.04269  [pdf, other

    stat.ML cs.DM cs.LG stat.AP stat.ME

    Supersparse Linear Integer Models for Optimized Medical Scoring Systems

    Authors: Berk Ustun, Cynthia Rudin

    Abstract: Scoring systems are linear classification models that only require users to add, subtract and multiply a few small numbers in order to make a prediction. These models are in widespread use by the medical community, but are difficult to learn from data because they need to be accurate and sparse, have coprime integer coefficients, and satisfy multiple operational constraints. We present a new metho… ▽ More

    Submitted 26 January, 2016; v1 submitted 14 February, 2015; originally announced February 2015.

    Comments: This version reflects our findings on SLIM as of January 2016 (arXiv:1306.5860 and arXiv:1405.4047 are out-of-date). The final published version of this articled is available at http://www.springerlink.com

  20. arXiv:1405.4047  [pdf, other

    stat.ME cs.LG stat.ML

    Methods and Models for Interpretable Linear Classification

    Authors: Berk Ustun, Cynthia Rudin

    Abstract: We present an integer programming framework to build accurate and interpretable discrete linear classification models. Unlike existing approaches, our framework is designed to provide practitioners with the control and flexibility they need to tailor accurate and interpretable models for a domain of choice. To this end, our framework can produce models that are fully optimized for accuracy, by min… ▽ More

    Submitted 1 October, 2014; v1 submitted 15 May, 2014; originally announced May 2014.

  21. arXiv:1306.6677  [pdf, other

    stat.ML stat.AP

    Supersparse Linear Integer Models for Interpretable Classification

    Authors: Berk Ustun, Stefano TracĂ , Cynthia Rudin

    Abstract: Scoring systems are classification models that only require users to add, subtract and multiply a few meaningful numbers to make a prediction. These models are often used because they are practical and interpretable. In this paper, we introduce an off-the-shelf tool to create scoring systems that both accurate and interpretable, known as a Supersparse Linear Integer Model (SLIM). SLIM is a discret… ▽ More

    Submitted 10 April, 2014; v1 submitted 27 June, 2013; originally announced June 2013.

  22. arXiv:1306.5860  [pdf, ps, other

    stat.ML

    Supersparse Linear Integer Models for Predictive Scoring Systems

    Authors: Berk Ustun, Stefano Traca, Cynthia Rudin

    Abstract: We introduce Supersparse Linear Integer Models (SLIM) as a tool to create scoring systems for binary classification. We derive theoretical bounds on the true risk of SLIM scoring systems, and present experimental results to show that SLIM scoring systems are accurate, sparse, and interpretable classification models.

    Submitted 25 June, 2013; originally announced June 2013.

    Comments: Short version