Skip to main content

Showing 1–50 of 58 results for author: Yurochkin, M

.
  1. arXiv:2407.00066  [pdf, other

    cs.DC cs.AI cs.CL cs.LG

    Compress then Serve: Serving Thousands of LoRA Adapters with Little Overhead

    Authors: Rickard Brüel-Gabrielsson, Jiacheng Zhu, Onkar Bhardwaj, Leshem Choshen, Kristjan Greenewald, Mikhail Yurochkin, Justin Solomon

    Abstract: Fine-tuning large language models (LLMs) with low-rank adapters (LoRAs) has become common practice, often yielding numerous copies of the same LLM differing only in their LoRA updates. This paradigm presents challenges for systems that serve real-time responses to queries that each involve a different LoRA. Prior works optimize the design of such systems but still require continuous loading and of… ▽ More

    Submitted 17 June, 2024; originally announced July 2024.

  2. arXiv:2406.05882  [pdf, other

    cs.LG stat.ML

    Distributional Preference Alignment of LLMs via Optimal Transport

    Authors: Igor Melnyk, Youssef Mroueh, Brian Belgodere, Mattia Rigotti, Apoorva Nitsure, Mikhail Yurochkin, Kristjan Greenewald, Jiri Navratil, Jerret Ross

    Abstract: Current LLM alignment techniques use pairwise human preferences at a sample level, and as such, they do not imply an alignment on the distributional level. We propose in this paper Alignment via Optimal Transport (AOT), a novel method for distributional preference alignment of LLMs. AOT aligns LLMs on unpaired preference data by making the reward distribution of the positive samples stochastically… ▽ More

    Submitted 9 June, 2024; originally announced June 2024.

  3. arXiv:2405.17202  [pdf, other

    cs.CL cs.AI cs.LG stat.ML

    Efficient multi-prompt evaluation of LLMs

    Authors: Felipe Maia Polo, Ronald Xu, Lucas Weber, Mírian Silva, Onkar Bhardwaj, Leshem Choshen, Allysson Flavio Melo de Oliveira, Yuekai Sun, Mikhail Yurochkin

    Abstract: Most popular benchmarks for comparing LLMs rely on a limited set of prompt templates, which may not fully capture the LLMs' abilities and can affect the reproducibility of results on leaderboards. Many recent works empirically verify prompt sensitivity and advocate for changes in LLM evaluation. In this paper, we consider the problem of estimating the performance distribution across many prompt va… ▽ More

    Submitted 7 June, 2024; v1 submitted 27 May, 2024; originally announced May 2024.

  4. arXiv:2405.16236  [pdf, ps, other

    stat.ML cs.LG

    A statistical framework for weak-to-strong generalization

    Authors: Seamus Somerstep, Felipe Maia Polo, Moulinath Banerjee, Ya'acov Ritov, Mikhail Yurochkin, Yuekai Sun

    Abstract: Modern large language model (LLM) alignment techniques rely on human feedback, but it is unclear whether the techniques fundamentally limit the capabilities of aligned LLMs. In particular, it is unclear whether it is possible to align (stronger) LLMs with superhuman capabilities with (weaker) human feedback without degrading their capabilities. This is an instance of the weak-to-strong generalizat… ▽ More

    Submitted 25 May, 2024; originally announced May 2024.

  5. arXiv:2405.11083  [pdf, other

    cs.CL cs.LG

    Prompt Exploration with Prompt Regression

    Authors: Michael Feffer, Ronald Xu, Yuekai Sun, Mikhail Yurochkin

    Abstract: In the advent of democratized usage of large language models (LLMs), there is a growing desire to systematize LLM prompt creation and selection processes beyond iterative trial-and-error. Prior works majorly focus on searching the space of prompts without accounting for relations between prompt variations. Here we propose a framework, Prompt Exploration with Prompt Regression (PEPR), to predict… ▽ More

    Submitted 17 May, 2024; originally announced May 2024.

  6. arXiv:2403.04224  [pdf, other

    cs.CL cs.AI cs.LG

    Aligners: Decoupling LLMs and Alignment

    Authors: Lilian Ngweta, Mayank Agarwal, Subha Maity, Alex Gittens, Yuekai Sun, Mikhail Yurochkin

    Abstract: Large Language Models (LLMs) need to be aligned with human expectations to ensure their safety and utility in most applications. Alignment is challenging, costly, and needs to be repeated for every LLM and alignment criterion. We propose to decouple LLMs and alignment by training aligner models that can be used to align any LLM for a given criteria on an as-needed basis, thus also reducing the pot… ▽ More

    Submitted 16 June, 2024; v1 submitted 6 March, 2024; originally announced March 2024.

    Comments: Tiny Papers at the International Conference on Learning Representations (ICLR) 2024

  7. arXiv:2402.16842  [pdf, other

    cs.LG

    Asymmetry in Low-Rank Adapters of Foundation Models

    Authors: Jiacheng Zhu, Kristjan Greenewald, Kimia Nadjahi, Haitz Sáez de Ocáriz Borde, Rickard Brüel Gabrielsson, Leshem Choshen, Marzyeh Ghassemi, Mikhail Yurochkin, Justin Solomon

    Abstract: Parameter-efficient fine-tuning optimizes large, pre-trained foundation models by updating a subset of parameters; in this class, Low-Rank Adaptation (LoRA) is particularly effective. Inspired by an effort to investigate the different roles of LoRA matrices during fine-tuning, this paper characterizes and leverages unexpected asymmetry in the importance of low-rank adapter matrices. Specifically,… ▽ More

    Submitted 27 February, 2024; v1 submitted 26 February, 2024; originally announced February 2024.

    Comments: 17 pages, 2 figures, 9 tables

  8. arXiv:2402.14992  [pdf, other

    cs.CL cs.AI cs.LG stat.ML

    tinyBenchmarks: evaluating LLMs with fewer examples

    Authors: Felipe Maia Polo, Lucas Weber, Leshem Choshen, Yuekai Sun, Gongjun Xu, Mikhail Yurochkin

    Abstract: The versatility of large language models (LLMs) led to the creation of diverse benchmarks that thoroughly test a variety of language models' abilities. These benchmarks consist of tens of thousands of examples making evaluation of LLMs very expensive. In this paper, we investigate strategies to reduce the number of evaluations needed to assess the performance of an LLM on several key benchmarks. F… ▽ More

    Submitted 26 May, 2024; v1 submitted 22 February, 2024; originally announced February 2024.

    Comments: Proceedings of the 41st International Conference on Machine Learning (ICML)

  9. arXiv:2402.08324  [pdf, other

    cs.LG cs.AI

    Uncertainty Quantification via Stable Distribution Propagation

    Authors: Felix Petersen, Aashwin Mishra, Hilde Kuehne, Christian Borgelt, Oliver Deussen, Mikhail Yurochkin

    Abstract: We propose a new approach for propagating stable probability distributions through neural networks. Our method is based on local linearization, which we show to be an optimal approximation in terms of total variation distance for the ReLU non-linearity. This allows propagating Gaussian and Cauchy input uncertainties through neural networks to quantify their output uncertainties. To demonstrate the… ▽ More

    Submitted 13 February, 2024; originally announced February 2024.

    Comments: Published at ICLR 2024, Code @ https://github.com/Felix-Petersen/distprop

  10. arXiv:2312.04601  [pdf, other

    stat.ML cs.AI cs.LG stat.ME

    Estimating Fréchet bounds for validating programmatic weak supervision

    Authors: Felipe Maia Polo, Mikhail Yurochkin, Moulinath Banerjee, Subha Maity, Yuekai Sun

    Abstract: We develop methods for estimating Fréchet bounds on (possibly high-dimensional) distribution classes in which some variables are continuous-valued. We establish the statistical correctness of the computed bounds under uncertainty in the marginal constraints and demonstrate the usefulness of our algorithms by evaluating the performance of machine learning (ML) models trained with programmatic weak… ▽ More

    Submitted 7 December, 2023; originally announced December 2023.

  11. arXiv:2310.07132  [pdf, other

    cs.LG math.ST q-fin.RM stat.ML

    Risk Aware Benchmarking of Large Language Models

    Authors: Apoorva Nitsure, Youssef Mroueh, Mattia Rigotti, Kristjan Greenewald, Brian Belgodere, Mikhail Yurochkin, Jiri Navratil, Igor Melnyk, Jerret Ross

    Abstract: We propose a distributional framework for benchmarking socio-technical risks of foundation models with quantified statistical significance. Our approach hinges on a new statistical relative testing based on first and second order stochastic dominance of real random variables. We show that the second order statistics in this test are linked to mean-risk models commonly used in econometrics and math… ▽ More

    Submitted 9 June, 2024; v1 submitted 10 October, 2023; originally announced October 2023.

    Comments: ICML 2024

  12. arXiv:2310.01583  [pdf, other

    stat.ML cs.LG

    An Investigation of Representation and Allocation Harms in Contrastive Learning

    Authors: Subha Maity, Mayank Agarwal, Mikhail Yurochkin, Yuekai Sun

    Abstract: The effect of underrepresentation on the performance of minority groups is known to be a serious problem in supervised learning settings; however, it has been underexplored so far in the context of self-supervised learning (SSL). In this paper, we demonstrate that contrastive learning (CL), a popular variant of SSL, tends to collapse representations of minority groups with certain majority groups.… ▽ More

    Submitted 2 October, 2023; originally announced October 2023.

  13. arXiv:2310.01542  [pdf, other

    cs.LG

    Fusing Models with Complementary Expertise

    Authors: Hongyi Wang, Felipe Maia Polo, Yuekai Sun, Souvik Kundu, Eric Xing, Mikhail Yurochkin

    Abstract: Training AI models that generalize across tasks and domains has long been among the open problems driving AI research. The emergence of Foundation Models made it easier to obtain expert models for a given task, but the heterogeneity of data that may be encountered at test time often means that any single expert is insufficient. We consider the Fusion of Experts (FoE) problem of fusing outputs of e… ▽ More

    Submitted 9 May, 2024; v1 submitted 2 October, 2023; originally announced October 2023.

    Comments: This paper was published at ICLR 2024

  14. arXiv:2310.00672  [pdf, other

    cs.LG cs.CL cs.CV

    GeRA: Label-Efficient Geometrically Regularized Alignment

    Authors: Dustin Klebe, Tal Shnitzer, Mikhail Yurochkin, Leonid Karlinsky, Justin Solomon

    Abstract: Pretrained unimodal encoders incorporate rich semantic information into embedding space structures. To be similarly informative, multi-modal encoders typically require massive amounts of paired data for alignment and training. We introduce a semi-supervised Geometrically Regularized Alignment (GeRA) method to align the embedding spaces of pretrained unimodal encoders in a label-efficient way. Our… ▽ More

    Submitted 7 October, 2023; v1 submitted 1 October, 2023; originally announced October 2023.

    Comments: 9 pages

    ACM Class: I.2; I.2.7

  15. arXiv:2309.15789  [pdf, other

    cs.CL cs.LG

    Large Language Model Routing with Benchmark Datasets

    Authors: Tal Shnitzer, Anthony Ou, Mírian Silva, Kate Soule, Yuekai Sun, Justin Solomon, Neil Thompson, Mikhail Yurochkin

    Abstract: There is a rapidly growing number of open-source Large Language Models (LLMs) and benchmark datasets to compare them. While some models dominate these benchmarks, no single model typically achieves the best accuracy in all tasks and use cases. In this work, we address the challenge of selecting the best LLM out of a collection of models for new tasks. We propose a new formulation for the problem,… ▽ More

    Submitted 27 September, 2023; originally announced September 2023.

    Comments: 18 pages, 8 figures, 4 tables

    MSC Class: I.2.7; I.2.6

  16. arXiv:2303.00673  [pdf, other

    cs.HC cs.CY cs.LG

    Fairness Evaluation in Text Classification: Machine Learning Practitioner Perspectives of Individual and Group Fairness

    Authors: Zahra Ashktorab, Benjamin Hoover, Mayank Agarwal, Casey Dugan, Werner Geyer, Hao Bang Yang, Mikhail Yurochkin

    Abstract: Mitigating algorithmic bias is a critical task in the development and deployment of machine learning models. While several toolkits exist to aid machine learning practitioners in addressing fairness issues, little is known about the strategies practitioners employ to evaluate model fairness and what factors influence their assessment, particularly in the context of text classification. Two common… ▽ More

    Submitted 1 March, 2023; originally announced March 2023.

    Comments: To appear in Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems (CHI '23)

  17. arXiv:2302.09795  [pdf, other

    cs.LG cs.CV stat.ML

    Simple Disentanglement of Style and Content in Visual Representations

    Authors: Lilian Ngweta, Subha Maity, Alex Gittens, Yuekai Sun, Mikhail Yurochkin

    Abstract: Learning visual representations with interpretable features, i.e., disentangled representations, remains a challenging problem. Existing methods demonstrate some success but are hard to apply to large-scale vision datasets like ImageNet. In this work, we propose a simple post-processing framework to disentangle content and style in learned representations from pre-trained vision models. We model t… ▽ More

    Submitted 31 May, 2023; v1 submitted 20 February, 2023; originally announced February 2023.

    Comments: International Conference on Machine Learning (ICML) 2023

  18. arXiv:2301.06195  [pdf, other

    stat.ML cs.LG

    Calibrated Data-Dependent Constraints with Exact Satisfaction Guarantees

    Authors: Songkai Xue, Yuekai Sun, Mikhail Yurochkin

    Abstract: We consider the task of training machine learning models with data-dependent constraints. Such constraints often arise as empirical versions of expected value constraints that enforce fairness or stability goals. We reformulate data-dependent constraints so that they are calibrated: enforcing the reformulated constraints guarantees that their expected value counterparts are satisfied with a user-p… ▽ More

    Submitted 15 January, 2023; originally announced January 2023.

    Comments: In Proceedings of the 36th Conference on Neural Information Processing Systems (NeurIPS) 2022

  19. arXiv:2210.13400  [pdf, other

    stat.ML cs.LG

    Sampling with Mollified Interaction Energy Descent

    Authors: Lingxiao Li, Qiang Liu, Anna Korba, Mikhail Yurochkin, Justin Solomon

    Abstract: Sampling from a target measure whose density is only known up to a normalization constant is a fundamental problem in computational statistics and machine learning. In this paper, we present a new optimization-based method for sampling called mollified interaction energy descent (MIED). MIED minimizes a new class of energies on probability measures called mollified interaction energies (MIEs). The… ▽ More

    Submitted 1 March, 2023; v1 submitted 24 October, 2022; originally announced October 2022.

  20. arXiv:2210.06759  [pdf, other

    cs.LG

    Outlier-Robust Group Inference via Gradient Space Clustering

    Authors: Yuchen Zeng, Kristjan Greenewald, Kangwook Lee, Justin Solomon, Mikhail Yurochkin

    Abstract: Traditional machine learning models focus on achieving good performance on the overall training distribution, but they often underperform on minority groups. Existing methods can improve the worst-group performance, but they can have several limitations: (i) they require group annotations, which are often expensive and sometimes infeasible to obtain, and/or (ii) they are sensitive to outliers. Mos… ▽ More

    Submitted 13 October, 2022; originally announced October 2022.

    Comments: 17 pages, 6 tables, 8 figures

  21. arXiv:2209.06378  [pdf, other

    cs.HC

    RMExplorer: A Visual Analytics Approach to Explore the Performance and the Fairness of Disease Risk Models on Population Subgroups

    Authors: Bum Chul Kwon, Uri Kartoun, Shaan Khurshid, Mikhail Yurochkin, Subha Maity, Deanna G Brockman, Amit V Khera, Patrick T Ellinor, Steven A Lubitz, Kenney Ng

    Abstract: Disease risk models can identify high-risk patients and help clinicians provide more personalized care. However, risk models developed on one dataset may not generalize across diverse subpopulations of patients in different datasets and may have unexpected performance. It is challenging for clinical researchers to inspect risk models across different subgroups without any tools. Therefore, we deve… ▽ More

    Submitted 13 September, 2022; originally announced September 2022.

    Comments: IEEE VIS 2022 Short

  22. arXiv:2206.03515  [pdf, other

    cs.LG math.ST

    How does overparametrization affect performance on minority groups?

    Authors: Subha Maity, Saptarshi Roy, Songkai Xue, Mikhail Yurochkin, Yuekai Sun

    Abstract: The benefits of overparameterization for the overall performance of modern machine learning (ML) models are well known. However, the effect of overparameterization at a more granular level of data subgroups is less understood. Recent empirical studies demonstrate encouraging results: (i) when groups are not known, overparameterized models trained with empirical risk minimization (ERM) perform bett… ▽ More

    Submitted 7 June, 2022; originally announced June 2022.

  23. arXiv:2205.13577  [pdf, other

    cs.LG stat.ME stat.ML

    Understanding new tasks through the lens of training data via exponential tilting

    Authors: Subha Maity, Mikhail Yurochkin, Moulinath Banerjee, Yuekai Sun

    Abstract: Deploying machine learning models to new tasks is a major challenge despite the large size of the modern training datasets. However, it is conceivable that the training data can be reweighted to be more representative of the new (target) task. We consider the problem of reweighing the training samples to gain insights into the distribution of the target task. Specifically, we formulate a distribut… ▽ More

    Submitted 21 February, 2023; v1 submitted 26 May, 2022; originally announced May 2022.

    Comments: Accepted in ICLR 2023

  24. arXiv:2205.00504  [pdf, ps, other

    stat.ML cs.AI cs.CY cs.LG

    Domain Adaptation meets Individual Fairness. And they get along

    Authors: Debarghya Mukherjee, Felix Petersen, Mikhail Yurochkin, Yuekai Sun

    Abstract: Many instances of algorithmic bias are caused by distributional shifts. For example, machine learning (ML) models often perform worse on demographic groups that are underrepresented in the training data. In this paper, we leverage this connection between algorithmic fairness and distribution shifts to show that algorithmic fairness interventions can help ML models overcome distribution shifts, and… ▽ More

    Submitted 15 October, 2022; v1 submitted 1 May, 2022; originally announced May 2022.

    Comments: Published at NeurIPS 2022

  25. arXiv:2202.01671  [pdf, other

    stat.ML cs.LG

    Log-Euclidean Signatures for Intrinsic Distances Between Unaligned Datasets

    Authors: Tal Shnitzer, Mikhail Yurochkin, Kristjan Greenewald, Justin Solomon

    Abstract: The need for efficiently comparing and representing datasets with unknown alignment spans various fields, from model analysis and comparison in machine learning to trend discovery in collections of medical datasets. We use manifold learning to compare the intrinsic geometric structures of different datasets by comparing their diffusion operators, symmetric positive-definite (SPD) matrices that rel… ▽ More

    Submitted 11 July, 2022; v1 submitted 3 February, 2022; originally announced February 2022.

    Comments: 23 pages, 9 figures

  26. arXiv:2201.12674  [pdf, other

    cs.LG

    Rewiring with Positional Encodings for Graph Neural Networks

    Authors: Rickard Brüel-Gabrielsson, Mikhail Yurochkin, Justin Solomon

    Abstract: Several recent works use positional encodings to extend the receptive fields of graph neural network (GNN) layers equipped with attention mechanisms. These techniques, however, extend receptive fields to the complete graph, at substantial computational cost and risking a change in the inductive biases of conventional GNNs, or require complex architecture adjustments. As a conservative alternative,… ▽ More

    Submitted 13 December, 2023; v1 submitted 29 January, 2022; originally announced January 2022.

  27. arXiv:2201.11945  [pdf, other

    cs.LG

    Learning Proximal Operators to Discover Multiple Optima

    Authors: Lingxiao Li, Noam Aigerman, Vladimir G. Kim, Jia** Li, Kristjan Greenewald, Mikhail Yurochkin, Justin Solomon

    Abstract: Finding multiple solutions of non-convex optimization problems is a ubiquitous yet challenging task. Most past algorithms either apply single-solution optimization methods from multiple random initial guesses or search in the vicinity of found solutions using ad hoc heuristics. We present an end-to-end method to learn the proximal operator of a family of training problems so that multiple local mi… ▽ More

    Submitted 1 March, 2023; v1 submitted 28 January, 2022; originally announced January 2022.

  28. arXiv:2110.13953  [pdf, other

    cs.LG

    On sensitivity of meta-learning to support data

    Authors: Mayank Agarwal, Mikhail Yurochkin, Yuekai Sun

    Abstract: Meta-learning algorithms are widely used for few-shot learning. For example, image recognition systems that readily adapt to unseen classes after seeing only a few labeled examples. Despite their success, we show that modern meta-learning algorithms are extremely sensitive to the data used for adaptation, i.e. support data. In particular, we demonstrate the existence of (unaltered, in-distribution… ▽ More

    Submitted 26 October, 2021; originally announced October 2021.

    Comments: Accepted at NeurIPS 2021

  29. arXiv:2110.13796  [pdf, other

    stat.ML cs.LG

    Post-processing for Individual Fairness

    Authors: Felix Petersen, Debarghya Mukherjee, Yuekai Sun, Mikhail Yurochkin

    Abstract: Post-processing in algorithmic fairness is a versatile approach for correcting bias in ML systems that are already used in production. The main appeal of post-processing is that it avoids expensive retraining. In this work, we propose general post-processing algorithms for individual fairness (IF). We consider a setting where the learner only has access to the predictions of the original model and… ▽ More

    Submitted 26 October, 2021; originally announced October 2021.

    Comments: Published at NeurIPS 2021, Code @ https://github.com/Felix-Petersen/fairness-post-processing, Video @ https://www.youtube.com/watch?v=9PyKODDewPA

  30. arXiv:2108.01250  [pdf, other

    cs.CL cs.LG

    Your fairness may vary: Pretrained language model fairness in toxic text classification

    Authors: Ioana Baldini, Dennis Wei, Karthikeyan Natesan Ramamurthy, Mikhail Yurochkin, Moninder Singh

    Abstract: The popularity of pretrained language models in natural language processing systems calls for a careful evaluation of such models in down-stream tasks, which have a higher potential for societal impact. The evaluation of such systems usually focuses on accuracy measures. Our findings in this paper call for attention to be paid to fairness measures as well. Through the analysis of more than a dozen… ▽ More

    Submitted 13 April, 2022; v1 submitted 2 August, 2021; originally announced August 2021.

    Comments: Findings of ACL 2022

  31. arXiv:2106.06510  [pdf, other

    stat.ML cs.LG stat.CO

    Measuring the robustness of Gaussian processes to kernel choice

    Authors: William T. Stephenson, Soumya Ghosh, Tin D. Nguyen, Mikhail Yurochkin, Sameer K. Deshpande, Tamara Broderick

    Abstract: Gaussian processes (GPs) are used to make medical and scientific decisions, including in cardiac care and monitoring of atmospheric carbon dioxide levels. Notably, the choice of GP kernel is often somewhat arbitrary. In particular, uncountably many kernels typically align with qualitative prior knowledge (e.g.\ function smoothness or stationarity). But in practice, data analysts choose among a han… ▽ More

    Submitted 12 March, 2022; v1 submitted 11 June, 2021; originally announced June 2021.

    Comments: AISTATS 2022

  32. arXiv:2106.02933  [pdf, other

    cs.LG

    k-Mixup Regularization for Deep Learning via Optimal Transport

    Authors: Kristjan Greenewald, Anming Gu, Mikhail Yurochkin, Justin Solomon, Edward Chien

    Abstract: Mixup is a popular regularization technique for training deep neural networks that improves generalization and increases robustness to certain distribution shifts. It perturbs input training data in the direction of other randomly-chosen instances in the training set. To better leverage the structure of the data, we extend mixup in a simple, broadly applicable way to \emph{$k$-mixup}, which pertur… ▽ More

    Submitted 7 October, 2023; v1 submitted 5 June, 2021; originally announced June 2021.

  33. arXiv:2103.16785  [pdf, other

    cs.LG stat.ML

    Individually Fair Gradient Boosting

    Authors: Alexander Vargo, Fan Zhang, Mikhail Yurochkin, Yuekai Sun

    Abstract: We consider the task of enforcing individual fairness in gradient boosting. Gradient boosting is a popular method for machine learning from tabular data, which arise often in applications where algorithmic fairness is a concern. At a high level, our approach is a functional gradient descent on a (distributionally) robust loss function that encodes our intuition of algorithmic fairness for the ML t… ▽ More

    Submitted 30 March, 2021; originally announced March 2021.

    Comments: ICLR Camera-Ready Version

  34. arXiv:2103.16714  [pdf, other

    stat.ML cs.LG

    Statistical inference for individual fairness

    Authors: Subha Maity, Songkai Xue, Mikhail Yurochkin, Yuekai Sun

    Abstract: As we rely on machine learning (ML) models to make more consequential decisions, the issue of ML models perpetuating or even exacerbating undesirable historical biases (e.g., gender and racial biases) has come to the fore of the public's attention. In this paper, we focus on the problem of detecting violations of individual fairness in ML models. We formalize the problem as measuring the susceptib… ▽ More

    Submitted 30 March, 2021; originally announced March 2021.

  35. arXiv:2103.11023  [pdf, other

    stat.ML cs.LG

    Individually Fair Ranking

    Authors: Amanda Bower, Hamid Eftekhari, Mikhail Yurochkin, Yuekai Sun

    Abstract: We develop an algorithm to train individually fair learning-to-rank (LTR) models. The proposed approach ensures items from minority groups appear alongside similar items from majority groups. This notion of fair ranking is based on the definition of individual fairness from supervised learning and is more nuanced than prior fair LTR approaches that simply ensure the ranking model provides underrep… ▽ More

    Submitted 19 March, 2021; originally announced March 2021.

    Comments: ICLR Camera-Ready Version

  36. arXiv:2012.07363  [pdf, other

    stat.ME

    Outlier-Robust Optimal Transport

    Authors: Debarghya Mukherjee, Aritra Guha, Justin Solomon, Yuekai Sun, Mikhail Yurochkin

    Abstract: Optimal transport (OT) measures distances between distributions in a way that depends on the geometry of the sample space. In light of recent advances in computational OT, OT distances are widely used as loss functions in machine learning. Despite their prevalence and advantages, OT loss functions can be extremely sensitive to outliers. In fact, a single adversarially-picked outlier can increase t… ▽ More

    Submitted 20 June, 2021; v1 submitted 14 December, 2020; originally announced December 2020.

    Comments: Accepted in Proceedings of the 38th International Conference on Machine Learning, PMLR 139, 2021

  37. arXiv:2012.01193  [pdf, ps, other

    cs.CY

    Black Loans Matter: Distributionally Robust Fairness for Fighting Subgroup Discrimination

    Authors: Mark Weber, Mikhail Yurochkin, Sherif Botros, Vanio Markov

    Abstract: Algorithmic fairness in lending today relies on group fairness metrics for monitoring statistical parity across protected groups. This approach is vulnerable to subgroup discrimination by proxy, carrying significant risks of legal and reputational damage for lenders and blatantly unfair outcomes for borrowers. Practical challenges arise from the many possible combinations and subsets of protected… ▽ More

    Submitted 27 November, 2020; originally announced December 2020.

    Comments: 8 pages, NeurIPS Fair AI in Finance Workshop

  38. arXiv:2011.03173  [pdf, other

    stat.ML cs.LG

    Does enforcing fairness mitigate biases caused by subpopulation shift?

    Authors: Subha Maity, Debarghya Mukherjee, Mikhail Yurochkin, Yuekai Sun

    Abstract: Many instances of algorithmic bias are caused by subpopulation shifts. For example, ML models often perform worse on demographic groups that are underrepresented in the training data. In this paper, we study whether enforcing algorithmic fairness during training improves the performance of the trained model in the \emph{target domain}. On one hand, we conceive scenarios in which enforcing fairness… ▽ More

    Submitted 26 October, 2021; v1 submitted 5 November, 2020; originally announced November 2020.

  39. arXiv:2010.12574  [pdf, other

    cs.LG stat.ML

    Online Semi-Supervised Learning with Bandit Feedback

    Authors: Sohini Upadhyay, Mikhail Yurochkin, Mayank Agarwal, Yasaman Khazaeni, DjallelBouneffouf

    Abstract: We formulate a new problem at the intersectionof semi-supervised learning and contextual bandits,motivated by several applications including clini-cal trials and ad recommendations. We demonstratehow Graph Convolutional Network (GCN), a semi-supervised learning approach, can be adjusted tothe new problem formulation. We also propose avariant of the linear contextual bandit with semi-supervised mis… ▽ More

    Submitted 23 October, 2020; originally announced October 2020.

  40. arXiv:2008.12534  [pdf, other

    cs.LG stat.ML

    Continuous Regularized Wasserstein Barycenters

    Authors: Lingxiao Li, Aude Genevay, Mikhail Yurochkin, Justin Solomon

    Abstract: Wasserstein barycenters provide a geometrically meaningful way to aggregate probability distributions, built on the theory of optimal transport. They are difficult to compute in practice, however, leading previous work to restrict their supports to finite sets of points. Leveraging a new dual formulation for the regularized Wasserstein barycenter problem, we introduce a stochastic algorithm that c… ▽ More

    Submitted 24 October, 2020; v1 submitted 28 August, 2020; originally announced August 2020.

  41. arXiv:2007.10987  [pdf, other

    cs.LG cs.CR cs.DC

    IBM Federated Learning: an Enterprise Framework White Paper V0.1

    Authors: Heiko Ludwig, Nathalie Baracaldo, Gegi Thomas, Yi Zhou, Ali Anwar, Shashank Rajamoni, Yuya Ong, Jayaram Radhakrishnan, Ashish Verma, Mathieu Sinn, Mark Purcell, Ambrish Rawat, Tran Minh, Naoise Holohan, Supriyo Chakraborty, Shalisha Whitherspoon, Dean Steuer, Laura Wynter, Hifaz Hassan, Sean Laguna, Mikhail Yurochkin, Mayank Agarwal, Ebube Chuba, Annie Abay

    Abstract: Federated Learning (FL) is an approach to conduct machine learning without centralizing training data in a single place, for reasons of privacy, confidentiality or data volume. However, solving federated machine learning problems raises issues above and beyond those of centralized machine learning. These issues include setting up communication infrastructure between parties, coordinating the learn… ▽ More

    Submitted 22 July, 2020; originally announced July 2020.

    Comments: 17 pages

    ACM Class: I.2.6; I.2.11

  42. arXiv:2007.06168  [pdf, other

    cs.LG stat.ML

    Model Fusion with Kullback--Leibler Divergence

    Authors: Sebastian Claici, Mikhail Yurochkin, Soumya Ghosh, Justin Solomon

    Abstract: We propose a method to fuse posterior distributions learned from heterogeneous datasets. Our algorithm relies on a mean field assumption for both the fused model and the individual dataset posteriors and proceeds using a simple assign-and-average approach. The components of the dataset posteriors are assigned to the proposed global model components by solving a regularized variant of the assignmen… ▽ More

    Submitted 12 July, 2020; originally announced July 2020.

    Comments: ICML 2020

  43. arXiv:2006.14168  [pdf, other

    cs.LG stat.ML

    SenSeI: Sensitive Set Invariance for Enforcing Individual Fairness

    Authors: Mikhail Yurochkin, Yuekai Sun

    Abstract: In this paper, we cast fair machine learning as invariant machine learning. We first formulate a version of individual fairness that enforces invariance on certain sensitive sets. We then design a transport-based regularizer that enforces this version of individual fairness and develop an algorithm to minimize the regularizer efficiently. Our theoretical results guarantee the proposed approach tra… ▽ More

    Submitted 31 March, 2021; v1 submitted 25 June, 2020; originally announced June 2020.

    Comments: ICLR 2021

  44. arXiv:2006.11439  [pdf, ps, other

    stat.ML cs.LG stat.ME

    Two Simple Ways to Learn Individual Fairness Metrics from Data

    Authors: Debarghya Mukherjee, Mikhail Yurochkin, Moulinath Banerjee, Yuekai Sun

    Abstract: Individual fairness is an intuitive definition of algorithmic fairness that addresses some of the drawbacks of group fairness. Despite its benefits, it depends on a task specific fair metric that encodes our intuition of what is fair and unfair for the ML task at hand, and the lack of a widely accepted fair metric for many ML tasks is the main barrier to broader adoption of individual fairness. In… ▽ More

    Submitted 19 June, 2020; originally announced June 2020.

    Comments: To appear in ICML 2020

  45. arXiv:2003.05048  [pdf, other

    stat.ML cs.LG

    Auditing ML Models for Individual Bias and Unfairness

    Authors: Songkai Xue, Mikhail Yurochkin, Yuekai Sun

    Abstract: We consider the task of auditing ML models for individual bias/unfairness. We formalize the task in an optimization problem and develop a suite of inferential tools for the optimal value. Our tools permit us to obtain asymptotic confidence intervals and hypothesis tests that cover the target/control the Type I error rate exactly. To demonstrate the utility of our tools, we use them to reveal the g… ▽ More

    Submitted 10 March, 2020; originally announced March 2020.

    Comments: In Proceedings of the 23rd International Conference on Artificial Intelligence and Statistics (AISTATS) 2020

  46. arXiv:2002.06440  [pdf, other

    cs.LG stat.ML

    Federated Learning with Matched Averaging

    Authors: Hongyi Wang, Mikhail Yurochkin, Yuekai Sun, Dimitris Papailiopoulos, Yasaman Khazaeni

    Abstract: Federated learning allows edge devices to collaboratively learn a shared model while kee** the training data on device, decoupling the ability to do model training from the need to store the data in the cloud. We propose Federated matched averaging (FedMA) algorithm designed for federated learning of modern neural network architectures e.g. convolutional neural networks (CNNs) and LSTMs. FedMA c… ▽ More

    Submitted 15 February, 2020; originally announced February 2020.

    Comments: Accepted by ICLR 2020

  47. arXiv:1911.02053  [pdf, other

    cs.LG stat.ML

    Alleviating Label Switching with Optimal Transport

    Authors: Pierre Monteiller, Sebastian Claici, Edward Chien, Farzaneh Mirzazadeh, Justin Solomon, Mikhail Yurochkin

    Abstract: Label switching is a phenomenon arising in mixture model posterior inference that prevents one from meaningfully assessing posterior statistics using standard Monte Carlo procedures. This issue arises due to invariance of the posterior under actions of a group; for example, permuting the ordering of mixture components has no effect on the likelihood. We propose a resolution to label switching that… ▽ More

    Submitted 10 November, 2019; v1 submitted 5 November, 2019; originally announced November 2019.

    Comments: 33rd Conference on Neural Information Processing Systems (NeurIPS 2019), Vancouver, Canada

  48. arXiv:1911.00218  [pdf, other

    stat.ML cs.LG

    Statistical Model Aggregation via Parameter Matching

    Authors: Mikhail Yurochkin, Mayank Agarwal, Soumya Ghosh, Kristjan Greenewald, Trong Nghia Hoang

    Abstract: We consider the problem of aggregating models learned from sequestered, possibly heterogeneous datasets. Exploiting tools from Bayesian nonparametrics, we develop a general meta-modeling framework that learns shared global latent structures by identifying correspondences among local model parameterizations. Our proposed framework is model-independent and is applicable to a wide range of model type… ▽ More

    Submitted 1 November, 2019; originally announced November 2019.

    Comments: NeurIPS 2019

  49. arXiv:1909.08787  [pdf, other

    stat.ML cs.LG

    On Efficient Multilevel Clustering via Wasserstein Distances

    Authors: Viet Huynh, Nhat Ho, Nhan Dam, XuanLong Nguyen, Mikhail Yurochkin, Hung Bui, and Dinh Phung

    Abstract: We propose a novel approach to the problem of multilevel clustering, which aims to simultaneously partition data in each group and discover grou** patterns among groups in a potentially large hierarchically structured corpus of data. Our method involves a joint optimization formulation over several spaces of discrete probability measures, which are endowed with Wasserstein distance metrics. We p… ▽ More

    Submitted 24 May, 2021; v1 submitted 18 September, 2019; originally announced September 2019.

    Comments: 32 pages, 8 figures, JMLR submission. arXiv admin note: substantial text overlap with arXiv:1706.03883

  50. arXiv:1907.00020  [pdf, other

    stat.ML cs.LG

    Training individually fair ML models with Sensitive Subspace Robustness

    Authors: Mikhail Yurochkin, Amanda Bower, Yuekai Sun

    Abstract: We consider training machine learning models that are fair in the sense that their performance is invariant under certain sensitive perturbations to the inputs. For example, the performance of a resume screening system should be invariant under changes to the gender and/or ethnicity of the applicant. We formalize this notion of algorithmic fairness as a variant of individual fairness and develop a… ▽ More

    Submitted 13 March, 2020; v1 submitted 28 June, 2019; originally announced July 2019.

    Comments: ICLR 2020 (spotlight)