Skip to main content

Showing 1–33 of 33 results for author: Alvarez-Melis, D

.
  1. arXiv:2406.10485  [pdf, other

    cs.LG cs.CV

    A Label is Worth a Thousand Images in Dataset Distillation

    Authors: Tian Qin, Zhiwei Deng, David Alvarez-Melis

    Abstract: Data $\textit{quality}$ is a crucial factor in the performance of machine learning models, a principle that dataset distillation methods exploit by compressing training datasets into much smaller counterparts that maintain similar downstream performance. Understanding how and why data distillation methods work is vital not only for improving these methods but also for revealing fundamental charact… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

  2. arXiv:2404.07117  [pdf, other

    cs.CL cs.LG

    Continuous Language Model Interpolation for Dynamic and Controllable Text Generation

    Authors: Sara Kangaslahti, David Alvarez-Melis

    Abstract: As large language models (LLMs) have gained popularity for a variety of use cases, making them adaptable and controllable has become increasingly important, especially for user-facing applications. While the existing literature on LLM adaptation primarily focuses on finding a model (or models) that optimizes a single predefined objective, here we focus on the challenging case where the model must… ▽ More

    Submitted 10 April, 2024; originally announced April 2024.

    Comments: 20 pages, 22 figures

  3. arXiv:2403.00999  [pdf, other

    cs.LG

    Distributional Dataset Distillation with Subtask Decomposition

    Authors: Tian Qin, Zhiwei Deng, David Alvarez-Melis

    Abstract: What does a neural network learn when training from a task-specific dataset? Synthesizing this knowledge is the central idea behind Dataset Distillation, which recent work has shown can be used to compress large datasets into a small set of input-label pairs ($\textit{prototypes}$) that capture essential aspects of the original dataset. In this paper, we make the key observation that existing meth… ▽ More

    Submitted 1 March, 2024; originally announced March 2024.

  4. arXiv:2402.05140  [pdf, other

    cs.LG cs.AI cs.CL

    Tag-LLM: Repurposing General-Purpose LLMs for Specialized Domains

    Authors: Junhong Shen, Neil Tenenholtz, James Brian Hall, David Alvarez-Melis, Nicolo Fusi

    Abstract: Large Language Models (LLMs) have demonstrated remarkable proficiency in understanding and generating natural language. However, their capabilities wane in highly specialized domains underrepresented in the pretraining corpus, such as physical and biomedical sciences. This work explores how to repurpose general LLMs into effective task solvers for specialized domains. We introduce a novel, model-a… ▽ More

    Submitted 30 May, 2024; v1 submitted 6 February, 2024; originally announced February 2024.

  5. arXiv:2306.06866  [pdf, other

    cs.LG cs.AI

    Generating Synthetic Datasets by Interpolating along Generalized Geodesics

    Authors: Jiaojiao Fan, David Alvarez-Melis

    Abstract: Data for pretraining machine learning models often consists of collections of heterogeneous datasets. Although training on their union is reasonable in agnostic settings, it might be suboptimal when the target domain -- where the model will ultimately be used -- is known in advance. In that case, one would ideally pretrain only on the dataset(s) most similar to the target one. Instead of limiting… ▽ More

    Submitted 12 June, 2023; originally announced June 2023.

    Journal ref: Conference on Uncertainty in Artificial Intelligence (UAI) 2023

  6. arXiv:2303.02241  [pdf, other

    cs.CV cs.LG

    Domain adaptation using optimal transport for invariant learning using histopathology datasets

    Authors: Kianoush Falahkheirkhah, Alex Lu, David Alvarez-Melis, Grace Huynh

    Abstract: Histopathology is critical for the diagnosis of many diseases, including cancer. These protocols typically require pathologists to manually evaluate slides under a microscope, which is time-consuming and subjective, leading to interest in machine learning to automate analysis. However, computational techniques are limited by batch effects, where technical factors like differences in preparation pr… ▽ More

    Submitted 3 March, 2023; originally announced March 2023.

  7. arXiv:2211.14469  [pdf, other

    cs.LG cs.AI

    Transfer RL via the Undo Maps Formalism

    Authors: Abhi Gupta, Ted Moskovitz, David Alvarez-Melis, Aldo Pacchiano

    Abstract: Transferring knowledge across domains is one of the most fundamental problems in machine learning, but doing so effectively in the context of reinforcement learning remains largely an open problem. Current methods make strong assumptions on the specifics of the task, often lack principled objectives, and -- crucially -- modify individual policies, which might be sub-optimal when the domains differ… ▽ More

    Submitted 25 November, 2022; originally announced November 2022.

    Comments: 8 main pages, 3 appendix

  8. arXiv:2210.13630  [pdf, other

    cs.LG cs.IT

    Budget-Constrained Bounds for Mini-Batch Estimation of Optimal Transport

    Authors: David Alvarez-Melis, Nicolò Fusi, Lester Mackey, Tal Wagner

    Abstract: Optimal Transport (OT) is a fundamental tool for comparing probability distributions, but its exact computation remains prohibitive for large datasets. In this work, we introduce novel families of upper and lower bounds for the OT problem constructed by aggregating solutions of mini-batch OT problems. The upper bound family contains traditional mini-batch averaging at one extreme and a tight bound… ▽ More

    Submitted 24 October, 2022; originally announced October 2022.

  9. arXiv:2210.03164  [pdf, other

    cs.LG stat.ML

    InfoOT: Information Maximizing Optimal Transport

    Authors: Ching-Yao Chuang, Stefanie Jegelka, David Alvarez-Melis

    Abstract: Optimal transport aligns samples across distributions by minimizing the transportation cost between them, e.g., the geometric distances. Yet, it ignores coherence structure in the data such as clusters, does not handle outliers well, and cannot integrate new data points. To address these drawbacks, we propose InfoOT, an information-theoretic extension of optimal transport that maximizes the mutual… ▽ More

    Submitted 29 May, 2023; v1 submitted 6 October, 2022; originally announced October 2022.

    Journal ref: ICML 2023

  10. arXiv:2209.15621  [pdf, other

    cs.LG stat.AP

    Neural Unbalanced Optimal Transport via Cycle-Consistent Semi-Couplings

    Authors: Frederike Lübeck, Charlotte Bunne, Gabriele Gut, Jacobo Sarabia del Castillo, Lucas Pelkmans, David Alvarez-Melis

    Abstract: Comparing unpaired samples of a distribution or population taken at different points in time is a fundamental task in many application domains where measuring populations is destructive and cannot be done repeatedly on the same sample, such as in single-cell biology. Optimal transport (OT) can solve this challenge by learning an optimal coupling of samples across distributions from unpaired data.… ▽ More

    Submitted 30 September, 2022; originally announced September 2022.

  11. arXiv:2208.02896  [pdf, other

    cs.LG cs.AI

    Interpretable Distribution Shift Detection using Optimal Transport

    Authors: Neha Hulkund, Nicolo Fusi, Jennifer Wortman Vaughan, David Alvarez-Melis

    Abstract: We propose a method to identify and characterize distribution shifts in classification datasets based on optimal transport. It allows the user to identify the extent to which each class is affected by the shift, and retrieves corresponding pairs of samples to provide insights on its nature. We illustrate its use on synthetic and natural shift examples. While the results we present are preliminary,… ▽ More

    Submitted 4 August, 2022; originally announced August 2022.

    Comments: Presented at ICML 2022 DataPerf Workshop

  12. arXiv:2205.09838  [pdf, ps, other

    cs.LG stat.ML

    Why GANs are overkill for NLP

    Authors: David Alvarez-Melis, Vikas Garg, Adam Tauman Kalai

    Abstract: This work offers a novel theoretical perspective on why, despite numerous attempts, adversarial approaches to generative modeling (e.g., GANs) have not been as popular for certain generation tasks, particularly sequential tasks such as Natural Language Generation, as they have in others, such as Computer Vision. In particular, on sequential data such as text, maximum-likelihood approaches are sign… ▽ More

    Submitted 19 May, 2022; originally announced May 2022.

  13. arXiv:2204.08324  [pdf, other

    cs.CV cs.AI

    Hierarchical Optimal Transport for Comparing Histopathology Datasets

    Authors: Anna Yeaton, Rahul G. Krishnan, Rebecca Mieloszyk, David Alvarez-Melis, Grace Huynh

    Abstract: Scarcity of labeled histopathology data limits the applicability of deep learning methods to under-profiled cancer types and labels. Transfer learning allows researchers to overcome the limitations of small datasets by pre-training machine learning models on larger datasets similar to the small target dataset. However, similarity between datasets is often determined heuristically. In this paper, w… ▽ More

    Submitted 20 April, 2022; v1 submitted 18 April, 2022; originally announced April 2022.

  14. arXiv:2106.00774  [pdf, other

    stat.ML cs.LG math.NA

    Optimizing Functionals on the Space of Probabilities with Input Convex Neural Networks

    Authors: David Alvarez-Melis, Yair Schiff, Youssef Mroueh

    Abstract: Gradient flows are a powerful tool for optimizing functionals in general metric spaces, including the space of probabilities endowed with the Wasserstein metric. A typical approach to solving this optimization problem relies on its connection to the dynamic formulation of optimal transport and the celebrated Jordan-Kinderlehrer-Otto (JKO) scheme. However, this formulation involves optimization ove… ▽ More

    Submitted 30 November, 2021; v1 submitted 1 June, 2021; originally announced June 2021.

  15. arXiv:2104.13299  [pdf, other

    cs.AI cs.LG

    From Human Explanation to Model Interpretability: A Framework Based on Weight of Evidence

    Authors: David Alvarez-Melis, Harmanpreet Kaur, Hal Daumé III, Hanna Wallach, Jennifer Wortman Vaughan

    Abstract: We take inspiration from the study of human explanation to inform the design and evaluation of interpretability methods in machine learning. First, we survey the literature on human explanation in philosophy, cognitive science, and the social sciences, and propose a list of design principles for machine-generated explanations that are meaningful to humans. Using the concept of weight of evidence f… ▽ More

    Submitted 20 September, 2021; v1 submitted 27 April, 2021; originally announced April 2021.

    Comments: HCOMP 2021

  16. arXiv:2010.12760  [pdf, ps, other

    cs.LG stat.ML

    Dataset Dynamics via Gradient Flows in Probability Space

    Authors: David Alvarez-Melis, Nicolò Fusi

    Abstract: Various machine learning tasks, from generative modeling to domain adaptation, revolve around the concept of dataset transformation and manipulation. While various methods exist for transforming unlabeled datasets, principled methods to do so for labeled (e.g., classification) datasets are missing. In this work, we propose a novel framework for dataset transformation, which we cast as optimization… ▽ More

    Submitted 16 June, 2021; v1 submitted 23 October, 2020; originally announced October 2020.

    Comments: ICML 2021

  17. arXiv:2002.02923  [pdf, other

    cs.LG stat.ML

    Geometric Dataset Distances via Optimal Transport

    Authors: David Alvarez-Melis, Nicolò Fusi

    Abstract: The notion of task similarity is at the core of various machine learning paradigms, such as domain adaptation and meta-learning. Current methods to quantify it are often heuristic, make strong assumptions on the label sets across the tasks, and many are architecture-dependent, relying on task-specific optimal parameters (e.g., require training a model on each dataset). In this work we propose an a… ▽ More

    Submitted 7 February, 2020; originally announced February 2020.

  18. arXiv:1911.02536  [pdf, other

    cs.LG stat.ML

    Unsupervised Hierarchy Matching with Optimal Transport over Hyperbolic Spaces

    Authors: David Alvarez-Melis, Youssef Mroueh, Tommi S. Jaakkola

    Abstract: This paper focuses on the problem of unsupervised alignment of hierarchical data such as ontologies or lexical databases. This is a problem that appears across areas, from natural language processing to bioinformatics, and is typically solved by appeal to outside knowledge bases and label-textual similarity. In contrast, we approach the problem from a purely geometric perspective: given only a vec… ▽ More

    Submitted 7 May, 2020; v1 submitted 6 November, 2019; originally announced November 2019.

    Comments: AISTATS 2020

  19. arXiv:1910.14497  [pdf, other

    cs.CL cs.LG

    Probabilistic Bias Mitigation in Word Embeddings

    Authors: Hailey Joren, David Alvarez-Melis

    Abstract: It has been shown that word embeddings derived from large corpora tend to incorporate biases present in their training data. Various methods for mitigating these biases have been proposed, but recent work has demonstrated that these methods hide but fail to truly remove the biases, which can still be observed in word nearest-neighbor statistics. In this work we propose a probabilistic view of wo… ▽ More

    Submitted 26 April, 2020; v1 submitted 31 October, 2019; originally announced October 2019.

    Comments: 4 pages, 4 figures, Workshop on Human-Centric Machine Learning at NeurIPS 2019

  20. arXiv:1910.13503  [pdf, other

    cs.LG cs.AI stat.ML

    Weight of Evidence as a Basis for Human-Oriented Explanations

    Authors: David Alvarez-Melis, Hal Daumé III, Jennifer Wortman Vaughan, Hanna Wallach

    Abstract: Interpretability is an elusive but highly sought-after characteristic of modern machine learning methods. Recent work has focused on interpretability via $\textit{explanations}$, which justify individual model predictions. In this work, we take a step towards reconciling machine explanations with those that humans produce and prefer by taking inspiration from the study of explanation in philosophy… ▽ More

    Submitted 29 October, 2019; originally announced October 2019.

    Comments: Human-Centric Machine Learning (HCML) Workshop @ NeurIPS 2019

  21. arXiv:1907.03207  [pdf, other

    cs.LG stat.ML

    Towards Robust, Locally Linear Deep Networks

    Authors: Guang-He Lee, David Alvarez-Melis, Tommi S. Jaakkola

    Abstract: Deep networks realize complex map**s that are often understood by their locally linear behavior at or around points of interest. For example, we use the derivative of the map** with respect to its inputs for sensitivity analysis, or to explain (obtain coordinate relevance for) a prediction. One key challenge is that such derivatives are themselves inherently unstable. In this paper, we propose… ▽ More

    Submitted 6 July, 2019; originally announced July 2019.

    Comments: Published in International Conference on Learning Representations (ICLR), 2019

  22. arXiv:1905.05461  [pdf, other

    cs.LG stat.ML

    Learning Generative Models across Incomparable Spaces

    Authors: Charlotte Bunne, David Alvarez-Melis, Andreas Krause, Stefanie Jegelka

    Abstract: Generative Adversarial Networks have shown remarkable success in learning a distribution that faithfully recovers a reference distribution in its entirety. However, in some cases, we may want to only learn some aspects (e.g., cluster or manifold structure), while modifying others (e.g., style, orientation or dimension). In this work, we propose an approach to learn generative models across such in… ▽ More

    Submitted 15 May, 2019; v1 submitted 14 May, 2019; originally announced May 2019.

    Comments: International Conference on Machine Learning (ICML)

    Journal ref: Proceedings of Machine Learning Research (PMLR), 97 (2019)

  23. arXiv:1902.09737  [pdf, other

    cs.LG stat.ML

    Functional Transparency for Structured Data: a Game-Theoretic Approach

    Authors: Guang-He Lee, Wengong **, David Alvarez-Melis, Tommi S. Jaakkola

    Abstract: We provide a new approach to training neural models to exhibit transparency in a well-defined, functional manner. Our approach naturally operates over structured data and tailors the predictor, functionally, towards a chosen family of (local) witnesses. The estimation problem is setup as a co-operative game between an unrestricted predictor such as a neural network, and a set of witnesses chosen f… ▽ More

    Submitted 26 February, 2019; originally announced February 2019.

  24. arXiv:1809.00013  [pdf, other

    cs.CL

    Gromov-Wasserstein Alignment of Word Embedding Spaces

    Authors: David Alvarez-Melis, Tommi S. Jaakkola

    Abstract: Cross-lingual or cross-domain correspondences play key roles in tasks ranging from machine translation to transfer learning. Recently, purely unsupervised methods operating on monolingual embeddings have become effective alignment tools. Current state-of-the-art methods, however, involve multiple steps, including heuristic post-hoc refinement strategies. In this paper, we cast the correspondence p… ▽ More

    Submitted 31 August, 2018; originally announced September 2018.

    Comments: EMNLP 2018

  25. arXiv:1807.00130  [pdf, other

    cs.LG stat.ML

    Game-Theoretic Interpretability for Temporal Modeling

    Authors: Guang-He Lee, David Alvarez-Melis, Tommi S. Jaakkola

    Abstract: Interpretability has arisen as a key desideratum of machine learning models alongside performance. Approaches so far have been primarily concerned with fixed dimensional inputs emphasizing feature relevance or selection. In contrast, we focus on temporal modeling and the problem of tailoring the predictor, functionally, towards an interpretable family. To this end, we propose a co-operative game b… ▽ More

    Submitted 30 June, 2018; originally announced July 2018.

  26. arXiv:1806.09277  [pdf, other

    stat.ML cs.LG

    Towards Optimal Transport with Global Invariances

    Authors: David Alvarez-Melis, Stefanie Jegelka, Tommi S. Jaakkola

    Abstract: Many problems in machine learning involve calculating correspondences between sets of objects, such as point clouds or images. Discrete optimal transport provides a natural and successful approach to such tasks whenever the two sets of objects can be represented in the same space, or at least distances between them can be directly evaluated. Unfortunately neither requirement is likely to hold when… ▽ More

    Submitted 26 February, 2019; v1 submitted 24 June, 2018; originally announced June 2018.

    Comments: AISTATS 2019

  27. arXiv:1806.08049  [pdf, other

    cs.LG stat.ML

    On the Robustness of Interpretability Methods

    Authors: David Alvarez-Melis, Tommi S. Jaakkola

    Abstract: We argue that robustness of explanations---i.e., that similar inputs should give rise to similar explanations---is a key desideratum for interpretability. We introduce metrics to quantify robustness and demonstrate that current methods do not perform well according to these metrics. Finally, we propose ways that robustness can be enforced on existing interpretability approaches.

    Submitted 20 June, 2018; originally announced June 2018.

    Comments: presented at 2018 ICML Workshop on Human Interpretability in Machine Learning (WHI 2018), Stockholm, Sweden

  28. arXiv:1806.07538  [pdf, other

    cs.LG stat.ML

    Towards Robust Interpretability with Self-Explaining Neural Networks

    Authors: David Alvarez-Melis, Tommi S. Jaakkola

    Abstract: Most recent work on interpretability of complex machine learning models has focused on estimating $\textit{a posteriori}$ explanations for previously trained models around specific predictions. $\textit{Self-explaining}$ models where interpretability plays a key role already during learning have received much less attention. We propose three desiderata for explanations in general -- explicitness,… ▽ More

    Submitted 3 December, 2018; v1 submitted 19 June, 2018; originally announced June 2018.

    Comments: NeurIPS 2018

  29. arXiv:1712.06199  [pdf, other

    stat.ML cs.LG

    Structured Optimal Transport

    Authors: David Alvarez-Melis, Tommi S. Jaakkola, Stefanie Jegelka

    Abstract: Optimal Transport has recently gained interest in machine learning for applications ranging from domain adaptation, sentence similarities to deep learning. Yet, its ability to capture frequently occurring structure beyond the "ground metric" is limited. In this work, we develop a nonlinear generalization of (discrete) optimal transport that is able to reflect much additional structure. We demonstr… ▽ More

    Submitted 17 December, 2017; originally announced December 2017.

  30. arXiv:1707.01943  [pdf, other

    cs.LG

    A causal framework for explaining the predictions of black-box sequence-to-sequence models

    Authors: David Alvarez-Melis, Tommi S. Jaakkola

    Abstract: We interpret the predictions of any black-box structured input-structured output model around a specific input-output pair. Our method returns an "explanation" consisting of groups of input-output tokens that are causally related. These dependencies are inferred by querying the black-box model with perturbed inputs, generating a graph over tokens from the responses, and solving a partitioning prob… ▽ More

    Submitted 14 November, 2017; v1 submitted 6 July, 2017; originally announced July 2017.

    Comments: 12 Pages, EMNLP 2017

  31. arXiv:1706.09549  [pdf, other

    cs.LG

    Distributional Adversarial Networks

    Authors: Chengtao Li, David Alvarez-Melis, Keyulu Xu, Stefanie Jegelka, Suvrit Sra

    Abstract: We propose a framework for adversarial training that relies on a sample rather than a single sample point as the fundamental unit of discrimination. Inspired by discrepancy measures and two-sample tests between probability distributions, we propose two such distributional adversaries that operate and predict on samples, and show how they can be easily implemented on top of existing models. Various… ▽ More

    Submitted 9 July, 2017; v1 submitted 28 June, 2017; originally announced June 2017.

  32. arXiv:1512.01229  [pdf, other

    math.ST

    A translation of "The characteristic function of a random phenomenon" by Bruno de Finetti

    Authors: David Alvarez-Melis, Tamara Broderick

    Abstract: This article is a translation of Bruno de Finetti's paper "Funzione Caratteristica di un fenomeno aleatorio" which appeared in Atti del Congresso Internazionale dei Matematici, Bologna 3-10 Settembre 1928, Tomo VI, pp. 179-190, originally published by Nicola Zanichelli Editore S.p.A. The translation was made as close as possible to the original in form and style, except for apparent mistakes found… ▽ More

    Submitted 3 December, 2015; originally announced December 2015.

    Comments: 12 pages

    MSC Class: 60E10

  33. arXiv:1509.05808  [pdf, other

    cs.CL cs.LG stat.ML

    Word, graph and manifold embedding from Markov processes

    Authors: Tatsunori B. Hashimoto, David Alvarez-Melis, Tommi S. Jaakkola

    Abstract: Continuous vector representations of words and objects appear to carry surprisingly rich semantic content. In this paper, we advance both the conceptual and theoretical understanding of word embeddings in three ways. First, we ground embeddings in semantic spaces studied in cognitive-psychometric literature and introduce new evaluation tasks. Second, in contrast to prior work, we take metric recov… ▽ More

    Submitted 18 September, 2015; originally announced September 2015.