Skip to main content

Showing 1–26 of 26 results for author: Khodak, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2310.02246  [pdf, other

    cs.LG cs.AI math.NA stat.ML

    Learning to Relax: Setting Solver Parameters Across a Sequence of Linear System Instances

    Authors: Mikhail Khodak, Edmond Chow, Maria-Florina Balcan, Ameet Talwalkar

    Abstract: Solving a linear system $Ax=b$ is a fundamental scientific computing primitive for which numerous solvers and preconditioners have been developed. These come with parameters whose optimal values depend on the system being solved and are often impossible or too expensive to identify; thus in practice sub-optimal heuristics are used. We consider the common setting in which many related linear system… ▽ More

    Submitted 2 May, 2024; v1 submitted 3 October, 2023; originally announced October 2023.

    Comments: ICLR 2024 Spotlight

  2. arXiv:2307.02295  [pdf, other

    cs.LG cs.AI stat.ML

    Meta-Learning Adversarial Bandit Algorithms

    Authors: Mikhail Khodak, Ilya Osadchiy, Keegan Harris, Maria-Florina Balcan, Kfir Y. Levy, Ron Meir, Zhiwei Steven Wu

    Abstract: We study online meta-learning with bandit feedback, with the goal of improving performance across multiple tasks if they are similar according to some natural similarity measure. As the first to target the adversarial online-within-online partial-information setting, we design meta-algorithms that combine outer learners to simultaneously tune the initialization and other hyperparameters of an inne… ▽ More

    Submitted 1 November, 2023; v1 submitted 5 July, 2023; originally announced July 2023.

    Comments: Merger of arXiv:2205.14128 and arXiv:2205.15921, with some additional improvements; to appear in NeurIPS 2023

  3. arXiv:2302.05738  [pdf, other

    cs.LG

    Cross-Modal Fine-Tuning: Align then Refine

    Authors: Junhong Shen, Liam Li, Lucio M. Dery, Corey Staten, Mikhail Khodak, Graham Neubig, Ameet Talwalkar

    Abstract: Fine-tuning large-scale pretrained models has led to tremendous progress in well-studied modalities such as vision and NLP. However, similar gains have not been observed in many other modalities due to a lack of relevant pretrained models. In this work, we propose ORCA, a general cross-modal fine-tuning framework that extends the applicability of a single large-scale pretrained model to diverse mo… ▽ More

    Submitted 18 March, 2023; v1 submitted 11 February, 2023; originally announced February 2023.

  4. arXiv:2212.08930  [pdf, other

    cs.LG

    On Noisy Evaluation in Federated Hyperparameter Tuning

    Authors: Kevin Kuo, Pratiksha Thaker, Mikhail Khodak, John Nguyen, Daniel Jiang, Ameet Talwalkar, Virginia Smith

    Abstract: Hyperparameter tuning is critical to the success of federated learning applications. Unfortunately, appropriately selecting hyperparameters is challenging in federated networks. Issues of scale, privacy, and heterogeneity introduce noise in the tuning process and make it difficult to evaluate the performance of various hyperparameters. In this work, we perform the first systematic study on the eff… ▽ More

    Submitted 15 May, 2023; v1 submitted 17 December, 2022; originally announced December 2022.

    Comments: v1: 19 pages, 15 figures, submitted to MLSys2023; v2: Fixed citation formatting; v3: Fixed typo, update acks v4: MLSys2023 camera-ready

  5. arXiv:2210.11222  [pdf, other

    cs.CR cs.AI cs.DS cs.LG stat.ML

    Learning-Augmented Private Algorithms for Multiple Quantile Release

    Authors: Mikhail Khodak, Kareem Amin, Travis Dick, Sergei Vassilvitskii

    Abstract: When applying differential privacy to sensitive data, we can often improve performance using external information such as other sensitive data, public data, or human priors. We propose to use the learning-augmented algorithms (or algorithms with predictions) framework -- previously applied largely to improve time complexity or competitive ratios -- as a powerful way of designing and analyzing priv… ▽ More

    Submitted 8 May, 2023; v1 submitted 20 October, 2022; originally announced October 2022.

    Comments: To appear in ICML 2023

  6. arXiv:2209.14110  [pdf, other

    cs.GT

    Meta-Learning in Games

    Authors: Keegan Harris, Ioannis Anagnostides, Gabriele Farina, Mikhail Khodak, Zhiwei Steven Wu, Tuomas Sandholm

    Abstract: In the literature on game-theoretic equilibrium finding, focus has mainly been on solving a single game in isolation. In practice, however, strategic interactions -- ranging from routing problems to online advertising auctions -- evolve dynamically, thereby leading to many similar games to be solved. To address this gap, we introduce meta-learning for equilibrium finding and learning to play games… ▽ More

    Submitted 1 March, 2023; v1 submitted 28 September, 2022; originally announced September 2022.

    Comments: In the eleventh Conference on Learning Representations (ICLR 2023)

  7. arXiv:2207.10199  [pdf, other

    cs.LG stat.ML

    Provably tuning the ElasticNet across instances

    Authors: Maria-Florina Balcan, Mikhail Khodak, Dravyansh Sharma, Ameet Talwalkar

    Abstract: An important unresolved challenge in the theory of regularization is to set the regularization coefficients of popular techniques like the ElasticNet with general provable guarantees. We consider the problem of tuning the regularization parameters of Ridge regression, LASSO, and the ElasticNet across multiple problem instances, a setting that encompasses both cross-validation and multi-task hyperp… ▽ More

    Submitted 15 January, 2024; v1 submitted 20 July, 2022; originally announced July 2022.

  8. arXiv:2205.14128  [pdf, ps, other

    cs.LG cs.AI stat.ML

    Meta-Learning Adversarial Bandits

    Authors: Maria-Florina Balcan, Keegan Harris, Mikhail Khodak, Zhiwei Steven Wu

    Abstract: We study online learning with bandit feedback across multiple tasks, with the goal of improving average performance across tasks if they are similar according to some natural task-similarity measure. As the first to target the adversarial setting, we design a unified meta-algorithm that yields setting-specific guarantees for two important cases: multi-armed bandits (MAB) and bandit linear optimiza… ▽ More

    Submitted 27 May, 2022; originally announced May 2022.

    Comments: 19 pages

  9. arXiv:2205.14082  [pdf, other

    cs.LG cs.AI

    AANG: Automating Auxiliary Learning

    Authors: Lucio M. Dery, Paul Michel, Mikhail Khodak, Graham Neubig, Ameet Talwalkar

    Abstract: Auxiliary objectives, supplementary learning signals that are introduced to help aid learning on data-starved or highly complex end-tasks, are commonplace in machine learning. Whilst much work has been done to formulate useful auxiliary objectives, their construction is still an art which proceeds by slow and tedious hand-design. Intuition for how and when these objectives improve end-task perform… ▽ More

    Submitted 27 February, 2023; v1 submitted 27 May, 2022; originally announced May 2022.

    Comments: Accepted to ICLR 2023 22 pages, 7 tables and 5 figures

  10. arXiv:2204.07554  [pdf, other

    cs.LG cs.AI

    Efficient Architecture Search for Diverse Tasks

    Authors: Junhong Shen, Mikhail Khodak, Ameet Talwalkar

    Abstract: While neural architecture search (NAS) has enabled automated machine learning (AutoML) for well-researched areas, its application to tasks beyond computer vision is still under-explored. As less-studied domains are precisely those where we expect AutoML to have the greatest impact, in this work we study NAS for efficiently solving diverse problems. Seeking an approach that is fast, simple, and bro… ▽ More

    Submitted 9 October, 2022; v1 submitted 15 April, 2022; originally announced April 2022.

    Comments: NeurIPS 2022 Camera-Ready; code available at https://github.com/sjunhongshen/DASH

  11. arXiv:2202.09312  [pdf, other

    cs.LG cs.AI cs.DS stat.ML

    Learning Predictions for Algorithms with Predictions

    Authors: Mikhail Khodak, Maria-Florina Balcan, Ameet Talwalkar, Sergei Vassilvitskii

    Abstract: A burgeoning paradigm in algorithm design is the field of algorithms with predictions, in which algorithms can take advantage of a possibly-imperfect prediction of some aspect of the problem. While much work has focused on using predictions to improve competitive ratios, running times, or other performance measures, less effort has been devoted to the question of how to obtain the predictions them… ▽ More

    Submitted 17 October, 2022; v1 submitted 18 February, 2022; originally announced February 2022.

    Comments: NeurIPS 2022 camera-ready

  12. arXiv:2110.05668  [pdf, other

    cs.CV cs.LG

    NAS-Bench-360: Benchmarking Neural Architecture Search on Diverse Tasks

    Authors: Renbo Tu, Nicholas Roberts, Mikhail Khodak, Junhong Shen, Frederic Sala, Ameet Talwalkar

    Abstract: Most existing neural architecture search (NAS) benchmarks and algorithms prioritize well-studied tasks, e.g. image classification on CIFAR or ImageNet. This makes the performance of NAS approaches in more diverse areas poorly understood. In this paper, we present NAS-Bench-360, a benchmark suite to evaluate methods on domains beyond those traditionally studied in architecture search, and use it to… ▽ More

    Submitted 19 January, 2023; v1 submitted 11 October, 2021; originally announced October 2021.

    Comments: NeurIPS 2022 Datasets and Benchmarks Track

  13. arXiv:2108.08770  [pdf, other

    cs.LG

    Learning-to-learn non-convex piecewise-Lipschitz functions

    Authors: Maria-Florina Balcan, Mikhail Khodak, Dravyansh Sharma, Ameet Talwalkar

    Abstract: We analyze the meta-learning of the initialization and step-size of learning algorithms for piecewise-Lipschitz functions, a non-convex setting with applications to both machine learning and algorithms. Starting from recent regret bounds for the exponential forecaster on losses with dispersed discontinuities, we generalize them to be initialization-dependent and then use this result to propose a p… ▽ More

    Submitted 19 August, 2021; originally announced August 2021.

  14. arXiv:2106.04502  [pdf, other

    cs.LG cs.AI cs.DC stat.ML

    Federated Hyperparameter Tuning: Challenges, Baselines, and Connections to Weight-Sharing

    Authors: Mikhail Khodak, Renbo Tu, Tian Li, Liam Li, Maria-Florina Balcan, Virginia Smith, Ameet Talwalkar

    Abstract: Tuning hyperparameters is a crucial but arduous part of the machine learning pipeline. Hyperparameter optimization is even more challenging in federated learning, where models are learned over a distributed network of heterogeneous devices; here, the need to keep data on device and perform local training makes it difficult to efficiently train and evaluate configurations. In this work, we investig… ▽ More

    Submitted 4 November, 2021; v1 submitted 8 June, 2021; originally announced June 2021.

    Comments: NeurIPS 2021

  15. arXiv:2105.01029  [pdf, other

    stat.ML cs.AI cs.CL cs.CV cs.LG

    Initialization and Regularization of Factorized Neural Layers

    Authors: Mikhail Khodak, Neil Tenenholtz, Lester Mackey, Nicolò Fusi

    Abstract: Factorized layers--operations parameterized by products of two or more matrices--occur in a variety of deep learning contexts, including compressed model training, certain types of knowledge distillation, and multi-head self-attention architectures. We study how to initialize and regularize deep nets containing such layers, examining two simple, understudied schemes, spectral initialization and Fr… ▽ More

    Submitted 4 October, 2022; v1 submitted 3 May, 2021; originally announced May 2021.

    Comments: ICLR 2021 camera-ready, amended due to error pointed out in arXiv:2209.13569v1 (amendment shown in blue)

  16. arXiv:2103.15798  [pdf, other

    cs.LG cs.AI cs.CV math.NA stat.ML

    Rethinking Neural Operations for Diverse Tasks

    Authors: Nicholas Roberts, Mikhail Khodak, Tri Dao, Liam Li, Christopher Ré, Ameet Talwalkar

    Abstract: An important goal of AutoML is to automate-away the design of neural networks on new tasks in under-explored domains. Motivated by this goal, we study the problem of enabling users to discover the right neural operations given data from their specific domain. We introduce a search space of operations called XD-Operations that mimic the inductive bias of standard multi-channel convolutions while be… ▽ More

    Submitted 4 November, 2021; v1 submitted 29 March, 2021; originally announced March 2021.

    Comments: NeurIPS 2021

  17. arXiv:2004.07802  [pdf, other

    cs.LG cs.CV cs.NE math.OC stat.ML

    Geometry-Aware Gradient Algorithms for Neural Architecture Search

    Authors: Liam Li, Mikhail Khodak, Maria-Florina Balcan, Ameet Talwalkar

    Abstract: Recent state-of-the-art methods for neural architecture search (NAS) exploit gradient-based optimization by relaxing the problem into continuous optimization over architectures and shared-weights, a noisy process that remains poorly understood. We argue for the study of single-level empirical risk minimization to understand NAS with weight-sharing, reducing the design of NAS methods to devising op… ▽ More

    Submitted 18 March, 2021; v1 submitted 16 April, 2020; originally announced April 2020.

    Comments: ICLR 2021 Camera-Ready

  18. arXiv:2002.11172  [pdf, other

    cs.LG math.OC stat.ML

    A Sample Complexity Separation between Non-Convex and Convex Meta-Learning

    Authors: Nikunj Saunshi, Yi Zhang, Mikhail Khodak, Sanjeev Arora

    Abstract: One popular trend in meta-learning is to learn from many training tasks a common initialization for a gradient-based method that can be used to solve a new task with few samples. The theory of meta-learning is still in its early stages, with several recent learning-theoretic analyses of methods such as Reptile [Nichol et al., 2018] being for convex models. This work shows that convex-case analysis… ▽ More

    Submitted 25 February, 2020; originally announced February 2020.

    Comments: 34 pages

  19. arXiv:1912.04977  [pdf, other

    cs.LG cs.CR stat.ML

    Advances and Open Problems in Federated Learning

    Authors: Peter Kairouz, H. Brendan McMahan, Brendan Avent, Aurélien Bellet, Mehdi Bennis, Arjun Nitin Bhagoji, Kallista Bonawitz, Zachary Charles, Graham Cormode, Rachel Cummings, Rafael G. L. D'Oliveira, Hubert Eichner, Salim El Rouayheb, David Evans, Josh Gardner, Zachary Garrett, Adrià Gascón, Badih Ghazi, Phillip B. Gibbons, Marco Gruteser, Zaid Harchaoui, Chaoyang He, Lie He, Zhouyuan Huo, Ben Hutchinson , et al. (34 additional authors not shown)

    Abstract: Federated learning (FL) is a machine learning setting where many clients (e.g. mobile devices or whole organizations) collaboratively train a model under the orchestration of a central server (e.g. service provider), while kee** the training data decentralized. FL embodies the principles of focused data collection and minimization, and can mitigate many of the systemic privacy risks and costs re… ▽ More

    Submitted 8 March, 2021; v1 submitted 10 December, 2019; originally announced December 2019.

    Comments: Published in Foundations and Trends in Machine Learning Vol 4 Issue 1. See: https://www.nowpublishers.com/article/Details/MAL-083

  20. arXiv:1909.05830  [pdf, other

    cs.LG cs.AI cs.CR stat.ML

    Differentially Private Meta-Learning

    Authors: Jeffrey Li, Mikhail Khodak, Sebastian Caldas, Ameet Talwalkar

    Abstract: Parameter-transfer is a well-known and versatile approach for meta-learning, with applications including few-shot learning, federated learning, and reinforcement learning. However, parameter-transfer algorithms often require sharing models that have been trained on the samples from specific tasks, thus leaving the task-owners susceptible to breaches of privacy. We conduct the first formal study of… ▽ More

    Submitted 21 February, 2020; v1 submitted 12 September, 2019; originally announced September 2019.

  21. arXiv:1906.02717  [pdf, other

    cs.LG cs.AI stat.ML

    Adaptive Gradient-Based Meta-Learning Methods

    Authors: Mikhail Khodak, Maria-Florina Balcan, Ameet Talwalkar

    Abstract: We build a theoretical framework for designing and understanding practical meta-learning methods that integrates sophisticated formalizations of task-similarity with the extensive literature on online convex optimization and sequential prediction algorithms. Our approach enables the task-similarity to be learned adaptively, provides sharper transfer-risk bounds in the setting of statistical learni… ▽ More

    Submitted 6 December, 2019; v1 submitted 6 June, 2019; originally announced June 2019.

    Comments: NeurIPS 2019

  22. arXiv:1902.10644  [pdf, other

    cs.LG cs.AI stat.ML

    Provable Guarantees for Gradient-Based Meta-Learning

    Authors: Mikhail Khodak, Maria-Florina Balcan, Ameet Talwalkar

    Abstract: We study the problem of meta-learning through the lens of online convex optimization, develo** a meta-algorithm bridging the gap between popular gradient-based meta-learning and classical regularization-based multi-task transfer methods. Our method is the first to simultaneously satisfy good sample efficiency guarantees in the convex setting, with generalization bounds that improve with task-sim… ▽ More

    Submitted 16 May, 2019; v1 submitted 27 February, 2019; originally announced February 2019.

    Comments: ICML 2019

  23. arXiv:1902.09229  [pdf, other

    cs.LG cs.AI stat.ML

    A Theoretical Analysis of Contrastive Unsupervised Representation Learning

    Authors: Sanjeev Arora, Hrishikesh Khandeparkar, Mikhail Khodak, Orestis Plevrakis, Nikunj Saunshi

    Abstract: Recent empirical works have successfully used unlabeled data to learn feature representations that are broadly useful in downstream classification tasks. Several of these methods are reminiscent of the well-known word2vec embedding algorithm: leveraging availability of pairs of semantically "similar" data points and "negative samples," the learner forces the inner product of representations of sim… ▽ More

    Submitted 25 February, 2019; originally announced February 2019.

    Comments: 19 pages, 5 figures

  24. arXiv:1805.05388  [pdf, other

    cs.CL cs.AI

    A La Carte Embedding: Cheap but Effective Induction of Semantic Feature Vectors

    Authors: Mikhail Khodak, Nikunj Saunshi, Yingyu Liang, Tengyu Ma, Brandon Stewart, Sanjeev Arora

    Abstract: Motivations like domain adaptation, transfer learning, and feature learning have fueled interest in inducing embeddings for rare or unseen words, n-grams, synsets, and other textual features. This paper introduces a la carte embedding, a simple and general alternative to the usual word2vec-based approaches for building such representations that is based upon recent theoretical results for GloVe-li… ▽ More

    Submitted 14 May, 2018; originally announced May 2018.

    Comments: 11 pages, 2 figures, To appear in ACL 2018

  25. arXiv:1705.00217  [pdf, other

    cs.CL cs.IR

    Extending and Improving Wordnet via Unsupervised Word Embeddings

    Authors: Mikhail Khodak, Andrej Risteski, Christiane Fellbaum, Sanjeev Arora

    Abstract: This work presents an unsupervised approach for improving WordNet that builds upon recent advances in document and sense representation via distributional semantics. We apply our methods to construct Wordnets in French and Russian, languages which both lack good manual constructions.1 These are evaluated on two new 600-word test sets for word-to-synset matching and found to improve greatly upon sy… ▽ More

    Submitted 29 April, 2017; originally announced May 2017.

    Comments: 17 pages, 3 figures, In Submission

  26. arXiv:1704.05579  [pdf, other

    cs.CL cs.AI cs.LG

    A Large Self-Annotated Corpus for Sarcasm

    Authors: Mikhail Khodak, Nikunj Saunshi, Kiran Vodrahalli

    Abstract: We introduce the Self-Annotated Reddit Corpus (SARC), a large corpus for sarcasm research and for training and evaluating systems for sarcasm detection. The corpus has 1.3 million sarcastic statements -- 10 times more than any previous dataset -- and many times more instances of non-sarcastic statements, allowing for learning in both balanced and unbalanced label regimes. Each statement is further… ▽ More

    Submitted 22 March, 2018; v1 submitted 18 April, 2017; originally announced April 2017.

    Comments: 6 pages, 4 Figures. To Appear in LREC 2018