Skip to main content

Showing 51–74 of 74 results for author: Talwalkar, A

.
  1. arXiv:1902.10644  [pdf, other

    cs.LG cs.AI stat.ML

    Provable Guarantees for Gradient-Based Meta-Learning

    Authors: Mikhail Khodak, Maria-Florina Balcan, Ameet Talwalkar

    Abstract: We study the problem of meta-learning through the lens of online convex optimization, develo** a meta-algorithm bridging the gap between popular gradient-based meta-learning and classical regularization-based multi-task transfer methods. Our method is the first to simultaneously satisfy good sample efficiency guarantees in the convex setting, with generalization bounds that improve with task-sim… ▽ More

    Submitted 16 May, 2019; v1 submitted 27 February, 2019; originally announced February 2019.

    Comments: ICML 2019

  2. arXiv:1902.07638  [pdf, other

    cs.LG stat.ML

    Random Search and Reproducibility for Neural Architecture Search

    Authors: Liam Li, Ameet Talwalkar

    Abstract: Neural architecture search (NAS) is a promising research direction that has the potential to replace expert-designed networks with learned, task-specific architectures. In this work, in order to help ground the empirical results in this field, we propose new NAS baselines that build off the following observations: (i) NAS is a specialized hyperparameter optimization problem; and (ii) random search… ▽ More

    Submitted 30 July, 2019; v1 submitted 20 February, 2019; originally announced February 2019.

    Comments: V2 Changelog: - Modified footnote 2 for ENAS. - Expanded broad reproducibility study for random search with WS for CNN to 6 sets of random seeds v3 Changelog: - Added journal reference - Updated acknowledgements

    Journal ref: Conference on Uncertainty in Artificial Intelligence (UAI), 2019

  3. arXiv:1902.06787  [pdf, other

    cs.LG stat.ML

    Regularizing Black-box Models for Improved Interpretability

    Authors: Gregory Plumb, Maruan Al-Shedivat, Angel Alexander Cabrera, Adam Perer, Eric Xing, Ameet Talwalkar

    Abstract: Most of the work on interpretable machine learning has focused on designing either inherently interpretable models, which typically trade-off accuracy for interpretability, or post-hoc explanation systems, whose explanation quality can be unpredictable. Our method, ExpO, is a hybridization of these approaches that regularizes a model for explanation quality at training time. Importantly, these reg… ▽ More

    Submitted 8 November, 2020; v1 submitted 18 February, 2019; originally announced February 2019.

  4. arXiv:1812.07210  [pdf, other

    cs.LG cs.DC stat.ML

    Expanding the Reach of Federated Learning by Reducing Client Resource Requirements

    Authors: Sebastian Caldas, Jakub Konečny, H. Brendan McMahan, Ameet Talwalkar

    Abstract: Communication on heterogeneous edge networks is a fundamental bottleneck in Federated Learning (FL), restricting both model capacity and user participation. To address this issue, we introduce two novel strategies to reduce communication costs: (1) the use of lossy compression on the global model sent server-to-client; and (2) Federated Dropout, which allows users to efficiently train locally on s… ▽ More

    Submitted 8 January, 2019; v1 submitted 18 December, 2018; originally announced December 2018.

  5. arXiv:1812.06127  [pdf, other

    cs.LG stat.ML

    Federated Optimization in Heterogeneous Networks

    Authors: Tian Li, Anit Kumar Sahu, Manzil Zaheer, Maziar Sanjabi, Ameet Talwalkar, Virginia Smith

    Abstract: Federated Learning is a distributed learning paradigm with two key challenges that differentiate it from traditional distributed optimization: (1) significant variability in terms of the systems characteristics on each device in the network (systems heterogeneity), and (2) non-identically distributed data across the network (statistical heterogeneity). In this work, we introduce a framework, FedPr… ▽ More

    Submitted 21 April, 2020; v1 submitted 14 December, 2018; originally announced December 2018.

    Comments: MLSys 2020

  6. arXiv:1812.01097  [pdf, other

    cs.LG stat.ML

    LEAF: A Benchmark for Federated Settings

    Authors: Sebastian Caldas, Sai Meher Karthik Duddu, Peter Wu, Tian Li, Jakub Konečný, H. Brendan McMahan, Virginia Smith, Ameet Talwalkar

    Abstract: Modern federated networks, such as those comprised of wearable devices, mobile phones, or autonomous vehicles, generate massive amounts of data each day. This wealth of data can help to learn models that can improve the user experience on each device. However, the scale and heterogeneity of federated data presents new challenges in research areas such as federated learning, meta-learning, and mult… ▽ More

    Submitted 9 December, 2019; v1 submitted 3 December, 2018; originally announced December 2018.

  7. arXiv:1810.05934  [pdf, other

    cs.LG stat.ML

    A System for Massively Parallel Hyperparameter Tuning

    Authors: Liam Li, Kevin Jamieson, Afshin Rostamizadeh, Ekaterina Gonina, Moritz Hardt, Benjamin Recht, Ameet Talwalkar

    Abstract: Modern learning models are characterized by large hyperparameter spaces and long training times. These properties, coupled with the rise of parallel computing and the growing demand to productionize machine learning workloads, motivate the need to develop mature hyperparameter optimization functionality in distributed computing settings. We address this challenge by first introducing a simple and… ▽ More

    Submitted 15 March, 2020; v1 submitted 13 October, 2018; originally announced October 2018.

    Comments: v2: Corrected typo in Algorithm 1 v3: Added comparison to BOHB and parallel version of synchronous SHA. Add PBT to experiment in Section 4.3.1 v4: Added acknowledgements and slight edit to related work

    Journal ref: Conference on Machine Learning and Systems 2020

  8. arXiv:1807.02910  [pdf, other

    cs.LG stat.ML

    Model Agnostic Supervised Local Explanations

    Authors: Gregory Plumb, Denali Molitor, Ameet Talwalkar

    Abstract: Model interpretability is an increasingly important component of practical machine learning. Some of the most common forms of interpretability systems are example-based, local, and global explanations. One of the main challenges in interpretability is designing explanation systems that can capture aspects of each of these explanation types, in order to develop a more thorough understanding of the… ▽ More

    Submitted 5 January, 2019; v1 submitted 8 July, 2018; originally announced July 2018.

  9. arXiv:1707.00424  [pdf, other

    cs.LG cs.DC stat.ML

    Parle: parallelizing stochastic gradient descent

    Authors: Pratik Chaudhari, Carlo Baldassi, Riccardo Zecchina, Stefano Soatto, Ameet Talwalkar, Adam Oberman

    Abstract: We propose a new algorithm called Parle for parallel training of deep networks that converges 2-4x faster than a data-parallel implementation of SGD, while achieving significantly improved error rates that are nearly state-of-the-art on several benchmarks including CIFAR-10 and CIFAR-100, without introducing any additional hyper-parameters. We exploit the phenomenon of flat minima that has been sh… ▽ More

    Submitted 10 September, 2017; v1 submitted 3 July, 2017; originally announced July 2017.

  10. arXiv:1705.10467  [pdf, other

    cs.LG stat.ML

    Federated Multi-Task Learning

    Authors: Virginia Smith, Chao-Kai Chiang, Maziar Sanjabi, Ameet Talwalkar

    Abstract: Federated learning poses new statistical and systems challenges in training machine learning models over distributed networks of devices. In this work, we show that multi-task learning is naturally suited to handle the statistical challenges of this setting, and propose a novel systems-aware optimization method, MOCHA, that is robust to practical systems issues. Our method and theory for the first… ▽ More

    Submitted 27 February, 2018; v1 submitted 30 May, 2017; originally announced May 2017.

  11. arXiv:1603.06560  [pdf, other

    cs.LG stat.ML

    Hyperband: A Novel Bandit-Based Approach to Hyperparameter Optimization

    Authors: Lisha Li, Kevin Jamieson, Giulia DeSalvo, Afshin Rostamizadeh, Ameet Talwalkar

    Abstract: Performance of machine learning algorithms depends critically on identifying a good set of hyperparameters. While recent approaches use Bayesian optimization to adaptively select configurations, we focus on speeding up random search through adaptive resource allocation and early-stop**. We formulate hyperparameter optimization as a pure-exploration non-stochastic infinite-armed bandit problem wh… ▽ More

    Submitted 18 June, 2018; v1 submitted 21 March, 2016; originally announced March 2016.

    Comments: Changes: - Updated to JMLR version

    Journal ref: Journal of Machine Learning Research 18 (2018) 1-52

  12. arXiv:1505.06807  [pdf, other

    cs.LG cs.DC cs.MS stat.ML

    MLlib: Machine Learning in Apache Spark

    Authors: Xiangrui Meng, Joseph Bradley, Burak Yavuz, Evan Sparks, Shivaram Venkataraman, Davies Liu, Jeremy Freeman, DB Tsai, Manish Amde, Sean Owen, Doris Xin, Reynold Xin, Michael J. Franklin, Reza Zadeh, Matei Zaharia, Ameet Talwalkar

    Abstract: Apache Spark is a popular open-source platform for large-scale data processing that is well-suited for iterative machine learning tasks. In this paper we present MLlib, Spark's open-source distributed machine learning library. MLlib provides efficient functionality for a wide range of learning settings and includes several underlying statistical, optimization, and linear algebra primitives. Shippe… ▽ More

    Submitted 26 May, 2015; originally announced May 2015.

  13. arXiv:1502.07943  [pdf, other

    cs.LG stat.ML

    Non-stochastic Best Arm Identification and Hyperparameter Optimization

    Authors: Kevin Jamieson, Ameet Talwalkar

    Abstract: Motivated by the task of hyperparameter optimization, we introduce the non-stochastic best-arm identification problem. Within the multi-armed bandit literature, the cumulative regret objective enjoys algorithms and analyses for both the non-stochastic and stochastic settings while to the best of our knowledge, the best-arm identification framework has only been considered in the stochastic setting… ▽ More

    Submitted 27 February, 2015; originally announced February 2015.

  14. arXiv:1502.00068  [pdf, other

    cs.DB cs.DC cs.LG

    TuPAQ: An Efficient Planner for Large-scale Predictive Analytic Queries

    Authors: Evan R. Sparks, Ameet Talwalkar, Michael J. Franklin, Michael I. Jordan, Tim Kraska

    Abstract: The proliferation of massive datasets combined with the development of sophisticated analytical techniques have enabled a wide variety of novel applications such as improved product recommendations, automatic image tagging, and improved speech-driven interfaces. These and many other applications can be supported by Predictive Analytic Queries (PAQs). A major obstacle to supporting PAQs is the chal… ▽ More

    Submitted 8 March, 2015; v1 submitted 30 January, 2015; originally announced February 2015.

  15. arXiv:1408.2044  [pdf

    cs.LG stat.ML

    Matrix Coherence and the Nystrom Method

    Authors: Ameet Talwalkar, Afshin Rostamizadeh

    Abstract: The Nystrom method is an efficient technique used to speed up large-scale learning applications by generating low-rank approximations. Crucial to the performance of this technique is the assumption that a matrix can be well approximated by working exclusively with a subset of its columns. In this work we relate this assumption to the concept of matrix coherence, connecting coherence to the perform… ▽ More

    Submitted 9 August, 2014; originally announced August 2014.

    Comments: Appears in Proceedings of the Twenty-Sixth Conference on Uncertainty in Artificial Intelligence (UAI2010)

    Report number: UAI-P-2010-PG-572-579

  16. arXiv:1310.8420  [pdf, other

    q-bio.GN q-bio.QM

    SMaSH: A Benchmarking Toolkit for Human Genome Variant Calling

    Authors: Ameet Talwalkar, Jesse Liptrap, Julie Newcomb, Christopher Hartl, Jonathan Terhorst, Kristal Curtis, Ma'ayan Bresler, Yun S. Song, Michael I. Jordan, David Patterson

    Abstract: Motivation: Computational methods are essential to extract actionable information from raw sequencing data, and to thus fulfill the promise of next-generation sequencing technology. Unfortunately, computational tools developed to call variants from human sequencing data disagree on many of their predictions, and current methods to evaluate accuracy and computational performance are ad-hoc and inco… ▽ More

    Submitted 5 January, 2014; v1 submitted 31 October, 2013; originally announced October 2013.

  17. arXiv:1310.5426  [pdf, other

    cs.LG cs.DC stat.ML

    MLI: An API for Distributed Machine Learning

    Authors: Evan R. Sparks, Ameet Talwalkar, Virginia Smith, Jey Kottalam, Xinghao Pan, Joseph Gonzalez, Michael J. Franklin, Michael I. Jordan, Tim Kraska

    Abstract: MLI is an Application Programming Interface designed to address the challenges of building Machine Learn- ing algorithms in a distributed setting based on data-centric computing. Its primary goal is to simplify the development of high-performance, scalable, distributed algorithms. Our initial results show that, relative to existing systems, this interface can be used to build distributed implement… ▽ More

    Submitted 25 October, 2013; v1 submitted 21 October, 2013; originally announced October 2013.

  18. arXiv:1304.5583  [pdf, ps, other

    cs.CV cs.DC cs.LG stat.ML

    Distributed Low-rank Subspace Segmentation

    Authors: Ameet Talwalkar, Lester Mackey, Yadong Mu, Shih-Fu Chang, Michael I. Jordan

    Abstract: Vision problems ranging from image clustering to motion segmentation to semi-supervised learning can naturally be framed as subspace segmentation problems, in which one aims to recover multiple low-dimensional subspaces from noisy and corrupted input data. Low-Rank Representation (LRR), a convex formulation of the subspace segmentation problem, is provably and empirically accurate on small problem… ▽ More

    Submitted 15 October, 2013; v1 submitted 19 April, 2013; originally announced April 2013.

  19. arXiv:1206.6415  [pdf

    cs.LG stat.ML

    The Big Data Bootstrap

    Authors: Ariel Kleiner, Ameet Talwalkar, Purnamrita Sarkar, Michael Jordan

    Abstract: The bootstrap provides a simple and powerful means of assessing the quality of estimators. However, in settings involving large datasets, the computation of bootstrap-based quantities can be prohibitively demanding. As an alternative, we present the Bag of Little Bootstraps (BLB), a new procedure which incorporates features of both the bootstrap and subsampling to obtain a robust, computationally… ▽ More

    Submitted 27 June, 2012; originally announced June 2012.

    Comments: Appears in Proceedings of the 29th International Conference on Machine Learning (ICML 2012). arXiv admin note: text overlap with arXiv:1112.5016

  20. arXiv:1112.5016  [pdf, other

    stat.ME stat.CO stat.ML

    A Scalable Bootstrap for Massive Data

    Authors: Ariel Kleiner, Ameet Talwalkar, Purnamrita Sarkar, Michael I. Jordan

    Abstract: The bootstrap provides a simple and powerful means of assessing the quality of estimators. However, in settings involving large datasets---which are increasingly prevalent---the computation of bootstrap-based quantities can be prohibitively demanding computationally. While variants such as subsampling and the $m$ out of $n$ bootstrap can be used in principle to reduce the cost of bootstrap computa… ▽ More

    Submitted 27 June, 2012; v1 submitted 21 December, 2011; originally announced December 2011.

  21. arXiv:1112.3265  [pdf, other

    cs.SI physics.soc-ph

    Jointly Predicting Links and Inferring Attributes using a Social-Attribute Network (SAN)

    Authors: Neil Zhenqiang Gong, Ameet Talwalkar, Lester Mackey, Ling Huang, Eui Chul Richard Shin, Emil Stefanov, Elaine, Shi, Dawn Song

    Abstract: The effects of social influence and homophily suggest that both network structure and node attribute information should inform the tasks of link prediction and node attribute inference. Recently, Yin et al. proposed Social-Attribute Network (SAN), an attribute-augmented social network, to integrate network structure and node attributes to perform both link prediction and attribute inference. They… ▽ More

    Submitted 22 June, 2012; v1 submitted 14 December, 2011; originally announced December 2011.

    Comments: 9 pages, 4 figures and 4 tables

  22. arXiv:1107.0789  [pdf, ps, other

    cs.LG cs.DS math.NA stat.ML

    Distributed Matrix Completion and Robust Factorization

    Authors: Lester Mackey, Ameet Talwalkar, Michael I. Jordan

    Abstract: If learning methods are to scale to the massive sizes of modern datasets, it is essential for the field of machine learning to embrace parallel and distributed computing. Inspired by the recent development of matrix factorization methods with rich theory but poor computational complexity and by the relative ease of map** matrices onto distributed architectures, we introduce a scalable divide-and… ▽ More

    Submitted 28 October, 2013; v1 submitted 5 July, 2011; originally announced July 2011.

    Comments: 35 pages, 6 figures

  23. arXiv:1009.0861  [pdf, ps, other

    stat.ML cs.AI cs.LG

    On the Estimation of Coherence

    Authors: Mehryar Mohri, Ameet Talwalkar

    Abstract: Low-rank matrix approximations are often used to help scale standard machine learning algorithms to large-scale problems. Recently, matrix coherence has been used to characterize the ability to extract global information from a subset of matrix entries in the context of these low-rank approximations and other sampling-based algorithms, e.g., matrix com- pletion, robust PCA. Since coherence is defi… ▽ More

    Submitted 4 September, 2010; originally announced September 2010.

  24. arXiv:1004.2008  [pdf, ps, other

    cs.AI

    Matrix Coherence and the Nystrom Method

    Authors: Ameet Talwalkar, Afshin Rostamizadeh

    Abstract: The Nystrom method is an efficient technique to speed up large-scale learning applications by generating low-rank approximations. Crucial to the performance of this technique is the assumption that a matrix can be well approximated by working exclusively with a subset of its columns. In this work we relate this assumption to the concept of matrix coherence and connect matrix coherence to the per… ▽ More

    Submitted 12 April, 2010; originally announced April 2010.