Skip to main content

Showing 1–11 of 11 results for author: Mariet, Z

Searching in archive stat. Search in all archives.
.
  1. arXiv:2207.07411  [pdf, other

    cs.LG stat.ML

    Plex: Towards Reliability using Pretrained Large Model Extensions

    Authors: Dustin Tran, Jeremiah Liu, Michael W. Dusenberry, Du Phan, Mark Collier, Jie Ren, Kehang Han, Zi Wang, Zelda Mariet, Huiyi Hu, Neil Band, Tim G. J. Rudner, Karan Singhal, Zachary Nado, Joost van Amersfoort, Andreas Kirsch, Rodolphe Jenatton, Nithum Thain, Honglin Yuan, Kelly Buchanan, Kevin Murphy, D. Sculley, Yarin Gal, Zoubin Ghahramani, Jasper Snoek , et al. (1 additional authors not shown)

    Abstract: A recent trend in artificial intelligence is the use of pretrained models for language and vision tasks, which have achieved extraordinary performance but also puzzling failures. Probing these models' abilities in diverse ways is therefore critical to the field. In this paper, we explore the reliability of models, where we define a reliable model as one that not only achieves strong predictive per… ▽ More

    Submitted 15 July, 2022; originally announced July 2022.

    Comments: Code available at https://goo.gle/plex-code

  2. arXiv:2207.03084  [pdf, other

    cs.LG cs.AI stat.ML

    Pre-training helps Bayesian optimization too

    Authors: Zi Wang, George E. Dahl, Kevin Swersky, Chansoo Lee, Zelda Mariet, Zachary Nado, Justin Gilmer, Jasper Snoek, Zoubin Ghahramani

    Abstract: Bayesian optimization (BO) has become a popular strategy for global optimization of many expensive real-world functions. Contrary to a common belief that BO is suited to optimizing black-box functions, it actually requires domain knowledge on characteristics of those functions to deploy BO successfully. Such domain knowledge often manifests in Gaussian process priors that specify initial beliefs o… ▽ More

    Submitted 7 July, 2022; originally announced July 2022.

    Comments: ICML2022 Workshop on Adaptive Experimental Design and Active Learning in the Real World. arXiv admin note: substantial text overlap with arXiv:2109.08215

  3. arXiv:2206.10566  [pdf, other

    stat.ML cs.LG

    Ensembling over Classifiers: a Bias-Variance Perspective

    Authors: Neha Gupta, Jamie Smith, Ben Adlam, Zelda Mariet

    Abstract: Ensembles are a straightforward, remarkably effective method for improving the accuracy,calibration, and robustness of models on classification tasks; yet, the reasons that underlie their success remain an active area of research. We build upon the extension to the bias-variance decomposition by Pfau (2013) in order to gain crucial insights into the behavior of ensembles of classifiers. Introducin… ▽ More

    Submitted 21 June, 2022; originally announced June 2022.

  4. arXiv:2202.04167  [pdf, other

    stat.ML cs.LG math.PR

    Understanding the bias-variance tradeoff of Bregman divergences

    Authors: Ben Adlam, Neha Gupta, Zelda Mariet, Jamie Smith

    Abstract: This paper builds upon the work of Pfau (2013), which generalized the bias variance tradeoff to any Bregman divergence loss function. Pfau (2013) showed that for Bregman divergences, the bias and variances are defined with respect to a central label, defined as the mean of the label variable, and a central prediction, of a more complex form. We show that, similarly to the label, the central predic… ▽ More

    Submitted 9 February, 2022; v1 submitted 8 February, 2022; originally announced February 2022.

  5. arXiv:2110.03360  [pdf, other

    cs.LG cs.CV stat.ML

    Sparse MoEs meet Efficient Ensembles

    Authors: James Urquhart Allingham, Florian Wenzel, Zelda E Mariet, Basil Mustafa, Joan Puigcerver, Neil Houlsby, Ghassen Jerfel, Vincent Fortuin, Balaji Lakshminarayanan, Jasper Snoek, Dustin Tran, Carlos Riquelme Ruiz, Rodolphe Jenatton

    Abstract: Machine learning models based on the aggregated outputs of submodels, either at the activation or prediction levels, often exhibit strong performance compared to individual models. We study the interplay of two popular classes of such models: ensembles of neural networks and sparse mixture of experts (sparse MoEs). First, we show that the two approaches have complementary features whose combinatio… ▽ More

    Submitted 9 July, 2023; v1 submitted 7 October, 2021; originally announced October 2021.

    Comments: 59 pages, 26 figures, 36 tables. Accepted at TMLR

  6. arXiv:2109.08215  [pdf, other

    cs.LG stat.ML

    Pre-trained Gaussian processes for Bayesian optimization

    Authors: Zi Wang, George E. Dahl, Kevin Swersky, Chansoo Lee, Zelda Mariet, Zachary Nado, Justin Gilmer, Jasper Snoek, Zoubin Ghahramani

    Abstract: Bayesian optimization (BO) has become a popular strategy for global optimization of many expensive real-world functions. Contrary to a common belief that BO is suited to optimizing black-box functions, it actually requires domain knowledge on characteristics of those functions to deploy BO successfully. Such domain knowledge often manifests in Gaussian process priors that specify initial beliefs o… ▽ More

    Submitted 6 July, 2022; v1 submitted 16 September, 2021; originally announced September 2021.

  7. Population-Based Black-Box Optimization for Biological Sequence Design

    Authors: Christof Angermueller, David Belanger, Andreea Gane, Zelda Mariet, David Dohan, Kevin Murphy, Lucy Colwell, D Sculley

    Abstract: The use of black-box optimization for the design of new biological sequences is an emerging research area with potentially revolutionary impact. The cost and latency of wet-lab experiments requires methods that find good sequences in few experimental rounds of large batches of sequences--a setting that off-the-shelf black-box optimization methods are ill-equipped to handle. We find that the perfor… ▽ More

    Submitted 10 July, 2020; v1 submitted 5 June, 2020; originally announced June 2020.

    Journal ref: Proceedings of the 37th International Conference on Machine Learning, Vienna, Austria, PMLR 119, 2020

  8. arXiv:2002.09927  [pdf, other

    cs.LG stat.ML

    Weighting Is Worth the Wait: Bayesian Optimization with Importance Sampling

    Authors: Setareh Ariafar, Zelda Mariet, Ehsan Elhamifar, Dana Brooks, Jennifer Dy, Jasper Snoek

    Abstract: Many contemporary machine learning models require extensive tuning of hyperparameters to perform well. A variety of methods, such as Bayesian optimization, have been developed to automate and expedite this process. However, tuning remains extremely costly as it typically requires repeatedly fully training models. We propose to accelerate the Bayesian optimization approach to hyperparameter tuning… ▽ More

    Submitted 23 February, 2020; originally announced February 2020.

  9. arXiv:1901.02051  [pdf, other

    stat.ML cs.LG

    DPPNet: Approximating Determinantal Point Processes with Deep Networks

    Authors: Zelda Mariet, Yaniv Ovadia, Jasper Snoek

    Abstract: Determinantal Point Processes (DPPs) provide an elegant and versatile way to sample sets of items that balance the point-wise quality with the set-wise diversity of selected items. For this reason, they have gained prominence in many machine learning applications that rely on subset selection. However, sampling from a DPP over a ground set of size $N$ is a costly operation, requiring in general an… ▽ More

    Submitted 7 January, 2019; originally announced January 2019.

  10. arXiv:1805.03714  [pdf, other

    cs.LG cs.AI stat.ML

    Foundations of Sequence-to-Sequence Modeling for Time Series

    Authors: Vitaly Kuznetsov, Zelda Mariet

    Abstract: The availability of large amounts of time series data, paired with the performance of deep-learning algorithms on a broad class of problems, has recently led to significant interest in the use of sequence-to-sequence models for time series forecasting. We provide the first theoretical analysis of this time series forecasting framework. We include a comparison of sequence-to-sequence modeling to cl… ▽ More

    Submitted 26 February, 2019; v1 submitted 9 May, 2018; originally announced May 2018.

    Comments: To appear at AISTATS 2019

  11. arXiv:1605.08374  [pdf, other

    cs.LG cs.AI stat.ML

    Kronecker Determinantal Point Processes

    Authors: Zelda Mariet, Suvrit Sra

    Abstract: Determinantal Point Processes (DPPs) are probabilistic models over all subsets a ground set of $N$ items. They have recently gained prominence in several applications that rely on "diverse" subsets. However, their applicability to large problems is still limited due to the $\mathcal O(N^3)$ complexity of core tasks such as sampling and learning. We enable efficient sampling and learning for DPPs b… ▽ More

    Submitted 26 May, 2016; originally announced May 2016.