Skip to main content

Showing 1–50 of 146 results for author: Singh, S

Searching in archive stat. Search in all archives.
.
  1. arXiv:2403.20313  [pdf, other

    stat.ME stat.CO

    Towards a turnkey approach to unbiased Monte Carlo estimation of smooth functions of expectations

    Authors: Nicolas Chopin, Francesca R. Crucinio, Sumeetpal S. Singh

    Abstract: Given a smooth function $f$, we develop a general approach to turn Monte Carlo samples with expectation $m$ into an unbiased estimate of $f(m)$. Specifically, we develop estimators that are based on randomly truncating the Taylor series expansion of $f$ and estimating the coefficients of the truncated series. We derive their properties and propose a strategy to set their tuning parameters -- which… ▽ More

    Submitted 12 April, 2024; v1 submitted 29 March, 2024; originally announced March 2024.

  2. arXiv:2403.07379  [pdf, other

    cs.LG cs.CL stat.ML

    Hallmarks of Optimization Trajectories in Neural Networks: Directional Exploration and Redundancy

    Authors: Sidak Pal Singh, Bobby He, Thomas Hofmann, Bernhard Schölkopf

    Abstract: We propose a fresh take on understanding the mechanisms of neural networks by analyzing the rich directional structure of optimization trajectories, represented by their pointwise parameters. Towards this end, we introduce some natural notions of the complexity of optimization trajectories, both qualitative and quantitative, which hallmark the directional nature of optimization in neural networks:… ▽ More

    Submitted 24 June, 2024; v1 submitted 12 March, 2024; originally announced March 2024.

    Comments: Preprint, 57 pages

  3. arXiv:2402.15345  [pdf, other

    cs.LG stat.ML

    Fourier Basis Density Model

    Authors: Alfredo De la Fuente, Saurabh Singh, Johannes Ballé

    Abstract: We introduce a lightweight, flexible and end-to-end trainable probability density model parameterized by a constrained Fourier basis. We assess its performance at approximating a range of multi-modal 1D densities, which are generally difficult to fit. In comparison to the deep factorized model introduced in [1], our model achieves a lower cross entropy at a similar computational budget. In additio… ▽ More

    Submitted 23 February, 2024; originally announced February 2024.

  4. arXiv:2312.17572  [pdf, other

    stat.CO math.PR

    Mixing time of the conditional backward sampling particle filter

    Authors: Joona Karjalainen, Anthony Lee, Sumeetpal S. Singh, Matti Vihola

    Abstract: The conditional backward sampling particle filter (CBPF) is a powerful Markov chain Monte Carlo sampler for general state space hidden Markov model smoothing. It was proposed as an improvement over the conditional particle filter, which is known to have an $O(T^2)$ computational time complexity under a general `strong' mixing assumption, where $T$ is the time horizon. We provide the first proof th… ▽ More

    Submitted 22 February, 2024; v1 submitted 29 December, 2023; originally announced December 2023.

    Comments: 30 pages, 7 figures; revised before submission to a journal

    MSC Class: Primary 60J22; secondary 65C05; 65C40; 65C35; 62M05

  5. arXiv:2310.05719  [pdf, other

    cs.LG stat.ML

    Transformer Fusion with Optimal Transport

    Authors: Moritz Imfeld, Jacopo Graldi, Marco Giordano, Thomas Hofmann, Sotiris Anagnostidis, Sidak Pal Singh

    Abstract: Fusion is a technique for merging multiple independently-trained neural networks in order to combine their capabilities. Past attempts have been restricted to the case of fully-connected, convolutional, and residual networks. This paper presents a systematic approach for fusing two or more transformer-based networks exploiting Optimal Transport to (soft-)align the various architectural components.… ▽ More

    Submitted 22 April, 2024; v1 submitted 9 October, 2023; originally announced October 2023.

    Comments: Appears at International Conference on Learning Representations (ICLR), 2024. M. Imfeld, J. Graldi, and M. Giordano are the first authors and contributed equally to this work

  6. arXiv:2309.08517  [pdf, ps, other

    math.PR stat.CO

    On the Forgetting of Particle Filters

    Authors: Joona Karjalainen, Anthony Lee, Sumeetpal S. Singh, Matti Vihola

    Abstract: We study the forgetting properties of the particle filter when its state - the collection of particles - is regarded as a Markov chain. Under a strong mixing assumption on the particle filter's underlying Feynman-Kac model, we find that the particle filter is exponentially mixing, and forgets its initial state in $O(\log N )$ `time', where $N$ is the number of particles and time refers to the numb… ▽ More

    Submitted 15 September, 2023; originally announced September 2023.

    Comments: 26 pages

  7. arXiv:2307.09933  [pdf, other

    cs.LG cs.AI cs.CV stat.ML

    Spuriosity Didn't Kill the Classifier: Using Invariant Predictions to Harness Spurious Features

    Authors: Cian Eastwood, Shashank Singh, Andrei Liviu Nicolicioiu, Marin Vlastelica, Julius von Kügelgen, Bernhard Schölkopf

    Abstract: To avoid failures on out-of-distribution data, recent works have sought to extract features that have an invariant or stable relationship with the label across domains, discarding "spurious" or unstable features whose relationship with the label changes across domains. However, unstable features often carry complementary information that could boost performance if used correctly in the test domain… ▽ More

    Submitted 8 November, 2023; v1 submitted 19 July, 2023; originally announced July 2023.

    Comments: NeurIPS 2023 Camera-Ready

  8. arXiv:2307.01928  [pdf, other

    cs.RO cs.AI stat.AP

    Robots That Ask For Help: Uncertainty Alignment for Large Language Model Planners

    Authors: Allen Z. Ren, Anushri Dixit, Alexandra Bodrova, Sumeet Singh, Stephen Tu, Noah Brown, Peng Xu, Leila Takayama, Fei Xia, Jake Varley, Zhenjia Xu, Dorsa Sadigh, Andy Zeng, Anirudha Majumdar

    Abstract: Large language models (LLMs) exhibit a wide range of promising capabilities -- from step-by-step planning to commonsense reasoning -- that may provide utility for robots, but remain prone to confidently hallucinated predictions. In this work, we present KnowNo, which is a framework for measuring and aligning the uncertainty of LLM-based planners such that they know when they don't know and ask for… ▽ More

    Submitted 4 September, 2023; v1 submitted 4 July, 2023; originally announced July 2023.

    Comments: Conference on Robot Learning (CoRL) 2023, Oral Presentation

  9. arXiv:2306.15576  [pdf, other

    stat.ML cs.LG

    PyBADS: Fast and robust black-box optimization in Python

    Authors: Gurjeet Sangra Singh, Luigi Acerbi

    Abstract: PyBADS is a Python implementation of the Bayesian Adaptive Direct Search (BADS) algorithm for fast and robust black-box optimization (Acerbi and Ma 2017). BADS is an optimization algorithm designed to efficiently solve difficult optimization problems where the objective function is rough (non-convex, non-smooth), mildly expensive (e.g., the function evaluation requires more than 0.1 seconds), poss… ▽ More

    Submitted 27 June, 2023; originally announced June 2023.

    Comments: 7 pages, 1 figure. Documentation is available at https://acerbilab.github.io/pybads/ and source code is available at https://github.com/acerbilab/pybads

  10. arXiv:2306.01763  [pdf, other

    stat.AP cs.AI

    Optimization for truss design using Bayesian optimization

    Authors: Bhawani Sandeep, Surjeet Singh, Sumit Kumar

    Abstract: In this work, geometry optimization of mechanical truss using computer-aided finite element analysis is presented. The shape of the truss is a dominant factor in determining the capacity of load it can bear. At a given parameter space, our goal is to find the parameters of a hull that maximize the load-bearing capacity and also don't yield to the induced stress. We rely on finite element analysis,… ▽ More

    Submitted 1 July, 2023; v1 submitted 27 May, 2023; originally announced June 2023.

  11. arXiv:2305.09088  [pdf, other

    cs.LG stat.ML

    The Hessian perspective into the Nature of Convolutional Neural Networks

    Authors: Sidak Pal Singh, Thomas Hofmann, Bernhard Schölkopf

    Abstract: While Convolutional Neural Networks (CNNs) have long been investigated and applied, as well as theorized, we aim to provide a slightly different perspective into their nature -- through the perspective of their Hessian maps. The reason is that the loss Hessian captures the pairwise interaction of parameters and therefore forms a natural ground to probe how the architectural aspects of CNN get mani… ▽ More

    Submitted 15 May, 2023; originally announced May 2023.

    Comments: ICML 2023 conference proceedings

  12. arXiv:2302.10886  [pdf, other

    cs.LG stat.ML

    Some Fundamental Aspects about Lipschitz Continuity of Neural Networks

    Authors: Grigory Khromov, Sidak Pal Singh

    Abstract: Lipschitz continuity is a crucial functional property of any predictive model, that naturally governs its robustness, generalisation, as well as adversarial vulnerability. Contrary to other works that focus on obtaining tighter bounds and develo** different practical strategies to enforce certain Lipschitz properties, we aim to thoroughly examine and characterise the Lipschitz behaviour of Neura… ▽ More

    Submitted 14 May, 2024; v1 submitted 21 February, 2023; originally announced February 2023.

  13. arXiv:2211.12580  [pdf, other

    stat.ME

    Quasi-Newton Sequential Monte Carlo

    Authors: Samuel Duffield, Sumeetpal S. Singh

    Abstract: Sequential Monte Carlo samplers represent a compelling approach to posterior inference in Bayesian models, due to being parallelisable and providing an unbiased estimate of the posterior normalising constant. In this work, we significantly accelerate sequential Monte Carlo samplers by adopting the L-BFGS Hessian approximation which represents the state-of-the-art in full-batch optimisation techniq… ▽ More

    Submitted 22 November, 2022; originally announced November 2022.

  14. arXiv:2211.09981  [pdf, other

    cs.LG cs.AI stat.ML

    Weighted Ensemble Self-Supervised Learning

    Authors: Yangjun Ruan, Saurabh Singh, Warren Morningstar, Alexander A. Alemi, Sergey Ioffe, Ian Fischer, Joshua V. Dillon

    Abstract: Ensembling has proven to be a powerful technique for boosting model performance, uncertainty estimation, and robustness in supervised learning. Advances in self-supervised learning (SSL) enable leveraging large unlabeled corpora for state-of-the-art few-shot and supervised learning performance. In this paper, we explore how ensemble methods can improve recent SSL techniques by develo** a framewo… ▽ More

    Submitted 9 April, 2023; v1 submitted 17 November, 2022; originally announced November 2022.

    Comments: Accepted by ICLR 2023

  15. arXiv:2210.16872  [pdf, ps, other

    cs.LG stat.ML

    Planning to the Information Horizon of BAMDPs via Epistemic State Abstraction

    Authors: Dilip Arumugam, Satinder Singh

    Abstract: The Bayes-Adaptive Markov Decision Process (BAMDP) formalism pursues the Bayes-optimal solution to the exploration-exploitation trade-off in reinforcement learning. As the computation of exact solutions to Bayesian reinforcement-learning problems is intractable, much of the literature has focused on develo** suitable approximation algorithms. In this work, before diving into algorithm design, we… ▽ More

    Submitted 30 October, 2022; originally announced October 2022.

    Comments: Accepted to Neural Information Processing Systems (NeurIPS) 2022

  16. arXiv:2209.04035  [pdf, other

    stat.AP q-bio.QM

    Shape-based Evaluation of Epidemic Forecasts

    Authors: Ajitesh Srivastava, Satwant Singh, Fiona Lee

    Abstract: Infectious disease forecasting for ongoing epidemics has been traditionally performed, communicated, and evaluated as numerical targets - 1, 2, 3, and 4 week ahead cases, deaths, and hospitalizations. While there is great value in predicting these numerical targets to assess the burden of the disease, we argue that there is also value in communicating the future trend (description of the shape) of… ▽ More

    Submitted 11 November, 2022; v1 submitted 8 September, 2022; originally announced September 2022.

    Comments: Accepted at the IEEE International Conference on Big Data (IEEE BigData 2022)

  17. arXiv:2207.09944  [pdf, other

    stat.ML cs.AI cs.CV cs.LG

    Probable Domain Generalization via Quantile Risk Minimization

    Authors: Cian Eastwood, Alexander Robey, Shashank Singh, Julius von Kügelgen, Hamed Hassani, George J. Pappas, Bernhard Schölkopf

    Abstract: Domain generalization (DG) seeks predictors which perform well on unseen test distributions by leveraging data drawn from multiple related training distributions or domains. To achieve this, DG is commonly formulated as an average- or worst-case problem over the set of possible domains. However, predictors that perform well on average lack robustness while predictors that perform well in the worst… ▽ More

    Submitted 22 August, 2023; v1 submitted 20 July, 2022; originally announced July 2022.

    Comments: NeurIPS 2022 camera-ready (+ minor corrections)

  18. arXiv:2206.10478  [pdf, other

    stat.CO stat.AP stat.ME

    De-biasing particle filtering for a continuous time hidden Markov model with a Cox process observation model

    Authors: Ruiyang **, Sumeetpal S. Singh, Nicolas Chopin

    Abstract: We develop a (nearly) unbiased particle filtering algorithm for a specific class of continuous-time state-space models, such that (a) the latent process $X_t$ is a linear Gaussian diffusion; and (b) the observations arise from a Poisson process with intensity $λ(X_t)$. The likelihood of the posterior probability density function of the latent process includes an intractable path integral. Our algo… ▽ More

    Submitted 30 June, 2022; v1 submitted 21 June, 2022; originally announced June 2022.

    Comments: 34 pages, 14 figures

  19. arXiv:2206.04615  [pdf, other

    cs.CL cs.AI cs.CY cs.LG stat.ML

    Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models

    Authors: Aarohi Srivastava, Abhinav Rastogi, Abhishek Rao, Abu Awal Md Shoeb, Abubakar Abid, Adam Fisch, Adam R. Brown, Adam Santoro, Aditya Gupta, Adrià Garriga-Alonso, Agnieszka Kluska, Aitor Lewkowycz, Akshat Agarwal, Alethea Power, Alex Ray, Alex Warstadt, Alexander W. Kocurek, Ali Safaya, Ali Tazarv, Alice Xiang, Alicia Parrish, Allen Nie, Aman Hussain, Amanda Askell, Amanda Dsouza , et al. (426 additional authors not shown)

    Abstract: Language models demonstrate both quantitative improvement and new qualitative capabilities with increasing scale. Despite their potentially transformative impact, these new capabilities are as yet poorly characterized. In order to inform future research, prepare for disruptive new model capabilities, and ameliorate socially harmful effects, it is vital that we understand the present and near-futur… ▽ More

    Submitted 12 June, 2023; v1 submitted 9 June, 2022; originally announced June 2022.

    Comments: 27 pages, 17 figures + references and appendices, repo: https://github.com/google/BIG-bench

    Journal ref: Transactions on Machine Learning Research, May/2022, https://openreview.net/forum?id=uyTL5Bvosj

  20. arXiv:2206.01454  [pdf, other

    math.ST cs.IT cs.LG stat.ME stat.ML

    Indirect Active Learning

    Authors: Shashank Singh

    Abstract: Traditional models of active learning assume a learner can directly manipulate or query a covariate $X$ in order to study its relationship with a response $Y$. However, if $X$ is a feature of a complex system, it may be possible only to indirectly influence $X$ by manipulating a control variable $Z$, a scenario we refer to as Indirect Active Learning. Under a nonparametric model of Indirect Active… ▽ More

    Submitted 21 January, 2023; v1 submitted 3 June, 2022; originally announced June 2022.

    Comments: To appear in proceedings of the 26th International Conference on Artificial Intelligence and Statistics (AISTATS)

  21. arXiv:2205.13898  [pdf, other

    stat.CO stat.ME

    Conditional particle filters with bridge backward sampling

    Authors: Santeri Karppinen, Sumeetpal S. Singh, Matti Vihola

    Abstract: Conditional particle filters (CPFs) with backward/ancestor sampling are powerful methods for sampling from the posterior distribution of the latent states of a dynamic model such as a hidden Markov model. However, the performance of these methods deteriorates with models involving weakly informative observations and/or slowly mixing dynamics. Both of these complications arise when sampling finely… ▽ More

    Submitted 19 June, 2023; v1 submitted 27 May, 2022; originally announced May 2022.

  22. arXiv:2203.12961  [pdf, other

    stat.CO math.NA stat.ML

    Multilevel Bayesian Deep Neural Networks

    Authors: Neil K. Chada, Ajay Jasra, Kody J. H. Law, Sumeetpal S. Singh

    Abstract: In this article we consider Bayesian inference associated to deep neural networks (DNNs) and in particular, trace-class neural network (TNN) priors which were proposed by Sell et al. [39]. Such priors were developed as more robust alternatives to classical architectures in the context of inference problems. For this work we develop multilevel Monte Carlo (MLMC) methods for such models. MLMC is a p… ▽ More

    Submitted 20 July, 2022; v1 submitted 24 March, 2022; originally announced March 2022.

  23. arXiv:2203.10037  [pdf, other

    stat.CO math.PR stat.ME

    On resampling schemes for particle filters with weakly informative observations

    Authors: Nicolas Chopin, Sumeetpal S. Singh, Tomás Soto, Matti Vihola

    Abstract: We consider particle filters with weakly informative observations (or `potentials') relative to the latent state dynamics. The particular focus of this work is on particle filters to approximate time-discretisations of continuous-time Feynman--Kac path integral models -- a scenario that naturally arises when addressing filtering and smoothing problems in continuous time -- but our findings are ind… ▽ More

    Submitted 9 July, 2022; v1 submitted 18 March, 2022; originally announced March 2022.

    Comments: 36 pages, 9 figures

    MSC Class: Primary 65C35; secondary 65C05; 65C60; 60J25

  24. arXiv:2203.07337  [pdf, other

    stat.ML cs.LG

    Phenomenology of Double Descent in Finite-Width Neural Networks

    Authors: Sidak Pal Singh, Aurelien Lucchi, Thomas Hofmann, Bernhard Schölkopf

    Abstract: `Double descent' delineates the generalization behaviour of models depending on the regime they belong to: under- or over-parameterized. The current theoretical understanding behind the occurrence of this phenomenon is primarily based on linear and kernel regression models -- with informal parallels to neural networks via the Neural Tangent Kernel. Therefore such analyses do not adequately capture… ▽ More

    Submitted 14 March, 2022; originally announced March 2022.

    Comments: Published at ICLR 2022

  25. arXiv:2111.03267  [pdf, other

    cs.LG stat.ML

    Interpretable Personalized Experimentation

    Authors: Han Wu, Sarah Tan, Weiwei Li, Mia Garrard, Adam Obeng, Drew Dimmery, Shaun Singh, Hanson Wang, Daniel Jiang, Eytan Bakshy

    Abstract: Black-box heterogeneous treatment effect (HTE) models are increasingly being used to create personalized policies that assign individuals to their optimal treatments. However, they are difficult to understand, and can be burdensome to maintain in a production environment. In this paper, we present a scalable, interpretable personalized experimentation system, implemented and deployed in production… ▽ More

    Submitted 5 August, 2022; v1 submitted 5 November, 2021; originally announced November 2021.

    Comments: Camera-ready version for KDD 2022. Previously titled "Distilling Heterogeneity: From Explanations of Heterogeneous Treatment Effect Models to Interpretable Policies". A short version was presented at MIT CODE 2021

  26. Ensemble Kalman Inversion for General Likelihoods

    Authors: Samuel Duffield, Sumeetpal S. Singh

    Abstract: In this letter we generalise Ensemble Kalman inversion techniques to general Bayesian models where previously they were restricted to additive Gaussian likelihoods - all in the difficult setting where the likelihood can be sampled from, but its density not necessarily evaluated.

    Submitted 7 June, 2022; v1 submitted 6 October, 2021; originally announced October 2021.

    Journal ref: Statistics & Probability Letters 187 (2022)

  27. arXiv:2109.04504  [pdf, other

    cs.LG cs.AI stat.ML

    Bootstrapped Meta-Learning

    Authors: Sebastian Flennerhag, Yannick Schroecker, Tom Zahavy, Hado van Hasselt, David Silver, Satinder Singh

    Abstract: Meta-learning empowers artificial intelligence to increase its efficiency by learning how to learn. Unlocking this potential involves overcoming a challenging meta-optimisation problem. We propose an algorithm that tackles this problem by letting the meta-learner teach itself. The algorithm first bootstraps a target from the meta-learner, then optimises the meta-learner by minimising the distance… ▽ More

    Submitted 16 March, 2022; v1 submitted 9 September, 2021; originally announced September 2021.

    Comments: Published at ICLR 2022. 37 pages, 19 figures, 9 tables

  28. arXiv:2107.01777  [pdf, other

    math.ST cs.LG stat.ML

    Optimal Binary Classification Beyond Accuracy

    Authors: Shashank Singh, Justin Khim

    Abstract: The vast majority of statistical theory on binary classification characterizes performance in terms of accuracy. However, accuracy is known in many cases to poorly reflect the practical consequences of classification error, most famously in imbalanced binary classification, where data are dominated by samples from one of two classes. The first part of this paper derives a novel generalization of t… ▽ More

    Submitted 26 September, 2022; v1 submitted 4 July, 2021; originally announced July 2021.

    Comments: Parts of this paper have been revised from arXiv:2004.04715v2 [math.ST]

  29. arXiv:2106.16225  [pdf, other

    cs.LG cs.NE math.ST stat.ML

    Analytic Insights into Structure and Rank of Neural Network Hessian Maps

    Authors: Sidak Pal Singh, Gregor Bachmann, Thomas Hofmann

    Abstract: The Hessian of a neural network captures parameter interactions through second-order derivatives of the loss. It is a fundamental object of study, closely tied to various problems in deep learning, including model design, optimization, and generalization. Most prior work has been empirical, typically focusing on low-rank approximations and heuristics that are blind to the network structure. In con… ▽ More

    Submitted 1 July, 2021; v1 submitted 30 June, 2021; originally announced June 2021.

  30. arXiv:2106.01746  [pdf, ps, other

    q-bio.QM stat.CO

    Limits of accuracy for parameter estimation and localisation in Single-Molecule Microscopy via sequential Monte Carlo methods

    Authors: A. Marie d'Avigneau, S. S. Singh, R. J. Ober

    Abstract: Assessing the quality of parameter estimates for models describing the motion of single molecules in cellular environments is an important problem in fluorescence microscopy. We consider the fundamental data model, where molecules emit photons at random times and the photons arrive at random locations on the detector according to complex point spread functions (PSFs). The random, non-Gaussian PSF… ▽ More

    Submitted 14 September, 2021; v1 submitted 3 June, 2021; originally announced June 2021.

    Comments: 38 pages (inc. 7 pages appendix), 11 figures

    MSC Class: 65C05; 92C55

  31. arXiv:2106.00669  [pdf, other

    cs.AI cs.LG stat.ML

    Discovering Diverse Nearly Optimal Policies with Successor Features

    Authors: Tom Zahavy, Brendan O'Donoghue, Andre Barreto, Volodymyr Mnih, Sebastian Flennerhag, Satinder Singh

    Abstract: Finding different solutions to the same problem is a key aspect of intelligence associated with creativity and adaptation to novel situations. In reinforcement learning, a set of diverse policies can be useful for exploration, transfer, hierarchy, and robustness. We propose Diverse Successive Policies, a method for discovering policies that are diverse in the space of Successor Features, while ass… ▽ More

    Submitted 4 January, 2022; v1 submitted 1 June, 2021; originally announced June 2021.

  32. arXiv:2106.00661  [pdf, other

    cs.AI cs.LG stat.ML

    Reward is enough for convex MDPs

    Authors: Tom Zahavy, Brendan O'Donoghue, Guillaume Desjardins, Satinder Singh

    Abstract: Maximising a cumulative reward function that is Markov and stationary, i.e., defined over state-action pairs and independent of time, is sufficient to capture many kinds of goals in a Markov decision process (MDP). However, not all goals can be captured in this manner. In this paper we study convex MDPs in which goals are expressed as convex functions of the stationary distribution and show that t… ▽ More

    Submitted 2 June, 2023; v1 submitted 1 June, 2021; originally announced June 2021.

  33. arXiv:2103.09017  [pdf, other

    stat.ME stat.CO

    Gradient-Based Markov Chain Monte Carlo for Bayesian Inference With Non-Differentiable Priors

    Authors: Jacob Vorstrup Goldman, Torben Sell, Sumeetpal Sidhu Singh

    Abstract: The use of non-differentiable priors in Bayesian statistics has become increasingly popular, in particular in Bayesian imaging analysis. Current state of the art methods are approximate in the sense that they replace the posterior with a smooth approximation via Moreau-Yosida envelopes, and apply gradient-based discretized diffusions to sample from the resulting distribution. We characterize the e… ▽ More

    Submitted 16 March, 2021; originally announced March 2021.

    Comments: Accepted for publication by the Journal of the American Statistical Association

  34. arXiv:2101.03079  [pdf, other

    stat.CO stat.ME

    Spatiotemporal blocking of the bouncy particle sampler for efficient inference in state space models

    Authors: Jacob Vorstrup Goldman, Sumeetpal Sidhu Singh

    Abstract: We propose a novel blocked version of the continuous-time bouncy particle sampler of [Bouchard-Côté et al., 2018] which is applicable to any differentiable probability density. This alternative implementation is motivated by blocked Gibbs sampling for state space models [Singh et al., 2017] and leads to significant improvement in terms of effective sample size per second, and furthermore, allows f… ▽ More

    Submitted 9 July, 2021; v1 submitted 8 January, 2021; originally announced January 2021.

    Comments: 22 pages, 5 figures

  35. arXiv:2012.10943  [pdf, other

    stat.ME stat.CO stat.ML

    Trace-class Gaussian priors for Bayesian learning of neural networks with MCMC

    Authors: Torben Sell, Sumeetpal S. Singh

    Abstract: This paper introduces a new neural network based prior for real valued functions on $\mathbb R^d$ which, by construction, is more easily and cheaply scaled up in the domain dimension $d$ compared to the usual Karhunen-Loève function space prior. The new prior is a Gaussian neural network prior, where each weight and bias has an independent Gaussian prior, but with the key difference that the varia… ▽ More

    Submitted 8 September, 2022; v1 submitted 20 December, 2020; originally announced December 2020.

    Comments: 24 pages, 21 figures

  36. Online Particle Smoothing with Application to Map-matching

    Authors: Samuel Duffield, Sumeetpal S. Singh

    Abstract: We introduce a novel method for online smoothing in state-space models that utilises a fixed-lag approximation to overcome the well known issue of path degeneracy. Unlike classical fixed-lag techniques that only approximate certain marginals, we introduce an online resampling algorithm, called particle stitching, that converts these marginal samples into a full posterior approximation. We demonstr… ▽ More

    Submitted 2 August, 2021; v1 submitted 8 December, 2020; originally announced December 2020.

    Journal ref: IEEE Transactions on Signal Processing 2022

  37. arXiv:2011.05721  [pdf, other

    stat.AP

    A General Class of New Continuous Mixture Distribution and Application

    Authors: Brijesh P. Singh, Sandeep Singh, Utpal Dhar Das

    Abstract: A generalization of a distribution increases the flexibility particularly in studying of a phenomenon and its properties. Many generalizations of continuous univariate distributions are available in literature. In this study, an investigation is conducted on a distribution and its generalization. Several available generalizations of the distribution are reviewed and recent trends in the constructi… ▽ More

    Submitted 11 November, 2020; originally announced November 2020.

    Comments: 14 pages, 14 figures and 2 tables

    MSC Class: 62Exx

    Journal ref: Journal of Mathematical and Computational Science (2021), Vol. 11, No. 1, 585-602

  38. arXiv:2010.08007  [pdf, other

    stat.ML cs.IT cs.LG math.OC math.ST

    Continuum-Armed Bandits: A Function Space Perspective

    Authors: Shashank Singh

    Abstract: Continuum-armed bandits (a.k.a., black-box or $0^{th}$-order optimization) involves optimizing an unknown objective function given an oracle that evaluates the function at a query point, with the goal of using as few query points as possible. In the most well-studied case, the objective function is assumed to be Lipschitz continuous and minimax rates of simple and cumulative regrets are known in b… ▽ More

    Submitted 21 March, 2021; v1 submitted 15 October, 2020; originally announced October 2020.

  39. arXiv:2009.06389  [pdf, other

    cs.LG cs.AI cs.CR stat.ML

    Neither Private Nor Fair: Impact of Data Imbalance on Utility and Fairness in Differential Privacy

    Authors: Tom Farrand, Fatemehsadat Mireshghallah, Sahib Singh, Andrew Trask

    Abstract: Deployment of deep learning in different fields and industries is growing day by day due to its performance, which relies on the availability of data and compute. Data is often crowd-sourced and contains sensitive information about its contributors, which leaks into models that are trained on it. To achieve rigorous privacy guarantees, differentially private training mechanisms are used. However,… ▽ More

    Submitted 3 October, 2020; v1 submitted 10 September, 2020; originally announced September 2020.

    Comments: 5 pages, 5 figures

  40. arXiv:2008.05030  [pdf, other

    cs.LG stat.ML

    Reliable Post hoc Explanations: Modeling Uncertainty in Explainability

    Authors: Dylan Slack, Sophie Hilgard, Sameer Singh, Himabindu Lakkaraju

    Abstract: As black box explanations are increasingly being employed to establish model credibility in high-stakes settings, it is important to ensure that these explanations are accurate and reliable. However, prior work demonstrates that explanations generated by state-of-the-art techniques are inconsistent, unstable, and provide very little insight into their correctness and reliability. In addition, thes… ▽ More

    Submitted 6 November, 2021; v1 submitted 11 August, 2020; originally announced August 2020.

  41. arXiv:2008.02930  [pdf, other

    cs.LG cs.IR stat.ML

    Zero-Shot Heterogeneous Transfer Learning from Recommender Systems to Cold-Start Search Retrieval

    Authors: Tao Wu, Ellie Ka-In Chio, Heng-Tze Cheng, Yu Du, Steffen Rendle, Dima Kuzmin, Ritesh Agarwal, Li Zhang, John Anderson, Sarvjeet Singh, Tushar Chandra, Ed H. Chi, Wen Li, Ankit Kumar, Xiang Ma, Alex Soares, Nitin **dal, Pei Cao

    Abstract: Many recent advances in neural information retrieval models, which predict top-K items given a query, learn directly from a large training set of (query, item) pairs. However, they are often insufficient when there are many previously unseen (query, item) combinations, often referred to as the cold start problem. Furthermore, the search system can be biased towards items that are frequently shown… ▽ More

    Submitted 18 August, 2020; v1 submitted 6 August, 2020; originally announced August 2020.

    Comments: Accepted at CIKM 2020

  42. arXiv:2008.00646  [pdf, other

    cs.LG stat.ML

    Interpretable Sequence Learning for COVID-19 Forecasting

    Authors: Sercan O. Arik, Chun-Liang Li, **sung Yoon, Rajarishi Sinha, Arkady Epshteyn, Long T. Le, Vikas Menon, Shashank Singh, Leyou Zhang, Nate Yoder, Martin Nikoltchev, Yash Sonthalia, Hootan Nakhost, Elli Kanal, Tomas Pfister

    Abstract: We propose a novel approach that integrates machine learning into compartmental disease modeling to predict the progression of COVID-19. Our model is explainable by design as it explicitly shows how different compartments evolve and it uses interpretable encoders to incorporate covariates and improve performance. Explainability is valuable to ensure that the model's forecasts are credible to epide… ▽ More

    Submitted 13 January, 2021; v1 submitted 3 August, 2020; originally announced August 2020.

  43. arXiv:2007.08433  [pdf, other

    cs.LG cs.AI stat.ML

    Meta-Gradient Reinforcement Learning with an Objective Discovered Online

    Authors: Zhongwen Xu, Hado van Hasselt, Matteo Hessel, Junhyuk Oh, Satinder Singh, David Silver

    Abstract: Deep reinforcement learning includes a broad family of algorithms that parameterise an internal representation, such as a value function or policy, by a deep neural network. Each algorithm optimises its parameters with respect to an objective, such as Q-learning or policy gradient, that defines its semantics. In this work, we propose an algorithm based on meta-gradient descent that discovers its o… ▽ More

    Submitted 16 July, 2020; originally announced July 2020.

  44. Anytime Parallel Tempering

    Authors: A. Marie d'Avigneau, S. S. Singh, L. M. Murray

    Abstract: Develo** efficient MCMC algorithms is indispensable in Bayesian inference. In parallel tempering, multiple interacting MCMC chains run to more efficiently explore the state space and improve performance. The multiple chains advance independently through local moves, and the performance enhancement steps are exchange moves, where the chains pause to exchange their current sample amongst each othe… ▽ More

    Submitted 14 September, 2021; v1 submitted 26 June, 2020; originally announced June 2020.

    Comments: 34 Pages, 10 Figures

    MSC Class: 62-08; 62F15

  45. arXiv:2006.04635  [pdf, other

    cs.LG cs.AI cs.GT cs.MA stat.ML

    Learning to Play No-Press Diplomacy with Best Response Policy Iteration

    Authors: Thomas Anthony, Tom Eccles, Andrea Tacchetti, János Kramár, Ian Gemp, Thomas C. Hudson, Nicolas Porcel, Marc Lanctot, Julien Pérolat, Richard Everett, Roman Werpachowski, Satinder Singh, Thore Graepel, Yoram Bachrach

    Abstract: Recent advances in deep reinforcement learning (RL) have led to considerable progress in many 2-player zero-sum games, such as Go, Poker and Starcraft. The purely adversarial nature of such games allows for conceptually simple and principled application of RL methods. However real-world settings are many-agent, and agent interactions are complex mixtures of common-interest and competitive aspects.… ▽ More

    Submitted 4 January, 2022; v1 submitted 8 June, 2020; originally announced June 2020.

  46. arXiv:2006.02595  [pdf, other

    cs.LG cs.CV eess.IV stat.ML

    Image Augmentations for GAN Training

    Authors: Zhengli Zhao, Zizhao Zhang, Ting Chen, Sameer Singh, Han Zhang

    Abstract: Data augmentations have been widely studied to improve the accuracy and robustness of classifiers. However, the potential of image augmentation in improving GAN models for image synthesis has not been thoroughly investigated in previous studies. In this work, we systematically study the effectiveness of various existing augmentation techniques for GAN training in a variety of settings. We provide… ▽ More

    Submitted 3 June, 2020; originally announced June 2020.

  47. arXiv:2005.13099  [pdf, other

    cs.LG cs.CR cs.CV eess.IV stat.ML

    Benchmarking Differentially Private Residual Networks for Medical Imagery

    Authors: Sahib Singh, Harshvardhan Sikka, Sasikanth Kotti, Andrew Trask

    Abstract: In this paper we measure the effectiveness of $ε$-Differential Privacy (DP) when applied to medical imaging. We compare two robust differential privacy mechanisms: Local-DP and DP-SGD and benchmark their performance when analyzing medical imagery records. We analyze the trade-off between the model's accuracy and the level of privacy it guarantees, and also take a closer look to evaluate how useful… ▽ More

    Submitted 4 September, 2020; v1 submitted 26 May, 2020; originally announced May 2020.

    Comments: 5 Pages, 4 Figures

  48. arXiv:2004.14340  [pdf, other

    cs.LG stat.ML

    WoodFisher: Efficient Second-Order Approximation for Neural Network Compression

    Authors: Sidak Pal Singh, Dan Alistarh

    Abstract: Second-order information, in the form of Hessian- or Inverse-Hessian-vector products, is a fundamental tool for solving optimization problems. Recently, there has been significant interest in utilizing this information in the context of deep neural networks; however, relatively little is known about the quality of existing approximations in this context. Our work examines this question, identifies… ▽ More

    Submitted 25 November, 2020; v1 submitted 29 April, 2020; originally announced April 2020.

    Comments: NeurIPS 2020

  49. arXiv:2004.08597  [pdf, other

    math.ST cs.LG stat.ML

    Robust Density Estimation under Besov IPM Losses

    Authors: Ananya Uppal, Shashank Singh, Barnabas Poczos

    Abstract: We study minimax convergence rates of nonparametric density estimation in the Huber contamination model, in which a proportion of the data comes from an unknown outlier distribution. We provide the first results for this problem under a large family of losses, called Besov integral probability metrics (IPMs), that includes $\mathcal{L}^p$, Wasserstein, Kolmogorov-Smirnov, and other common distance… ▽ More

    Submitted 6 September, 2021; v1 submitted 18 April, 2020; originally announced April 2020.

  50. arXiv:2004.04715  [pdf, other

    stat.ML cs.LG math.ST

    Multiclass Classification via Class-Weighted Nearest Neighbors

    Authors: Justin Khim, Ziyu Xu, Shashank Singh

    Abstract: We study statistical properties of the k-nearest neighbors algorithm for multiclass classification, with a focus on settings where the number of classes may be large and/or classes may be highly imbalanced. In particular, we consider a variant of the k-nearest neighbor classifier with non-uniform class-weightings, for which we derive upper and minimax lower bounds on accuracy, class-weighted risk,… ▽ More

    Submitted 3 May, 2020; v1 submitted 9 April, 2020; originally announced April 2020.

    Comments: 62 pages, 4 figures, 2 tables