Skip to main content

Showing 1–50 of 117 results for author: De, S

Searching in archive stat. Search in all archives.
.
  1. arXiv:2404.07141  [pdf, other

    stat.ME math.ST

    High-dimensional copula-based Wasserstein dependence

    Authors: Steven De Keyser, Irene Gijbels

    Abstract: We generalize 2-Wasserstein dependence coefficients to measure dependence between a finite number of random vectors. This generalization includes theoretical properties, and in particular focuses on an interpretation of maximal dependence and an asymptotic normality result for a proposed semi-parametric estimator under a Gaussian copula assumption. In addition, we discuss general axioms for depend… ▽ More

    Submitted 10 April, 2024; originally announced April 2024.

  2. arXiv:2306.12674  [pdf, other

    stat.ME stat.AP

    Map** poverty at multiple geographical scales

    Authors: Silvia De Nicolò, Enrico Fabrizi, Aldo Gardini

    Abstract: Poverty map** is a powerful tool to study the geography of poverty. The choice of the spatial resolution is central as poverty measures defined at a coarser level may mask their heterogeneity at finer levels. We introduce a small area multi-scale approach integrating survey and remote sensing data that leverages information at different spatial resolutions and accounts for hierarchical dependenc… ▽ More

    Submitted 22 June, 2023; originally announced June 2023.

    Comments: 22 pages, 7 figures

  3. arXiv:2306.05315  [pdf, other

    stat.ME math.ST

    Large-scale adaptive multiple testing for sequential data controlling false discovery and nondiscovery rates

    Authors: Rahul Roy, Shyamal K. De, Subir Kumar Bhandari

    Abstract: In modern scientific experiments, we frequently encounter data that have large dimensions, and in some experiments, such high dimensional data arrive sequentially rather than full data being available all at a time. We develop multiple testing procedures with simultaneous control of false discovery and nondiscovery rates when $m$-variate data vectors $\mathbf{X}_1, \mathbf{X}_2, \dots$ are observe… ▽ More

    Submitted 8 June, 2023; originally announced June 2023.

    Comments: 44 pages, 4 figures, 2 tables

  4. arXiv:2305.16530  [pdf, other

    stat.ML cs.LG math.NA

    Bi-fidelity Variational Auto-encoder for Uncertainty Quantification

    Authors: Nuo** Cheng, Osman Asif Malik, Subhayan De, Stephen Becker, Alireza Doostan

    Abstract: Quantifying the uncertainty of quantities of interest (QoIs) from physical systems is a primary objective in model validation. However, achieving this goal entails balancing the need for computational efficiency with the requirement for numerical accuracy. To address this trade-off, we propose a novel bi-fidelity formulation of variational auto-encoders (BF-VAE) designed to estimate the uncertaint… ▽ More

    Submitted 17 October, 2023; v1 submitted 25 May, 2023; originally announced May 2023.

    Journal ref: Computer Methods in Applied Mechanics and Engineering (CMAME), Volume 421, 1 March 2024, 116793

  5. arXiv:2304.12609  [pdf, other

    stat.ML cs.LG

    A Bi-fidelity DeepONet Approach for Modeling Uncertain and Degrading Hysteretic Systems

    Authors: Subhayan De, Patrick T. Brewick

    Abstract: Nonlinear systems, such as with degrading hysteretic behavior, are often encountered in engineering applications. In addition, due to the ubiquitous presence of uncertainty and the modeling of such systems becomes increasingly difficult. On the other hand, datasets from pristine models developed without knowing the nature of the degrading effects can be easily obtained. In this paper, we use datas… ▽ More

    Submitted 25 April, 2023; originally announced April 2023.

    Comments: 23 pages, 15 figures

  6. arXiv:2303.16151  [pdf, other

    q-fin.ST cs.LG econ.EM stat.ML

    Forecasting Large Realized Covariance Matrices: The Benefits of Factor Models and Shrinkage

    Authors: Rafael Alves, Diego S. de Brito, Marcelo C. Medeiros, Ruy M. Ribeiro

    Abstract: We propose a model to forecast large realized covariance matrices of returns, applying it to the constituents of the S\&P 500 daily. To address the curse of dimensionality, we decompose the return covariance matrix using standard firm-level factors (e.g., size, value, and profitability) and use sectoral restrictions in the residual covariance matrix. This restricted model is then estimated using v… ▽ More

    Submitted 22 March, 2023; originally announced March 2023.

  7. arXiv:2302.13861  [pdf, other

    cs.LG cs.CR cs.CV stat.ML

    Differentially Private Diffusion Models Generate Useful Synthetic Images

    Authors: Sahra Ghalebikesabi, Leonard Berrada, Sven Gowal, Ira Ktena, Robert Stanforth, Jamie Hayes, Soham De, Samuel L. Smith, Olivia Wiles, Borja Balle

    Abstract: The ability to generate privacy-preserving synthetic versions of sensitive image datasets could unlock numerous ML applications currently constrained by data availability. Due to their astonishing image generation quality, diffusion models are a prime candidate for generating high-quality synthetic data. However, recent studies have found that, by default, the outputs of some diffusion models do n… ▽ More

    Submitted 27 February, 2023; originally announced February 2023.

  8. arXiv:2302.13611  [pdf, other

    math.ST stat.ME

    Parametric dependence between random vectors via copula-based divergence measures

    Authors: Steven De Keyser, Irène Gijbels

    Abstract: This article proposes copula-based dependence quantification between multiple groups of random variables of possibly different sizes via the family of $Phi$-divergences. An axiomatic framework for this purpose is provided, after which we focus on the absolutely continuous setting assuming copula densities exist. We consider parametric and semi-parametric frameworks, discuss estimation procedures,… ▽ More

    Submitted 27 February, 2023; originally announced February 2023.

  9. arXiv:2302.10861  [pdf, other

    stat.ME stat.AP

    Estimating the optimal time to perform a PET-PSMA exam in prostatectomized patients based on data from clinical practice

    Authors: Martina Amongero, Gianluca Mastrantonio, Stefano De Luca, Mauro Gasparini

    Abstract: Prostatectomized patients are at risk of resurgence: this is the reason why, during a follow-up period, they are monitored for PSA growth, an indicator of tumor progression. The presence of tumors can be evaluated with an expensive exam, called PET-PSMA (Positron Emission Tomography with Prostate-Specific Membrane Antigen). But, to optimize the benefit/risk ratio, patients should be referred to th… ▽ More

    Submitted 21 February, 2023; originally announced February 2023.

  10. arXiv:2211.04915  [pdf, other

    stat.AP

    Inferring Mobility of Care Travel Behavior From Transit Origin-Destination Data

    Authors: Daniela Shuman, Awad Abdelhalim, Anson F Stewart, Kayleigh B Campbell, Mira Patel, Ines Sanchez de Madariaga, **hua Zhao

    Abstract: There are substantial differences in travel behavior by gender on public transit. Studies have concluded that these differences are largely attributable to household responsibilities typically falling disproportionately on women, leading to women being more likely to utilize transit for purposes referred to by the umbrella concept of "mobility of care". In contrast to past studies that have quanti… ▽ More

    Submitted 10 April, 2023; v1 submitted 9 November, 2022; originally announced November 2022.

    Comments: Updated reference formatting and discussion points

  11. arXiv:2210.13028  [pdf, other

    cs.CR cs.AI stat.AP

    Generalised Likelihood Ratio Testing Adversaries through the Differential Privacy Lens

    Authors: Georgios Kaissis, Alexander Ziller, Stefan Kolek Martinez de Azagra, Daniel Rueckert

    Abstract: Differential Privacy (DP) provides tight upper bounds on the capabilities of optimal adversaries, but such adversaries are rarely encountered in practice. Under the hypothesis testing/membership inference interpretation of DP, we examine the Gaussian mechanism and relax the usual assumption of a Neyman-Pearson-Optimal (NPO) adversary to a Generalized Likelihood Test (GLRT) adversary. This mild rel… ▽ More

    Submitted 24 October, 2022; originally announced October 2022.

  12. Small Area Estimation of Inequality Measures using Mixtures of Beta

    Authors: Silvia De Nicolò, Maria Rosaria Ferrante, Silvia Pacei

    Abstract: Economic inequalities referring to specific regions are crucial in deepening spatial heterogeneity. Income surveys are generally planned to produce reliable estimates at countries or macroregion levels, thus we implement a small area model for a set of inequality measures (Gini, Relative Theil and Atkinson indexes) to obtain microregion estimates. Considering that inequality estimators are unit-in… ▽ More

    Submitted 15 September, 2023; v1 submitted 5 September, 2022; originally announced September 2022.

    Comments: 29 pages, 8 figures, 2 tables

    Journal ref: Journal of the Royal Statistical Society Series A: Statistics in Society (2024)

  13. arXiv:2207.07001  [pdf, other

    astro-ph.CO stat.AP

    SCONCE: A cosmic web finder for spherical and conic geometries

    Authors: Yikun Zhang, Rafael S. de Souza, Yen-Chi Chen

    Abstract: The latticework structure known as the cosmic web provides a valuable insight into the assembly history of large-scale structures. Despite the variety of methods to identify the cosmic web structures, they mostly rely on the assumption that galaxies are embedded in a Euclidean geometric space. Here we present a novel cosmic web identifier called SCONCE (Spherical and CONic Cosmic wEb finder) that… ▽ More

    Submitted 14 July, 2022; originally announced July 2022.

    Comments: 20 pages, 9 figures, 2 tables

  14. arXiv:2204.13650  [pdf, other

    cs.LG cs.CR cs.CV stat.ML

    Unlocking High-Accuracy Differentially Private Image Classification through Scale

    Authors: Soham De, Leonard Berrada, Jamie Hayes, Samuel L. Smith, Borja Balle

    Abstract: Differential Privacy (DP) provides a formal privacy guarantee preventing adversaries with access to a machine learning model from extracting information about individual training points. Differentially Private Stochastic Gradient Descent (DP-SGD), the most popular DP training method for deep learning, realizes this protection by injecting noise during training. However previous works have found th… ▽ More

    Submitted 16 June, 2022; v1 submitted 28 April, 2022; originally announced April 2022.

  15. arXiv:2204.00997  [pdf, other

    stat.ML cs.LG

    Bi-fidelity Modeling of Uncertain and Partially Unknown Systems using DeepONets

    Authors: Subhayan De, Matthew Reynolds, Malik Hassanaly, Ryan N. King, Alireza Doostan

    Abstract: Recent advances in modeling large-scale complex physical systems have shifted research focuses towards data-driven techniques. However, generating datasets by simulating complex systems can require significant computational resources. Similarly, acquiring experimental datasets can prove difficult as well. For these systems, often computationally inexpensive, but in general inaccurate, models, know… ▽ More

    Submitted 18 August, 2022; v1 submitted 3 April, 2022; originally announced April 2022.

    Comments: 20 pages, 15 figures

  16. arXiv:2203.03304  [pdf, other

    cs.LG stat.ML

    Regularising for invariance to data augmentation improves supervised learning

    Authors: Aleksander Botev, Matthias Bauer, Soham De

    Abstract: Data augmentation is used in machine learning to make the classifier invariant to label-preserving transformations. Usually this invariance is only encouraged implicitly by including a single augmented input during training. However, several works have recently shown that using multiple augmentations per input can improve generalisation or can be used to incorporate invariances more explicitly. In… ▽ More

    Submitted 7 March, 2022; originally announced March 2022.

  17. arXiv:2109.12953  [pdf, other

    stat.AP

    Non-destructive methods for assessing tree fiber length distributions in standing trees

    Authors: Sara Sjöstedt de Luna, Konrad Abramowicz, Natalya Pya Arnqvist

    Abstract: One of the main concerns of silviculture and forest management focuses on finding fast, cost-efficient and non-destructive ways of measuring wood properties in standing trees. This paper presents an R package \verb+fiberLD+ that provides functions for estimating tree fiber length distributions in the standing tree based on increment core samples. The methods rely on increment core data measured by… ▽ More

    Submitted 27 September, 2021; originally announced September 2021.

  18. arXiv:2107.08950  [pdf, other

    stat.ME econ.EM

    Mind the Income Gap: Bias Correction of Inequality Estimators in Small-Sized Samples

    Authors: Silvia De Nicolò, Maria Rosaria Ferrante, Silvia Pacei

    Abstract: Income inequality estimators are biased in small samples, leading generally to an underestimation. This aspect deserves particular attention when estimating inequality in small domains and performing small area estimation at the area level. We propose a bias correction framework for a large class of inequality measures comprising the Gini Index, the Generalized Entropy and the Atkinson index famil… ▽ More

    Submitted 10 May, 2023; v1 submitted 19 July, 2021; originally announced July 2021.

    Comments: 21 pages, 4 figures

  19. arXiv:2107.06223  [pdf

    stat.AP stat.OT

    Impact of heat waves and cold spells on cause-specific mortality in the city of Sao Paulo, Brazil

    Authors: Sara Lopes de Moraes, Ricardo Almendra, Ligia Vizeu Barrozo

    Abstract: The impact of heat waves and cold spells on mortality has become a major public health problem worldwide, especially among older adults living in low- to middle-income countries. This study aimed to investigate the effects of heat waves and cold spells under different definitions on cause-specific mortality among people aged 65 years and over in Sao Paulo from 2006 to 2015. A quasi-Poisson general… ▽ More

    Submitted 13 July, 2021; originally announced July 2021.

    Comments: 28 pages, 2 tables, and 4 figures

  20. Neural Network Training Using $\ell_1$-Regularization and Bi-fidelity Data

    Authors: Subhayan De, Alireza Doostan

    Abstract: With the capability of accurately representing a functional relationship between the inputs of a physical system's model and output quantities of interest, neural networks have become popular for surrogate modeling in scientific applications. However, as these networks are over-parameterized, their training often requires a large amount of data. To prevent overfitting and improve generalization er… ▽ More

    Submitted 1 June, 2021; v1 submitted 27 May, 2021; originally announced May 2021.

    Comments: 28 pages, 14 figures

  21. arXiv:2105.02083  [pdf, other

    math.ST cs.IT stat.ML

    AdaBoost and robust one-bit compressed sensing

    Authors: Geoffrey Chinot, Felix Kuchelmeister, Matthias Löffler, Sara van de Geer

    Abstract: This paper studies binary classification in robust one-bit compressed sensing with adversarial errors. It is assumed that the model is overparameterized and that the parameter of interest is effectively sparse. AdaBoost is considered, and, through its relation to the max-$\ell_1$-margin-classifier, prediction error bounds are derived. The developed theory is general and allows for heavy-tailed fea… ▽ More

    Submitted 8 December, 2021; v1 submitted 5 May, 2021; originally announced May 2021.

    Comments: 40 pages, 4 figures, code available at https://github.com/Felix-127/Adaboost-and-robust-one-bit-compressed-sensing, extended results to features that satisfy weak-moment and anti-concentration assumption

    MSC Class: 62H30 (Primary); 94A12 (Secondary)

  22. arXiv:2104.00777  [pdf

    q-bio.PE stat.OT

    Impact of climate change on West Nile virus distribution in South America

    Authors: Camila Lorenz, Thiago Salomao de Azevedo, Francisco Chiaravalloti-Neto

    Abstract: West Nile virus (WNV) is a vector-borne pathogen of global relevance and is currently the most widely distributed flavivirus of encephalitis worldwide. This virus infects birds, humans, horses, and other mammals, and its transmission cycle occurs in urban and rural areas. Climate conditions have direct and indirect impacts on vector abundance and virus dynamics within the mosquito. The significanc… ▽ More

    Submitted 1 April, 2021; originally announced April 2021.

    Comments: 25 pages, 5 figures and 1 table

  23. arXiv:2103.00594  [pdf

    stat.AP

    Examining socioeconomic factors to understand the hospital case-fatality rates of COVID-19 in the city of Sao Paulo, Brazil

    Authors: Camila Lorenz, Patricia Marques Moralejo Bermudi, Marcelo Antunes Failla, Breno Souza de Aguiar, Tatiana Natasha Toporcov, Francisco Chiaravalloti Neto, Ligia Vizeu Barrozo

    Abstract: Understanding differences in hospital case-fatality rates (HCFRs) of coronavirus disease (COVID-19) may help evaluate its severity and the capacity of the healthcare system to reduce mortality. We examined the variability in HCFRs of COVID-19 in relation to spatial inequalities in socioeconomic factors across the city of Sao Paulo, Brazil. We found that HCFRs were higher for men and for individual… ▽ More

    Submitted 28 February, 2021; originally announced March 2021.

    Comments: 10 pages, 1 figure, 1 table

  24. arXiv:2102.06171  [pdf, other

    cs.CV cs.LG stat.ML

    High-Performance Large-Scale Image Recognition Without Normalization

    Authors: Andrew Brock, Soham De, Samuel L. Smith, Karen Simonyan

    Abstract: Batch normalization is a key component of most image classification models, but it has many undesirable properties stemming from its dependence on the batch size and interactions between examples. Although recent work has succeeded in training deep ResNets without normalization layers, these models do not match the test accuracies of the best batch-normalized networks, and are often unstable for l… ▽ More

    Submitted 11 February, 2021; originally announced February 2021.

  25. arXiv:2101.12176  [pdf, other

    cs.LG stat.ML

    On the Origin of Implicit Regularization in Stochastic Gradient Descent

    Authors: Samuel L. Smith, Benoit Dherin, David G. T. Barrett, Soham De

    Abstract: For infinitesimal learning rates, stochastic gradient descent (SGD) follows the path of gradient flow on the full batch loss function. However moderately large learning rates can achieve higher test accuracies, and this generalization benefit is not explained by convergence bounds, since the learning rate which maximizes test accuracy is often larger than the learning rate which minimizes training… ▽ More

    Submitted 28 January, 2021; originally announced January 2021.

    Comments: Accepted as a conference paper at ICLR 2021

  26. arXiv:2101.08692  [pdf, other

    cs.LG cs.CV stat.ML

    Characterizing signal propagation to close the performance gap in unnormalized ResNets

    Authors: Andrew Brock, Soham De, Samuel L. Smith

    Abstract: Batch Normalization is a key component in almost all state-of-the-art image classifiers, but it also introduces practical challenges: it breaks the independence between training examples within a batch, can incur compute and memory overhead, and often results in unexpected bugs. Building on recent theoretical analyses of deep ResNets at initialization, we propose a simple set of analysis tools to… ▽ More

    Submitted 27 January, 2021; v1 submitted 21 January, 2021; originally announced January 2021.

    Comments: Published as a conference paper at ICLR 2021

  27. arXiv:2012.03854  [pdf, other

    stat.AP cs.LG econ.EM stat.ML stat.OT

    Forecasting: theory and practice

    Authors: Fotios Petropoulos, Daniele Apiletti, Vassilios Assimakopoulos, Mohamed Zied Babai, Devon K. Barrow, Souhaib Ben Taieb, Christoph Bergmeir, Ricardo J. Bessa, Jakub Bijak, John E. Boylan, Jethro Browell, Claudio Carnevale, Jennifer L. Castle, Pasquale Cirillo, Michael P. Clements, Clara Cordeiro, Fernando Luiz Cyrino Oliveira, Shari De Baets, Alexander Dokumentov, Joanne Ellison, Piotr Fiszeder, Philip Hans Franses, David T. Frazier, Michael Gilliland, M. Sinan Gönül , et al. (55 additional authors not shown)

    Abstract: Forecasting has always been at the forefront of decision making and planning. The uncertainty that surrounds the future is both exciting and challenging, with individuals and organisations seeking to minimise risks and maximise utilities. The large number of forecasting applications calls for a diverse set of forecasting methods to tackle real-life challenges. This article provides a non-systemati… ▽ More

    Submitted 5 January, 2022; v1 submitted 4 December, 2020; originally announced December 2020.

  28. arXiv:2012.00807  [pdf, ps, other

    math.ST cs.IT math.NA stat.ML

    On the robustness of minimum norm interpolators and regularized empirical risk minimizers

    Authors: Geoffrey Chinot, Matthias Löffler, Sara van de Geer

    Abstract: This article develops a general theory for minimum norm interpolating estimators and regularized empirical risk minimizers (RERM) in linear models in the presence of additive, potentially adversarial, errors. In particular, no conditions on the errors are imposed. A quantitative bound for the prediction error is given, relating it to the Rademacher complexity of the covariates, the norm of the min… ▽ More

    Submitted 7 October, 2021; v1 submitted 1 December, 2020; originally announced December 2020.

    Comments: 35 pages

    MSC Class: 62J05

  29. arXiv:2010.10241  [pdf, ps, other

    stat.ML cs.CV cs.LG

    BYOL works even without batch statistics

    Authors: Pierre H. Richemond, Jean-Bastien Grill, Florent Altché, Corentin Tallec, Florian Strub, Andrew Brock, Samuel Smith, Soham De, Razvan Pascanu, Bilal Piot, Michal Valko

    Abstract: Bootstrap Your Own Latent (BYOL) is a self-supervised learning approach for image representation. From an augmented view of an image, BYOL trains an online network to predict a target network representation of a different augmented view of the same image. Unlike contrastive methods, BYOL does not explicitly use a repulsion term built from negative pairs in its training objective. Yet, it avoids co… ▽ More

    Submitted 20 October, 2020; originally announced October 2020.

  30. arXiv:2008.09083  [pdf, other

    stat.ME stat.ML

    Exact Tests for Offline Changepoint Detection in Multichannel Binary and Count Data with Application to Networks

    Authors: Shyamal K. De, Soumendu Sundar Mukherjee

    Abstract: We consider offline detection of a single changepoint in binary and count time-series. We compare exact tests based on the cumulative sum (CUSUM) and the likelihood ratio (LR) statistics, and a new proposal that combines exact two-sample conditional tests with multiplicity correction, against standard asymptotic tests based on the Brownian bridge approximation to the CUSUM statistic. We see empiri… ▽ More

    Submitted 20 August, 2020; originally announced August 2020.

    Comments: 31 pages, 9 figures, 8 tables

  31. arXiv:2008.04598  [pdf, other

    physics.comp-ph stat.ML

    Uncertainty Quantification of Locally Nonlinear Dynamical Systems using Neural Networks

    Authors: Subhayan De

    Abstract: Models are often given in terms of differential equations to represent physical systems. In the presence of uncertainty, accurate prediction of the behavior of these systems using the models requires understanding the effect of uncertainty in the response. In uncertainty quantification, statistics such as mean and variance of the response of these physical systems are sought. To estimate these sta… ▽ More

    Submitted 11 August, 2020; originally announced August 2020.

    Comments: 26 pages, 20 figures

  32. Spatiotemporal dynamic of COVID-19 mortality in the city of Sao Paulo, Brazil: shifting the high risk from the best to the worst socio-economic conditions

    Authors: Patricia Marques Moralejo Bermudi, Camila Lorenz, Breno Souza de Aguiar, Marcelo Antunes Failla, Ligia Vizeu Barrozo, Francisco Chiaravalloti-Neto

    Abstract: Currently, Brazil has one of the fastest increasing COVID-19 epidemics in the world, that has caused at least 94 thousand confirmed deaths until now. The city of Sao Paulo is particularly vulnerable because it is the most populous in the country. Analyzing the spatiotemporal dynamics of COVID-19 is important to help the urgent need to integrate better actions to face the pandemic. Thus, this study… ▽ More

    Submitted 5 August, 2020; originally announced August 2020.

    Comments: 22 pages, 6 figures, 2 tables, 3 supplementary materials

  33. arXiv:2007.06847  [pdf, other

    q-bio.BM cs.CE cs.LG stat.ML

    Sequence-guided protein structure determination using graph convolutional and recurrent networks

    Authors: Po-Nan Li, Saulo H. P. de Oliveira, Soichi Wakatsuki, Henry van den Bedem

    Abstract: Single particle, cryogenic electron microscopy (cryo-EM) experiments now routinely produce high-resolution data for large proteins and their complexes. Building an atomic model into a cryo-EM density map is challenging, particularly when no structure for the target protein is known a priori. Existing protocols for this type of task often rely on significant human intervention and can take hours to… ▽ More

    Submitted 2 September, 2020; v1 submitted 14 July, 2020; originally announced July 2020.

    Comments: 6 pages, 5 figures; accepted to IEEE BIBE 2020

  34. arXiv:2007.01073  [pdf, other

    stat.ML cs.LG

    Accurate Characterization of Non-Uniformly Sampled Time Series using Stochastic Differential Equations

    Authors: Stijn de Waele

    Abstract: Non-uniform sampling arises when an experimenter does not have full control over the sampling characteristics of the process under investigation. Moreover, it is introduced intentionally in algorithms such as Bayesian optimization and compressive sensing. We argue that Stochastic Differential Equations (SDEs) are especially well-suited for characterizing second order moments of such time series. W… ▽ More

    Submitted 2 July, 2020; originally announced July 2020.

  35. arXiv:2007.00254  [pdf, other

    stat.ML cs.LG q-fin.ST

    Construction of confidence interval for a univariate stock price signal predicted through Long Short Term Memory Network

    Authors: Shankhyajyoti De, Arabin Kumar Dey, Deepak Gauda

    Abstract: In this paper, we show an innovative way to construct bootstrap confidence interval of a signal estimated based on a univariate LSTM model. We take three different types of bootstrap methods for dependent set up. We prescribe some useful suggestions to select the optimal block length while performing the bootstrap** of the sample. We also propose a benchmark to compare the confidence interval me… ▽ More

    Submitted 1 July, 2020; originally announced July 2020.

    Comments: 14 pages, 11 figures

  36. arXiv:2006.15081  [pdf, other

    cs.LG stat.ML

    On the Generalization Benefit of Noise in Stochastic Gradient Descent

    Authors: Samuel L. Smith, Erich Elsen, Soham De

    Abstract: It has long been argued that minibatch stochastic gradient descent can generalize better than large batch gradient descent in deep neural networks. However recent papers have questioned this claim, arguing that this effect is simply a consequence of suboptimal hyperparameter tuning or insufficient compute budgets when the batch size is large. In this paper, we perform carefully designed experiment… ▽ More

    Submitted 26 June, 2020; originally announced June 2020.

    Comments: Camera-ready version of ICML 2020

  37. arXiv:2006.04295  [pdf, other

    stat.ML cs.LG

    Efficient MCMC Sampling for Bayesian Matrix Factorization by Breaking Posterior Symmetries

    Authors: Saibal De, Hadi Salehi, Alex Gorodetsky

    Abstract: Bayesian low-rank matrix factorization techniques have become an essential tool for relational data analysis and matrix completion. A standard approach is to assign zero-mean Gaussian priors on the columns or rows of factor matrices to create a conjugate system. This choice of prior leads to simple implementations; however it also causes symmetries in the posterior distribution that can severely r… ▽ More

    Submitted 10 November, 2020; v1 submitted 7 June, 2020; originally announced June 2020.

  38. arXiv:2006.04222  [pdf, other

    cs.LG cs.AI cs.MA stat.ML

    Randomized Entity-wise Factorization for Multi-Agent Reinforcement Learning

    Authors: Shariq Iqbal, Christian A. Schroeder de Witt, Bei Peng, Wendelin Böhmer, Shimon Whiteson, Fei Sha

    Abstract: Multi-agent settings in the real world often involve tasks with varying types and quantities of agents and non-agent entities; however, common patterns of behavior often emerge among these agents/entities. Our method aims to leverage these commonalities by asking the question: ``What is the expected utility of each agent when only considering a randomly selected sub-group of its observed entities?… ▽ More

    Submitted 11 June, 2021; v1 submitted 7 June, 2020; originally announced June 2020.

    Comments: ICML 2021 Camera Ready

  39. arXiv:2005.08583  [pdf, ps, other

    astro-ph.CO astro-ph.IM stat.AP stat.CO

    Ridges in the Dark Energy Survey for cosmic trough identification

    Authors: Ben Moews, Morgan A. Schmitz, Andrew J. Lawler, Joe Zuntz, Alex I. Malz, Rafael S. de Souza, Ricardo Vilalta, Alberto Krone-Martins, Emille E. O. Ishida

    Abstract: Cosmic voids and their corresponding redshift-projected mass densities, known as troughs, play an important role in our attempt to model the large-scale structure of the Universe. Understanding these structures enables us to compare the standard model with alternative cosmologies, constrain the dark energy equation of state, and distinguish between different gravitational theories. In this paper,… ▽ More

    Submitted 14 November, 2022; v1 submitted 18 May, 2020; originally announced May 2020.

    Comments: 12 pages, 5 figures, accepted for publication in MNRAS

    MSC Class: 85A40; 62G07; 62P35; 85A35

  40. arXiv:2005.07062  [pdf, other

    cs.LG stat.AP stat.ML

    Simulation-Based Inference for Global Health Decisions

    Authors: Christian Schroeder de Witt, Bradley Gram-Hansen, Nantas Nardelli, Andrew Gambardella, Rob Zinkov, Puneet Dokania, N. Siddharth, Ana Belen Espinosa-Gonzalez, Ara Darzi, Philip Torr, Atılım Güneş Baydin

    Abstract: The COVID-19 pandemic has highlighted the importance of in-silico epidemiological modelling in predicting the dynamics of infectious diseases to inform health policy and decision makers about suitable prevention and containment strategies. Work in this setting involves solving challenging inference and control problems in individual-based models of ever increasing complexity. Here we discuss recen… ▽ More

    Submitted 14 May, 2020; originally announced May 2020.

    Journal ref: ICML Workshop on Machine Learning for Global Health, Thirty-Seventh International Conference on Machine Learning (ICML 2020)

  41. arXiv:2004.06833  [pdf, ps, other

    eess.AS cs.LG stat.ML

    Alzheimer's Dementia Recognition through Spontaneous Speech: The ADReSS Challenge

    Authors: Saturnino Luz, Fasih Haider, Sofia de la Fuente, Davida Fromm, Brian MacWhinney

    Abstract: The ADReSS Challenge at INTERSPEECH 2020 defines a shared task through which different approaches to the automated recognition of Alzheimer's dementia based on spontaneous speech can be compared. ADReSS provides researchers with a benchmark speech dataset which has been acoustically pre-processed and balanced in terms of age and gender, defining two cognitive assessment tasks, namely: the Alzheime… ▽ More

    Submitted 5 August, 2020; v1 submitted 14 April, 2020; originally announced April 2020.

    Comments: To appear in the Proceedings of INTERSPEECH 2020, Oct 2020, Shanghai, China

  42. arXiv:2003.12011  [pdf

    eess.SP cs.LG stat.ML

    Adaptive machine learning strategies for network calibration of IoT smart air quality monitoring devices

    Authors: Saverio De Vito, Girolamo Di Francia, Elena Esposito, Sergio Ferlito, Fabrizio Formisano, Ettore Massera

    Abstract: Air Quality Multi-sensors Systems (AQMS) are IoT devices based on low cost chemical microsensors array that recently have showed capable to provide relatively accurate air pollutant quantitative estimations. Their availability permits to deploy pervasive Air Quality Monitoring (AQM) networks that will solve the geographical sparseness issue that affect the current network of AQ Regulatory Monitori… ▽ More

    Submitted 24 March, 2020; originally announced March 2020.

    Comments: Submitted to Pattern Recognition Letters

  43. arXiv:2003.08839  [pdf, other

    cs.LG cs.MA stat.ML

    Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning

    Authors: Tabish Rashid, Mikayel Samvelyan, Christian Schroeder de Witt, Gregory Farquhar, Jakob Foerster, Shimon Whiteson

    Abstract: In many real-world settings, a team of agents must coordinate its behaviour while acting in a decentralised fashion. At the same time, it is often possible to train the agents in a centralised fashion where global state information is available and communication constraints are lifted. Learning joint action-values conditioned on extra state information is an attractive way to exploit centralised l… ▽ More

    Submitted 27 August, 2020; v1 submitted 19 March, 2020; originally announced March 2020.

    Comments: Extended version of the ICML 2018 conference paper (arXiv:1803.11485)

    Journal ref: Journal of Machine Learning Research 21(178):1-51, 2020

  44. arXiv:2003.06709  [pdf, other

    cs.LG cs.AI stat.ML

    FACMAC: Factored Multi-Agent Centralised Policy Gradients

    Authors: Bei Peng, Tabish Rashid, Christian A. Schroeder de Witt, Pierre-Alexandre Kamienny, Philip H. S. Torr, Wendelin Böhmer, Shimon Whiteson

    Abstract: We propose FACtored Multi-Agent Centralised policy gradients (FACMAC), a new method for cooperative multi-agent reinforcement learning in both discrete and continuous action spaces. Like MADDPG, a popular multi-agent actor-critic method, our approach uses deep deterministic policy gradients to learn policies. However, FACMAC learns a centralised but factored critic, which combines per-agent utilit… ▽ More

    Submitted 7 May, 2021; v1 submitted 14 March, 2020; originally announced March 2020.

  45. arXiv:2002.10444  [pdf, other

    cs.LG cs.CV stat.ML

    Batch Normalization Biases Residual Blocks Towards the Identity Function in Deep Networks

    Authors: Soham De, Samuel L. Smith

    Abstract: Batch normalization dramatically increases the largest trainable depth of residual networks, and this benefit has been crucial to the empirical success of deep residual networks on a wide range of benchmarks. We show that this key benefit arises because, at initialization, batch normalization downscales the residual branch relative to the skip connection, by a normalizing factor on the order of th… ▽ More

    Submitted 9 December, 2020; v1 submitted 24 February, 2020; originally announced February 2020.

    Comments: Camera-ready version of NeurIPS 2020

  46. arXiv:2002.04495  [pdf, other

    stat.ML cs.LG stat.CO

    On transfer learning of neural networks using bi-fidelity data for uncertainty propagation

    Authors: Subhayan De, Jolene Britton, Matthew Reynolds, Ryan Skinner, Kenneth Jansen, Alireza Doostan

    Abstract: Due to their high degree of expressiveness, neural networks have recently been used as surrogate models for map** inputs of an engineering system to outputs of interest. Once trained, neural networks are computationally inexpensive to evaluate and remove the need for repeated evaluations of computationally expensive models in uncertainty quantification applications. However, given the highly par… ▽ More

    Submitted 11 February, 2020; originally announced February 2020.

  47. arXiv:1912.09621  [pdf

    cs.LG cs.AI cs.CV eess.IV stat.ML

    Understanding Deep Neural Network Predictions for Medical Imaging Applications

    Authors: Barath Narayanan Narayanan, Manawaduge Supun De Silva, Russell C. Hardie, Nathan K. Kueterman, Redha Ali

    Abstract: Computer-aided detection has been a research area attracting great interest in the past decade. Machine learning algorithms have been utilized extensively for this application as they provide a valuable second opinion to the doctors. Despite several machine learning models being available for medical imaging applications, not many have been implemented in the real-world due to the uninterpretable… ▽ More

    Submitted 19 December, 2019; originally announced December 2019.

    Comments: 20 pages, 28 Figures and 9 Tables

  48. arXiv:1911.12446  [pdf, other

    cs.LG cs.NE stat.ML

    QubitHD: A Stochastic Acceleration Method for HD Computing-Based Machine Learning

    Authors: Samuel Bosch, Alexander Sanchez de la Cerda, Mohsen Imani, Tajana Simunic Rosing, Giovanni De Micheli

    Abstract: Machine Learning algorithms based on Brain-inspired Hyperdimensional(HD) computing imitate cognition by exploiting statistical properties of high-dimensional vector spaces. It is a promising solution for achieving high energy efficiency in different machine learning tasks, such as classification, semi-supervised learning, and clustering. A weakness of existing HD computing-based ML algorithms is t… ▽ More

    Submitted 10 October, 2022; v1 submitted 27 November, 2019; originally announced November 2019.

    Comments: 8 pages, 5 figures, 3 tables

  49. arXiv:1911.07231  [pdf, other

    math.ST stat.ML

    Adaptive Rates for Total Variation Image Denoising

    Authors: Francesco Ortelli, Sara van de Geer

    Abstract: We study the theoretical properties of image denoising via total variation penalized least-squares. We define the total vatiation in terms of the two-dimensional total discrete derivative of the image and show that it gives rise to denoised images that are piecewise constant on rectangular sets. We prove that, if the true image is piecewise constant on just a few rectangular sets, the denoised ima… ▽ More

    Submitted 26 January, 2021; v1 submitted 17 November, 2019; originally announced November 2019.

    Comments: 38 pages, 6 figures

    Journal ref: Journal of Machine Learning Research, 21(247), 2020

  50. arXiv:1910.09056  [pdf, other

    cs.LG cs.AI stat.ML

    Amortized Rejection Sampling in Universal Probabilistic Programming

    Authors: Saeid Naderiparizi, Adam Ścibior, Andreas Munk, Mehrdad Ghadiri, Atılım Güneş Baydin, Bradley Gram-Hansen, Christian Schroeder de Witt, Robert Zinkov, Philip H. S. Torr, Tom Rainforth, Yee Whye Teh, Frank Wood

    Abstract: Naive approaches to amortized inference in probabilistic programs with unbounded loops can produce estimators with infinite variance. This is particularly true of importance sampling inference in programs that explicitly include rejection sampling as part of the user-programmed generative procedure. In this paper we develop a new and efficient amortized importance sampling estimator. We prove fini… ▽ More

    Submitted 28 March, 2022; v1 submitted 20 October, 2019; originally announced October 2019.

    Comments: AISTATS 2022 camera ready