Skip to main content

Showing 1–50 of 60 results for author: Smith, S

Searching in archive stat. Search in all archives.
.
  1. arXiv:2406.07555  [pdf, other

    stat.CO stat.ME

    Sequential Monte Carlo for Cut-Bayesian Posterior Computation

    Authors: Joseph Mathews, Giri Gopalan, James Gattiker, Sean Smith, Devin Francom

    Abstract: We propose a sequential Monte Carlo (SMC) method to efficiently and accurately compute cut-Bayesian posterior quantities of interest, variations of standard Bayesian approaches constructed primarily to account for model misspecification. We prove finite sample concentration bounds for estimators derived from the proposed method along with a linear tempering extension and apply these results to a r… ▽ More

    Submitted 8 March, 2024; originally announced June 2024.

    Report number: LA-UR-23-31546

  2. arXiv:2405.15358  [pdf, ps, other

    stat.ML cs.LG

    Coordinated Multi-Neighborhood Learning on a Directed Acyclic Graph

    Authors: Stephen Smith, Qing Zhou

    Abstract: Learning the structure of causal directed acyclic graphs (DAGs) is useful in many areas of machine learning and artificial intelligence, with wide applications. However, in the high-dimensional setting, it is challenging to obtain good empirical and theoretical results without strong and often restrictive assumptions. Additionally, it is questionable whether all of the variables purported to be in… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

    Comments: 13 pages, 6 figures

  3. arXiv:2404.01100  [pdf, other

    eess.SY cs.LG math.OC stat.ML

    Finite Sample Frequency Domain Identification

    Authors: Anastasios Tsiamis, Mohamed Abdalmoaty, Roy S. Smith, John Lygeros

    Abstract: We study non-parametric frequency-domain system identification from a finite-sample perspective. We assume an open loop scenario where the excitation input is periodic and consider the Empirical Transfer Function Estimate (ETFE), where the goal is to estimate the frequency response at certain desired (evenly-spaced) frequencies, given input-output samples. We show that under sub-Gaussian colored n… ▽ More

    Submitted 1 April, 2024; originally announced April 2024.

  4. arXiv:2403.05899  [pdf, other

    stat.ME cs.LG eess.SP eess.SY

    Online Identification of Stochastic Continuous-Time Wiener Models Using Sampled Data

    Authors: Mohamed Abdalmoaty, Efe C. Balta, John Lygeros, Roy S. Smith

    Abstract: It is well known that ignoring the presence of stochastic disturbances in the identification of stochastic Wiener models leads to asymptotically biased estimators. On the other hand, optimal statistical identification, via likelihood-based methods, is sensitive to the assumptions on the data distribution and is usually based on relatively complex sequential Monte Carlo algorithms. We develop a sim… ▽ More

    Submitted 9 March, 2024; originally announced March 2024.

  5. arXiv:2401.11804  [pdf, other

    stat.ME

    Regression Copulas for Multivariate Responses

    Authors: Nadja Klein, Michael Stanley Smith, David Nott, Ryan Chisholm

    Abstract: We propose a novel distributional regression model for a multivariate response vector based on a copula process over the covariate space. It uses the implicit copula of a Gaussian multivariate regression, which we call a ``regression copula''. To allow for large covariate vectors their coefficients are regularized using a novel multivariate extension of the horseshoe prior. Bayesian inference and… ▽ More

    Submitted 5 March, 2024; v1 submitted 22 January, 2024; originally announced January 2024.

  6. arXiv:2310.03521  [pdf, other

    stat.ME econ.EM math.ST

    Cutting Feedback in Misspecified Copula Models

    Authors: Michael Stanley Smith, Weichang Yu, David J. Nott, David Frazier

    Abstract: In copula models the marginal distributions and copula function are specified separately. We treat these as two modules in a modular Bayesian inference framework, and propose conducting modified Bayesian inference by "cutting feedback". Cutting feedback limits the influence of potentially misspecified modules in posterior inference. We consider two types of cuts. The first limits the influence of… ▽ More

    Submitted 27 June, 2024; v1 submitted 5 October, 2023; originally announced October 2023.

  7. arXiv:2308.05564  [pdf, other

    econ.EM cs.LG q-fin.ST stat.CO

    Large Skew-t Copula Models and Asymmetric Dependence in Intraday Equity Returns

    Authors: Lin Deng, Michael Stanley Smith, Worapree Maneesoonthorn

    Abstract: Skew-t copula models are attractive for the modeling of financial data because they allow for asymmetric and extreme tail dependence. We show that the copula implicit in the skew-t distribution of Azzalini and Capitanio (2003) allows for a higher level of pairwise asymmetric dependence than two popular alternative skew-t copulas. Estimation of this copula in high dimensions is challenging, and we… ▽ More

    Submitted 2 July, 2024; v1 submitted 10 August, 2023; originally announced August 2023.

  8. arXiv:2303.09842  [pdf, ps, other

    eess.SY stat.ML

    Error Bounds for Kernel-Based Linear System Identification with Unknown Hyperparameters

    Authors: Mingzhou Yin, Roy S. Smith

    Abstract: The kernel-based method has been successfully applied in linear system identification using stable kernel designs. From a Gaussian process perspective, it automatically provides probabilistic error bounds for the identified models from the posterior covariance, which are useful in robust and stochastic control. However, the error bounds require knowledge of the true hyperparameters in the kernel d… ▽ More

    Submitted 17 March, 2023; originally announced March 2023.

  9. arXiv:2302.13861  [pdf, other

    cs.LG cs.CR cs.CV stat.ML

    Differentially Private Diffusion Models Generate Useful Synthetic Images

    Authors: Sahra Ghalebikesabi, Leonard Berrada, Sven Gowal, Ira Ktena, Robert Stanforth, Jamie Hayes, Soham De, Samuel L. Smith, Olivia Wiles, Borja Balle

    Abstract: The ability to generate privacy-preserving synthetic versions of sensitive image datasets could unlock numerous ML applications currently constrained by data availability. Due to their astonishing image generation quality, diffusion models are a prime candidate for generating high-quality synthetic data. However, recent studies have found that, by default, the outputs of some diffusion models do n… ▽ More

    Submitted 27 February, 2023; originally announced February 2023.

  10. arXiv:2302.13536  [pdf, other

    stat.ML cs.LG

    Natural Gradient Hybrid Variational Inference with Application to Deep Mixed Models

    Authors: Weiben Zhang, Michael Stanley Smith, Worapree Maneesoonthorn, Ruben Loaiza-Maya

    Abstract: Stochastic models with global parameters $\bmθ$ and latent variables $\bm{z}$ are common, and variational inference (VI) is popular for their estimation. This paper uses a variational approximation (VA) that comprises a Gaussian with factor covariance matrix for the marginal of $\bmθ$, and the exact conditional posterior of $\bm{z}|\bmθ$. Stochastic optimization for learning the VA only requires g… ▽ More

    Submitted 27 February, 2023; originally announced February 2023.

  11. arXiv:2302.10322  [pdf, other

    cs.LG cs.AI cs.CL stat.ML

    Deep Transformers without Shortcuts: Modifying Self-attention for Faithful Signal Propagation

    Authors: Bobby He, James Martens, Guodong Zhang, Aleksandar Botev, Andrew Brock, Samuel L Smith, Yee Whye Teh

    Abstract: Skip connections and normalisation layers form two standard architectural components that are ubiquitous for the training of Deep Neural Networks (DNNs), but whose precise roles are poorly understood. Recent approaches such as Deep Kernel Sha** have made progress towards reducing our reliance on them, using insights from wide NN kernel theory to improve signal propagation in vanilla DNNs (which… ▽ More

    Submitted 20 February, 2023; originally announced February 2023.

    Comments: ICLR 2023

  12. arXiv:2204.13650  [pdf, other

    cs.LG cs.CR cs.CV stat.ML

    Unlocking High-Accuracy Differentially Private Image Classification through Scale

    Authors: Soham De, Leonard Berrada, Jamie Hayes, Samuel L. Smith, Borja Balle

    Abstract: Differential Privacy (DP) provides a formal privacy guarantee preventing adversaries with access to a machine learning model from extracting information about individual training points. Differentially Private Stochastic Gradient Descent (DP-SGD), the most popular DP training method for deep learning, realizes this protection by injecting noise during training. However previous works have found th… ▽ More

    Submitted 16 June, 2022; v1 submitted 28 April, 2022; originally announced April 2022.

  13. Infinite-Dimensional Sparse Learning in Linear System Identification

    Authors: Mingzhou Yin, Mehmet Tolga Akan, Andrea Iannelli, Roy S. Smith

    Abstract: Regularized methods have been widely applied to system identification problems without known model structures. This paper proposes an infinite-dimensional sparse learning algorithm based on atomic norm regularization. Atomic norm regularization decomposes the transfer function into first-order atomic models and solves a group lasso problem that selects a sparse set of poles and identifies the corr… ▽ More

    Submitted 31 August, 2022; v1 submitted 28 March, 2022; originally announced March 2022.

    Comments: Accepted for presentation at IEEE Conference on Decision and Control 2022

    Journal ref: 2022 IEEE 61st Conference on Decision and Control (CDC), Cancun, Mexico, 2022, pp. 850-855

  14. arXiv:2201.05985  [pdf, other

    cs.SI cs.LG stat.AP

    Exposing the Obscured Influence of State-Controlled Media: A Causal Estimation of Influence Between Media Outlets Via Quotation Propagation

    Authors: Joseph Schlessinger, Richard Bennet, Jacob Coakwell, Steven T. Smith, Edward K. Kao

    Abstract: This study quantifies influence between media outlets by applying a novel methodology that uses causal effect estimation on networks and transformer language models. We demonstrate the obscured influence of state-controlled outlets over other outlets, regardless of orientation, by analyzing a large dataset of quotations from over 100 thousand articles published by the most prominent European and R… ▽ More

    Submitted 16 January, 2022; originally announced January 2022.

  15. arXiv:2111.09511  [pdf, ps, other

    stat.ME stat.CO

    Implicit copula variational inference

    Authors: Michael Stanley Smith, Rubén Loaiza-Maya

    Abstract: Key to effective generic, or "black-box", variational inference is the selection of an approximation to the target density that balances accuracy and speed. Copula models are promising options, but calibration of the approximation can be slow for some choices. Smith et al. (2020) suggest using tractable and scalable "implicit copula" models that are formed by element-wise transformation of the tar… ▽ More

    Submitted 29 June, 2022; v1 submitted 17 November, 2021; originally announced November 2021.

    Comments: Abstract has been updated. The abstract of v2 is not up-to-date

  16. arXiv:2111.00782  [pdf

    stat.AP

    Unpacking uncertainty in the modelling process for energy policy making

    Authors: Samuele Lo Piano, Máté János Lőrincz, Arnald Puy, Steve Pye, Andrea Saltelli, Stefán Thor Smith, Jeroen P. van der Sluijs

    Abstract: This paper explores how the modelling of energy systems may lead to undue closure of alternatives by generating an excess of certainty around some of the possible policy options. We exemplify the problem with two cases: first, the International Institute for Applied Systems Analysis (IIASA) global modelling in the 1980s; and second, the modelling activity undertaken in support of the construction… ▽ More

    Submitted 1 November, 2021; originally announced November 2021.

    Comments: 39 pages, 2 tables, 3 figures

  17. arXiv:2109.04718  [pdf, ps, other

    stat.ME econ.EM

    Implicit Copulas: An Overview

    Authors: Michael Stanley Smith

    Abstract: Implicit copulas are the most common copula choice for modeling dependence in high dimensions. This broad class of copulas is introduced and surveyed, including elliptical copulas, skew $t$ copulas, factor copulas, time series copulas and regression copulas. The common auxiliary representation of implicit copulas is outlined, and how this makes them both scalable and tractable for statistical mode… ▽ More

    Submitted 10 September, 2021; originally announced September 2021.

  18. arXiv:2108.11066  [pdf, other

    stat.ME stat.CO

    Variational inference for cutting feedback in misspecified models

    Authors: Xuejun Yu, David J. Nott, Michael Stanley Smith

    Abstract: Bayesian analyses combine information represented by different terms in a joint Bayesian model. When one or more of the terms is misspecified, it can be helpful to restrict the use of information from suspect model components to modify posterior inference. This is called "cutting feedback", and both the specification and computation of the posterior for such "cut models" is challenging. In this pa… ▽ More

    Submitted 24 June, 2022; v1 submitted 25 August, 2021; originally announced August 2021.

  19. arXiv:2102.06171  [pdf, other

    cs.CV cs.LG stat.ML

    High-Performance Large-Scale Image Recognition Without Normalization

    Authors: Andrew Brock, Soham De, Samuel L. Smith, Karen Simonyan

    Abstract: Batch normalization is a key component of most image classification models, but it has many undesirable properties stemming from its dependence on the batch size and interactions between examples. Although recent work has succeeded in training deep ResNets without normalization layers, these models do not match the test accuracies of the best batch-normalized networks, and are often unstable for l… ▽ More

    Submitted 11 February, 2021; originally announced February 2021.

  20. arXiv:2101.12176  [pdf, other

    cs.LG stat.ML

    On the Origin of Implicit Regularization in Stochastic Gradient Descent

    Authors: Samuel L. Smith, Benoit Dherin, David G. T. Barrett, Soham De

    Abstract: For infinitesimal learning rates, stochastic gradient descent (SGD) follows the path of gradient flow on the full batch loss function. However moderately large learning rates can achieve higher test accuracies, and this generalization benefit is not explained by convergence bounds, since the learning rate which maximizes test accuracy is often larger than the learning rate which minimizes training… ▽ More

    Submitted 28 January, 2021; originally announced January 2021.

    Comments: Accepted as a conference paper at ICLR 2021

  21. arXiv:2101.08692  [pdf, other

    cs.LG cs.CV stat.ML

    Characterizing signal propagation to close the performance gap in unnormalized ResNets

    Authors: Andrew Brock, Soham De, Samuel L. Smith

    Abstract: Batch Normalization is a key component in almost all state-of-the-art image classifiers, but it also introduces practical challenges: it breaks the independence between training examples within a batch, can incur compute and memory overhead, and often results in unexpected bugs. Building on recent theoretical analyses of deep ResNets at initialization, we propose a simple set of analysis tools to… ▽ More

    Submitted 27 January, 2021; v1 submitted 21 January, 2021; originally announced January 2021.

    Comments: Published as a conference paper at ICLR 2021

  22. arXiv:2010.10241  [pdf, ps, other

    stat.ML cs.CV cs.LG

    BYOL works even without batch statistics

    Authors: Pierre H. Richemond, Jean-Bastien Grill, Florent Altché, Corentin Tallec, Florian Strub, Andrew Brock, Samuel Smith, Soham De, Razvan Pascanu, Bilal Piot, Michal Valko

    Abstract: Bootstrap Your Own Latent (BYOL) is a self-supervised learning approach for image representation. From an augmented view of an image, BYOL trains an online network to predict a target network representation of a different augmented view of the same image. Unlike contrastive methods, BYOL does not explicitly use a repulsion term built from negative pairs in its training objective. Yet, it avoids co… ▽ More

    Submitted 20 October, 2020; originally announced October 2020.

  23. arXiv:2010.01844  [pdf, ps, other

    stat.ME econ.EM stat.AP stat.CO stat.ML

    Deep Distributional Time Series Models and the Probabilistic Forecasting of Intraday Electricity Prices

    Authors: Nadja Klein, Michael Stanley Smith, David J. Nott

    Abstract: Recurrent neural networks (RNNs) with rich feature vectors of past values can provide accurate point forecasts for series that exhibit complex serial dependence. We propose two approaches to constructing deep time series probabilistic models based on a variant of RNN called an echo state network (ESN). The first is where the output layer of the ESN has stochastic disturbances and a shrinkage prior… ▽ More

    Submitted 27 May, 2021; v1 submitted 5 October, 2020; originally announced October 2020.

    Journal ref: Journal of Applied Econometrics (2023), 38( 4), 493-511

  24. arXiv:2008.00029  [pdf, other

    stat.ML cs.LG

    Cold Posteriors and Aleatoric Uncertainty

    Authors: Ben Adlam, Jasper Snoek, Samuel L. Smith

    Abstract: Recent work has observed that one can outperform exact inference in Bayesian neural networks by tuning the "temperature" of the posterior on a validation set (the "cold posterior" effect). To help interpret this phenomenon, we argue that commonly used priors in Bayesian neural networks can significantly overestimate the aleatoric uncertainty in the labels on many classification datasets. This prob… ▽ More

    Submitted 31 July, 2020; originally announced August 2020.

    Comments: 5 pages, 3 figures

    Journal ref: ICML workshop on Uncertainty and Robustness in Deep Learning (2020)

  25. arXiv:2006.15081  [pdf, other

    cs.LG stat.ML

    On the Generalization Benefit of Noise in Stochastic Gradient Descent

    Authors: Samuel L. Smith, Erich Elsen, Soham De

    Abstract: It has long been argued that minibatch stochastic gradient descent can generalize better than large batch gradient descent in deep neural networks. However recent papers have questioned this claim, arguing that this effect is simply a consequence of suboptimal hyperparameter tuning or insufficient compute budgets when the batch size is large. In this paper, we perform carefully designed experiment… ▽ More

    Submitted 26 June, 2020; originally announced June 2020.

    Comments: Camera-ready version of ICML 2020

  26. arXiv:2006.08287  [pdf, other

    cs.LG eess.IV stat.ML

    ICAM: Interpretable Classification via Disentangled Representations and Feature Attribution Map**

    Authors: Cher Bass, Mariana da Silva, Carole Sudre, Petru-Daniel Tudosiu, Stephen M. Smith, Emma C. Robinson

    Abstract: Feature attribution (FA), or the assignment of class-relevance to different locations in an image, is important for many classification problems but is particularly crucial within the neuroscience domain, where accurate mechanistic models of behaviours, or disease, require knowledge of all features discriminative of a trait. At the same time, predicting class relevance from brain images is challen… ▽ More

    Submitted 16 June, 2020; v1 submitted 15 June, 2020; originally announced June 2020.

    Comments: Submitted to NeurIPS 2020: Neural Information Processing Systems. Keywords: interpretable, classification, feature attribution, domain translation, variational autoencoder, generative adversarial network, neuroimaging

  27. arXiv:2006.05475  [pdf, other

    physics.comp-ph cond-mat.mtrl-sci stat.ML

    Simple and efficient algorithms for training machine learning potentials to force data

    Authors: Justin S. Smith, Nicholas Lubbers, Aidan P. Thompson, Kipton Barros

    Abstract: Abstract Machine learning models, trained on data from ab initio quantum simulations, are yielding molecular dynamics potentials with unprecedented accuracy. One limiting factor is the quantity of available training data, which can be expensive to obtain. A quantum simulation often provides all atomic forces, in addition to the total energy of the system. These forces provide much more information… ▽ More

    Submitted 9 June, 2020; originally announced June 2020.

  28. arXiv:2005.10879  [pdf, other

    cs.SI cs.LG stat.AP stat.ML

    Automatic Detection of Influential Actors in Disinformation Networks

    Authors: Steven T. Smith, Edward K. Kao, Erika D. Mackin, Danelle C. Shah, Olga Simek, Donald B. Rubin

    Abstract: The weaponization of digital communications and social media to conduct disinformation campaigns at immense scale, speed, and reach presents new challenges to identify and counter hostile influence operations (IOs). This paper presents an end-to-end framework to automate detection of disinformation narratives, networks, and influential actors. The framework integrates natural language processing,… ▽ More

    Submitted 7 January, 2021; v1 submitted 21 May, 2020; originally announced May 2020.

    Comments: Proc. Natl. Acad. Sciences U.S.A. Vol. 118, No. 4, e2011216118

  29. arXiv:2005.07430  [pdf, ps, other

    stat.ME econ.EM

    Fast and Accurate Variational Inference for Models with Many Latent Variables

    Authors: Rubén Loaiza-Maya, Michael Stanley Smith, David J. Nott, Peter J. Danaher

    Abstract: Models with a large number of latent variables are often used to fully utilize the information in big or complex data. However, they can be difficult to estimate using standard approaches, and variational inference methods are a popular alternative. Key to the success of these is the selection of an approximation to the target density that is accurate, tractable and fast to calibrate using optimiz… ▽ More

    Submitted 18 April, 2021; v1 submitted 15 May, 2020; originally announced May 2020.

    Comments: Macroeconomic example was replaced by the bigger and more challenging time varying parameter vector autoregression model with stochastic volatility. Microeconomic example was extended to 20,000 individuals and variational subsampling is also implemented for this example. Small microeconomics example now uses 1000 individuals

    MSC Class: 62P20 ACM Class: G.3

  30. arXiv:2002.10444  [pdf, other

    cs.LG cs.CV stat.ML

    Batch Normalization Biases Residual Blocks Towards the Identity Function in Deep Networks

    Authors: Soham De, Samuel L. Smith

    Abstract: Batch normalization dramatically increases the largest trainable depth of residual networks, and this benefit has been crucial to the empirical success of deep residual networks on a wide range of benchmarks. We show that this key benefit arises because, at initialization, batch normalization downscales the residual branch relative to the skip connection, by a normalizing factor on the order of th… ▽ More

    Submitted 9 December, 2020; v1 submitted 24 February, 2020; originally announced February 2020.

    Comments: Camera-ready version of NeurIPS 2020

  31. arXiv:2002.10046  [pdf, other

    stat.ME math.ST stat.AP stat.CO stat.ML

    Permutation Inference for Canonical Correlation Analysis

    Authors: Anderson M. Winkler, Olivier Renaud, Stephen M. Smith, Thomas E. Nichols

    Abstract: Canonical correlation analysis (CCA) has become a key tool for population neuroimaging, allowing investigation of associations between many imaging and non-imaging measurements. As other variables are often a source of variability not of direct interest, previous work has used CCA on residuals from a model that removes these effects, then proceeded directly to permutation inference. We show that s… ▽ More

    Submitted 17 June, 2020; v1 submitted 23 February, 2020; originally announced February 2020.

    Comments: 49 pages, 2 figures, 10 tables, 3 algorithms, 119 references

  32. arXiv:2001.01805  [pdf, other

    stat.CO math.DG

    Geodesically parameterized covariance estimation

    Authors: Antoni Musolas, Steven T. Smith, Youssef Marzouk

    Abstract: Statistical modeling of spatiotemporal phenomena often requires selecting a covariance matrix from a covariance class. Yet standard parametric covariance families can be insufficiently flexible for practical applications, while non-parametric approaches may not easily allow certain kinds of prior knowledge to be incorporated. We propose instead to build covariance families out of geodesic curves.… ▽ More

    Submitted 23 December, 2020; v1 submitted 6 January, 2020; originally announced January 2020.

  33. arXiv:1909.06134  [pdf, other

    cs.LG stat.ML

    Deep Adversarial Belief Networks

    Authors: Yuming Huang, Ashkan Panahi, Hamid Krim, Yiyi Yu, Spencer L. Smith

    Abstract: We present a novel adversarial framework for training deep belief networks (DBNs), which includes replacing the generator network in the methodology of generative adversarial networks (GANs) with a DBN and develo** a highly parallelizable numerical algorithm for training the resulting architecture in a stochastic manner. Unlike the existing techniques, this framework can be applied to the most g… ▽ More

    Submitted 25 September, 2019; v1 submitted 13 September, 2019; originally announced September 2019.

  34. arXiv:1908.09482  [pdf, ps, other

    stat.ME stat.CO stat.ML

    Marginally-calibrated deep distributional regression

    Authors: Nadja Klein, David J. Nott, Michael Stanley Smith

    Abstract: Deep neural network (DNN) regression models are widely used in applications requiring state-of-the-art predictive accuracy. However, until recently there has been little work on accurate uncertainty quantification for predictions from such models. We add to this literature by outlining an approach to constructing predictive distributions that are `marginally calibrated'. This is where the long run… ▽ More

    Submitted 3 September, 2020; v1 submitted 26 August, 2019; originally announced August 2019.

    Journal ref: Journal of Computational and Graphical Statistics (2020)

  35. Bayesian Variable Selection for Non-Gaussian Responses: A Marginally Calibrated Copula Approach

    Authors: Nadja Klein, Michael Stanley Smith

    Abstract: We propose a new highly flexible and tractable Bayesian approach to undertake variable selection in non-Gaussian regression models. It uses a copula decomposition for the joint distribution of observations on the dependent variable. This allows the marginal distribution of the dependent variable to be calibrated accurately using a nonparametric or other estimator. The family of copulas employed ar… ▽ More

    Submitted 3 September, 2020; v1 submitted 10 July, 2019; originally announced July 2019.

    Journal ref: Biometrics (2020)

  36. Bayesian Inference for Regression Copulas

    Authors: Michael Stanley Smith, Nadja Klein

    Abstract: We propose a new semi-parametric distributional regression smoother that is based on a copula decomposition of the joint distribution of the vector of response values. The copula is high-dimensional and constructed by inversion of a pseudo regression, where the conditional mean and variance are semi-parametric functions of covariates modeled using regularized basis functions. By integrating out th… ▽ More

    Submitted 24 January, 2020; v1 submitted 10 July, 2019; originally announced July 2019.

    Comments: Journal of Business & Economic Statistics (2020)

  37. arXiv:1906.03318  [pdf, other

    stat.ML cs.LG

    Efficient non-conjugate Gaussian process factor models for spike count data using polynomial approximations

    Authors: Stephen L. Keeley, David M. Zoltowski, Yiyi Yu, Jacob L. Yates, Spencer L. Smith, Jonathan W. Pillow

    Abstract: Gaussian Process Factor Analysis (GPFA) has been broadly applied to the problem of identifying smooth, low-dimensional temporal structure underlying large-scale neural recordings. However, spike trains are non-Gaussian, which motivates combining GPFA with discrete observation models for binned spike count data. The drawback to this approach is that GPFA priors are not conjugate to count model like… ▽ More

    Submitted 5 October, 2020; v1 submitted 7 June, 2019; originally announced June 2019.

  38. arXiv:1905.03776  [pdf, other

    cs.LG cs.AI cs.CV stat.ML

    The Effect of Network Width on Stochastic Gradient Descent and Generalization: an Empirical Study

    Authors: Daniel S. Park, Jascha Sohl-Dickstein, Quoc V. Le, Samuel L. Smith

    Abstract: We investigate how the final parameters found by stochastic gradient descent are influenced by over-parameterization. We generate families of models by increasing the number of channels in a base network, and then perform a large hyper-parameter search to study how the test error depends on learning rate, batch size, and network width. We find that the optimal SGD hyper-parameters are determined b… ▽ More

    Submitted 9 May, 2019; originally announced May 2019.

    Comments: 17 pages, 3 tables, 17 figures; accepted to ICML 2019

  39. arXiv:1904.07495  [pdf, ps, other

    stat.CO

    High-dimensional copula variational approximation through transformation

    Authors: Michael Stanley Smith, Ruben Loaiza-Maya, David J. Nott

    Abstract: Variational methods are attractive for computing Bayesian inference for highly parametrized models and large datasets where exact inference is impractical. They approximate a target distribution - either the posterior or an augmented posterior - using a simpler distribution that is selected to balance accuracy with computational feasibility. Here we approximate an element-wise parametric transform… ▽ More

    Submitted 20 November, 2019; v1 submitted 16 April, 2019; originally announced April 2019.

  40. Practical Considerations for Data Collection and Management in Mobile Health Micro-randomized Trials

    Authors: Nicholas J. Seewald, Shawna N. Smith, Andy **seok Lee, Predrag Klasnja, Susan A. Murphy

    Abstract: There is a growing interest in leveraging the prevalence of mobile technology to improve health by delivering momentary, contextualized interventions to individuals' smartphones. A just-in-time adaptive intervention (JITAI) adjusts to an individual's changing state and/or context to provide the right treatment, at the right time, in the right place. Micro-randomized trials (MRTs) allow for the col… ▽ More

    Submitted 27 December, 2018; originally announced December 2018.

    Comments: Author accepted manuscript

  41. arXiv:1811.03578  [pdf, other

    stat.OT

    The ASCCR Frame for Learning Essential Collaboration Skills

    Authors: Eric A. Vance, Heather S. Smith

    Abstract: Statistics and data science are especially collaborative disciplines that typically require practitioners to interact with many different people or groups. Consequently, interdisciplinary collaboration skills are part of the personal and professional skills essential for success as an applied statistician or data scientist. These skills are learnable and teachable, and learning and improving colla… ▽ More

    Submitted 30 August, 2019; v1 submitted 8 November, 2018; originally announced November 2018.

    Comments: 12 pages, 1 figure. Updated to this Version 5 by adding a few more references, discussing how to teach ASCCR in the classroom, calling on others to add to research supporting the use of the ASCCR Frame, and adding discussion of ethics and reproducible research

  42. arXiv:1806.09597  [pdf, other

    cs.LG cs.AI stat.ML

    Stochastic natural gradient descent draws posterior samples in function space

    Authors: Samuel L. Smith, Daniel Duckworth, Semon Rezchikov, Quoc V. Le, Jascha Sohl-Dickstein

    Abstract: Recent work has argued that stochastic gradient descent can approximate the Bayesian uncertainty in model parameters near local minima. In this work we develop a similar correspondence for minibatch natural gradient descent (NGD). We prove that for sufficiently small learning rates, if the model predictions on the training set approach the true conditional distribution of labels given inputs, the… ▽ More

    Submitted 28 November, 2018; v1 submitted 25 June, 2018; originally announced June 2018.

    Comments: Workshop on Bayesian Deep Learning (NeurIPS 2018)

  43. Implicit Copulas from Bayesian Regularized Regression Smoothers

    Authors: Nadja Klein, Michael Stanley Smith

    Abstract: We show how to extract the implicit copula of a response vector from a Bayesian regularized regression smoother with Gaussian disturbances. The copula can be used to compare smoothers that employ different shrinkage priors and function bases. We illustrate with three popular choices of shrinkage priors --- a pairwise prior, the horseshoe prior and a g prior augmented with a point mass as employed… ▽ More

    Submitted 14 May, 2018; v1 submitted 27 April, 2018; originally announced April 2018.

    Journal ref: Bayesian Anal. 14 (2019), no. 4, 1143--1171

  44. arXiv:1804.08218  [pdf, ps, other

    econ.EM stat.AP

    Econometric Modeling of Regional Electricity Spot Prices in the Australian Market

    Authors: Michael Stanley Smith, Thomas S. Shively

    Abstract: Wholesale electricity markets are increasingly integrated via high voltage interconnectors, and inter-regional trade in electricity is growing. To model this, we consider a spatial equilibrium model of price formation, where constraints on inter-regional flows result in three distinct equilibria in prices. We use this to motivate an econometric model for the distribution of observed electricity sp… ▽ More

    Submitted 22 April, 2018; originally announced April 2018.

    Comments: Key Words: Bayesian Monotonic Function Estimation, Intraday Electricity Prices, Copula Time Series Model. JEL: C11, C14, C32, C53

  45. arXiv:1801.09319  [pdf

    physics.comp-ph cs.LG physics.chem-ph stat.ML

    Less is more: sampling chemical space with active learning

    Authors: Justin S. Smith, Ben Nebgen, Nicholas Lubbers, Olexandr Isayev, Adrian E. Roitberg

    Abstract: The development of accurate and transferable machine learning (ML) potentials for predicting molecular energetics is a challenging task. The process of data generation to train such ML potentials is a task neither well understood nor researched in detail. In this work, we present a fully automated approach for the generation of datasets with the intent of training universal ML potentials. It is ba… ▽ More

    Submitted 9 April, 2018; v1 submitted 28 January, 2018; originally announced January 2018.

    Comments: Accepted at J. Chem. Phys

    Journal ref: J. Chem. Phys. 148, 241733 (2018)

  46. arXiv:1712.09150  [pdf, ps, other

    stat.ME econ.EM stat.ML

    Variational Bayes Estimation of Discrete-Margined Copula Models with Application to Time Series

    Authors: Ruben Loaiza-Maya, Michael Stanley Smith

    Abstract: We propose a new variational Bayes estimator for high-dimensional copulas with discrete, or a combination of discrete and continuous, margins. The method is based on a variational approximation to a tractable augmented posterior, and is faster than previous likelihood-based approaches. We use it to estimate drawable vine copulas for univariate and multivariate Markov ordinal and mixed time series.… ▽ More

    Submitted 20 July, 2018; v1 submitted 25 December, 2017; originally announced December 2017.

  47. arXiv:1711.00489  [pdf, other

    cs.LG cs.CV cs.DC stat.ML

    Don't Decay the Learning Rate, Increase the Batch Size

    Authors: Samuel L. Smith, Pieter-Jan Kindermans, Chris Ying, Quoc V. Le

    Abstract: It is common practice to decay the learning rate. Here we show one can usually obtain the same learning curve on both training and test sets by instead increasing the batch size during training. This procedure is successful for stochastic gradient descent (SGD), SGD with momentum, Nesterov momentum, and Adam. It reaches equivalent test accuracies after the same number of training epochs, but with… ▽ More

    Submitted 23 February, 2018; v1 submitted 1 November, 2017; originally announced November 2017.

    Comments: 11 pages, 8 figures. Published as a conference paper at ICLR 2018

  48. arXiv:1710.06451  [pdf, other

    cs.LG cs.AI stat.ML

    A Bayesian Perspective on Generalization and Stochastic Gradient Descent

    Authors: Samuel L. Smith, Quoc V. Le

    Abstract: We consider two questions at the heart of machine learning; how can we predict if a minimum will generalize to the test set, and why does stochastic gradient descent find minima that generalize well? Our work responds to Zhang et al. (2016), who showed deep neural networks can easily memorize randomly labeled training data, despite generalizing well on real labels of the same inputs. We show that… ▽ More

    Submitted 14 February, 2018; v1 submitted 17 October, 2017; originally announced October 2017.

    Comments: 13 pages, 9 figures. Published as a conference paper at ICLR 2018

  49. arXiv:1710.00017  [pdf, other

    stat.ML physics.chem-ph

    Hierarchical modeling of molecular energies using a deep neural network

    Authors: Nicholas Lubbers, Justin S. Smith, Kipton Barros

    Abstract: We introduce the Hierarchically Interacting Particle Neural Network (HIP-NN) to model molecular properties from datasets of quantum calculations. Inspired by a many-body expansion, HIP-NN decomposes properties, such as energy, as a sum over hierarchical terms. These terms are generated from a neural network--a composition of many nonlinear transformations--acting on a representation of the molecul… ▽ More

    Submitted 29 September, 2017; originally announced October 2017.

  50. arXiv:1701.07152  [pdf, other

    stat.AP q-fin.ST

    Time Series Copulas for Heteroskedastic Data

    Authors: Rubén Loaiza-Maya, Michael S. Smith, Worapree Maneesoonthorn

    Abstract: We propose parametric copulas that capture serial dependence in stationary heteroskedastic time series. We develop our copula for first order Markov series, and extend it to higher orders and multivariate series. We derive the copula of a volatility proxy, based on which we propose new measures of volatility dependence, including co-movement and spillover in multivariate series. In general, these… ▽ More

    Submitted 24 January, 2017; originally announced January 2017.