Skip to main content

Showing 1–46 of 46 results for author: Dunson, D

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.00778  [pdf, other

    stat.ML cs.AI cs.LG stat.CO stat.ME

    Bayesian Joint Additive Factor Models for Multiview Learning

    Authors: Niccolo Anceschi, Federico Ferrari, David B. Dunson, Himel Mallick

    Abstract: It is increasingly common in a wide variety of applied settings to collect data of multiple different types on the same set of samples. Our particular focus in this article is on studying relationships between such multiview features and responses. A motivating application arises in the context of precision medicine where multi-omics data are collected to correlate with clinical outcomes. It is of… ▽ More

    Submitted 2 June, 2024; originally announced June 2024.

    MSC Class: 62F15

  2. arXiv:2402.00809  [pdf, other

    cs.LG stat.ML

    Position: Bayesian Deep Learning is Needed in the Age of Large-Scale AI

    Authors: Theodore Papamarkou, Maria Skoularidou, Konstantina Palla, Laurence Aitchison, Julyan Arbel, David Dunson, Maurizio Filippone, Vincent Fortuin, Philipp Hennig, José Miguel Hernández-Lobato, Aliaksandr Hubin, Alexander Immer, Theofanis Karaletsos, Mohammad Emtiyaz Khan, Agustinus Kristiadi, Yingzhen Li, Stephan Mandt, Christopher Nemeth, Michael A. Osborne, Tim G. J. Rudner, David Rügamer, Yee Whye Teh, Max Welling, Andrew Gordon Wilson, Ruqi Zhang

    Abstract: In the current landscape of deep learning research, there is a predominant emphasis on achieving high predictive accuracy in supervised tasks involving large image and language datasets. However, a broader perspective reveals a multitude of overlooked metrics, tasks, and data types, such as uncertainty, active and continual learning, and scientific data, that demand attention. Bayesian deep learni… ▽ More

    Submitted 2 June, 2024; v1 submitted 1 February, 2024; originally announced February 2024.

  3. arXiv:2312.13484  [pdf, other

    stat.ML cs.LG

    Bayesian Transfer Learning

    Authors: Piotr M. Suder, Jason Xu, David B. Dunson

    Abstract: Transfer learning is a burgeoning concept in statistical machine learning that seeks to improve inference and/or predictive accuracy on a domain of interest by leveraging data from related domains. While the term "transfer learning" has garnered much recent interest, its foundational principles have existed for years under various guises. Prior literature reviews in computer science and electrical… ▽ More

    Submitted 20 December, 2023; originally announced December 2023.

  4. arXiv:2311.14829   

    cs.CE cs.CV

    Proximal Algorithms for Accelerated Langevin Dynamics

    Authors: Duy H. Thai, Alexander L. Young, David B. Dunson

    Abstract: We develop a novel class of MCMC algorithms based on a stochastized Nesterov scheme. With an appropriate addition of noise, the result is a time-inhomogeneous underdamped Langevin equation, which we prove emits a specified target distribution as its invariant measure. Convergence rates to stationarity under Wasserstein-2 distance are established as well. Metropolis-adjusted and stochastic gradient… ▽ More

    Submitted 28 November, 2023; v1 submitted 24 November, 2023; originally announced November 2023.

    Comments: The technical proofs for the paper will be revised

  5. arXiv:2304.11251  [pdf, other

    stat.ML cs.LG

    Machine Learning and the Future of Bayesian Computation

    Authors: Steven Winter, Trevor Campbell, Lizhen Lin, Sanvesh Srivastava, David B. Dunson

    Abstract: Bayesian models are a powerful tool for studying complex data, allowing the analyst to encode rich hierarchical dependencies and leverage prior information. Most importantly, they facilitate a complete characterization of uncertainty through the posterior distribution. Practical posterior computation is commonly performed via MCMC, which can be computationally infeasible for high dimensional model… ▽ More

    Submitted 21 April, 2023; originally announced April 2023.

  6. arXiv:2304.10630  [pdf, other

    stat.ML cs.LG stat.AP

    Ellipsoid fitting with the Cayley transform

    Authors: Omar Melikechi, David B. Dunson

    Abstract: We introduce Cayley transform ellipsoid fitting (CTEF), an algorithm that uses the Cayley transform to fit ellipsoids to noisy data in any dimension. Unlike many ellipsoid fitting methods, CTEF is ellipsoid specific, meaning it always returns elliptic solutions, and can fit arbitrary ellipsoids. It also significantly outperforms other fitting methods when data are not uniformly distributed over th… ▽ More

    Submitted 27 September, 2023; v1 submitted 20 April, 2023; originally announced April 2023.

  7. arXiv:2304.03096  [pdf, other

    stat.ML cs.LG

    Spectral Gap Regularization of Neural Networks

    Authors: Edric Tam, David Dunson

    Abstract: We introduce Fiedler regularization, a novel approach for regularizing neural networks that utilizes spectral/graphical information. Existing regularization methods often focus on penalizing weights in a global/uniform manner that ignores the connectivity structure of the neural network. We propose to use the Fiedler value of the neural network's underlying graph as a tool for regularization. We p… ▽ More

    Submitted 6 April, 2023; originally announced April 2023.

    Comments: This is a journal extension of the ICML conference paper by Tam and Dunson (2020), arXiv:2003.00992

  8. arXiv:2302.00755  [pdf, other

    stat.ML cs.LG stat.ME

    Hierarchical shrinkage Gaussian processes: applications to computer code emulation and dynamical system recovery

    Authors: Tao Tang, Simon Mak, David Dunson

    Abstract: In many areas of science and engineering, computer simulations are widely used as proxies for physical experiments, which can be infeasible or unethical. Such simulations can often be computationally expensive, and an emulator can be trained to efficiently predict the desired response surface. A widely-used emulator is the Gaussian process (GP), which provides a flexible framework for efficient pr… ▽ More

    Submitted 1 February, 2023; originally announced February 2023.

  9. arXiv:2201.12064  [pdf, other

    stat.ML cs.LG

    Multiscale Graph Comparison via the Embedded Laplacian Discrepancy

    Authors: Edric Tam, David Dunson

    Abstract: Laplacian eigenvectors capture natural community structures on graphs and are widely used in spectral clustering and manifold learning. The use of Laplacian eigenvectors as embeddings for the purpose of multiscale graph comparison has however been limited. Here we propose the Embedded Laplacian Discrepancy (ELD) as a simple and fast approach to compare graphs (of potentially different sizes) based… ▽ More

    Submitted 5 February, 2023; v1 submitted 28 January, 2022; originally announced January 2022.

  10. arXiv:2110.07478  [pdf, other

    stat.ML cs.LG

    Inferring Manifolds From Noisy Data Using Gaussian Processes

    Authors: David B Dunson, Nan Wu

    Abstract: In analyzing complex datasets, it is often of interest to infer lower dimensional structure underlying the higher dimensional observations. As a flexible class of nonlinear structures, it is common to focus on Riemannian manifolds. Most existing manifold learning algorithms replace the original data with lower dimensional coordinates without providing an estimate of the manifold in the observation… ▽ More

    Submitted 24 May, 2024; v1 submitted 14 October, 2021; originally announced October 2021.

    Comments: 51 pages, 20 figures

  11. arXiv:2010.14056  [pdf, ps, other

    math.ST cs.LG stat.ML

    Statistical Guarantees for Transformation Based Models with Applications to Implicit Variational Inference

    Authors: Sean Plummer, Shuang Zhou, Anirban Bhattacharya, David Dunson, Debdeep Pati

    Abstract: Transformation-based methods have been an attractive approach in non-parametric inference for problems such as unconditional and conditional density estimation due to their unique hierarchical structure that models the data as flexible transformation of a set of common latent variables. More recently, transformation-based models have been used in variational inference (VI) to construct flexible im… ▽ More

    Submitted 4 November, 2020; v1 submitted 23 October, 2020; originally announced October 2020.

    Comments: First two authors contributed equally to this work. arXiv admin note: text overlap with arXiv:1701.07572

  12. arXiv:2010.08908  [pdf, other

    stat.CO cs.LG math.OC

    Accelerated Algorithms for Convex and Non-Convex Optimization on Manifolds

    Authors: Lizhen Lin, Bayan Saparbayeva, Michael Minyi Zhang, David B. Dunson

    Abstract: We propose a general scheme for solving convex and non-convex optimization problems on manifolds. The central idea is that, by adding a multiple of the squared retraction distance to the objective function in question, we "convexify" the objective function and solve a series of convex sub-problems in the optimization procedure. One of the key challenges for optimization on manifolds is the difficu… ▽ More

    Submitted 17 October, 2020; originally announced October 2020.

  13. arXiv:2008.08044  [pdf, other

    stat.ML cs.LG stat.CO

    Bayesian neural networks and dimensionality reduction

    Authors: Deborshee Sen, Theodore Papamarkou, David Dunson

    Abstract: In conducting non-linear dimensionality reduction and feature learning, it is common to suppose that the data lie near a lower-dimensional manifold. A class of model-based approaches for such problems includes latent variables in an unknown non-linear regression function; this includes Gaussian process latent variable models and variational auto-encoders (VAEs) as special cases. VAEs are artificia… ▽ More

    Submitted 19 August, 2020; v1 submitted 18 August, 2020; originally announced August 2020.

    Comments: 29 pages, 13 figures

  14. arXiv:2004.05209  [pdf, other

    stat.ML cs.LG q-bio.NC

    Estimating a Brain Network Predictive of Stress and Genotype with Supervised Autoencoders

    Authors: Austin Talbot, David Dunson, Kafui Dzirasa, David Carlson

    Abstract: Targeted stimulation of the brain has the potential to treat mental illnesses. We propose an approach to help design the stimulation protocol by identifying electrical dynamics across many brain regions that relate to illness states. We model multi-region electrical activity as a superposition of activity from latent networks, where the weights on the latent networks relate to an outcome of intere… ▽ More

    Submitted 7 March, 2023; v1 submitted 10 April, 2020; originally announced April 2020.

    Comments: 43 pages, 9 figures

  15. arXiv:2003.00992  [pdf, ps, other

    stat.ML cs.LG

    Fiedler Regularization: Learning Neural Networks with Graph Sparsity

    Authors: Edric Tam, David Dunson

    Abstract: We introduce a novel regularization approach for deep learning that incorporates and respects the underlying graphical structure of the neural network. Existing regularization methods often focus on drop**/penalizing weights in a global manner that ignores the connectivity structure of the neural network. We propose to use the Fiedler value of the neural network's underlying graph as a tool for… ▽ More

    Submitted 15 August, 2020; v1 submitted 2 March, 2020; originally announced March 2020.

  16. arXiv:2001.03988  [pdf, other

    stat.ME cs.LG stat.ML

    Domain Adaptive Bootstrap Aggregating

    Authors: Meimei Liu, David B. Dunson

    Abstract: When there is a distributional shift between data used to train a predictive algorithm and current data, performance can suffer. This is known as the domain adaptation problem. Bootstrap aggregating, or bagging, is a popular method for improving stability of predictive algorithms, while reducing variance and protecting against over-fitting. This article proposes a domain adaptive bagging method co… ▽ More

    Submitted 16 June, 2020; v1 submitted 12 January, 2020; originally announced January 2020.

  17. arXiv:1911.02728  [pdf, other

    stat.ML cs.LG q-bio.NC

    Auto-encoding brain networks with applications to analyzing large-scale brain imaging datasets

    Authors: Meimei Liu, Zhengwu Zhang, David B. Dunson

    Abstract: There has been huge interest in studying human brain connectomes inferred from different imaging modalities and exploring their relationship with human traits, such as cognition. Brain connectomes are usually represented as networks, with nodes corresponding to different regions of interest (ROIs) and edges to connection strengths between ROIs. Due to the high-dimensionality and non-Euclidean natu… ▽ More

    Submitted 13 September, 2021; v1 submitted 6 November, 2019; originally announced November 2019.

    Comments: 31 pages, 12 figures, 5 tables

  18. arXiv:1904.05850  [pdf, other

    math.ST cs.IT

    Consistent Entropy Estimation for Stationary Time Series

    Authors: Alexander L Young, David B Dunson

    Abstract: Entropy estimation, due in part to its connection with mutual information, has seen considerable use in the study of time series data including causality detection and information flow. In many cases, the entropy is estimated using $k$-nearest neighbor (Kozachenko-Leonenko) based methods. However, analytic results on this estimator are limited to independent data. In the article, we show rigorous… ▽ More

    Submitted 3 August, 2019; v1 submitted 11 April, 2019; originally announced April 2019.

    Comments: 16 pages, 2 figures

    MSC Class: 62G05; 62G20

  19. Classification via local manifold approximation

    Authors: Didong Li, David B Dunson

    Abstract: Classifiers label data as belonging to one of a set of groups based on input features. It is challenging to obtain accurate classification performance when the feature distributions in the different classes are complex, with nonlinear, overlap** and intersecting supports. This is particularly true when training data are limited. To address this problem, this article proposes a new type of classi… ▽ More

    Submitted 3 March, 2019; originally announced March 2019.

  20. arXiv:1901.00172  [pdf, other

    cs.LG cs.SI stat.ML

    Supervised Multiscale Dimension Reduction for Spatial Interaction Networks

    Authors: Shaobo Han, David B. Dunson

    Abstract: We introduce a multiscale supervised dimension reduction method for SPatial Interaction Network (SPIN) data, which consist of a collection of spatially coordinated interactions. This type of predictor arises when the sampling unit of data is composed of a collection of primitive variables, each of them being essentially unique, so that it becomes necessary to group the variables in order to simpli… ▽ More

    Submitted 8 June, 2019; v1 submitted 1 January, 2019; originally announced January 2019.

    Comments: 30 pages, 12 figures, revised for clarity and conciseness

  21. arXiv:1810.13431  [pdf, other

    stat.ML cs.LG

    Targeted stochastic gradient Markov chain Monte Carlo for hidden Markov models with rare latent states

    Authors: Rihui Ou, Deborshee Sen, Alexander L Young, David B Dunson

    Abstract: Markov chain Monte Carlo (MCMC) algorithms for hidden Markov models often rely on the forward-backward sampler. This makes them computationally slow as the length of the time series increases, motivating the recent development of sub-sampling-based approaches. These approximate the full posterior by using small random subsequences of the data at each MCMC iteration within stochastic gradient MCMC.… ▽ More

    Submitted 27 May, 2021; v1 submitted 31 October, 2018; originally announced October 2018.

  22. arXiv:1810.08537  [pdf, other

    stat.ML cs.LG

    Bayesian Distance Clustering

    Authors: Leo L Duan, David B Dunson

    Abstract: Model-based clustering is widely-used in a variety of application areas. However, fundamental concerns remain about robustness. In particular, results can be sensitive to the choice of kernel representing the within-cluster data density. Leveraging on properties of pairwise differences between data points, we propose a class of Bayesian distance clustering methods, which rely on modeling the likel… ▽ More

    Submitted 25 June, 2019; v1 submitted 19 October, 2018; originally announced October 2018.

  23. arXiv:1805.08102  [pdf, other

    stat.ML cs.LG

    PiPs: a Kernel-based Optimization Scheme for Analyzing Non-Stationary 1D Signals

    Authors: Jieren Xu, Yitong Li, Haizhao Yang, David Dunson, Ingrid Daubechies

    Abstract: This paper proposes a novel kernel-based optimization scheme to handle tasks in the analysis, e.g., signal spectral estimation and single-channel source separation of 1D non-stationary oscillatory data. The key insight of our optimization scheme for reconstructing the time-frequency information is that when a nonparametric regression is applied on some input values, the output regressed points wou… ▽ More

    Submitted 9 December, 2022; v1 submitted 21 May, 2018; originally announced May 2018.

  24. arXiv:1803.01203  [pdf, other

    stat.AP cs.LG cs.SI stat.ML

    Multiresolution Tensor Decomposition for Multiple Spatial Passing Networks

    Authors: Shaobo Han, David B. Dunson

    Abstract: This article is motivated by soccer positional passing networks collected across multiple games. We refer to these data as replicated spatial passing networks---to accurately model such data it is necessary to take into account the spatial positions of the passer and receiver for each passing event. This spatial registration and replicates that occur across games represent key differences with usu… ▽ More

    Submitted 3 March, 2018; originally announced March 2018.

    Comments: 34 pages, 15 figures

  25. arXiv:1802.05392  [pdf, other

    cs.LG stat.ML

    Reducing over-clustering via the powered Chinese restaurant process

    Authors: Jun Lu, Meng Li, David Dunson

    Abstract: Dirichlet process mixture (DPM) models tend to produce many small clusters regardless of whether they are needed to accurately characterize the data - this is particularly true for large data sets. However, interpretability, parsimony, data storage and communication costs all are hampered by having overly many clusters. We propose a powered Chinese restaurant process to limit this kind of problem… ▽ More

    Submitted 14 February, 2018; originally announced February 2018.

  26. arXiv:1801.01061  [pdf, other

    stat.ML cs.LG

    Intrinsic Gaussian processes on complex constrained domains

    Authors: Mu Niu, Pokman Cheung, Lizhen Lin, Zhenwen Dai, Neil Lawrence, David Dunson

    Abstract: We propose a class of intrinsic Gaussian processes (in-GPs) for interpolation, regression and classification on manifolds with a primary focus on complex constrained domains or irregular shaped spaces arising as subsets or submanifolds of R, R2, R3 and beyond. For example, in-GPs can accommodate spatial domains arising as complex subsets of Euclidean space. in-GPs respect the potentially complex b… ▽ More

    Submitted 3 January, 2018; originally announced January 2018.

  27. arXiv:1611.05559  [pdf, other

    stat.ML cs.LG

    Boosting Variational Inference

    Authors: Fangjian Guo, Xiangyu Wang, Kai Fan, Tamara Broderick, David B. Dunson

    Abstract: Variational inference (VI) provides fast approximations of a Bayesian posterior in part because it formulates posterior approximation as an optimization problem: to find the closest distribution to the exact posterior over some family of distributions. For practical reasons, the family of distributions in VI is usually constrained so that it does not include the exact posterior, even as a limit po… ▽ More

    Submitted 1 March, 2017; v1 submitted 16 November, 2016; originally announced November 2016.

    Comments: 17 pages, 7 figures

  28. arXiv:1605.05798  [pdf, other

    math.ST cs.CC stat.CO

    MCMC for Imbalanced Categorical Data

    Authors: James E. Johndrow, Aaron Smith, Natesh Pillai, David B. Dunson

    Abstract: Many modern applications collect highly imbalanced categorical data, with some categories relatively rare. Bayesian hierarchical models combat data sparsity by borrowing information, while also quantifying uncertainty. However, posterior computation presents a fundamental barrier to routine use; a single class of algorithms does not work well in all settings and practitioners waste time trying dif… ▽ More

    Submitted 26 June, 2017; v1 submitted 18 May, 2016; originally announced May 2016.

    MSC Class: 62

  29. arXiv:1603.05324  [pdf, other

    math.ST cs.LG stat.AP stat.ME

    Fast moment estimation for generalized latent Dirichlet models

    Authors: Shiwen Zhao, Barbara E. Engelhardt, Sayan Mukherjee, David B. Dunson

    Abstract: We develop a generalized method of moments (GMM) approach for fast parameter estimation in a new class of Dirichlet latent variable models with mixed data types. Parameter estimation via GMM has been demonstrated to have computational and statistical advantages over alternative methods, such as expectation maximization, variational inference, and Markov chain Monte Carlo. The key computational adv… ▽ More

    Submitted 23 March, 2016; v1 submitted 16 March, 2016; originally announced March 2016.

    Comments: corrected a typo in figure

  30. arXiv:1602.02575  [pdf, other

    stat.ME cs.DC stat.CO stat.ML

    DECOrrelated feature space partitioning for distributed sparse regression

    Authors: Xiangyu Wang, David Dunson, Chenlei Leng

    Abstract: Fitting statistical models is computationally challenging when the sample size or the dimension of the dataset is huge. An attractive approach for down-scaling the problem size is to first partition the dataset into subsets and then fit using distributed algorithms. The dataset can be partitioned either horizontally (in the sample space) or vertically (in the feature space). While the majority of… ▽ More

    Submitted 12 February, 2016; v1 submitted 8 February, 2016; originally announced February 2016.

    Comments: Correct legend errors in Figure 3

  31. arXiv:1506.05860  [pdf, ps, other

    stat.ML cs.LG stat.CO

    Variational Gaussian Copula Inference

    Authors: Shaobo Han, Xuejun Liao, David B. Dunson, Lawrence Carin

    Abstract: We utilize copulas to constitute a unified framework for constructing and optimizing variational proposals in hierarchical Bayesian models. For models with continuous and non-Gaussian hidden variables, we propose a semiparametric and automated variational Gaussian copula approach, in which the parametric Gaussian copula family is able to preserve multivariate posterior dependence, and the nonparam… ▽ More

    Submitted 18 May, 2016; v1 submitted 18 June, 2015; originally announced June 2015.

    Comments: Appearing in Proceedings of the 19th International Conference on Artificial Intelligence and Statistics (AISTATS) 2016, Cadiz, Spain. JMLR: W&CP volume 51

  32. arXiv:1506.02222  [pdf, other

    stat.ME cs.LG math.ST stat.ML

    No penalty no tears: Least squares in high-dimensional linear models

    Authors: Xiangyu Wang, David Dunson, Chenlei Leng

    Abstract: Ordinary least squares (OLS) is the default method for fitting linear models, but is not applicable for problems with dimensionality larger than the sample size. For these problems, we advocate the use of a generalized version of OLS motivated by ridge regression, and propose two novel three-step algorithms involving least squares fitting and hard thresholding. The algorithms are methodologically… ▽ More

    Submitted 16 June, 2016; v1 submitted 7 June, 2015; originally announced June 2015.

    Comments: Added results for non-sparse models; Added results for elliptical distribution; Added simulations for adaptive lasso

  33. arXiv:1502.06895  [pdf, ps, other

    math.ST cs.LG stat.ML

    On the consistency theory of high dimensional variable screening

    Authors: Xiangyu Wang, Chenlei Leng, David B. Dunson

    Abstract: Variable screening is a fast dimension reduction technique for assisting high dimensional feature selection. As a preselection method, it selects a moderate size subset of candidate variables for further refining via feature selection to produce the final model. The performance of variable screening depends on both computational efficiency and the ability to dramatically reduce the number of varia… ▽ More

    Submitted 6 June, 2015; v1 submitted 24 February, 2015; originally announced February 2015.

    Comments: adding comments on REC

  34. arXiv:1410.6604  [pdf, ps, other

    stat.ML cs.DC stat.CO stat.ME

    Median Selection Subset Aggregation for Parallel Inference

    Authors: Xiangyu Wang, Peichao Peng, David Dunson

    Abstract: For massive data sets, efficient computation commonly relies on distributed algorithms that store and process subsets of the data on different machines, minimizing communication costs. Our focus is on regression and classification problems involving many features. A variety of distributed algorithms have been proposed in this context, but challenges arise in defining an algorithm with low communic… ▽ More

    Submitted 24 October, 2014; originally announced October 2014.

  35. arXiv:1410.0719  [pdf, other

    math.NA cs.CV cs.IT cs.LG math.OC math.ST

    Proceedings of the second "international Traveling Workshop on Interactions between Sparse models and Technology" (iTWIST'14)

    Authors: L. Jacques, C. De Vleeschouwer, Y. Boursier, P. Sudhakar, C. De Mol, A. Pizurica, S. Anthoine, P. Vandergheynst, P. Frossard, C. Bilen, S. Kitic, N. Bertin, R. Gribonval, N. Boumal, B. Mishra, P. -A. Absil, R. Sepulchre, S. Bundervoet, C. Schretter, A. Dooms, P. Schelkens, O. Chabiron, F. Malgouyres, J. -Y. Tourneret, N. Dobigeon , et al. (42 additional authors not shown)

    Abstract: The implicit objective of the biennial "international - Traveling Workshop on Interactions between Sparse models and Technology" (iTWIST) is to foster collaboration between international scientific teams by disseminating ideas through both specific oral/poster presentations and free discussions. For its second edition, the iTWIST workshop took place in the medieval and picturesque town of Namur in… ▽ More

    Submitted 9 October, 2014; v1 submitted 2 October, 2014; originally announced October 2014.

    Comments: 69 pages, 24 extended abstracts, iTWIST'14 website: http://sites.google.com/site/itwist14

  36. arXiv:1403.2660  [pdf, other

    math.ST cs.DC cs.LG

    Robust and Scalable Bayes via a Median of Subset Posterior Measures

    Authors: Stanislav Minsker, Sanvesh Srivastava, Lizhen Lin, David B. Dunson

    Abstract: We propose a novel approach to Bayesian analysis that is provably robust to outliers in the data and often has computational advantages over standard methods. Our technique is based on splitting the data into non-overlap** subgroups, evaluating the posterior distribution given each independent subgroup, and then combining the resulting measures. The main novelty of our approach is the proposed a… ▽ More

    Submitted 1 June, 2016; v1 submitted 11 March, 2014; originally announced March 2014.

    MSC Class: Primary 62F15; secondary 68W15; 62G35

  37. arXiv:1401.3632  [pdf, other

    stat.ML cs.LG stat.CO

    Bayesian Conditional Density Filtering

    Authors: Shaan Qamar, Rajarshi Guhaniyogi, David B. Dunson

    Abstract: We propose a Conditional Density Filtering (C-DF) algorithm for efficient online Bayesian inference. C-DF adapts MCMC sampling to the online setting, sampling from approximations to conditional posterior distributions obtained by propagating surrogate conditional sufficient statistics (a function of data and parameter estimates) as new data arrive. These quantities eliminate the need to store or p… ▽ More

    Submitted 22 September, 2015; v1 submitted 15 January, 2014; originally announced January 2014.

    Comments: 41 pages, 7 figures, 12 tables

  38. arXiv:1312.4605  [pdf, ps, other

    stat.CO cs.DC stat.ML

    Parallelizing MCMC via Weierstrass Sampler

    Authors: Xiangyu Wang, David B. Dunson

    Abstract: With the rapidly growing scales of statistical problems, subset based communication-free parallel MCMC methods are a promising future for large scale Bayesian analysis. In this article, we propose a new Weierstrass sampler for parallel MCMC based on independent subsets. The new sampler approximates the full data posterior samples via combining the posterior draws from independent subset MCMC chain… ▽ More

    Submitted 25 May, 2014; v1 submitted 16 December, 2013; originally announced December 2013.

    Comments: The original Algorithm 1 removed. Provided some theoretical justification for refinement sampling (Theorem 2). Added a new algorithm in addition to the rejection sampling for handling dimensionality curse. New simulations and graphs (with new colors and designs). A real data analysis is also provided

  39. arXiv:1312.1099  [pdf, other

    stat.ML cs.LG

    Multiscale Dictionary Learning for Estimating Conditional Distributions

    Authors: Francesca Petralia, Joshua Vogelstein, David B. Dunson

    Abstract: Nonparametric estimation of the conditional distribution of a response given high-dimensional features is a challenging problem. It is important to allow not only the mean but also the variance and shape of the response density to change flexibly with features, which are massive-dimensional. We propose a multiscale dictionary learning model, which expresses the conditional response density as a co… ▽ More

    Submitted 4 December, 2013; originally announced December 2013.

    Journal ref: Proceeding of Neural Information Processing Systems, Lake Tahoe, Nevada December 2013

  40. arXiv:1304.7230  [pdf, other

    stat.ML cs.LG

    Learning Densities Conditional on Many Interacting Features

    Authors: David C. Kessler, Jack Taylor, David B. Dunson

    Abstract: Learning a distribution conditional on a set of discrete-valued features is a commonly encountered task. This becomes more challenging with a high-dimensional feature set when there is the possibility of interaction between the features. In addition, many frequently applied techniques consider only prediction of the mean, but the complete conditional density is needed to answer more complex questi… ▽ More

    Submitted 29 April, 2013; v1 submitted 26 April, 2013; originally announced April 2013.

  41. arXiv:1304.5894  [pdf

    cs.CV cs.LG

    Bayesian crack detection in ultra high resolution multimodal images of paintings

    Authors: Bruno Cornelis, Yun Yang, Joshua T. Vogelstein, Ann Dooms, Ingrid Daubechies, David Dunson

    Abstract: The preservation of our cultural heritage is of paramount importance. Thanks to recent developments in digital acquisition techniques, powerful image analysis algorithms are developed which can be useful non-invasive tools to assist in the restoration and preservation of art. In this paper we propose a semi-supervised crack detection method that can be used for high-dimensional acquisitions of pai… ▽ More

    Submitted 23 April, 2013; v1 submitted 22 April, 2013; originally announced April 2013.

    Comments: 8 pages, double column

  42. arXiv:1303.0642  [pdf, other

    stat.ML cs.LG

    Bayesian Compressed Regression

    Authors: Rajarshi Guhaniyogi, David B. Dunson

    Abstract: As an alternative to variable selection or shrinkage in high dimensional regression, we propose to randomly compress the predictors prior to analysis. This dramatically reduces storage and computational bottlenecks, performing well when the predictors can be projected to a low dimensional linear subspace with minimal loss of information about the response. As opposed to existing Bayesian dimension… ▽ More

    Submitted 22 March, 2013; v1 submitted 4 March, 2013; originally announced March 2013.

    Comments: 29 pages, 4 figures

  43. Bayesian Consensus Clustering

    Authors: Eric F. Lock, David B. Dunson

    Abstract: The task of clustering a set of objects based on multiple sources of data arises in several modern applications. We propose an integrative statistical model that permits a separate clustering of the objects for each data source. These separate clusterings adhere loosely to an overall consensus clustering, and hence they are not independent. We describe a computationally scalable Bayesian framework… ▽ More

    Submitted 28 February, 2013; originally announced February 2013.

    Comments: 32 pages, 13 figures

    Journal ref: Bioinformatics 29 (2013) 2610-2616

  44. arXiv:1206.6456  [pdf

    stat.AP cs.LG stat.ME

    Lognormal and Gamma Mixed Negative Binomial Regression

    Authors: Mingyuan Zhou, Lingbo Li, David Dunson, Lawrence Carin

    Abstract: In regression analysis of counts, a lack of simple and efficient algorithms for posterior computation has made Bayesian approaches appear unattractive and thus underdeveloped. We propose a lognormal and gamma mixed negative binomial (NB) regression model for counts, and present efficient closed-form Bayesian inference; unlike conventional Poisson models, the proposed approach has two free paramete… ▽ More

    Submitted 27 June, 2012; originally announced June 2012.

    Comments: Appears in Proceedings of the 29th International Conference on Machine Learning (ICML 2012)

  45. arXiv:1206.4662  [pdf

    cs.CR cs.LG cs.MM

    Bayesian Watermark Attacks

    Authors: Ivo Shterev, David Dunson

    Abstract: This paper presents an application of statistical machine learning to the field of watermarking. We propose a new attack model on additive spread-spectrum watermarking systems. The proposed attack is based on Bayesian statistics. We consider the scenario in which a watermark signal is repeatedly embedded in specific, possibly chosen based on a secret message bitstream, segments (signals) of the ho… ▽ More

    Submitted 18 June, 2012; originally announced June 2012.

    Comments: ICML2012

  46. arXiv:1206.4645  [pdf

    cs.LG math.NA stat.ME stat.ML

    Ensemble Methods for Convex Regression with Applications to Geometric Programming Based Circuit Design

    Authors: Lauren Hannah, David Dunson

    Abstract: Convex regression is a promising area for bridging statistical estimation and deterministic convex optimization. New piecewise linear convex regression methods are fast and scalable, but can have instability when used to approximate constraints or objective functions for optimization. Ensemble methods, like bagging, smearing and random partitioning, can alleviate this problem and maintain the theo… ▽ More

    Submitted 18 June, 2012; originally announced June 2012.

    Comments: ICML2012