-
Low-rank Matrix Bandits with Heavy-tailed Rewards
Authors:
Yue Kang,
Cho-Jui Hsieh,
Thomas C. M. Lee
Abstract:
In stochastic low-rank matrix bandit, the expected reward of an arm is equal to the inner product between its feature matrix and some unknown $d_1$ by $d_2$ low-rank parameter matrix $Θ^*$ with rank $r \ll d_1\wedge d_2$. While all prior studies assume the payoffs are mixed with sub-Gaussian noises, in this work we loosen this strict assumption and consider the new problem of \underline{low}-rank…
▽ More
In stochastic low-rank matrix bandit, the expected reward of an arm is equal to the inner product between its feature matrix and some unknown $d_1$ by $d_2$ low-rank parameter matrix $Θ^*$ with rank $r \ll d_1\wedge d_2$. While all prior studies assume the payoffs are mixed with sub-Gaussian noises, in this work we loosen this strict assumption and consider the new problem of \underline{low}-rank matrix bandit with \underline{h}eavy-\underline{t}ailed \underline{r}ewards (LowHTR), where the rewards only have finite $(1+δ)$ moment for some $δ\in (0,1]$. By utilizing the truncation on observed payoffs and the dynamic exploration, we propose a novel algorithm called LOTUS attaining the regret bound of order $\tilde O(d^\frac{3}{2}r^\frac{1}{2}T^\frac{1}{1+δ}/\tilde{D}_{rr})$ without knowing $T$, which matches the state-of-the-art regret bound under sub-Gaussian noises~\citep{lu2021low,kang2022efficient} with $δ= 1$. Moreover, we establish a lower bound of the order $Ω(d^\fracδ{1+δ} r^\fracδ{1+δ} T^\frac{1}{1+δ}) = Ω(T^\frac{1}{1+δ})$ for LowHTR, which indicates our LOTUS is nearly optimal in the order of $T$. In addition, we improve LOTUS so that it does not require knowledge of the rank $r$ with $\tilde O(dr^\frac{3}{2}T^\frac{1+δ}{1+2δ})$ regret bound, and it is efficient under the high-dimensional scenario. We also conduct simulations to demonstrate the practical superiority of our algorithm.
△ Less
Submitted 26 April, 2024;
originally announced April 2024.
-
AutoGFI: Streamlined Generalized Fiducial Inference for Modern Inference Problems
Authors:
Wei Du,
Jan Hannig,
Thomas C. M. Lee,
Yi Su,
Chunzhe Zhang
Abstract:
The origins of fiducial inference trace back to the 1930s when R. A. Fisher first introduced the concept as a response to what he perceived as a limitation of Bayesian inference - the requirement for a subjective prior distribution on model parameters in cases where no prior information was available. However, Fisher's initial fiducial approach fell out of favor as complications arose, particularl…
▽ More
The origins of fiducial inference trace back to the 1930s when R. A. Fisher first introduced the concept as a response to what he perceived as a limitation of Bayesian inference - the requirement for a subjective prior distribution on model parameters in cases where no prior information was available. However, Fisher's initial fiducial approach fell out of favor as complications arose, particularly in multi-parameter problems. In the wake of 2000, amidst a renewed interest in contemporary adaptations of fiducial inference, generalized fiducial inference (GFI) emerged to extend Fisher's fiducial argument, providing a promising avenue for addressing numerous crucial and practical inference challenges. Nevertheless, the adoption of GFI has been limited due to its often demanding mathematical derivations and the necessity for implementing complex Markov Chain Monte Carlo algorithms. This complexity has impeded its widespread utilization and practical applicability. This paper presents a significant advancement by introducing an innovative variant of GFI designed to alleviate these challenges. Specifically, this paper proposes AutoGFI, an easily implementable algorithm that streamlines the application of GFI to a broad spectrum of inference problems involving additive noise. AutoGFI can be readily implemented as long as a fitting routine is available, making it accessible to a broader audience of researchers and practitioners. To demonstrate its effectiveness, AutoGFI is applied to three contemporary and challenging problems: tensor regression, matrix completion, and regression with network cohesion. These case studies highlight the immense potential of GFI and illustrate AutoGFI's promising performance when compared to specialized solutions for these problems. Overall, this research paves the way for a more accessible and powerful application of GFI in a range of practical domains.
△ Less
Submitted 11 April, 2024;
originally announced April 2024.
-
Efficient Frameworks for Generalized Low-Rank Matrix Bandit Problems
Authors:
Yue Kang,
Cho-Jui Hsieh,
Thomas C. M. Lee
Abstract:
In the stochastic contextual low-rank matrix bandit problem, the expected reward of an action is given by the inner product between the action's feature matrix and some fixed, but initially unknown $d_1$ by $d_2$ matrix $Θ^*$ with rank $r \ll \{d_1, d_2\}$, and an agent sequentially takes actions based on past experience to maximize the cumulative reward. In this paper, we study the generalized lo…
▽ More
In the stochastic contextual low-rank matrix bandit problem, the expected reward of an action is given by the inner product between the action's feature matrix and some fixed, but initially unknown $d_1$ by $d_2$ matrix $Θ^*$ with rank $r \ll \{d_1, d_2\}$, and an agent sequentially takes actions based on past experience to maximize the cumulative reward. In this paper, we study the generalized low-rank matrix bandit problem, which has been recently proposed in \cite{lu2021low} under the Generalized Linear Model (GLM) framework. To overcome the computational infeasibility and theoretical restrain of existing algorithms on this problem, we first propose the G-ESTT framework that modifies the idea from \cite{jun2019bilinear} by using Stein's method on the subspace estimation and then leverage the estimated subspaces via a regularization idea. Furthermore, we remarkably improve the efficiency of G-ESTT by using a novel exclusion idea on the estimated subspace instead, and propose the G-ESTS framework. We also show that G-ESTT can achieve the $\tilde{O}(\sqrt{(d_1+d_2)MrT})$ bound of regret while G-ESTS can achineve the $\tilde{O}(\sqrt{(d_1+d_2)^{3/2}Mr^{3/2}T})$ bound of regret under mild assumption up to logarithm terms, where $M$ is some problem dependent value. Under a reasonable assumption that $M = O((d_1+d_2)^2)$ in our problem setting, the regret of G-ESTT is consistent with the current best regret of $\tilde{O}((d_1+d_2)^{3/2} \sqrt{rT}/D_{rr})$~\citep{lu2021low} ($D_{rr}$ will be defined later). For completeness, we conduct experiments to illustrate that our proposed algorithms, especially G-ESTS, are also computationally tractable and consistently outperform other state-of-the-art (generalized) linear matrix bandit methods based on a suite of simulations.
△ Less
Submitted 14 January, 2024;
originally announced January 2024.
-
Robust Lipschitz Bandits to Adversarial Corruptions
Authors:
Yue Kang,
Cho-Jui Hsieh,
Thomas C. M. Lee
Abstract:
Lipschitz bandit is a variant of stochastic bandits that deals with a continuous arm set defined on a metric space, where the reward function is subject to a Lipschitz constraint. In this paper, we introduce a new problem of Lipschitz bandits in the presence of adversarial corruptions where an adaptive adversary corrupts the stochastic rewards up to a total budget $C$. The budget is measured by th…
▽ More
Lipschitz bandit is a variant of stochastic bandits that deals with a continuous arm set defined on a metric space, where the reward function is subject to a Lipschitz constraint. In this paper, we introduce a new problem of Lipschitz bandits in the presence of adversarial corruptions where an adaptive adversary corrupts the stochastic rewards up to a total budget $C$. The budget is measured by the sum of corruption levels across the time horizon $T$. We consider both weak and strong adversaries, where the weak adversary is unaware of the current action before the attack, while the strong one can observe it. Our work presents the first line of robust Lipschitz bandit algorithms that can achieve sub-linear regret under both types of adversary, even when the total budget of corruption $C$ is unrevealed to the agent. We provide a lower bound under each type of adversary, and show that our algorithm is optimal under the strong case. Finally, we conduct experiments to illustrate the effectiveness of our algorithms against two classic kinds of attacks.
△ Less
Submitted 8 October, 2023; v1 submitted 29 May, 2023;
originally announced May 2023.
-
Online Continuous Hyperparameter Optimization for Generalized Linear Contextual Bandits
Authors:
Yue Kang,
Cho-Jui Hsieh,
Thomas C. M. Lee
Abstract:
In stochastic contextual bandits, an agent sequentially makes actions from a time-dependent action set based on past experience to minimize the cumulative regret. Like many other machine learning algorithms, the performance of bandits heavily depends on the values of hyperparameters, and theoretically derived parameter values may lead to unsatisfactory results in practice. Moreover, it is infeasib…
▽ More
In stochastic contextual bandits, an agent sequentially makes actions from a time-dependent action set based on past experience to minimize the cumulative regret. Like many other machine learning algorithms, the performance of bandits heavily depends on the values of hyperparameters, and theoretically derived parameter values may lead to unsatisfactory results in practice. Moreover, it is infeasible to use offline tuning methods like cross-validation to choose hyperparameters under the bandit environment, as the decisions should be made in real-time. To address this challenge, we propose the first online continuous hyperparameter tuning framework for contextual bandits to learn the optimal parameter configuration in practice within a search space on the fly. Specifically, we use a double-layer bandit framework named CDT (Continuous Dynamic Tuning) and formulate the hyperparameter optimization as a non-stationary continuum-armed bandit, where each arm represents a combination of hyperparameters, and the corresponding reward is the algorithmic result. For the top layer, we propose the Zooming TS algorithm that utilizes Thompson Sampling (TS) for exploration and a restart technique to get around the \textit{switching} environment. The proposed CDT framework can be easily utilized to tune contextual bandit algorithms without any pre-specified candidate set for multiple hyperparameters. We further show that it could achieve a sublinear regret in theory and performs consistently better than all existing methods on both synthetic and real datasets.
△ Less
Submitted 8 April, 2024; v1 submitted 18 February, 2023;
originally announced February 2023.
-
Syndicated Bandits: A Framework for Auto Tuning Hyper-parameters in Contextual Bandit Algorithms
Authors:
Qin Ding,
Yue Kang,
Yi-Wei Liu,
Thomas C. M. Lee,
Cho-Jui Hsieh,
James Sharpnack
Abstract:
The stochastic contextual bandit problem, which models the trade-off between exploration and exploitation, has many real applications, including recommender systems, online advertising and clinical trials. As many other machine learning algorithms, contextual bandit algorithms often have one or more hyper-parameters. As an example, in most optimal stochastic contextual bandit algorithms, there is…
▽ More
The stochastic contextual bandit problem, which models the trade-off between exploration and exploitation, has many real applications, including recommender systems, online advertising and clinical trials. As many other machine learning algorithms, contextual bandit algorithms often have one or more hyper-parameters. As an example, in most optimal stochastic contextual bandit algorithms, there is an unknown exploration parameter which controls the trade-off between exploration and exploitation. A proper choice of the hyper-parameters is essential for contextual bandit algorithms to perform well. However, it is infeasible to use offline tuning methods to select hyper-parameters in contextual bandit environment since there is no pre-collected dataset and the decisions have to be made in real time. To tackle this problem, we first propose a two-layer bandit structure for auto tuning the exploration parameter and further generalize it to the Syndicated Bandits framework which can learn multiple hyper-parameters dynamically in contextual bandit environment. We derive the regret bounds of our proposed Syndicated Bandits framework and show it can avoid its regret dependent exponentially in the number of hyper-parameters to be tuned. Moreover, it achieves optimal regret bounds under certain scenarios. Syndicated Bandits framework is general enough to handle the tuning tasks in many popular contextual bandit algorithms, such as LinUCB, LinTS, UCB-GLM, etc. Experiments on both synthetic and real datasets validate the effectiveness of our proposed framework.
△ Less
Submitted 11 June, 2022; v1 submitted 5 June, 2021;
originally announced June 2021.
-
Adversarial Examples Detection with Bayesian Neural Network
Authors:
Yao Li,
Tongyi Tang,
Cho-Jui Hsieh,
Thomas C. M. Lee
Abstract:
In this paper, we propose a new framework to detect adversarial examples motivated by the observations that random components can improve the smoothness of predictors and make it easier to simulate the output distribution of a deep neural network. With these observations, we propose a novel Bayesian adversarial example detector, short for BATer, to improve the performance of adversarial example de…
▽ More
In this paper, we propose a new framework to detect adversarial examples motivated by the observations that random components can improve the smoothness of predictors and make it easier to simulate the output distribution of a deep neural network. With these observations, we propose a novel Bayesian adversarial example detector, short for BATer, to improve the performance of adversarial example detection. Specifically, we study the distributional difference of hidden layer output between natural and adversarial examples, and propose to use the randomness of the Bayesian neural network to simulate hidden layer output distribution and leverage the distribution dispersion to detect adversarial examples. The advantage of a Bayesian neural network is that the output is stochastic while a deep neural network without random components does not have such characteristics. Empirical results on several benchmark datasets against popular attacks show that the proposed BATer outperforms the state-of-the-art detectors in adversarial example detection.
△ Less
Submitted 22 February, 2024; v1 submitted 18 May, 2021;
originally announced May 2021.
-
Change point detection and image segmentation for time series of astrophysical images
Authors:
Cong Xu,
Hans Moritz Günther,
Vinay L. Kashyap,
Thomas C. M. Lee,
Andreas Zezas
Abstract:
Many astrophysical phenomena are time-varying, in the sense that their intensity, energy spectrum, and/or the spatial distribution of the emission suddenly change. This paper develops a method for modeling a time series of images. Under the assumption that the arrival times of the photons follow a Poisson process, the data are binned into 4D grids of voxels (time, energy band, and x-y coordinates)…
▽ More
Many astrophysical phenomena are time-varying, in the sense that their intensity, energy spectrum, and/or the spatial distribution of the emission suddenly change. This paper develops a method for modeling a time series of images. Under the assumption that the arrival times of the photons follow a Poisson process, the data are binned into 4D grids of voxels (time, energy band, and x-y coordinates), and viewed as a time series of non-homogeneous Poisson images. The method assumes that at each time point, the corresponding multi-band image stack is an unknown 3D piecewise constant function including Poisson noise. It also assumes that all image stacks between any two adjacent change points (in time domain) share the same unknown piecewise constant function. The proposed method is designed to estimate the number and the locations of all the change points (in time domain), as well as all the unknown piecewise constant functions between any pairs of the change points. The method applies the minimum description length (MDL) principle to perform this task. A practical algorithm is also developed to solve the corresponding complicated optimization problem. Simulation experiments and applications to real datasets show that the proposed method enjoys very promising empirical properties. Applications to two real datasets, the XMM observation of a flaring star and an emerging solar coronal loop, illustrate the usage of the proposed method and the scientific insight gained from it.
△ Less
Submitted 26 January, 2021;
originally announced January 2021.
-
Estimating Fiber Orientation Distribution through Blockwise Adaptive Thresholding with Application to HCP Young Adults Data
Authors:
Seungyong Hwang,
Thomas C. M. Lee,
Debashis Paul,
Jie Peng
Abstract:
Due to recent technological advances, large brain imaging data sets can now be collected. Such data are highly complex so extraction of meaningful information from them remains challenging. Thus, there is an urgent need for statistical procedures that are computationally scalable and can provide accurate estimates that capture the neuronal structures and their functionalities. We propose a fast me…
▽ More
Due to recent technological advances, large brain imaging data sets can now be collected. Such data are highly complex so extraction of meaningful information from them remains challenging. Thus, there is an urgent need for statistical procedures that are computationally scalable and can provide accurate estimates that capture the neuronal structures and their functionalities. We propose a fast method for estimating the fiber orientation distribution(FOD) based on diffusion MRI data. This method models the observed dMRI signal at any voxel as a convolved and noisy version of the underlying FOD, and utilizes the spherical harmonics basis for representing the FOD, where the spherical harmonic coefficients are adaptively and nonlinearly shrunk by using a James-Stein type estimator. To further improve the estimation accuracy by enhancing the localized peaks of the FOD, as a second step a super-resolution sharpening process is then applied. The resulting estimated FODs can be fed to a fiber tracking algorithm to reconstruct the white matter fiber tracts. We illustrate the overall methodology using both synthetic data and data from the Human Connectome Project.
△ Less
Submitted 28 June, 2021; v1 submitted 8 April, 2020;
originally announced April 2020.
-
Uncertainty Quantification in Ensembles of Honest Regression Trees using Generalized Fiducial Inference
Authors:
Suofei Wu,
Jan Hannig,
Thomas C. M. Lee
Abstract:
Due to their accuracies, methods based on ensembles of regression trees are a popular approach for making predictions. Some common examples include Bayesian additive regression trees, boosting and random forests. This paper focuses on honest random forests, which add honesty to the original form of random forests and are proved to have better statistical properties. The main contribution is a new…
▽ More
Due to their accuracies, methods based on ensembles of regression trees are a popular approach for making predictions. Some common examples include Bayesian additive regression trees, boosting and random forests. This paper focuses on honest random forests, which add honesty to the original form of random forests and are proved to have better statistical properties. The main contribution is a new method that quantifies the uncertainties of the estimates and predictions produced by honest random forests. The proposed method is based on the generalized fiducial methodology, and provides a fiducial density function that measures how likely each single honest tree is the true model. With such a density function, estimates and predictions, as well as their confidence/prediction intervals, can be obtained. The promising empirical properties of the proposed method are demonstrated by numerical comparisons with several state-of-the-art methods, and by applications to a few real data sets. Lastly, the proposed method is theoretically backed up by a strong asymptotic guarantee.
△ Less
Submitted 14 November, 2019;
originally announced November 2019.
-
Measuring the Algorithmic Convergence of Randomized Ensembles: The Regression Setting
Authors:
Miles E. Lopes,
Suofei Wu,
Thomas C. M. Lee
Abstract:
When randomized ensemble methods such as bagging and random forests are implemented, a basic question arises: Is the ensemble large enough? In particular, the practitioner desires a rigorous guarantee that a given ensemble will perform nearly as well as an ideal infinite ensemble (trained on the same data). The purpose of the current paper is to develop a bootstrap method for solving this problem…
▽ More
When randomized ensemble methods such as bagging and random forests are implemented, a basic question arises: Is the ensemble large enough? In particular, the practitioner desires a rigorous guarantee that a given ensemble will perform nearly as well as an ideal infinite ensemble (trained on the same data). The purpose of the current paper is to develop a bootstrap method for solving this problem in the context of regression --- which complements our companion paper in the context of classification (Lopes 2019). In contrast to the classification setting, the current paper shows that theoretical guarantees for the proposed bootstrap can be established under much weaker assumptions. In addition, we illustrate the flexibility of the method by showing how it can be adapted to measure algorithmic convergence for variable selection. Lastly, we provide numerical results demonstrating that the method works well in a range of situations.
△ Less
Submitted 3 August, 2019;
originally announced August 2019.
-
Simultaneous Detection of Multiple Change Points and Community Structures in Time Series of Networks
Authors:
Rex C. Y. Cheung,
Alexander Aue,
Seungyong Hwang,
Thomas C. M. Lee
Abstract:
In many complex systems, networks and graphs arise in a natural manner. Often, time evolving behavior can be easily found and modeled using time-series methodology. Amongst others, two common research problems in network analysis are community detection and change-point detection. Community detection aims at finding specific sub-structures within the networks, and change-point detection tries to f…
▽ More
In many complex systems, networks and graphs arise in a natural manner. Often, time evolving behavior can be easily found and modeled using time-series methodology. Amongst others, two common research problems in network analysis are community detection and change-point detection. Community detection aims at finding specific sub-structures within the networks, and change-point detection tries to find the time points at which sub-structures change. We propose a novel methodology to detect both community structures and change points simultaneously based on a model selection framework in which the Minimum Description Length Principle (MDL) is utilized as minimizing objective criterion. The promising practical performance of the proposed method is illustrated via a series of numerical experiments and real data analysis.
△ Less
Submitted 30 June, 2020; v1 submitted 29 November, 2018;
originally announced December 2018.
-
Block-wise Partitioning for Extreme Multi-label Classification
Authors:
Yuefeng Liang,
Cho-Jui Hsieh,
Thomas C. M. Lee
Abstract:
Extreme multi-label classification aims to learn a classifier that annotates an instance with a relevant subset of labels from an extremely large label set. Many existing solutions embed the label matrix to a low-dimensional linear subspace, or examine the relevance of a test instance to every label via a linear scan. In practice, however, those approaches can be computationally exorbitant. To all…
▽ More
Extreme multi-label classification aims to learn a classifier that annotates an instance with a relevant subset of labels from an extremely large label set. Many existing solutions embed the label matrix to a low-dimensional linear subspace, or examine the relevance of a test instance to every label via a linear scan. In practice, however, those approaches can be computationally exorbitant. To alleviate this drawback, we propose a Block-wise Partitioning (BP) pretreatment that divides all instances into disjoint clusters, to each of which the most frequently tagged label subset is attached. One multi-label classifier is trained on one pair of instance and label clusters, and the label set of a test instance is predicted by first delivering it to the most appropriate instance cluster. Experiments on benchmark multi-label data sets reveal that BP pretreatment significantly reduces prediction time, and retains almost the same level of prediction accuracy.
△ Less
Submitted 3 November, 2018;
originally announced November 2018.
-
Network estimation via graphon with node features
Authors:
Yi Su,
Raymond K. W. Wong,
Thomas C. M. Lee
Abstract:
Estimating the probabilities of linkages in a network has gained increasing interest in recent years. One popular model for network analysis is the exchangeable graph model (ExGM) characterized by a two-dimensional function known as a graphon. Estimating an underlying graphon becomes the key of such analysis. Several nonparametric estimation methods have been proposed, and some are provably consis…
▽ More
Estimating the probabilities of linkages in a network has gained increasing interest in recent years. One popular model for network analysis is the exchangeable graph model (ExGM) characterized by a two-dimensional function known as a graphon. Estimating an underlying graphon becomes the key of such analysis. Several nonparametric estimation methods have been proposed, and some are provably consistent. However, if certain useful features of the nodes (e.g., age and schools in social network context) are available, none of these methods was designed to incorporate this source of information to help with the estimation. This paper develops a consistent graphon estimation method that integrates the information from both the adjacency matrix itself and node features. We show that properly leveraging the features can improve the estimation. A cross-validation method is proposed to automatically select the tuning parameter of the method.
△ Less
Submitted 2 September, 2018;
originally announced September 2018.
-
Method G: Uncertainty Quantification for Distributed Data Problems using Generalized Fiducial Inference
Authors:
Randy C. S. Lai,
J. Hannig,
Thomas C. M. Lee
Abstract:
It is not unusual for a data analyst to encounter data sets distributed across several computers. This can happen for reasons such as privacy concerns, efficiency of likelihood evaluations, or just the sheer size of the whole data set. This presents new challenges to statisticians as even computing simple summary statistics such as the median becomes computationally challenging. Furthermore, if ot…
▽ More
It is not unusual for a data analyst to encounter data sets distributed across several computers. This can happen for reasons such as privacy concerns, efficiency of likelihood evaluations, or just the sheer size of the whole data set. This presents new challenges to statisticians as even computing simple summary statistics such as the median becomes computationally challenging. Furthermore, if other advanced statistical methods are desired, novel computational strategies are needed. In this paper we propose a new approach for distributed analysis of massive data that is suitable for generalized fiducial inference and is based on a careful implementation of a "divide and conquer" strategy combined with importance sampling. The proposed approach requires only small amount of communication between nodes, and is shown to be asymptotically equivalent to using the whole data set. Unlike most existing methods, the proposed approach produces uncertainty measures (such as confidence intervals) in addition to point estimates for parameters of interest. The proposed approach is also applied to the analysis of a large set of solar images.
△ Less
Submitted 18 May, 2018;
originally announced May 2018.
-
A Multi-Resolution Model for Non-Gaussian Random Fields on a Sphere with Application to Ionospheric Electrostatic Potentials
Authors:
Minjie Fan,
Debashis Paul,
Thomas C. M. Lee,
Tomoko Matsuo
Abstract:
Gaussian random fields have been one of the most popular tools for analyzing spatial data. However, many geophysical and environmental processes often display non-Gaussian characteristics. In this paper, we propose a new class of spatial models for non-Gaussian random fields on a sphere based on a multi-resolution analysis. Using a special wavelet frame, named spherical needlets, as building block…
▽ More
Gaussian random fields have been one of the most popular tools for analyzing spatial data. However, many geophysical and environmental processes often display non-Gaussian characteristics. In this paper, we propose a new class of spatial models for non-Gaussian random fields on a sphere based on a multi-resolution analysis. Using a special wavelet frame, named spherical needlets, as building blocks, the proposed model is constructed in the form of a sparse random effects model. The spatial localization of needlets, together with carefully chosen random coefficients, ensure the model to be non-Gaussian and isotropic. The model can also be expanded to include a spatially varying variance profile. The special formulation of the model enables us to develop efficient estimation and prediction procedures, in which an adaptive MCMC algorithm is used. We investigate the accuracy of parameter estimation of the proposed model, and compare its predictive performance with that of two Gaussian models by extensive numerical experiments. Practical utility of the proposed model is demonstrated through an application of the methodology to a data set of high-latitude ionospheric electrostatic potentials, generated from the LFM-MIX model of the magnetosphere-ionosphere system.
△ Less
Submitted 30 September, 2017;
originally announced October 2017.
-
Uncertainty Quantification for High Dimensional Sparse Nonparametric Additive Models
Authors:
Qi Gao,
Randy C. S. Lai,
Thomas C. M. Lee,
Yao Li
Abstract:
Statistical inference in high dimensional settings has recently attracted enormous attention within the literature. However, most published work focuses on the parametric linear regression problem. This paper considers an important extension of this problem: statistical inference for high dimensional sparse nonparametric additive models. To be more precise, this paper develops a methodology for co…
▽ More
Statistical inference in high dimensional settings has recently attracted enormous attention within the literature. However, most published work focuses on the parametric linear regression problem. This paper considers an important extension of this problem: statistical inference for high dimensional sparse nonparametric additive models. To be more precise, this paper develops a methodology for constructing a probability density function on the set of all candidate models. This methodology can also be applied to construct confidence intervals for various quantities of interest (such as noise variance) and confidence bands for the additive functions. This methodology is derived using a generalized fiducial inference framework. It is shown that results produced by the proposed methodology enjoy correct asymptotic frequentist properties. Empirical results obtained from numerical experimentation verify this theoretical claim. Lastly, the methodology is applied to a gene expression data set and discovers new findings for which most existing methods based on parametric linear modeling failed to observe.
△ Less
Submitted 13 November, 2019; v1 submitted 23 September, 2017;
originally announced September 2017.
-
Covariance Estimation via Fiducial Inference
Authors:
W. Jenny Shi,
Jan Hannig,
Randy C. S. Lai,
Thomas C. M. Lee
Abstract:
As a classical problem, covariance estimation has drawn much attention from the statistical community for decades. Much work has been done under the frequentist and the Bayesian frameworks. Aiming to quantify the uncertainty of the estimators without having to choose a prior, we have developed a fiducial approach to the estimation of covariance matrix. Built upon the Fiducial Berstein-von Mises Th…
▽ More
As a classical problem, covariance estimation has drawn much attention from the statistical community for decades. Much work has been done under the frequentist and the Bayesian frameworks. Aiming to quantify the uncertainty of the estimators without having to choose a prior, we have developed a fiducial approach to the estimation of covariance matrix. Built upon the Fiducial Berstein-von Mises Theorem (Sonderegger and Hannig 2014), we show that the fiducial distribution of the covariate matrix is consistent under our framework. Consequently, the samples generated from this fiducial distribution are good estimators to the true covariance matrix, which enable us to define a meaningful confidence region for the covariance matrix. Lastly, we also show that the fiducial approach can be a powerful tool for identifying clique structures in covariance matrices.
△ Less
Submitted 16 August, 2017;
originally announced August 2017.
-
Modeling Tangential Vector Fields on a Sphere
Authors:
Minjie Fan,
Debashis Paul,
Thomas C. M. Lee,
Tomoko Matsuo
Abstract:
Physical processes that manifest as tangential vector fields on a sphere are common in geophysical and environmental sciences. These naturally occurring vector fields are often subject to physical constraints, such as being curl-free or divergence-free. We construct a new class of parametric models for cross-covariance functions of curl-free and divergence-free vector fields that are tangential to…
▽ More
Physical processes that manifest as tangential vector fields on a sphere are common in geophysical and environmental sciences. These naturally occurring vector fields are often subject to physical constraints, such as being curl-free or divergence-free. We construct a new class of parametric models for cross-covariance functions of curl-free and divergence-free vector fields that are tangential to the unit sphere. These models are constructed by applying the surface gradient or the surface curl operator to scalar random potential fields defined on the unit sphere. We propose a likelihood-based estimation procedure for the model parameters and show that fast computation is possible even for large data sets when the observations are on a regular latitude-longitude grid. Characteristics and utility of the proposed methodology are illustrated through simulation studies and by applying it to an ocean surface wind velocity data set collected through satellite-based scatterometry remote sensing. We also compare the performance of the proposed model with a class of bivariate Matérn models in terms of estimation and prediction, and demonstrate that the proposed model is superior in capturing certain physical characteristics of the wind fields.
△ Less
Submitted 23 December, 2016;
originally announced December 2016.
-
Consistent Estimation for Partition-wise Regression and Classification Models
Authors:
Rex C. Y. Cheung,
Alexander Aue,
Thomas C. M. Lee
Abstract:
Partition-wise models offer a flexible approach for modeling complex and multidimensional data that are capable of producing interpretable results. They are based on partitioning the observed data into regions, each of which is modeled with a simple submodel. The success of this approach highly depends on the quality of the partition, as too large a region could lead to a non-simple submodel, whil…
▽ More
Partition-wise models offer a flexible approach for modeling complex and multidimensional data that are capable of producing interpretable results. They are based on partitioning the observed data into regions, each of which is modeled with a simple submodel. The success of this approach highly depends on the quality of the partition, as too large a region could lead to a non-simple submodel, while too small a region could inflate estimation variance. This paper proposes an automatic procedure for choosing the partition (i.e., the number of regions and the boundaries between regions) as well as the submodels for the regions. It is shown that, under the assumption of the existence of a true partition, the proposed partition estimator is statistically consistent. The methodology is demonstrated for both regression and classification problems.
△ Less
Submitted 11 January, 2016;
originally announced January 2016.
-
Detecting Abrupt Changes in the Spectra of High-Energy Astrophysical Sources
Authors:
Raymond K. W. Wong,
Vinay L. Kashyap,
Thomas C. M. Lee,
David A. van Dyk
Abstract:
Variable-intensity astronomical sources are the result of complex and often extreme physical processes. Abrupt changes in source intensity are typically accompanied by equally sudden spectral shifts, i.e., sudden changes in the wavelength distribution of the emission. This article develops a method for modeling photon counts collected from observation of such sources. We embed change points into a…
▽ More
Variable-intensity astronomical sources are the result of complex and often extreme physical processes. Abrupt changes in source intensity are typically accompanied by equally sudden spectral shifts, i.e., sudden changes in the wavelength distribution of the emission. This article develops a method for modeling photon counts collected from observation of such sources. We embed change points into a marked Poisson process, where photon wavelengths are regarded as marks and both the Poisson intensity parameter and the distribution of the marks are allowed to change. To the best of our knowledge this is the first effort to embed change points into a marked Poisson process. Between the change points, the spectrum is modeled non-parametrically using a mixture of a smooth radial basis expansion and a number of local deviations from the smooth term representing spectral emission lines. Because the model is over parameterized we employ an $\ell_1$ penalty. The tuning parameter in the penalty and the number of change points are determined via the minimum description length principle. Our method is validated via a series of simulation studies and its practical utility is illustrated in the analysis of the ultra-fast rotating yellow giant star known as FK Com.
△ Less
Submitted 10 December, 2015; v1 submitted 27 August, 2015;
originally announced August 2015.
-
Matrix Completion with Noisy Entries and Outliers
Authors:
Raymond K. W. Wong,
Thomas C. M. Lee
Abstract:
This paper considers the problem of matrix completion when the observed entries are noisy and contain outliers. It begins with introducing a new optimization criterion for which the recovered matrix is defined as its solution. This criterion uses the celebrated Huber function from the robust statistics literature to downweigh the effects of outliers. A practical algorithm is developed to solve the…
▽ More
This paper considers the problem of matrix completion when the observed entries are noisy and contain outliers. It begins with introducing a new optimization criterion for which the recovered matrix is defined as its solution. This criterion uses the celebrated Huber function from the robust statistics literature to downweigh the effects of outliers. A practical algorithm is developed to solve the optimization involved. This algorithm is fast, straightforward to implement, and monotonic convergent. Furthermore, the proposed methodology is theoretically shown to be stable in a well defined sense. Its promising empirical performance is demonstrated via a sequence of simulation experiments, including image inpainting.
△ Less
Submitted 27 December, 2017; v1 submitted 28 February, 2015;
originally announced March 2015.
-
A Frequentist Approach to Computer Model Calibration
Authors:
Raymond K. W. Wong,
Curtis B. Storlie,
Thomas C. M. Lee
Abstract:
This paper considers the computer model calibration problem and provides a general frequentist solution. Under the proposed framework, the data model is semi-parametric with a nonparametric discrepancy function which accounts for any discrepancy between the physical reality and the computer model. In an attempt to solve a fundamentally important (but often ignored) identifiability issue between th…
▽ More
This paper considers the computer model calibration problem and provides a general frequentist solution. Under the proposed framework, the data model is semi-parametric with a nonparametric discrepancy function which accounts for any discrepancy between the physical reality and the computer model. In an attempt to solve a fundamentally important (but often ignored) identifiability issue between the computer model parameters and the discrepancy function, this paper proposes a new and identifiable parametrization of the calibration problem. It also develops a two-step procedure for estimating all the relevant quantities under the new parameterization. This estimation procedure is shown to enjoy excellent rates of convergence and can be straightforwardly implemented with existing software. For uncertainty quantification, bootstrap** is adopted to construct confidence regions for the quantities of interest. The practical performance of the proposed methodology is illustrated through simulation examples and an application to a computational fluid dynamics model.
△ Less
Submitted 10 September, 2015; v1 submitted 17 November, 2014;
originally announced November 2014.
-
Fiber Direction Estimation, Smoothing and Tracking in Diffusion MRI
Authors:
Raymond K. W. Wong,
Thomas C. M. Lee,
Debashis Paul,
Jie Peng,
the Alzheimer's Disease Neuroimaging Initiative
Abstract:
Diffusion magnetic resonance imaging is an imaging technology designed to probe anatomical architectures of biological samples in an in vivo and non-invasive manner through measuring water diffusion. The contribution of this paper is threefold. First it proposes a new method to identify and estimate multiple diffusion directions within a voxel through a new and identifiable parametrization of the…
▽ More
Diffusion magnetic resonance imaging is an imaging technology designed to probe anatomical architectures of biological samples in an in vivo and non-invasive manner through measuring water diffusion. The contribution of this paper is threefold. First it proposes a new method to identify and estimate multiple diffusion directions within a voxel through a new and identifiable parametrization of the widely used multi-tensor model. Unlike many existing methods, this method focuses on the estimation of diffusion directions rather than the diffusion tensors. Second, this paper proposes a novel direction smoothing method which greatly improves direction estimation in regions with crossing fibers. This smoothing method is shown to have excellent theoretical and empirical properties. Lastly, this paper develops a fiber tracking algorithm that can handle multiple directions within a voxel. The overall methodology is illustrated with simulated data and a data set collected for the study of Alzheimer's disease by the Alzheimer's Disease Neuroimaging Initiative (ADNI).
△ Less
Submitted 24 September, 2015; v1 submitted 3 June, 2014;
originally announced June 2014.
-
Automatic estimation of flux distributions of astrophysical source populations
Authors:
Raymond K. W. Wong,
Paul Baines,
Alexander Aue,
Thomas C. M. Lee,
Vinay L. Kashyap
Abstract:
In astrophysics a common goal is to infer the flux distribution of populations of scientifically interesting objects such as pulsars or supernovae. In practice, inference for the flux distribution is often conducted using the cumulative distribution of the number of sources detected at a given sensitivity. The resulting "$\log(N>S)$-$\log (S)$" relationship can be used to compare and evaluate theo…
▽ More
In astrophysics a common goal is to infer the flux distribution of populations of scientifically interesting objects such as pulsars or supernovae. In practice, inference for the flux distribution is often conducted using the cumulative distribution of the number of sources detected at a given sensitivity. The resulting "$\log(N>S)$-$\log (S)$" relationship can be used to compare and evaluate theoretical models for source populations and their evolution. Under restrictive assumptions the relationship should be linear. In practice, however, when simple theoretical models fail, it is common for astrophysicists to use prespecified piecewise linear models. This paper proposes a methodology for estimating both the number and locations of "breakpoints" in astrophysical source populations that extends beyond existing work in this field. An important component of the proposed methodology is a new interwoven EM algorithm that computes parameter estimates. It is shown that in simple settings such estimates are asymptotically consistent despite the complex nature of the parameter space. Through simulation studies it is demonstrated that the proposed methodology is capable of accurately detecting structural breaks in a variety of parameter configurations. This paper concludes with an application of our methodology to the Chandra Deep Field North (CDFN) data set.
△ Less
Submitted 24 November, 2014; v1 submitted 4 May, 2013;
originally announced May 2013.
-
Generalized Fiducial Inference for Ultrahigh Dimensional Regression
Authors:
Randy C. S. Lai,
Jan Hannig,
Thomas C. M. Lee
Abstract:
In recent years the ultrahigh dimensional linear regression problem has attracted enormous attentions from the research community. Under the sparsity assumption most of the published work is devoted to the selection and estimation of the significant predictor variables. This paper studies a different but fundamentally important aspect of this problem: uncertainty quantification for parameter estim…
▽ More
In recent years the ultrahigh dimensional linear regression problem has attracted enormous attentions from the research community. Under the sparsity assumption most of the published work is devoted to the selection and estimation of the significant predictor variables. This paper studies a different but fundamentally important aspect of this problem: uncertainty quantification for parameter estimates and model choices. To be more specific, this paper proposes methods for deriving a probability density function on the set of all possible models, and also for constructing confidence intervals for the corresponding parameters. These proposed methods are developed using the generalized fiducial methodology, which is a variant of Fisher's controversial fiducial idea. Theoretical properties of the proposed methods are studied, and in particular it is shown that statistical inference based on the proposed methods will have exact asymptotic frequentist property. In terms of empirical performances, the proposed methods are tested by simulation experiments and an application to a real data set. Lastly this work can also be seen as an interesting and successful application of Fisher's fiducial idea to an important and contemporary problem. To the best of the authors' knowledge, this is the first time that the fiducial idea is being applied to a so-called "large p small n" problem.
△ Less
Submitted 29 April, 2013;
originally announced April 2013.
-
An MDL approach to the climate segmentation problem
Authors:
QiQi Lu,
Robert Lund,
Thomas C. M. Lee
Abstract:
This paper proposes an information theory approach to estimate the number of changepoints and their locations in a climatic time series. A model is introduced that has an unknown number of changepoints and allows for series autocorrelations, periodic dynamics, and a mean shift at each changepoint time. An objective function gauging the number of changepoints and their locations, based on a minimum…
▽ More
This paper proposes an information theory approach to estimate the number of changepoints and their locations in a climatic time series. A model is introduced that has an unknown number of changepoints and allows for series autocorrelations, periodic dynamics, and a mean shift at each changepoint time. An objective function gauging the number of changepoints and their locations, based on a minimum description length (MDL) information criterion, is derived. A genetic algorithm is then developed to optimize the objective function. The methods are applied in the analysis of a century of monthly temperatures from Tuscaloosa, Alabama.
△ Less
Submitted 7 October, 2010;
originally announced October 2010.
-
A Multiresolution Census Algorithm for Calculating Vortex Statistics in Turbulent Flows
Authors:
Brandon Whitcher,
Thomas C. M. Lee,
Jeffrey B. Weiss,
Timothy J. Hoar,
Douglas W. Nychka
Abstract:
The fundamental equations that model turbulent flow do not provide much insight into the size and shape of observed turbulent structures. We investigate the efficient and accurate representation of structures in two-dimensional turbulence by applying statistical models directly to the simulated vorticity field. Rather than extract the coherent portion of the image from the background variation,…
▽ More
The fundamental equations that model turbulent flow do not provide much insight into the size and shape of observed turbulent structures. We investigate the efficient and accurate representation of structures in two-dimensional turbulence by applying statistical models directly to the simulated vorticity field. Rather than extract the coherent portion of the image from the background variation, as in the classical signal-plus-noise model, we present a model for individual vortices using the non-decimated discrete wavelet transform. A template image, supplied by the user, provides the features to be extracted from the vorticity field. By transforming the vortex template into the wavelet domain, specific characteristics present in the template, such as size and symmetry, are broken down into components associated with spatial frequencies. Multivariate multiple linear regression is used to fit the vortex template to the vorticity field in the wavelet domain. Since all levels of the template decomposition may be used to model each level in the field decomposition, the resulting model need not be identical to the template. Application to a vortex census algorithm that records quantities of interest (such as size, peak amplitude, circulation, etc.) as the vorticity field evolves is given. The multiresolution census algorithm extracts coherent structures of all shapes and sizes in simulated vorticity fields and is able to reproduce known physical scaling laws when processing a set of voriticity fields that evolve over time.
△ Less
Submitted 2 October, 2007;
originally announced October 2007.