Skip to main content

Showing 1–28 of 28 results for author: Lee, T C M

Searching in archive stat. Search in all archives.
.
  1. arXiv:2404.17709  [pdf, other

    stat.ML cs.LG

    Low-rank Matrix Bandits with Heavy-tailed Rewards

    Authors: Yue Kang, Cho-Jui Hsieh, Thomas C. M. Lee

    Abstract: In stochastic low-rank matrix bandit, the expected reward of an arm is equal to the inner product between its feature matrix and some unknown $d_1$ by $d_2$ low-rank parameter matrix $Θ^*$ with rank $r \ll d_1\wedge d_2$. While all prior studies assume the payoffs are mixed with sub-Gaussian noises, in this work we loosen this strict assumption and consider the new problem of \underline{low}-rank… ▽ More

    Submitted 26 April, 2024; originally announced April 2024.

    Comments: The 40th Conference on Uncertainty in Artificial Intelligence (UAI 2024)

  2. arXiv:2404.08169  [pdf, other

    stat.ME

    AutoGFI: Streamlined Generalized Fiducial Inference for Modern Inference Problems

    Authors: Wei Du, Jan Hannig, Thomas C. M. Lee, Yi Su, Chunzhe Zhang

    Abstract: The origins of fiducial inference trace back to the 1930s when R. A. Fisher first introduced the concept as a response to what he perceived as a limitation of Bayesian inference - the requirement for a subjective prior distribution on model parameters in cases where no prior information was available. However, Fisher's initial fiducial approach fell out of favor as complications arose, particularl… ▽ More

    Submitted 11 April, 2024; originally announced April 2024.

  3. arXiv:2401.07298  [pdf, other

    stat.ML cs.LG

    Efficient Frameworks for Generalized Low-Rank Matrix Bandit Problems

    Authors: Yue Kang, Cho-Jui Hsieh, Thomas C. M. Lee

    Abstract: In the stochastic contextual low-rank matrix bandit problem, the expected reward of an action is given by the inner product between the action's feature matrix and some fixed, but initially unknown $d_1$ by $d_2$ matrix $Θ^*$ with rank $r \ll \{d_1, d_2\}$, and an agent sequentially takes actions based on past experience to maximize the cumulative reward. In this paper, we study the generalized lo… ▽ More

    Submitted 14 January, 2024; originally announced January 2024.

    Comments: Revision of the paper accepted by NeurIPS 2022

  4. arXiv:2305.18543  [pdf, other

    cs.LG stat.ML

    Robust Lipschitz Bandits to Adversarial Corruptions

    Authors: Yue Kang, Cho-Jui Hsieh, Thomas C. M. Lee

    Abstract: Lipschitz bandit is a variant of stochastic bandits that deals with a continuous arm set defined on a metric space, where the reward function is subject to a Lipschitz constraint. In this paper, we introduce a new problem of Lipschitz bandits in the presence of adversarial corruptions where an adaptive adversary corrupts the stochastic rewards up to a total budget $C$. The budget is measured by th… ▽ More

    Submitted 8 October, 2023; v1 submitted 29 May, 2023; originally announced May 2023.

    Comments: Thirty-seventh Conference on Neural Information Processing Systems (NeurIPS 2023)

  5. arXiv:2302.09440  [pdf, other

    cs.LG stat.ML

    Online Continuous Hyperparameter Optimization for Generalized Linear Contextual Bandits

    Authors: Yue Kang, Cho-Jui Hsieh, Thomas C. M. Lee

    Abstract: In stochastic contextual bandits, an agent sequentially makes actions from a time-dependent action set based on past experience to minimize the cumulative regret. Like many other machine learning algorithms, the performance of bandits heavily depends on the values of hyperparameters, and theoretically derived parameter values may lead to unsatisfactory results in practice. Moreover, it is infeasib… ▽ More

    Submitted 8 April, 2024; v1 submitted 18 February, 2023; originally announced February 2023.

    Comments: Published in Transactions on Machine Learning Research (TMLR)

  6. arXiv:2106.02979  [pdf, other

    stat.ML cs.LG

    Syndicated Bandits: A Framework for Auto Tuning Hyper-parameters in Contextual Bandit Algorithms

    Authors: Qin Ding, Yue Kang, Yi-Wei Liu, Thomas C. M. Lee, Cho-Jui Hsieh, James Sharpnack

    Abstract: The stochastic contextual bandit problem, which models the trade-off between exploration and exploitation, has many real applications, including recommender systems, online advertising and clinical trials. As many other machine learning algorithms, contextual bandit algorithms often have one or more hyper-parameters. As an example, in most optimal stochastic contextual bandit algorithms, there is… ▽ More

    Submitted 11 June, 2022; v1 submitted 5 June, 2021; originally announced June 2021.

  7. arXiv:2105.08620  [pdf, other

    stat.ML cs.CV cs.LG

    Adversarial Examples Detection with Bayesian Neural Network

    Authors: Yao Li, Tongyi Tang, Cho-Jui Hsieh, Thomas C. M. Lee

    Abstract: In this paper, we propose a new framework to detect adversarial examples motivated by the observations that random components can improve the smoothness of predictors and make it easier to simulate the output distribution of a deep neural network. With these observations, we propose a novel Bayesian adversarial example detector, short for BATer, to improve the performance of adversarial example de… ▽ More

    Submitted 22 February, 2024; v1 submitted 18 May, 2021; originally announced May 2021.

  8. arXiv:2101.11202  [pdf, other

    astro-ph.IM stat.AP stat.ME

    Change point detection and image segmentation for time series of astrophysical images

    Authors: Cong Xu, Hans Moritz Günther, Vinay L. Kashyap, Thomas C. M. Lee, Andreas Zezas

    Abstract: Many astrophysical phenomena are time-varying, in the sense that their intensity, energy spectrum, and/or the spatial distribution of the emission suddenly change. This paper develops a method for modeling a time series of images. Under the assumption that the arrival times of the photons follow a Poisson process, the data are binned into 4D grids of voxels (time, energy band, and x-y coordinates)… ▽ More

    Submitted 26 January, 2021; originally announced January 2021.

    Comments: 22 pages, 10 figures

  9. arXiv:2004.04258  [pdf, other

    stat.AP q-bio.NC

    Estimating Fiber Orientation Distribution through Blockwise Adaptive Thresholding with Application to HCP Young Adults Data

    Authors: Seungyong Hwang, Thomas C. M. Lee, Debashis Paul, Jie Peng

    Abstract: Due to recent technological advances, large brain imaging data sets can now be collected. Such data are highly complex so extraction of meaningful information from them remains challenging. Thus, there is an urgent need for statistical procedures that are computationally scalable and can provide accurate estimates that capture the neuronal structures and their functionalities. We propose a fast me… ▽ More

    Submitted 28 June, 2021; v1 submitted 8 April, 2020; originally announced April 2020.

  10. arXiv:1911.06177  [pdf, other

    stat.ME math.ST stat.ML

    Uncertainty Quantification in Ensembles of Honest Regression Trees using Generalized Fiducial Inference

    Authors: Suofei Wu, Jan Hannig, Thomas C. M. Lee

    Abstract: Due to their accuracies, methods based on ensembles of regression trees are a popular approach for making predictions. Some common examples include Bayesian additive regression trees, boosting and random forests. This paper focuses on honest random forests, which add honesty to the original form of random forests and are proved to have better statistical properties. The main contribution is a new… ▽ More

    Submitted 14 November, 2019; originally announced November 2019.

  11. arXiv:1908.01251  [pdf, other

    stat.ML cs.LG math.ST stat.ME

    Measuring the Algorithmic Convergence of Randomized Ensembles: The Regression Setting

    Authors: Miles E. Lopes, Suofei Wu, Thomas C. M. Lee

    Abstract: When randomized ensemble methods such as bagging and random forests are implemented, a basic question arises: Is the ensemble large enough? In particular, the practitioner desires a rigorous guarantee that a given ensemble will perform nearly as well as an ideal infinite ensemble (trained on the same data). The purpose of the current paper is to develop a bootstrap method for solving this problem… ▽ More

    Submitted 3 August, 2019; originally announced August 2019.

    Comments: 36 pages

  12. arXiv:1812.00789  [pdf, other

    cs.SI physics.soc-ph stat.ME

    Simultaneous Detection of Multiple Change Points and Community Structures in Time Series of Networks

    Authors: Rex C. Y. Cheung, Alexander Aue, Seungyong Hwang, Thomas C. M. Lee

    Abstract: In many complex systems, networks and graphs arise in a natural manner. Often, time evolving behavior can be easily found and modeled using time-series methodology. Amongst others, two common research problems in network analysis are community detection and change-point detection. Community detection aims at finding specific sub-structures within the networks, and change-point detection tries to f… ▽ More

    Submitted 30 June, 2020; v1 submitted 29 November, 2018; originally announced December 2018.

  13. arXiv:1811.01305  [pdf, other

    cs.LG stat.ML

    Block-wise Partitioning for Extreme Multi-label Classification

    Authors: Yuefeng Liang, Cho-Jui Hsieh, Thomas C. M. Lee

    Abstract: Extreme multi-label classification aims to learn a classifier that annotates an instance with a relevant subset of labels from an extremely large label set. Many existing solutions embed the label matrix to a low-dimensional linear subspace, or examine the relevance of a test instance to every label via a linear scan. In practice, however, those approaches can be computationally exorbitant. To all… ▽ More

    Submitted 3 November, 2018; originally announced November 2018.

  14. arXiv:1809.00420  [pdf, other

    stat.ME

    Network estimation via graphon with node features

    Authors: Yi Su, Raymond K. W. Wong, Thomas C. M. Lee

    Abstract: Estimating the probabilities of linkages in a network has gained increasing interest in recent years. One popular model for network analysis is the exchangeable graph model (ExGM) characterized by a two-dimensional function known as a graphon. Estimating an underlying graphon becomes the key of such analysis. Several nonparametric estimation methods have been proposed, and some are provably consis… ▽ More

    Submitted 2 September, 2018; originally announced September 2018.

  15. arXiv:1805.07427  [pdf, other

    stat.CO

    Method G: Uncertainty Quantification for Distributed Data Problems using Generalized Fiducial Inference

    Authors: Randy C. S. Lai, J. Hannig, Thomas C. M. Lee

    Abstract: It is not unusual for a data analyst to encounter data sets distributed across several computers. This can happen for reasons such as privacy concerns, efficiency of likelihood evaluations, or just the sheer size of the whole data set. This presents new challenges to statisticians as even computing simple summary statistics such as the median becomes computationally challenging. Furthermore, if ot… ▽ More

    Submitted 18 May, 2018; originally announced May 2018.

  16. arXiv:1710.00153  [pdf, other

    stat.AP stat.ME

    A Multi-Resolution Model for Non-Gaussian Random Fields on a Sphere with Application to Ionospheric Electrostatic Potentials

    Authors: Minjie Fan, Debashis Paul, Thomas C. M. Lee, Tomoko Matsuo

    Abstract: Gaussian random fields have been one of the most popular tools for analyzing spatial data. However, many geophysical and environmental processes often display non-Gaussian characteristics. In this paper, we propose a new class of spatial models for non-Gaussian random fields on a sphere based on a multi-resolution analysis. Using a special wavelet frame, named spherical needlets, as building block… ▽ More

    Submitted 30 September, 2017; originally announced October 2017.

  17. Uncertainty Quantification for High Dimensional Sparse Nonparametric Additive Models

    Authors: Qi Gao, Randy C. S. Lai, Thomas C. M. Lee, Yao Li

    Abstract: Statistical inference in high dimensional settings has recently attracted enormous attention within the literature. However, most published work focuses on the parametric linear regression problem. This paper considers an important extension of this problem: statistical inference for high dimensional sparse nonparametric additive models. To be more precise, this paper develops a methodology for co… ▽ More

    Submitted 13 November, 2019; v1 submitted 23 September, 2017; originally announced September 2017.

    Journal ref: 2019, Technometrics

  18. arXiv:1708.04929  [pdf, other

    stat.ME

    Covariance Estimation via Fiducial Inference

    Authors: W. Jenny Shi, Jan Hannig, Randy C. S. Lai, Thomas C. M. Lee

    Abstract: As a classical problem, covariance estimation has drawn much attention from the statistical community for decades. Much work has been done under the frequentist and the Bayesian frameworks. Aiming to quantify the uncertainty of the estimators without having to choose a prior, we have developed a fiducial approach to the estimation of covariance matrix. Built upon the Fiducial Berstein-von Mises Th… ▽ More

    Submitted 16 August, 2017; originally announced August 2017.

    Comments: 31 pages with 5 figures, including appendix; 1 supplementary document with 5 figures

    MSC Class: 62J10; 62E20; 62F25; 62F12

  19. arXiv:1612.08062  [pdf, ps, other

    stat.ME stat.AP

    Modeling Tangential Vector Fields on a Sphere

    Authors: Minjie Fan, Debashis Paul, Thomas C. M. Lee, Tomoko Matsuo

    Abstract: Physical processes that manifest as tangential vector fields on a sphere are common in geophysical and environmental sciences. These naturally occurring vector fields are often subject to physical constraints, such as being curl-free or divergence-free. We construct a new class of parametric models for cross-covariance functions of curl-free and divergence-free vector fields that are tangential to… ▽ More

    Submitted 23 December, 2016; originally announced December 2016.

  20. Consistent Estimation for Partition-wise Regression and Classification Models

    Authors: Rex C. Y. Cheung, Alexander Aue, Thomas C. M. Lee

    Abstract: Partition-wise models offer a flexible approach for modeling complex and multidimensional data that are capable of producing interpretable results. They are based on partitioning the observed data into regions, each of which is modeled with a simple submodel. The success of this approach highly depends on the quality of the partition, as too large a region could lead to a non-simple submodel, whil… ▽ More

    Submitted 11 January, 2016; originally announced January 2016.

    Comments: 29 pages, 2 figures

  21. arXiv:1508.07083  [pdf, other

    stat.AP astro-ph.IM

    Detecting Abrupt Changes in the Spectra of High-Energy Astrophysical Sources

    Authors: Raymond K. W. Wong, Vinay L. Kashyap, Thomas C. M. Lee, David A. van Dyk

    Abstract: Variable-intensity astronomical sources are the result of complex and often extreme physical processes. Abrupt changes in source intensity are typically accompanied by equally sudden spectral shifts, i.e., sudden changes in the wavelength distribution of the emission. This article develops a method for modeling photon counts collected from observation of such sources. We embed change points into a… ▽ More

    Submitted 10 December, 2015; v1 submitted 27 August, 2015; originally announced August 2015.

    Comments: 30 pages, 6 figures

  22. arXiv:1503.00214  [pdf, other

    stat.ML

    Matrix Completion with Noisy Entries and Outliers

    Authors: Raymond K. W. Wong, Thomas C. M. Lee

    Abstract: This paper considers the problem of matrix completion when the observed entries are noisy and contain outliers. It begins with introducing a new optimization criterion for which the recovered matrix is defined as its solution. This criterion uses the celebrated Huber function from the robust statistics literature to downweigh the effects of outliers. A practical algorithm is developed to solve the… ▽ More

    Submitted 27 December, 2017; v1 submitted 28 February, 2015; originally announced March 2015.

    Comments: 33 pages, 2 figures

  23. arXiv:1411.4723  [pdf, other

    stat.ME

    A Frequentist Approach to Computer Model Calibration

    Authors: Raymond K. W. Wong, Curtis B. Storlie, Thomas C. M. Lee

    Abstract: This paper considers the computer model calibration problem and provides a general frequentist solution. Under the proposed framework, the data model is semi-parametric with a nonparametric discrepancy function which accounts for any discrepancy between the physical reality and the computer model. In an attempt to solve a fundamentally important (but often ignored) identifiability issue between th… ▽ More

    Submitted 10 September, 2015; v1 submitted 17 November, 2014; originally announced November 2014.

    Comments: 21 pages, 2 figures

  24. arXiv:1406.0581  [pdf, other

    stat.ME stat.AP

    Fiber Direction Estimation, Smoothing and Tracking in Diffusion MRI

    Authors: Raymond K. W. Wong, Thomas C. M. Lee, Debashis Paul, Jie Peng, the Alzheimer's Disease Neuroimaging Initiative

    Abstract: Diffusion magnetic resonance imaging is an imaging technology designed to probe anatomical architectures of biological samples in an in vivo and non-invasive manner through measuring water diffusion. The contribution of this paper is threefold. First it proposes a new method to identify and estimate multiple diffusion directions within a voxel through a new and identifiable parametrization of the… ▽ More

    Submitted 24 September, 2015; v1 submitted 3 June, 2014; originally announced June 2014.

    Comments: 21 pages, 5 figures

  25. Automatic estimation of flux distributions of astrophysical source populations

    Authors: Raymond K. W. Wong, Paul Baines, Alexander Aue, Thomas C. M. Lee, Vinay L. Kashyap

    Abstract: In astrophysics a common goal is to infer the flux distribution of populations of scientifically interesting objects such as pulsars or supernovae. In practice, inference for the flux distribution is often conducted using the cumulative distribution of the number of sources detected at a given sensitivity. The resulting "$\log(N>S)$-$\log (S)$" relationship can be used to compare and evaluate theo… ▽ More

    Submitted 24 November, 2014; v1 submitted 4 May, 2013; originally announced May 2013.

    Comments: Published in at http://dx.doi.org/10.1214/14-AOAS750 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org)

    Report number: IMS-AOAS-AOAS750

    Journal ref: Annals of Applied Statistics 2014, Vol. 8, No. 3, 1690-1712

  26. arXiv:1304.7847  [pdf, ps, other

    stat.ME

    Generalized Fiducial Inference for Ultrahigh Dimensional Regression

    Authors: Randy C. S. Lai, Jan Hannig, Thomas C. M. Lee

    Abstract: In recent years the ultrahigh dimensional linear regression problem has attracted enormous attentions from the research community. Under the sparsity assumption most of the published work is devoted to the selection and estimation of the significant predictor variables. This paper studies a different but fundamentally important aspect of this problem: uncertainty quantification for parameter estim… ▽ More

    Submitted 29 April, 2013; originally announced April 2013.

  27. An MDL approach to the climate segmentation problem

    Authors: QiQi Lu, Robert Lund, Thomas C. M. Lee

    Abstract: This paper proposes an information theory approach to estimate the number of changepoints and their locations in a climatic time series. A model is introduced that has an unknown number of changepoints and allows for series autocorrelations, periodic dynamics, and a mean shift at each changepoint time. An objective function gauging the number of changepoints and their locations, based on a minimum… ▽ More

    Submitted 7 October, 2010; originally announced October 2010.

    Comments: Published in at http://dx.doi.org/10.1214/09-AOAS289 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org)

    Report number: IMS-AOAS-AOAS289

    Journal ref: Annals of Applied Statistics 2010, Vol. 4, No. 1, 299-319

  28. A Multiresolution Census Algorithm for Calculating Vortex Statistics in Turbulent Flows

    Authors: Brandon Whitcher, Thomas C. M. Lee, Jeffrey B. Weiss, Timothy J. Hoar, Douglas W. Nychka

    Abstract: The fundamental equations that model turbulent flow do not provide much insight into the size and shape of observed turbulent structures. We investigate the efficient and accurate representation of structures in two-dimensional turbulence by applying statistical models directly to the simulated vorticity field. Rather than extract the coherent portion of the image from the background variation,… ▽ More

    Submitted 2 October, 2007; originally announced October 2007.

    Journal ref: Journal of the Royal Statistical Society. Series C (Applied Statistics) Vol. 57, No. 3 (2008), pp. 293-312