Skip to main content

Showing 1–15 of 15 results for author: Hart, J

Searching in archive stat. Search in all archives.
.
  1. Bagging cross-validated bandwidths with application to Big Data

    Authors: Daniel Barreiro-Ures, Ricardo Cao, Mario Francisco Fernández, Jeffrey D. Hart

    Abstract: Hall and Robinson (2009) proposed and analyzed the use of bagged cross-validation to choose the bandwidth of a kernel density estimator. They established that bagging greatly reduces the noise inherent in ordinary cross-validation, and hence leads to a more efficient bandwidth selector. The asymptotic theory of Hall and Robinson (2009) assumes that $N$, the number of bagged subsamples, is… ▽ More

    Submitted 31 January, 2024; originally announced January 2024.

    Comments: 37 pages, 9 figures

    MSC Class: 62G07 (Primary); 62G20 (Secondary)

    Journal ref: Bagging cross-validated bandwidths with application to Big Data. Biometrika (2021), 108(4), 981-988

  2. arXiv:2303.11379  [pdf, other

    stat.ML cs.LG math.OC

    Solving High-Dimensional Inverse Problems with Auxiliary Uncertainty via Operator Learning with Limited Data

    Authors: Joseph Hart, Mamikon Gulian, Indu Manickam, Laura Swiler

    Abstract: In complex large-scale systems such as climate, important effects are caused by a combination of confounding processes that are not fully observable. The identification of sources from observations of system state is vital for attribution and prediction, which inform critical policy decisions. The difficulty of these types of inverse problems lies in the inability to isolate sources and the cost o… ▽ More

    Submitted 20 March, 2023; originally announced March 2023.

    Comments: 29 pages, 10 figures

  3. arXiv:2301.02283  [pdf, other

    stat.ME

    Screening Methods for Classification Based on Non-parametric Bayesian Tests

    Authors: Naveed Merchant, Jeffrey D. Hart

    Abstract: Feature or variable selection is a problem inherent to large data sets. While many methods have been proposed to deal with this problem, some can scale poorly with the number of predictors in a data set. Screening methods scale linearly with the number of predictors by checking each predictor one at a time, and are a tool used to decrease the number of variables to consider before further analysis… ▽ More

    Submitted 5 January, 2023; originally announced January 2023.

  4. arXiv:2212.12386  [pdf, ps, other

    stat.ME math.OC

    Hyper-differential sensitivity analysis in the context of Bayesian inference applied to ice-sheet problems

    Authors: William Reese, Joseph Hart, Bart van Bloemen Waanders, Mauro Perego, John Jakeman, Arvind Saibaba

    Abstract: Inverse problems constrained by partial differential equations (PDEs) play a critical role in model development and calibration. In many applications, there are multiple uncertain parameters in a model which must be estimated. Although the Bayesian formulation is attractive for such problems, computational cost and high dimensionality frequently prohibit a thorough exploration of the parametric un… ▽ More

    Submitted 23 December, 2022; originally announced December 2022.

  5. arXiv:2007.13171  [pdf, other

    cs.LG math.NA math.OC stat.ML

    Train Like a (Var)Pro: Efficient Training of Neural Networks with Variable Projection

    Authors: Elizabeth Newman, Lars Ruthotto, Joseph Hart, Bart van Bloemen Waanders

    Abstract: Deep neural networks (DNNs) have achieved state-of-the-art performance across a variety of traditional machine learning tasks, e.g., speech recognition, image classification, and segmentation. The ability of DNNs to efficiently approximate high-dimensional functions has also motivated their use in scientific applications, e.g., to solve partial differential equations (PDE) and to generate surrogat… ▽ More

    Submitted 19 April, 2021; v1 submitted 26 July, 2020; originally announced July 2020.

    Comments: 33 pages, 14 figures, 3 tables

    MSC Class: 68T05; 49M15 ACM Class: I.2.6

  6. arXiv:2006.00589  [pdf, other

    cs.LG cs.RO stat.ML

    Deep R-Learning for Continual Area Swee**

    Authors: Rishi Shah, Yuqian Jiang, Justin Hart, Peter Stone

    Abstract: Coverage path planning is a well-studied problem in robotics in which a robot must plan a path that passes through every point in a given area repeatedly, usually with a uniform frequency. To address the scenario in which some points need to be visited more frequently than others, this problem has been extended to non-uniform coverage planning. This paper considers the variant of non-uniform cover… ▽ More

    Submitted 31 May, 2020; originally announced June 2020.

  7. arXiv:2003.06368  [pdf, other

    stat.ME math.ST stat.AP

    Use of Cross-validation Bayes Factors to Test Equality of Two Densities

    Authors: Jeffery Hart, Taeryon Choi, Naveed Merchant

    Abstract: We propose a non-parametric, two-sample Bayesian test for checking whether or not two data sets share a common distribution. The test makes use of data splitting ideas and does not require priors for high-dimensional parameter vectors as do other nonparametric Bayesian procedures. We provide evidence that the new procedure provides more stable Bayes factors than do methods based on Pólya trees. So… ▽ More

    Submitted 13 March, 2020; originally announced March 2020.

  8. arXiv:1812.07042  [pdf, other

    stat.CO

    Robustness of the Sobol' indices to marginal distribution uncertainty

    Authors: Joseph Hart, Pierre Gremaud

    Abstract: Global sensitivity analysis (GSA) quantifies the influence of uncertain variables in a mathematical model. The Sobol' indices, a commonly used tool in GSA, seek to do this by attributing to each variable its relative contribution to the variance of the model output. In order to compute Sobol' indices, the user must specify a probability distribution for the uncertain variables. This distribution i… ▽ More

    Submitted 17 December, 2018; originally announced December 2018.

    Comments: 20 pages

  9. Estimating the Mean and Variance of a High-dimensional Normal Distribution Using a Mixture Prior

    Authors: Shyamalendu Sinha, Jeffrey D. Hart

    Abstract: This paper provides a framework for estimating the mean and variance of a high-dimensional normal density. The main setting considered is a fixed number of vector following a high-dimensional normal distribution with unknown mean and diagonal covariance matrix. The diagonal covariance matrix can be known or unknown. If the covariance matrix is unknown, the sample size can be as small as $2$. The p… ▽ More

    Submitted 15 November, 2018; originally announced November 2018.

    Journal ref: Computational Statistics and Data Analysis 138 (2019) 201-221

  10. arXiv:1708.07441  [pdf, other

    stat.CO

    Global sensitivity analysis for statistical model parameters

    Authors: Joseph Hart, Julie Bessac, Emil Constantinescu

    Abstract: Global sensitivity analysis (GSA) is frequently used to analyze the influence of uncertain parameters in mathematical models and simulations. In principle, tools from GSA may be extended to analyze the influence of parameters in statistical models. Such analyses may enable reduced or parsimonious modeling and greater predictive capability. However, difficulties such as parameter correlation, model… ▽ More

    Submitted 28 June, 2018; v1 submitted 24 August, 2017; originally announced August 2017.

    Comments: revisions

  11. arXiv:1609.00065  [pdf, other

    stat.ME

    Partitioned Cross-Validation for Divide-and-Conquer Density Estimation

    Authors: Anirban Bhattacharya, Jeffrey D. Hart

    Abstract: We present an efficient method to estimate cross-validation bandwidth parameters for kernel density estimation in very large datasets where ordinary cross-validation is rendered highly inefficient, both statistically and computationally. Our approach relies on calculating multiple cross-validation bandwidths on partitions of the data, followed by suitable scaling and averaging to return a partitio… ▽ More

    Submitted 31 August, 2016; originally announced September 2016.

  12. arXiv:1602.08521  [pdf, ps, other

    stat.ME

    Theoretical Properties and Practical Performance of Fully Robust One-Sided Cross-Validation

    Authors: Olga Y. Savchuk, Jeffrey D. Hart

    Abstract: Fully robust OSCV is a modification of the OSCV method that produces consistent bandwidth in the cases of smooth and nonsmooth regression functions. The current implementation of the method uses the kernel $H_I$ that is almost indistinguishable from the Gaussian kernel on the interval $[-4,4]$, but has negative tails. The theoretical properties and practical performances of the $H_I$- and $φ$-base… ▽ More

    Submitted 26 February, 2016; originally announced February 2016.

    Comments: 9 figures, 2 tables

  13. arXiv:1602.06218  [pdf, other

    stat.CO

    Efficient computation of Sobol' indices for stochastic models

    Authors: Joseph L. Hart, Alen Alexanderian, Pierre A. Gremaud

    Abstract: Stochastic models are necessary for the realistic description of an increasing number of applications. The ability to identify influential parameters and variables is critical to a thorough analysis and understanding of the underlying phenomena. We present a new global sensitivity analysis approach for stochastic models, i.e., models with both uncertain parameters and intrinsic stochasticity. Our… ▽ More

    Submitted 28 November, 2016; v1 submitted 19 February, 2016; originally announced February 2016.

    Comments: Minor revisions

    MSC Class: 60G99; 65C05; 65C20; 62H99; 62J02

  14. arXiv:0812.0052  [pdf, ps, other

    stat.ME

    Empirical study of indirect cross-validation

    Authors: Olga Y. Savchuk, Jeffrey D. Hart, Simon J. Sheather

    Abstract: In this paper we provide insight into the empirical properties of indirect cross-validation (ICV), a new method of bandwidth selection for kernel density estimators. First, we describe the method and report on the theoretical results used to develop a practical-purpose model for certain ICV parameters. Next, we provide a detailed description of a numerical study which shows that the ICV method u… ▽ More

    Submitted 29 November, 2008; originally announced December 2008.

    Comments: 22 pages, 21 figures

  15. arXiv:0812.0051  [pdf, ps, other

    stat.ME

    Indirect Cross-validation for Density Estimation

    Authors: Olga Y. Savchuk, Jeffrey D. Hart, Simon J. Sheather

    Abstract: A new method of bandwidth selection for kernel density estimators is proposed. The method, termed indirect cross-validation, or ICV, makes use of so-called selection kernels. Least squares cross-validation (LSCV) is used to select the bandwidth of a selection-kernel estimator, and this bandwidth is appropriately rescaled for use in a Gaussian kernel estimator. The proposed selection kernels are… ▽ More

    Submitted 29 November, 2008; originally announced December 2008.

    Comments: 26 pages, 10 figures