-
High Confidence Level Inference is Almost Free using Parallel Stochastic Optimization
Authors:
Wanrong Zhu,
Zhipeng Lou,
Ziyang Wei,
Wei Biao Wu
Abstract:
Uncertainty quantification for estimation through stochastic optimization solutions in an online setting has gained popularity recently. This paper introduces a novel inference method focused on constructing confidence intervals with efficient computation and fast convergence to the nominal level. Specifically, we propose to use a small number of independent multi-runs to acquire distribution info…
▽ More
Uncertainty quantification for estimation through stochastic optimization solutions in an online setting has gained popularity recently. This paper introduces a novel inference method focused on constructing confidence intervals with efficient computation and fast convergence to the nominal level. Specifically, we propose to use a small number of independent multi-runs to acquire distribution information and construct a t-based confidence interval. Our method requires minimal additional computation and memory beyond the standard updating of estimates, making the inference process almost cost-free. We provide a rigorous theoretical guarantee for the confidence interval, demonstrating that the coverage is approximately exact with an explicit convergence rate and allowing for high confidence level inference. In particular, a new Gaussian approximation result is developed for the online estimators to characterize the coverage properties of our confidence intervals in terms of relative errors. Additionally, our method also allows for leveraging parallel computing to further accelerate calculations using multiple cores. It is easy to implement and can be integrated with existing stochastic algorithms without the need for complicated modifications.
△ Less
Submitted 17 January, 2024;
originally announced January 2024.
-
Weighted Averaged Stochastic Gradient Descent: Asymptotic Normality and Optimality
Authors:
Ziyang Wei,
Wanrong Zhu,
Wei Biao Wu
Abstract:
Stochastic Gradient Descent (SGD) is one of the simplest and most popular algorithms in modern statistical and machine learning due to its computational and memory efficiency. Various averaging schemes have been proposed to accelerate the convergence of SGD in different settings. In this paper, we explore a general averaging scheme for SGD. Specifically, we establish the asymptotic normality of a…
▽ More
Stochastic Gradient Descent (SGD) is one of the simplest and most popular algorithms in modern statistical and machine learning due to its computational and memory efficiency. Various averaging schemes have been proposed to accelerate the convergence of SGD in different settings. In this paper, we explore a general averaging scheme for SGD. Specifically, we establish the asymptotic normality of a broad range of weighted averaged SGD solutions and provide asymptotically valid online inference approaches. Furthermore, we propose an adaptive averaging scheme that exhibits both optimal statistical rate and favorable non-asymptotic convergence, drawing insights from the optimal weight for the linear model in terms of non-asymptotic mean squared error (MSE).
△ Less
Submitted 18 July, 2023; v1 submitted 13 July, 2023;
originally announced July 2023.
-
High Dimensional Analysis of Variance in Multivariate Linear Regression
Authors:
Zhipeng Lou,
Xianyang Zhang,
Wei Biao Wu
Abstract:
In this paper, we develop a systematic theory for high dimensional analysis of variance in multivariate linear regression, where the dimension and the number of coefficients can both grow with the sample size. We propose a new \emph{U}~type test statistic to test linear hypotheses and establish a high dimensional Gaussian approximation result under fairly mild moment assumptions. Our general frame…
▽ More
In this paper, we develop a systematic theory for high dimensional analysis of variance in multivariate linear regression, where the dimension and the number of coefficients can both grow with the sample size. We propose a new \emph{U}~type test statistic to test linear hypotheses and establish a high dimensional Gaussian approximation result under fairly mild moment assumptions. Our general framework and theory can be applied to deal with the classical one-way multivariate ANOVA and the nonparametric one-way MANOVA in high dimensions. To implement the test procedure in practice, we introduce a sample-splitting based estimator of the second moment of the error covariance and discuss its properties. A simulation study shows that our proposed test outperforms some existing tests in various settings.
△ Less
Submitted 10 January, 2023;
originally announced January 2023.
-
$\ell^2$ Inference for Change Points in High-Dimensional Time Series via a Two-Way MOSUM
Authors:
Jiaqi Li,
Likai Chen,
Weining Wang,
Wei Biao Wu
Abstract:
We propose an inference method for detecting multiple change points in high-dimensional time series, targeting dense or spatially clustered signals. Our method aggregates moving sum (MOSUM) statistics cross-sectionally by an $\ell^2$-norm and maximizes them over time. We further introduce a novel Two-Way MOSUM, which utilizes spatial-temporal moving regions to search for breaks, with the added adv…
▽ More
We propose an inference method for detecting multiple change points in high-dimensional time series, targeting dense or spatially clustered signals. Our method aggregates moving sum (MOSUM) statistics cross-sectionally by an $\ell^2$-norm and maximizes them over time. We further introduce a novel Two-Way MOSUM, which utilizes spatial-temporal moving regions to search for breaks, with the added advantage of enhancing testing power when breaks occur in only a few groups. The limiting distribution of an $\ell^2$-aggregated statistic is established for testing break existence by extending a high-dimensional Gaussian approximation theorem to spatial-temporal non-stationary processes. Simulation studies exhibit promising performance of our test in detecting non-sparse weak signals. Two applications, analyzing equity returns and COVID-19 cases in the United States, showcase the real-world relevance of our proposed algorithms.
△ Less
Submitted 3 July, 2023; v1 submitted 27 August, 2022;
originally announced August 2022.
-
Testing and estimation of clustered signals
Authors:
Hongyuan Cao,
Wei Biao Wu
Abstract:
We propose a change-point detection method for large scale multiple testing problems with data having clustered signals. Unlike the classic change-point setup, the signals can vary in size within a cluster. The clustering structure on the signals enables us to effectively delineate the boundaries between signal and non-signal segments. New test statistics are proposed for observations from one and…
▽ More
We propose a change-point detection method for large scale multiple testing problems with data having clustered signals. Unlike the classic change-point setup, the signals can vary in size within a cluster. The clustering structure on the signals enables us to effectively delineate the boundaries between signal and non-signal segments. New test statistics are proposed for observations from one and/or multiple realizations. Their asymptotic distributions are derived. We also study the associated variance estimation problem. We allow the variances to be heteroscedastic in the multiple realization case, which substantially expands the applicability of the proposed method. Simulation studies demonstrate that the proposed approach has a favorable performance. Our procedure is applied to {an array based Comparative Genomic Hybridization (aCGH)} dataset.
△ Less
Submitted 29 April, 2021;
originally announced April 2021.
-
Long-term prediction intervals with many covariates
Authors:
Sayar Karmakar,
Marek Chudy,
Wei Biao Wu
Abstract:
Accurate forecasting is one of the fundamental focus in the literature of econometric time-series. Often practitioners and policy makers want to predict outcomes of an entire time horizon in the future instead of just a single $k$-step ahead prediction. These series, apart from their own possible non-linear dependence, are often also influenced by many external predictors. In this paper, we constr…
▽ More
Accurate forecasting is one of the fundamental focus in the literature of econometric time-series. Often practitioners and policy makers want to predict outcomes of an entire time horizon in the future instead of just a single $k$-step ahead prediction. These series, apart from their own possible non-linear dependence, are often also influenced by many external predictors. In this paper, we construct prediction intervals of time-aggregated forecasts in a high-dimensional regression setting. Our approach is based on quantiles of residuals obtained by the popular LASSO routine. We allow for general heavy-tailed, long-memory, and nonlinear stationary error process and stochastic predictors. Through a series of systematically arranged consistency results we provide theoretical guarantees of our proposed quantile-based method in all of these scenarios. After validating our approach using simulations we also propose a novel bootstrap based method that can boost the coverage of the theoretical intervals. Finally analyzing the EPEX Spot data, we construct prediction intervals for hourly electricity prices over horizons spanning 17 weeks and contrast them to selected Bayesian and bootstrap interval forecasts.
△ Less
Submitted 30 September, 2021; v1 submitted 15 December, 2020;
originally announced December 2020.
-
Explainable AI for a No-Teardown Vehicle Component Cost Estimation: A Top-Down Approach
Authors:
Ayman Moawad,
Ehsan Islam,
Namdoo Kim,
Ram Vijayagopal,
Aymeric Rousseau,
Wei Biao Wu
Abstract:
The broader ambition of this article is to popularize an approach for the fair distribution of the quantity of a system's output to its subsystems, while allowing for underlying complex subsystem level interactions. Particularly, we present a data-driven approach to vehicle price modeling and its component price estimation by leveraging a combination of concepts from machine learning and game theo…
▽ More
The broader ambition of this article is to popularize an approach for the fair distribution of the quantity of a system's output to its subsystems, while allowing for underlying complex subsystem level interactions. Particularly, we present a data-driven approach to vehicle price modeling and its component price estimation by leveraging a combination of concepts from machine learning and game theory. We show an alternative to common teardown methodologies and surveying approaches for component and vehicle price estimation at the manufacturer's suggested retail price (MSRP) level that has the advantage of bypassing the uncertainties involved in 1) the gathering of teardown data, 2) the need to perform expensive and biased surveying, and 3) the need to perform retail price equivalent (RPE) or indirect cost multiplier (ICM) adjustments to mark up direct manufacturing costs to MSRP. This novel exercise not only provides accurate pricing of the technologies at the customer level, but also shows the, a priori known, large gaps in pricing strategies between manufacturers, vehicle sizes, classes, market segments, and other factors. There is also clear synergism or interaction between the price of certain technologies and other specifications present in the same vehicle. Those (unsurprising) results are indication that old methods of manufacturer-level component costing, aggregation, and the application of a flat and rigid RPE or ICM adjustment factor should be carefully examined. The findings are based on an extensive database, developed by Argonne National Laboratory, that includes more than 64,000 vehicles covering MY1990 to MY2020 over hundreds of vehicle specs.
△ Less
Submitted 15 June, 2020;
originally announced June 2020.
-
Online Covariance Matrix Estimation in Stochastic Gradient Descent
Authors:
Wanrong Zhu,
Xi Chen,
Wei Biao Wu
Abstract:
The stochastic gradient descent (SGD) algorithm is widely used for parameter estimation, especially for huge data sets and online learning. While this recursive algorithm is popular for computation and memory efficiency, quantifying variability and randomness of the solutions has been rarely studied. This paper aims at conducting statistical inference of SGD-based estimates in an online setting. I…
▽ More
The stochastic gradient descent (SGD) algorithm is widely used for parameter estimation, especially for huge data sets and online learning. While this recursive algorithm is popular for computation and memory efficiency, quantifying variability and randomness of the solutions has been rarely studied. This paper aims at conducting statistical inference of SGD-based estimates in an online setting. In particular, we propose a fully online estimator for the covariance matrix of averaged SGD iterates (ASGD) only using the iterates from SGD. We formally establish our online estimator's consistency and show that the convergence rate is comparable to offline counterparts. Based on the classic asymptotic normality results of ASGD, we construct asymptotically valid confidence intervals for model parameters. Upon receiving new observations, we can quickly update the covariance matrix estimate and the confidence intervals. This approach fits in an online setting and takes full advantage of SGD: efficiency in computation and memory.
△ Less
Submitted 22 June, 2021; v1 submitted 10 February, 2020;
originally announced February 2020.
-
Uniform Convergence of Multivariate Spectral Density Estimates
Authors:
Wei Biao Wu,
Paolo Zaffaroni
Abstract:
We consider uniform moment convergence of lag-window spectral density estimates for univariate and multivariate stationary processes. Optimal rates of convergence are obtained under mild and easily verifiable conditions. Our theory complements earlier results which primarily concern weak or in-probability convergence.
We consider uniform moment convergence of lag-window spectral density estimates for univariate and multivariate stationary processes. Optimal rates of convergence are obtained under mild and easily verifiable conditions. Our theory complements earlier results which primarily concern weak or in-probability convergence.
△ Less
Submitted 14 May, 2015;
originally announced May 2015.
-
Testing for Parallelism Between Trends in Multiple Time Series
Authors:
David Degras,
Zhiwei Xu,
Ting Zhang,
Wei Biao Wu
Abstract:
This paper considers the inference of trends in multiple, nonstationary time series. To test whether trends are parallel to each other, we use a parallelism index based on the L2-distances between nonparametric trend estimators and their average. A central limit theorem is obtained for the test statistic and the test's consistency is established. We propose a simulation-based approximation to the…
▽ More
This paper considers the inference of trends in multiple, nonstationary time series. To test whether trends are parallel to each other, we use a parallelism index based on the L2-distances between nonparametric trend estimators and their average. A central limit theorem is obtained for the test statistic and the test's consistency is established. We propose a simulation-based approximation to the distribution of the test statistic, which significantly improves upon the normal approximation. The test is also applied to devise a clustering algorithm. Finally, the finite-sample properties of the test are assessed through simulations and the test methodology is illustrated with time series from Motorola cell phone activity in the United States.
△ Less
Submitted 24 May, 2011; v1 submitted 10 October, 2010;
originally announced October 2010.