-
SGMM: Stochastic Approximation to Generalized Method of Moments
Authors:
Xiaohong Chen,
Sokbae Lee,
Yuan Liao,
Myung Hwan Seo,
Youngki Shin,
Myunghyun Song
Abstract:
We introduce a new class of algorithms, Stochastic Generalized Method of Moments (SGMM), for estimation and inference on (overidentified) moment restriction models. Our SGMM is a novel stochastic approximation alternative to the popular Hansen (1982) (offline) GMM, and offers fast and scalable implementation with the ability to handle streaming datasets in real time. We establish the almost sure c…
▽ More
We introduce a new class of algorithms, Stochastic Generalized Method of Moments (SGMM), for estimation and inference on (overidentified) moment restriction models. Our SGMM is a novel stochastic approximation alternative to the popular Hansen (1982) (offline) GMM, and offers fast and scalable implementation with the ability to handle streaming datasets in real time. We establish the almost sure convergence, and the (functional) central limit theorem for the inefficient online 2SLS and the efficient SGMM. Moreover, we propose online versions of the Durbin-Wu-Hausman and Sargan-Hansen tests that can be seamlessly integrated within the SGMM framework. Extensive Monte Carlo simulations show that as the sample size increases, the SGMM matches the standard (offline) GMM in terms of estimation accuracy and gains over computational efficiency, indicating its practical value for both large-scale and online datasets. We demonstrate the efficacy of our approach by a proof of concept using two well known empirical examples with large sample sizes.
△ Less
Submitted 30 October, 2023; v1 submitted 24 August, 2023;
originally announced August 2023.
-
Bootstraps for Dynamic Panel Threshold Models
Authors:
Woosik Gong,
Myung Hwan Seo
Abstract:
This paper develops valid bootstrap inference methods for the dynamic panel threshold regression. For the first-differenced generalized method of moments (GMM) estimation for the dynamic short panel, we show that the standard nonparametric bootstrap is inconsistent. The inconsistency is due to an $n^{1/4}$-consistent non-normal asymptotic distribution for the threshold estimate when the parameter…
▽ More
This paper develops valid bootstrap inference methods for the dynamic panel threshold regression. For the first-differenced generalized method of moments (GMM) estimation for the dynamic short panel, we show that the standard nonparametric bootstrap is inconsistent. The inconsistency is due to an $n^{1/4}$-consistent non-normal asymptotic distribution for the threshold estimate when the parameter resides within the continuity region of the parameter space, which stems from the rank deficiency of the approximate Jacobian of the sample moment conditions on the continuity region. We propose a grid bootstrap to construct confidence sets for the threshold, a residual bootstrap to construct confidence intervals for the coefficients, and a bootstrap for testing continuity. They are shown to be valid under uncertain continuity. A set of Monte Carlo experiments demonstrate that the proposed bootstraps perform well in the finite samples and improve upon the asymptotic normal approximation even under a large jump at the threshold. An empirical application to firms' investment model illustrates our methods.
△ Less
Submitted 17 November, 2023; v1 submitted 8 November, 2022;
originally announced November 2022.
-
Fast Inference for Quantile Regression with Tens of Millions of Observations
Authors:
Sokbae Lee,
Yuan Liao,
Myung Hwan Seo,
Youngki Shin
Abstract:
Big data analytics has opened new avenues in economic research, but the challenge of analyzing datasets with tens of millions of observations is substantial. Conventional econometric methods based on extreme estimators require large amounts of computing resources and memory, which are often not readily available. In this paper, we focus on linear quantile regression applied to "ultra-large" datase…
▽ More
Big data analytics has opened new avenues in economic research, but the challenge of analyzing datasets with tens of millions of observations is substantial. Conventional econometric methods based on extreme estimators require large amounts of computing resources and memory, which are often not readily available. In this paper, we focus on linear quantile regression applied to "ultra-large" datasets, such as U.S. decennial censuses. A fast inference framework is presented, utilizing stochastic subgradient descent (S-subGD) updates. The inference procedure handles cross-sectional data sequentially: (i) updating the parameter estimate with each incoming "new observation", (ii) aggregating it as a $\textit{Polyak-Ruppert}$ average, and (iii) computing a pivotal statistic for inference using only a solution path. The methodology draws from time-series regression to create an asymptotically pivotal statistic through random scaling. Our proposed test statistic is calculated in a fully online fashion and critical values are calculated without resampling. We conduct extensive numerical studies to showcase the computational merits of our proposed inference. For inference problems as large as $(n, d) \sim (10^7, 10^3)$, where $n$ is the sample size and $d$ is the number of regressors, our method generates new insights, surpassing current inference methods in computation. Our method specifically reveals trends in the gender gap in the U.S. college wage premium using millions of observations, while controlling over $10^3$ covariates to mitigate confounding effects.
△ Less
Submitted 31 October, 2023; v1 submitted 28 September, 2022;
originally announced September 2022.
-
What Impulse Response Do Instrumental Variables Identify?
Authors:
Bonsoo Koo,
Seojeong Lee,
Myung Hwan Seo
Abstract:
Macro shocks are often composites, yet overlooked in the impulse response analysis. When an instrumental variable (IV) is used to identify a composite shock, it violates the common IV exclusion restriction. We show that the Local Projection-IV estimand is represented as a weighted average of component-wise impulse responses but with possibly negative weights, which occur when the IV and shock comp…
▽ More
Macro shocks are often composites, yet overlooked in the impulse response analysis. When an instrumental variable (IV) is used to identify a composite shock, it violates the common IV exclusion restriction. We show that the Local Projection-IV estimand is represented as a weighted average of component-wise impulse responses but with possibly negative weights, which occur when the IV and shock components have opposite correlations. We further develop alternative (set-) identification strategies for the LP-IV based on sign restrictions or additional granular information. Our applications confirm the composite nature of monetary policy shocks and reveal a non-defense spending multiplier exceeding one.
△ Less
Submitted 23 August, 2023; v1 submitted 24 August, 2022;
originally announced August 2022.
-
Motion-selective coherent population trap** by Raman sideband cooling along two paths in a $Λ$ configuration
Authors:
Sooyoung Park,
Meung Ho Seo,
Ryun Ah Kim,
D. Cho
Abstract:
We report our experiment on sideband cooling with two Raman transitions in a $Λ$ configuration that allows selective coherent population trap** (CPT) of the motional ground state. The cooling method is applied to $^{87}$Rb atoms in a circularly-polarized one-dimensional optical lattice. Owing to the vector polarizability, the vibration frequency of a trapped atom depends on its Zeeman quantum nu…
▽ More
We report our experiment on sideband cooling with two Raman transitions in a $Λ$ configuration that allows selective coherent population trap** (CPT) of the motional ground state. The cooling method is applied to $^{87}$Rb atoms in a circularly-polarized one-dimensional optical lattice. Owing to the vector polarizability, the vibration frequency of a trapped atom depends on its Zeeman quantum number, and CPT resonance for a pair of bound states in the $Λ$ configuration depends on their vibrational quantum numbers. We call this scheme motion-selective coherent population trap** (MSCPT) and it is a trapped-atom analogue to the velocity-selective CPT developed for free He atoms. We observe a pronounced dip in temperature near a detuning for the Raman beams to satisfy the CPT resonance condition for the motional ground state. Although the lowest temperature we obtain is ten times the recoil limit owing to the large Lamb-Dicke parameter of 2.3 in our apparatus, the experiment demonstrates that MSCPT enhances the effectiveness of Raman sideband cooling and enlarges the range of its application. Discussions on design parameters optimized for MSCPT on $^{87}$Rb atoms and opportunities provided by diatomic polar molecules, whose Stark shift shows strong dependence on the rotational quantum number, are included.
△ Less
Submitted 10 May, 2022;
originally announced May 2022.
-
Motion-selective coherent population trap** for subrecoil cooling of optically trapped atoms outside the Lamb-Dicke regime
Authors:
Hyun Gyung Lee,
Sooyoung Park,
Meung Ho Seo,
D. Cho
Abstract:
We propose a scheme that combines velocity-selective coherent population trap** (CPT) and Raman sideband cooling (RSC) for subrecoil cooling of optically trapped atoms outside the Lamb-Dicke regime. This scheme is based on an inverted $\mathsf{Y}$ configuration in an alkali-metal atom. It consists of a $Λ$ formed by two Raman transitions between the ground hyperfine levels and the $D$ transition…
▽ More
We propose a scheme that combines velocity-selective coherent population trap** (CPT) and Raman sideband cooling (RSC) for subrecoil cooling of optically trapped atoms outside the Lamb-Dicke regime. This scheme is based on an inverted $\mathsf{Y}$ configuration in an alkali-metal atom. It consists of a $Λ$ formed by two Raman transitions between the ground hyperfine levels and the $D$ transition, allowing RSC along two paths and formation of a CPT dark state. Using state-dependent difference in vibration frequency of the atom in a circularly polarized trap, we can tune the $Λ$ to make only the motional ground state a CPT dark state. We call this scheme motion-selective coherent population trap** (MSCPT). We write the master equations for RSC and MSCPT and solve them numerically for a $^{87}$Rb atom in a one-dimensional optical lattice when the Lamb-Dicke parameter is 1. Although MSCPT reaches the steady state slowly compared with RSC, the former consistently produces colder atoms than the latter. The numerical results also show that subrecoil cooling by MSCPT outside the Lamb-Dicke regime is possible under a favorable, yet experimentally feasible, condition. We explain this performance quantitatively by calculating the relative darkness of each motional state. Finally, we discuss on application of the MSCPT scheme to an optically trapped diatomic polar molecule whose Stark shift and vibration frequency exhibit large variations depending on the rotational quantum number.
△ Less
Submitted 26 May, 2022; v1 submitted 2 May, 2022;
originally announced May 2022.
-
Minimax Risk in Estimating Kink Threshold and Testing Continuity
Authors:
Javier Hidalgo,
Heejun Lee,
Jungyoon Lee,
Myung Hwan Seo
Abstract:
We derive a risk lower bound in estimating the threshold parameter without knowing whether the threshold regression model is continuous or not. The bound goes to zero as the sample size $ n $ grows only at the cube root rate. Motivated by this finding, we develop a continuity test for the threshold regression model and a bootstrap to compute its \textit{p}-values. The validity of the bootstrap is…
▽ More
We derive a risk lower bound in estimating the threshold parameter without knowing whether the threshold regression model is continuous or not. The bound goes to zero as the sample size $ n $ grows only at the cube root rate. Motivated by this finding, we develop a continuity test for the threshold regression model and a bootstrap to compute its \textit{p}-values. The validity of the bootstrap is established, and its finite sample property is explored through Monte Carlo simulations.
△ Less
Submitted 1 March, 2022;
originally announced March 2022.
-
Regression Discontinuity Design with Potentially Many Covariates
Authors:
Yoichi Arai,
Taisuke Otsu,
Myung Hwan Seo
Abstract:
This paper studies the case of possibly high-dimensional covariates in the regression discontinuity design (RDD) analysis. In particular, we propose estimation and inference methods for the RDD models with covariate selection which perform stably regardless of the number of covariates. The proposed methods combine the local approach using kernel weights with $\ell_{1}$-penalization to handle high-…
▽ More
This paper studies the case of possibly high-dimensional covariates in the regression discontinuity design (RDD) analysis. In particular, we propose estimation and inference methods for the RDD models with covariate selection which perform stably regardless of the number of covariates. The proposed methods combine the local approach using kernel weights with $\ell_{1}$-penalization to handle high-dimensional covariates. We provide theoretical and numerical results which illustrate the usefulness of the proposed methods. Theoretically, we present risk and coverage properties for our point estimation and inference methods, respectively. Under certain special case, the proposed estimator becomes more efficient than the conventional covariate adjusted estimator at the cost of an additional sparsity condition. Numerically, our simulation experiments and empirical example show the robust behaviors of the proposed methods to the number of covariates in terms of bias and variance for point estimation and coverage probability and interval length for inference.
△ Less
Submitted 18 February, 2024; v1 submitted 17 September, 2021;
originally announced September 2021.
-
Fast and Robust Online Inference with Stochastic Gradient Descent via Random Scaling
Authors:
Sokbae Lee,
Yuan Liao,
Myung Hwan Seo,
Youngki Shin
Abstract:
We develop a new method of online inference for a vector of parameters estimated by the Polyak-Ruppert averaging procedure of stochastic gradient descent (SGD) algorithms. We leverage insights from time series regression in econometrics and construct asymptotically pivotal statistics via random scaling. Our approach is fully operational with online data and is rigorously underpinned by a functiona…
▽ More
We develop a new method of online inference for a vector of parameters estimated by the Polyak-Ruppert averaging procedure of stochastic gradient descent (SGD) algorithms. We leverage insights from time series regression in econometrics and construct asymptotically pivotal statistics via random scaling. Our approach is fully operational with online data and is rigorously underpinned by a functional central limit theorem. Our proposed inference method has a couple of key advantages over the existing methods. First, the test statistic is computed in an online fashion with only SGD iterates and the critical values can be obtained without any resampling methods, thereby allowing for efficient implementation suitable for massive online data. Second, there is no need to estimate the asymptotic variance and our inference method is shown to be robust to changes in the tuning parameters for SGD algorithms in simulation experiments with synthetic data.
△ Less
Submitted 6 October, 2021; v1 submitted 6 June, 2021;
originally announced June 2021.
-
Inference for parameters identified by conditional moment restrictions using a generalized Bierens maximum statistic
Authors:
Xiaohong Chen,
Sokbae Lee,
Myung Hwan Seo,
Myunghyun Song
Abstract:
Many economic panel and dynamic models, such as rational behavior and Euler equations, imply that the parameters of interest are identified by conditional moment restrictions. We introduce a novel inference method without any prior information about which conditioning instruments are weak or irrelevant. Building on Bierens (1990), we propose penalized maximum statistics and combine bootstrap infer…
▽ More
Many economic panel and dynamic models, such as rational behavior and Euler equations, imply that the parameters of interest are identified by conditional moment restrictions. We introduce a novel inference method without any prior information about which conditioning instruments are weak or irrelevant. Building on Bierens (1990), we propose penalized maximum statistics and combine bootstrap inference with model selection. Our method optimizes asymptotic power by solving a data-dependent max-min problem for tuning parameter selection. Extensive Monte Carlo experiments, based on an empirical example, demonstrate the extent to which our inference procedure is superior to those available in the literature.
△ Less
Submitted 28 June, 2024; v1 submitted 25 August, 2020;
originally announced August 2020.
-
Frequent or Systematic Changes? discussion on "Detecting possibly frequent change-points: Wild Binary Segmentation 2 and steepest-drop model selection."
Authors:
Myung Hwan Seo
Abstract:
We discuss Fryzlewicz's (2020) that proposes WBS2.SDLL approach to detect possibly frequent changes in mean of a series. Our focus is on the potential issues related to the model misspecification. We present some numerical examples such as the self-exciting threshold autoregression and the unit root process, that can be confused as a frequent change-points model.
We discuss Fryzlewicz's (2020) that proposes WBS2.SDLL approach to detect possibly frequent changes in mean of a series. Our focus is on the potential issues related to the model misspecification. We present some numerical examples such as the self-exciting threshold autoregression and the unit root process, that can be confused as a frequent change-points model.
△ Less
Submitted 11 August, 2020;
originally announced August 2020.
-
Sparse HP Filter: Finding Kinks in the COVID-19 Contact Rate
Authors:
Sokbae Lee,
Yuan Liao,
Myung Hwan Seo,
Youngki Shin
Abstract:
In this paper, we estimate the time-varying COVID-19 contact rate of a Susceptible-Infected-Recovered (SIR) model. Our measurement of the contact rate is constructed using data on actively infected, recovered and deceased cases. We propose a new trend filtering method that is a variant of the Hodrick-Prescott (HP) filter, constrained by the number of possible kinks. We term it the…
▽ More
In this paper, we estimate the time-varying COVID-19 contact rate of a Susceptible-Infected-Recovered (SIR) model. Our measurement of the contact rate is constructed using data on actively infected, recovered and deceased cases. We propose a new trend filtering method that is a variant of the Hodrick-Prescott (HP) filter, constrained by the number of possible kinks. We term it the $\textit{sparse HP filter}$ and apply it to daily data from five countries: Canada, China, South Korea, the UK and the US. Our new method yields the kinks that are well aligned with actual events in each country. We find that the sparse HP filter provides a fewer kinks than the $\ell_1$ trend filter, while both methods fitting data equally well. Theoretically, we establish risk consistency of both the sparse HP and $\ell_1$ trend filters. Ultimately, we propose to use time-varying $\textit{contact growth rates}$ to document and monitor outbreaks of COVID-19.
△ Less
Submitted 29 July, 2020; v1 submitted 18 June, 2020;
originally announced June 2020.
-
Robust Inference on Infinite and Growing Dimensional Time Series Regression
Authors:
Abhimanyu Gupta,
Myung Hwan Seo
Abstract:
We develop a class of tests for time series models such as multiple regression with growing dimension, infinite-order autoregression and nonparametric sieve regression. Examples include the Chow test and general linear restriction tests of growing rank $p$. Employing such increasing $p$ asymptotics, we introduce a new scale correction to conventional test statistics which accounts for a high-order…
▽ More
We develop a class of tests for time series models such as multiple regression with growing dimension, infinite-order autoregression and nonparametric sieve regression. Examples include the Chow test and general linear restriction tests of growing rank $p$. Employing such increasing $p$ asymptotics, we introduce a new scale correction to conventional test statistics which accounts for a high-order long-run variance (HLV) that emerges as $ p $ grows with sample size. We also propose a bias correction via a null-imposed bootstrap to alleviate finite sample bias without sacrificing power unduly. A simulation study shows the importance of robustifying testing procedures against the HLV even when $ p $ is moderate. The tests are illustrated with an application to the oil regressions in Hamilton (2003).
△ Less
Submitted 1 April, 2023; v1 submitted 19 November, 2019;
originally announced November 2019.
-
Desperate times call for desperate measures: government spending multipliers in hard times
Authors:
Sokbae Lee,
Yuan Liao,
Myung Hwan Seo,
Youngki Shin
Abstract:
We investigate state-dependent effects of fiscal multipliers and allow for endogenous sample splitting to determine whether the US economy is in a slack state. When the endogenized slack state is estimated as the period of the unemployment rate higher than about 12 percent, the estimated cumulative multipliers are significantly larger during slack periods than non-slack periods and are above unity…
▽ More
We investigate state-dependent effects of fiscal multipliers and allow for endogenous sample splitting to determine whether the US economy is in a slack state. When the endogenized slack state is estimated as the period of the unemployment rate higher than about 12 percent, the estimated cumulative multipliers are significantly larger during slack periods than non-slack periods and are above unity. We also examine the possibility of time-varying regimes of slackness and find that our empirical results are robust under a more flexible framework. Our estimation results point out the importance of the heterogenous effects of fiscal policy and shed light on the prospect of fiscal policy in response to economic shocks from the current COVID-19 pandemic.
△ Less
Submitted 28 May, 2020; v1 submitted 21 September, 2019;
originally announced September 2019.
-
Estimation of Dynamic Panel Threshold Model using Stata
Authors:
Myung Hwan Seo,
Sueyoul Kim,
Young-Joo Kim
Abstract:
We develop a Stata command xthenreg to implement the first-differenced GMM estimation of the dynamic panel threshold model, which Seo and Shin (2016, Journal of Econometrics 195: 169-186) have proposed. Furthermore, We derive the asymptotic variance formula for a kink constrained GMM estimator of the dynamic threshold model and include an estimation algorithm. We also propose a fast bootstrap algo…
▽ More
We develop a Stata command xthenreg to implement the first-differenced GMM estimation of the dynamic panel threshold model, which Seo and Shin (2016, Journal of Econometrics 195: 169-186) have proposed. Furthermore, We derive the asymptotic variance formula for a kink constrained GMM estimator of the dynamic threshold model and include an estimation algorithm. We also propose a fast bootstrap algorithm to implement the bootstrap for the linearity test. The use of the command is illustrated through a Monte Carlo simulation and an economic application.
△ Less
Submitted 26 February, 2019;
originally announced February 2019.
-
Factor-Driven Two-Regime Regression
Authors:
Sokbae Lee,
Yuan Liao,
Myung Hwan Seo,
Youngki Shin
Abstract:
We propose a novel two-regime regression model where regime switching is driven by a vector of possibly unobservable factors. When the factors are latent, we estimate them by the principal component analysis of a panel data set. We show that the optimization problem can be reformulated as mixed integer optimization, and we present two alternative computational algorithms. We derive the asymptotic…
▽ More
We propose a novel two-regime regression model where regime switching is driven by a vector of possibly unobservable factors. When the factors are latent, we estimate them by the principal component analysis of a panel data set. We show that the optimization problem can be reformulated as mixed integer optimization, and we present two alternative computational algorithms. We derive the asymptotic distribution of the resulting estimator under the scheme that the threshold effect shrinks to zero. In particular, we establish a phase transition that describes the effect of first-stage factor estimation as the cross-sectional dimension of panel data increases relative to the time-series dimension. Moreover, we develop bootstrap inference and illustrate our methods via numerical studies.
△ Less
Submitted 10 September, 2020; v1 submitted 25 October, 2018;
originally announced October 2018.
-
Robust inference for threshold regression models
Authors:
Javier Hidalgo,
Jungyoon Lee,
Myung Hwan Seo
Abstract:
This paper is concerned with inference in threshold regression models when the practitioners do not know whether at the threshold point the true specification has a kink or a jump. We nest previous works that assume either continuity or discontinuity at the threshold point and develop robust inference methods on the parameters of the model, which are valid under both specifications. In particular,…
▽ More
This paper is concerned with inference in threshold regression models when the practitioners do not know whether at the threshold point the true specification has a kink or a jump. We nest previous works that assume either continuity or discontinuity at the threshold point and develop robust inference methods on the parameters of the model, which are valid under both specifications. In particular, we found that the parameter values under the kink restriction are irregular points of the Hessian matrix of the expected Gaussian quasi-likelihood. This irregularity destroys the asymptotic normality and induces the non-standard cube root convergence rate for the threshold estimate. However, it also enables us to obtain the same asymptotic distribution as in Hansen (2000) for the quasi-likelihood ratio statistic for the unknown threshold up to an unknown scale parameter. We show that this scale parameter can be consistently estimated by a kernel method as long as no higher order kernel is used. Furthermore, we propose to construct confidence intervals for the unknown threshold by bootstrap test inversion, also known as grid bootstrap. Finite sample performances of the grid bootstrap confidence intervals are examined through Monte Carlo simulations. We also implement our procedure to an economic empirical application.
△ Less
Submitted 12 November, 2018; v1 submitted 2 February, 2017;
originally announced February 2017.
-
Local M-estimation with Discontinuous Criterion for Dependent and Limited Observations
Authors:
Myung Hwan Seo,
Taisuke Otsu
Abstract:
This paper examines asymptotic properties of local M-estimators under three sets of high-level conditions. These conditions are sufficiently general to cover the minimum volume predictive region, conditional maximum score estimator for a panel data discrete choice model, and many other widely used estimators in statistics and econometrics. Specifically, they allow for discontinuous criterion funct…
▽ More
This paper examines asymptotic properties of local M-estimators under three sets of high-level conditions. These conditions are sufficiently general to cover the minimum volume predictive region, conditional maximum score estimator for a panel data discrete choice model, and many other widely used estimators in statistics and econometrics. Specifically, they allow for discontinuous criterion functions of weakly dependent observations, which may be localized by kernel smoothing and contain nuisance parameters whose dimension may grow to infinity. Furthermore, the localization can occur around parameter values rather than around a fixed point and the observation may take limited values, which leads to set estimators. Our theory produces three different nonparametric cube root rates and enables valid inference for the local M-estimators, building on novel maximal inequalities for weakly dependent data. Our results include the standard cube root asymptotics as a special case. To illustrate the usefulness of our results, we verify our conditions for various examples such as the Hough transform estimator with diminishing bandwidth, maximum score-type set estimator, and many others.
△ Less
Submitted 9 October, 2016;
originally announced October 2016.
-
Oracle Estimation of a Change Point in High Dimensional Quantile Regression
Authors:
Sokbae Lee,
Yuan Liao,
Myung Hwan Seo,
Youngki Shin
Abstract:
In this paper, we consider a high-dimensional quantile regression model where the sparsity structure may differ between two sub-populations. We develop $\ell_1$-penalized estimators of both regression coefficients and the threshold parameter. Our penalized estimators not only select covariates but also discriminate between a model with homogeneous sparsity and a model with a change point. As a res…
▽ More
In this paper, we consider a high-dimensional quantile regression model where the sparsity structure may differ between two sub-populations. We develop $\ell_1$-penalized estimators of both regression coefficients and the threshold parameter. Our penalized estimators not only select covariates but also discriminate between a model with homogeneous sparsity and a model with a change point. As a result, it is not necessary to know or pretest whether the change point is present, or where it occurs. Our estimator of the change point achieves an oracle property in the sense that its asymptotic distribution is the same as if the unknown active sets of regression coefficients were known. Importantly, we establish this oracle property without a perfect covariate selection, thereby avoiding the need for the minimum level condition on the signals of active covariates. Dealing with high-dimensional quantile regression with an unknown change point calls for a new proof technique since the quantile loss function is non-smooth and furthermore the corresponding objective function is non-convex with respect to the change point. The technique developed in this paper is applicable to a general M-estimation framework with a change point, which may be of independent interest. The proposed methods are then illustrated via Monte Carlo experiments and an application to tip** in the dynamics of racial segregation.
△ Less
Submitted 16 December, 2016; v1 submitted 1 March, 2016;
originally announced March 2016.
-
Structural Change in Sparsity
Authors:
Sokbae Lee,
Yuan Liao,
Myung Hwan Seo,
Youngki Shin
Abstract:
In the high-dimensional sparse modeling literature, it has been crucially assumed that the sparsity structure of the model is homogeneous over the entire population. That is, the identities of important regressors are invariant across the population and across the individuals in the collected sample. In practice, however, the sparsity structure may not always be invariant in the population, due to…
▽ More
In the high-dimensional sparse modeling literature, it has been crucially assumed that the sparsity structure of the model is homogeneous over the entire population. That is, the identities of important regressors are invariant across the population and across the individuals in the collected sample. In practice, however, the sparsity structure may not always be invariant in the population, due to heterogeneity across different sub-populations. We consider a general, possibly non-smooth M-estimation framework, allowing a possible structural change regarding the identities of important regressors in the population. Our penalized M-estimator not only selects covariates but also discriminates between a model with homogeneous sparsity and a model with a structural change in sparsity. As a result, it is not necessary to know or pretest whether the structural change is present, or where it occurs. We derive asymptotic bounds on the estimation loss of the penalized M-estimators, and achieve the oracle properties. We also show that when there is a structural change, the estimator of the threshold parameter is super-consistent. If the signal is relatively strong, the rates of convergence can be further improved and asymptotic distributional properties of the estimators including the threshold estimator can be established using an adaptive penalization. The proposed methods are then applied to quantile regression and logistic regression models and are illustrated via Monte Carlo experiments.
△ Less
Submitted 19 November, 2014; v1 submitted 11 November, 2014;
originally announced November 2014.
-
The Lasso for High-Dimensional Regression with a Possible Change-Point
Authors:
Sokbae Lee,
Myung Hwan Seo,
Youngki Shin
Abstract:
We consider a high-dimensional regression model with a possible change-point due to a covariate threshold and develop the Lasso estimator of regression coefficients as well as the threshold parameter. Our Lasso estimator not only selects covariates but also selects a model between linear and threshold regression models. Under a sparsity assumption, we derive non-asymptotic oracle inequalities for…
▽ More
We consider a high-dimensional regression model with a possible change-point due to a covariate threshold and develop the Lasso estimator of regression coefficients as well as the threshold parameter. Our Lasso estimator not only selects covariates but also selects a model between linear and threshold regression models. Under a sparsity assumption, we derive non-asymptotic oracle inequalities for both the prediction risk and the $\ell_1$ estimation loss for regression coefficients. Since the Lasso estimator selects variables simultaneously, we show that oracle inequalities can be established without pretesting the existence of the threshold effect. Furthermore, we establish conditions under which the estimation error of the unknown threshold parameter can be bounded by a nearly $n^{-1}$ factor even when the number of regressors can be much larger than the sample size ($n$). We illustrate the usefulness of our proposed estimation method via Monte Carlo simulations and an application to real data.
△ Less
Submitted 19 April, 2014; v1 submitted 21 September, 2012;
originally announced September 2012.