-
Assessing the strength of many instruments with the first-stage F and Cragg-Donald statistics
Authors:
Zhenhong Huang,
Chen Wang,
Jianfeng Yao
Abstract:
This paper investigates the behavior of Stock and Yogo (2005)'s first-stage F statistic and the Cragg-Donald statistic (Cragg and Donald, 1993) when the number of instruments and the sample size go to infinity in a comparable magnitude. Our theory shows that the first-stage F test is oversized for detecting many weak instruments. We next propose an asymptotically valid correction of the F statisti…
▽ More
This paper investigates the behavior of Stock and Yogo (2005)'s first-stage F statistic and the Cragg-Donald statistic (Cragg and Donald, 1993) when the number of instruments and the sample size go to infinity in a comparable magnitude. Our theory shows that the first-stage F test is oversized for detecting many weak instruments. We next propose an asymptotically valid correction of the F statistic for testing weakness of instruments. The theory is also used to construct confidence intervals for the strength of instruments. As for the Cragg-Donald statistic, we obtain an asymptotically valid correction in the case of two endogenous variables. Monte Carlo experiments demonstrate the satisfactory performance of the proposed methods in both situations of a single and multiple endogenous variables. The usefulness of the proposed tests is illustrated by an analysis of the returns to education data in Angrist and Keueger (1991).
△ Less
Submitted 28 February, 2023;
originally announced February 2023.
-
A specification test for the strength of instrumental variables
Authors:
Zhenhong Huang,
Chen Wang,
Jianfeng Yao
Abstract:
This paper develops a new specification test for the instrument weakness when the number of instruments $K_n$ is large with a magnitude comparable to the sample size $n$. The test relies on the fact that the difference between the two-stage least squares (2SLS) estimator and the ordinary least squares (OLS) estimator asymptotically disappears when there are many weak instruments, but otherwise con…
▽ More
This paper develops a new specification test for the instrument weakness when the number of instruments $K_n$ is large with a magnitude comparable to the sample size $n$. The test relies on the fact that the difference between the two-stage least squares (2SLS) estimator and the ordinary least squares (OLS) estimator asymptotically disappears when there are many weak instruments, but otherwise converges to a non-zero limit. We establish the limiting distribution of the difference within the above two specifications, and introduce a delete-$d$ Jackknife procedure to consistently estimate the asymptotic variance/covariance of the difference. Monte Carlo experiments demonstrate the good performance of the test procedure for both cases of single and multiple endogenous variables. Additionally, we re-examine the analysis of returns to education data in Angrist and Keueger (1991) using our proposed test. Both the simulation results and empirical analysis indicate the reliability of the test.
△ Less
Submitted 28 February, 2023;
originally announced February 2023.
-
Unified and robust Lagrange multiplier type tests for cross-sectional independence in large panel data models
Authors:
Zhenhong Huang,
Zhaoyuan Li,
Jianfeng Yao
Abstract:
This paper revisits the Lagrange multiplier type test for the null hypothesis of no cross-sectional dependence in large panel data models. We propose a unified test procedure and its power enhancement version, which show robustness for a wide class of panel model contexts. Specifically, the two procedures are applicable to both heterogeneous and fixed effects panel data models with the presence of…
▽ More
This paper revisits the Lagrange multiplier type test for the null hypothesis of no cross-sectional dependence in large panel data models. We propose a unified test procedure and its power enhancement version, which show robustness for a wide class of panel model contexts. Specifically, the two procedures are applicable to both heterogeneous and fixed effects panel data models with the presence of weakly exogenous as well as lagged dependent regressors, allowing for a general form of nonnormal error distribution. With the tools from Random Matrix Theory, the asymptotic validity of the test procedures is established under the simultaneous limit scheme where the number of time periods and the number of cross-sectional units go to infinity proportionally. The derived theories are accompanied by detailed Monte Carlo experiments, which confirm the robustness of the two tests and also suggest the validity of the power enhancement technique.
△ Less
Submitted 28 February, 2023;
originally announced February 2023.
-
Revealing Unobservables by Deep Learning: Generative Element Extraction Networks (GEEN)
Authors:
Yingyao Hu,
Yang Liu,
Jiaxiong Yao
Abstract:
Latent variable models are crucial in scientific research, where a key variable, such as effort, ability, and belief, is unobserved in the sample but needs to be identified. This paper proposes a novel method for estimating realizations of a latent variable $X^*$ in a random sample that contains its multiple measurements. With the key assumption that the measurements are independent conditional on…
▽ More
Latent variable models are crucial in scientific research, where a key variable, such as effort, ability, and belief, is unobserved in the sample but needs to be identified. This paper proposes a novel method for estimating realizations of a latent variable $X^*$ in a random sample that contains its multiple measurements. With the key assumption that the measurements are independent conditional on $X^*$, we provide sufficient conditions under which realizations of $X^*$ in the sample are locally unique in a class of deviations, which allows us to identify realizations of $X^*$. To the best of our knowledge, this paper is the first to provide such identification in observation. We then use the Kullback-Leibler distance between the two probability densities with and without the conditional independence as the loss function to train a Generative Element Extraction Networks (GEEN) that maps from the observed measurements to realizations of $X^*$ in the sample. The simulation results imply that this proposed estimator works quite well and the estimated values are highly correlated with realizations of $X^*$. Our estimator can be applied to a large class of latent variable models and we expect it will change how people deal with latent variables.
△ Less
Submitted 3 October, 2022;
originally announced October 2022.
-
Extension of the Lagrange multiplier test for error cross-section independence to large panels with non normal errors
Authors:
Zhaoyuan Li,
Jianfeng Yao
Abstract:
This paper reexamines the seminal Lagrange multiplier test for cross-section independence in a large panel model where both the number of cross-sectional units n and the number of time series observations T can be large. The first contribution of the paper is an enlargement of the test with two extensions: firstly the new asymptotic normality is derived in a simultaneous limiting scheme where the…
▽ More
This paper reexamines the seminal Lagrange multiplier test for cross-section independence in a large panel model where both the number of cross-sectional units n and the number of time series observations T can be large. The first contribution of the paper is an enlargement of the test with two extensions: firstly the new asymptotic normality is derived in a simultaneous limiting scheme where the two dimensions (n, T) tend to infinity with comparable magnitudes; second, the result is valid for general error distribution (not necessarily normal). The second contribution of the paper is a new test statistic based on the sum of the fourth powers of cross-section correlations from OLS residuals, instead of their squares used in the Lagrange multiplier statistic. This new test is generally more powerful, and the improvement is particularly visible against alternatives with weak or sparse cross-section dependence. Both simulation study and real data analysis are proposed to demonstrate the advantages of the enlarged Lagrange multiplier test and the power enhanced test in comparison with the existing procedures.
△ Less
Submitted 10 March, 2021;
originally announced March 2021.