Skip to main content

Showing 1–33 of 33 results for author: Miao, W

Searching in archive stat. Search in all archives.
.
  1. arXiv:2405.16130  [pdf, ps, other

    cs.LG stat.ME

    Automating the Selection of Proxy Variables of Unmeasured Confounders

    Authors: Feng Xie, Zhengming Chen, Shanshan Luo, Wang Miao, Ruichu Cai, Zhi Geng

    Abstract: Recently, interest has grown in the use of proxy variables of unobserved confounding for inferring the causal effect in the presence of unmeasured confounders from observational data. One difficulty inhibiting the practical use is finding valid proxy variables of unobserved confounding to a target causal effect of interest. These proxy variables are typically justified by background knowledge. In… ▽ More

    Submitted 25 May, 2024; originally announced May 2024.

  2. arXiv:2312.10596  [pdf, other

    stat.ME

    A maximin optimal approach for sampling designs in two-phase studies

    Authors: Ruoyu Wang, Qihua Wang, Wang Miao

    Abstract: Data collection costs can vary widely across variables in data science tasks. Two-phase designs are often employed to save data collection costs. In two-phase studies, inexpensive variables are collected for all subjects in the first phase, and expensive variables are measured for a subset of subjects in the second phase based on a predetermined sampling rule. The estimation efficiency under two-p… ▽ More

    Submitted 25 May, 2024; v1 submitted 16 December, 2023; originally announced December 2023.

  3. arXiv:2311.08691  [pdf, ps, other

    stat.ME

    On Doubly Robust Estimation with Nonignorable Missing Data Using Instrumental Variables

    Authors: Baoluo Sun, Wang Miao, Deshanee S. Wickramarachchi

    Abstract: Suppose we are interested in the mean of an outcome that is subject to nonignorable nonresponse. This paper develops new semiparametric estimation methods with instrumental variables which affect nonresponse, but not the outcome. The proposed estimators remain consistent and asymptotically normal even under partial model misspecifications for two variation independent nuisance components. We evalu… ▽ More

    Submitted 13 May, 2024; v1 submitted 14 November, 2023; originally announced November 2023.

    Comments: 29 pages

  4. Proximal Causal Inference without Uniqueness Assumptions

    Authors: Jeffrey Zhang, Wei Li, Wang Miao, Eric Tchetgen Tchetgen

    Abstract: We consider identification and inference about a counterfactual outcome mean when there is unmeasured confounding using tools from proximal causal inference (Miao et al. [2018], Tchetgen Tchetgen et al. [2020]). Proximal causal inference requires existence of solutions to at least one of two integral equations. We motivate the existence of solutions to the integral equations from proximal causal i… ▽ More

    Submitted 1 October, 2023; v1 submitted 17 March, 2023; originally announced March 2023.

    Comments: Fixed some errors and added to acknowledgements

    Journal ref: Statistics & Probability Letters 198 (2023)

  5. arXiv:2301.02225  [pdf, ps, other

    stat.ML math.ST q-bio.QM

    $l_{1-2}$ GLasso: $L_{1-2}$ Regularized Multi-task Graphical Lasso for Joint Estimation of eQTL Map** and Gene Network

    Authors: Wei Miao, Lan Yao

    Abstract: A critical problem in genetics is to discover how gene expression is regulated within cells. Two major tasks of regulatory association learning are : (i) identifying SNP-gene relationships, known as eQTL map**, and (ii) determining gene-gene relationships, known as gene network estimation. To share information between these two tasks, we focus on the unified model for joint estimation of eQTL ma… ▽ More

    Submitted 4 January, 2023; originally announced January 2023.

  6. arXiv:2210.02014  [pdf, other

    stat.ME

    Doubly Robust Proximal Synthetic Controls

    Authors: Hongxiang Qiu, Xu Shi, Wang Miao, Edgar Dobriban, Eric Tchetgen Tchetgen

    Abstract: To infer the treatment effect for a single treated unit using panel data, synthetic control methods construct a linear combination of control units' outcomes that mimics the treated unit's pre-treatment outcome trajectory. This linear combination is subsequently used to impute the counterfactual outcomes of the treated unit had it not been treated in the post-treatment period, and used to estimate… ▽ More

    Submitted 6 May, 2024; v1 submitted 5 October, 2022; originally announced October 2022.

  7. arXiv:2210.00200  [pdf, other

    stat.ME math.ST

    Paradoxes and resolutions for semiparametric fusion of individual and summary data

    Authors: Wenjie Hu, Ruoyu Wang, Wei Li, Wang Miao

    Abstract: Suppose we have available individual data from an internal study and various types of summary statistics from relevant external studies. External summary statistics have been used as constraints on the internal data distribution, which promised to improve the statistical inference in the internal data; however, the additional use of external summary data may lead to paradoxical results: efficiency… ▽ More

    Submitted 17 July, 2023; v1 submitted 1 October, 2022; originally announced October 2022.

    Comments: 17 pages, 3 figures

  8. arXiv:2208.01237  [pdf, ps, other

    stat.ME

    Doubly Robust Proximal Causal Inference under Confounded Outcome-Dependent Sampling

    Authors: Kendrick Qijun Li, Xu Shi, Wang Miao, Eric Tchetgen Tchetgen

    Abstract: Unmeasured confounding and selection bias are often of concern in observational studies and may invalidate a causal analysis if not appropriately accounted for. Under outcome-dependent sampling, a latent factor that has causal effects on the treatment, outcome, and sample selection process may cause both unmeasured confounding and selection bias, rendering standard causal parameters unidentifiable… ▽ More

    Submitted 2 August, 2022; originally announced August 2022.

    Comments: 43 pages, 1 figure

  9. arXiv:2207.08535  [pdf, other

    stat.ME

    A self-censoring model for multivariate nonignorable nonmonotone missing data

    Authors: Yilin Li, Wang Miao, Ilya Shpitser, Eric J. Tchetgen Tchetgen

    Abstract: We introduce a self-censoring model for multivariate nonignorable nonmonotone missing data, where the missingness process of each outcome is affected by its own value and is associated with missingness indicators of other outcomes, while conditionally independent of the other outcomes. The self-censoring model complements previous graphical approaches for the analysis of multivariate nonignorable… ▽ More

    Submitted 30 September, 2022; v1 submitted 18 July, 2022; originally announced July 2022.

    Comments: 28 pages, 6 figures

  10. arXiv:2206.08228  [pdf, other

    stat.ME

    Identification and estimation of causal effects in the presence of confounded principal strata

    Authors: Shanshan Luo, Wei Li, Wang Miao, Yangbo He

    Abstract: The principal stratification has become a popular tool to address a broad class of causal inference questions, particularly in dealing with non-compliance and truncation-by-death problems. The causal effects within principal strata which are determined by joint potential values of the intermediate variable, also known as the principal causal effects, are often of interest in these studies. Analyse… ▽ More

    Submitted 17 June, 2022; v1 submitted 16 June, 2022; originally announced June 2022.

    Comments: Figure 1 updated

  11. arXiv:2203.12509  [pdf, other

    stat.ME

    Double Negative Control Inference in Test-Negative Design Studies of Vaccine Effectiveness

    Authors: Kendrick Qijun Li, Xu Shi, Wang Miao, Eric Tchetgen Tchetgen

    Abstract: The test-negative design (TND) has become a standard approach to evaluate vaccine effectiveness against the risk of acquiring infectious diseases in real-world settings, such as Influenza, Rotavirus, Dengue fever, and more recently COVID-19. In a TND study, individuals who experience symptoms and seek care are recruited and tested for the infectious disease which defines cases and controls. Despit… ▽ More

    Submitted 8 March, 2023; v1 submitted 23 March, 2022; originally announced March 2022.

    Comments: 78 pages, 4 figures, 5 tables

  12. arXiv:2112.02822  [pdf, other

    stat.ME

    A stableness of resistance model for nonresponse adjustment with callback data

    Authors: Wang Miao, Xinyu Li, Baoluo Sun

    Abstract: Nonresponse arises frequently in surveys and follow-ups are routinely made to increase the response rate. In order to monitor the follow-up process, callback data have been used in social sciences and survey studies for decades. In modern surveys, the availability of callback data is increasing because the response rate is decreasing and follow-ups are essential to collect maximum information. Alt… ▽ More

    Submitted 14 February, 2023; v1 submitted 6 December, 2021; originally announced December 2021.

  13. arXiv:2110.05776  [pdf, other

    math.ST stat.ME

    Nonparametric inference about mean functionals of nonignorable nonresponse data without identifying the joint distribution

    Authors: Wei Li, Wang Miao, Eric Tchetgen Tchetgen

    Abstract: We consider identification and inference about mean functionals of observed covariates and an outcome variable subject to nonignorable missingness. By leveraging a shadow variable, we establish a necessary and sufficient condition for identification of the mean functional even if the full data distribution is not identified. We further characterize a necessary condition for $\sqrt{n}$-estimability… ▽ More

    Submitted 6 April, 2022; v1 submitted 12 October, 2021; originally announced October 2021.

    Comments: 23 pages, 1 figure and 3 tables

  14. arXiv:2110.01106  [pdf, ps, other

    stat.ME

    Data Integration in Causal Inference

    Authors: Xu Shi, Ziyang Pan, Wang Miao

    Abstract: Integrating data from multiple heterogeneous sources has become increasingly popular to achieve a large sample size and diverse study population. This paper reviews development in causal inference methods that combines multiple datasets collected by potentially different designs from potentially heterogeneous populations. We summarize recent advances on combining randomized clinical trial with ext… ▽ More

    Submitted 3 October, 2021; originally announced October 2021.

  15. arXiv:2109.07030  [pdf, other

    stat.ME math.ST

    Proximal Causal Inference for Complex Longitudinal Studies

    Authors: Andrew Ying, Wang Miao, Xu Shi, Eric J. Tchetgen Tchetgen

    Abstract: A standard assumption for causal inference about the joint effects of time-varying treatment is that one has measured sufficient covariates to ensure that within covariate strata, subjects are exchangeable across observed treatment values, also known as "sequential randomization assumption (SRA)". SRA is often criticized as it requires one to accurately measure all confounders. Realistically, meas… ▽ More

    Submitted 3 August, 2022; v1 submitted 14 September, 2021; originally announced September 2021.

  16. arXiv:2108.13935  [pdf, other

    stat.ME

    Theory for identification and Inference with Synthetic Controls: A Proximal Causal Inference Framework

    Authors: Xu Shi, Kendrick Li, Wang Miao, Mengtong Hu, Eric Tchetgen Tchetgen

    Abstract: Synthetic control (SC) methods are commonly used to estimate the treatment effect on a single treated unit in panel data settings. An SC is a weighted average of control units built to match the treated unit, with weights typically estimated by regressing (summaries of) pre-treatment outcomes and measured covariates of the treated unit to those of the control units. However, it has been establishe… ▽ More

    Submitted 18 February, 2023; v1 submitted 31 August, 2021; originally announced August 2021.

    Comments: 37 pages, 3 figures. The Supplementary Materials are attached

  17. arXiv:2108.12600  [pdf, other

    stat.ME

    A robust fusion-extraction procedure with summary statistics in the presence of biased sources

    Authors: Ruoyu Wang, Qihua Wang, Wang Miao

    Abstract: Information from various data sources is increasingly available nowadays. However, some of the data sources may produce biased estimation due to commonly encountered biased sampling, population heterogeneity, or model misspecification. This calls for statistical methods to combine information in the presence of biased sources. In this paper, a robust data fusion-extraction method is proposed. The… ▽ More

    Submitted 5 February, 2023; v1 submitted 28 August, 2021; originally announced August 2021.

  18. arXiv:2011.09829  [pdf, ps, other

    math.ST stat.ME

    Sharp bounds for variance of treatment effect estimators in the finite population in the presence of covariates

    Authors: Ruoyu Wang, Qihua Wang, Wang Miao, Xiaohua Zhou

    Abstract: In a completely randomized experiment, the variances of treatment effect estimators in the finite population are usually not identifiable and hence not estimable. Although some estimable bounds of the variances have been established in the literature, few of them are derived in the presence of covariates. In this paper, the difference-in-means estimator and the Wald estimator are considered in t… ▽ More

    Submitted 19 September, 2022; v1 submitted 19 November, 2020; originally announced November 2020.

    Comments: Accepted by Statistica Sinica

  19. arXiv:2011.08411  [pdf, other

    stat.ME math.ST

    Semiparametric proximal causal inference

    Authors: Yifan Cui, Hongming Pu, Xu Shi, Wang Miao, Eric Tchetgen Tchetgen

    Abstract: Skepticism about the assumption of no unmeasured confounding, also known as exchangeability, is often warranted in making causal inferences from observational data; because exchangeability hinges on an investigator's ability to accurately measure covariates that capture all potential sources of confounding. In practice, the most one can hope for is that covariate measurements are at best proxies o… ▽ More

    Submitted 21 February, 2023; v1 submitted 16 November, 2020; originally announced November 2020.

  20. Improving efficiency of inference in clinical trials with external control data

    Authors: Xinyu Li, Wang Miao, Fang Lu, Xiao-Hua Zhou

    Abstract: Suppose we are interested in the effect of a treatment in a clinical trial. The efficiency of inference may be limited due to small sample size. However, external control data are often available from historical studies. Motivated by an application to Helicobacter pylori infection, we show how to borrow strength from such data to improve efficiency of inference in the clinical trial. Under an exch… ▽ More

    Submitted 9 December, 2021; v1 submitted 14 November, 2020; originally announced November 2020.

    Comments: Accepted for publication in Biometrics; 1 figure, 3 tables

  21. arXiv:2011.04504  [pdf, other

    stat.ME

    Identifying effects of multiple treatments in the presence of unmeasured confounding

    Authors: Wang Miao, Wenjie Hu, Elizabeth L. Ogburn, Xiaohua Zhou

    Abstract: Identification of treatment effects in the presence of unmeasured confounding is a persistent problem in the social, biological, and medical sciences. The problem of unmeasured confounding in settings with multiple treatments is most common in statistical genetics and bioinformatics settings, where researchers have developed many successful statistical strategies without engaging deeply with the c… ▽ More

    Submitted 9 July, 2022; v1 submitted 9 November, 2020; originally announced November 2020.

  22. arXiv:2009.10982  [pdf, other

    stat.ME

    An Introduction to Proximal Causal Learning

    Authors: Eric J Tchetgen Tchetgen, Andrew Ying, Yifan Cui, Xu Shi, Wang Miao

    Abstract: A standard assumption for causal inference from observational data is that one has measured a sufficiently rich set of covariates to ensure that within covariate strata, subjects are exchangeable across observed treatment values. Skepticism about the exchangeability assumption in observational studies is often warranted because it hinges on investigators' ability to accurately measure covariates c… ▽ More

    Submitted 23 September, 2020; originally announced September 2020.

    Comments: This paper was originally presented by the first author at the 2020 Myrto Lefkopoulou Distinguished Lectureship at the Harvard T. H. Chan School of Public Health on September 17th 2020

    MSC Class: 62A01

  23. arXiv:2009.05641  [pdf, ps, other

    stat.ME

    A Selective Review of Negative Control Methods in Epidemiology

    Authors: Xu Shi, Wang Miao, Eric Tchetgen Tchetgen

    Abstract: Purpose of Review: Negative controls are a powerful tool to detect and adjust for bias in epidemiological research. This paper introduces negative controls to a broader audience and provides guidance on principled design and causal analysis based on a formal negative control framework. Recent Findings: We review and summarize causal and statistical assumptions, practical strategies, and validati… ▽ More

    Submitted 19 July, 2022; v1 submitted 11 September, 2020; originally announced September 2020.

  24. On Semiparametric Instrumental Variable Estimation of Average Treatment Effects through Data Fusion

    Authors: BaoLuo Sun, Wang Miao

    Abstract: Suppose one is interested in estimating causal effects in the presence of potentially unmeasured confounding with the aid of a valid instrumental variable. This paper investigates the problem of making inferences about the average treatment effect when data are fused from two separate sources, one of which contains information on the treatment and the other contains information on the outcome, whi… ▽ More

    Submitted 13 February, 2020; v1 submitted 8 October, 2018; originally announced October 2018.

    Comments: 34 pages

  25. arXiv:1808.04945  [pdf, other

    stat.ME

    A Confounding Bridge Approach for Double Negative Control Inference on Causal Effects

    Authors: Wang Miao, Xu Shi, Eric Tchetgen Tchetgen

    Abstract: Unmeasured confounding is a key challenge for causal inference. Negative control variables are widely available in observational studies. A negative control outcome is associated with the confounder but not causally affected by the exposure in view, and a negative control exposure is correlated with the primary exposure or the confounder but does not causally affect the outcome of interest. In thi… ▽ More

    Submitted 18 September, 2020; v1 submitted 14 August, 2018; originally announced August 2018.

    Comments: Supplement and Sample Codes are included

  26. arXiv:1808.04906  [pdf, ps, other

    stat.ME

    Multiply Robust Causal Inference with Double Negative Control Adjustment for Categorical Unmeasured Confounding

    Authors: Xu Shi, Wang Miao, Jennifer C. Nelson, Eric J. Tchetgen Tchetgen

    Abstract: Unmeasured confounding is a threat to causal inference in observational studies. In recent years, use of negative controls to mitigate unmeasured confounding has gained increasing recognition and popularity. Negative controls have a longstanding tradition in laboratory sciences and epidemiology to rule out non-causal explanations, although they have been used primarily for bias detection. Recently… ▽ More

    Submitted 4 September, 2019; v1 submitted 14 August, 2018; originally announced August 2018.

  27. arXiv:1609.08816  [pdf, ps, other

    stat.ME

    Identifying Causal Effects With Proxy Variables of an Unmeasured Confounder

    Authors: Wang Miao, Zhi Geng, Eric Tchetgen Tchetgen

    Abstract: We consider a causal effect that is confounded by an unobserved variable, but with observed proxy variables of the confounder. We show that, with at least two independent proxy variables satisfying a certain rank condition, the causal effect is nonparametrically identified, even if the measurement error mechanism, i.e., the conditional distribution of the proxies given the con- founder, may not be… ▽ More

    Submitted 28 June, 2018; v1 submitted 28 September, 2016; originally announced September 2016.

  28. arXiv:1607.03197  [pdf, other

    stat.ME

    Semiparametric Estimation with Data Missing Not at Random Using an Instrumental Variable

    Authors: BaoLuo Sun, Lan Liu, Wang Miao, Kathleen Wirth, James Robins, Eric Tchetgen Tchetgen

    Abstract: Missing data occur frequently in empirical studies in health and social sciences, often compromising our ability to make accurate inferences. An outcome is said to be missing not at random (MNAR) if, conditional on the observed variables, the missing data mechanism still depends on the unobserved outcome. In such settings, identification is generally not possible without imposing additional assump… ▽ More

    Submitted 17 January, 2017; v1 submitted 11 July, 2016; originally announced July 2016.

    Journal ref: Statistica Sinica 28 (2018), 1965-1983

  29. arXiv:1509.03860  [pdf, other

    math.ST stat.AP

    Identifiability of Normal and Normal Mixture Models With Nonignorable Missing Data

    Authors: Wang Miao, Peng Ding, Zhi Geng

    Abstract: Missing data problems arise in many applied research studies. They may jeopardize statistical inference of the model of interest, if the missing mechanism is nonignorable, that is, the missing mechanism depends on the missing values themselves even conditional on the observed data. With a nonignorable missing mechanism, the model of interest is often not identifiable without imposing further assum… ▽ More

    Submitted 13 September, 2015; originally announced September 2015.

  30. arXiv:1509.02556  [pdf, other

    stat.ME

    Identification, Doubly Robust Estimation, and Semiparametric Efficiency Theory of Nonignorable Missing Data With a Shadow Variable

    Authors: Wang Miao, Lan Liu, Eric Tchetgen Tchetgen, Zhi Geng

    Abstract: We consider identification and estimation with an outcome missing not at random (MNAR). We study an identification strategy based on a so-called shadow variable. A shadow variable is assumed to be correlated with the outcome, but independent of the missingness process conditional on the outcome and fully observed covariates. We describe a general condition for nonparametric identification of the f… ▽ More

    Submitted 9 September, 2019; v1 submitted 8 September, 2015; originally announced September 2015.

  31. arXiv:1506.08149  [pdf, other

    stat.ME

    Identification and Inference for Marginal Average Treatment Effect on the Treated With an Instrumental Variable

    Authors: Lan Liu, Wang Miao, Baoluo Sun, James Robins, Eric Tchetgen Tchetgen

    Abstract: In observational studies, treatments are typically not randomized and therefore estimated treatment effects may be subject to confounding bias. The instrumental variable (IV) design plays the role of a quasi-experimental handle since the IV is associated with the treatment and only affects the outcome through the treatment. In this paper, we present a novel framework for identification and inferen… ▽ More

    Submitted 26 August, 2016; v1 submitted 26 June, 2015; originally announced June 2015.

  32. arXiv:1210.3709  [pdf, other

    math.OC cs.IT math.NA stat.ML

    A Rank-Corrected Procedure for Matrix Completion with Fixed Basis Coefficients

    Authors: Weimin Miao, Shaohua Pan, Defeng Sun

    Abstract: For the problems of low-rank matrix completion, the efficiency of the widely-used nuclear norm technique may be challenged under many circumstances, especially when certain basis coefficients are fixed, for example, the low-rank correlation matrix completion in various fields such as the financial market and the low-rank density matrix completion from the quantum state tomography. To seek a soluti… ▽ More

    Submitted 22 June, 2015; v1 submitted 13 October, 2012; originally announced October 2012.

    Comments: 51 pages, 4 figures

  33. The Impact of Levene's Test of Equality of Variances on Statistical Theory and Practice

    Authors: Joseph L. Gastwirth, Yulia R. Gel, Weiwen Miao

    Abstract: In many applications, the underlying scientific question concerns whether the variances of $k$ samples are equal. There are a substantial number of tests for this problem. Many of them rely on the assumption of normality and are not robust to its violation. In 1960 Professor Howard Levene proposed a new approach to this problem by applying the $F$-test to the absolute deviations of the observation… ▽ More

    Submitted 2 October, 2010; originally announced October 2010.

    Comments: Published in at http://dx.doi.org/10.1214/09-STS301 the Statistical Science (http://www.imstat.org/sts/) by the Institute of Mathematical Statistics (http://www.imstat.org)

    Report number: IMS-STS-STS301

    Journal ref: Statistical Science 2009, Vol. 24, No. 3, 343-360