-
Causality for Complex Continuous-time Functional Longitudinal Studies with Dynamic Treatment Regimes
Authors:
Andrew Ying
Abstract:
Causal inference in longitudinal studies is often hampered by treatment-confounder feedback. Existing methods typically assume discrete time steps or step-like data changes, which we term ``regular and irregular functional studies,'' limiting their applicability to studies with continuous monitoring data, like intensive care units or continuous glucose monitoring. These studies, which we formally…
▽ More
Causal inference in longitudinal studies is often hampered by treatment-confounder feedback. Existing methods typically assume discrete time steps or step-like data changes, which we term ``regular and irregular functional studies,'' limiting their applicability to studies with continuous monitoring data, like intensive care units or continuous glucose monitoring. These studies, which we formally term ``functional longitudinal studies,'' require new approaches. Moreover, existing methods tailored for ``functional longitudinal studies'' can only investigate static treatment regimes, which are independent of historical covariates or treatments, leading to either stringent parametric assumptions or strong positivity assumptions. This restriction has limited the range of causal questions these methods can answer and their practicality. We address these limitations by develo** a nonparametric framework for functional longitudinal data, accommodating dynamic treatment regimes that depend on historical covariates or treatments, and may or may not depend on the actual treatment administered. To build intuition and explain our approach, we provide a comprehensive review of existing methods for regular and irregular longitudinal studies. We then formally define the potential outcomes and causal effects of interest, develop identification assumptions, and derive g-computation and inverse probability weighting formulas through novel applications of stochastic process and measure theory. Additionally, we compute the efficient influence curve using semiparametric theory. Our framework generalizes existing literature, and achieves double robustness under specific conditions. Finally, to aid interpretation, we provide sufficient and intuitive conditions for our identification assumptions, enhancing the applicability of our methodology to real-world scenarios.
△ Less
Submitted 10 June, 2024;
originally announced June 2024.
-
A Geometric Perspective on Double Robustness by Semiparametric Theory and Information Geometry
Authors:
Andrew Ying
Abstract:
Double robustness (DR) is a widely-used property of estimators that provides protection against model misspecification and slow convergence of nuisance functions. While DR is a global property on the probability distribution manifold, it often coincides with influence curves, which only ensure orthogonality to nuisance directions locally. This apparent discrepancy raises fundamental questions abou…
▽ More
Double robustness (DR) is a widely-used property of estimators that provides protection against model misspecification and slow convergence of nuisance functions. While DR is a global property on the probability distribution manifold, it often coincides with influence curves, which only ensure orthogonality to nuisance directions locally. This apparent discrepancy raises fundamental questions about the theoretical underpinnings of DR.
In this short communication, we address two key questions: (1) Why do influence curves frequently imply DR "for free"? (2) Under what conditions do DR estimators exist for a given statistical model and parameterization? Using tools from semiparametric theory, we show that convexity is the crucial property that enables influence curves to imply DR. We then derive necessary and sufficient conditions for the existence of DR estimators under a mean squared differentiable path-connected parameterization.
Our main contribution also lies in the novel geometric interpretation of DR using information geometry. By leveraging concepts such as parallel transport, m-flatness, and m-curvature freeness, we characterize DR in terms of invariance along submanifolds. This geometric perspective deepens the understanding of when and why DR estimators exist.
The results not only resolve apparent mysteries surrounding DR but also have practical implications for the construction and analysis of DR estimators. The geometric insights open up new connections and directions for future research. Our findings aim to solidify the theoretical foundations of a fundamental concept and contribute to the broader understanding of robust estimation in statistics.
△ Less
Submitted 22 April, 2024;
originally announced April 2024.
-
A resolvent-based prediction framework for incompressible turbulent channel flow with limited measurements
Authors:
Anjia Ying,
Tian Liang,
Zhigang Li,
Lin Fu
Abstract:
A new resolvent-based method is developed to predict the space-time properties of the flow field. To overcome the deterioration of the prediction accuracy with the increasing distance between the measurements and predictions in the Resolvent-Based Estimation (RBE), the newly proposed method utilizes the RBE to estimate the relative energy distribution near the wall rather than the absolute energy…
▽ More
A new resolvent-based method is developed to predict the space-time properties of the flow field. To overcome the deterioration of the prediction accuracy with the increasing distance between the measurements and predictions in the Resolvent-Based Estimation (RBE), the newly proposed method utilizes the RBE to estimate the relative energy distribution near the wall rather than the absolute energy directly estimated from the measurements. Using this extra information from RBE, the new method modifies the energy distribution of the spatially uniform and uncorrelated forcing that drives the flow system by minimizing the norm of the cross-spectral density (CSD) tensor of the error matrix in the near-wall region in comparison with the RBE-estimated one, and therefore it is named as the Resolvent-informed White-noise-based Estimation (RWE) method. For validation, three time-resolved direct numerical simulation (DNS) datasets with the friction Reynolds numbers $Re_τ= 180$, 550, and 950 are generated, with various locations of measurements ranging from the near-wall region ($y^+ = 40$) to the upper bound of the logarithmic region ($y/h \approx 0.2$) for the predictions. Besides the RWE, three existing methods, i.e., the RBE, the $λ$-model, and the White-noise-Based Estimation (WBE), are also included for the validation. The performance of the RBE and $λ$-model in predicting the energy spectra shows a strong dependence on the measurement locations. The newly proposed RWE shows a low sensitivity on $Re_τ$ and the measurement locations, which may range from the near-wall region to the upper bound of the logarithmic region, and has a high accuracy in predicting the energy spectra.
△ Less
Submitted 15 October, 2023;
originally announced October 2023.
-
On Defense of the Hazard Ratio
Authors:
Andrew Ying,
Ronghui Xu
Abstract:
In this short communication, we describe the recent debate on whether the hazard function should be used for causal inference in time-to-event studies and consider three different potential outcomes frameworks (by Rubin, Robins, and Pearl, respectively) as well as use the single-world intervention graph to show mathematically that the hazard function has causal interpretations under all three fram…
▽ More
In this short communication, we describe the recent debate on whether the hazard function should be used for causal inference in time-to-event studies and consider three different potential outcomes frameworks (by Rubin, Robins, and Pearl, respectively) as well as use the single-world intervention graph to show mathematically that the hazard function has causal interpretations under all three frameworks. In addition, we argue that the hazard ratio over time can provide a useful interpretation in practical settings.
△ Less
Submitted 18 October, 2023; v1 submitted 21 July, 2023;
originally announced July 2023.
-
Asymptotic Theory for Doubly Robust Estimators with Continuous-Time Nuisance Parameters
Authors:
Andrew Ying
Abstract:
Doubly robust estimators have gained widespread popularity in various fields due to their ability to provide unbiased estimates under model misspecification. However, the asymptotic theory for doubly robust estimators with continuous-time nuisance parameters remains largely unexplored. In this short communication, we address this gap by develo** a general asymptotic theory for a class of doubly…
▽ More
Doubly robust estimators have gained widespread popularity in various fields due to their ability to provide unbiased estimates under model misspecification. However, the asymptotic theory for doubly robust estimators with continuous-time nuisance parameters remains largely unexplored. In this short communication, we address this gap by develo** a general asymptotic theory for a class of doubly robust estimating equations involving stochastic processes and Riemann-Stieltjes integrals. We introduce generic assumptions on the nuisance parameter estimators that ensure the consistency and asymptotic normality of the resulting doubly robust estimator. Our results cover both the model doubly robust estimator, which relies on parametric or semiparametric models, and the rate doubly robust estimator, which allows for flexible machine learning methods. We discuss the implications of our findings and highlight the key differences between the continuous-time setting and the classical theory for doubly robust estimators. Our work provides a solid theoretical foundation for the use of doubly robust estimators in complex settings with continuous-time nuisance parameters, paving the way for future research and applications.
△ Less
Submitted 22 April, 2024; v1 submitted 13 February, 2023;
originally announced February 2023.
-
Proximal Survival Analysis to Handle Dependent Right Censoring
Authors:
Andrew Ying
Abstract:
Many epidemiological and clinical studies aim at analyzing a time-to-event endpoint. A common complication is right censoring. In some cases, it arises because subjects are still surviving after the study terminates or move out of the study area, in which case right censoring is typically treated as independent or non-informative. Such an assumption can be further relaxed to conditional independen…
▽ More
Many epidemiological and clinical studies aim at analyzing a time-to-event endpoint. A common complication is right censoring. In some cases, it arises because subjects are still surviving after the study terminates or move out of the study area, in which case right censoring is typically treated as independent or non-informative. Such an assumption can be further relaxed to conditional independent censoring by leveraging possibly time-varying covariate information, if available, assuming censoring and failure time are independent among covariate strata. In yet other instances, events may be censored by other competing events like death and are associated with censoring possibly through prognoses. Realistically, measured covariates can rarely capture all such associations with certainty. For such dependent censoring, often covariate measurements are at best proxies of underlying prognoses. In this paper, we establish a nonparametric identification framework by formally admitting that conditional independent censoring may fail in practice and accounting for covariate measurements as imperfect proxies of underlying association. The framework suggests adaptive estimators which we give generic assumptions under which they are consistent, asymptotically normal, and doubly robust. We illustrate our framework with concrete settings, where we examine the finite-sample performance of our proposed estimators via a Monte-Carlo simulation and apply them to the SEER-Medicare dataset.
△ Less
Submitted 8 January, 2024; v1 submitted 15 August, 2022;
originally announced August 2022.
-
Doubly Robust Estimation under Covariate-Induced Dependent Left Truncation
Authors:
Yuyao Wang,
Andrew Ying,
Ronghui Xu
Abstract:
In prevalent cohort studies with follow-up, the time-to-event outcome is subject to left truncation leading to selection bias. For estimation of the distribution of time-to-event, conventional methods adjusting for left truncation tend to rely on the (quasi-)independence assumption that the truncation time and the event time are "independent" on the observed region. This assumption is violated whe…
▽ More
In prevalent cohort studies with follow-up, the time-to-event outcome is subject to left truncation leading to selection bias. For estimation of the distribution of time-to-event, conventional methods adjusting for left truncation tend to rely on the (quasi-)independence assumption that the truncation time and the event time are "independent" on the observed region. This assumption is violated when there is dependence between the truncation time and the event time possibly induced by measured covariates. Inverse probability of truncation weighting leveraging covariate information can be used in this case, but it is sensitive to misspecification of the truncation model. In this work, we apply the semiparametric theory to find the efficient influence curve of an expected (arbitrarily transformed) survival time in the presence of covariate-induced dependent left truncation. We then use it to construct estimators that are shown to enjoy double-robustness properties. Our work represents the first attempt to construct doubly robust estimators in the presence of left truncation, which does not fall under the established framework of coarsened data where doubly robust approaches are developed. We provide technical conditions for the asymptotic properties that appear to not have been carefully examined in the literature for time-to-event data, and study the estimators via extensive simulation. We apply the estimators to two data sets from practice, with different right-censoring patterns.
△ Less
Submitted 13 March, 2023; v1 submitted 14 August, 2022;
originally announced August 2022.
-
Causality for Complex Continuous-time Functional Longitudinal Studies
Authors:
Andrew Ying
Abstract:
The paramount obstacle in longitudinal studies for causal inference is the complex "treatment-confounder feedback." Traditional methodologies for elucidating causal effects in longitudinal analyses are primarily based on the assumption that time moves in specific intervals or that changes in treatment occur discretely. This conventional view confines treatment-confounder feedback to a limited, cou…
▽ More
The paramount obstacle in longitudinal studies for causal inference is the complex "treatment-confounder feedback." Traditional methodologies for elucidating causal effects in longitudinal analyses are primarily based on the assumption that time moves in specific intervals or that changes in treatment occur discretely. This conventional view confines treatment-confounder feedback to a limited, countable scope. The advent of real-time monitoring in modern medical research introduces functional longitudinal data with dynamically time-varying outcomes, treatments, and confounders, necessitating dealing with a potentially uncountably infinite treatment-confounder feedback. Thus, there is an urgent need for a more elaborate and refined theoretical framework to navigate these intricacies. Recently, Ying (2024) proposed a preliminary framework focusing on end-of-study outcomes and addressing the causality in functional longitudinal data. Our paper expands significantly upon his foundation in fourfold: First, we conduct a comprehensive review of existing literature, which not only fosters a deeper understanding of the underlying concepts but also illuminates the genesis of both Ying (2024)'s and ours. Second, we extend Ying (2024) to fully embrace a functional time-varying outcome process, incorporating right censoring and truncation by death, which are both significant and practical concerns. Third, we formalize previously informal propositions in Ying (2024), demonstrating how this framework broadens the existing frameworks in a nonparametric manner. Lastly, we delve into a detailed discussion on the interpretability and feasibility of our assumptions, and outlining a strategy for future numerical studies.
△ Less
Submitted 16 March, 2024; v1 submitted 24 June, 2022;
originally announced June 2022.
-
A Robust Instrumental Variable Method Accounting for Treatment Switching in Open-Label Randomized Controlled Trials
Authors:
Andrew Ying
Abstract:
In a randomized controlled trial, treatment switching (also called contamination or crossover) occurs when a patient initially assigned to one treatment arm changes to another arm during the course of follow-up. Overlooking treatment switching might substantially bias the evaluation of treatment efficacy or safety. To account for treatment switching, instrumental variable (IV) methods by leveragin…
▽ More
In a randomized controlled trial, treatment switching (also called contamination or crossover) occurs when a patient initially assigned to one treatment arm changes to another arm during the course of follow-up. Overlooking treatment switching might substantially bias the evaluation of treatment efficacy or safety. To account for treatment switching, instrumental variable (IV) methods by leveraging the initial randomized assignment as an IV serve as natural adjustment methods because they allow dependent treatment switching possibly due to underlying prognoses. However, the ``exclusion restriction'' assumption for IV methods, which requires the initial randomization to have no direct effect on the outcome, remains questionable, especially for open-label trials. We propose a robust instrumental variable estimator circumventing such a caveat. We derive large-sample properties of our proposed estimator, along with inferential tools. We conduct extensive simulations to examine the finite performance of our estimator and its associated inferential tools. An R package ``ivsacim'' implementing all proposed methods is freely available on R CRAN. We apply the estimator to evaluate the treatment effect of Nucleoside Reverse Transcriptase Inhibitors (NRTIs) on a safety outcome in the Optimized Treatment That Includes or Omits NRTIs trial.
△ Less
Submitted 24 September, 2022; v1 submitted 27 April, 2022;
originally announced April 2022.
-
Proximal Causal Inference for Marginal Counterfactual Survival Curves
Authors:
Andrew Ying,
Yifan Cui,
Eric J. Tchetgen Tchetgen
Abstract:
Contrasting marginal counterfactual survival curves across treatment arms is an effective and popular approach for inferring the causal effect of an intervention on a right-censored time-to-event outcome. A key challenge to drawing such inferences in observational settings is the possible existence of unmeasured confounding, which may invalidate most commonly used methods that assume no hidden con…
▽ More
Contrasting marginal counterfactual survival curves across treatment arms is an effective and popular approach for inferring the causal effect of an intervention on a right-censored time-to-event outcome. A key challenge to drawing such inferences in observational settings is the possible existence of unmeasured confounding, which may invalidate most commonly used methods that assume no hidden confounding bias. In this paper, rather than making the standard no unmeasured confounding assumption, we extend the recently proposed proximal causal inference framework of Miao et al. (2018), Tchetgen et al. (2020), Cui et al. (2020) to obtain nonparametric identification of a causal survival contrast by leveraging observed covariates as imperfect proxies of unmeasured confounders. Specifically, we develop a proximal inverse probability-weighted (PIPW) estimator, the proximal analog of standard IPW, which allows the observed data distribution for the time-to-event outcome to remain completely unrestricted. PIPW estimation relies on a parametric model for a so-called treatment confounding bridge function relating the treatment process to confounding proxies. As a result, PIPW might be sensitive to model misspecification. To improve robustness and efficiency, we also propose a proximal doubly robust estimator and establish uniform consistency and asymptotic normality of both estimators. We conduct extensive simulations to examine the finite sample performance of our estimators, and proposed methods are applied to a study evaluating the effectiveness of right heart catheterization in the intensive care unit of critically ill patients.
△ Less
Submitted 27 April, 2022;
originally announced April 2022.
-
Marginal Structural Illness-Death Models for Semi-Competing Risks Data
Authors:
Yiran Zhang,
Andrew Ying,
Steve Edland,
Lon White,
Ronghui Xu
Abstract:
The three state illness death model has been established as a general approach for regression analysis of semi competing risks data. For observational data the marginal structural models (MSM) are a useful tool, under the potential outcomes framework to define and estimate parameters with causal interpretations. In this paper we introduce a class of marginal structural illness death models for the…
▽ More
The three state illness death model has been established as a general approach for regression analysis of semi competing risks data. For observational data the marginal structural models (MSM) are a useful tool, under the potential outcomes framework to define and estimate parameters with causal interpretations. In this paper we introduce a class of marginal structural illness death models for the analysis of observational semi competing risks data. We consider two specific such models, the Markov illness death MSM and the frailty based Markov illness death MSM. For interpretation purposes, risk contrasts under the MSMs are defined. Inference under the illness death MSM can be carried out using estimating equations with inverse probability weighting, while inference under the frailty based illness death MSM requires a weighted EM algorithm. We study the inference procedures under both MSMs using extensive simulations, and apply them to the analysis of mid life alcohol exposure on late life cognitive impairment as well as mortality using the Honolulu Asia Aging Study data set. The R codes developed in this work have been implemented in the R package semicmprskcoxmsm that is publicly available on CRAN.
△ Less
Submitted 18 December, 2023; v1 submitted 21 April, 2022;
originally announced April 2022.
-
Proximal Causal Inference for Complex Longitudinal Studies
Authors:
Andrew Ying,
Wang Miao,
Xu Shi,
Eric J. Tchetgen Tchetgen
Abstract:
A standard assumption for causal inference about the joint effects of time-varying treatment is that one has measured sufficient covariates to ensure that within covariate strata, subjects are exchangeable across observed treatment values, also known as "sequential randomization assumption (SRA)". SRA is often criticized as it requires one to accurately measure all confounders. Realistically, meas…
▽ More
A standard assumption for causal inference about the joint effects of time-varying treatment is that one has measured sufficient covariates to ensure that within covariate strata, subjects are exchangeable across observed treatment values, also known as "sequential randomization assumption (SRA)". SRA is often criticized as it requires one to accurately measure all confounders. Realistically, measured covariates can rarely capture all confounders with certainty. Often covariate measurements are at best proxies of confounders, thus invalidating inferences under SRA. In this paper, we extend the proximal causal inference (PCI) framework of Miao et al. (2018) to the longitudinal setting under a semiparametric marginal structural mean model (MSMM). PCI offers an opportunity to learn about joint causal effects in settings where SRA based on measured time-varying covariates fails, by formally accounting for the covariate measurements as imperfect proxies of underlying confounding mechanisms. We establish nonparametric identification with a pair of time-varying proxies and provide a corresponding characterization of regular and asymptotically linear estimators of the parameter indexing the MSMM, including a rich class of doubly robust estimators, and establish the corresponding semiparametric efficiency bound for the MSMM. Extensive simulation studies and a data application illustrate the finite sample behavior of proposed methods.
△ Less
Submitted 3 August, 2022; v1 submitted 14 September, 2021;
originally announced September 2021.
-
Minimax Kernel Machine Learning for a Class of Doubly Robust Functionals with Application to Proximal Causal Inference
Authors:
AmirEmad Ghassami,
Andrew Ying,
Ilya Shpitser,
Eric Tchetgen Tchetgen
Abstract:
Robins et al. (2008) introduced a class of influence functions (IFs) which could be used to obtain doubly robust moment functions for the corresponding parameters. However, that class does not include the IF of parameters for which the nuisance functions are solutions to integral equations. Such parameters are particularly important in the field of causal inference, specifically in the recently pr…
▽ More
Robins et al. (2008) introduced a class of influence functions (IFs) which could be used to obtain doubly robust moment functions for the corresponding parameters. However, that class does not include the IF of parameters for which the nuisance functions are solutions to integral equations. Such parameters are particularly important in the field of causal inference, specifically in the recently proposed proximal causal inference framework of Tchetgen Tchetgen et al. (2020), which allows for estimating the causal effect in the presence of latent confounders. In this paper, we first extend the class of Robins et al. to include doubly robust IFs in which the nuisance functions are solutions to integral equations. Then we demonstrate that the double robustness property of these IFs can be leveraged to construct estimating equations for the nuisance functions, which enables us to solve the integral equations without resorting to parametric models. We frame the estimation of the nuisance functions as a minimax optimization problem. We provide convergence rates for the nuisance functions and conditions required for asymptotic linearity of the estimator of the parameter of interest. The experiment results demonstrate that our proposed methodology leads to robust and high-performance estimators for average causal effect in the proximal causal inference framework.
△ Less
Submitted 7 March, 2022; v1 submitted 7 April, 2021;
originally announced April 2021.
-
A New Causal Approach to Account for Treatment Switching in Randomized Experiments under a Structural Cumulative Survival Model
Authors:
Andrew Ying,
Eric J. Tchetgen Tchetgen
Abstract:
Treatment switching in a randomized controlled trial is said to occur when a patient randomized to one treatment arm switches to another treatment arm during follow-up. This can occur at the point of disease progression, whereby patients in the control arm may be offered the experimental treatment. It is widely known that failure to account for treatment switching can seriously dilute the estimate…
▽ More
Treatment switching in a randomized controlled trial is said to occur when a patient randomized to one treatment arm switches to another treatment arm during follow-up. This can occur at the point of disease progression, whereby patients in the control arm may be offered the experimental treatment. It is widely known that failure to account for treatment switching can seriously dilute the estimated effect of treatment on overall survival. In this paper, we aim to account for the potential impact of treatment switching in a re-analysis evaluating the treatment effect of NucleosideReverse Transcriptase Inhibitors (NRTIs) on a safety outcome (time to first severe or worse sign or symptom) in participants receiving a new antiretroviral regimen that either included or omitted NRTIs in the Optimized Treatment That Includes or OmitsNRTIs (OPTIONS) trial. We propose an estimator of a treatment causal effect under a structural cumulative survival model (SCSM) that leverages randomization as an instrumental variable to account for selective treatment switching. Unlike Robins' accelerated failure time model often used to address treatment switching, the proposed approach avoids the need for artificial censoring for estimation. We establish that the proposed estimator is uniformly consistent and asymptotically Gaussian under standard regularity conditions. A consistent variance estimator is also given and a simple resampling approach provides uniform confidence bands for the causal difference comparing treatment groups overtime on the cumulative intensity scale. We develop an R package named "ivsacim" implementing all proposed methods, freely available to download from R CRAN. We examine the finite performance of the estimator via extensive simulations.
△ Less
Submitted 22 March, 2021;
originally announced March 2021.
-
Magnetoresistance oscillation study of the half-quantum vortex in doubly connected mesoscopic superconducting cylinders of Sr2RuO4
Authors:
Xinxin Cai,
Brian M. Zakrzewski,
Yiqun A. Ying,
Hae-Young Kee,
Manfred Sigrist,
J. Elliott Ortmann,
Weifeng Sun,
Zhiqiang Mao,
Ying Liu
Abstract:
The observation of the highly unusual half-quantum vortex (HQV) in a single crystalline superconductor excludes unequivocally the spin-singlet symmetry of the superconducting order parameter. HQVs were observed previously in mesoscopic samples of Sr2RuO4 in cantilever torque magnetometry measurements, thus providing direct evidence for spin-triplet pairing in the material. In addition, it raised i…
▽ More
The observation of the highly unusual half-quantum vortex (HQV) in a single crystalline superconductor excludes unequivocally the spin-singlet symmetry of the superconducting order parameter. HQVs were observed previously in mesoscopic samples of Sr2RuO4 in cantilever torque magnetometry measurements, thus providing direct evidence for spin-triplet pairing in the material. In addition, it raised important questions on HQV, including its stability and dynamics. These issues have remained largely unexplored, in particular, experimentally. We report in this paper the detection of HQVs in mesoscopic, doubly connected cylinders of single-crystalline Sr2RuO4 of a mesoscopic size and the examination of the effect of the in-plane magnetic field needed for the observation of the HQV by magnetoresistance (MR) oscillations measurements. Several distinct features found in our data, especially a dip and secondary peaks in the MR oscillations seen only in the presence of a sufficiently large in-plane magnetic field as well as a large measurement current, are linked to the formation of the HQV fluxoid state in and crossing of an Abrikosov HQV through the sample. The conclusion is drawn from the analysis of our data using a model of thermally activated vortex crossing overcoming a free-energy barrier which is modulated by the applied magnetic flux enclosed in the cylinder as well as the measurement current. Evidence for the trap** of an HQV fluxoid state in the sample was also found. Our observation of the HQV in mesoscopic Sr2RuO4 provided not only additional evidence for spin-triplet superconductivity in Sr2RuO4 but also insights into the physics of HQV, including its spontaneous spin polarization, stability, and dynamics. Our study also revealed a possible effect of the measurement current on the magnitude of the spontaneous spin polarization associated with the HQV.
△ Less
Submitted 29 October, 2020;
originally announced October 2020.
-
An Introduction to Proximal Causal Learning
Authors:
Eric J Tchetgen Tchetgen,
Andrew Ying,
Yifan Cui,
Xu Shi,
Wang Miao
Abstract:
A standard assumption for causal inference from observational data is that one has measured a sufficiently rich set of covariates to ensure that within covariate strata, subjects are exchangeable across observed treatment values. Skepticism about the exchangeability assumption in observational studies is often warranted because it hinges on investigators' ability to accurately measure covariates c…
▽ More
A standard assumption for causal inference from observational data is that one has measured a sufficiently rich set of covariates to ensure that within covariate strata, subjects are exchangeable across observed treatment values. Skepticism about the exchangeability assumption in observational studies is often warranted because it hinges on investigators' ability to accurately measure covariates capturing all potential sources of confounding. Realistically, confounding mechanisms can rarely if ever, be learned with certainty from measured covariates. One can therefore only ever hope that covariate measurements are at best proxies of true underlying confounding mechanisms operating in an observational study, thus invalidating causal claims made on basis of standard exchangeability conditions. Causal learning from proxies is a challenging inverse problem which has to date remained unresolved. In this paper, we introduce a formal potential outcome framework for proximal causal learning, which while explicitly acknowledging covariate measurements as imperfect proxies of confounding mechanisms, offers an opportunity to learn about causal effects in settings where exchangeability on the basis of measured covariates fails. Sufficient conditions for nonparametric identification are given, leading to the proximal g-formula and corresponding proximal g-computation algorithm for estimation. These may be viewed as generalizations of Robins' foundational g-formula and g-computation algorithm, which account explicitly for bias due to unmeasured confounding. Both point treatment and time-varying treatment settings are considered, and an application of proximal g-computation of causal effects is given for illustration.
△ Less
Submitted 23 September, 2020;
originally announced September 2020.
-
Causal Effects of Prenatal Drug Exposure on Birth Defects with Missing by Terathanasia
Authors:
Andrew Ying,
Ronghui Xu,
Christina D. Chambers,
Kenneth Lyons Jones
Abstract:
A recent cohort study revealed a positive correlate between major structural birth defects in infants and a certain medication taken by pregnant women. To draw valid causal inference, an outstanding problem to overcome was the missing birth defect outcomes among pregnancy losses resulting from spontaneous abortion. This led to missing not at random since, according to the theory of "terathanasia",…
▽ More
A recent cohort study revealed a positive correlate between major structural birth defects in infants and a certain medication taken by pregnant women. To draw valid causal inference, an outstanding problem to overcome was the missing birth defect outcomes among pregnancy losses resulting from spontaneous abortion. This led to missing not at random since, according to the theory of "terathanasia", a defected fetus is more likely to be spontaneously aborted. Other complications in the data included left truncation, right censoring, observational nature, and rare events. In addition, the previous analysis stratified on live birth against spontaneous abortion, which was itself a post-exposure variable and hence did not lead to a causal interpretation of the stratified results. In this paper we aim to estimate and provide inference for the causal parameters of scientific interest, including the principal effects, making use of the missing data mechanism informed by "terathanasia". The rare events with missing outcomes led to multiple sensitivity analyses where the causal parameters can be estimated with better confidence in each setting. Our findings should shed light on how studies on causal effects of medication or other exposures during pregnancy may be analyzed using state-of-the-art methodologies.
△ Less
Submitted 7 June, 2022; v1 submitted 17 April, 2020;
originally announced April 2020.
-
On the Asymptotic Distribution of the Scan Statistic for Empirical Distributions
Authors:
Andrew Ying,
Wen-Xin Zhou
Abstract:
We investigate the asymptotic behavior of several variants of the scan statistic applied to empirical distributions, which can be applied to detect the presence of an anomalous interval with any length. Of particular interest is Studentized scan statistic that is preferable in practice. The main ingredients in the proof are Kolmogorov's theorem, a Poisson approximation, and recent technical result…
▽ More
We investigate the asymptotic behavior of several variants of the scan statistic applied to empirical distributions, which can be applied to detect the presence of an anomalous interval with any length. Of particular interest is Studentized scan statistic that is preferable in practice. The main ingredients in the proof are Kolmogorov's theorem, a Poisson approximation, and recent technical results by Kabluchko et al (2014).
△ Less
Submitted 24 March, 2020; v1 submitted 4 October, 2019;
originally announced October 2019.
-
A Scan Procedure for Multiple Testing
Authors:
Shiyun Chen,
Andrew Ying,
Ery Arias-Castro
Abstract:
In a multiple testing framework, we propose a method that identifies the interval with the highest estimated false discovery rate of P-values and rejects the corresponding null hypotheses. Unlike the Benjamini-Hochberg method, which does the same but over intervals with an endpoint at the origin, the new procedure `scans' all intervals. In parallel with \citep*{storey2004strong}, we show that this…
▽ More
In a multiple testing framework, we propose a method that identifies the interval with the highest estimated false discovery rate of P-values and rejects the corresponding null hypotheses. Unlike the Benjamini-Hochberg method, which does the same but over intervals with an endpoint at the origin, the new procedure `scans' all intervals. In parallel with \citep*{storey2004strong}, we show that this scan procedure provides strong control of asymptotic false discovery rate. In addition, we investigate its asymptotic false non-discovery rate, deriving conditions under which it outperforms the Benjamini-Hochberg procedure. For example, the scan procedure is superior in power-law location models.
△ Less
Submitted 1 August, 2018;
originally announced August 2018.
-
Two-Stage Residual Inclusion under the Additive Hazards Model - An Instrumental Variable Approach with Application to SEER-Medicare Linked Data
Authors:
Andrew Ying,
Ronghui Xu,
James Murphy
Abstract:
Instrumental variable is an essential tool for addressing unmeasured confounding in observational studies. Two stage predictor substitution (2SPS) estimator and two stage residual inclusion(2SRI) are two commonly used approaches in applying instrumental variables. Recently 2SPS was studied under the additive hazards model in the presence of competing risks of time-to-events data, where linearity w…
▽ More
Instrumental variable is an essential tool for addressing unmeasured confounding in observational studies. Two stage predictor substitution (2SPS) estimator and two stage residual inclusion(2SRI) are two commonly used approaches in applying instrumental variables. Recently 2SPS was studied under the additive hazards model in the presence of competing risks of time-to-events data, where linearity was assumed for the relationship between the treatment and the instrument variable. This assumption may not be the most appropriate when we have binary treatments. In this paper, we consider the 2SRI estimator under the additive hazards model for general survival data and in the presence of competing risks, which allows generalized linear models for the relation between the treatment and the instrumental variable. We derive the asymptotic properties including a closed-form asymptotic variance estimate for the 2SRI estimator. We carry out numerical studies in finite samples, and apply our methodology to the linked Surveillance, Epidemiology and End Results (SEER) - Medicare database comparing radical prostatectomy versus conservative treatment in early-stage prostate cancer patients.
△ Less
Submitted 18 July, 2018; v1 submitted 20 May, 2018;
originally announced May 2018.
-
Detection of Sparse Mixtures: Higher Criticism and Scan Statistic
Authors:
Ery Arias-Castro,
Andrew Ying
Abstract:
We consider the problem of detecting a sparse mixture as studied by Ingster (1997) and Donoho and ** (2004). We consider a wide array of base distributions. In particular, we study the situation when the base distribution has polynomial tails, a situation that has not received much attention in the literature. Perhaps surprisingly, we find that in the context of such a power-law distribution, the…
▽ More
We consider the problem of detecting a sparse mixture as studied by Ingster (1997) and Donoho and ** (2004). We consider a wide array of base distributions. In particular, we study the situation when the base distribution has polynomial tails, a situation that has not received much attention in the literature. Perhaps surprisingly, we find that in the context of such a power-law distribution, the higher criticism does not achieve the detection boundary. However, the scan statistic does.
△ Less
Submitted 23 February, 2018;
originally announced February 2018.
-
Automatically Extracting Web API Specifications from HTML Documentation
Authors:
**qiu Yang,
Erik Wittern,
Annie T. T. Ying,
Julian Dolby,
Lin Tan
Abstract:
Web API specifications are machine-readable descriptions of APIs. These specifications, in combination with related tooling, simplify and support the consumption of APIs. However, despite the increased distribution of web APIs, specifications are rare and their creation and maintenance heavily relies on manual efforts by third parties. In this paper, we propose an automatic approach and an associa…
▽ More
Web API specifications are machine-readable descriptions of APIs. These specifications, in combination with related tooling, simplify and support the consumption of APIs. However, despite the increased distribution of web APIs, specifications are rare and their creation and maintenance heavily relies on manual efforts by third parties. In this paper, we propose an automatic approach and an associated tool called D2Spec for extracting specifications from web API documentation pages. Given a seed online documentation page on an API, D2Spec first crawls all documentation pages on the API, and then uses a set of machine learning techniques to extract the base URL, path templates, and HTTP methods, which collectively describe the endpoints of an API. We evaluated whether D2Spec can accurately extract endpoints from documentation on 120 web APIs. The results showed that D2Spec achieved a precision of 87.5% in identifying base URLs, a precision of 81.3% and a recall of 80.6% in generating path templates, and a precision of 84.4% and a recall of 76.2% in extracting HTTP methods. In addition, we found that D2Spec was useful when applied to APIs with pre-existing API specifications: D2Spec revealed many inconsistencies between web API documentation and their corresponding publicly available specifications. Thus, D2Spec can be used by web API providers to keep documentation and specifications in synchronization.
△ Less
Submitted 26 January, 2018;
originally announced January 2018.
-
Opportunities in Software Engineering Research for Web API Consumption
Authors:
Erik Wittern,
Annie Ying,
Yunhui Zheng,
Jim A. Laredo,
Julian Dolby,
Christopher C. Young,
Aleksander A. Slominski
Abstract:
Nowadays, invoking third party code increasingly involves calling web services via their web APIs, as opposed to the more traditional scenario of downloading a library and invoking the library's API. However, there are also new challenges for developers calling these web APIs. In this paper, we highlight a broad set of these challenges and argue for resulting opportunities for software engineering…
▽ More
Nowadays, invoking third party code increasingly involves calling web services via their web APIs, as opposed to the more traditional scenario of downloading a library and invoking the library's API. However, there are also new challenges for developers calling these web APIs. In this paper, we highlight a broad set of these challenges and argue for resulting opportunities for software engineering research to support developers in consuming web APIs. We outline two specific research threads in this context: (1) web API specification curation, which enables us to know the signatures of web APIs, and (2) static analysis that is capable of extracting URLs, HTTP methods etc. of web API calls. Furthermore, we present new work on how we combine (1) and (2) to provide IDE support for application developers consuming web APIs. As web APIs are used broadly, research in supporting the consumption of web APIs offers exciting opportunities.
△ Less
Submitted 18 May, 2017;
originally announced May 2017.
-
Statically Checking Web API Requests in JavaScript
Authors:
Erik Wittern,
Annie T. T. Ying,
Yunhui Zheng,
Julian Dolby,
Jim A. Laredo
Abstract:
Many JavaScript applications perform HTTP requests to web APIs, relying on the request URL, HTTP method, and request data to be constructed correctly by string operations. Traditional compile-time error checking, such as calling a non-existent method in Java, are not available for checking whether such requests comply with the requirements of a web API. In this paper, we propose an approach to sta…
▽ More
Many JavaScript applications perform HTTP requests to web APIs, relying on the request URL, HTTP method, and request data to be constructed correctly by string operations. Traditional compile-time error checking, such as calling a non-existent method in Java, are not available for checking whether such requests comply with the requirements of a web API. In this paper, we propose an approach to statically check web API requests in JavaScript. Our approach first extracts a request's URL string, HTTP method, and the corresponding request data using an inter-procedural string analysis, and then checks whether the request conforms to given web API specifications. We evaluated our approach by checking whether web API requests in JavaScript files mined from GitHub are consistent or inconsistent with publicly available API specifications. From the 6575 requests in scope, our approach determined whether the request's URL and HTTP method was consistent or inconsistent with web API specifications with a precision of 96.0%. Our approach also correctly determined whether extracted request data was consistent or inconsistent with the data requirements with a precision of 87.9% for payload data and 99.9% for query data. In a systematic analysis of the inconsistent cases, we found that many of them were due to errors in the client code. The here proposed checker can be integrated with code editors or with continuous integration tools to warn programmers about code containing potentially erroneous requests.
△ Less
Submitted 15 February, 2017; v1 submitted 13 February, 2017;
originally announced February 2017.
-
Magnetoresistance oscillations and the half-flux-quantum state in spin-triplet superconductor Sr2RuO4
Authors:
X. Cai,
Y. A. Ying,
J. E. Ortmann,
W. -F. Sun,
Z. -Q. Mao,
Y. Liu
Abstract:
We report results of our low-temperature magneto electric transport measurements on micron-sized short cylinders of odd-parity, spin-triplet superconductor Sr$_2$RuO$_4$ with the cylinder axis along the $c$ axis. The in-plane magnetic field and measurement current dependent magnetoresistance oscillations were found to feature an amplitude much larger than that expected from the conventional Little…
▽ More
We report results of our low-temperature magneto electric transport measurements on micron-sized short cylinders of odd-parity, spin-triplet superconductor Sr$_2$RuO$_4$ with the cylinder axis along the $c$ axis. The in-plane magnetic field and measurement current dependent magnetoresistance oscillations were found to feature an amplitude much larger than that expected from the conventional Little-Parks effect, suggesting that the magnetoresistance oscillations originate from vortex crossing. The free-energy barrier that controls the vortex crossing was modulated by the magnetic flux enclosed in the cylinder, the in-plane field, measurement current, and structural factors. Distinct features on magnetoresistance peaks were found, which we argue to be related to the emergence of half-flux quantum states, but only in samples for which the vortex crossing is confined at specific parts of the sample.
△ Less
Submitted 11 July, 2016; v1 submitted 1 July, 2015;
originally announced July 2015.
-
Dislocations and the enhancement of superconductivity in odd-parity superconductor Sr$_2$RuO$_4$
Authors:
Y. A. Ying,
N. E. Staley,
Y. Xin,
K. Sun,
X. Cai,
D. Fobes,
T. Liu,
Z. Q. Mao,
Y. Liu
Abstract:
We report observation of the enhancement of superconductivity near lattice dislocations and the absence of the strengthening of vortex pinning in odd-parity superconductor Sr$_2$RuO$_4$, both surprising results in direct contrast to the well known sensitivity of superconductivity in Sr$_2$RuO$_4$ to disorder. The enhanced superconductivity appears to be related fundamentally to the two-component n…
▽ More
We report observation of the enhancement of superconductivity near lattice dislocations and the absence of the strengthening of vortex pinning in odd-parity superconductor Sr$_2$RuO$_4$, both surprising results in direct contrast to the well known sensitivity of superconductivity in Sr$_2$RuO$_4$ to disorder. The enhanced superconductivity appears to be related fundamentally to the two-component nature of the superconducting order parameter, as revealed in our phenomenological theory taking into account the effect of symmetry reduction near a dislocation.
△ Less
Submitted 14 May, 2012;
originally announced May 2012.
-
Unconventional quantum oscillations in mesoscopic rings of spin-triplet superconductor Sr2RuO4
Authors:
X. Cai,
Y. A. Ying,
N. E. Staley,
Y. Xin,
D. Fobes,
T. J. Liu,
Z. Q. Mao,
Y. Liu
Abstract:
Odd-parity, spin-triplet superconductor Sr2RuO4 has been found to feature exotic vortex physics including half-flux quanta trapped in a doubly connected sample and the formation of vortex lattices at low fields. The consequences of these vortex states on the low-temperature magnetoresistive behavior of mesoscopic samples of Sr2RuO4 were investigated in this work using ring device fabricated on mec…
▽ More
Odd-parity, spin-triplet superconductor Sr2RuO4 has been found to feature exotic vortex physics including half-flux quanta trapped in a doubly connected sample and the formation of vortex lattices at low fields. The consequences of these vortex states on the low-temperature magnetoresistive behavior of mesoscopic samples of Sr2RuO4 were investigated in this work using ring device fabricated on mechanically exfoliated single crystals of Sr2RuO4 by photolithography and focused ion beam. With the magnetic field applied perpendicular to the in-plane direction, thin-wall rings of Sr2RuO4 were found to exhibit pronounced quantum oscillations with a conventional period of the full-flux quantum even though the unexpectedly large amplitude and the number of oscillations suggest the observation of vortex-flow-dominated magnetoresistance oscillations rather than a conventional Little-Parks effect. For rings with a thick wall, two distinct periods of quantum oscillations were found in high and low field regimes, respectively, which we argue to be associated with the "lock-in" of a vortex lattice in these thick-wall rings. No evidence for half-flux-quantum resistance oscillations were identified in any sample measured so far without the presence of an in-plane field.
△ Less
Submitted 29 October, 2012; v1 submitted 14 February, 2012;
originally announced February 2012.
-
Magnetotransport properties of BaRuO$_3$: Observation of two scattering rates
Authors:
Y. A. Ying,
Y. Liu,
T. He,
R. J. Cava
Abstract:
We report results of low-temperature magnetotransport and Hall measurements on single crystals of four-layered hexagonal (4H) and nine-layered rhombohedral (9R) BaRuO$_3$ that provide insight into the structure-property relationships of BaRuO$_3$ polymorphs. We found that 4H BaRuO$_3$ possesses Fermi-liquid behavior down to the lowest temperature ($T$) of our measurements, 1.8 K. On the other hand…
▽ More
We report results of low-temperature magnetotransport and Hall measurements on single crystals of four-layered hexagonal (4H) and nine-layered rhombohedral (9R) BaRuO$_3$ that provide insight into the structure-property relationships of BaRuO$_3$ polymorphs. We found that 4H BaRuO$_3$ possesses Fermi-liquid behavior down to the lowest temperature ($T$) of our measurements, 1.8 K. On the other hand, 9R BaRuO$_3$ was found to show a crossover in the temperature dependence of resistivity around 150 K, and the existence of two separate scattering rates at low temperatures. The magnetoresistance in the 9R BaRuO$_3$ was found to be negative while that in the 4H BaRuO$_3$ is positive. We propose that local moments may be present in 9R but not in 4H BaRuO$_3$, which leads to distinctly different behavior in the two forms.
△ Less
Submitted 20 December, 2011; v1 submitted 17 October, 2011;
originally announced October 2011.