Search | arXiv e-print repository

Sharp detection of low-dimensional structure in probability measures via dimensional logarithmic Sobolev inequalities

Authors: Matthew T. C. Li, Tiangang Cui, Fengyi Li, Youssef Marzouk, Olivier Zahm

Abstract: Identifying low-dimensional structure in high-dimensional probability measures is an essential pre-processing step for efficient sampling. We introduce a method for identifying and approximating a target measure $π$ as a perturbation of a given reference measure $μ$ along a few significant directions of $\mathbb{R}^{d}$. The reference measure can be a Gaussian or a nonlinear transformation of a Ga… ▽ More Identifying low-dimensional structure in high-dimensional probability measures is an essential pre-processing step for efficient sampling. We introduce a method for identifying and approximating a target measure $π$ as a perturbation of a given reference measure $μ$ along a few significant directions of $\mathbb{R}^{d}$. The reference measure can be a Gaussian or a nonlinear transformation of a Gaussian, as commonly arising in generative modeling. Our method extends prior work on minimizing majorizations of the Kullback--Leibler divergence to identify optimal approximations within this class of measures. Our main contribution unveils a connection between the \emph{dimensional} logarithmic Sobolev inequality (LSI) and approximations with this ansatz. Specifically, when the target and reference are both Gaussian, we show that minimizing the dimensional LSI is equivalent to minimizing the KL divergence restricted to this ansatz. For general non-Gaussian measures, the dimensional LSI produces majorants that uniformly improve on previous majorants for gradient-based dimension reduction. We further demonstrate the applicability of this analysis to the squared Hellinger distance, where analogous reasoning shows that the dimensional Poincaré inequality offers improved bounds. △ Less

Submitted 21 June, 2024; v1 submitted 18 June, 2024; originally announced June 2024.

arXiv:2406.02834 [pdf, ps, other]

Asymptotic inference with flexible covariate adjustment under rerandomization and stratified rerandomization

Authors: Bingkai Wang, Fan Li

Abstract: Rerandomization is an effective treatment allocation procedure to control for baseline covariate imbalance. For estimating the average treatment effect, rerandomization has been previously shown to improve the precision of the unadjusted and the linearly-adjusted estimators over simple randomization without compromising consistency. However, it remains unclear whether such results apply more gener… ▽ More Rerandomization is an effective treatment allocation procedure to control for baseline covariate imbalance. For estimating the average treatment effect, rerandomization has been previously shown to improve the precision of the unadjusted and the linearly-adjusted estimators over simple randomization without compromising consistency. However, it remains unclear whether such results apply more generally to the class of M-estimators, including the g-computation formula with generalized linear regression and doubly-robust methods, and more broadly, to efficient estimators with data-adaptive machine learners. In this paper, using a super-population framework, we develop the asymptotic theory for a more general class of covariate-adjusted estimators under rerandomization and its stratified extension. We prove that the asymptotic linearity and the influence function remain identical for any M-estimator under simple randomization and rerandomization, but rerandomization may lead to a non-Gaussian asymptotic distribution. We further explain, drawing examples from several common M-estimators, that asymptotic normality can be achieved if rerandomization variables are appropriately adjusted for in the final estimator. These results are extended to stratified rerandomization. Finally, we study the asymptotic theory for efficient estimators based on data-adaptive machine learners, and prove their efficiency optimality under rerandomization and stratified rerandomization. Our results are demonstrated via simulations and re-analyses of a cluster-randomized experiment that used stratified rerandomization. △ Less

Submitted 4 June, 2024; originally announced June 2024.

arXiv:2406.02028 [pdf]

How should parallel cluster randomized trials with a baseline period be analyzed? A survey of estimands and common estimators

Authors: Kenneth Menglin Lee, Fan Li

Abstract: The parallel cluster randomized trial with baseline (PB-CRT) is a common variant of the standard parallel cluster randomized trial (P-CRT) that maintains parallel randomization but additionally allows for both within and between-cluster comparisons. We define two estimands of interest in the context of PB-CRTs, the participant-average treatment effect (pATE) and cluster-average treatment effect (c… ▽ More The parallel cluster randomized trial with baseline (PB-CRT) is a common variant of the standard parallel cluster randomized trial (P-CRT) that maintains parallel randomization but additionally allows for both within and between-cluster comparisons. We define two estimands of interest in the context of PB-CRTs, the participant-average treatment effect (pATE) and cluster-average treatment effect (cATE), to address participant and cluster-level hypotheses. Previous work has indicated that under informative cluster sizes, commonly used mixed-effects models may yield inconsistent estimators for the estimands of interest. In this work, we theoretically derive the convergence of the unweighted and inverse cluster-period size weighted (i.) independence estimating equation, (ii.) fixed-effects model, (iii.) exchangeable mixed-effects model, and (iv.) nested-exchangeable mixed-effects model treatment effect estimators in a PB-CRT with continuous outcomes. We report a simulation study to evaluate the bias and inference with these different treatment effect estimators and their corresponding model-based or jackknife variance estimators. We then re-analyze a PB-CRT examining the effects of community youth teams on improving mental health among adolescent girls in rural eastern India. We demonstrate that the unweighted and weighted independence estimating equation and fixed-effects model regularly yield consistent estimators for the pATE and cATE estimands, whereas the mixed-effects models yield inconsistent estimators under informative cluster sizes. However, we demonstrate that unlike the nested-exchangeable mixed-effects model and corresponding analyses in P-CRTs, the exchangeable mixed-effects model is surprisingly robust to bias in many PB-CRT scenarios. △ Less

Submitted 4 June, 2024; originally announced June 2024.

Comments: 77 pages, 16 figures

arXiv:2404.18256 [pdf, other]

Semiparametric causal mediation analysis in cluster-randomized experiments

Authors: Chao Cheng, Fan Li

Abstract: In cluster-randomized experiments, there is emerging interest in exploring the causal mechanism in which a cluster-level treatment affects the outcome through an intermediate outcome. Despite an extensive development of causal mediation methods in the past decade, only a few exceptions have been considered in assessing causal mediation in cluster-randomized studies, all of which depend on parametr… ▽ More In cluster-randomized experiments, there is emerging interest in exploring the causal mechanism in which a cluster-level treatment affects the outcome through an intermediate outcome. Despite an extensive development of causal mediation methods in the past decade, only a few exceptions have been considered in assessing causal mediation in cluster-randomized studies, all of which depend on parametric model-based estimators. In this article, we develop the formal semiparametric efficiency theory to motivate several doubly-robust methods for addressing several mediation effect estimands corresponding to both the cluster-average and the individual-level treatment effects in cluster-randomized experiments--the natural indirect effect, natural direct effect, and spillover mediation effect. We derive the efficient influence function for each mediation effect, and carefully parameterize each efficient influence function to motivate practical strategies for operationalizing each estimator. We consider both parametric working models and data-adaptive machine learners to estimate the nuisance functions, and obtain semiparametric efficient causal mediation estimators in the latter case. Our methods are illustrated via extensive simulations and two completed cluster-randomized experiments. △ Less

Submitted 28 April, 2024; originally announced April 2024.

arXiv:2404.14840 [pdf, other]

Analysis of cohort stepped wedge cluster-randomized trials with non-ignorable dropout via joint modeling

Authors: Alessandro Gasparini, Michael J. Crowther, Emiel O. Hoogendijk, Fan Li, Michael O. Harhay

Abstract: Stepped wedge cluster-randomized trial (CRTs) designs randomize clusters of individuals to intervention sequences, ensuring that every cluster eventually transitions from a control period to receive the intervention under study by the end of the study period. The analysis of stepped wedge CRTs is usually more complex than parallel-arm CRTs due to potential secular trends that result in changing in… ▽ More Stepped wedge cluster-randomized trial (CRTs) designs randomize clusters of individuals to intervention sequences, ensuring that every cluster eventually transitions from a control period to receive the intervention under study by the end of the study period. The analysis of stepped wedge CRTs is usually more complex than parallel-arm CRTs due to potential secular trends that result in changing intra-cluster and period-cluster correlations over time. A further challenge in the analysis of closed-cohort stepped wedge CRTs, which follow groups of individuals enrolled in each period longitudinally, is the occurrence of dropout. This is particularly problematic in studies of individuals at high risk for mortality, which causes non-ignorable missing outcomes. If not appropriately addressed, missing outcomes from death will erode statistical power, at best, and bias treatment effect estimates, at worst. Joint longitudinal-survival models can accommodate informative dropout and missingness patterns in longitudinal studies. Specifically, within this framework one directly models the dropout process via a time-to-event submodel together with the longitudinal outcome of interest. The two submodels are then linked using a variety of possible association structures. This work extends linear mixed-effects models by jointly modeling the dropout process to accommodate informative missing outcome data in closed-cohort stepped wedge CRTs. We focus on constant intervention and general time-on-treatment effect parametrizations for the longitudinal submodel and study the performance of the proposed methodology using Monte Carlo simulation under several data-generating scenarios. We illustrate the joint modeling methodology in practice by reanalyzing the `Frail Older Adults: Care in Transition' (ACT) trial, a stepped wedge CRT of a multifaceted geriatric care model versus usual care in the Netherlands. △ Less

Submitted 23 April, 2024; originally announced April 2024.

arXiv:2404.10629 [pdf, other]

Weighting methods for truncation by death in cluster-randomized trials

Authors: Dane Isenberg, Michael Harhay, Nandita Mitra, Fan Li

Abstract: Patient-centered outcomes, such as quality of life and length of hospital stay, are the focus in a wide array of clinical studies. However, participants in randomized trials for elderly or critically and severely ill patient populations may have truncated or undefined non-mortality outcomes if they do not survive through the measurement time point. To address truncation by death, the survivor aver… ▽ More Patient-centered outcomes, such as quality of life and length of hospital stay, are the focus in a wide array of clinical studies. However, participants in randomized trials for elderly or critically and severely ill patient populations may have truncated or undefined non-mortality outcomes if they do not survive through the measurement time point. To address truncation by death, the survivor average causal effect (SACE) has been proposed as a causally interpretable subgroup treatment effect defined under the principal stratification framework. However, the majority of methods for estimating SACE have been developed in the context of individually-randomized trials. Only limited discussions have been centered around cluster-randomized trials (CRTs), where methods typically involve strong distributional assumptions for outcome modeling. In this paper, we propose two weighting methods to estimate SACE in CRTs that obviate the need for potentially complicated outcome distribution modeling. We establish the requisite assumptions that address latent clustering effects to enable point identification of SACE, and we provide computationally-efficient asymptotic variance estimators for each weighting estimator. In simulations, we evaluate our weighting estimators, demonstrating their finite-sample operating characteristics and robustness to certain departures from the identification assumptions. We illustrate our methods using data from a CRT to assess the impact of a sedation protocol on mechanical ventilation among children with acute respiratory failure. △ Less

Submitted 16 April, 2024; originally announced April 2024.

Comments: Code for simulations and R package is available on https://github.com/abcdane1/PtSaceCrts

arXiv:2403.08927 [pdf, other]

Principal stratification with U-statistics under principal ignorability

Authors: Xinyuan Chen, Fan Li

Abstract: Principal stratification is a popular framework for causal inference in the presence of an intermediate outcome. While the principal average treatment effects have traditionally been the default target of inference, it may not be sufficient when the interest lies in the relative favorability of one potential outcome over the other within the principal stratum. We thus introduce the principal gener… ▽ More Principal stratification is a popular framework for causal inference in the presence of an intermediate outcome. While the principal average treatment effects have traditionally been the default target of inference, it may not be sufficient when the interest lies in the relative favorability of one potential outcome over the other within the principal stratum. We thus introduce the principal generalized causal effect estimands, which extend the principal average causal effects to accommodate nonlinear contrast functions. Under principal ignorability, we expand the theoretical results in Jiang et. al. (2022) to a much wider class of causal estimands in the presence of a binary intermediate variable. We develop identification formulas and derive the efficient influence functions of the generalized estimands for principal stratification analyses. These efficient influence functions motivate a set of multiply robust estimators and lay the ground for obtaining efficient debiased machine learning estimators via cross-fitting based on U-statistics. The proposed methods are illustrated through simulations and the analysis of a data example. △ Less

Submitted 2 April, 2024; v1 submitted 13 March, 2024; originally announced March 2024.

arXiv:2402.17096 [pdf, other]

Simple rejection Monte Carlo algorithm and its application to multivariate statistical inference

Authors: Fengyu Li, Huijiao Yu, Jun Yan, Xianyong Meng

Abstract: The Monte Carlo algorithm is increasingly utilized, with its central step involving computer-based random sampling from stochastic models. While both Markov Chain Monte Carlo (MCMC) and Reject Monte Carlo serve as sampling methods, the latter finds fewer applications compared to the former. Hence, this paper initially provides a concise introduction to the theory of the Reject Monte Carlo algorith… ▽ More The Monte Carlo algorithm is increasingly utilized, with its central step involving computer-based random sampling from stochastic models. While both Markov Chain Monte Carlo (MCMC) and Reject Monte Carlo serve as sampling methods, the latter finds fewer applications compared to the former. Hence, this paper initially provides a concise introduction to the theory of the Reject Monte Carlo algorithm and its implementation techniques, aiming to enhance conceptual understanding and program implementation. Subsequently, a simplified rejection Monte Carlo algorithm is formulated. Furthermore, by considering multivariate distribution sampling and multivariate integration as examples, this study explores the specific application of the algorithm in statistical inference. △ Less

Submitted 26 February, 2024; originally announced February 2024.

arXiv:2402.15053 [pdf, ps, other]

Nonlinear Bayesian optimal experimental design using logarithmic Sobolev inequalities

Authors: Fengyi Li, Ayoub Belhadji, Youssef Marzouk

Abstract: We study the problem of selecting $k$ experiments from a larger candidate pool, where the goal is to maximize mutual information (MI) between the selected subset and the underlying parameters. Finding the exact solution is to this combinatorial optimization problem is computationally costly, not only due to the complexity of the combinatorial search but also the difficulty of evaluating MI in nonl… ▽ More We study the problem of selecting $k$ experiments from a larger candidate pool, where the goal is to maximize mutual information (MI) between the selected subset and the underlying parameters. Finding the exact solution is to this combinatorial optimization problem is computationally costly, not only due to the complexity of the combinatorial search but also the difficulty of evaluating MI in nonlinear/non-Gaussian settings. We propose greedy approaches based on new computationally inexpensive lower bounds for MI, constructed via log-Sobolev inequalities. We demonstrate that our method outperforms random selection strategies, Gaussian approximations, and nested Monte Carlo (NMC) estimators of MI in various settings, including optimal design for nonlinear models with non-additive noise. △ Less

Submitted 22 February, 2024; originally announced February 2024.

arXiv:2402.14840 [pdf, other]

RJUA-MedDQA: A Multimodal Benchmark for Medical Document Question Answering and Clinical Reasoning

Authors: Congyun **, Ming Zhang, Xiaowei Ma, Li Yujiao, Yingbo Wang, Yabo Jia, Yuliang Du, Tao Sun, Haowen Wang, Cong Fan, **jie Gu, Chenfei Chi, Xiangguo Lv, Fangzhou Li, Wei Xue, Yiran Huang

Abstract: Recent advancements in Large Language Models (LLMs) and Large Multi-modal Models (LMMs) have shown potential in various medical applications, such as Intelligent Medical Diagnosis. Although impressive results have been achieved, we find that existing benchmarks do not reflect the complexity of real medical reports and specialized in-depth reasoning capabilities. In this work, we introduced RJUA-Me… ▽ More Recent advancements in Large Language Models (LLMs) and Large Multi-modal Models (LMMs) have shown potential in various medical applications, such as Intelligent Medical Diagnosis. Although impressive results have been achieved, we find that existing benchmarks do not reflect the complexity of real medical reports and specialized in-depth reasoning capabilities. In this work, we introduced RJUA-MedDQA, a comprehensive benchmark in the field of medical specialization, which poses several challenges: comprehensively interpreting imgage content across diverse challenging layouts, possessing numerical reasoning ability to identify abnormal indicators and demonstrating clinical reasoning ability to provide statements of disease diagnosis, status and advice based on medical contexts. We carefully design the data generation pipeline and proposed the Efficient Structural Restoration Annotation (ESRA) Method, aimed at restoring textual and tabular content in medical report images. This method substantially enhances annotation efficiency, doubling the productivity of each annotator, and yields a 26.8% improvement in accuracy. We conduct extensive evaluations, including few-shot assessments of 5 LMMs which are capable of solving Chinese medical QA tasks. To further investigate the limitations and potential of current LMMs, we conduct comparative experiments on a set of strong LLMs by using image-text generated by ESRA method. We report the performance of baselines and offer several observations: (1) The overall performance of existing LMMs is still limited; however LMMs more robust to low-quality and diverse-structured images compared to LLMs. (3) Reasoning across context and image content present significant challenges. We hope this benchmark helps the community make progress on these challenging tasks in multi-modal medical document understanding and facilitate its application in healthcare. △ Less

Submitted 19 February, 2024; originally announced February 2024.

Comments: 15 pages, 13 figures

arXiv:2402.02306 [pdf, other]

A flexible Bayesian g-formula for causal survival analyses with time-dependent confounding

Authors: Xinyuan Chen, Liangyuan Hu, Fan Li

Abstract: In longitudinal observational studies with a time-to-event outcome, a common objective in causal analysis is to estimate the causal survival curve under hypothetical intervention scenarios within the study cohort. The g-formula is a particularly useful tool for this analysis. To enhance the traditional parametric g-formula approach, we developed a more adaptable Bayesian g-formula estimator, which… ▽ More In longitudinal observational studies with a time-to-event outcome, a common objective in causal analysis is to estimate the causal survival curve under hypothetical intervention scenarios within the study cohort. The g-formula is a particularly useful tool for this analysis. To enhance the traditional parametric g-formula approach, we developed a more adaptable Bayesian g-formula estimator, which incorporates the Bayesian additive regression trees (BART) in the modeling of the time-evolving generative components, aiming to mitigate bias due to model misspecification. Specifically, we introduce a more general class of g-formulas for discrete survival data that can incorporate the longitudinal balancing scores, which serve as an effective method for dimension reduction and are vital when dealing with an expanding array of time-varying confounders. The minimum sufficient formulation of these longitudinal balancing scores is linked to the nature of treatment regimes, whether static or dynamic. For each type of treatment regime, we provide posterior sampling algorithms grounded in the BART framework. We have conducted simulation studies to illustrate the empirical performance of the proposed method and further demonstrate its practical utility using data from the Yale New Haven Health System's (YNHHS) electronic health records. △ Less

Submitted 28 June, 2024; v1 submitted 3 February, 2024; originally announced February 2024.

arXiv:2401.15680 [pdf, other]

How to achieve model-robust inference in stepped wedge trials with model-based methods?

Authors: Bingkai Wang, Xueqi Wang, Fan Li

Abstract: A stepped wedge design is a unidirectional crossover design where clusters are randomized to distinct treatment sequences. While model-based analysis of stepped wedge designs -- via linear mixed models or generalized estimating equations -- is standard practice to evaluate treatment effects accounting for clustering and adjusting for baseline covariates, their properties under misspecification hav… ▽ More A stepped wedge design is a unidirectional crossover design where clusters are randomized to distinct treatment sequences. While model-based analysis of stepped wedge designs -- via linear mixed models or generalized estimating equations -- is standard practice to evaluate treatment effects accounting for clustering and adjusting for baseline covariates, their properties under misspecification have not been systematically explored. In this article, we study when a potentially misspecified multilevel model can offer consistent estimation for treatment effect estimands that are functions of calendar time and/or exposure time. We define nonparametric treatment effect estimands using potential outcomes, and adapt model-based methods via g-computation to achieve estimand-aligned inference. We prove a central result that, as long as the working model includes a correctly specified treatment effect structure, the g-computation is guaranteed to be consistent even if all remaining model components are arbitrarily misspecified. Furthermore, valid inference is obtained via the sandwich variance estimator. The theoretical results are illustrated via several simulation experiments and re-analysis of a completed stepped wedge trial. △ Less

Submitted 27 March, 2024; v1 submitted 28 January, 2024; originally announced January 2024.

arXiv:2401.11278 [pdf, other]

Handling incomplete outcomes and covariates in cluster-randomized trials: doubly-robust estimation, efficiency considerations, and sensitivity analysis

Authors: Bingkai Wang, Fan Li, Rui Wang

Abstract: In cluster-randomized trials (CRTs), missing data can occur in various ways, including missing values in outcomes and baseline covariates at the individual or cluster level, or completely missing information for non-participants. Among the various types of missing data in CRTs, missing outcomes have attracted the most attention. However, no existing methods can simultaneously address all aforement… ▽ More In cluster-randomized trials (CRTs), missing data can occur in various ways, including missing values in outcomes and baseline covariates at the individual or cluster level, or completely missing information for non-participants. Among the various types of missing data in CRTs, missing outcomes have attracted the most attention. However, no existing methods can simultaneously address all aforementioned types of missing data in CRTs. To fill in this gap, we propose a new doubly-robust estimator for the average treatment effect on a variety of scales. The proposed estimator simultaneously handles missing outcomes under missingness at random, missing covariates without constraining the missingness mechanism, and missing cluster-population sizes via a uniform sampling mechanism. Furthermore, we detail key considerations to improve precision by specifying the optimal weights, leveraging machine learning, and modeling the treatment assignment mechanism. Finally, to evaluate the impact of violating missing data assumptions, we contribute a new sensitivity analysis framework tailored to CRTs. Simulation studies and a real data application both demonstrate that our proposed methods are effective in handling missing data in CRTs and superior to the existing methods. △ Less

Submitted 24 March, 2024; v1 submitted 20 January, 2024; originally announced January 2024.

arXiv:2401.04372 [pdf, ps, other]

Stable generative modeling using diffusion maps

Authors: Georg Gottwald, Fengyi Li, Youssef Marzouk, Sebastian Reich

Abstract: We consider the problem of sampling from an unknown distribution for which only a sufficiently large number of training samples are available. Such settings have recently drawn considerable interest in the context of generative modelling. In this paper, we propose a generative model combining diffusion maps and Langevin dynamics. Diffusion maps are used to approximate the drift term from the avail… ▽ More We consider the problem of sampling from an unknown distribution for which only a sufficiently large number of training samples are available. Such settings have recently drawn considerable interest in the context of generative modelling. In this paper, we propose a generative model combining diffusion maps and Langevin dynamics. Diffusion maps are used to approximate the drift term from the available training samples, which is then implemented in a discrete-time Langevin sampler to generate new samples. By setting the kernel bandwidth to match the time step size used in the unadjusted Langevin algorithm, our method effectively circumvents any stability issues typically associated with time-step** stiff stochastic differential equations. More precisely, we introduce a novel split-step scheme, ensuring that the generated samples remain within the convex hull of the training samples. Our framework can be naturally extended to generate conditional samples. We demonstrate the performance of our proposed scheme through experiments on synthetic datasets with increasing dimensions and on a stochastic subgrid-scale parametrization conditional sampling problem. △ Less

Submitted 9 January, 2024; originally announced January 2024.

Comments: 23 pages, 25 figures

arXiv:2401.01977 [pdf, other]

Conformal causal inference for cluster randomized trials: model-robust inference without asymptotic approximations

Authors: Bingkai Wang, Fan Li, Mengxin Yu

Abstract: In the analysis of cluster randomized trials, two typical features are that individuals within a cluster are correlated and that the total number of clusters can sometimes be limited. While model-robust treatment effect estimators have been recently developed, their asymptotic theory requires the number of clusters to approach infinity, and one often has to empirically assess the applicability of… ▽ More In the analysis of cluster randomized trials, two typical features are that individuals within a cluster are correlated and that the total number of clusters can sometimes be limited. While model-robust treatment effect estimators have been recently developed, their asymptotic theory requires the number of clusters to approach infinity, and one often has to empirically assess the applicability of those methods in finite samples. To address this challenge, we propose a conformal causal inference framework that achieves the target coverage probability of treatment effects in finite samples without the need for asymptotic approximations. Meanwhile, we prove that this framework is compatible with arbitrary working models, including machine learning algorithms leveraging baseline covariates, possesses robustness against arbitrary misspecification of working models, and accommodates a variety of within-cluster correlations. Under this framework, we offer efficient algorithms to make inferences on treatment effects at both the cluster and individual levels, applicable to user-specified covariate subgroups and two types of test data. Finally, we demonstrate our methods via simulations and a real data application based on a cluster randomized trial for treating chronic pain. △ Less

Submitted 3 January, 2024; originally announced January 2024.

arXiv:2401.00987 [pdf, ps, other]

Inverting estimating equations for causal inference on quantiles

Authors: Chao Cheng, Fan Li

Abstract: The causal inference literature frequently focuses on estimating the mean of the potential outcome, whereas the quantiles of the potential outcome may carry important additional information. We propose a universal approach, based on the inverse estimating equations, to generalize a wide class of causal inference solutions from estimating the mean of the potential outcome to its quantiles. We assum… ▽ More The causal inference literature frequently focuses on estimating the mean of the potential outcome, whereas the quantiles of the potential outcome may carry important additional information. We propose a universal approach, based on the inverse estimating equations, to generalize a wide class of causal inference solutions from estimating the mean of the potential outcome to its quantiles. We assume that an identifying moment function is available to identify the mean of the threshold-transformed potential outcome, based on which a convenient construction of the estimating equation of quantiles of potential outcome is proposed. In addition, we also give a general construction of the efficient influence functions of the mean and quantiles of potential outcomes, and identify their connection. We motivate estimators for the quantile estimands with the efficient influence function, and develop their asymptotic properties when either parametric models or data-adaptive machine learners are used to estimate the nuisance functions. A broad implication of our results is that one can rework the existing result for mean causal estimands to facilitate causal inference on quantiles, rather than starting from scratch. Our results are illustrated by several examples. △ Less

Submitted 1 January, 2024; originally announced January 2024.

arXiv:2312.13097 [pdf, other]

Power calculation for cross-sectional stepped wedge cluster randomized trials with a time-to-event endpoint

Authors: Mary M. Ryan, Denise Esserman, Monica Taljaard, Fan Li

Abstract: A popular design choice in public health and implementation science research, stepped wedge cluster randomized trials (SW-CRTs) are a form of randomized trial whereby clusters are progressively transitioned from control to intervention, and the timing of transition is randomized for each cluster. An important task at the design stage is to ensure that the planned trial has sufficient power to obse… ▽ More A popular design choice in public health and implementation science research, stepped wedge cluster randomized trials (SW-CRTs) are a form of randomized trial whereby clusters are progressively transitioned from control to intervention, and the timing of transition is randomized for each cluster. An important task at the design stage is to ensure that the planned trial has sufficient power to observe a clinically meaningful effect size. While methods for determining study power have been well-developed for SW-CRTs with continuous and binary outcomes, limited methods for power calculation are available for SW-CRTs with censored time-to-event outcomes. In this article, we propose a stratified marginal Cox model to account for secular trend in cross-sectional SW-CRTs, and derive an explicit expression of the robust sandwich variance to facilitate power calculations without the need for computationally intensive simulations. Power formulas based on both the Wald and robust score tests are developed and compared via simulation, generally demonstrating superiority of robust score procedures in different finite-sample scenarios. Finally, we illustrate our methods using a SW-CRT testing the effect of a new electronic reminder system on time to catheter removal in hospital settings. We also offer an R Shiny application to facilitate sample size and power calculations using our proposed methods. △ Less

Submitted 20 December, 2023; originally announced December 2023.

Comments: Manuscript under review; 45 pages total (main text 22 pages, supporting information 23 pages); 18 figures total (main text 4 figures, supporting information 14 figures); 2 tables total (main text 2 tables, supporting information 0 tables); 5 appendices

arXiv:2311.12379 [pdf, other]

Infinite forecast combinations based on Dirichlet process

Authors: Yinuo Ren, Feng Li, Yanfei Kang, Jue Wang

Abstract: Forecast combination integrates information from various sources by consolidating multiple forecast results from the target time series. Instead of the need to select a single optimal forecasting model, this paper introduces a deep learning ensemble forecasting model based on the Dirichlet process. Initially, the learning rate is sampled with three basis distributions as hyperparameters to convert… ▽ More Forecast combination integrates information from various sources by consolidating multiple forecast results from the target time series. Instead of the need to select a single optimal forecasting model, this paper introduces a deep learning ensemble forecasting model based on the Dirichlet process. Initially, the learning rate is sampled with three basis distributions as hyperparameters to convert the infinite mixture into a finite one. All checkpoints are collected to establish a deep learning sub-model pool, and weight adjustment and diversity strategies are developed during the combination process. The main advantage of this method is its ability to generate the required base learners through a single training process, utilizing the decaying strategy to tackle the challenge posed by the stochastic nature of gradient descent in determining the optimal learning rate. To ensure the method's generalizability and competitiveness, this paper conducts an empirical analysis using the weekly dataset from the M4 competition and explores sensitivity to the number of models to be combined. The results demonstrate that the ensemble model proposed offers substantial improvements in prediction accuracy and stability compared to a single benchmark model. △ Less

Submitted 24 November, 2023; v1 submitted 21 November, 2023; originally announced November 2023.

arXiv:2311.10877 [pdf, other]

Covariate adjustment in randomized experiments with missing outcomes and covariates

Authors: Anqi Zhao, Peng Ding, Fan Li

Abstract: Covariate adjustment can improve precision in analyzing randomized experiments. With fully observed data, regression adjustment and propensity score weighting are asymptotically equivalent in improving efficiency over unadjusted analysis. When some outcomes are missing, we consider combining these two adjustment methods with inverse probability of observation weighting for handling missing outcome… ▽ More Covariate adjustment can improve precision in analyzing randomized experiments. With fully observed data, regression adjustment and propensity score weighting are asymptotically equivalent in improving efficiency over unadjusted analysis. When some outcomes are missing, we consider combining these two adjustment methods with inverse probability of observation weighting for handling missing outcomes, and show that the equivalence between the two methods breaks down. Regression adjustment no longer ensures efficiency gain over unadjusted analysis unless the true outcome model is linear in covariates or the outcomes are missing completely at random. Propensity score weighting, in contrast, still guarantees efficiency over unadjusted analysis, and including more covariates in adjustment never harms asymptotic efficiency. Moreover, we establish the value of using partially observed covariates to secure additional efficiency by the missingness indicator method, which imputes all missing covariates by zero and uses the union of the completed covariates and corresponding missingness indicators as the new, fully observed covariates. Based on these findings, we recommend using regression adjustment in combination with the missingness indicator method if the linear outcome model or missing complete at random assumption is plausible and using propensity score weighting with the missingness indicator method otherwise. △ Less

Submitted 4 March, 2024; v1 submitted 17 November, 2023; originally announced November 2023.

arXiv:2310.11603 [pdf, other]

doi 10.1002/sim.9962

Group sequential two-stage preference designs

Authors: Ruyi Liu, Fan Li, Denise Esserman, Mary M. Ryan

Abstract: The two-stage preference design (TSPD) enables the inference for treatment efficacy while allowing for incorporation of patient preference to treatment. It can provide unbiased estimates for selection and preference effects, where a selection effect occurs when patients who prefer one treatment respond differently than those who prefer another, and a preference effect is the difference in response… ▽ More The two-stage preference design (TSPD) enables the inference for treatment efficacy while allowing for incorporation of patient preference to treatment. It can provide unbiased estimates for selection and preference effects, where a selection effect occurs when patients who prefer one treatment respond differently than those who prefer another, and a preference effect is the difference in response caused by an interaction between the patient's preference and the actual treatment they receive. One potential barrier to adopting TSPD in practice, however, is the relatively large sample size required to estimate selection and preference effects with sufficient power. To address this concern, we propose a group sequential two-stage preference design (GS-TSPD), which combines TSPD with sequential monitoring for early stop**. In the GS-TSPD, pre-planned sequential monitoring allows investigators to conduct repeated hypothesis tests on accumulated data prior to full enrollment to assess study eligibility for early trial termination without inflating type I error rates. Thus, the procedure allows investigators to terminate the study when there is sufficient evidence of treatment, selection, or preference effects during an interim analysis, thereby reducing the design resource in expectation. To formalize such a procedure, we verify the independent increments assumption for testing the selection and preference effects and apply group sequential stop** boundaries from the approximate sequential density functions. Simulations are then conducted to investigate the operating characteristics of our proposed GS-TSPD compared to the traditional TSPD. We demonstrate the applicability of the design using a study of Hepatitis C treatment modality. △ Less

Submitted 17 October, 2023; originally announced October 2023.

Comments: 27 pages, 7 tables, 5 figures, 4 appendices; under review at Statistics in Medicine

Journal ref: Statistics in Medicine. (2023) 1-27

arXiv:2309.15316 [pdf, other]

Leveraging Neural Networks to Profile Health Care Providers with Application to Medicare Claims

Authors: Wenbo Wu, Fan Li, Richard Liu, Yiting Li, Mara McAdams-DeMarco, Krzysztof J. Geras, Douglas E. Schaubel, Iván Díaz

Abstract: Encompassing numerous nationwide, statewide, and institutional initiatives in the United States, provider profiling has evolved into a major health care undertaking with ubiquitous applications, profound implications, and high-stakes consequences. In line with such a significant profile, the literature has accumulated a number of developments dedicated to enhancing the statistical paradigm of prov… ▽ More Encompassing numerous nationwide, statewide, and institutional initiatives in the United States, provider profiling has evolved into a major health care undertaking with ubiquitous applications, profound implications, and high-stakes consequences. In line with such a significant profile, the literature has accumulated a number of developments dedicated to enhancing the statistical paradigm of provider profiling. Tackling wide-ranging profiling issues, these methods typically adjust for risk factors using linear predictors. While this approach is simple, it can be too restrictive to characterize complex and dynamic factor-outcome associations in certain contexts. One such example arises from evaluating dialysis facilities treating Medicare beneficiaries with end-stage renal disease. It is of primary interest to consider how the coronavirus disease (COVID-19) affected 30-day unplanned readmissions in 2020. The impact of COVID-19 on the risk of readmission varied dramatically across pandemic phases. To efficiently capture the variation while profiling facilities, we develop a generalized partially linear model (GPLM) that incorporates a neural network. Considering provider-level clustering, we implement the GPLM as a stratified sampling-based stochastic optimization algorithm that features accelerated convergence. Furthermore, an exact test is designed to identify under- and over-performing facilities, with an accompanying funnel plot to visualize profiles. The advantages of the proposed methods are demonstrated through simulation experiments and profiling dialysis facilities using 2020 Medicare claims from the United States Renal Data System. △ Less

Submitted 20 January, 2024; v1 submitted 26 September, 2023; originally announced September 2023.

Comments: 8 figures, 6 tables

arXiv:2309.13677 [pdf, other]

Bayesian pathway analysis over brain network mediators for survival data

Authors: Xinyuan Tian, Fan Li, Li Shen, Denise Esserman, Yize Zhao

Abstract: Technological advancements in noninvasive imaging facilitate the construction of whole brain interconnected networks, known as brain connectivity. Existing approaches to analyze brain connectivity frequently disaggregate the entire network into a vector of unique edges or summary measures, leading to a substantial loss of information. Motivated by the need to explore the effect mechanism among gen… ▽ More Technological advancements in noninvasive imaging facilitate the construction of whole brain interconnected networks, known as brain connectivity. Existing approaches to analyze brain connectivity frequently disaggregate the entire network into a vector of unique edges or summary measures, leading to a substantial loss of information. Motivated by the need to explore the effect mechanism among genetic exposure, brain connectivity and time to disease onset, we propose an integrative Bayesian framework to model the effect pathway between each of these components while quantifying the mediating role of brain networks. To accommodate the biological architectures of brain connectivity constructed along white matter fiber tracts, we develop a structural modeling framework that includes a symmetric matrix-variate accelerated failure time model and a symmetric matrix response regression to characterize the effect paths. We further impose within-graph sparsity and between-graph shrinkage to identify informative network configurations and eliminate the interference of noisy components. Extensive simulations confirm the superiority of our method compared with existing alternatives. By applying the proposed method to the landmark Alzheimer's Disease Neuroimaging Initiative study, we obtain neurobiologically plausible insights that may inform future intervention strategies. △ Less

Submitted 24 September, 2023; originally announced September 2023.

arXiv:2309.07365 [pdf, other]

Addressing selection bias in cluster randomized experiments via weighting

Authors: Georgia Papadogeorgou, Bo Liu, Fan Li, Fan Li

Abstract: In cluster randomized experiments, units are often recruited after the random cluster assignment, and data are only available for the recruited sample. Post-randomization recruitment can lead to selection bias, inducing systematic differences between the overall and the recruited populations, and between the recruited intervention and control arms. In this setting, we define causal estimands for t… ▽ More In cluster randomized experiments, units are often recruited after the random cluster assignment, and data are only available for the recruited sample. Post-randomization recruitment can lead to selection bias, inducing systematic differences between the overall and the recruited populations, and between the recruited intervention and control arms. In this setting, we define causal estimands for the overall and the recruited populations. We first show that if units select their cluster independently of the treatment assignment, cluster randomization implies individual randomization in the overall population. We then prove that under the assumption of ignorable recruitment, the average treatment effect on the recruited population can be consistently estimated from the recruited sample using inverse probability weighting. Generally we cannot identify the average treatment effect on the overall population. Nonetheless, we show, via a principal stratification formulation, that one can use weighting of the recruited sample to identify treatment effects on two meaningful subpopulations of the overall population: units who would be recruited into the study regardless of the assignment, and units who would be recruited in the study under treatment but not under control. We develop a corresponding estimation strategy and a sensitivity analysis method for checking the ignorable recruitment assumption. △ Less

Submitted 13 September, 2023; originally announced September 2023.

arXiv:2308.07248 [pdf]

Maintaining the validity of inference from linear mixed models in stepped-wedge cluster randomized trials under misspecified random-effects structures

Authors: Yongdong Ouyang, Monica Taljaard, Andrew B Forbes, Fan Li

Abstract: Linear mixed models are commonly used in analyzing stepped-wedge cluster randomized trials (SW-CRTs). A key consideration for analyzing a SW-CRT is accounting for the potentially complex correlation structure, which can be achieved by specifying a random effects structure. Common random effects structures for a SW-CRT include random intercept, random cluster-by-period, and discrete-time decay. Rec… ▽ More Linear mixed models are commonly used in analyzing stepped-wedge cluster randomized trials (SW-CRTs). A key consideration for analyzing a SW-CRT is accounting for the potentially complex correlation structure, which can be achieved by specifying a random effects structure. Common random effects structures for a SW-CRT include random intercept, random cluster-by-period, and discrete-time decay. Recently, more complex structures, such as the random intervention structure, have been proposed. In practice, specifying appropriate random effects can be challenging. Robust variance estimators (RVE) may be applied to linear mixed models to provide consistent estimators of standard errors of fixed effect parameters in the presence of random-effects misspecification. However, there has been no empirical investigation of RVE for SW-CRT. In this paper, we first review five RVEs (both standard and small-sample bias-corrected RVEs) that are available for linear mixed models. We then describe a comprehensive simulation study to examine the performance of these RVEs for SW-CRTs with a continuous outcome under different data generators. For each data generator, we investigate whether the use of a RVE with either the random intercept model or the random cluster-by-period model is sufficient to provide valid statistical inference for fixed effect parameters, when these working models are subject to misspecification. Our results indicate that the random intercept and random cluster-by-period models with RVEs performed similarly. The CR3 RVE estimator, coupled with the number of clusters minus two degrees of freedom correction, consistently gave the best coverage results, but could be slightly anti-conservative when the number of clusters was below 16. We summarize the implications of our results for linear mixed model analysis of SW-CRTs in practice. △ Less

Submitted 14 February, 2024; v1 submitted 14 August, 2023; originally announced August 2023.

arXiv:2308.05451 [pdf, ps, other]

A Forecaster's Review of Judea Pearl's Causality: Models, Reasoning and Inference, Second Edition, 2009

Authors: Feng Li

Abstract: With the big popularity and success of Judea Pearl's original causality book, this review covers the main topics updated in the second edition in 2009 and illustrates an easy-to-follow causal inference strategy in a forecast scenario. It further discusses some potential benefits and challenges for causal inference with time series forecasting when modeling the counterfactuals, estimating the uncer… ▽ More With the big popularity and success of Judea Pearl's original causality book, this review covers the main topics updated in the second edition in 2009 and illustrates an easy-to-follow causal inference strategy in a forecast scenario. It further discusses some potential benefits and challenges for causal inference with time series forecasting when modeling the counterfactuals, estimating the uncertainty and incorporating prior knowledge to estimate causal effects in different forecasting scenarios. △ Less

Submitted 10 August, 2023; originally announced August 2023.

arXiv:2306.11267 [pdf, other]

Model-assisted analysis of covariance estimators for stepped wedge cluster randomized experiments

Authors: Xinyuan Chen, Fan Li

Abstract: Stepped wedge cluster randomized experiments (SW-CREs) represent a class of unidirectional crossover designs. Although SW-CREs have become popular, definitions of estimands and robust methods to target estimands under the potential outcomes framework remain insufficient. To address this gap, we describe a class of estimands that explicitly acknowledge the multilevel data structure in SW-CREs and h… ▽ More Stepped wedge cluster randomized experiments (SW-CREs) represent a class of unidirectional crossover designs. Although SW-CREs have become popular, definitions of estimands and robust methods to target estimands under the potential outcomes framework remain insufficient. To address this gap, we describe a class of estimands that explicitly acknowledge the multilevel data structure in SW-CREs and highlight three typical members of the estimand class that are interpretable. We then introduce four analysis of covariance (ANCOVA) working models to achieve estimand-aligned analyses with covariate adjustment. Each ANCOVA estimator is model-assisted, as its point estimator is consistent even when the working model is misspecified. Under the stepped wedge randomization scheme, we establish the finite population Central Limit Theorem for each estimator. We study the finite-sample operating characteristics of the ANCOVA estimators in simulations and illustrate their application by analyzing the Washington State Expedited Partner Therapy study. △ Less

Submitted 25 June, 2024; v1 submitted 19 June, 2023; originally announced June 2023.

arXiv:2306.03266 [pdf, other]

Extending the Design Space of Graph Neural Networks by Rethinking Folklore Weisfeiler-Lehman

Authors: Jiarui Feng, Lecheng Kong, Hao Liu, Dacheng Tao, Fuhai Li, Muhan Zhang, Yixin Chen

Abstract: Message passing neural networks (MPNNs) have emerged as the most popular framework of graph neural networks (GNNs) in recent years. However, their expressive power is limited by the 1-dimensional Weisfeiler-Lehman (1-WL) test. Some works are inspired by $k$-WL/FWL (Folklore WL) and design the corresponding neural versions. Despite the high expressive power, there are serious limitations in this li… ▽ More Message passing neural networks (MPNNs) have emerged as the most popular framework of graph neural networks (GNNs) in recent years. However, their expressive power is limited by the 1-dimensional Weisfeiler-Lehman (1-WL) test. Some works are inspired by $k$-WL/FWL (Folklore WL) and design the corresponding neural versions. Despite the high expressive power, there are serious limitations in this line of research. In particular, (1) $k$-WL/FWL requires at least $O(n^k)$ space complexity, which is impractical for large graphs even when $k=3$; (2) The design space of $k$-WL/FWL is rigid, with the only adjustable hyper-parameter being $k$. To tackle the first limitation, we propose an extension, $(k,t)$-FWL. We theoretically prove that even if we fix the space complexity to $O(n^k)$ (for any $k\geq 2$) in $(k,t)$-FWL, we can construct an expressiveness hierarchy up to solving the graph isomorphism problem. To tackle the second problem, we propose $k$-FWL+, which considers any equivariant set as neighbors instead of all nodes, thereby greatly expanding the design space of $k$-FWL. Combining these two modifications results in a flexible and powerful framework $(k,t)$-FWL+. We demonstrate $(k,t)$-FWL+ can implement most existing models with matching expressiveness. We then introduce an instance of $(k,t)$-FWL+ called Neighborhood$^2$-FWL (N$^2$-FWL), which is practically and theoretically sound. We prove that N$^2$-FWL is no less powerful than 3-WL, and can encode many substructures while only requiring $O(n^2)$ space. Finally, we design its neural version named N$^2$-GNN and evaluate its performance on various tasks. N$^2$-GNN achieves record-breaking results on ZINC-Subset (0.059), outperforming previous SOTA results by 10.6%. Moreover, N$^2$-GNN achieves new SOTA results on the BREC dataset (71.8%) among all existing high-expressive GNN methods. △ Less

Submitted 14 January, 2024; v1 submitted 5 June, 2023; originally announced June 2023.

Comments: Accepted to NeurIPS 2023

arXiv:2305.18412 [pdf, other]

Short-term Temporal Dependency Detection under Heterogeneous Event Dynamic with Hawkes Processes

Authors: Yu Chen, Fengpei Li, Anderson Schneider, Yuriy Nevmyvaka, Asohan Amarasingham, Henry Lam

Abstract: Many event sequence data exhibit mutually exciting or inhibiting patterns. Reliable detection of such temporal dependency is crucial for scientific investigation. The de facto model is the Multivariate Hawkes Process (MHP), whose impact function naturally encodes a causal structure in Granger causality. However, the vast majority of existing methods use direct or nonlinear transform of standard MH… ▽ More Many event sequence data exhibit mutually exciting or inhibiting patterns. Reliable detection of such temporal dependency is crucial for scientific investigation. The de facto model is the Multivariate Hawkes Process (MHP), whose impact function naturally encodes a causal structure in Granger causality. However, the vast majority of existing methods use direct or nonlinear transform of standard MHP intensity with constant baseline, inconsistent with real-world data. Under irregular and unknown heterogeneous intensity, capturing temporal dependency is hard as one struggles to distinguish the effect of mutual interaction from that of intensity fluctuation. In this paper, we address the short-term temporal dependency detection issue. We show the maximum likelihood estimation (MLE) for cross-impact from MHP has an error that can not be eliminated but may be reduced by order of magnitude, using heterogeneous intensity not of the target HP but of the interacting HP. Then we proposed a robust and computationally-efficient method modified from MLE that does not rely on the prior estimation of the heterogeneous intensity and is thus applicable in a data-limited regime (e.g., few-shot, no repeated observations). Extensive experiments on various datasets show that our method outperforms existing ones by notable margins, with highlighted novel applications in neuroscience. △ Less

Submitted 28 May, 2023; originally announced May 2023.

Comments: Conference on Uncertainty in Artificial Intelligence 2023

arXiv:2305.13443 [pdf, other]

Multiply robust estimation for causal survival analysis with treatment noncompliance

Authors: Chao Cheng, Yueqi Guo, Bo Liu, Lisa Wruck, Fan Li, Fan Li

Abstract: Comparative effectiveness research frequently addresses a time-to-event outcome and can require unique considerations in the presence of treatment noncompliance. Motivated by the challenges in addressing noncompliance in the ADAPTABLE pragmatic trial, we develop a multiply robust estimator to estimate the principal survival causal effects under the principal ignorability and monotonicity assumptio… ▽ More Comparative effectiveness research frequently addresses a time-to-event outcome and can require unique considerations in the presence of treatment noncompliance. Motivated by the challenges in addressing noncompliance in the ADAPTABLE pragmatic trial, we develop a multiply robust estimator to estimate the principal survival causal effects under the principal ignorability and monotonicity assumption. The multiply robust estimator involves several working models including that for the treatment assignment, the compliance strata, censoring, and time-to-event of interest. The proposed estimator is consistent even if one, and sometimes two, of the working models are misspecified. We apply the multiply robust method in the ADAPTABLE trial to evaluate the effect of low- versus high-dose aspirin assignment on patients' death and hospitalization from cardiovascular diseases. We find that, comparing to low-dose assignment, assignment to the high-dose leads to differential effects among always high-dose takers, compliers, and always low-dose takers. Such treatment effect heterogeneity contributes to the null intention-to-treatment effect, and suggests that policy makers should design personalized strategies based on potential compliance patterns to maximize treatment benefits to the entire study population. We further perform a formal sensitivity analysis for investigating the robustness of our causal conclusions under violation of two identification assumptions specific to noncompliance. △ Less

Submitted 27 July, 2023; v1 submitted 22 May, 2023; originally announced May 2023.

arXiv:2304.10025 [pdf, other]

Identification and multiply robust estimation in causal mediation analysis across principal strata

Authors: Chao Cheng, Fan Li

Abstract: We consider assessing causal mediation in the presence of a post-treatment event (examples include noncompliance, a clinical event, or a terminal event). We identify natural mediation effects for the entire study population and for each principal stratum characterized by the joint potential values of the post-treatment event. We derive efficient influence functions for each mediation estimand, whi… ▽ More We consider assessing causal mediation in the presence of a post-treatment event (examples include noncompliance, a clinical event, or a terminal event). We identify natural mediation effects for the entire study population and for each principal stratum characterized by the joint potential values of the post-treatment event. We derive efficient influence functions for each mediation estimand, which motivate a set of multiply robust estimators for inference. The multiply robust estimators are consistent under four types of misspecifications and are efficient when all nuisance models are correctly specified. We illustrate our methods via simulations and two real data examples. △ Less

Submitted 25 March, 2024; v1 submitted 19 April, 2023; originally announced April 2023.

arXiv:2304.04868 [pdf, other]

Correcting for bias due to mismeasured exposure in mediation analysis with a survival outcome

Authors: Chao Cheng, Donna Spiegelman, Fan Li

Abstract: Mediation analysis is widely used in health science research to evaluate the extent to which an intermediate variable explains an observed exposure-outcome relationship. However, the validity of analysis can be compromised when the exposure is measured with error. Motivated by the Health Professionals Follow-up Study (HPFS), we investigate the impact of exposure measurement error on assessing medi… ▽ More Mediation analysis is widely used in health science research to evaluate the extent to which an intermediate variable explains an observed exposure-outcome relationship. However, the validity of analysis can be compromised when the exposure is measured with error. Motivated by the Health Professionals Follow-up Study (HPFS), we investigate the impact of exposure measurement error on assessing mediation with a survival outcome, based on the Cox proportional hazards outcome model. When the outcome is rare and there is no exposure-mediator interaction, we show that the uncorrected estimators of the natural indirect and direct effects can be biased into either direction, but the uncorrected estimator of the mediation proportion is approximately unbiased as long as the measurement error is not large or the mediator-exposure association is not strong. We develop ordinary regression calibration and risk set regression calibration approaches to correct the exposure measurement error-induced bias when estimating mediation effects and allowing for an exposure-mediator interaction in the Cox outcome model. The proposed approaches require a validation study to characterize the measurement error process. We apply the proposed approaches to the HPFS (1986-2016) to evaluate extent to which reduced body mass index mediates the protective effect of vigorous physical activity on the risk of cardiovascular diseases, and compare the finite-sample properties of the proposed estimators via simulations. △ Less

Submitted 15 September, 2023; v1 submitted 10 April, 2023; originally announced April 2023.

arXiv:2304.03928 [pdf]

doi 10.1039/D3NR02322B

Interpretable machine learning-accelerated seed treatment by nanomaterials for environmental stress alleviation

Authors: Hengjie Yu, Dan Luo, Sam F. Y. Li, Maozhen Qu, Da Liu, Yingchao He, Fang Cheng

Abstract: Crops are constantly challenged by different environmental conditions. Seed treatment by nanomaterials is a cost-effective and environmentally-friendly solution for environmental stress mitigation in crop plants. Here, 56 seed nanopriming treatments are used to alleviate environmental stresses in maize. Seven selected nanopriming treatments significantly increase the stress resistance index (SRI)… ▽ More Crops are constantly challenged by different environmental conditions. Seed treatment by nanomaterials is a cost-effective and environmentally-friendly solution for environmental stress mitigation in crop plants. Here, 56 seed nanopriming treatments are used to alleviate environmental stresses in maize. Seven selected nanopriming treatments significantly increase the stress resistance index (SRI) by 13.9% and 12.6% under salinity stress and combined heat-drought stress, respectively. Metabolomics data reveals that ZnO nanopriming treatment, with the highest SRI value, mainly regulates the pathways of amino acid metabolism, secondary metabolite synthesis, carbohydrate metabolism, and translation. Understanding the mechanism of seed nanopriming is still difficult due to the variety of nanomaterials and the complexity of interactions between nanomaterials and plants. Using the nanopriming data, we present an interpretable structure-activity relationship (ISAR) approach based on interpretable machine learning for predicting and understanding its stress mitigation effects. The post hoc and model-based interpretation approaches of machine learning are combined to provide complementary benefits and give researchers or policymakers more illuminating or trustworthy results. The concentration, size, and zeta potential of nanoparticles are identified as dominant factors for correlating root dry weight under salinity stress, and their effects and interactions are explained. Additionally, a web-based interactive tool is developed for offering prediction-level interpretation and gathering more details about specific nanopriming treatments. This work offers a promising framework for accelerating the agricultural applications of nanomaterials and may profoundly contribute to nanosafety assessment. △ Less

Submitted 8 April, 2023; originally announced April 2023.

Comments: 30 pages, 6 figures

arXiv:2304.02740 [pdf, other]

PStrata: An R Package for Principal Stratification

Authors: Bo Liu, Fan Li

Abstract: Post-treatment confounding is a common problem in causal inference, including special cases of noncompliance, truncation by death, surrogate endpoint, etc. Principal stratification (Frangakis and Rubin 2002) is a general framework for defining and estimating causal effects in the presence of post-treatment confounding. A prominent special case is the instrumental variable approach to noncompliance… ▽ More Post-treatment confounding is a common problem in causal inference, including special cases of noncompliance, truncation by death, surrogate endpoint, etc. Principal stratification (Frangakis and Rubin 2002) is a general framework for defining and estimating causal effects in the presence of post-treatment confounding. A prominent special case is the instrumental variable approach to noncompliance in randomized experiments (Angrist, Imbens, and Rubin 1996). Despite its versatility, principal stratification is not accessible to the vast majority of applied researchers because its inherent latent mixture structure requires complex inference tools and highly customized programming. We develop the R package PStrata to automatize statistical analysis of principal stratification for several common scenarios. PStrata supports both Bayesian and frequentist paradigms. For the Bayesian paradigm, the computing architecture combines R, C++, Stan, where R provides user-interface, Stan automatizes posterior sampling, and C++ bridges the two by automatically generating Stan code. For the Frequentist paradigm, PStrata implements a triply-robust weighting estimator. PStrata accommodates regular outcomes and time-to-event outcomes with both unstructured and clustered data. △ Less

Submitted 5 April, 2023; originally announced April 2023.

arXiv:2304.01506 [pdf, other]

doi 10.14778/3583140.3583155

OneShotSTL: One-Shot Seasonal-Trend Decomposition For Online Time Series Anomaly Detection And Forecasting

Authors: Xiao He, Ye Li, Jian Tan, Bin Wu, Feifei Li

Abstract: Seasonal-trend decomposition is one of the most fundamental concepts in time series analysis that supports various downstream tasks, including time series anomaly detection and forecasting. However, existing decomposition methods rely on batch processing with a time complexity of O(W), where W is the number of data points within a time window. Therefore, they cannot always efficiently support real… ▽ More Seasonal-trend decomposition is one of the most fundamental concepts in time series analysis that supports various downstream tasks, including time series anomaly detection and forecasting. However, existing decomposition methods rely on batch processing with a time complexity of O(W), where W is the number of data points within a time window. Therefore, they cannot always efficiently support real-time analysis that demands low processing delay. To address this challenge, we propose OneShotSTL, an efficient and accurate algorithm that can decompose time series online with an update time complexity of O(1). OneShotSTL is more than $1,000$ times faster than the batch methods, with accuracy comparable to the best counterparts. Extensive experiments on real-world benchmark datasets for downstream time series anomaly detection and forecasting tasks demonstrate that OneShotSTL is from 10 to over 1,000 times faster than the state-of-the-art methods, while still providing comparable or even better accuracy. △ Less

Submitted 3 April, 2023; originally announced April 2023.

Comments: PVLDB 2023

Report number: 1399-1412

arXiv:2304.00231 [pdf]

Using Overlap Weights to Address Extreme Propensity Scores in Estimating Restricted Mean Counterfactual Survival Times

Authors: Zhiqiang Cao, Lama Ghazi, Claudia Mastrogiacomo, Laura Forastiere, F. Perry Wilson, Fan Li

Abstract: While the inverse probability of treatment weighting (IPTW) is a commonly used approach for treatment comparisons in observational data, the resulting estimates may be subject to bias and excessively large variance when there is lack of overlap in the propensity score distributions. By smoothly down-weighting the units with extreme propensity scores, overlap weighting (OW) can help mitigate the bi… ▽ More While the inverse probability of treatment weighting (IPTW) is a commonly used approach for treatment comparisons in observational data, the resulting estimates may be subject to bias and excessively large variance when there is lack of overlap in the propensity score distributions. By smoothly down-weighting the units with extreme propensity scores, overlap weighting (OW) can help mitigate the bias and variance issues associated with IPTW. Although theoretical and simulation results have supported the use of OW with continuous and binary outcomes, its performance with right-censored survival outcomes remains to be further investigated, especially when the target estimand is defined based on the restricted mean survival time (RMST)-a clinically meaningful summary measure free of the proportional hazards assumption. In this article, we combine propensity score weighting and inverse probability of censoring weighting to estimate the restricted mean counterfactual survival times, and propose computationally-efficient variance estimators. We conduct simulations to compare the performance of IPTW, trimming, and OW in terms of bias, variance, and 95% confidence interval coverage, under various degrees of covariate overlap. Regardless of overlap, we demonstrate the advantage of OW over IPTW and trimming methods in bias, variance, and coverage when the estimand is defined based on RMST. △ Less

Submitted 10 February, 2024; v1 submitted 1 April, 2023; originally announced April 2023.

arXiv:2304.00200 [pdf, ps, other]

Diffusion map particle systems for generative modeling

Authors: Fengyi Li, Youssef Marzouk

Abstract: We propose a novel diffusion map particle system (DMPS) for generative modeling, based on diffusion maps and Laplacian-adjusted Wasserstein gradient descent (LAWGD). Diffusion maps are used to approximate the generator of the corresponding Langevin diffusion process from samples, and hence to learn the underlying data-generating manifold. On the other hand, LAWGD enables efficient sampling from th… ▽ More We propose a novel diffusion map particle system (DMPS) for generative modeling, based on diffusion maps and Laplacian-adjusted Wasserstein gradient descent (LAWGD). Diffusion maps are used to approximate the generator of the corresponding Langevin diffusion process from samples, and hence to learn the underlying data-generating manifold. On the other hand, LAWGD enables efficient sampling from the target distribution given a suitable choice of kernel, which we construct here via a spectral approximation of the generator, computed with diffusion maps. Our method requires no offline training and minimal tuning, and can outperform other approaches on data sets of moderate dimension. △ Less

Submitted 30 October, 2023; v1 submitted 31 March, 2023; originally announced April 2023.

arXiv:2303.13960 [pdf]

Demystifying estimands in cluster-randomised trials

Authors: Brennan C Kahan, Bryan Blette, Michael Harhay, Scott Halpern, Vipul Jairath, Andrew Copas, Fan Li

Abstract: Estimands can help clarify the interpretation of treatment effects and ensure that estimators are aligned to the study's objectives. Cluster randomised trials require additional attributes to be defined within the estimand compared to individually randomised trials, including whether treatment effects are marginal or cluster specific, and whether they are participant or cluster average. In this pa… ▽ More Estimands can help clarify the interpretation of treatment effects and ensure that estimators are aligned to the study's objectives. Cluster randomised trials require additional attributes to be defined within the estimand compared to individually randomised trials, including whether treatment effects are marginal or cluster specific, and whether they are participant or cluster average. In this paper, we provide formal definitions of estimands encompassing both these attributes using potential outcomes notation and describe differences between them. We then provide an overview of estimators for each estimand, describe their assumptions, and show consistency (i.e. asymptotically unbiased estimation) for a series of analyses based on cluster level summaries. Then, through a reanalysis of a published cluster randomised trial, we demonstrate that the choice of both estimand and estimator can affect interpretation. For instance, the estimated odds ratio ranged from 1.38 (p=0.17) to 1.83 (p=0.03) depending on the target estimand, and for some estimands, the choice of estimator affected the conclusions by leading to smaller treatment effect estimates. We conclude that careful specification of the estimand, along with an appropriate choice of estimator, are essential to ensuring that cluster randomised trials address the right question. △ Less

Submitted 22 February, 2024; v1 submitted 24 March, 2023; originally announced March 2023.

arXiv:2302.09138 [pdf, other]

doi 10.1002/sim.9830

Maximin optimal cluster randomized designs for assessing treatment effect heterogeneity

Authors: Mary M. Ryan, Denise Esserman, Fan Li

Abstract: Cluster randomized trials (CRTs) are studies where treatment is randomized at the cluster level but outcomes are typically collected at the individual level. When CRTs are employed in pragmatic settings, baseline population characteristics may moderate treatment effects, leading to what is known as heterogeneous treatment effects (HTEs). Pre-specified, hypothesis-driven HTE analyses in CRTs can en… ▽ More Cluster randomized trials (CRTs) are studies where treatment is randomized at the cluster level but outcomes are typically collected at the individual level. When CRTs are employed in pragmatic settings, baseline population characteristics may moderate treatment effects, leading to what is known as heterogeneous treatment effects (HTEs). Pre-specified, hypothesis-driven HTE analyses in CRTs can enable an understanding of how interventions may impact subpopulation outcomes. While closed-form sample size formulas have recently been proposed, assuming known intracluster correlation coefficients (ICCs) for both the covariate and outcome, guidance on optimal cluster randomized designs to ensure maximum power with pre-specified HTE analyses has not yet been developed. We derive new design formulas to determine the cluster size and number of clusters to achieve the locally optimal design (LOD) that minimizes variance for estimating the HTE parameter given a budget constraint. Given the LODs are based on covariate and outcome-ICC values that are usually unknown, we further develop the maximin design for assessing HTE, identifying the combination of design resources that maximize the relative efficiency of the HTE analysis in the worst case scenario. In addition, given the analysis of the average treatment effect is often of primary interest, we also establish optimal designs to accommodate multiple objectives by combining considerations for studying both the average and heterogeneous treatment effects. We illustrate our methods using the context of the Kerala Diabetes Prevention Program CRT, and provide an R Shiny app to facilitate calculation of optimal designs under a wide range of design parameters. △ Less

Submitted 30 May, 2023; v1 submitted 17 February, 2023; originally announced February 2023.

Comments: 25 pages, 6 figures, 5 tables, 3 appendices; clarified phrasing, typos corrected

Journal ref: Statistics in Medicine. (2023) 1-22

arXiv:2301.07672 [pdf, other]

Principal Stratification with Time-to-Event Outcomes

Authors: Bo Liu, Lisa Wruck, Fan Li

Abstract: Post-randomization events, also known as intercurrent events, such as treatment noncompliance and censoring due to a terminal event, are common in clinical trials. Principal stratification is a framework for causal inference in the presence of intercurrent events. Despite the extensive existing literature, there lacks generally applicable and accessible methods for principal stratification analysi… ▽ More Post-randomization events, also known as intercurrent events, such as treatment noncompliance and censoring due to a terminal event, are common in clinical trials. Principal stratification is a framework for causal inference in the presence of intercurrent events. Despite the extensive existing literature, there lacks generally applicable and accessible methods for principal stratification analysis with time-to-event outcomes. In this paper, we specify two causal estimands for time-to-event outcomes in principal stratification. For estimation, we adopt the general strategy of latent mixture modeling and derive the corresponding likelihood function. For computational convenience, we illustrate the general strategy with a mixture of Bayesian parametric Weibull-Cox proportional model for the outcome. We utilize the Stan programming language to obtain automatic posterior sampling of the model parameters via the Hamiltonian Monte Carlo. We provide the analytical forms of the causal estimands as functions of the model parameters and an alternative numerical method when analytical forms are not available. We apply the proposed method to the ADAPTABLE trial to evaluate the causal effect of taking 81 mg versus 325 mg aspirin on the risk of major adverse cardiovascular events. △ Less

Submitted 18 January, 2023; originally announced January 2023.

arXiv:2212.13892 [pdf, other]

Cross-Dataset Propensity Estimation for Debiasing Recommender Systems

Authors: Fengyu Li, Sarah Dean

Abstract: Datasets for training recommender systems are often subject to distribution shift induced by users' and recommenders' selection biases. In this paper, we study the impact of selection bias on datasets with different quantization. We then leverage two differently quantized datasets from different source distributions to mitigate distribution shift by applying the inverse probability scoring method… ▽ More Datasets for training recommender systems are often subject to distribution shift induced by users' and recommenders' selection biases. In this paper, we study the impact of selection bias on datasets with different quantization. We then leverage two differently quantized datasets from different source distributions to mitigate distribution shift by applying the inverse probability scoring method from causal inference. Empirically, our approach gains significant performance improvement over single-dataset methods and alternative ways of combining two datasets. △ Less

Submitted 21 December, 2022; originally announced December 2022.

Comments: In Workshop on Distribution Shifts, 36th Conference on Neural Information Processing Systems (NeurIPS 2022)

arXiv:2210.07324 [pdf, ps, other]

doi 10.1080/01621459.2023.2289693

Model-robust and efficient covariate adjustment for cluster-randomized experiments

Authors: Bingkai Wang, Chan Park, Dylan S. Small, Fan Li

Abstract: Cluster-randomized experiments are increasingly used to evaluate interventions in routine practice conditions, and researchers often adopt model-based methods with covariate adjustment in the statistical analyses. However, the validity of model-based covariate adjustment is unclear when the working models are misspecified, leading to ambiguity of estimands and risk of bias. In this article, we fir… ▽ More Cluster-randomized experiments are increasingly used to evaluate interventions in routine practice conditions, and researchers often adopt model-based methods with covariate adjustment in the statistical analyses. However, the validity of model-based covariate adjustment is unclear when the working models are misspecified, leading to ambiguity of estimands and risk of bias. In this article, we first adapt two conventional model-based methods, generalized estimating equations and linear mixed models, with weighted g-computation to achieve robust inference for cluster-average and individual-average treatment effects. To further overcome the limitations of model-based covariate adjustment methods, we propose an efficient estimator for each estimand that allows for flexible covariate adjustment and additionally addresses cluster size variation dependent on treatment assignment and other cluster characteristics. Such cluster size variations often occur post-randomization and, if ignored, can lead to bias of model-based estimators. For our proposed efficient covariate-adjusted estimator, we prove that when the nuisance functions are consistently estimated by machine learning algorithms, the estimator is consistent, asymptotically normal, and efficient. When the nuisance functions are estimated via parametric working models, the estimator is triply-robust. Simulation studies and analyses of three real-world cluster-randomized experiments demonstrate that the proposed methods are superior to existing alternatives. △ Less

Submitted 18 July, 2023; v1 submitted 13 October, 2022; originally announced October 2022.

arXiv:2210.04100 [pdf, other]

Doubly robust estimation and sensitivity analysis for marginal structural quantile models

Authors: Chao Cheng, Liangyuan Hu, Fan Li

Abstract: The marginal structure quantile model (MSQM) provides a unique lens to understand the causal effect of a time-varying treatment on the full distribution of potential outcomes. Under the semiparametric framework, we derive the efficiency influence function for the MSQM, from which a new doubly robust estimator is proposed for point estimation and inference. We show that the doubly robust estimator… ▽ More The marginal structure quantile model (MSQM) provides a unique lens to understand the causal effect of a time-varying treatment on the full distribution of potential outcomes. Under the semiparametric framework, we derive the efficiency influence function for the MSQM, from which a new doubly robust estimator is proposed for point estimation and inference. We show that the doubly robust estimator is consistent if either of the models associated with treatment assignment or the potential outcome distributions is correctly specified, and is semiparametric efficient if both models are correct. To implement the doubly robust MSQM estimator, we propose to solve a smoothed estimating equation to facilitate efficient computation of the point and variance estimates. In addition, we develop a confounding function approach to investigate the sensitivity of several MSQM estimators when the sequential ignorability assumption is violated. Extensive simulations are conducted to examine the finite-sample performance characteristics of the proposed methods. We apply the proposed methods to the Yale New Haven Health System Electronic Health Record data to study the effect of antihypertensive medications to patients with severe hypertension and assess the robustness of findings to unmeasured baseline and time-varying confounding. △ Less

Submitted 10 February, 2024; v1 submitted 8 October, 2022; originally announced October 2022.

arXiv:2209.03533 [pdf, other]

Using propensity scores for racial disparities analysis

Authors: Fan Li, Fan Li

Abstract: Propensity score plays a central role in causal inference, but its use is not limited to causal comparisons. As a covariate balancing tool, propensity score can be used for controlled descriptive comparisons between groups whose memberships are not manipulable. A prominent example is racial disparities in health care. However, conceptual confusion and hesitation persists for using propensity score… ▽ More Propensity score plays a central role in causal inference, but its use is not limited to causal comparisons. As a covariate balancing tool, propensity score can be used for controlled descriptive comparisons between groups whose memberships are not manipulable. A prominent example is racial disparities in health care. However, conceptual confusion and hesitation persists for using propensity score in racial disparities studies. In this commentary, we argue that propensity score, possibly combined with other methods, is an effective tool for racial disparities analysis. We describe relevant estimands, target population, and assumptions. In particular, we clarify that a controlled descriptive comparisons require weaker assumptions than a causal comparison. We discuss three common propensity score weighting strategies: overlap weighting, inverse probability weighting and average treatment effect for treated weighting. We further describe how to combine weighting with the rank-and-replace adjustment method to produce racial disparity estimates concordant to the Institute of Medicine's definition. The method is illustrated by a re-analysis of the Medical Expenditure Panel Survey data. △ Less

Submitted 7 September, 2022; originally announced September 2022.

Comments: This is an invited commentary. This commentary includes 10 pages, 2 figures and 1 table

arXiv:2209.01297 [pdf, other]

Assessing treatment effect heterogeneity in the presence of missing effect modifier data in cluster-randomized trials

Authors: Bryan S. Blette, Scott D. Halpern, Fan Li, Michael O. Harhay

Abstract: Understanding whether and how treatment effects vary across subgroups is crucial to inform clinical practice and recommendations. Accordingly, the assessment of heterogeneous treatment effects (HTE) based on pre-specified potential effect modifiers has become a common goal in modern randomized trials. However, when one or more potential effect modifiers are missing, complete-case analysis may lead… ▽ More Understanding whether and how treatment effects vary across subgroups is crucial to inform clinical practice and recommendations. Accordingly, the assessment of heterogeneous treatment effects (HTE) based on pre-specified potential effect modifiers has become a common goal in modern randomized trials. However, when one or more potential effect modifiers are missing, complete-case analysis may lead to bias and under-coverage. While statistical methods for handling missing data have been proposed and compared for individually randomized trials with missing effect modifier data, few guidelines exist for the cluster-randomized setting, where intracluster correlations in the effect modifiers, outcomes, or even missingness mechanisms may introduce further threats to accurate assessment of HTE. In this article, the performance of several missing data methods are compared through a simulation study of cluster-randomized trials with continuous outcome and missing binary effect modifier data, and further illustrated using real data from the Work, Family, and Health Study. Our results suggest that multilevel multiple imputation (MMI) and Bayesian MMI have better performance than other available methods, and that Bayesian MMI has lower bias and closer to nominal coverage than standard MMI when there are model specification or compatibility issues. △ Less

Submitted 1 December, 2023; v1 submitted 2 September, 2022; originally announced September 2022.

arXiv:2209.00170 [pdf, other]

CPS Attack Detection under Limited Local Information in Cyber Security: A Multi-node Multi-class Classification Ensemble Approach

Authors: Junyi Liu, Yifu Tang, Haimeng Zhao, Xieheng Wang, Fangyu Li, **gyi Zhang

Abstract: Cybersecurity breaches are the common anomalies for distributed cyber-physical systems (CPS). However, the cyber security breach classification is still a difficult problem, even using cutting-edge artificial intelligence (AI) approaches. In this paper, we study the multi-class classification problem in cyber security for attack detection. A challenging multi-node data-censoring case is considered… ▽ More Cybersecurity breaches are the common anomalies for distributed cyber-physical systems (CPS). However, the cyber security breach classification is still a difficult problem, even using cutting-edge artificial intelligence (AI) approaches. In this paper, we study the multi-class classification problem in cyber security for attack detection. A challenging multi-node data-censoring case is considered. In such a case, data within each data center/node cannot be shared while the local data is incomplete. Particularly, local nodes contain only a part of the multiple classes. In order to train a global multi-class classifier without sharing the raw data across all nodes, the main result of our study is designing a multi-node multi-class classification ensemble approach. By gathering the estimated parameters of the binary classifiers and data densities from each local node, the missing information for each local node is completed to build the global multi-class classifier. Numerical experiments are given to validate the effectiveness of the proposed approach under the multi-node data-censoring case. Under such a case, we even show the out-performance of the proposed approach over the full-data approach. △ Less

Submitted 31 August, 2022; originally announced September 2022.

Comments: 22 pages. Submitted to ACM Transactions on Sensor Networks (TOSN)

arXiv:2208.00139 [pdf, other]

Another look at forecast trimming for combinations: robustness, accuracy and diversity

Authors: Xiaoqian Wang, Yanfei Kang, Feng Li

Abstract: Forecast combination is widely recognized as a preferred strategy over forecast selection due to its ability to mitigate the uncertainty associated with identifying a single "best" forecast. Nonetheless, sophisticated combinations are often empirically dominated by simple averaging, which is commonly attributed to the weight estimation error. The issue becomes more problematic when dealing with a… ▽ More Forecast combination is widely recognized as a preferred strategy over forecast selection due to its ability to mitigate the uncertainty associated with identifying a single "best" forecast. Nonetheless, sophisticated combinations are often empirically dominated by simple averaging, which is commonly attributed to the weight estimation error. The issue becomes more problematic when dealing with a forecast pool containing a large number of individual forecasts. In this paper, we propose a new forecast trimming algorithm to identify an optimal subset from the original forecast pool for forecast combination tasks. In contrast to existing approaches, our proposed algorithm simultaneously takes into account the robustness, accuracy and diversity issues of the forecast pool, rather than isolating each one of these issues. We also develop five forecast trimming algorithms as benchmarks, including one trimming-free algorithm and several trimming algorithms that isolate each one of the three key issues. Experimental results show that our algorithm achieves superior forecasting performance in general in terms of both point forecasts and prediction intervals. Nevertheless, we argue that diversity does not always have to be addressed in forecast trimming. Based on the results, we offer some practical guidelines on the selection of forecast trimming algorithms for a target series. △ Less

Submitted 14 June, 2024; v1 submitted 30 July, 2022; originally announced August 2022.

arXiv:2207.07890 [pdf, other]

doi 10.1002/sim.9840

Covariate Adjustment in Randomized Clinical Trials with Missing Covariate and Outcome Data

Authors: Chia-Rui Chang, Yue Song, Fan Li, Rui Wang

Abstract: When analyzing data from randomized clinical trials, covariate adjustment can be used to account for chance imbalance in baseline covariates and to increase precision of the treatment effect estimate. A practical barrier to covariate adjustment is the presence of missing data. In this paper, in the light of recent theoretical advancement, we first review several covariate adjustment methods with i… ▽ More When analyzing data from randomized clinical trials, covariate adjustment can be used to account for chance imbalance in baseline covariates and to increase precision of the treatment effect estimate. A practical barrier to covariate adjustment is the presence of missing data. In this paper, in the light of recent theoretical advancement, we first review several covariate adjustment methods with incomplete covariate data. We investigate the implications of the missing data mechanism on estimating the average treatment effect in randomized clinical trials with continuous or binary outcomes. In parallel, we consider settings where the outcome data are fully observed or are missing at random; in the latter setting, we propose a full weighting approach that combines inverse probability weighting for adjusting missing outcomes and overlap weighting for covariate adjustment. We highlight the importance of including the interaction terms between the missingness indicators and covariates as predictors in the models. We conduct comprehensive simulation studies to examine the finite-sample performance of the proposed methods and compare with a range of common alternatives. We find that conducting the proposed adjustment methods generally improves the precision of treatment effect estimates regardless of the imputation methods when the adjusted covariate is associated with the outcome. We apply the methods to the Childhood Adenotonsillectomy Trial to assess the effect of adenotonsillectomy on neurocognitive functioning scores. △ Less

Submitted 16 May, 2023; v1 submitted 16 July, 2022; originally announced July 2022.

arXiv:2206.15460 [pdf, other]

Bayesian Causal Inference: A Critical Review

Authors: Fan Li, Peng Ding, Fabrizia Mealli

Abstract: This paper provides a critical review of the Bayesian perspective of causal inference based on the potential outcomes framework. We review the causal estimands, identification assumptions, the general structure of Bayesian inference of causal effects, and sensitivity analysis. We highlight issues that are unique to Bayesian causal inference, including the role of the propensity score, definition o… ▽ More This paper provides a critical review of the Bayesian perspective of causal inference based on the potential outcomes framework. We review the causal estimands, identification assumptions, the general structure of Bayesian inference of causal effects, and sensitivity analysis. We highlight issues that are unique to Bayesian causal inference, including the role of the propensity score, definition of identifiability, the choice of priors in both low and high dimensional regimes. We point out the central role of covariate overlap and more generally the design stage in Bayesian causal inference. We extend the discussion to two complex assignment mechanisms: instrumental variable and time-varying treatments. Throughout, we illustrate the key concepts via examples. △ Less

Submitted 23 October, 2022; v1 submitted 30 June, 2022; originally announced June 2022.

arXiv:2206.11978 [pdf, other]

Power analyses for stepped wedge designs with multivariate continuous outcomes

Authors: Kendra Davis-Plourde, Monica Taljaard, Fan Li

Abstract: Multivariate outcomes are common in pragmatic cluster randomized trials. While sample size calculation procedures for multivariate outcomes exist under parallel assignment, none have been developed for a stepped wedge design. In this article, we present computationally efficient power and sample size procedures for stepped wedge cluster randomized trials (SW-CRTs) with multivariate outcomes that d… ▽ More Multivariate outcomes are common in pragmatic cluster randomized trials. While sample size calculation procedures for multivariate outcomes exist under parallel assignment, none have been developed for a stepped wedge design. In this article, we present computationally efficient power and sample size procedures for stepped wedge cluster randomized trials (SW-CRTs) with multivariate outcomes that differentiate the within-period and between-period intracluster correlation coefficients (ICCs). Under a multivariate linear mixed model, we derive the joint distribution of the intervention test statistics which can be used for determining power under different hypotheses and provide an example using the commonly utilized intersection-union test for co-primary outcomes. Simplifications under a common treatment effect and common ICCs across endpoints and an extension to closed cohort designs are also provided. Finally, under the common ICC across endpoints assumption, we formally prove that the multivariate linear mixed model leads to a more efficient treatment effect estimator compared to the univariate linear mixed model, providing a rigorous justification on the use of the former with multivariate outcomes. We illustrate application of the proposed methods using data from an existing SW-CRT and present extensive simulations to validate the methods. △ Less

Submitted 2 December, 2022; v1 submitted 23 June, 2022; originally announced June 2022.

arXiv:2206.11343 [pdf, other]

Bayesian model calibration for block copolymer self-assembly: Likelihood-free inference and expected information gain computation via measure transport

Authors: Ricardo Baptista, Lianghao Cao, Joshua Chen, Omar Ghattas, Fengyi Li, Youssef M. Marzouk, J. Tinsley Oden

Abstract: We consider the Bayesian calibration of models describing the phenomenon of block copolymer (BCP) self-assembly using image data produced by microscopy or X-ray scattering techniques. To account for the random long-range disorder in BCP equilibrium structures, we introduce auxiliary variables to represent this aleatory uncertainty. These variables, however, result in an integrated likelihood for h… ▽ More We consider the Bayesian calibration of models describing the phenomenon of block copolymer (BCP) self-assembly using image data produced by microscopy or X-ray scattering techniques. To account for the random long-range disorder in BCP equilibrium structures, we introduce auxiliary variables to represent this aleatory uncertainty. These variables, however, result in an integrated likelihood for high-dimensional image data that is generally intractable to evaluate. We tackle this challenging Bayesian inference problem using a likelihood-free approach based on measure transport together with the construction of summary statistics for the image data. We also show that expected information gains (EIGs) from the observed data about the model parameters can be computed with no significant additional cost. Lastly, we present a numerical case study based on the Ohta--Kawasaki model for diblock copolymer thin film self-assembly and top-down microscopy characterization. For calibration, we introduce several domain-specific energy- and Fourier-based summary statistics, and quantify their informativeness using EIG. We demonstrate the power of the proposed approach to study the effect of data corruptions and experimental designs on the calibration results. △ Less

Submitted 22 June, 2022; originally announced June 2022.

Showing 1–50 of 147 results for author: Li, F