-
Estimating conditional hazard functions and densities with the highly-adaptive lasso
Authors:
Anders Munch,
Thomas A. Gerds,
Mark J. van der Laan,
Helene C. W. Rytgaard
Abstract:
We consider estimation of conditional hazard functions and densities over the class of multivariate càdlàg functions with uniformly bounded sectional variation norm when data are either fully observed or subject to right-censoring. We demonstrate that the empirical risk minimizer is either not well-defined or not consistent for estimation of conditional hazard functions and densities. Under a smoo…
▽ More
We consider estimation of conditional hazard functions and densities over the class of multivariate càdlàg functions with uniformly bounded sectional variation norm when data are either fully observed or subject to right-censoring. We demonstrate that the empirical risk minimizer is either not well-defined or not consistent for estimation of conditional hazard functions and densities. Under a smoothness assumption about the data-generating distribution, a highly-adaptive lasso estimator based on a particular data-adaptive sieve achieves the same convergence rate as has been shown to hold for the empirical risk minimizer in settings where the latter is well-defined. We use this result to study a highly-adaptive lasso estimator of a conditional hazard function based on right-censored data. We also propose a new conditional density estimator and derive its convergence rate. Finally, we show that the result is of interest also for settings where the empirical risk minimizer is well-defined, because the highly-adaptive lasso depends on a much smaller number of basis function than the empirical risk minimizer.
△ Less
Submitted 17 April, 2024;
originally announced April 2024.
-
Nonparametric estimation of a covariate-adjusted counterfactual treatment regimen response curve
Authors:
Ashkan Ertefaie,
Luke Duttweiler,
Brent A. Johnson,
Mark J. van der Laan
Abstract:
Flexible estimation of the mean outcome under a treatment regimen (i.e., value function) is the key step toward personalized medicine. We define our target parameter as a conditional value function given a set of baseline covariates which we refer to as a stratum based value function. We focus on semiparametric class of decision rules and propose a sieve based nonparametric covariate adjusted regi…
▽ More
Flexible estimation of the mean outcome under a treatment regimen (i.e., value function) is the key step toward personalized medicine. We define our target parameter as a conditional value function given a set of baseline covariates which we refer to as a stratum based value function. We focus on semiparametric class of decision rules and propose a sieve based nonparametric covariate adjusted regimen-response curve estimator within that class. Our work contributes in several ways. First, we propose an inverse probability weighted nonparametrically efficient estimator of the smoothed regimen-response curve function. We show that asymptotic linearity is achieved when the nuisance functions are undersmoothed sufficiently. Asymptotic and finite sample criteria for undersmoothing are proposed. Second, using Gaussian process theory, we propose simultaneous confidence intervals for the smoothed regimen-response curve function. Third, we provide consistency and convergence rate for the optimizer of the regimen-response curve estimator; this enables us to estimate an optimal semiparametric rule. The latter is important as the optimizer corresponds with the optimal dynamic treatment regimen. Some finite-sample properties are explored with simulations.
△ Less
Submitted 27 September, 2023;
originally announced September 2023.
-
Why Machine Learning Cannot Ignore Maximum Likelihood Estimation
Authors:
Mark J. van der Laan,
Sherri Rose
Abstract:
The growth of machine learning as a field has been accelerating with increasing interest and publications across fields, including statistics, but predominantly in computer science. How can we parse this vast literature for developments that exemplify the necessary rigor? How many of these manuscripts incorporate foundational theory to allow for statistical inference? Which advances have the great…
▽ More
The growth of machine learning as a field has been accelerating with increasing interest and publications across fields, including statistics, but predominantly in computer science. How can we parse this vast literature for developments that exemplify the necessary rigor? How many of these manuscripts incorporate foundational theory to allow for statistical inference? Which advances have the greatest potential for impact in practice? One could posit many answers to these queries. Here, we assert that one essential idea is for machine learning to integrate maximum likelihood for estimation of functional parameters, such as prediction functions and conditional densities.
△ Less
Submitted 22 October, 2021;
originally announced October 2021.
-
Continuous-time targeted minimum loss-based estimation of intervention-specific mean outcomes
Authors:
Helene C. Rytgaard,
Thomas A. Gerds,
Mark J. van der Laan
Abstract:
This paper studies the generalization of the targeted minimum loss-based estimation (TMLE) framework to estimation of effects of time-varying interventions in settings where both interventions, covariates, and outcome can happen at subject-specific time-points on an arbitrarily fine time-scale. TMLE is a general template for constructing asymptotically linear substitution estimators for smooth low…
▽ More
This paper studies the generalization of the targeted minimum loss-based estimation (TMLE) framework to estimation of effects of time-varying interventions in settings where both interventions, covariates, and outcome can happen at subject-specific time-points on an arbitrarily fine time-scale. TMLE is a general template for constructing asymptotically linear substitution estimators for smooth low-dimensional parameters in infinite-dimensional models. Existing longitudinal TMLE methods are developed for data where observations are made on a discrete time-grid.
We consider a continuous-time counting process model where intensity measures track the monitoring of subjects, and focus on a low-dimensional target parameter defined as the intervention-specific mean outcome at the end of follow-up. To construct our TMLE algorithm for the given statistical estimation problem we derive an expression for the efficient influence curve and represent the target parameter as a functional of intensities and conditional expectations. The high-dimensional nuisance parameters of our model are estimated and updated in an iterative manner according to separate targeting steps for the involved intensities and conditional expectations.
The resulting estimator solves the efficient influence curve equation. We state a general efficiency theorem and describe a highly adaptive lasso estimator for nuisance parameters that allows us to establish asymptotic linearity and efficiency of our estimator under minimal conditions on the underlying statistical model.
△ Less
Submitted 5 May, 2021;
originally announced May 2021.
-
Adaptive Sequential Design for a Single Time-Series
Authors:
Ivana Malenica,
Aurelien Bibaut,
Mark J. van der Laan
Abstract:
The current work is motivated by the need for robust statistical methods for precision medicine; as such, we address the need for statistical methods that provide actionable inference for a single unit at any point in time. We aim to learn an optimal, unknown choice of the controlled components of the design in order to optimize the expected outcome; with that, we adapt the randomization mechanism…
▽ More
The current work is motivated by the need for robust statistical methods for precision medicine; as such, we address the need for statistical methods that provide actionable inference for a single unit at any point in time. We aim to learn an optimal, unknown choice of the controlled components of the design in order to optimize the expected outcome; with that, we adapt the randomization mechanism for future time-point experiments based on the data collected on the individual over time. Our results demonstrate that one can learn the optimal rule based on a single sample, and thereby adjust the design at any point t with valid inference for the mean target parameter. This work provides several contributions to the field of statistical precision medicine. First, we define a general class of averages of conditional causal parameters defined by the current context for the single unit time-series data. We define a nonparametric model for the probability distribution of the time-series under few assumptions, and aim to fully utilize the sequential randomization in the estimation procedure via the double robust structure of the efficient influence curve of the proposed target parameter. We present multiple exploration-exploitation strategies for assigning treatment, and methods for estimating the optimal rule. Lastly, we present the study of the data-adaptive inference on the mean under the optimal treatment rule, where the target parameter adapts over time in response to the observed context of the individual. Our target parameter is pathwise differentiable with an efficient influence function that is doubly robust - which makes it easier to estimate than previously proposed variations. We characterize the limit distribution of our estimator under a Donsker condition expressed in terms of a notion of bracketing entropy adapted to martingale settings.
△ Less
Submitted 1 July, 2021; v1 submitted 29 January, 2021;
originally announced February 2021.
-
Nonparametric causal mediation analysis for stochastic interventional (in)direct effects
Authors:
Nima S. Hejazi,
Kara E. Rudolph,
Mark J. van der Laan,
Iván Díaz
Abstract:
Causal mediation analysis has historically been limited in two important ways: (i) a focus has traditionally been placed on binary treatments and static interventions, and (ii) direct and indirect effect decompositions have been pursued that are only identifiable in the absence of intermediate confounders affected by treatment. We present a theoretical study of an (in)direct effect decomposition o…
▽ More
Causal mediation analysis has historically been limited in two important ways: (i) a focus has traditionally been placed on binary treatments and static interventions, and (ii) direct and indirect effect decompositions have been pursued that are only identifiable in the absence of intermediate confounders affected by treatment. We present a theoretical study of an (in)direct effect decomposition of the population intervention effect, defined by stochastic interventions jointly applied to the treatment and mediators. In contrast to existing proposals, our causal effects can be evaluated regardless of whether a treatment is categorical or continuous and remain well-defined even in the presence of intermediate confounders affected by treatment. Our (in)direct effects are identifiable without a restrictive assumption on cross-world counterfactual independencies, allowing for substantive conclusions drawn from them to be validated in randomized controlled trials. Beyond the novel effects introduced, we provide a careful study of nonparametric efficiency theory relevant for the construction of flexible, multiply robust estimators of our (in)direct effects, while avoiding undue restrictions induced by assuming parametric models of nuisance parameter functionals. To complement our nonparametric estimation strategy, we introduce inferential techniques for constructing confidence intervals and hypothesis tests, and discuss open source software implementing the proposed methodology.
△ Less
Submitted 11 January, 2022; v1 submitted 14 September, 2020;
originally announced September 2020.
-
Sufficient and insufficient conditions for the stochastic convergence of Cesàro means
Authors:
Aurélien F. Bibaut,
Alex Luedtke,
Mark J. van der Laan
Abstract:
We study the stochastic convergence of the Cesàro mean of a sequence of random variables. These arise naturally in statistical problems that have a sequential component, where the sequence of random variables is typically derived from a sequence of estimators computed on data. We show that establishing a rate of convergence in probability for a sequence is not sufficient in general to establish a…
▽ More
We study the stochastic convergence of the Cesàro mean of a sequence of random variables. These arise naturally in statistical problems that have a sequential component, where the sequence of random variables is typically derived from a sequence of estimators computed on data. We show that establishing a rate of convergence in probability for a sequence is not sufficient in general to establish a rate in probability for its Cesàro mean. We also present several sets of conditions on the sequence of random variables that are sufficient to guarantee a rate of convergence for its Cesàro mean. We identify common settings in which these sets of conditions hold.
△ Less
Submitted 13 September, 2020;
originally announced September 2020.
-
Nonparametric inverse probability weighted estimators based on the highly adaptive lasso
Authors:
Ashkan Ertefaie,
Nima S. Hejazi,
Mark J. van der Laan
Abstract:
Inverse probability weighted estimators are the oldest and potentially most commonly used class of procedures for the estimation of causal effects. By adjusting for selection biases via a weighting mechanism, these procedures estimate an effect of interest by constructing a pseudo-population in which selection biases are eliminated. Despite their ease of use, these estimators require the correct s…
▽ More
Inverse probability weighted estimators are the oldest and potentially most commonly used class of procedures for the estimation of causal effects. By adjusting for selection biases via a weighting mechanism, these procedures estimate an effect of interest by constructing a pseudo-population in which selection biases are eliminated. Despite their ease of use, these estimators require the correct specification of a model for the weighting mechanism, are known to be inefficient, and suffer from the curse of dimensionality. We propose a class of nonparametric inverse probability weighted estimators in which the weighting mechanism is estimated via undersmoothing of the highly adaptive lasso, a nonparametric regression function proven to converge at $n^{-1/3}$-rate to the true weighting mechanism. We demonstrate that our estimators are asymptotically linear with variance converging to the nonparametric efficiency bound. Unlike doubly robust estimators, our procedures require neither derivation of the efficient influence function nor specification of the conditional outcome model. Our theoretical developments have broad implications for the construction of efficient inverse probability weighted estimators in large statistical models and a variety of problem settings. We assess the practical performance of our estimators in simulation studies and demonstrate use of our proposed methodology with data from a large-scale epidemiologic study.
△ Less
Submitted 3 July, 2021; v1 submitted 22 May, 2020;
originally announced May 2020.
-
Efficient Estimation of Pathwise Differentiable Target Parameters with the Undersmoothed Highly Adaptive Lasso
Authors:
Mark J. van der Laan,
David Benkeser,
Weixin Cai
Abstract:
We consider estimation of a functional parameter of a realistically modeled data distribution based on observing independent and identically distributed observations. We define an $m$-th order Spline Highly Adaptive Lasso Minimum Loss Estimator (Spline HAL-MLE) of a functional parameter that is defined by minimizing the empirical risk function over an $m$-th order smoothness class of functions. We…
▽ More
We consider estimation of a functional parameter of a realistically modeled data distribution based on observing independent and identically distributed observations. We define an $m$-th order Spline Highly Adaptive Lasso Minimum Loss Estimator (Spline HAL-MLE) of a functional parameter that is defined by minimizing the empirical risk function over an $m$-th order smoothness class of functions. We show that this $m$-th order smoothness class consists of all functions that can be represented as an infinitesimal linear combination of tensor products of $\leq m$-th order spline-basis functions, and involves assuming $m$-derivatives in each coordinate. By selecting $m$ with cross-validation we obtain a Spline-HAL-MLE that is able to adapt to the underlying unknown smoothness of the true function, while guaranteeing a rate of convergence faster than $n^{-1/4}$, as long as the true function is cadlag (right-continuous with left-hand limits) and has finite sectional variation norm. The $m=0$-smoothness class consists of all cadlag functions with finite sectional variation norm and corresponds with the original HAL-MLE defined in van der Laan (2015).
In this article we establish that this Spline-HAL-MLE yields an asymptotically efficient estimator of any smooth feature of the functional parameter under an easily verifiable global undersmoothing condition. A sufficient condition for the latter condition is that the minimum of the empirical mean of the selected basis functions is smaller than a constant times $n^{-1/2}$, which is not parameter specific and enforces the selection of the $L_1$-norm in the lasso to be large enough to include sparsely supported basis. We demonstrate our general result for the $m=0$-HAL-MLE of the average treatment effect and of the integral of the square of the data density. We also present simulations for these two examples confirming the theory.
△ Less
Submitted 2 July, 2021; v1 submitted 14 August, 2019;
originally announced August 2019.
-
Fast rates for empirical risk minimization over càdlàg functions with bounded sectional variation norm
Authors:
Aurélien F. Bibaut,
Mark J. van der Laan
Abstract:
Empirical risk minimization over classes functions that are bounded for some version of the variation norm has a long history, starting with Total Variation Denoising (Rudin et al., 1992), and has been considered by several recent articles, in particular Fang et al., 2019 and van der Laan, 2015. In this article, we consider empirical risk minimization over the class $\mathcal{F}_d$ of càdlàg funct…
▽ More
Empirical risk minimization over classes functions that are bounded for some version of the variation norm has a long history, starting with Total Variation Denoising (Rudin et al., 1992), and has been considered by several recent articles, in particular Fang et al., 2019 and van der Laan, 2015. In this article, we consider empirical risk minimization over the class $\mathcal{F}_d$ of càdlàg functions over $[0,1]^d$ with bounded sectional variation norm (also called Hardy-Krause variation).
We show how a certain representation of functions in $\mathcal{F}_d$ allows to bound the bracketing entropy of sieves of $\mathcal{F}_d$, and therefore derive rates of convergence in nonparametric function estimation. Specifically, for sieves whose growth is controlled by some rate $a_n$, we show that the empirical risk minimizer has rate of convergence $O_P(n^{-1/3} (\log n)^{2(d-1)/3} a_n)$. Remarkably, the dimension only affects the rate in $n$ through the logarithmic factor, making this method especially appropriate for high dimensional problems.
In particular, we show that in the case of nonparametric regression over sieves of càdlàg functions with bounded sectional variation norm, this upper bound on the rate of convergence holds for least-squares estimators, under the random design, sub-exponential errors setting.
△ Less
Submitted 23 August, 2019; v1 submitted 22 July, 2019;
originally announced July 2019.
-
Robust variance estimation and inference for causal effect estimation
Authors:
Linh Tran,
Maya Petersen,
Joshua Schwab,
Mark J van der Laan
Abstract:
We consider a longitudinal data structure consisting of baseline covariates, time-varying treatment variables, intermediate time-dependent covariates, and a possibly time dependent outcome. Previous studies have shown that estimating the variance of asymptotically linear estimators using empirical influence functions in this setting result in anti-conservative estimates with increasing magnitudes…
▽ More
We consider a longitudinal data structure consisting of baseline covariates, time-varying treatment variables, intermediate time-dependent covariates, and a possibly time dependent outcome. Previous studies have shown that estimating the variance of asymptotically linear estimators using empirical influence functions in this setting result in anti-conservative estimates with increasing magnitudes of positivity violations, leading to poor coverage and uncontrolled Type I errors. In this paper, we present two alternative approaches of estimating the variance of these estimators: (i) a robust approach which directly targets the variance of the influence function as a counterfactual mean outcome, and (ii) a non-parametric bootstrap based approach that is theoretically valid and lowers the computational cost, thereby increasing the feasibility in non-parametric settings using complex machine learning algorithms. The performance of these approaches are compared to that of the empirical influence function in simulations across different levels of positivity violations and treatment effect sizes.
△ Less
Submitted 6 October, 2018;
originally announced October 2018.
-
Robust Estimation of Data-Dependent Causal Effects based on Observing a Single Time-Series
Authors:
Mark J. van der Laan,
Ivana Malenica
Abstract:
Consider the case that one observes a single time-series, where at each time t one observes a data record O(t) involving treatment nodes A(t), possible covariates L(t) and an outcome node Y(t). The data record at time t carries information for an (potentially causal) effect of the treatment A(t) on the outcome Y(t), in the context defined by a fixed dimensional summary measure Co(t). We are concer…
▽ More
Consider the case that one observes a single time-series, where at each time t one observes a data record O(t) involving treatment nodes A(t), possible covariates L(t) and an outcome node Y(t). The data record at time t carries information for an (potentially causal) effect of the treatment A(t) on the outcome Y(t), in the context defined by a fixed dimensional summary measure Co(t). We are concerned with defining causal effects that can be consistently estimated, with valid inference, for sequentially randomized experiments without further assumptions. More generally, we consider the case when the (possibly causal) effects can be estimated in a double robust manner, analogue to double robust estimation of effects in the i.i.d. causal inference literature. We propose a general class of averages of conditional (context-specific) causal parameters that can be estimated in a double robust manner, therefore fully utilizing the sequential randomization. We propose a targeted maximum likelihood estimator (TMLE) of these causal parameters, and present a general theorem establishing the asymptotic consistency and normality of the TMLE. We extend our general framework to a number of typically studied causal target parameters, including a sequentially adaptive design within a single unit that learns the optimal treatment rule for the unit over time. Our work opens up robust statistical inference for causal questions based on observing a single time-series on a particular unit.
△ Less
Submitted 3 September, 2018;
originally announced September 2018.
-
Collaborative targeted inference from continuously indexed nuisance parameter estimators
Authors:
Cheng Ju,
Antoine Chambaz,
Mark J. van der Laan
Abstract:
We wish to infer the value of a parameter at a law from which we sample independent observations. The parameter is smooth and we can define two variation-independent features of the law, its $Q$- and $G$-components, such that estimating them consistently at a fast enough product of rates allows to build a confidence interval (CI) with a given asymptotic level from a plain targeted minimum loss est…
▽ More
We wish to infer the value of a parameter at a law from which we sample independent observations. The parameter is smooth and we can define two variation-independent features of the law, its $Q$- and $G$-components, such that estimating them consistently at a fast enough product of rates allows to build a confidence interval (CI) with a given asymptotic level from a plain targeted minimum loss estimator (TMLE). Say that the above product is not fast enough and the algorithm for the $G$-component is fine-tuned by a real-valued $h$. A plain TMLE with an $h$ chosen by cross-validation would typically not yield a CI. We construct a collaborative TMLE (C-TMLE) and show under mild conditions that, if there exists an oracle $h$ that makes a bulky remainder term asymptotically Gaussian, then the C-TMLE yields a CI. We illustrate our findings with the inference of the average treatment effect. We conduct a simulation study where the $G$-component is estimated by the LASSO and $h$ is the bound on the coefficients' norms. It sheds light on small sample properties, in the face of low- to high-dimensional baseline covariates, and possibly positivity violation.
△ Less
Submitted 5 April, 2018; v1 submitted 30 March, 2018;
originally announced April 2018.
-
Uniform Consistency of the Highly Adaptive Lasso Estimator of Infinite Dimensional Parameters
Authors:
Mark J. van der Laan,
Aurélien F. Bibaut
Abstract:
Consider the case that we observe $n$ independent and identically distributed copies of a random variable with a probability distribution known to be an element of a specified statistical model. We are interested in estimating an infinite dimensional target parameter that minimizes the expectation of a specified loss function. In \cite{generally_efficient_TMLE} we defined an estimator that minimiz…
▽ More
Consider the case that we observe $n$ independent and identically distributed copies of a random variable with a probability distribution known to be an element of a specified statistical model. We are interested in estimating an infinite dimensional target parameter that minimizes the expectation of a specified loss function. In \cite{generally_efficient_TMLE} we defined an estimator that minimizes the empirical risk over all multivariate real valued cadlag functions with variation norm bounded by some constant $M$ in the parameter space, and selects $M$ with cross-validation. We referred to this estimator as the Highly-Adaptive-Lasso estimator due to the fact that the constrained can be formulated as a bound $M$ on the sum of the coefficients a linear combination of a very large number of basis functions. Specifically, in the case that the target parameter is a conditional mean, then it can be implemented with the standard LASSO regression estimator. In \cite{generally_efficient_TMLE} we proved that the HAL-estimator is consistent w.r.t. the (quadratic) loss-based dissimilarity at a rate faster than $n^{-1/2}$ (i.e., faster than $n^{-1/4}$ w.r.t. a norm), even when the parameter space is completely nonparametric. The only assumption required for this rate is that the true parameter function has a finite variation norm. The loss-based dissimilarity is often equivalent with the square of an $L^2(P_0)$-type norm. In this article, we establish that under some weak continuity condition, the HAL-estimator is also uniformly consistent.
△ Less
Submitted 19 September, 2017;
originally announced September 2017.
-
Data-adaptive smoothing for optimal-rate estimation of possibly non-regular parameters
Authors:
Aurelien F. Bibaut,
Mark J. van der Laan
Abstract:
We consider nonparametric inference of finite dimensional, potentially non-pathwise differentiable target parameters. In a nonparametric model, some examples of such parameters that are always non pathwise differentiable target parameters include probability density functions at a point, or regression functions at a point. In causal inference, under appropriate causal assumptions, mean counterfact…
▽ More
We consider nonparametric inference of finite dimensional, potentially non-pathwise differentiable target parameters. In a nonparametric model, some examples of such parameters that are always non pathwise differentiable target parameters include probability density functions at a point, or regression functions at a point. In causal inference, under appropriate causal assumptions, mean counterfactual outcomes can be pathwise differentiable or not, depending on the degree at which the positivity assumption holds.
In this paper, given a potentially non-pathwise differentiable target parameter, we introduce a family of approximating parameters, that are pathwise differentiable. This family is indexed by a scalar. In kernel regression or density estimation for instance, a natural choice for such a family is obtained by kernel smoothing and is indexed by the smoothing level. For the counterfactual mean outcome, a possible approximating family is obtained through truncation of the propensity score, and the truncation level then plays the role of the index.
We propose a method to data-adaptively select the index in the family, so as to optimize mean squared error. We prove an asymptotic normality result, which allows us to derive confidence intervals. Under some conditions, our estimator achieves an optimal mean squared error convergence rate. Confidence intervals are data-adaptive and have almost optimal width.
A simulation study demonstrates the practical performance of our estimators for the inference of a causal dose-response curve at a given treatment dose.
△ Less
Submitted 12 July, 2017; v1 submitted 22 June, 2017;
originally announced June 2017.
-
Causal inference for social network data
Authors:
Elizabeth L. Ogburn,
Oleg Sofrygin,
Ivan Diaz,
Mark J. van der Laan
Abstract:
We describe semiparametric estimation and inference for causal effects using observational data from a single social network. Our asymptotic results are the first to allow for dependence of each observation on a growing number of other units as sample size increases. In addition, while previous methods have implicitly permitted only one of two possible sources of dependence among social network ob…
▽ More
We describe semiparametric estimation and inference for causal effects using observational data from a single social network. Our asymptotic results are the first to allow for dependence of each observation on a growing number of other units as sample size increases. In addition, while previous methods have implicitly permitted only one of two possible sources of dependence among social network observations, we allow for both dependence due to transmission of information across network ties and for dependence due to latent similarities among nodes sharing ties. We propose new causal effects that are specifically of interest in social network settings, such as interventions on network ties and network structure. We use our methods to reanalyze an influential and controversial study that estimated causal peer effects of obesity using social network data from the Framingham Heart Study; after accounting for network structure we find no evidence for causal peer effects.
△ Less
Submitted 1 June, 2022; v1 submitted 23 May, 2017;
originally announced May 2017.
-
Toward computerized efficient estimation in infinite-dimensional models
Authors:
Marco Carone,
Alexander R. Luedtke,
Mark J. van der Laan
Abstract:
Despite the risk of misspecification they are tied to, parametric models continue to be used in statistical practice because they are accessible to all. In particular, efficient estimation procedures in parametric models are simple to describe and implement. Unfortunately, the same cannot be said of semiparametric and nonparametric models. While the latter often reflect the level of available scie…
▽ More
Despite the risk of misspecification they are tied to, parametric models continue to be used in statistical practice because they are accessible to all. In particular, efficient estimation procedures in parametric models are simple to describe and implement. Unfortunately, the same cannot be said of semiparametric and nonparametric models. While the latter often reflect the level of available scientific knowledge more appropriately, performing efficient inference in these models is generally challenging. The efficient influence function is a key analytic object from which the construction of asymptotically efficient estimators can potentially be streamlined. However, the theoretical derivation of the efficient influence function requires specialized knowledge and is often a difficult task, even for experts. In this paper, we propose and discuss a numerical procedure for approximating the efficient influence function. The approach generalizes the simple nonparametric procedures described recently by Frangakis et al. (2015) and Luedtke et al. (2015) to arbitrary models. We present theoretical results to support our proposal, and also illustrate the method in the context of two examples. The proposed approach is an important step toward automating efficient estimation in general statistical models, thereby rendering the use of realistic models in statistical analyses much more accessible.
△ Less
Submitted 30 August, 2016;
originally announced August 2016.
-
Statistical inference for the mean outcome under a possibly non-unique optimal treatment strategy
Authors:
Alexander R. Luedtke,
Mark J. van der Laan
Abstract:
We consider challenges that arise in the estimation of the mean outcome under an optimal individualized treatment strategy defined as the treatment rule that maximizes the population mean outcome, where the candidate treatment rules are restricted to depend on baseline covariates. We prove a necessary and sufficient condition for the pathwise differentiability of the optimal value, a key condition…
▽ More
We consider challenges that arise in the estimation of the mean outcome under an optimal individualized treatment strategy defined as the treatment rule that maximizes the population mean outcome, where the candidate treatment rules are restricted to depend on baseline covariates. We prove a necessary and sufficient condition for the pathwise differentiability of the optimal value, a key condition needed to develop a regular and asymptotically linear (RAL) estimator of the optimal value. The stated condition is slightly more general than the previous condition implied in the literature. We then describe an approach to obtain root-$n$ rate confidence intervals for the optimal value even when the parameter is not pathwise differentiable. We provide conditions under which our estimator is RAL and asymptotically efficient when the mean outcome is pathwise differentiable. We also outline an extension of our approach to a multiple time point problem. All of our results are supported by simulations.
△ Less
Submitted 24 March, 2016;
originally announced March 2016.
-
Second-Order Inference for the Mean of a Variable Missing at Random
Authors:
Iván Díaz,
Marco Carone,
Mark J. van der Laan
Abstract:
We present a second-order estimator of the mean of a variable subject to missingness, under the missing at random assumption. The estimator improves upon existing methods by using an approximate second-order expansion of the parameter functional, in addition to the first-order expansion employed by standard doubly robust methods. This results in weaker assumptions about the convergence rates neces…
▽ More
We present a second-order estimator of the mean of a variable subject to missingness, under the missing at random assumption. The estimator improves upon existing methods by using an approximate second-order expansion of the parameter functional, in addition to the first-order expansion employed by standard doubly robust methods. This results in weaker assumptions about the convergence rates necessary to establish consistency, local efficiency, and asymptotic linearity. The general estimation strategy is developed under the targeted minimum loss-based estimation (TMLE) framework. We present a simulation comparing the sensitivity of the first and second order estimators to the convergence rate of the initial estimators of the outcome regression and missingness score. In our simulation, the second-order TMLE improved the coverage probability of a confidence interval by up to 85%. In addition, we present a first-order estimator inspired by a second-order expansion of the parameter functional. This estimator only requires one-dimensional smoothing, whereas implementation of the second-order TMLE generally requires kernel smoothing on the covariate space. The first-order estimator proposed is expected to have improved finite sample performance compared to existing first-order estimators. In our simulations, the proposed first-order estimator improved the coverage probability by up to 90%. We provide an illustration of our methods using a publicly available dataset to determine the effect of an anticoagulant on health outcomes of patients undergoing percutaneous coronary intervention. We provide R code implementing the proposed estimator.
△ Less
Submitted 26 November, 2015;
originally announced November 2015.
-
An Omnibus Nonparametric Test of Equality in Distribution for Unknown Functions
Authors:
Alexander R. Luedtke,
Marco Carone,
Mark J. van der Laan
Abstract:
We present a novel family of nonparametric omnibus tests of the hypothesis that two unknown but estimable functions are equal in distribution when applied to the observed data structure. We developed these tests, which represent a generalization of the maximum mean discrepancy tests described in Gretton et al. [2006], using recent developments from the higher-order pathwise differentiability liter…
▽ More
We present a novel family of nonparametric omnibus tests of the hypothesis that two unknown but estimable functions are equal in distribution when applied to the observed data structure. We developed these tests, which represent a generalization of the maximum mean discrepancy tests described in Gretton et al. [2006], using recent developments from the higher-order pathwise differentiability literature. Despite their complex derivation, the associated test statistics can be expressed rather simply as U-statistics. We study the asymptotic behavior of the proposed tests under the null hypothesis and under both fixed and local alternatives. We provide examples to which our tests can be applied and show that they perform well in a simulation study. As an important special case, our proposed tests can be used to determine whether an unknown function, such as the conditional average treatment effect, is equal to zero almost surely.
△ Less
Submitted 13 June, 2017; v1 submitted 14 October, 2015;
originally announced October 2015.
-
Causal inference in longitudinal studies with history-restricted marginal structural models
Authors:
Romain Neugebauer,
Mark J. van der Laan,
Marshall M. Joffe,
Ira B. Tager
Abstract:
A new class of Marginal Structural Models (MSMs), History-Restricted MSMs (HRMSMs), was recently introduced for longitudinal data for the purpose of defining causal parameters which may often be better suited for public health research or at least more practicable than MSMs \citejoffe,feldman. HRMSMs allow investigators to analyze the causal effect of a treatment on an outcome based on a fixed,…
▽ More
A new class of Marginal Structural Models (MSMs), History-Restricted MSMs (HRMSMs), was recently introduced for longitudinal data for the purpose of defining causal parameters which may often be better suited for public health research or at least more practicable than MSMs \citejoffe,feldman. HRMSMs allow investigators to analyze the causal effect of a treatment on an outcome based on a fixed, shorter and user-specified history of exposure compared to MSMs. By default, the latter represent the treatment causal effect of interest based on a treatment history defined by the treatments assigned between the study's start and outcome collection. We lay out in this article the formal statistical framework behind HRMSMs. Beyond allowing a more flexible causal analysis, HRMSMs improve computational tractability and mitigate statistical power concerns when designing longitudinal studies. We also develop three consistent estimators of HRMSM parameters under sufficient model assumptions: the Inverse Probability of Treatment Weighted (IPTW), G-computation and Double Robust (DR) estimators. In addition, we show that the assumptions commonly adopted for identification and consistent estimation of MSM parameters (existence of counterfactuals, consistency, time-ordering and sequential randomization assumptions) also lead to identification and consistent estimation of HRMSM parameters.
△ Less
Submitted 9 May, 2007;
originally announced May 2007.