-
Nonsense associations in Markov random fields with pairwise dependence
Authors:
Sohom Bhattacharya,
Rajarshi Mukherjee,
Elizabeth Ogburn
Abstract:
Yule (1926) identified the issue of "nonsense correlations" in time series data, where dependence within each of two random vectors causes overdispersion -- i.e. variance inflation -- for measures of dependence between the two. During the near century since then, much has been written about nonsense correlations -- but nearly all of it confined to the time series literature. In this paper we provi…
▽ More
Yule (1926) identified the issue of "nonsense correlations" in time series data, where dependence within each of two random vectors causes overdispersion -- i.e. variance inflation -- for measures of dependence between the two. During the near century since then, much has been written about nonsense correlations -- but nearly all of it confined to the time series literature. In this paper we provide the first, to our knowledge, rigorous study of this phenomenon for more general forms of (positive) dependence, specifically for Markov random fields on lattices and graphs. We consider both binary and continuous random vectors and three different measures of association: correlation, covariance, and the ordinary least squares coefficient that results from projecting one random vector onto the other. In some settings we find variance inflation consistent with Yule's nonsense correlation. However, surprisingly, we also find variance deflation in some settings, and in others the variance is unchanged under dependence. Perhaps most notably, we find general conditions under which OLS inference that ignores dependence is valid despite positive dependence in the regression errors, contradicting the presentation of OLS in countless textbooks and courses.
△ Less
Submitted 5 February, 2024;
originally announced February 2024.
-
Causal inference for social network data
Authors:
Elizabeth L. Ogburn,
Oleg Sofrygin,
Ivan Diaz,
Mark J. van der Laan
Abstract:
We describe semiparametric estimation and inference for causal effects using observational data from a single social network. Our asymptotic results are the first to allow for dependence of each observation on a growing number of other units as sample size increases. In addition, while previous methods have implicitly permitted only one of two possible sources of dependence among social network ob…
▽ More
We describe semiparametric estimation and inference for causal effects using observational data from a single social network. Our asymptotic results are the first to allow for dependence of each observation on a growing number of other units as sample size increases. In addition, while previous methods have implicitly permitted only one of two possible sources of dependence among social network observations, we allow for both dependence due to transmission of information across network ties and for dependence due to latent similarities among nodes sharing ties. We propose new causal effects that are specifically of interest in social network settings, such as interventions on network ties and network structure. We use our methods to reanalyze an influential and controversial study that estimated causal peer effects of obesity using social network data from the Framingham Heart Study; after accounting for network structure we find no evidence for causal peer effects.
△ Less
Submitted 1 June, 2022; v1 submitted 23 May, 2017;
originally announced May 2017.
-
The magnitude and direction of collider bias for binary variables
Authors:
Trang Quynh Nguyen,
Allan Dafoe,
Elizabeth L. Ogburn
Abstract:
Suppose we are interested in the effect of variable $X$ on variable $Y$. If $X$ and $Y$ both influence, or are associated with variables that influence, a common outcome, called a collider, then conditioning on the collider (or on a variable influenced by the collider -- its "child") induces a spurious association between $X$ and $Y$, which is known as collider bias. Characterizing the magnitude a…
▽ More
Suppose we are interested in the effect of variable $X$ on variable $Y$. If $X$ and $Y$ both influence, or are associated with variables that influence, a common outcome, called a collider, then conditioning on the collider (or on a variable influenced by the collider -- its "child") induces a spurious association between $X$ and $Y$, which is known as collider bias. Characterizing the magnitude and direction of collider bias is crucial for understanding the implications of selection bias and for adjudicating decisions about whether to control for variables that are known to be associated with both exposure and outcome but could be either confounders or colliders. Considering a class of situations where all variables are binary, and where $X$ and $Y$ either are, or are respectively influenced by, two marginally independent causes of a collider, we derive collider bias that results from (i) conditioning on specific levels of, or (ii) linear regression adjustment for, the collider (or its child). We also derive simple conditions that determine the sign of such bias.
△ Less
Submitted 14 January, 2019; v1 submitted 2 September, 2016;
originally announced September 2016.