-
Risk Set Matched Difference-in-Differences for the Analysis of Effect Modification in an Observational Study on the Impact of Gun Violence on Health Outcomes
Authors:
Eric R. Cohn,
Zirui Song,
Jose R. Zubizarreta
Abstract:
Gun violence is a major source of injury and death in the United States. However, relatively little is known about the effects of firearm injuries on survivors and their family members and how these effects vary across subpopulations. To study these questions and, more generally, to address a gap in the causal inference literature, we present a framework for the study of effect modification or het…
▽ More
Gun violence is a major source of injury and death in the United States. However, relatively little is known about the effects of firearm injuries on survivors and their family members and how these effects vary across subpopulations. To study these questions and, more generally, to address a gap in the causal inference literature, we present a framework for the study of effect modification or heterogeneous treatment effects in difference-in-differences designs. We implement a new matching technique, which combines profile matching and risk set matching, to (i) preserve the time alignment of covariates, exposure, and outcomes, avoiding pitfalls of other common approaches for difference-in-differences, and (ii) explicitly control biases due to imbalances in observed covariates in subgroups discovered from the data. Our case study shows significant and persistent effects of nonfatal firearm injuries on several health outcomes for those injured and on the mental health of their family members. Sensitivity analyses reveal that these results are moderately robust to unmeasured confounding bias. Finally, while the effects for those injured vary largely by the severity of the injury and its documented intent, for families, effects are strongest for those whose relative's injury is documented as resulting from an assault, self-harm, or law enforcement intervention.
△ Less
Submitted 31 May, 2024; v1 submitted 6 May, 2023;
originally announced May 2023.
-
One-Step weighting to generalize and transport treatment effect estimates to a target population
Authors:
Ambarish Chattopadhyay,
Eric R. Cohn,
Jose R. Zubizarreta
Abstract:
The problem of generalization and transportation of treatment effect estimates from a study sample to a target population is central to empirical research and statistical methodology. In both randomized experiments and observational studies, weighting methods are often used with this objective. Traditional methods construct the weights by separately modeling the treatment assignment and study sele…
▽ More
The problem of generalization and transportation of treatment effect estimates from a study sample to a target population is central to empirical research and statistical methodology. In both randomized experiments and observational studies, weighting methods are often used with this objective. Traditional methods construct the weights by separately modeling the treatment assignment and study selection probabilities and then multiplying functions (e.g., inverses) of their estimates. In this work, we provide a justification and an implementation for weighting in a single step. We show a formal connection between this one-step method and inverse probability and inverse odds weighting. We demonstrate that the resulting estimator for the target average treatment effect is consistent, asymptotically Normal, multiply robust, and semiparametrically efficient. We evaluate the performance of the one-step estimator in a simulation study. We illustrate its use in a case study on the effects of physician racial diversity on preventive healthcare utilization among Black men in California. We provide R code implementing the methodology.
△ Less
Submitted 15 June, 2023; v1 submitted 16 March, 2022;
originally announced March 2022.
-
Continuum covariance propagation for understanding variance loss in advective systems
Authors:
Shay Gilpin,
Tomoko Matsuo,
Stephen E. Cohn
Abstract:
Motivated by the spurious variance loss encountered during covariance propagation in atmospheric and other large-scale data assimilation systems, we consider the problem for state dynamics governed by the continuity and related hyperbolic partial differential equations. This loss of variance is often attributed to reduced-rank representations of the covariance matrix, as in ensemble methods for ex…
▽ More
Motivated by the spurious variance loss encountered during covariance propagation in atmospheric and other large-scale data assimilation systems, we consider the problem for state dynamics governed by the continuity and related hyperbolic partial differential equations. This loss of variance is often attributed to reduced-rank representations of the covariance matrix, as in ensemble methods for example, or else to the use of dissipative numerical methods. Through a combination of analytical work and numerical experiments, we demonstrate that significant variance loss, as well as gain, typically occurs during covariance propagation, even at full rank. The cause of this unusual behavior is a discontinuous change in the continuum covariance dynamics as correlation lengths become small, for instance in the vicinity of sharp gradients in the velocity field. This discontinuity in the covariance dynamics arises from hyperbolicity: the diagonal of the kernel of the covariance operator is a characteristic surface for advective dynamics. Our numerical experiments demonstrate that standard numerical methods for evolving the state are not adequate for propagating the covariance, because they do not capture the discontinuity in the continuum covariance dynamics as correlations lengths tend to zero. Our analytical and numerical results demonstrate in the context of mass conservation that this leads to significant, spurious variance loss in regions of mass convergence and gain in regions of mass divergence. The results suggest that develo** local covariance propagation methods designed specifically to capture covariance evolution near the diagonal may prove a useful alternative to current methods of covariance propagation.
△ Less
Submitted 6 September, 2021;
originally announced September 2021.
-
Profile Matching for the Generalization and Personalization of Causal Inferences
Authors:
Eric R. Cohn,
Jose R. Zubizarreta
Abstract:
We introduce profile matching, a multivariate matching method for randomized experiments and observational studies that finds the largest possible unweighted samples across multiple treatment groups that are balanced relative to a covariate profile. This covariate profile can represent a specific population or a target individual, facilitating the generalization and personalization of causal infer…
▽ More
We introduce profile matching, a multivariate matching method for randomized experiments and observational studies that finds the largest possible unweighted samples across multiple treatment groups that are balanced relative to a covariate profile. This covariate profile can represent a specific population or a target individual, facilitating the generalization and personalization of causal inferences. For generalization, because the profile often amounts to summary statistics for a target population, profile matching does not always require accessing individual-level data, which may be unavailable for confidentiality reasons. For personalization, the profile comprises the characteristics of a single individual. Profile matching achieves covariate balance by construction, but unlike existing approaches to matching, it does not require specifying a matching ratio, as this is implicitly optimized for the data. The method can also be used for the selection of units for study follow-up, and it readily applies to multi-valued treatments with many treatment categories. We evaluate the performance of profile matching in a simulation study of the generalization of a randomized trial to a target population. We further illustrate this method in an exploratory observational study of the relationship between opioid use and mental health outcomes. We analyze these relationships for three covariate profiles representing: (i) sexual minorities, (ii) the Appalachian United States, and (iii) the characteristics of a hypothetical vulnerable patient. The method can be implemented via the new function profmatch in the designmatch package for R, for which we provide a step-by-step tutorial.
△ Less
Submitted 6 July, 2022; v1 submitted 20 May, 2021;
originally announced May 2021.
-
Storage-Based Frequency Sha** Control
Authors:
Yan Jiang,
Eliza Cohn,
Petr Vorobev,
Enrique Mallada
Abstract:
With the decrease in system inertia, frequency security becomes an issue for power systems around the world. Energy storage systems (ESS), due to their excellent ram** capabilities, are considered as a natural choice for the improvement of frequency response following major contingencies. In this manuscript, we propose a new strategy for energy storage -- frequency sha** control -- that allows…
▽ More
With the decrease in system inertia, frequency security becomes an issue for power systems around the world. Energy storage systems (ESS), due to their excellent ram** capabilities, are considered as a natural choice for the improvement of frequency response following major contingencies. In this manuscript, we propose a new strategy for energy storage -- frequency sha** control -- that allows to completely eliminate the frequency Nadir, one of the main issue in frequency security, and at the same time tune the rate of change of frequency (RoCoF) to a desired value. With Nadir eliminated, the frequency security assessment can be performed via simple algebraic calculations, as opposed to dynamic simulations for conventional control strategies. Moreover, our proposed control is also very efficient in terms of the requirements on storage peak power, requiring up to 40% less power than conventional virtual inertia approach for the same performance.
△ Less
Submitted 25 May, 2020;
originally announced May 2020.
-
Dynamic Droop Approach for Storage-based Frequency Control
Authors:
Yan Jiang,
Eliza Cohn,
Petr Vorobev,
Enrique Mallada
Abstract:
Transient frequency dips that follow sudden power imbalances --frequency Nadir-- represent a big challenge for frequency stability of low-inertia power systems. Since low inertia is identified as one of the causes for deep frequency Nadir, virtual inertia, which is provided by energy storage units, is said to be one of the solutions to the problem. In the present paper, we propose a new method for…
▽ More
Transient frequency dips that follow sudden power imbalances --frequency Nadir-- represent a big challenge for frequency stability of low-inertia power systems. Since low inertia is identified as one of the causes for deep frequency Nadir, virtual inertia, which is provided by energy storage units, is said to be one of the solutions to the problem. In the present paper, we propose a new method for frequency control with energy storage systems (ESS), called dynamic droop control (iDroop), that can completely eliminate frequency Nadir during transients. Nadir elimination allows us to perform frequency stability assessment without the need for direct numerical simulations of system dynamics. We make a direct comparison of our developed strategy with the usual control approaches --virtual inertia (VI) and droop control (DC)-- and show that iDroop is more effective than both in eliminating the Nadir. More precisely, iDroop achieves the Nadir elimination under significantly lower gains than virtual inertia and requires almost $40\%$ less storage power capacity to implement the control. Moreover, we show that rather unrealistic control gains are required for virtual inertia in order to achieve Nadir elimination.
△ Less
Submitted 14 October, 2019; v1 submitted 10 October, 2019;
originally announced October 2019.
-
Guided Deep List: Automating the Generation of Epidemiological Line Lists from Open Sources
Authors:
Saurav Ghosh,
Prithwish Chakraborty,
Bryan L. Lewis,
Maimuna S. Majumder,
Emily Cohn,
John S. Brownstein,
Madhav V. Marathe,
Naren Ramakrishnan
Abstract:
Real-time monitoring and responses to emerging public health threats rely on the availability of timely surveillance data. During the early stages of an epidemic, the ready availability of line lists with detailed tabular information about laboratory-confirmed cases can assist epidemiologists in making reliable inferences and forecasts. Such inferences are crucial to understand the epidemiology of…
▽ More
Real-time monitoring and responses to emerging public health threats rely on the availability of timely surveillance data. During the early stages of an epidemic, the ready availability of line lists with detailed tabular information about laboratory-confirmed cases can assist epidemiologists in making reliable inferences and forecasts. Such inferences are crucial to understand the epidemiology of a specific disease early enough to stop or control the outbreak. However, construction of such line lists requires considerable human supervision and therefore, difficult to generate in real-time. In this paper, we motivate Guided Deep List, the first tool for building automated line lists (in near real-time) from open source reports of emerging disease outbreaks. Specifically, we focus on deriving epidemiological characteristics of an emerging disease and the affected population from reports of illness. Guided Deep List uses distributed vector representations (ala word2vec) to discover a set of indicators for each line list feature. This discovery of indicators is followed by the use of dependency parsing based techniques for final extraction in tabular form. We evaluate the performance of Guided Deep List against a human annotated line list provided by HealthMap corresponding to MERS outbreaks in Saudi Arabia. We demonstrate that Guided Deep List extracts line list features with increased accuracy compared to a baseline method. We further show how these automatically extracted line list features can be used for making epidemiological inferences, such as inferring demographics and symptoms-to-hospitalization period of affected individuals.
△ Less
Submitted 21 February, 2017;
originally announced February 2017.
-
Temporal Topic Modeling to Assess Associations between News Trends and Infectious Disease Outbreaks
Authors:
Saurav Ghosh,
Prithwish Chakraborty,
Elaine O. Nsoesie,
Emily Cohn,
Sumiko R. Mekaru,
John S. Brownstein,
Naren Ramakrishnan
Abstract:
In retrospective assessments, internet news reports have been shown to capture early reports of unknown infectious disease transmission prior to official laboratory confirmation. In general, media interest and reporting peaks and wanes during the course of an outbreak. In this study, we quantify the extent to which media interest during infectious disease outbreaks is indicative of trends of repor…
▽ More
In retrospective assessments, internet news reports have been shown to capture early reports of unknown infectious disease transmission prior to official laboratory confirmation. In general, media interest and reporting peaks and wanes during the course of an outbreak. In this study, we quantify the extent to which media interest during infectious disease outbreaks is indicative of trends of reported incidence. We introduce an approach that uses supervised temporal topic models to transform large corpora of news articles into temporal topic trends. The key advantages of this approach include, applicability to a wide range of diseases, and ability to capture disease dynamics - including seasonality, abrupt peaks and troughs. We evaluated the method using data from multiple infectious disease outbreaks reported in the United States of America (U.S.), China and India. We noted that temporal topic trends extracted from disease-related news reports successfully captured the dynamics of multiple outbreaks such as whoo** cough in U.S. (2012), dengue outbreaks in India (2013) and China (2014). Our observations also suggest that efficient modeling of temporal topic trends using time-series regression techniques can estimate disease case counts with increased precision before official reports by health organizations.
△ Less
Submitted 1 June, 2016;
originally announced June 2016.
-
Characterizing Diseases from Unstructured Text: A Vocabulary Driven Word2vec Approach
Authors:
Saurav Ghosh,
Prithwish Chakraborty,
Emily Cohn,
John S. Brownstein,
Naren Ramakrishnan
Abstract:
Traditional disease surveillance can be augmented with a wide variety of real-time sources such as, news and social media. However, these sources are in general unstructured and, construction of surveillance tools such as taxonomical correlations and trace map** involves considerable human supervision. In this paper, we motivate a disease vocabulary driven word2vec model (Dis2Vec) to model disea…
▽ More
Traditional disease surveillance can be augmented with a wide variety of real-time sources such as, news and social media. However, these sources are in general unstructured and, construction of surveillance tools such as taxonomical correlations and trace map** involves considerable human supervision. In this paper, we motivate a disease vocabulary driven word2vec model (Dis2Vec) to model diseases and constituent attributes as word embeddings from the HealthMap news corpus. We use these word embeddings to automatically create disease taxonomies and evaluate our model against corresponding human annotated taxonomies. We compare our model accuracies against several state-of-the art word2vec methods. Our results demonstrate that Dis2Vec outperforms traditional distributed vector representations in its ability to faithfully capture taxonomical attributes across different class of diseases such as endemic, emerging and rare.
△ Less
Submitted 3 June, 2016; v1 submitted 29 February, 2016;
originally announced March 2016.