-
Impact of COVID-19 Policies and Misinformation on Social Unrest
Authors:
Martha Barnard,
Radhika Iyer,
Sara Y. Del Valle,
Ashlynn R. Daughton
Abstract:
The novel coronavirus disease (COVID-19) pandemic has impacted every corner of earth, disrupting governments and leading to socioeconomic instability. This crisis has prompted questions surrounding how different sectors of society interact and influence each other during times of change and stress. Given the unprecedented economic and societal impacts of this pandemic, many new data sources have b…
▽ More
The novel coronavirus disease (COVID-19) pandemic has impacted every corner of earth, disrupting governments and leading to socioeconomic instability. This crisis has prompted questions surrounding how different sectors of society interact and influence each other during times of change and stress. Given the unprecedented economic and societal impacts of this pandemic, many new data sources have become available, allowing us to quantitatively explore these associations. Understanding these relationships can help us better prepare for future disasters and mitigate the impacts. Here, we focus on the interplay between social unrest (protests), health outcomes, public health orders, and misinformation in eight countries of Western Europe and four regions of the United States. We created 1-3 week forecasts of both a binary protest metric for identifying times of high protest activity and the overall protest counts over time. We found that for all regions, except Belgium, at least one feature from our various data streams was predictive of protests. However, the accuracy of the protest forecasts varied by country, that is, for roughly half of the countries analyzed, our forecasts outperform a naïve model. These mixed results demonstrate the potential of diverse data streams to predict a topic as volatile as protests as well as the difficulties of predicting a situation that is as rapidly evolving as a pandemic.
△ Less
Submitted 7 October, 2021;
originally announced October 2021.
-
"Thought I'd Share First" and Other Conspiracy Theory Tweets from the COVID-19 Infodemic: Exploratory Study
Authors:
Dax Gerts,
Courtney D. Shelley,
Nidhi Parikh,
Travis Pitts,
Chrysm Watson Ross,
Geoffrey Fairchild,
Nidia Yadria Vaquera Chavez,
Ashlynn R. Daughton
Abstract:
Background: The COVID-19 outbreak has left many people isolated within their homes; these people are turning to social media for news and social connection, which leaves them vulnerable to believing and sharing misinformation. Health-related misinformation threatens adherence to public health messaging, and monitoring its spread on social media is critical to understanding the evolution of ideas t…
▽ More
Background: The COVID-19 outbreak has left many people isolated within their homes; these people are turning to social media for news and social connection, which leaves them vulnerable to believing and sharing misinformation. Health-related misinformation threatens adherence to public health messaging, and monitoring its spread on social media is critical to understanding the evolution of ideas that have potentially negative public health impacts. Results: Analysis using model-labeled data was beneficial for increasing the proportion of data matching misinformation indicators. Random forest classifier metrics varied across the four conspiracy theories considered (F1 scores between 0.347 and 0.857); this performance increased as the given conspiracy theory was more narrowly defined. We showed that misinformation tweets demonstrate more negative sentiment when compared to nonmisinformation tweets and that theories evolve over time, incorporating details from unrelated conspiracy theories as well as real-world events. Conclusions: Although we focus here on health-related misinformation, this combination of approaches is not specific to public health and is valuable for characterizing misinformation in general, which is an important first step in creating targeted messaging to counteract its spread. Initial messaging should aim to preempt generalized misinformation before it becomes widespread, while later messaging will need to target evolving conspiracy theories and the new facets of each as they become incorporated.
△ Less
Submitted 15 April, 2021; v1 submitted 14 December, 2020;
originally announced December 2020.
-
Estimating influenza incidence using search query deceptiveness and generalized ridge regression
Authors:
Reid Priedhorsky,
Ashlynn R. Daughton,
Martha Barnard,
Fiona O'Connell,
Dave Osthus
Abstract:
Seasonal influenza is a sometimes surprisingly impactful disease, causing thousands of deaths per year along with much additional morbidity. Timely knowledge of the outbreak state is valuable for managing an effective response. The current state of the art is to gather this knowledge using in-person patient contact. While accurate, this is time-consuming and expensive. This has motivated inquiry i…
▽ More
Seasonal influenza is a sometimes surprisingly impactful disease, causing thousands of deaths per year along with much additional morbidity. Timely knowledge of the outbreak state is valuable for managing an effective response. The current state of the art is to gather this knowledge using in-person patient contact. While accurate, this is time-consuming and expensive. This has motivated inquiry into new approaches using internet activity traces, based on the theory that lay observations of health status lead to informative features in internet data.
These approaches risk being deceived by activity traces having a coincidental, rather than informative, relationship to disease incidence; to our knowledge, this risk has not yet been quantitatively explored. We evaluated both simulated and real activity traces of varying deceptiveness for influenza incidence estimation using linear regression.
We found that deceptiveness knowledge does reduce error in such estimates, that it may help automatically-selected features perform as well or better than features that require human curation, and that a semantic distance measure derived from the Wikipedia article category tree serves as a useful proxy for deceptiveness. This suggests that disease incidence estimation models should incorporate not only data about how internet features map to incidence but also additional data to estimate feature deceptiveness. By doing so, we may gain one more step along the path to accurate, reliable disease incidence estimation using internet data. This capability would improve public health by decreasing the cost and increasing the timeliness of such estimates.
△ Less
Submitted 11 January, 2019;
originally announced January 2019.
-
Epidemiological data challenges: planning for a more robust future through data standards
Authors:
Geoffrey Fairchild,
Byron Tasseff,
Hari Khalsa,
Nicholas Generous,
Ashlynn R. Daughton,
Nileena Velappan,
Reid Priedhorsky,
Alina Deshpande
Abstract:
Accessible epidemiological data are of great value for emergency preparedness and response, understanding disease progression through a population, and building statistical and mechanistic disease models that enable forecasting. The status quo, however, renders acquiring and using such data difficult in practice. In many cases, a primary way of obtaining epidemiological data is through the interne…
▽ More
Accessible epidemiological data are of great value for emergency preparedness and response, understanding disease progression through a population, and building statistical and mechanistic disease models that enable forecasting. The status quo, however, renders acquiring and using such data difficult in practice. In many cases, a primary way of obtaining epidemiological data is through the internet, but the methods by which the data are presented to the public often differ drastically among institutions. As a result, there is a strong need for better data sharing practices. This paper identifies, in detail and with examples, the three key challenges one encounters when attempting to acquire and use epidemiological data: 1) interfaces, 2) data formatting, and 3) reporting. These challenges are used to provide suggestions and guidance for improvement as these systems evolve in the future. If these suggested data and interface recommendations were adhered to, epidemiological and public health analysis, modeling, and informatics work would be significantly streamlined, which can in turn yield better public health decision-making capabilities.
△ Less
Submitted 24 November, 2018; v1 submitted 20 April, 2018;
originally announced May 2018.
-
Deceptiveness of internet data for disease surveillance
Authors:
Reid Priedhorsky,
Dave Osthus,
Ashlynn R. Daughton,
Kelly R. Moran,
Aron Culotta
Abstract:
Quantifying how many people are or will be sick, and where, is a critical ingredient in reducing the burden of disease because it helps the public health system plan and implement effective outbreak response. This process of disease surveillance is currently based on data gathering using clinical and laboratory methods; this distributed human contact and resulting bureaucratic data aggregation yie…
▽ More
Quantifying how many people are or will be sick, and where, is a critical ingredient in reducing the burden of disease because it helps the public health system plan and implement effective outbreak response. This process of disease surveillance is currently based on data gathering using clinical and laboratory methods; this distributed human contact and resulting bureaucratic data aggregation yield expensive procedures that lag real time by weeks or months. The promise of new surveillance approaches using internet data, such as web event logs or social media messages, is to achieve the same goal but faster and cheaper. However, prior work in this area lacks a rigorous model of information flow, making it difficult to assess the reliability of both specific approaches and the body of work as a whole.
We model disease surveillance as a Shannon communication. This new framework lets any two disease surveillance approaches be compared using a unified vocabulary and conceptual model. Using it, we describe and compare the deficiencies suffered by traditional and internet-based surveillance, introduce a new risk metric called deceptiveness, and offer mitigations for some of these deficiencies. This framework also makes the rich tools of information theory applicable to disease surveillance. This better understanding will improve the decision-making of public health practitioners by hel** to leverage internet-based surveillance in a way complementary to the strengths of traditional surveillance.
△ Less
Submitted 31 July, 2018; v1 submitted 16 November, 2017;
originally announced November 2017.
-
A globally-applicable disease ontology for biosurveillance; Anthology of Biosurveillance Diseases (ABD)
Authors:
A. R. Daughton,
R. Priedhorsky,
G. Fairchild,
N. Generous,
A. Hengartner,
E. Abeyta,
N. Velappan,
A. Lillo,
K. Stark,
A. Deshpande
Abstract:
Biosurveillance, a relatively young field, has recently increased in importance because of its relevance to national security and global health. Databases and tools describing particular subsets of disease are becoming increasingly common in the field. However, a common method to describe those diseases is lacking. Here, we present the Anthology of Biosurveillance Diseases (ABD), an ontology of in…
▽ More
Biosurveillance, a relatively young field, has recently increased in importance because of its relevance to national security and global health. Databases and tools describing particular subsets of disease are becoming increasingly common in the field. However, a common method to describe those diseases is lacking. Here, we present the Anthology of Biosurveillance Diseases (ABD), an ontology of infectious diseases of biosurveillance relevance.
△ Less
Submitted 25 August, 2016;
originally announced September 2016.