Search | arXiv e-print repository

doi 10.3847/1538-4365/ac0ceb

Fermi Large Area Telescope Performance After 10 Years Of Operation

Authors: The Fermi LAT Collaboration, M. Ajello, W. B. Atwood, M. Axelsson, R. Bagagli, M. Bagni, L. Baldini, D. Bastieri, F. Bellardi, R. Bellazzini, E. Bissaldi, E. D. Bloom, R. Bonino, J. Bregeon, A. Brez, P. Bruel, R. Buehler, S. Buson, R. A. Cameron, P. A. Caraveo, E. Cavazzuti, M. Ceccanti, S. Chen, C. C. Cheung, S. Ciprini , et al. (104 additional authors not shown)

Abstract: The Large Area Telescope (LAT), the primary instrument for the Fermi Gamma-ray Space Telescope (Fermi) mission, is an imaging, wide field-of-view, high-energy gamma-ray telescope, covering the energy range from 30 MeV to more than 300 GeV. We describe the performance of the instrument at the 10-year milestone. LAT performance remains well within the specifications defined during the planning phase… ▽ More The Large Area Telescope (LAT), the primary instrument for the Fermi Gamma-ray Space Telescope (Fermi) mission, is an imaging, wide field-of-view, high-energy gamma-ray telescope, covering the energy range from 30 MeV to more than 300 GeV. We describe the performance of the instrument at the 10-year milestone. LAT performance remains well within the specifications defined during the planning phase, validating the design choices and supporting the compelling case to extend the duration of the Fermi mission. The details provided here will be useful when designing the next generation of high-energy gamma-ray observatories. △ Less

Submitted 6 September, 2021; v1 submitted 23 June, 2021; originally announced June 2021.

Comments: 60 pages, 28 figures. Published in ApJS

Journal ref: ApJS 256 (2021) 12

arXiv:2105.05134 [pdf, other]

COVID-19 Vaccine Hesitancy on Social Media: Building a Public Twitter Dataset of Anti-vaccine Content, Vaccine Misinformation and Conspiracies

Authors: Goran Muric, Yusong Wu, Emilio Ferrara

Abstract: False claims about COVID-19 vaccines can undermine public trust in ongoing vaccination campaigns, thus posing a threat to global public health. Misinformation originating from various sources has been spreading online since the beginning of the COVID-19 pandemic. In this paper, we present a dataset of Twitter posts that exhibit a strong anti-vaccine stance. The dataset consists of two parts: a) a… ▽ More False claims about COVID-19 vaccines can undermine public trust in ongoing vaccination campaigns, thus posing a threat to global public health. Misinformation originating from various sources has been spreading online since the beginning of the COVID-19 pandemic. In this paper, we present a dataset of Twitter posts that exhibit a strong anti-vaccine stance. The dataset consists of two parts: a) a streaming keyword-centered data collection with more than 1.8 million tweets, and b) a historical account-level collection with more than 135 million tweets. The former leverages the Twitter streaming API to follow a set of specific vaccine-related keywords starting from mid-October 2020. The latter consists of all historical tweets of 70K accounts that were engaged in the active spreading of anti-vaccine narratives. We present descriptive analyses showing the volume of activity over time, geographical distributions, topics, news sources, and inferred account political leaning. This dataset can be used in studying anti-vaccine misinformation on social media and enable a better understanding of vaccine hesitancy. In compliance with Twitter's Terms of Service, our anonymized dataset is publicly available at: https://github.com/gmuric/avax-tweets-dataset △ Less

Submitted 14 May, 2021; v1 submitted 11 May, 2021; originally announced May 2021.

arXiv:2104.13930 [pdf, other]

doi 10.1103/PhysRevLett.127.251302

Searching For Gravitational Waves From Cosmological Phase Transitions With The NANOGrav 12.5-year dataset

Authors: Zaven Arzoumanian, Paul T. Baker, Harsha Blumer, Bence Bécsy, Adam Brazier, Paul R. Brook, Sarah Burke-Spolaor, Maria Charisi, Shami Chatterjee, Siyuan Chen, James M. Cordes, Neil J. Cornish, Fronefield Crawford, H. Thankful Cromartie, Megan E. DeCesar, Paul B. Demorest, Timothy Dolch, Justin A. Ellis, Elizabeth C. Ferrara, William Fiore, Emmanuel Fonseca, Nathan Garver-Daniels, Peter A. Gentile, Deborah C. Good, Jeffrey S. Hazboun , et al. (40 additional authors not shown)

Abstract: We search for a first-order phase transition gravitational wave signal in 45 pulsars from the NANOGrav 12.5 year dataset. We find that the data can be modeled in terms of a strong first order phase transition taking place at temperatures below the electroweak scale. However, we do not observe any strong preference for a phase-transition interpretation of the signal over the standard astrophysical… ▽ More We search for a first-order phase transition gravitational wave signal in 45 pulsars from the NANOGrav 12.5 year dataset. We find that the data can be modeled in terms of a strong first order phase transition taking place at temperatures below the electroweak scale. However, we do not observe any strong preference for a phase-transition interpretation of the signal over the standard astrophysical interpretation in terms of supermassive black holes mergers; but we expect to gain additional discriminating power with future datasets, improving the signal to noise ratio and extending the sensitivity window to lower frequencies. An interesting open question is how well gravitational wave observatories could separate such signals. △ Less

Submitted 11 January, 2022; v1 submitted 28 April, 2021; originally announced April 2021.

Comments: 13 pages, 4 figures. v2: updated to match published version

Journal ref: Phys.Rev.Lett. 127 (2021) 25, 251302

arXiv:2104.09656 [pdf, other]

"Don't quote me on that": Finding Mixtures of Sources in News Articles

Authors: Alexander Spangher, Nanyun Peng, Jonathan May, Emilio Ferrara

Abstract: Journalists publish statements provided by people, or \textit{sources} to contextualize current events, help voters make informed decisions, and hold powerful individuals accountable. In this work, we construct an ontological labeling system for sources based on each source's \textit{affiliation} and \textit{role}. We build a probabilistic model to infer these attributes for named sources and to d… ▽ More Journalists publish statements provided by people, or \textit{sources} to contextualize current events, help voters make informed decisions, and hold powerful individuals accountable. In this work, we construct an ontological labeling system for sources based on each source's \textit{affiliation} and \textit{role}. We build a probabilistic model to infer these attributes for named sources and to describe news articles as mixtures of these sources. Our model outperforms existing mixture modeling and co-clustering approaches and correctly infers source-type in 80\% of expert-evaluated trials. Such work can facilitate research in downstream tasks like opinion and argumentation mining, representing a first step towards machine-in-the-loop \textit{computational journalism} systems. △ Less

Submitted 19 April, 2021; originally announced April 2021.

arXiv:2104.09653 [pdf, other]

Modeling "Newsworthiness" for Lead-Generation Across Corpora

Authors: Alexander Spangher, Nanyun Peng, Jonathan May, Emilio Ferrara

Abstract: Journalists obtain "leads", or story ideas, by reading large corpora of government records: court cases, proposed bills, etc. However, only a small percentage of such records are interesting documents. We propose a model of "newsworthiness" aimed at surfacing interesting documents. We train models on automatically labeled corpora -- published newspaper articles -- to predict whether each article w… ▽ More Journalists obtain "leads", or story ideas, by reading large corpora of government records: court cases, proposed bills, etc. However, only a small percentage of such records are interesting documents. We propose a model of "newsworthiness" aimed at surfacing interesting documents. We train models on automatically labeled corpora -- published newspaper articles -- to predict whether each article was a front-page article (i.e., \textbf{newsworthy}) or not (i.e., \textbf{less newsworthy}). We transfer these models to unlabeled corpora -- court cases, bills, city-council meeting minutes -- to rank documents in these corpora on "newsworthiness". A fine-tuned RoBERTa model achieves .93 AUC performance on heldout labeled documents, and .88 AUC on expert-validated unlabeled corpora. We provide interpretation and visualization for our models. △ Less

Submitted 19 April, 2021; originally announced April 2021.

arXiv:2104.05723 [pdf, ps, other]

doi 10.3847/1538-4357/ac4045

The NANOGrav 12.5-Year Data Set: Polarimetry and Faraday Rotation Measures from Observations of Millisecond Pulsars with the Green Bank Telescope

Authors: Haley M. Wahl, Maura McLaughlin, Peter A. Gentile, Megan L. Jones, Renée Spiewak, Zaven Arzoumanian, Kathryn Crowter, Paul Demorest, Megan E. DeCesar, Timothy Dolch, Justin A. Ellis, Robert D. Ferdman, Elizabeth C. Ferrara, Emmanuel Fonseca, Nate Garver-Daniels, Glenn Jones, Michael T. Lam, Lina Levin, Natalia Lewandowska, Duncan Lorimer, Ryan S. Lynch, Dustin R. Madison, Cherry Ng, David J. Nice, Timothy T. Pennucci , et al. (6 additional authors not shown)

Abstract: In this work, we present polarization profiles for 23 millisecond pulsars observed at 820 MHz and 1500 MHz with the Green Bank Telescope as part of the NANOGrav pulsar timing array. We calibrate the data using Mueller matrix solutions calculated from observations of PSRs B1929+10 and J1022+1001. We discuss the polarization profiles, which can be used to constrain pulsar emission geometry, and pres… ▽ More In this work, we present polarization profiles for 23 millisecond pulsars observed at 820 MHz and 1500 MHz with the Green Bank Telescope as part of the NANOGrav pulsar timing array. We calibrate the data using Mueller matrix solutions calculated from observations of PSRs B1929+10 and J1022+1001. We discuss the polarization profiles, which can be used to constrain pulsar emission geometry, and present both the first published radio polarization profiles for nine pulsars and the discovery of very low intensity average profile components ("microcomponents") in four pulsars. Using the Faraday rotation measures, we measure for each pulsar and use it to calculate the Galactic magnetic field parallel to the line of sight for different lines of sight through the interstellar medium. We fit for linear and sinusoidal trends in time in the dispersion measure and Galactic magnetic field and detect magnetic field variations with a period of one year in some pulsars, but overall find that the variations in these parameters are more consistent with a stochastic origin. △ Less

Submitted 6 December, 2022; v1 submitted 12 April, 2021; originally announced April 2021.

Comments: 35 pages, 21 figures. Accepted to ApJ

Journal ref: ApJ 926 168 (2022)

arXiv:2104.00880 [pdf, other]

doi 10.3847/2041-8213/ac03b8

Refined Mass and Geometric Measurements of the High-Mass PSR J0740+6620

Authors: Emmanuel Fonseca, H. Thankful Cromartie, Timothy T. Pennucci, Paul S. Ray, Aida Yu. Kirichenko, Scott M. Ransom, Paul B. Demorest, Ingrid H. Stairs, Zaven Arzoumanian, Lucas Guillemot, Aditya Parthasarathy, Matthew Kerr, Ismael Cognard, Paul T. Baker, Harsha Blumer, Paul R. Brook, Megan DeCesar, Timothy Dolch, F. Adam Dong, Elizabeth C. Ferrara, William Fiore, Nathaniel Garver-Daniels, Deborah C. Good, Ross Jennings, Megan L. Jones , et al. (20 additional authors not shown)

Abstract: We report results from continued timing observations of PSR J0740+6620, a high-mass, 2.8-ms radio pulsar in orbit with a likely ultra-cool white dwarf companion. Our data set consists of combined pulse arrival-time measurements made with the 100-m Green Bank Telescope and the Canadian Hydrogen Intensity Map** Experiment telescope. We explore the significance of timing-based phenomena arising fro… ▽ More We report results from continued timing observations of PSR J0740+6620, a high-mass, 2.8-ms radio pulsar in orbit with a likely ultra-cool white dwarf companion. Our data set consists of combined pulse arrival-time measurements made with the 100-m Green Bank Telescope and the Canadian Hydrogen Intensity Map** Experiment telescope. We explore the significance of timing-based phenomena arising from general-relativistic dynamics and variations in pulse dispersion. When using various statistical methods, we find that combining $\sim 1.5$ years of additional, high-cadence timing data with previous measurements confirms and improves upon previous estimates of relativistic effects within the PSR J0740+6620 system, with the pulsar mass $m_{\rm p} = 2.08^{+0.07}_{-0.07}$ M$_\odot$ (68.3\% credibility) determined by the relativistic Shapiro time delay. For the first time, we measure secular variation in the orbital period and argue that this effect arises from apparent acceleration due to significant transverse motion. After incorporating contributions from Galactic differential rotation and off-plane acceleration in the Galactic potential, we obtain a model-dependent distance of $d = 1.14^{+0.17}_{-0.15}$ kpc (68.3\% credibility). This improved distance confirms the ultra-cool nature of the white dwarf companion determined from recent optical observations. We discuss the prospects for future observations with next-generation facilities, which will likely improve the precision on $m_{\rm p}$ for J0740+6620 by an order of magnitude within the next few years. △ Less

Submitted 6 July, 2021; v1 submitted 2 April, 2021; originally announced April 2021.

Comments: Final version after minor corrections during referee process. Published in the Astrophysical Journal Letters on 1 July 2021

arXiv:2103.10979 [pdf]

doi 10.2196/29570

Social Media Polarization and Echo Chambers in the Context of COVID-19: Case Study

Authors: Julie Jiang, Xiang Ren, Emilio Ferrara

Abstract: Background: Social media chatter in 2020 has been largely dominated by the COVID-19 pandemic. Existing research shows that COVID-19 discourse is highly politicized, with political preferences linked to beliefs and disbeliefs about the virus. As it happens with topics that become politicized, people may fall into echo chambers, which is the idea that one is only presented with information they alre… ▽ More Background: Social media chatter in 2020 has been largely dominated by the COVID-19 pandemic. Existing research shows that COVID-19 discourse is highly politicized, with political preferences linked to beliefs and disbeliefs about the virus. As it happens with topics that become politicized, people may fall into echo chambers, which is the idea that one is only presented with information they already agree with, thereby reinforcing one's confirmation bias. Understanding the relationship between information dissemination and political preference is crucial for effective public health communication. Objective: We aimed to study the extent of polarization and examine the structure of echo chambers related to COVID-19 discourse on Twitter in the United States. Methods: First, we presented Retweet-BERT, a scalable and highly accurate model for estimating user polarity by leveraging language features and network structures. Then, by analyzing the user polarity predicted by Retweet-BERT, we provided new insights into the characterization of partisan users. Results: We observed that right-leaning users were noticeably more vocal and active in the production and consumption of COVID-19 information. We also found that most of the highly influential users were partisan, which may contribute to further polarization. Importantly, while echo chambers exist in both the right- and left-leaning communities, the right-leaning community was by far more densely connected within their echo chamber and isolated from the rest. Conclusions: We provided empirical evidence that political echo chambers are prevalent, especially in the right-leaning community, which can exacerbate the exposure to information in line with pre-existing users' views. Our findings have broader implications in develo** effective public health campaigns and promoting the circulation of factual information online. △ Less

Submitted 10 August, 2021; v1 submitted 19 March, 2021; originally announced March 2021.

Comments: Published in JMIRx Med 2 (3), e29570, 2021

Journal ref: JMIRx Med 2021;2(3):e29570

arXiv:2102.11352 [pdf, other]

doi 10.1109/ICDMW51313.2020.00048

Individualized Context-Aware Tensor Factorization for Online Games Predictions

Authors: Julie Jiang, Kristina Lerman, Emilio Ferrara

Abstract: Individual behavior and decisions are substantially influenced by their contexts, such as location, environment, and time. Changes along these dimensions can be readily observed in Multiplayer Online Battle Arena games (MOBA), where players face different in-game settings for each match and are subject to frequent game patches. Existing methods utilizing contextual information generalize the effec… ▽ More Individual behavior and decisions are substantially influenced by their contexts, such as location, environment, and time. Changes along these dimensions can be readily observed in Multiplayer Online Battle Arena games (MOBA), where players face different in-game settings for each match and are subject to frequent game patches. Existing methods utilizing contextual information generalize the effect of a context over the entire population, but contextual information tailored to each individual can be more effective. To achieve this, we present the Neural Individualized Context-aware Embeddings (NICE) model for predicting user performance and game outcomes. Our proposed method identifies individual behavioral differences in different contexts by learning latent representations of users and contexts through non-negative tensor factorization. Using a dataset from the MOBA game League of Legends, we demonstrate that our model substantially improves the prediction of winning outcome, individual user performance, and user engagement. △ Less

Submitted 22 February, 2021; originally announced February 2021.

Journal ref: 2020 International Conference on Data Mining Workshops (ICDMW)

arXiv:2102.08436 [pdf, other]

Social Bots and Social Media Manipulation in 2020: The Year in Review

Authors: Ho-Chun Herbert Chang, Emily Chen, Meiqing Zhang, Goran Muric, Emilio Ferrara

Abstract: The year 2020 will be remembered for two events of global significance: the COVID-19 pandemic and 2020 U.S. Presidential Election. In this chapter, we summarize recent studies using large public Twitter data sets on these issues. We have three primary objectives. First, we delineate epistemological and practical considerations when combining the traditions of computational research and social scie… ▽ More The year 2020 will be remembered for two events of global significance: the COVID-19 pandemic and 2020 U.S. Presidential Election. In this chapter, we summarize recent studies using large public Twitter data sets on these issues. We have three primary objectives. First, we delineate epistemological and practical considerations when combining the traditions of computational research and social science research. A sensible balance should be struck when the stakes are high between advancing social theory and concrete, timely reporting of ongoing events. We additionally comment on the computational challenges of gleaning insight from large amounts of social media data. Second, we characterize the role of social bots in social media manipulation around the discourse on the COVID-19 pandemic and 2020 U.S. Presidential Election. Third, we compare results from 2020 to prior years to note that, although bot accounts still contribute to the emergence of echo-chambers, there is a transition from state-sponsored campaigns to domestically emergent sources of distortion. Furthermore, issues of public health can be confounded by political orientation, especially from localized communities of actors who spread misinformation. We conclude that automation and social media manipulation pose issues to a healthy and democratic discourse, precisely because they distort representation of pluralism within the public sphere. △ Less

Submitted 16 February, 2021; originally announced February 2021.

Comments: Book Chapter submitted for the Handbook of Computational Social Science

arXiv:2102.04568 [pdf, other]

Tracking e-cigarette warning label compliance on Instagram with deep learning

Authors: Chris J. Kennedy, Julia Vassey, Ho-Chun Herbert Chang, Jennifer B. Unger, Emilio Ferrara

Abstract: The U.S. Food & Drug Administration (FDA) requires that e-cigarette advertisements include a prominent warning label that reminds consumers that nicotine is addictive. However, the high volume of va**-related posts on social media makes compliance auditing expensive and time-consuming, suggesting that an automated, scalable method is needed. We sought to develop and evaluate a deep learning syst… ▽ More The U.S. Food & Drug Administration (FDA) requires that e-cigarette advertisements include a prominent warning label that reminds consumers that nicotine is addictive. However, the high volume of va**-related posts on social media makes compliance auditing expensive and time-consuming, suggesting that an automated, scalable method is needed. We sought to develop and evaluate a deep learning system designed to automatically determine if an Instagram post promotes va**, and if so, if an FDA-compliant warning label was included or if a non-compliant warning label was visible in the image. We compiled and labeled a dataset of 4,363 Instagram images, of which 44% were va**-related, 3% contained FDA-compliant warning labels, and 4% contained non-compliant labels. Using a 20% test set for evaluation, we tested multiple neural network variations: image processing backbone model (Inceptionv3, ResNet50, EfficientNet), data augmentation, progressive layer unfreezing, output bias initialization designed for class imbalance, and multitask learning. Our final model achieved an area under the curve (AUC) and [accuracy] of 0.97 [92%] on va** classification, 0.99 [99%] on FDA-compliant warning labels, and 0.94 [97%] on non-compliant warning labels. We conclude that deep learning models can effectively identify va** posts on Instagram and track compliance with FDA warning label requirements. △ Less

Submitted 8 February, 2021; originally announced February 2021.

Comments: 9 pages, 3 figures

arXiv:2102.04026 [pdf, other]

doi 10.3847/1538-4357/abe4d5

Discovery and timing of three millisecond pulsars in radio and gamma-rays with the GMRT and Fermi-LAT

Authors: B. Bhattacharyya, J. Roy, T. J. Johnson, P. S. Ray, P. C. C. Freire, Y. Gupta, D. Bhattacharya, A. Kaninghat, B. W. Stappers, E. C. Ferrara, S. Sengupta, R. S. Rathour, M. Kerr, D. A. Smith, P. M. Saz Parkinson, S. M. Ransom, P. F. Michelson

Abstract: We performed deep observations to search for radio pulsations in the directions of 375 unassociated Fermi Large Area Telescope (LAT) gamma-ray sources using the Giant Metrewave Radio Telescope (GMRT) at 322 and 607 MHz. In this paper we report the discovery of three millisecond pulsars (MSPs), PSR J0248+4230, PSR J1207$-$5050 and PSR J1536$-$4948. We conducted follow up timing observations for aro… ▽ More We performed deep observations to search for radio pulsations in the directions of 375 unassociated Fermi Large Area Telescope (LAT) gamma-ray sources using the Giant Metrewave Radio Telescope (GMRT) at 322 and 607 MHz. In this paper we report the discovery of three millisecond pulsars (MSPs), PSR J0248+4230, PSR J1207$-$5050 and PSR J1536$-$4948. We conducted follow up timing observations for around 5 years with the GMRT and derived phase coherent timing models for these MSPs. PSR J0248$+$4230 and J1207$-$5050 are isolated MSPs having periodicities of 2.60 ms and 4.84 ms. PSR J1536-4948 is a 3.07 ms pulsar in a binary system with orbital period of around 62 days about a companion of minimum mass 0.32 solar mass. We also present multi-frequency pulse profiles of these MSPs from the GMRT observations. PSR J1536-4948 is an MSP with an extremely wide pulse profile having multiple components. Using the radio timing ephemeris we subsequently detected gamma-ray pulsations from these three MSPs, confirming them as the sources powering the gamma-ray emission. For PSR J1536-4948 we performed combined radio-gamma-ray timing using around 11.6 years of gamma-ray pulse times of arrivals (TOAs) along with the radio TOAs. PSR J1536-4948 also shows evidence for pulsed gamma-ray emission out to above 25 GeV, confirming earlier associations of this MSP with a >10 GeV point source. The multi-wavelength pulse profiles of all three MSPs offer challenges to models of radio and gamma-ray emission in pulsar magnetospheres. △ Less

Submitted 8 February, 2021; originally announced February 2021.

Comments: 35 pages, 8 Figures, 4 Tables, Accepted for publication in the Astrophysical Journal

arXiv:2101.04128 [pdf, other]

doi 10.3847/1538-3881/abda53

X-ray Spectra and Multiwavelength Machine Learning Classification for Likely Counterparts to Fermi 3FGL Unassociated Sources

Authors: Stephen Kerby, Amanpreet Kaur, Abraham D. Falcone, Michael C. Stroh, Elizabeth C. Ferrara, Jamie A. Kennea, Joseph Colosimo

Abstract: We conduct X-ray spectral fits on 184 likely counterparts to Fermi-LAT 3FGL unassociated sources. Characterization and classification of these sources allows for more complete population studies of the high-energy sky. Most of these X-ray spectra are well fit by an absorbed power law model, as expected for a population dominated by blazars and pulsars. A small subset of 7 X-ray sources have spectr… ▽ More We conduct X-ray spectral fits on 184 likely counterparts to Fermi-LAT 3FGL unassociated sources. Characterization and classification of these sources allows for more complete population studies of the high-energy sky. Most of these X-ray spectra are well fit by an absorbed power law model, as expected for a population dominated by blazars and pulsars. A small subset of 7 X-ray sources have spectra unlike the power law expected from a blazar or pulsar and may be linked to coincident stars or background emission. We develop a multiwavelength machine learning classifier to categorize unassociated sources into pulsars and blazars using gamma- and X-ray observations. Training a random forest procedure with known pulsars and blazars, we achieve a cross-validated classification accuracy of 98.6%. Applying the random forest routine to the unassociated sources returned 126 likely blazar candidates (defined as $ P_{bzr} > 90 \% $) and 5 likely pulsar candidates ($ P_{bzr} < 10 \% $). Our new X-ray spectral analysis does not drastically alter the random forest classifications of these sources compared to previous works, but it builds a more robust classification scheme and highlights the importance of X-ray spectral fitting. Our procedure can be further expanded with UV, visual, or radio spectral parameters or by measuring flux variability. △ Less

Submitted 11 January, 2021; originally announced January 2021.

Comments: 11 pages text, 4 figures, 4 tables (2 in-text, 2 in 11-page appendix), accepted for publication in the Astronomical Journal

arXiv:2101.02716 [pdf, other]

doi 10.3847/1538-4357/abfcd3

The NANOGrav 11yr Data Set: Limits on Supermassive Black Hole Binaries in Galaxies within 500Mpc

Authors: Zaven Arzoumanian, Paul T. Baker, Adam Brazier, Paul R. Brook, Sarah Burke-Spolaor, Bence Becsy, Maria Charisi, Shami Chatterjee, James M. Cordes, Neil J. Cornish, Fronefield Crawford, H. Thankful Cromartie, Megan E. DeCesar, Paul B. Demorest, Timothy Dolch, Rodney D. Elliott, Justin A. Ellis, Elizabeth C. Ferrara, Emmanuel Fonseca, Nathan Garver-Daniels, Peter A. Gentile, Deborah C. Good, Jeffrey S. Hazboun, Kristina Islo, Ross J. Jennings , et al. (32 additional authors not shown)

Abstract: Supermassive black hole binaries (SMBHBs) should form frequently in galactic nuclei as a result of galaxy mergers. At sub-parsec separations, binaries become strong sources of low-frequency gravitational waves (GWs), targeted by Pulsar Timing Arrays (PTAs). We used recent upper limits on continuous GWs from the North American Nanohertz Observatory for Gravitational Waves (NANOGrav) 11yr dataset to… ▽ More Supermassive black hole binaries (SMBHBs) should form frequently in galactic nuclei as a result of galaxy mergers. At sub-parsec separations, binaries become strong sources of low-frequency gravitational waves (GWs), targeted by Pulsar Timing Arrays (PTAs). We used recent upper limits on continuous GWs from the North American Nanohertz Observatory for Gravitational Waves (NANOGrav) 11yr dataset to place constraints on putative SMBHBs in nearby massive galaxies. We compiled a comprehensive catalog of ~44,000 galaxies in the local universe (up to redshift ~0.05) and populated them with hypothetical binaries, assuming that the total mass of the binary is equal to the SMBH mass derived from global scaling relations. Assuming circular equal-mass binaries emitting at NANOGrav's most sensitive frequency of 8nHz, we found that 216 galaxies are within NANOGrav's sensitivity volume. We ranked the potential SMBHBs based on GW detectability by calculating the total signal-to-noise ratio (S/N) such binaries would induce within the NANOGrav array. We placed constraints on the chirp mass and mass ratio of the 216 hypothetical binaries. For 19 galaxies, only very unequal-mass binaries are allowed, with the mass of the secondary less than 10 percent that of the primary, roughly comparable to constraints on a SMBHB in the Milky Way. Additionally, we were able to exclude binaries delivered by major mergers (mass ratio of at least 1/4) for several of these galaxies. We also derived the first limit on the density of binaries delivered by major mergers purely based on GW data. △ Less

Submitted 7 January, 2021; originally announced January 2021.

Comments: Submitted to ApJ. Send comments to Maria Charisi ([email protected])

arXiv:2012.15185 [pdf, ps, other]

doi 10.3847/1538-4357/abd7a1

Timing of Eight Binary Millisecond Pulsars Found with Arecibo in Fermi-LAT Unidentified Sources

Authors: J. S. Deneva, P. S. Ray, F. Camilo, P. C. C. Freire, H. T. Cromartie, S. M. Ransom, E. Ferrara, M. Kerr, T. H. Burnett, P. M. Saz Parkinson

Abstract: We present timing solutions for eight binary millisecond pulsars (MSPs) discovered by searching unidentified Fermi-LAT source positions with the 327 MHz receiver of the Arecibo 305-m radio telescope. Five of the pulsars are "spiders" with orbital periods shorter than 8.1 h. Three of these are in "black widow" systems (with degenerate companions of 0.02-0.03 solar masses), one is in a "redback" sys… ▽ More We present timing solutions for eight binary millisecond pulsars (MSPs) discovered by searching unidentified Fermi-LAT source positions with the 327 MHz receiver of the Arecibo 305-m radio telescope. Five of the pulsars are "spiders" with orbital periods shorter than 8.1 h. Three of these are in "black widow" systems (with degenerate companions of 0.02-0.03 solar masses), one is in a "redback" system (with a non-degenerate companion of $\gtrsim 0.3$ solar masses), and one (J1908+2105) is an apparent middle-ground case between the two observational classes. The remaining three pulsars have white dwarf companions and longer orbital periods. With the initially derived radio timing solutions, we detected gamma-ray pulsations from all MSPs and extended the timing solutions using photons from the full Fermi mission, thus confirming the identification of these MSPs with the Fermi-LAT sources. The radio emission of the redback is eclipsed during 50% of its orbital period, which is typical for this kind of system. Two of the black widows exhibit radio eclipses lasting for 10-20% of the orbit, while J1908+2105 eclipses for 40% of the orbit. We investigate an apparent link between gamma-ray emission and a short orbital period among known binary MSPs in the Galactic disk, and conclude that selection effects cannot be ruled out as the cause. Based on this analysis we outline how the likelihood of new MSP discoveries can be improved in ongoing and future pulsar searches. △ Less

Submitted 30 December, 2020; originally announced December 2020.

Comments: 23 pages, 8 figures; accepted for publication in the Astrophysical Journal

arXiv:2012.09884 [pdf, other]

doi 10.3847/1538-4357/abfafe

The NANOGrav 12.5-Year Data Set: Monitoring Interstellar Scattering Delays

Authors: Jacob E. Turner, Maura A. McLaughlin, James M. Cordes, Michael T. Lam, Brent J. Shapiro-Albert, Daniel R. Stinebring, Zaven Arzoumanian, Harsha Blumer, Paul R. Brook, Shami Chatterjee, H. Thankful Cromartie, Megan E. DeCesar, Paul B. Demorest, Timothy Dolch, Justin A. Ellis, Robert D. Ferdman, Elizabeth C. Ferrara, Emmanuel Fonseca, Nathan Garver-Daniels, Peter A. Gentile, Deborah C. Good, Megan L. Jones, T. Joseph W. Lazio, Duncan R. Lorimer, **g Luo , et al. (11 additional authors not shown)

Abstract: We extract interstellar scintillation parameters for pulsars observed by the NANOGrav radio pulsar timing program. Dynamic spectra for the observing epochs of each pulsar were used to obtain estimates of scintillation timescales, scintillation bandwidths, and the corresponding scattering delays using a stretching algorithm to account for frequency-dependent scaling. We were able to measure scintil… ▽ More We extract interstellar scintillation parameters for pulsars observed by the NANOGrav radio pulsar timing program. Dynamic spectra for the observing epochs of each pulsar were used to obtain estimates of scintillation timescales, scintillation bandwidths, and the corresponding scattering delays using a stretching algorithm to account for frequency-dependent scaling. We were able to measure scintillation bandwidths for 28 pulsars at 1500 MHz and 15 pulsars at 820 MHz. We examine scaling behavior for 17 pulsars and find power-law indices ranging from $-0.7$ to $-3.6$, though these may be biased shallow due to insufficient frequency resolution at lower frequencies. We were also able to measure scintillation timescales for six pulsars at 1500 MHz and seven pulsars at 820 MHz. There is fair agreement between our scattering delay measurements and electron-density model predictions for most pulsars. We derive interstellar scattering-based transverse velocities assuming isotropic scattering and a scattering screen halfway between the pulsar and earth. We also estimate the location of the scattering screens assuming proper motion and interstellar scattering-derived transverse velocities are equal. We find no correlations between variations in scattering delay and either variations in dispersion measure or flux density. For most pulsars for which scattering delays were measurable, we find that time of arrival uncertainties for a given epoch are larger than our scattering delay measurements, indicating that variable scattering delays are currently subdominant in our overall noise budget but are important for achieving precisions of tens of ns or less. △ Less

Submitted 30 April, 2021; v1 submitted 17 December, 2020; originally announced December 2020.

Comments: Accepted to ApJ

arXiv:2011.10549 [pdf, other]

Graph Signal Recovery Using Restricted Boltzmann Machines

Authors: Ankith Mohan, Aiichiro Nakano, Emilio Ferrara

Abstract: We propose a model-agnostic pipeline to recover graph signals from an expert system by exploiting the content addressable memory property of restricted Boltzmann machine and the representational ability of a neural network. The proposed pipeline requires the deep neural network that is trained on a downward machine learning task with clean data, data which is free from any form of corruption or in… ▽ More We propose a model-agnostic pipeline to recover graph signals from an expert system by exploiting the content addressable memory property of restricted Boltzmann machine and the representational ability of a neural network. The proposed pipeline requires the deep neural network that is trained on a downward machine learning task with clean data, data which is free from any form of corruption or incompletion. We show that denoising the representations learned by the deep neural networks is usually more effective than denoising the data itself. Although this pipeline can deal with noise in any dataset, it is particularly effective for graph-structured datasets. △ Less

Submitted 20 November, 2020; originally announced November 2020.

Comments: Paper: 27 pages, 9 figures. Appendix: 5 pages, 12 figures. Submitted to Expert Systems with Applications

arXiv:2011.08498 [pdf, other]

Political Partisanship and Anti-Science Attitudes in Online Discussions about Covid-19

Authors: Ashwin Rao, Fred Morstatter, Minda Hu, Emily Chen, Keith Burghardt, Emilio Ferrara, Kristina Lerman

Abstract: The novel coronavirus pandemic continues to ravage communities across the US. Opinion surveys identified importance of political ideology in sha** perceptions of the pandemic and compliance with preventive measures. Here, we use social media data to study complexity of polarization. We analyze a large dataset of tweets related to the pandemic collected between January and May of 2020, and develo… ▽ More The novel coronavirus pandemic continues to ravage communities across the US. Opinion surveys identified importance of political ideology in sha** perceptions of the pandemic and compliance with preventive measures. Here, we use social media data to study complexity of polarization. We analyze a large dataset of tweets related to the pandemic collected between January and May of 2020, and develop methods to classify the ideological alignment of users along the moderacy (hardline vs moderate), political (liberal vs conservative) and science (anti-science vs pro-science) dimensions. While polarization along the science and political dimensions are correlated, politically moderate users are more likely to be aligned with the pro-science views, and politically hardline users with anti-science views. Contrary to expectations, we do not find that polarization grows over time; instead, we see increasing activity by moderate pro-science users. We also show that anti-science conservatives tend to tweet from the Southern US, while anti-science moderates from the Western states. Our findings shed light on the multi-dimensional nature of polarization, and the feasibility of tracking polarized opinions about the pandemic across time and space through social media data. △ Less

Submitted 17 November, 2020; originally announced November 2020.

Comments: 10 pages, 5 figures

arXiv:2011.05367 [pdf, other]

doi 10.1145/3543873.3587615

Detecting Social Media Manipulation in Low-Resource Languages

Authors: Samar Haider, Luca Luceri, Ashok Deb, Adam Badawy, Nanyun Peng, Emilio Ferrara

Abstract: Social media have been deliberately used for malicious purposes, including political manipulation and disinformation. Most research focuses on high-resource languages. However, malicious actors share content across countries and languages, including low-resource ones. Here, we investigate whether and to what extent malicious actors can be detected in low-resource language settings. We discovered t… ▽ More Social media have been deliberately used for malicious purposes, including political manipulation and disinformation. Most research focuses on high-resource languages. However, malicious actors share content across countries and languages, including low-resource ones. Here, we investigate whether and to what extent malicious actors can be detected in low-resource language settings. We discovered that a high number of accounts posting in Tagalog were suspended as part of Twitter's crackdown on interference operations after the 2016 US Presidential election. By combining text embedding and transfer learning, our framework can detect, with promising accuracy, malicious users posting in Tagalog without any prior knowledge or training on malicious content in that language. We first learn an embedding model for each language, namely a high-resource language (English) and a low-resource one (Tagalog), independently. Then, we learn a map** between the two latent spaces to transfer the detection model. We demonstrate that the proposed approach significantly outperforms state-of-the-art models, including BERT, and yields marked advantages in settings with very limited training data -- the norm when dealing with detecting malicious activity in online platforms. △ Less

Submitted 19 February, 2023; v1 submitted 10 November, 2020; originally announced November 2020.

Journal ref: WWW '23 Companion: Companion Proceedings of the ACM Web Conference 2023, April 2023, Pages 1358-1364

arXiv:2010.11950 [pdf, other]

doi 10.3847/2041-8213/abf2c9

Astrophysics Milestones For Pulsar Timing Array Gravitational Wave Detection

Authors: Nihan S. Pol, Stephen R. Taylor, Luke Zoltan Kelley, Sarah J. Vigeland, Joseph Simon, Siyuan Chen, Zaven Arzoumanian, Paul T. Baker, Bence Bécsy, Adam Brazier, Paul R. Brook, Sarah Burke-Spolaor, Shami Chatterjee, James M. Cordes, Neil J. Cornish, Fronefield Crawford, H. Thankful Cromartie, Megan E. DeCesar, Paul B. Demorest, Timothy Dolch, Elizabeth C. Ferrara, William Fiore, Emmanuel Fonseca, Nathan Garver-Daniels, Deborah C. Good , et al. (27 additional authors not shown)

Abstract: The NANOGrav Collaboration reported strong Bayesian evidence for a common-spectrum stochastic process in its 12.5-yr pulsar timing array dataset, with median characteristic strain amplitude at periods of a year of $A_{\rm yr} = 1.92^{+0.75}_{-0.55} \times 10^{-15}$. However, evidence for the quadrupolar Hellings \& Downs interpulsar correlations, which are characteristic of gravitational wave sign… ▽ More The NANOGrav Collaboration reported strong Bayesian evidence for a common-spectrum stochastic process in its 12.5-yr pulsar timing array dataset, with median characteristic strain amplitude at periods of a year of $A_{\rm yr} = 1.92^{+0.75}_{-0.55} \times 10^{-15}$. However, evidence for the quadrupolar Hellings \& Downs interpulsar correlations, which are characteristic of gravitational wave signals, was not yet significant. We emulate and extend the NANOGrav dataset, injecting a wide range of stochastic gravitational wave background (GWB) signals that encompass a variety of amplitudes and spectral shapes, and quantify three key milestones: (I) Given the amplitude measured in the 12.5 yr analysis and assuming this signal is a GWB, we expect to accumulate robust evidence of an interpulsar-correlated GWB signal with 15--17 yrs of data, i.e., an additional 2--5 yrs from the 12.5 yr dataset; (II) At the initial detection, we expect a fractional uncertainty of $40\%$ on the power-law strain spectrum slope, which is sufficient to distinguish a GWB of supermassive black-hole binary origin from some models predicting more exotic origins;(III) Similarly, the measured GWB amplitude will have an uncertainty of $44\%$ upon initial detection, allowing us to arbitrate between some population models of supermassive black-hole binaries. In addition, power-law models are distinguishable from those having low-frequency spectral turnovers once 20~yrs of data are reached. Even though our study is based on the NANOGrav data, we also derive relations that allow for a generalization to other pulsar-timing array datasets. Most notably, by combining the data of individual arrays into the International Pulsar Timing Array, all of these milestones can be reached significantly earlier. △ Less

Submitted 24 March, 2021; v1 submitted 22 October, 2020; originally announced October 2020.

Comments: 15 pages, 7 figures

arXiv:2010.00600 [pdf, ps, other]

doi 10.1007/s42001-021-00117-9

#Election2020: The First Public Twitter Dataset on the 2020 US Presidential Election

Authors: Emily Chen, Ashok Deb, Emilio Ferrara

Abstract: The integrity of democratic political discourse is at the core to guarantee free and fair elections. With social media often dictating the tones and trends of politics-related discussion, it is of paramount important to be able to study online chatter, especially in the run up to important voting events, like in the case of the upcoming November 3, 2020 U.S. Presidential Election. Limited access t… ▽ More The integrity of democratic political discourse is at the core to guarantee free and fair elections. With social media often dictating the tones and trends of politics-related discussion, it is of paramount important to be able to study online chatter, especially in the run up to important voting events, like in the case of the upcoming November 3, 2020 U.S. Presidential Election. Limited access to social media data is often the first barrier to impede, hinder, or slow down progress, and ultimately our understanding of online political discourse. To mitigate this issue and try to empower the Computational Social Science research community, we decided to publicly release a massive-scale, longitudinal dataset of U.S. politics- and election-related tweets. This multilingual dataset that we have been collecting for over one year encompasses hundreds of millions of tweets and tracks all salient U.S. politics trends, actors, and events between 2019 and 2020. It predates and spans the whole period of Republican and Democratic primaries, with real-time tracking of all presidential contenders of both sides of the isle. After that, it focuses on presidential and vice-presidential candidates. Our dataset release is curated, documented and will be constantly updated on a weekly-basis, until the November 3, 2020 election and beyond. We hope that the academic community, computational journalists, and research practitioners alike will all take advantage of our dataset to study relevant scientific and social issues, including problems like misinformation, information manipulation, interference, and distortion of online political discourse that have been prevalent in the context of recent election events in the United States and worldwide. Our dataset is available at: https://github.com/echen102/us-pres-elections-2020 △ Less

Submitted 1 October, 2020; originally announced October 2020.

Comments: Our dataset is available at: https://github.com/echen102/us-pres-elections-2020

arXiv:2009.04496 [pdf, other]

doi 10.3847/2041-8213/abd401

The NANOGrav 12.5-year Data Set: Search For An Isotropic Stochastic Gravitational-Wave Background

Authors: Zaven Arzoumanian, Paul T. Baker, Harsha Blumer, Bence Becsy, Adam Brazier, Paul R. Brook, Sarah Burke-Spolaor, Shami Chatterjee, Siyuan Chen, James M. Cordes, Neil J. Cornish, Fronefield Crawford, H. Thankful Cromartie, Megan E. DeCesar, Paul B. Demorest, Timothy Dolch, Justin A. Ellis, Elizabeth C. Ferrara, William Fiore, Emmanuel Fonseca, Nathan Garver-Daniels, Peter A. Gentile, Deborah C. Good, Jeffrey S. Hazboun, A. Miguel Holgado , et al. (36 additional authors not shown)

Abstract: We search for an isotropic stochastic gravitational-wave background (GWB) in the $12.5$-year pulsar timing data set collected by the North American Nanohertz Observatory for Gravitational Waves. Our analysis finds strong evidence of a stochastic process, modeled as a power-law, with common amplitude and spectral slope across pulsars. The Bayesian posterior of the amplitude for an $f^{-2/3}$ power-… ▽ More We search for an isotropic stochastic gravitational-wave background (GWB) in the $12.5$-year pulsar timing data set collected by the North American Nanohertz Observatory for Gravitational Waves. Our analysis finds strong evidence of a stochastic process, modeled as a power-law, with common amplitude and spectral slope across pulsars. The Bayesian posterior of the amplitude for an $f^{-2/3}$ power-law spectrum, expressed as the characteristic GW strain, has median $1.92 \times 10^{-15}$ and $5\%$--$95\%$ quantiles of $1.37$--$2.67 \times 10^{-15}$ at a reference frequency of $f_\mathrm{yr} = 1 ~\mathrm{yr}^{-1}$. The Bayes factor in favor of the common-spectrum process versus independent red-noise processes in each pulsar exceeds $10,000$. However, we find no statistically significant evidence that this process has quadrupolar spatial correlations, which we would consider necessary to claim a GWB detection consistent with general relativity. We find that the process has neither monopolar nor dipolar correlations, which may arise from, for example, reference clock or solar system ephemeris systematics, respectively. The amplitude posterior has significant support above previously reported upper limits; we explain this in terms of the Bayesian priors assumed for intrinsic pulsar red noise. We examine potential implications for the supermassive black hole binary population under the hypothesis that the signal is indeed astrophysical in nature. △ Less

Submitted 8 January, 2021; v1 submitted 9 September, 2020; originally announced September 2020.

Comments: 25 pages, 14 figures, 5 tables, 3 appendices. Published in The Astrophysical Journal Letters. Please send any comments/questions to Joseph Simon ([email protected]). Jupyter notebook tutorials and some MCMC chain files are available at https://github.com/nanograv/12p5yr_stochastic_analysis

Journal ref: The Astrophysical Journal Letters, Volume 905, Number 2 (2020)

arXiv:2009.01966 [pdf, other]

Leveraging Clickstream Trajectories to Reveal Low-Quality Workers in Crowdsourced Forecasting Platforms

Authors: Akira Matsui, Emilio Ferrara, Fred Morstatter, Andres Abeliuk, Aram Galstyan

Abstract: Crowdwork often entails tackling cognitively-demanding and time-consuming tasks. Crowdsourcing can be used for complex annotation tasks, from medical imaging to geospatial data, and such data powers sensitive applications, such as health diagnostics or autonomous driving. However, the existence and prevalence of underperforming crowdworkers is well-recognized, and can pose a threat to the validity… ▽ More Crowdwork often entails tackling cognitively-demanding and time-consuming tasks. Crowdsourcing can be used for complex annotation tasks, from medical imaging to geospatial data, and such data powers sensitive applications, such as health diagnostics or autonomous driving. However, the existence and prevalence of underperforming crowdworkers is well-recognized, and can pose a threat to the validity of crowdsourcing. In this study, we propose the use of a computational framework to identify clusters of underperforming workers using clickstream trajectories. We focus on crowdsourced geopolitical forecasting. The framework can reveal different types of underperformers, such as workers with forecasts whose accuracy is far from the consensus of the crowd, those who provide low-quality explanations for their forecasts, and those who simply copy-paste their forecasts from other users. Our study suggests that clickstream clustering and analysis are fundamental tools to diagnose the performance of crowdworkers in platforms leveraging the wisdom of crowds. △ Less

Submitted 3 September, 2020; originally announced September 2020.

Comments: 12 pages, 8 figures

arXiv:2009.01513 [pdf, other]

doi 10.3847/2041-8213/abbc02

Discovery of a Gamma-ray Black Widow Pulsar by GPU-accelerated Einstein@Home

Authors: L. Nieder, C. J. Clark, D. Kandel, R. W. Romani, C. G. Bassa, B. Allen, A. Ashok, I. Cognard, H. Fehrmann, P. Freire, R. Karuppusamy, M. Kramer, D. Li, B. Machenschalk, Z. Pan, M. A. Papa, S. M. Ransom, P. S. Ray, J. Roy, P. Wang, J. Wu, C. Aulbert, E. D. Barr, B. Beheshtipour, O. Behnke , et al. (17 additional authors not shown)

Abstract: We report the discovery of 1.97 ms period gamma-ray pulsations from the 75 minute orbital-period binary pulsar now named PSR J1653-0158. The associated Fermi Large Area Telescope gamma-ray source 4FGL J1653.6-0158 has long been expected to harbor a binary millisecond pulsar. Despite the pulsar-like gamma-ray spectrum and candidate optical/X-ray associations -- whose periodic brightness modulations… ▽ More We report the discovery of 1.97 ms period gamma-ray pulsations from the 75 minute orbital-period binary pulsar now named PSR J1653-0158. The associated Fermi Large Area Telescope gamma-ray source 4FGL J1653.6-0158 has long been expected to harbor a binary millisecond pulsar. Despite the pulsar-like gamma-ray spectrum and candidate optical/X-ray associations -- whose periodic brightness modulations suggested an orbit -- no radio pulsations had been found in many searches. The pulsar was discovered by directly searching the gamma-ray data using the GPU-accelerated Einstein@Home distributed volunteer computing system. The multi-dimensional parameter space was bounded by positional and orbital constraints obtained from the optical counterpart. More sensitive analyses of archival and new radio data using knowledge of the pulsar timing solution yield very stringent upper limits on radio emission. Any radio emission is thus either exceptionally weak, or eclipsed for a large fraction of the time. The pulsar has one of the three lowest inferred surface magnetic-field strengths of any known pulsar with $B_{\rm surf} \approx 4 \times 10^{7}\,$G. The resulting mass function, combined with models of the companion star's optical light curve and spectra, suggests a pulsar mass $\gtrsim 2\,M_{\odot}$. The companion is light-weight with mass $\sim 0.01\,M_{\odot}$, and the orbital period is the shortest known for any rotation-powered binary pulsar. This discovery demonstrates the Fermi Large Area Telescope's potential to discover extreme pulsars that would otherwise remain undetected. △ Less

Submitted 22 October, 2020; v1 submitted 3 September, 2020; originally announced September 2020.

Comments: 12 pages, 3 figures, published in ApJL

arXiv:2008.11308 [pdf, other]

Identifying Coordinated Accounts on Social Media through Hidden Influence and Group Behaviours

Authors: Karishma Sharma, Yizhou Zhang, Emilio Ferrara, Yan Liu

Abstract: Disinformation campaigns on social media, involving coordinated activities from malicious accounts towards manipulating public opinion, have become increasingly prevalent. Existing approaches to detect coordinated accounts either make very strict assumptions about coordinated behaviours, or require part of the malicious accounts in the coordinated group to be revealed in order to detect the rest.… ▽ More Disinformation campaigns on social media, involving coordinated activities from malicious accounts towards manipulating public opinion, have become increasingly prevalent. Existing approaches to detect coordinated accounts either make very strict assumptions about coordinated behaviours, or require part of the malicious accounts in the coordinated group to be revealed in order to detect the rest. To address these drawbacks, we propose a generative model, AMDN-HAGE (Attentive Mixture Density Network with Hidden Account Group Estimation) which jointly models account activities and hidden group behaviours based on Temporal Point Processes (TPP) and Gaussian Mixture Model (GMM), to capture inherent characteristics of coordination which is, accounts that coordinate must strongly influence each other's activities, and collectively appear anomalous from normal accounts. To address the challenges of optimizing the proposed model, we provide a bilevel optimization algorithm with theoretical guarantee on convergence. We verified the effectiveness of the proposed method and training algorithm on real-world social network data collected from Twitter related to coordinated campaigns from Russia's Internet Research Agency targeting the 2016 U.S. Presidential Elections, and to identify coordinated campaigns related to the COVID-19 pandemic. Leveraging the learned model, we find that the average influence between coordinated account pairs is the highest.On COVID-19, we found coordinated group spreading anti-vaccination, anti-masks conspiracies that suggest the pandemic is a hoax and political scam. △ Less

Submitted 18 May, 2021; v1 submitted 25 August, 2020; originally announced August 2020.

Comments: KDD'2021 (Accepted)

Journal ref: ACM SIGKDD International Conference on Knowledge Discovery & Data Mining 2021

arXiv:2008.05131 [pdf, other]

Learning to Reason in Round-based Games: Multi-task Sequence Generation for Purchasing Decision Making in First-person Shooters

Authors: Yilei Zeng, Deren Lei, Beichen Li, Gangrong Jiang, Emilio Ferrara, Michael Zyda

Abstract: Sequential reasoning is a complex human ability, with extensive previous research focusing on gaming AI in a single continuous game, round-based decision makings extending to a sequence of games remain less explored. Counter-Strike: Global Offensive (CS:GO), as a round-based game with abundant expert demonstrations, provides an excellent environment for multi-player round-based sequential reasonin… ▽ More Sequential reasoning is a complex human ability, with extensive previous research focusing on gaming AI in a single continuous game, round-based decision makings extending to a sequence of games remain less explored. Counter-Strike: Global Offensive (CS:GO), as a round-based game with abundant expert demonstrations, provides an excellent environment for multi-player round-based sequential reasoning. In this work, we propose a Sequence Reasoner with Round Attribute Encoder and Multi-Task Decoder to interpret the strategies behind the round-based purchasing decisions. We adopt few-shot learning to sample multiple rounds in a match, and modified model agnostic meta-learning algorithm Reptile for the meta-learning loop. We formulate each round as a multi-task sequence generation problem. Our state representations combine action encoder, team encoder, player features, round attribute encoder, and economy encoders to help our agent learn to reason under this specific multi-player round-based scenario. A complete ablation study and comparison with the greedy approach certify the effectiveness of our model. Our research will open doors for interpretable AI for understanding episodic and long-term purchasing strategies beyond the gaming community. △ Less

Submitted 12 August, 2020; originally announced August 2020.

Comments: 16th AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment (AIIDE-20)

arXiv:2008.01723 [pdf, other]

doi 10.1007/978-3-030-80387-2_25

Having a Bad Day? Detecting the Impact of Atypical Life Events Using Wearable Sensors

Authors: Keith Burghardt, Nazgol Tavabi, Emilio Ferrara, Shrikanth Narayanan, Kristina Lerman

Abstract: Life events can dramatically affect our psychological state and work performance. Stress, for example, has been linked to professional dissatisfaction, increased anxiety, and workplace burnout. We explore the impact of positive and negative life events on a number of psychological constructs through a multi-month longitudinal study of hospital and aerospace workers. Through causal inference, we de… ▽ More Life events can dramatically affect our psychological state and work performance. Stress, for example, has been linked to professional dissatisfaction, increased anxiety, and workplace burnout. We explore the impact of positive and negative life events on a number of psychological constructs through a multi-month longitudinal study of hospital and aerospace workers. Through causal inference, we demonstrate that positive life events increase positive affect, while negative events increase stress, anxiety and negative affect. While most events have a transient effect on psychological states, major negative events, like illness or attending a funeral, can reduce positive affect for multiple days. Next, we assess whether these events can be detected through wearable sensors, which can cheaply and unobtrusively monitor health-related factors. We show that these sensors paired with embedding-based learning models can be used ``in the wild'' to capture atypical life events in hundreds of workers across both datasets. Overall our results suggest that automated interventions based on physiological sensing may be feasible to help workers regulate the negative effects of life events. △ Less

Submitted 4 August, 2020; originally announced August 2020.

Comments: 10 pages, 4 figures, and 3 tables

arXiv:2006.06142 [pdf]

doi 10.2196/25379

Gender disparity in the authorship of biomedical research publications during the COVID-19 pandemic

Authors: Goran Muric, Kristina Lerman, Emilio Ferrara

Abstract: Preliminary evidence suggests that women, including female researchers, are disproportionately affected by the COVID-19 pandemic in terms of unequal distribution of childcare, elderly care and other kinds of domestic and emotional labor. Sudden lockdowns and abrupt shifts in daily routines have disproportionate consequences on their productivity, which is reflected by a sudden drop in research out… ▽ More Preliminary evidence suggests that women, including female researchers, are disproportionately affected by the COVID-19 pandemic in terms of unequal distribution of childcare, elderly care and other kinds of domestic and emotional labor. Sudden lockdowns and abrupt shifts in daily routines have disproportionate consequences on their productivity, which is reflected by a sudden drop in research output in biomedical research, consequently affecting the number of female authors of scientific publications. We investigate the proportion of male and female researchers who published scientific papers during the COVID-19 pandemic, using bibliometric data from biomedical preprint servers and selected Springer-Nature journals. Our findings document a decrease in the number of publications by female authors in biomedical field during the global pandemic. This effect is particularly pronounced for papers related to COVID-19, indicating that women are producing fewer publications related to COVID-19 research. This sudden increase in the gender gap is persistent across the ten countries with the highest number of researchers. These results should be used to inform the scientific community of the worrying trend in COVID-19 research and the disproportionate effect that the pandemic has on female academics. △ Less

Submitted 24 March, 2021; v1 submitted 10 June, 2020; originally announced June 2020.

arXiv:2006.05557 [pdf, other]

doi 10.1145/3340531.3412880

ReCOVery: A Multimodal Repository for COVID-19 News Credibility Research

Authors: Xinyi Zhou, Apurva Mulay, Emilio Ferrara, Reza Zafarani

Abstract: First identified in Wuhan, China, in December 2019, the outbreak of COVID-19 has been declared as a global emergency in January, and a pandemic in March 2020 by the World Health Organization (WHO). Along with this pandemic, we are also experiencing an "infodemic" of information with low credibility such as fake news and conspiracies. In this work, we present ReCOVery, a repository designed and con… ▽ More First identified in Wuhan, China, in December 2019, the outbreak of COVID-19 has been declared as a global emergency in January, and a pandemic in March 2020 by the World Health Organization (WHO). Along with this pandemic, we are also experiencing an "infodemic" of information with low credibility such as fake news and conspiracies. In this work, we present ReCOVery, a repository designed and constructed to facilitate research on combating such information regarding COVID-19. We first broadly search and investigate ~2,000 news publishers, from which 60 are identified with extreme [high or low] levels of credibility. By inheriting the credibility of the media on which they were published, a total of 2,029 news articles on coronavirus, published from January to May 2020, are collected in the repository, along with 140,820 tweets that reveal how these news articles have spread on the Twitter social network. The repository provides multimodal information of news articles on coronavirus, including textual, visual, temporal, and network information. The way that news credibility is obtained allows a trade-off between dataset scalability and label accuracy. Extensive experiments are conducted to present data statistics and distributions, as well as to provide baseline performances for predicting news credibility so that future methods can be compared. Our repository is available at http://coronavirus-fakenews.com. △ Less

Submitted 17 August, 2020; v1 submitted 9 June, 2020; originally announced June 2020.

Comments: Proceedings of the 29th ACM International Conference on Information and Knowledge Management (CIKM '20)

arXiv:2005.07123 [pdf, other]

Multi-Messenger Gravitational Wave Searches with Pulsar Timing Arrays: Application to 3C66B Using the NANOGrav 11-year Data Set

Authors: Zaven Arzoumanian, Paul T. Baker, Adam Brazier, Paul R. Brook, Sarah Burke-Spolaor, Bence Becsy, Maria Charisi, Shami Chatterjee, James M. Cordes, Neil J. Cornish, Fronefield Crawford, H. Thankful Cromartie, Kathryn Crowter, Megan E. DeCesar, Paul B. Demorest, Timothy Dolch, Rodney D. Elliott, Justin A. Ellis, Robert D. Ferdman, Elizabeth C. Ferrara, Emmanuel Fonseca, Nathan Garver-Daniels, Peter A. Gentile, Deborah C. Good, Jeffrey S. Hazboun , et al. (34 additional authors not shown)

Abstract: When galaxies merge, the supermassive black holes in their centers may form binaries and, during the process of merger, emit low-frequency gravitational radiation in the process. In this paper we consider the galaxy 3C66B, which was used as the target of the first multi-messenger search for gravitational waves. Due to the observed periodicities present in the photometric and astrometric data of th… ▽ More When galaxies merge, the supermassive black holes in their centers may form binaries and, during the process of merger, emit low-frequency gravitational radiation in the process. In this paper we consider the galaxy 3C66B, which was used as the target of the first multi-messenger search for gravitational waves. Due to the observed periodicities present in the photometric and astrometric data of the source of the source, it has been theorized to contain a supermassive black hole binary. Its apparent 1.05-year orbital period would place the gravitational wave emission directly in the pulsar timing band. Since the first pulsar timing array study of 3C66B, revised models of the source have been published, and timing array sensitivities and techniques have improved dramatically. With these advances, we further constrain the chirp mass of the potential supermassive black hole binary in 3C66B to less than $(1.65\pm0.02) \times 10^9~{M_\odot}$ using data from the NANOGrav 11-year data set. This upper limit provides a factor of 1.6 improvement over previous limits, and a factor of 4.3 over the first search done. Nevertheless, the most recent orbital model for the source is still consistent with our limit from pulsar timing array data. In addition, we are able to quantify the improvement made by the inclusion of source properties gleaned from electromagnetic data to `blind' pulsar timing array searches. With these methods, it is apparent that it is not necessary to obtain exact a priori knowledge of the period of a binary to gain meaningful astrophysical inferences. △ Less

Submitted 12 August, 2020; v1 submitted 14 May, 2020; originally announced May 2020.

Comments: 14 pages, 6 figures. Accepted by ApJ

arXiv:2005.06495 [pdf, other]

doi 10.3847/1538-4365/abc6a1

The NANOGrav 12.5-year Data Set: Wideband Timing of 47 Millisecond Pulsars

Authors: Md F. Alam, Zaven Arzoumanian, Paul T. Baker, Harsha Blumer, Keith E. Bohler, Adam Brazier, Paul R. Brook, Sarah Burke-Spolaor, Keeisi Caballero, Richard S. Camuccio, Rachel L. Chamberlain, Shami Chatterjee, James M. Cordes, Neil J. Cornish, Fronefield Crawford, H. Thankful Cromartie, Megan E. DeCesar, Paul B. Demorest, Timothy Dolch, Justin A. Ellis, Robert D. Ferdman, Elizabeth C. Ferrara, William Fiore, Emmanuel Fonseca, Yhamil Garcia , et al. (45 additional authors not shown)

Abstract: We present a new analysis of the profile data from the 47 millisecond pulsars comprising the 12.5-year data set of the North American Nanohertz Observatory for Gravitational Waves (NANOGrav), which is presented in a parallel paper (Alam et al. 2021a; NG12.5). Our reprocessing is performed using "wideband" timing methods, which use frequency-dependent template profiles, simultaneous time-of-arrival… ▽ More We present a new analysis of the profile data from the 47 millisecond pulsars comprising the 12.5-year data set of the North American Nanohertz Observatory for Gravitational Waves (NANOGrav), which is presented in a parallel paper (Alam et al. 2021a; NG12.5). Our reprocessing is performed using "wideband" timing methods, which use frequency-dependent template profiles, simultaneous time-of-arrival (TOA) and dispersion measure (DM) measurements from broadband observations, and novel analysis techniques. In particular, the wideband DM measurements are used to constrain the DM portion of the timing model. We compare the ensemble timing results to NG12.5 by examining the timing residuals, timing models, and noise model components. There is a remarkable level of agreement across all metrics considered. Our best-timed pulsars produce encouragingly similar results to those from NG12.5. In certain cases, such as high-DM pulsars with profile broadening, or sources that are weak and scintillating, wideband timing techniques prove to be beneficial, leading to more precise timing model parameters by 10-15%. The high-precision, multi-band measurements of several pulsars indicate frequency-dependent DMs. Compared to the narrowband analysis in NG12.5, the TOA volume is reduced by a factor of 33, which may ultimately facilitate computational speed-ups for complex pulsar timing array analyses. This first wideband pulsar timing data set is a step** stone, and its consistent results with NG12.5 assure us that such data sets are appropriate for gravitational wave analyses. △ Less

Submitted 18 December, 2020; v1 submitted 13 May, 2020; originally announced May 2020.

Comments: 62 pages, 55 figures, 5 tables, 3 appendices. Data available at http://nanograv.org/data/ and via DOI 10.5281/zenodo.4312887

Journal ref: The Astrophysical Journal Supplement Series, 252, 5 (2021)

arXiv:2005.06490 [pdf, other]

doi 10.3847/1538-4365/abc6a0

The NANOGrav 12.5 yr Data Set: Observations and Narrowband Timing of 47 Millisecond Pulsars

Authors: Md F. Alam, Zaven Arzoumanian, Paul T. Baker, Harsha Blumer, Keith E. Bohler, Adam Brazier, Paul R. Brook, Sarah Burke-Spolaor, Keeisi Caballero, Richard S. Camuccio, Rachel L. Chamberlain, Shami Chatterjee, James M. Cordes, Neil J. Cornish, Fronefield Crawford, H. Thankful Cromartie, Megan E. DeCesar, Paul B. Demorest, Timothy Dolch, Justin A. Ellis, Robert D. Ferdman, Elizabeth C. Ferrara, William Fiore, Emmanuel Fonseca, Yhamil Garcia , et al. (45 additional authors not shown)

Abstract: We present time-of-arrival (TOA) measurements and timing models of 47 millisecond pulsars (MSPs) observed from 2004 to 2017 at the Arecibo Observatory and the Green Bank Telescope by the North American Nanohertz Observatory for Gravitational Waves (NANOGrav). The observing cadence was three to four weeks for most pulsars over most of this time span, with weekly observations of six sources. These d… ▽ More We present time-of-arrival (TOA) measurements and timing models of 47 millisecond pulsars (MSPs) observed from 2004 to 2017 at the Arecibo Observatory and the Green Bank Telescope by the North American Nanohertz Observatory for Gravitational Waves (NANOGrav). The observing cadence was three to four weeks for most pulsars over most of this time span, with weekly observations of six sources. These data were collected for use in low-frequency gravitational wave searches and for other astrophysical purposes. We detail our observational methods and present a set of TOA measurements, based on "narrowband" analysis, in which many TOAs are calculated within narrow radio-frequency bands for data collected simultaneously across a wide bandwidth. A separate set of "wideband" TOAs will be presented in a companion paper. We detail a number of methodological changes compared to our previous work which yield a cleaner and more uniformly processed data set. Our timing models include several new astrometric and binary pulsar measurements, including previously unpublished values for the parallaxes of PSRs J1832-0836 and J2322+2057, the secular derivatives of the projected semi-major orbital axes of PSRs J0613-0200 and J2229+2643, and the first detection of the Shapiro delay in PSR J2145-0750. We report detectable levels of red noise in the time series for 14 pulsars. As a check on timing model reliability, we investigate the stability of astrometric parameters across data sets of different lengths. We report flux density measurements for all pulsars observed. Searches for stochastic and continuous gravitational waves using these data will be subjects of forthcoming publications. △ Less

Submitted 23 December, 2020; v1 submitted 13 May, 2020; originally announced May 2020.

Comments: 54 pages, 52 figures, 7 tables, 1 appendix. Data are available at http://nanograv.org/data/ and via the DOI 10.5281/zenodo.4312297

Journal ref: The Astrophysical Journal Supplement Series, 252, 4 (2021)

arXiv:2004.13277 [pdf, other]

doi 10.1007/s42001-020-00078-5

Detecting multi-timescale consumption patterns from receipt data: A non-negative tensor factorization approach

Authors: Akira Matsui, Teruyoshi Kobayashi, Daisuke Moriwaki, Emilio Ferrara

Abstract: Understanding consumer behavior is an important task, not only for develo** marketing strategies but also for the management of economic policies. Detecting consumption patterns, however, is a high-dimensional problem in which various factors that would affect consumers' behavior need to be considered, such as consumers' demographics, circadian rhythm, seasonal cycles, etc. Here, we develop a me… ▽ More Understanding consumer behavior is an important task, not only for develo** marketing strategies but also for the management of economic policies. Detecting consumption patterns, however, is a high-dimensional problem in which various factors that would affect consumers' behavior need to be considered, such as consumers' demographics, circadian rhythm, seasonal cycles, etc. Here, we develop a method to extract multi-timescale expenditure patterns of consumers from a large dataset of scanned receipts. We use a non-negative tensor factorization (NTF) to detect intra- and inter-week consumption patterns at one time. The proposed method allows us to characterize consumers based on their consumption patterns that are correlated over different timescales. △ Less

Submitted 2 August, 2020; v1 submitted 28 April, 2020; originally announced April 2020.

Comments: 16 pages, 10 figures

Journal ref: Journal of Computational Social Science (2020)

arXiv:2004.09531 [pdf]

doi 10.5210/fm.v25i6.10633

What Types of COVID-19 Conspiracies are Populated by Twitter Bots?

Authors: Emilio Ferrara

Abstract: With people moving out of physical public spaces due to containment measures to tackle the novel coronavirus (COVID-19) pandemic, online platforms become even more prominent tools to understand social discussion. Studying social media can be informative to assess how we are collectively co** with this unprecedented global crisis. However, social media platforms are also populated by bots, automa… ▽ More With people moving out of physical public spaces due to containment measures to tackle the novel coronavirus (COVID-19) pandemic, online platforms become even more prominent tools to understand social discussion. Studying social media can be informative to assess how we are collectively co** with this unprecedented global crisis. However, social media platforms are also populated by bots, automated accounts that can amplify certain topics of discussion at the expense of others. In this paper, we study 43.3M English tweets about COVID-19 and provide early evidence of the use of bots to promote political conspiracies in the United States, in stark contrast with humans who focus on public health concerns. △ Less

Submitted 2 June, 2020; v1 submitted 20 April, 2020; originally announced April 2020.

Comments: Published in: First Monday, 25(6), 2020; https://firstmonday.org/ojs/index.php/fm/article/view/10633

Journal ref: First Monday, 25(6), 2020

arXiv:2003.08474 [pdf, other]

doi 10.1038/s41597-020-00655-3

TILES-2018, a longitudinal physiologic and behavioral data set of hospital workers

Authors: Karel Mundnich, Brandon M. Booth, Michelle L'Hommedieu, Tiantian Feng, Benjamin Girault, Justin L'Hommedieu, Mackenzie Wildman, Sophia Skaaden, Amrutha Nadarajan, Jennifer L. Villatte, Tiago H. Falk, Kristina Lerman, Emilio Ferrara, Shrikanth Narayanan

Abstract: We present a novel longitudinal multimodal corpus of physiological and behavioral data collected from direct clinical providers in a hospital workplace. We designed the study to investigate the use of off-the-shelf wearable and environmental sensors to understand individual-specific constructs such as job performance, interpersonal interaction, and well-being of hospital workers over time in their… ▽ More We present a novel longitudinal multimodal corpus of physiological and behavioral data collected from direct clinical providers in a hospital workplace. We designed the study to investigate the use of off-the-shelf wearable and environmental sensors to understand individual-specific constructs such as job performance, interpersonal interaction, and well-being of hospital workers over time in their natural day-to-day job settings. We collected behavioral and physiological data from $n = 212$ participants through Internet-of-Things Bluetooth data hubs, wearable sensors (including a wristband, a biometrics-tracking garment, a smartphone, and an audio-feature recorder), together with a battery of surveys to assess personality traits, behavioral states, job performance, and well-being over time. Besides the default use of the data set, we envision several novel research opportunities and potential applications, including multi-modal and multi-task behavioral modeling, authentication through biometrics, and privacy-aware and privacy-preserving machine learning. △ Less

Submitted 18 December, 2020; v1 submitted 18 March, 2020; originally announced March 2020.

Comments: 57 pages, 9 figures, journal paper

Journal ref: Sci Data 7, 354 (2020)

arXiv:2003.07372 [pdf]

doi 10.2196/19273

Tracking Social Media Discourse About the COVID-19 Pandemic: Development of a Public Coronavirus Twitter Data Set

Authors: Emily Chen, Kristina Lerman, Emilio Ferrara

Abstract: At the time of this writing, the novel coronavirus (COVID-19) pandemic outbreak has already put tremendous strain on many countries' citizens, resources and economies around the world. Social distancing measures, travel bans, self-quarantines, and business closures are changing the very fabric of societies worldwide. With people forced out of public spaces, much conversation about these phenomena… ▽ More At the time of this writing, the novel coronavirus (COVID-19) pandemic outbreak has already put tremendous strain on many countries' citizens, resources and economies around the world. Social distancing measures, travel bans, self-quarantines, and business closures are changing the very fabric of societies worldwide. With people forced out of public spaces, much conversation about these phenomena now occurs online, e.g., on social media platforms like Twitter. In this paper, we describe a multilingual coronavirus (COVID-19) Twitter dataset that we have been continuously collecting since January 22, 2020. We are making our dataset available to the research community (https://github.com/echen102/COVID-19-TweetIDs). It is our hope that our contribution will enable the study of online conversation dynamics in the context of a planetary-scale epidemic outbreak of unprecedented proportions and implications. This dataset could also help track scientific coronavirus misinformation and unverified rumors, or enable the understanding of fear and panic -- and undoubtedly more. Ultimately, this dataset may contribute towards enabling informed solutions and prescribing targeted policy interventions to fight this global crisis. △ Less

Submitted 2 June, 2020; v1 submitted 16 March, 2020; originally announced March 2020.

Journal ref: JMIR Public Health Surveill 2020;6(2):e19273

arXiv:2001.10570 [pdf, other]

Detecting Troll Behavior via Inverse Reinforcement Learning: A Case Study of Russian Trolls in the 2016 US Election

Authors: Luca Luceri, Silvia Giordano, Emilio Ferrara

Abstract: Since the 2016 US Presidential election, social media abuse has been eliciting massive concern in the academic community and beyond. Preventing and limiting the malicious activity of users, such as trolls and bots, in their manipulation campaigns is of paramount importance for the integrity of democracy, public health, and more. However, the automated detection of troll accounts is an open challen… ▽ More Since the 2016 US Presidential election, social media abuse has been eliciting massive concern in the academic community and beyond. Preventing and limiting the malicious activity of users, such as trolls and bots, in their manipulation campaigns is of paramount importance for the integrity of democracy, public health, and more. However, the automated detection of troll accounts is an open challenge. In this work, we propose an approach based on Inverse Reinforcement Learning (IRL) to capture troll behavior and identify troll accounts. We employ IRL to infer a set of online incentives that may steer user behavior, which in turn highlights behavioral differences between troll and non-troll accounts, enabling their accurate classification. As a study case, we consider the troll accounts identified by the US Congress during the investigation of Russian meddling in the 2016 US Presidential election. We report promising results: the IRL-based approach is able to accurately detect troll accounts (AUC=89.1%). The differences in the predictive features between the two classes of accounts enables a principled understanding of the distinctive behaviors reflecting the incentives trolls and non-trolls respond to. △ Less

Submitted 5 June, 2020; v1 submitted 28 January, 2020; originally announced January 2020.

arXiv:2001.10289 [pdf, other]

doi 10.1109/ACCESS.2020.3003370

Charting the Landscape of Online Cryptocurrency Manipulation

Authors: Leonardo Nizzoli, Serena Tardelli, Marco Avvenuti, Stefano Cresci, Maurizio Tesconi, Emilio Ferrara

Abstract: Cryptocurrencies represent one of the most attractive markets for financial speculation. As a consequence, they have attracted unprecedented attention on social media. Besides genuine discussions and legitimate investment initiatives, several deceptive activities have flourished. In this work, we chart the online cryptocurrency landscape across multiple platforms. To reach our goal, we collected a… ▽ More Cryptocurrencies represent one of the most attractive markets for financial speculation. As a consequence, they have attracted unprecedented attention on social media. Besides genuine discussions and legitimate investment initiatives, several deceptive activities have flourished. In this work, we chart the online cryptocurrency landscape across multiple platforms. To reach our goal, we collected a large dataset, composed of more than 50M messages published by almost 7M users on Twitter, Telegram and Discord, over three months. We performed bot detection on Twitter accounts sharing invite links to Telegram and Discord channels, and we discovered that more than 56% of them were bots or suspended accounts. Then, we applied topic modeling techniques to Telegram and Discord messages, unveiling two different deception schemes - "pump-and-dump" and "Ponzi" - and identifying the channels involved in these frauds. Whereas on Discord we found a negligible level of deception, on Telegram we retrieved 296 channels involved in pump-and-dump and 432 involved in Ponzi schemes, accounting for a striking 20% of the total. Moreover, we observed that 93% of the invite links shared by Twitter bots point to Telegram pump-and-dump channels, shedding light on a little-known social bot activity. Charting the landscape of online cryptocurrency manipulation can inform actionable policies to fight such abuse. △ Less

Submitted 28 January, 2020; originally announced January 2020.

Journal ref: IEEE Access 8, 2020

arXiv:2001.06547 [pdf, other]

Predictability limit of partially observed systems

Authors: Andrés Abeliuk, Zhishen Huang, Emilio Ferrara, Kristina Lerman

Abstract: Applications from finance to epidemiology and cyber-security require accurate forecasts of dynamic phenomena, which are often only partially observed. We demonstrate that a system's predictability degrades as a function of temporal sampling, regardless of the adopted forecasting model. We quantify the loss of predictability due to sampling, and show that it cannot be recovered by using external si… ▽ More Applications from finance to epidemiology and cyber-security require accurate forecasts of dynamic phenomena, which are often only partially observed. We demonstrate that a system's predictability degrades as a function of temporal sampling, regardless of the adopted forecasting model. We quantify the loss of predictability due to sampling, and show that it cannot be recovered by using external signals. We validate the generality of our theoretical findings in real-world partially observed systems representing infectious disease outbreaks, online discussions, and software development projects. On a variety of prediction tasks---forecasting new infections, the popularity of topics in online discussions, or interest in cryptocurrency projects---predictability irrecoverably decays as a function of sampling, unveiling fundamental predictability limits in partially observed systems. △ Less

Submitted 17 January, 2020; originally announced January 2020.

arXiv:2001.00595 [pdf, other]

doi 10.3847/1538-4357/ab7b67

Modeling the uncertainties of solar-system ephemerides for robust gravitational-wave searches with pulsar timing arrays

Authors: M. Vallisneri, S. R. Taylor, J. Simon, W. M. Folkner, R. S. Park, C. Cutler, J. A. Ellis, T. J. W. Lazio, S. J. Vigeland, K. Aggarwal, Z. Arzoumanian, P. T. Baker, A. Brazier, P. R. Brook, S. Burke-Spolaor, S. Chatterjee, J. M. Cordes, N. J. Cornish, F. Crawford, H. T. Cromartie, K. Crowter, M. DeCesar, P. B. Demorest, T. Dolch, R. D. Ferdman , et al. (39 additional authors not shown)

Abstract: The regularity of pulsar emissions becomes apparent once we reference the pulses' times of arrivals to the inertial rest frame of the solar system. It follows that errors in the determination of Earth's position with respect to the solar-system barycenter can appear as a time-correlated bias in pulsar-timing residual time series, affecting the searches for low-frequency gravitational waves perform… ▽ More The regularity of pulsar emissions becomes apparent once we reference the pulses' times of arrivals to the inertial rest frame of the solar system. It follows that errors in the determination of Earth's position with respect to the solar-system barycenter can appear as a time-correlated bias in pulsar-timing residual time series, affecting the searches for low-frequency gravitational waves performed with pulsar timing arrays. Indeed, recent array datasets yield different gravitational-wave background upper limits and detection statistics when analyzed with different solar-system ephemerides. Crucially, the ephemerides do not generally provide usable error representations. In this article we describe the motivation, construction, and application of a physical model of solar-system ephemeris uncertainties, which focuses on the degrees of freedom (Jupiter's orbital elements) most relevant to gravitational-wave searches with pulsar timing arrays. This model, BayesEphem, was used to derive ephemeris-robust results in NANOGrav's 11-yr stochastic-background search, and it provides a foundation for future searches by NANOGrav and other consortia. The analysis and simulations reported here suggest that ephemeris modeling reduces the gravitational-wave sensitivity of the 11-yr dataset; and that this degeneracy will vanish with improved ephemerides and with the longer pulsar timing datasets that will become available in the near future. △ Less

Submitted 6 January, 2020; v1 submitted 2 January, 2020; originally announced January 2020.

Comments: Fixed typo in author list. Code that supports all calculations and figures is available at github.com/nanograv/11yr_stochastic_analysis/tree/master/bayesephem

arXiv:1912.07642 [pdf, other]

Building A Field: The Future of Astronomy with Gravitational Waves, A State of The Profession Consideration for Astro2020

Authors: Kelly Holley-Bockelmann, Joey Shapiro Key, Brittany Kamai, Robert Caldwell, Warren Brown, Bill Gabella, Karan Jani, Quentin Baghi, John Baker, Jillian Bellovary, Pete Bender, Emanuele Berti, T. J. Brandt, Curt Cutler, John W. Conklin, Michael Eracleous, Elizabeth C. Ferrara, Bernard J. Kelly, Shane L. Larson, Jeff Livas, Maura McLaughlin, Sean T. McWilliams, Guido Mueller, Priyamvada Natarajan, Norman Rioux , et al. (6 additional authors not shown)

Abstract: Harnessing the sheer discovery potential of gravitational wave astronomy will require bold, deliberate, and sustained efforts to train and develop the requisite workforce. The next decade requires a strategic plan to build -- from the ground up -- a robust, open, and well-connected gravitational wave astronomy community with deep participation from traditional astronomers, physicists, data scienti… ▽ More Harnessing the sheer discovery potential of gravitational wave astronomy will require bold, deliberate, and sustained efforts to train and develop the requisite workforce. The next decade requires a strategic plan to build -- from the ground up -- a robust, open, and well-connected gravitational wave astronomy community with deep participation from traditional astronomers, physicists, data scientists, and instrumentalists. This basic infrastructure is sorely needed as an enabling foundation for research. We outline a set of recommendations for funding agencies, universities, and professional societies to help build a thriving, diverse, and inclusive new field. △ Less

Submitted 16 December, 2019; originally announced December 2019.

arXiv:1912.00482 [pdf, other]

doi 10.3847/2041-8213/ab8121

The NANOGrav 11-year Data Set: Constraints on Planetary Masses Around 45 Millisecond Pulsars

Authors: E. A. Behrens, S. M. Ransom, D. R. Madison, Z. Arzoumanian, K. Crowter, M. E. DeCesar, P. B. Demorest, T. Dolch, J. A. Ellis, R. D. Ferdman, E. C. Ferrara, E. Fonseca, P. A. Gentile, G. Jones, M. L. Jones, M. T. Lam, L. Levin, D. R. Lorimer, R. S. Lynch, M. A. McLaughlin, C. Ng, D. J. Nice, T. T. Pennucci, B. B. P. Perera, P. S. Ray , et al. (5 additional authors not shown)

Abstract: We search for extrasolar planets around millisecond pulsars using pulsar timing data and seek to determine the minimum detectable planetary masses as a function of orbital period. Using the 11-year data set from the North American Nanohertz Observatory for Gravitational Waves (NANOGrav), we look for variations from our models of pulse arrival times due to the presence of exoplanets. No planets are… ▽ More We search for extrasolar planets around millisecond pulsars using pulsar timing data and seek to determine the minimum detectable planetary masses as a function of orbital period. Using the 11-year data set from the North American Nanohertz Observatory for Gravitational Waves (NANOGrav), we look for variations from our models of pulse arrival times due to the presence of exoplanets. No planets are detected around the millisecond pulsars in the NANOGrav 11-year data set, but taking into consideration the noise levels of each pulsar and the sampling rate of our observations, we develop limits that show we are sensitive to planetary masses as low as that of the moon. We analyzed potential planet periods, P, in the range 7 days < P < 2000 days, with somewhat smaller ranges for some binary pulsars. The planetary mass limit for our median-sensitivity pulsar within this period range is 1 M_moon (P / 100 days)^(-2/3). △ Less

Submitted 24 March, 2020; v1 submitted 1 December, 2019; originally announced December 2019.

Comments: Revised and accepted by ApJ Letters

arXiv:1911.11787 [pdf, other]

doi 10.1145/3359176

Collaboration Drives Individual Productivity

Authors: Goran Muric, Andres Abeliuk, Kristina Lerman, Emilio Ferrara

Abstract: How does the number of collaborators affect individual productivity? Results of prior research have been conflicting, with some studies reporting an increase in individual productivity as the number of collaborators grows, while other studies showing that the {free-rider effect} skews the effort invested by individuals, making larger groups less productive. The difference between these schools of… ▽ More How does the number of collaborators affect individual productivity? Results of prior research have been conflicting, with some studies reporting an increase in individual productivity as the number of collaborators grows, while other studies showing that the {free-rider effect} skews the effort invested by individuals, making larger groups less productive. The difference between these schools of thought is substantial: if a super-scaling effect exists, as suggested by former studies, then as groups grow, their productivity will increase even faster than their size, super-linearly improving their efficiency. We address this question by studying two planetary-scale collaborative systems: GitHub and Wikipedia. By analyzing the activity of over 2 million users on these platforms, we discover that the interplay between group size and productivity exhibits complex, previously-unobserved dynamics: the productivity of smaller groups scales super-linearly with group size, but saturates at larger sizes. This effect is not an artifact of the heterogeneity of productivity: the relation between group size and productivity holds at the individual level. People tend to do more when collaborating with more people. We propose a generative model of individual productivity that captures the non-linearity in collaboration effort. The proposed model is able to explain and predict group work dynamics in GitHub and Wikipedia by capturing their maximally informative behavioral features, and it paves the way for a principled, data-driven science of collaboration. △ Less

Submitted 26 November, 2019; originally announced November 2019.

Comments: Proc. ACM Hum.-Comput. Interact., Vol. 3, No. CSCW

arXiv:1911.08488 [pdf, other]

doi 10.3847/1538-4357/ab6083

The NANOGrav 11-Year Data Set: Limits on Gravitational Wave Memory

Authors: K. Aggarwal, Z. Arzoumanian, P. T. Baker, A. Brazier, P. R. Brook, S. Burke-Spolaor, S. Chatterjee, J. M. Cordes, N. J. Cornish, F. Crawford, H. T. Cromartie, K. Crowter, M. Decesar, P. B. Demorest, T. Dolch, J. A. Ellis, R. D. Ferdman, E. C. Ferrara, E. Fonseca, N. Garver-Daniels, P. Gentile, D. Good, J. S. Hazboun, A. M. Holgado, E. A. Huerta , et al. (36 additional authors not shown)

Abstract: The mergers of supermassive black hole binaries (SMBHBs) promise to be incredible sources of gravitational waves (GWs). While the oscillatory part of the merger gravitational waveform will be outside the frequency sensitivity range of pulsar timing arrays (PTAs), the non-oscillatory GW memory effect is detectable. Further, any burst of gravitational waves will produce GW memory, making memory a us… ▽ More The mergers of supermassive black hole binaries (SMBHBs) promise to be incredible sources of gravitational waves (GWs). While the oscillatory part of the merger gravitational waveform will be outside the frequency sensitivity range of pulsar timing arrays (PTAs), the non-oscillatory GW memory effect is detectable. Further, any burst of gravitational waves will produce GW memory, making memory a useful probe of unmodeled exotic sources and new physics. We searched the North American Nanohertz Observatory for Gravitational Waves (NANOGrav) 11-year data set for GW memory. This dataset is sensitive to very low frequency GWs of $\sim3$ to $400$ nHz (periods of $\sim11$ yr $-$ $1$ mon). Finding no evidence for GWs, we placed limits on the strain amplitude of GW memory events during the observation period. We then used the strain upper limits to place limits on the rate of GW memory causing events. At a strain of $2.5\times10^{-14}$, corresponding to the median upper limit as a function of source sky position, we set a limit on the rate of GW memory events at $<0.4$ yr$^{-1}$. That strain corresponds to a SMBHB merger with reduced mass of $ηM \sim 2\times10^{10}M_\odot$ and inclination of $ι=π/3$ at a distance of 1 Gpc. As a test of our analysis, we analyzed the NANOGrav 9-year data set as well. This analysis found an anomolous signal, which does not appear in the 11-year data set. This signal is not a GW, and its origin remains unknown. △ Less

Submitted 6 December, 2019; v1 submitted 19 November, 2019; originally announced November 2019.

Comments: 10 pages, 6 figures, submitted to ApJ

arXiv:1911.06959 [pdf, other]

Learning Behavioral Representations from Wearable Sensors

Authors: Nazgol Tavabi, Homa Hosseinmardi, Jennifer L. Villatte, Andrés Abeliuk, Shrikanth Narayanan, Emilio Ferrara, Kristina Lerman

Abstract: Continuous collection of physiological data from wearable sensors enables temporal characterization of individual behaviors. Understanding the relation between an individual's behavioral patterns and psychological states can help identify strategies to improve quality of life. One challenge in analyzing physiological data is extracting the underlying behavioral states from the temporal sensor sign… ▽ More Continuous collection of physiological data from wearable sensors enables temporal characterization of individual behaviors. Understanding the relation between an individual's behavioral patterns and psychological states can help identify strategies to improve quality of life. One challenge in analyzing physiological data is extracting the underlying behavioral states from the temporal sensor signals and interpreting them. Here, we use a non-parametric Bayesian approach to model sensor data from multiple people and discover the dynamic behaviors they share. We apply this method to data collected from sensors worn by a population of hospital workers and show that the learned states can cluster participants into meaningful groups and better predict their cognitive and psychological states. This method offers a way to learn interpretable compact behavioral representations from multivariate sensor signals. △ Less

Submitted 4 July, 2020; v1 submitted 16 November, 2019; originally announced November 2019.

arXiv:1910.06317 [pdf, other]

doi 10.3847/1538-4357/ab4ceb

Classification of New X-ray Counterparts for Fermi Unassociated Gamma Ray Sources Using the Swift X-Ray Telescope

Authors: Amanpreet Kaur, Abraham D Falcone, Michael D Stroh, Jamie A Kennea, Elizabeth C Ferrara

Abstract: Approximately one-third of the gamma-ray sources in the third Fermi-LAT catalog are unidentified or unassociated with objects at other wavelengths. Observations with Swift-XRT have yielded possible counterparts in $\sim$30% of these source regions. The objective of this work is to identify the nature of these possible counterparts, utilizing their gamma ray properties coupled with the Swift derive… ▽ More Approximately one-third of the gamma-ray sources in the third Fermi-LAT catalog are unidentified or unassociated with objects at other wavelengths. Observations with Swift-XRT have yielded possible counterparts in $\sim$30% of these source regions. The objective of this work is to identify the nature of these possible counterparts, utilizing their gamma ray properties coupled with the Swift derived X-ray properties. The majority of the known sources in the Fermi catalogs are blazars, which constitute the bulk of the extragalactic gamma-ray source population. The galactic population on the other hand is dominated by pulsars. Blazars and pulsars occupy different parameter space when X-ray fluxes are compared with various gamma-ray properties. In this work, we utilize the X-ray observations performed with the Swift-XRT for the unknown Fermi sources and compare their X-ray and gamma-ray properties to differentiate between the two source classes. We employ two machine learning algorithms, decision tree and random forest classifier, to our high signal-to-noise ratio sample of 217 sources, each of which correspond to Fermi unassociated regions. The accuracy score for both methods were found to be 97% and 99%, respectively. The random forest classifier, which is based on the application of a multitude of decision trees, associated a probability value (P$_{bzr}$) for each source to be a blazar. This yielded 173 blazar candidates with P$_{bzr}$ $\geq$ 90% for each of these sources, and 134 of these possible blazar source associations had P$_{bzr}$ $\geq$ 99%. The results yielded 13 sources with P$_{bzr}$ $\leq$ 10%, which we deemed as reasonable candidates for pulsars, 7 of which result with P$_{bzr}$ $\leq$ 1%. There were 31 sources that exhibited intermediate probabilities and were termed ambiguous due to their unclear characterization as a pulsar or a blazar. △ Less

Submitted 14 October, 2019; originally announced October 2019.

Comments: accepted in ApJ

arXiv:1910.05870 [pdf, other]

doi 10.1103/PhysRevE.102.052316

Network Modularity Controls the Speed of Information Diffusion

Authors: Hao Peng, Azadeh Nematzadeh, Daniel M. Romero, Emilio Ferrara

Abstract: The rapid diffusion of information and the adoption of social behaviors are of critical importance in situations as diverse as collective actions, pandemic prevention, or advertising and marketing. Although the dynamics of large cascades have been extensively studied in various contexts, few have systematically examined the impact of network topology on the efficiency of information diffusion. Her… ▽ More The rapid diffusion of information and the adoption of social behaviors are of critical importance in situations as diverse as collective actions, pandemic prevention, or advertising and marketing. Although the dynamics of large cascades have been extensively studied in various contexts, few have systematically examined the impact of network topology on the efficiency of information diffusion. Here, by employing the linear threshold model on networks with communities, we demonstrate that a prominent network feature---the modular structure---strongly affects the speed of information diffusion in complex contagion. Our simulations show that there always exists an optimal network modularity for the most efficient spreading process. Beyond this critical value, either a stronger or a weaker modular structure actually hinders the diffusion speed. These results are confirmed by an analytical approximation. We further demonstrate that the optimal modularity varies with both the seed size and the target cascade size, and is ultimately dependent on the network under investigation. We underscore the importance of our findings in applications from marketing to epidemiology, from neuroscience to engineering, where the understanding of the structural design of complex systems focuses on the efficiency of information propagation. △ Less

Submitted 30 July, 2020; v1 submitted 13 October, 2019; originally announced October 2019.

arXiv:1910.01720 [pdf, other]

Bots, elections, and social media: a brief overview

Authors: Emilio Ferrara

Abstract: Bots, software-controlled accounts that operate on social media, have been used to manipulate and deceive. We studied the characteristics and activity of bots around major political events, including elections in various countries. In this chapter, we summarize our findings of bot operations in the context of the 2016 and 2018 US Presidential and Midterm elections and the 2017 French Presidential… ▽ More Bots, software-controlled accounts that operate on social media, have been used to manipulate and deceive. We studied the characteristics and activity of bots around major political events, including elections in various countries. In this chapter, we summarize our findings of bot operations in the context of the 2016 and 2018 US Presidential and Midterm elections and the 2017 French Presidential election. △ Less

Submitted 3 October, 2019; originally announced October 2019.

arXiv:1909.08644 [pdf, other]

doi 10.3847/1538-4357/ab68db

The NANOGrav 11-Year Data Set: Evolution of Gravitational Wave Background Statistics

Authors: J. S. Hazboun, J. Simon, S. R. Taylor, M. T. Lam, S. J. Vigeland, K. Islo, J. S. Key, Z. Arzoumanian, P. T. Baker, A. Brazier, P. R. Brook, S. Burke-Spolaor, S. Chatterjee, J. M. Cordes, N. J. Cornish, F. Crawford, K. Crowter, H. T. Cromartie, M. DeCesar, P. B. Demorest, T. Dolch, J. A. Ellis, R. D. Ferdman, E. Ferrara, E. Fonseca , et al. (38 additional authors not shown)

Abstract: An ensemble of inspiraling supermassive black hole binaries should produce a stochastic background of very low frequency gravitational waves. This stochastic background is predicted to be a power law, with a spectral index of -2/3, and it should be detectable by a network of precisely timed millisecond pulsars, widely distributed on the sky. This paper reports a new "time slicing" analysis of the… ▽ More An ensemble of inspiraling supermassive black hole binaries should produce a stochastic background of very low frequency gravitational waves. This stochastic background is predicted to be a power law, with a spectral index of -2/3, and it should be detectable by a network of precisely timed millisecond pulsars, widely distributed on the sky. This paper reports a new "time slicing" analysis of the 11-year data release from the North American Nanohertz Observatory for Gravitational Waves (NANOGrav) using 34 millisecond pulsars. Methods to flag potential "false positive" signatures are developed, including techniques to identify responsible pulsars. Mitigation strategies are then presented. We demonstrate how an incorrect noise model can lead to spurious signals, and show how independently modeling noise across 30 Fourier components, spanning NANOGrav's frequency range, effectively diagnoses and absorbs the excess power in gravitational-wave searches. This results in a nominal, and expected, progression of our gravitational-wave statistics. Additionally we show that the first interstellar medium event in PSR J1713+0747 pollutes the common red noise process with low-spectral index noise, and use a tailored noise model to remove these effects. △ Less

Submitted 20 September, 2019; v1 submitted 18 September, 2019; originally announced September 2019.

Comments: 14 pages, 13 figures, fixed typo in abstract of earlier version

arXiv:1909.04534 [pdf, other]

doi 10.1093/mnras/stz2857

The International Pulsar Timing Array: Second data release

Authors: B. B. P. Perera, M. E. DeCesar, P. B. Demorest, M. Kerr, L. Lentati, D. J. Nice, S. Oslowski, S. M. Ransom, M. J. Keith, Z. Arzoumanian, M. Bailes, P. T. Baker, C. G. Bassa, N. D. R. Bhat, A. Brazier, M. Burgay, S. Burke-Spolaor, R. N. Caballero, D. J. Champion, S. Chatterjee, S. Chen, I. Cognard, J. M. Cordes, K. Crowter, S. Dai , et al. (50 additional authors not shown)

Abstract: In this paper, we describe the International Pulsar Timing Array second data release, which includes recent pulsar timing data obtained by three regional consortia: the European Pulsar Timing Array, the North American Nanohertz Observatory for Gravitational Waves, and the Parkes Pulsar Timing Array. We analyse and where possible combine high-precision timing data for 65 millisecond pulsars which a… ▽ More In this paper, we describe the International Pulsar Timing Array second data release, which includes recent pulsar timing data obtained by three regional consortia: the European Pulsar Timing Array, the North American Nanohertz Observatory for Gravitational Waves, and the Parkes Pulsar Timing Array. We analyse and where possible combine high-precision timing data for 65 millisecond pulsars which are regularly observed by these groups. A basic noise analysis, including the processes which are both correlated and uncorrelated in time, provides noise models and timing ephemerides for the pulsars. We find that the timing precisions of pulsars are generally improved compared to the previous data release, mainly due to the addition of new data in the combination. The main purpose of this work is to create the most up-to-date IPTA data release. These data are publicly available for searches for low-frequency gravitational waves and other pulsar science. △ Less

Submitted 10 September, 2019; originally announced September 2019.

Comments: Submitted to MNRAS and in review, 23 pages, 5 figures

Showing 101–150 of 324 results for author: Ferrara, E