Skip to main content

Showing 1–1 of 1 results for author: Pabbi, D

Searching in archive cs. Search in all archives.
.
  1. arXiv:2005.06012  [pdf, other

    cs.SI cs.CL

    Mega-COV: A Billion-Scale Dataset of 100+ Languages for COVID-19

    Authors: Muhammad Abdul-Mageed, AbdelRahim Elmadany, El Moatez Billah Nagoudi, Dinesh Pabbi, Kunal Verma, Rannie Lin

    Abstract: We describe Mega-COV, a billion-scale dataset from Twitter for studying COVID-19. The dataset is diverse (covers 268 countries), longitudinal (goes as back as 2007), multilingual (comes in 100+ languages), and has a significant number of location-tagged tweets (~169M tweets). We release tweet IDs from the dataset. We also develop and release two powerful models, one for identifying whether or not… ▽ More

    Submitted 5 February, 2021; v1 submitted 2 May, 2020; originally announced May 2020.