Computer Science > Computers and Society
[Submitted on 9 Mar 2023]
Title:TGDataset: a Collection of Over One Hundred Thousand Telegram Channels
View PDFAbstract:Telegram is one of the most popular instant messaging apps in today's digital age. In addition to providing a private messaging service, Telegram, with its channels, represents a valid medium for rapidly broadcasting content to a large audience (COVID-19 announcements), but, unfortunately, also for disseminating radical ideologies and coordinating attacks (Capitol Hill riot). This paper presents the TGDataset, a new dataset that includes 120,979 Telegram channels and over 400 million messages, making it the largest collection of Telegram channels to the best of our knowledge. After a brief introduction to the data collection process, we analyze the languages spoken within our dataset and the topic covered by English channels. Finally, we discuss some use cases in which our dataset can be extremely useful to understand better the Telegram ecosystem, as well as to study the diffusion of questionable news. In addition to the raw dataset, we released the scripts we used to analyze the dataset and the list of channels belonging to the network of a new conspiracy theory called Sabmyk.
Submission history
From: Alberto Maria Mongardini [view email][v1] Thu, 9 Mar 2023 15:42:38 UTC (747 KB)
References & Citations
Bibliographic and Citation Tools
Bibliographic Explorer (What is the Explorer?)
Litmaps (What is Litmaps?)
scite Smart Citations (What are Smart Citations?)
Code, Data and Media Associated with this Article
CatalyzeX Code Finder for Papers (What is CatalyzeX?)
DagsHub (What is DagsHub?)
Gotit.pub (What is GotitPub?)
Papers with Code (What is Papers with Code?)
ScienceCast (What is ScienceCast?)
Demos
Recommenders and Search Tools
Influence Flower (What are Influence Flowers?)
Connected Papers (What is Connected Papers?)
CORE Recommender (What is CORE?)
arXivLabs: experimental projects with community collaborators
arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.
Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.
Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs.