Skip to main content

Showing 1–41 of 41 results for author: Martino, G D S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2311.03179  [pdf, other

    cs.CL cs.AI

    ArAIEval Shared Task: Persuasion Techniques and Disinformation Detection in Arabic Text

    Authors: Maram Hasanain, Firoj Alam, Hamdy Mubarak, Samir Abdaljalil, Wajdi Zaghouani, Preslav Nakov, Giovanni Da San Martino, Abed Alhakim Freihat

    Abstract: We present an overview of the ArAIEval shared task, organized as part of the first ArabicNLP 2023 conference co-located with EMNLP 2023. ArAIEval offers two tasks over Arabic text: (i) persuasion technique detection, focusing on identifying persuasion techniques in tweets and news articles, and (ii) disinformation detection in binary and multiclass setups over tweets. A total of 20 teams participa… ▽ More

    Submitted 6 November, 2023; originally announced November 2023.

    Comments: Accepted at ArabicNLP-23 (EMNLP-23), propaganda, disinformation, misinformation, fake news

    MSC Class: 68T50 ACM Class: F.2.2; I.2.7

  2. arXiv:2301.06774  [pdf, other

    cs.SI cs.CL cs.CY

    Temporal Dynamics of Coordinated Online Behavior: Stability, Archetypes, and Influence

    Authors: Serena Tardelli, Leonardo Nizzoli, Maurizio Tesconi, Mauro Conti, Preslav Nakov, Giovanni Da San Martino, Stefano Cresci

    Abstract: Large-scale online campaigns, malicious or otherwise, require a significant degree of coordination among participants, which sparked interest in the study of coordinated online behavior. State-of-the-art methods for detecting coordinated behavior perform static analyses, disregarding the temporal dynamics of coordination. Here, we carry out the first dynamic analysis of coordinated behavior. To re… ▽ More

    Submitted 9 May, 2024; v1 submitted 17 January, 2023; originally announced January 2023.

    Comments: Article published in PNAS 121 (20). Please, cite the published version

    Journal ref: Proceedings of the National Academy of Sciences 121 (20), e2307038121, 2024

  3. arXiv:2211.10057  [pdf, other

    cs.CL cs.AI cs.LG

    Overview of the WANLP 2022 Shared Task on Propaganda Detection in Arabic

    Authors: Firoj Alam, Hamdy Mubarak, Wajdi Zaghouani, Giovanni Da San Martino, Preslav Nakov

    Abstract: Propaganda is the expression of an opinion or an action by an individual or a group deliberately designed to influence the opinions or the actions of other individuals or groups with reference to predetermined ends, which is achieved by means of well-defined rhetorical and psychological devices. Propaganda techniques are commonly used in social media to manipulate or to mislead users. Thus, there… ▽ More

    Submitted 18 November, 2022; originally announced November 2022.

    Comments: Accepted at WANLP-22 (EMNLP-22), propaganda, disinformation, misinformation, fake news, memes, multimodality. arXiv admin note: text overlap with arXiv:2109.08013, arXiv:2105.09284

    MSC Class: 68T50 ACM Class: F.2.2; I.2.7

  4. arXiv:2205.04274  [pdf, other

    cs.CL cs.AI cs.CV

    Detecting and Understanding Harmful Memes: A Survey

    Authors: Shivam Sharma, Firoj Alam, Md. Shad Akhtar, Dimitar Dimitrov, Giovanni Da San Martino, Hamed Firooz, Alon Halevy, Fabrizio Silvestri, Preslav Nakov, Tanmoy Chakraborty

    Abstract: The automatic identification of harmful content online is of major concern for social media platforms, policymakers, and society. Researchers have studied textual, visual, and audio content, but typically in isolation. Yet, harmful content often combines multiple modalities, as in the case of memes, which are of particular interest due to their viral nature. With this in mind, here we offer a comp… ▽ More

    Submitted 29 May, 2022; v1 submitted 9 May, 2022; originally announced May 2022.

    Comments: Accepted at IJCAI-ECAI 2022 (Survey Track) - Editorial Feedback Revised, 9 pages (7 main + 2 reference pages)

  5. arXiv:2109.15118  [pdf, other

    cs.CL cs.AI cs.IR cs.LG cs.SI

    Overview of the CLEF-2019 CheckThat!: Automatic Identification and Verification of Claims

    Authors: Tamer Elsayed, Preslav Nakov, Alberto Barrón-Cedeño, Maram Hasanain, Reem Suwaileh, Giovanni Da San Martino, Pepa Atanasova

    Abstract: We present an overview of the second edition of the CheckThat! Lab at CLEF 2019. The lab featured two tasks in two different languages: English and Arabic. Task 1 (English) challenged the participating systems to predict which claims in a political debate or speech should be prioritized for fact-checking. Task 2 (Arabic) asked to (A) rank a given set of Web pages with respect to a check-worthy cla… ▽ More

    Submitted 25 September, 2021; originally announced September 2021.

    Comments: Check-worthiness Estimation, Fact-Checking, Veracity, Evidence-based Verification, Fake News Detection, Computational Journalism, Disinformation, Misinformation. arXiv admin note: text overlap with arXiv:2012.09263 by other authors

    MSC Class: 68T50 ACM Class: F.2.2; I.2.7

    Journal ref: CLEF-2019

  6. arXiv:2109.13046  [pdf, other

    cs.SI cs.AI cs.CL

    The Spread of Propaganda by Coordinated Communities on Social Media

    Authors: Kristina Hristakieva, Stefano Cresci, Giovanni Da San Martino, Mauro Conti, Preslav Nakov

    Abstract: Large-scale manipulations on social media have two important characteristics: (i) use of propaganda to influence others, and (ii) adoption of coordinated behavior to spread it and to amplify its impact. Despite the connection between them, these two characteristics have so far been considered in isolation. Here we aim to bridge this gap. In particular, we analyze the spread of propaganda and its i… ▽ More

    Submitted 21 May, 2022; v1 submitted 27 September, 2021; originally announced September 2021.

    Comments: The 14th ACM Web Science Conference 2022 (WebSci '22)

  7. arXiv:2109.12987  [pdf, other

    cs.CL cs.IR cs.LG cs.SI

    Overview of the CLEF--2021 CheckThat! Lab on Detecting Check-Worthy Claims, Previously Fact-Checked Claims, and Fake News

    Authors: Preslav Nakov, Giovanni Da San Martino, Tamer Elsayed, Alberto Barrón-Cedeño, Rubén Míguez, Shaden Shaar, Firoj Alam, Fatima Haouari, Maram Hasanain, Watheq Mansour, Bayan Hamdan, Zien Sheikh Ali, Nikolay Babulkov, Alex Nikolov, Gautam Kishore Shahi, Julia Maria Struß, Thomas Mandl, Mucahid Kutlu, Yavuz Selim Kartal

    Abstract: We describe the fourth edition of the CheckThat! Lab, part of the 2021 Conference and Labs of the Evaluation Forum (CLEF). The lab evaluates technology supporting tasks related to factuality, and covers Arabic, Bulgarian, English, Spanish, and Turkish. Task 1 asks to predict which posts in a Twitter stream are worth fact-checking, focusing on COVID-19 and politics (in all five languages). Task 2 a… ▽ More

    Submitted 23 September, 2021; originally announced September 2021.

    Comments: Check-Worthiness Estimation, Fact-Checking, Veracity, Evidence-based Verification, Detecting Previously Fact-Checked Claims, Social Media Verification, Computational Journalism, COVID-19

    MSC Class: 68T50 ACM Class: F.2.2; I.2.7

    Journal ref: CLEF-2021

  8. arXiv:2109.12986  [pdf, other

    cs.CL cs.IR cs.LG cs.SI

    Findings of the NLP4IF-2021 Shared Tasks on Fighting the COVID-19 Infodemic and Censorship Detection

    Authors: Shaden Shaar, Firoj Alam, Giovanni Da San Martino, Alex Nikolov, Wajdi Zaghouani, Preslav Nakov, Anna Feldman

    Abstract: We present the results and the main findings of the NLP4IF-2021 shared tasks. Task 1 focused on fighting the COVID-19 infodemic in social media, and it was offered in Arabic, Bulgarian, and English. Given a tweet, it asked to predict whether that tweet contains a verifiable claim, and if so, whether it is likely to be false, is of general interest, is likely to be harmful, and is worthy of manual… ▽ More

    Submitted 23 September, 2021; originally announced September 2021.

    Comments: COVID-19, infodemic, harmfulness, check-worthiness, censorship, social media, tweets, Arabic, Bulgarian, English, Chinese

    MSC Class: 68T50 ACM Class: F.2.2; I.2.7

    Journal ref: NLP4IF-2021

  9. arXiv:2109.11372  [pdf, other

    cs.CL cs.SI

    A Second Pandemic? Analysis of Fake News About COVID-19 Vaccines in Qatar

    Authors: Preslav Nakov, Firoj Alam, Shaden Shaar, Giovanni Da San Martino, Yifan Zhang

    Abstract: While COVID-19 vaccines are finally becoming widely available, a second pandemic that revolves around the circulation of anti-vaxxer fake news may hinder efforts to recover from the first one. With this in mind, we performed an extensive analysis of Arabic and English tweets about COVID-19 vaccines, with focus on messages originating from Qatar. We found that Arabic tweets contain a lot of false i… ▽ More

    Submitted 22 September, 2021; originally announced September 2021.

    Comments: COVID-19, disinformation, misinformation, factuality, fact-checking, fact-checkers, check-worthiness, framing, harmfulness, social media platforms, social media

    Report number: RANLP-2021 MSC Class: 68T50 ACM Class: F.2.2; I.2.7

    Journal ref: RANLP-2021

  10. arXiv:2109.08013  [pdf, other

    cs.CV cs.CL cs.LG cs.MM

    Detecting Propaganda Techniques in Memes

    Authors: Dimitar Dimitrov, Bishr Bin Ali, Shaden Shaar, Firoj Alam, Fabrizio Silvestri, Hamed Firooz, Preslav Nakov, Giovanni Da San Martino

    Abstract: Propaganda can be defined as a form of communication that aims to influence the opinions or the actions of people towards a specific goal; this is achieved by means of well-defined rhetorical and psychological devices. Propaganda, in the form we know it today, can be dated back to the beginning of the 17th century. However, it is with the advent of the Internet and the social media that it has sta… ▽ More

    Submitted 7 August, 2021; originally announced September 2021.

    Comments: propaganda, disinformation, fake news, memes, multimodality. arXiv admin note: text overlap with arXiv:2105.09284

    MSC Class: 68T50 ACM Class: I.2.7

    Journal ref: ACL-2021

  11. arXiv:2109.07410  [pdf, other

    cs.CL cs.AI cs.CY cs.IR cs.LG

    Assisting the Human Fact-Checkers: Detecting All Previously Fact-Checked Claims in a Document

    Authors: Shaden Shaar, Nikola Georgiev, Firoj Alam, Giovanni Da San Martino, Aisha Mohamed, Preslav Nakov

    Abstract: Given the recent proliferation of false claims online, there has been a lot of manual fact-checking effort. As this is very time-consuming, human fact-checkers can benefit from tools that can support them and make them more efficient. Here, we focus on building a system that could provide such support. Given an input document, it aims to detect all sentences that contain a claim that can be verifi… ▽ More

    Submitted 15 November, 2022; v1 submitted 14 September, 2021; originally announced September 2021.

    Comments: detecting previously fact-checked claims, fact-checking, disinformation, fake news, social media, political debates

    MSC Class: 68T50 ACM Class: F.2.2; I.2.7

    Journal ref: EMNLP-2022

  12. arXiv:2108.12802  [pdf, other

    cs.CL cs.AI cs.LG

    Interpretable Propaganda Detection in News Articles

    Authors: Seunghak Yu, Giovanni Da San Martino, Mitra Mohtarami, James Glass, Preslav Nakov

    Abstract: Online users today are exposed to misleading and propagandistic news articles and media posts on a daily basis. To counter thus, a number of approaches have been designed aiming to achieve a healthier and safer online news and media consumption. Automatic systems are able to support humans in detecting such content; yet, a major impediment to their broad adoption is that besides being accurate, th… ▽ More

    Submitted 29 August, 2021; originally announced August 2021.

    Comments: propaganda, propaganda techniques, disinformation, misinformation, fake news, explainability, interpretability

    MSC Class: 68T50 ACM Class: F.2.2; I.2.7

    Journal ref: RANLP-2021

  13. arXiv:2105.09284  [pdf, other

    cs.MM cs.CL cs.LG

    SemEval-2021 Task 6: Detection of Persuasion Techniques in Texts and Images

    Authors: Dimitar Dimitrov, Bishr Bin Ali, Shaden Shaar, Firoj Alam, Fabrizio Silvestri, Hamed Firooz, Preslav Nakov, Giovanni Da San Martino

    Abstract: We describe SemEval-2021 task 6 on Detection of Persuasion Techniques in Texts and Images: the data, the annotation guidelines, the evaluation setup, the results, and the participating systems. The task focused on memes and had three subtasks: (i) detecting the techniques in the text, (ii) detecting the text spans where the techniques are used, and (iii) detecting techniques in the entire meme, i.… ▽ More

    Submitted 25 April, 2021; originally announced May 2021.

    Comments: propaganda, disinformation, misinformation, fake news, memes, multimodality

    MSC Class: 68T50 ACM Class: F.2.2; I.2.7

    Journal ref: SemEval-2021

  14. arXiv:2104.07423  [pdf, other

    cs.CL cs.AI cs.IR cs.LG cs.NE

    The Role of Context in Detecting Previously Fact-Checked Claims

    Authors: Shaden Shaar, Firoj Alam, Giovanni Da San Martino, Preslav Nakov

    Abstract: Recent years have seen the proliferation of disinformation and fake news online. Traditional approaches to mitigate these issues is to use manual or automatic fact-checking. Recently, another approach has emerged: checking whether the input claim has previously been fact-checked, which can be done automatically, and thus fast, while also offering credibility and explainability, thanks to the human… ▽ More

    Submitted 9 May, 2022; v1 submitted 15 April, 2021; originally announced April 2021.

    Comments: Accepted as Findings of NAACL-2022, detecting previously fact-checked claims, fact-checking, disinformation, fake news, social media, political debates

    MSC Class: 68T50 ACM Class: F.2.2; I.2.7

  15. arXiv:2103.12541  [pdf, other

    cs.MM cs.AI cs.CL cs.CR cs.CY cs.LG cs.SI

    A Survey on Multimodal Disinformation Detection

    Authors: Firoj Alam, Stefano Cresci, Tanmoy Chakraborty, Fabrizio Silvestri, Dimiter Dimitrov, Giovanni Da San Martino, Shaden Shaar, Hamed Firooz, Preslav Nakov

    Abstract: Recent years have witnessed the proliferation of offensive content online such as fake news, propaganda, misinformation, and disinformation. While initially this was mostly about textual content, over time images and videos gained popularity, as they are much easier to consume, attract more attention, and spread further than text. As a result, researchers started leveraging different modalities an… ▽ More

    Submitted 28 September, 2022; v1 submitted 13 March, 2021; originally announced March 2021.

    Comments: Accepted at COLING-2022, disinformation, misinformation, factuality, harmfulness, fake news, propaganda, multimodality, text, images, videos, network structure, temporality

    MSC Class: 68T50 ACM Class: I.2.7

  16. arXiv:2103.07769  [pdf, other

    cs.AI cs.CL cs.CR cs.IR cs.LG

    Automated Fact-Checking for Assisting Human Fact-Checkers

    Authors: Preslav Nakov, David Corney, Maram Hasanain, Firoj Alam, Tamer Elsayed, Alberto Barrón-Cedeño, Paolo Papotti, Shaden Shaar, Giovanni Da San Martino

    Abstract: The reporting and the analysis of current events around the globe has expanded from professional, editor-lead journalism all the way to citizen journalism. Nowadays, politicians and other key players enjoy direct access to their audiences through social media, bypassing the filters of official cables or traditional media. However, the multiple advantages of free speech and direct communication are… ▽ More

    Submitted 22 May, 2021; v1 submitted 13 March, 2021; originally announced March 2021.

    Comments: fact-checking, fact-checkers, check-worthiness, detecting previously fact-checked claims, evidence retrieval

    MSC Class: 68T50 ACM Class: I.2.7

    Journal ref: IJCAI-2021

  17. arXiv:2010.05338  [pdf, other

    cs.CL

    We Can Detect Your Bias: Predicting the Political Ideology of News Articles

    Authors: Ramy Baly, Giovanni Da San Martino, James Glass, Preslav Nakov

    Abstract: We explore the task of predicting the leading political ideology or bias of news articles. First, we collect and release a large dataset of 34,737 articles that were manually annotated for political ideology -left, center, or right-, which is well-balanced across both topics and media. We further use a challenging experimental setup where the test examples come from media that were not seen during… ▽ More

    Submitted 11 October, 2020; originally announced October 2020.

    Comments: Political bias, bias in news, neural networks bias, adversarial adaptation, triplet loss, transformers, recurrent neural networks

    Journal ref: EMNLP-2020

  18. arXiv:2009.02931  [pdf, ps, other

    cs.CL cs.IR cs.LG cs.SI

    Team Alex at CLEF CheckThat! 2020: Identifying Check-Worthy Tweets With Transformer Models

    Authors: Alex Nikolov, Giovanni Da San Martino, Ivan Koychev, Preslav Nakov

    Abstract: While misinformation and disinformation have been thriving in social media for years, with the emergence of the COVID-19 pandemic, the political and the health misinformation merged, thus elevating the problem to a whole new level and giving rise to the first global infodemic. The fight against this infodemic has many aspects, with fact-checking and debunking false and misleading claims being amon… ▽ More

    Submitted 7 September, 2020; originally announced September 2020.

    Comments: Check-worthiness; Fact-Checking; Veracity

    MSC Class: 68T50 ACM Class: I.2.7

    Journal ref: CLEF-2020

  19. arXiv:2009.02696  [pdf, other

    cs.CL cs.CY

    SemEval-2020 Task 11: Detection of Propaganda Techniques in News Articles

    Authors: G. Da San Martino, A. Barrón-Cedeño, H. Wachsmuth, R. Petrov, P. Nakov

    Abstract: We present the results and the main findings of SemEval-2020 Task 11 on Detection of Propaganda Techniques in News Articles. The task featured two subtasks. Subtask SI is about Span Identification: given a plain-text document, spot the specific text fragments containing propaganda. Subtask TC is about Technique Classification: given a specific text fragment, in the context of a full document, dete… ▽ More

    Submitted 6 September, 2020; originally announced September 2020.

    Comments: 37 pages, to be published in Proceedings of the 14th International Workshop on Semantic Evaluation

  20. arXiv:2007.08024  [pdf, other

    cs.CL cs.IR cs.LG

    A Survey on Computational Propaganda Detection

    Authors: Giovanni Da San Martino, Stefano Cresci, Alberto Barron-Cedeno, Seunghak Yu, Roberto Di Pietro, Preslav Nakov

    Abstract: Propaganda campaigns aim at influencing people's mindset with the purpose of advancing a specific agenda. They exploit the anonymity of the Internet, the micro-profiling ability of social networks, and the ease of automatically creating and managing coordinated networks of accounts, to reach millions of social network users with persuasive messages, specifically targeted to topics each individual… ▽ More

    Submitted 15 July, 2020; originally announced July 2020.

    Comments: propaganda detection, disinformation, misinformation, fake news, media bias

    MSC Class: 68T50 ACM Class: I.2.7

    Journal ref: IJCAI-2020

  21. arXiv:2007.07997  [pdf, other

    cs.CL cs.IR cs.LG

    Overview of CheckThat! 2020: Automatic Identification and Verification of Claims in Social Media

    Authors: Alberto Barron-Cedeno, Tamer Elsayed, Preslav Nakov, Giovanni Da San Martino, Maram Hasanain, Reem Suwaileh, Fatima Haouari, Nikolay Babulkov, Bayan Hamdan, Alex Nikolov, Shaden Shaar, Zien Sheikh Ali

    Abstract: We present an overview of the third edition of the CheckThat! Lab at CLEF 2020. The lab featured five tasks in two different languages: English and Arabic. The first four tasks compose the full pipeline of claim verification in social media: Task 1 on check-worthiness estimation, Task 2 on retrieving previously fact-checked claims, Task 3 on evidence retrieval, and Task 4 on claim verification. Th… ▽ More

    Submitted 15 July, 2020; originally announced July 2020.

    Comments: Check-Worthiness Estimation, Fact-Checking, Veracity, Evidence-based Verification, Detecting Previously Fact-Checked Claims, Social Media Verification, Computational Journalism, COVID-19

    MSC Class: 68T50 ACM Class: I.2.7

    Journal ref: CLEF-2020

  22. arXiv:2007.07996  [pdf, other

    cs.IR cs.CL cs.LG cs.SI

    Fighting the COVID-19 Infodemic in Social Media: A Holistic Perspective and a Call to Arms

    Authors: Firoj Alam, Fahim Dalvi, Shaden Shaar, Nadir Durrani, Hamdy Mubarak, Alex Nikolov, Giovanni Da San Martino, Ahmed Abdelali, Hassan Sajjad, Kareem Darwish, Preslav Nakov

    Abstract: With the outbreak of the COVID-19 pandemic, people turned to social media to read and to share timely information including statistics, warnings, advice, and inspirational stories. Unfortunately, alongside all this useful information, there was also a new blending of medical and political misinformation and disinformation, which gave rise to the first global infodemic. While fighting this infodemi… ▽ More

    Submitted 9 April, 2021; v1 submitted 15 July, 2020; originally announced July 2020.

    Comments: COVID-19, Infodemic, Disinformation, Misinformation, Fake News, Call to Arms, Crowdsourcing Annotations

    MSC Class: 68T50 ACM Class: I.2.7

  23. arXiv:2005.06058  [pdf, other

    cs.CL cs.IR cs.LG cs.SI

    That is a Known Lie: Detecting Previously Fact-Checked Claims

    Authors: Shaden Shaar, Giovanni Da San Martino, Nikolay Babulkov, Preslav Nakov

    Abstract: The recent proliferation of "fake news" has triggered a number of responses, most notably the emergence of several manual fact-checking initiatives. As a result and over time, a large number of fact-checked claims have been accumulated, which increases the likelihood that a new claim in social media or a new statement by a politician might have already been fact-checked by some trusted fact-checki… ▽ More

    Submitted 12 May, 2020; originally announced May 2020.

    Comments: detecting previously fact-checked claims, fact-checking, disinformation, fake news, social media, political debates

    MSC Class: 68T50 ACM Class: I.2.7

    Journal ref: ACL-2020

  24. arXiv:2005.05854  [pdf, other

    cs.CL cs.IR cs.LG cs.NE

    Prta: A System to Support the Analysis of Propaganda Techniques in the News

    Authors: Giovanni Da San Martino, Shaden Shaar, Yifan Zhang, Seunghak Yu, Alberto Barrón-Cedeño, Preslav Nakov

    Abstract: Recent events, such as the 2016 US Presidential Campaign, Brexit and the COVID-19 "infodemic", have brought into the spotlight the dangers of online disinformation. There has been a lot of research focusing on fact-checking and disinformation detection. However, little attention has been paid to the specific rhetorical and psychological techniques used to convey propaganda messages. Revealing the… ▽ More

    Submitted 12 May, 2020; originally announced May 2020.

    Comments: propaganda, disinformation, fake news, media bias, COVID-19

    MSC Class: 68T50 ACM Class: I.2.7

    Journal ref: ACL-2020

  25. arXiv:2005.00033  [pdf, other

    cs.CL cs.CY cs.IR

    Fighting the COVID-19 Infodemic: Modeling the Perspective of Journalists, Fact-Checkers, Social Media Platforms, Policy Makers, and the Society

    Authors: Firoj Alam, Shaden Shaar, Fahim Dalvi, Hassan Sajjad, Alex Nikolov, Hamdy Mubarak, Giovanni Da San Martino, Ahmed Abdelali, Nadir Durrani, Kareem Darwish, Abdulaziz Al-Homaid, Wajdi Zaghouani, Tommaso Caselli, Gijs Danoe, Friso Stolk, Britt Bruntink, Preslav Nakov

    Abstract: With the emergence of the COVID-19 pandemic, the political and the medical aspects of disinformation merged as the problem got elevated to a whole new level to become the first global infodemic. Fighting this infodemic has been declared one of the most important focus areas of the World Health Organization, with dangers ranging from promoting fake cures, rumors, and conspiracy theories to spreadin… ▽ More

    Submitted 22 September, 2021; v1 submitted 30 April, 2020; originally announced May 2020.

    Comments: disinformation, misinformation, factuality, fact-checking, fact-checkers, check-worthiness, Social Media Platforms, COVID-19, social media

    MSC Class: 68T50 ACM Class: I.2; I.2.7

    Journal ref: EMNLP-2021 (Findings)

  26. arXiv:2001.08546  [pdf, other

    cs.CL cs.AI cs.IR cs.LG

    CheckThat! at CLEF 2020: Enabling the Automatic Identification and Verification of Claims in Social Media

    Authors: Alberto Barron-Cedeno, Tamer Elsayed, Preslav Nakov, Giovanni Da San Martino, Maram Hasanain, Reem Suwaileh, Fatima Haouari

    Abstract: We describe the third edition of the CheckThat! Lab, which is part of the 2020 Cross-Language Evaluation Forum (CLEF). CheckThat! proposes four complementary tasks and a related task from previous lab editions, offered in English, Arabic, and Spanish. Task 1 asks to predict which tweets in a Twitter stream are worth fact-checking. Task 2 asks to determine whether a claim posted in a tweet can be v… ▽ More

    Submitted 21 January, 2020; originally announced January 2020.

    Comments: Computational journalism, Check-worthiness, Fact-checking, Veracity, CLEF-2020 CheckThat! Lab

    MSC Class: 68T50 ACM Class: I.2.7

    Journal ref: CLEF-2018 ECIR-2020

  27. arXiv:1912.06810  [pdf, other

    cs.CL cs.IR cs.LG

    Proppy: A System to Unmask Propaganda in Online News

    Authors: Alberto Barrón-Cedeño, Giovanni Da San Martino, Israa Jaradat, Preslav Nakov

    Abstract: We present proppy, the first publicly available real-world, real-time propaganda detection system for online news, which aims at raising awareness, thus potentially limiting the impact of propaganda and hel** fight disinformation. The system constantly monitors a number of news sources, deduplicates and clusters the news into events, and organizes the articles about an event on the basis of the… ▽ More

    Submitted 14 December, 2019; originally announced December 2019.

    Comments: propaganda, disinformation, fake news

    MSC Class: 68T50 ACM Class: I.2.7

    Journal ref: Thirty-Third AAAI Conference on Artificial Intelligence (AAAI-2019)

  28. arXiv:1911.08755  [pdf, ps, other

    cs.CL cs.AI cs.IR cs.LO

    Global Thread-Level Inference for Comment Classification in Community Question Answering

    Authors: Shafiq Joty, Alberto Barrón-Cedeño, Giovanni Da San Martino, Simone Filice, Lluís Màrquez, Alessandro Moschitti, Preslav Nakov

    Abstract: Community question answering, a recent evolution of question answering in the Web context, allows a user to quickly consult the opinion of a number of people on a particular topic, thus taking advantage of the wisdom of the crowd. Here we try to help the user by deciding automatically which answers are good and which are bad for a given question. In particular, we focus on exploiting the output st… ▽ More

    Submitted 20 November, 2019; originally announced November 2019.

    Comments: community question answering, thread-level inference, graph-cut, inductive logic programming

    MSC Class: 68T50 ACM Class: I.2.7

    Journal ref: EMNLP-2015

  29. arXiv:1911.06815  [pdf, other

    cs.CL cs.IR

    Experiments in Detecting Persuasion Techniques in the News

    Authors: Seunghak Yu, Giovanni Da San Martino, Preslav Nakov

    Abstract: Many recent political events, like the 2016 US Presidential elections or the 2018 Brazilian elections have raised the attention of institutions and of the general public on the role of Internet and social media in influencing the outcome of these events. We argue that a safe democracy is one in which citizens have tools to make them aware of propaganda campaigns. We propose a novel task: performin… ▽ More

    Submitted 15 November, 2019; originally announced November 2019.

    Comments: arXiv admin note: substantial text overlap with arXiv:1910.02517

    MSC Class: 68T50 ACM Class: I.2.7

    Journal ref: NeurIPS-2019 workshop on AI for Social Good

  30. arXiv:1910.09982  [pdf, other

    cs.CL cs.SI

    Findings of the NLP4IF-2019 Shared Task on Fine-Grained Propaganda Detection

    Authors: Giovanni Da San Martino, Alberto Barrón-Cedeño, Preslav Nakov

    Abstract: We present the shared task on Fine-Grained Propaganda Detection, which was organized as part of the NLP4IF workshop at EMNLP-IJCNLP 2019. There were two subtasks. FLC is a fragment-level task that asks for the identification of propagandist text fragments in a news article and also for the prediction of the specific propaganda technique used in each such fragment (18-way classification task). SLC… ▽ More

    Submitted 20 October, 2019; originally announced October 2019.

    Comments: propaganda, disinformation, fake news. arXiv admin note: text overlap with arXiv:1910.02517

    MSC Class: 68T50 ACM Class: I.2.7

    Journal ref: NLP4IF@EMNLP-2019

  31. arXiv:1910.02517  [pdf, other

    cs.CL cs.AI cs.IR

    Fine-Grained Analysis of Propaganda in News Articles

    Authors: Giovanni Da San Martino, Seunghak Yu, Alberto Barrón-Cedeño, Rostislav Petrov, Preslav Nakov

    Abstract: Propaganda aims at influencing people's mindset with the purpose of advancing a specific agenda. Previous work has addressed propaganda detection at the document level, typically labelling all articles from a propagandistic news outlet as propaganda. Such noisy gold labels inevitably affect the quality of any learning system trained on them. A further issue with most existing systems is the lack o… ▽ More

    Submitted 6 October, 2019; originally announced October 2019.

    MSC Class: 68T50 ACM Class: I.2.7

    Journal ref: EMNLP-2019

  32. arXiv:1910.02028  [pdf, other

    cs.CL cs.IR

    Tanbih: Get To Know What You Are Reading

    Authors: Yifan Zhang, Giovanni Da San Martino, Alberto Barrón-Cedeño, Salvatore Romeo, Jisun An, Haewoon Kwak, Todor Staykovski, Israa Jaradat, Georgi Karadzhov, Ramy Baly, Kareem Darwish, James Glass, Preslav Nakov

    Abstract: We introduce Tanbih, a news aggregator with intelligent analysis tools to help readers understanding what's behind a news story. Our system displays news grouped into events and generates media profiles that show the general factuality of reporting, the degree of propagandistic content, hyper-partisanship, leading political ideology, general frame of reporting, and stance with respect to various c… ▽ More

    Submitted 4 October, 2019; originally announced October 2019.

    MSC Class: 68T50 ACM Class: I.2.7

    Journal ref: EMNLP-2019

  33. arXiv:1904.03513  [pdf, other

    cs.IR cs.CL cs.LG stat.ML

    Team QCRI-MIT at SemEval-2019 Task 4: Propaganda Analysis Meets Hyperpartisan News Detection

    Authors: Abdelrhman Saleh, Ramy Baly, Alberto Barrón-Cedeño, Giovanni Da San Martino, Mitra Mohtarami, Preslav Nakov, James Glass

    Abstract: In this paper, we describe our submission to SemEval-2019 Task 4 on Hyperpartisan News Detection. Our system relies on a variety of engineered features originally used to detect propaganda. This is based on the assumption that biased messages are propagandistic in the sense that they promote a particular political cause or viewpoint. We trained a logistic regression model with features ranging fro… ▽ More

    Submitted 6 April, 2019; originally announced April 2019.

    Comments: Hyperpartisanship, propaganda, news media, fake news, SemEval-2018

  34. arXiv:1808.05542  [pdf, other

    cs.CL

    Overview of the CLEF-2018 CheckThat! Lab on Automatic Identification and Verification of Political Claims. Task 1: Check-Worthiness

    Authors: Pepa Atanasova, Alberto Barron-Cedeno, Tamer Elsayed, Reem Suwaileh, Wajdi Zaghouani, Spas Kyuchukov, Giovanni Da San Martino, Preslav Nakov

    Abstract: We present an overview of the CLEF-2018 CheckThat! Lab on Automatic Identification and Verification of Political Claims, with focus on Task 1: Check-Worthiness. The task asks to predict which claims in a political debate should be prioritized for fact-checking. In particular, given a debate or a political speech, the goal was to produce a ranked list of its sentences based on their worthiness for… ▽ More

    Submitted 8 August, 2018; originally announced August 2018.

    Comments: Computational journalism, Check-worthiness, Fact-checking, Veracity

    MSC Class: 68T50 ACM Class: I.2.7

    Journal ref: CLEF-2018

  35. arXiv:1710.01487  [pdf, other

    cs.CL

    Cross-Language Question Re-Ranking

    Authors: Giovanni Da San Martino, Salvatore Romeo, Alberto Barron-Cedeno, Shafiq Joty, Lluis Marquez, Alessandro Moschitti, Preslav Nakov

    Abstract: We study how to find relevant questions in community forums when the language of the new questions is different from that of the existing questions in the forum. In particular, we explore the Arabic-English language pair. We compare a kernel-based system with a feed-forward neural network in a scenario where a large parallel corpus is available for training a machine translation system, bilingual… ▽ More

    Submitted 4 October, 2017; originally announced October 2017.

    Comments: SIGIR-2017; Community Question Answering; Cross-language Approaches; Question Retrieval; Kernel-based Methods; Neural Networks; Distributed Representations

    MSC Class: 68T50 ACM Class: I.2.7

    Journal ref: SIGIR 2017: 1145-1148

  36. arXiv:1710.00689  [pdf, ps, other

    cs.CL

    Building Chatbots from Forum Data: Model Selection Using Question Answering Metrics

    Authors: Martin Boyanov, Ivan Koychev, Preslav Nakov, Alessandro Moschitti, Giovanni Da San Martino

    Abstract: We propose to use question answering (QA) data from Web forums to train chatbots from scratch, i.e., without dialog training data. First, we extract pairs of question and answer sentences from the typically much longer texts of questions and answers in a forum. We then use these shorter texts to train seq2seq models in a more efficient way. We further improve the parameter optimization using a new… ▽ More

    Submitted 2 October, 2017; originally announced October 2017.

    Comments: RANLP-2017

    MSC Class: 68T50 ACM Class: I.2.7

  37. arXiv:1610.05522  [pdf, other

    cs.CL

    Addressing Community Question Answering in English and Arabic

    Authors: Giovanni Da San Martino, Alberto Barrón-Cedeño, Salvatore Romeo, Alessandro Moschitti, Shafiq Joty, Fahad A. Al Obaidli, Kateryna Tymoshenko, Antonio Uva

    Abstract: This paper studies the impact of different types of features applied to learning to re-rank questions in community Question Answering. We tested our models on two datasets released in SemEval-2016 Task 3 on "Community Question Answering". Task 3 targeted real-life Web fora both in English and Arabic. Our models include bag-of-words features (BoW), syntactic tree kernels (TKs), rank features, embed… ▽ More

    Submitted 18 October, 2016; originally announced October 2016.

    Comments: presented at Second WebQA workshop, SIGIR2016 (http://plg2.cs.uwaterloo.ca/~avtyurin/WebQA2016/)

    ACM Class: I.2.7; H.3.4

  38. Graph Kernels exploiting Weisfeiler-Lehman Graph Isomorphism Test Extensions

    Authors: Giovanni Da San Martino, Nicolò Navarin, Alessandro Sperduti

    Abstract: In this paper we present a novel graph kernel framework inspired the by the Weisfeiler-Lehman (WL) isomorphism tests. Any WL test comprises a relabelling phase of the nodes based on test-specific information extracted from the graph, for example the set of neighbours of a node. We defined a novel relabelling and derived two kernels of the framework from it. The novel kernels are very fast to compu… ▽ More

    Submitted 22 September, 2015; originally announced September 2015.

    Journal ref: Neural Information Processing, Volume 8835 of the series Lecture Notes in Computer Science pp 93-100, 2014 Springer International Publishing

  39. A tree-based kernel for graphs with continuous attributes

    Authors: Giovanni Da San Martino, Nicolò Navarin, Alessandro Sperduti

    Abstract: The availability of graph data with node attributes that can be either discrete or real-valued is constantly increasing. While existing kernel methods are effective techniques for dealing with graphs having discrete node labels, their adaptation to non-discrete or continuous node attributes has been limited, mainly for computational issues. Recently, a few kernels especially tailored for this doma… ▽ More

    Submitted 20 December, 2016; v1 submitted 3 September, 2015; originally announced September 2015.

    Comments: This work has been submitted to the IEEE Transactions on Neural Networks and Learning Systems for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

  40. Ordered Decompositional DAG Kernels Enhancements

    Authors: Giovanni Da San Martino, Nicolò Navarin, Alessandro Sperduti

    Abstract: In this paper, we show how the Ordered Decomposition DAGs (ODD) kernel framework, a framework that allows the definition of graph kernels from tree kernels, allows to easily define new state-of-the-art graph kernels. Here we consider a fast graph kernel based on the Subtree kernel (ST), and we propose various enhancements to increase its expressiveness. The proposed DAG kernel has the same worst… ▽ More

    Submitted 28 December, 2015; v1 submitted 13 July, 2015; originally announced July 2015.

    Comments: Paper accepted for publication in Neurocomputing

    Journal ref: Neurocomputing, Volume 192, 5 June 2016, Pages 92--103

  41. arXiv:1507.02158  [pdf, other

    cs.LG

    An Empirical Study on Budget-Aware Online Kernel Algorithms for Streams of Graphs

    Authors: Giovanni Da San Martino, Nicolò Navarin, Alessandro Sperduti

    Abstract: Kernel methods are considered an effective technique for on-line learning. Many approaches have been developed for compactly representing the dual solution of a kernel method when the problem imposes memory constraints. However, in literature no work is specifically tailored to streams of graphs. Motivated by the fact that the size of the feature space representation of many state-of-the-art graph… ▽ More

    Submitted 20 July, 2016; v1 submitted 8 July, 2015; originally announced July 2015.

    Comments: Author's version of the manuscript, to appear in Neurocomputing (ELSEVIER)