Skip to main content

Showing 1–13 of 13 results for author: Moro, R

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.12549  [pdf, other

    cs.CL cs.AI

    MultiSocial: Multilingual Benchmark of Machine-Generated Text Detection of Social-Media Texts

    Authors: Dominik Macko, Jakub Kopal, Robert Moro, Ivan Srba

    Abstract: Recent LLMs are able to generate high-quality multilingual texts, indistinguishable for humans from authentic human-written ones. Research in machine-generated text detection is however mostly focused on the English language and longer texts, such as news articles, scientific papers or student essays. Social-media texts are usually much shorter and often feature informal language, grammatical erro… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

  2. arXiv:2401.07867  [pdf, other

    cs.CL

    Authorship Obfuscation in Multilingual Machine-Generated Text Detection

    Authors: Dominik Macko, Robert Moro, Adaku Uchendu, Ivan Srba, Jason Samuel Lucas, Michiharu Yamashita, Nafis Irtiza Tripto, Dongwon Lee, Jakub Simko, Maria Bielikova

    Abstract: High-quality text generation capability of recent Large Language Models (LLMs) causes concerns about their misuse (e.g., in massive generation/spread of disinformation). Machine-generated text (MGT) detection is important to cope with such threats. However, it is susceptible to authorship obfuscation (AO) methods, such as paraphrasing, which can cause MGTs to evade detection. So far, this was eval… ▽ More

    Submitted 18 June, 2024; v1 submitted 15 January, 2024; originally announced January 2024.

  3. arXiv:2311.08838  [pdf, other

    cs.CL

    Disinformation Capabilities of Large Language Models

    Authors: Ivan Vykopal, Matúš Pikuliak, Ivan Srba, Robert Moro, Dominik Macko, Maria Bielikova

    Abstract: Automated disinformation generation is often listed as an important risk associated with large language models (LLMs). The theoretical ability to flood the information space with disinformation content might have dramatic consequences for societies around the world. This paper presents a comprehensive study of the disinformation capabilities of the current generation of LLMs to generate false news… ▽ More

    Submitted 23 February, 2024; v1 submitted 15 November, 2023; originally announced November 2023.

  4. arXiv:2311.08374  [pdf, other

    cs.CL

    A Ship of Theseus: Curious Cases of Paraphrasing in LLM-Generated Texts

    Authors: Nafis Irtiza Tripto, Saranya Venkatraman, Dominik Macko, Robert Moro, Ivan Srba, Adaku Uchendu, Thai Le, Dongwon Lee

    Abstract: In the realm of text manipulation and linguistic transformation, the question of authorship has been a subject of fascination and philosophical inquiry. Much like the Ship of Theseus paradox, which ponders whether a ship remains the same when each of its original planks is replaced, our research delves into an intriguing question: Does a text retain its original authorship when it undergoes numero… ▽ More

    Submitted 6 June, 2024; v1 submitted 14 November, 2023; originally announced November 2023.

    Comments: To appear in Association for Computational Linguistics (ACL 2024)

  5. arXiv:2311.06121  [pdf, other

    cs.CL

    Is it indeed bigger better? The comprehensive study of claim detection LMs applied for disinformation tackling

    Authors: Martin Hyben, Sebastian Kula, Ivan Srba, Robert Moro, Jakub Simko

    Abstract: This study compares the performance of (1) fine-tuned models and (2) extremely large language models on the task of check-worthy claim detection. For the purpose of the comparison we composed a multilingual and multi-topical dataset comprising texts of various sources and styles. Building on this, we performed a benchmark analysis to determine the most general multilingual and multi-topical claim… ▽ More

    Submitted 10 November, 2023; originally announced November 2023.

    Comments: 27 pages, 10 figures

  6. MULTITuDE: Large-Scale Multilingual Machine-Generated Text Detection Benchmark

    Authors: Dominik Macko, Robert Moro, Adaku Uchendu, Jason Samuel Lucas, Michiharu Yamashita, Matúš Pikuliak, Ivan Srba, Thai Le, Dongwon Lee, Jakub Simko, Maria Bielikova

    Abstract: There is a lack of research into capabilities of recent LLMs to generate convincing text in languages other than English and into performance of detectors of machine-generated text in multilingual settings. This is also reflected in the available benchmarks which lack authentic texts in languages other than English and predominantly cover older generators. To fill this gap, we introduce MULTITuDE,… ▽ More

    Submitted 20 October, 2023; originally announced October 2023.

    Journal ref: Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing

  7. Multilingual Previously Fact-Checked Claim Retrieval

    Authors: Matúš Pikuliak, Ivan Srba, Robert Moro, Timo Hromadka, Timotej Smolen, Martin Melisek, Ivan Vykopal, Jakub Simko, Juraj Podrouzek, Maria Bielikova

    Abstract: Fact-checkers are often hampered by the sheer amount of online content that needs to be fact-checked. NLP can help them by retrieving already existing fact-checks relevant to the content being investigated. This paper introduces a new multilingual dataset -- MultiClaim -- for previously fact-checked claim retrieval. We collected 28k posts in 27 languages from social media, 206k fact-checks in 39 l… ▽ More

    Submitted 13 October, 2023; v1 submitted 13 May, 2023; originally announced May 2023.

    Comments: Accepted at EMNLP 2023

    Journal ref: Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing

  8. Eye Tracking as a Source of Implicit Feedback in Recommender Systems: A Preliminary Analysis

    Authors: Santiago de Leon-Martinez, Robert Moro, Maria Bielikova

    Abstract: Eye tracking in recommender systems can provide an additional source of implicit feedback, while hel** to evaluate other sources of feedback. In this study, we use eye tracking data to inform a collaborative filtering model for movie recommendation providing an improvement over the click-based implementations and additionally analyze the area of interest (AOI) duration as related to the known in… ▽ More

    Submitted 12 May, 2023; originally announced May 2023.

    Comments: Paper accepted to Eyes4ICU workshop at ETRA 2023

  9. arXiv:2211.12143  [pdf

    cs.CY cs.AI cs.HC

    Automated, not Automatic: Needs and Practices in European Fact-checking Organizations as a basis for Designing Human-centered AI Systems

    Authors: Andrea Hrckova, Robert Moro, Ivan Srba, Jakub Simko, Maria Bielikova

    Abstract: To mitigate the negative effects of false information more effectively, the development of automated AI (artificial intelligence) tools assisting fact-checkers is needed. Despite the existing research, there is still a gap between the fact-checking practitioners' needs and pains and the current AI research. We aspire to bridge this gap by employing methods of information behavior research to ident… ▽ More

    Submitted 22 November, 2022; originally announced November 2022.

    Comments: 41 pages, 13 figures, 1 table, 2 annexes

  10. arXiv:2210.10085  [pdf, other

    cs.IR cs.LG cs.SI

    Auditing YouTube's Recommendation Algorithm for Misinformation Filter Bubbles

    Authors: Ivan Srba, Robert Moro, Matus Tomlein, Branislav Pecher, Jakub Simko, Elena Stefancova, Michal Kompan, Andrea Hrckova, Juraj Podrouzek, Adrian Gavornik, Maria Bielikova

    Abstract: In this paper, we present results of an auditing study performed over YouTube aimed at investigating how fast a user can get into a misinformation filter bubble, but also what it takes to "burst the bubble", i.e., revert the bubble enclosure. We employ a sock puppet audit methodology, in which pre-programmed agents (acting as YouTube users) delve into misinformation filter bubbles by watching misi… ▽ More

    Submitted 18 October, 2022; originally announced October 2022.

    Comments: Just accepted to ACM Transactions on Recommender Systems (ACM TORS). arXiv admin note: substantial text overlap with arXiv:2203.13769

    Journal ref: ACM Transactions on Recommender Systems. 1, 1, Article 6 (March 2023), 33 pages

  11. arXiv:2204.12294  [pdf, other

    cs.CL cs.CY cs.IR cs.LG

    Monant Medical Misinformation Dataset: Map** Articles to Fact-Checked Claims

    Authors: Ivan Srba, Branislav Pecher, Matus Tomlein, Robert Moro, Elena Stefancova, Jakub Simko, Maria Bielikova

    Abstract: False information has a significant negative influence on individuals as well as on the whole society. Especially in the current COVID-19 era, we witness an unprecedented growth of medical misinformation. To help tackle this problem with machine learning approaches, we are publishing a feature-rich dataset of approx. 317k medical news articles/blogs and 3.5k fact-checked claims. It also contains 5… ▽ More

    Submitted 26 April, 2022; originally announced April 2022.

    Comments: 11 pages, 4 figures, SIGIR 2022 Resource paper track

    Journal ref: ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2022)

  12. An Audit of Misinformation Filter Bubbles on YouTube: Bubble Bursting and Recent Behavior Changes

    Authors: Matus Tomlein, Branislav Pecher, Jakub Simko, Ivan Srba, Robert Moro, Elena Stefancova, Michal Kompan, Andrea Hrckova, Juraj Podrouzek, Maria Bielikova

    Abstract: The negative effects of misinformation filter bubbles in adaptive systems have been known to researchers for some time. Several studies investigated, most prominently on YouTube, how fast a user can get into a misinformation filter bubble simply by selecting wrong choices from the items offered. Yet, no studies so far have investigated what it takes to burst the bubble, i.e., revert the bubble enc… ▽ More

    Submitted 25 March, 2022; originally announced March 2022.

    Comments: RecSys '21: Fifteenth ACM Conference on Recommender System

    Journal ref: RecSys '21: Fifteenth ACM Conference on Recommender Systems, 2021

  13. arXiv:2109.12523  [pdf, other

    cs.HC cs.CY cs.LG cs.SI

    A Study of Fake News Reading and Annotating in Social Media Context

    Authors: Jakub Simko, Patrik Racsko, Matus Tomlein, Martin Hanakova, Robert Moro, Maria Bielikova

    Abstract: The online spreading of fake news is a major issue threatening entire societies. Much of this spreading is enabled by new media formats, namely social networks and online media sites. Researchers and practitioners have been trying to answer this by characterizing the fake news and devising automated methods for detecting them. The detection methods had so far only limited success, mostly due to th… ▽ More

    Submitted 26 April, 2022; v1 submitted 26 September, 2021; originally announced September 2021.

    ACM Class: H.5.2; H.5.4; K.4.2; H.3.1

    Journal ref: New Review of Hypermedia and Multimedia. pages 1-31 (2021)