Showing 1–1 of 1 results for author: Tuck, B E

Search v0.5.6 released 2020-02-24

arXiv:2406.17967 [pdf, other]

cs.CL

Unmasking the Imposters: In-Domain Detection of Human vs. Machine-Generated Tweets

Authors: Bryan E. Tuck, Rakesh M. Verma

Abstract: The rapid development of large language models (LLMs) has significantly improved the generation of fluent and convincing text, raising concerns about their misuse on social media platforms. We present a methodology using Twitter datasets to examine the generative capabilities of four LLMs: Llama 3, Mistral, Qwen2, and GPT4o. We evaluate 7B and 8B parameter base-instruction models of the three open… ▽ More The rapid development of large language models (LLMs) has significantly improved the generation of fluent and convincing text, raising concerns about their misuse on social media platforms. We present a methodology using Twitter datasets to examine the generative capabilities of four LLMs: Llama 3, Mistral, Qwen2, and GPT4o. We evaluate 7B and 8B parameter base-instruction models of the three open-source LLMs and validate the impact of further fine-tuning and "uncensored" versions. Our findings show that "uncensored" models with additional in-domain fine-tuning dramatically reduce the effectiveness of automated detection methods. This study addresses a gap by exploring smaller open-source models and the effects of "uncensoring," providing insights into how fine-tuning and content moderation influence machine-generated text detection. △ Less

Submitted 25 June, 2024; originally announced June 2024.

Search v0.5.6 released 2020-02-24