Chaos with Keywords: Exposing Large Language Models Sycophancy to Misleading Keywords and Evaluating Defense Strategies

RRV, Aswin; Tyagi, Nemika; Uddin, Md Nayem; Varshney, Neeraj; Baral, Chitta

Computer Science > Computation and Language

arXiv:2406.03827 (cs)

[Submitted on 6 Jun 2024]

Title:Chaos with Keywords: Exposing Large Language Models Sycophancy to Misleading Keywords and Evaluating Defense Strategies

Authors:Aswin RRV, Nemika Tyagi, Md Nayem Uddin, Neeraj Varshney, Chitta Baral

View PDF HTML (experimental)

Abstract:This study explores the sycophantic tendencies of Large Language Models (LLMs), where these models tend to provide answers that match what users want to hear, even if they are not entirely correct. The motivation behind this exploration stems from the common behavior observed in individuals searching the internet for facts with partial or misleading knowledge. Similar to using web search engines, users may recall fragments of misleading keywords and submit them to an LLM, ho** for a comprehensive response. Our empirical analysis of several LLMs shows the potential danger of these models amplifying misinformation when presented with misleading keywords. Additionally, we thoroughly assess four existing hallucination mitigation strategies to reduce LLMs sycophantic behavior. Our experiments demonstrate the effectiveness of these strategies for generating factually correct statements. Furthermore, our analyses delve into knowledge-probing experiments on factual keywords and different categories of sycophancy mitigation.

Comments:	To be published in Findings of ACL 2024
Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2406.03827 [cs.CL]
	(or arXiv:2406.03827v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2406.03827

Submission history

From: Aswin Ravikumar Rangasamy Veerasamy [view email]
[v1] Thu, 6 Jun 2024 08:03:05 UTC (6,759 KB)

Computer Science > Computation and Language

Title:Chaos with Keywords: Exposing Large Language Models Sycophancy to Misleading Keywords and Evaluating Defense Strategies

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Chaos with Keywords: Exposing Large Language Models Sycophancy to Misleading Keywords and Evaluating Defense Strategies

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators