Skip to main content

Showing 1–2 of 2 results for author: RRV, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.03827  [pdf, other

    cs.CL

    Chaos with Keywords: Exposing Large Language Models Sycophancy to Misleading Keywords and Evaluating Defense Strategies

    Authors: Aswin RRV, Nemika Tyagi, Md Nayem Uddin, Neeraj Varshney, Chitta Baral

    Abstract: This study explores the sycophantic tendencies of Large Language Models (LLMs), where these models tend to provide answers that match what users want to hear, even if they are not entirely correct. The motivation behind this exploration stems from the common behavior observed in individuals searching the internet for facts with partial or misleading knowledge. Similar to using web search engines,… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

    Comments: To be published in Findings of ACL 2024

  2. arXiv:2405.16681  [pdf, other

    cs.CL

    Triple Preference Optimization: Achieving Better Alignment with Less Data in a Single Step Optimization

    Authors: Amir Saeidi, Shivanshu Verma, Aswin RRV, Chitta Baral

    Abstract: Large Language Models (LLMs) perform well across diverse tasks, but aligning them with human demonstrations is challenging. Recently, Reinforcement Learning (RL)-free methods like Direct Preference Optimization (DPO) have emerged, offering improved stability and scalability while retaining competitive performance relative to RL-based methods. However, while RL-free methods deliver satisfactory per… ▽ More

    Submitted 26 May, 2024; originally announced May 2024.