Search | arXiv e-print repository

Are LLMs Effective Negotiators? Systematic Evaluation of the Multifaceted Capabilities of LLMs in Negotiation Dialogues

Authors: Deuksin Kwon, Emily Weiss, Tara Kulshrestha, Kushal Chawla, Gale M. Lucas, Jonathan Gratch

Abstract: A successful negotiation demands a deep comprehension of the conversation context, Theory-of-Mind (ToM) skills to infer the partner's motives, as well as strategic reasoning and effective communication, making it challenging for automated systems. Given the remarkable performance of LLMs across a variety of NLP tasks, in this work, we aim to understand how LLMs can advance different aspects of neg… ▽ More A successful negotiation demands a deep comprehension of the conversation context, Theory-of-Mind (ToM) skills to infer the partner's motives, as well as strategic reasoning and effective communication, making it challenging for automated systems. Given the remarkable performance of LLMs across a variety of NLP tasks, in this work, we aim to understand how LLMs can advance different aspects of negotiation research, ranging from designing dialogue systems to providing pedagogical feedback and scaling up data collection practices. To this end, we devise a methodology to analyze the multifaceted capabilities of LLMs across diverse dialogue scenarios covering all the time stages of a typical negotiation interaction. Our analysis adds to the increasing evidence for the superiority of GPT-4 across various tasks while also providing insights into specific tasks that remain difficult for LLMs. For instance, the models correlate poorly with human players when making subjective assessments about the negotiation dialogues and often struggle to generate responses that are contextually appropriate as well as strategically advantageous. △ Less

Submitted 21 February, 2024; originally announced February 2024.

arXiv:2311.10781 [pdf, other]

Can Language Model Moderators Improve the Health of Online Discourse?

Authors: Hyundong Cho, Shuai Liu, Taiwei Shi, Darpan Jain, Basem Rizk, Yuyang Huang, Zixun Lu, Nuan Wen, Jonathan Gratch, Emilio Ferrara, Jonathan May

Abstract: Conversational moderation of online communities is crucial to maintaining civility for a constructive environment, but it is challenging to scale and harmful to moderators. The inclusion of sophisticated natural language generation modules as a force multiplier to aid human moderators is a tantalizing prospect, but adequate evaluation approaches have so far been elusive. In this paper, we establis… ▽ More Conversational moderation of online communities is crucial to maintaining civility for a constructive environment, but it is challenging to scale and harmful to moderators. The inclusion of sophisticated natural language generation modules as a force multiplier to aid human moderators is a tantalizing prospect, but adequate evaluation approaches have so far been elusive. In this paper, we establish a systematic definition of conversational moderation effectiveness grounded on moderation literature and establish design criteria for conducting realistic yet safe evaluation. We then propose a comprehensive evaluation framework to assess models' moderation capabilities independently of human intervention. With our framework, we conduct the first known study of language models as conversational moderators, finding that appropriately prompted models that incorporate insights from social science can provide specific and fair feedback on toxic behavior but struggle to influence users to increase their levels of respect and cooperation. △ Less

Submitted 6 May, 2024; v1 submitted 16 November, 2023; originally announced November 2023.

Comments: 9 pages, NAACL 2024 Main

arXiv:2311.03551 [pdf, other]

Context Unlocks Emotions: Text-based Emotion Classification Dataset Auditing with Large Language Models

Authors: Daniel Yang, Aditya Kommineni, Mohammad Alshehri, Nilamadhab Mohanty, Vedant Modi, Jonathan Gratch, Shrikanth Narayanan

Abstract: The lack of contextual information in text data can make the annotation process of text-based emotion classification datasets challenging. As a result, such datasets often contain labels that fail to consider all the relevant emotions in the vocabulary. This misalignment between text inputs and labels can degrade the performance of machine learning models trained on top of them. As re-annotating e… ▽ More The lack of contextual information in text data can make the annotation process of text-based emotion classification datasets challenging. As a result, such datasets often contain labels that fail to consider all the relevant emotions in the vocabulary. This misalignment between text inputs and labels can degrade the performance of machine learning models trained on top of them. As re-annotating entire datasets is a costly and time-consuming task that cannot be done at scale, we propose to use the expressive capabilities of large language models to synthesize additional context for input text to increase its alignment with the annotated emotional labels. In this work, we propose a formal definition of textual context to motivate a prompting strategy to enhance such contextual information. We provide both human and empirical evaluation to demonstrate the efficacy of the enhanced context. Our method improves alignment between inputs and their human-annotated labels from both an empirical and human-evaluated standpoint. △ Less

Submitted 6 November, 2023; originally announced November 2023.

arXiv:2310.14404 [pdf, other]

Be Selfish, But Wisely: Investigating the Impact of Agent Personality in Mixed-Motive Human-Agent Interactions

Authors: Kushal Chawla, Ian Wu, Yu Rong, Gale M. Lucas, Jonathan Gratch

Abstract: A natural way to design a negotiation dialogue system is via self-play RL: train an agent that learns to maximize its performance by interacting with a simulated user that has been designed to imitate human-human dialogue data. Although this procedure has been adopted in prior work, we find that it results in a fundamentally flawed system that fails to learn the value of compromise in a negotiatio… ▽ More A natural way to design a negotiation dialogue system is via self-play RL: train an agent that learns to maximize its performance by interacting with a simulated user that has been designed to imitate human-human dialogue data. Although this procedure has been adopted in prior work, we find that it results in a fundamentally flawed system that fails to learn the value of compromise in a negotiation, which can often lead to no agreements (i.e., the partner walking away without a deal), ultimately hurting the model's overall performance. We investigate this observation in the context of the DealOrNoDeal task, a multi-issue negotiation over books, hats, and balls. Grounded in negotiation theory from Economics, we modify the training procedure in two novel ways to design agents with diverse personalities and analyze their performance with human partners. We find that although both techniques show promise, a selfish agent, which maximizes its own performance while also avoiding walkaways, performs superior to other variants by implicitly learning to generate value for both itself and the negotiation partner. We discuss the implications of our findings for what it means to be a successful negotiation dialogue system and how these systems should be designed in the future. △ Less

Submitted 22 October, 2023; originally announced October 2023.

Comments: Accepted at EMNLP 2023 (Main)

arXiv:2307.13779 [pdf]

Is GPT a Computational Model of Emotion? Detailed Analysis

Authors: Ala N. Tak, Jonathan Gratch

Abstract: This paper investigates the emotional reasoning abilities of the GPT family of large language models via a component perspective. The paper first examines how the model reasons about autobiographical memories. Second, it systematically varies aspects of situations to impact emotion intensity and co** tendencies. Even without the use of prompt engineering, it is shown that GPT's predictions align… ▽ More This paper investigates the emotional reasoning abilities of the GPT family of large language models via a component perspective. The paper first examines how the model reasons about autobiographical memories. Second, it systematically varies aspects of situations to impact emotion intensity and co** tendencies. Even without the use of prompt engineering, it is shown that GPT's predictions align significantly with human-provided appraisals and emotional labels. However, GPT faces difficulties predicting emotion intensity and co** responses. GPT-4 showed the highest performance in the initial study but fell short in the second, despite providing superior results after minor prompt engineering. This assessment brings up questions on how to effectively employ the strong points and address the weak areas of these models, particularly concerning response variability. These studies underscore the merits of evaluating models from a componential perspective. △ Less

Submitted 25 July, 2023; originally announced July 2023.

arXiv:2210.05664 [pdf, other]

Social Influence Dialogue Systems: A Survey of Datasets and Models For Social Influence Tasks

Authors: Kushal Chawla, Weiyan Shi, **gwen Zhang, Gale Lucas, Zhou Yu, Jonathan Gratch

Abstract: Dialogue systems capable of social influence such as persuasion, negotiation, and therapy, are essential for extending the use of technology to numerous realistic scenarios. However, existing research primarily focuses on either task-oriented or open-domain scenarios, a categorization that has been inadequate for capturing influence skills systematically. There exists no formal definition or categ… ▽ More Dialogue systems capable of social influence such as persuasion, negotiation, and therapy, are essential for extending the use of technology to numerous realistic scenarios. However, existing research primarily focuses on either task-oriented or open-domain scenarios, a categorization that has been inadequate for capturing influence skills systematically. There exists no formal definition or category for dialogue systems with these skills and data-driven efforts in this direction are highly limited. In this work, we formally define and introduce the category of social influence dialogue systems that influence users' cognitive and emotional responses, leading to changes in thoughts, opinions, and behaviors through natural conversations. We present a survey of various tasks, datasets, and methods, compiling the progress across seven diverse domains. We discuss the commonalities and differences between the examined systems, identify limitations, and recommend future directions. This study serves as a comprehensive reference for social influence dialogue systems to inspire more dedicated research and discussion in this emerging area. △ Less

Submitted 24 January, 2023; v1 submitted 11 October, 2022; originally announced October 2022.

Comments: Accepted at EACL 2023

arXiv:2207.00925 [pdf]

The Impact of Partner Expressions on Felt Emotion in the Iterated Prisoner's Dilemma: An Event-level Analysis

Authors: Maria Angelika-Nikita, Celso M. de Melo, Kazunori Terada, Gale Lucas, Jonathan Gratch

Abstract: Social games like the prisoner's dilemma are often used to develop models of the role of emotion in social decision-making. Here we examine an understudied aspect of emotion in such games: how an individual's feelings are shaped by their partner's expressions. Prior research has tended to focus on other aspects of emotion. Research on felt-emotion has focused on how an individual's feelings shape… ▽ More Social games like the prisoner's dilemma are often used to develop models of the role of emotion in social decision-making. Here we examine an understudied aspect of emotion in such games: how an individual's feelings are shaped by their partner's expressions. Prior research has tended to focus on other aspects of emotion. Research on felt-emotion has focused on how an individual's feelings shape how they treat their partner, or whether these feelings are authentically expressed. Research on expressed-emotion has focused on how an individual's decisions are shaped by their partner's expressions, without regard for whether these expressions actually evoke feelings. Here, we use computer-generated characters to examine how an individual's moment-to-moment feelings are shaped by (1) how they are treated by their partner and (2) what their partner expresses during this treatment. Surprisingly, we find that partner expressions are far more important than actions in determining self-reported feelings. In other words, our partner can behave in a selfish and exploitive way, but if they show a collaborative pattern of expressions, we will feel greater pleasure collaborating with them. These results also emphasize the importance of context in determining how someone will feel in response to an expression (i.e., knowing a partner is happy is insufficient; we must know what they are happy-at). We discuss the implications of this work for cognitive-system design, emotion theory, and methodological practice in affective computing. △ Less

Submitted 2 July, 2022; originally announced July 2022.

Comments: 18 pages, 7 figures, Ninth Annual Conference on Advances in Cognitive Systems

arXiv:2205.00344 [pdf, other]

Opponent Modeling in Negotiation Dialogues by Related Data Adaptation

Authors: Kushal Chawla, Gale M. Lucas, Jonathan May, Jonathan Gratch

Abstract: Opponent modeling is the task of inferring another party's mental state within the context of social interactions. In a multi-issue negotiation, it involves inferring the relative importance that the opponent assigns to each issue under discussion, which is crucial for finding high-value deals. A practical model for this task needs to infer these priorities of the opponent on the fly based on part… ▽ More Opponent modeling is the task of inferring another party's mental state within the context of social interactions. In a multi-issue negotiation, it involves inferring the relative importance that the opponent assigns to each issue under discussion, which is crucial for finding high-value deals. A practical model for this task needs to infer these priorities of the opponent on the fly based on partial dialogues as input, without needing additional annotations for training. In this work, we propose a ranker for identifying these priorities from negotiation dialogues. The model takes in a partial dialogue as input and predicts the priority order of the opponent. We further devise ways to adapt related data sources for this task to provide more explicit supervision for incorporating the opponent's preferences and offers, as a proxy to relying on granular utterance-level annotations. We show the utility of our proposed approach through extensive experiments based on two dialogue datasets. We find that the proposed data adaptations lead to strong performance in zero-shot and few-shot scenarios. Moreover, they allow the model to perform better than baselines while accessing fewer utterances from the opponent. We release our code to support future work in this direction. △ Less

Submitted 3 May, 2022; v1 submitted 30 April, 2022; originally announced May 2022.

Comments: Appearing at Findings of NAACL 2022

arXiv:2110.06486 [pdf, other]

Understanding of Emotion Perception from Art

Authors: Digbalay Bose, Krishna Somandepalli, Souvik Kundu, Rimita Lahiri, Jonathan Gratch, Shrikanth Narayanan

Abstract: Computational modeling of the emotions evoked by art in humans is a challenging problem because of the subjective and nuanced nature of art and affective signals. In this paper, we consider the above-mentioned problem of understanding emotions evoked in viewers by artwork using both text and visual modalities. Specifically, we analyze images and the accompanying text captions from the viewers expr… ▽ More Computational modeling of the emotions evoked by art in humans is a challenging problem because of the subjective and nuanced nature of art and affective signals. In this paper, we consider the above-mentioned problem of understanding emotions evoked in viewers by artwork using both text and visual modalities. Specifically, we analyze images and the accompanying text captions from the viewers expressing emotions as a multimodal classification task. Our results show that single-stream multimodal transformer-based models like MMBT and VisualBERT perform better compared to both image-only models and dual-stream multimodal models having separate pathways for text and image modalities. We also observe improvements in performance for extreme positive and negative emotion classes, when a single-stream model like MMBT is compared with a text-only transformer model like BERT. △ Less

Submitted 13 October, 2021; originally announced October 2021.

Comments: 5 pages, 5 figures. Accepted at ICCV2021: 4th Workshop on Closing the loop between Vision and Language

arXiv:2107.13165 [pdf, other]

Towards Emotion-Aware Agents For Negotiation Dialogues

Authors: Kushal Chawla, Rene Clever, Jaysa Ramirez, Gale Lucas, Jonathan Gratch

Abstract: Negotiation is a complex social interaction that encapsulates emotional encounters in human decision-making. Virtual agents that can negotiate with humans are useful in pedagogy and conversational AI. To advance the development of such agents, we explore the prediction of two important subjective goals in a negotiation - outcome satisfaction and partner perception. Specifically, we analyze the ext… ▽ More Negotiation is a complex social interaction that encapsulates emotional encounters in human decision-making. Virtual agents that can negotiate with humans are useful in pedagogy and conversational AI. To advance the development of such agents, we explore the prediction of two important subjective goals in a negotiation - outcome satisfaction and partner perception. Specifically, we analyze the extent to which emotion attributes extracted from the negotiation help in the prediction, above and beyond the individual difference variables. We focus on a recent dataset in chat-based negotiations, grounded in a realistic cam** scenario. We study three degrees of emotion dimensions - emoticons, lexical, and contextual by leveraging affective lexicons and a state-of-the-art deep learning architecture. Our insights will be helpful in designing adaptive negotiation agents that interact through realistic communication interfaces. △ Less

Submitted 28 July, 2021; originally announced July 2021.

Comments: Accepted at 9th International Conference on Affective Computing & Intelligent Interaction (ACII 2021)

arXiv:2103.15721 [pdf, other]

CaSiNo: A Corpus of Campsite Negotiation Dialogues for Automatic Negotiation Systems

Authors: Kushal Chawla, Jaysa Ramirez, Rene Clever, Gale Lucas, Jonathan May, Jonathan Gratch

Abstract: Automated systems that negotiate with humans have broad applications in pedagogy and conversational AI. To advance the development of practical negotiation systems, we present CaSiNo: a novel corpus of over a thousand negotiation dialogues in English. Participants take the role of campsite neighbors and negotiate for food, water, and firewood packages for their upcoming trip. Our design results in… ▽ More Automated systems that negotiate with humans have broad applications in pedagogy and conversational AI. To advance the development of practical negotiation systems, we present CaSiNo: a novel corpus of over a thousand negotiation dialogues in English. Participants take the role of campsite neighbors and negotiate for food, water, and firewood packages for their upcoming trip. Our design results in diverse and linguistically rich negotiations while maintaining a tractable, closed-domain environment. Inspired by the literature in human-human negotiations, we annotate persuasion strategies and perform correlation analysis to understand how the dialogue behaviors are associated with the negotiation performance. We further propose and evaluate a multi-task framework to recognize these strategies in a given utterance. We find that multi-task learning substantially improves the performance for all strategy labels, especially for the ones that are the most skewed. We release the dataset, annotations, and the code to propel future work in human-machine negotiations: https://github.com/kushalchawla/CaSiNo △ Less

Submitted 28 April, 2021; v1 submitted 29 March, 2021; originally announced March 2021.

Comments: Accepted at NAACL 2021

arXiv:2004.02363 [pdf, ps, other]

Exploring Early Prediction of Buyer-Seller Negotiation Outcomes

Authors: Kushal Chawla, Gale Lucas, Jonathan May, Jonathan Gratch

Abstract: Agents that negotiate with humans find broad applications in pedagogy and conversational AI. Most efforts in human-agent negotiations rely on restrictive menu-driven interfaces for communication. To advance the research in language-based negotiation systems, we explore a novel task of early prediction of buyer-seller negotiation outcomes, by varying the fraction of utterances that the model can ac… ▽ More Agents that negotiate with humans find broad applications in pedagogy and conversational AI. Most efforts in human-agent negotiations rely on restrictive menu-driven interfaces for communication. To advance the research in language-based negotiation systems, we explore a novel task of early prediction of buyer-seller negotiation outcomes, by varying the fraction of utterances that the model can access. We explore the feasibility of early prediction by using traditional feature-based methods, as well as by incorporating the non-linguistic task context into a pretrained language model using sentence templates. We further quantify the extent to which linguistic features help in making better predictions apart from the task-specific price information. Finally, probing the pretrained model helps us to identify specific features, such as trust and agreement, that contribute to the prediction performance. △ Less

Submitted 25 February, 2021; v1 submitted 5 April, 2020; originally announced April 2020.

ACM Class: I.2.7

arXiv:1605.01600 [pdf, other]

AVEC 2016 - Depression, Mood, and Emotion Recognition Workshop and Challenge

Authors: Michel Valstar, Jonathan Gratch, Bjorn Schuller, Fabien Ringeval, Denis Lalanne, Mercedes Torres Torres, Stefan Scherer, Guiota Stratou, Roddy Cowie, Maja Pantic

Abstract: The Audio/Visual Emotion Challenge and Workshop (AVEC 2016) "Depression, Mood and Emotion" will be the sixth competition event aimed at comparison of multimedia processing and machine learning methods for automatic audio, visual and physiological depression and emotion analysis, with all participants competing under strictly the same conditions. The goal of the Challenge is to provide a common ben… ▽ More The Audio/Visual Emotion Challenge and Workshop (AVEC 2016) "Depression, Mood and Emotion" will be the sixth competition event aimed at comparison of multimedia processing and machine learning methods for automatic audio, visual and physiological depression and emotion analysis, with all participants competing under strictly the same conditions. The goal of the Challenge is to provide a common benchmark test set for multi-modal information processing and to bring together the depression and emotion recognition communities, as well as the audio, video and physiological processing communities, to compare the relative merits of the various approaches to depression and emotion recognition under well-defined and strictly comparable conditions and establish to what extent fusion of the approaches is possible and beneficial. This paper presents the challenge guidelines, the common data used, and the performance of the baseline system on the two tasks. △ Less

Submitted 22 November, 2016; v1 submitted 5 May, 2016; originally announced May 2016.

Comments: Proceedings of the 6th International Workshop on Audio/Visual Emotion Challenge, AVEC'16, co-located with the 24th ACM International Conference on Multimedia, MM 2016, pages 3-10, Amsterdam, The Netherlands, October 2016. ACM

arXiv:cs/9605104 [pdf, ps]

Adaptive Problem-solving for Large-scale Scheduling Problems: A Case Study

Authors: J. Gratch, S. Chien

Abstract: Although most scheduling problems are NP-hard, domain specific techniques perform well in practice but are quite expensive to construct. In adaptive problem-solving solving, domain specific knowledge is acquired automatically for a general problem solver with a flexible control architecture. In this approach, a learning system explores a space of possible heuristic methods for one well-suited to… ▽ More Although most scheduling problems are NP-hard, domain specific techniques perform well in practice but are quite expensive to construct. In adaptive problem-solving solving, domain specific knowledge is acquired automatically for a general problem solver with a flexible control architecture. In this approach, a learning system explores a space of possible heuristic methods for one well-suited to the eccentricities of the given domain and problem distribution. In this article, we discuss an application of the approach to scheduling satellite communications. Using problem distributions based on actual mission requirements, our approach identifies strategies that not only decrease the amount of CPU time required to produce schedules, but also increase the percentage of problems that are solvable within computational resource limitations. △ Less

Submitted 30 April, 1996; originally announced May 1996.

Comments: See http://www.jair.org/ for any accompanying files

Journal ref: Journal of Artificial Intelligence Research, Vol 4, (1996), 365-396

Showing 1–14 of 14 results for author: Gratch, J