-
Are LLMs Effective Negotiators? Systematic Evaluation of the Multifaceted Capabilities of LLMs in Negotiation Dialogues
Authors:
Deuksin Kwon,
Emily Weiss,
Tara Kulshrestha,
Kushal Chawla,
Gale M. Lucas,
Jonathan Gratch
Abstract:
A successful negotiation demands a deep comprehension of the conversation context, Theory-of-Mind (ToM) skills to infer the partner's motives, as well as strategic reasoning and effective communication, making it challenging for automated systems. Given the remarkable performance of LLMs across a variety of NLP tasks, in this work, we aim to understand how LLMs can advance different aspects of neg…
▽ More
A successful negotiation demands a deep comprehension of the conversation context, Theory-of-Mind (ToM) skills to infer the partner's motives, as well as strategic reasoning and effective communication, making it challenging for automated systems. Given the remarkable performance of LLMs across a variety of NLP tasks, in this work, we aim to understand how LLMs can advance different aspects of negotiation research, ranging from designing dialogue systems to providing pedagogical feedback and scaling up data collection practices. To this end, we devise a methodology to analyze the multifaceted capabilities of LLMs across diverse dialogue scenarios covering all the time stages of a typical negotiation interaction. Our analysis adds to the increasing evidence for the superiority of GPT-4 across various tasks while also providing insights into specific tasks that remain difficult for LLMs. For instance, the models correlate poorly with human players when making subjective assessments about the negotiation dialogues and often struggle to generate responses that are contextually appropriate as well as strategically advantageous.
△ Less
Submitted 21 February, 2024;
originally announced February 2024.
-
Can Language Model Moderators Improve the Health of Online Discourse?
Authors:
Hyundong Cho,
Shuai Liu,
Taiwei Shi,
Darpan Jain,
Basem Rizk,
Yuyang Huang,
Zixun Lu,
Nuan Wen,
Jonathan Gratch,
Emilio Ferrara,
Jonathan May
Abstract:
Conversational moderation of online communities is crucial to maintaining civility for a constructive environment, but it is challenging to scale and harmful to moderators. The inclusion of sophisticated natural language generation modules as a force multiplier to aid human moderators is a tantalizing prospect, but adequate evaluation approaches have so far been elusive. In this paper, we establis…
▽ More
Conversational moderation of online communities is crucial to maintaining civility for a constructive environment, but it is challenging to scale and harmful to moderators. The inclusion of sophisticated natural language generation modules as a force multiplier to aid human moderators is a tantalizing prospect, but adequate evaluation approaches have so far been elusive. In this paper, we establish a systematic definition of conversational moderation effectiveness grounded on moderation literature and establish design criteria for conducting realistic yet safe evaluation. We then propose a comprehensive evaluation framework to assess models' moderation capabilities independently of human intervention. With our framework, we conduct the first known study of language models as conversational moderators, finding that appropriately prompted models that incorporate insights from social science can provide specific and fair feedback on toxic behavior but struggle to influence users to increase their levels of respect and cooperation.
△ Less
Submitted 6 May, 2024; v1 submitted 16 November, 2023;
originally announced November 2023.
-
Context Unlocks Emotions: Text-based Emotion Classification Dataset Auditing with Large Language Models
Authors:
Daniel Yang,
Aditya Kommineni,
Mohammad Alshehri,
Nilamadhab Mohanty,
Vedant Modi,
Jonathan Gratch,
Shrikanth Narayanan
Abstract:
The lack of contextual information in text data can make the annotation process of text-based emotion classification datasets challenging. As a result, such datasets often contain labels that fail to consider all the relevant emotions in the vocabulary. This misalignment between text inputs and labels can degrade the performance of machine learning models trained on top of them. As re-annotating e…
▽ More
The lack of contextual information in text data can make the annotation process of text-based emotion classification datasets challenging. As a result, such datasets often contain labels that fail to consider all the relevant emotions in the vocabulary. This misalignment between text inputs and labels can degrade the performance of machine learning models trained on top of them. As re-annotating entire datasets is a costly and time-consuming task that cannot be done at scale, we propose to use the expressive capabilities of large language models to synthesize additional context for input text to increase its alignment with the annotated emotional labels. In this work, we propose a formal definition of textual context to motivate a prompting strategy to enhance such contextual information. We provide both human and empirical evaluation to demonstrate the efficacy of the enhanced context. Our method improves alignment between inputs and their human-annotated labels from both an empirical and human-evaluated standpoint.
△ Less
Submitted 6 November, 2023;
originally announced November 2023.
-
Be Selfish, But Wisely: Investigating the Impact of Agent Personality in Mixed-Motive Human-Agent Interactions
Authors:
Kushal Chawla,
Ian Wu,
Yu Rong,
Gale M. Lucas,
Jonathan Gratch
Abstract:
A natural way to design a negotiation dialogue system is via self-play RL: train an agent that learns to maximize its performance by interacting with a simulated user that has been designed to imitate human-human dialogue data. Although this procedure has been adopted in prior work, we find that it results in a fundamentally flawed system that fails to learn the value of compromise in a negotiatio…
▽ More
A natural way to design a negotiation dialogue system is via self-play RL: train an agent that learns to maximize its performance by interacting with a simulated user that has been designed to imitate human-human dialogue data. Although this procedure has been adopted in prior work, we find that it results in a fundamentally flawed system that fails to learn the value of compromise in a negotiation, which can often lead to no agreements (i.e., the partner walking away without a deal), ultimately hurting the model's overall performance. We investigate this observation in the context of the DealOrNoDeal task, a multi-issue negotiation over books, hats, and balls. Grounded in negotiation theory from Economics, we modify the training procedure in two novel ways to design agents with diverse personalities and analyze their performance with human partners. We find that although both techniques show promise, a selfish agent, which maximizes its own performance while also avoiding walkaways, performs superior to other variants by implicitly learning to generate value for both itself and the negotiation partner. We discuss the implications of our findings for what it means to be a successful negotiation dialogue system and how these systems should be designed in the future.
△ Less
Submitted 22 October, 2023;
originally announced October 2023.
-
Is GPT a Computational Model of Emotion? Detailed Analysis
Authors:
Ala N. Tak,
Jonathan Gratch
Abstract:
This paper investigates the emotional reasoning abilities of the GPT family of large language models via a component perspective. The paper first examines how the model reasons about autobiographical memories. Second, it systematically varies aspects of situations to impact emotion intensity and co** tendencies. Even without the use of prompt engineering, it is shown that GPT's predictions align…
▽ More
This paper investigates the emotional reasoning abilities of the GPT family of large language models via a component perspective. The paper first examines how the model reasons about autobiographical memories. Second, it systematically varies aspects of situations to impact emotion intensity and co** tendencies. Even without the use of prompt engineering, it is shown that GPT's predictions align significantly with human-provided appraisals and emotional labels. However, GPT faces difficulties predicting emotion intensity and co** responses. GPT-4 showed the highest performance in the initial study but fell short in the second, despite providing superior results after minor prompt engineering. This assessment brings up questions on how to effectively employ the strong points and address the weak areas of these models, particularly concerning response variability. These studies underscore the merits of evaluating models from a componential perspective.
△ Less
Submitted 25 July, 2023;
originally announced July 2023.
-
Social Influence Dialogue Systems: A Survey of Datasets and Models For Social Influence Tasks
Authors:
Kushal Chawla,
Weiyan Shi,
**gwen Zhang,
Gale Lucas,
Zhou Yu,
Jonathan Gratch
Abstract:
Dialogue systems capable of social influence such as persuasion, negotiation, and therapy, are essential for extending the use of technology to numerous realistic scenarios. However, existing research primarily focuses on either task-oriented or open-domain scenarios, a categorization that has been inadequate for capturing influence skills systematically. There exists no formal definition or categ…
▽ More
Dialogue systems capable of social influence such as persuasion, negotiation, and therapy, are essential for extending the use of technology to numerous realistic scenarios. However, existing research primarily focuses on either task-oriented or open-domain scenarios, a categorization that has been inadequate for capturing influence skills systematically. There exists no formal definition or category for dialogue systems with these skills and data-driven efforts in this direction are highly limited. In this work, we formally define and introduce the category of social influence dialogue systems that influence users' cognitive and emotional responses, leading to changes in thoughts, opinions, and behaviors through natural conversations. We present a survey of various tasks, datasets, and methods, compiling the progress across seven diverse domains. We discuss the commonalities and differences between the examined systems, identify limitations, and recommend future directions. This study serves as a comprehensive reference for social influence dialogue systems to inspire more dedicated research and discussion in this emerging area.
△ Less
Submitted 24 January, 2023; v1 submitted 11 October, 2022;
originally announced October 2022.
-
The Impact of Partner Expressions on Felt Emotion in the Iterated Prisoner's Dilemma: An Event-level Analysis
Authors:
Maria Angelika-Nikita,
Celso M. de Melo,
Kazunori Terada,
Gale Lucas,
Jonathan Gratch
Abstract:
Social games like the prisoner's dilemma are often used to develop models of the role of emotion in social decision-making. Here we examine an understudied aspect of emotion in such games: how an individual's feelings are shaped by their partner's expressions. Prior research has tended to focus on other aspects of emotion. Research on felt-emotion has focused on how an individual's feelings shape…
▽ More
Social games like the prisoner's dilemma are often used to develop models of the role of emotion in social decision-making. Here we examine an understudied aspect of emotion in such games: how an individual's feelings are shaped by their partner's expressions. Prior research has tended to focus on other aspects of emotion. Research on felt-emotion has focused on how an individual's feelings shape how they treat their partner, or whether these feelings are authentically expressed. Research on expressed-emotion has focused on how an individual's decisions are shaped by their partner's expressions, without regard for whether these expressions actually evoke feelings. Here, we use computer-generated characters to examine how an individual's moment-to-moment feelings are shaped by (1) how they are treated by their partner and (2) what their partner expresses during this treatment. Surprisingly, we find that partner expressions are far more important than actions in determining self-reported feelings. In other words, our partner can behave in a selfish and exploitive way, but if they show a collaborative pattern of expressions, we will feel greater pleasure collaborating with them. These results also emphasize the importance of context in determining how someone will feel in response to an expression (i.e., knowing a partner is happy is insufficient; we must know what they are happy-at). We discuss the implications of this work for cognitive-system design, emotion theory, and methodological practice in affective computing.
△ Less
Submitted 2 July, 2022;
originally announced July 2022.
-
Opponent Modeling in Negotiation Dialogues by Related Data Adaptation
Authors:
Kushal Chawla,
Gale M. Lucas,
Jonathan May,
Jonathan Gratch
Abstract:
Opponent modeling is the task of inferring another party's mental state within the context of social interactions. In a multi-issue negotiation, it involves inferring the relative importance that the opponent assigns to each issue under discussion, which is crucial for finding high-value deals. A practical model for this task needs to infer these priorities of the opponent on the fly based on part…
▽ More
Opponent modeling is the task of inferring another party's mental state within the context of social interactions. In a multi-issue negotiation, it involves inferring the relative importance that the opponent assigns to each issue under discussion, which is crucial for finding high-value deals. A practical model for this task needs to infer these priorities of the opponent on the fly based on partial dialogues as input, without needing additional annotations for training. In this work, we propose a ranker for identifying these priorities from negotiation dialogues. The model takes in a partial dialogue as input and predicts the priority order of the opponent. We further devise ways to adapt related data sources for this task to provide more explicit supervision for incorporating the opponent's preferences and offers, as a proxy to relying on granular utterance-level annotations. We show the utility of our proposed approach through extensive experiments based on two dialogue datasets. We find that the proposed data adaptations lead to strong performance in zero-shot and few-shot scenarios. Moreover, they allow the model to perform better than baselines while accessing fewer utterances from the opponent. We release our code to support future work in this direction.
△ Less
Submitted 3 May, 2022; v1 submitted 30 April, 2022;
originally announced May 2022.
-
Understanding of Emotion Perception from Art
Authors:
Digbalay Bose,
Krishna Somandepalli,
Souvik Kundu,
Rimita Lahiri,
Jonathan Gratch,
Shrikanth Narayanan
Abstract:
Computational modeling of the emotions evoked by art in humans is a challenging problem because of the subjective and nuanced nature of art and affective signals. In this paper, we consider the above-mentioned problem of understanding emotions evoked in viewers by artwork using both text and visual modalities. Specifically, we analyze images and the accompanying text captions from the viewers expr…
▽ More
Computational modeling of the emotions evoked by art in humans is a challenging problem because of the subjective and nuanced nature of art and affective signals. In this paper, we consider the above-mentioned problem of understanding emotions evoked in viewers by artwork using both text and visual modalities. Specifically, we analyze images and the accompanying text captions from the viewers expressing emotions as a multimodal classification task. Our results show that single-stream multimodal transformer-based models like MMBT and VisualBERT perform better compared to both image-only models and dual-stream multimodal models having separate pathways for text and image modalities. We also observe improvements in performance for extreme positive and negative emotion classes, when a single-stream model like MMBT is compared with a text-only transformer model like BERT.
△ Less
Submitted 13 October, 2021;
originally announced October 2021.
-
Towards Emotion-Aware Agents For Negotiation Dialogues
Authors:
Kushal Chawla,
Rene Clever,
Jaysa Ramirez,
Gale Lucas,
Jonathan Gratch
Abstract:
Negotiation is a complex social interaction that encapsulates emotional encounters in human decision-making. Virtual agents that can negotiate with humans are useful in pedagogy and conversational AI. To advance the development of such agents, we explore the prediction of two important subjective goals in a negotiation - outcome satisfaction and partner perception. Specifically, we analyze the ext…
▽ More
Negotiation is a complex social interaction that encapsulates emotional encounters in human decision-making. Virtual agents that can negotiate with humans are useful in pedagogy and conversational AI. To advance the development of such agents, we explore the prediction of two important subjective goals in a negotiation - outcome satisfaction and partner perception. Specifically, we analyze the extent to which emotion attributes extracted from the negotiation help in the prediction, above and beyond the individual difference variables. We focus on a recent dataset in chat-based negotiations, grounded in a realistic cam** scenario. We study three degrees of emotion dimensions - emoticons, lexical, and contextual by leveraging affective lexicons and a state-of-the-art deep learning architecture. Our insights will be helpful in designing adaptive negotiation agents that interact through realistic communication interfaces.
△ Less
Submitted 28 July, 2021;
originally announced July 2021.
-
CaSiNo: A Corpus of Campsite Negotiation Dialogues for Automatic Negotiation Systems
Authors:
Kushal Chawla,
Jaysa Ramirez,
Rene Clever,
Gale Lucas,
Jonathan May,
Jonathan Gratch
Abstract:
Automated systems that negotiate with humans have broad applications in pedagogy and conversational AI. To advance the development of practical negotiation systems, we present CaSiNo: a novel corpus of over a thousand negotiation dialogues in English. Participants take the role of campsite neighbors and negotiate for food, water, and firewood packages for their upcoming trip. Our design results in…
▽ More
Automated systems that negotiate with humans have broad applications in pedagogy and conversational AI. To advance the development of practical negotiation systems, we present CaSiNo: a novel corpus of over a thousand negotiation dialogues in English. Participants take the role of campsite neighbors and negotiate for food, water, and firewood packages for their upcoming trip. Our design results in diverse and linguistically rich negotiations while maintaining a tractable, closed-domain environment. Inspired by the literature in human-human negotiations, we annotate persuasion strategies and perform correlation analysis to understand how the dialogue behaviors are associated with the negotiation performance. We further propose and evaluate a multi-task framework to recognize these strategies in a given utterance. We find that multi-task learning substantially improves the performance for all strategy labels, especially for the ones that are the most skewed. We release the dataset, annotations, and the code to propel future work in human-machine negotiations: https://github.com/kushalchawla/CaSiNo
△ Less
Submitted 28 April, 2021; v1 submitted 29 March, 2021;
originally announced March 2021.
-
Exploring Early Prediction of Buyer-Seller Negotiation Outcomes
Authors:
Kushal Chawla,
Gale Lucas,
Jonathan May,
Jonathan Gratch
Abstract:
Agents that negotiate with humans find broad applications in pedagogy and conversational AI. Most efforts in human-agent negotiations rely on restrictive menu-driven interfaces for communication. To advance the research in language-based negotiation systems, we explore a novel task of early prediction of buyer-seller negotiation outcomes, by varying the fraction of utterances that the model can ac…
▽ More
Agents that negotiate with humans find broad applications in pedagogy and conversational AI. Most efforts in human-agent negotiations rely on restrictive menu-driven interfaces for communication. To advance the research in language-based negotiation systems, we explore a novel task of early prediction of buyer-seller negotiation outcomes, by varying the fraction of utterances that the model can access. We explore the feasibility of early prediction by using traditional feature-based methods, as well as by incorporating the non-linguistic task context into a pretrained language model using sentence templates. We further quantify the extent to which linguistic features help in making better predictions apart from the task-specific price information. Finally, probing the pretrained model helps us to identify specific features, such as trust and agreement, that contribute to the prediction performance.
△ Less
Submitted 25 February, 2021; v1 submitted 5 April, 2020;
originally announced April 2020.
-
AVEC 2016 - Depression, Mood, and Emotion Recognition Workshop and Challenge
Authors:
Michel Valstar,
Jonathan Gratch,
Bjorn Schuller,
Fabien Ringeval,
Denis Lalanne,
Mercedes Torres Torres,
Stefan Scherer,
Guiota Stratou,
Roddy Cowie,
Maja Pantic
Abstract:
The Audio/Visual Emotion Challenge and Workshop (AVEC 2016) "Depression, Mood and Emotion" will be the sixth competition event aimed at comparison of multimedia processing and machine learning methods for automatic audio, visual and physiological depression and emotion analysis, with all participants competing under strictly the same conditions. The goal of the Challenge is to provide a common ben…
▽ More
The Audio/Visual Emotion Challenge and Workshop (AVEC 2016) "Depression, Mood and Emotion" will be the sixth competition event aimed at comparison of multimedia processing and machine learning methods for automatic audio, visual and physiological depression and emotion analysis, with all participants competing under strictly the same conditions. The goal of the Challenge is to provide a common benchmark test set for multi-modal information processing and to bring together the depression and emotion recognition communities, as well as the audio, video and physiological processing communities, to compare the relative merits of the various approaches to depression and emotion recognition under well-defined and strictly comparable conditions and establish to what extent fusion of the approaches is possible and beneficial. This paper presents the challenge guidelines, the common data used, and the performance of the baseline system on the two tasks.
△ Less
Submitted 22 November, 2016; v1 submitted 5 May, 2016;
originally announced May 2016.
-
Adaptive Problem-solving for Large-scale Scheduling Problems: A Case Study
Authors:
J. Gratch,
S. Chien
Abstract:
Although most scheduling problems are NP-hard, domain specific techniques perform well in practice but are quite expensive to construct. In adaptive problem-solving solving, domain specific knowledge is acquired automatically for a general problem solver with a flexible control architecture. In this approach, a learning system explores a space of possible heuristic methods for one well-suited to…
▽ More
Although most scheduling problems are NP-hard, domain specific techniques perform well in practice but are quite expensive to construct. In adaptive problem-solving solving, domain specific knowledge is acquired automatically for a general problem solver with a flexible control architecture. In this approach, a learning system explores a space of possible heuristic methods for one well-suited to the eccentricities of the given domain and problem distribution. In this article, we discuss an application of the approach to scheduling satellite communications. Using problem distributions based on actual mission requirements, our approach identifies strategies that not only decrease the amount of CPU time required to produce schedules, but also increase the percentage of problems that are solvable within computational resource limitations.
△ Less
Submitted 30 April, 1996;
originally announced May 1996.