The Good, the Bad, and the Hulk-like GPT: Analyzing Emotional Decisions of Large Language Models in Cooperation and Bargaining Games
Authors:
Mikhail Mozikov,
Nikita Severin,
Valeria Bodishtianu,
Maria Glushanina,
Mikhail Baklashkin,
Andrey V. Savchenko,
Ilya Makarov
Abstract:
Behavior study experiments are an important part of society modeling and understanding human interactions. In practice, many behavioral experiments encounter challenges related to internal and external validity, reproducibility, and social bias due to the complexity of social interactions and cooperation in human user studies. Recent advances in Large Language Models (LLMs) have provided researche…
▽ More
Behavior study experiments are an important part of society modeling and understanding human interactions. In practice, many behavioral experiments encounter challenges related to internal and external validity, reproducibility, and social bias due to the complexity of social interactions and cooperation in human user studies. Recent advances in Large Language Models (LLMs) have provided researchers with a new promising tool for the simulation of human behavior. However, existing LLM-based simulations operate under the unproven hypothesis that LLM agents behave similarly to humans as well as ignore a crucial factor in human decision-making: emotions.
In this paper, we introduce a novel methodology and the framework to study both, the decision-making of LLMs and their alignment with human behavior under emotional states. Experiments with GPT-3.5 and GPT-4 on four games from two different classes of behavioral game theory showed that emotions profoundly impact the performance of LLMs, leading to the development of more optimal strategies. While there is a strong alignment between the behavioral responses of GPT-3.5 and human participants, particularly evident in bargaining games, GPT-4 exhibits consistent behavior, ignoring induced emotions for rationality decisions. Surprisingly, emotional prompting, particularly with `anger' emotion, can disrupt the "superhuman" alignment of GPT-4, resembling human emotional responses.
△ Less
Submitted 5 June, 2024;
originally announced June 2024.
Temporal Graph Network Embedding with Causal Anonymous Walks Representations
Authors:
Ilya Makarov,
Andrey Savchenko,
Arseny Korovko,
Leonid Sherstyuk,
Nikita Severin,
Aleksandr Mikheev,
Dmitrii Babaev
Abstract:
Many tasks in graph machine learning, such as link prediction and node classification, are typically solved by using representation learning, in which each node or edge in the network is encoded via an embedding. Though there exists a lot of network embeddings for static graphs, the task becomes much more complicated when the dynamic (i.e. temporal) network is analyzed. In this paper, we propose a…
▽ More
Many tasks in graph machine learning, such as link prediction and node classification, are typically solved by using representation learning, in which each node or edge in the network is encoded via an embedding. Though there exists a lot of network embeddings for static graphs, the task becomes much more complicated when the dynamic (i.e. temporal) network is analyzed. In this paper, we propose a novel approach for dynamic network representation learning based on Temporal Graph Network by using a highly custom message generating function by extracting Causal Anonymous Walks. For evaluation, we provide a benchmark pipeline for the evaluation of temporal network embeddings. This work provides the first comprehensive comparison framework for temporal network representation learning in every available setting for graph machine learning problems involving node classification and link prediction. The proposed model outperforms state-of-the-art baseline models. The work also justifies the difference between them based on evaluation in various transductive/inductive edge/node classification tasks. In addition, we show the applicability and superior performance of our model in the real-world downstream graph machine learning task provided by one of the top European banks, involving credit scoring based on transaction data.
△ Less
Submitted 24 August, 2021; v1 submitted 19 August, 2021;
originally announced August 2021.