Skip to main content

Showing 1–7 of 7 results for author: Abdulhai, M

.
  1. arXiv:2407.06576  [pdf, other

    cs.CL cs.AI

    Virtual Personas for Language Models via an Anthology of Backstories

    Authors: Suhong Moon, Marwa Abdulhai, Minwoo Kang, Joseph Suh, Widyadewi Soedarmadji, Eran Kohen Behar, David M. Chan

    Abstract: Large language models (LLMs) are trained from vast repositories of text authored by millions of distinct authors, reflecting an enormous diversity of human traits. While these models bear the potential to be used as approximations of human subjects in behavioral studies, prior efforts have been limited in steering model responses to match individual human users. In this work, we introduce "Antholo… ▽ More

    Submitted 9 July, 2024; originally announced July 2024.

  2. arXiv:2311.18232  [pdf, other

    cs.CL cs.AI cs.LG

    LMRL Gym: Benchmarks for Multi-Turn Reinforcement Learning with Language Models

    Authors: Marwa Abdulhai, Isadora White, Charlie Snell, Charles Sun, Joey Hong, Yuexiang Zhai, Kelvin Xu, Sergey Levine

    Abstract: Large language models (LLMs) provide excellent text-generation capabilities, but standard prompting and generation methods generally do not lead to intentional or goal-directed agents and might necessitate considerable prompt tuning. This becomes particularly apparent in multi-turn conversations: even the best current LLMs rarely ask clarifying questions, engage in explicit information gathering,… ▽ More

    Submitted 29 November, 2023; originally announced November 2023.

  3. arXiv:2310.15337  [pdf, other

    cs.AI cs.CL cs.CY

    Moral Foundations of Large Language Models

    Authors: Marwa Abdulhai, Gregory Serapio-Garcia, Clément Crepy, Daria Valter, John Canny, Natasha Jaques

    Abstract: Moral foundations theory (MFT) is a psychological assessment tool that decomposes human moral reasoning into five factors, including care/harm, liberty/oppression, and sanctity/degradation (Graham et al., 2009). People vary in the weight they place on these dimensions when making moral decisions, in part due to their cultural upbringing and political ideology. As large language models (LLMs) are t… ▽ More

    Submitted 23 October, 2023; originally announced October 2023.

  4. arXiv:2307.00184  [pdf, other

    cs.CL cs.AI cs.CY cs.HC

    Personality Traits in Large Language Models

    Authors: Greg Serapio-García, Mustafa Safdari, Clément Crepy, Luning Sun, Stephen Fitz, Peter Romero, Marwa Abdulhai, Aleksandra Faust, Maja Matarić

    Abstract: The advent of large language models (LLMs) has revolutionized natural language processing, enabling the generation of coherent and contextually relevant human-like text. As LLMs increasingly power conversational agents used by the general public world-wide, the synthetic personality embedded in these models, by virtue of training on large amounts of human data, is becoming increasingly important.… ▽ More

    Submitted 21 September, 2023; v1 submitted 30 June, 2023; originally announced July 2023.

    MSC Class: 68T35 ACM Class: I.2.7

  5. arXiv:2208.04919  [pdf, other

    cs.LG

    Basis for Intentions: Efficient Inverse Reinforcement Learning using Past Experience

    Authors: Marwa Abdulhai, Natasha Jaques, Sergey Levine

    Abstract: This paper addresses the problem of inverse reinforcement learning (IRL) -- inferring the reward function of an agent from observing its behavior. IRL can provide a generalizable and compact representation for apprenticeship learning, and enable accurately inferring the preferences of a human in order to assist them. %and provide for more accurate prediction. However, effective IRL is challenging,… ▽ More

    Submitted 9 August, 2022; originally announced August 2022.

  6. arXiv:2109.09876  [pdf, other

    cs.LG cs.AI

    Context-Specific Representation Abstraction for Deep Option Learning

    Authors: Marwa Abdulhai, Dong-Ki Kim, Matthew Riemer, Miao Liu, Gerald Tesauro, Jonathan P. How

    Abstract: Hierarchical reinforcement learning has focused on discovering temporally extended actions, such as options, that can provide benefits in problems requiring extensive exploration. One promising approach that learns these options end-to-end is the option-critic (OC) framework. We examine and show in this paper that OC does not decompose a problem into simpler sub-problems, but instead increases the… ▽ More

    Submitted 23 April, 2022; v1 submitted 20 September, 2021; originally announced September 2021.

    Comments: Accepted at AAAI 2022

  7. arXiv:2011.00382  [pdf, other

    cs.LG cs.AI cs.MA

    A Policy Gradient Algorithm for Learning to Learn in Multiagent Reinforcement Learning

    Authors: Dong-Ki Kim, Miao Liu, Matthew Riemer, Chuangchuang Sun, Marwa Abdulhai, Golnaz Habibi, Sebastian Lopez-Cot, Gerald Tesauro, Jonathan P. How

    Abstract: A fundamental challenge in multiagent reinforcement learning is to learn beneficial behaviors in a shared environment with other simultaneously learning agents. In particular, each agent perceives the environment as effectively non-stationary due to the changing policies of other agents. Moreover, each agent is itself constantly learning, leading to natural non-stationarity in the distribution of… ▽ More

    Submitted 11 June, 2021; v1 submitted 31 October, 2020; originally announced November 2020.

    Comments: Accepted to ICML 2021. Code at https://github.com/dkkim93/meta-mapg and Videos at https://sites.google.com/view/meta-mapg/home