-
STEER: Unified Style Transfer with Expert Reinforcement
Authors:
Skyler Hallinan,
Faeze Brahman,
Ximing Lu,
Jaehun Jung,
Sean Welleck,
Ye** Choi
Abstract:
While text style transfer has many applications across natural language processing, the core premise of transferring from a single source style is unrealistic in a real-world setting. In this work, we focus on arbitrary style transfer: rewriting a text from an arbitrary, unknown style to a target style.
We propose STEER: Unified Style Transfer with Expert Reinforcement, a unified frame-work deve…
▽ More
While text style transfer has many applications across natural language processing, the core premise of transferring from a single source style is unrealistic in a real-world setting. In this work, we focus on arbitrary style transfer: rewriting a text from an arbitrary, unknown style to a target style.
We propose STEER: Unified Style Transfer with Expert Reinforcement, a unified frame-work developed to overcome the challenge of limited parallel data for style transfer. STEER involves automatically generating a corpus of style-transfer pairs using a product of experts during decoding. The generated offline data is then used to pre-train an initial policy before switching to online, off-policy reinforcement learning for further improvements via fine-grained reward signals. STEER is unified and can transfer to multiple target styles from an arbitrary, unknown source style, making it particularly flexible and efficient.
Experimental results on a challenging dataset with text from a diverse set of styles demonstrate state-of-the-art results compared to competitive baselines. Remarkably, STEER outperforms the 175B parameter instruction-tuned GPT-3 on overall style transfer quality, despite being 226 times smaller in size. We also show STEER is robust, maintaining its style transfer capabilities on out-of-domain data, and surpassing nearly all baselines across various styles. The success of our method highlights the potential of RL algorithms when augmented with controllable decoding to overcome the challenge of limited data supervision.
△ Less
Submitted 13 November, 2023;
originally announced November 2023.
-
Tailoring Self-Rationalizers with Multi-Reward Distillation
Authors:
Sahana Ramnath,
Brihi Joshi,
Skyler Hallinan,
Ximing Lu,
Liunian Harold Li,
Aaron Chan,
Jack Hessel,
Ye** Choi,
Xiang Ren
Abstract:
Large language models (LMs) are capable of generating free-text rationales to aid question answering. However, prior work 1) suggests that useful self-rationalization is emergent only at significant scales (e.g., 175B parameter GPT-3); and 2) focuses largely on downstream performance, ignoring the semantics of the rationales themselves, e.g., are they faithful, true, and helpful for humans? In thi…
▽ More
Large language models (LMs) are capable of generating free-text rationales to aid question answering. However, prior work 1) suggests that useful self-rationalization is emergent only at significant scales (e.g., 175B parameter GPT-3); and 2) focuses largely on downstream performance, ignoring the semantics of the rationales themselves, e.g., are they faithful, true, and helpful for humans? In this work, we enable small-scale LMs (approx. 200x smaller than GPT-3) to generate rationales that not only improve downstream task performance, but are also more plausible, consistent, and diverse, assessed both by automatic and human evaluation. Our method, MaRio (Multi-rewArd RatIOnalization), is a multi-reward conditioned self-rationalization algorithm that optimizes multiple distinct properties like plausibility, diversity and consistency. Results on five difficult question-answering datasets StrategyQA, QuaRel, OpenBookQA, NumerSense and QASC show that not only does MaRio improve task accuracy, but it also improves the self-rationalization quality of small LMs across the aforementioned axes better than a supervised fine-tuning (SFT) baseline. Extensive human evaluations confirm that MaRio rationales are preferred vs. SFT rationales, as well as qualitative improvements in plausibility and consistency.
△ Less
Submitted 22 May, 2024; v1 submitted 5 November, 2023;
originally announced November 2023.
-
Inference-Time Policy Adapters (IPA): Tailoring Extreme-Scale LMs without Fine-tuning
Authors:
Ximing Lu,
Faeze Brahman,
Peter West,
Jaehun Jang,
Khyathi Chandu,
Abhilasha Ravichander,
Lianhui Qin,
Prithviraj Ammanabrolu,
Liwei Jiang,
Sahana Ramnath,
Nouha Dziri,
Jillian Fisher,
Bill Yuchen Lin,
Skyler Hallinan,
Xiang Ren,
Sean Welleck,
Ye** Choi
Abstract:
While extreme-scale language models have demonstrated exceptional performance on a variety of language tasks, the degree of control over these language models through pure prompting can often be limited. Directly fine-tuning such language models can be effective for tailoring them, but it can be either extremely costly (e.g., GPT-3) or not even feasible for the broader community (e.g., GPT-4).
W…
▽ More
While extreme-scale language models have demonstrated exceptional performance on a variety of language tasks, the degree of control over these language models through pure prompting can often be limited. Directly fine-tuning such language models can be effective for tailoring them, but it can be either extremely costly (e.g., GPT-3) or not even feasible for the broader community (e.g., GPT-4).
We propose Inference-time Policy Adapters (IPA), which efficiently tailors a language model such as GPT-3 without fine-tuning it. IPA guides a large base model during decoding time through a lightweight policy adapter trained to optimize an arbitrary user objective with reinforcement learning.
On five challenging text generation tasks, such as toxicity reduction and lexically constrained generation, IPA consistently brings significant improvements over off-the-shelf language models. It outperforms competitive baseline methods, sometimes even including expensive fine-tuning. In particular, tailoring GPT-2 with IPA can outperform GPT-3, while tailoring GPT-3 with IPA brings a major performance boost over GPT-3 (and sometimes even over GPT-4). Our promising results highlight the potential of IPA as a lightweight alternative to tailoring extreme-scale language models.
△ Less
Submitted 6 December, 2023; v1 submitted 24 May, 2023;
originally announced May 2023.
-
Self-Refine: Iterative Refinement with Self-Feedback
Authors:
Aman Madaan,
Niket Tandon,
Prakhar Gupta,
Skyler Hallinan,
Luyu Gao,
Sarah Wiegreffe,
Uri Alon,
Nouha Dziri,
Shrimai Prabhumoye,
Yiming Yang,
Shashank Gupta,
Bodhisattwa Prasad Majumder,
Katherine Hermann,
Sean Welleck,
Amir Yazdanbakhsh,
Peter Clark
Abstract:
Like humans, large language models (LLMs) do not always generate the best output on their first try. Motivated by how humans refine their written text, we introduce Self-Refine, an approach for improving initial outputs from LLMs through iterative feedback and refinement. The main idea is to generate an initial output using an LLMs; then, the same LLMs provides feedback for its output and uses it…
▽ More
Like humans, large language models (LLMs) do not always generate the best output on their first try. Motivated by how humans refine their written text, we introduce Self-Refine, an approach for improving initial outputs from LLMs through iterative feedback and refinement. The main idea is to generate an initial output using an LLMs; then, the same LLMs provides feedback for its output and uses it to refine itself, iteratively. Self-Refine does not require any supervised training data, additional training, or reinforcement learning, and instead uses a single LLM as the generator, refiner, and feedback provider. We evaluate Self-Refine across 7 diverse tasks, ranging from dialog response generation to mathematical reasoning, using state-of-the-art (GPT-3.5, ChatGPT, and GPT-4) LLMs. Across all evaluated tasks, outputs generated with Self-Refine are preferred by humans and automatic metrics over those generated with the same LLM using conventional one-step generation, improving by ~20% absolute on average in task performance. Our work demonstrates that even state-of-the-art LLMs like GPT-4 can be further improved at test time using our simple, standalone approach.
△ Less
Submitted 25 May, 2023; v1 submitted 30 March, 2023;
originally announced March 2023.
-
Detoxifying Text with MaRCo: Controllable Revision with Experts and Anti-Experts
Authors:
Skyler Hallinan,
Alisa Liu,
Ye** Choi,
Maarten Sap
Abstract:
Text detoxification has the potential to mitigate the harms of toxicity by rephrasing text to remove offensive meaning, but subtle toxicity remains challenging to tackle. We introduce MaRCo, a detoxification algorithm that combines controllable generation and text rewriting methods using a Product of Experts with autoencoder language models (LMs). MaRCo uses likelihoods under a non-toxic LM (exper…
▽ More
Text detoxification has the potential to mitigate the harms of toxicity by rephrasing text to remove offensive meaning, but subtle toxicity remains challenging to tackle. We introduce MaRCo, a detoxification algorithm that combines controllable generation and text rewriting methods using a Product of Experts with autoencoder language models (LMs). MaRCo uses likelihoods under a non-toxic LM (expert) and a toxic LM (anti-expert) to find candidate words to mask and potentially replace. We evaluate our method on several subtle toxicity and microaggressions datasets, and show that it not only outperforms baselines on automatic metrics, but MaRCo's rewrites are preferred 2.1 $\times$ more in human evaluation. Its applicability to instances of subtle toxicity is especially promising, demonstrating a path forward for addressing increasingly elusive online hate.
△ Less
Submitted 26 May, 2023; v1 submitted 20 December, 2022;
originally announced December 2022.
-
Rainier: Reinforced Knowledge Introspector for Commonsense Question Answering
Authors:
Jiacheng Liu,
Skyler Hallinan,
Ximing Lu,
Pengfei He,
Sean Welleck,
Hannaneh Hajishirzi,
Ye** Choi
Abstract:
Knowledge underpins reasoning. Recent research demonstrates that when relevant knowledge is provided as additional context to commonsense question answering (QA), it can substantially enhance the performance even on top of state-of-the-art. The fundamental challenge is where and how to find such knowledge that is high quality and on point with respect to the question; knowledge retrieved from know…
▽ More
Knowledge underpins reasoning. Recent research demonstrates that when relevant knowledge is provided as additional context to commonsense question answering (QA), it can substantially enhance the performance even on top of state-of-the-art. The fundamental challenge is where and how to find such knowledge that is high quality and on point with respect to the question; knowledge retrieved from knowledge bases are incomplete and knowledge generated from language models are inconsistent. We present Rainier, or Reinforced Knowledge Introspector, that learns to generate contextually relevant knowledge in response to given questions. Our approach starts by imitating knowledge generated by GPT-3, then learns to generate its own knowledge via reinforcement learning where rewards are shaped based on the increased performance on the resulting question answering. Rainier demonstrates substantial and consistent performance gains when tested over 9 different commonsense benchmarks: including 5 datasets that are seen during model training, as well as 4 datasets that are kept unseen. Our work is the first to report that knowledge generated by models that are orders of magnitude smaller than GPT-3, even without direct supervision on the knowledge itself, can exceed the quality of commonsense knowledge elicited from GPT-3.
△ Less
Submitted 22 October, 2022; v1 submitted 6 October, 2022;
originally announced October 2022.
-
Investigating behavior change indicators and cognitive measures in persuasive health games
Authors:
S. Durga,
S. Hallinan,
M. Seif El-Nasr,
M. Shiyko,
C. Sceppa
Abstract:
Outcome-driven studies designed to evaluate potential effects of games and apps designed to promote healthy eating and exercising remain limited either targeting design or usability factors while omitting out health-based outcomes altogether, or tend to be too narrowly focuses on behavioral outcomes within a short periods of time thereby less likely to influence longitudinal factors that can help…
▽ More
Outcome-driven studies designed to evaluate potential effects of games and apps designed to promote healthy eating and exercising remain limited either targeting design or usability factors while omitting out health-based outcomes altogether, or tend to be too narrowly focuses on behavioral outcomes within a short periods of time thereby less likely to influence longitudinal factors that can help sustain healthy habits. In this paper we argue for a unified approach to tackle behavioral change through focusing on both health outcomes and cognitive precursors, such as players' attitudes and behaviors around healthy eating and exercising, motivation stage and knowledge and awareness about nutrition or physical activity. Key findings from a 3-month long game play study, with 47 female participants indicate that there are clear shifts in players' perceptions about health and knowledge about eating. This paper extends our current understandings about approaches for evaluating health games and presents a unified approach to assess effectiveness of game-based health interventions through combining health-based outcomes and shifts in players' cognitive precursors.
△ Less
Submitted 25 June, 2021;
originally announced June 2021.
-
Misinfo Reaction Frames: Reasoning about Readers' Reactions to News Headlines
Authors:
Saadia Gabriel,
Skyler Hallinan,
Maarten Sap,
Pemi Nguyen,
Franziska Roesner,
Eunsol Choi,
Ye** Choi
Abstract:
Even to a simple and short news headline, readers react in a multitude of ways: cognitively (e.g. inferring the writer's intent), emotionally (e.g. feeling distrust), and behaviorally (e.g. sharing the news with their friends). Such reactions are instantaneous and yet complex, as they rely on factors that go beyond interpreting factual content of news. We propose Misinfo Reaction Frames (MRF), a p…
▽ More
Even to a simple and short news headline, readers react in a multitude of ways: cognitively (e.g. inferring the writer's intent), emotionally (e.g. feeling distrust), and behaviorally (e.g. sharing the news with their friends). Such reactions are instantaneous and yet complex, as they rely on factors that go beyond interpreting factual content of news. We propose Misinfo Reaction Frames (MRF), a pragmatic formalism for modeling how readers might react to a news headline. In contrast to categorical schema, our free-text dimensions provide a more nuanced way of understanding intent beyond being benign or malicious. We also introduce a Misinfo Reaction Frames corpus, a crowdsourced dataset of reactions to over 25k news headlines focusing on global crises: the Covid-19 pandemic, climate change, and cancer. Empirical results confirm that it is indeed possible for neural models to predict the prominent patterns of readers' reactions to previously unseen news headlines. Additionally, our user study shows that displaying machine-generated MRF implications alongside news headlines to readers can increase their trust in real news while decreasing their trust in misinformation. Our work demonstrates the feasibility and importance of pragmatic inferences on news headlines to help enhance AI-guided misinformation detection and mitigation.
△ Less
Submitted 22 March, 2022; v1 submitted 18 April, 2021;
originally announced April 2021.