Make The Most of Prior Data: A Solution for Interactive Text Summarization with Preference Feedback
Authors:
Duy-Hung Nguyen,
Nguyen Viet Dung Nghiem,
Bao-Sinh Nguyen,
Dung Tien Le,
Shahab Sabahi,
Minh-Tien Nguyen,
Hung Le
Abstract:
For summarization, human preference is critical to tame outputs of the summarizer in favor of human interests, as ground-truth summaries are scarce and ambiguous. Practical settings require dynamic exchanges between human and AI agent wherein feedback is provided in an online manner, a few at a time. In this paper, we introduce a new framework to train summarization models with preference feedback…
▽ More
For summarization, human preference is critical to tame outputs of the summarizer in favor of human interests, as ground-truth summaries are scarce and ambiguous. Practical settings require dynamic exchanges between human and AI agent wherein feedback is provided in an online manner, a few at a time. In this paper, we introduce a new framework to train summarization models with preference feedback interactively. By properly leveraging offline data and a novel reward model, we improve the performance regarding ROUGE scores and sample-efficiency. Our experiments on three various datasets confirm the benefit of the proposed framework in active, few-shot and online settings of preference learning.
△ Less
Submitted 11 May, 2022; v1 submitted 11 April, 2022;
originally announced April 2022.
Robust Deep Reinforcement Learning for Extractive Legal Summarization
Authors:
Duy-Hung Nguyen,
Bao-Sinh Nguyen,
Nguyen Viet Dung Nghiem,
Dung Tien Le,
Mim Amina Khatun,
Minh-Tien Nguyen,
Hung Le
Abstract:
Automatic summarization of legal texts is an important and still a challenging task since legal documents are often long and complicated with unusual structures and styles. Recent advances of deep models trained end-to-end with differentiable losses can well-summarize natural text, yet when applied to legal domain, they show limited results. In this paper, we propose to use reinforcement learning…
▽ More
Automatic summarization of legal texts is an important and still a challenging task since legal documents are often long and complicated with unusual structures and styles. Recent advances of deep models trained end-to-end with differentiable losses can well-summarize natural text, yet when applied to legal domain, they show limited results. In this paper, we propose to use reinforcement learning to train current deep summarization models to improve their performance on the legal domain. To this end, we adopt proximal policy optimization methods and introduce novel reward functions that encourage the generation of candidate summaries satisfying both lexical and semantic criteria. We apply our method to training different summarization backbones and observe a consistent and significant performance gain across 3 public legal datasets.
△ Less
Submitted 23 November, 2021; v1 submitted 13 November, 2021;
originally announced November 2021.