RAFT: Reward rAnked FineTuning for Generative Foundation Model Alignment

Dong, Hanze; Xiong, Wei; Goyal, Deepanshu; Pan, Rui; Diao, Shizhe; Zhang, Jipeng; Shum, Kashun; Zhang, Tong

Computer Science > Machine Learning

arXiv:2304.06767v2 (cs)

[Submitted on 13 Apr 2023 (v1), revised 25 May 2023 (this version, v2), latest version 1 Dec 2023 (v4)]

Title:RAFT: Reward rAnked FineTuning for Generative Foundation Model Alignment

Authors:Hanze Dong, Wei Xiong, Deepanshu Goyal, Rui Pan, Shizhe Diao, Jipeng Zhang, Kashun Shum, Tong Zhang

View PDF

Abstract:Generative foundation models are susceptible to implicit biases that can arise from extensive unsupervised training data. Such biases can produce suboptimal samples, skewed outcomes, and unfairness, with potentially significant repercussions. Consequently, aligning these models with human ethics and preferences is an essential step toward ensuring their responsible and effective deployment in real-world applications. Prior research has primarily employed Reinforcement Learning from Human Feedback (RLHF) as a means of addressing this problem, wherein generative models are fine-tuned using RL algorithms guided by a human-feedback-informed reward model. However, the inefficiencies and instabilities associated with RL algorithms frequently present substantial obstacles to the successful alignment of generative models, necessitating the development of a more robust and streamlined approach. To this end, we introduce a new framework, Reward rAnked FineTuning (RAFT), designed to align generative models more effectively. Utilizing a reward model and a sufficient number of samples, our approach selects the high-quality samples, discarding those that exhibit undesired behavior, and subsequently assembles a streaming dataset. This dataset serves as the basis for aligning the generative model and can be employed under both offline and online settings. Notably, the sample generation process within RAFT is gradient-free, rendering it compatible with black-box generators. Through extensive experiments, we demonstrate that our proposed algorithm exhibits strong performance in the context of both large language models and diffusion models.

Comments:	26 pages, 8 figures
Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (stat.ML)
Cite as:	arXiv:2304.06767 [cs.LG]
	(or arXiv:2304.06767v2 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2304.06767

Submission history

From: Hanze Dong [view email]
[v1] Thu, 13 Apr 2023 18:22:40 UTC (62,967 KB)
[v2] Thu, 25 May 2023 06:27:31 UTC (42,022 KB)
[v3] Wed, 30 Aug 2023 01:25:29 UTC (33,955 KB)
[v4] Fri, 1 Dec 2023 14:28:06 UTC (34,049 KB)

Computer Science > Machine Learning

Title:RAFT: Reward rAnked FineTuning for Generative Foundation Model Alignment

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:RAFT: Reward rAnked FineTuning for Generative Foundation Model Alignment

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators