Skip to main content

Showing 1–1 of 1 results for author: Seetharaman, K V

.
  1. arXiv:2405.15065  [pdf, other

    cs.LG

    Direct Preference Optimization With Unobserved Preference Heterogeneity

    Authors: Keertana Chidambaram, Karthik Vinay Seetharaman, Vasilis Syrgkanis

    Abstract: RLHF has emerged as a pivotal step in aligning language models with human objectives and values. It typically involves learning a reward model from human preference data and then using reinforcement learning to update the generative model accordingly. Conversely, Direct Preference Optimization (DPO) directly optimizes the generative model with preference data, skip** reinforcement learning. Howe… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.