Skip to main content

Showing 1–1 of 1 results for author: Muldrew, W

Searching in archive cs. Search in all archives.
.
  1. arXiv:2402.08114  [pdf, other

    cs.LG cs.AI cs.CL

    Active Preference Learning for Large Language Models

    Authors: William Muldrew, Peter Hayes, Mingtian Zhang, David Barber

    Abstract: As large language models (LLMs) become more capable, fine-tuning techniques for aligning with human intent are increasingly important. A key consideration for aligning these models is how to most effectively use human resources, or model resources in the case where LLMs themselves are used as oracles. Reinforcement learning from Human or AI preferences (RLHF/RLAIF) is the most prominent example of… ▽ More

    Submitted 28 June, 2024; v1 submitted 12 February, 2024; originally announced February 2024.

    Comments: 13 pages, 5 figures, 6 tables