Learning Personalized Alignment for Evaluating Open-ended Text Generation

Wang, Danqing; Yang, Kevin; Zhu, Hanlin; Yang, Xiaomeng; Cohen, Andrew; Li, Lei; Tian, Yuandong

Computer Science > Computation and Language

arXiv:2310.03304 (cs)

[Submitted on 5 Oct 2023 (v1), last revised 19 Jun 2024 (this version, v4)]

Title:Learning Personalized Alignment for Evaluating Open-ended Text Generation

Authors:Danqing Wang, Kevin Yang, Hanlin Zhu, Xiaomeng Yang, Andrew Cohen, Lei Li, Yuandong Tian

View PDF HTML (experimental)

Abstract:With rapid progress made in language qualities such as fluency and consistency via large language models (LLMs), there has been increasing interest in assessing alignment with diverse human preferences. Traditional metrics heavily rely on lexical similarity with human-written references and have been observed to suffer from a poor correlation with human evaluation. Furthermore, they ignore the diverse preferences of humans, a key aspect in evaluating open-ended tasks like story generation. Inspired by these challenges, we introduce an interpretable open-ended evaluation framework PerSE to assess the alignment with a specific human preference. It is tuned to deduce the specific preference from a given personal profile and evaluate the alignment between the generation and the personal preference. PerSE also explains its assessment by a detailed comment or several fine-grained scores. This enhances its interpretability, making it more suitable to tailor a personalized generation. Our 13B LLaMA-2-based PerSE shows a 15.8% increase in Kendall correlation and a 13.7% rise in accuracy on zero-shot reviewers compared to GPT-4. It also outperforms GPT-4 by 46.01% in the Kendall correlation on new domains, indicating its transferability.

Comments:	19 pages
Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2310.03304 [cs.CL]
	(or arXiv:2310.03304v4 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2310.03304

Submission history

From: Danqing Wang [view email]
[v1] Thu, 5 Oct 2023 04:15:48 UTC (2,406 KB)
[v2] Fri, 6 Oct 2023 17:59:16 UTC (2,406 KB)
[v3] Tue, 10 Oct 2023 15:15:54 UTC (2,406 KB)
[v4] Wed, 19 Jun 2024 22:05:12 UTC (2,599 KB)

Computer Science > Computation and Language

Title:Learning Personalized Alignment for Evaluating Open-ended Text Generation

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Learning Personalized Alignment for Evaluating Open-ended Text Generation

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators