Evaluation of African American Language Bias in Natural Language Generation

Deas, Nicholas; Grieser, Jessi; Kleiner, Shana; Patton, Desmond; Turcan, Elsbeth; McKeown, Kathleen

Computer Science > Computation and Language

arXiv:2305.14291 (cs)

[Submitted on 23 May 2023 (v1), last revised 13 Nov 2023 (this version, v2)]

Title:Evaluation of African American Language Bias in Natural Language Generation

Authors:Nicholas Deas, Jessi Grieser, Shana Kleiner, Desmond Patton, Elsbeth Turcan, Kathleen McKeown

View PDF

Abstract:We evaluate how well LLMs understand African American Language (AAL) in comparison to their performance on White Mainstream English (WME), the encouraged "standard" form of English taught in American classrooms. We measure LLM performance using automatic metrics and human judgments for two tasks: a counterpart generation task, where a model generates AAL (or WME) given WME (or AAL), and a masked span prediction (MSP) task, where models predict a phrase that was removed from their input. Our contributions include: (1) evaluation of six pre-trained, large language models on the two language generation tasks; (2) a novel dataset of AAL text from multiple contexts (social media, hip-hop lyrics, focus groups, and linguistic interviews) with human-annotated counterparts in WME; and (3) documentation of model performance gaps that suggest bias and identification of trends in lack of understanding of AAL features.

Comments:	EMNLP 2023 Camera-Ready
Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2305.14291 [cs.CL]
	(or arXiv:2305.14291v2 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2305.14291

Submission history

From: Nicholas Deas [view email]
[v1] Tue, 23 May 2023 17:34:37 UTC (608 KB)
[v2] Mon, 13 Nov 2023 01:41:43 UTC (319 KB)

Computer Science > Computation and Language

Title:Evaluation of African American Language Bias in Natural Language Generation

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Evaluation of African American Language Bias in Natural Language Generation

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators