Towards Controllable Biases in Language Generation

Sheng, Emily; Chang, Kai-Wei; Natarajan, Premkumar; Peng, Nanyun

Computer Science > Computation and Language

arXiv:2005.00268 (cs)

[Submitted on 1 May 2020 (v1), last revised 7 Oct 2020 (this version, v2)]

Title:Towards Controllable Biases in Language Generation

Authors:Emily Sheng, Kai-Wei Chang, Premkumar Natarajan, Nanyun Peng

View PDF

Abstract:We present a general approach towards controllable societal biases in natural language generation (NLG). Building upon the idea of adversarial triggers, we develop a method to induce societal biases in generated text when input prompts contain mentions of specific demographic groups. We then analyze two scenarios: 1) inducing negative biases for one demographic and positive biases for another demographic, and 2) equalizing biases between demographics. The former scenario enables us to detect the types of biases present in the model. Specifically, we show the effectiveness of our approach at facilitating bias analysis by finding topics that correspond to demographic inequalities in generated text and comparing the relative effectiveness of inducing biases for different demographics. The second scenario is useful for mitigating biases in downstream applications such as dialogue generation. In our experiments, the mitigation technique proves to be effective at equalizing the amount of biases across demographics while simultaneously generating less negatively biased text overall.

Comments:	16 pages, Findings of EMNLP 2020
Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2005.00268 [cs.CL]
	(or arXiv:2005.00268v2 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2005.00268

Submission history

From: Emily Sheng [view email]
[v1] Fri, 1 May 2020 08:25:11 UTC (218 KB)
[v2] Wed, 7 Oct 2020 05:17:16 UTC (7,208 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.CL

< prev | next >

new | recent | 2020-05

Change to browse by:

References & Citations

DBLP - CS Bibliography

listing | bibtex

Emily Sheng
Kai-Wei Chang
Premkumar Natarajan
Nanyun Peng

export BibTeX citation

Computer Science > Computation and Language

Title:Towards Controllable Biases in Language Generation

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Towards Controllable Biases in Language Generation

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators