-
The Problem of Alignment
Authors:
Tsvetelina Hristova,
Liam Magee,
Karen Soldatic
Abstract:
Large Language Models produce sequences learned as statistical patterns from large corpora. In order not to reproduce corpus biases, after initial training models must be aligned with human values, preferencing certain continuations over others. Alignment, which can be viewed as the superimposition of normative structure onto a statistical model, reveals a conflicted and complex interrelationship…
▽ More
Large Language Models produce sequences learned as statistical patterns from large corpora. In order not to reproduce corpus biases, after initial training models must be aligned with human values, preferencing certain continuations over others. Alignment, which can be viewed as the superimposition of normative structure onto a statistical model, reveals a conflicted and complex interrelationship between language and technology. This relationship shapes theories of language, linguistic practice and subjectivity, which are especially relevant to the current sophistication in artificially produced text. We examine this practice of structuration as a two-way interaction between users and models by analysing how ChatGPT4 redacts perceived `anomalous' language in fragments of Joyce's Ulysses and the new linguistic practice of prompt engineering. We then situate this alignment problem historically, revisiting earlier postwar linguistic debates which counterposed two views of meaning: as discrete structures, and as continuous probability distributions. We discuss the largely occluded work of the Moscow Linguistic School, which sought to reconcile this opposition. Our attention to the Moscow School and later related arguments by Searle and Kristeva casts the problem of alignment in a new light: as one involving attention to the social structuration of linguistic practice, including structuration of anomalies that, like the Joycean text, exist in defiance of expressive conventions. These debates around the communicative orientation toward language can help explain some of the contemporary behaviours and interdependencies that take place between users and LLMs.
△ Less
Submitted 30 December, 2023;
originally announced January 2024.
-
Intersectional Inquiry, on the Ground and in the Algorithm
Authors:
Shanthi Robertson,
Liam Magee,
Karen Soldatić
Abstract:
This article makes two key contributions to methodological debates in automation research. First, we argue for and demonstrate how methods in this field must account for intersections of social difference, such as race, class, ethnicity, culture, and disability, in more nuanced ways. Second, we consider the complexities of bringing together computational and qualitative methods in an intersectiona…
▽ More
This article makes two key contributions to methodological debates in automation research. First, we argue for and demonstrate how methods in this field must account for intersections of social difference, such as race, class, ethnicity, culture, and disability, in more nuanced ways. Second, we consider the complexities of bringing together computational and qualitative methods in an intersectional methodological approach while also arguing that in their respective subjects (machines and human subjects) and conceptual scope they enable a specific dialogue on intersectionality and automation to be articulated. We draw on field reflections from a project that combines an analysis of intersectional bias in language models with findings from a community workshop on the frustrations and aspirations produced through engagement with everyday AI-driven technologies in the context of care.
△ Less
Submitted 29 August, 2023;
originally announced August 2023.
-
Intersectional Bias in Causal Language Models
Authors:
Liam Magee,
Lida Ghahremanlou,
Karen Soldatic,
Shanthi Robertson
Abstract:
To examine whether intersectional bias can be observed in language generation, we examine \emph{GPT-2} and \emph{GPT-NEO} models, ranging in size from 124 million to ~2.7 billion parameters. We conduct an experiment combining up to three social categories - gender, religion and disability - into unconditional or zero-shot prompts used to generate sentences that are then analysed for sentiment. Our…
▽ More
To examine whether intersectional bias can be observed in language generation, we examine \emph{GPT-2} and \emph{GPT-NEO} models, ranging in size from 124 million to ~2.7 billion parameters. We conduct an experiment combining up to three social categories - gender, religion and disability - into unconditional or zero-shot prompts used to generate sentences that are then analysed for sentiment. Our results confirm earlier tests conducted with auto-regressive causal models, including the \emph{GPT} family of models. We also illustrate why bias may be resistant to techniques that target single categories (e.g. gender, religion and race), as it can also manifest, in often subtle ways, in texts prompted by concatenated social categories. To address these difficulties, we suggest technical and community-based approaches need to combine to acknowledge and address complex and intersectional language model bias.
△ Less
Submitted 15 July, 2021;
originally announced July 2021.