-
You are what you eat? Feeding foundation models a regionally diverse food dataset of World Wide Dishes
Authors:
Jabez Magomere,
Shu Ishida,
Tejumade Afonja,
Aya Salama,
Daniel Kochin,
Foutse Yuehgoh,
Imane Hamzaoui,
Raesetje Sefala,
Aisha Alaagib,
Elizaveta Semenova,
Lauren Crais,
Siobhan Mackenzie Hall
Abstract:
Foundation models are increasingly ubiquitous in our daily lives, used in everyday tasks such as text-image searches, interactions with chatbots, and content generation. As use increases, so does concern over the disparities in performance and fairness of these models for different people in different parts of the world. To assess these growing regional disparities, we present World Wide Dishes, a…
▽ More
Foundation models are increasingly ubiquitous in our daily lives, used in everyday tasks such as text-image searches, interactions with chatbots, and content generation. As use increases, so does concern over the disparities in performance and fairness of these models for different people in different parts of the world. To assess these growing regional disparities, we present World Wide Dishes, a mixed text and image dataset consisting of 765 dishes, with dish names collected in 131 local languages. World Wide Dishes has been collected purely through human contribution and decentralised means, by creating a website widely distributed through social networks. Using the dataset, we demonstrate a novel means of operationalising capability and representational biases in foundation models such as language models and text-to-image generative models. We enrich these studies with a pilot community review to understand, from a first-person perspective, how these models generate images for people in five African countries and the United States.
We find that these models generally do not produce quality text and image outputs of dishes specific to different regions. This is true even for the US, which is typically considered to be more well-resourced in training data - though the generation of US dishes does outperform that of the investigated African countries. The models demonstrate a propensity to produce outputs that are inaccurate as well as culturally misrepresentative, flattening, and insensitive. These failures in capability and representational bias have the potential to further reinforce stereotypes and disproportionately contribute to erasure based on region. The dataset and code are available at https://github.com/oxai/world-wide-dishes/.
△ Less
Submitted 13 June, 2024;
originally announced June 2024.
-
VisoGender: A dataset for benchmarking gender bias in image-text pronoun resolution
Authors:
Siobhan Mackenzie Hall,
Fernanda Gonçalves Abrantes,
Hanwen Zhu,
Grace Sodunke,
Aleksandar Shtedritski,
Hannah Rose Kirk
Abstract:
We introduce VisoGender, a novel dataset for benchmarking gender bias in vision-language models. We focus on occupation-related biases within a hegemonic system of binary gender, inspired by Winograd and Winogender schemas, where each image is associated with a caption containing a pronoun relationship of subjects and objects in the scene. VisoGender is balanced by gender representation in profess…
▽ More
We introduce VisoGender, a novel dataset for benchmarking gender bias in vision-language models. We focus on occupation-related biases within a hegemonic system of binary gender, inspired by Winograd and Winogender schemas, where each image is associated with a caption containing a pronoun relationship of subjects and objects in the scene. VisoGender is balanced by gender representation in professional roles, supporting bias evaluation in two ways: i) resolution bias, where we evaluate the difference between pronoun resolution accuracies for image subjects with gender presentations perceived as masculine versus feminine by human annotators and ii) retrieval bias, where we compare ratios of professionals perceived to have masculine and feminine gender presentations retrieved for a gender-neutral search query. We benchmark several state-of-the-art vision-language models and find that they demonstrate bias in resolving binary gender in complex scenes. While the direction and magnitude of gender bias depends on the task and the model being evaluated, captioning models are generally less biased than Vision-Language Encoders. Dataset and code are available at https://github.com/oxai/visogender
△ Less
Submitted 12 December, 2023; v1 submitted 21 June, 2023;
originally announced June 2023.
-
Balancing the Picture: Debiasing Vision-Language Datasets with Synthetic Contrast Sets
Authors:
Brandon Smith,
Miguel Farinha,
Siobhan Mackenzie Hall,
Hannah Rose Kirk,
Aleksandar Shtedritski,
Max Bain
Abstract:
Vision-language models are growing in popularity and public visibility to generate, edit, and caption images at scale; but their outputs can perpetuate and amplify societal biases learned during pre-training on uncurated image-text pairs from the internet. Although debiasing methods have been proposed, we argue that these measurements of model bias lack validity due to dataset bias. We demonstrate…
▽ More
Vision-language models are growing in popularity and public visibility to generate, edit, and caption images at scale; but their outputs can perpetuate and amplify societal biases learned during pre-training on uncurated image-text pairs from the internet. Although debiasing methods have been proposed, we argue that these measurements of model bias lack validity due to dataset bias. We demonstrate there are spurious correlations in COCO Captions, the most commonly used dataset for evaluating bias, between background context and the gender of people in-situ. This is problematic because commonly-used bias metrics (such as Bias@K) rely on per-gender base rates. To address this issue, we propose a novel dataset debiasing pipeline to augment the COCO dataset with synthetic, gender-balanced contrast sets, where only the gender of the subject is edited and the background is fixed. However, existing image editing methods have limitations and sometimes produce low-quality images; so, we introduce a method to automatically filter the generated images based on their similarity to real images. Using our balanced synthetic contrast sets, we benchmark bias in multiple CLIP-based models, demonstrating how metrics are skewed by imbalance in the original COCO images. Our results indicate that the proposed approach improves the validity of the evaluation, ultimately contributing to more realistic understanding of bias in vision-language models.
△ Less
Submitted 24 May, 2023;
originally announced May 2023.
-
A Prompt Array Keeps the Bias Away: Debiasing Vision-Language Models with Adversarial Learning
Authors:
Hugo Berg,
Siobhan Mackenzie Hall,
Yash Bhalgat,
Wonsuk Yang,
Hannah Rose Kirk,
Aleksandar Shtedritski,
Max Bain
Abstract:
Vision-language models can encode societal biases and stereotypes, but there are challenges to measuring and mitigating these multimodal harms due to lacking measurement robustness and feature degradation. To address these challenges, we investigate bias measures and apply ranking metrics for image-text representations. We then investigate debiasing methods and show that prepending learned embeddi…
▽ More
Vision-language models can encode societal biases and stereotypes, but there are challenges to measuring and mitigating these multimodal harms due to lacking measurement robustness and feature degradation. To address these challenges, we investigate bias measures and apply ranking metrics for image-text representations. We then investigate debiasing methods and show that prepending learned embeddings to text queries that are jointly trained with adversarial debiasing and a contrastive loss reduces various bias measures with minimal degradation to the image-text representation.
△ Less
Submitted 25 October, 2022; v1 submitted 22 March, 2022;
originally announced March 2022.
-
Energetics of star-disc encounters in the non-linear regime
Authors:
S. M. Hall,
C. J. Clarke,
J. E. Pringle
Abstract:
We investigate the response of a circumstellar accretion disc to the fly-by of a perturbing mass on a parabolic orbit. The energy and angular momentum transferred during the encounter are calculated using a reduced three-body method. In almost all close encounters the energy and angular momentum transfer is dominated by disc material becoming unbound from the system, with the contributions from…
▽ More
We investigate the response of a circumstellar accretion disc to the fly-by of a perturbing mass on a parabolic orbit. The energy and angular momentum transferred during the encounter are calculated using a reduced three-body method. In almost all close encounters the energy and angular momentum transfer is dominated by disc material becoming unbound from the system, with the contributions from close disc particle -- star encounters being significant. For more distant encounters with some prograde element to the motion the disc material loses energy and angular momentum to the perturber's orbit through a resonance feature. The magnitude of the energy transfer calculated in our simulations is greater than that of the binding energy of material exterior to periastron by a factor of two in the prograde case, and up to a factor of five in the case of the retrograde encounter. The destructive nature of the encounters indicates that a non-linear treatment is essential in all but the most distant encounters.
△ Less
Submitted 31 October, 1995;
originally announced October 1995.