Skip to main content

Showing 1–4 of 4 results for author: Reusens, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.17385  [pdf, other

    cs.CL

    Native Design Bias: Studying the Impact of English Nativeness on Language Model Performance

    Authors: Manon Reusens, Philipp Borchert, Jochen De Weerdt, Bart Baesens

    Abstract: Large Language Models (LLMs) excel at providing information acquired during pretraining on large-scale corpora and following instructions through user prompts. This study investigates whether the quality of LLM responses varies depending on the demographic profile of users. Considering English as the global lingua franca, along with the diversity of its dialects among speakers of different native… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

  2. arXiv:2310.10310  [pdf, other

    cs.CL

    Investigating Bias in Multilingual Language Models: Cross-Lingual Transfer of Debiasing Techniques

    Authors: Manon Reusens, Philipp Borchert, Margot Mieskes, Jochen De Weerdt, Bart Baesens

    Abstract: This paper investigates the transferability of debiasing techniques across different languages within multilingual models. We examine the applicability of these techniques in English, French, German, and Dutch. Using multilingual BERT (mBERT), we demonstrate that cross-lingual transfer of debiasing techniques is not only feasible but also yields promising results. Surprisingly, our findings reveal… ▽ More

    Submitted 16 October, 2023; originally announced October 2023.

    Comments: Accepted to EMNLP 2023 main conference

  3. arXiv:2310.06675  [pdf, other

    cs.CL

    SEER : A Knapsack approach to Exemplar Selection for In-Context HybridQA

    Authors: Jonathan Tonglet, Manon Reusens, Philipp Borchert, Bart Baesens

    Abstract: Question answering over hybrid contexts is a complex task, which requires the combination of information extracted from unstructured texts and structured tables in various ways. Recently, In-Context Learning demonstrated significant performance advances for reasoning tasks. In this paradigm, a large language model performs predictions based on a small set of supporting exemplars. The performance o… ▽ More

    Submitted 20 October, 2023; v1 submitted 10 October, 2023; originally announced October 2023.

    Comments: Camera ready revision for EMNLP 2023 main conference. Code available at https://github.com/jtonglet/SEER

  4. arXiv:2306.04338  [pdf, other

    stat.ML cs.LG

    Changing Data Sources in the Age of Machine Learning for Official Statistics

    Authors: Cedric De Boom, Michael Reusens

    Abstract: Data science has become increasingly essential for the production of official statistics, as it enables the automated collection, processing, and analysis of large amounts of data. With such data science practices in place, it enables more timely, more insightful and more flexible reporting. However, the quality and integrity of data-science-driven statistics rely on the accuracy and reliability o… ▽ More

    Submitted 7 June, 2023; originally announced June 2023.

    Comments: Presented at UNECE Machine Learning for Official Statistics Workshop 2023

    Journal ref: UNECE Machine Learning for Official Statistics Workshop 2023