Skip to main content

Showing 1–11 of 11 results for author: Kreuter, F

Searching in archive stat. Search in all archives.
.
  1. arXiv:2403.01208  [pdf, other

    cs.HC stat.ME

    Position: Insights from Survey Methodology can Improve Training Data

    Authors: Stephanie Eckman, Barbara Plank, Frauke Kreuter

    Abstract: Whether future AI models are fair, trustworthy, and aligned with the public's interests rests in part on our ability to collect accurate data about what we want the models to do. However, collecting high-quality data is difficult, and few AI/ML researchers are trained in data collection methods. Recent research in data-centric AI has show that higher quality training data leads to better performin… ▽ More

    Submitted 7 June, 2024; v1 submitted 2 March, 2024; originally announced March 2024.

    Comments: 9 pages, 4 figures. ICML 2024 Position Paper, forthcoming

    ACM Class: E.0

  2. arXiv:2311.14212  [pdf, other

    stat.ML cs.CL cs.LG stat.ME

    Annotation Sensitivity: Training Data Collection Methods Affect Model Performance

    Authors: Christoph Kern, Stephanie Eckman, Jacob Beck, Rob Chew, Bolei Ma, Frauke Kreuter

    Abstract: When training data are collected from human annotators, the design of the annotation instrument, the instructions given to annotators, the characteristics of the annotators, and their interactions can impact training data. This study demonstrates that design choices made when creating an annotation instrument also impact the models trained on the resulting annotations. We introduce the term annota… ▽ More

    Submitted 22 January, 2024; v1 submitted 23 November, 2023; originally announced November 2023.

    Comments: EMNLP 2023 Findings: https://aclanthology.org/2023.findings-emnlp.992/

  3. arXiv:2310.19091  [pdf, other

    cs.LG cs.CY cs.HC stat.ME

    Bridging the Gap: Towards an Expanded Toolkit for ML-Supported Decision-Making in the Public Sector

    Authors: Unai Fischer-Abaigar, Christoph Kern, Noam Barda, Frauke Kreuter

    Abstract: Machine Learning (ML) systems are becoming instrumental in the public sector, with applications spanning areas like criminal justice, social welfare, financial fraud detection, and public health. While these systems offer great potential benefits to institutional decision-making processes, such as improved efficiency and reliability, they still face the challenge of aligning nuanced policy objecti… ▽ More

    Submitted 26 April, 2024; v1 submitted 29 October, 2023; originally announced October 2023.

  4. arXiv:2305.16703  [pdf, other

    stat.ML cs.LG

    Sources of Uncertainty in Machine Learning -- A Statisticians' View

    Authors: Cornelia Gruber, Patrick Oliver Schenk, Malte Schierholz, Frauke Kreuter, Göran Kauermann

    Abstract: Machine Learning and Deep Learning have achieved an impressive standard today, enabling us to answer questions that were inconceivable a few years ago. Besides these successes, it becomes clear, that beyond pure prediction, which is the primary strength of most supervised machine learning algorithms, the quantification of uncertainty is relevant and necessary as well. While first concepts and idea… ▽ More

    Submitted 26 May, 2023; originally announced May 2023.

  5. arXiv:2303.05349  [pdf, other

    stat.AP cs.CL cs.CY

    Seeing ChatGPT Through Students' Eyes: An Analysis of TikTok Data

    Authors: Anna-Carolina Haensch, Sarah Ball, Markus Herklotz, Frauke Kreuter

    Abstract: Advanced large language models like ChatGPT have gained considerable attention recently, including among students. However, while the debate on ChatGPT in academia is making waves, more understanding is needed among lecturers and teachers on how students use and perceive ChatGPT. To address this gap, we analyzed the content on ChatGPT available on TikTok in February 2023. TikTok is a rapidly growi… ▽ More

    Submitted 9 March, 2023; originally announced March 2023.

  6. arXiv:2205.13380  [pdf, ps, other

    stat.ME stat.ML

    Classification ensembles for multivariate functional data with application to mouse movements in web surveys

    Authors: Amanda Fernández-Fontelo, Felix Henninger, Pascal J. Kieslich, Frauke Kreuter, Sonja Greven

    Abstract: We propose new ensemble models for multivariate functional data classification as combinations of semi-metric-based weak learners. Our models extend current semi-metric-type methods from the univariate to the multivariate case, propose new semi-metrics to compute distances between functions, and consider more flexible options for combining weak learners using stacked generalisation methods. We app… ▽ More

    Submitted 26 May, 2022; originally announced May 2022.

    Comments: 24 pages, 3 tables, 0 figures

  7. arXiv:2108.04134  [pdf, other

    cs.CY cs.LG stat.AP

    Fairness in Algorithmic Profiling: A German Case Study

    Authors: Christoph Kern, Ruben L. Bach, Hannah Mautner, Frauke Kreuter

    Abstract: Algorithmic profiling is increasingly used in the public sector as a means to allocate limited public resources effectively and objectively. One example is the prediction-based statistical profiling of job seekers to guide the allocation of support measures by public employment services. However, empirical evaluations of potential side-effects such as unintended discrimination and fairness concern… ▽ More

    Submitted 4 August, 2021; originally announced August 2021.

  8. arXiv:2105.01441  [pdf, other

    stat.ML cs.LG

    Distributive Justice and Fairness Metrics in Automated Decision-making: How Much Overlap Is There?

    Authors: Matthias Kuppler, Christoph Kern, Ruben L. Bach, Frauke Kreuter

    Abstract: The advent of powerful prediction algorithms led to increased automation of high-stake decisions regarding the allocation of scarce resources such as government spending and welfare support. This automation bears the risk of perpetuating unwanted discrimination against vulnerable and historically disadvantaged groups. Research on algorithmic discrimination in computer science and other disciplines… ▽ More

    Submitted 6 May, 2021; v1 submitted 4 May, 2021; originally announced May 2021.

  9. arXiv:2012.11678  [pdf

    stat.AP

    Global Trends and Predictors of Face Mask Usage During the COVID-19 Pandemic

    Authors: Elena Badillo-Goicoechea, Ting-Hsuan Chang, Esther Kim, Sarah LaRocca, Katherine Morris, Xiaoyi Deng, Samantha Chiu, Adrianne Bradford, Andres Garcia, Christoph Kern, Curtiss Cobb, Frauke Kreuter, Elizabeth A. Stuart

    Abstract: Background: Guidelines and recommendations from public health authorities related to face masks have been essential in containing the COVID-19 pandemic. We assessed the prevalence and correlates of mask usage during the pandemic. Methods: We examined a total of 13,723,810 responses to a daily cross-sectional representative online survey in 38 countries who completed from April 23, 2020 to Octobe… ▽ More

    Submitted 8 January, 2021; v1 submitted 21 December, 2020; originally announced December 2020.

    Comments: 39 pages, 2 mian figures, Appendix

  10. arXiv:2011.06916  [pdf

    cs.HC cs.LG stat.AP

    Predicting respondent difficulty in web surveys: A machine-learning approach based on mouse movement features

    Authors: Amanda Fernández-Fontelo, Pascal J. Kieslich, Felix Henninger, Frauke Kreuter, Sonja Greven

    Abstract: A central goal of survey research is to collect robust and reliable data from respondents. However, despite researchers' best efforts in designing questionnaires, respondents may experience difficulty understanding questions' intent and therefore may struggle to respond appropriately. If it were possible to detect such difficulty, this knowledge could be used to inform real-time interventions thro… ▽ More

    Submitted 5 November, 2020; originally announced November 2020.

    Comments: 40 pages, 2 Figures, 3 Tables

  11. arXiv:1508.05502  [pdf, other

    stat.AP

    Evaluating the quality of survey and administrative data with generalized multitrait-multimethod models

    Authors: Daniel Leonard Oberski, Antje Kirchner, Stephanie Eckman, Frauke Kreuter

    Abstract: Administrative register data are increasingly important in statistics, but, like other types of data, may contain measurement errors. To prevent such errors from invalidating analyses of scientific interest, it is therefore essential to estimate the extent of measurement errors in administrative data. Currently, however, most approaches to evaluate such errors involve either prohibitively expensiv… ▽ More

    Submitted 22 August, 2015; originally announced August 2015.