Skip to main content

Showing 1–11 of 11 results for author: Belli, L

Searching in archive cs. Search in all archives.
.
  1. arXiv:2211.08667  [pdf, other

    cs.SI

    County-level Algorithmic Audit of Racial Bias in Twitter's Home Timeline

    Authors: Luca Belli, Kyra Yee, Uthaipon Tantipongpipat, Aaron Gonzales, Kristian Lum, Moritz Hardt

    Abstract: We report on the outcome of an audit of Twitter's Home Timeline ranking system. The goal of the audit was to determine if authors from some racial groups experience systematically higher impression counts for their Tweets than others. A central obstacle for any such audit is that Twitter does not ordinarily collect or associate racial information with its users, thus prohibiting an analysis at the… ▽ More

    Submitted 10 February, 2023; v1 submitted 15 November, 2022; originally announced November 2022.

  2. arXiv:2210.06351  [pdf, other

    cs.CL

    A Keyword Based Approach to Understanding the Overpenalization of Marginalized Groups by English Marginal Abuse Models on Twitter

    Authors: Kyra Yee, Alice Schoenauer Sebag, Olivia Redfield, Emily Sheng, Matthias Eck, Luca Belli

    Abstract: Harmful content detection models tend to have higher false positive rates for content from marginalized groups. In the context of marginal abuse modeling on Twitter, such disproportionate penalization poses the risk of reduced visibility, where marginalized communities lose the opportunity to voice their opinion on the platform. Current approaches to algorithmic harm mitigation, and bias detection… ▽ More

    Submitted 7 October, 2022; originally announced October 2022.

  3. arXiv:2209.05000  [pdf, other

    cs.IR cs.SI

    Random Isn't Always Fair: Candidate Set Imbalance and Exposure Inequality in Recommender Systems

    Authors: Amanda Bower, Kristian Lum, Tomo Lazovich, Kyra Yee, Luca Belli

    Abstract: Traditionally, recommender systems operate by returning a user a set of items, ranked in order of estimated relevance to that user. In recent years, methods relying on stochastic ordering have been developed to create "fairer" rankings that reduce inequality in who or what is shown to users. Complete randomization -- ordering candidate items randomly, independent of estimated relevance -- is large… ▽ More

    Submitted 11 September, 2022; originally announced September 2022.

    Comments: 12 pages

  4. Measuring Disparate Outcomes of Content Recommendation Algorithms with Distributional Inequality Metrics

    Authors: Tomo Lazovich, Luca Belli, Aaron Gonzales, Amanda Bower, Uthaipon Tantipongpipat, Kristian Lum, Ferenc Huszar, Rumman Chowdhury

    Abstract: The harmful impacts of algorithmic decision systems have recently come into focus, with many examples of systems such as machine learning (ML) models amplifying existing societal biases. Most metrics attempting to quantify disparities resulting from ML algorithms focus on differences between groups, dividing users based on demographic identities and comparing model performance or overall outcomes… ▽ More

    Submitted 3 February, 2022; originally announced February 2022.

    Comments: 11 pages, 7 figures

  5. Algorithmic Amplification of Politics on Twitter

    Authors: Ferenc Huszár, Sofia Ira Ktena, Conor O'Brien, Luca Belli, Andrew Schlaikjer, Moritz Hardt

    Abstract: Content on Twitter's home timeline is selected and ordered by personalization algorithms. By consistently ranking certain content higher, these algorithms may amplify some messages while reducing the visibility of others. There's been intense public and scholarly debate about the possibility that some political groups benefit more from algorithmic amplification than others. We provide quantitative… ▽ More

    Submitted 21 October, 2021; originally announced October 2021.

  6. arXiv:2109.08245  [pdf, other

    cs.SI

    The 2021 RecSys Challenge Dataset: Fairness is not optional

    Authors: Luca Belli, Alykhan Tejani, Frank Portman, Alexandre Lung-Yut-Fong, Ben Chamberlain, Yuanpu Xie, Kristian Lum, Jonathan Hunt, Michael Bronstein, Vito Walter Anelli, Saikishore Kalloori, Bruce Ferwerda, Wenzhe Shi

    Abstract: After the success the RecSys 2020 Challenge, we are describing a novel and bigger dataset that was released in conjunction with the ACM RecSys Challenge 2021. This year's dataset is not only bigger (~ 1B data points, a 5 fold increase), but for the first time it take into consideration fairness aspects of the challenge. Unlike many static datsets, a lot of effort went into making sure that the dat… ▽ More

    Submitted 21 September, 2021; v1 submitted 16 September, 2021; originally announced September 2021.

  7. Causal Inference Struggles with Agency on Online Platforms

    Authors: Smitha Milli, Luca Belli, Moritz Hardt

    Abstract: Online platforms regularly conduct randomized experiments to understand how changes to the platform causally affect various outcomes of interest. However, experimentation on online platforms has been criticized for having, among other issues, a lack of meaningful oversight and user consent. As platforms give users greater agency, it becomes possible to conduct observational studies in which users… ▽ More

    Submitted 10 May, 2022; v1 submitted 19 July, 2021; originally announced July 2021.

    Comments: Accepted to FaccT'22

  8. arXiv:2008.12623  [pdf, other

    cs.SI cs.LG stat.ML

    From Optimizing Engagement to Measuring Value

    Authors: Smitha Milli, Luca Belli, Moritz Hardt

    Abstract: Most recommendation engines today are based on predicting user engagement, e.g. predicting whether a user will click on an item or not. However, there is potentially a large gap between engagement signals and a desired notion of "value" that is worth optimizing for. We use the framework of measurement theory to (a) confront the designer with a normative question about what the designer values, (b)… ▽ More

    Submitted 19 July, 2021; v1 submitted 20 August, 2020; originally announced August 2020.

    Comments: Published at FAccT'21

  9. arXiv:2008.03415  [pdf, other

    cs.CL cs.CY cs.IR cs.LG

    Assessing Demographic Bias in Named Entity Recognition

    Authors: Shubhanshu Mishra, Sijun He, Luca Belli

    Abstract: Named Entity Recognition (NER) is often the first step towards automated Knowledge Base (KB) generation from raw text. In this work, we assess the bias in various Named Entity Recognition (NER) systems for English across different demographic groups with synthetically generated corpora. Our analysis reveals that models perform better at identifying names from specific demographic groups across two… ▽ More

    Submitted 7 August, 2020; originally announced August 2020.

    Comments: Presented at the AKBC Workshop on Bias in Automatic Knowledge Graph Construction, 2020 (arXiv:2007.11659)

    Report number: REPORT-NO:KGBias/2020/02 MSC Class: 68T50 (Primary); 68T30 (Secondary); 68U15 ACM Class: I.2.7; I.2.1; I.2.6; H.3.1; H.3.3; H.1.2; K.4.2

  10. arXiv:2004.13715  [pdf, ps, other

    cs.SI cs.LG stat.ML

    Privacy-Aware Recommender Systems Challenge on Twitter's Home Timeline

    Authors: Luca Belli, Sofia Ira Ktena, Alykhan Tejani, Alexandre Lung-Yut-Fong, Frank Portman, Xiao Zhu, Yuanpu Xie, Akshay Gupta, Michael Bronstein, Amra Delić, Gabriele Sottocornola, Walter Anelli, Nazareno Andrade, Jessie Smith, Wenzhe Shi

    Abstract: Recommender systems constitute the core engine of most social network platforms nowadays, aiming to maximize user satisfaction along with other key business objectives. Twitter is no exception. Despite the fact that Twitter data has been extensively used to understand socioeconomic and political phenomena and user behaviour, the implicit feedback provided by users on Tweets through their engagemen… ▽ More

    Submitted 7 October, 2020; v1 submitted 28 April, 2020; originally announced April 2020.

    Comments: 16 pages, 2 tables

  11. arXiv:1809.07703  [pdf, other

    cs.SI cs.LG stat.ML

    Fighting Redundancy and Model Decay with Embeddings

    Authors: Dan Shiebler, Luca Belli, Jay Baxter, Hanchen Xiong, Abhishek Tayal

    Abstract: Every day, hundreds of millions of new Tweets containing over 40 languages of ever-shifting vernacular flow through Twitter. Models that attempt to extract insight from this firehose of information must face the torrential covariate shift that is endemic to the Twitter platform. While regularly-retrained algorithms can maintain performance in the face of this shift, fixed model features that fail… ▽ More

    Submitted 18 September, 2018; originally announced September 2018.

    Comments: Presented at the Common Model Infrastructure Workshop at KDD 2018 (link: https://cmi2018.sdsc.edu/)