Search | arXiv e-print repository

Understanding the Role of Invariance in Transfer Learning

Authors: Till Speicher, Vedant Nanda, Krishna P. Gummadi

Abstract: Transfer learning is a powerful technique for knowledge-sharing between different tasks. Recent work has found that the representations of models with certain invariances, such as to adversarial input perturbations, achieve higher performance on downstream tasks. These findings suggest that invariance may be an important property in the context of transfer learning. However, the relationship of in… ▽ More Transfer learning is a powerful technique for knowledge-sharing between different tasks. Recent work has found that the representations of models with certain invariances, such as to adversarial input perturbations, achieve higher performance on downstream tasks. These findings suggest that invariance may be an important property in the context of transfer learning. However, the relationship of invariance with transfer performance is not fully understood yet and a number of questions remain. For instance, how important is invariance compared to other factors of the pretraining task? How transferable is learned invariance? In this work, we systematically investigate the importance of representational invariance for transfer learning, as well as how it interacts with other parameters during pretraining. To do so, we introduce a family of synthetic datasets that allow us to precisely control factors of variation both in training and test data. Using these datasets, we a) show that for learning representations with high transfer performance, invariance to the right transformations is as, or often more, important than most other factors such as the number of training samples, the model architecture and the identity of the pretraining classes, b) show conditions under which invariance can harm the ability to transfer representations and c) explore how transferable invariance is between tasks. The code is available at \url{https://github.com/tillspeicher/representation-invariance-transfer}. △ Less

Submitted 5 July, 2024; originally announced July 2024.

Comments: Published at TMLR 2024

arXiv:2407.01732 [pdf, other]

Investigating Nudges toward Related Sellers on E-commerce Marketplaces: A Case Study on Amazon

Authors: Abhisek Dash, Abhijnan Chakraborty, Saptarshi Ghosh, Animesh Mukherjee, Krishna P. Gummadi

Abstract: E-commerce marketplaces provide business opportunities to millions of sellers worldwide. Some of these sellers have special relationships with the marketplace by virtue of using their subsidiary services (e.g., fulfillment and/or ship** services provided by the marketplace) -- we refer to such sellers collectively as Related Sellers. When multiple sellers offer to sell the same product, the mark… ▽ More E-commerce marketplaces provide business opportunities to millions of sellers worldwide. Some of these sellers have special relationships with the marketplace by virtue of using their subsidiary services (e.g., fulfillment and/or ship** services provided by the marketplace) -- we refer to such sellers collectively as Related Sellers. When multiple sellers offer to sell the same product, the marketplace helps a customer in selecting an offer (by a seller) through (a) a default offer selection algorithm, (b) showing features about each of the offers and the corresponding sellers (price, seller performance metrics, seller's number of ratings etc.), and (c) finally evaluating the sellers along these features. In this paper, we perform an end-to-end investigation into how the above apparatus can nudge customers toward the Related Sellers on Amazon's four different marketplaces in India, USA, Germany and France. We find that given explicit choices, customers' preferred offers and algorithmically selected offers can be significantly different. We highlight that Amazon is adopting different performance metric evaluation policies for different sellers, potentially benefiting Related Sellers. For instance, such policies result in notable discrepancy between the actual performance metric and the presented performance metric of Related Sellers. We further observe that among the seller-centric features visible to customers, sellers' number of ratings influences their decisions the most, yet it may not reflect the true quality of service by the seller, rather reflecting the scale at which the seller operates, thereby implicitly steering customers toward larger Related Sellers. Moreover, when customers are shown the rectified metrics for the different sellers, their preference toward Related Sellers is almost halved. △ Less

Submitted 1 July, 2024; originally announced July 2024.

Comments: This work has been accepted for presentation at the ACM Conference on Computer-Supported Cooperative Work and Social Computing (CSCW) 2024. It will appear in Proceedings of the ACM on Human-Computer Interaction

arXiv:2404.12957 [pdf, other]

Towards Reliable Latent Knowledge Estimation in LLMs: In-Context Learning vs. Prompting Based Factual Knowledge Extraction

Authors: Qinyuan Wu, Mohammad Aflah Khan, Soumi Das, Vedant Nanda, Bishwamittra Ghosh, Camila Kolling, Till Speicher, Laurent Bindschaedler, Krishna P. Gummadi, Evimaria Terzi

Abstract: We propose an approach for estimating the latent knowledge embedded inside large language models (LLMs). We leverage the in-context learning (ICL) abilities of LLMs to estimate the extent to which an LLM knows the facts stored in a knowledge base. Our knowledge estimator avoids reliability concerns with previous prompting-based methods, is both conceptually simpler and easier to apply, and we demo… ▽ More We propose an approach for estimating the latent knowledge embedded inside large language models (LLMs). We leverage the in-context learning (ICL) abilities of LLMs to estimate the extent to which an LLM knows the facts stored in a knowledge base. Our knowledge estimator avoids reliability concerns with previous prompting-based methods, is both conceptually simpler and easier to apply, and we demonstrate that it can surface more of the latent knowledge embedded in LLMs. We also investigate how different design choices affect the performance of ICL-based knowledge estimation. Using the proposed estimator, we perform a large-scale evaluation of the factual knowledge of a variety of open source LLMs, like OPT, Pythia, Llama(2), Mistral, Gemma, etc. over a large set of relations and facts from the Wikidata knowledge base. We observe differences in the factual knowledge between different model families and models of different sizes, that some relations are consistently better known than others but that models differ in the precise facts they know, and differences in the knowledge of base models and their finetuned counterparts. △ Less

Submitted 19 April, 2024; originally announced April 2024.

arXiv:2403.18623 [pdf, other]

Antitrust, Amazon, and Algorithmic Auditing

Authors: Abhisek Dash, Abhijnan Chakraborty, Saptarshi Ghosh, Animesh Mukherjee, Jens Frankenreiter, Stefan Bechtold, Krishna P. Gummadi

Abstract: In digital markets, antitrust law and special regulations aim to ensure that markets remain competitive despite the dominating role that digital platforms play today in everyone's life. Unlike traditional markets, market participant behavior is easily observable in these markets. We present a series of empirical investigations into the extent to which Amazon engages in practices that are typically… ▽ More In digital markets, antitrust law and special regulations aim to ensure that markets remain competitive despite the dominating role that digital platforms play today in everyone's life. Unlike traditional markets, market participant behavior is easily observable in these markets. We present a series of empirical investigations into the extent to which Amazon engages in practices that are typically described as self-preferencing. We discuss how the computer science tools used in this paper can be used in a regulatory environment that is based on algorithmic auditing and requires regulating digital markets at scale. △ Less

Submitted 25 April, 2024; v1 submitted 27 March, 2024; originally announced March 2024.

Comments: The paper has been accepted to appear at Journal of Institutional and Theoretical Economics (JITE) 2024

arXiv:2403.12410 [pdf]

TikTok and the Art of Personalization: Investigating Exploration and Exploitation on Social Media Feeds

Authors: Karan Vombatkere, Sepehr Mousavi, Savvas Zannettou, Franziska Roesner, Krishna P. Gummadi

Abstract: Recommendation algorithms for social media feeds often function as black boxes from the perspective of users. We aim to detect whether social media feed recommendations are personalized to users, and to characterize the factors contributing to personalization in these feeds. We introduce a general framework to examine a set of social media feed recommendations for a user as a timeline. We label it… ▽ More Recommendation algorithms for social media feeds often function as black boxes from the perspective of users. We aim to detect whether social media feed recommendations are personalized to users, and to characterize the factors contributing to personalization in these feeds. We introduce a general framework to examine a set of social media feed recommendations for a user as a timeline. We label items in the timeline as the result of exploration vs. exploitation of the user's interests on the part of the recommendation algorithm and introduce a set of metrics to capture the extent of personalization across user timelines. We apply our framework to a real TikTok dataset and validate our results using a baseline generated from automated TikTok bots, as well as a randomized baseline. We also investigate the extent to which factors such as video viewing duration, liking, and following drive the personalization of content on TikTok. Our results demonstrate that our framework produces intuitive and explainable results, and can be used to audit and understand personalization in social media feeds. △ Less

Submitted 18 March, 2024; originally announced March 2024.

Comments: ACM Web Conference 2024

arXiv:2306.00183 [pdf, other]

Diffused Redundancy in Pre-trained Representations

Authors: Vedant Nanda, Till Speicher, John P. Dickerson, Soheil Feizi, Krishna P. Gummadi, Adrian Weller

Abstract: Representations learned by pre-training a neural network on a large dataset are increasingly used successfully to perform a variety of downstream tasks. In this work, we take a closer look at how features are encoded in such pre-trained representations. We find that learned representations in a given layer exhibit a degree of diffuse redundancy, ie, any randomly chosen subset of neurons in the lay… ▽ More Representations learned by pre-training a neural network on a large dataset are increasingly used successfully to perform a variety of downstream tasks. In this work, we take a closer look at how features are encoded in such pre-trained representations. We find that learned representations in a given layer exhibit a degree of diffuse redundancy, ie, any randomly chosen subset of neurons in the layer that is larger than a threshold size shares a large degree of similarity with the full layer and is able to perform similarly as the whole layer on a variety of downstream tasks. For example, a linear probe trained on $20\%$ of randomly picked neurons from the penultimate layer of a ResNet50 pre-trained on ImageNet1k achieves an accuracy within $5\%$ of a linear probe trained on the full layer of neurons for downstream CIFAR10 classification. We conduct experiments on different neural architectures (including CNNs and Transformers) pre-trained on both ImageNet1k and ImageNet21k and evaluate a variety of downstream tasks taken from the VTAB benchmark. We find that the loss and dataset used during pre-training largely govern the degree of diffuse redundancy and the "critical mass" of neurons needed often depends on the downstream task, suggesting that there is a task-inherent redundancy-performance Pareto frontier. Our findings shed light on the nature of representations learned by pre-trained deep neural networks and suggest that entire layers might not be necessary to perform many downstream tasks. We investigate the potential for exploiting this redundancy to achieve efficient generalization for downstream tasks and also draw caution to certain possible unintended consequences. Our code is available at \url{https://github.com/nvedant07/diffused-redundancy}. △ Less

Submitted 14 November, 2023; v1 submitted 31 May, 2023; originally announced June 2023.

Comments: NeurIPS 2023

arXiv:2305.19294 [pdf, other]

Pointwise Representational Similarity

Authors: Camila Kolling, Till Speicher, Vedant Nanda, Mariya Toneva, Krishna P. Gummadi

Abstract: With the increasing reliance on deep neural networks, it is important to develop ways to better understand their learned representations. Representation similarity measures have emerged as a popular tool for examining learned representations However, existing measures only provide aggregate estimates of similarity at a global level, i.e. over a set of representations for N input examples. As such,… ▽ More With the increasing reliance on deep neural networks, it is important to develop ways to better understand their learned representations. Representation similarity measures have emerged as a popular tool for examining learned representations However, existing measures only provide aggregate estimates of similarity at a global level, i.e. over a set of representations for N input examples. As such, these measures are not well-suited for investigating representations at a local level, i.e. representations of a single input example. Local similarity measures are needed, for instance, to understand which individual input representations are affected by training interventions to models (e.g. to be more fair and unbiased) or are at greater risk of being misclassified. In this work, we fill in this gap and propose Pointwise Normalized Kernel Alignment (PNKA), a measure that quantifies how similarly an individual input is represented in two representation spaces. Intuitively, PNKA compares the similarity of an input's neighborhoods across both spaces. Using our measure, we are able to analyze properties of learned representations at a finer granularity than what was previously possible. Concretely, we show how PNKA can be leveraged to develop a deeper understanding of (a) the input examples that are likely to be misclassified, (b) the concepts encoded by (individual) neurons in a layer, and (c) the effects of fairness interventions on learned representations. △ Less

Submitted 30 May, 2023; originally announced May 2023.

arXiv:2305.17655 [pdf, other]

Understanding Blockchain Governance: Analyzing Decentralized Voting to Amend DeFi Smart Contracts

Authors: Johnnatan Messias, Vabuk Pahari, Balakrishnan Chandrasekaran, Krishna P. Gummadi, Patrick Loiseau

Abstract: Smart contracts are contractual agreements between participants of a blockchain, who cannot implicitly trust one another. They are software programs that run on top of a blockchain, and we may need to change them from time to time (e.g., to fix bugs or address new use cases). Governance protocols define the means for amending or changing these smart contracts without any centralized authority. The… ▽ More Smart contracts are contractual agreements between participants of a blockchain, who cannot implicitly trust one another. They are software programs that run on top of a blockchain, and we may need to change them from time to time (e.g., to fix bugs or address new use cases). Governance protocols define the means for amending or changing these smart contracts without any centralized authority. They distribute the decision-making power to every user of the smart contract: Users vote on accepting or rejecting every change. In this work, we review and characterize decentralized governance in practice, using Compound and Uniswap -- two widely used governance protocols -- as a case study. We reveal a high concentration of voting power in both Compound and Uniswap: 10 voters hold together 57.86% and 44.72% of the voting power, respectively. Although proposals to change or amend the protocol receive, on average, a substantial number of votes (i.e., 89.39%) in favor within the Compound protocol, they require fewer than three voters to obtain 50% or more votes. We show that voting on Compound proposals can be unfairly expensive for small token holders, and we discover voting coalitions that can further marginalize these users. △ Less

Submitted 21 April, 2024; v1 submitted 28 May, 2023; originally announced May 2023.

arXiv:2302.06962 [pdf, other]

Dissecting Bitcoin and Ethereum Transactions: On the Lack of Transaction Contention and Prioritization Transparency in Blockchains

Authors: Johnnatan Messias, Vabuk Pahari, Balakrishnan Chandrasekaran, Krishna P. Gummadi, Patrick Loiseau

Abstract: In permissionless blockchains, transaction issuers include a fee to incentivize miners to include their transactions. To accurately estimate this prioritization fee for a transaction, transaction issuers (or blockchain participants, more generally) rely on two fundamental notions of transparency, namely contention and prioritization transparency. Contention transparency implies that participants a… ▽ More In permissionless blockchains, transaction issuers include a fee to incentivize miners to include their transactions. To accurately estimate this prioritization fee for a transaction, transaction issuers (or blockchain participants, more generally) rely on two fundamental notions of transparency, namely contention and prioritization transparency. Contention transparency implies that participants are aware of every pending transaction that will contend with a given transaction for inclusion. Prioritization transparency states that the participants are aware of the transaction or prioritization fees paid by every such contending transaction. Neither of these notions of transparency holds well today. Private relay networks, for instance, allow users to send transactions privately to miners. Besides, users can offer fees to miners via either direct transfers to miners' wallets or off-chain payments -- neither of which are public. In this work, we characterize the lack of contention and prioritization transparency in Bitcoin and Ethereum resulting from such practices. We show that private relay networks are widely used and private transactions are quite prevalent. We show that the lack of transparency facilitates miners to collude and overcharge users who may use these private relay networks despite them offering little to no guarantees on transaction prioritization. The lack of these transparencies in blockchains has crucial implications for transaction issuers as well as the stability of blockchains. Finally, we make our data sets and scripts publicly available. △ Less

Submitted 24 May, 2023; v1 submitted 14 February, 2023; originally announced February 2023.

Comments: This is a pre-print of our paper accepted to appear to the Financial Cryptography and Data Security 2023 (FC '23)

Journal ref: In Proceedings of the Financial Cryptography and Data Security 2023 (FC '23)

arXiv:2301.04945 [pdf, other]

Analyzing User Engagement with TikTok's Short Format Video Recommendations using Data Donations

Authors: Savvas Zannettou, Olivia-Nemes Nemeth, Oshrat Ayalon, Angelica Goetzen, Krishna P. Gummadi, Elissa M. Redmiles, Franziska Roesner

Abstract: Short-format videos have exploded on platforms like TikTok, Instagram, and YouTube. Despite this, the research community lacks large-scale empirical studies into how people engage with short-format videos and the role of recommendation systems that offer endless streams of such content. In this work, we analyze user engagement on TikTok using data we collect via a data donation system that allows… ▽ More Short-format videos have exploded on platforms like TikTok, Instagram, and YouTube. Despite this, the research community lacks large-scale empirical studies into how people engage with short-format videos and the role of recommendation systems that offer endless streams of such content. In this work, we analyze user engagement on TikTok using data we collect via a data donation system that allows TikTok users to donate their data. We recruited 347 TikTok users and collected 9.2M TikTok video recommendations they received. By analyzing user engagement, we find that the average daily usage time increases over the users' lifetime while the user attention remains stable at around 45%. We also find that users like more videos uploaded by people they follow than those recommended by people they do not follow. Our study offers valuable insights into how users engage with short-format videos on TikTok and lessons learned from designing a data donation system. △ Less

Submitted 20 March, 2024; v1 submitted 12 January, 2023; originally announced January 2023.

Comments: CHI Conference on Human Factors in Computing Systems 2024 (CHI '24)

arXiv:2209.03821 [pdf, other]

Taking Advice from (Dis)Similar Machines: The Impact of Human-Machine Similarity on Machine-Assisted Decision-Making

Authors: Nina Grgić-Hlača, Claude Castelluccia, Krishna P. Gummadi

Abstract: Machine learning algorithms are increasingly used to assist human decision-making. When the goal of machine assistance is to improve the accuracy of human decisions, it might seem appealing to design ML algorithms that complement human knowledge. While neither the algorithm nor the human are perfectly accurate, one could expect that their complementary expertise might lead to improved outcomes. In… ▽ More Machine learning algorithms are increasingly used to assist human decision-making. When the goal of machine assistance is to improve the accuracy of human decisions, it might seem appealing to design ML algorithms that complement human knowledge. While neither the algorithm nor the human are perfectly accurate, one could expect that their complementary expertise might lead to improved outcomes. In this study, we demonstrate that in practice decision aids that are not complementary, but make errors similar to human ones may have their own benefits. In a series of human-subject experiments with a total of 901 participants, we study how the similarity of human and machine errors influences human perceptions of and interactions with algorithmic decision aids. We find that (i) people perceive more similar decision aids as more useful, accurate, and predictable, and that (ii) people are more likely to take opposing advice from more similar decision aids, while (iii) decision aids that are less similar to humans have more opportunities to provide opposing advice, resulting in a higher influence on people's decisions overall. △ Less

Submitted 8 September, 2022; originally announced September 2022.

arXiv:2206.11939 [pdf, other]

Measuring Representational Robustness of Neural Networks Through Shared Invariances

Authors: Vedant Nanda, Till Speicher, Camila Kolling, John P. Dickerson, Krishna P. Gummadi, Adrian Weller

Abstract: A major challenge in studying robustness in deep learning is defining the set of ``meaningless'' perturbations to which a given Neural Network (NN) should be invariant. Most work on robustness implicitly uses a human as the reference model to define such perturbations. Our work offers a new view on robustness by using another reference NN to define the set of perturbations a given NN should be inv… ▽ More A major challenge in studying robustness in deep learning is defining the set of ``meaningless'' perturbations to which a given Neural Network (NN) should be invariant. Most work on robustness implicitly uses a human as the reference model to define such perturbations. Our work offers a new view on robustness by using another reference NN to define the set of perturbations a given NN should be invariant to, thus generalizing the reliance on a reference ``human NN'' to any NN. This makes measuring robustness equivalent to measuring the extent to which two NNs share invariances, for which we propose a measure called STIR. STIR re-purposes existing representation similarity measures to make them suitable for measuring shared invariances. Using our measure, we are able to gain insights into how shared invariances vary with changes in weight initialization, architecture, loss functions, and training dataset. Our implementation is available at: \url{https://github.com/nvedant07/STIR}. △ Less

Submitted 23 June, 2022; originally announced June 2022.

Comments: Accepted for oral presentation at ICML 2022

arXiv:2205.04790 [pdf, other]

doi 10.1145/3531146.3533199

Don't Throw it Away! The Utility of Unlabeled Data in Fair Decision Making

Authors: Miriam Rateike, Ayan Majumdar, Olga Mineeva, Krishna P. Gummadi, Isabel Valera

Abstract: Decision making algorithms, in practice, are often trained on data that exhibits a variety of biases. Decision-makers often aim to take decisions based on some ground-truth target that is assumed or expected to be unbiased, i.e., equally distributed across socially salient groups. In many practical settings, the ground-truth cannot be directly observed, and instead, we have to rely on a biased pro… ▽ More Decision making algorithms, in practice, are often trained on data that exhibits a variety of biases. Decision-makers often aim to take decisions based on some ground-truth target that is assumed or expected to be unbiased, i.e., equally distributed across socially salient groups. In many practical settings, the ground-truth cannot be directly observed, and instead, we have to rely on a biased proxy measure of the ground-truth, i.e., biased labels, in the data. In addition, data is often selectively labeled, i.e., even the biased labels are only observed for a small fraction of the data that received a positive decision. To overcome label and selection biases, recent work proposes to learn stochastic, exploring decision policies via i) online training of new policies at each time-step and ii) enforcing fairness as a constraint on performance. However, the existing approach uses only labeled data, disregarding a large amount of unlabeled data, and thereby suffers from high instability and variance in the learned decision policies at different times. In this paper, we propose a novel method based on a variational autoencoder for practical fair decision-making. Our method learns an unbiased data representation leveraging both labeled and unlabeled data and uses the representations to learn a policy in an online process. Using synthetic data, we empirically validate that our method converges to the optimal (fair) policy according to the ground-truth with low variance. In real-world experiments, we further show that our training approach not only offers a more stable learning process but also yields policies with higher fairness as well as utility than previous approaches. △ Less

Submitted 4 July, 2022; v1 submitted 10 May, 2022; originally announced May 2022.

arXiv:2204.12062 [pdf, other]

doi 10.1145/3485447.3512136

Scheduling Virtual Conferences Fairly: Achieving Equitable Participant and Speaker Satisfaction

Authors: Gourab K. Patro, Prithwish Jana, Abhijnan Chakraborty, Krishna P. Gummadi, Niloy Ganguly

Abstract: Recently, almost all conferences have moved to virtual mode due to the pandemic-induced restrictions on travel and social gathering. Contrary to in-person conferences, virtual conferences face the challenge of efficiently scheduling talks, accounting for the availability of participants from different timezones and their interests in attending different talks. A natural objective for conference or… ▽ More Recently, almost all conferences have moved to virtual mode due to the pandemic-induced restrictions on travel and social gathering. Contrary to in-person conferences, virtual conferences face the challenge of efficiently scheduling talks, accounting for the availability of participants from different timezones and their interests in attending different talks. A natural objective for conference organizers is to maximize efficiency, e.g., total expected audience participation across all talks. However, we show that optimizing for efficiency alone can result in an unfair virtual conference schedule, where individual utilities for participants and speakers can be highly unequal. To address this, we formally define fairness notions for participants and speakers, and derive suitable objectives to account for them. As the efficiency and fairness objectives can be in conflict with each other, we propose a joint optimization framework that allows conference organizers to design schedules that balance (i.e., allow trade-offs) among efficiency, participant fairness and speaker fairness objectives. While the optimization problem can be solved using integer programming to schedule smaller conferences, we provide two scalable techniques to cater to bigger conferences. Extensive evaluations over multiple real-world datasets show the efficacy and flexibility of our proposed approaches. △ Less

Submitted 25 April, 2022; originally announced April 2022.

Comments: In proceedings of the Thirty-first Web Conference (WWW-2022). arXiv admin note: text overlap with arXiv:2010.14624

arXiv:2204.00241 [pdf, other]

FaiRIR: Mitigating Exposure Bias from Related Item Recommendations in Two-Sided Platforms

Authors: Abhisek Dash, Abhijnan Chakraborty, Saptarshi Ghosh, Animesh Mukherjee, Krishna P. Gummadi

Abstract: Related Item Recommendations (RIRs) are ubiquitous in most online platforms today, including e-commerce and content streaming sites. These recommendations not only help users compare items related to a given item, but also play a major role in bringing traffic to individual items, thus deciding the exposure that different items receive. With a growing number of people depending on such platforms t… ▽ More Related Item Recommendations (RIRs) are ubiquitous in most online platforms today, including e-commerce and content streaming sites. These recommendations not only help users compare items related to a given item, but also play a major role in bringing traffic to individual items, thus deciding the exposure that different items receive. With a growing number of people depending on such platforms to earn their livelihood, it is important to understand whether different items are receiving their desired exposure. To this end, our experiments on multiple real-world RIR datasets reveal that the existing RIR algorithms often result in very skewed exposure distribution of items, and the quality of items is not a plausible explanation for such skew in exposure. To mitigate this exposure bias, we introduce multiple flexible interventions (FaiRIR) in the RIR pipeline. We instantiate these mechanisms with two well-known algorithms for constructing related item recommendations -- rating-SVD and item2vec -- and show on real-world data that our mechanisms allow for a fine-grained control on the exposure distribution, often at a small or no cost in terms of recommendation quality, measured in terms of relatedness and user satisfaction. △ Less

Submitted 1 April, 2022; originally announced April 2022.

Comments: This work has been accepted as a regular paper in IEEE Transactions on Computational Social Systems 2022 (IEEE TCSS'22)

arXiv:2202.03934 [pdf, other]

Alexa, in you, I trust! Fairness and Interpretability Issues in E-commerce Search through Smart Speakers

Authors: Abhisek Dash, Abhijnan Chakraborty, Saptarshi Ghosh, Animesh Mukherjee, Krishna P. Gummadi

Abstract: In traditional (desktop) e-commerce search, a customer issues a specific query and the system returns a ranked list of products in order of relevance to the query. An increasingly popular alternative in e-commerce search is to issue a voice-query to a smart speaker (e.g., Amazon Echo) powered by a voice assistant (VA, e.g., Alexa). In this situation, the VA usually spells out the details of only o… ▽ More In traditional (desktop) e-commerce search, a customer issues a specific query and the system returns a ranked list of products in order of relevance to the query. An increasingly popular alternative in e-commerce search is to issue a voice-query to a smart speaker (e.g., Amazon Echo) powered by a voice assistant (VA, e.g., Alexa). In this situation, the VA usually spells out the details of only one product, an explanation citing the reason for its selection, and a default action of adding the product to the customer's cart. This reduced autonomy of the customer in the choice of a product during voice-search makes it necessary for a VA to be far more responsible and trustworthy in its explanation and default action. In this paper, we ask whether the explanation presented for a product selection by the Alexa VA installed on an Amazon Echo device is consistent with human understanding as well as with the observations on other traditional mediums (e.g., desktop ecommerce search). Through a user survey, we find that in 81% cases the interpretation of 'a top result' by the users is different from that of Alexa. While investigating for the fairness of the default action, we observe that over a set of as many as 1000 queries, in nearly 68% cases, there exist one or more products which are more relevant (as per Amazon's own desktop search results) than the product chosen by Alexa. Finally, we conducted a survey over 30 queries for which the Alexa-selected product was different from the top desktop search result, and observed that in nearly 73% cases, the participants preferred the top desktop search result as opposed to the product chosen by Alexa. Our results raise several concerns and necessitates more discussions around the related fairness and interpretability issues of VAs for e-commerce search. △ Less

Submitted 8 February, 2022; v1 submitted 8 February, 2022; originally announced February 2022.

Comments: This work has been accepted at The Web Conference 2022 (WWW'22)

arXiv:2201.07726 [pdf, other]

"Learn the Facts About COVID-19": Analyzing the Use of Warning Labels on TikTok Videos

Authors: Chen Ling, Krishna P. Gummadi, Savvas Zannettou

Abstract: During the COVID-19 pandemic, health-related misinformation and harmful content shared online had a significant adverse effect on society. To mitigate this adverse effect, mainstream social media platforms employed soft moderation interventions (i.e., warning labels) on potentially harmful posts. Despite the recent popularity of these moderation interventions, we lack empirical analyses aiming to… ▽ More During the COVID-19 pandemic, health-related misinformation and harmful content shared online had a significant adverse effect on society. To mitigate this adverse effect, mainstream social media platforms employed soft moderation interventions (i.e., warning labels) on potentially harmful posts. Despite the recent popularity of these moderation interventions, we lack empirical analyses aiming to uncover how these warning labels are used in the wild, particularly during challenging times like the COVID-19 pandemic. In this work, we analyze the use of warning labels on TikTok, focusing on COVID-19 videos. First, we construct a set of 26 COVID-19 related hashtags, then we collect 41K videos that include those hashtags in their description. Second, we perform a quantitative analysis on the entire dataset to understand the use of warning labels on TikTok. Then, we perform an in-depth qualitative study, using thematic analysis, on 222 COVID-19 related videos to assess the content and the connection between the content and the warning labels. Our analysis shows that TikTok broadly applies warning labels on TikTok videos, likely based on hashtags included in the description. More worrying is the addition of COVID-19 warning labels on videos where their actual content is not related to COVID-19 (23% of the cases in a sample of 143 English videos that are not related to COVID-19). Finally, our qualitative analysis on a sample of 222 videos shows that 7.7% of the videos share misinformation/harmful content and do not include warning labels, 37.3% share benign information and include warning labels, and that 35% of the videos that share misinformation/harmful content (and need a warning label) are made for fun. Our study demonstrates the need to develop more accurate and precise soft moderation systems, especially on a platform like TikTok that is extremely popular among people of younger age. △ Less

Submitted 19 January, 2022; originally announced January 2022.

Comments: 11 pages (include reference), 4 figures

arXiv:2201.01180 [pdf, other]

doi 10.1145/3503624

Towards Fair Recommendation in Two-Sided Platforms

Authors: Arpita Biswas, Gourab K Patro, Niloy Ganguly, Krishna P. Gummadi, Abhijnan Chakraborty

Abstract: Many online platforms today (such as Amazon, Netflix, Spotify, LinkedIn, and AirBnB) can be thought of as two-sided markets with producers and customers of goods and services. Traditionally, recommendation services in these platforms have focused on maximizing customer satisfaction by tailoring the results according to the personalized preferences of individual customers. However, our investigatio… ▽ More Many online platforms today (such as Amazon, Netflix, Spotify, LinkedIn, and AirBnB) can be thought of as two-sided markets with producers and customers of goods and services. Traditionally, recommendation services in these platforms have focused on maximizing customer satisfaction by tailoring the results according to the personalized preferences of individual customers. However, our investigation reinforces the fact that such customer-centric design of these services may lead to unfair distribution of exposure to the producers, which may adversely impact their well-being. On the other hand, a pure producer-centric design might become unfair to the customers. As more and more people are depending on such platforms to earn a living, it is important to ensure fairness to both producers and customers. In this work, by map** a fair personalized recommendation problem to a constrained version of the problem of fairly allocating indivisible goods, we propose to provide fairness guarantees for both sides. Formally, our proposed {\em FairRec} algorithm guarantees Maxi-Min Share ($α$-MMS) of exposure for the producers, and Envy-Free up to One Item (EF1) fairness for the customers. Extensive evaluations over multiple real-world datasets show the effectiveness of {\em FairRec} in ensuring two-sided fairness while incurring a marginal loss in overall recommendation quality. Finally, we present a modification of FairRec (named as FairRecPlus) that at the cost of additional computation time, improves the recommendation performance for the customers, while maintaining the same fairness guarantees. △ Less

Submitted 26 December, 2021; originally announced January 2022.

Comments: ACM Transactions on the Web, Volume 16, Issue 2 May 2022, Article no 8. arXiv admin note: substantial text overlap with arXiv:2002.10764

arXiv:2112.05630 [pdf, other]

doi 10.1016/j.artint.2021.103609

On Fair Selection in the Presence of Implicit and Differential Variance

Authors: Vitalii Emelianov, Nicolas Gast, Krishna P. Gummadi, Patrick Loiseau

Abstract: Discrimination in selection problems such as hiring or college admission is often explained by implicit bias from the decision maker against disadvantaged demographic groups. In this paper, we consider a model where the decision maker receives a noisy estimate of each candidate's quality, whose variance depends on the candidate's group -- we argue that such differential variance is a key feature o… ▽ More Discrimination in selection problems such as hiring or college admission is often explained by implicit bias from the decision maker against disadvantaged demographic groups. In this paper, we consider a model where the decision maker receives a noisy estimate of each candidate's quality, whose variance depends on the candidate's group -- we argue that such differential variance is a key feature of many selection problems. We analyze two notable settings: in the first, the noise variances are unknown to the decision maker who simply picks the candidates with the highest estimated quality independently of their group; in the second, the variances are known and the decision maker picks candidates having the highest expected quality given the noisy estimate. We show that both baseline decision makers yield discrimination, although in opposite directions: the first leads to underrepresentation of the low-variance group while the second leads to underrepresentation of the high-variance group. We study the effect on the selection utility of imposing a fairness mechanism that we term the $γ$-rule (it is an extension of the classical four-fifths rule and it also includes demographic parity). In the first setting (with unknown variances), we prove that under mild conditions, imposing the $γ$-rule increases the selection utility -- here there is no trade-off between fairness and utility. In the second setting (with known variances), imposing the $γ$-rule decreases the utility but we prove a bound on the utility loss due to the fairness mechanism. △ Less

Submitted 10 December, 2021; originally announced December 2021.

Comments: Accepted for publication in the Artificial Intelligence Journal. This paper is an extended version of our paper arXiv:2006.13699: we added a Bayesian-optimal baseline (in addition to the group-oblivious baseline) and generalized the model by assuming a group-dependent distribution of quality and an implicit bias; but we removed the part on two-stage selection

arXiv:2111.14726 [pdf, other]

Do Invariances in Deep Neural Networks Align with Human Perception?

Authors: Vedant Nanda, Ayan Majumdar, Camila Kolling, John P. Dickerson, Krishna P. Gummadi, Bradley C. Love, Adrian Weller

Abstract: An evaluation criterion for safe and trustworthy deep learning is how well the invariances captured by representations of deep neural networks (DNNs) are shared with humans. We identify challenges in measuring these invariances. Prior works used gradient-based methods to generate identically represented inputs (IRIs), ie, inputs which have identical representations (on a given layer) of a neural n… ▽ More An evaluation criterion for safe and trustworthy deep learning is how well the invariances captured by representations of deep neural networks (DNNs) are shared with humans. We identify challenges in measuring these invariances. Prior works used gradient-based methods to generate identically represented inputs (IRIs), ie, inputs which have identical representations (on a given layer) of a neural network, and thus capture invariances of a given network. One necessary criterion for a network's invariances to align with human perception is for its IRIs look 'similar' to humans. Prior works, however, have mixed takeaways; some argue that later layers of DNNs do not learn human-like invariances (\cite{jenelle2019metamers}) yet others seem to indicate otherwise (\cite{mahendran2014understanding}). We argue that the loss function used to generate IRIs can heavily affect takeaways about invariances of the network and is the primary reason for these conflicting findings. We propose an adversarial regularizer on the IRI generation loss that finds IRIs that make any model appear to have very little shared invariance with humans. Based on this evidence, we argue that there is scope for improving models to have human-like invariances, and further, to have meaningful comparisons between models one should use IRIs generated using the regularizer-free loss. We then conduct an in-depth investigation of how different components (eg architectures, training losses, data augmentations) of the deep learning pipeline contribute to learning models that have good alignment with humans. We find that architectures with residual connections trained using a (self-supervised) contrastive loss with $\ell_p$ ball adversarial data augmentation tend to learn invariances that are most aligned with humans. Code: \url{github.com/nvedant07/Human-NN-Alignment}. △ Less

Submitted 2 December, 2022; v1 submitted 29 November, 2021; originally announced November 2021.

Comments: AAAI 2023

arXiv:2110.11740 [pdf, other]

doi 10.1145/3487552.3487823

Selfish & Opaque Transaction Ordering in the Bitcoin Blockchain: The Case for Chain Neutrality

Authors: Johnnatan Messias, Mohamed Alzayat, Balakrishnan Chandrasekaran, Krishna P. Gummadi, Patrick Loiseau, Alan Mislove

Abstract: Most public blockchain protocols, including the popular Bitcoin and Ethereum blockchains, do not formally specify the order in which miners should select transactions from the pool of pending (or uncommitted) transactions for inclusion in the blockchain. Over the years, informal conventions or "norms" for transaction ordering have, however, emerged via the use of shared software by miners, e.g., t… ▽ More Most public blockchain protocols, including the popular Bitcoin and Ethereum blockchains, do not formally specify the order in which miners should select transactions from the pool of pending (or uncommitted) transactions for inclusion in the blockchain. Over the years, informal conventions or "norms" for transaction ordering have, however, emerged via the use of shared software by miners, e.g., the GetBlockTemplate (GBT) mining protocol in Bitcoin Core. Today, a widely held view is that Bitcoin miners prioritize transactions based on their offered "transaction fee-per-byte." Bitcoin users are, consequently, encouraged to increase the fees to accelerate the commitment of their transactions, particularly during periods of congestion. In this paper, we audit the Bitcoin blockchain and present statistically significant evidence of mining pools deviating from the norms to accelerate the commitment of transactions for which they have (i) a selfish or vested interest, or (ii) received dark-fee payments via opaque (non-public) side-channels. As blockchains are increasingly being used as a record-kee** substrate for a variety of decentralized (financial technology) systems, our findings call for an urgent discussion on defining neutrality norms that miners must adhere to when ordering transactions in the chains. Finally, we make our data sets and scripts publicly available. △ Less

Submitted 22 October, 2021; originally announced October 2021.

Comments: This is a pre-print of our paper accepted to appear to ACM IMC 2021

Journal ref: In Proceedings of the ACM SIGCOMM Internet Measurement Conference (IMC 2021)

arXiv:2109.04432 [pdf, other]

Detecting and Mitigating Test-time Failure Risks via Model-agnostic Uncertainty Learning

Authors: Preethi Lahoti, Krishna P. Gummadi, Gerhard Weikum

Abstract: Reliably predicting potential failure risks of machine learning (ML) systems when deployed with production data is a crucial aspect of trustworthy AI. This paper introduces Risk Advisor, a novel post-hoc meta-learner for estimating failure risks and predictive uncertainties of any already-trained black-box classification model. In addition to providing a risk score, the Risk Advisor decomposes the… ▽ More Reliably predicting potential failure risks of machine learning (ML) systems when deployed with production data is a crucial aspect of trustworthy AI. This paper introduces Risk Advisor, a novel post-hoc meta-learner for estimating failure risks and predictive uncertainties of any already-trained black-box classification model. In addition to providing a risk score, the Risk Advisor decomposes the uncertainty estimates into aleatoric and epistemic uncertainty components, thus giving informative insights into the sources of uncertainty inducing the failures. Consequently, Risk Advisor can distinguish between failures caused by data variability, data shifts and model limitations and advise on mitigation actions (e.g., collecting more data to counter data shift). Extensive experiments on various families of black-box classification models and on real-world and synthetic datasets covering common ML failure scenarios show that the Risk Advisor reliably predicts deployment-time failure risks in all the scenarios, and outperforms strong baselines. △ Less

Submitted 9 September, 2021; originally announced September 2021.

Comments: To appear in the 21st IEEE International Conference on Data Mining (ICDM 2021), Auckland, New Zealand

arXiv:2106.02970 [pdf, other]

Modeling Coordinated vs. P2P Mining: An Analysis of Inefficiency and Inequality in Proof-of-Work Blockchains

Authors: Mohamed Alzayat, Johnnatan Messias, Balakrishnan Chandrasekaran, Krishna P. Gummadi, Patrick Loiseau

Abstract: We study efficiency in a proof-of-work blockchain with non-zero latencies, focusing in particular on the (inequality in) individual miners' efficiencies. Prior work attributed differences in miners' efficiencies mostly to attacks, but we pursue a different question: Can inequality in miners' efficiencies be explained by delays, even when all miners are honest? Traditionally, such efficiency-relate… ▽ More We study efficiency in a proof-of-work blockchain with non-zero latencies, focusing in particular on the (inequality in) individual miners' efficiencies. Prior work attributed differences in miners' efficiencies mostly to attacks, but we pursue a different question: Can inequality in miners' efficiencies be explained by delays, even when all miners are honest? Traditionally, such efficiency-related questions were tackled only at the level of the overall system, and in a peer-to-peer (P2P) setting where miners directly connect to one another. Despite it being common today for miners to pool compute capacities in a mining pool managed by a centralized coordinator, efficiency in such a coordinated setting has barely been studied. In this paper, we propose a simple model of a proof-of-work blockchain with latencies for both the P2P and the coordinated settings. We derive a closed-form expression for the efficiency in the coordinated setting with an arbitrary number of miners and arbitrary latencies, both for the overall system and for each individual miner. We leverage this result to show that inequalities arise from variability in the delays, but that if all miners are equidistant from the coordinator, they have equal efficiency irrespective of their compute capacities. We then prove that, under a natural consistency condition, the overall system efficiency in the P2P setting is higher than that in the coordinated setting. Finally, we perform a simulation-based study to demonstrate that even in the P2P setting delays between miners introduce inequalities, and that there is a more complex interplay between delays and compute capacities. △ Less

Submitted 5 June, 2021; originally announced June 2021.

Comments: 12 pages, 11 figures

arXiv:2105.04273 [pdf, other]

doi 10.1145/3461702.3462630

Loss-Aversively Fair Classification

Authors: Junaid Ali, Muhammad Bilal Zafar, Adish Singla, Krishna P. Gummadi

Abstract: The use of algorithmic (learning-based) decision making in scenarios that affect human lives has motivated a number of recent studies to investigate such decision making systems for potential unfairness, such as discrimination against subjects based on their sensitive features like gender or race. However, when judging the fairness of a newly designed decision making system, these studies have ove… ▽ More The use of algorithmic (learning-based) decision making in scenarios that affect human lives has motivated a number of recent studies to investigate such decision making systems for potential unfairness, such as discrimination against subjects based on their sensitive features like gender or race. However, when judging the fairness of a newly designed decision making system, these studies have overlooked an important influence on people's perceptions of fairness, which is how the new algorithm changes the status quo, i.e., decisions of the existing decision making system. Motivated by extensive literature in behavioral economics and behavioral psychology (prospect theory), we propose a notion of fair updates that we refer to as loss-averse updates. Loss-averse updates constrain the updates to yield improved (more beneficial) outcomes to subjects compared to the status quo. We propose tractable proxy measures that would allow this notion to be incorporated in the training of a variety of linear and non-linear classifiers. We show how our proxy measures can be combined with existing measures for training nondiscriminatory classifiers. Our evaluation using synthetic and real-world datasets demonstrates that the proposed proxy measures are effective for their desired tasks. △ Less

Submitted 10 May, 2021; originally announced May 2021.

Comments: 8 pages, Accepted at AIES 2019

Journal ref: In AAAI/ACM Conference on AI, Ethics, and Society (AIES 2019), January 27-28 2019 Honolulu, HI, USA

arXiv:2105.04249 [pdf, other]

doi 10.1145/3461702.3462630

Accounting for Model Uncertainty in Algorithmic Discrimination

Authors: Junaid Ali, Preethi Lahoti, Krishna P. Gummadi

Abstract: Traditional approaches to ensure group fairness in algorithmic decision making aim to equalize ``total'' error rates for different subgroups in the population. In contrast, we argue that the fairness approaches should instead focus only on equalizing errors arising due to model uncertainty (a.k.a epistemic uncertainty), caused due to lack of knowledge about the best model or due to lack of data. I… ▽ More Traditional approaches to ensure group fairness in algorithmic decision making aim to equalize ``total'' error rates for different subgroups in the population. In contrast, we argue that the fairness approaches should instead focus only on equalizing errors arising due to model uncertainty (a.k.a epistemic uncertainty), caused due to lack of knowledge about the best model or due to lack of data. In other words, our proposal calls for ignoring the errors that occur due to uncertainty inherent in the data, i.e., aleatoric uncertainty. We draw a connection between predictive multiplicity and model uncertainty and argue that the techniques from predictive multiplicity could be used to identify errors made due to model uncertainty. We propose scalable convex proxies to come up with classifiers that exhibit predictive multiplicity and empirically show that our methods are comparable in performance and up to four orders of magnitude faster than the current state-of-the-art. We further propose methods to achieve our goal of equalizing group error rates arising due to model uncertainty in algorithmic decision making and demonstrate the effectiveness of these methods using synthetic and real-world datasets. △ Less

Submitted 10 May, 2021; originally announced May 2021.

Comments: 12 pages, Accepted at AIES 2021

arXiv:2105.02725 [pdf, other]

CrossWalk: Fairness-enhanced Node Representation Learning

Authors: Ahmad Khajehnejad, Moein Khajehnejad, Mahmoudreza Babaei, Krishna P. Gummadi, Adrian Weller, Baharan Mirzasoleiman

Abstract: The potential for machine learning systems to amplify social inequities and unfairness is receiving increasing popular and academic attention. Much recent work has focused on develo** algorithmic tools to assess and mitigate such unfairness. However, there is little work on enhancing fairness in graph algorithms. Here, we develop a simple, effective and general method, CrossWalk, that enhances f… ▽ More The potential for machine learning systems to amplify social inequities and unfairness is receiving increasing popular and academic attention. Much recent work has focused on develo** algorithmic tools to assess and mitigate such unfairness. However, there is little work on enhancing fairness in graph algorithms. Here, we develop a simple, effective and general method, CrossWalk, that enhances fairness of various graph algorithms, including influence maximization, link prediction and node classification, applied to node embeddings. CrossWalk is applicable to any random walk based node representation learning algorithm, such as DeepWalk and Node2Vec. The key idea is to bias random walks to cross group boundaries, by upweighting edges which (1) are closer to the groups' peripheries or (2) connect different groups in the network. CrossWalk pulls nodes that are near groups' peripheries towards their neighbors from other groups in the embedding space, while preserving the necessary structural properties of the graph. Extensive experiments show the effectiveness of our algorithm to enhance fairness in various graph algorithms, including influence maximization, link prediction and node classification in synthetic and real networks, with only a very small decrease in performance. △ Less

Submitted 25 March, 2022; v1 submitted 6 May, 2021; originally announced May 2021.

Comments: Association for the Advancement of Artificial Intelligence (AAAI) 2022

arXiv:2102.00141 [pdf, other]

When the Umpire is also a Player: Bias in Private Label Product Recommendations on E-commerce Marketplaces

Authors: Abhisek Dash, Abhijnan Chakraborty, Saptarshi Ghosh, Animesh Mukherjee, Krishna P. Gummadi

Abstract: Algorithmic recommendations mediate interactions between millions of customers and products (in turn, their producers and sellers) on large e-commerce marketplaces like Amazon. In recent years, the producers and sellers have raised concerns about the fairness of black-box recommendation algorithms deployed on these marketplaces. Many complaints are centered around marketplaces biasing the algorith… ▽ More Algorithmic recommendations mediate interactions between millions of customers and products (in turn, their producers and sellers) on large e-commerce marketplaces like Amazon. In recent years, the producers and sellers have raised concerns about the fairness of black-box recommendation algorithms deployed on these marketplaces. Many complaints are centered around marketplaces biasing the algorithms to preferentially favor their own `private label' products over competitors. These concerns are exacerbated as marketplaces increasingly de-emphasize or replace `organic' recommendations with ad-driven `sponsored' recommendations, which include their own private labels. While these concerns have been covered in popular press and have spawned regulatory investigations, to our knowledge, there has not been any public audit of these marketplace algorithms. In this study, we bridge this gap by performing an end-to-end systematic audit of related item recommendations on Amazon. We propose a network-centric framework to quantify and compare the biases across organic and sponsored related item recommendations. Along a number of our proposed bias measures, we find that the sponsored recommendations are significantly more biased toward Amazon private label products compared to organic recommendations. While our findings are primarily interesting to producers and sellers on Amazon, our proposed bias measures are generally useful for measuring link formation bias in any social or content networks. △ Less

Submitted 1 February, 2021; v1 submitted 29 January, 2021; originally announced February 2021.

Comments: This work has been accepted for presentation at the ACM Conference on Fairness, Accountability, and Transparency 2021 (ACM FAccT 2021)

arXiv:2010.14624 [pdf, other]

On Fair Virtual Conference Scheduling: Achieving Equitable Participant and Speaker Satisfaction

Authors: Gourab K Patro, Abhijnan Chakraborty, Niloy Ganguly, Krishna P. Gummadi

Abstract: The (COVID-19) pandemic-induced restrictions on travel and social gatherings have prompted most conference organizers to move their events online. However, in contrast to physical conferences, virtual conferences face a challenge in efficiently scheduling talks, accounting for the availability of participants from different time-zones as well as their interests in attending different talks. In suc… ▽ More The (COVID-19) pandemic-induced restrictions on travel and social gatherings have prompted most conference organizers to move their events online. However, in contrast to physical conferences, virtual conferences face a challenge in efficiently scheduling talks, accounting for the availability of participants from different time-zones as well as their interests in attending different talks. In such settings, a natural objective for the conference organizers would be to maximize some global welfare measure, such as the total expected audience participation across all talks. However, we show that optimizing for global welfare could result in a schedule that is unfair to the stakeholders, i.e., the individual utilities for participants and speakers can be highly unequal. To address the fairness concerns, we formally define fairness notions for participants and speakers, and subsequently derive suitable fairness objectives for them. We show that the welfare and fairness objectives can be in conflict with each other, and there is a need to maintain a balance between these objective while caring for them simultaneously. Thus, we propose a joint optimization framework that allows conference organizers to design talk schedules that balance (i.e., allow trade-offs) between global welfare, participant fairness and the speaker fairness objectives. We show that the optimization problem can be solved using integer linear programming, and empirically evaluate the necessity and benefits of such joint optimization approach in virtual conference scheduling. △ Less

Submitted 24 October, 2020; originally announced October 2020.

arXiv:2007.00251 [pdf, other]

Unifying Model Explainability and Robustness via Machine-Checkable Concepts

Authors: Vedant Nanda, Till Speicher, John P. Dickerson, Krishna P. Gummadi, Muhammad Bilal Zafar

Abstract: As deep neural networks (DNNs) get adopted in an ever-increasing number of applications, explainability has emerged as a crucial desideratum for these models. In many real-world tasks, one of the principal reasons for requiring explainability is to in turn assess prediction robustness, where predictions (i.e., class labels) that do not conform to their respective explanations (e.g., presence or ab… ▽ More As deep neural networks (DNNs) get adopted in an ever-increasing number of applications, explainability has emerged as a crucial desideratum for these models. In many real-world tasks, one of the principal reasons for requiring explainability is to in turn assess prediction robustness, where predictions (i.e., class labels) that do not conform to their respective explanations (e.g., presence or absence of a concept in the input) are deemed to be unreliable. However, most, if not all, prior methods for checking explanation-conformity (e.g., LIME, TCAV, saliency maps) require significant manual intervention, which hinders their large-scale deployability. In this paper, we propose a robustness-assessment framework, at the core of which is the idea of using machine-checkable concepts. Our framework defines a large number of concepts that the DNN explanations could be based on and performs the explanation-conformity check at test time to assess prediction robustness. Both steps are executed in an automated manner without requiring any human intervention and are easily scaled to datasets with a very large number of classes. Experiments on real-world datasets and human surveys show that our framework is able to enhance prediction robustness significantly: the predictions marked to be robust by our framework have significantly higher accuracy and are more robust to adversarial perturbations. △ Less

Submitted 2 July, 2020; v1 submitted 1 July, 2020; originally announced July 2020.

Comments: 22 pages, 12 figures, 11 tables

arXiv:2006.13699 [pdf, other]

doi 10.1145/3391403.3399482

On Fair Selection in the Presence of Implicit Variance

Authors: Vitalii Emelianov, Nicolas Gast, Krishna P. Gummadi, Patrick Loiseau

Abstract: Quota-based fairness mechanisms like the so-called Rooney rule or four-fifths rule are used in selection problems such as hiring or college admission to reduce inequalities based on sensitive demographic attributes. These mechanisms are often viewed as introducing a trade-off between selection fairness and utility. In recent work, however, Kleinberg and Raghavan showed that, in the presence of imp… ▽ More Quota-based fairness mechanisms like the so-called Rooney rule or four-fifths rule are used in selection problems such as hiring or college admission to reduce inequalities based on sensitive demographic attributes. These mechanisms are often viewed as introducing a trade-off between selection fairness and utility. In recent work, however, Kleinberg and Raghavan showed that, in the presence of implicit bias in estimating candidates' quality, the Rooney rule can increase the utility of the selection process. We argue that even in the absence of implicit bias, the estimates of candidates' quality from different groups may differ in another fundamental way, namely, in their variance. We term this phenomenon implicit variance and we ask: can fairness mechanisms be beneficial to the utility of a selection process in the presence of implicit variance (even in the absence of implicit bias)? To answer this question, we propose a simple model in which candidates have a true latent quality that is drawn from a group-independent normal distribution. To make the selection, a decision maker receives an unbiased estimate of the quality of each candidate, with normal noise, but whose variance depends on the candidate's group. We then compare the utility obtained by imposing a fairness mechanism that we term $γ$-rule (it includes demographic parity and the four-fifths rule as special cases), to that of a group-oblivious selection algorithm that picks the candidates with the highest estimated quality independently of their group. Our main result shows that the demographic parity mechanism always increases the selection utility, while any $γ$-rule weakly increases it. We extend our model to a two-stage selection process where the true quality is observed at the second stage. We discuss multiple extensions of our results, in particular to different distributions of the true latent quality. △ Less

Submitted 24 June, 2020; originally announced June 2020.

Comments: 27 pages, 10 figures, Economics and Computation (EC'20)

arXiv:2005.09209 [pdf, other]

doi 10.1145/3386392.3399568

Fair Inputs and Fair Outputs: The Incompatibility of Fairness in Privacy and Accuracy

Authors: Bashir Rastegarpanah, Mark Crovella, Krishna P. Gummadi

Abstract: Fairness concerns about algorithmic decision-making systems have been mainly focused on the outputs (e.g., the accuracy of a classifier across individuals or groups). However, one may additionally be concerned with fairness in the inputs. In this paper, we propose and formulate two properties regarding the inputs of (features used by) a classifier. In particular, we claim that fair privacy (whethe… ▽ More Fairness concerns about algorithmic decision-making systems have been mainly focused on the outputs (e.g., the accuracy of a classifier across individuals or groups). However, one may additionally be concerned with fairness in the inputs. In this paper, we propose and formulate two properties regarding the inputs of (features used by) a classifier. In particular, we claim that fair privacy (whether individuals are all asked to reveal the same information) and need-to-know (whether users are only asked for the minimal information required for the task at hand) are desirable properties of a decision system. We explore the interaction between these properties and fairness in the outputs (fair prediction accuracy). We show that for an optimal classifier these three properties are in general incompatible, and we explain what common properties of data make them incompatible. Finally we provide an algorithm to verify if the trade-off between the three properties exists in a given dataset, and use the algorithm to show that this trade-off is common in real data. △ Less

Submitted 24 May, 2020; v1 submitted 19 May, 2020; originally announced May 2020.

arXiv:2002.10764 [pdf, other]

doi 10.1145/3366423.3380196

FairRec: Two-Sided Fairness for Personalized Recommendations in Two-Sided Platforms

Authors: Gourab K Patro, Arpita Biswas, Niloy Ganguly, Krishna P. Gummadi, Abhijnan Chakraborty

Abstract: We investigate the problem of fair recommendation in the context of two-sided online platforms, comprising customers on one side and producers on the other. Traditionally, recommendation services in these platforms have focused on maximizing customer satisfaction by tailoring the results according to the personalized preferences of individual customers. However, our investigation reveals that such… ▽ More We investigate the problem of fair recommendation in the context of two-sided online platforms, comprising customers on one side and producers on the other. Traditionally, recommendation services in these platforms have focused on maximizing customer satisfaction by tailoring the results according to the personalized preferences of individual customers. However, our investigation reveals that such customer-centric design may lead to unfair distribution of exposure among the producers, which may adversely impact their well-being. On the other hand, a producer-centric design might become unfair to the customers. Thus, we consider fairness issues that span both customers and producers. Our approach involves a novel map** of the fair recommendation problem to a constrained version of the problem of fairly allocating indivisible goods. Our proposed FairRec algorithm guarantees at least Maximin Share (MMS) of exposure for most of the producers and Envy-Free up to One item (EF1) fairness for every customer. Extensive evaluations over multiple real-world datasets show the effectiveness of FairRec in ensuring two-sided fairness while incurring a marginal loss in the overall recommendation quality. △ Less

Submitted 23 June, 2020; v1 submitted 25 February, 2020; originally announced February 2020.

Comments: In Proceedings of The Web Conference (WWW) 2020

arXiv:1910.13983 [pdf, other]

DADI: Dynamic Discovery of Fair Information with Adversarial Reinforcement Learning

Authors: Michiel A. Bakker, Duy Patrick Tu, Humberto Riverón Valdés, Krishna P. Gummadi, Kush R. Varshney, Adrian Weller, Alex Pentland

Abstract: We introduce a framework for dynamic adversarial discovery of information (DADI), motivated by a scenario where information (a feature set) is used by third parties with unknown objectives. We train a reinforcement learning agent to sequentially acquire a subset of the information while balancing accuracy and fairness of predictors downstream. Based on the set of already acquired features, the age… ▽ More We introduce a framework for dynamic adversarial discovery of information (DADI), motivated by a scenario where information (a feature set) is used by third parties with unknown objectives. We train a reinforcement learning agent to sequentially acquire a subset of the information while balancing accuracy and fairness of predictors downstream. Based on the set of already acquired features, the agent decides dynamically to either collect more information from the set of available features or to stop and predict using the information that is currently available. Building on previous work exploring adversarial representation learning, we attain group fairness (demographic parity) by rewarding the agent with the adversary's loss, computed over the final feature set. Importantly, however, the framework provides a more general starting point for fair or private dynamic information discovery. Finally, we demonstrate empirically, using two real-world datasets, that we can trade-off fairness and predictive performance △ Less

Submitted 30 October, 2019; originally announced October 2019.

Comments: Accepted at NeurIPS 2019 HCML Workshop

arXiv:1910.10255 [pdf, other]

An Empirical Study on Learning Fairness Metrics for COMPAS Data with Human Supervision

Authors: Hanchen Wang, Nina Grgic-Hlaca, Preethi Lahoti, Krishna P. Gummadi, Adrian Weller

Abstract: The notion of individual fairness requires that similar people receive similar treatment. However, this is hard to achieve in practice since it is difficult to specify the appropriate similarity metric. In this work, we attempt to learn such similarity metric from human annotated data. We gather a new dataset of human judgments on a criminal recidivism prediction (COMPAS) task. By assuming the hum… ▽ More The notion of individual fairness requires that similar people receive similar treatment. However, this is hard to achieve in practice since it is difficult to specify the appropriate similarity metric. In this work, we attempt to learn such similarity metric from human annotated data. We gather a new dataset of human judgments on a criminal recidivism prediction (COMPAS) task. By assuming the human supervision obeys the principle of individual fairness, we leverage prior work on metric learning, evaluate the performance of several metric learning methods on our dataset, and show that the learned metrics outperform the Euclidean and Precision metric under various criteria. We do not provide a way to directly learn a similarity metric satisfying the individual fairness, but to provide an empirical study on how to derive the similarity metric from human supervisors, then future work can use this as a tool to understand human supervision. △ Less

Submitted 31 October, 2019; v1 submitted 22 October, 2019; originally announced October 2019.

Comments: Accepted at NeurIPS 2019 HCML Workshop

arXiv:1909.10005 [pdf, ps, other]

Incremental Fairness in Two-Sided Market Platforms: On Smoothly Updating Recommendations

Authors: Gourab K Patro, Abhijnan Chakraborty, Niloy Ganguly, Krishna P. Gummadi

Abstract: Major online platforms today can be thought of as two-sided markets with producers and customers of goods and services. There have been concerns that over-emphasis on customer satisfaction by the platforms may affect the well-being of the producers. To counter such issues, few recent works have attempted to incorporate fairness for the producers. However, these studies have overlooked an important… ▽ More Major online platforms today can be thought of as two-sided markets with producers and customers of goods and services. There have been concerns that over-emphasis on customer satisfaction by the platforms may affect the well-being of the producers. To counter such issues, few recent works have attempted to incorporate fairness for the producers. However, these studies have overlooked an important issue in such platforms -- to supposedly improve customer utility, the underlying algorithms are frequently updated, causing abrupt changes in the exposure of producers. In this work, we focus on the fairness issues arising out of such frequent updates, and argue for incremental updates of the platform algorithms so that the producers have enough time to adjust (both logistically and mentally) to the change. However, naive incremental updates may become unfair to the customers. Thus focusing on recommendations deployed on two-sided platforms, we formulate an ILP based online optimization to deploy changes incrementally in n steps, where we can ensure smooth transition of the exposure of items while guaranteeing a minimum utility for every customer. Evaluations over multiple real world datasets show that our proposed mechanism for platform updates can be efficient and fair to both the producers and the customers in two-sided platforms. △ Less

Submitted 20 November, 2019; v1 submitted 22 September, 2019; originally announced September 2019.

Comments: To Appear In the Proceedings of 34th AAAI Conference on Artificial Intelligence (AAAI), New York, USA, Feb 2020

arXiv:1909.05583 [pdf, other]

Minimizing Margin of Victory for Fair Political and Educational Districting

Authors: Ana-Andreea Stoica, Abhijnan Chakraborty, Palash Dey, Krishna P. Gummadi

Abstract: In many practical scenarios, a population is divided into disjoint groups for better administration, e.g., electorates into political districts, employees into departments, students into school districts, and so on. However, grou** people arbitrarily may lead to biased partitions, raising concerns of gerrymandering in political districting, racial segregation in schools, etc. To counter such iss… ▽ More In many practical scenarios, a population is divided into disjoint groups for better administration, e.g., electorates into political districts, employees into departments, students into school districts, and so on. However, grou** people arbitrarily may lead to biased partitions, raising concerns of gerrymandering in political districting, racial segregation in schools, etc. To counter such issues, in this paper, we conceptualize such problems in a voting scenario, and propose FAIR DISTRICTING problem to divide a given set of people having preference over candidates into k groups such that the maximum margin of victory of any group is minimized. We also propose the FAIR CONNECTED DISTRICTING problem which additionally requires each group to be connected. We show that the FAIR DISTRICTING problem is NP-complete for plurality voting even if we have only 3 candidates but admits polynomial time algorithms if we assume k to be some constant or everyone can be moved to any group. In contrast, we show that the FAIR CONNECTED DISTRICTING problem is NP-complete for plurality voting even if we have only 2 candidates and k = 2. Finally, we propose heuristic algorithms for both the problems and show their effectiveness in UK political districting and in lowering racial segregation in public schools in the US. △ Less

Submitted 12 September, 2019; originally announced September 2019.

arXiv:1907.01439 [pdf, other]

doi 10.14778/3372716.3372723

Operationalizing Individual Fairness with Pairwise Fair Representations

Authors: Preethi Lahoti, Krishna P. Gummadi, Gerhard Weikum

Abstract: We revisit the notion of individual fairness proposed by Dwork et al. A central challenge in operationalizing their approach is the difficulty in eliciting a human specification of a similarity metric. In this paper, we propose an operationalization of individual fairness that does not rely on a human specification of a distance metric. Instead, we propose novel approaches to elicit and leverage s… ▽ More We revisit the notion of individual fairness proposed by Dwork et al. A central challenge in operationalizing their approach is the difficulty in eliciting a human specification of a similarity metric. In this paper, we propose an operationalization of individual fairness that does not rely on a human specification of a distance metric. Instead, we propose novel approaches to elicit and leverage side-information on equally deserving individuals to counter subordination between social groups. We model this knowledge as a fairness graph, and learn a unified Pairwise Fair Representation (PFR) of the data that captures both data-driven similarity between individuals and the pairwise side-information in fairness graph. We elicit fairness judgments from a variety of sources, including human judgments for two real-world datasets on recidivism prediction (COMPAS) and violent neighborhood prediction (Crime & Communities). Our experiments show that the PFR model for operationalizing individual fairness is practically viable. △ Less

Submitted 1 December, 2019; v1 submitted 2 July, 2019; originally announced July 2019.

Comments: To be published in the proceedings of the VLDB Endowment, Vol. 13, Issue. 4

arXiv:1905.06618 [pdf, other]

On the Fairness of Time-Critical Influence Maximization in Social Networks

Authors: Junaid Ali, Mahmoudreza Babaei, Abhijnan Chakraborty, Baharan Mirzasoleiman, Krishna P. Gummadi, Adish Singla

Abstract: Influence maximization has found applications in a wide range of real-world problems, for instance, viral marketing of products in an online social network, and information propagation of valuable information such as job vacancy advertisements and health-related information. While existing algorithmic techniques usually aim at maximizing the total number of people influenced, the population often… ▽ More Influence maximization has found applications in a wide range of real-world problems, for instance, viral marketing of products in an online social network, and information propagation of valuable information such as job vacancy advertisements and health-related information. While existing algorithmic techniques usually aim at maximizing the total number of people influenced, the population often comprises several socially salient groups, e.g., based on gender or race. As a result, these techniques could lead to disparity across different groups in receiving important information. Furthermore, in many of these applications, the spread of influence is time-critical, i.e., it is only beneficial to be influenced before a time deadline. As we show in this paper, the time-criticality of the information could further exacerbate the disparity of influence across groups. This disparity, introduced by algorithms aimed at maximizing total influence, could have far-reaching consequences, impacting people's prosperity and putting minority groups at a big disadvantage. In this work, we propose a notion of group fairness in time-critical influence maximization. We introduce surrogate objective functions to solve the influence maximization problem under fairness considerations. By exploiting the submodularity structure of our objectives, we provide computationally efficient algorithms with guarantees that are effective in enforcing fairness during the propagation process. We demonstrate the effectiveness of our approach through synthetic and real-world experiments. △ Less

Submitted 3 November, 2021; v1 submitted 16 May, 2019; originally announced May 2019.

Comments: Accepted at TKDE and Human-Centeric Machine learning (HCML), Workshop at NeurIPS 2019

arXiv:1903.01209 [pdf, other]

On the Long-term Impact of Algorithmic Decision Policies: Effort Unfairness and Feature Segregation through Social Learning

Authors: Hoda Heidari, Vedant Nanda, Krishna P. Gummadi

Abstract: Most existing notions of algorithmic fairness are one-shot: they ensure some form of allocative equality at the time of decision making, but do not account for the adverse impact of the algorithmic decisions today on the long-term welfare and prosperity of certain segments of the population. We take a broader perspective on algorithmic fairness. We propose an effort-based measure of fairness and p… ▽ More Most existing notions of algorithmic fairness are one-shot: they ensure some form of allocative equality at the time of decision making, but do not account for the adverse impact of the algorithmic decisions today on the long-term welfare and prosperity of certain segments of the population. We take a broader perspective on algorithmic fairness. We propose an effort-based measure of fairness and present a data-driven framework for characterizing the long-term impact of algorithmic policies on resha** the underlying population. Motivated by the psychological literature on \emph{social learning} and the economic literature on equality of opportunity, we propose a micro-scale model of how individuals may respond to decision-making algorithms. We employ existing measures of segregation from sociology and economics to quantify the resulting macro-scale population-level change. Importantly, we observe that different models may shift the group-conditional distribution of qualifications in different directions. Our findings raise a number of important questions regarding the formalization of fairness for decision-making models. △ Less

Submitted 27 June, 2019; v1 submitted 4 March, 2019; originally announced March 2019.

arXiv:1812.01504 [pdf, other]

doi 10.1145/3289600.3291002

Fighting Fire with Fire: Using Antidote Data to Improve Polarization and Fairness of Recommender Systems

Authors: Bashir Rastegarpanah, Krishna P. Gummadi, Mark Crovella

Abstract: The increasing role of recommender systems in many aspects of society makes it essential to consider how such systems may impact social good. Various modifications to recommendation algorithms have been proposed to improve their performance for specific socially relevant measures. However, previous proposals are often not easily adapted to different measures, and they generally require the ability… ▽ More The increasing role of recommender systems in many aspects of society makes it essential to consider how such systems may impact social good. Various modifications to recommendation algorithms have been proposed to improve their performance for specific socially relevant measures. However, previous proposals are often not easily adapted to different measures, and they generally require the ability to modify either existing system inputs, the system's algorithm, or the system's outputs. As an alternative, in this paper we introduce the idea of improving the social desirability of recommender system outputs by adding more data to the input, an approach we view as providing `antidote' data to the system. We formalize the antidote data problem, and develop optimization-based solutions. We take as our model system the matrix factorization approach to recommendation, and we propose a set of measures to capture the polarization or fairness of recommendations. We then show how to generate antidote data for each measure, pointing out a number of computational efficiencies, and discuss the impact on overall system accuracy. Our experiments show that a modest budget for antidote data can lead to significant improvements in the polarization or fairness of recommendations. △ Less

Submitted 25 January, 2019; v1 submitted 2 December, 2018; originally announced December 2018.

Comments: References to appendices are fixed

arXiv:1811.08690 [pdf, other]

Equality of Voice: Towards Fair Representation in Crowdsourced Top-K Recommendations

Authors: Abhijnan Chakraborty, Gourab K Patro, Niloy Ganguly, Krishna P. Gummadi, Patrick Loiseau

Abstract: To help their users to discover important items at a particular time, major websites like Twitter, Yelp, TripAdvisor or NYTimes provide Top-K recommendations (e.g., 10 Trending Topics, Top 5 Hotels in Paris or 10 Most Viewed News Stories), which rely on crowdsourced popularity signals to select the items. However, different sections of a crowd may have different preferences, and there is a large s… ▽ More To help their users to discover important items at a particular time, major websites like Twitter, Yelp, TripAdvisor or NYTimes provide Top-K recommendations (e.g., 10 Trending Topics, Top 5 Hotels in Paris or 10 Most Viewed News Stories), which rely on crowdsourced popularity signals to select the items. However, different sections of a crowd may have different preferences, and there is a large silent majority who do not explicitly express their opinion. Also, the crowd often consists of actors like bots, spammers, or people running orchestrated campaigns. Recommendation algorithms today largely do not consider such nuances, hence are vulnerable to strategic manipulation by small but hyper-active user groups. To fairly aggregate the preferences of all users while recommending top-K items, we borrow ideas from prior research on social choice theory, and identify a voting mechanism called Single Transferable Vote (STV) as having many of the fairness properties we desire in top-K item (s)elections. We develop an innovative mechanism to attribute preferences of silent majority which also make STV completely operational. We show the generalizability of our approach by implementing it on two different real-world datasets. Through extensive experimentation and comparison with state-of-the-art techniques, we show that our proposed approach provides maximum user satisfaction, and cuts down drastically on items disliked by most but hyper-actively promoted by a few users. △ Less

Submitted 21 November, 2018; originally announced November 2018.

Comments: In the proceedings of the Conference on Fairness, Accountability, and Transparency (FAT* '19). Please cite the conference version

arXiv:1809.03400 [pdf, other]

A Moral Framework for Understanding of Fair ML through Economic Models of Equality of Opportunity

Authors: Hoda Heidari, Michele Loi, Krishna P. Gummadi, Andreas Krause

Abstract: We map the recently proposed notions of algorithmic fairness to economic models of Equality of opportunity (EOP)---an extensively studied ideal of fairness in political philosophy. We formally show that through our conceptual map**, many existing definition of algorithmic fairness, such as predictive value parity and equality of odds, can be interpreted as special cases of EOP. In this respect,… ▽ More We map the recently proposed notions of algorithmic fairness to economic models of Equality of opportunity (EOP)---an extensively studied ideal of fairness in political philosophy. We formally show that through our conceptual map**, many existing definition of algorithmic fairness, such as predictive value parity and equality of odds, can be interpreted as special cases of EOP. In this respect, our work serves as a unifying moral framework for understanding existing notions of algorithmic fairness. Most importantly, this framework allows us to explicitly spell out the moral assumptions underlying each notion of fairness, and interpret recent fairness impossibility results in a new light. Last but not least and inspired by luck egalitarian models of EOP, we propose a new family of measures for algorithmic fairness. We illustrate our proposal empirically and show that employing a measure of algorithmic (un)fairness when its underlying moral assumptions are not satisfied, can have devastating consequences for the disadvantaged group's welfare. △ Less

Submitted 27 November, 2018; v1 submitted 10 September, 2018; originally announced September 2018.

arXiv:1808.09218 [pdf, other]

doi 10.1145/3287560.3287580

On Microtargeting Socially Divisive Ads: A Case Study of Russia-Linked Ad Campaigns on Facebook

Authors: Filipe N. Ribeiro, Koustuv Saha, Mahmoudreza Babaei, Lucas Henrique, Johnnatan Messias, Fabricio Benevenuto, Oana Goga, Krishna P. Gummadi, Elissa M. Redmiles

Abstract: Targeted advertising is meant to improve the efficiency of matching advertisers to their customers. However, targeted advertising can also be abused by malicious advertisers to efficiently reach people susceptible to false stories, stoke grievances, and incite social conflict. Since targeted ads are not seen by non-targeted and non-vulnerable people, malicious ads are likely to go unreported and t… ▽ More Targeted advertising is meant to improve the efficiency of matching advertisers to their customers. However, targeted advertising can also be abused by malicious advertisers to efficiently reach people susceptible to false stories, stoke grievances, and incite social conflict. Since targeted ads are not seen by non-targeted and non-vulnerable people, malicious ads are likely to go unreported and their effects undetected. This work examines a specific case of malicious advertising, exploring the extent to which political ads from the Russian Intelligence Research Agency (IRA) run prior to 2016 U.S. elections exploited Facebook's targeted advertising infrastructure to efficiently target ads on divisive or polarizing topics (e.g., immigration, race-based policing) at vulnerable sub-populations. In particular, we do the following: (a) We conduct U.S. census-representative surveys to characterize how users with different political ideologies report, approve, and perceive truth in the content of the IRA ads. Our surveys show that many ads are "divisive": they elicit very different reactions from people belonging to different socially salient groups. (b) We characterize how these divisive ads are targeted to sub-populations that feel particularly aggrieved by the status quo. Our findings support existing calls for greater transparency of content and targeting of political ads. (c) We particularly focus on how the Facebook ad API facilitates such targeting. We show how the enormous amount of personal data Facebook aggregates about users and makes available to advertisers enables such malicious targeting. △ Less

Submitted 21 November, 2018; v1 submitted 28 August, 2018; originally announced August 2018.

Comments: This is a preprint of a full paper accepted at ACM FAT*'19 (ACM Conference on Fairness, Accountability, and Transparency). Please cite that version instead

arXiv:1807.00787 [pdf, other]

doi 10.1145/3219819.3220046

A Unified Approach to Quantifying Algorithmic Unfairness: Measuring Individual & Group Unfairness via Inequality Indices

Authors: Till Speicher, Hoda Heidari, Nina Grgic-Hlaca, Krishna P. Gummadi, Adish Singla, Adrian Weller, Muhammad Bilal Zafar

Abstract: Discrimination via algorithmic decision making has received considerable attention. Prior work largely focuses on defining conditions for fairness, but does not define satisfactory measures of algorithmic unfairness. In this paper, we focus on the following question: Given two unfair algorithms, how should we determine which of the two is more unfair? Our core idea is to use existing inequality in… ▽ More Discrimination via algorithmic decision making has received considerable attention. Prior work largely focuses on defining conditions for fairness, but does not define satisfactory measures of algorithmic unfairness. In this paper, we focus on the following question: Given two unfair algorithms, how should we determine which of the two is more unfair? Our core idea is to use existing inequality indices from economics to measure how unequally the outcomes of an algorithm benefit different individuals or groups in a population. Our work offers a justified and general framework to compare and contrast the (un)fairness of algorithmic predictors. This unifying approach enables us to quantify unfairness both at the individual and the group level. Further, our work reveals overlooked tradeoffs between different fairness notions: using our proposed measures, the overall individual-level unfairness of an algorithm can be decomposed into a between-group and a within-group component. Earlier methods are typically designed to tackle only between-group unfairness, which may be justified for legal or other reasons. However, we demonstrate that minimizing exclusively the between-group component may, in fact, increase the within-group, and hence the overall unfairness. We characterize and illustrate the tradeoffs between our measures of (un)fairness and the prediction accuracy. △ Less

Submitted 2 July, 2018; originally announced July 2018.

Comments: 12 pages 7 figures To be published in: KDD '18: The 24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining Proceedings

arXiv:1806.04959 [pdf, other]

Fairness Behind a Veil of Ignorance: A Welfare Analysis for Automated Decision Making

Authors: Hoda Heidari, Claudio Ferrari, Krishna P. Gummadi, Andreas Krause

Abstract: We draw attention to an important, yet largely overlooked aspect of evaluating fairness for automated decision making systems---namely risk and welfare considerations. Our proposed family of measures corresponds to the long-established formulations of cardinal social welfare in economics, and is justified by the Rawlsian conception of fairness behind a veil of ignorance. The convex formulation of… ▽ More We draw attention to an important, yet largely overlooked aspect of evaluating fairness for automated decision making systems---namely risk and welfare considerations. Our proposed family of measures corresponds to the long-established formulations of cardinal social welfare in economics, and is justified by the Rawlsian conception of fairness behind a veil of ignorance. The convex formulation of our welfare-based measures of fairness allows us to integrate them as a constraint into any convex loss minimization pipeline. Our empirical analysis reveals interesting trade-offs between our proposal and (a) prediction accuracy, (b) group discrimination, and (c) Dwork et al.'s notion of individual fairness. Furthermore and perhaps most importantly, our work provides both heuristic justification and empirical evidence suggesting that a lower-bound on our measures often leads to bounded inequality in algorithmic outcomes; hence presenting the first computationally feasible mechanism for bounding individual-level inequality. △ Less

Submitted 11 January, 2019; v1 submitted 13 June, 2018; originally announced June 2018.

Comments: Conference: Thirty-second Conference on Neural Information Processing Systems (NIPS 2018)

arXiv:1806.03281 [pdf, other]

Blind Justice: Fairness with Encrypted Sensitive Attributes

Authors: Niki Kilbertus, Adrià Gascón, Matt J. Kusner, Michael Veale, Krishna P. Gummadi, Adrian Weller

Abstract: Recent work has explored how to train machine learning models which do not discriminate against any subgroup of the population as determined by sensitive attributes such as gender or race. To avoid disparate treatment, sensitive attributes should not be considered. On the other hand, in order to avoid disparate impact, sensitive attributes must be examined, e.g., in order to learn a fair model, or… ▽ More Recent work has explored how to train machine learning models which do not discriminate against any subgroup of the population as determined by sensitive attributes such as gender or race. To avoid disparate treatment, sensitive attributes should not be considered. On the other hand, in order to avoid disparate impact, sensitive attributes must be examined, e.g., in order to learn a fair model, or to check if a given model is fair. We introduce methods from secure multi-party computation which allow us to avoid both. By encrypting sensitive attributes, we show how an outcome-based fair model may be learned, checked, or have its outputs verified and held to account, without users revealing their sensitive attributes. △ Less

Submitted 8 June, 2018; originally announced June 2018.

Comments: published at ICML 2018

Journal ref: Proceedings of the 35th International Conference on Machine Learning, PMLR 80:2630-2639, 2018

arXiv:1806.01059 [pdf, other]

iFair: Learning Individually Fair Data Representations for Algorithmic Decision Making

Authors: Preethi Lahoti, Krishna P. Gummadi, Gerhard Weikum

Abstract: People are rated and ranked, towards algorithmic decision making in an increasing number of applications, typically based on machine learning. Research on how to incorporate fairness into such tasks has prevalently pursued the paradigm of group fairness: giving adequate success rates to specifically protected groups. In contrast, the alternative paradigm of individual fairness has received relativ… ▽ More People are rated and ranked, towards algorithmic decision making in an increasing number of applications, typically based on machine learning. Research on how to incorporate fairness into such tasks has prevalently pursued the paradigm of group fairness: giving adequate success rates to specifically protected groups. In contrast, the alternative paradigm of individual fairness has received relatively little attention, and this paper advances this less explored direction. The paper introduces a method for probabilistically map** user records into a low-rank representation that reconciles individual fairness and the utility of classifiers and rankings in downstream applications. Our notion of individual fairness requires that users who are similar in all task-relevant attributes such as job qualification, and disregarding all potentially discriminating attributes such as gender, should have similar outcomes. We demonstrate the versatility of our method by applying it to classification and learning-to-rank tasks on a variety of real-world datasets. Our experiments show substantial improvements over the best prior work for this setting. △ Less

Submitted 6 February, 2019; v1 submitted 4 June, 2018; originally announced June 2018.

Comments: Accepted at ICDE 2019. Please cite the ICDE 2019 proceedings version

arXiv:1805.01788 [pdf, other]

doi 10.1145/3209978.3210063

Equity of Attention: Amortizing Individual Fairness in Rankings

Authors: Asia J. Biega, Krishna P. Gummadi, Gerhard Weikum

Abstract: Rankings of people and items are at the heart of selection-making, match-making, and recommender systems, ranging from employment sites to sharing economy platforms. As ranking positions influence the amount of attention the ranked subjects receive, biases in rankings can lead to unfair distribution of opportunities and resources, such as jobs or income. This paper proposes new measures and mech… ▽ More Rankings of people and items are at the heart of selection-making, match-making, and recommender systems, ranging from employment sites to sharing economy platforms. As ranking positions influence the amount of attention the ranked subjects receive, biases in rankings can lead to unfair distribution of opportunities and resources, such as jobs or income. This paper proposes new measures and mechanisms to quantify and mitigate unfairness from a bias inherent to all rankings, namely, the position bias, which leads to disproportionately less attention being paid to low-ranked subjects. Our approach differs from recent fair ranking approaches in two important ways. First, existing works measure unfairness at the level of subject groups while our measures capture unfairness at the level of individual subjects, and as such subsume group unfairness. Second, as no single ranking can achieve individual attention fairness, we propose a novel mechanism that achieves amortized fairness, where attention accumulated across a series of rankings is proportional to accumulated relevance. We formulate the challenge of achieving amortized individual fairness subject to constraints on ranking quality as an online optimization problem and show that it can be solved as an integer linear program. Our experimental evaluation reveals that unfair attention distribution in rankings can be substantial, and demonstrates that our method can improve individual fairness while retaining high ranking quality. △ Less

Submitted 4 May, 2018; originally announced May 2018.

Comments: Accepted to SIGIR 2018

arXiv:1802.09548 [pdf, other]

Human Perceptions of Fairness in Algorithmic Decision Making: A Case Study of Criminal Risk Prediction

Authors: Nina Grgić-Hlača, Elissa M. Redmiles, Krishna P. Gummadi, Adrian Weller

Abstract: As algorithms are increasingly used to make important decisions that affect human lives, ranging from social benefit assignment to predicting risk of criminal recidivism, concerns have been raised about the fairness of algorithmic decision making. Most prior works on algorithmic fairness normatively prescribe how fair decisions ought to be made. In contrast, here, we descriptively survey users for… ▽ More As algorithms are increasingly used to make important decisions that affect human lives, ranging from social benefit assignment to predicting risk of criminal recidivism, concerns have been raised about the fairness of algorithmic decision making. Most prior works on algorithmic fairness normatively prescribe how fair decisions ought to be made. In contrast, here, we descriptively survey users for how they perceive and reason about fairness in algorithmic decision making. A key contribution of this work is the framework we propose to understand why people perceive certain features as fair or unfair to be used in algorithms. Our framework identifies eight properties of features, such as relevance, volitionality and reliability, as latent considerations that inform people's moral judgments about the fairness of feature use in decision-making algorithms. We validate our framework through a series of scenario-based surveys with 576 people. We find that, based on a person's assessment of the eight latent properties of a feature in our exemplar scenario, we can accurately (> 85%) predict if the person will judge the use of the feature as fair. Our findings have important implications. At a high-level, we show that people's unfairness concerns are multi-dimensional and argue that future studies need to address unfairness concerns beyond discrimination. At a low-level, we find considerable disagreements in people's fairness judgments. We identify root causes of the disagreements, and note possible pathways to resolve them. △ Less

Submitted 26 February, 2018; originally announced February 2018.

Comments: To appear in the Proceedings of the Web Conference (WWW 2018). Code available at https://fate-computing.mpi-sws.org/procedural_fairness/

arXiv:1708.00670 [pdf, other]

On Quantifying Knowledge Segregation in Society

Authors: Abhijnan Chakraborty, Muhammad Ali, Saptarshi Ghosh, Niloy Ganguly, Krishna P. Gummadi

Abstract: With rapid increase in online information consumption, especially via social media sites, there have been concerns on whether people are getting selective exposure to a biased subset of the information space, where a user is receiving more of what she already knows, and thereby potentially getting trapped in echo chambers or filter bubbles. Even though such concerns are being debated for some time… ▽ More With rapid increase in online information consumption, especially via social media sites, there have been concerns on whether people are getting selective exposure to a biased subset of the information space, where a user is receiving more of what she already knows, and thereby potentially getting trapped in echo chambers or filter bubbles. Even though such concerns are being debated for some time, it is not clear how to quantify such echo chamber effect. In this position paper, we introduce Information Segregation (or Informational Segregation) measures, which follow the long lines of work on residential segregation. We believe that information segregation nicely captures the notion of exposure to different information by different population in a society, and would help in quantifying the extent of social media sites offering selective (or diverse) information to their users. △ Less

Submitted 2 August, 2017; originally announced August 2017.

Comments: Accepted for publication in the proceedings of FATREC Workshop on Responsible Recommendation at RecSys 2017

Showing 1–50 of 62 results for author: Gummadi, K P