-
On Efficient Computation of DiRe Committees
Authors:
Kunal Relia
Abstract:
Consider a committee election consisting of (i) a set of candidates who are divided into arbitrary groups each of size \emph{at most} two and a diversity constraint that stipulates the selection of \emph{at least} one candidate from each group and (ii) a set of voters who are divided into arbitrary populations each approving \emph{at most} two candidates and a representation constraint that stipul…
▽ More
Consider a committee election consisting of (i) a set of candidates who are divided into arbitrary groups each of size \emph{at most} two and a diversity constraint that stipulates the selection of \emph{at least} one candidate from each group and (ii) a set of voters who are divided into arbitrary populations each approving \emph{at most} two candidates and a representation constraint that stipulates the selection of \emph{at least} one candidate from each population who has a non-null set of approved candidates.
The DiRe (Diverse + Representative) committee feasibility problem (a.k.a. the minimum vertex cover problem on unweighted undirected graphs) concerns the determination of the smallest size committee that satisfies the given constraints. Here, for this problem, we discover an unconditional deterministic polynomial-time algorithm that is an amalgamation of maximum matching, breadth-first search, maximal matching, and local minimization.
△ Less
Submitted 29 February, 2024;
originally announced February 2024.
-
On the Complexity of Finding a Diverse and Representative Committee using a Monotone, Separable Positional Multiwinner Voting Rule
Authors:
Kunal Relia
Abstract:
Fairness in multiwinner elections, a growing line of research in computational social choice, primarily concerns the use of constraints to ensure fairness. Recent work proposed a model to find a diverse \emph{and} representative committee and studied the model's computational aspects. However, the work gave complexity results under major assumptions on how the candidates and the voters are grouped…
▽ More
Fairness in multiwinner elections, a growing line of research in computational social choice, primarily concerns the use of constraints to ensure fairness. Recent work proposed a model to find a diverse \emph{and} representative committee and studied the model's computational aspects. However, the work gave complexity results under major assumptions on how the candidates and the voters are grouped. Here, we close this gap and classify the complexity of finding a diverse and representative committee using a monotone, separable positional multiwinner voting rule, conditioned \emph{only} on the assumption that P $\neq$ NP.
△ Less
Submitted 23 November, 2022;
originally announced November 2022.
-
Fairly Allocating Utility in Constrained Multiwinner Elections
Authors:
Kunal Relia
Abstract:
Fairness in multiwinner elections is studied in varying contexts. For instance, diversity of candidates and representation of voters are both separately termed as being fair. A common denominator to ensure fairness across all such contexts is the use of constraints. However, across these contexts, the candidates selected to satisfy the given constraints may systematically lead to unfair outcomes f…
▽ More
Fairness in multiwinner elections is studied in varying contexts. For instance, diversity of candidates and representation of voters are both separately termed as being fair. A common denominator to ensure fairness across all such contexts is the use of constraints. However, across these contexts, the candidates selected to satisfy the given constraints may systematically lead to unfair outcomes for historically disadvantaged voter populations as the cost of fairness may be borne unequally. Hence, we develop a model to select candidates that satisfy the constraints fairly across voter populations. To do so, the model maps the constrained multiwinner election problem to a problem of fairly allocating indivisible goods. We propose three variants of the model, namely, global, localized, and inter-sectional. Next, we analyze the model's computational complexity, and we present an empirical analysis of the utility traded-off across various settings of our model across the three variants and discuss the impact of Simpson's paradox using synthetic datasets and a dataset of voting at the United Nations. Finally, we discuss the implications of our work for AI and machine learning, especially for studies that use constraints to guarantee fairness.
△ Less
Submitted 23 November, 2022;
originally announced November 2022.
-
DiRe Committee : Diversity and Representation Constraints in Multiwinner Elections
Authors:
Kunal Relia
Abstract:
The study of fairness in multiwinner elections focuses on settings where candidates have attributes. However, voters may also be divided into predefined populations under one or more attributes (e.g., "California" and "Illinois" populations under the "state" attribute), which may be same or different from candidate attributes. The models that focus on candidate attributes alone may systematically…
▽ More
The study of fairness in multiwinner elections focuses on settings where candidates have attributes. However, voters may also be divided into predefined populations under one or more attributes (e.g., "California" and "Illinois" populations under the "state" attribute), which may be same or different from candidate attributes. The models that focus on candidate attributes alone may systematically under-represent smaller voter populations. Hence, we develop a model, DiRe Committee WinnerDetermination (DRCWD), which delineates candidate and voter attributes to select a committee by specifying diversity and representation constraints and a voting rule. We analyze its computational complexity, inapproximability, and parameterized complexity. We develop a heuristic-based algorithm, which finds the winning DiRe committee in under two minutes on 63% of the instances of synthetic datasets and on 100% of instances of real-world datasets. We present an empirical analysis of the running time, feasibility, and utility traded-off.
Overall, DRCWD motivates that a study of multiwinner elections should consider both its actors, namely candidates and voters, as candidate-specific models can unknowingly harm voter populations, and vice versa. Additionally, even when the attributes of candidates and voters coincide, it is important to treat them separately as diversity does not imply representation and vice versa. This is to say that having a female candidate on the committee, for example, is different from having a candidate on the committee who is preferred by the female voters, and who themselves may or may not be female.
△ Less
Submitted 23 September, 2021; v1 submitted 15 July, 2021;
originally announced July 2021.
-
Algorithmic Techniques for Necessary and Possible Winners
Authors:
Vishal Chakraborty,
Theo Delemazure,
Benny Kimelfeld,
Phokion G. Kolaitis,
Kunal Relia,
Julia Stoyanovich
Abstract:
We investigate the practical aspects of computing the necessary and possible winners in elections over incomplete voter preferences. In the case of the necessary winners, we show how to implement and accelerate the polynomial-time algorithm of Xia and Conitzer. In the case of the possible winners, where the problem is NP-hard, we give a natural reduction to Integer Linear Programming (ILP) for all…
▽ More
We investigate the practical aspects of computing the necessary and possible winners in elections over incomplete voter preferences. In the case of the necessary winners, we show how to implement and accelerate the polynomial-time algorithm of Xia and Conitzer. In the case of the possible winners, where the problem is NP-hard, we give a natural reduction to Integer Linear Programming (ILP) for all positional scoring rules and implement it in a leading commercial optimization solver. Further, we devise optimization techniques to minimize the number of ILP executions and, oftentimes, avoid them altogether. We conduct a thorough experimental study that includes the construction of a rich benchmark of election data based on real and synthetic data. Our findings suggest that, the worst-case intractability of the possible winners notwithstanding, the algorithmic techniques presented here scale well and can be used to compute the possible winners in realistic scenarios.
△ Less
Submitted 14 May, 2020;
originally announced May 2020.
-
Race, Ethnicity and National Origin-based Discrimination in Social Media and Hate Crimes Across 100 U.S. Cities
Authors:
Kunal Relia,
Zhengyi Li,
Stephanie H. Cook,
Rumi Chunara
Abstract:
We study malicious online content via a specific type of hate speech: race, ethnicity and national-origin based discrimination in social media, alongside hate crimes motivated by those characteristics, in 100 cities across the United States. We develop a spatially-diverse training dataset and classification pipeline to delineate targeted and self-narration of discrimination on social media, accoun…
▽ More
We study malicious online content via a specific type of hate speech: race, ethnicity and national-origin based discrimination in social media, alongside hate crimes motivated by those characteristics, in 100 cities across the United States. We develop a spatially-diverse training dataset and classification pipeline to delineate targeted and self-narration of discrimination on social media, accounting for language across geographies. Controlling for census parameters, we find that the proportion of discrimination that is targeted is associated with the number of hate crimes. Finally, we explore the linguistic features of discrimination Tweets in relation to hate crimes by city, features used by users who Tweet different amounts of discrimination, and features of discrimination compared to non-discrimination Tweets. Findings from this spatial study can inform future studies of how discrimination in physical and virtual worlds vary by place, or how physical and virtual world discrimination may synergize.
△ Less
Submitted 31 January, 2019;
originally announced February 2019.
-
From the User to the Medium: Neural Profiling Across Web Communities
Authors:
Mohammad Akbari,
Kunal Relia,
Anas Elghafari,
Rumi Chunara
Abstract:
Online communities provide a unique way for individuals to access information from those in similar circumstances, which can be critical for health conditions that require daily and personalized management. As these groups and topics often arise organically, identifying the types of topics discussed is necessary to understand their needs. As well, these communities and people in them can be quite…
▽ More
Online communities provide a unique way for individuals to access information from those in similar circumstances, which can be critical for health conditions that require daily and personalized management. As these groups and topics often arise organically, identifying the types of topics discussed is necessary to understand their needs. As well, these communities and people in them can be quite diverse, and existing community detection methods have not been extended towards evaluating these heterogeneities. This has been limited as community detection methodologies have not focused on community detection based on semantic relations between textual features of the user-generated content. Thus here we develop an approach, NeuroCom, that optimally finds dense groups of users as communities in a latent space inferred by neural representation of published contents of users. By embedding of words and messages, we show that NeuroCom demonstrates improved clustering and identifies more nuanced discussion topics in contrast to other common unsupervised learning approaches.
△ Less
Submitted 3 December, 2018;
originally announced December 2018.
-
Socio-spatial Self-organizing Maps: Using Social Media to Assess Relevant Geographies for Exposure to Social Processes
Authors:
Kunal Relia,
Mohammad Akbari,
Dustin Duncan,
Rumi Chunara
Abstract:
Social media offers a unique window into attitudes like racism and homophobia, exposure to which are important, hard to measure and understudied social determinants of health. However, individual geo-located observations from social media are noisy and geographically inconsistent. Existing areas by which exposures are measured, like Zip codes, average over irrelevant administratively-defined bound…
▽ More
Social media offers a unique window into attitudes like racism and homophobia, exposure to which are important, hard to measure and understudied social determinants of health. However, individual geo-located observations from social media are noisy and geographically inconsistent. Existing areas by which exposures are measured, like Zip codes, average over irrelevant administratively-defined boundaries. Hence, in order to enable studies of online social environmental measures like attitudes on social media and their possible relationship to health outcomes, first there is a need for a method to define the collective, underlying degree of social media attitudes by region. To address this, we create the Socio-spatial-Self organizing map, "SS-SOM" pipeline to best identify regions by their latent social attitude from Twitter posts. SS-SOMs use neural embedding for text-classification, and augment traditional SOMs to generate a controlled number of non-overlap**, topologically-constrained and topically-similar clusters. We find that not only are SS-SOMs robust to missing data, the exposure of a cohort of men who are susceptible to multiple racism and homophobia-linked health outcomes, changes by up to 42% using SS-SOM measures as compared to using Zip code-based measures.
△ Less
Submitted 4 September, 2018; v1 submitted 23 March, 2018;
originally announced March 2018.
-
Creating Full Individual-level Location Timelines from Sparse Social Media Data
Authors:
Nabeel Abdur Rehman,
Kunal Relia,
Rumi Chunara
Abstract:
In many domain applications, a continuous timeline of human locations is critical; for example for understanding possible locations where a disease may spread, or the flow of traffic. While data sources such as GPS trackers or Call Data Records are temporally-rich, they are expensive, often not publicly available or garnered only in select locations, restricting their wide use. Conversely, geo-loc…
▽ More
In many domain applications, a continuous timeline of human locations is critical; for example for understanding possible locations where a disease may spread, or the flow of traffic. While data sources such as GPS trackers or Call Data Records are temporally-rich, they are expensive, often not publicly available or garnered only in select locations, restricting their wide use. Conversely, geo-located social media data are publicly and freely available, but present challenges especially for full timeline inference due to their sparse nature. We propose a stochastic framework, Intermediate Location Computing (ILC) which uses prior knowledge about human mobility patterns to predict every missing location from an individual's social media timeline. We compare ILC with a state-of-the-art RNN baseline as well as methods that are optimized for next-location prediction only. For three major cities, ILC predicts the top 1 location for all missing locations in a timeline, at 1 and 2-hour resolution, with up to 77.2% accuracy (up to 6% better accuracy than all compared methods). Specifically, ILC also outperforms the RNN in settings of low data; both cases of very small number of users (under 50), as well as settings with more users, but with sparser timelines. In general, the RNN model needs a higher number of users to achieve the same performance as ILC. Overall, this work illustrates the tradeoff between prior knowledge of heuristics and more data, for an important societal problem of filling in entire timelines using freely available, but sparse social media data.
△ Less
Submitted 22 November, 2019; v1 submitted 6 October, 2017;
originally announced October 2017.