-
Allocation Requires Prediction Only if Inequality Is Low
Authors:
Ali Shirali,
Rediet Abebe,
Moritz Hardt
Abstract:
Algorithmic predictions are emerging as a promising solution concept for efficiently allocating societal resources. Fueling their use is an underlying assumption that such systems are necessary to identify individuals for interventions. We propose a principled framework for assessing this assumption: Using a simple mathematical model, we evaluate the efficacy of prediction-based allocations in set…
▽ More
Algorithmic predictions are emerging as a promising solution concept for efficiently allocating societal resources. Fueling their use is an underlying assumption that such systems are necessary to identify individuals for interventions. We propose a principled framework for assessing this assumption: Using a simple mathematical model, we evaluate the efficacy of prediction-based allocations in settings where individuals belong to larger units such as hospitals, neighborhoods, or schools. We find that prediction-based allocations outperform baseline methods using aggregate unit-level statistics only when between-unit inequality is low and the intervention budget is high. Our results hold for a wide range of settings for the price of prediction, treatment effect heterogeneity, and unit-level statistics' learnability. Combined, we highlight the potential limits to improving the efficacy of interventions through prediction.
△ Less
Submitted 19 June, 2024;
originally announced June 2024.
-
When the Majority is Wrong: Modeling Annotator Disagreement for Subjective Tasks
Authors:
Eve Fleisig,
Rediet Abebe,
Dan Klein
Abstract:
Though majority vote among annotators is typically used for ground truth labels in natural language processing, annotator disagreement in tasks such as hate speech detection may reflect differences in opinion across groups, not noise. Thus, a crucial problem in hate speech detection is determining whether a statement is offensive to the demographic group that it targets, when that group may consti…
▽ More
Though majority vote among annotators is typically used for ground truth labels in natural language processing, annotator disagreement in tasks such as hate speech detection may reflect differences in opinion across groups, not noise. Thus, a crucial problem in hate speech detection is determining whether a statement is offensive to the demographic group that it targets, when that group may constitute a small fraction of the annotator pool. We construct a model that predicts individual annotator ratings on potentially offensive text and combines this information with the predicted target group of the text to model the opinions of target group members. We show gains across a range of metrics, including raising performance over the baseline by 22% at predicting individual annotators' ratings and by 33% at predicting variance among annotators, which provides a metric for model uncertainty downstream. We find that annotator ratings can be predicted using their demographic information and opinions on online content, without the need to track identifying annotator IDs that link each annotator to their ratings. We also find that use of non-invasive survey questions on annotators' online experiences helps to maximize privacy and minimize unnecessary collection of demographic information when predicting annotators' opinions.
△ Less
Submitted 17 March, 2024; v1 submitted 11 May, 2023;
originally announced May 2023.
-
Difficult Lessons on Social Prediction from Wisconsin Public Schools
Authors:
Juan C. Perdomo,
Tolani Britton,
Moritz Hardt,
Rediet Abebe
Abstract:
Early warning systems (EWS) are predictive tools at the center of recent efforts to improve graduation rates in public schools across the United States. These systems assist in targeting interventions to individual students by predicting which students are at risk of drop** out. Despite significant investments in their widespread adoption, there remain large gaps in our understanding of the effi…
▽ More
Early warning systems (EWS) are predictive tools at the center of recent efforts to improve graduation rates in public schools across the United States. These systems assist in targeting interventions to individual students by predicting which students are at risk of drop** out. Despite significant investments in their widespread adoption, there remain large gaps in our understanding of the efficacy of EWS, and the role of statistical risk scores in education.
In this work, we draw on nearly a decade's worth of data from a system used throughout Wisconsin to provide the first large-scale evaluation of the long-term impact of EWS on graduation outcomes. We present empirical evidence that the prediction system accurately sorts students by their dropout risk. We also find that it may have caused a single-digit percentage increase in graduation rates, though our empirical analyses cannot reliably rule out that there has been no positive treatment effect.
Going beyond a retrospective evaluation of DEWS, we draw attention to a central question at the heart of the use of EWS: Are individual risk scores necessary for effectively targeting interventions? We propose a simple mechanism that only uses information about students' environments -- such as their schools, and districts -- and argue that this mechanism can target interventions just as efficiently as the individual risk score-based mechanism. Our argument holds even if individual predictions are highly accurate and effective interventions exist. In addition to motivating this simple targeting mechanism, our work provides a novel empirical backbone for the robust qualitative understanding among education researchers that dropout is structurally determined. Combined, our insights call into question the marginal value of individual predictions in settings where outcomes are driven by high levels of inequality.
△ Less
Submitted 18 September, 2023; v1 submitted 12 April, 2023;
originally announced April 2023.
-
A Theory of Dynamic Benchmarks
Authors:
Ali Shirali,
Rediet Abebe,
Moritz Hardt
Abstract:
Dynamic benchmarks interweave model fitting and data collection in an attempt to mitigate the limitations of static benchmarks. In contrast to an extensive theoretical and empirical study of the static setting, the dynamic counterpart lags behind due to limited empirical studies and no apparent theoretical foundation to date. Responding to this deficit, we initiate a theoretical study of dynamic b…
▽ More
Dynamic benchmarks interweave model fitting and data collection in an attempt to mitigate the limitations of static benchmarks. In contrast to an extensive theoretical and empirical study of the static setting, the dynamic counterpart lags behind due to limited empirical studies and no apparent theoretical foundation to date. Responding to this deficit, we initiate a theoretical study of dynamic benchmarking. We examine two realizations, one capturing current practice and the other modeling more complex settings. In the first model, where data collection and model fitting alternate sequentially, we prove that model performance improves initially but can stall after only three rounds. Label noise arising from, for instance, annotator disagreement leads to even stronger negative results. Our second model generalizes the first to the case where data collection and model fitting have a hierarchical dependency structure. We show that this design guarantees strictly more progress than the first, albeit at a significant increase in complexity. We support our theoretical analysis by simulating dynamic benchmarks on two popular datasets. These results illuminate the benefits and practical limitations of dynamic benchmarking, providing both a theoretical foundation and a causal explanation for observed bottlenecks in empirical work.
△ Less
Submitted 1 March, 2023; v1 submitted 6 October, 2022;
originally announced October 2022.
-
Lost in Translation: Reimagining the Machine Learning Life Cycle in Education
Authors:
Lydia T. Liu,
Serena Wang,
Tolani Britton,
Rediet Abebe
Abstract:
Machine learning (ML) techniques are increasingly prevalent in education, from their use in predicting student dropout, to assisting in university admissions, and facilitating the rise of MOOCs. Given the rapid growth of these novel uses, there is a pressing need to investigate how ML techniques support long-standing education principles and goals. In this work, we shed light on this complex lands…
▽ More
Machine learning (ML) techniques are increasingly prevalent in education, from their use in predicting student dropout, to assisting in university admissions, and facilitating the rise of MOOCs. Given the rapid growth of these novel uses, there is a pressing need to investigate how ML techniques support long-standing education principles and goals. In this work, we shed light on this complex landscape drawing on qualitative insights from interviews with education experts. These interviews comprise in-depth evaluations of ML for education (ML4Ed) papers published in preeminent applied ML conferences over the past decade. Our central research goal is to critically examine how the stated or implied education and societal objectives of these papers are aligned with the ML problems they tackle. That is, to what extent does the technical problem formulation, objectives, approach, and interpretation of results align with the education problem at hand. We find that a cross-disciplinary gap exists and is particularly salient in two parts of the ML life cycle: the formulation of an ML problem from education goals and the translation of predictions to interventions. We use these insights to propose an extended ML life cycle, which may also apply to the use of ML in other domains. Our work joins a growing number of meta-analytical studies across education and ML research, as well as critical analyses of the societal impact of ML. Specifically, it fills a gap between the prevailing technical understanding of machine learning and the perspective of education researchers working with students and in policy.
△ Less
Submitted 8 September, 2022;
originally announced September 2022.
-
Adversarial Scrutiny of Evidentiary Statistical Software
Authors:
Rediet Abebe,
Moritz Hardt,
Angela **,
John Miller,
Ludwig Schmidt,
Rebecca Wexler
Abstract:
The U.S. criminal legal system increasingly relies on software output to convict and incarcerate people. In a large number of cases each year, the government makes these consequential decisions based on evidence from statistical software -- such as probabilistic genoty**, environmental audio detection, and toolmark analysis tools -- that defense counsel cannot fully cross-examine or scrutinize.…
▽ More
The U.S. criminal legal system increasingly relies on software output to convict and incarcerate people. In a large number of cases each year, the government makes these consequential decisions based on evidence from statistical software -- such as probabilistic genoty**, environmental audio detection, and toolmark analysis tools -- that defense counsel cannot fully cross-examine or scrutinize. This undermines the commitments of the adversarial criminal legal system, which relies on the defense's ability to probe and test the prosecution's case to safeguard individual rights.
Responding to this need to adversarially scrutinize output from such software, we propose robust adversarial testing as an audit framework to examine the validity of evidentiary statistical software. We define and operationalize this notion of robust adversarial testing for defense use by drawing on a large body of recent work in robust machine learning and algorithmic fairness. We demonstrate how this framework both standardizes the process for scrutinizing such tools and empowers defense lawyers to examine their validity for instances most relevant to the case at hand. We further discuss existing structural and institutional challenges within the U.S. criminal legal system that may create barriers for implementing this and other such audit frameworks and close with a discussion on policy changes that could help address these concerns.
△ Less
Submitted 30 September, 2022; v1 submitted 18 June, 2022;
originally announced June 2022.
-
On the Effect of Triadic Closure on Network Segregation
Authors:
Rediet Abebe,
Nicole Immorlica,
Jon Kleinberg,
Brendan Lucier,
Ali Shirali
Abstract:
The tendency for individuals to form social ties with others who are similar to themselves, known as homophily, is one of the most robust sociological principles. Since this phenomenon can lead to patterns of interactions that segregate people along different demographic dimensions, it can also lead to inequalities in access to information, resources, and opportunities. As we consider potential in…
▽ More
The tendency for individuals to form social ties with others who are similar to themselves, known as homophily, is one of the most robust sociological principles. Since this phenomenon can lead to patterns of interactions that segregate people along different demographic dimensions, it can also lead to inequalities in access to information, resources, and opportunities. As we consider potential interventions that might alleviate the effects of segregation, we face the challenge that homophily constitutes a pervasive and organic force that is difficult to push back against. Designing effective interventions can therefore benefit from identifying counterbalancing social processes that might be harnessed to work in opposition to segregation.
In this work, we show that triadic closure -- another common phenomenon that posits that individuals with a mutual connection are more likely to be connected to one another -- can be one such process. In doing so, we challenge a long-held belief that triadic closure and homophily work in tandem. By analyzing several fundamental network models using popular integration measures, we demonstrate the desegregating potential of triadic closure. We further empirically investigate this effect on real-world dynamic networks, surfacing observations that mirror our theoretical findings. We leverage these insights to discuss simple interventions that can help reduce segregation in settings that exhibit an interplay between triadic closure and homophily. We conclude with a discussion on qualitative implications for the design of interventions in settings where individuals arrive in an online fashion, and the designer can influence the initial set of connections.
△ Less
Submitted 26 May, 2022;
originally announced May 2022.
-
An Algorithmic Introduction to Savings Circles
Authors:
Rediet Abebe,
Adam Eck,
Christian Ikeokwu,
Samuel Taggart
Abstract:
Rotating savings and credit associations (roscas) are informal financial organizations common in settings where communities have reduced access to formal financial institutions. In a rosca, a fixed group of participants regularly contribute sums of money to a pot. This pot is then allocated periodically using lottery, aftermarket, or auction mechanisms. Roscas are empirically well-studied in econo…
▽ More
Rotating savings and credit associations (roscas) are informal financial organizations common in settings where communities have reduced access to formal financial institutions. In a rosca, a fixed group of participants regularly contribute sums of money to a pot. This pot is then allocated periodically using lottery, aftermarket, or auction mechanisms. Roscas are empirically well-studied in economics. They are, however, challenging to study theoretically due to their dynamic nature. Typical economic analyses of roscas stop at coarse ordinal welfare comparisons to other credit allocation mechanisms, leaving much of roscas' ubiquity unexplained. In this work, we take an algorithmic perspective on the study of roscas. Building on techniques from the price of anarchy literature, we present worst-case welfare approximation guarantees. We further experimentally compare the welfare of outcomes as key features of the environment vary. These cardinal welfare analyses further rationalize the prevalence of roscas. We conclude by discussing several other promising avenues.
△ Less
Submitted 23 March, 2022;
originally announced March 2022.
-
Narratives and Counternarratives on Data Sharing in Africa
Authors:
Rediet Abebe,
Kehinde Aruleba,
Abeba Birhane,
Sara Kingsley,
George Obaido,
Sekou L. Remy,
Swathi Sadagopan
Abstract:
As machine learning and data science applications grow ever more prevalent, there is an increased focus on data sharing and open data initiatives, particularly in the context of the African continent. Many argue that data sharing can support research and policy design to alleviate poverty, inequality, and derivative effects in Africa. Despite the fact that the datasets in question are often extrac…
▽ More
As machine learning and data science applications grow ever more prevalent, there is an increased focus on data sharing and open data initiatives, particularly in the context of the African continent. Many argue that data sharing can support research and policy design to alleviate poverty, inequality, and derivative effects in Africa. Despite the fact that the datasets in question are often extracted from African communities, conversations around the challenges of accessing and sharing African data are too often driven by nonAfrican stakeholders. These perspectives frequently employ a deficit narratives, often focusing on lack of education, training, and technological resources in the continent as the leading causes of friction in the data ecosystem. We argue that these narratives obfuscate and distort the full complexity of the African data sharing landscape. In particular, we use storytelling via fictional personas built from a series of interviews with African data experts to complicate dominant narratives and to provide counternarratives. Coupling these personas with research on data practices within the continent, we identify recurring barriers to data sharing as well as inequities in the distribution of data sharing benefits. In particular, we discuss issues arising from power imbalances resulting from the legacies of colonialism, ethno-centrism, and slavery, disinvestment in building trust, lack of acknowledgement of historical and present-day extractive practices, and Western-centric policies that are ill-suited to the African context. After outlining these problems, we discuss avenues for addressing them when sharing data generated in the continent.
△ Less
Submitted 1 March, 2021;
originally announced March 2021.
-
Opinion Dynamics with Varying Susceptibility to Persuasion via Non-Convex Local Search
Authors:
Rediet Abebe,
T-H. Hubert Chan,
Jon Kleinberg,
Zhibin Liang,
David Parkes,
Mauro Sozio,
Charalampos E. Tsourakakis
Abstract:
A long line of work in social psychology has studied variations in people's susceptibility to persuasion -- the extent to which they are willing to modify their opinions on a topic. This body of literature suggests an interesting perspective on theoretical models of opinion formation by interacting parties in a network: in addition to considering interventions that directly modify people's intrins…
▽ More
A long line of work in social psychology has studied variations in people's susceptibility to persuasion -- the extent to which they are willing to modify their opinions on a topic. This body of literature suggests an interesting perspective on theoretical models of opinion formation by interacting parties in a network: in addition to considering interventions that directly modify people's intrinsic opinions, it is also natural to consider interventions that modify people's susceptibility to persuasion. In this work, motivated by this fact we propose a new framework for social influence. Specifically, we adopt a popular model for social opinion dynamics, where each agent has some fixed innate opinion, and a resistance that measures the importance it places on its innate opinion; agents influence one another's opinions through an iterative process. Under non-trivial conditions, this iterative process converges to some equilibrium opinion vector. For the unbudgeted variant of the problem, the goal is to select the resistance of each agent (from some given range) such that the sum of the equilibrium opinions is minimized. We prove that the objective function is in general non-convex. Hence, formulating the problem as a convex program as in an early version of this work (Abebe et al., KDD'18) might have potential correctness issues. We instead analyze the structure of the objective function, and show that any local optimum is also a global optimum, which is somehow surprising as the objective function might not be convex. Furthermore, we combine the iterative process and the local search paradigm to design very efficient algorithms that can solve the unbudgeted variant of the problem optimally on large-scale graphs containing millions of nodes. Finally, we propose and evaluate experimentally a family of heuristics for the budgeted variation of the problem.
△ Less
Submitted 4 November, 2020;
originally announced November 2020.
-
Quantifying Community Characteristics of Maternal Mortality Using Social Media
Authors:
Rediet Abebe,
Salvatore Giorgi,
Anna Tedijanto,
Anneke Buffone,
H. Andrew Schwartz
Abstract:
While most mortality rates have decreased in the US, maternal mortality has increased and is among the highest of any OECD nation. Extensive public health research is ongoing to better understand the characteristics of communities with relatively high or low rates. In this work, we explore the role that social media language can play in providing insights into such community characteristics. Analy…
▽ More
While most mortality rates have decreased in the US, maternal mortality has increased and is among the highest of any OECD nation. Extensive public health research is ongoing to better understand the characteristics of communities with relatively high or low rates. In this work, we explore the role that social media language can play in providing insights into such community characteristics. Analyzing pregnancy-related tweets generated in US counties, we reveal a diverse set of latent topics including Morning Sickness, Celebrity Pregnancies, and Abortion Rights. We find that rates of mentioning these topics on Twitter predicts maternal mortality rates with higher accuracy than standard socioeconomic and risk variables such as income, race, and access to health-care, holding even after reducing the analysis to six topics chosen for their interpretability and connections to known risk factors. We then investigate psychological dimensions of community language, finding the use of less trustful, more stressed, and more negative affective language is significantly associated with higher mortality rates, while trust and negative affect also explain a significant portion of racial disparities in maternal mortality. We discuss the potential for these insights to inform actionable health interventions at the community-level.
△ Less
Submitted 14 April, 2020;
originally announced April 2020.
-
Roles for Computing in Social Change
Authors:
Rediet Abebe,
Solon Barocas,
Jon Kleinberg,
Karen Levy,
Manish Raghavan,
David G. Robinson
Abstract:
A recent normative turn in computer science has brought concerns about fairness, bias, and accountability to the core of the field. Yet recent scholarship has warned that much of this technical work treats problematic features of the status quo as fixed, and fails to address deeper patterns of injustice and inequality. While acknowledging these critiques, we posit that computational research has v…
▽ More
A recent normative turn in computer science has brought concerns about fairness, bias, and accountability to the core of the field. Yet recent scholarship has warned that much of this technical work treats problematic features of the status quo as fixed, and fails to address deeper patterns of injustice and inequality. While acknowledging these critiques, we posit that computational research has valuable roles to play in addressing social problems -- roles whose value can be recognized even from a perspective that aspires toward fundamental social change. In this paper, we articulate four such roles, through an analysis that considers the opportunities as well as the significant risks inherent in such work. Computing research can serve as a diagnostic, hel** us to understand and measure social problems with precision and clarity. As a formalizer, computing shapes how social problems are explicitly defined --- changing how those problems, and possible responses to them, are understood. Computing serves as rebuttal when it illuminates the boundaries of what is possible through technical means. And computing acts as synecdoche when it makes long-standing social problems newly salient in the public eye. We offer these paths forward as modalities that leverage the particular strengths of computational work in the service of social change, without overclaiming computing's capacity to solve social problems on its own.
△ Less
Submitted 9 July, 2020; v1 submitted 10 December, 2019;
originally announced December 2019.
-
A Conjectural Brouwer Inequality for Higher-Dimensional Laplacian Spectra
Authors:
Rediet Abebe
Abstract:
We present a generalization of Brouwer's conjectural family of inequalities -- a popular family of inequalities in spectral graph theory bounding the partial sum of the Laplacian eigenvalues of graphs -- for the case of abstract simplicial complexes of any dimension. We prove that this family of inequalities holds for shifted simplicial complexes, which generalize threshold graphs, and give tighte…
▽ More
We present a generalization of Brouwer's conjectural family of inequalities -- a popular family of inequalities in spectral graph theory bounding the partial sum of the Laplacian eigenvalues of graphs -- for the case of abstract simplicial complexes of any dimension. We prove that this family of inequalities holds for shifted simplicial complexes, which generalize threshold graphs, and give tighter bounds (linear in the dimension of the complexes) for simplicial trees. We prove that the conjecture holds for the the first, second, and last partial sums for all simplicial complexes, generalizing many known proofs for graphs to the case of simplicial complexes. We also show that the conjecture holds for the tth partial sum for all simplicial complexes with dimension at least t and matching number greater than $t$. Returning to the special case of graphs, we expand on a known proof to show that the Brouwer's conjecture holds with equality for the tth partial sum where t is the maximum clique size of the graph minus one (or, equivalently, the number of cone vertices). Along the way, we develop machinery that may give further insights into related long-standing conjectures.
△ Less
Submitted 17 July, 2019;
originally announced July 2019.
-
A Truthful Cardinal Mechanism for One-Sided Matching
Authors:
Rediet Abebe,
Richard Cole,
Vasilis Gkatzelis,
Jason D. Hartline
Abstract:
We revisit the well-studied problem of designing mechanisms for one-sided matching markets, where a set of $n$ agents needs to be matched to a set of $n$ heterogeneous items. Each agent $i$ has a value $v_{i,j}$ for each item $j$, and these values are private information that the agents may misreport if doing so leads to a preferred outcome. Ensuring that the agents have no incentive to misreport…
▽ More
We revisit the well-studied problem of designing mechanisms for one-sided matching markets, where a set of $n$ agents needs to be matched to a set of $n$ heterogeneous items. Each agent $i$ has a value $v_{i,j}$ for each item $j$, and these values are private information that the agents may misreport if doing so leads to a preferred outcome. Ensuring that the agents have no incentive to misreport requires a careful design of the matching mechanism, and mechanisms proposed in the literature mitigate this issue by eliciting only the \emph{ordinal} preferences of the agents, i.e., their ranking of the items from most to least preferred. However, the efficiency guarantees of these mechanisms are based only on weak measures that are oblivious to the underlying values. In this paper we achieve stronger performance guarantees by introducing a mechanism that truthfully elicits the full \emph{cardinal} preferences of the agents, i.e., all of the $v_{i,j}$ values. We evaluate the performance of this mechanism using the much more demanding Nash bargaining solution as a benchmark, and we prove that our mechanism significantly outperforms all ordinal mechanisms (even non-truthful ones). To prove our approximation bounds, we also study the population monotonicity of the Nash bargaining solution in the context of matching markets, providing both upper and lower bounds which are of independent interest.
△ Less
Submitted 20 January, 2020; v1 submitted 18 March, 2019;
originally announced March 2019.
-
Mechanism Design for Social Good
Authors:
Rediet Abebe,
Kira Goldner
Abstract:
Across various domains--such as health, education, and housing--improving societal welfare involves allocating resources, setting policies, targeting interventions, and regulating activities. These solutions have an immense impact on the day-to-day lives of individuals, whether in the form of access to quality healthcare, labor market outcomes, or how votes are accounted for in a democratic societ…
▽ More
Across various domains--such as health, education, and housing--improving societal welfare involves allocating resources, setting policies, targeting interventions, and regulating activities. These solutions have an immense impact on the day-to-day lives of individuals, whether in the form of access to quality healthcare, labor market outcomes, or how votes are accounted for in a democratic society. Problems that can have an out-sized impact on individuals whose opportunities have historically been limited often pose conceptual and technical challenges, requiring insights from many disciplines. Conversely, the lack of interdisciplinary approach can leave these urgent needs unaddressed and can even exacerbate underlying socioeconomic inequalities. To realize the opportunities in these domains, we need to correctly set objectives and reason about human behavior and actions. Doing so requires a deep grounding in the field of interest and collaboration with domain experts who understand the societal implications and feasibility of proposed solutions. These insights can play an instrumental role in proposing algorithmically-informed policies.
In this article, we describe the Mechanism Design for Social Good (MD4SG) research agenda, which involves using insights from algorithms, optimization, and mechanism design to improve access to opportunity. The MD4SG research community takes an interdisciplinary, multi-stakeholder approach to improve societal welfare. We discuss three exciting research avenues within MD4SG related to improving access to opportunity in the develo** world, labor markets and discrimination, and housing. For each of these, we showcase ongoing work, underline new directions, and discuss potential for implementing existing work in practice.
△ Less
Submitted 21 October, 2018;
originally announced October 2018.
-
Using Search Queries to Understand Health Information Needs in Africa
Authors:
Rediet Abebe,
Shawndra Hill,
Jennifer Wortman Vaughan,
Peter M. Small,
H. Andrew Schwartz
Abstract:
The lack of comprehensive, high-quality health data in develo** nations creates a roadblock for combating the impacts of disease. One key challenge is understanding the health information needs of people in these nations. Without understanding people's everyday needs, concerns, and misconceptions, health organizations and policymakers lack the ability to effectively target education and programm…
▽ More
The lack of comprehensive, high-quality health data in develo** nations creates a roadblock for combating the impacts of disease. One key challenge is understanding the health information needs of people in these nations. Without understanding people's everyday needs, concerns, and misconceptions, health organizations and policymakers lack the ability to effectively target education and programming efforts. In this paper, we propose a bottom-up approach that uses search data from individuals to uncover and gain insight into health information needs in Africa. We analyze Bing searches related to HIV/AIDS, malaria, and tuberculosis from all 54 African nations. For each disease, we automatically derive a set of common search themes or topics, revealing a wide-spread interest in various types of information, including disease symptoms, drugs, concerns about breastfeeding, as well as stigma, beliefs in natural cures, and other topics that may be hard to uncover through traditional surveys. We expose the different patterns that emerge in health information needs by demographic groups (age and sex) and country. We also uncover discrepancies in the quality of content returned by search engines to users by topic. Combined, our results suggest that search data can help illuminate health information needs in Africa and inform discussions on health policy and targeted education efforts both on- and offline.
△ Less
Submitted 17 April, 2019; v1 submitted 14 June, 2018;
originally announced June 2018.
-
Simplicial Closure and higher-order link prediction
Authors:
Austin R. Benson,
Rediet Abebe,
Michael T. Schaub,
Ali Jadbabaie,
Jon Kleinberg
Abstract:
Networks provide a powerful formalism for modeling complex systems by using a model of pairwise interactions. But much of the structure within these systems involves interactions that take place among more than two nodes at once; for example, communication within a group rather than person-to person, collaboration among a team rather than a pair of coauthors, or biological interaction between a se…
▽ More
Networks provide a powerful formalism for modeling complex systems by using a model of pairwise interactions. But much of the structure within these systems involves interactions that take place among more than two nodes at once; for example, communication within a group rather than person-to person, collaboration among a team rather than a pair of coauthors, or biological interaction between a set of molecules rather than just two. Such higher-order interactions are ubiquitous, but their empirical study has received limited attention, and little is known about possible organizational principles of such structures. Here we study the temporal evolution of 19 datasets with explicit accounting for higher-order interactions. We show that there is a rich variety of structure in our datasets but datasets from the same system types have consistent patterns of higher-order structure. Furthermore, we find that tie strength and edge density are competing positive indicators of higher-order organization, and these trends are consistent across interactions involving differing numbers of nodes. To systematically further the study of theories for such higher-order structures, we propose higher-order link prediction as a benchmark problem to assess models and algorithms that predict higher-order structure. We find a fundamental differences from traditional pairwise link prediction, with a greater role for local rather than long-range information in predicting the appearance of new interactions.
△ Less
Submitted 11 December, 2018; v1 submitted 19 February, 2018;
originally announced February 2018.
-
Opinion Dynamics with Varying Susceptibility to Persuasion
Authors:
Rediet Abebe,
Jon Kleinberg,
David Parkes,
Charalampos E. Tsourakakis
Abstract:
A long line of work in social psychology has studied variations in people's susceptibility to persuasion -- the extent to which they are willing to modify their opinions on a topic. This body of literature suggests an interesting perspective on theoretical models of opinion formation by interacting parties in a network: in addition to considering interventions that directly modify people's intrins…
▽ More
A long line of work in social psychology has studied variations in people's susceptibility to persuasion -- the extent to which they are willing to modify their opinions on a topic. This body of literature suggests an interesting perspective on theoretical models of opinion formation by interacting parties in a network: in addition to considering interventions that directly modify people's intrinsic opinions, it is also natural to consider interventions that modify people's susceptibility to persuasion. In this work, we adopt a popular model for social opinion dynamics, and we formalize the opinion maximization and minimization problems where interventions happen at the level of susceptibility.
We show that modeling interventions at the level of susceptibility lead to an interesting family of new questions in network opinion dynamics. We find that the questions are quite different depending on whether there is an overall budget constraining the number of agents we can target or not. We give a polynomial-time algorithm for finding the optimal target-set to optimize the sum of opinions when there are no budget constraints on the size of the target-set. We show that this problem is NP-hard when there is a budget, and that the objective function is neither submodular nor supermodular. Finally, we propose a heuristic for the budgeted opinion optimization and show its efficacy at finding target-sets that optimize the sum of opinions compared on real world networks, including a Twitter network with real opinion estimates.
△ Less
Submitted 24 January, 2018;
originally announced January 2018.
-
Mitigating Overexposure in Viral Marketing
Authors:
Rediet Abebe,
Lada Adamic,
Jon Kleinberg
Abstract:
In traditional models for word-of-mouth recommendations and viral marketing, the objective function has generally been based on reaching as many people as possible. However, a number of studies have shown that the indiscriminate spread of a product by word-of-mouth can result in overexposure, reaching people who evaluate it negatively. This can lead to an effect in which the over-promotion of a pr…
▽ More
In traditional models for word-of-mouth recommendations and viral marketing, the objective function has generally been based on reaching as many people as possible. However, a number of studies have shown that the indiscriminate spread of a product by word-of-mouth can result in overexposure, reaching people who evaluate it negatively. This can lead to an effect in which the over-promotion of a product can produce negative reputational effects, by reaching a part of the audience that is not receptive to it.
How should one make use of social influence when there is a risk of overexposure? In this paper, we develop and analyze a theoretical model for this process; we show how it captures a number of the qualitative phenomena associated with overexposure, and for the main formulation of our model, we provide a polynomial-time algorithm to find the optimal marketing strategy. We also present simulations of the model on real network topologies, quantifying the extent to which our optimal strategies outperform natural baselines
△ Less
Submitted 8 November, 2017; v1 submitted 12 September, 2017;
originally announced September 2017.
-
Fair Division via Social Comparison
Authors:
Rediet Abebe,
Jon Kleinberg,
David Parkes
Abstract:
In the classical cake cutting problem, a resource must be divided among agents with different utilities so that each agent believes they have received a fair share of the resource relative to the other agents. We introduce a variant of the problem in which we model an underlying social network on the agents with a graph, and agents only evaluate their shares relative to their neighbors' in the net…
▽ More
In the classical cake cutting problem, a resource must be divided among agents with different utilities so that each agent believes they have received a fair share of the resource relative to the other agents. We introduce a variant of the problem in which we model an underlying social network on the agents with a graph, and agents only evaluate their shares relative to their neighbors' in the network. This formulation captures many situations in which it is unrealistic to assume a global view, and also exposes interesting phenomena in the original problem.
Specifically, we say an allocation is locally envy-free if no agent envies a neighbor's allocation and locally proportional if each agent values her own allocation as much as the average value of her neighbor's allocations, with the former implying the latter. While global envy-freeness implies local envy-freeness, global proportionality does not imply local proportionality, or vice versa. A general result is that for any two distinct graphs on the same set of nodes and an allocation, there exists a set of valuation functions such that the allocation is locally proportional on one but not the other.
We fully characterize the set of graphs for which an oblivious single-cutter protocol-- a protocol that uses a single agent to cut the cake into pieces --admits a bounded protocol with $O(n^2)$ query complexity for locally envy-free allocations in the Robertson-Webb model. We also consider the price of envy-freeness, which compares the total utility of an optimal allocation to the best utility of an allocation that is envy-free. We show that a lower bound of $Ω(\sqrt{n})$ on the price of envy-freeness for global allocations in fact holds for local envy-freeness in any connected undirected graph. Thus, sparse graphs surprisingly do not provide more flexibility with respect to the quality of envy-free allocations.
△ Less
Submitted 25 February, 2018; v1 submitted 20 November, 2016;
originally announced November 2016.
-
Long-distance spin-spin coupling via floating gates
Authors:
Luka Trifunovic,
Oliver Dial,
Mircea Trif,
James R. Wootton,
Rediet Abebe,
Amir Yacoby,
Daniel Loss
Abstract:
The electron spin is a natural two level system that allows a qubit to be encoded. When localized in a gate defined quantum dot, the electron spin provides a promising platform for a future functional quantum computer. The essential ingredient of any quantum computer is entanglement---between electron spin qubits---commonly achieved via the exchange interaction. Nevertheless, there is an immense c…
▽ More
The electron spin is a natural two level system that allows a qubit to be encoded. When localized in a gate defined quantum dot, the electron spin provides a promising platform for a future functional quantum computer. The essential ingredient of any quantum computer is entanglement---between electron spin qubits---commonly achieved via the exchange interaction. Nevertheless, there is an immense challenge as to how to scale the system up to include many qubits. Here we propose a novel architecture of a large scale quantum computer based on a realization of long-distance quantum gates between electron spins localized in quantum dots. The crucial ingredients of such a long-distance coupling are floating metallic gates that mediate electrostatic coupling over large distances. We show, both analytically and numerically, that distant electron spins in an array of quantum dots can be coupled selectively, with coupling strengths that are larger than the electron spin decay and with switching times on the order of nanoseconds.
△ Less
Submitted 6 October, 2011;
originally announced October 2011.