-
Participatory Budgeting Design for the Real World
Authors:
Roy Fairstein,
Gerdus Benadè,
Kobi Gal
Abstract:
Participatory budgeting engages the public in the process of allocating public money to different types of projects. PB designs differ in how voters are asked to express their preferences over candidate projects and how these preferences are aggregated to determine which projects to fund. This paper studies two fundamental questions in PB design. Which voting format and aggregation method to use,…
▽ More
Participatory budgeting engages the public in the process of allocating public money to different types of projects. PB designs differ in how voters are asked to express their preferences over candidate projects and how these preferences are aggregated to determine which projects to fund. This paper studies two fundamental questions in PB design. Which voting format and aggregation method to use, and how to evaluate the outcomes of these design decisions? We conduct an extensive empirical study in which 1 800 participants vote in four participatory budgeting elections in a controlled setting to evaluate the practical effects of the choice of voting format and aggregation rule. We find that k-approval leads to the best user experience. With respect to the aggregation rule, greedy aggregation leads to outcomes that are highly sensitive to the input format used and the fraction of the population that participates. The method of equal shares, in contrast, leads to outcomes that are not sensitive to the type of voting format used, and these outcomes are remarkably stable even when the majority of the population does not participate in the election. These results carry valuable insights for PB practitioners and social choice researchers.
△ Less
Submitted 26 February, 2023;
originally announced February 2023.
-
Automatic Creativity Measurement in Scratch Programs Across Modalities
Authors:
Anastasia Kovalkov,
Benjamin Paaßen,
Avi Segal,
Niels Pinkwart,
Kobi Gal
Abstract:
Promoting creativity is considered an important goal of education, but creativity is notoriously hard to measure.In this paper, we make the journey fromdefining a formal measure of creativity that is efficientlycomputable to applying the measure in a practical domain. The measure is general and relies on coretheoretical concepts in creativity theory, namely fluency, flexibility, and originality, i…
▽ More
Promoting creativity is considered an important goal of education, but creativity is notoriously hard to measure.In this paper, we make the journey fromdefining a formal measure of creativity that is efficientlycomputable to applying the measure in a practical domain. The measure is general and relies on coretheoretical concepts in creativity theory, namely fluency, flexibility, and originality, integratingwith prior cognitive science literature. We adapted the general measure for projects in the popular visual programming language Scratch.We designed a machine learning model for predicting the creativity of Scratch projects, trained and evaluated on human expert creativity assessments in an extensive user study. Our results show that opinions about creativity in Scratch varied widely across experts. The automatic creativity assessment aligned with the assessment of the human experts more than the experts agreed with each other. This is a first step in providing computational models for measuring creativity that can be applied to educational technologies, and to scale up the benefit of creativity education in schools.
△ Less
Submitted 7 November, 2022;
originally announced November 2022.
-
Detecting Suicide Risk in Online Counseling Services: A Study in a Low-Resource Language
Authors:
Amir Bialer,
Daniel Izmaylov,
Avi Segal,
Oren Tsur,
Yossi Levi-Belz,
Kobi Gal
Abstract:
With the increased awareness of situations of mental crisis and their societal impact, online services providing emergency support are becoming commonplace in many countries. Computational models, trained on discussions between help-seekers and providers, can support suicide prevention by identifying at-risk individuals. However, the lack of domain-specific models, especially in low-resource langu…
▽ More
With the increased awareness of situations of mental crisis and their societal impact, online services providing emergency support are becoming commonplace in many countries. Computational models, trained on discussions between help-seekers and providers, can support suicide prevention by identifying at-risk individuals. However, the lack of domain-specific models, especially in low-resource languages, poses a significant challenge for the automatic detection of suicide risk. We propose a model that combines pre-trained language models (PLM) with a fixed set of manually crafted (and clinically approved) set of suicidal cues, followed by a two-stage fine-tuning process. Our model achieves 0.91 ROC-AUC and an F2-score of 0.55, significantly outperforming an array of strong baselines even early on in the conversation, which is critical for real-time detection in the field. Moreover, the model performs well across genders and age groups.
△ Less
Submitted 11 September, 2022;
originally announced September 2022.
-
Welfare vs. Representation in Participatory Budgeting
Authors:
Roy Fairstein,
Reshef Meir,
Dan Vilenchik,
Kobi Gal
Abstract:
Participatory budgeting (PB) is a democratic process for allocating funds to projects based on the votes of members of the community. Different rules have been used to aggregate participants' votes. Past research has studied the trade-off between notions of social welfare and fairness in the multi-winner setting (a special case of participatory budgeting with identical project costs) by Lackner an…
▽ More
Participatory budgeting (PB) is a democratic process for allocating funds to projects based on the votes of members of the community. Different rules have been used to aggregate participants' votes. Past research has studied the trade-off between notions of social welfare and fairness in the multi-winner setting (a special case of participatory budgeting with identical project costs) by Lackner and Skowron (2020). But there is little understanding of this trade-off in the more general PB setting. This paper provides a theoretical and empirical study of the worst-case guarantees of several common rules to better understand the trade-off between social welfare, representation. We show that many of the guarantees from the multi-winner setting do not generalize to the PB setting, and that the introduction of costs leads to substantially worse guarantees, thereby exacerbating the welfare-representation trade-off. We extend our theoretical analysis to studying how the requirement of proportionality over voting rules affects this trade-off. We further study how the requirement of proportionality over voting rules effects the guarantees on social welfare and representation. We study the latter point also empirically, both on real and synthetic datasets. We show that variants of the recently suggested voting rule Rule-X (which satisfies proportionality) do very well in practice both with respect to social welfare and representation.
△ Less
Submitted 25 May, 2022; v1 submitted 19 January, 2022;
originally announced January 2022.
-
MultiplexNet: Towards Fully Satisfied Logical Constraints in Neural Networks
Authors:
Nicholas Hoernle,
Rafael Michael Karampatsis,
Vaishak Belle,
Kobi Gal
Abstract:
We propose a novel way to incorporate expert knowledge into the training of deep neural networks. Many approaches encode domain constraints directly into the network architecture, requiring non-trivial or domain-specific engineering. In contrast, our approach, called MultiplexNet, represents domain knowledge as a logical formula in disjunctive normal form (DNF) which is easy to encode and to elici…
▽ More
We propose a novel way to incorporate expert knowledge into the training of deep neural networks. Many approaches encode domain constraints directly into the network architecture, requiring non-trivial or domain-specific engineering. In contrast, our approach, called MultiplexNet, represents domain knowledge as a logical formula in disjunctive normal form (DNF) which is easy to encode and to elicit from human experts. It introduces a Categorical latent variable that learns to choose which constraint term optimizes the error function of the network and it compiles the constraints directly into the output of existing learning algorithms. We demonstrate the efficacy of this approach empirically on several classical deep learning tasks, such as density estimation and classification in both supervised and unsupervised settings where prior knowledge about the domains was expressed as logical constraints. Our results show that the MultiplexNet approach learned to approximate unknown distributions well, often requiring fewer data samples than the alternative approaches. In some cases, MultiplexNet finds better solutions than the baselines; or solutions that could not be achieved with the alternative approaches. Our contribution is in encoding domain knowledge in a way that facilitates inference that is shown to be both efficient and general; and critically, our approach guarantees 100% constraint satisfaction in a network's output.
△ Less
Submitted 2 November, 2021;
originally announced November 2021.
-
Proportional Participatory Budgeting with Substitute Projects
Authors:
Roy Fairstein,
Reshef Meir,
Kobi Gal
Abstract:
Participatory budgeting is a democratic process for allocating funds to projects based on the votes of members of the community. However, most input methods of voters' preferences prevent the voters from expressing complex relationships among projects, leading to outcomes that do not reflect their preferences well enough. In this paper, we propose an input method that begins to address this challe…
▽ More
Participatory budgeting is a democratic process for allocating funds to projects based on the votes of members of the community. However, most input methods of voters' preferences prevent the voters from expressing complex relationships among projects, leading to outcomes that do not reflect their preferences well enough. In this paper, we propose an input method that begins to address this challenge, by allowing participants to express substitutes over projects. Then, we extend a known aggregation mechanism from the literature (Rule X) to handle substitute projects. We prove that our extended rule preserves proportionality under natural conditions, and show empirically that it obtains substantially more welfare than the original mechanism on instances with substitutes.
△ Less
Submitted 9 June, 2021;
originally announced June 2021.
-
Revisiting Citizen Science Through the Lens of Hybrid Intelligence
Authors:
Janet Rafner,
Miroslav Gajdacz,
Gitte Kragh,
Arthur Hjorth,
Anna Gander,
Blanka Palfi,
Aleks Berditchevskaia,
François Grey,
Kobi Gal,
Avi Segal,
Mike Walmsley,
Josh Aaron Miller,
Dominik Dellerman,
Muki Haklay,
Pietro Michelucci,
Jacob Sherson
Abstract:
Artificial Intelligence (AI) can augment and sometimes even replace human cognition. Inspired by efforts to value human agency alongside productivity, we discuss the benefits of solving Citizen Science (CS) tasks with Hybrid Intelligence (HI), a synergetic mixture of human and artificial intelligence. Currently there is no clear framework or methodology on how to create such an effective mixture.…
▽ More
Artificial Intelligence (AI) can augment and sometimes even replace human cognition. Inspired by efforts to value human agency alongside productivity, we discuss the benefits of solving Citizen Science (CS) tasks with Hybrid Intelligence (HI), a synergetic mixture of human and artificial intelligence. Currently there is no clear framework or methodology on how to create such an effective mixture. Due to the unique participant-centered set of values and the abundance of tasks drawing upon both human common sense and complex 21st century skills, we believe that the field of CS offers an invaluable testbed for the development of HI and human-centered AI of the 21st century, while benefiting CS as well. In order to investigate this potential, we first relate CS to adjacent computational disciplines. Then, we demonstrate that CS projects can be grouped according to their potential for HI-enhancement by examining two key dimensions: the level of digitization and the amount of knowledge or experience required for participation. Finally, we propose a framework for types of human-AI interaction in CS based on established criteria of HI. This "HI lens" provides the CS community with an overview of several ways to utilize the combination of AI and human intelligence in their projects. It also allows the AI community to gain ideas on how develo** AI in CS projects can further their own field.
△ Less
Submitted 30 April, 2021;
originally announced April 2021.
-
One Size Does Not Fit All: A Study of Badge Behavior in Stack Overflow
Authors:
Stav Yanovsky,
Nicholas Hoernle,
Omer Lev,
Kobi Gal
Abstract:
Badges are endemic to online interaction sites, from Question and Answer (Q&A) websites to ride sharing, as systems for rewarding participants for their contributions. This paper studies how badge design affects people's contributions and behavior over time. Past work has shown that badges "steer" people's behavior toward substantially increasing the amount of contributions before obtaining the ba…
▽ More
Badges are endemic to online interaction sites, from Question and Answer (Q&A) websites to ride sharing, as systems for rewarding participants for their contributions. This paper studies how badge design affects people's contributions and behavior over time. Past work has shown that badges "steer" people's behavior toward substantially increasing the amount of contributions before obtaining the badge, and immediately decreasing their contributions thereafter, returning to their baseline contribution levels. In contrast, we find that the steering effect depends on the type of user, as modeled by the rate and intensity of the user's contributions. We use these measures to distinguish between different groups of user activity, including users who are not affected by the badge system despite being significant contributors to the site. We provide a predictive model of how users change their activity group over the course of their lifetime in the system. We demonstrate our approach empirically in three different Q\&A sites on Stack Exchange with hundreds of thousands of users, for two types of activities (editing and voting on posts).
△ Less
Submitted 13 August, 2020;
originally announced August 2020.
-
In the Eye of the Beholder? Detecting Creativity in Visual Programming Environments
Authors:
Anastasia Kovalkov,
Avi Segal,
Kobi Gal
Abstract:
Visual programming environments are increasingly part of the curriculum in schools. Their potential for promoting creative thinking of students is an important factor in their adoption. However, there does not exist a standard approach for detecting creativity in students' programming behavior, and analyzing programs manually requires human expertise and is time consuming. This work provides a com…
▽ More
Visual programming environments are increasingly part of the curriculum in schools. Their potential for promoting creative thinking of students is an important factor in their adoption. However, there does not exist a standard approach for detecting creativity in students' programming behavior, and analyzing programs manually requires human expertise and is time consuming. This work provides a computational tool for measuring creativity in visual programming that combines theory from the literature with data mining approaches. It adapts the classical dimensions of creative processes to our setting, as well as considering new aspects such as visual elements of the projects. We apply this approach to the Scratch programming environment, measuring the creativity score of hundreds of projects. We show that current metrics of computational thinking in Scratch fail to capture important aspects of creativity, such as the visual artifacts of projects. Interviews conducted with Scratch teachers validate our approach.
△ Less
Submitted 10 April, 2020;
originally announced April 2020.
-
Personalization in Human-AI Teams: Improving the Compatibility-Accuracy Tradeoff
Authors:
Jonathan Martinez,
Kobi Gal,
Ece Kamar,
Levi H. S. Lelis
Abstract:
AI systems that model and interact with users can update their models over time to reflect new information and changes in the environment. Although these updates may improve the overall performance of the AI system, they may actually hurt the performance with respect to individual users. Prior work has studied the trade-off between improving the system's accuracy following an update and the compat…
▽ More
AI systems that model and interact with users can update their models over time to reflect new information and changes in the environment. Although these updates may improve the overall performance of the AI system, they may actually hurt the performance with respect to individual users. Prior work has studied the trade-off between improving the system's accuracy following an update and the compatibility of the updated system with prior user experience. The more the model is forced to be compatible with a prior version, the higher loss in accuracy it will incur. In this paper, we show that by personalizing the loss function to specific users, in some cases it is possible to improve the compatibility-accuracy trade-off with respect to these users (increase the compatibility of the model while sacrificing less accuracy). We present experimental results indicating that this approach provides moderate improvements on average (around 20%) but large improvements for certain users (up to 300%).
△ Less
Submitted 19 August, 2020; v1 submitted 5 April, 2020;
originally announced April 2020.
-
Applying Transparency in Artificial Intelligence based Personalization Systems
Authors:
Laura Schelenz,
Avi Segal,
Kobi Gal
Abstract:
Artificial Intelligence based systems increasingly use personalization to provide users with relevant content, products, and solutions. Personalization is intended to support users and address their respective needs and preferences. However, users are becoming increasingly vulnerable to online manipulation due to algorithmic advancements and lack of transparency. Such manipulation decreases users'…
▽ More
Artificial Intelligence based systems increasingly use personalization to provide users with relevant content, products, and solutions. Personalization is intended to support users and address their respective needs and preferences. However, users are becoming increasingly vulnerable to online manipulation due to algorithmic advancements and lack of transparency. Such manipulation decreases users' levels of trust, autonomy, and satisfaction concerning the systems with which they interact. Increasing transparency is an important goal for personalization based systems. Unfortunately, system designers lack guidance in assessing and implementing transparency in their developed systems.
In this work we combine insights from technology ethics and computer science to generate a list of transparency best practices for machine generated personalization. Based on these best practices, we develop a checklist to be used by designers wishing to evaluate and increase the transparency of their algorithmic systems. Adopting a designer perspective, we apply the checklist to prominent online services and discuss its advantages and shortcomings. We encourage researchers to adopt the checklist in various environments and to work towards a consensus-based tool for measuring transparency in the personalization community.
△ Less
Submitted 21 August, 2020; v1 submitted 2 April, 2020;
originally announced April 2020.
-
The Phantom Steering Effect in Q&A Websites
Authors:
Nicholas Hoernle,
Gregory Kehne,
Ariel D. Procaccia,
Kobi Gal
Abstract:
Badges are commonly used in online platforms as incentives for promoting contributions. It is widely accepted that badges "steer" people's behavior toward increasing their rate of contributions before obtaining the badge. This paper provides a new probabilistic model of user behavior in the presence of badges. By applying the model to data from thousands of users on the Q&A site Stack Overflow, we…
▽ More
Badges are commonly used in online platforms as incentives for promoting contributions. It is widely accepted that badges "steer" people's behavior toward increasing their rate of contributions before obtaining the badge. This paper provides a new probabilistic model of user behavior in the presence of badges. By applying the model to data from thousands of users on the Q&A site Stack Overflow, we find that steering is not as widely applicable as was previously understood. Rather, the majority of users remain apathetic toward badges, while still providing a substantial number of contributions to the site. An interesting statistical phenomenon, termed "Phantom Steering," accounts for the interaction data of these users and this may have contributed to some previous conclusions about steering. Our results suggest that a small population, approximately 20%, of users respond to the badge incentives. Moreover, we conduct a qualitative survey of the users on Stack Overflow which provides further evidence that the insights from the model reflect the true behavior of the community. We argue that while badges might contribute toward a suite of effective rewards in an online system, research into other aspects of reward systems such as Stack Overflow reputation points should become a focus of the community.
△ Less
Submitted 21 August, 2020; v1 submitted 14 February, 2020;
originally announced February 2020.
-
EduBERT: Pretrained Deep Language Models for Learning Analytics
Authors:
Benjamin Clavié,
Kobi Gal
Abstract:
The use of large pretrained neural networks to create contextualized word embeddings has drastically improved performance on several natural language processing (NLP) tasks. These computationally expensive models have begun to be applied to domain-specific NLP tasks such as re-hospitalization prediction from clinical notes. This paper demonstrates that using large pretrained models produces excell…
▽ More
The use of large pretrained neural networks to create contextualized word embeddings has drastically improved performance on several natural language processing (NLP) tasks. These computationally expensive models have begun to be applied to domain-specific NLP tasks such as re-hospitalization prediction from clinical notes. This paper demonstrates that using large pretrained models produces excellent results on common learning analytics tasks. Pre-training deep language models using student forum data from a wide array of online courses improves performance beyond the state of the art on three text classification tasks. We also show that a smaller, distilled version of our model produces the best results on two of the three tasks while limiting computational cost. We make both models available to the research community at large.
△ Less
Submitted 2 December, 2019;
originally announced December 2019.
-
Interpretable Models for Understanding Immersive Simulations
Authors:
Nicholas Hoernle,
Kobi Gal,
Barbara Grosz,
Leilah Lyons,
Ada Ren,
Andee Rubin
Abstract:
This paper describes methods for comparative evaluation of the interpretability of models of high dimensional time series data inferred by unsupervised machine learning algorithms. The time series data used in this investigation were logs from an immersive simulation like those commonly used in education and healthcare training. The structures learnt by the models provide representations of partic…
▽ More
This paper describes methods for comparative evaluation of the interpretability of models of high dimensional time series data inferred by unsupervised machine learning algorithms. The time series data used in this investigation were logs from an immersive simulation like those commonly used in education and healthcare training. The structures learnt by the models provide representations of participants' activities in the simulation which are intended to be meaningful to people's interpretation. To choose the model that induces the best representation, we designed two interpretability tests, each of which evaluates the extent to which a model's output aligns with people's expectations or intuitions of what has occurred in the simulation. We compared the performance of the models on these interpretability tests to their performance on statistical information criteria. We show that the models that optimize interpretability quality differ from those that optimize (statistical) information theoretic criteria. Furthermore, we found that a model using a fully Bayesian approach performed well on both the statistical and human-interpretability measures. The Bayesian approach is a good candidate for fully automated model selection, i.e., when direct empirical investigations of interpretability are costly or infeasible.
△ Less
Submitted 4 May, 2020; v1 submitted 24 September, 2019;
originally announced September 2019.
-
Modeling Peoples Voting Behavior with Poll Information
Authors:
Roy Fairstein,
Adam Lauz,
Kobi Gal,
Reshef Meir
Abstract:
Despite the prevalence of voting systems in the real world there is no consensus among researchers of how people vote strategically, even in simple voting settings. This paper addresses this gap by comparing different approaches that have been used to model strategic voting, including expected utility maximization, heuristic decisionmaking, and bounded rationality models. The models are applied to…
▽ More
Despite the prevalence of voting systems in the real world there is no consensus among researchers of how people vote strategically, even in simple voting settings. This paper addresses this gap by comparing different approaches that have been used to model strategic voting, including expected utility maximization, heuristic decisionmaking, and bounded rationality models. The models are applied to data collected from hundreds of people in controlled voting experiments, where people vote after observing non-binding poll information. We introduce a new voting model, the Attainability- Utility (AU) heuristic, which weighs the popularity of a candidate according to the poll, with the utility of the candidate to the voter. We argue that the AU model is cognitively plausible, and show that it is able to predict peoples voting behavior significantly better than other models from the literature. It was almost at par with (and sometimes better than) a machine learning algorithm that uses substantially more information. Our results provide new insights into the strategic considerations of voters, that undermine the prevalent assumptions of much theoretical work in social choice.
△ Less
Submitted 23 September, 2019;
originally announced September 2019.
-
A difficulty ranking approach to personalization in E-learning
Authors:
Avi Segal,
Kobi Gal,
Guy Shani,
Bracha Shapira
Abstract:
The prevalence of e-learning systems and on-line courses has made educational material widely accessible to students of varying abilities and backgrounds. There is thus a growing need to accommodate for individual differences in e-learning systems. This paper presents an algorithm called EduRank for personalizing educational content to students that combines a collaborative filtering algorithm wit…
▽ More
The prevalence of e-learning systems and on-line courses has made educational material widely accessible to students of varying abilities and backgrounds. There is thus a growing need to accommodate for individual differences in e-learning systems. This paper presents an algorithm called EduRank for personalizing educational content to students that combines a collaborative filtering algorithm with voting methods. EduRank constructs a difficulty ranking for each student by aggregating the rankings of similar students using different aspects of their performance on common questions. These aspects include grades, number of retries, and time spent solving questions. It infers a difficulty ranking directly over the questions for each student, rather than ordering them according to the student's predicted score. The EduRank algorithm was tested on two data sets containing thousands of students and a million records. It was able to outperform the state-of-the-art ranking approaches as well as a domain expert. EduRank was used by students in a classroom activity, where a prior model was incorporated to predict the difficulty rankings of students with no prior history in the system. It was shown to lead students to solve more difficult questions than an ordering by a domain expert, without reducing their performance.
△ Less
Submitted 28 July, 2019;
originally announced July 2019.
-
Predicting Strategic Voting Behavior with Poll Information
Authors:
Roy Fairstein,
Adam Lauz,
Kobi Gal,
Reshef Meir
Abstract:
The question of how people vote strategically under uncertainty has attracted much attention in several disciplines. Theoretical decision models have been proposed which vary in their assumptions on the sophistication of the voters and on the information made available to them about others' preferences and their voting behavior. This work focuses on modeling strategic voting behavior under poll in…
▽ More
The question of how people vote strategically under uncertainty has attracted much attention in several disciplines. Theoretical decision models have been proposed which vary in their assumptions on the sophistication of the voters and on the information made available to them about others' preferences and their voting behavior. This work focuses on modeling strategic voting behavior under poll information. It proposes a new heuristic for voting behavior that weighs the success of each candidate according to the poll score with the utility of the candidate given the voters' preferences. The model weights can be tuned individually for each voter. We compared this model with other relevant voting models from the literature on data obtained from a recently released large scale study. We show that the new heuristic outperforms all other tested models. The prediction errors of the model can be partly explained due to inconsistent voters that vote for (weakly) dominated candidates.
△ Less
Submitted 19 May, 2018;
originally announced May 2018.
-
Combining Difficulty Ranking with Multi-Armed Bandits to Sequence Educational Content
Authors:
Avi Segal,
Yossi Ben David,
Joseph Jay Williams,
Kobi Gal,
Yaar Shalom
Abstract:
As e-learning systems become more prevalent, there is a growing need for them to accommodate individual differences between students. This paper addresses the problem of how to personalize educational content to students in order to maximize their learning gains over time. We present a new computational approach to this problem called MAPLE (Multi-Armed Bandits based Personalization for Learning E…
▽ More
As e-learning systems become more prevalent, there is a growing need for them to accommodate individual differences between students. This paper addresses the problem of how to personalize educational content to students in order to maximize their learning gains over time. We present a new computational approach to this problem called MAPLE (Multi-Armed Bandits based Personalization for Learning Environments) that combines difficulty ranking with multi-armed bandits. Given a set of target questions MAPLE estimates the expected learning gains for each question and uses an exploration-exploitation strategy to choose the next question to pose to the student. It maintains a personalized ranking over the difficulties of question in the target set which is used in two ways: First, to obtain initial estimates over the learning gains for the set of questions. Second, to update the estimates over time based on the students responses. We show in simulations that MAPLE was able to improve students' learning gains compared to approaches that sequence questions in increasing level of difficulty, or rely on content experts. When implemented in a live e-learning system in the wild, MAPLE showed promising results. This work demonstrates the efficacy of using stochastic approaches to the sequencing problem when augmented with information about question difficulty.
△ Less
Submitted 14 April, 2018;
originally announced April 2018.