Search | arXiv e-print repository

Acoustic tactile sensing for mobile robot wheels

Authors: Wilfred Mason, David Brenken, Falcon Z. Dai, Ricardo Gonzalo Cruz Castillo, Olivier St-Martin Cormier, Audrey Sedal

Abstract: Tactile sensing in mobile robots remains under-explored, mainly due to challenges related to sensor integration and the complexities of distributed sensing. In this work, we present a tactile sensing architecture for mobile robots based on wheel-mounted acoustic waveguides. Our sensor architecture enables tactile sensing along the entire circumference of a wheel with a single active component: an… ▽ More Tactile sensing in mobile robots remains under-explored, mainly due to challenges related to sensor integration and the complexities of distributed sensing. In this work, we present a tactile sensing architecture for mobile robots based on wheel-mounted acoustic waveguides. Our sensor architecture enables tactile sensing along the entire circumference of a wheel with a single active component: an off-the-shelf acoustic rangefinder. We present findings showing that our sensor, mounted on the wheel of a mobile robot, is capable of discriminating between different terrains, detecting and classifying obstacles with different geometries, and performing collision detection via contact localization. We also present a comparison between our sensor and sensors traditionally used in mobile robots, and point to the potential for sensor fusion approaches that leverage the unique capabilities of our tactile sensing architecture. Our findings demonstrate that autonomous mobile robots can further leverage our sensor architecture for diverse map** tasks requiring knowledge of terrain material, surface topology, and underlying structure. △ Less

Submitted 28 February, 2024; originally announced February 2024.

Comments: 12 pages, 12 figures

arXiv:2401.15994 [pdf]

Extracting and visualizing a new classification system for Colombia's National Administrative Department of Statistics. A visual analytics framework case study

Authors: Pierre Raimbaud, Jaime Camilo Espitia Castillo, John Guerra-Gomez

Abstract: In a world filled with data, it is expected for a nation to take decisions informed by data. However, countries need to first collect and publish such data in a way meaningful for both citizens and policy makers. A good thematic classification could be instrumental in hel** users navigate and find the right resources on a rich data repository as the one collected by Colombia's National Administr… ▽ More In a world filled with data, it is expected for a nation to take decisions informed by data. However, countries need to first collect and publish such data in a way meaningful for both citizens and policy makers. A good thematic classification could be instrumental in hel** users navigate and find the right resources on a rich data repository as the one collected by Colombia's National Administrative Department of Statistics (DANE). The Visual Analytics Framework is a methodology for conducting visual analysis developed by T. Munzner et al. [T. Munzner, Visualization Analysis and Design, A K Peters Visualization Series, 1, 2014] that could help with this task. This paper presents a case study applying such framework conducted to help the DANE better visualize their data repository, and present a more understandable classification of it. It describes three main analysis tasks identified, the proposed solutions and the collection of insights generated from them. △ Less

Submitted 29 January, 2024; originally announced January 2024.

Comments: V Jornadas Iberoamericanas de Interacci{ó}n Humano-Computador 2019, Benem{é}rita Universidad Aut{ó}noma de Puebla, Jun 2019, Puebla (Mexico), Mexico

arXiv:2401.00420 [pdf, other]

SynCDR : Training Cross Domain Retrieval Models with Synthetic Data

Authors: Samarth Mishra, Carlos D. Castillo, Hongcheng Wang, Kate Saenko, Venkatesh Saligrama

Abstract: In cross-domain retrieval, a model is required to identify images from the same semantic category across two visual domains. For instance, given a sketch of an object, a model needs to retrieve a real image of it from an online store's catalog. A standard approach for such a problem is learning a feature space of images where Euclidean distances reflect similarity. Even without human annotations,… ▽ More In cross-domain retrieval, a model is required to identify images from the same semantic category across two visual domains. For instance, given a sketch of an object, a model needs to retrieve a real image of it from an online store's catalog. A standard approach for such a problem is learning a feature space of images where Euclidean distances reflect similarity. Even without human annotations, which may be expensive to acquire, prior methods function reasonably well using unlabeled images for training. Our problem constraint takes this further to scenarios where the two domains do not necessarily share any common categories in training data. This can occur when the two domains in question come from different versions of some biometric sensor recording identities of different people. We posit a simple solution, which is to generate synthetic data to fill in these missing category examples across domains. This, we do via category preserving translation of images from one visual domain to another. We compare approaches specifically trained for this translation for a pair of domains, as well as those that can use large-scale pre-trained text-to-image diffusion models via prompts, and find that the latter can generate better replacement synthetic data, leading to more accurate cross-domain retrieval models. Our best SynCDR model can outperform prior art by up to 15\%. Code for our work is available at https://github.com/samarth4149/SynCDR . △ Less

Submitted 19 March, 2024; v1 submitted 31 December, 2023; originally announced January 2024.

Comments: Pre-print

arXiv:2311.11776 [pdf, ps, other]

Responsible AI Research Needs Impact Statements Too

Authors: Alexandra Olteanu, Michael Ekstrand, Carlos Castillo, **a Suh

Abstract: All types of research, development, and policy work can have unintended, adverse consequences - work in responsible artificial intelligence (RAI), ethical AI, or ethics in AI is no exception. All types of research, development, and policy work can have unintended, adverse consequences - work in responsible artificial intelligence (RAI), ethical AI, or ethics in AI is no exception. △ Less

Submitted 20 November, 2023; originally announced November 2023.

arXiv:2308.09596 [pdf, other]

Disparity, Inequality, and Accuracy Tradeoffs in Graph Neural Networks for Node Classification

Authors: Arpit Merchant, Carlos Castillo

Abstract: Graph neural networks (GNNs) are increasingly used in critical human applications for predicting node labels in attributed graphs. Their ability to aggregate features from nodes' neighbors for accurate classification also has the capacity to exacerbate existing biases in data or to introduce new ones towards members from protected demographic groups. Thus, it is imperative to quantify how GNNs may… ▽ More Graph neural networks (GNNs) are increasingly used in critical human applications for predicting node labels in attributed graphs. Their ability to aggregate features from nodes' neighbors for accurate classification also has the capacity to exacerbate existing biases in data or to introduce new ones towards members from protected demographic groups. Thus, it is imperative to quantify how GNNs may be biased and to what extent their harmful effects may be mitigated. To this end, we propose two new GNN-agnostic interventions namely, (i) PFR-AX which decreases the separability between nodes in protected and non-protected groups, and (ii) PostProcess which updates model predictions based on a blackbox policy to minimize differences between error rates across demographic groups. Through a large set of experiments on four datasets, we frame the efficacies of our approaches (and three variants) in terms of their algorithmic fairness-accuracy tradeoff and benchmark our results against three strong baseline interventions on three state-of-the-art GNN models. Our results show that no single intervention offers a universally optimal tradeoff, but PFR-AX and PostProcess provide granular control and improve model confidence when correctly predicting positive outcomes for nodes in protected groups. △ Less

Submitted 18 August, 2023; originally announced August 2023.

Comments: Accepted to CIKM 2023

arXiv:2305.19160 [pdf, other]

Recognizing People by Body Shape Using Deep Networks of Images and Words

Authors: Blake A. Myers, Lucas Jaggernauth, Thomas M. Metz, Matthew Q. Hill, Veda Nandan Gandi, Carlos D. Castillo, Alice J. O'Toole

Abstract: Common and important applications of person identification occur at distances and viewpoints in which the face is not visible or is not sufficiently resolved to be useful. We examine body shape as a biometric across distance and viewpoint variation. We propose an approach that combines standard object classification networks with representations based on linguistic (word-based) descriptions of bod… ▽ More Common and important applications of person identification occur at distances and viewpoints in which the face is not visible or is not sufficiently resolved to be useful. We examine body shape as a biometric across distance and viewpoint variation. We propose an approach that combines standard object classification networks with representations based on linguistic (word-based) descriptions of bodies. Algorithms with and without linguistic training were compared on their ability to identify people from body shape in images captured across a large range of distances/views (close-range, 100m, 200m, 270m, 300m, 370m, 400m, 490m, 500m, 600m, and at elevated pitch in images taken by an unmanned aerial vehicle [UAV]). Accuracy, as measured by identity-match ranking and false accept errors in an open-set test, was surprisingly good. For identity-ranking, linguistic models were more accurate for close-range images, whereas non-linguistic models fared better at intermediary distances. Fusion of the linguistic and non-linguistic embeddings improved performance at all, but the farthest distance. Although the non-linguistic model yielded fewer false accepts at all distances, fusion of the linguistic and non-linguistic models decreased false accepts for all, but the UAV images. We conclude that linguistic and non-linguistic representations of body shape can offer complementary identity information for bodies that can improve identification in applications of interest. △ Less

Submitted 30 May, 2023; originally announced May 2023.

Comments: 9 pages, 5 figures, 4 tables

arXiv:2305.09319 [pdf, other]

Fairness and Diversity in Information Access Systems

Authors: Lorenzo Porcaro, Carlos Castillo, Emilia Gómez, João Vinagre

Abstract: Among the seven key requirements to achieve trustworthy AI proposed by the High-Level Expert Group on Artificial Intelligence (AI-HLEG) established by the European Commission (EC), the fifth requirement ("Diversity, non-discrimination and fairness") declares: "In order to achieve Trustworthy AI, we must enable inclusion and diversity throughout the entire AI system's life cycle. [...] This require… ▽ More Among the seven key requirements to achieve trustworthy AI proposed by the High-Level Expert Group on Artificial Intelligence (AI-HLEG) established by the European Commission (EC), the fifth requirement ("Diversity, non-discrimination and fairness") declares: "In order to achieve Trustworthy AI, we must enable inclusion and diversity throughout the entire AI system's life cycle. [...] This requirement is closely linked with the principle of fairness". In this paper, we try to shed light on how closely these two distinct concepts, diversity and fairness, may be treated by focusing on information access systems and ranking literature. These concepts should not be used interchangeably because they do represent two different values, but what we argue is that they also cannot be considered totally unrelated or divergent. Having diversity does not imply fairness, but fostering diversity can effectively lead to fair outcomes, an intuition behind several methods proposed to mitigate the disparate impact of information access systems, i.e. recommender systems and search engines. △ Less

Submitted 16 May, 2023; originally announced May 2023.

Comments: Presented at the European Workshop on Algorithmic Fairness (EWAF'23) Winterthur, Switzerland, June 7-9, 2023

arXiv:2302.13897 [pdf]

Resistance Maintained in Digital Organisms despite Guanine/Cytosine-Based Fitness Cost and Extended De-Selection: Implications to Microbial Antibiotics Resistance

Authors: Clarence FG Castillo, Zhu En Chay, Maurice HT Ling

Abstract: Antibiotics resistance has caused much complication in the treatment of diseases, where the pathogen is no longer susceptible to specific antibiotics and the use of such antibiotics are no longer effective for treatment. A recent study that utilizes digital organisms suggests that complete elimination of specific antibiotic resistance is unlikely after the disuse of antibiotics, assuming that ther… ▽ More Antibiotics resistance has caused much complication in the treatment of diseases, where the pathogen is no longer susceptible to specific antibiotics and the use of such antibiotics are no longer effective for treatment. A recent study that utilizes digital organisms suggests that complete elimination of specific antibiotic resistance is unlikely after the disuse of antibiotics, assuming that there are no fitness costs for maintaining resistance once resistance are established. Fitness cost are referred to as reaction to change in environment, where organism improves its' abilities in one area at the expense of the other. Our goal in this study is to use digital organisms to examine the rate of gain and loss of resistance where fitness costs have incurred in maintaining resistance. Our results showed that GC-content based fitness cost during de-selection by removal of antibiotic-induced selective pressure portrayed similar trends in resistance compared to that of no fitness cost, at all stages of initial selection, repeated de-selection and re-introduction of selective pressure. Paired t-test suggested that prolonged stabilization of resistance after initial loss is not statistically significant for its difference to that of no fitness cost. This suggests that complete elimination of specific antibiotics resistance is unlikely after the disuse of antibiotics despite presence of fitness cost in maintaining antibiotic resistance during the disuse of antibiotics, once a resistant pool of micro-organism has been established. △ Less

Submitted 19 February, 2023; originally announced February 2023.

Journal ref: MOJ Proteomics & Bioinformatics 2(2): 00039 (2015)

arXiv:2212.08969 [pdf, other]

A Brief Survey on Person Recognition at a Distance

Authors: Chrisopher B. Nalty, Neehar Peri, Joshua Gleason, Carlos D. Castillo, Shuowen Hu, Thirimachos Bourlai, Rama Chellappa

Abstract: Person recognition at a distance entails recognizing the identity of an individual appearing in images or videos collected by long-range imaging systems such as drones or surveillance cameras. Despite recent advances in deep convolutional neural networks (DCNNs), this remains challenging. Images or videos collected by long-range cameras often suffer from atmospheric turbulence, blur, low-resolutio… ▽ More Person recognition at a distance entails recognizing the identity of an individual appearing in images or videos collected by long-range imaging systems such as drones or surveillance cameras. Despite recent advances in deep convolutional neural networks (DCNNs), this remains challenging. Images or videos collected by long-range cameras often suffer from atmospheric turbulence, blur, low-resolution, unconstrained poses, and poor illumination. In this paper, we provide a brief survey of recent advances in person recognition at a distance. In particular, we review recent work in multi-spectral face verification, person re-identification, and gait-based analysis techniques. Furthermore, we discuss the merits and drawbacks of existing approaches and identify important, yet under explored challenges for deploying remote person recognition systems in-the-wild. △ Less

Submitted 17 December, 2022; originally announced December 2022.

Comments: This work has been accepted to the IEEE Asilomar Conference on Signals, Systems, and Computers (ACSSC) 2022

arXiv:2212.00592 [pdf, other]

Assessing the Impact of Music Recommendation Diversity on Listeners: A Longitudinal Study

Authors: Lorenzo Porcaro, Emilia Gómez, Carlos Castillo

Abstract: We present the results of a 12-week longitudinal user study wherein the participants, 110 subjects from Southern Europe, received on a daily basis Electronic Music (EM) diversified recommendations. By analyzing their explicit and implicit feedback, we show that exposure to specific levels of music recommendation diversity may be responsible for long-term impacts on listeners' attitudes. In particu… ▽ More We present the results of a 12-week longitudinal user study wherein the participants, 110 subjects from Southern Europe, received on a daily basis Electronic Music (EM) diversified recommendations. By analyzing their explicit and implicit feedback, we show that exposure to specific levels of music recommendation diversity may be responsible for long-term impacts on listeners' attitudes. In particular, we highlight the function of diversity in increasing the openness in listening to EM, a music genre not particularly known or liked by the participants previous to their participation in the study. Moreover, we demonstrate that recommendations may help listeners in removing positive and negative attachments towards EM, deconstructing pre-existing implicit associations but also stereotypes associated with this music. In addition, our results show the significant clout that recommendation diversity has in generating curiosity in listeners. △ Less

Submitted 1 December, 2022; originally announced December 2022.

arXiv:2211.01679 [pdf]

doi 10.22152/programming-journal.org/2023/7/5

Out-of-Things Debugging: A Live Debugging Approach for Internet of Things

Authors: Carlos Rojas Castillo, Matteo Marra, Jim Bauwens, Elisa Gonzalez Boix

Abstract: Context: Internet of Things (IoT) has become an important kind of distributed systems thanks to the wide-spread of cheap embedded devices equipped with different networking technologies. Although ubiquitous, develo** IoT systems remains challenging. Inquiry: A recent field study with 194 IoT developers identifies debugging as one of the main challenges faced when develo** IoT systems. This c… ▽ More Context: Internet of Things (IoT) has become an important kind of distributed systems thanks to the wide-spread of cheap embedded devices equipped with different networking technologies. Although ubiquitous, develo** IoT systems remains challenging. Inquiry: A recent field study with 194 IoT developers identifies debugging as one of the main challenges faced when develo** IoT systems. This comes from the lack of debugging tools taking into account the unique properties of IoT systems such as non-deterministic data, and hardware restricted devices. On the one hand, offline debuggers allow developers to analyse post-failure recorded program information, but impose too much overhead on the devices while generating such information. Furthermore, the analysis process is also time-consuming and might miss contextual information relevant to find the root cause of bugs. On the other hand, online debuggers do allow debugging a program upon a failure while providing contextual information (e.g., stack trace). In particular, remote online debuggers enable debugging of devices without physical access to them. However, they experience debugging interference due to network delays which complicates bug reproducibility, and have limited support for dynamic software updates on remote devices. Approach: This paper proposes out-of-things debugging, an online debugging approach especially designed for IoT systems. The debugger is always-on as it ensures constant availability to for instance debug post-deployment situations. Upon a failure or breakpoint, out-of-things debugging moves the state of a deployed application to the developer's machine. Developers can then debug the application locally by applying operations (e.g., step commands) to the retrieved state. Once debugging is finished, developers can commit bug fixes to the device through live update capabilities. Finally, by means of a fine-grained flexible interface for accessing remote resources, developers have full control over the debugging overhead imposed on the device, and the access to device hardware resources (e.g., sensors) needed during local debugging. Knowledge: Out-of-things debugging maintains good properties of remote debugging as it does not require physical access to the device to debug it, while reducing debugging interference since there are no network delays on operations (e.g., step**) issued on the debugger since those happen locally. Furthermore, device resources are only accessed when requested by the user which further mitigates overhead and opens avenues for mocking or simulation of non-accessed resources. Grounding: We implemented an out-of-things debugger as an extension to a WebAssembly Virtual Machine and benchmarked its suitability for IoT. In particular, we compared our solution to remote debugging alternatives based on metrics such as network overhead, memory usage, scalability, and usability in production settings. From the benchmarks, we conclude that our debugger exhibits competitive performance in addition to confining overhead without sacrificing debugging convenience and flexibility. Importance: Out-of-things debugging enables debugging of IoT systems by means of classical online operations (e.g., stepwise execution) while addressing IoT-specific concerns (e.g., hardware limitations). We show that having the debugger always-on does not have to come at cost of performance loss or increased overhead but instead can enforce a smooth-going and flexible debugging experience of IoT systems. △ Less

Submitted 3 November, 2022; originally announced November 2022.

Journal ref: The Art, Science, and Engineering of Programming, 2023, Vol. 7, Issue 2, Article 5

arXiv:2207.05316 [pdf, other]

Twin identification over viewpoint change: A deep convolutional neural network surpasses humans

Authors: Connor J. Parde, Virginia E. Strehle, Vivekjyoti Banerjee, Ying Hu, Jacqueline G. Cavazos, Carlos D. Castillo, Alice J. O'Toole

Abstract: Deep convolutional neural networks (DCNNs) have achieved human-level accuracy in face identification (Phillips et al., 2018), though it is unclear how accurately they discriminate highly-similar faces. Here, humans and a DCNN performed a challenging face-identity matching task that included identical twins. Participants (N=87) viewed pairs of face images of three types: same-identity, general impo… ▽ More Deep convolutional neural networks (DCNNs) have achieved human-level accuracy in face identification (Phillips et al., 2018), though it is unclear how accurately they discriminate highly-similar faces. Here, humans and a DCNN performed a challenging face-identity matching task that included identical twins. Participants (N=87) viewed pairs of face images of three types: same-identity, general imposter pairs (different identities from similar demographic groups), and twin imposter pairs (identical twin siblings). The task was to determine whether the pairs showed the same person or different people. Identity comparisons were tested in three viewpoint-disparity conditions: frontal to frontal, frontal to 45-degree profile, and frontal to 90-degree profile. Accuracy for discriminating matched-identity pairs from twin-imposters and general imposters was assessed in each viewpoint-disparity condition. Humans were more accurate for general-imposter pairs than twin-imposter pairs, and accuracy declined with increased viewpoint disparity between the images in a pair. A DCNN trained for face identification (Ranjan et al., 2018) was tested on the same image pairs presented to humans. Machine performance mirrored the pattern of human accuracy, but with performance at or above all humans in all but one condition. Human and machine similarity scores were compared across all image-pair types. This item-level analysis showed that human and machine similarity ratings correlated significantly in six of nine image-pair types [range r=0.38 to r=0.63], suggesting general accord between the perception of face similarity by humans and the DCNN. These findings also contribute to our understanding of DCNN performance for discriminating high-resemblance faces, demonstrate that the DCNN performs at a level at or above humans, and suggest a degree of parity between the features used by humans and the DCNN. △ Less

Submitted 12 July, 2022; originally announced July 2022.

arXiv:2205.07970 [pdf, other]

SciLander: Map** the Scientific News Landscape

Authors: Maurício Gruppi, Panayiotis Smeros, Sibel Adalı, Carlos Castillo, Karl Aberer

Abstract: The COVID-19 pandemic has fueled the spread of misinformation on social media and the Web as a whole. The phenomenon dubbed `infodemic' has taken the challenges of information veracity and trust to new heights by massively introducing seemingly scientific and technical elements into misleading content. Despite the existing body of work on modeling and predicting misinformation, the coverage of ver… ▽ More The COVID-19 pandemic has fueled the spread of misinformation on social media and the Web as a whole. The phenomenon dubbed `infodemic' has taken the challenges of information veracity and trust to new heights by massively introducing seemingly scientific and technical elements into misleading content. Despite the existing body of work on modeling and predicting misinformation, the coverage of very complex scientific topics with inherent uncertainty and an evolving set of findings, such as COVID-19, provides many new challenges that are not easily solved by existing tools. To address these issues, we introduce SciLander, a method for learning representations of news sources reporting on science-based topics. SciLander extracts four heterogeneous indicators for the news sources; two generic indicators that capture (1) the copying of news stories between sources, and (2) the use of the same terms to mean different things (i.e., the semantic shift of terms), and two scientific indicators that capture (1) the usage of jargon and (2) the stance towards specific citations. We use these indicators as signals of source agreement, sampling pairs of positive (similar) and negative (dissimilar) samples, and combine them in a unified framework to train unsupervised news source embeddings with a triplet margin loss objective. We evaluate our method on a novel COVID-19 dataset containing nearly 1M news articles from 500 sources spanning a period of 18 months since the beginning of the pandemic in 2020. Our results show that the features learned by our model outperform state-of-the-art baseline methods on the task of news veracity classification. Furthermore, a clustering analysis suggests that the learned representations encode information about the reliability, political leaning, and partisanship bias of these sources. △ Less

Submitted 16 May, 2022; originally announced May 2022.

arXiv:2204.13861 [pdf, other]

Where in the World is this Image? Transformer-based Geo-localization in the Wild

Authors: Shraman Pramanick, Ewa M. Nowara, Joshua Gleason, Carlos D. Castillo, Rama Chellappa

Abstract: Predicting the geographic location (geo-localization) from a single ground-level RGB image taken anywhere in the world is a very challenging problem. The challenges include huge diversity of images due to different environmental scenarios, drastic variation in the appearance of the same location depending on the time of the day, weather, season, and more importantly, the prediction is made from a… ▽ More Predicting the geographic location (geo-localization) from a single ground-level RGB image taken anywhere in the world is a very challenging problem. The challenges include huge diversity of images due to different environmental scenarios, drastic variation in the appearance of the same location depending on the time of the day, weather, season, and more importantly, the prediction is made from a single image possibly having only a few geo-locating cues. For these reasons, most existing works are restricted to specific cities, imagery, or worldwide landmarks. In this work, we focus on develo** an efficient solution to planet-scale single-image geo-localization. To this end, we propose TransLocator, a unified dual-branch transformer network that attends to tiny details over the entire image and produces robust feature representation under extreme appearance variations. TransLocator takes an RGB image and its semantic segmentation map as inputs, interacts between its two parallel branches after each transformer layer, and simultaneously performs geo-localization and scene recognition in a multi-task fashion. We evaluate TransLocator on four benchmark datasets - Im2GPS, Im2GPS3k, YFCC4k, YFCC26k and obtain 5.5%, 14.1%, 4.9%, 9.9% continent-level accuracy improvement over the state-of-the-art. TransLocator is also validated on real-world test images and found to be more effective than previous methods. △ Less

Submitted 25 July, 2022; v1 submitted 28 April, 2022; originally announced April 2022.

Comments: Accepted in ECCV 2022

arXiv:2204.12591 [pdf, other]

The Influence of the Other-Race Effect on Susceptibility to Face Morphing Attacks

Authors: Snipta Mallick, Geraldine Jeckeln, Connor J. Parde, Carlos D. Castillo, Alice J. O'Toole

Abstract: Facial morphs created between two identities resemble both of the faces used to create the morph. Consequently, humans and machines are prone to mistake morphs made from two identities for either of the faces used to create the morph. This vulnerability has been exploited in "morph attacks" in security scenarios. Here, we asked whether the "other-race effect" (ORE) -- the human advantage for ident… ▽ More Facial morphs created between two identities resemble both of the faces used to create the morph. Consequently, humans and machines are prone to mistake morphs made from two identities for either of the faces used to create the morph. This vulnerability has been exploited in "morph attacks" in security scenarios. Here, we asked whether the "other-race effect" (ORE) -- the human advantage for identifying own- vs. other-race faces -- exacerbates morph attack susceptibility for humans. We also asked whether face-identification performance in a deep convolutional neural network (DCNN) is affected by the race of morphed faces. Caucasian (CA) and East-Asian (EA) participants performed a face-identity matching task on pairs of CA and EA face images in two conditions. In the morph condition, different-identity pairs consisted of an image of identity "A" and a 50/50 morph between images of identity "A" and "B". In the baseline condition, morphs of different identities never appeared. As expected, morphs were identified mistakenly more often than original face images. Moreover, CA participants showed an advantage for CA faces in comparison to EA faces (a partial ORE). Of primary interest, morph identification was substantially worse for cross-race faces than for own-race faces. Similar to humans, the DCNN performed more accurately for original face images than for morphed image pairs. Notably, the deep network proved substantially more accurate than humans in both cases. The results point to the possibility that DCNNs might be useful for improving face identification accuracy when morphed faces are presented. They also indicate the significance of the ORE in morph attack susceptibility in applied settings. △ Less

Submitted 26 April, 2022; originally announced April 2022.

Comments: 4 figures, 11 pages

arXiv:2204.10230 [pdf, other]

Cross-Lingual Query-Based Summarization of Crisis-Related Social Media: An Abstractive Approach Using Transformers

Authors: Fedor Vitiugin, Carlos Castillo

Abstract: Relevant and timely information collected from social media during crises can be an invaluable resource for emergency management. However, extracting this information remains a challenging task, particularly when dealing with social media postings in multiple languages. This work proposes a cross-lingual method for retrieving and summarizing crisis-relevant information from social media postings.… ▽ More Relevant and timely information collected from social media during crises can be an invaluable resource for emergency management. However, extracting this information remains a challenging task, particularly when dealing with social media postings in multiple languages. This work proposes a cross-lingual method for retrieving and summarizing crisis-relevant information from social media postings. We describe a uniform way of expressing various information needs through structured queries and a way of creating summaries answering those information needs. The method is based on multilingual transformers embeddings. Queries are written in one of the languages supported by the embeddings, and the extracted sentences can be in any of the other languages supported. Abstractive summaries are created by transformers. The evaluation, done by crowdsourcing evaluators and emergency management experts, and carried out on collections extracted from Twitter during five large-scale disasters spanning ten languages, shows the flexibility of our approach. The generated summaries are regarded as more focused, structured, and coherent than existing state-of-the-art methods, and experts compare them favorably against summaries created by existing, state-of-the-art methods. △ Less

Submitted 21 April, 2022; originally announced April 2022.

arXiv:2203.15514 [pdf, other]

Human Response to an AI-Based Decision Support System: A User Study on the Effects of Accuracy and Bias

Authors: David Solans, Andrea Beretta, Manuel Portela, Carlos Castillo, Anna Monreale

Abstract: Artificial Intelligence (AI) is increasingly used to build Decision Support Systems (DSS) across many domains. This paper describes a series of experiments designed to observe human response to different characteristics of a DSS such as accuracy and bias, particularly the extent to which participants rely on the DSS, and the performance they achieve. In our experiments, participants play a simple… ▽ More Artificial Intelligence (AI) is increasingly used to build Decision Support Systems (DSS) across many domains. This paper describes a series of experiments designed to observe human response to different characteristics of a DSS such as accuracy and bias, particularly the extent to which participants rely on the DSS, and the performance they achieve. In our experiments, participants play a simple online game inspired by so-called "wildcat" (i.e., exploratory) drilling for oil. The landscape has two layers: a visible layer describing the costs (terrain), and a hidden layer describing the reward (oil yield). Participants in the control group play the game without receiving any assistance, while in treatment groups they are assisted by a DSS suggesting places to drill. For certain treatments, the DSS does not consider costs, but only rewards, which introduces a bias that is observable by users. Between subjects, we vary the accuracy and bias of the DSS, and observe the participants' total score, time to completion, the extent to which they follow or ignore suggestions. We also measure the acceptability of the DSS in an exit survey. Our results show that participants tend to score better with the DSS, that the score increase is due to users following the DSS advice, and related to the difficulty of the game and the accuracy of the DSS. We observe that this setting elicits mostly rational behavior from participants, who place a moderate amount of trust in the DSS and show neither algorithmic aversion (under-reliance) nor automation bias (over-reliance).However, their stated willingness to accept the DSS in the exit survey seems less sensitive to the accuracy of the DSS than their behavior, suggesting that users are only partially aware of the (lack of) accuracy of the DSS. △ Less

Submitted 24 March, 2022; originally announced March 2022.

arXiv:2202.00640 [pdf, other]

doi 10.1145/3485447.3512143

Rewiring What-to-Watch-Next Recommendations to Reduce Radicalization Pathways

Authors: Francesco Fabbri, Yanhao Wang, Francesco Bonchi, Carlos Castillo, Michael Mathioudakis

Abstract: Recommender systems typically suggest to users content similar to what they consumed in the past. If a user happens to be exposed to strongly polarized content, she might subsequently receive recommendations which may steer her towards more and more radicalized content, eventually being trapped in what we call a "radicalization pathway". In this paper, we study the problem of mitigating radicaliza… ▽ More Recommender systems typically suggest to users content similar to what they consumed in the past. If a user happens to be exposed to strongly polarized content, she might subsequently receive recommendations which may steer her towards more and more radicalized content, eventually being trapped in what we call a "radicalization pathway". In this paper, we study the problem of mitigating radicalization pathways using a graph-based approach. Specifically, we model the set of recommendations of a "what-to-watch-next" recommender as a d-regular directed graph where nodes correspond to content items, links to recommendations, and paths to possible user sessions. We measure the "segregation" score of a node representing radicalized content as the expected length of a random walk from that node to any node representing non-radicalized content. High segregation scores are associated to larger chances to get users trapped in radicalization pathways. Hence, we define the problem of reducing the prevalence of radicalization pathways by selecting a small number of edges to "rewire", so to minimize the maximum of segregation scores among all radicalized nodes, while maintaining the relevance of the recommendations. We prove that the problem of finding the optimal set of recommendations to rewire is NP-hard and NP-hard to approximate within any factor. Therefore, we turn our attention to heuristics, and propose an efficient yet effective greedy algorithm based on the absorbing random walk theory. Our experiments on real-world datasets in the context of video and news recommendations confirm the effectiveness of our proposal. △ Less

Submitted 1 February, 2022; originally announced February 2022.

Comments: To appear in the Web conference 2022 (WWW '22)

arXiv:2201.11080 [pdf, other]

doi 10.1007/s10506-024-09393-y

A Comparative User Study of Human Predictions in Algorithm-Supported Recidivism Risk Assessment

Authors: Manuel Portela, Carlos Castillo, Songül Tolan, Marzieh Karimi-Haghighi, Antonio Andres Pueyo

Abstract: In this paper, we study the effects of using an algorithm-based risk assessment instrument to support the prediction of risk of criminalrecidivism. The instrument we use in our experiments is a machine learning version ofRiskEval(name changed for double-blindreview), which is the main risk assessment instrument used by the Justice Department ofCountry(omitted for double-blind review).The task is t… ▽ More In this paper, we study the effects of using an algorithm-based risk assessment instrument to support the prediction of risk of criminalrecidivism. The instrument we use in our experiments is a machine learning version ofRiskEval(name changed for double-blindreview), which is the main risk assessment instrument used by the Justice Department ofCountry(omitted for double-blind review).The task is to predict whether a person who has been released from prison will commit a new crime, leading to re-incarceration,within the next two years. We measure, among other variables, the accuracy of human predictions with and without algorithmicsupport. This user study is done with (1)generalparticipants from diverse backgrounds recruited through a crowdsourcing platform,(2)targetedparticipants who are students and practitioners of data science, criminology, or social work and professionals who workwithRiskEval. Among other findings, we observe that algorithmic support systematically leads to more accurate predictions fromall participants, but that statistically significant gains are only seen in the performance of targeted participants with respect to thatof crowdsourced participants. We also run focus groups with participants of the targeted study to interpret the quantitative results,including people who useRiskEvalin a professional capacity. Among other comments, professional participants indicate that theywould not foresee using a fully-automated system in criminal risk assessment, but do consider it valuable for training, standardization,and to fine-tune or double-check their predictions on particularly difficult cases. △ Less

Submitted 27 January, 2022; v1 submitted 26 January, 2022; originally announced January 2022.

arXiv:2201.10249 [pdf, ps, other]

Diversity in the Music Listening Experience: Insights from Focus Group Interviews

Authors: Lorenzo Porcaro, Emilia Gómez, Carlos Castillo

Abstract: Music listening in today's digital spaces is highly characterized by the availability of huge music catalogues, accessible by people all over the world. In this scenario, recommender systems are designed to guide listeners in finding tracks and artists that best fit their requests, having therefore the power to influence the diversity of the music they listen to. Albeit several works have proposed… ▽ More Music listening in today's digital spaces is highly characterized by the availability of huge music catalogues, accessible by people all over the world. In this scenario, recommender systems are designed to guide listeners in finding tracks and artists that best fit their requests, having therefore the power to influence the diversity of the music they listen to. Albeit several works have proposed new techniques for develo** diversity-aware recommendations, little is known about how people perceive diversity while interacting with music recommendations. In this study, we interview several listeners about the role that diversity plays in their listening experience, trying to get a better understanding of how they interact with music recommendations. We recruit the listeners among the participants of a previous quantitative study, where they were confronted with the notion of diversity when asked to identify, from a series of electronic music lists, the most diverse ones according to their beliefs. As a follow-up, in this qualitative study we carry out semi-structured interviews to understand how listeners may assess the diversity of a music list and to investigate their experiences with music recommendation diversity. We report here our main findings on 1) what can influence the diversity assessment of tracks and artists' music lists, and 2) which factors can characterize listeners' interaction with music recommendation diversity. △ Less

Submitted 25 January, 2022; originally announced January 2022.

arXiv:2112.09786 [pdf, other]

Distill and De-bias: Mitigating Bias in Face Verification using Knowledge Distillation

Authors: Prithviraj Dhar, Joshua Gleason, Aniket Roy, Carlos D. Castillo, P. Jonathon Phillips, Rama Chellappa

Abstract: Face recognition networks generally demonstrate bias with respect to sensitive attributes like gender, skintone etc. For gender and skintone, we observe that the regions of the face that a network attends to vary by the category of an attribute. This might contribute to bias. Building on this intuition, we propose a novel distillation-based approach called Distill and De-bias (D&D) to enforce a ne… ▽ More Face recognition networks generally demonstrate bias with respect to sensitive attributes like gender, skintone etc. For gender and skintone, we observe that the regions of the face that a network attends to vary by the category of an attribute. This might contribute to bias. Building on this intuition, we propose a novel distillation-based approach called Distill and De-bias (D&D) to enforce a network to attend to similar face regions, irrespective of the attribute category. In D&D, we train a teacher network on images from one category of an attribute; e.g. light skintone. Then distilling information from the teacher, we train a student network on images of the remaining category; e.g., dark skintone. A feature-level distillation loss constrains the student network to generate teacher-like representations. This allows the student network to attend to similar face regions for all attribute categories and enables it to reduce bias. We also propose a second distillation step on top of D&D, called D&D++. Here, we distill the `un-biasedness' of the D&D network into a new student network, the D&D++ network, while training this new network on all attribute categories; e.g., both light and dark skintones. This helps us train a network that is less biased for an attribute, while obtaining higher face verification performance than D&D. We show that D&D++ outperforms existing baselines in reducing gender and skintone bias on the IJB-C dataset, while obtaining higher face verification performance than existing adversarial de-biasing methods. We evaluate the effectiveness of our proposed methods on two state-of-the-art face recognition networks: ArcFace and Crystalface. △ Less

Submitted 16 April, 2022; v1 submitted 17 December, 2021; originally announced December 2021.

arXiv:2112.08237 [pdf, other]

Exposure Inequality in People Recommender Systems: The Long-Term Effects

Authors: Francesco Fabbri, Maria Luisa Croci, Francesco Bonchi, Carlos Castillo

Abstract: People recommender systems may affect the exposure that users receive in social networking platforms, influencing attention dynamics and potentially strengthening pre-existing inequalities that disproportionately affect certain groups. In this paper we introduce a model to simulate the feedback loop created by multiple rounds of interactions between users and a link recommender in a social netwo… ▽ More People recommender systems may affect the exposure that users receive in social networking platforms, influencing attention dynamics and potentially strengthening pre-existing inequalities that disproportionately affect certain groups. In this paper we introduce a model to simulate the feedback loop created by multiple rounds of interactions between users and a link recommender in a social network. This allows us to study the long-term consequences of those particular recommendation algorithms. Our model is equipped with several parameters to control (i) the level of homophily in the network, (ii) the relative size of the groups, (iii) the choice among several state-of-the-art link recommenders, and (iv) the choice among three different user behavior models, that decide which recommendations are accepted or rejected. Our extensive experimentation with the proposed model shows that a minority group, if homophilic enough, can get a disproportionate advantage in exposure from all link recommenders. Instead, when it is heterophilic, it gets under-exposed. Moreover, while the homophily level of the minority affects the speed of the growth of the disparate exposure, the relative size of the minority affects the magnitude of the effect. Finally, link recommenders strengthen exposure inequalities at the individual level, exacerbating the "rich-get-richer" effect: this happens for both the minority and the majority class and independently of their level of homophily. △ Less

Submitted 15 December, 2021; originally announced December 2021.

Comments: To appear in ICWSM 2022

arXiv:2110.13090 [pdf, other]

doi 10.1145/3459637.3482475

SciClops: Detecting and Contextualizing Scientific Claims for Assisting Manual Fact-Checking

Authors: Panayiotis Smeros, Carlos Castillo, Karl Aberer

Abstract: This paper describes SciClops, a method to help combat online scientific misinformation. Although automated fact-checking methods have gained significant attention recently, they require pre-existing ground-truth evidence, which, in the scientific context, is sparse and scattered across a constantly-evolving scientific literature. Existing methods do not exploit this literature, which can effectiv… ▽ More This paper describes SciClops, a method to help combat online scientific misinformation. Although automated fact-checking methods have gained significant attention recently, they require pre-existing ground-truth evidence, which, in the scientific context, is sparse and scattered across a constantly-evolving scientific literature. Existing methods do not exploit this literature, which can effectively contextualize and combat science-related fallacies. Furthermore, these methods rarely require human intervention, which is essential for the convoluted and critical domain of scientific misinformation. SciClops involves three main steps to process scientific claims found in online news articles and social media postings: extraction, clustering, and contextualization. First, the extraction of scientific claims takes place using a domain-specific, fine-tuned transformer model. Second, similar claims extracted from heterogeneous sources are clustered together with related scientific literature using a method that exploits their content and the connections among them. Third, check-worthy claims, broadcasted by popular yet unreliable sources, are highlighted together with an enhanced fact-checking context that includes related verified claims, news articles, and scientific papers. Extensive experiments show that SciClops tackles sufficiently these three steps, and effectively assists non-expert fact-checkers in the verification of complex scientific claims, outperforming commercial fact-checking systems. △ Less

Submitted 25 October, 2021; originally announced October 2021.

Comments: Proceedings of the 30th ACM International Conference on Information and Knowledge Management (CIKM '21). November 1-5, 2021. QLD, Australia

ACM Class: H.3.1; I.2.7

arXiv:2108.09558 [pdf, other]

A Synthesis-Based Approach for Thermal-to-Visible Face Verification

Authors: Neehar Peri, Joshua Gleason, Carlos D. Castillo, Thirimachos Bourlai, Vishal M. Patel, Rama Chellappa

Abstract: In recent years, visible-spectrum face verification systems have been shown to match the performance of experienced forensic examiners. However, such systems are ineffective in low-light and nighttime conditions. Thermal face imagery, which captures body heat emissions, effectively augments the visible spectrum, capturing discriminative facial features in scenes with limited illumination. Due to t… ▽ More In recent years, visible-spectrum face verification systems have been shown to match the performance of experienced forensic examiners. However, such systems are ineffective in low-light and nighttime conditions. Thermal face imagery, which captures body heat emissions, effectively augments the visible spectrum, capturing discriminative facial features in scenes with limited illumination. Due to the increased cost and difficulty of obtaining diverse, paired thermal and visible spectrum datasets, not many algorithms and large-scale benchmarks for low-light recognition are available. This paper presents an algorithm that achieves state-of-the-art performance on both the ARL-VTF and TUFTS multi-spectral face datasets. Importantly, we study the impact of face alignment, pixel-level correspondence, and identity classification with label smoothing for multi-spectral face synthesis and verification. We show that our proposed method is widely applicable, robust, and highly effective. In addition, we show that the proposed method significantly outperforms face frontalization methods on profile-to-frontal verification. Finally, we present MILAB-VTF(B), a challenging multi-spectral face dataset that is composed of paired thermal and visible videos. To the best of our knowledge, with face data from 400 subjects, this dataset represents the most extensive collection of indoor and long-range outdoor thermal-visible face imagery. Lastly, we show that our end-to-end thermal-to-visible face verification system provides strong performance on the MILAB-VTF(B) dataset. △ Less

Submitted 6 November, 2022; v1 submitted 21 August, 2021; originally announced August 2021.

Comments: This work has been accepted to the IEEE International Conference on Automatic Face and Gesture Recognition (FG) 2021

arXiv:2108.03764 [pdf, other]

PASS: Protected Attribute Suppression System for Mitigating Bias in Face Recognition

Authors: Prithviraj Dhar, Joshua Gleason, Aniket Roy, Carlos D. Castillo, Rama Chellappa

Abstract: Face recognition networks encode information about sensitive attributes while being trained for identity classification. Such encoding has two major issues: (a) it makes the face representations susceptible to privacy leakage (b) it appears to contribute to bias in face recognition. However, existing bias mitigation approaches generally require end-to-end training and are unable to achieve high ve… ▽ More Face recognition networks encode information about sensitive attributes while being trained for identity classification. Such encoding has two major issues: (a) it makes the face representations susceptible to privacy leakage (b) it appears to contribute to bias in face recognition. However, existing bias mitigation approaches generally require end-to-end training and are unable to achieve high verification accuracy. Therefore, we present a descriptor-based adversarial de-biasing approach called `Protected Attribute Suppression System (PASS)'. PASS can be trained on top of descriptors obtained from any previously trained high-performing network to classify identities and simultaneously reduce encoding of sensitive attributes. This eliminates the need for end-to-end training. As a component of PASS, we present a novel discriminator training strategy that discourages a network from encoding protected attribute information. We show the efficacy of PASS to reduce gender and skintone information in descriptors from SOTA face recognition networks like Arcface. As a result, PASS descriptors outperform existing baselines in reducing gender and skintone bias on the IJB-C dataset, while maintaining a high verification accuracy. △ Less

Submitted 8 August, 2021; originally announced August 2021.

Comments: Accepted to ICCV 2021

arXiv:2103.09068 [pdf]

Predicting Early Dropout: Calibration and Algorithmic Fairness Considerations

Authors: Marzieh Karimi-Haghighi, Carlos Castillo, Davinia Hernandez-Leo, Veronica Moreno Oliver

Abstract: In this work, the problem of predicting dropout risk in undergraduate studies is addressed from a perspective of algorithmic fairness. We develop a machine learning method to predict the risks of university dropout and underperformance. The objective is to understand if such a system can identify students at risk while avoiding potential discriminatory biases. When modeling both risks, we obtain p… ▽ More In this work, the problem of predicting dropout risk in undergraduate studies is addressed from a perspective of algorithmic fairness. We develop a machine learning method to predict the risks of university dropout and underperformance. The objective is to understand if such a system can identify students at risk while avoiding potential discriminatory biases. When modeling both risks, we obtain prediction models with an Area Under the ROC Curve (AUC) of 0.77-0.78 based on the data available at the enrollment time, before the first year of studies starts. This data includes the students' demographics, the high school they attended, and their admission (average) grade. Our models are calibrated: they produce estimated probabilities for each risk, not mere scores. We analyze if this method leads to discriminatory outcomes for some sensitive groups in terms of prediction accuracy (AUC) and error rates (Generalized False Positive Rate, GFPR, or Generalized False Negative Rate, GFNR). The models exhibit some equity in terms of AUC and GFNR along groups. The similar GFNR means a similar probability of failing to detect risk for students who drop out. The disparities in GFPR are addressed through a mitigation process that does not affect the calibration of the model. △ Less

Submitted 16 March, 2021; originally announced March 2021.

Comments: 10 pages, Companion Proceedings 11th International Conference on Learning Analytics & Knowledge (LAK21)

arXiv:2101.11916 [pdf, other]

doi 10.1145/3512956

Perceptions of Diversity in Electronic Music: the Impact of Listener, Artist, and Track Characteristics

Authors: Lorenzo Porcaro, Emilia Gómez, Carlos Castillo

Abstract: Shared practices to assess the diversity of retrieval system results are still debated in the Information Retrieval community, partly because of the challenges of determining what diversity means in specific scenarios, and of understanding how diversity is perceived by end-users. The field of Music Information Retrieval is not exempt from this issue. Even if fields such as Musicology or Sociology… ▽ More Shared practices to assess the diversity of retrieval system results are still debated in the Information Retrieval community, partly because of the challenges of determining what diversity means in specific scenarios, and of understanding how diversity is perceived by end-users. The field of Music Information Retrieval is not exempt from this issue. Even if fields such as Musicology or Sociology of Music have a long tradition in questioning the representation and the impact of diversity in cultural environments, such knowledge has not been yet embedded into the design and development of music technologies. In this paper, focusing on electronic music, we investigate the characteristics of listeners, artists, and tracks that are influential in the perception of diversity. Specifically, we center our attention on 1) understanding the relationship between perceived diversity and computational methods to measure diversity, and 2) analyzing how listeners' domain knowledge and familiarity influence such perceived diversity. To accomplish this, we design a user-study in which listeners are asked to compare pairs of lists of tracks and artists, and to select the most diverse list from each pair. We compare participants' ratings with results obtained through computational models built using audio tracks' features and artist attributes. We find that such models are generally aligned with participants' choices when most of them agree that one list is more diverse than the other, while they present a mixed behaviour in cases where participants have little agreement. Moreover, we observe how differences in domain knowledge, familiarity, and demographics can influence the level of agreement among listeners, and between listeners and diversity metrics computed automatically. △ Less

Submitted 26 November, 2021; v1 submitted 28 January, 2021; originally announced January 2021.

arXiv:2012.12795 [pdf, other]

A Note on the Significance Adjustment for FA*IR with Two Protected Groups

Authors: Meike Zehlike, Tom Sühr, Carlos Castillo

Abstract: In this report we provide an improvement of the significance adjustment from the FA*IR algorithm of Zehlike et al., which did not work for very short rankings in combination with a low minimum proportion $p$ for the protected group. We show how the minimum number of protected candidates per ranking position can be calculated exactly and provide a map** from the continuous space of significance l… ▽ More In this report we provide an improvement of the significance adjustment from the FA*IR algorithm of Zehlike et al., which did not work for very short rankings in combination with a low minimum proportion $p$ for the protected group. We show how the minimum number of protected candidates per ranking position can be calculated exactly and provide a map** from the continuous space of significance levels ($α$) to a discrete space of tables, which allows us to find $α_c$ using a binary search heuristic. △ Less

Submitted 23 December, 2020; originally announced December 2020.

arXiv:2012.05852 [pdf, other]

Social Media Alerts can Improve, but not Replace Hydrological Models for Forecasting Floods

Authors: Valerio Lorini, Carlos Castillo, Domenico Nappo, Francesco Dottori, Peter Salamon

Abstract: Social media can be used for disaster risk reduction as a complement to traditional information sources, and the literature has suggested numerous ways to achieve this. In the case of floods, for instance, data collection from social media can be triggered by a severe weather forecast and/or a flood prediction. By way of contrast, in this paper we explore the possibility of having an entirely inde… ▽ More Social media can be used for disaster risk reduction as a complement to traditional information sources, and the literature has suggested numerous ways to achieve this. In the case of floods, for instance, data collection from social media can be triggered by a severe weather forecast and/or a flood prediction. By way of contrast, in this paper we explore the possibility of having an entirely independent flood monitoring system which is based completely on social media, and which is completely self-activated. This independence and self-activation would bring increased robustness, as the system would not depend on other mechanisms for forecasting. We observe that social media can indeed help in the early detection of some flood events that would otherwise not be detected until later, albeit at the cost of many false positives. Overall, our experiments suggest that social media signals should only be used to complement existing monitoring systems, and we provide various explanations to support this argument. △ Less

Submitted 10 December, 2020; originally announced December 2020.

arXiv:2009.01715 [pdf, other]

Exploring Artist Gender Bias in Music Recommendation

Authors: Dougal Shakespeare, Lorenzo Porcaro, Emilia Gómez, Carlos Castillo

Abstract: Music Recommender Systems (mRS) are designed to give personalised and meaningful recommendations of items (i.e. songs, playlists or artists) to a user base, thereby reflecting and further complementing individual users' specific music preferences. Whilst accuracy metrics have been widely applied to evaluate recommendations in mRS literature, evaluating a user's item utility from other impact-orien… ▽ More Music Recommender Systems (mRS) are designed to give personalised and meaningful recommendations of items (i.e. songs, playlists or artists) to a user base, thereby reflecting and further complementing individual users' specific music preferences. Whilst accuracy metrics have been widely applied to evaluate recommendations in mRS literature, evaluating a user's item utility from other impact-oriented perspectives, including their potential for discrimination, is still a novel evaluation practice in the music domain. In this work, we center our attention on a specific phenomenon for which we want to estimate if mRS may exacerbate its impact: gender bias. Our work presents an exploratory study, analyzing the extent to which commonly deployed state of the art Collaborative Filtering(CF) algorithms may act to further increase or decrease artist gender bias. To assess group biases introduced by CF, we deploy a recently proposed metric of bias disparity on two listening event datasets: the LFM-1b dataset, and the earlier constructed Celma's dataset. Our work traces the causes of disparity to variations in input gender distributions and user-item preferences, highlighting the effect such configurations can have on user's gender bias after recommendation generation. △ Less

Submitted 6 October, 2020; v1 submitted 3 September, 2020; originally announced September 2020.

Comments: Presented at the 2nd Workshop on the Impact of Recommender Systems (ImpactRS), at the 14th ACM Conference on Recommender Systems (RecSys 2020)

arXiv:2008.12039 [pdf, other]

doi 10.14778/3415478.3415521

SciLens News Platform: A System for Real-Time Evaluation of News Articles

Authors: Angelika Romanou, Panayiotis Smeros, Carlos Castillo, Karl Aberer

Abstract: We demonstrate the SciLens News Platform, a novel system for evaluating the quality of news articles. The SciLens News Platform automatically collects contextual information about news articles in real-time and provides quality indicators about their validity and trustworthiness. These quality indicators derive from i) social media discussions regarding news articles, showcasing the reach and stan… ▽ More We demonstrate the SciLens News Platform, a novel system for evaluating the quality of news articles. The SciLens News Platform automatically collects contextual information about news articles in real-time and provides quality indicators about their validity and trustworthiness. These quality indicators derive from i) social media discussions regarding news articles, showcasing the reach and stance towards these articles, and ii) their content and their referenced sources, showcasing the journalistic foundations of these articles. Furthermore, the platform enables domain-experts to review articles and rate the quality of news sources. This augmented view of news articles, which combines automatically extracted indicators and domain-expert reviews, has provably helped the platform users to have a better consensus about the quality of the underlying articles. The platform is built in a distributed and robust fashion and runs operationally handling daily thousands of news articles. We evaluate the SciLens News Platform on the emerging topic of COVID-19 where we highlight the discrepancies between low and high-quality news outlets based on three axes, namely their newsroom activity, evidence seeking and social engagement. A live demonstration of the platform can be found here: http://scilens.epfl.ch. △ Less

Submitted 27 August, 2020; originally announced August 2020.

Comments: Conference demo paper, 4 pages, 5 figures

Journal ref: Proceedings of the 46th International Conference on Very Large Data Bases, Tokyo, Japan, Aug 31-Sept 4, 2020

arXiv:2007.14775 [pdf, other]

Intersectional Affirmative Action Policies for Top-k Candidates Selection

Authors: Giorgio Barnabo', Carlos Castillo, Michael Mathioudakis, Sergio Celis

Abstract: We study the problem of selecting the top-k candidates from a pool of applicants, where each candidate is associated with a score indicating his/her aptitude. Depending on the specific scenario, such as job search or college admissions, these scores may be the results of standardized tests or other predictors of future performance and utility. We consider a situation in which some groups of candid… ▽ More We study the problem of selecting the top-k candidates from a pool of applicants, where each candidate is associated with a score indicating his/her aptitude. Depending on the specific scenario, such as job search or college admissions, these scores may be the results of standardized tests or other predictors of future performance and utility. We consider a situation in which some groups of candidates experience historical and present disadvantage that makes their chances of being accepted much lower than other groups. In these circumstances, we wish to apply an affirmative action policy to reduce acceptance rate disparities, while avoiding any large decrease in the aptitude of the candidates that are eventually selected. Our algorithmic design is motivated by the frequently observed phenomenon that discrimination disproportionately affects individuals who simultaneously belong to multiple disadvantaged groups, defined along intersecting dimensions such as gender, race, sexual orientation, socio-economic status, and disability. In short, our algorithm's objective is to simultaneously: select candidates with high utility, and level up the representation of disadvantaged intersectional classes. This naturally involves trade-offs and is computationally challenging due to the the combinatorial explosion of potential subgroups as more attributes are considered. We propose two algorithms to solve this problem, analyze them, and evaluate them experimentally using a dataset of university application scores and admissions to bachelor degrees in an OECD country. Our conclusion is that it is possible to significantly reduce disparities in admission rates affecting intersectional classes with a small loss in terms of selected candidate aptitude. To the best of our knowledge, we are the first to study fairness constraints with regards to intersectional classes in the context of top-k selection. △ Less

Submitted 5 March, 2021; v1 submitted 29 July, 2020; originally announced July 2020.

arXiv:2007.03177 [pdf, other]

doi 10.1016/j.ijhcs.2022.102772

Modeling and mitigating human annotation errors to design efficient stream processing systems with human-in-the-loop machine learning

Authors: Rahul Pandey, Hemant Purohit, Carlos Castillo, Valerie L. Shalin

Abstract: High-quality human annotations are necessary for creating effective machine learning-driven stream processing systems. We study hybrid stream processing systems based on a Human-In-The-Loop Machine Learning (HITL-ML) paradigm, in which one or many human annotators and an automatic classifier (trained at least partially by the human annotators) label an incoming stream of instances. This is typical… ▽ More High-quality human annotations are necessary for creating effective machine learning-driven stream processing systems. We study hybrid stream processing systems based on a Human-In-The-Loop Machine Learning (HITL-ML) paradigm, in which one or many human annotators and an automatic classifier (trained at least partially by the human annotators) label an incoming stream of instances. This is typical of many near-real-time social media analytics and web applications, including annotating social media posts during emergencies by digital volunteer groups. From a practical perspective, low-quality human annotations result in wrong labels for retraining automated classifiers and indirectly contribute to the creation of inaccurate classifiers. Considering human annotation as a psychological process allows us to address these limitations. We show that human annotation quality is dependent on the ordering of instances shown to annotators and can be improved by local changes in the instance sequence/order provided to the annotators, yielding a more accurate annotation of the stream. We adapt a theoretically-motivated human error framework of mistakes and slips for the human annotation task to study the effect of ordering instances (i.e., an "annotation schedule"). Further, we propose an error-avoidance approach to the active learning paradigm for stream processing applications robust to these likely human errors (in the form of slips) when deciding a human annotation schedule. We support the human error framework using crowdsourcing experiments and evaluate the proposed algorithm against standard baselines for active learning via extensive experimentation on classification tasks of filtering relevant social media posts during natural disasters. △ Less

Submitted 18 January, 2022; v1 submitted 6 July, 2020; originally announced July 2020.

Comments: Accepted at International Journal of Human-Computer Studies on January 4th, 2022

Journal ref: IJHCS 160 (2022) 102772

arXiv:2007.01202 [pdf, other]

Towards Data-Driven Affirmative Action Policies under Uncertainty

Authors: Corinna Hertweck, Carlos Castillo, Michael Mathioudakis

Abstract: In this paper, we study university admissions under a centralized system that uses grades and standardized test scores to match applicants to university programs. We consider affirmative action policies that seek to increase the number of admitted applicants from underrepresented groups. Since such a policy has to be announced before the start of the application period, there is uncertainty about… ▽ More In this paper, we study university admissions under a centralized system that uses grades and standardized test scores to match applicants to university programs. We consider affirmative action policies that seek to increase the number of admitted applicants from underrepresented groups. Since such a policy has to be announced before the start of the application period, there is uncertainty about the score distribution of the students applying to each program. This poses a difficult challenge for policy-makers. We explore the possibility of using a predictive model trained on historical data to help optimize the parameters of such policies. △ Less

Submitted 2 July, 2020; originally announced July 2020.

Comments: 4 pages

arXiv:2006.07845 [pdf, other]

Towards Gender-Neutral Face Descriptors for Mitigating Bias in Face Recognition

Authors: Prithviraj Dhar, Joshua Gleason, Hossein Souri, Carlos D. Castillo, Rama Chellappa

Abstract: State-of-the-art deep networks implicitly encode gender information while being trained for face recognition. Gender is often viewed as an important attribute with respect to identifying faces. However, the implicit encoding of gender information in face descriptors has two major issues: (a.) It makes the descriptors susceptible to privacy leakage, i.e. a malicious agent can be trained to predict… ▽ More State-of-the-art deep networks implicitly encode gender information while being trained for face recognition. Gender is often viewed as an important attribute with respect to identifying faces. However, the implicit encoding of gender information in face descriptors has two major issues: (a.) It makes the descriptors susceptible to privacy leakage, i.e. a malicious agent can be trained to predict the face gender from such descriptors. (b.) It appears to contribute to gender bias in face recognition, i.e. we find a significant difference in the recognition accuracy of DCNNs on male and female faces. Therefore, we present a novel `Adversarial Gender De-biasing algorithm (AGENDA)' to reduce the gender information present in face descriptors obtained from previously trained face recognition networks. We show that AGENDA significantly reduces gender predictability of face descriptors. Consequently, we are also able to reduce gender bias in face verification while maintaining reasonable recognition performance. △ Less

Submitted 17 September, 2020; v1 submitted 14 June, 2020; originally announced June 2020.

Comments: Under submission

arXiv:2004.07401 [pdf, other]

Poisoning Attacks on Algorithmic Fairness

Authors: David Solans, Battista Biggio, Carlos Castillo

Abstract: Research in adversarial machine learning has shown how the performance of machine learning models can be seriously compromised by injecting even a small fraction of poisoning points into the training data. While the effects on model accuracy of such poisoning attacks have been widely studied, their potential effects on other model performance metrics remain to be evaluated. In this work, we introd… ▽ More Research in adversarial machine learning has shown how the performance of machine learning models can be seriously compromised by injecting even a small fraction of poisoning points into the training data. While the effects on model accuracy of such poisoning attacks have been widely studied, their potential effects on other model performance metrics remain to be evaluated. In this work, we introduce an optimization framework for poisoning attacks against algorithmic fairness, and develop a gradient-based poisoning attack aimed at introducing classification disparities among different groups in the data. We empirically show that our attack is effective not only in the white-box setting, in which the attacker has full access to the target model, but also in a more challenging black-box scenario in which the attacks are optimized against a substitute model and then transferred to the target model. We believe that our findings pave the way towards the definition of an entirely novel set of adversarial attacks targeting algorithmic fairness in different scenarios, and that investigating such vulnerabilities will help design more robust algorithms and countermeasures in the future. △ Less

Submitted 26 June, 2020; v1 submitted 15 April, 2020; originally announced April 2020.

arXiv:2003.04794 [pdf, other]

Addressing multiple metrics of group fairness in data-driven decision making

Authors: Marius Miron, Songül Tolan, Emilia Gómez, Carlos Castillo

Abstract: The Fairness, Accountability, and Transparency in Machine Learning (FAT-ML) literature proposes a varied set of group fairness metrics to measure discrimination against socio-demographic groups that are characterized by a protected feature, such as gender or race.Such a system can be deemed as either fair or unfair depending on the choice of the metric. Several metrics have been proposed, some of… ▽ More The Fairness, Accountability, and Transparency in Machine Learning (FAT-ML) literature proposes a varied set of group fairness metrics to measure discrimination against socio-demographic groups that are characterized by a protected feature, such as gender or race.Such a system can be deemed as either fair or unfair depending on the choice of the metric. Several metrics have been proposed, some of them incompatible with each other.We do so empirically, by observing that several of these metrics cluster together in two or three main clusters for the same groups and machine learning methods. In addition, we propose a robust way to visualize multidimensional fairness in two dimensions through a Principal Component Analysis (PCA) of the group fairness metrics. Experimental results on multiple datasets show that the PCA decomposition explains the variance between the metrics with one to three components. △ Less

Submitted 10 March, 2020; originally announced March 2020.

arXiv:2002.07618 [pdf, ps, other]

doi 10.1145/3219819.3220056

Algorithms for Hiring and Outsourcing in the Online Labor Market

Authors: Aris Anagnostopoulos, Carlos Castillo, Adriano Fazzone, Stefano Leonardi, Evimaria Terzi

Abstract: Although freelancing work has grown substantially in recent years, in part facilitated by a number of online labor marketplaces, (e.g., Guru, Freelancer, Amazon Mechanical Turk), traditional forms of "in-sourcing" work continue being the dominant form of employment. This means that, at least for the time being, freelancing and salaried employment will continue to co-exist. In this paper, we provid… ▽ More Although freelancing work has grown substantially in recent years, in part facilitated by a number of online labor marketplaces, (e.g., Guru, Freelancer, Amazon Mechanical Turk), traditional forms of "in-sourcing" work continue being the dominant form of employment. This means that, at least for the time being, freelancing and salaried employment will continue to co-exist. In this paper, we provide algorithms for outsourcing and hiring workers in a general setting, where workers form a team and contribute different skills to perform a task. We call this model team formation with outsourcing. In our model, tasks arrive in an online fashion: neither the number nor the composition of the tasks is known a-priori. At any point in time, there is a team of hired workers who receive a fixed salary independently of the work they perform. This team is dynamic: new members can be hired and existing members can be fired, at some cost. Additionally, some parts of the arriving tasks can be outsourced and thus completed by non-team members, at a premium. Our contribution is an efficient online cost-minimizing algorithm for hiring and firing team members and outsourcing tasks. We present theoretical bounds obtained using a primal-dual scheme proving that our algorithms have a logarithmic competitive approximation ratio. We complement these results with experiments using semi-synthetic datasets based on actual task requirements and worker skills from three large online labor marketplaces. △ Less

Submitted 16 February, 2020; originally announced February 2020.

Comments: Published at 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining 2018

arXiv:2002.06274 [pdf, other]

Single Unit Status in Deep Convolutional Neural Network Codes for Face Identification: Sparseness Redefined

Authors: Connor J. Parde, Y. Ivette Colón, Matthew Q. Hill, Carlos D. Castillo, Prithviraj Dhar, Alice J. O'Toole

Abstract: Deep convolutional neural networks (DCNNs) trained for face identification develop representations that generalize over variable images, while retaining subject (e.g., gender) and image (e.g., viewpoint) information. Identity, gender, and viewpoint codes were studied at the "neural unit" and ensemble levels of a face-identification network. At the unit level, identification, gender classification,… ▽ More Deep convolutional neural networks (DCNNs) trained for face identification develop representations that generalize over variable images, while retaining subject (e.g., gender) and image (e.g., viewpoint) information. Identity, gender, and viewpoint codes were studied at the "neural unit" and ensemble levels of a face-identification network. At the unit level, identification, gender classification, and viewpoint estimation were measured by deleting units to create variably-sized, randomly-sampled subspaces at the top network layer. Identification of 3,531 identities remained high (area under the ROC approximately 1.0) as dimensionality decreased from 512 units to 16 (0.95), 4 (0.80), and 2 (0.72) units. Individual identities separated statistically on every top-layer unit. Cross-unit responses were minimally correlated, indicating that units code non-redundant identity cues. This "distributed" code requires only a sparse, random sample of units to identify faces accurately. Gender classification declined gradually and viewpoint estimation fell steeply as dimensionality decreased. Individual units were weakly predictive of gender and viewpoint, but ensembles proved effective predictors. Therefore, distributed and sparse codes co-exist in the network units to represent different face attributes. At the ensemble level, principal component analysis of face representations showed that identity, gender, and viewpoint information separated into high-dimensional subspaces, ordered by explained variance. Identity, gender, and viewpoint information contributed to all individual unit responses, undercutting a neural tuning analogy for face attributes. Interpretation of neural-like codes from DCNNs, and by analogy, high-level visual codes, cannot be inferred from single unit responses. Instead, "meaning" is encoded by directions in the high-dimensional space. △ Less

Submitted 1 March, 2020; v1 submitted 14 February, 2020; originally announced February 2020.

arXiv:2001.08810 [pdf, other]

Uneven Coverage of Natural Disasters in Wikipedia: the Case of Flood

Authors: Valerio Lorini, Javier Rando, Diego Saez-Trumper, Carlos Castillo

Abstract: The usage of non-authoritative data for disaster management presents the opportunity of accessing timely information that might not be available through other means, as well as the challenge of dealing with several layers of biases. Wikipedia, a collaboratively-produced encyclopedia, includes in-depth information about many natural and human-made disasters, and its editors are particularly good at… ▽ More The usage of non-authoritative data for disaster management presents the opportunity of accessing timely information that might not be available through other means, as well as the challenge of dealing with several layers of biases. Wikipedia, a collaboratively-produced encyclopedia, includes in-depth information about many natural and human-made disasters, and its editors are particularly good at adding information in real-time as a crisis unfolds. In this study, we focus on the English version of Wikipedia, that is by far the most comprehensive version of this encyclopedia. Wikipedia tends to have good coverage of disasters, particularly those having a large number of fatalities. However, we also show that a tendency to cover events in wealthy countries and not cover events in poorer ones permeates Wikipedia as a source for disaster-related information. By performing careful automatic content analysis at a large scale, we show how the coverage of floods in Wikipedia is skewed towards rich, English-speaking countries, in particular the US and Canada. We also note how coverage of floods in countries with the lowest income, as well as countries in South America, is substantially lower than the coverage of floods in middle-income countries. These results have implications for systems using Wikipedia or similar collaborative media platforms as an information source for detecting emergencies or for gathering valuable information for disaster response. △ Less

Submitted 23 January, 2020; originally announced January 2020.

Comments: 17 pages, submitted to ISCRAM 2020 conference

arXiv:1912.07398 [pdf, other]

Accuracy comparison across face recognition algorithms: Where are we on measuring race bias?

Authors: Jacqueline G. Cavazos, P. Jonathon Phillips, Carlos D. Castillo, Alice J. O'Toole

Abstract: Previous generations of face recognition algorithms differ in accuracy for images of different races (race bias). Here, we present the possible underlying factors (data-driven and scenario modeling) and methodological considerations for assessing race bias in algorithms. We discuss data driven factors (e.g., image quality, image population statistics, and algorithm architecture), and scenario mode… ▽ More Previous generations of face recognition algorithms differ in accuracy for images of different races (race bias). Here, we present the possible underlying factors (data-driven and scenario modeling) and methodological considerations for assessing race bias in algorithms. We discuss data driven factors (e.g., image quality, image population statistics, and algorithm architecture), and scenario modeling factors that consider the role of the "user" of the algorithm (e.g., threshold decisions and demographic constraints). To illustrate how these issues apply, we present data from four face recognition algorithms (a previous-generation algorithm and three deep convolutional neural networks, DCNNs) for East Asian and Caucasian faces. First, dataset difficulty affected both overall recognition accuracy and race bias, such that race bias increased with item difficulty. Second, for all four algorithms, the degree of bias varied depending on the identification decision threshold. To achieve equal false accept rates (FARs), East Asian faces required higher identification thresholds than Caucasian faces, for all algorithms. Third, demographic constraints on the formulation of the distributions used in the test, impacted estimates of algorithm accuracy. We conclude that race bias needs to be measured for individual applications and we provide a checklist for measuring this bias in face recognition algorithms. △ Less

Submitted 4 June, 2020; v1 submitted 16 December, 2019; originally announced December 2019.

arXiv:1912.02484 [pdf, ps, other]

EviDense: a Graph-based Method for Finding Unique High-impact Events with Succinct Keyword-based Descriptions

Authors: Oana Balalau, Carlos Castillo, Mauro Sozio

Abstract: Despite the significant efforts made by the research community in recent years, automatically acquiring valuable information about high impact-events from social media remains challenging. We present EviDense, a graph-based approach for finding high-impact events (such as disaster events) in social media. One of the challenges we address in our work is to provide for each event a succinct keyword-… ▽ More Despite the significant efforts made by the research community in recent years, automatically acquiring valuable information about high impact-events from social media remains challenging. We present EviDense, a graph-based approach for finding high-impact events (such as disaster events) in social media. One of the challenges we address in our work is to provide for each event a succinct keyword-based description, containing the most relevant information about it, such as what happened, the location, as well as its timeframe. We evaluate our approach on a large collection of tweets posted over a period of 19 months, using a crowdsourcing platform. Our evaluation shows that our method outperforms state-of-the-art approaches for the same problem, in terms of having higher precision, lower number of duplicates, and presenting a keyword-based description that is succinct and informative. We further improve the results of our algorithm by incorporating news from mainstream media. A preliminary version of this work was presented as a 4-pages short paper at ICWSM 2018. △ Less

Submitted 5 December, 2019; originally announced December 2019.

Comments: 20 pages

arXiv:1910.12591 [pdf, other]

Conflict and Cooperation: AI Research and Development in terms of the Economy of Conventions

Authors: David Solans, Christopher Tauchmann, Aideen Farrell, Karolin Kappler, Hans-Hendrik Huber, Carlos Castillo, Kristian Kersting

Abstract: Artificial Intelligence (AI) and its relation with societies is increasingly becoming an interesting object of study from the perspective of sociology and other disciplines. Theories such as the Economy of Conventions (EC) are usually applied in the context of interpersonal relations but there is still a clear lack of studies around how this and other theories can shed light on interactions betwee… ▽ More Artificial Intelligence (AI) and its relation with societies is increasingly becoming an interesting object of study from the perspective of sociology and other disciplines. Theories such as the Economy of Conventions (EC) are usually applied in the context of interpersonal relations but there is still a clear lack of studies around how this and other theories can shed light on interactions between human an autonomous systems. This work is focused into studying a preliminary step that is a key enabler for the subsequent interaction between machines and humans: how the processes of researching, designing and develo** AI related systems reflect different moral registers, represented by conventions within the EC. Having a better understanding of those conventions guiding the advances in AI is considered as the first and required advance to understand the conventions afterwards reflected by those autonomous systems in the interactions with societies. For this purpose, we develop an iterative tool based on active learning to label a data set from the field of AI and Machine Learning (ML) research and present preliminary results of a supervised classifier trained on these conventions. To further demonstrate the feasibility of the approach, the results are contrasted with a classifier trained on software conventions. △ Less

Submitted 1 September, 2020; v1 submitted 17 October, 2019; originally announced October 2019.

Comments: Accepted at ICWSM 2021

arXiv:1910.05657 [pdf, other]

How are attributes expressed in face DCNNs?

Authors: Prithviraj Dhar, Ankan Bansal, Carlos D. Castillo, Joshua Gleason, P. Jonathon Phillips, Rama Chellappa

Abstract: As deep networks become increasingly accurate at recognizing faces, it is vital to understand how these networks process faces. While these networks are solely trained to recognize identities, they also contain face related information such as sex, age, and pose of the face. The networks are not trained to learn these attributes. We introduce expressivity as a measure of how much a feature vector… ▽ More As deep networks become increasingly accurate at recognizing faces, it is vital to understand how these networks process faces. While these networks are solely trained to recognize identities, they also contain face related information such as sex, age, and pose of the face. The networks are not trained to learn these attributes. We introduce expressivity as a measure of how much a feature vector informs us about an attribute, where a feature vector can be from internal or final layers of a network. Expressivity is computed by a second neural network whose inputs are features and attributes. The output of the second neural network approximates the mutual information between feature vectors and an attribute. We investigate the expressivity for two different deep convolutional neural network (DCNN) architectures: a Resnet-101 and an Inception Resnet v2. In the final fully connected layer of the networks, we found the order of expressivity for facial attributes to be Age > Sex > Yaw. Additionally, we studied the changes in the encoding of facial attributes over training iterations. We found that as training progresses, expressivities of yaw, sex, and age decrease. Our technique can be a tool for investigating the sources of bias in a network and a step towards explaining the network's identity decisions. △ Less

Submitted 12 October, 2019; originally announced October 2019.

arXiv:1908.06520 [pdf, other]

doi 10.1145/3359253

Modeling Islamist Extremist Communications on Social Media using Contextual Dimensions: Religion, Ideology, and Hate

Authors: Ugur Kursuncu, Manas Gaur, Carlos Castillo, Amanuel Alambo, K. Thirunarayan, Valerie Shalin, Dilshod Achilov, I. Budak Arpinar, Amit Sheth

Abstract: Terror attacks have been linked in part to online extremist content. Although tens of thousands of Islamist extremism supporters consume such content, they are a small fraction relative to peaceful Muslims. The efforts to contain the ever-evolving extremism on social media platforms have remained inadequate and mostly ineffective. Divergent extremist and mainstream contexts challenge machine inter… ▽ More Terror attacks have been linked in part to online extremist content. Although tens of thousands of Islamist extremism supporters consume such content, they are a small fraction relative to peaceful Muslims. The efforts to contain the ever-evolving extremism on social media platforms have remained inadequate and mostly ineffective. Divergent extremist and mainstream contexts challenge machine interpretation, with a particular threat to the precision of classification algorithms. Our context-aware computational approach to the analysis of extremist content on Twitter breaks down this persuasion process into building blocks that acknowledge inherent ambiguity and sparsity that likely challenge both manual and automated classification. We model this process using a combination of three contextual dimensions -- religion, ideology, and hate -- each elucidating a degree of radicalization and highlighting independent features to render them computationally accessible. We utilize domain-specific knowledge resources for each of these contextual dimensions such as Qur'an for religion, the books of extremist ideologues and preachers for political ideology and a social media hate speech corpus for hate. Our study makes three contributions to reliable analysis: (i) Development of a computational approach rooted in the contextual dimensions of religion, ideology, and hate that reflects strategies employed by online Islamist extremist groups, (ii) An in-depth analysis of relevant tweet datasets with respect to these dimensions to exclude likely mislabeled users, and (iii) A framework for understanding online radicalization as a process to assist counter-programming. Given the potentially significant social impact, we evaluate the performance of our algorithms to minimize mislabeling, where our approach outperforms a competitive baseline by 10.2% in precision. △ Less

Submitted 5 October, 2020; v1 submitted 18 August, 2019; originally announced August 2019.

Comments: 22 pages

Journal ref: Proceedings of the ACM on Human-Computer Interaction. 3 (2019)

arXiv:1907.07228 [pdf, other]

doi 10.1145/3341161.3342931

Modeling Human Annotation Errors to Design Bias-Aware Systems for Social Stream Processing

Authors: Rahul Pandey, Carlos Castillo, Hemant Purohit

Abstract: High-quality human annotations are necessary to create effective machine learning systems for social media. Low-quality human annotations indirectly contribute to the creation of inaccurate or biased learning systems. We show that human annotation quality is dependent on the ordering of instances shown to annotators (referred as 'annotation schedule'), and can be improved by local changes in the i… ▽ More High-quality human annotations are necessary to create effective machine learning systems for social media. Low-quality human annotations indirectly contribute to the creation of inaccurate or biased learning systems. We show that human annotation quality is dependent on the ordering of instances shown to annotators (referred as 'annotation schedule'), and can be improved by local changes in the instance ordering provided to the annotators, yielding a more accurate annotation of the data stream for efficient real-time social media analytics. We propose an error-mitigating active learning algorithm that is robust with respect to some cases of human errors when deciding an annotation schedule. We validate the human error model and evaluate the proposed algorithm against strong baselines by experimenting on classification tasks of relevant social media posts during crises. According to these experiments, considering the order in which data instances are presented to human annotators leads to both an increase in accuracy for machine learning and awareness toward some potential biases in human learning that may affect the automated classifier. △ Less

Submitted 16 July, 2019; originally announced July 2019.

Comments: To appear in International Conference on Advances in Social Networks Analysis and Mining (ASONAM '19), Vancouver, BC, Canada

arXiv:1905.13134 [pdf, other]

doi 10.1145/3366424.3383534

FairSearch: A Tool For Fairness in Ranked Search Results

Authors: Meike Zehlike, Tom Sühr, Carlos Castillo, Ivan Kitanovski

Abstract: Ranked search results and recommendations have become the main mechanism by which we find content, products, places, and people online. With hiring, selecting, purchasing, and dating being increasingly mediated by algorithms, rankings may determine career and business opportunities, educational placement, access to benefits, and even social and reproductive success. It is therefore of societal and… ▽ More Ranked search results and recommendations have become the main mechanism by which we find content, products, places, and people online. With hiring, selecting, purchasing, and dating being increasingly mediated by algorithms, rankings may determine career and business opportunities, educational placement, access to benefits, and even social and reproductive success. It is therefore of societal and ethical importance to ask whether search results can demote, marginalize, or exclude individuals of unprivileged groups or promote products with undesired features. In this paper we present FairSearch, the first fair open source search API to provide fairness notions in ranked search results. We implement two algorithms from the fair ranking literature, namely FA*IR (Zehlike et al., 2017) and DELTR (Zehlike and Castillo, 2018) and provide them as stand-alone libraries in Python and Java. Additionally we implement interfaces to Elasticsearch for both algorithms, that use the aforementioned Java libraries and are then provided as Elasticsearch plugins. Elasticsearch is a well-known search engine API based on Apache Lucene. With our plugins we enable search engine developers who wish to ensure fair search results of different styles to easily integrate DELTR and FA*IR into their existing Elasticsearch environment. △ Less

Submitted 23 April, 2020; v1 submitted 27 May, 2019; originally announced May 2019.

Comments: 4 pages, demo paper

ACM Class: H.3.3

Journal ref: Companion Proceedings of the Web Conference 2020 (WWW '20 Companion), April 20--24, 2020, Taipei, Taiwan

arXiv:1905.09947 [pdf, other]

Affirmative Action Policies for Top-k Candidates Selection, With an Application to the Design of Policies for University Admissions

Authors: Michael Mathioudakis, Carlos Castillo, Giorgio Barnabo, Sergio Celis

Abstract: We consider the problem of designing affirmative action policies for selecting the top-k candidates from a pool of applicants. We assume that for each candidate we have socio-demographic attributes and a series of variables that serve as indicators of future performance (e.g., results on standardized tests). We further assume that we have access to historical data including the actual performance… ▽ More We consider the problem of designing affirmative action policies for selecting the top-k candidates from a pool of applicants. We assume that for each candidate we have socio-demographic attributes and a series of variables that serve as indicators of future performance (e.g., results on standardized tests). We further assume that we have access to historical data including the actual performance of previously selected candidates. Critically, performance information is only available for candidates who were selected under some previous selection policy. In this work we assume that due to legal requirements or voluntary commitments, an organization wants to increase the presence of people from disadvantaged socio-demographic groups among the selected candidates. Hence, we seek to design an affirmative action or positive action policy. This policy has two concurrent objectives: (i) to select candidates who, given what can be learnt from historical data, are more likely to perform well, and (ii) to select candidates in a way that increases the representation of disadvantaged socio-demographic groups. Our motivating application is the design of university admission policies to bachelor's degrees. We use a causal model as a framework to describe several families of policies (changing component weights, giving bonuses, and enacting quotas), and compare them both theoretically and through extensive experimentation on a large real-world dataset containing thousands of university applicants. Our paper is the first to place the problem of affirmative-action policy design within the framework of algorithmic fairness. Our empirical results indicate that simple policies could favor the admission of disadvantaged groups without significantly compromising on the quality of accepted candidates. △ Less

Submitted 9 March, 2021; v1 submitted 23 May, 2019; originally announced May 2019.

Comments: 10 pages

arXiv:1905.02756 [pdf, other]

Uncertainty Modeling of Contextual-Connections between Tracklets for Unconstrained Video-based Face Recognition

Authors: **gxiao Zheng, Ruichi Yu, Jun-Cheng Chen, Boyu Lu, Carlos D. Castillo, Rama Chellappa

Abstract: Unconstrained video-based face recognition is a challenging problem due to significant within-video variations caused by pose, occlusion and blur. To tackle this problem, an effective idea is to propagate the identity from high-quality faces to low-quality ones through contextual connections, which are constructed based on context such as body appearance. However, previous methods have often propa… ▽ More Unconstrained video-based face recognition is a challenging problem due to significant within-video variations caused by pose, occlusion and blur. To tackle this problem, an effective idea is to propagate the identity from high-quality faces to low-quality ones through contextual connections, which are constructed based on context such as body appearance. However, previous methods have often propagated erroneous information due to lack of uncertainty modeling of the noisy contextual connections. In this paper, we propose the Uncertainty-Gated Graph (UGG), which conducts graph-based identity propagation between tracklets, which are represented by nodes in a graph. UGG explicitly models the uncertainty of the contextual connections by adaptively updating the weights of the edge gates according to the identity distributions of the nodes during inference. UGG is a generic graphical model that can be applied at only inference time or with end-to-end training. We demonstrate the effectiveness of UGG with state-of-the-art results in the recently released challenging Cast Search in Movies and IARPA Janus Surveillance Video Benchmark dataset. △ Less

Submitted 21 August, 2019; v1 submitted 7 May, 2019; originally announced May 2019.

Comments: To appear in ICCV 2019

arXiv:1904.10876 [pdf, other]

Integrating Social Media into a Pan-European Flood Awareness System: A Multilingual Approach

Authors: V. Lorini, C. Castillo, F. Dottori, M. Kalas, D. Nappo, P. Salamon

Abstract: This paper describes a prototype system that integrates social media analysis into the European Flood Awareness System (EFAS). This integration allows the collection of social media data to be automatically triggered by flood risk warnings determined by a hydro-meteorological model. Then, we adopt a multi-lingual approach to find flood-related messages by employing two state-of-the-art methodologi… ▽ More This paper describes a prototype system that integrates social media analysis into the European Flood Awareness System (EFAS). This integration allows the collection of social media data to be automatically triggered by flood risk warnings determined by a hydro-meteorological model. Then, we adopt a multi-lingual approach to find flood-related messages by employing two state-of-the-art methodologies: language-agnostic word embeddings and language-aligned word embeddings. Both approaches can be used to bootstrap a classifier of social media messages for a new language with little or no labeled data. Finally, we describe a method for selecting relevant and representative messages and displaying them back in the interface of EFAS. △ Less

Submitted 24 April, 2019; originally announced April 2019.

Comments: accepted at ISCRAM2019 Conference

Showing 1–50 of 89 results for author: Castillo, C