-
This Looks Better than That: Better Interpretable Models with ProtoPNeXt
Authors:
Frank Willard,
Luke Moffett,
Emmanuel Mokel,
Jon Donnelly,
Stark Guo,
Julia Yang,
Giyoung Kim,
Alina Jade Barnett,
Cynthia Rudin
Abstract:
Prototypical-part models are a popular interpretable alternative to black-box deep learning models for computer vision. However, they are difficult to train, with high sensitivity to hyperparameter tuning, inhibiting their application to new datasets and our understanding of which methods truly improve their performance. To facilitate the careful study of prototypical-part networks (ProtoPNets), w…
▽ More
Prototypical-part models are a popular interpretable alternative to black-box deep learning models for computer vision. However, they are difficult to train, with high sensitivity to hyperparameter tuning, inhibiting their application to new datasets and our understanding of which methods truly improve their performance. To facilitate the careful study of prototypical-part networks (ProtoPNets), we create a new framework for integrating components of prototypical-part models -- ProtoPNeXt. Using ProtoPNeXt, we show that applying Bayesian hyperparameter tuning and an angular prototype similarity metric to the original ProtoPNet is sufficient to produce new state-of-the-art accuracy for prototypical-part models on CUB-200 across multiple backbones. We further deploy this framework to jointly optimize for accuracy and prototype interpretability as measured by metrics included in ProtoPNeXt. Using the same resources, this produces models with substantially superior semantics and changes in accuracy between +1.3% and -1.5%. The code and trained models will be made publicly available upon publication.
△ Less
Submitted 20 June, 2024;
originally announced June 2024.
-
FPN-IAIA-BL: A Multi-Scale Interpretable Deep Learning Model for Classification of Mass Margins in Digital Mammography
Authors:
Julia Yang,
Alina Jade Barnett,
Jon Donnelly,
Satvik Kishore,
Jerry Fang,
Fides Regina Schwartz,
Chaofan Chen,
Joseph Y. Lo,
Cynthia Rudin
Abstract:
Digital mammography is essential to breast cancer detection, and deep learning offers promising tools for faster and more accurate mammogram analysis. In radiology and other high-stakes environments, uninterpretable ("black box") deep learning models are unsuitable and there is a call in these fields to make interpretable models. Recent work in interpretable computer vision provides transparency t…
▽ More
Digital mammography is essential to breast cancer detection, and deep learning offers promising tools for faster and more accurate mammogram analysis. In radiology and other high-stakes environments, uninterpretable ("black box") deep learning models are unsuitable and there is a call in these fields to make interpretable models. Recent work in interpretable computer vision provides transparency to these formerly black boxes by utilizing prototypes for case-based explanations, achieving high accuracy in applications including mammography. However, these models struggle with precise feature localization, reasoning on large portions of an image when only a small part is relevant. This paper addresses this gap by proposing a novel multi-scale interpretable deep learning model for mammographic mass margin classification. Our contribution not only offers an interpretable model with reasoning aligned with radiologist practices, but also provides a general architecture for computer vision with user-configurable prototypes from coarse- to fine-grained prototypes.
△ Less
Submitted 10 June, 2024;
originally announced June 2024.
-
Simulating Policy Impacts: Develo** a Generative Scenario Writing Method to Evaluate the Perceived Effects of Regulation
Authors:
Julia Barnett,
Kimon Kieslich,
Nicholas Diakopoulos
Abstract:
The rapid advancement of AI technologies yields numerous future impacts on individuals and society. Policy-makers are therefore tasked to react quickly and establish policies that mitigate those impacts. However, anticipating the effectiveness of policies is a difficult task, as some impacts might only be observable in the future and respective policies might not be applicable to the future develo…
▽ More
The rapid advancement of AI technologies yields numerous future impacts on individuals and society. Policy-makers are therefore tasked to react quickly and establish policies that mitigate those impacts. However, anticipating the effectiveness of policies is a difficult task, as some impacts might only be observable in the future and respective policies might not be applicable to the future development of AI. In this work we develop a method for using large language models (LLMs) to evaluate the efficacy of a given piece of policy at mitigating specified negative impacts. We do so by using GPT-4 to generate scenarios both pre- and post-introduction of policy and translating these vivid stories into metrics based on human perceptions of impacts. We leverage an already established taxonomy of impacts of generative AI in the media environment to generate a set of scenario pairs both mitigated and non-mitigated by the transparency legislation of Article 50 of the EU AI Act. We then run a user study (n=234) to evaluate these scenarios across four risk-assessment dimensions: severity, plausibility, magnitude, and specificity to vulnerable populations. We find that this transparency legislation is perceived to be effective at mitigating harms in areas such as labor and well-being, but largely ineffective in areas such as social cohesion and security. Through this case study on generative AI harms we demonstrate the efficacy of our method as a tool to iterate on the effectiveness of policy on mitigating various negative impacts. We expect this method to be useful to researchers or other stakeholders who want to brainstorm the potential utility of different pieces of policy or other mitigation strategies.
△ Less
Submitted 15 May, 2024;
originally announced May 2024.
-
Exploring Musical Roots: Applying Audio Embeddings to Empower Influence Attribution for a Generative Music Model
Authors:
Julia Barnett,
Hugo Flores Garcia,
Bryan Pardo
Abstract:
Every artist has a creative process that draws inspiration from previous artists and their works. Today, "inspiration" has been automated by generative music models. The black box nature of these models obscures the identity of the works that influence their creative output. As a result, users may inadvertently appropriate, misuse, or copy existing artists' works. We establish a replicable methodo…
▽ More
Every artist has a creative process that draws inspiration from previous artists and their works. Today, "inspiration" has been automated by generative music models. The black box nature of these models obscures the identity of the works that influence their creative output. As a result, users may inadvertently appropriate, misuse, or copy existing artists' works. We establish a replicable methodology to systematically identify similar pieces of music audio in a manner that is useful for understanding training data attribution. A key aspect of our approach is to harness an effective music audio similarity measure. We compare the effect of applying CLMR and CLAP embeddings to similarity measurement in a set of 5 million audio clips used to train VampNet, a recent open source generative music model. We validate this approach with a human listening study. We also explore the effect that modifications of an audio example (e.g., pitch shifting, time stretching, background noise) have on similarity measurements. This work is foundational to incorporating automated influence attribution into generative modeling, which promises to let model creators and users move from ignorant appropriation to informed creation. Audio samples that accompany this paper are available at https://tinyurl.com/exploring-musical-roots.
△ Less
Submitted 25 January, 2024;
originally announced January 2024.
-
ProtoEEGNet: An Interpretable Approach for Detecting Interictal Epileptiform Discharges
Authors:
Dennis Tang,
Frank Willard,
Ronan Tegerdine,
Luke Triplett,
Jon Donnelly,
Luke Moffett,
Lesia Semenova,
Alina Jade Barnett,
** **g,
Cynthia Rudin,
Brandon Westover
Abstract:
In electroencephalogram (EEG) recordings, the presence of interictal epileptiform discharges (IEDs) serves as a critical biomarker for seizures or seizure-like events.Detecting IEDs can be difficult; even highly trained experts disagree on the same sample. As a result, specialists have turned to machine-learning models for assistance. However, many existing models are black boxes and do not provid…
▽ More
In electroencephalogram (EEG) recordings, the presence of interictal epileptiform discharges (IEDs) serves as a critical biomarker for seizures or seizure-like events.Detecting IEDs can be difficult; even highly trained experts disagree on the same sample. As a result, specialists have turned to machine-learning models for assistance. However, many existing models are black boxes and do not provide any human-interpretable reasoning for their decisions. In high-stakes medical applications, it is critical to have interpretable models so that experts can validate the reasoning of the model before making important diagnoses. We introduce ProtoEEGNet, a model that achieves state-of-the-art accuracy for IED detection while additionally providing an interpretable justification for its classifications. Specifically, it can reason that one EEG looks similar to another ''prototypical'' EEG that is known to contain an IED. ProtoEEGNet can therefore help medical professionals effectively detect IEDs while maintaining a transparent decision-making process.
△ Less
Submitted 3 December, 2023;
originally announced December 2023.
-
Entropy-Based Strategies for Multi-Bracket Pools
Authors:
Ryan S. Brill,
Abraham J. Wyner,
Ian J. Barnett
Abstract:
Much work in the parimutuel betting literature has discussed estimating event outcome probabilities or develo** optimal wagering strategies, particularly for horse race betting. Some betting pools, however, involve betting not just on a single event, but on a tuple of events. For example, pick six betting in horse racing, March Madness bracket challenges, and predicting a randomly drawn bitstrin…
▽ More
Much work in the parimutuel betting literature has discussed estimating event outcome probabilities or develo** optimal wagering strategies, particularly for horse race betting. Some betting pools, however, involve betting not just on a single event, but on a tuple of events. For example, pick six betting in horse racing, March Madness bracket challenges, and predicting a randomly drawn bitstring each involve making a series of individual forecasts. Although traditional optimal wagering strategies work well when the size of the tuple is very small (e.g., betting on the winner of a horse race), they are intractable for more general betting pools in higher dimensions (e.g., March Madness bracket challenges). Hence we pose the multi-brackets problem: supposing we wish to predict a tuple of events and that we know the true probabilities of each potential outcome of each event, what is the best way to tractably generate a set of $n$ predicted tuples? The most general version of this problem is extremely difficult, so we begin with a simpler setting. In particular, we generate $n$ independent predicted tuples according to a distribution having optimal entropy. This entropy-based approach is tractable, scalable, and performs well.
△ Less
Submitted 20 March, 2024; v1 submitted 28 August, 2023;
originally announced August 2023.
-
The Ethical Implications of Generative Audio Models: A Systematic Literature Review
Authors:
Julia Barnett
Abstract:
Generative audio models typically focus their applications in music and speech generation, with recent models having human-like quality in their audio output. This paper conducts a systematic literature review of 884 papers in the area of generative audio models in order to both quantify the degree to which researchers in the field are considering potential negative impacts and identify the types…
▽ More
Generative audio models typically focus their applications in music and speech generation, with recent models having human-like quality in their audio output. This paper conducts a systematic literature review of 884 papers in the area of generative audio models in order to both quantify the degree to which researchers in the field are considering potential negative impacts and identify the types of ethical implications researchers in this area need to consider. Though 65% of generative audio research papers note positive potential impacts of their work, less than 10% discuss any negative impacts. This jarringly small percentage of papers considering negative impact is particularly worrying because the issues brought to light by the few papers doing so are raising serious ethical implications and concerns relevant to the broader field such as the potential for fraud, deep-fakes, and copyright infringement. By quantifying this lack of ethical consideration in generative audio research and identifying key areas of potential harm, this paper lays the groundwork for future work in the field at a critical point in time in order to guide more conscientious research as this field progresses.
△ Less
Submitted 7 July, 2023;
originally announced July 2023.
-
Neural-Network-Augmented Projection-Based Model Order Reduction for Mitigating the Kolmogorov Barrier to Reducibility of CFD Models
Authors:
Joshua L Barnett,
Charbel Farhat,
Yvon Maday
Abstract:
Inspired by our previous work on mitigating the Kolmogorov barrier using a quadratic approximation manifold, we propose in this paper a computationally tractable approach for combining a projection-based reduced-order model (PROM) and an artificial neural network (ANN) for mitigating the Kolmogorov barrier to reducibility of convection-dominated flow problems. The main objective the PROM-ANN conce…
▽ More
Inspired by our previous work on mitigating the Kolmogorov barrier using a quadratic approximation manifold, we propose in this paper a computationally tractable approach for combining a projection-based reduced-order model (PROM) and an artificial neural network (ANN) for mitigating the Kolmogorov barrier to reducibility of convection-dominated flow problems. The main objective the PROM-ANN concept that we propose is to reduce the dimensionality of the online approximation of the solution beyond what is possible using affine and quadratic approximation manifolds. In contrast to previous approaches for constructing arbitrarily nonlinear manifold approximations for nonlinear model reduction that exploited one form or another of ANN, the training of the PROM-ANN we propose in this paper does not involve data whose dimension scales with that of the high-dimensional model; and this PROM-ANN is hyperreducible using any well-established hyperreduction method. Hence, unlike many other ANN-based approaches, the PROM-ANN concept we propose in this paper is practical for large-scale and industry-relevant CFD problems. Its potential is demonstrated here for a parametric, shock-dominated, benchmark problem.
△ Less
Submitted 17 December, 2022;
originally announced December 2022.
-
Interpretable Machine Learning System to EEG Patterns on the Ictal-Interictal-Injury Continuum
Authors:
Alina Jade Barnett,
Zhicheng Guo,
** **g,
Wendong Ge,
Cynthia Rudin,
M. Brandon Westover
Abstract:
In intensive care units (ICUs), critically ill patients are monitored with electroencephalograms (EEGs) to prevent serious brain injury. The number of patients who can be monitored is constrained by the availability of trained physicians to read EEGs, and EEG interpretation can be subjective and prone to inter-observer variability. Automated deep learning systems for EEG could reduce human bias an…
▽ More
In intensive care units (ICUs), critically ill patients are monitored with electroencephalograms (EEGs) to prevent serious brain injury. The number of patients who can be monitored is constrained by the availability of trained physicians to read EEGs, and EEG interpretation can be subjective and prone to inter-observer variability. Automated deep learning systems for EEG could reduce human bias and accelerate the diagnostic process. However, black box deep learning models are untrustworthy, difficult to troubleshoot, and lack accountability in real-world applications, leading to a lack of trust and adoption by clinicians. To address these challenges, we propose a novel interpretable deep learning model that not only predicts the presence of harmful brainwave patterns but also provides high-quality case-based explanations of its decisions. Our model performs better than the corresponding black box model, despite being constrained to be interpretable. The learned 2D embedded space provides the first global overview of the structure of ictal-interictal-injury continuum brainwave patterns. The ability to understand how our model arrived at its decisions will not only help clinicians to diagnose and treat harmful brain activities more accurately but also increase their trust and adoption of machine learning models in clinical practice; this could be an integral component of the ICU neurologists' standard workflow.
△ Less
Submitted 11 April, 2023; v1 submitted 9 November, 2022;
originally announced November 2022.
-
Crowdsourcing Impacts: Exploring the Utility of Crowds for Anticipating Societal Impacts of Algorithmic Decision Making
Authors:
Julia Barnett,
Nicholas Diakopoulos
Abstract:
With the increasing pervasiveness of algorithms across industry and government, a growing body of work has grappled with how to understand their societal impact and ethical implications. Various methods have been used at different stages of algorithm development to encourage researchers and designers to consider the potential societal impact of their research. An understudied yet promising area in…
▽ More
With the increasing pervasiveness of algorithms across industry and government, a growing body of work has grappled with how to understand their societal impact and ethical implications. Various methods have been used at different stages of algorithm development to encourage researchers and designers to consider the potential societal impact of their research. An understudied yet promising area in this realm is using participatory foresight to anticipate these different societal impacts. We employ crowdsourcing as a means of participatory foresight to uncover four different types of impact areas based on a set of governmental algorithmic decision making tools: (1) perceived valence, (2) societal domains, (3) specific abstract impact types, and (4) ethical algorithm concerns. Our findings suggest that this method is effective at leveraging the cognitive diversity of the crowd to uncover a range of issues. We further analyze the complexities within the interaction of the impact areas identified to demonstrate how crowdsourcing can illuminate patterns around the connections between impacts. Ultimately this work establishes crowdsourcing as an effective means of anticipating algorithmic impact which complements other approaches towards assessing algorithms in society by leveraging participatory foresight and cognitive diversity.
△ Less
Submitted 19 July, 2022;
originally announced July 2022.
-
Quadratic Approximation Manifold for Mitigating the Kolmogorov Barrier in Nonlinear Projection-Based Model Order Reduction
Authors:
Joshua Barnett,
Charbel Farhat
Abstract:
A quadratic approximation manifold is presented for performing nonlinear, projection-based, model order reduction (PMOR). It constitutes a departure from the traditional affine subspace approximation that is aimed at mitigating the Kolmogorov barrier for nonlinear PMOR, particularly for convection-dominated transport problems. It builds on the data-driven approach underlying the traditional constr…
▽ More
A quadratic approximation manifold is presented for performing nonlinear, projection-based, model order reduction (PMOR). It constitutes a departure from the traditional affine subspace approximation that is aimed at mitigating the Kolmogorov barrier for nonlinear PMOR, particularly for convection-dominated transport problems. It builds on the data-driven approach underlying the traditional construction of projection-based reduced-order models (PROMs); is application-independent; is linearization-free; and therefore is robust for highly nonlinear problems. Most importantly, this approximation leads to quadratic PROMs that deliver the same accuracy as their traditional counterparts using however a much smaller dimension -- typically, $n_2 \sim \sqrt n_1$, where $n_2$ and $n_1$ denote the dimensions of the quadratic and traditional PROMs, respectively. The computational advantages of the proposed high-order approach to nonlinear PMOR over the traditional approach are highlighted for the detached-eddy simulation-based prediction of the Ahmed body turbulent wake flow, which is a popular CFD benchmark problem in the automotive industry. For a fixed accuracy level, these advantages include: a reduction of the total offline computational cost by a factor greater than five; a reduction of its online wall clock time by a factor greater than 32; and a reduction of the wall clock time of the underlying high-dimensional model by a factor greater than two orders of magnitude.
△ Less
Submitted 5 April, 2022;
originally announced April 2022.
-
Sparse Neural Additive Model: Interpretable Deep Learning with Feature Selection via Group Sparsity
Authors:
Shiyun Xu,
Zhiqi Bu,
Pratik Chaudhari,
Ian J. Barnett
Abstract:
Interpretable machine learning has demonstrated impressive performance while preserving explainability. In particular, neural additive models (NAM) offer the interpretability to the black-box deep learning and achieve state-of-the-art accuracy among the large family of generalized additive models. In order to empower NAM with feature selection and improve the generalization, we propose the sparse…
▽ More
Interpretable machine learning has demonstrated impressive performance while preserving explainability. In particular, neural additive models (NAM) offer the interpretability to the black-box deep learning and achieve state-of-the-art accuracy among the large family of generalized additive models. In order to empower NAM with feature selection and improve the generalization, we propose the sparse neural additive models (SNAM) that employ the group sparsity regularization (e.g. Group LASSO), where each feature is learned by a sub-network whose trainable parameters are clustered as a group. We study the theoretical properties for SNAM with novel techniques to tackle the non-parametric truth, thus extending from classical sparse linear models such as the LASSO, which only works on the parametric truth.
Specifically, we show that SNAM with subgradient and proximal gradient descents provably converges to zero training loss as $t\to\infty$, and that the estimation error of SNAM vanishes asymptotically as $n\to\infty$. We also prove that SNAM, similar to LASSO, can have exact support recovery, i.e. perfect feature selection, with appropriate regularization. Moreover, we show that the SNAM can generalize well and preserve the `identifiability', recovering each feature's effect. We validate our theories via extensive experiments and further testify to the good accuracy and efficiency of SNAM.
△ Less
Submitted 24 February, 2022;
originally announced February 2022.
-
Deformable ProtoPNet: An Interpretable Image Classifier Using Deformable Prototypes
Authors:
Jon Donnelly,
Alina Jade Barnett,
Chaofan Chen
Abstract:
We present a deformable prototypical part network (Deformable ProtoPNet), an interpretable image classifier that integrates the power of deep learning and the interpretability of case-based reasoning. This model classifies input images by comparing them with prototypes learned during training, yielding explanations in the form of "this looks like that." However, while previous methods use spatiall…
▽ More
We present a deformable prototypical part network (Deformable ProtoPNet), an interpretable image classifier that integrates the power of deep learning and the interpretability of case-based reasoning. This model classifies input images by comparing them with prototypes learned during training, yielding explanations in the form of "this looks like that." However, while previous methods use spatially rigid prototypes, we address this shortcoming by proposing spatially flexible prototypes. Each prototype is made up of several prototypical parts that adaptively change their relative spatial positions depending on the input image. Consequently, a Deformable ProtoPNet can explicitly capture pose variations and context, improving both model accuracy and the richness of explanations provided. Compared to other case-based interpretable models using prototypes, our approach achieves state-of-the-art accuracy and gives an explanation with greater context. The code is available at https://github.com/jdonnelly36/Deformable-ProtoPNet.
△ Less
Submitted 2 May, 2024; v1 submitted 29 November, 2021;
originally announced November 2021.
-
Interpretable Mammographic Image Classification using Case-Based Reasoning and Deep Learning
Authors:
Alina Jade Barnett,
Fides Regina Schwartz,
Chaofan Tao,
Chaofan Chen,
Yinhao Ren,
Joseph Y. Lo,
Cynthia Rudin
Abstract:
When we deploy machine learning models in high-stakes medical settings, we must ensure these models make accurate predictions that are consistent with known medical science. Inherently interpretable networks address this need by explaining the rationale behind each decision while maintaining equal or higher accuracy compared to black-box models. In this work, we present a novel interpretable neura…
▽ More
When we deploy machine learning models in high-stakes medical settings, we must ensure these models make accurate predictions that are consistent with known medical science. Inherently interpretable networks address this need by explaining the rationale behind each decision while maintaining equal or higher accuracy compared to black-box models. In this work, we present a novel interpretable neural network algorithm that uses case-based reasoning for mammography. Designed to aid a radiologist in their decisions, our network presents both a prediction of malignancy and an explanation of that prediction using known medical features. In order to yield helpful explanations, the network is designed to mimic the reasoning processes of a radiologist: our network first detects the clinically relevant semantic features of each image by comparing each new image with a learned set of prototypical image parts from the training images, then uses those clinical features to predict malignancy. Compared to other methods, our model detects clinical features (mass margins) with equal or higher accuracy, provides a more detailed explanation of its prediction, and is better able to differentiate the classification-relevant parts of the image.
△ Less
Submitted 4 October, 2021; v1 submitted 12 July, 2021;
originally announced July 2021.
-
IAIA-BL: A Case-based Interpretable Deep Learning Model for Classification of Mass Lesions in Digital Mammography
Authors:
Alina Jade Barnett,
Fides Regina Schwartz,
Chaofan Tao,
Chaofan Chen,
Yinhao Ren,
Joseph Y. Lo,
Cynthia Rudin
Abstract:
Interpretability in machine learning models is important in high-stakes decisions, such as whether to order a biopsy based on a mammographic exam. Mammography poses important challenges that are not present in other computer vision tasks: datasets are small, confounding information is present, and it can be difficult even for a radiologist to decide between watchful waiting and biopsy based on a m…
▽ More
Interpretability in machine learning models is important in high-stakes decisions, such as whether to order a biopsy based on a mammographic exam. Mammography poses important challenges that are not present in other computer vision tasks: datasets are small, confounding information is present, and it can be difficult even for a radiologist to decide between watchful waiting and biopsy based on a mammogram alone. In this work, we present a framework for interpretable machine learning-based mammography. In addition to predicting whether a lesion is malignant or benign, our work aims to follow the reasoning processes of radiologists in detecting clinically relevant semantic features of each image, such as the characteristics of the mass margins. The framework includes a novel interpretable neural network algorithm that uses case-based reasoning for mammography. Our algorithm can incorporate a combination of data with whole image labelling and data with pixel-wise annotations, leading to better accuracy and interpretability even with a small number of images. Our interpretable models are able to highlight the classification-relevant parts of the image, whereas other methods highlight healthy tissue and confounding information. Our models are decision aids, rather than decision makers, aimed at better overall human-machine collaboration. We do not observe a loss in mass margin classification accuracy over a black box neural network trained on the same data.
△ Less
Submitted 23 March, 2021;
originally announced March 2021.
-
A Socio-Informatic Approach to Automated Account Classification on Social Media
Authors:
Laurenz A Cornelissen,
Petrus Schoonwinkel,
Richard J Barnett
Abstract:
Automated accounts on social media have become increasingly problematic. We propose a key feature in combination with existing methods to improve machine learning algorithms for bot detection. We successfully improve classification performance through including the proposed feature.
Automated accounts on social media have become increasingly problematic. We propose a key feature in combination with existing methods to improve machine learning algorithms for bot detection. We successfully improve classification performance through including the proposed feature.
△ Less
Submitted 27 April, 2019;
originally announced April 2019.
-
A Network Topology Approach to Bot Classification
Authors:
Laurenz A Cornelissen,
Richard J Barnett,
Petrus Schoonwinkel,
Brent D. Eichstadt,
Hluma B. Magodla
Abstract:
Automated social agents, or bots, are increasingly becoming a problem on social media platforms. There is a growing body of literature and multiple tools to aid in the detection of such agents on online social networking platforms. We propose that the social network topology of a user would be sufficient to determine whether the user is a automated agent or a human. To test this, we use a publicly…
▽ More
Automated social agents, or bots, are increasingly becoming a problem on social media platforms. There is a growing body of literature and multiple tools to aid in the detection of such agents on online social networking platforms. We propose that the social network topology of a user would be sufficient to determine whether the user is a automated agent or a human. To test this, we use a publicly available dataset containing users on Twitter labelled as either automated social agent or human. Using an unsupervised machine learning approach, we obtain a detection accuracy rate of 70%.
△ Less
Submitted 17 September, 2018;
originally announced September 2018.
-
Deploying South African Social Honeypots on Twitter
Authors:
Laurenz A Cornelissen,
Richard J Barnett,
Morakane AM Kepa,
Daniel Loebenberg-Novitzkas,
Jacques Jordaan
Abstract:
Inspired by the simple, yet effective, method of tweeting gibberish to attract automated social agents (bots), we attempt to create localised honeypots in the South African political context. We produce a series of defined techniques and combine them to generate interactions from users on Twitter. The paper offers two key contributions. Conceptually, an argument is made that honeypots should not b…
▽ More
Inspired by the simple, yet effective, method of tweeting gibberish to attract automated social agents (bots), we attempt to create localised honeypots in the South African political context. We produce a series of defined techniques and combine them to generate interactions from users on Twitter. The paper offers two key contributions. Conceptually, an argument is made that honeypots should not be confused for bot detection methods, but are rather methods to capture low-quality users. Secondly, we successfully generate a list of 288 local low quality users active in the political context.
△ Less
Submitted 17 September, 2018;
originally announced September 2018.
-
This Looks Like That: Deep Learning for Interpretable Image Recognition
Authors:
Chaofan Chen,
Oscar Li,
Chaofan Tao,
Alina Jade Barnett,
Jonathan Su,
Cynthia Rudin
Abstract:
When we are faced with challenging image classification tasks, we often explain our reasoning by dissecting the image, and pointing out prototypical aspects of one class or another. The mounting evidence for each of the classes helps us make our final decision. In this work, we introduce a deep network architecture -- prototypical part network (ProtoPNet), that reasons in a similar way: the networ…
▽ More
When we are faced with challenging image classification tasks, we often explain our reasoning by dissecting the image, and pointing out prototypical aspects of one class or another. The mounting evidence for each of the classes helps us make our final decision. In this work, we introduce a deep network architecture -- prototypical part network (ProtoPNet), that reasons in a similar way: the network dissects the image by finding prototypical parts, and combines evidence from the prototypes to make a final classification. The model thus reasons in a way that is qualitatively similar to the way ornithologists, physicians, and others would explain to people on how to solve challenging image classification tasks. The network uses only image-level labels for training without any annotations for parts of images. We demonstrate our method on the CUB-200-2011 dataset and the Stanford Cars dataset. Our experiments show that ProtoPNet can achieve comparable accuracy with its analogous non-interpretable counterpart, and when several ProtoPNets are combined into a larger network, it can achieve an accuracy that is on par with some of the best-performing deep models. Moreover, ProtoPNet provides a level of interpretability that is absent in other interpretable deep models.
△ Less
Submitted 28 December, 2019; v1 submitted 27 June, 2018;
originally announced June 2018.