-
Data Quality in Crowdsourcing and Spamming Behavior Detection
Authors:
Yang Ba,
Michelle V. Mancenido,
Erin K. Chiou,
Rong Pan
Abstract:
As crowdsourcing emerges as an efficient and cost-effective method for obtaining labels for machine learning datasets, it is important to assess the quality of crowd-provided data, so as to improve analysis performance and reduce biases in subsequent machine learning tasks. Given the lack of ground truth in most cases of crowdsourcing, we refer to data quality as annotators' consistency and credib…
▽ More
As crowdsourcing emerges as an efficient and cost-effective method for obtaining labels for machine learning datasets, it is important to assess the quality of crowd-provided data, so as to improve analysis performance and reduce biases in subsequent machine learning tasks. Given the lack of ground truth in most cases of crowdsourcing, we refer to data quality as annotators' consistency and credibility. Unlike the simple scenarios where Kappa coefficient and intraclass correlation coefficient usually can apply, online crowdsourcing requires dealing with more complex situations. We introduce a systematic method for evaluating data quality and detecting spamming threats via variance decomposition, and we classify spammers into three categories based on their different behavioral patterns. A spammer index is proposed to assess entire data consistency and two metrics are developed to measure crowd worker's credibility by utilizing the Markov chain and generalized random effects models. Furthermore, we showcase the practicality of our techniques and their advantages by applying them on a face verification task with both simulation and real-world data collected from two crowdsourcing platforms.
△ Less
Submitted 3 April, 2024;
originally announced April 2024.
-
PADTHAI-MM: A Principled Approach for Designing Trustable, Human-centered AI systems using the MAST Methodology
Authors:
Nayoung Kim,
Myke C. Cohen,
Yang Ba,
Anna Pan,
Shawaiz Bhatti,
Pouria Salehi,
James Sung,
Erik Blasch,
Michelle V. Mancenido,
Erin K. Chiou
Abstract:
Designing for AI trustworthiness is challenging, with a lack of practical guidance despite extensive literature on trust. The Multisource AI Scorecard Table (MAST), a checklist rating system, addresses this gap in designing and evaluating AI-enabled decision support systems. We propose the Principled Approach for Designing Trustable Human-centered AI systems using MAST Methodology (PADTHAI-MM), a…
▽ More
Designing for AI trustworthiness is challenging, with a lack of practical guidance despite extensive literature on trust. The Multisource AI Scorecard Table (MAST), a checklist rating system, addresses this gap in designing and evaluating AI-enabled decision support systems. We propose the Principled Approach for Designing Trustable Human-centered AI systems using MAST Methodology (PADTHAI-MM), a nine-step framework what we demonstrate through the iterative design of a text analysis platform called the REporting Assistant for Defense and Intelligence Tasks (READIT). We designed two versions of READIT, high-MAST including AI context and explanations, and low-MAST resembling a "black box" type system. Participant feedback and state-of-the-art AI knowledge was integrated in the design process, leading to a redesigned prototype tested by participants in an intelligence reporting task. Results show that MAST-guided design can improve trust perceptions, and that MAST criteria can be linked to performance, process, and purpose information, providing a practical and theory-informed basis for AI system design.
△ Less
Submitted 24 January, 2024;
originally announced January 2024.
-
Evaluating Trustworthiness of AI-Enabled Decision Support Systems: Validation of the Multisource AI Scorecard Table (MAST)
Authors:
Pouria Salehi,
Yang Ba,
Nayoung Kim,
Ahmadreza Mosallanezhad,
Anna Pan,
Myke C. Cohen,
Yixuan Wang,
Jieqiong Zhao,
Shawaiz Bhatti,
James Sung,
Erik Blasch,
Michelle V. Mancenido,
Erin K. Chiou
Abstract:
The Multisource AI Scorecard Table (MAST) is a checklist tool based on analytic tradecraft standards to inform the design and evaluation of trustworthy AI systems. In this study, we evaluate whether MAST is associated with people's trust perceptions in AI-enabled decision support systems (AI-DSSs). Evaluating trust in AI-DSSs poses challenges to researchers and practitioners. These challenges incl…
▽ More
The Multisource AI Scorecard Table (MAST) is a checklist tool based on analytic tradecraft standards to inform the design and evaluation of trustworthy AI systems. In this study, we evaluate whether MAST is associated with people's trust perceptions in AI-enabled decision support systems (AI-DSSs). Evaluating trust in AI-DSSs poses challenges to researchers and practitioners. These challenges include identifying the components, capabilities, and potential of these systems, many of which are based on the complex deep learning algorithms that drive DSS performance and preclude complete manual inspection. We developed two interactive, AI-DSS test environments using the MAST criteria. One emulated an identity verification task in security screening, and another emulated a text summarization system to aid in an investigative reporting task. Each test environment had one version designed to match low-MAST ratings, and another designed to match high-MAST ratings, with the hypothesis that MAST ratings would be positively related to the trust ratings of these systems. A total of 177 subject matter experts were recruited to interact with and evaluate these systems. Results generally show higher MAST ratings for the high-MAST conditions compared to the low-MAST groups, and that measures of trust perception are highly correlated with the MAST ratings. We conclude that MAST can be a useful tool for designing and evaluating systems that will engender high trust perceptions, including AI-DSS that may be used to support visual screening and text summarization tasks. However, higher MAST ratings may not translate to higher joint performance.
△ Less
Submitted 29 November, 2023;
originally announced November 2023.
-
Beyond Deterministic Translation for Unsupervised Domain Adaptation
Authors:
Eleni Chiou,
Eleftheria Panagiotaki,
Iasonas Kokkinos
Abstract:
In this work we challenge the common approach of using a one-to-one map** ('translation') between the source and target domains in unsupervised domain adaptation (UDA). Instead, we rely on stochastic translation to capture inherent translation ambiguities. This allows us to (i) train more accurate target networks by generating multiple outputs conditioned on the same source image, leveraging bot…
▽ More
In this work we challenge the common approach of using a one-to-one map** ('translation') between the source and target domains in unsupervised domain adaptation (UDA). Instead, we rely on stochastic translation to capture inherent translation ambiguities. This allows us to (i) train more accurate target networks by generating multiple outputs conditioned on the same source image, leveraging both accurate translation and data augmentation for appearance variability, (ii) impute robust pseudo-labels for the target data by averaging the predictions of a source network on multiple translated versions of a single target image and (iii) train and ensemble diverse networks in the target domain by modulating the degree of stochasticity in the translations. We report improvements over strong recent baselines, leading to state-of-the-art UDA results on two challenging semantic segmentation benchmarks. Our code is available at https://github.com/elchiou/Beyond-deterministic-translation-for-UDA.
△ Less
Submitted 20 November, 2022; v1 submitted 15 February, 2022;
originally announced February 2022.
-
Unsupervised Domain Adaptation with Semantic Consistency across Heterogeneous Modalities for MRI Prostate Lesion Segmentation
Authors:
Eleni Chiou,
Francesco Giganti,
Shonit Punwani,
Iasonas Kokkinos,
Eleftheria Panagiotaki
Abstract:
Any novel medical imaging modality that differs from previous protocols e.g. in the number of imaging channels, introduces a new domain that is heterogeneous from previous ones. This common medical imaging scenario is rarely considered in the domain adaptation literature, which handles shifts across domains of the same dimensionality. In our work we rely on stochastic generative modeling to transl…
▽ More
Any novel medical imaging modality that differs from previous protocols e.g. in the number of imaging channels, introduces a new domain that is heterogeneous from previous ones. This common medical imaging scenario is rarely considered in the domain adaptation literature, which handles shifts across domains of the same dimensionality. In our work we rely on stochastic generative modeling to translate across two heterogeneous domains at pixel space and introduce two new loss functions that promote semantic consistency. Firstly, we introduce a semantic cycle-consistency loss in the source domain to ensure that the translation preserves the semantics. Secondly, we introduce a pseudo-labelling loss, where we translate target data to source, label them by a source-domain network, and use the generated pseudo-labels to supervise the target-domain network. Our results show that this allows us to extract systematically better representations for the target domain. In particular, we address the challenge of enhancing performance on VERDICT-MRI, an advanced diffusion-weighted imaging technique, by exploiting labeled mp-MRI data. When compared to several unsupervised domain adaptation approaches, our approach yields substantial improvements, that consistently carry over to the semi-supervised and supervised learning settings.
△ Less
Submitted 19 September, 2021;
originally announced September 2021.
-
Harnessing Uncertainty in Domain Adaptation for MRI Prostate Lesion Segmentation
Authors:
Eleni Chiou,
Francesco Giganti,
Shonit Punwani,
Iasonas Kokkinos,
Eleftheria Panagiotaki
Abstract:
The need for training data can impede the adoption of novel imaging modalities for learning-based medical image analysis. Domain adaptation methods partially mitigate this problem by translating training data from a related source domain to a novel target domain, but typically assume that a one-to-one translation is possible. Our work addresses the challenge of adapting to a more informative targe…
▽ More
The need for training data can impede the adoption of novel imaging modalities for learning-based medical image analysis. Domain adaptation methods partially mitigate this problem by translating training data from a related source domain to a novel target domain, but typically assume that a one-to-one translation is possible. Our work addresses the challenge of adapting to a more informative target domain where multiple target samples can emerge from a single source sample. In particular we consider translating from mp-MRI to VERDICT, a richer MRI modality involving an optimized acquisition protocol for cancer characterization. We explicitly account for the inherent uncertainty of this map** and exploit it to generate multiple outputs conditioned on a single input. Our results show that this allows us to extract systematically better image representations for the target domain, when used in tandem with both simple, CycleGAN-based baselines, as well as more powerful approaches that integrate discriminative segmentation losses and/or residual adapters. When compared to its deterministic counterparts, our approach yields substantial improvements across a broad range of dataset sizes, increasingly strong baselines, and evaluation measures.
△ Less
Submitted 18 January, 2021; v1 submitted 14 October, 2020;
originally announced October 2020.