Search | arXiv e-print repository

Collaborative human-AI trust (CHAI-T): A process framework for active management of trust in human-AI collaboration

Authors: Melanie J. McGrath, Andreas Duenser, Justine Lacey, Cecile Paris

Abstract: Collaborative human-AI (HAI) teaming combines the unique skills and capabilities of humans and machines in sustained teaming interactions leveraging the strengths of each. In tasks involving regular exposure to novelty and uncertainty, collaboration between adaptive, creative humans and powerful, precise artificial intelligence (AI) promises new solutions and efficiencies. User trust is essential… ▽ More Collaborative human-AI (HAI) teaming combines the unique skills and capabilities of humans and machines in sustained teaming interactions leveraging the strengths of each. In tasks involving regular exposure to novelty and uncertainty, collaboration between adaptive, creative humans and powerful, precise artificial intelligence (AI) promises new solutions and efficiencies. User trust is essential to creating and maintaining these collaborative relationships. Established models of trust in traditional forms of AI typically recognize the contribution of three primary categories of trust antecedents: characteristics of the human user, characteristics of the technology, and environmental factors. The emergence of HAI teams, however, requires an understanding of human trust that accounts for the specificity of task contexts and goals, integrates processes of interaction, and captures how trust evolves in a teaming environment over time. Drawing on both the psychological and computer science literature, the process framework of trust in collaborative HAI teams (CHAI-T) presented in this paper adopts the tripartite structure of antecedents established by earlier models, while incorporating team processes and performance phases to capture the dynamism inherent to trust in teaming contexts. These features enable active management of trust in collaborative AI systems, with practical implications for the design and deployment of collaborative HAI teams. △ Less

Submitted 1 April, 2024; originally announced April 2024.

Comments: 36 pages, 2 figures

arXiv:2304.14044 [pdf, other]

doi 10.1007/s10032-023-00427-w

Large Scale Genealogical Information Extraction From Handwritten Quebec Parish Records

Authors: Solène Tarride, Martin Maarand, Mélodie Boillet, James McGrath, Eugénie Capel, Hélène Vézina, Christopher Kermorvant

Abstract: This paper presents a complete workflow designed for extracting information from Quebec handwritten parish registers. The acts in these documents contain individual and family information highly valuable for genetic, demographic and social studies of the Quebec population. From an image of parish records, our workflow is able to identify the acts and extract personal information. The workflow is d… ▽ More This paper presents a complete workflow designed for extracting information from Quebec handwritten parish registers. The acts in these documents contain individual and family information highly valuable for genetic, demographic and social studies of the Quebec population. From an image of parish records, our workflow is able to identify the acts and extract personal information. The workflow is divided into successive steps: page classification, text line detection, handwritten text recognition, named entity recognition and act detection and classification. For all these steps, different machine learning models are compared. Once the information is extracted, validation rules designed by experts are then applied to standardize the extracted information and ensure its consistency with the type of act (birth, marriage, and death). This validation step is able to reject records that are considered invalid or merged. The full workflow has been used to process over two million pages of Quebec parish registers from the 19-20th centuries. On a sample comprising 65% of registers, 3.2 million acts were recognized. Verification of the birth and death acts from this sample shows that 74% of them are considered complete and valid. These records will be integrated into the BALSAC database and linked together to recreate family and genealogical relations at large scale. △ Less

Submitted 27 April, 2023; originally announced April 2023.

Journal ref: International Journal on Document Analysis and Recognition (IJDAR) (2023)

arXiv:2109.12143 [pdf, other]

Weather of the Dorm WIFI Ecosystem at the University of Colorado Boulder for Fall Semester 2019 to Spring Semester 2020 a Case Study of WIFI and a Campus Response to the COVID-19 Perturbation

Authors: Jake Mcgrath, Armen Davis, James Curry, Orrie Gartner, Glenn Rodrigues, Seth Spielman, Daniel Massey

Abstract: Growing use of network technology in Higher Education means that there has been increasing demand to adapt technology platforms and tools that transform student learning strategies, faculty teaching, research modalities, as well as general operations. Many of the new modalities are necessary for IHE business. In August 2019, we began collecting and analyzing data from the campus WIFI network. A go… ▽ More Growing use of network technology in Higher Education means that there has been increasing demand to adapt technology platforms and tools that transform student learning strategies, faculty teaching, research modalities, as well as general operations. Many of the new modalities are necessary for IHE business. In August 2019, we began collecting and analyzing data from the campus WIFI network. A goal of the research was to answer question like what passive sensing of the IHE WIFI might tell us about the dynamics of the WIFI weather in the IHE ecosystem and what does anonymized data tell us about the IHE ecosystem. The analogy with weather prediction seemed appropriate and a viable approach. Starting Fall 2019, data were collected in the observational phase. In the analysis phase, we applied Singular Spectrum Analysis decomposition, to deconstruct WIFI data from dorms, the central campus dining cafeteria, the recreation center, and other buildings on campus. That analysis led to the identification of clusters of buildings that behaved similarly. Just as in the case of models of the weather, a final component of this research was forecasting. We found that weekly forecast of WIFI behavior in the Fall 2019, were straight forward using SSA and seemed to present behavior of a low dimensional dynamical system. However, in Spring 2020, and the COVID perturbation, the campus ecosystem received a shock and data show that the campus changed very quickly. We found that as the campus moved to conduct remote learning, teaching, the closure of research labs, and the edict to work remotely, SSA forecasting techniques not trained on the Spring 2020, data after the shock, performed poorly. While SSA forecasting trained on a portion of the data did better. △ Less

Submitted 24 September, 2021; originally announced September 2021.

Comments: Contact E-mail: [email protected], Applied Mathematics, University of Colorado, Boulder 80309-0526

arXiv:2009.00749 [pdf, other]

Iris Liveness Detection Competition (LivDet-Iris) -- The 2020 Edition

Authors: Priyanka Das, Joseph McGrath, Zhaoyuan Fang, Aidan Boyd, Ganghee Jang, Amir Mohammadi, Sandip Purnapatra, David Yambay, Sébastien Marcel, Mateusz Trokielewicz, Piotr Maciejewicz, Kevin Bowyer, Adam Czajka, Stephanie Schuckers, Juan Tapia, Sebastian Gonzalez, Meiling Fang, Naser Damer, Fadi Boutros, Arjan Kuijper, Renu Sharma, Cunjian Chen, Arun Ross

Abstract: Launched in 2013, LivDet-Iris is an international competition series open to academia and industry with the aim to assess and report advances in iris Presentation Attack Detection (PAD). This paper presents results from the fourth competition of the series: LivDet-Iris 2020. This year's competition introduced several novel elements: (a) incorporated new types of attacks (samples displayed on a scr… ▽ More Launched in 2013, LivDet-Iris is an international competition series open to academia and industry with the aim to assess and report advances in iris Presentation Attack Detection (PAD). This paper presents results from the fourth competition of the series: LivDet-Iris 2020. This year's competition introduced several novel elements: (a) incorporated new types of attacks (samples displayed on a screen, cadaver eyes and prosthetic eyes), (b) initiated LivDet-Iris as an on-going effort, with a testing protocol available now to everyone via the Biometrics Evaluation and Testing (BEAT)(https://www.idiap.ch/software/beat/) open-source platform to facilitate reproducibility and benchmarking of new algorithms continuously, and (c) performance comparison of the submitted entries with three baseline methods (offered by the University of Notre Dame and Michigan State University), and three open-source iris PAD methods available in the public domain. The best performing entry to the competition reported a weighted average APCER of 59.10\% and a BPCER of 0.46\% over all five attack types. This paper serves as the latest evaluation of iris PAD on a large spectrum of presentation attack instruments. △ Less

Submitted 1 September, 2020; originally announced September 2020.

Comments: 9 pages, 3 figures, 3 tables, Accepted for presentation at International Joint Conference on Biometrics (IJCB 2020)

arXiv:1910.12462 [pdf, other]

Fine-Grained Object Detection over Scientific Document Images with Region Embeddings

Authors: Ankur Goswami, Joshua McGrath, Shanan Peters, Theodoros Rekatsinas

Abstract: We study the problem of object detection over scanned images of scientific documents. We consider images that contain objects of varying aspect ratios and sizes and range from coarse elements such as tables and figures to fine elements such as equations and section headers. We find that current object detectors fail to produce properly localized region proposals over such page objects. We revisit… ▽ More We study the problem of object detection over scanned images of scientific documents. We consider images that contain objects of varying aspect ratios and sizes and range from coarse elements such as tables and figures to fine elements such as equations and section headers. We find that current object detectors fail to produce properly localized region proposals over such page objects. We revisit the original R-CNN model and present a method for generating fine-grained proposals over document elements. We also present a region embedding model that uses the convolutional maps of a proposal's neighbors as context to produce an embedding for each proposal. This region embedding is able to capture the semantic relationships between a target region and its surrounding context. Our end-to-end model produces an embedding for each proposal, then classifies each proposal by using a multi-head attention model that attends to the most important neighbors of a proposal. To evaluate our model, we collect and annotate a dataset of publications from heterogeneous journals. We show that our model, referred to as Attentive-RCNN, yields a 17% mAP improvement compared to standard object detection models. △ Less

Submitted 30 October, 2019; v1 submitted 28 October, 2019; originally announced October 2019.

arXiv:1904.02285 [pdf, other]

doi 10.1145/3299869.3319888

HoloDetect: Few-Shot Learning for Error Detection

Authors: Alireza Heidari, Joshua McGrath, Ihab F. Ilyas, Theodoros Rekatsinas

Abstract: We introduce a few-shot learning framework for error detection. We show that data augmentation (a form of weak supervision) is key to training high-quality, ML-based error detection models that require minimal human involvement. Our framework consists of two parts: (1) an expressive model to learn rich representations that capture the inherent syntactic and semantic heterogeneity of errors; and (2… ▽ More We introduce a few-shot learning framework for error detection. We show that data augmentation (a form of weak supervision) is key to training high-quality, ML-based error detection models that require minimal human involvement. Our framework consists of two parts: (1) an expressive model to learn rich representations that capture the inherent syntactic and semantic heterogeneity of errors; and (2) a data augmentation model that, given a small seed of clean records, uses dataset-specific transformations to automatically generate additional training data. Our key insight is to learn data augmentation policies from the noisy input dataset in a weakly supervised manner. We show that our framework detects errors with an average precision of ~94% and an average recall of ~93% across a diverse array of datasets that exhibit different types and amounts of errors. We compare our approach to a comprehensive collection of error detection methods, ranging from traditional rule-based methods to ensemble-based and active learning approaches. We show that data augmentation yields an average improvement of 20 F1 points while it requires access to 3x fewer labeled examples compared to other ML approaches. △ Less

Submitted 3 April, 2019; originally announced April 2019.

Comments: 18 pages,

Journal ref: ACM SIGMOD 2019

arXiv:1809.10172 [pdf, other]

Open Source Presentation Attack Detection Baseline for Iris Recognition

Authors: Joseph McGrath, Kevin W. Bowyer, Adam Czajka

Abstract: This paper proposes the first, known to us, open source presentation attack detection (PAD) solution to distinguish between authentic iris images (possibly wearing clear contact lenses) and irises with textured contact lenses. This software can serve as a baseline in various PAD evaluations, and also as an open-source platform with an up-to-date reference method for iris PAD. The software is writt… ▽ More This paper proposes the first, known to us, open source presentation attack detection (PAD) solution to distinguish between authentic iris images (possibly wearing clear contact lenses) and irises with textured contact lenses. This software can serve as a baseline in various PAD evaluations, and also as an open-source platform with an up-to-date reference method for iris PAD. The software is written in C++ and Python and uses only open source resources, such as OpenCV. This method does not incorporate iris image segmentation, which may be problematic for unknown fake samples. Instead, it makes a best guess to localize the rough position of the iris. The PAD-related features are extracted with the Binary Statistical Image Features (BSIF), which are classified by an ensemble of classifiers incorporating support vector machine, random forest and multilayer perceptron. The models attached to the current software have been trained with the NDCLD'15 database and evaluated on the independent datasets included in the LivDet-Iris 2017 competition. The software implements the functionality of retraining the classifiers with any database of authentic and attack images. The accuracy of the current version offered with this paper exceeds 99% when tested on subject-disjoint subsets of NDCLD'15, and oscillates around 85% when tested on the LivDet-Iris 2017 benchmarks, which is on par with the results obtained by the LivDet-Iris 2017 winner. △ Less

Submitted 8 May, 2019; v1 submitted 26 September, 2018; originally announced September 2018.

Showing 1–7 of 7 results for author: McGrath, J