Search | arXiv e-print repository

CattleFace-RGBT: RGB-T Cattle Facial Landmark Benchmark

Authors: Ethan Coffman, Reagan Clark, Nhat-Tan Bui, Trong Thang Pham, Beth Kegley, Jeremy G. Powell, Jiangchao Zhao, Ngan Le

Abstract: To address this challenge, we introduce CattleFace-RGBT, a RGB-T Cattle Facial Landmark dataset consisting of 2,300 RGB-T image pairs, a total of 4,600 images. Creating a landmark dataset is time-consuming, but AI-assisted annotation can help. However, applying AI to thermal images is challenging due to suboptimal results from direct thermal training and infeasible RGB-thermal alignment due to dif… ▽ More To address this challenge, we introduce CattleFace-RGBT, a RGB-T Cattle Facial Landmark dataset consisting of 2,300 RGB-T image pairs, a total of 4,600 images. Creating a landmark dataset is time-consuming, but AI-assisted annotation can help. However, applying AI to thermal images is challenging due to suboptimal results from direct thermal training and infeasible RGB-thermal alignment due to different camera views. Therefore, we opt to transfer models trained on RGB to thermal images and refine them using our AI-assisted annotation tool following a semi-automatic annotation approach. Accurately localizing facial key points on both RGB and thermal images enables us to not only discern the cattle's respiratory signs but also measure temperatures to assess the animal's thermal state. To the best of our knowledge, this is the first dataset for the cattle facial landmark on RGB-T images. We conduct benchmarking of the CattleFace-RGBT dataset across various backbone architectures, with the objective of establishing baselines for future research, analysis, and comparison. The dataset and models are at https://github.com/UARK-AICV/CattleFace-RGBT-benchmark △ Less

Submitted 5 June, 2024; originally announced June 2024.

arXiv:2309.13550 [pdf, other]

I-AI: A Controllable & Interpretable AI System for Decoding Radiologists' Intense Focus for Accurate CXR Diagnoses

Authors: Trong Thang Pham, Jacob Brecheisen, Anh Nguyen, Hien Nguyen, Ngan Le

Abstract: In the field of chest X-ray (CXR) diagnosis, existing works often focus solely on determining where a radiologist looks, typically through tasks such as detection, segmentation, or classification. However, these approaches are often designed as black-box models, lacking interpretability. In this paper, we introduce Interpretable Artificial Intelligence (I-AI) a novel and unified controllable inter… ▽ More In the field of chest X-ray (CXR) diagnosis, existing works often focus solely on determining where a radiologist looks, typically through tasks such as detection, segmentation, or classification. However, these approaches are often designed as black-box models, lacking interpretability. In this paper, we introduce Interpretable Artificial Intelligence (I-AI) a novel and unified controllable interpretable pipeline for decoding the intense focus of radiologists in CXR diagnosis. Our I-AI addresses three key questions: where a radiologist looks, how long they focus on specific areas, and what findings they diagnose. By capturing the intensity of the radiologist's gaze, we provide a unified solution that offers insights into the cognitive process underlying radiological interpretation. Unlike current methods that rely on black-box machine learning models, which can be prone to extracting erroneous information from the entire input image during the diagnosis process, we tackle this issue by effectively masking out irrelevant information. Our proposed I-AI leverages a vision-language model, allowing for precise control over the interpretation process while ensuring the exclusion of irrelevant features. To train our I-AI model, we utilize an eye gaze dataset to extract anatomical gaze information and generate ground truth heatmaps. Through extensive experimentation, we demonstrate the efficacy of our method. We showcase that the attention heatmaps, designed to mimic radiologists' focus, encode sufficient and relevant information, enabling accurate classification tasks using only a portion of CXR. The code, checkpoints, and data are at https://github.com/UARK-AICV/IAI △ Less

Submitted 9 December, 2023; v1 submitted 24 September, 2023; originally announced September 2023.

Comments: Accepted at WACV 2024

arXiv:2204.01626 [pdf]

doi 10.18419/darus-2785

Stuttgart Open Relay Degradation Dataset (SOReDD)

Authors: Benjamin Maschler, Angel Iliev, Thi Thu Huong Pham, Michael Weyrich

Abstract: Real-life industrial use cases for machine learning oftentimes involve heterogeneous and dynamic assets, processes and data, resulting in a need to continuously adapt the learning algorithm accordingly. Industrial transfer learning offers to lower the effort of such adaptation by allowing the utilization of previously acquired knowledge in solving new (variants of) tasks. Being data-driven methods… ▽ More Real-life industrial use cases for machine learning oftentimes involve heterogeneous and dynamic assets, processes and data, resulting in a need to continuously adapt the learning algorithm accordingly. Industrial transfer learning offers to lower the effort of such adaptation by allowing the utilization of previously acquired knowledge in solving new (variants of) tasks. Being data-driven methods, the development of industrial transfer learning algorithms naturally requires appropriate datasets for training. However, open-source datasets suitable for transfer learning training, i.e. spanning different assets, processes and data (variants), are rare. With the Stuttgart Open Relay Degradation Dataset (SOReDD) we want to offer such a dataset. It provides data on the degradation of different electromechanical relays under different operating conditions, allowing for a large number of different transfer scenarios. Although such relays themselves are usually inexpensive standard components, their failure often leads to the failure of a machine as a whole due to their role as the central power switching element of a machine. The main cost factor in the event of a relay defect is therefore not the relay itself, but the reduced machine availability. It is therefore desirable to predict relay degradation as accurately as possible for specific applications in order to be able to replace relays in good time and avoid unplanned machine downtimes. Nevertheless, data-driven failure prediction for electromechanical relays faces the challenge that relay degradation behavior is highly dependent on the operating conditions, high-resolution measurement data on relay degradation behavior is only collected in rare cases, and such data can then only cover a fraction of the possible operating environments. Relays are thus representative of many other central standard components in automation technology. △ Less

Submitted 4 April, 2022; originally announced April 2022.

Comments: Dataset description (8 pages, 4 figures, 8 tables)

arXiv:2111.05523 [pdf, ps, other]

Anonymous communication system provides a secure environment without leaking metadata, which has many application scenarios in IoT

Authors: Ngoc Ai Van Nguyen, Minh Thuy Truc Pham

Abstract: Anonymous Identity Based Encryption (AIBET) scheme allows a tracer to use the tracing key to reveal the recipient's identity from the ciphertext while kee** other data anonymous. This special feature makes AIBET a promising solution to distributed IoT data security. In this paper, we construct an efficient quantum-safe Hierarchical Identity-Based cryptosystem with Traceable Identities (AHIBET) w… ▽ More Anonymous Identity Based Encryption (AIBET) scheme allows a tracer to use the tracing key to reveal the recipient's identity from the ciphertext while kee** other data anonymous. This special feature makes AIBET a promising solution to distributed IoT data security. In this paper, we construct an efficient quantum-safe Hierarchical Identity-Based cryptosystem with Traceable Identities (AHIBET) with fully anonymous ciphertexts. We prove the security of the AHIBET scheme under the Learning with Errors (LWE) problem in the standard model. △ Less

Submitted 9 November, 2021; originally announced November 2021.

arXiv:2101.00509 [pdf]

doi 10.13140/RG.2.2.15631.00163

Regularization-based Continual Learning for Anomaly Detection in Discrete Manufacturing

Authors: Benjamin Maschler, Thi Thu Huong Pham, Michael Weyrich

Abstract: The early and robust detection of anomalies occurring in discrete manufacturing processes allows operators to prevent harm, e.g. defects in production machinery or products. While current approaches for data-driven anomaly detection provide good results on the exact processes they were trained on, they often lack the ability to flexibly adapt to changes, e.g. in products. Continual learning promis… ▽ More The early and robust detection of anomalies occurring in discrete manufacturing processes allows operators to prevent harm, e.g. defects in production machinery or products. While current approaches for data-driven anomaly detection provide good results on the exact processes they were trained on, they often lack the ability to flexibly adapt to changes, e.g. in products. Continual learning promises such flexibility, allowing for an automatic adaption of previously learnt knowledge to new tasks. Therefore, this article discusses different continual learning approaches from the group of regularization strategies, which are implemented, evaluated and compared based on a real industrial metal forming dataset. △ Less

Submitted 2 January, 2021; originally announced January 2021.

Comments: 6 pages, 5 figures, 3 tables, submitted to the CIRP Conference on Manufacturing Systems 2021

arXiv:1905.13125 [pdf, other]

Seeker: Real-Time Interactive Search

Authors: Ari Biswas, Thai T Pham, Michael Vogelsong, Benjamin Snyder, Houssam Nassif

Abstract: This paper introduces Seeker, a system that allows users to interactively refine search rankings in real time, through feedback in the form of likes and dislikes. When searching online, users may not know how to accurately describe their product of choice in words. An alternative approach is to search an embedding space, allowing the user to query using a representation of the item (like a tune fo… ▽ More This paper introduces Seeker, a system that allows users to interactively refine search rankings in real time, through feedback in the form of likes and dislikes. When searching online, users may not know how to accurately describe their product of choice in words. An alternative approach is to search an embedding space, allowing the user to query using a representation of the item (like a tune for a song, or a picture for an object). However, this approach requires the user to possess an example representation of their desired item. Additionally, most current search systems do not allow the user to dynamically adapt the results with further feedback. On the other hand, users often have a mental picture of the desired item and are able to answer ordinal questions of the form: "Is this item similar to what you have in mind?" With this assumption, our algorithm allows for users to provide sequential feedback on search results to adapt the search feed. We show that our proposed approach works well both qualitatively and quantitatively. Unlike most previous representation-based search systems, we can quantify the quality of our algorithm by evaluating humans-in-the-loop experiments. △ Less

Submitted 17 May, 2019; originally announced May 2019.

Comments: This paper will appear in KDD 2019

Journal ref: Knowledge Discovery in Databases Conference (KDD'19), Anchorage, Alaska, pp. 2867-2875, 2019

arXiv:1704.00860 [pdf, other]

Simultaneous Feature Aggregating and Hashing for Large-scale Image Search

Authors: Thanh-Toan Do, Dang-Khoa Le Tan, Trung T. Pham, Ngai-Man Cheung

Abstract: In most state-of-the-art hashing-based visual search systems, local image descriptors of an image are first aggregated as a single feature vector. This feature vector is then subjected to a hashing function that produces a binary hash code. In previous work, the aggregating and the hashing processes are designed independently. In this paper, we propose a novel framework where feature aggregating a… ▽ More In most state-of-the-art hashing-based visual search systems, local image descriptors of an image are first aggregated as a single feature vector. This feature vector is then subjected to a hashing function that produces a binary hash code. In previous work, the aggregating and the hashing processes are designed independently. In this paper, we propose a novel framework where feature aggregating and hashing are designed simultaneously and optimized jointly. Specifically, our joint optimization produces aggregated representations that can be better reconstructed by some binary codes. This leads to more discriminative binary hash codes and improved retrieval accuracy. In addition, we also propose a fast version of the recently-proposed Binary Autoencoder to be used in our proposed framework. We perform extensive retrieval experiments on several benchmark datasets with both SIFT and convolutional features. Our results suggest that the proposed framework achieves significant improvements over the state of the art. △ Less

Submitted 3 April, 2017; originally announced April 2017.

Comments: Accepted to CVPR 2017

arXiv:1609.07849 [pdf, other]

Meaningful Maps With Object-Oriented Semantic Map**

Authors: Niko Sünderhauf, Trung T. Pham, Yasir Latif, Michael Milford, Ian Reid

Abstract: For intelligent robots to interact in meaningful ways with their environment, they must understand both the geometric and semantic properties of the scene surrounding them. The majority of research to date has addressed these map** challenges separately, focusing on either geometric or semantic map**. In this paper we address the problem of building environmental maps that include both semanti… ▽ More For intelligent robots to interact in meaningful ways with their environment, they must understand both the geometric and semantic properties of the scene surrounding them. The majority of research to date has addressed these map** challenges separately, focusing on either geometric or semantic map**. In this paper we address the problem of building environmental maps that include both semantically meaningful, object-level entities and point- or mesh-based geometrical representations. We simultaneously build geometric point cloud models of previously unseen instances of known object classes and create a map that contains these object models as central entities. Our system leverages sparse, feature-based RGB-D SLAM, image-based deep-learning object detection and 3D unsupervised segmentation. △ Less

Submitted 2 August, 2017; v1 submitted 26 September, 2016; originally announced September 2016.

Showing 1–8 of 8 results for author: Pham, T T