Search | arXiv e-print repository

Efficient and Concise Explanations for Object Detection with Gaussian-Class Activation Map** Explainer

Authors: Quoc Khanh Nguyen, Truong Thanh Hung Nguyen, Vo Thanh Khang Nguyen, Van Binh Truong, Tuong Phan, Hung Cao

Abstract: To address the challenges of providing quick and plausible explanations in Explainable AI (XAI) for object detection models, we introduce the Gaussian Class Activation Map** Explainer (G-CAME). Our method efficiently generates concise saliency maps by utilizing activation maps from selected layers and applying a Gaussian kernel to emphasize critical image regions for the predicted object. Compar… ▽ More To address the challenges of providing quick and plausible explanations in Explainable AI (XAI) for object detection models, we introduce the Gaussian Class Activation Map** Explainer (G-CAME). Our method efficiently generates concise saliency maps by utilizing activation maps from selected layers and applying a Gaussian kernel to emphasize critical image regions for the predicted object. Compared with other Region-based approaches, G-CAME significantly reduces explanation time to 0.5 seconds without compromising the quality. Our evaluation of G-CAME, using Faster-RCNN and YOLOX on the MS-COCO 2017 dataset, demonstrates its ability to offer highly plausible and faithful explanations, especially in reducing the bias on tiny object detection. △ Less

Submitted 20 April, 2024; originally announced April 2024.

Comments: Canadian AI 2024

arXiv:2402.12525 [pdf, other]

LangXAI: Integrating Large Vision Models for Generating Textual Explanations to Enhance Explainability in Visual Perception Tasks

Authors: Truong Thanh Hung Nguyen, Tobias Clement, Phuc Truong Loc Nguyen, Nils Kemmerzell, Van Binh Truong, Vo Thanh Khang Nguyen, Mohamed Abdelaal, Hung Cao

Abstract: LangXAI is a framework that integrates Explainable Artificial Intelligence (XAI) with advanced vision models to generate textual explanations for visual recognition tasks. Despite XAI advancements, an understanding gap persists for end-users with limited domain knowledge in artificial intelligence and computer vision. LangXAI addresses this by furnishing text-based explanations for classification,… ▽ More LangXAI is a framework that integrates Explainable Artificial Intelligence (XAI) with advanced vision models to generate textual explanations for visual recognition tasks. Despite XAI advancements, an understanding gap persists for end-users with limited domain knowledge in artificial intelligence and computer vision. LangXAI addresses this by furnishing text-based explanations for classification, object detection, and semantic segmentation model outputs to end-users. Preliminary results demonstrate LangXAI's enhanced plausibility, with high BERTScore across tasks, fostering a more transparent and reliable AI framework on vision tasks for end-users. △ Less

Submitted 19 February, 2024; originally announced February 2024.

arXiv:2402.12179 [pdf, other]

Examining Monitoring System: Detecting Abnormal Behavior In Online Examinations

Authors: Dinh An Ngo, Thanh Dat Nguyen, Thi Le Chi Dang, Huy Hoan Le, Ton Bao Ho, Vo Thanh Khang Nguyen, Truong Thanh Hung Nguyen

Abstract: Cheating in online exams has become a prevalent issue over the past decade, especially during the COVID-19 pandemic. To address this issue of academic dishonesty, our "Exam Monitoring System: Detecting Abnormal Behavior in Online Examinations" is designed to assist proctors in identifying unusual student behavior. Our system demonstrates high accuracy and speed in detecting cheating in real-time s… ▽ More Cheating in online exams has become a prevalent issue over the past decade, especially during the COVID-19 pandemic. To address this issue of academic dishonesty, our "Exam Monitoring System: Detecting Abnormal Behavior in Online Examinations" is designed to assist proctors in identifying unusual student behavior. Our system demonstrates high accuracy and speed in detecting cheating in real-time scenarios, providing valuable information, and aiding proctors in decision-making. This article outlines our methodology and the effectiveness of our system in mitigating the widespread problem of cheating in online exams. △ Less

Submitted 19 February, 2024; originally announced February 2024.

arXiv:2401.09900 [pdf, other]

XAI-Enhanced Semantic Segmentation Models for Visual Quality Inspection

Authors: Tobias Clement, Truong Thanh Hung Nguyen, Mohamed Abdelaal, Hung Cao

Abstract: Visual quality inspection systems, crucial in sectors like manufacturing and logistics, employ computer vision and machine learning for precise, rapid defect detection. However, their unexplained nature can hinder trust, error identification, and system improvement. This paper presents a framework to bolster visual quality inspection by using CAM-based explanations to refine semantic segmentation… ▽ More Visual quality inspection systems, crucial in sectors like manufacturing and logistics, employ computer vision and machine learning for precise, rapid defect detection. However, their unexplained nature can hinder trust, error identification, and system improvement. This paper presents a framework to bolster visual quality inspection by using CAM-based explanations to refine semantic segmentation models. Our approach consists of 1) Model Training, 2) XAI-based Model Explanation, 3) XAI Evaluation, and 4) Annotation Augmentation for Model Enhancement, informed by explanations and expert insights. Evaluations show XAI-enhanced models surpass original DeepLabv3-ResNet101 models, especially in intricate object segmentation. △ Less

Submitted 18 January, 2024; originally announced January 2024.

Comments: IEEE ICCE 2024

arXiv:2401.09852 [pdf, other]

Enhancing the Fairness and Performance of Edge Cameras with Explainable AI

Authors: Truong Thanh Hung Nguyen, Vo Thanh Khang Nguyen, Quoc Hung Cao, Van Binh Truong, Quoc Khanh Nguyen, Hung Cao

Abstract: The rising use of Artificial Intelligence (AI) in human detection on Edge camera systems has led to accurate but complex models, challenging to interpret and debug. Our research presents a diagnostic method using Explainable AI (XAI) for model debugging, with expert-driven problem identification and solution creation. Validated on the Bytetrack model in a real-world office Edge network, we found t… ▽ More The rising use of Artificial Intelligence (AI) in human detection on Edge camera systems has led to accurate but complex models, challenging to interpret and debug. Our research presents a diagnostic method using Explainable AI (XAI) for model debugging, with expert-driven problem identification and solution creation. Validated on the Bytetrack model in a real-world office Edge network, we found the training dataset as the main bias source and suggested model augmentation as a solution. Our approach helps identify model biases, essential for achieving fair and trustworthy models. △ Less

Submitted 18 January, 2024; originally announced January 2024.

Comments: IEEE ICCE 2024

arXiv:2307.04137 [pdf, other]

A Novel Explainable Artificial Intelligence Model in Image Classification problem

Authors: Quoc Hung Cao, Truong Thanh Hung Nguyen, Vo Thanh Khang Nguyen, Xuan Phong Nguyen

Abstract: In recent years, artificial intelligence is increasingly being applied widely in many different fields and has a profound and direct impact on human life. Following this is the need to understand the principles of the model making predictions. Since most of the current high-precision models are black boxes, neither the AI scientist nor the end-user deeply understands what's going on inside these m… ▽ More In recent years, artificial intelligence is increasingly being applied widely in many different fields and has a profound and direct impact on human life. Following this is the need to understand the principles of the model making predictions. Since most of the current high-precision models are black boxes, neither the AI scientist nor the end-user deeply understands what's going on inside these models. Therefore, many algorithms are studied for the purpose of explaining AI models, especially those in the problem of image classification in the field of computer vision such as LIME, CAM, GradCAM. However, these algorithms still have limitations such as LIME's long execution time and CAM's confusing interpretation of concreteness and clarity. Therefore, in this paper, we propose a new method called Segmentation - Class Activation Map** (SeCAM) that combines the advantages of these algorithms above, while at the same time overcoming their disadvantages. We tested this algorithm with various models, including ResNet50, Inception-v3, VGG16 from ImageNet Large Scale Visual Recognition Challenge (ILSVRC) data set. Outstanding results when the algorithm has met all the requirements for a specific explanation in a remarkably concise time. △ Less

Submitted 9 July, 2023; originally announced July 2023.

Comments: Published in the Proceedings of FAIC 2021

arXiv:2306.03400 [pdf, other]

G-CAME: Gaussian-Class Activation Map** Explainer for Object Detectors

Authors: Quoc Khanh Nguyen, Truong Thanh Hung Nguyen, Vo Thanh Khang Nguyen, Van Binh Truong, Quoc Hung Cao

Abstract: Nowadays, deep neural networks for object detection in images are very prevalent. However, due to the complexity of these networks, users find it hard to understand why these objects are detected by models. We proposed Gaussian Class Activation Map** Explainer (G-CAME), which generates a saliency map as the explanation for object detection models. G-CAME can be considered a CAM-based method that… ▽ More Nowadays, deep neural networks for object detection in images are very prevalent. However, due to the complexity of these networks, users find it hard to understand why these objects are detected by models. We proposed Gaussian Class Activation Map** Explainer (G-CAME), which generates a saliency map as the explanation for object detection models. G-CAME can be considered a CAM-based method that uses the activation maps of selected layers combined with the Gaussian kernel to highlight the important regions in the image for the predicted box. Compared with other Region-based methods, G-CAME can transcend time constraints as it takes a very short time to explain an object. We also evaluated our method qualitatively and quantitatively with YOLOX on the MS-COCO 2017 dataset and guided to apply G-CAME into the two-stage Faster-RCNN model. △ Less

Submitted 6 June, 2023; originally announced June 2023.

Comments: 10 figures

arXiv:2306.02744 [pdf, other]

Towards Better Explanations for Object Detection

Authors: Van Binh Truong, Truong Thanh Hung Nguyen, Vo Thanh Khang Nguyen, Quoc Khanh Nguyen, Quoc Hung Cao

Abstract: Recent advances in Artificial Intelligence (AI) technology have promoted their use in almost every field. The growing complexity of deep neural networks (DNNs) makes it increasingly difficult and important to explain the inner workings and decisions of the network. However, most current techniques for explaining DNNs focus mainly on interpreting classification tasks. This paper proposes a method t… ▽ More Recent advances in Artificial Intelligence (AI) technology have promoted their use in almost every field. The growing complexity of deep neural networks (DNNs) makes it increasingly difficult and important to explain the inner workings and decisions of the network. However, most current techniques for explaining DNNs focus mainly on interpreting classification tasks. This paper proposes a method to explain the decision for any object detection model called D-CLOSE. To closely track the model's behavior, we used multiple levels of segmentation on the image and a process to combine them. We performed tests on the MS-COCO dataset with the YOLOX model, which shows that our method outperforms D-RISE and can give a better quality and less noise explanation. △ Less

Submitted 6 June, 2023; v1 submitted 5 June, 2023; originally announced June 2023.

Comments: 9 pages, 10 figures

arXiv:2305.07476 [pdf, ps, other]

A Hilbert Bundles Description of Complex Brunn-Minkowski Theory

Authors: Tai Terje Huu Nguyen

Abstract: The following is a Ph.D. thesis. The thesis is submitted in partial fulfillment of the requirements for the degree of Philosophiae Doctor (Ph.D.) at the Norwegian University of Science and Technology. The following is a Ph.D. thesis. The thesis is submitted in partial fulfillment of the requirements for the degree of Philosophiae Doctor (Ph.D.) at the Norwegian University of Science and Technology. △ Less

Submitted 12 May, 2023; originally announced May 2023.

arXiv:2303.04731 [pdf, other]

Towards Trust of Explainable AI in Thyroid Nodule Diagnosis

Authors: Truong Thanh Hung Nguyen, Van Binh Truong, Vo Thanh Khang Nguyen, Quoc Hung Cao, Quoc Khanh Nguyen

Abstract: The ability to explain the prediction of deep learning models to end-users is an important feature to leverage the power of artificial intelligence (AI) for the medical decision-making process, which is usually considered non-transparent and challenging to comprehend. In this paper, we apply state-of-the-art eXplainable artificial intelligence (XAI) methods to explain the prediction of the black-b… ▽ More The ability to explain the prediction of deep learning models to end-users is an important feature to leverage the power of artificial intelligence (AI) for the medical decision-making process, which is usually considered non-transparent and challenging to comprehend. In this paper, we apply state-of-the-art eXplainable artificial intelligence (XAI) methods to explain the prediction of the black-box AI models in the thyroid nodule diagnosis application. We propose new statistic-based XAI methods, namely Kernel Density Estimation and Density map, to explain the case of no nodule detected. XAI methods' performances are considered under a qualitative and quantitative comparison as feedback to improve the data quality and the model performance. Finally, we survey to assess doctors' and patients' trust in XAI explanations of the model's decisions on thyroid nodule images. △ Less

Submitted 8 March, 2023; originally announced March 2023.

Comments: Accepted by AAAI 2023 The 7th International Workshop on Health Intelligence (W3PHIAI-23)

arXiv:2112.10491 [pdf, ps, other]

Secrecy Performance of RIS-assisted Wireless Networks under Rician fading

Authors: Thi Tuyet Hai Nguyen, Tien Hoa Nguyen

Abstract: Secrecy outage probability (SOP) and secrecy rate (SR) of the reconfigurable intelligent surface (RIS) assisted wireless networks under Rician fading are investigated in this paper. More precisely, we enhance the secrecy performance of the considered networks by suppressing the wiretap channel instead of maximizing the main channel. We propose a simple heuristic algorithm to find out the optimal p… ▽ More Secrecy outage probability (SOP) and secrecy rate (SR) of the reconfigurable intelligent surface (RIS) assisted wireless networks under Rician fading are investigated in this paper. More precisely, we enhance the secrecy performance of the considered networks by suppressing the wiretap channel instead of maximizing the main channel. We propose a simple heuristic algorithm to find out the optimal phase-shift of each RIS's element. Simulation results based on the Monte-Carlo method are given to verify the superiority of the proposed optimal phase-shifts compared to the random phase-shifts design. △ Less

Submitted 10 December, 2021; originally announced December 2021.

arXiv:2111.08422 [pdf, ps, other]

A Hilbert bundle approach to the sharp strong openness theorem and the Ohsawa-Takegoshi extension theorem

Authors: Tai Terje Huu Nguyen, Xu Wang

Abstract: The following paper is around parts of the first named author's thesis. We discuss (what we call) a Hilbert bundle approach to complex Brunn-Minkowski theory and obtain a general monotonicity theorem. As two applications, we prove a generalization of Guan's sharp strong openness theorem and a sharp Ohsawa-Takegoshi extension theorem. A second proof of Guan-Zhou's strong openness theorem using a Do… ▽ More The following paper is around parts of the first named author's thesis. We discuss (what we call) a Hilbert bundle approach to complex Brunn-Minkowski theory and obtain a general monotonicity theorem. As two applications, we prove a generalization of Guan's sharp strong openness theorem and a sharp Ohsawa-Takegoshi extension theorem. A second proof of Guan-Zhou's strong openness theorem using a Donnelly-Fefferman estimate is also given. △ Less

Submitted 23 February, 2024; v1 submitted 16 November, 2021; originally announced November 2021.

Comments: New version, accepted by Contemporary Mathematics

arXiv:2105.00947 [pdf, ps, other]

On a remark by Ohsawa related to the Berndtsson-Lempert method for $L^2$-holomorphic extension

Authors: Tai Terje Huu Nguyen, Xu Wang

Abstract: We utilize the Legendre-Fenchel transform and weak geodesics for plurisubharmonic functions to construct a weight function that can be used in the Berndtsson-Lempert method, to give an Ohsawa-Takegoshi extension type of result. Theorem 4.1 and Theorem 0.1 in \cite{OT2017} (Theorem \ref{Theorem A} and \ref{Theorem B} below) follow as two special cases of this result, thus answering affirmatively a… ▽ More We utilize the Legendre-Fenchel transform and weak geodesics for plurisubharmonic functions to construct a weight function that can be used in the Berndtsson-Lempert method, to give an Ohsawa-Takegoshi extension type of result. Theorem 4.1 and Theorem 0.1 in \cite{OT2017} (Theorem \ref{Theorem A} and \ref{Theorem B} below) follow as two special cases of this result, thus answering affirmatively a question posed by Ohsawa in remark 4.1 in \cite{OT2017}, on the Berndtsson-Lempert method. △ Less

Submitted 3 May, 2021; originally announced May 2021.

arXiv:2010.00198 [pdf, other]

Improving Vietnamese Named Entity Recognition from Speech Using Word Capitalization and Punctuation Recovery Models

Authors: Thai Binh Nguyen, Quang Minh Nguyen, Thi Thu Hien Nguyen, Quoc Truong Do, Chi Mai Luong

Abstract: Studies on the Named Entity Recognition (NER) task have shown outstanding results that reach human parity on input texts with correct text formattings, such as with proper punctuation and capitalization. However, such conditions are not available in applications where the input is speech, because the text is generated from a speech recognition system (ASR), and that the system does not consider th… ▽ More Studies on the Named Entity Recognition (NER) task have shown outstanding results that reach human parity on input texts with correct text formattings, such as with proper punctuation and capitalization. However, such conditions are not available in applications where the input is speech, because the text is generated from a speech recognition system (ASR), and that the system does not consider the text formatting. In this paper, we (1) presented the first Vietnamese speech dataset for NER task, and (2) the first pre-trained public large-scale monolingual language model for Vietnamese that achieved the new state-of-the-art for the Vietnamese NER task by 1.3% absolute F1 score comparing to the latest study. And finally, (3) we proposed a new pipeline for NER task from speech that overcomes the text formatting problem by introducing a text capitalization and punctuation recovery model (CaPu) into the pipeline. The model takes input text from an ASR system and performs two tasks at the same time, producing proper text formatting that helps to improve NER performance. Experimental results indicated that the CaPu model helps to improve by nearly 4% of F1-score. △ Less

Submitted 1 October, 2020; originally announced October 2020.

Comments: Accepted in Interspeech 2020

arXiv:2009.10954 [pdf]

doi 10.1039/D0NR00165A

Polytypism in Few-Layer Gallium Selenide

Authors: Soo Yeon Lim, Jae-Ung Lee, Jung Hwa Kim, Liangbo Liang, Xiangru Kong, Thi Thanh Huong Nguyen, Zonghoon Lee, Sunglae Cho, Hyeonsik Cheong

Abstract: Gallium selenide (GaSe) is one of layered group-III metal monochalcogenides, which has an indirect bandgap in monolayer and direct bandgap in bulk unlike other conventional transition metal dichalcogenides (TMDs) such as MoX2 and WX2 (X=S and Se). Four polytypes of bulk GaSe, designated as beta-, epsilon-, gamma-, and delta-GaSe, have been reported. Since different polytypes result in different op… ▽ More Gallium selenide (GaSe) is one of layered group-III metal monochalcogenides, which has an indirect bandgap in monolayer and direct bandgap in bulk unlike other conventional transition metal dichalcogenides (TMDs) such as MoX2 and WX2 (X=S and Se). Four polytypes of bulk GaSe, designated as beta-, epsilon-, gamma-, and delta-GaSe, have been reported. Since different polytypes result in different optical and electrical properties even for the same thickness, identifying the polytype is essential in utilizing this material for various optoelectronic applications. We performed polarized Raman measurement on GaSe and found different ultra-low-frequency Raman spectra of inter-layer vibrational modes even for the same thickness due to different stacking sequences of the polytypes. By comparing the ultra-low-frequency Raman spectra with theoretical calculations and high-resolution electron microscopy measurements, we established the correlation between the ultra-low-frequency Raman spectra and the stacking sequences for trilayer GaSe. We further found that the AB-type stacking is more stable than the AA'-type stacking in GaSe. △ Less

Submitted 23 September, 2020; originally announced September 2020.

Journal ref: Nanoscale, 2020,12, 8563-8573

arXiv:1707.08031 [pdf, other]

Optimal Timing in Dynamic and Robust Attacker Engagement During Advanced Persistent Threats

Authors: Jeffrey Pawlick, Thi Thu Hang Nguyen, Edward Colbert, Quanyan Zhu

Abstract: Advanced persistent threats (APTs) are stealthy attacks which make use of social engineering and deception to give adversaries insider access to networked systems. Against APTs, active defense technologies aim to create and exploit information asymmetry for defenders. In this paper, we study a scenario in which a powerful defender uses honeynets for active defense in order to observe an attacker w… ▽ More Advanced persistent threats (APTs) are stealthy attacks which make use of social engineering and deception to give adversaries insider access to networked systems. Against APTs, active defense technologies aim to create and exploit information asymmetry for defenders. In this paper, we study a scenario in which a powerful defender uses honeynets for active defense in order to observe an attacker who has penetrated the network. Rather than immediately eject the attacker, the defender may elect to gather information. We introduce an undiscounted, infinite-horizon Markov decision process on a continuous state space in order to model the defender's problem. We find a threshold of information that the defender should gather about the attacker before ejecting him. Then we study the robustness of this policy using a Stackelberg game. Finally, we simulate the policy for a conceptual network. Our results provide a quantitative foundation for studying optimal timing for attacker engagement in network defense. △ Less

Submitted 22 January, 2019; v1 submitted 25 July, 2017; originally announced July 2017.

Comments: Submitted to the 2019 Intl. Symp. Modeling and Optimization in Mobile, Ad Hoc, and Wireless Nets. (WiOpt)

Showing 1–16 of 16 results for author: Nguyen, T T H