Search | arXiv e-print repository

Text Classification: A Review, Empirical, and Experimental Evaluation

Authors: Kamal Taha, Paul D. Yoo, Chan Yeun, Aya Taha

Abstract: The explosive and widespread growth of data necessitates the use of text classification to extract crucial information from vast amounts of data. Consequently, there has been a surge of research in both classical and deep learning text classification methods. Despite the numerous methods proposed in the literature, there is still a pressing need for a comprehensive and up-to-date survey. Existing… ▽ More The explosive and widespread growth of data necessitates the use of text classification to extract crucial information from vast amounts of data. Consequently, there has been a surge of research in both classical and deep learning text classification methods. Despite the numerous methods proposed in the literature, there is still a pressing need for a comprehensive and up-to-date survey. Existing survey papers categorize algorithms for text classification into broad classes, which can lead to the misclassification of unrelated algorithms and incorrect assessments of their qualities and behaviors using the same metrics. To address these limitations, our paper introduces a novel methodological taxonomy that classifies algorithms hierarchically into fine-grained classes and specific techniques. The taxonomy includes methodology categories, methodology techniques, and methodology sub-techniques. Our study is the first survey to utilize this methodological taxonomy for classifying algorithms for text classification. Furthermore, our study also conducts empirical evaluation and experimental comparisons and rankings of different algorithms that employ the same specific sub-technique, different sub-techniques within the same technique, different techniques within the same category, and categories △ Less

Submitted 11 January, 2024; originally announced January 2024.

arXiv:2401.01896 [pdf]

Reputation-Based Federated Learning Defense to Mitigate Threats in EEG Signal Classification

Authors: Zhibo Zhang, Pengfei Li, Ahmed Y. Al Hammadi, Fusen Guo, Ernesto Damiani, Chan Yeob Yeun

Abstract: This paper presents a reputation-based threat mitigation framework that defends potential security threats in electroencephalogram (EEG) signal classification during model aggregation of Federated Learning. While EEG signal analysis has attracted attention because of the emergence of brain-computer interface (BCI) technology, it is difficult to create efficient learning models for EEG analysis bec… ▽ More This paper presents a reputation-based threat mitigation framework that defends potential security threats in electroencephalogram (EEG) signal classification during model aggregation of Federated Learning. While EEG signal analysis has attracted attention because of the emergence of brain-computer interface (BCI) technology, it is difficult to create efficient learning models for EEG analysis because of the distributed nature of EEG data and related privacy and security concerns. To address these challenges, the proposed defending framework leverages the Federated Learning paradigm to preserve privacy by collaborative model training with localized data from dispersed sources and introduces a reputation-based mechanism to mitigate the influence of data poisoning attacks and identify compromised participants. To assess the efficiency of the proposed reputation-based federated learning defense framework, data poisoning attacks based on the risk level of training data derived by Explainable Artificial Intelligence (XAI) techniques are conducted on both publicly available EEG signal datasets and the self-established EEG signal dataset. Experimental results on the poisoned datasets show that the proposed defense methodology performs well in EEG signal classification while reducing the risks associated with security threats. △ Less

Submitted 22 October, 2023; originally announced January 2024.

arXiv:2401.01895 [pdf]

A Robust Adversary Detection-Deactivation Method for Metaverse-oriented Collaborative Deep Learning

Authors: Pengfei Li, Zhibo Zhang, Ameena S. Al-Sumaiti, Naoufel Werghi, Chan Yeob Yeun

Abstract: Metaverse is trending to create a digital circumstance that can transfer the real world to an online platform supported by large quantities of real-time interactions. Pre-trained Artificial Intelligence (AI) models are demonstrating their increasing capability in aiding the metaverse to achieve an excellent response with negligible delay, and nowadays, many large models are collaboratively trained… ▽ More Metaverse is trending to create a digital circumstance that can transfer the real world to an online platform supported by large quantities of real-time interactions. Pre-trained Artificial Intelligence (AI) models are demonstrating their increasing capability in aiding the metaverse to achieve an excellent response with negligible delay, and nowadays, many large models are collaboratively trained by various participants in a manner named collaborative deep learning (CDL). However, several security weaknesses can threaten the safety of the CDL training process, which might result in fatal attacks to either the pre-trained large model or the local sensitive data sets possessed by an individual entity. In CDL, malicious participants can hide within the major innocent and silently uploads deceptive parameters to degenerate the model performance, or they can abuse the downloaded parameters to construct a Generative Adversarial Network (GAN) to acquire the private information of others illegally. To compensate for these vulnerabilities, this paper proposes an adversary detection-deactivation method, which can limit and isolate the access of potential malicious participants, quarantine and disable the GAN-attack or harmful backpropagation of received threatening gradients. A detailed protection analysis has been conducted on a Multiview CDL case, and results show that the protocol can effectively prevent harmful access by heuristic manner analysis and can protect the existing model by swiftly checking received gradients using only one low-cost branch with an embedded firewall. △ Less

Submitted 21 October, 2023; originally announced January 2024.

arXiv:2302.04224 [pdf]

Data Poisoning Attacks on EEG Signal-based Risk Assessment Systems

Authors: Zhibo Zhang, Sani Umar, Ahmed Y. Al Hammadi, Sangyoung Yoon, Ernesto Damiani, Chan Yeob Yeun

Abstract: Industrial insider risk assessment using electroencephalogram (EEG) signals has consistently attracted a lot of research attention. However, EEG signal-based risk assessment systems, which could evaluate the emotional states of humans, have shown several vulnerabilities to data poison attacks. In this paper, from the attackers' perspective, data poison attacks involving label-flip** occurring in… ▽ More Industrial insider risk assessment using electroencephalogram (EEG) signals has consistently attracted a lot of research attention. However, EEG signal-based risk assessment systems, which could evaluate the emotional states of humans, have shown several vulnerabilities to data poison attacks. In this paper, from the attackers' perspective, data poison attacks involving label-flip** occurring in the training stages of different machine learning models intrude on the EEG signal-based risk assessment systems using these machine learning models. This paper aims to propose two categories of label-flip** methods to attack different machine learning classifiers including Adaptive Boosting (AdaBoost), Multilayer Perceptron (MLP), Random Forest, and K-Nearest Neighbors (KNN) dedicated to the classification of 4 different human emotions using EEG signals. This aims to degrade the performance of the aforementioned machine learning models concerning the classification task. The experimental results show that the proposed data poison attacks are model-agnostically effective whereas different models have different resilience to the data poison attacks. △ Less

Submitted 8 February, 2023; originally announced February 2023.

Comments: 2nd International Conference on Business Analytics For Technology and Security (ICBATS)

arXiv:2302.04109 [pdf]

Explainable Label-flip** Attacks on Human Emotion Assessment System

Authors: Zhibo Zhang, Ahmed Y. Al Hammadi, Ernesto Damiani, Chan Yeob Yeun

Abstract: This paper's main goal is to provide an attacker's point of view on data poisoning assaults that use label-flip** during the training phase of systems that use electroencephalogram (EEG) signals to evaluate human emotion. To attack different machine learning classifiers such as Adaptive Boosting (AdaBoost) and Random Forest dedicated to the classification of 4 different human emotions using EEG… ▽ More This paper's main goal is to provide an attacker's point of view on data poisoning assaults that use label-flip** during the training phase of systems that use electroencephalogram (EEG) signals to evaluate human emotion. To attack different machine learning classifiers such as Adaptive Boosting (AdaBoost) and Random Forest dedicated to the classification of 4 different human emotions using EEG signals, this paper proposes two scenarios of label-flip** methods. The results of the studies show that the proposed data poison attacksm based on label-flip** are successful regardless of the model, but different models show different degrees of resistance to the assaults. In addition, numerous Explainable Artificial Intelligence (XAI) techniques are used to explain the data poison attacks on EEG signal-based human emotion evaluation systems. △ Less

Submitted 8 February, 2023; originally announced February 2023.

arXiv:2301.06923 [pdf]

doi 10.1109/ACCESS.2023.3245813

Explainable Data Poison Attacks on Human Emotion Evaluation Systems based on EEG Signals

Authors: Zhibo Zhang, Sani Umar, Ahmed Y. Al Hammadi, Sangyoung Yoon, Ernesto Damiani, Claudio Agostino Ardagna, Nicola Bena, Chan Yeob Yeun

Abstract: The major aim of this paper is to explain the data poisoning attacks using label-flip** during the training stage of the electroencephalogram (EEG) signal-based human emotion evaluation systems deploying Machine Learning models from the attackers' perspective. Human emotion evaluation using EEG signals has consistently attracted a lot of research attention. The identification of human emotional… ▽ More The major aim of this paper is to explain the data poisoning attacks using label-flip** during the training stage of the electroencephalogram (EEG) signal-based human emotion evaluation systems deploying Machine Learning models from the attackers' perspective. Human emotion evaluation using EEG signals has consistently attracted a lot of research attention. The identification of human emotional states based on EEG signals is effective to detect potential internal threats caused by insider individuals. Nevertheless, EEG signal-based human emotion evaluation systems have shown several vulnerabilities to data poison attacks. The findings of the experiments demonstrate that the suggested data poison assaults are model-independently successful, although various models exhibit varying levels of resilience to the attacks. In addition, the data poison attacks on the EEG signal-based human emotion evaluation systems are explained with several Explainable Artificial Intelligence (XAI) methods, including Shapley Additive Explanation (SHAP) values, Local Interpretable Model-agnostic Explanations (LIME), and Generated Decision Trees. And the codes of this paper are publicly available on GitHub. △ Less

Submitted 17 January, 2023; originally announced January 2023.

Journal ref: IEEE Access 2023

arXiv:2210.14616

A Late Multi-Modal Fusion Model for Detecting Hybrid Spam E-mail

Authors: Zhibo Zhang, Ernesto Damiani, Hussam Al Hamadi, Chan Yeob Yeun, Fatma Taher

Abstract: In recent years, spammers are now trying to obfuscate their intents by introducing hybrid spam e-mail combining both image and text parts, which is more challenging to detect in comparison to e-mails containing text or image only. The motivation behind this research is to design an effective approach filtering out hybrid spam e-mails to avoid situations where traditional text-based or image-baesd… ▽ More In recent years, spammers are now trying to obfuscate their intents by introducing hybrid spam e-mail combining both image and text parts, which is more challenging to detect in comparison to e-mails containing text or image only. The motivation behind this research is to design an effective approach filtering out hybrid spam e-mails to avoid situations where traditional text-based or image-baesd only filters fail to detect hybrid spam e-mails. To the best of our knowledge, a few studies have been conducted with the goal of detecting hybrid spam e-mails. Ordinarily, Optical Character Recognition (OCR) technology is used to eliminate the image parts of spam by transforming images into text. However, the research questions are that although OCR scanning is a very successful technique in processing text-and-image hybrid spam, it is not an effective solution for dealing with huge quantities due to the CPU power required and the execution time it takes to scan e-mail files. And the OCR techniques are not always reliable in the transformation processes. To address such problems, we propose new late multi-modal fusion training frameworks for a text-and-image hybrid spam e-mail filtering system compared to the classical early fusion detection frameworks based on the OCR method. Convolutional Neural Network (CNN) and Continuous Bag of Words were implemented to extract features from image and text parts of hybrid spam respectively, whereas generated features were fed to sigmoid layer and Machine Learning based classifiers including Random Forest (RF), Decision Tree (DT), Naive Bayes (NB) and Support Vector Machine (SVM) to determine the e-mail ham or spam. △ Less

Submitted 15 May, 2023; v1 submitted 26 October, 2022; originally announced October 2022.

Comments: The content of this paper needs to be updated

Journal ref: Index in journal International Journal of Computer Theory and Engineering (IJCTE), 2023

arXiv:2210.11592 [pdf, other]

New data poison attacks on machine learning classifiers for mobile exfiltration

Authors: Miguel A. Ramirez, Sangyoung Yoon, Ernesto Damiani, Hussam Al Hamadi, Claudio Agostino Ardagna, Nicola Bena, Young-Ji Byon, Tae-Yeon Kim, Chung-Suk Cho, Chan Yeob Yeun

Abstract: Most recent studies have shown several vulnerabilities to attacks with the potential to jeopardize the integrity of the model, opening in a few recent years a new window of opportunity in terms of cyber-security. The main interest of this paper is directed towards data poisoning attacks involving label-flip**, this kind of attacks occur during the training phase, being the aim of the attacker to… ▽ More Most recent studies have shown several vulnerabilities to attacks with the potential to jeopardize the integrity of the model, opening in a few recent years a new window of opportunity in terms of cyber-security. The main interest of this paper is directed towards data poisoning attacks involving label-flip**, this kind of attacks occur during the training phase, being the aim of the attacker to compromise the integrity of the targeted machine learning model by drastically reducing the overall accuracy of the model and/or achieving the missclassification of determined samples. This paper is conducted with intention of proposing two new kinds of data poisoning attacks based on label-flip**, the targeted of the attack is represented by a variety of machine learning classifiers dedicated for malware detection using mobile exfiltration data. With that, the proposed attacks are proven to be model-agnostic, having successfully corrupted a wide variety of machine learning models; Logistic Regression, Decision Tree, Random Forest and KNN are some examples. The first attack is performs label-flip** actions randomly while the second attacks performs label flip** only one of the 2 classes in particular. The effects of each attack are analyzed in further detail with special emphasis on the accuracy drop and the misclassification rate. Finally, this paper pursuits further research direction by suggesting the development of a defense technique that could promise a feasible detection and/or mitigation mechanisms; such technique should be capable of conferring a certain level of robustness to a target model against potential attackers. △ Less

Submitted 20 October, 2022; originally announced October 2022.

Comments: arXiv admin note: substantial text overlap with arXiv:2202.10276

arXiv:2209.14013 [pdf, other]

doi 10.1109/TSUSC.2023.3293269

On the Robustness of Random Forest Against Untargeted Data Poisoning: An Ensemble-Based Approach

Authors: Marco Anisetti, Claudio A. Ardagna, Alessandro Balestrucci, Nicola Bena, Ernesto Damiani, Chan Yeob Yeun

Abstract: Machine learning is becoming ubiquitous. From finance to medicine, machine learning models are boosting decision-making processes and even outperforming humans in some tasks. This huge progress in terms of prediction quality does not however find a counterpart in the security of such models and corresponding predictions, where perturbations of fractions of the training set (poisoning) can seriousl… ▽ More Machine learning is becoming ubiquitous. From finance to medicine, machine learning models are boosting decision-making processes and even outperforming humans in some tasks. This huge progress in terms of prediction quality does not however find a counterpart in the security of such models and corresponding predictions, where perturbations of fractions of the training set (poisoning) can seriously undermine the model accuracy. Research on poisoning attacks and defenses received increasing attention in the last decade, leading to several promising solutions aiming to increase the robustness of machine learning. Among them, ensemble-based defenses, where different models are trained on portions of the training set and their predictions are then aggregated, provide strong theoretical guarantees at the price of a linear overhead. Surprisingly, ensemble-based defenses, which do not pose any restrictions on the base model, have not been applied to increase the robustness of random forest models. The work in this paper aims to fill in this gap by designing and implementing a novel hash-based ensemble approach that protects random forest against untargeted, random poisoning attacks. An extensive experimental evaluation measures the performance of our approach against a variety of attacks, as well as its sustainability in terms of resource consumption and performance, and compares it with a traditional monolithic model based on random forest. A final discussion presents our main findings and compares our approach with existing poisoning defenses targeting random forests. △ Less

Submitted 28 August, 2023; v1 submitted 28 September, 2022; originally announced September 2022.

Comments: Accepted in IEEE Transactions on Sustainable Computing; 15 pages, 8 figures

arXiv:2209.03166 [pdf]

doi 10.1109/ICCR56254.2022.9995839

Explainable Artificial Intelligence to Detect Image Spam Using Convolutional Neural Network

Authors: Zhibo Zhang, Ernesto Damiani, Hussam Al Hamadi, Chan Yeob Yeun, Fatma Taher

Abstract: Image spam threat detection has continually been a popular area of research with the internet's phenomenal expansion. This research presents an explainable framework for detecting spam images using Convolutional Neural Network(CNN) algorithms and Explainable Artificial Intelligence (XAI) algorithms. In this work, we use CNN model to classify image spam respectively whereas the post-hoc XAI methods… ▽ More Image spam threat detection has continually been a popular area of research with the internet's phenomenal expansion. This research presents an explainable framework for detecting spam images using Convolutional Neural Network(CNN) algorithms and Explainable Artificial Intelligence (XAI) algorithms. In this work, we use CNN model to classify image spam respectively whereas the post-hoc XAI methods including Local Interpretable Model Agnostic Explanation (LIME) and Shapley Additive Explanations (SHAP) were deployed to provide explanations for the decisions that the black-box CNN models made about spam image detection. We train and then evaluate the performance of the proposed approach on a 6636 image dataset including spam images and normal images collected from three different publicly available email corpora. The experimental results show that the proposed framework achieved satisfactory detection results in terms of different performance metrics whereas the model-independent XAI algorithms could provide explanations for the decisions of different models which could be utilized for comparison for the future study. △ Less

Submitted 7 September, 2022; originally announced September 2022.

Comments: Under review by International Conference on Cyber Resilience (ICCR), Dubai 2022

arXiv:2208.14937 [pdf]

doi 10.1109/ACCESS.2022.3204051

Explainable Artificial Intelligence Applications in Cyber Security: State-of-the-Art in Research

Authors: Zhibo Zhang, Hussam Al Hamadi, Ernesto Damiani, Chan Yeob Yeun, Fatma Taher

Abstract: This survey presents a comprehensive review of current literature on Explainable Artificial Intelligence (XAI) methods for cyber security applications. Due to the rapid development of Internet-connected systems and Artificial Intelligence in recent years, Artificial Intelligence including Machine Learning (ML) and Deep Learning (DL) has been widely utilized in the fields of cyber security includin… ▽ More This survey presents a comprehensive review of current literature on Explainable Artificial Intelligence (XAI) methods for cyber security applications. Due to the rapid development of Internet-connected systems and Artificial Intelligence in recent years, Artificial Intelligence including Machine Learning (ML) and Deep Learning (DL) has been widely utilized in the fields of cyber security including intrusion detection, malware detection, and spam filtering. However, although Artificial Intelligence-based approaches for the detection and defense of cyber attacks and threats are more advanced and efficient compared to the conventional signature-based and rule-based cyber security strategies, most ML-based techniques and DL-based techniques are deployed in the black-box manner, meaning that security experts and customers are unable to explain how such procedures reach particular conclusions. The deficiencies of transparency and interpretability of existing Artificial Intelligence techniques would decrease human users' confidence in the models utilized for the defense against cyber attacks, especially in current situations where cyber attacks become increasingly diverse and complicated. Therefore, it is essential to apply XAI in the establishment of cyber security models to create more explainable models while maintaining high accuracy and allowing human users to comprehend, trust, and manage the next generation of cyber defense mechanisms. Although there are papers reviewing Artificial Intelligence applications in cyber security areas and the vast literature on applying XAI in many fields including healthcare, financial services, and criminal justice, the surprising fact is that there are currently no survey research articles that concentrate on XAI applications in cyber security. △ Less

Submitted 31 August, 2022; originally announced August 2022.

Comments: Accepted by IEEE Access

Journal ref: IEEE Access 2022

arXiv:2202.10276 [pdf, other]

Poisoning Attacks and Defenses on Artificial Intelligence: A Survey

Authors: Miguel A. Ramirez, Song-Kyoo Kim, Hussam Al Hamadi, Ernesto Damiani, Young-Ji Byon, Tae-Yeon Kim, Chung-Suk Cho, Chan Yeob Yeun

Abstract: Machine learning models have been widely adopted in several fields. However, most recent studies have shown several vulnerabilities from attacks with a potential to jeopardize the integrity of the model, presenting a new window of research opportunity in terms of cyber-security. This survey is conducted with a main intention of highlighting the most relevant information related to security vulnera… ▽ More Machine learning models have been widely adopted in several fields. However, most recent studies have shown several vulnerabilities from attacks with a potential to jeopardize the integrity of the model, presenting a new window of research opportunity in terms of cyber-security. This survey is conducted with a main intention of highlighting the most relevant information related to security vulnerabilities in the context of machine learning (ML) classifiers; more specifically, directed towards training procedures against data poisoning attacks, representing a type of attack that consists of tampering the data samples fed to the model during the training phase, leading to a degradation in the models accuracy during the inference phase. This work compiles the most relevant insights and findings found in the latest existing literatures addressing this type of attacks. Moreover, this paper also covers several defense techniques that promise feasible detection and mitigation mechanisms, capable of conferring a certain level of robustness to a target model against an attacker. A thorough assessment is performed on the reviewed works, comparing the effects of data poisoning on a wide range of ML models in real-world conditions, performing quantitative and qualitative analyses. This paper analyzes the main characteristics for each approach including performance success metrics, required hyperparameters, and deployment complexity. Moreover, this paper emphasizes the underlying assumptions and limitations considered by both attackers and defenders along with their intrinsic properties such as: availability, reliability, privacy, accountability, interpretability, etc. Finally, this paper concludes by making references of some of main existing research trends that provide pathways towards future research directions in the field of cyber-security. △ Less

Submitted 22 February, 2022; v1 submitted 21 February, 2022; originally announced February 2022.

arXiv:2111.09529 [pdf, other]

Blockchain Interoperability in UAV Networks: State-of-the-art and Open Issues

Authors: Ruba Alkadi, Noura Alnuaimi, Abdulhadi Shoufan, Chan Yeun

Abstract: The breakthrough of blockchain technology has facilitated the emergence and deployment of a wide range of Unmanned Aerial Vehicles (UAV) network-based applications. Yet, the full utilization of these applications is still limited due to the fact that each application is operating on an isolated blockchain. Thus, it is inevitable to orchestrate these blockchain fragments by introducing a cross-bloc… ▽ More The breakthrough of blockchain technology has facilitated the emergence and deployment of a wide range of Unmanned Aerial Vehicles (UAV) network-based applications. Yet, the full utilization of these applications is still limited due to the fact that each application is operating on an isolated blockchain. Thus, it is inevitable to orchestrate these blockchain fragments by introducing a cross-blockchain platform that governs the inter-communication and transfer of assets in the UAV networks context. In this paper, we provide an up-to-date survey of blockchain-based UAV networks applications. We also survey the literature on the state-of-the-art cross blockchain frameworks to highlight the latest advances in the field. Based on the outcomes of our survey, we introduce a spectrum of scenarios related to UAV networks that may leverage the potentials of the currently available cross-blockchain solutions. Finally, we identify open issues and potential challenges associated with the application of a cross-blockchain scheme for UAV networks that will hopefully guide future research directions. △ Less

Submitted 18 November, 2021; originally announced November 2021.

Comments: This paper is submitted to an IEEE journal for possible publication

arXiv:2012.00348 [pdf]

Deep Learning-Based Arrhythmia Detection Using RR-Interval Framed Electrocardiograms

Authors: Song-Kyoo Kim, Chan Yeob Yeun, Paul D. Yoo, Nai-Wei Lo, Ernesto Damiani

Abstract: Deep learning applied to electrocardiogram (ECG) data can be used to achieve personal authentication in biometric security applications, but it has not been widely used to diagnose cardiovascular disorders. We developed a deep learning model for the detection of arrhythmia in which time-sliced ECG data representing the distance between successive R-peaks are used as the input for a convolutional n… ▽ More Deep learning applied to electrocardiogram (ECG) data can be used to achieve personal authentication in biometric security applications, but it has not been widely used to diagnose cardiovascular disorders. We developed a deep learning model for the detection of arrhythmia in which time-sliced ECG data representing the distance between successive R-peaks are used as the input for a convolutional neural network (CNN). The main objective is develo** the compact deep learning based detect system which minimally uses the dataset but delivers the confident accuracy rate of the Arrhythmia detection. This compact system can be implemented in wearable devices or real-time monitoring equipment because the feature extraction step is not required for complex ECG waveforms, only the R-peak data is needed. The results of both tests indicated that the Compact Arrhythmia Detection System (CADS) matched the performance of conventional systems for the detection of arrhythmia in two consecutive test runs. All features of the CADS are fully implemented and publicly available in MATLAB. △ Less

Submitted 1 December, 2020; originally announced December 2020.

Comments: This paper is considered to be submitted to an international journal

arXiv:1909.05417 [pdf, other]

Deep User Identification Model with Multiple Biometrics

Authors: Hyoung-Kyu Song, Ebrahim AlAlkeem, Jaewoong Yun, Tae-Ho Kim, Tae-Ho Kim, Hyerin Yoo, Dasom Heo, Chan Yeob Yeun, Myungsu Chae

Abstract: Identification using biometrics is an important yet challenging task. Abundant research has been conducted on identifying personal identity or gender using given signals. Various types of biometrics such as electrocardiogram (ECG), electroencephalogram (EEG), face, fingerprint, and voice have been used for these tasks. Most research has only focused on single modality or a single task, while the c… ▽ More Identification using biometrics is an important yet challenging task. Abundant research has been conducted on identifying personal identity or gender using given signals. Various types of biometrics such as electrocardiogram (ECG), electroencephalogram (EEG), face, fingerprint, and voice have been used for these tasks. Most research has only focused on single modality or a single task, while the combination of input modality or tasks is yet to be investigated. In this paper, we propose deep identification and gender classification using multimodal biometrics. Our model uses ECG, fingerprint, and facial data. It then performs two tasks: gender identification and classification. By engaging multi-modality, a single model can handle various input domains without training each modality independently, and the correlation between domains can increase its generalization performance on the tasks. △ Less

Submitted 3 September, 2019; originally announced September 2019.

Comments: Accepted, CIKM 2019 Workshop on DTMBio

arXiv:1907.13517 [pdf]

doi 10.1109/ACCESS.2019.2954576

An Enhanced Machine Learning-based Biometric Authentication System Using RR-Interval Framed Electrocardiograms

Authors: Amang Song-Kyoo Kim, Chan Yeob Yeun, Paul D. Yoo

Abstract: This paper is targeted in the area of biometric data enabled security system based on the machine learning for the digital health. The disadvantages of traditional authentication systems include the risks of forgetfulness, loss, and theft. Biometric authentication is therefore rapidly replacing traditional authentication methods and is becoming an everyday part of life. The electrocardiogram (ECG)… ▽ More This paper is targeted in the area of biometric data enabled security system based on the machine learning for the digital health. The disadvantages of traditional authentication systems include the risks of forgetfulness, loss, and theft. Biometric authentication is therefore rapidly replacing traditional authentication methods and is becoming an everyday part of life. The electrocardiogram (ECG) was recently introduced as a biometric authentication system suitable for security checks. The proposed authentication system helps investigators studying ECG-based biometric authentication techniques to reshape input data by slicing based on the RR-interval, and defines the Overall Performance (OP), which is the combined performance metric of multiple authentication measures. We evaluated the performance of the proposed system using a confusion matrix and achieved up to 95% accuracy by compact data analysis. We also used the Amang ECG (amgecg) toolbox in MATLAB to investigate the upper-range control limit (UCL) based on the mean square error, which directly affects three authentication performance metrics: the accuracy, the number of accepted samples, and the OP. Using this approach, we found that the OP can be optimized by using a UCL of 0.0028, which indicates 61 accepted samples out of 70 and ensures that the proposed authentication system achieves an accuracy of 95%. △ Less

Submitted 30 November, 2019; v1 submitted 27 July, 2019; originally announced July 2019.

Comments: The paper has been accepted and published in the IEEE Access

Journal ref: IEEE Access 7 (2019), pp. 168669-168674

arXiv:1907.05887 [pdf]

doi 10.3390/math7111005

A Versatile Queuing System For Sharing Economy Platform Operations

Authors: Song-Kyoo Kim, Chan Yeob Yeun

Abstract: The paper deals with a sharing economy system with various management factors by using a bulk input G/M/1 type queuing model. The effective management of operating costs is vital for controlling the sharing economy platform and this research builds the theoretical background to understand the sharing economy business model. Analytically, the techniques include a classical Markov process of the sin… ▽ More The paper deals with a sharing economy system with various management factors by using a bulk input G/M/1 type queuing model. The effective management of operating costs is vital for controlling the sharing economy platform and this research builds the theoretical background to understand the sharing economy business model. Analytically, the techniques include a classical Markov process of the single channel queueing system, semi-Markov process and semi-regenerative process. It uses the stochastic congruent properties to find the probability distribution of the number of contractors in the sharing economy platform. The obtained explicit formulas demonstrate the usage of functional for the main stochastic characteristics including sharing expenses due to over contracted resources and optimization of their objective function. △ Less

Submitted 23 October, 2019; v1 submitted 12 July, 2019; originally announced July 2019.

Comments: This original paper has been published in the Mathematics

MSC Class: 60J10; 60K15; 60K25; 60K20; 90B05; 90B22; 90B50; 90C15

arXiv:1907.00366 [pdf]

doi 10.1109/ACCESS.2019.2937357

An Enhanced Electrocardiogram Biometric Authentication System Using Machine Learning

Authors: Ebrahim Al Alkeem, Song-Kyoo Kim, Chan Yeob Yeun, M. Jamal Zemerly, Kin Poon, Paul D. Yoo

Abstract: Traditional authentication systems use alphanumeric or graphical passwords, or token-based techniques that require "something you know and something you have". The disadvantages of these systems include the risks of forgetfulness, loss, and theft. To address these shortcomings, biometric authentication is rapidly replacing traditional authentication methods and is becoming a part of everyday life.… ▽ More Traditional authentication systems use alphanumeric or graphical passwords, or token-based techniques that require "something you know and something you have". The disadvantages of these systems include the risks of forgetfulness, loss, and theft. To address these shortcomings, biometric authentication is rapidly replacing traditional authentication methods and is becoming a part of everyday life. The electrocardiogram (ECG) is one of the most recent traits considered for biometric purposes. In this work we describe an ECG-based authentication system suitable for security checks and hospital environments. The proposed system will help investigators studying ECG-based biometric authentication techniques to define dataset boundaries and to acquire high-quality training data. We evaluated the performance of the proposed system and found that it could achieve up to the 92 percent identification accuracy. In addition, by applying the Amang ECG (amgecg) toolbox within MATLAB, we investigated the two parameters that directly affect the accuracy of authentication: the ECG slicing time (sliding window) and the sampling time period, and found their optimal values. △ Less

Submitted 24 September, 2019; v1 submitted 30 June, 2019; originally announced July 2019.

Comments: This paper has been published in the IEEE Access

Journal ref: IEEE Access 7 (2019), pp. 123069-123075

arXiv:1903.12340 [pdf]

doi 10.1109/ACCESS.2019.2927079

A Machine Learning Framework for Biometric Authentication using Electrocardiogram

Authors: Song-Kyoo Kim, Chan Yeob Yeun, Ernesto Damiani, Nai-Wei Lo

Abstract: This paper introduces a framework for how to appropriately adopt and adjust Machine Learning (ML) techniques used to construct Electrocardiogram (ECG) based biometric authentication schemes. The proposed framework can help investigators and developers on ECG based biometric authentication mechanisms define the boundaries of required datasets and get training data with good quality. To determine th… ▽ More This paper introduces a framework for how to appropriately adopt and adjust Machine Learning (ML) techniques used to construct Electrocardiogram (ECG) based biometric authentication schemes. The proposed framework can help investigators and developers on ECG based biometric authentication mechanisms define the boundaries of required datasets and get training data with good quality. To determine the boundaries of datasets, use case analysis is adopted. Based on various application scenarios on ECG based authentication, three distinct use cases (or authentication categories) are developed. With more qualified training data given to corresponding machine learning schemes, the precision on ML-based ECG biometric authentication mechanisms is increased in consequence. ECG time slicing technique with the R-peak anchoring is utilized in this framework to acquire ML training data with good quality. In the proposed framework four new measure metrics are introduced to evaluate the quality of ML training and testing data. In addition, a Matlab toolbox, containing all proposed mechanisms, metrics and sample data with demonstrations using various ML techniques, is developed and made publicly available for further investigation. For develo** ML-based ECG biometric authentication, the proposed framework can guide researchers to prepare the proper ML setups and the ML training datasets along with three identified user case scenarios. For researchers adopting ML techniques to design new schemes in other research domains, the proposed framework is still useful for generating ML-based training and testing datasets with good quality and utilizing new measure metrics. △ Less

Submitted 5 August, 2019; v1 submitted 28 March, 2019; originally announced March 2019.

Comments: This paper has been published in the IEEE Access

Journal ref: IEEE Access 7 (2019), pp. 94858-94868

arXiv:1706.07187 [pdf]

Pay-with-a-Selfie, a human-centred digital payment system

Authors: Ernesto Damiani, Perpetus Jacques Houngbo, Rasool Asal, Stelvio Cimato, Fulvio Frati, Joel T. Honsou, Dina Shehada, Chan Yeob Yeun

Abstract: Mobile payment systems are increasingly used to simplify the way in which money transfers and transactions can be performed. We argue that, to achieve their full potential as economic boosters in develo** countries, mobile payment systems need to rely on new metaphors suitable for the business models, lifestyle, and technology availability conditions of the targeted communities. The Pay-with-a-G… ▽ More Mobile payment systems are increasingly used to simplify the way in which money transfers and transactions can be performed. We argue that, to achieve their full potential as economic boosters in develo** countries, mobile payment systems need to rely on new metaphors suitable for the business models, lifestyle, and technology availability conditions of the targeted communities. The Pay-with-a-Group-Selfie (PGS) project, funded by the Melinda & Bill Gates Foundation, has developed a micro-payment system that supports everyday small transactions by extending the reach of, rather than substituting, existing payment frameworks. PGS is based on a simple gesture and a readily understandable metaphor. The gesture - taking a selfie - has become part of the lifestyle of mobile phone users worldwide, including non-technology-savvy ones. The metaphor likens computing two visual shares of the selfie to rip** a banknote in two, a technique used for decades for delayed payment in cash-only markets. PGS is designed to work with devices with limited computational power and when connectivity is patchy or not always available. Thanks to visual cryptography techniques PGS uses for computing the shares, the original selfie can be recomposed simply by stacking the shares, preserving the analogy with re-joining the two parts of the banknote. △ Less

Submitted 22 June, 2017; originally announced June 2017.

Showing 1–20 of 20 results for author: Yeun, C