Search | arXiv e-print repository

Data Quality in Edge Machine Learning: A State-of-the-Art Survey

Authors: Mohammed Djameleddine Belgoumri, Mohamed Reda Bouadjenek, Sunil Aryal, Hakim Hacid

Abstract: Data-driven Artificial Intelligence (AI) systems trained using Machine Learning (ML) are sha** an ever-increasing (in size and importance) portion of our lives, including, but not limited to, recommendation systems, autonomous driving technologies, healthcare diagnostics, financial services, and personalized marketing. On the one hand, the outsized influence of these systems imposes a high stand… ▽ More Data-driven Artificial Intelligence (AI) systems trained using Machine Learning (ML) are sha** an ever-increasing (in size and importance) portion of our lives, including, but not limited to, recommendation systems, autonomous driving technologies, healthcare diagnostics, financial services, and personalized marketing. On the one hand, the outsized influence of these systems imposes a high standard of quality, particularly in the data used to train them. On the other hand, establishing and maintaining standards of Data Quality (DQ) becomes more challenging due to the proliferation of Edge Computing and Internet of Things devices, along with their increasing adoption for training and deploying ML models. The nature of the edge environment -- characterized by limited resources, decentralized data storage, and processing -- exacerbates data-related issues, making them more frequent, severe, and difficult to detect and mitigate. From these observations, it follows that DQ research for edge ML is a critical and urgent exploration track for the safety and robust usefulness of present and future AI systems. Despite this fact, DQ research for edge ML is still in its infancy. The literature on this subject remains fragmented and scattered across different research communities, with no comprehensive survey to date. Hence, this paper aims to fill this gap by providing a global view of the existing literature from multiple disciplines that can be grouped under the umbrella of DQ for edge ML. Specifically, we present a tentative definition of data quality in Edge computing, which we use to establish a set of DQ dimensions. We explore each dimension in detail, including existing solutions for mitigation. △ Less

Submitted 1 June, 2024; originally announced June 2024.

Comments: 31 pages, 5 figures

arXiv:2405.18843 [pdf]

Data-driven Machinery Fault Detection: A Comprehensive Review

Authors: Dhiraj Neupane, Mohamed Reda Bouadjenek, Richard Dazeley, Sunil Aryal

Abstract: In this era of advanced manufacturing, it's now more crucial than ever to diagnose machine faults as early as possible to guarantee their safe and efficient operation. With the massive surge in industrial big data and advancement in sensing and computational technologies, data-driven Machinery Fault Diagnosis (MFD) solutions based on machine/deep learning approaches have been used ubiquitously in… ▽ More In this era of advanced manufacturing, it's now more crucial than ever to diagnose machine faults as early as possible to guarantee their safe and efficient operation. With the massive surge in industrial big data and advancement in sensing and computational technologies, data-driven Machinery Fault Diagnosis (MFD) solutions based on machine/deep learning approaches have been used ubiquitously in manufacturing. Timely and accurately identifying faulty machine signals is vital in industrial applications for which many relevant solutions have been proposed and are reviewed in many articles. Despite the availability of numerous solutions and reviews on MFD, existing works often lack several aspects. Most of the available literature has limited applicability in a wide range of manufacturing settings due to their concentration on a particular type of equipment or method of analysis. Additionally, discussions regarding the challenges associated with implementing data-driven approaches, such as dealing with noisy data, selecting appropriate features, and adapting models to accommodate new or unforeseen faults, are often superficial or completely overlooked. Thus, this survey provides a comprehensive review of the articles using different types of machine learning approaches for the detection and diagnosis of various types of machinery faults, highlights their strengths and limitations, provides a review of the methods used for condition-based analyses, comprehensively discusses the available machinery fault datasets, introduces future researchers to the possible challenges they have to encounter while using these approaches for MFD and recommends the probable solutions to mitigate those problems. The future research prospects are also pointed out for a better understanding of the field. We believe this article will help researchers and contribute to the further development of the field. △ Less

Submitted 29 May, 2024; originally announced May 2024.

arXiv:2404.08511 [pdf, other]

Leveraging Multi-AI Agents for Cross-Domain Knowledge Discovery

Authors: Shiva Aryal, Tuyen Do, Bisesh Heyojoo, Sandeep Chataut, Bichar Dip Shrestha Gurung, Venkataramana Gadhamshetty, Etienne Gnimpieba

Abstract: In the rapidly evolving field of artificial intelligence, the ability to harness and integrate knowledge across various domains stands as a paramount challenge and opportunity. This study introduces a novel approach to cross-domain knowledge discovery through the deployment of multi-AI agents, each specialized in distinct knowledge domains. These AI agents, designed to function as domain-specific… ▽ More In the rapidly evolving field of artificial intelligence, the ability to harness and integrate knowledge across various domains stands as a paramount challenge and opportunity. This study introduces a novel approach to cross-domain knowledge discovery through the deployment of multi-AI agents, each specialized in distinct knowledge domains. These AI agents, designed to function as domain-specific experts, collaborate in a unified framework to synthesize and provide comprehensive insights that transcend the limitations of single-domain expertise. By facilitating seamless interaction among these agents, our platform aims to leverage the unique strengths and perspectives of each, thereby enhancing the process of knowledge discovery and decision-making. We present a comparative analysis of the different multi-agent workflow scenarios evaluating their performance in terms of efficiency, accuracy, and the breadth of knowledge integration. Through a series of experiments involving complex, interdisciplinary queries, our findings demonstrate the superior capability of domain specific multi-AI agent system in identifying and bridging knowledge gaps. This research not only underscores the significance of collaborative AI in driving innovation but also sets the stage for future advancements in AI-driven, cross-disciplinary research and application. Our methods were evaluated on a small pilot data and it showed a trend we expected, if we increase the amount of data we custom train the agents, the trend is expected to be more smooth. △ Less

Submitted 12 April, 2024; originally announced April 2024.

arXiv:2404.04905 [pdf, other]

Review for Handling Missing Data with special missing mechanism

Authors: Youran Zhou, Sunil Aryal, Mohamed Reda Bouadjenek

Abstract: Missing data poses a significant challenge in data science, affecting decision-making processes and outcomes. Understanding what missing data is, how it occurs, and why it is crucial to handle it appropriately is paramount when working with real-world data, especially in tabular data, one of the most commonly used data types in the real world. Three missing mechanisms are defined in the literature… ▽ More Missing data poses a significant challenge in data science, affecting decision-making processes and outcomes. Understanding what missing data is, how it occurs, and why it is crucial to handle it appropriately is paramount when working with real-world data, especially in tabular data, one of the most commonly used data types in the real world. Three missing mechanisms are defined in the literature: Missing Completely At Random (MCAR), Missing At Random (MAR), and Missing Not At Random (MNAR), each presenting unique challenges in imputation. Most existing work are focused on MCAR that is relatively easy to handle. The special missing mechanisms of MNAR and MAR are less explored and understood. This article reviews existing literature on handling missing values. It compares and contrasts existing methods in terms of their ability to handle different missing mechanisms and data types. It identifies research gap in the existing literature and lays out potential directions for future research in the field. The information in this review will help data analysts and researchers to adopt and promote good practices for handling missing data in real-world problems. △ Less

Submitted 7 April, 2024; originally announced April 2024.

arXiv:2404.02330 [pdf, other]

Comparative Study of Domain Driven Terms Extraction Using Large Language Models

Authors: Sandeep Chataut, Tuyen Do, Bichar Dip Shrestha Gurung, Shiva Aryal, Anup Khanal, Carol Lushbough, Etienne Gnimpieba

Abstract: Keywords play a crucial role in bridging the gap between human understanding and machine processing of textual data. They are essential to data enrichment because they form the basis for detailed annotations that provide a more insightful and in-depth view of the underlying data. Keyword/domain driven term extraction is a pivotal task in natural language processing, facilitating information retrie… ▽ More Keywords play a crucial role in bridging the gap between human understanding and machine processing of textual data. They are essential to data enrichment because they form the basis for detailed annotations that provide a more insightful and in-depth view of the underlying data. Keyword/domain driven term extraction is a pivotal task in natural language processing, facilitating information retrieval, document summarization, and content categorization. This review focuses on keyword extraction methods, emphasizing the use of three major Large Language Models(LLMs): Llama2-7B, GPT-3.5, and Falcon-7B. We employed a custom Python package to interface with these LLMs, simplifying keyword extraction. Our study, utilizing the Inspec and PubMed datasets, evaluates the performance of these models. The Jaccard similarity index was used for assessment, yielding scores of 0.64 (Inspec) and 0.21 (PubMed) for GPT-3.5, 0.40 and 0.17 for Llama2-7B, and 0.23 and 0.12 for Falcon-7B. This paper underlines the role of prompt engineering in LLMs for better keyword extraction and discusses the impact of hallucination in LLMs on result evaluation. It also sheds light on the challenges in using LLMs for keyword extraction, including model complexity, resource demands, and optimization techniques. △ Less

Submitted 2 April, 2024; originally announced April 2024.

arXiv:2403.17333 [pdf, other]

The Pursuit of Fairness in Artificial Intelligence Models: A Survey

Authors: Tahsin Alamgir Kheya, Mohamed Reda Bouadjenek, Sunil Aryal

Abstract: Artificial Intelligence (AI) models are now being utilized in all facets of our lives such as healthcare, education and employment. Since they are used in numerous sensitive environments and make decisions that can be life altering, potential biased outcomes are a pressing matter. Developers should ensure that such models don't manifest any unexpected discriminatory practices like partiality for c… ▽ More Artificial Intelligence (AI) models are now being utilized in all facets of our lives such as healthcare, education and employment. Since they are used in numerous sensitive environments and make decisions that can be life altering, potential biased outcomes are a pressing matter. Developers should ensure that such models don't manifest any unexpected discriminatory practices like partiality for certain genders, ethnicities or disabled people. With the ubiquitous dissemination of AI systems, researchers and practitioners are becoming more aware of unfair models and are bound to mitigate bias in them. Significant research has been conducted in addressing such issues to ensure models don't intentionally or unintentionally perpetuate bias. This survey offers a synopsis of the different ways researchers have promoted fairness in AI systems. We explore the different definitions of fairness existing in the current literature. We create a comprehensive taxonomy by categorizing different types of bias and investigate cases of biased AI in different application domains. A thorough study is conducted of the approaches and techniques employed by researchers to mitigate bias in AI models. Moreover, we also delve into the impact of biased models on user experience and the ethical considerations to contemplate when develo** and deploying such models. We hope this survey helps researchers and practitioners understand the intricate details of fairness and bias in AI systems. By sharing this thorough survey, we aim to promote additional discourse in the domain of equitable and responsible AI. △ Less

Submitted 25 March, 2024; originally announced March 2024.

Comments: 37 pages, 6 figures

arXiv:2403.13013 [pdf, other]

Hierarchical Classification for Intrusion Detection System: Effective Design and Empirical Analysis

Authors: Md. Ashraf Uddin, Sunil Aryal, Mohamed Reda Bouadjenek, Muna Al-Hawawreh, Md. Alamin Talukder

Abstract: With the increased use of network technologies like Internet of Things (IoT) in many real-world applications, new types of cyberattacks have been emerging. To safeguard critical infrastructures from these emerging threats, it is crucial to deploy an Intrusion Detection System (IDS) that can detect different types of attacks accurately while minimizing false alarms. Machine learning approaches have… ▽ More With the increased use of network technologies like Internet of Things (IoT) in many real-world applications, new types of cyberattacks have been emerging. To safeguard critical infrastructures from these emerging threats, it is crucial to deploy an Intrusion Detection System (IDS) that can detect different types of attacks accurately while minimizing false alarms. Machine learning approaches have been used extensively in IDS and they are mainly using flat multi-class classification to differentiate normal traffic and different types of attacks. Though cyberattack types exhibit a hierarchical structure where similar granular attack subtypes can be grouped into more high-level attack types, hierarchical classification approach has not been explored well. In this paper, we investigate the effectiveness of hierarchical classification approach in IDS. We use a three-level hierarchical classification model to classify various network attacks, where the first level classifies benign or attack, the second level classifies coarse high-level attack types, and the third level classifies a granular level attack types. Our empirical results of using 10 different classification algorithms in 10 different datasets show that there is no significant difference in terms of overall classification performance (i.e., detecting normal and different types of attack correctly) of hierarchical and flat classification approaches. However, flat classification approach misclassify attacks as normal whereas hierarchical approach misclassify one type of attack as another attack type. In other words, the hierarchical classification approach significantly minimises attacks from misclassified as normal traffic, which is more important in critical systems. △ Less

Submitted 17 March, 2024; originally announced March 2024.

Comments: Deakin University, Australia | This material is based upon work supported by the Air Force Office of Scientific Research under award number FA2386-23-1-4003

arXiv:2403.13010 [pdf, other]

A Dual-Tier Adaptive One-Class Classification IDS for Emerging Cyberthreats

Authors: Md. Ashraf Uddin, Sunil Aryal, Mohamed Reda Bouadjenek, Muna Al-Hawawreh, Md. Alamin Talukder

Abstract: In today's digital age, our dependence on IoT (Internet of Things) and IIoT (Industrial IoT) systems has grown immensely, which facilitates sensitive activities such as banking transactions and personal, enterprise data, and legal document exchanges. Cyberattackers consistently exploit weak security measures and tools. The Network Intrusion Detection System (IDS) acts as a primary tool against suc… ▽ More In today's digital age, our dependence on IoT (Internet of Things) and IIoT (Industrial IoT) systems has grown immensely, which facilitates sensitive activities such as banking transactions and personal, enterprise data, and legal document exchanges. Cyberattackers consistently exploit weak security measures and tools. The Network Intrusion Detection System (IDS) acts as a primary tool against such cyber threats. However, machine learning-based IDSs, when trained on specific attack patterns, often misclassify new emerging cyberattacks. Further, the limited availability of attack instances for training a supervised learner and the ever-evolving nature of cyber threats further complicate the matter. This emphasizes the need for an adaptable IDS framework capable of recognizing and learning from unfamiliar/unseen attacks over time. In this research, we propose a one-class classification-driven IDS system structured on two tiers. The first tier distinguishes between normal activities and attacks/threats, while the second tier determines if the detected attack is known or unknown. Within this second tier, we also embed a multi-classification mechanism coupled with a clustering algorithm. This model not only identifies unseen attacks but also uses them for retraining them by clustering unseen attacks. This enables our model to be future-proofed, capable of evolving with emerging threat patterns. Leveraging one-class classifiers (OCC) at the first level, our approach bypasses the need for attack samples, addressing data imbalance and zero-day attack concerns and OCC at the second level can effectively separate unknown attacks from the known attacks. Our methodology and evaluations indicate that the presented framework exhibits promising potential for real-world deployments. △ Less

Submitted 17 March, 2024; originally announced March 2024.

Comments: Deakin University, Australia | This material is based upon work supported by the Air Force Office of Scientific Research under award number FA2386-23-1-4003

arXiv:2403.11180 [pdf, other]

usfAD Based Effective Unknown Attack Detection Focused IDS Framework

Authors: Md. Ashraf Uddin, Sunil Aryal, Mohamed Reda Bouadjenek, Muna Al-Hawawreh, Md. Alamin Talukder

Abstract: The rapid expansion of varied network systems, including the Internet of Things (IoT) and Industrial Internet of Things (IIoT), has led to an increasing range of cyber threats. Ensuring robust protection against these threats necessitates the implementation of an effective Intrusion Detection System (IDS). For more than a decade, researchers have delved into supervised machine learning techniques… ▽ More The rapid expansion of varied network systems, including the Internet of Things (IoT) and Industrial Internet of Things (IIoT), has led to an increasing range of cyber threats. Ensuring robust protection against these threats necessitates the implementation of an effective Intrusion Detection System (IDS). For more than a decade, researchers have delved into supervised machine learning techniques to develop IDS to classify normal and attack traffic. However, building effective IDS models using supervised learning requires a substantial number of benign and attack samples. To collect a sufficient number of attack samples from real-life scenarios is not possible since cyber attacks occur occasionally. Further, IDS trained and tested on known datasets fails in detecting zero-day or unknown attacks due to the swift evolution of attack patterns. To address this challenge, we put forth two strategies for semi-supervised learning based IDS where training samples of attacks are not required: 1) training a supervised machine learning model using randomly and uniformly dispersed synthetic attack samples; 2) building a One Class Classification (OCC) model that is trained exclusively on benign network traffic. We have implemented both approaches and compared their performances using 10 recent benchmark IDS datasets. Our findings demonstrate that the OCC model based on the state-of-art anomaly detection technique called usfAD significantly outperforms conventional supervised classification and other OCC based techniques when trained and tested considering real-life scenarios, particularly to detect previously unseen attacks. △ Less

Submitted 17 March, 2024; originally announced March 2024.

Comments: Deakin University, Australia | This material is based upon work supported by the Air Force Office of Scientific Research under award number FA2386-23-1-4003

arXiv:2403.02619 [pdf, other]

Training Machine Learning models at the Edge: A Survey

Authors: Aymen Rayane Khouas, Mohamed Reda Bouadjenek, Hakim Hacid, Sunil Aryal

Abstract: Edge Computing (EC) has gained significant traction in recent years, promising enhanced efficiency by integrating Artificial Intelligence (AI) capabilities at the edge. While the focus has primarily been on the deployment and inference of Machine Learning (ML) models at the edge, the training aspect remains less explored. This survey delves into Edge Learning (EL), specifically the optimization of… ▽ More Edge Computing (EC) has gained significant traction in recent years, promising enhanced efficiency by integrating Artificial Intelligence (AI) capabilities at the edge. While the focus has primarily been on the deployment and inference of Machine Learning (ML) models at the edge, the training aspect remains less explored. This survey delves into Edge Learning (EL), specifically the optimization of ML model training at the edge. The objective is to comprehensively explore diverse approaches and methodologies in EL, synthesize existing knowledge, identify challenges, and highlight future trends. Utilizing Scopus' advanced search, relevant literature on EL was identified, revealing a concentration of research efforts in distributed learning methods, particularly Federated Learning (FL). This survey further provides a guideline for comparing techniques used to optimize ML for edge learning, along with an exploration of different frameworks, libraries, and simulation tools available for EL. In doing so, the paper contributes to a holistic understanding of the current landscape and future directions in the intersection of edge computing and machine learning, paving the way for informed comparisons between optimization methods and techniques designed for edge learning. △ Less

Submitted 13 March, 2024; v1 submitted 4 March, 2024; originally announced March 2024.

Comments: 30 pages, 7 figures, submitted to IEEE Communications Surveys & Tutorials

arXiv:2403.00980 [pdf, other]

doi 10.1007/978-3-031-63646-2_3

Even-Ifs From If-Onlys: Are the Best Semi-Factual Explanations Found Using Counterfactuals As Guides?

Authors: Saugat Aryal, Mark T. Keane

Abstract: Recently, counterfactuals using "if-only" explanations have become very popular in eXplainable AI (XAI), as they describe which changes to feature-inputs of a black-box AI system result in changes to a (usually negative) decision-outcome. Even more recently, semi-factuals using "even-if" explanations have gained more attention. They elucidate the feature-input changes that do not change the decisi… ▽ More Recently, counterfactuals using "if-only" explanations have become very popular in eXplainable AI (XAI), as they describe which changes to feature-inputs of a black-box AI system result in changes to a (usually negative) decision-outcome. Even more recently, semi-factuals using "even-if" explanations have gained more attention. They elucidate the feature-input changes that do not change the decision-outcome of the AI system, with a potential to suggest more beneficial recourses. Some semi-factual methods use counterfactuals to the query-instance to guide semi-factual production (so-called counterfactual-guided methods), whereas others do not (so-called counterfactual-free methods). In this work, we perform comprehensive tests of 8 semi-factual methods on 7 datasets using 5 key metrics, to determine whether counterfactual guidance is necessary to find the best semi-factuals. The results of these tests suggests not, but rather that computing other aspects of the decision space lead to better semi-factual XAI. △ Less

Submitted 25 April, 2024; v1 submitted 1 March, 2024; originally announced March 2024.

Comments: 16 pages, 5 figures

Journal ref: 32nd International Conference on Case-Based Reasoning (ICCBR) 2024, Merida, Mexico

arXiv:2402.17807 [pdf, other]

Exploring Gene Regulatory Interaction Networks and predicting therapeutic molecules for Hypopharyngeal Cancer and EGFR-mutated lung adenocarcinoma

Authors: Abanti Bhattacharjya, Md Manowarul Islam, Md Ashraf Uddin, Md. Alamin Talukder, AKM Azad, Sunil Aryal, Bikash Kumar Paul, Wahia Tasnim, Muhammad Ali Abdulllah Almoyad, Mohammad Ali Moni

Abstract: With the advent of Information technology, the Bioinformatics research field is becoming increasingly attractive to researchers and academicians. The recent development of various Bioinformatics toolkits has facilitated the rapid processing and analysis of vast quantities of biological data for human perception. Most studies focus on locating two connected diseases and making some observations to… ▽ More With the advent of Information technology, the Bioinformatics research field is becoming increasingly attractive to researchers and academicians. The recent development of various Bioinformatics toolkits has facilitated the rapid processing and analysis of vast quantities of biological data for human perception. Most studies focus on locating two connected diseases and making some observations to construct diverse gene regulatory interaction networks, a forerunner to general drug design for curing illness. For instance, Hypopharyngeal cancer is a disease that is associated with EGFR-mutated lung adenocarcinoma. In this study, we select EGFR-mutated lung adenocarcinoma and Hypopharyngeal cancer by finding the Lung metastases in hypopharyngeal cancer. To conduct this study, we collect Mircorarray datasets from GEO (Gene Expression Omnibus), an online database controlled by NCBI. Differentially expressed genes, common genes, and hub genes between the selected two diseases are detected for the succeeding move. Our research findings have suggested common therapeutic molecules for the selected diseases based on 10 hub genes with the highest interactions according to the degree topology method and the maximum clique centrality (MCC). Our suggested therapeutic molecules will be fruitful for patients with those two diseases simultaneously. △ Less

Submitted 27 February, 2024; originally announced February 2024.

Comments: Accepted In The FEBS OPEN BIO (Q2, SCOPUS, SCIE, IF: 2.6, CS: 4.7), Wiley Journal, On FEB 25, 2024

arXiv:2402.13277 [pdf, other]

MLSTL-WSN: Machine Learning-based Intrusion Detection using SMOTETomek in WSNs

Authors: Md. Alamin Talukder, Selina Sharmin, Md Ashraf Uddin, Md Manowarul Islam, Sunil Aryal

Abstract: Wireless Sensor Networks (WSNs) play a pivotal role as infrastructures, encompassing both stationary and mobile sensors. These sensors self-organize and establish multi-hop connections for communication, collectively sensing, gathering, processing, and transmitting data about their surroundings. Despite their significance, WSNs face rapid and detrimental attacks that can disrupt functionality. Exi… ▽ More Wireless Sensor Networks (WSNs) play a pivotal role as infrastructures, encompassing both stationary and mobile sensors. These sensors self-organize and establish multi-hop connections for communication, collectively sensing, gathering, processing, and transmitting data about their surroundings. Despite their significance, WSNs face rapid and detrimental attacks that can disrupt functionality. Existing intrusion detection methods for WSNs encounter challenges such as low detection rates, computational overhead, and false alarms. These issues stem from sensor node resource constraints, data redundancy, and high correlation within the network. To address these challenges, we propose an innovative intrusion detection approach that integrates Machine Learning (ML) techniques with the Synthetic Minority Oversampling Technique Tomek Link (SMOTE-TomekLink) algorithm. This blend synthesizes minority instances and eliminates Tomek links, resulting in a balanced dataset that significantly enhances detection accuracy in WSNs. Additionally, we incorporate feature scaling through standardization to render input features consistent and scalable, facilitating more precise training and detection. To counteract imbalanced WSN datasets, we employ the SMOTE-Tomek resampling technique, mitigating overfitting and underfitting issues. Our comprehensive evaluation, using the WSN Dataset (WSN-DS) containing 374,661 records, identifies the optimal model for intrusion detection in WSNs. The standout outcome of our research is the remarkable performance of our model. In binary, it achieves an accuracy rate of 99.78% and in multiclass, it attains an exceptional accuracy rate of 99.92%. These findings underscore the efficiency and superiority of our proposal in the context of WSN intrusion detection, showcasing its effectiveness in detecting and mitigating intrusions in WSNs. △ Less

Submitted 22 February, 2024; v1 submitted 17 February, 2024; originally announced February 2024.

Comments: International Journal of Information Security, Springer Journal - Q1, Scopus, ISI, SCIE, IF: 3.2 - Accepted on Jan 17, 2024

arXiv:2401.11402 [pdf, other]

Enabling clustering algorithms to detect clusters of varying densities through scale-invariant data preprocessing

Authors: Sunil Aryal, Jonathan R. Wells, Arbind Agrahari Baniya, KC Santosh

Abstract: In this paper, we show that preprocessing data using a variant of rank transformation called 'Average Rank over an Ensemble of Sub-samples (ARES)' makes clustering algorithms robust to data representation and enable them to detect varying density clusters. Our empirical results, obtained using three most widely used clustering algorithms-namely KMeans, DBSCAN, and DP (Density Peak)-across a wide r… ▽ More In this paper, we show that preprocessing data using a variant of rank transformation called 'Average Rank over an Ensemble of Sub-samples (ARES)' makes clustering algorithms robust to data representation and enable them to detect varying density clusters. Our empirical results, obtained using three most widely used clustering algorithms-namely KMeans, DBSCAN, and DP (Density Peak)-across a wide range of real-world datasets, show that clustering after ARES transformation produces better and more consistent results. △ Less

Submitted 20 January, 2024; originally announced January 2024.

arXiv:2311.16593 [pdf, other]

Empowering COVID-19 Detection: Optimizing Performance Through Fine-Tuned EfficientNet Deep Learning Architecture

Authors: Md. Alamin Talukder, Md. Abu Layek, Mohsin Kazi, Md Ashraf Uddin, Sunil Aryal

Abstract: The worldwide COVID-19 pandemic has profoundly influenced the health and everyday experiences of individuals across the planet. It is a highly contagious respiratory disease requiring early and accurate detection to curb its rapid transmission. Initial testing methods primarily revolved around identifying the genetic composition of the coronavirus, exhibiting a relatively low detection rate and re… ▽ More The worldwide COVID-19 pandemic has profoundly influenced the health and everyday experiences of individuals across the planet. It is a highly contagious respiratory disease requiring early and accurate detection to curb its rapid transmission. Initial testing methods primarily revolved around identifying the genetic composition of the coronavirus, exhibiting a relatively low detection rate and requiring a time-intensive procedure. To address this challenge, experts have suggested using radiological imagery, particularly chest X-rays, as a valuable approach within the diagnostic protocol. This study investigates the potential of leveraging radiographic imaging (X-rays) with deep learning algorithms to swiftly and precisely identify COVID-19 patients. The proposed approach elevates the detection accuracy by fine-tuning with appropriate layers on various established transfer learning models. The experimentation was conducted on a COVID-19 X-ray dataset containing 2000 images. The accuracy rates achieved were impressive of 100% for EfficientNetB4 model. The fine-tuned EfficientNetB4 achieved an excellent accuracy score, showcasing its potential as a robust COVID-19 detection model. Furthermore, EfficientNetB4 excelled in identifying Lung disease using Chest X-ray dataset containing 4,350 Images, achieving remarkable performance with an accuracy of 99.17%, precision of 99.13%, recall of 99.16%, and f1-score of 99.14%. These results highlight the promise of fine-tuned transfer learning for efficient lung detection through medical imaging, especially with X-ray images. This research offers radiologists an effective means of aiding rapid and precise COVID-19 diagnosis and contributes valuable assistance for healthcare professionals in accurately identifying affected patients. △ Less

Submitted 28 November, 2023; originally announced November 2023.

Comments: Computers in Biology and Medicine [Q1, IF: 7.7, CS: 9.2]

Journal ref: Computers in Biology and Medicine, Elsevier 2023

arXiv:2310.14120 [pdf, other]

Sentiment Analysis Across Multiple African Languages: A Current Benchmark

Authors: Saurav K. Aryal, Howard Prioleau, Surakshya Aryal

Abstract: Sentiment analysis is a fundamental and valuable task in NLP. However, due to limitations in data and technological availability, research into sentiment analysis of African languages has been fragmented and lacking. With the recent release of the AfriSenti-SemEval Shared Task 12, hosted as a part of The 17th International Workshop on Semantic Evaluation, an annotated sentiment analysis of 14 Afri… ▽ More Sentiment analysis is a fundamental and valuable task in NLP. However, due to limitations in data and technological availability, research into sentiment analysis of African languages has been fragmented and lacking. With the recent release of the AfriSenti-SemEval Shared Task 12, hosted as a part of The 17th International Workshop on Semantic Evaluation, an annotated sentiment analysis of 14 African languages was made available. We benchmarked and compared current state-of-art transformer models across 12 languages and compared the performance of training one-model-per-language versus single-model-all-languages. We also evaluated the performance of standard multilingual models and their ability to learn and transfer cross-lingual representation from non-African to African languages. Our results show that despite work in low resource modeling, more data still produces better models on a per-language basis. Models explicitly developed for African languages outperform other models on all tasks. Additionally, no one-model-fits-all solution exists for a per-language evaluation of the models evaluated. Moreover, for some languages with a smaller sample size, a larger multilingual model may perform better than a dedicated per-language model for sentiment classification. △ Less

Submitted 21 October, 2023; originally announced October 2023.

Comments: Accepted to be published as part of SIAIA @ AAAI 2023

Journal ref: AAAI 2023

arXiv:2307.03871 [pdf]

HUMS2023 Data Challenge Result Submission

Authors: Dhiraj Neupane, Lakpa Dorje Tamang, Ngoc Dung Huynh, Mohamed Reda Bouadjenek, Sunil Aryal

Abstract: We implemented a simple method for early detection in this research. The implemented methods are plotting the given mat files and analyzing scalogram images generated by performing Continuous Wavelet Transform (CWT) on the samples. Also, finding the mean, standard deviation (STD), and peak-to-peak (P2P) values from each signal also helped detect faulty signs. We have implemented the autoregressive… ▽ More We implemented a simple method for early detection in this research. The implemented methods are plotting the given mat files and analyzing scalogram images generated by performing Continuous Wavelet Transform (CWT) on the samples. Also, finding the mean, standard deviation (STD), and peak-to-peak (P2P) values from each signal also helped detect faulty signs. We have implemented the autoregressive integrated moving average (ARIMA) method to track the progression. △ Less

Submitted 14 July, 2023; v1 submitted 7 July, 2023; originally announced July 2023.

Comments: This report is being submitted as part of the Data Challenge organized by HUmS2023

arXiv:2302.00837 [pdf]

SHINE: Deep Learning-Based Accessible Parking Management System

Authors: Dhiraj Neupane, Aashish Bhattarai, Sunil Aryal, Mohamed Reda Bouadjenek, Uk-Min Seok, Jongwon Seok

Abstract: The ongoing expansion of urban areas facilitated by advancements in science and technology has resulted in a considerable increase in the number of privately owned vehicles worldwide, including in South Korea. However, this gradual increment in the number of vehicles has inevitably led to parking-related issues, including the abuse of disabled parking spaces (hereafter referred to as accessible pa… ▽ More The ongoing expansion of urban areas facilitated by advancements in science and technology has resulted in a considerable increase in the number of privately owned vehicles worldwide, including in South Korea. However, this gradual increment in the number of vehicles has inevitably led to parking-related issues, including the abuse of disabled parking spaces (hereafter referred to as accessible parking spaces) designated for individuals with disabilities. Traditional license plate recognition (LPR) systems have proven inefficient in addressing such a problem in real-time due to the high frame rate of surveillance cameras, the presence of natural and artificial noise, and variations in lighting and weather conditions that impede detection and recognition by these systems. With the growing concept of parking 4.0, many sensors, IoT and deep learning-based approaches have been applied to automatic LPR and parking management systems. Nonetheless, the studies show a need for a robust and efficient model for managing accessible parking spaces in South Korea. To address this, we have proposed a novel system called, Shine, which uses the deep learning-based object detection algorithm for detecting the vehicle, license plate, and disability badges (referred to as cards, badges, or access badges hereafter) and verifies the rights of the driver to use accessible parking spaces by coordinating with the central server. Our model, which achieves a mean average precision of 92.16%, is expected to address the issue of accessible parking space abuse and contributes significantly towards efficient and effective parking management in urban environments. △ Less

Submitted 17 October, 2023; v1 submitted 1 February, 2023; originally announced February 2023.

arXiv:2301.11970 [pdf]

Even if Explanations: Prior Work, Desiderata & Benchmarks for Semi-Factual XAI

Authors: Saugat Aryal, Mark T Keane

Abstract: Recently, eXplainable AI (XAI) research has focused on counterfactual explanations as post-hoc justifications for AI-system decisions (e.g. a customer refused a loan might be told: If you asked for a loan with a shorter term, it would have been approved). Counterfactuals explain what changes to the input-features of an AI system change the output-decision. However, there is a sub-type of counterfa… ▽ More Recently, eXplainable AI (XAI) research has focused on counterfactual explanations as post-hoc justifications for AI-system decisions (e.g. a customer refused a loan might be told: If you asked for a loan with a shorter term, it would have been approved). Counterfactuals explain what changes to the input-features of an AI system change the output-decision. However, there is a sub-type of counterfactual, semi-factuals, that have received less attention in AI (though the Cognitive Sciences have studied them extensively). This paper surveys these literatures to summarise historical and recent breakthroughs in this area. It defines key desiderata for semi-factual XAI and reports benchmark tests of historical algorithms (along with a novel, naieve method) to provide a solid basis for future algorithmic developments. △ Less

Submitted 8 May, 2023; v1 submitted 27 January, 2023; originally announced January 2023.

Comments: 14 pages, 4 Figures

MSC Class: 68-11 ACM Class: H.5.2; I.2.1

Journal ref: 32nd International Joint Conference on Artificial Intelligence (IJCAI-23), China, Macao, 2023

arXiv:2211.02799 [pdf]

Evaluating Novel Mask-RCNN Architectures for Ear Mask Segmentation

Authors: Saurav K. Aryal, Teanna Barrett, Gloria Washington

Abstract: The human ear is generally universal, collectible, distinct, and permanent. Ear-based biometric recognition is a niche and recent approach that is being explored. For any ear-based biometric algorithm to perform well, ear detection and segmentation need to be accurately performed. While significant work has been done in existing literature for bounding boxes, a lack of approaches output a segmenta… ▽ More The human ear is generally universal, collectible, distinct, and permanent. Ear-based biometric recognition is a niche and recent approach that is being explored. For any ear-based biometric algorithm to perform well, ear detection and segmentation need to be accurately performed. While significant work has been done in existing literature for bounding boxes, a lack of approaches output a segmentation mask for ears. This paper trains and compares three newer models to the state-of-the-art MaskRCNN (ResNet 101 +FPN) model across four different datasets. The Average Precision (AP) scores reported show that the newer models outperform the state-of-the-art but no one model performs the best over multiple datasets. △ Less

Submitted 4 November, 2022; originally announced November 2022.

Comments: Accepted into ICCBS 2022

arXiv:2210.16461 [pdf]

Sentiment Classification of Code-Switched Text using Pre-trained Multilingual Embeddings and Segmentation

Authors: Saurav K. Aryal, Howard Prioleau, Gloria Washington

Abstract: With increasing globalization and immigration, various studies have estimated that about half of the world population is bilingual. Consequently, individuals concurrently use two or more languages or dialects in casual conversational settings. However, most research is natural language processing is focused on monolingual text. To further the work in code-switched sentiment analysis, we propose a… ▽ More With increasing globalization and immigration, various studies have estimated that about half of the world population is bilingual. Consequently, individuals concurrently use two or more languages or dialects in casual conversational settings. However, most research is natural language processing is focused on monolingual text. To further the work in code-switched sentiment analysis, we propose a multi-step natural language processing algorithm utilizing points of code-switching in mixed text and conduct sentiment analysis around those identified points. The proposed sentiment analysis algorithm uses semantic similarity derived from large pre-trained multilingual models with a handcrafted set of positive and negative words to determine the polarity of code-switched text. The proposed approach outperforms a comparable baseline model by 11.2% for accuracy and 11.64% for F1-score on a Spanish-English dataset. Theoretically, the proposed algorithm can be expanded for sentiment analysis of multiple languages with limited human expertise. △ Less

Submitted 28 October, 2022; originally announced October 2022.

arXiv:2210.03325 [pdf, other]

Elastic Step DQN: A novel multi-step algorithm to alleviate overestimation in Deep QNetworks

Authors: Adrian Ly, Richard Dazeley, Peter Vamplew, Francisco Cruz, Sunil Aryal

Abstract: Deep Q-Networks algorithm (DQN) was the first reinforcement learning algorithm using deep neural network to successfully surpass human level performance in a number of Atari learning environments. However, divergent and unstable behaviour have been long standing issues in DQNs. The unstable behaviour is often characterised by overestimation in the $Q$-values, commonly referred to as the overestima… ▽ More Deep Q-Networks algorithm (DQN) was the first reinforcement learning algorithm using deep neural network to successfully surpass human level performance in a number of Atari learning environments. However, divergent and unstable behaviour have been long standing issues in DQNs. The unstable behaviour is often characterised by overestimation in the $Q$-values, commonly referred to as the overestimation bias. To address the overestimation bias and the divergent behaviour, a number of heuristic extensions have been proposed. Notably, multi-step updates have been shown to drastically reduce unstable behaviour while improving agent's training performance. However, agents are often highly sensitive to the selection of the multi-step update horizon ($n$), and our empirical experiments show that a poorly chosen static value for $n$ can in many cases lead to worse performance than single-step DQN. Inspired by the success of $n$-step DQN and the effects that multi-step updates have on overestimation bias, this paper proposes a new algorithm that we call `Elastic Step DQN' (ES-DQN). It dynamically varies the step size horizon in multi-step updates based on the similarity of states visited. Our empirical evaluation shows that ES-DQN out-performs $n$-step with fixed $n$ updates, Double DQN and Average DQN in several OpenAI Gym environments while at the same time alleviating the overestimation bias. △ Less

Submitted 7 October, 2022; originally announced October 2022.

arXiv:2209.06085 [pdf, other]

Acoustic-Linguistic Features for Modeling Neurological Task Score in Alzheimer's

Authors: Saurav K. Aryal, Howard Prioleau, Legand Burge

Abstract: The average life expectancy is increasing globally due to advancements in medical technology, preventive health care, and a growing emphasis on gerontological health. Therefore, develo** technologies that detect and track aging-associated disease in cognitive function among older adult populations is imperative. In particular, research related to automatic detection and evaluation of Alzheimer's… ▽ More The average life expectancy is increasing globally due to advancements in medical technology, preventive health care, and a growing emphasis on gerontological health. Therefore, develo** technologies that detect and track aging-associated disease in cognitive function among older adult populations is imperative. In particular, research related to automatic detection and evaluation of Alzheimer's disease (AD) is critical given the disease's prevalence and the cost of current methods. As AD impacts the acoustics of speech and vocabulary, natural language processing and machine learning provide promising techniques for reliably detecting AD. We compare and contrast the performance of ten linear regression models for predicting Mini-Mental Status Exam scores on the ADReSS challenge dataset. We extracted 13000+ handcrafted and learned features that capture linguistic and acoustic phenomena. Using a subset of 54 top features selected by two methods: (1) recursive elimination and (2) correlation scores, we outperform a state-of-the-art baseline for the same task. Upon scoring and evaluating the statistical significance of each of the selected subset of features for each model, we find that, for the given task, handcrafted linguistic features are more significant than acoustic and learned features. △ Less

Submitted 13 September, 2022; originally announced September 2022.

Comments: The paper has been accepted to Pacific Symposium on Biocomputing \c{opyright} [2022] World Scientific Publishing Co., Singapore, http://psb.stanford.edu/ and is currently being camera-readied

arXiv:2207.07074 [pdf, other]

doi 10.1021/acs.jpclett.2c02856

Comparative Study of Covalent and van der Waals CdS Quantum Dot Assemblies from Many-Body Perturbation Theory

Authors: Sandip Aryal, Joseph Frimpong, Zhen-Fei Liu

Abstract: Quantum dot (QD) assemblies are nanostructured networks made from aggregates of QDs and feature improved charge and energy transfer efficiencies compared to discrete QDs. Using first-principles many-body perturbation theory, we systematically compare the electronic and optical properties of two types of CdS QD assemblies that have been experimentally investigated: QD gels, where individual QDs are… ▽ More Quantum dot (QD) assemblies are nanostructured networks made from aggregates of QDs and feature improved charge and energy transfer efficiencies compared to discrete QDs. Using first-principles many-body perturbation theory, we systematically compare the electronic and optical properties of two types of CdS QD assemblies that have been experimentally investigated: QD gels, where individual QDs are covalently connected via di- or poly-sulfide bonds, and QD nanocrystals, where individual QDs are bound via van der Waals interactions. Our work illustrates how the electronic, excitonic, and optical properties evolve when discrete QDs are assembled into 1D, 2D, and 3D gels and nanocrystals, as well as how the one-body and many-body interactions in these systems impact the trends as the dimensionality of the assembly increases. Furthermore, our work reveals the crucial role of the covalent di- or poly-sulfide bonds in the localization of the excitons, which highlights the difference between QD gels and QD nanocrystals. △ Less

Submitted 14 July, 2022; originally announced July 2022.

Comments: 25 pages, 4 figures

Journal ref: J. Phys. Chem. Lett. 13, 10153 (2022)

arXiv:2203.04089 [pdf]

Associating eHealth Policies and National Data Privacy Regulations

Authors: Saurav K. Aryal, Peter A. Keiller

Abstract: As electronic data becomes the lifeline of modern society, privacy concerns increase. These concerns are reflected by the European Union's enactment of the General Data Protection Regulation (GDPR), one of the most comprehensive and robust privacy regulations globally. This project aims to evaluate and highlight associations between eHealth systems' policies and personal data privacy regulations.… ▽ More As electronic data becomes the lifeline of modern society, privacy concerns increase. These concerns are reflected by the European Union's enactment of the General Data Protection Regulation (GDPR), one of the most comprehensive and robust privacy regulations globally. This project aims to evaluate and highlight associations between eHealth systems' policies and personal data privacy regulations. Using bias-corrected Cramer's V and Thiel's U tests, we found weak and zero associations between e-health systems' rules and protections for data privacy. A simple decision tree model is trained, which validates the association scores obtained △ Less

Submitted 27 February, 2022; originally announced March 2022.

arXiv:2111.04253 [pdf, other]

A Novel Data Pre-processing Technique: Making Data Mining Robust to Different Units and Scales of Measurement

Authors: Arbind Agrahari Baniya, Sunil Aryal, Santosh KC

Abstract: Many existing data mining algorithms use feature values directly in their model, making them sensitive to units/scales used to measure/represent data. Pre-processing of data based on rank transformation has been suggested as a potential solution to overcome this issue. However, the resulting data after pre-processing with rank transformation is uniformly distributed, which may not be very useful i… ▽ More Many existing data mining algorithms use feature values directly in their model, making them sensitive to units/scales used to measure/represent data. Pre-processing of data based on rank transformation has been suggested as a potential solution to overcome this issue. However, the resulting data after pre-processing with rank transformation is uniformly distributed, which may not be very useful in many data mining applications. In this paper, we present a better and effective alternative based on ranks over multiple sub-samples of data. We call the proposed pre-processing technique as ARES | Average Rank over an Ensemble of Sub-samples. Our empirical results of widely used data mining algorithms for classification and anomaly detection in a wide range of data sets suggest that ARES results in more consistent task specific? outcome across various algorithms and data sets. In addition to this, it results in better or competitive outcome most of the time compared to the most widely used min-max normalisation and the traditional rank transformation. △ Less

Submitted 7 November, 2021; originally announced November 2021.

Comments: This paper is published in a special issue of the Australian Journal of Intelligent Information Processing Systems as part of the proceedings of the International Conference on Neural Information Processing (ICONIP) 2019

Journal ref: Australian Journal of Intelligent Information Processing Systems, Vol. 16, Issue: 3 pp.1-8, 2019

arXiv:2107.03178 [pdf, other]

doi 10.1016/j.artint.2021.103525

Levels of explainable artificial intelligence for human-aligned conversational explanations

Authors: Richard Dazeley, Peter Vamplew, Cameron Foale, Charlotte Young, Sunil Aryal, Francisco Cruz

Abstract: Over the last few years there has been rapid research growth into eXplainable Artificial Intelligence (XAI) and the closely aligned Interpretable Machine Learning (IML). Drivers for this growth include recent legislative changes and increased investments by industry and governments, along with increased concern from the general public. People are affected by autonomous decisions every day and the… ▽ More Over the last few years there has been rapid research growth into eXplainable Artificial Intelligence (XAI) and the closely aligned Interpretable Machine Learning (IML). Drivers for this growth include recent legislative changes and increased investments by industry and governments, along with increased concern from the general public. People are affected by autonomous decisions every day and the public need to understand the decision-making process to accept the outcomes. However, the vast majority of the applications of XAI/IML are focused on providing low-level `narrow' explanations of how an individual decision was reached based on a particular datum. While important, these explanations rarely provide insights into an agent's: beliefs and motivations; hypotheses of other (human, animal or AI) agents' intentions; interpretation of external cultural expectations; or, processes used to generate its own explanation. Yet all of these factors, we propose, are essential to providing the explanatory depth that people require to accept and trust the AI's decision-making. This paper aims to define levels of explanation and describe how they can be integrated to create a human-aligned conversational explanation system. In so doing, this paper will survey current approaches and discuss the integration of different technologies to achieve these levels with Broad eXplainable Artificial Intelligence (Broad-XAI), and thereby move towards high-level `strong' explanations. △ Less

Submitted 7 July, 2021; originally announced July 2021.

Comments: 35 pages, 13 figures

Journal ref: Artificial Intelligence, 299, 103525 (2021)

arXiv:2012.15413 [pdf, other]

doi 10.1007/s13755-021-00152-w

New Bag of Deep Visual Words based features to classify chest x-ray images for COVID-19 diagnosis

Authors: Chiranjibi Sitaula, Sunil Aryal

Abstract: Because the infection by Severe Acute Respiratory Syndrome Coronavirus 2 (COVID-19) causes the pneumonia-like effect in the lungs, the examination of chest x-rays can help to diagnose the diseases. For automatic analysis of images, they are represented in machines by a set of semantic features. Deep Learning (DL) models are widely used to extract features from images. General deep features may not… ▽ More Because the infection by Severe Acute Respiratory Syndrome Coronavirus 2 (COVID-19) causes the pneumonia-like effect in the lungs, the examination of chest x-rays can help to diagnose the diseases. For automatic analysis of images, they are represented in machines by a set of semantic features. Deep Learning (DL) models are widely used to extract features from images. General deep features may not be appropriate to represent chest x-rays as they have a few semantic regions. Though the Bag of Visual Words (BoVW) based features are shown to be more appropriate for x-ray type of images, existing BoVW features may not capture enough information to differentiate COVID-19 infection from other pneumonia-related infections. In this paper, we propose a new BoVW method over deep features, called Bag of Deep Visual Words (BoDVW), by removing the feature map normalization step and adding deep features normalization step on the raw feature maps. This helps to preserve the semantics of each feature map that may have important clues to differentiate COVID-19 from pneumonia. We evaluate the effectiveness of our proposed BoDVW features in chest x-rays classification using Support Vector Machine (SVM) to diagnose COVID-19. Our results on a publicly available COVID-19 x-ray dataset reveal that our features produce stable and prominent classification accuracy, particularly differentiating COVID-19 infection from other pneumonia, in shorter computation time compared to the state-of-the-art methods. Thus, our method could be a very useful tool for quick diagnosis of COVID-19 patients on a large scale. △ Less

Submitted 28 January, 2021; v1 submitted 30 December, 2020; originally announced December 2020.

Comments: Submitted to Health Information Science and Systems (Springer) for review

Journal ref: Health Information Science and Systems (Springer), 2021

arXiv:2006.03217 [pdf, other]

doi 10.1016/j.knosys.2021.107470

Content and Context Features for Scene Image Representation

Authors: Chiranjibi Sitaula, Sunil Aryal, Yong Xiang, Anish Basnet, Xuequan Lu

Abstract: Existing research in scene image classification has focused on either content features (e.g., visual information) or context features (e.g., annotations). As they capture different information about images which can be complementary and useful to discriminate images of different classes, we suppose the fusion of them will improve classification results. In this paper, we propose new techniques to… ▽ More Existing research in scene image classification has focused on either content features (e.g., visual information) or context features (e.g., annotations). As they capture different information about images which can be complementary and useful to discriminate images of different classes, we suppose the fusion of them will improve classification results. In this paper, we propose new techniques to compute content features and context features, and then fuse them together. For content features, we design multi-scale deep features based on background and foreground information in images. For context features, we use annotations of similar images available in the web to design a filter words (codebook). Our experiments in three widely used benchmark scene datasets using support vector machine classifier reveal that our proposed context and content features produce better results than existing context and content features, respectively. The fusion of the proposed two types of features significantly outperform numerous state-of-the-art features. △ Less

Submitted 24 April, 2021; v1 submitted 4 June, 2020; originally announced June 2020.

Comments: Submitted to Knowledge-Based Systems (Elsevier) for consideration

Journal ref: Knowledge-based Systems (Elsevier), 2021

arXiv:2006.03199 [pdf, other]

doi 10.1016/j.eswa.2021.115285

Scene Image Representation by Foreground, Background and Hybrid Features

Authors: Chiranjibi Sitaula, Yong Xiang, Sunil Aryal, Xuequan Lu

Abstract: Previous methods for representing scene images based on deep learning primarily consider either the foreground or background information as the discriminating clues for the classification task. However, scene images also require additional information (hybrid) to cope with the inter-class similarity and intra-class variation problems. In this paper, we propose to use hybrid features in addition to… ▽ More Previous methods for representing scene images based on deep learning primarily consider either the foreground or background information as the discriminating clues for the classification task. However, scene images also require additional information (hybrid) to cope with the inter-class similarity and intra-class variation problems. In this paper, we propose to use hybrid features in addition to foreground and background features to represent scene images. We suppose that these three types of information could jointly help to represent scene image more accurately. To this end, we adopt three VGG-16 architectures pre-trained on ImageNet, Places, and Hybrid (both ImageNet and Places) datasets for the corresponding extraction of foreground, background and hybrid information. All these three types of deep features are further aggregated to achieve our final features for the representation of scene images. Extensive experiments on two large benchmark scene datasets (MIT-67 and SUN-397) show that our method produces the state-of-the-art classification performance. △ Less

Submitted 4 June, 2020; originally announced June 2020.

Comments: Submitted to Expert Systems with Applications (ESWA), 28 pages and 17 images

Journal ref: Expert Systems with Applications (ESWA), 2021

arXiv:2005.02637 [pdf, other]

A Comprehensive Survey on Outlying Aspect Mining Methods

Authors: Durgesh Samariya, Jiangang Ma, Sunil Aryal

Abstract: In recent years, researchers have become increasingly interested in outlying aspect mining. Outlying aspect mining is the task of finding a set of feature(s), where a given data object is different from the rest of the data objects. Remarkably few studies have been designed to address the problem of outlying aspect mining; therefore, little is known about outlying aspect mining approaches and thei… ▽ More In recent years, researchers have become increasingly interested in outlying aspect mining. Outlying aspect mining is the task of finding a set of feature(s), where a given data object is different from the rest of the data objects. Remarkably few studies have been designed to address the problem of outlying aspect mining; therefore, little is known about outlying aspect mining approaches and their strengths and weaknesses among researchers. In this work, we have grouped existing outlying aspect mining approaches in three different categories. For each category, we have provided existing work that falls in that category and then provided their strengths and weaknesses in those categories. We also offer time complexity comparison of the current techniques since it is a crucial issue in the real-world scenario. The motive behind this paper is to give a better understanding of the existing outlying aspect mining techniques and how these techniques have been developed. △ Less

Submitted 27 May, 2020; v1 submitted 6 May, 2020; originally announced May 2020.

Comments: 12 Pages

arXiv:2004.13550

A new effective and efficient measure for outlying aspect mining

Authors: Durgesh Samariya, Sunil Aryal, Kai Ming Ting

Abstract: Outlying Aspect Mining (OAM) aims to find the subspaces (a.k.a. aspects) in which a given query is an outlier with respect to a given dataset. Existing OAM algorithms use traditional distance/density-based outlier scores to rank subspaces. Because these distance/density-based scores depend on the dimensionality of subspaces, they cannot be compared directly between subspaces of different dimension… ▽ More Outlying Aspect Mining (OAM) aims to find the subspaces (a.k.a. aspects) in which a given query is an outlier with respect to a given dataset. Existing OAM algorithms use traditional distance/density-based outlier scores to rank subspaces. Because these distance/density-based scores depend on the dimensionality of subspaces, they cannot be compared directly between subspaces of different dimensionality. $Z$-score normalisation has been used to make them comparable. It requires to compute outlier scores of all instances in each subspace. This adds significant computational overhead on top of already expensive density estimation---making OAM algorithms infeasible to run in large and/or high-dimensional datasets. We also discover that $Z$-score normalisation is inappropriate for OAM in some cases. In this paper, we introduce a new score called SiNNE, which is independent of the dimensionality of subspaces. This enables the scores in subspaces with different dimensionalities to be compared directly without any additional normalisation. Our experimental results revealed that SiNNE produces better or at least the same results as existing scores; and it significantly improves the runtime of an existing OAM algorithm based on beam search. △ Less

Submitted 26 May, 2020; v1 submitted 28 April, 2020; originally announced April 2020.

Comments: Co-authors are not agree with submission of paper on arxiv

arXiv:2003.09773 [pdf, other]

doi 10.1109/IJCNN48605.2020.9207106

HDF: Hybrid Deep Features for Scene Image Representation

Authors: Chiranjibi Sitaula, Yong Xiang, Anish Basnet, Sunil Aryal, Xuequan Lu

Abstract: Nowadays it is prevalent to take features extracted from pre-trained deep learning models as image representations which have achieved promising classification performance. Existing methods usually consider either object-based features or scene-based features only. However, both types of features are important for complex images like scene images, as they can complement each other. In this paper,… ▽ More Nowadays it is prevalent to take features extracted from pre-trained deep learning models as image representations which have achieved promising classification performance. Existing methods usually consider either object-based features or scene-based features only. However, both types of features are important for complex images like scene images, as they can complement each other. In this paper, we propose a novel type of features -- hybrid deep features, for scene images. Specifically, we exploit both object-based and scene-based features at two levels: part image level (i.e., parts of an image) and whole image level (i.e., a whole image), which produces a total number of four types of deep features. Regarding the part image level, we also propose two new slicing techniques to extract part based features. Finally, we aggregate these four types of deep features via the concatenation operator. We demonstrate the effectiveness of our hybrid deep features on three commonly used scene datasets (MIT-67, Scene-15, and Event-8), in terms of the scene image classification task. Extensive comparisons show that our introduced features can produce state-of-the-art classification accuracies which are more consistent and stable than the results of existing features across all datasets. △ Less

Submitted 21 March, 2020; originally announced March 2020.

Comments: 8 pages, Accepted in IEEE WCCI 2020 Conference

Journal ref: Proceedings of IJCNN2020

arXiv:1909.12702 [pdf, other]

Improved histogram-based anomaly detector with the extended principal component features

Authors: Sunil Aryal, Arbind Agrahari Baniya, KC Santosh

Abstract: In this era of big data, databases are growing rapidly in terms of the number of records. Fast automatic detection of anomalous records in these massive databases is a challenging task. Traditional distance based anomaly detectors are not applicable in these massive datasets. Recently, a simple but extremely fast anomaly detector using one-dimensional histograms has been introduced. The anomaly sc… ▽ More In this era of big data, databases are growing rapidly in terms of the number of records. Fast automatic detection of anomalous records in these massive databases is a challenging task. Traditional distance based anomaly detectors are not applicable in these massive datasets. Recently, a simple but extremely fast anomaly detector using one-dimensional histograms has been introduced. The anomaly score of a data instance is computed as the product of the probability mass of histograms in each dimensions where it falls into. It is shown to produce competitive results compared to many state-of-the-art methods in many datasets. Because it assumes data features are independent of each other, it results in poor detection accuracy when there is correlation between features. To address this issue, we propose to increase the feature size by adding more features based on principal components. Our results show that using the original input features together with principal components improves the detection accuracy of histogram-based anomaly detector significantly without compromising much in terms of run-time. △ Less

Submitted 27 September, 2019; originally announced September 2019.

arXiv:1909.10708 [pdf, other]

doi 10.1007/978-3-030-34879-3_31

Unsupervised Deep Features for Privacy Image Classification

Authors: Chiranjibi Sitaula, Yong Xiang, Sunil Aryal, Xuequan Lu

Abstract: Sharing images online poses security threats to a wide range of users due to the unawareness of privacy information. Deep features have been demonstrated to be a powerful representation for images. However, deep features usually suffer from the issues of a large size and requiring a huge amount of data for fine-tuning. In contrast to normal images (e.g., scene images), privacy images are often lim… ▽ More Sharing images online poses security threats to a wide range of users due to the unawareness of privacy information. Deep features have been demonstrated to be a powerful representation for images. However, deep features usually suffer from the issues of a large size and requiring a huge amount of data for fine-tuning. In contrast to normal images (e.g., scene images), privacy images are often limited because of sensitive information. In this paper, we propose a novel approach that can work on limited data and generate deep features of smaller size. For training images, we first extract the initial deep features from the pre-trained model and then employ the K-means clustering algorithm to learn the centroids of these initial deep features. We use the learned centroids from training features to extract the final features for each testing image and encode our final features with the triangle encoding. To improve the discriminability of the features, we further perform the fusion of two proposed unsupervised deep features obtained from different layers. Experimental results show that the proposed features outperform state-of-the-art deep features, in terms of both classification accuracy and testing time. △ Less

Submitted 24 September, 2019; originally announced September 2019.

Comments: Accepted in PSIVT2019 Conference

Journal ref: PSIVT 2019. Lecture Notes in Computer Science, vol 11854

arXiv:1909.09999 [pdf, other]

doi 10.1007/978-3-030-36718-3_8

Tag-based Semantic Features for Scene Image Classification

Authors: Chiranjibi Sitaula, Yong Xiang, Anish Basnet, Sunil Aryal, Xuequan Lu

Abstract: The existing image feature extraction methods are primarily based on the content and structure information of images, and rarely consider the contextual semantic information. Regarding some types of images such as scenes and objects, the annotations and descriptions of them available on the web may provide reliable contextual semantic information for feature extraction. In this paper, we introduce… ▽ More The existing image feature extraction methods are primarily based on the content and structure information of images, and rarely consider the contextual semantic information. Regarding some types of images such as scenes and objects, the annotations and descriptions of them available on the web may provide reliable contextual semantic information for feature extraction. In this paper, we introduce novel semantic features of an image based on the annotations and descriptions of its similar images available on the web. Specifically, we propose a new method which consists of two consecutive steps to extract our semantic features. For each image in the training set, we initially search the top $k$ most similar images from the internet and extract their annotations/descriptions (e.g., tags or keywords). The annotation information is employed to design a filter bank for each image category and generate filter words (codebook). Finally, each image is represented by the histogram of the occurrences of filter words in all categories. We evaluate the performance of the proposed features in scene image classification on three commonly-used scene image datasets (i.e., MIT-67, Scene15 and Event8). Our method typically produces a lower feature dimension than existing feature extraction methods. Experimental results show that the proposed features generate better classification accuracies than vision based and tag based features, and comparable results to deep learning based features. △ Less

Submitted 22 September, 2019; originally announced September 2019.

Comments: Accepted by ICONIP2019 conference

Journal ref: In: Gedeon T., Wong K., Lee M. (eds) Neural Information Processing. ICONIP 2019., vol 11955 (2019)

arXiv:1906.04987 [pdf, other]

doi 10.1109/ACCESS.2019.2925002

Indoor image representation by high-level semantic features

Authors: Chiranjibi Sitaula, Yong Xiang, Yushu Zhang, Xuequan Lu, Sunil Aryal

Abstract: Indoor image features extraction is a fundamental problem in multiple fields such as image processing, pattern recognition, robotics and so on. Nevertheless, most of the existing feature extraction methods, which extract features based on pixels, color, shape/object parts or objects on images, suffer from limited capabilities in describing semantic information (e.g., object association). These tec… ▽ More Indoor image features extraction is a fundamental problem in multiple fields such as image processing, pattern recognition, robotics and so on. Nevertheless, most of the existing feature extraction methods, which extract features based on pixels, color, shape/object parts or objects on images, suffer from limited capabilities in describing semantic information (e.g., object association). These techniques, therefore, involve undesired classification performance. To tackle this issue, we propose the notion of high-level semantic features and design four steps to extract them. Specifically, we first construct the objects pattern dictionary through extracting raw objects in the images, and then retrieve and extract semantic objects from the objects pattern dictionary. We finally extract our high-level semantic features based on the calculated probability and delta parameter. Experiments on three publicly available datasets (MIT-67, Scene15 and NYU V1) show that our feature extraction approach outperforms state-of-the-art feature extraction methods for indoor image classification, given a lower dimension of our features than those methods. △ Less

Submitted 11 July, 2019; v1 submitted 12 June, 2019; originally announced June 2019.

Comments: This paper has been accepted in IEEE Access

Journal ref: IEEE Access 7 ,2019

arXiv:1902.03402 [pdf, ps, other]

A new simple and effective measure for bag-of-word inter-document similarity measurement

Authors: Sunil Aryal, Kai Ming Ting, Takashi Washio, Gholamreza Haffari

Abstract: To measure the similarity of two documents in the bag-of-words (BoW) vector representation, different term weighting schemes are used to improve the performance of cosine similarity---the most widely used inter-document similarity measure in text mining. In this paper, we identify the shortcomings of the underlying assumptions of term weighting in the inter-document similarity measurement task; an… ▽ More To measure the similarity of two documents in the bag-of-words (BoW) vector representation, different term weighting schemes are used to improve the performance of cosine similarity---the most widely used inter-document similarity measure in text mining. In this paper, we identify the shortcomings of the underlying assumptions of term weighting in the inter-document similarity measurement task; and provide a more fit-to-the-purpose alternative. Based on this new assumption, we introduce a new simple but effective similarity measure which does not require explicit term weighting. The proposed measure employs a more nuanced probabilistic approach than those used in term weighting to measure the similarity of two documents w.r.t each term occurring in the two documents. Our empirical comparison with the existing similarity measures using different term weighting schemes shows that the new measure produces (i) better results in the binary BoW representation; and (ii) competitive and more consistent results in the term-frequency-based BoW representation. △ Less

Submitted 9 February, 2019; originally announced February 2019.

arXiv:1603.06043 [pdf, ps, other]

Hamburger moment sequences and their moment subsequences

Authors: Saroj Aryal, Hayoung Choi, Farhad Jafari

Abstract: In this paper a connection between Hamburger moment sequences and their moment subsequences is given and the determinacy of these problems are related. In this paper a connection between Hamburger moment sequences and their moment subsequences is given and the determinacy of these problems are related. △ Less

Submitted 18 March, 2016; originally announced March 2016.

Comments: arXiv admin note: text overlap with arXiv:1203.5859

arXiv:1603.06040

Sparse Hamburger Moment Sequences and Completions in Several Variables

Authors: Saroj Aryal, Hayoung Choi, Farhad Jafari

Abstract: Putinar and Vasilescu have given an algebraic characterization of Hamburger moment sequences in several variables. In this paper we give a characterization of sparse moment subsequences of Hamburger moment sequences and consider the problem of completion of these moment subsequences. Putinar and Vasilescu have given an algebraic characterization of Hamburger moment sequences in several variables. In this paper we give a characterization of sparse moment subsequences of Hamburger moment sequences and consider the problem of completion of these moment subsequences. △ Less

Submitted 8 April, 2016; v1 submitted 18 March, 2016; originally announced March 2016.

Comments: This paper has been withdrawn by the author due to almost same results on another paper

arXiv:1306.3540 [pdf]

doi 10.1111/jace.12987

Crystal Structure and Elastic Properties of Hypothesized MAX Phase-like Compound (Cr2Hf)2Al3C3

Authors: Yuxiang Mo, Sitaram Aryal, Paul Rulis, Wai-Yim Ching

Abstract: The term "MAX phase" refers to a very interesting and important class of layered ternary transition-metal carbides and nitrides with a novel combination of both metal and ceramic-like properties that have made these materials highly regarded candidates for numerous technological and engineering applications. Using (Cr2Hf)2Al3C3 as an example, we demonstrate the possibility of incorporating more ty… ▽ More The term "MAX phase" refers to a very interesting and important class of layered ternary transition-metal carbides and nitrides with a novel combination of both metal and ceramic-like properties that have made these materials highly regarded candidates for numerous technological and engineering applications. Using (Cr2Hf)2Al3C3 as an example, we demonstrate the possibility of incorporating more types of elements into a MAX phase while maintaining the crystallinity, instead of creating solid-solution phases. The crystal structure and elastic properties of MAX-like (Cr2Hf)2Al3C3 are studied using the Vienna Ab initio Simulation Package. Unlike MAX phases with a hexagonal symmetry (P63/mmc, #194), (Cr2Hf)2Al3C3 crystallizes in the monoclinic space group of P21/m (#11) with lattice parameters of a = 5.1739 Å, b = 5.1974 Å, c = 12.8019 Å; α = β = 90°, γ = 119.8509°. Its structure is found to be energetically much more favorable with an energy (per formula unit) of -102.11 eV, significantly lower than those of the allotropic segregation (-100.05 eV) and solid-solution (-100.13 eV) phases. Calculations using a stress vs. strain approach and the VRH approximation for polycrystals also show that (Cr2Hf)2Al3C3 has outstanding elastic moduli. △ Less

Submitted 20 February, 2014; v1 submitted 14 June, 2013; originally announced June 2013.

Journal ref: Journal of the American Ceramic Society, 2014, 97, 6

arXiv:1211.1341 [pdf, ps, other]

doi 10.1088/0004-6256/145/1/15

New Young Star Candidates in BRC 27 and BRC 34

Authors: L. M. Rebull, C. H. Johnson, J. C. Gibbs, M. Linahan, D. Sartore, R. Laher, M. Legassie, J. D. Armstrong, L. E. Allen, P. McGehee, D. L. Padgett, S. Aryal, K. S. Badura, T. S. Canakapalli, S. Carlson, M. Clark, N. Ezyk, J. Fagan, N. Killingstad, S. Koop, T. McCanna, M. M. Nishida, T. R. Nuthmann, A. O'Bryan, A. PUllinger , et al. (4 additional authors not shown)

Abstract: We used archival Spitzer Space Telescope mid-infrared data to search for young stellar objects (YSOs) in the immediate vicinity of two bright-rimmed clouds, BRC 27 (part of CMa R1) and BRC 34 (part of the IC 1396 complex). These regions both appear to be actively forming young stars, perhaps triggered by the proximate OB stars. In BRC 27, we find clear infrared excesses around 22 of the 26 YSOs or… ▽ More We used archival Spitzer Space Telescope mid-infrared data to search for young stellar objects (YSOs) in the immediate vicinity of two bright-rimmed clouds, BRC 27 (part of CMa R1) and BRC 34 (part of the IC 1396 complex). These regions both appear to be actively forming young stars, perhaps triggered by the proximate OB stars. In BRC 27, we find clear infrared excesses around 22 of the 26 YSOs or YSO candidates identified in the literature, and identify 16 new YSO candidates that appear to have IR excesses. In BRC 34, the one literature-identified YSO has an IR excess, and we suggest 13 new YSO candidates in this region, including a new Class I object. Considering the entire ensemble, both BRCs are likely of comparable ages, within the uncertainties of small number statistics and without spectroscopy to confirm or refute the YSO candidates. Similarly, no clear conclusions can yet be drawn about any possible age gradients that may be present across the BRCs. △ Less

Submitted 6 November, 2012; originally announced November 2012.

Comments: 54 pages, 19 figures, accepted by AJ

arXiv:1203.5859 [pdf, ps, other]

Sparse Hamburger Moment Sequences

Authors: Saroj Aryal, Hayoung Choi, Farhad Jafari

Abstract: Putinar and Vasilescu [6] have given an algebraic characterization of Hamburger moment sequences in several variables. In this paper we study some sparse moment subsequences of Hamburger moment sequences and consider the problem of completion of these moment subsequences. Putinar and Vasilescu [6] have given an algebraic characterization of Hamburger moment sequences in several variables. In this paper we study some sparse moment subsequences of Hamburger moment sequences and consider the problem of completion of these moment subsequences. △ Less

Submitted 11 April, 2016; v1 submitted 26 March, 2012; originally announced March 2012.

Comments: The previously submitted paper has been re-written extensively and needs to be replaced with this one

MSC Class: 44A60

arXiv:0911.4159 [pdf]

Comparison of Fluid Attenuated Inversion Recovery Sequence with Spin Echo T2-Weighted MRI for Characterization of Brain Pathology

Authors: Indra Dev Sahu, Sheshkant Aryal, Shanta Lal Shrestha, Ram Kumar Ghimire

Abstract: Twenty cases of different brain pathology have been studied via MRI using an open resistive magnet with magnetic field strength of 0.2 Tesla. The relative signal intensity with respect to the repetition time (TR) at fixed echo time (TE) 0.117 sec. has been studied. It was found that the signal intensity saturates for most lesions beyond a certain TR~6 sec in the T2 - weighted image. The signal i… ▽ More Twenty cases of different brain pathology have been studied via MRI using an open resistive magnet with magnetic field strength of 0.2 Tesla. The relative signal intensity with respect to the repetition time (TR) at fixed echo time (TE) 0.117 sec. has been studied. It was found that the signal intensity saturates for most lesions beyond a certain TR~6 sec in the T2 - weighted image. The signal intensity differs with respect to the inversion time (TI) for fat and cerebrospinal fluid (CSF). It was found that the intensity is nulled for CSF at TI ~1.5 sec. and for Fat at TI~0.10 sec in the FLAIR imaging sequence. Thus the intensity of the lesions is qualitatively different for the two sequences. From the radiological diagnostic point of view, it was concluded that the FLAIR sequence is more useful for the detection of lesions compared to T2 sequences. △ Less

Submitted 5 December, 2009; v1 submitted 21 November, 2009; originally announced November 2009.

Comments: 9 pages, 6 figures

Showing 1–44 of 44 results for author: Aryal, S