Search | arXiv e-print repository

arXiv:2406.07716 [pdf]

Unleashing the Power of Transfer Learning Model for Sophisticated Insect Detection: Revolutionizing Insect Classification

Authors: Md. Mahmudul Hasan, SM Shaqib, Ms. Sharmin Akter, Rabiul Alam, Afraz Ul Haque, Shahrun akter khushbu

Abstract: The purpose of the Insect Detection System for Crop and Plant Health is to keep an eye out for and identify insect infestations in farming areas. By utilizing cutting-edge technology like computer vision and machine learning, the system seeks to identify hazardous insects early and accurately. This would enable prompt response to save crops and maintain optimal plant health. The Method of this stu… ▽ More The purpose of the Insect Detection System for Crop and Plant Health is to keep an eye out for and identify insect infestations in farming areas. By utilizing cutting-edge technology like computer vision and machine learning, the system seeks to identify hazardous insects early and accurately. This would enable prompt response to save crops and maintain optimal plant health. The Method of this study includes Data Acquisition, Preprocessing, Data splitting, Model Implementation and Model evaluation. Different models like MobileNetV2, ResNet152V2, Xecption, Custom CNN was used in this study. In order to categorize insect photos, a Convolutional Neural Network (CNN) based on the ResNet152V2 architecture is constructed and evaluated in this work. Achieving 99% training accuracy and 97% testing accuracy, ResNet152V2 demonstrates superior performance among four implemented models. The results highlight its potential for real-world applications in insect classification and entomology studies, emphasizing efficiency and accuracy. To ensure food security and sustain agricultural output globally, finding insects is crucial. Cutting-edge technology, such as ResNet152V2 models, greatly influence automating and improving the accuracy of insect identification. Efficient insect detection not only minimizes crop losses but also enhances agricultural productivity, contributing to sustainable food production. This underscores the pivotal role of technology in addressing challenges related to global food security. △ Less

Submitted 11 June, 2024; originally announced June 2024.

arXiv:2404.02269 [pdf, other]

Extracting Norms from Contracts Via ChatGPT: Opportunities and Challenges

Authors: Amanul Haque, Munindar P. Singh

Abstract: We investigate the effectiveness of ChatGPT in extracting norms from contracts. Norms provide a natural way to engineer multiagent systems by capturing how to govern the interactions between two or more autonomous parties. We extract norms of commitment, prohibition, authorization, and power, along with associated norm elements (the parties involved, antecedents, and consequents) from contracts. O… ▽ More We investigate the effectiveness of ChatGPT in extracting norms from contracts. Norms provide a natural way to engineer multiagent systems by capturing how to govern the interactions between two or more autonomous parties. We extract norms of commitment, prohibition, authorization, and power, along with associated norm elements (the parties involved, antecedents, and consequents) from contracts. Our investigation reveals ChatGPT's effectiveness and limitations in norm extraction from contracts. ChatGPT demonstrates promising performance in norm extraction without requiring training or fine-tuning, thus obviating the need for annotated data, which is not generally available in this domain. However, we found some limitations of ChatGPT in extracting these norms that lead to incorrect norm extractions. The limitations include oversight of crucial details, hallucination, incorrect parsing of conjunctions, and empty norm elements. Enhanced norm extraction from contracts can foster the development of more transparent and trustworthy formal agent interaction specifications, thereby contributing to the improvement of multiagent systems. △ Less

Submitted 2 April, 2024; originally announced April 2024.

Comments: Accepted at COINE-AAMAS 2024

arXiv:2403.14643 [pdf]

doi 10.1007/s43681-024-00435-4

Exploring ChatGPT and its Impact on Society

Authors: Md. Asraful Haque, Shuai Li

Abstract: Artificial intelligence has been around for a while, but suddenly it has received more attention than ever before. Thanks to innovations from companies like Google, Microsoft, Meta, and other major brands in technology. OpenAI, though, has triggered the button with its ground-breaking invention ChatGPT. ChatGPT is a Large Language Model (LLM) based on Transformer architecture that has the ability… ▽ More Artificial intelligence has been around for a while, but suddenly it has received more attention than ever before. Thanks to innovations from companies like Google, Microsoft, Meta, and other major brands in technology. OpenAI, though, has triggered the button with its ground-breaking invention ChatGPT. ChatGPT is a Large Language Model (LLM) based on Transformer architecture that has the ability to generate human-like responses in a conversational context. It uses deep learning algorithms to generate natural language responses to input text. Its large number of parameters, contextual generation, and open-domain training make it a versatile and effective tool for a wide range of applications, from chatbots to customer service to language translation. It has the potential to revolutionize various industries and transform the way we interact with technology. However, the use of ChatGPT has also raised several concerns, including ethical, social, and employment challenges, which must be carefully considered to ensure the responsible use of this technology. The article provides an overview of ChatGPT, delving into its architecture and training process. It highlights the potential impacts of ChatGPT on the society. In this paper, we suggest some approaches involving technology, regulation, education, and ethics in an effort to maximize ChatGPT's benefits while minimizing its negative impacts. This study is expected to contribute to a greater understanding of ChatGPT and aid in predicting the potential changes it may bring about. △ Less

Submitted 25 March, 2024; v1 submitted 21 February, 2024; originally announced March 2024.

Comments: 13 Pages

MSC Class: 68Txx

Journal ref: AI and Ethics (2024)

arXiv:2311.02102 [pdf]

Notion of Explainable Artificial Intelligence -- An Empirical Investigation from A Users Perspective

Authors: AKM Bahalul Haque, A. K. M. Najmul Islam, Patrick Mikalef

Abstract: The growing attention to artificial intelligence-based applications has led to research interest in explainability issues. This emerging research attention on explainable AI (XAI) advocates the need to investigate end user-centric explainable AI. Thus, this study aims to investigate usercentric explainable AI and considered recommendation systems as the study context. We conducted focus group inte… ▽ More The growing attention to artificial intelligence-based applications has led to research interest in explainability issues. This emerging research attention on explainable AI (XAI) advocates the need to investigate end user-centric explainable AI. Thus, this study aims to investigate usercentric explainable AI and considered recommendation systems as the study context. We conducted focus group interviews to collect qualitative data on the recommendation system. We asked participants about the end users' comprehension of a recommended item, its probable explanation, and their opinion of making a recommendation explainable. Our findings reveal that end users want a non-technical and tailor-made explanation with on-demand supplementary information. Moreover, we also observed users requiring an explanation about personal data usage, detailed user feedback, and authentic and reliable explanations. Finally, we propose a synthesized framework that aims at involving the end user in the development process for requirements collection and validation. △ Less

Submitted 1 November, 2023; originally announced November 2023.

Comments: 26 Pages, 3 Figures, 1 Table , Accepted version for publication in European Conference on Information Systems (ECIS), 2023

arXiv:2311.01996 [pdf]

Detection of keratoconus Diseases using deep Learning

Authors: AKM Enzam-Ul Haque, Golam Rabbany, Md. Siam

Abstract: One of the most serious corneal disorders, keratoconus is difficult to diagnose in its early stages and can result in blindness. This illness, which often appears in the second decade of life, affects people of all sexes and races. Convolutional neural networks (CNNs), one of the deep learning approaches, have recently come to light as particularly promising tools for the accurate and timely diagn… ▽ More One of the most serious corneal disorders, keratoconus is difficult to diagnose in its early stages and can result in blindness. This illness, which often appears in the second decade of life, affects people of all sexes and races. Convolutional neural networks (CNNs), one of the deep learning approaches, have recently come to light as particularly promising tools for the accurate and timely diagnosis of keratoconus. The purpose of this study was to evaluate how well different D-CNN models identified keratoconus-related diseases. To be more precise, we compared five different CNN-based deep learning architectures (DenseNet201, InceptionV3, MobileNetV2, VGG19, Xception). In our comprehensive experimental analysis, the DenseNet201-based model performed very well in keratoconus disease identification in our extensive experimental research. This model outperformed its D-CNN equivalents, with an astounding accuracy rate of 89.14% in three crucial classes: Keratoconus, Normal, and Suspect. The results demonstrate not only the stability and robustness of the model but also its practical usefulness in real-world applications for accurate and dependable keratoconus identification. In addition, D-CNN DenseNet201 performs extraordinarily well in terms of precision, recall rates, and F1 scores in addition to accuracy. These measures validate the model's usefulness as an effective diagnostic tool by highlighting its capacity to reliably detect instances of keratoconus and to reduce false positives and negatives. △ Less

Submitted 3 November, 2023; originally announced November 2023.

arXiv:2307.02412 [pdf]

Android Malware Detection using Machine learning: A Review

Authors: Md Naseef-Ur-Rahman Chowdhury, Ahshanul Haque, Hamdy Soliman, Mohammad Sahinur Hossen, Tanjim Fatima, Imtiaz Ahmed

Abstract: Malware for Android is becoming increasingly dangerous to the safety of mobile devices and the data they hold. Although machine learning(ML) techniques have been shown to be effective at detecting malware for Android, a comprehensive analysis of the methods used is required. We review the current state of Android malware detection us ing machine learning in this paper. We begin by providing an ove… ▽ More Malware for Android is becoming increasingly dangerous to the safety of mobile devices and the data they hold. Although machine learning(ML) techniques have been shown to be effective at detecting malware for Android, a comprehensive analysis of the methods used is required. We review the current state of Android malware detection us ing machine learning in this paper. We begin by providing an overview of Android malware and the security issues it causes. Then, we look at the various supervised, unsupervised, and deep learning machine learning approaches that have been utilized for Android malware detection. Addi tionally, we present a comparison of the performance of various Android malware detection methods and talk about the performance evaluation metrics that are utilized to evaluate their efficacy. Finally, we draw atten tion to the drawbacks and difficulties of the methods that are currently in use and suggest possible future directions for research in this area. In addition to providing insights into the current state of Android malware detection using machine learning, our review provides a comprehensive overview of the subject. △ Less

Submitted 15 March, 2023; originally announced July 2023.

Comments: 22 pages,2 figures, IntelliSys 2023

arXiv:2306.03412 [pdf, other]

DEK-Forecaster: A Novel Deep Learning Model Integrated with EMD-KNN for Traffic Prediction

Authors: Sajal Saha, Sudipto Baral, Anwar Haque

Abstract: Internet traffic volume estimation has a significant impact on the business policies of the ISP (Internet Service Provider) industry and business successions. Forecasting the internet traffic demand helps to shed light on the future traffic trend, which is often helpful for ISPs decision-making in network planning activities and investments. Besides, the capability to understand future trend contr… ▽ More Internet traffic volume estimation has a significant impact on the business policies of the ISP (Internet Service Provider) industry and business successions. Forecasting the internet traffic demand helps to shed light on the future traffic trend, which is often helpful for ISPs decision-making in network planning activities and investments. Besides, the capability to understand future trend contributes to managing regular and long-term operations. This study aims to predict the network traffic volume demand using deep sequence methods that incorporate Empirical Mode Decomposition (EMD) based noise reduction, Empirical rule based outlier detection, and $K$-Nearest Neighbour (KNN) based outlier mitigation. In contrast to the former studies, the proposed model does not rely on a particular EMD decomposed component called Intrinsic Mode Function (IMF) for signal denoising. In our proposed traffic prediction model, we used an average of all IMFs components for signal denoising. Moreover, the abnormal data points are replaced by $K$ nearest data points average, and the value for $K$ has been optimized based on the KNN regressor prediction error measured in Root Mean Squared Error (RMSE). Finally, we selected the best time-lagged feature subset for our prediction model based on AutoRegressive Integrated Moving Average (ARIMA) and Akaike Information Criterion (AIC) value. Our experiments are conducted on real-world internet traffic datasets from industry, and the proposed method is compared with various traditional deep sequence baseline models. Our results show that the proposed EMD-KNN integrated prediction models outperform comparative models. △ Less

Submitted 6 June, 2023; originally announced June 2023.

Comments: 13 pages, 9 figures

arXiv:2306.00421 [pdf, other]

Introduction to Medical Imaging Informatics

Authors: Md. Zihad Bin Jahangir, Ruksat Hossain, Riadul Islam, MD Abdullah Al Nasim, Md. Mahim Anjum Haque, Md Jahangir Alam, Sajedul Talukder

Abstract: Medical imaging informatics is a rapidly growing field that combines the principles of medical imaging and informatics to improve the acquisition, management, and interpretation of medical images. This chapter introduces the basic concepts of medical imaging informatics, including image processing, feature engineering, and machine learning. It also discusses the recent advancements in computer vis… ▽ More Medical imaging informatics is a rapidly growing field that combines the principles of medical imaging and informatics to improve the acquisition, management, and interpretation of medical images. This chapter introduces the basic concepts of medical imaging informatics, including image processing, feature engineering, and machine learning. It also discusses the recent advancements in computer vision and deep learning technologies and how they are used to develop new quantitative image markers and prediction models for disease detection, diagnosis, and prognosis prediction. By covering the basic knowledge of medical imaging informatics, this chapter provides a foundation for understanding the role of informatics in medicine and its potential impact on patient care. △ Less

Submitted 17 June, 2023; v1 submitted 1 June, 2023; originally announced June 2023.

Comments: 18 pages, 11 figures, 2 tables; Acceptance of the chapter for the Springer book "Data-driven approaches to medical imaging"

arXiv:2303.12789 [pdf, other]

Instruct-NeRF2NeRF: Editing 3D Scenes with Instructions

Authors: Ayaan Haque, Matthew Tancik, Alexei A. Efros, Aleksander Holynski, Angjoo Kanazawa

Abstract: We propose a method for editing NeRF scenes with text-instructions. Given a NeRF of a scene and the collection of images used to reconstruct it, our method uses an image-conditioned diffusion model (InstructPix2Pix) to iteratively edit the input images while optimizing the underlying scene, resulting in an optimized 3D scene that respects the edit instruction. We demonstrate that our proposed meth… ▽ More We propose a method for editing NeRF scenes with text-instructions. Given a NeRF of a scene and the collection of images used to reconstruct it, our method uses an image-conditioned diffusion model (InstructPix2Pix) to iteratively edit the input images while optimizing the underlying scene, resulting in an optimized 3D scene that respects the edit instruction. We demonstrate that our proposed method is able to edit large-scale, real-world scenes, and is able to accomplish more realistic, targeted edits than prior work. △ Less

Submitted 1 June, 2023; v1 submitted 22 March, 2023; originally announced March 2023.

Comments: Project website: https://instruct-nerf2nerf.github.io; v1. Revisions to related work and discussion

arXiv:2303.11434 [pdf, other]

ResDTA: Predicting Drug-Target Binding Affinity Using Residual Skip Connections

Authors: Partho Ghosh, Md. Aynal Haque

Abstract: The discovery of novel drug target (DT) interactions is an important step in the drug development process. The majority of computer techniques for predicting DT interactions have focused on binary classification, with the goal of determining whether or not a DT pair interacts. Protein ligand interactions, on the other hand, assume a continuous range of binding strength values, also known as bindin… ▽ More The discovery of novel drug target (DT) interactions is an important step in the drug development process. The majority of computer techniques for predicting DT interactions have focused on binary classification, with the goal of determining whether or not a DT pair interacts. Protein ligand interactions, on the other hand, assume a continuous range of binding strength values, also known as binding affinity, and forecasting this value remains a difficulty. As the amount of affinity data in DT knowledge-bases grows, advanced learning techniques such as deep learning architectures can be used to predict binding affinities. In this paper, we present a deep-learning-based methodology for predicting DT binding affinities using just sequencing information from both targets and drugs. The results show that the proposed deep learning-based model that uses the 1D representations of targets and drugs is an effective approach for drug target binding affinity prediction and it does not require additional chemical domain knowledge to work with. The model in which high-level representations of a drug and a target are constructed via CNNs that uses residual skip connections and also with an additional stream to create a high-level combined representation of the drug-target pair achieved the best Concordance Index (CI) performance in one of the largest benchmark datasets, outperforming the recent state-of-the-art method AttentionDTA and many other machine-learning and deep-learning based baseline methods for DT binding affinity prediction that uses the 1D representations of targets and drugs. △ Less

Submitted 20 March, 2023; originally announced March 2023.

Comments: 40 pages, 10 figures, 2 tables. arXiv admin note: substantial text overlap with arXiv:1801.10193, arXiv:1902.04166 by other authors

arXiv:2303.09645 [pdf]

Development of a Voice Controlled Robotic Arm

Authors: Akkas U. Haque, Humayun Kabir, S. C. Banik, M. T. Islam

Abstract: This paper describes a robotic arm with 5 degrees-of-freedom (DOF) which is controlled by human voice and has been developed in the Mechatronics Laboratory, CUET. This robotic arm is interfaced with a PC by serial communication (RS-232). Users' voice command is captured by a microphone, and this voice is processed by software which is made by Microsoft visual studio. Then the specific signal (obta… ▽ More This paper describes a robotic arm with 5 degrees-of-freedom (DOF) which is controlled by human voice and has been developed in the Mechatronics Laboratory, CUET. This robotic arm is interfaced with a PC by serial communication (RS-232). Users' voice command is captured by a microphone, and this voice is processed by software which is made by Microsoft visual studio. Then the specific signal (obtained by signal processing) is sent to control unit. The main control unit that is used in the robotic arm is a microcontroller whose model no. is PIC18f452. Then Control unit drives the actuators, (Hitec HS-422, HS-81) according to the signal or signals to give required motion of the robotic arm. At present the robotic arm can perform a set action like pick & pull, grip**, holding & releasing, and some other extra function like dance-like movement, and can turn according to the voice commands. △ Less

Submitted 16 March, 2023; originally announced March 2023.

arXiv:2303.08823 [pdf]

Wireless Sensor Networks anomaly detection using Machine Learning: A Survey

Authors: Ahsnaul Haque, Md Naseef-Ur-Rahman Chowdhury, Hamdy Soliman, Mohammad Sahinur Hossen, Tanjim Fatima, Imtiaz Ahmed

Abstract: Wireless Sensor Networks (WSNs) have become increasingly valuable in various civil/military applications like industrial process control, civil engineering applications such as buildings structural strength monitoring, environmental monitoring, border intrusion, IoT (Internet of Things), and healthcare. However, the sensed data generated by WSNs is often noisy and unreliable, making it a challenge… ▽ More Wireless Sensor Networks (WSNs) have become increasingly valuable in various civil/military applications like industrial process control, civil engineering applications such as buildings structural strength monitoring, environmental monitoring, border intrusion, IoT (Internet of Things), and healthcare. However, the sensed data generated by WSNs is often noisy and unreliable, making it a challenge to detect and diagnose anomalies. Machine learning (ML) techniques have been widely used to address this problem by detecting and identifying unusual patterns in the sensed data. This survey paper provides an overview of the state of the art applications of ML techniques for data anomaly detection in WSN domains. We first introduce the characteristics of WSNs and the challenges of anomaly detection in WSNs. Then, we review various ML techniques such as supervised, unsupervised, and semi-supervised learning that have been applied to WSN data anomaly detection. We also compare different ML-based approaches and their performance evaluation metrics. Finally, we discuss open research challenges and future directions for applying ML techniques in WSNs sensed data anomaly detection. △ Less

Submitted 15 March, 2023; originally announced March 2023.

Comments: 19 pages, 4 figures, IntelliSys 2023

arXiv:2302.13991 [pdf, other]

doi 10.1109/JBHI.2024.3372999

Learning to Generalize towards Unseen Domains via a Content-Aware Style Invariant Model for Disease Detection from Chest X-rays

Authors: Mohammad Zunaed, Md. Aynal Haque, Taufiq Hasan

Abstract: Performance degradation due to distribution discrepancy is a longstanding challenge in intelligent imaging, particularly for chest X-rays (CXRs). Recent studies have demonstrated that CNNs are biased toward styles (e.g., uninformative textures) rather than content (e.g., shape), in stark contrast to the human vision system. Radiologists tend to learn visual cues from CXRs and thus perform well acr… ▽ More Performance degradation due to distribution discrepancy is a longstanding challenge in intelligent imaging, particularly for chest X-rays (CXRs). Recent studies have demonstrated that CNNs are biased toward styles (e.g., uninformative textures) rather than content (e.g., shape), in stark contrast to the human vision system. Radiologists tend to learn visual cues from CXRs and thus perform well across multiple domains. Motivated by this, we employ the novel on-the-fly style randomization modules at both image (SRM-IL) and feature (SRM-FL) levels to create rich style perturbed features while kee** the content intact for robust cross-domain performance. Previous methods simulate unseen domains by constructing new styles via interpolation or swap** styles from existing data, limiting them to available source domains during training. However, SRM-IL samples the style statistics from the possible value range of a CXR image instead of the training data to achieve more diversified augmentations. Moreover, we utilize pixel-wise learnable parameters in the SRM-FL compared to pre-defined channel-wise mean and standard deviations as style embeddings for capturing more representative style features. Additionally, we leverage consistency regularizations on global semantic features and predictive distributions from with and without style-perturbed versions of the same CXR to tweak the model's sensitivity toward content markers for accurate predictions. Our proposed method, trained on CheXpert and MIMIC-CXR datasets, achieves 77.32$\pm$0.35, 88.38$\pm$0.19, 82.63$\pm$0.13 AUCs(%) on the unseen domain test datasets, i.e., BRAX, VinDr-CXR, and NIH chest X-ray14, respectively, compared to 75.56$\pm$0.80, 87.57$\pm$0.46, 82.07$\pm$0.19 from state-of-the-art models on five-fold cross-validation with statistically significant results in thoracic disease classification. △ Less

Submitted 6 March, 2024; v1 submitted 27 February, 2023; originally announced February 2023.

Comments: Accepted to IEEE Journal of Biomedical and Health Informatics

arXiv:2302.13380 [pdf]

doi 10.1002/adma.202302575

Closed-loop Error Correction Learning Accelerates Experimental Discovery of Thermoelectric Materials

Authors: Hitarth Choubisa, Md Azimul Haque, Tong Zhu, Lewei Zeng, Maral Vafaie, Derya Baran, Edward H Sargent

Abstract: The exploration of thermoelectric materials is challenging considering the large materials space, combined with added exponential degrees of freedom coming from do** and the diversity of synthetic pathways. Here we seek to incorporate historical data and update and refine it using experimental feedback by employing error-correction learning (ECL). We thus learn from prior datasets and then adapt… ▽ More The exploration of thermoelectric materials is challenging considering the large materials space, combined with added exponential degrees of freedom coming from do** and the diversity of synthetic pathways. Here we seek to incorporate historical data and update and refine it using experimental feedback by employing error-correction learning (ECL). We thus learn from prior datasets and then adapt the model to differences in synthesis and characterization that are otherwise difficult to parameterize. We then apply this strategy to discovering thermoelectric materials where we prioritize synthesis at temperatures < 300°C. We document a previously unreported chemical family of thermoelectric materials, PbSe:SnSb, finding that the best candidate in this chemical family, 2 wt% SnSb doped PbSe, exhibits a power factor more than 2x that of PbSe. Our investigations show that our closed-loop experimentation strategy reduces the required number of experiments to find an optimized material by as much as 3x compared to high-throughput searches powered by state-of-the-art machine learning models. We also observe that this improvement is dependent on the accuracy of prior in a manner that exhibits diminishing returns, and after a certain accuracy is reached, it is factors associated with experimental pathways that dictate the trends. △ Less

Submitted 26 February, 2023; originally announced February 2023.

arXiv:2302.11775 [pdf, other]

doi 10.1145/3544548.3580791

Don't Look at the Data! How Differential Privacy Reconfigures the Practices of Data Science

Authors: Jayshree Sarathy, Sophia Song, Audrey Haque, Tania Schlatter, Salil Vadhan

Abstract: Across academia, government, and industry, data stewards are facing increasing pressure to make datasets more openly accessible for researchers while also protecting the privacy of data subjects. Differential privacy (DP) is one promising way to offer privacy along with open access, but further inquiry is needed into the tensions between DP and data science. In this study, we conduct interviews wi… ▽ More Across academia, government, and industry, data stewards are facing increasing pressure to make datasets more openly accessible for researchers while also protecting the privacy of data subjects. Differential privacy (DP) is one promising way to offer privacy along with open access, but further inquiry is needed into the tensions between DP and data science. In this study, we conduct interviews with 19 data practitioners who are non-experts in DP as they use a DP data analysis prototype to release privacy-preserving statistics about sensitive data, in order to understand perceptions, challenges, and opportunities around using DP. We find that while DP is promising for providing wider access to sensitive datasets, it also introduces challenges into every stage of the data science workflow. We identify ethics and governance questions that arise when socializing data scientists around new privacy constraints and offer suggestions to better integrate DP and data science. △ Less

Submitted 22 February, 2023; originally announced February 2023.

arXiv:2301.03368 [pdf, other]

DRL-GAN: A Hybrid Approach for Binary and Multiclass Network Intrusion Detection

Authors: Caroline Strickland, Chandrika Saha, Muhammad Zakar, Sareh Nejad, Noshin Tasnim, Daniel Lizotte, Anwar Haque

Abstract: Our increasingly connected world continues to face an ever-growing amount of network-based attacks. Intrusion detection systems (IDS) are an essential security technology for detecting these attacks. Although numerous machine learning-based IDS have been proposed for the detection of malicious network traffic, the majority have difficulty properly detecting and classifying the more uncommon attack… ▽ More Our increasingly connected world continues to face an ever-growing amount of network-based attacks. Intrusion detection systems (IDS) are an essential security technology for detecting these attacks. Although numerous machine learning-based IDS have been proposed for the detection of malicious network traffic, the majority have difficulty properly detecting and classifying the more uncommon attack types. In this paper, we implement a novel hybrid technique using synthetic data produced by a Generative Adversarial Network (GAN) to use as input for training a Deep Reinforcement Learning (DRL) model. Our GAN model is trained with the NSL-KDD dataset for four attack categories as well as normal network flow. Ultimately, our findings demonstrate that training the DRL on specific synthetic datasets can result in better performance in correctly classifying minority classes over training on the true imbalanced dataset. △ Less

Submitted 5 January, 2023; originally announced January 2023.

arXiv:2211.15343 [pdf]

doi 10.1016/j.techfore.2022.122120

Explainable Artificial Intelligence (XAI) from a user perspective- A synthesis of prior literature and problematizing avenues for future research

Authors: AKM Bahalul Haque, A. K. M. Najmul Islam, Patrick Mikalef

Abstract: The final search query for the Systematic Literature Review (SLR) was conducted on 15th July 2022. Initially, we extracted 1707 journal and conference articles from the Scopus and Web of Science databases. Inclusion and exclusion criteria were then applied, and 58 articles were selected for the SLR. The findings show four dimensions that shape the AI explanation, which are format (explanation repr… ▽ More The final search query for the Systematic Literature Review (SLR) was conducted on 15th July 2022. Initially, we extracted 1707 journal and conference articles from the Scopus and Web of Science databases. Inclusion and exclusion criteria were then applied, and 58 articles were selected for the SLR. The findings show four dimensions that shape the AI explanation, which are format (explanation representation format), completeness (explanation should contain all required information, including the supplementary information), accuracy (information regarding the accuracy of the explanation), and currency (explanation should contain recent information). Moreover, along with the automatic representation of the explanation, the users can request additional information if needed. We have also found five dimensions of XAI effects: trust, transparency, understandability, usability, and fairness. In addition, we investigated current knowledge from selected articles to problematize future research agendas as research questions along with possible research paths. Consequently, a comprehensive framework of XAI and its possible effects on user behavior has been developed. △ Less

Submitted 24 November, 2022; originally announced November 2022.

Journal ref: Technological Forecasting and Social Change (Volume 186, Part A, January 2023, 122120)

arXiv:2211.00058 [pdf]

doi 10.1155/2022/6807484

Semantic Web in Healthcare: A Systematic Literature Review of Application, Research Gap, and Future Research Avenues

Authors: A. K. M. Bahalul Haque, B. M. Arifuzzaman, Sayed Abu Noman Siddik, Abul Kalam, Tabassum Sadia Shahjahan, T. S. Saleena, Morshed Alam, Md. Rabiul Islam, Foyez Ahmmed, 5and Md. Jamal Hossain

Abstract: Today, healthcare has become one of the largest and most fast-paced industries due to the rapid development of digital healthcare technologies. The fundamental thing to enhance healthcare services is communicating and linking massive volumes of available healthcare data. However, the key challenge in reaching this ambitious goal is letting the information exchange across heterogeneous sources and… ▽ More Today, healthcare has become one of the largest and most fast-paced industries due to the rapid development of digital healthcare technologies. The fundamental thing to enhance healthcare services is communicating and linking massive volumes of available healthcare data. However, the key challenge in reaching this ambitious goal is letting the information exchange across heterogeneous sources and methods as well as establishing efficient tools and techniques. Semantic Web (SW) technology can help to tackle these problems. They can enhance knowledge exchange, information management, data interoperability, and decision support in healthcare systems. They can also be utilized to create various e-healthcare systems that aid medical practitioners in making decisions and provide patients with crucial medical information and automated hospital services. This systematic literature review (SLR) on SW in healthcare systems aims to assess and critique previous findings while adhering to appropriate research procedures. We looked at 65 papers and came up with five themes: e-service, disease, information management, frontier technology, and regulatory conditions. In each thematic research area, we presented the contributions of previous literature. We emphasized the topic by responding to five specific research questions. We have finished the SLR study by identifying research gaps and establishing future research goals that will help to minimize the difficulty of adopting SW in healthcare systems and provide new approaches for SW-based medical systems progress. △ Less

Submitted 19 October, 2022; originally announced November 2022.

Comments: 27 Pages

arXiv:2210.13336 [pdf, other]

Brain Tumor Segmentation using Enhanced U-Net Model with Empirical Analysis

Authors: MD Abdullah Al Nasim, Abdullah Al Munem, Maksuda Islam, Md Aminul Haque Palash, MD. Mahim Anjum Haque, Faisal Muhammad Shah

Abstract: Cancer of the brain is deadly and requires careful surgical segmentation. The brain tumors were segmented using U-Net using a Convolutional Neural Network (CNN). When looking for overlaps of necrotic, edematous, growing, and healthy tissue, it might be hard to get relevant information from the images. The 2D U-Net network was improved and trained with the BraTS datasets to find these four areas. U… ▽ More Cancer of the brain is deadly and requires careful surgical segmentation. The brain tumors were segmented using U-Net using a Convolutional Neural Network (CNN). When looking for overlaps of necrotic, edematous, growing, and healthy tissue, it might be hard to get relevant information from the images. The 2D U-Net network was improved and trained with the BraTS datasets to find these four areas. U-Net can set up many encoder and decoder routes that can be used to get information from images that can be used in different ways. To reduce computational time, we use image segmentation to exclude insignificant background details. Experiments on the BraTS datasets show that our proposed model for segmenting brain tumors from MRI (MRI) works well. In this study, we demonstrate that the BraTS datasets for 2017, 2018, 2019, and 2020 do not significantly differ from the BraTS 2019 dataset's attained dice scores of 0.8717 (necrotic), 0.9506 (edema), and 0.9427 (enhancing). △ Less

Submitted 15 January, 2023; v1 submitted 24 October, 2022; originally announced October 2022.

Comments: 5 tables, 4 figures, 5 equations

arXiv:2210.12609 [pdf]

doi 10.1109/ACCESS.2022.3198956

Blockchain and Machine Learning for Fraud Detection: A Privacy-Preserving and Adaptive Incentive Based Approach

Authors: Tahmid Hasan Pranto, Kazi Tamzid Akhter Md Hasib, Tahsinur Rahman, AKM Bahalul Haque, A. K. M. Najmul Islam, Rashedur M. Rahman

Abstract: Financial fraud cases are on the rise even with the current technological advancements. Due to the lack of inter-organization synergy and because of privacy concerns, authentic financial transaction data is rarely available. On the other hand, data-driven technologies like machine learning need authentic data to perform precisely in real-world systems. This study proposes a blockchain and smart co… ▽ More Financial fraud cases are on the rise even with the current technological advancements. Due to the lack of inter-organization synergy and because of privacy concerns, authentic financial transaction data is rarely available. On the other hand, data-driven technologies like machine learning need authentic data to perform precisely in real-world systems. This study proposes a blockchain and smart contract-based approach to achieve robust Machine Learning (ML) algorithm for e-commerce fraud detection by facilitating inter-organizational collaboration. The proposed method uses blockchain to secure the privacy of the data. Smart contract deployed inside the network fully automates the system. An ML model is incrementally upgraded from collaborative data provided by the organizations connected to the blockchain. To incentivize the organizations, we have introduced an incentive mechanism that is adaptive to the difficulty level in updating a model. The organizations receive incentives based on the difficulty faced in updating the ML model. A mining criterion has been proposed to mine the block efficiently. And finally, the blockchain network istested under different difficulty levels and under different volumes of data to test its efficiency. The model achieved 98.93% testing accuracy and 98.22% Fbeta score (recall-biased f measure) over eight incremental updates. Our experiment shows that both data volume and difficulty level of blockchain impacts the mining time. For difficulty level less than five, mining time and difficulty level has a positive correlation. For difficulty level two and three, less than a second is required to mine a block in our system. Difficulty level five poses much more difficulties to mine the blocks. △ Less

Submitted 23 October, 2022; originally announced October 2022.

arXiv:2209.04614 [pdf]

doi 10.1080/24751839.2022.2117121

A customer satisfaction centric food delivery system based on blockchain and smart contract

Authors: A. A. Talha Talukder, Md. Anisul Islam Mahmud, Arbiya Sultana, Tahmid Hasan Pranto, AKM Bahalul Haque, Rashedur M. Rahman

Abstract: Food delivery systems are gaining popularity recently due to the expansion of internet connectivity and for the increasing availability of devices. The growing popularity of such systems has raised concerns regarding (i) Information security, (ii) Business to business (B2B) deep discounting race, and (iii) Strict policy enforcement. Sensitive personal data and financial information of the users mu… ▽ More Food delivery systems are gaining popularity recently due to the expansion of internet connectivity and for the increasing availability of devices. The growing popularity of such systems has raised concerns regarding (i) Information security, (ii) Business to business (B2B) deep discounting race, and (iii) Strict policy enforcement. Sensitive personal data and financial information of the users must be safeguarded. Additionally, in pursuit of gaining profit, the restaurants tend to offer deep discounts resulting in a higher volume of orders than usual. Therefore, the restaurants and the delivery persons fail to maintain the delivery time and often impair the food quality. In this paper, we have proposed a blockchain and smart contract-based food delivery system to address these issues. The main goal is to remove commission schemes and decrease service delays caused by a high volume of orders. The protocols have been deployed and tested on the Ethereum test network. The simulation manifests a successful implementation of our desired system; with the payment being controlled by our system. The actors (restaurant, delivery-person or consumer) are bound to be compliant with the policies or penalized otherwise. △ Less

Submitted 10 September, 2022; originally announced September 2022.

arXiv:2208.14614 [pdf, other]

doi 10.1145/3511808.3557433

Rethinking Conversational Recommendations: Is Decision Tree All You Need?

Authors: A S M Ahsan-Ul Haque, Hongning Wang

Abstract: Conversational recommender systems (CRS) dynamically obtain the user preferences via multi-turn questions and answers. The existing CRS solutions are widely dominated by deep reinforcement learning algorithms. However, deep reinforcement learning methods are often criticised for lacking interpretability and requiring a large amount of training data to perform. In this paper, we explore a simpler… ▽ More Conversational recommender systems (CRS) dynamically obtain the user preferences via multi-turn questions and answers. The existing CRS solutions are widely dominated by deep reinforcement learning algorithms. However, deep reinforcement learning methods are often criticised for lacking interpretability and requiring a large amount of training data to perform. In this paper, we explore a simpler alternative and propose a decision tree based solution to CRS. The underlying challenge in CRS is that the same item can be described differently by different users. We show that decision trees are sufficient to characterize the interactions between users and items, and solve the key challenges in multi-turn CRS: namely which questions to ask, how to rank the candidate items, when to recommend, and how to handle negative feedback on the recommendations. Firstly, the training of decision trees enables us to find questions which effectively narrow down the search space. Secondly, by learning embeddings for each item and tree nodes, the candidate items can be ranked based on their similarity to the conversation context encoded by the tree nodes. Thirdly, the diversity of items associated with each tree node allows us to develop an early stop** strategy to decide when to make recommendations. Fourthly, when the user rejects a recommendation, we adaptively choose the next decision tree to improve subsequent questions and recommendations. Extensive experiments on three publicly available benchmark CRS datasets show that our approach provides significant improvement to the state of the art CRS methods. △ Less

Submitted 30 August, 2022; originally announced August 2022.

Comments: 19 pages, 5 figures

arXiv:2208.07399 [pdf, other]

A Survey of Recommender System Techniques and the Ecommerce Domain

Authors: Imran Hossain, Md Aminul Haque Palash, Anika Tabassum Sejuty, Noor A Tanjim, MD Abdullah AL Nasim, Sarwar Saif, Abu Bokor Suraj, Md Mahim Anjum Haque, Nazmul Karim

Abstract: In this big data era, it is hard for the current generation to find the right data from the huge amount of data contained within online platforms. In such a situation, there is a need for an information filtering system that might help them find the information they are looking for. In recent years, a research field has emerged known as recommender systems. Recommenders have become important as th… ▽ More In this big data era, it is hard for the current generation to find the right data from the huge amount of data contained within online platforms. In such a situation, there is a need for an information filtering system that might help them find the information they are looking for. In recent years, a research field has emerged known as recommender systems. Recommenders have become important as they have many real-life applications. This paper reviews the different techniques and developments of recommender systems in e-commerce, e-tourism, e-resources, e-government, e-learning, and e-library. By analyzing recent work on this topic, we will be able to provide a detailed overview of current developments and identify existing difficulties in recommendation systems. The final results give practitioners and researchers the necessary guidance and insights into the recommendation system and its application. △ Less

Submitted 21 February, 2023; v1 submitted 15 August, 2022; originally announced August 2022.

Comments: 22 pages, 13 figures

arXiv:2208.06827 [pdf]

BDSL 49: A Comprehensive Dataset of Bangla Sign Language

Authors: Ayman Hasib, Saqib Sizan Khan, Jannatul Ferdous Eva, Mst. Nipa Khatun, Ashraful Haque, Nishat Shahrin, Rashik Rahman, Hasan Murad, Md. Rajibul Islam, Molla Rashied Hussein

Abstract: Language is a method by which individuals express their thoughts. Each language has its own set of alphabetic and numeric characters. People can communicate with one another through either oral or written communication. However, each language has a sign language counterpart. Individuals who are deaf and/or mute communicate through sign language. The Bangla language also has a sign language, which… ▽ More Language is a method by which individuals express their thoughts. Each language has its own set of alphabetic and numeric characters. People can communicate with one another through either oral or written communication. However, each language has a sign language counterpart. Individuals who are deaf and/or mute communicate through sign language. The Bangla language also has a sign language, which is called BDSL. The dataset is about Bangla hand sign images. The collection contains 49 individual Bangla alphabet images in sign language. BDSL49 is a dataset that consists of 29,490 images with 49 labels. Images of 14 different adult individuals, each with a distinct background and appearance, have been recorded during data collection. Several strategies have been used to eliminate noise from datasets during preparation. This dataset is available to researchers for free. They can develop automated systems using machine learning, computer vision, and deep learning techniques. In addition, two models were used in this dataset. The first is for detection, while the second is for recognition. △ Less

Submitted 14 August, 2022; originally announced August 2022.

Comments: 16 pages; 6 figures; Submitted to Data in Brief, a multidisciplinary, open-access and peer-reviewed journal for reviewing

arXiv:2208.04278 [pdf, other]

Self-Supervised Contrastive Representation Learning for 3D Mesh Segmentation

Authors: Ayaan Haque, Hankyu Moon, Heng Hao, Sima Didari, Jae Oh Woo, Patrick Bangert

Abstract: 3D deep learning is a growing field of interest due to the vast amount of information stored in 3D formats. Triangular meshes are an efficient representation for irregular, non-uniform 3D objects. However, meshes are often challenging to annotate due to their high geometrical complexity. Specifically, creating segmentation masks for meshes is tedious and time-consuming. Therefore, it is desirable… ▽ More 3D deep learning is a growing field of interest due to the vast amount of information stored in 3D formats. Triangular meshes are an efficient representation for irregular, non-uniform 3D objects. However, meshes are often challenging to annotate due to their high geometrical complexity. Specifically, creating segmentation masks for meshes is tedious and time-consuming. Therefore, it is desirable to train segmentation networks with limited-labeled data. Self-supervised learning (SSL), a form of unsupervised representation learning, is a growing alternative to fully-supervised learning which can decrease the burden of supervision for training. We propose SSL-MeshCNN, a self-supervised contrastive learning method for pre-training CNNs for mesh segmentation. We take inspiration from traditional contrastive learning frameworks to design a novel contrastive learning algorithm specifically for meshes. Our preliminary experiments show promising results in reducing the heavy labeled data requirement needed for mesh segmentation by at least 33%. △ Less

Submitted 21 December, 2022; v1 submitted 8 August, 2022; originally announced August 2022.

Comments: AAAI 2023

arXiv:2207.10807 [pdf]

A Machine Learning Approach for Driver Identification Based on CAN-BUS Sensor Data

Authors: Md. Abbas Ali Khan, Mphammad Hanif Ali, AKM Fazlul Haque, Md. Tarek Habib

Abstract: Driver identification is a momentous field of modern decorated vehicles in the controller area network (CAN-BUS) perspective. Many conventional systems are used to identify the driver. One step ahead, most of the researchers use sensor data of CAN-BUS but there are some difficulties because of the variation of the protocol of different models of vehicle. Our aim is to identify the driver through s… ▽ More Driver identification is a momentous field of modern decorated vehicles in the controller area network (CAN-BUS) perspective. Many conventional systems are used to identify the driver. One step ahead, most of the researchers use sensor data of CAN-BUS but there are some difficulties because of the variation of the protocol of different models of vehicle. Our aim is to identify the driver through supervised learning algorithms based on driving behavior analysis. To determine the driver, a driver verification technique is proposed that evaluate driving pattern using the measurement of CAN sensor data. In this paper on-board diagnostic (OBD-II) is used to capture the data from the CAN-BUS sensor and the sensors are listed under SAE J1979 statement. According to the service of OBD-II, drive identification is possible. However, we have gained two types of accuracy on a complete data set with 10 drivers and a partial data set with two drivers. The accuracy is good with less number of drivers compared to the higher number of drivers. We have achieved statistically significant results in terms of accuracy in contrast to the baseline algorithm △ Less

Submitted 15 July, 2022; originally announced July 2022.

arXiv:2206.07796 [pdf, other]

FixEval: Execution-based Evaluation of Program Fixes for Programming Problems

Authors: Md Mahim Anjum Haque, Wasi Uddin Ahmad, Ismini Lourentzou, Chris Brown

Abstract: The complexity of modern software has led to a drastic increase in the time and cost associated with detecting and rectifying software bugs. In response, researchers have explored various methods to automatically generate fixes for buggy code. However, due to the large combinatorial space of possible fixes for any given bug, few tools and datasets are available to evaluate model-generated fixes ef… ▽ More The complexity of modern software has led to a drastic increase in the time and cost associated with detecting and rectifying software bugs. In response, researchers have explored various methods to automatically generate fixes for buggy code. However, due to the large combinatorial space of possible fixes for any given bug, few tools and datasets are available to evaluate model-generated fixes effectively. To address this issue, we introduce FixEval, a benchmark comprising of buggy code submissions to competitive programming problems and their corresponding fixes. FixEval offers an extensive collection of unit tests to evaluate the correctness of model-generated program fixes and assess further information regarding time, memory constraints, and acceptance based on a verdict. We consider two Transformer language models pretrained on programming languages as our baseline and compare them using match-based and execution-based evaluation metrics. Our experiments show that match-based metrics do not reflect model-generated program fixes accurately. At the same time, execution-based methods evaluate programs through all cases and scenarios designed explicitly for that solution. Therefore, we believe FixEval provides a step towards real-world automatic bug fixing and model-generated code evaluation. The dataset and models are open-sourced at https://github.com/mahimanzum/FixEval. △ Less

Submitted 30 March, 2023; v1 submitted 15 June, 2022; originally announced June 2022.

arXiv:2205.10830 [pdf, ps, other]

A review on Deep Neural Network for Computer Network Traffic Classification

Authors: Md. Ariful Haque, Dr. Rajesh Palit

Abstract: Focus on Deep Neural Network based malicious and normal computer Network Traffic classification. (such as attacks, phishing, any other illegal activity and normal traffic identification). In this paper, the main idea is to review, existed Neural Network based network traffic classification. Which indicates intrusion activity classification and detection. It is very important to classify network tr… ▽ More Focus on Deep Neural Network based malicious and normal computer Network Traffic classification. (such as attacks, phishing, any other illegal activity and normal traffic identification). In this paper, the main idea is to review, existed Neural Network based network traffic classification. Which indicates intrusion activity classification and detection. It is very important to classify network traffic to safeguard any system, connected to computer network. There are a variety of NN architecture for it, with different rate of accuracy. On this paper we will do relative compression among them. Index Terms-Computer Network, Network traffic, Packet, Intrusion, DOS (Denial-of-service), unauthorized access, IDS (Intrusion Detection System), IPS (Intrusion Prevention Systems), R2L (Remote to Local Attack), Probing, U2R (User to Root Attack), DNN (Deep Neural Network), CRNN (Convolutional Recurrent Neural Network), RPROP (Resilient propagation). △ Less

Submitted 22 May, 2022; originally announced May 2022.

arXiv:2205.04344 [pdf, other]

Transfer Learning Based Efficient Traffic Prediction with Limited Training Data

Authors: Sajal Saha, Anwar Haque, Greg Sidebottom

Abstract: Efficient prediction of internet traffic is an essential part of Self Organizing Network (SON) for ensuring proactive management. There are many existing solutions for internet traffic prediction with higher accuracy using deep learning. But designing individual predictive models for each service provider in the network is challenging due to data heterogeneity, scarcity, and abnormality. Moreover,… ▽ More Efficient prediction of internet traffic is an essential part of Self Organizing Network (SON) for ensuring proactive management. There are many existing solutions for internet traffic prediction with higher accuracy using deep learning. But designing individual predictive models for each service provider in the network is challenging due to data heterogeneity, scarcity, and abnormality. Moreover, the performance of the deep sequence model in network traffic prediction with limited training data has not been studied extensively in the current works. In this paper, we investigated and evaluated the performance of the deep transfer learning technique in traffic prediction with inadequate historical data leveraging the knowledge of our pre-trained model. First, we used a comparatively larger real-world traffic dataset for source domain prediction based on five different deep sequence models: Recurrent Neural Network (RNN), Long Short-Term Memory (LSTM), LSTM Encoder-Decoder (LSTM_En_De), LSTM_En_De with Attention layer (LSTM_En_De_Atn), and Gated Recurrent Unit (GRU). Then, two best-performing models, LSTM_En_De and LSTM_En_De_Atn, from the source domain with an accuracy of 96.06% and 96.05% are considered for the target domain prediction. Finally, four smaller traffic datasets collected for four particular sources and destination pairs are used in the target domain to compare the performance of the standard learning and transfer learning in terms of accuracy and execution time. According to our experimental result, transfer learning helps to reduce the execution time for most cases, while the model's accuracy is improved in transfer learning with a larger training session. △ Less

Submitted 9 May, 2022; originally announced May 2022.

Comments: 7 pages, 7 figures, Submitted IEEE Global Communications Conference, 2022

arXiv:2205.04333 [pdf, other]

Wavelet-Based Hybrid Machine Learning Model for Out-of-distribution Internet Traffic Prediction

Authors: Sajal Saha, Anwar Haque, Greg Sidebottom

Abstract: Efficient prediction of internet traffic is essential for ensuring proactive management of computer networks. Nowadays, machine learning approaches show promising performance in modeling real-world complex traffic. However, most existing works assumed that model training and evaluation data came from identical distribution. But in practice, there is a high probability that the model will deal with… ▽ More Efficient prediction of internet traffic is essential for ensuring proactive management of computer networks. Nowadays, machine learning approaches show promising performance in modeling real-world complex traffic. However, most existing works assumed that model training and evaluation data came from identical distribution. But in practice, there is a high probability that the model will deal with data from a slightly or entirely unknown distribution in the deployment phase. This paper investigated and evaluated machine learning performances using eXtreme Gradient Boosting, Light Gradient Boosting Machine, Stochastic Gradient Descent, Gradient Boosting Regressor, CatBoost Regressor, and their stacked ensemble model using data from both identical and out-of distribution. Also, we proposed a hybrid machine learning model integrating wavelet decomposition for improving out-of-distribution prediction as standalone models were unable to generalize very well. Our experimental results show the best performance of the standalone ensemble model with an accuracy of 96.4%, while the hybrid ensemble model improved it by 1% for in-distribution data. But its performance dropped significantly when tested with three different datasets having a distribution shift than the training set. However, our proposed hybrid model considerably reduces the performance gap between identical and out-of-distribution evaluation compared with the standalone model, indicating the decomposition technique's effectiveness in the case of out-of-distribution generalization. △ Less

Submitted 9 May, 2022; originally announced May 2022.

Comments: 6 pages, 7 images, Submitted IEEE Global Communications Conference, 2022

arXiv:2205.01685 [pdf, other]

Deep Sequence Modeling for Anomalous ISP Traffic Prediction

Authors: Sajal Saha, Anwar Haque, Greg Sidebottom

Abstract: Internet traffic in the real world is susceptible to various external and internal factors which may abruptly change the normal traffic flow. Those unexpected changes are considered outliers in traffic. However, deep sequence models have been used to predict complex IP traffic, but their comparative performance for anomalous traffic has not been studied extensively. In this paper, we investigated… ▽ More Internet traffic in the real world is susceptible to various external and internal factors which may abruptly change the normal traffic flow. Those unexpected changes are considered outliers in traffic. However, deep sequence models have been used to predict complex IP traffic, but their comparative performance for anomalous traffic has not been studied extensively. In this paper, we investigated and evaluated the performance of different deep sequence models for anomalous traffic prediction. Several deep sequences models were implemented to predict real traffic without and with outliers and show the significance of outlier detection in real-world traffic prediction. First, two different outlier detection techniques, such as the Three-Sigma rule and Isolation Forest, were applied to identify the anomaly. Second, we adjusted those abnormal data points using the Backward Filling technique before training the model. Finally, the performance of different models was compared for abnormal and adjusted traffic. LSTM_Encoder_Decoder (LSTM_En_De) is the best prediction model in our experiment, reducing the deviation between actual and predicted traffic by more than 11\% after adjusting the outliers. All other models, including Recurrent Neural Network (RNN), Long Short-Term Memory (LSTM), LSTM_En_De with Attention layer (LSTM_En_De_Atn), Gated Recurrent Unit (GRU), show better prediction after replacing the outliers and decreasing prediction error by more than 29%, 24%, 19%, and 10% respectively. Our experimental results indicate that the outliers in the data can significantly impact the quality of the prediction. Thus, outlier detection and mitigation assist the deep sequence model in learning the general trend and making better predictions. △ Less

Submitted 3 May, 2022; originally announced May 2022.

Comments: 6 pages, 6 images, To appear in the Proceedings of IEEE International Conference on Communications, Seoul, South Korea, 2022. arXiv admin note: substantial text overlap with arXiv:2205.01300

arXiv:2205.01590 [pdf, other]

doi 10.1109/IWCMC55113.2022.9825059.

An Empirical Study on Internet Traffic Prediction Using Statistical Rolling Model

Authors: Sajal Saha, Anwar Haque, Greg Sidebottom

Abstract: Real-world IP network traffic is susceptible to external and internal factors such as new internet service integration, traffic migration, internet application, etc. Due to these factors, the actual internet traffic is non-linear and challenging to analyze using a statistical model for future prediction. In this paper, we investigated and evaluated the performance of different statistical predicti… ▽ More Real-world IP network traffic is susceptible to external and internal factors such as new internet service integration, traffic migration, internet application, etc. Due to these factors, the actual internet traffic is non-linear and challenging to analyze using a statistical model for future prediction. In this paper, we investigated and evaluated the performance of different statistical prediction models for real IP network traffic; and showed a significant improvement in prediction using the rolling prediction technique. Initially, a set of best hyper-parameters for the corresponding prediction model is identified by analyzing the traffic characteristics and implementing a grid search algorithm based on the minimum Akaike Information Criterion (AIC). Then, we performed a comparative performance analysis among AutoRegressive Integrated Moving Average (ARIMA), Seasonal ARIMA (SARIMA), SARIMA with eXogenous factors (SARIMAX), and Holt-Winter for single-step prediction. The seasonality of our traffic has been explicitly modeled using SARIMA, which reduces the rolling prediction Mean Average Percentage Error (MAPE) by more than 4% compared to ARIMA (incapable of handling the seasonality). We further improved traffic prediction using SARIMAX to learn different exogenous factors extracted from the original traffic, which yielded the best rolling prediction results with a MAPE of 6.83%. Finally, we applied the exponential smoothing technique to handle the variability in traffic following the Holt-Winter model, which exhibited a better prediction than ARIMA (around 1.5% less MAPE). The rolling prediction technique reduced prediction error using real Internet Service Provider (ISP) traffic data by more than 50\% compared to the standard prediction method. △ Less

Submitted 3 May, 2022; originally announced May 2022.

Comments: 6 pages, 2 figures, To appear in the Proceedings of the International Wireless Communications and Mobile Computing Conference in Dubrovnik, Croatia, 2022

Journal ref: 2022 International Wireless Communications and Mobile Computing (IWCMC), 2022, pp. 1058-1063

arXiv:2205.01300 [pdf, other]

Towards an Ensemble Regressor Model for Anomalous ISP Traffic Prediction

Authors: Sajal Saha, Anwar Haque, Greg Sidebottom

Abstract: Prediction of network traffic behavior is significant for the effective management of modern telecommunication networks. However, the intuitive approach of predicting network traffic using administrative experience and market analysis data is inadequate for an efficient forecast framework. As a result, many different mathematical models have been studied to capture the general trend of the network… ▽ More Prediction of network traffic behavior is significant for the effective management of modern telecommunication networks. However, the intuitive approach of predicting network traffic using administrative experience and market analysis data is inadequate for an efficient forecast framework. As a result, many different mathematical models have been studied to capture the general trend of the network traffic and predict accordingly. But the comprehensive performance analysis of varying regression models and their ensemble has not been studied before for analyzing real-world anomalous traffic. In this paper, several regression models such as Extra Gradient Boost (XGBoost), Light Gradient Boosting Machine (LightGBM), Stochastic Gradient Descent (SGD), Gradient Boosting Regressor (GBR), and CatBoost Regressor were analyzed to predict real traffic without and with outliers and show the significance of outlier detection in real-world traffic prediction. Also, we showed the outperformance of the ensemble regression model over the individual prediction model. We compared the performance of different regression models based on five different feature sets of lengths 6, 9, 12, 15, and 18. Our ensemble regression model achieved the minimum average gap of 5.04% between actual and predicted traffic with nine outlier-adjusted inputs. In general, our experimental results indicate that the outliers in the data can significantly impact the quality of the prediction. Thus, outlier detection and mitigation assist the regression model in learning the general trend and making better predictions. △ Less

Submitted 3 May, 2022; originally announced May 2022.

Comments: 7 pages, 7 figures, To appear in the proceedings on International Symposium on Networks, Computers and Communications in China, from July 19 to 22, 2022

arXiv:2205.01042 [pdf]

doi 10.33166/AETiC.2022.02.002

Machine Learning and Artificial Intelligence in Circular Economy: A Bibliometric Analysis and Systematic Literature Review

Authors: Abdulla All noman, Umma Habiba Akter, Tahmid Hasan Pranto, AKM Bahalul Haque

Abstract: With unorganized, unplanned and improper use of limited raw materials, an abundant amount of waste is being produced, which is harmful to our environment and ecosystem. While traditional linear production lines fail to address far-reaching issues like waste production and a shorter product life cycle, a prospective concept, namely circular economy (CE), has shown promising prospects to be adopted… ▽ More With unorganized, unplanned and improper use of limited raw materials, an abundant amount of waste is being produced, which is harmful to our environment and ecosystem. While traditional linear production lines fail to address far-reaching issues like waste production and a shorter product life cycle, a prospective concept, namely circular economy (CE), has shown promising prospects to be adopted at industrial and governmental levels. CE aims to complete the product life cycle loop by bringing out the highest values from raw materials in the design phase and later on by reusing, recycling, and remanufacturing. Innovative technologies like artificial intelligence (AI) and machine learning(ML) provide vital assistance in effectively adopting and implementing CE in real-world practices. This study explores the adoption and integration of applied AI techniques in CE. First, we conducted bibliometric analysis on a collection of 104 SCOPUS indexed documents exploring the critical research criteria in AI and CE. Forty papers were picked to conduct a systematic literature review from these documents. The selected documents were further divided into six categories: sustainable development, reverse logistics, waste management, supply chain management, recycle & reuse, and manufacturing development. Comprehensive research insights and trends have been extracted and delineated. Finally, the research gap needing further attention has been identified and the future research directions have also been discussed. △ Less

Submitted 1 April, 2022; originally announced May 2022.

Comments: 28 Pages, 7 Figures, 7 Tables

Journal ref: Annals of Emerging Technologies in Computing (AETiC) , 2022

arXiv:2201.00458 [pdf, other]

Lung-Originated Tumor Segmentation from Computed Tomography Scan (LOTUS) Benchmark

Authors: Parnian Afshar, Arash Mohammadi, Konstantinos N. Plataniotis, Keyvan Farahani, Justin Kirby, Anastasia Oikonomou, Amir Asif, Leonard Wee, Andre Dekker, Xin Wu, Mohammad Ariful Haque, Shahruk Hossain, Md. Kamrul Hasan, Uday Kamal, Winston Hsu, Jhih-Yuan Lin, M. Sohel Rahman, Nabil Ibtehaz, Sh. M. Amir Foisol, Kin-Man Lam, Zhong Guang, Runze Zhang, Sumohana S. Channappayya, Shashank Gupta, Chander Dev

Abstract: Lung cancer is one of the deadliest cancers, and in part its effective diagnosis and treatment depend on the accurate delineation of the tumor. Human-centered segmentation, which is currently the most common approach, is subject to inter-observer variability, and is also time-consuming, considering the fact that only experts are capable of providing annotations. Automatic and semi-automatic tumor… ▽ More Lung cancer is one of the deadliest cancers, and in part its effective diagnosis and treatment depend on the accurate delineation of the tumor. Human-centered segmentation, which is currently the most common approach, is subject to inter-observer variability, and is also time-consuming, considering the fact that only experts are capable of providing annotations. Automatic and semi-automatic tumor segmentation methods have recently shown promising results. However, as different researchers have validated their algorithms using various datasets and performance metrics, reliably evaluating these methods is still an open challenge. The goal of the Lung-Originated Tumor Segmentation from Computed Tomography Scan (LOTUS) Benchmark created through 2018 IEEE Video and Image Processing (VIP) Cup competition, is to provide a unique dataset and pre-defined metrics, so that different researchers can develop and evaluate their methods in a unified fashion. The 2018 VIP Cup started with a global engagement from 42 countries to access the competition data. At the registration stage, there were 129 members clustered into 28 teams from 10 countries, out of which 9 teams made it to the final stage and 6 teams successfully completed all the required tasks. In a nutshell, all the algorithms proposed during the competition, are based on deep learning models combined with a false positive reduction technique. Methods developed by the three finalists show promising results in tumor segmentation, however, more effort should be put into reducing the false positive rate. This competition manuscript presents an overview of the VIP-Cup challenge, along with the proposed algorithms and results. △ Less

Submitted 2 January, 2022; originally announced January 2022.

arXiv:2111.09537 [pdf, other]

The Prominence of Artificial Intelligence in COVID-19

Authors: MD Abdullah Al Nasim, Aditi Dhali, Faria Afrin, Noshin Tasnim Zaman, Nazmul Karimm, Md Mahim Anjum Haque

Abstract: In December 2019, a novel virus called COVID-19 had caused an enormous number of causalities to date. The battle with the novel Coronavirus is baffling and horrifying after the Spanish Flu 2019. While the front-line doctors and medical researchers have made significant progress in controlling the spread of the highly contiguous virus, technology has also proved its significance in the battle. More… ▽ More In December 2019, a novel virus called COVID-19 had caused an enormous number of causalities to date. The battle with the novel Coronavirus is baffling and horrifying after the Spanish Flu 2019. While the front-line doctors and medical researchers have made significant progress in controlling the spread of the highly contiguous virus, technology has also proved its significance in the battle. Moreover, Artificial Intelligence has been adopted in many medical applications to diagnose many diseases, even baffling experienced doctors. Therefore, this survey paper explores the methodologies proposed that can aid doctors and researchers in early and inexpensive methods of diagnosis of the disease. Most develo** countries have difficulties carrying out tests using the conventional manner, but a significant way can be adopted with Machine and Deep Learning. On the other hand, the access to different types of medical images has motivated the researchers. As a result, a mammoth number of techniques are proposed. This paper first details the background knowledge of the conventional methods in the Artificial Intelligence domain. Following that, we gather the commonly used datasets and their use cases to date. In addition, we also show the percentage of researchers adopting Machine Learning over Deep Learning. Thus we provide a thorough analysis of this scenario. Lastly, in the research challenges, we elaborate on the problems faced in COVID-19 research, and we address the issues with our understanding to build a bright and healthy environment. △ Less

Submitted 29 March, 2023; v1 submitted 18 November, 2021; originally announced November 2021.

Comments: 63 pages, 3 tables, 17 figures

arXiv:2110.13185 [pdf, other]

doi 10.59275/j.melba.2021-d8a3

Generalized Multi-Task Learning from Substantially Unlabeled Multi-Source Medical Image Data

Authors: Ayaan Haque, Abdullah-Al-Zubaer Imran, Adam Wang, Demetri Terzopoulos

Abstract: Deep learning-based models, when trained in a fully-supervised manner, can be effective in performing complex image analysis tasks, although contingent upon the availability of large labeled datasets. Especially in the medical imaging domain, however, expert image annotation is expensive, time-consuming, and prone to variability. Semi-supervised learning from limited quantities of labeled data has… ▽ More Deep learning-based models, when trained in a fully-supervised manner, can be effective in performing complex image analysis tasks, although contingent upon the availability of large labeled datasets. Especially in the medical imaging domain, however, expert image annotation is expensive, time-consuming, and prone to variability. Semi-supervised learning from limited quantities of labeled data has shown promise as an alternative. Maximizing knowledge gains from copious unlabeled data benefits semi-supervised learning models. Moreover, learning multiple tasks within the same model further improves its generalizability. We propose MultiMix, a new multi-task learning model that jointly learns disease classification and anatomical segmentation in a semi-supervised manner, while preserving explainability through a novel saliency bridge between the two tasks. Our experiments with varying quantities of multi-source labeled data in the training sets confirm the effectiveness of MultiMix in the simultaneous classification of pneumonia and segmentation of the lungs in chest X-ray images. Moreover, both in-domain and cross-domain evaluations across these tasks further showcase the potential of our model to adapt to challenging generalization scenarios. △ Less

Submitted 25 October, 2021; originally announced October 2021.

Comments: Accepted for publication at the Journal of Machine Learning for Biomedical Imaging (MELBA) https://www.melba-journal.org/

arXiv:2109.13862 [pdf, other]

3N-GAN: Semi-Supervised Classification of X-Ray Images with a 3-Player Adversarial Framework

Authors: Shafin Haque, Ayaan Haque

Abstract: The success of deep learning for medical imaging tasks, such as classification, is heavily reliant on the availability of large-scale datasets. However, acquiring datasets with large quantities of labeled data is challenging, as labeling is expensive and time-consuming. Semi-supervised learning (SSL) is a growing alternative to fully-supervised learning, but requires unlabeled samples for training… ▽ More The success of deep learning for medical imaging tasks, such as classification, is heavily reliant on the availability of large-scale datasets. However, acquiring datasets with large quantities of labeled data is challenging, as labeling is expensive and time-consuming. Semi-supervised learning (SSL) is a growing alternative to fully-supervised learning, but requires unlabeled samples for training. In medical imaging, many datasets lack unlabeled data entirely, so SSL can't be conventionally utilized. We propose 3N-GAN, or 3 Network Generative Adversarial Networks, to perform semi-supervised classification of medical images in fully-supervised settings. We incorporate a classifier into the adversarial relationship such that the generator trains adversarially against both the classifier and discriminator. Our preliminary results show improved classification performance and GAN generations over various algorithms. Our work can seamlessly integrate with numerous other medical imaging model architectures and SSL methods for greater performance. △ Less

Submitted 22 September, 2021; originally announced September 2021.

Comments: 5 pages, 2 figures; authors contributed equally

arXiv:2109.10867 [pdf]

doi 10.3390/app11136132

Towards a GDPR-Compliant Blockchain-Based COVID Vaccination Passport

Authors: AKM Bahalul Haque, Bilal Naqvi, A. K. M. Najmul Islam, Sami Hyrynsalmi

Abstract: The COVID-19 pandemic has shaken the world and limited work/personal life activities. Besides the loss of human lives and agony faced by humankind, the pandemic has badly hit different sectors economically, including the travel industry. Special arrangements, including COVID test before departure and on arrival, and voluntary quarantine, were enforced to limit the risk of transmission. However, th… ▽ More The COVID-19 pandemic has shaken the world and limited work/personal life activities. Besides the loss of human lives and agony faced by humankind, the pandemic has badly hit different sectors economically, including the travel industry. Special arrangements, including COVID test before departure and on arrival, and voluntary quarantine, were enforced to limit the risk of transmission. However, the hope for returning to a normal (pre-COVID) routine relies on the success of the current COVID vaccination drives administered by different countries. To open for tourism and other necessary travel, a need is realized for a universally accessible proof of COVID vaccination, allowing travelers to cross the borders without any hindrance. This paper presents an architectural framework for a GDPR-compliant blockchain-based COVID vaccination passport (VacciFi), whilst considering the relevant developments, especially in the European Union region. △ Less

Submitted 1 July, 2021; originally announced September 2021.

Comments: Applied Sciences , 2021

arXiv:2109.10213 [pdf, other]

Blockchain-based Covid Vaccination Registration and Monitoring

Authors: Shirajus Salekin Nabil, Md. Sabbir Alam Pran, Ali Abrar Al Haque, Narayan Ranjan Chakraborty, Mohammad Jabed Morshed Chowdhury, Md Sadek Ferdous

Abstract: Covid-19 (SARS-CoV-2) has changed almost all the aspects of our living. Governments around the world have imposed lockdown to slow down the transmissions. In the meantime, researchers worked hard to find the vaccine. Fortunately, we have found the vaccine, in fact a good number of them. However, managing the testing and vaccination process of the total population is a mammoth job. There are multip… ▽ More Covid-19 (SARS-CoV-2) has changed almost all the aspects of our living. Governments around the world have imposed lockdown to slow down the transmissions. In the meantime, researchers worked hard to find the vaccine. Fortunately, we have found the vaccine, in fact a good number of them. However, managing the testing and vaccination process of the total population is a mammoth job. There are multiple government and private sector organisations that are working together to ensure proper testing and vaccination. However, there is always delay or data silo problems in multi-organisational works. Therefore, streamlining this process is vital to improve the efficiency and save more lives. It is already proved that technology has a significant impact on the health sector, including blockchain. Blockchain provides a distributed system along with greater privacy, transparency and authenticity. In this article, we have presented a blockchain-based system that seamlessly integrates testing and vaccination system, allowing the system to be transparent. The instant verification of any tamper-proof result and a transparent and efficient vaccination system have been exhibited and implemented in the research. We have also implemented the system as "Digital Vaccine Passport" (DVP) and analysed its performance. △ Less

Submitted 20 September, 2021; originally announced September 2021.

Comments: 12 pages

arXiv:2108.04358 [pdf, other]

Convolutional Nets for Diabetic Retinopathy Screening in Bangladeshi Patients

Authors: Ayaan Haque, Ipsita Sutradhar, Mahziba Rahman, Mehedi Hasan, Malabika Sarker

Abstract: Diabetes is one of the most prevalent chronic diseases in Bangladesh, and as a result, Diabetic Retinopathy (DR) is widespread in the population. DR, an eye illness caused by diabetes, can lead to blindness if it is not identified and treated in its early stages. Unfortunately, diagnosis of DR requires medically trained professionals, but Bangladesh has limited specialists in comparison to its pop… ▽ More Diabetes is one of the most prevalent chronic diseases in Bangladesh, and as a result, Diabetic Retinopathy (DR) is widespread in the population. DR, an eye illness caused by diabetes, can lead to blindness if it is not identified and treated in its early stages. Unfortunately, diagnosis of DR requires medically trained professionals, but Bangladesh has limited specialists in comparison to its population. Moreover, the screening process is often expensive, prohibiting many from receiving timely and proper diagnosis. To address the problem, we introduce a deep learning algorithm which screens for different stages of DR. We use a state-of-the-art CNN architecture to diagnose patients based on retinal fundus imagery. This paper is an experimental evaluation of the algorithm we developed for DR diagnosis and screening specifically for Bangladeshi patients. We perform this validation study using separate pools of retinal image data of real patients from a hospital and field studies in Bangladesh. Our results show that the algorithm is effective at screening Bangladeshi eyes even when trained on a public dataset which is out of domain, and can accurately determine the stage of DR as well, achieving an overall accuracy of 92.27\% and 93.02\% on two validation sets of Bangladeshi eyes. The results confirm the ability of the algorithm to be used in real clinical settings and applications due to its high accuracy and classwise metrics. Our algorithm is implemented in the application Drishti, which is used to screen for DR in patients living in rural areas in Bangladesh, where access to professional screening is limited. △ Less

Submitted 30 July, 2021; originally announced August 2021.

Comments: 8 pages, 6 figures, 4 tables

arXiv:2105.14220 [pdf, other]

CoDesc: A Large Code-Description Parallel Dataset

Authors: Masum Hasan, Tanveer Muttaqueen, Abdullah Al Ishtiaq, Kazi Sajeed Mehrab, Md. Mahim Anjum Haque, Tahmid Hasan, Wasi Uddin Ahmad, Anindya Iqbal, Rifat Shahriyar

Abstract: Translation between natural language and source code can help software development by enabling developers to comprehend, ideate, search, and write computer programs in natural language. Despite growing interest from the industry and the research community, this task is often difficult due to the lack of large standard datasets suitable for training deep neural models, standard noise removal method… ▽ More Translation between natural language and source code can help software development by enabling developers to comprehend, ideate, search, and write computer programs in natural language. Despite growing interest from the industry and the research community, this task is often difficult due to the lack of large standard datasets suitable for training deep neural models, standard noise removal methods, and evaluation benchmarks. This leaves researchers to collect new small-scale datasets, resulting in inconsistencies across published works. In this study, we present CoDesc -- a large parallel dataset composed of 4.2 million Java methods and natural language descriptions. With extensive analysis, we identify and remove prevailing noise patterns from the dataset. We demonstrate the proficiency of CoDesc in two complementary tasks for code-description pairs: code summarization and code search. We show that the dataset helps improve code search by up to 22\% and achieves the new state-of-the-art in code summarization. Furthermore, we show CoDesc's effectiveness in pre-training--fine-tuning setup, opening possibilities in building pretrained language models for Java. To facilitate future research, we release the dataset, a data processing tool, and a benchmark at \url{https://github.com/csebuetnlp/CoDesc}. △ Less

Submitted 29 May, 2021; originally announced May 2021.

Comments: Findings of the Association for Computational Linguistics, ACL 2021 (camera-ready)

arXiv:2105.12833

Simulated Data Generation Through Algorithmic Force Coefficient Estimation for AI-Based Robotic Projectile Launch Modeling

Authors: Sajiv Shah, Ayaan Haque, Fei Liu

Abstract: Modeling of non-rigid object launching and manipulation is complex considering the wide range of dynamics affecting trajectory, many of which may be unknown. Using physics models can be inaccurate because they cannot account for unknown factors and the effects of the deformation of the object as it is launched; moreover, deriving force coefficients for these models is not possible without extensiv… ▽ More Modeling of non-rigid object launching and manipulation is complex considering the wide range of dynamics affecting trajectory, many of which may be unknown. Using physics models can be inaccurate because they cannot account for unknown factors and the effects of the deformation of the object as it is launched; moreover, deriving force coefficients for these models is not possible without extensive experimental testing. Recently, advancements in data-powered artificial intelligence methods have allowed learnable models and systems to emerge. It is desirable to train a model for launch prediction on a robot, as deep neural networks can account for immeasurable dynamics. However, the inability to collect large amounts of experimental data decreases performance of deep neural networks. Through estimating force coefficients, the accepted physics models can be leveraged to produce adequate supplemental data to artificially increase the size of the training set, yielding improved neural networks. In this paper, we introduce a new framework for algorithmic estimation of force coefficients for non-rigid object launching, which can be generalized to other domains, in order to generate large datasets. We implement a novel training algorithm and objective for our deep neural network to accurately model launch trajectory of non-rigid objects and predict whether they will hit a series of targets. Our experimental results demonstrate the effectiveness of using simulated data from force coefficient estimation and shows the importance of simulated data for training an effective neural network. △ Less

Submitted 26 January, 2024; v1 submitted 9 May, 2021; originally announced May 2021.

Comments: not relevant work

arXiv:2105.07153 [pdf, other]

Window-Level is a Strong Denoising Surrogate

Authors: Ayaan Haque, Adam Wang, Abdullah-Al-Zubaer Imran

Abstract: CT image quality is heavily reliant on radiation dose, which causes a trade-off between radiation dose and image quality that affects the subsequent image-based diagnostic performance. However, high radiation can be harmful to both patients and operators. Several (deep learning-based) approaches have been attempted to denoise low dose images. However, those approaches require access to large train… ▽ More CT image quality is heavily reliant on radiation dose, which causes a trade-off between radiation dose and image quality that affects the subsequent image-based diagnostic performance. However, high radiation can be harmful to both patients and operators. Several (deep learning-based) approaches have been attempted to denoise low dose images. However, those approaches require access to large training sets, specifically the full dose CT images for reference, which can often be difficult to obtain. Self-supervised learning is an emerging alternative for lowering the reference data requirement facilitating unsupervised learning. Currently available self-supervised CT denoising works are either dependent on foreign domain or pretexts are not very task-relevant. To tackle the aforementioned challenges, we propose a novel self-supervised learning approach, namely Self-Supervised Window-Leveling for Image DeNoising (SSWL-IDN), leveraging an innovative, task-relevant, simple, yet effective surrogate -- prediction of the window-leveled equivalent. SSWL-IDN leverages residual learning and a hybrid loss combining perceptual loss and MSE, all incorporated in a VAE framework. Our extensive (in- and cross-domain) experimentation demonstrates the effectiveness of SSWL-IDN in aggressive denoising of CT (abdomen and chest) images acquired at 5\% dose level only. △ Less

Submitted 15 May, 2021; originally announced May 2021.

Comments: 11 pages, 4 figures

arXiv:2105.05338 [pdf]

doi 10.1049/blc2.12005

SmartOil: Blockchain and smart contract-based oil supply chain management

Authors: AKM Bahalul Haque, Md. Rifat Hasan, Md. Oahiduzzaman Mondol Zihad

Abstract: The traditional oil supply chain suffers from various shortcomings regarding crude oil extraction, processing, distribution, environmental pollution, and traceability. It offers an only a forward flow of products with almost no security and tracking process. In time, the system will lag behind due to the limitations in quality inspection, fraudulent information, and monopolistic behavior of supply… ▽ More The traditional oil supply chain suffers from various shortcomings regarding crude oil extraction, processing, distribution, environmental pollution, and traceability. It offers an only a forward flow of products with almost no security and tracking process. In time, the system will lag behind due to the limitations in quality inspection, fraudulent information, and monopolistic behavior of supply chain entities. Inclusion of counterfeiting products and opaqueness of the system urge renovation in this sector. The recent evolution of Industry 4.0 leads to the alternation in the supply chain introducing the smart supply chain. Technological advancement can now reshape the infrastructure of the supply chain for the future. In this paper, we suggest a conceptual framework utilizing Blockchain and Smart Contract to monitor the overall oil supply chain. Blockchain is a groundbreaking technology to monitor and support the security building of a decentralized type supply chain over a peer-to-peer network. The use of the Internet of Things (IoT), especially sensors, opens broader window to track the global supply chain in real-time. We construct a methodology to support reverse traceability for each participant of the supply chain. The functions and characteristics of Blockchain and Smart Contract are defined. Implementation of Smart Contracts has also been shown with detailed analysis. We further describe the challenges of implementing such a system and validate our framework's adaptability in the real world. The paper concludes with future research scope to mitigate the restrictions of data management and maintenance with advanced working prototypes and agile systems achieving greater traceability and transparency. △ Less

Submitted 11 May, 2021; originally announced May 2021.

Comments: Accepted as Open access article in IET Blockchain

Journal ref: IET Blockchain, 2021

arXiv:2104.08017 [pdf, other]

BERT2Code: Can Pretrained Language Models be Leveraged for Code Search?

Authors: Abdullah Al Ishtiaq, Masum Hasan, Md. Mahim Anjum Haque, Kazi Sajeed Mehrab, Tanveer Muttaqueen, Tahmid Hasan, Anindya Iqbal, Rifat Shahriyar

Abstract: Millions of repetitive code snippets are submitted to code repositories every day. To search from these large codebases using simple natural language queries would allow programmers to ideate, prototype, and develop easier and faster. Although the existing methods have shown good performance in searching codes when the natural language description contains keywords from the code, they are still fa… ▽ More Millions of repetitive code snippets are submitted to code repositories every day. To search from these large codebases using simple natural language queries would allow programmers to ideate, prototype, and develop easier and faster. Although the existing methods have shown good performance in searching codes when the natural language description contains keywords from the code, they are still far behind in searching codes based on the semantic meaning of the natural language query and semantic structure of the code. In recent years, both natural language and programming language research communities have created techniques to embed them in vector spaces. In this work, we leverage the efficacy of these embedding models using a simple, lightweight 2-layer neural network in the task of semantic code search. We show that our model learns the inherent relationship between the embedding spaces and further probes into the scope of improvement by empirically analyzing the embedding methods. In this analysis, we show that the quality of the code embedding model is the bottleneck for our model's performance, and discuss future directions of study in this area. △ Less

Submitted 16 April, 2021; originally announced April 2021.

Comments: Submitted to ICANN2021

arXiv:2104.02573 [pdf]

doi 10.1088/1742-6596/1767/1/012041

Prediction of Solar Radiation Using Artificial Neural Network

Authors: Shahriar Rahman, Shazzadur Rahman, A K M Bahalul Haque

Abstract: Most solar applications and systems can be reliably used to generate electricity and power in many homes and offices. Recently, there is an increase in many solar required systems that can be found not only in electricity generation but other applications such as solar distillation, water heating, heating of buildings, meteorology and producing solar conversion energy. Prediction of solar radiatio… ▽ More Most solar applications and systems can be reliably used to generate electricity and power in many homes and offices. Recently, there is an increase in many solar required systems that can be found not only in electricity generation but other applications such as solar distillation, water heating, heating of buildings, meteorology and producing solar conversion energy. Prediction of solar radiation is very significant in order to accomplish the previously mentioned objectives. In this paper, the main target is to present an algorithm that can be used to predict an hourly activity of solar radiation. Using a dataset that consists of temperature of air, time, humidity, wind speed, atmospheric pressure, direction of wind and solar radiation data, an Artificial Neural Network (ANN) model is constructed to effectively forecast solar radiation using the available weather forecast data. Two models are created to efficiently create a system capable of interpreting patterns through supervised learning data and predict the correct amount of radiation present in the atmosphere. The results of the two statistical indicators: Mean Absolute Error (MAE) and Mean Squared Error (MSE) are performed and compared with observed and predicted data. These two models were able to generate efficient predictions with sufficient performance accuracy. △ Less

Submitted 1 April, 2021; originally announced April 2021.

Comments: Published as open access, 12 pages, 13 images and 2 tables

Journal ref: Journal of Physics: Conference Series , 2021

arXiv:2104.02173 [pdf]

doi 10.5121/ijaia.2020.11406

Insight about Detection, Prediction and Weather Impact of Coronavirus (Covid-19) using Neural Network

Authors: A K M Bahalul Haque, Tahmid Hasan Pranto, Abdulla All Noman, Atik Mahmood

Abstract: The world is facing a tough situation due to the catastrophic pandemic caused by novel coronavirus (COVID-19). The number people affected by this virus are increasing exponentially day by day and the number has already crossed 6.4 million. As no vaccine has been discovered yet, the early detection of patients and isolation is the only and most effective way to reduce the spread of the virus. Detec… ▽ More The world is facing a tough situation due to the catastrophic pandemic caused by novel coronavirus (COVID-19). The number people affected by this virus are increasing exponentially day by day and the number has already crossed 6.4 million. As no vaccine has been discovered yet, the early detection of patients and isolation is the only and most effective way to reduce the spread of the virus. Detecting infected persons from chest X-Ray by using Deep Neural Networks, can be applied as a time and laborsaving solution. In this study, we tried to detect Covid-19 by classification of Covid-19, pneumonia and normal chest X-Rays. We used five different Convolutional Pre-Trained Neural Network models (VGG16, VGG19, Xception, InceptionV3 and Resnet50) and compared their performance. VGG16 and VGG19 shows precise performance in classification. Both models can classify between three kinds of X-Rays with an accuracy over 92%. Another part of our study was to find the impact of weather factors (temperature, humidity, sun hour and wind speed) on this pandemic using Decision Tree Regressor. We found that temperature, humidity and sun-hour jointly hold 85.88% impact on escalation of Covid-19 and 91.89% impact on death due to Covid-19 where humidity has 8.09% impact on death. We also tried to predict the death of an individual based on age, gender, country, and location due to COVID-19 using the LogisticRegression, which can predict death of an individual with a model accuracy of 94.40%. △ Less

Submitted 5 April, 2021; originally announced April 2021.

Comments: 15 Pages, 13 Figures and 4 Tables

Journal ref: International Journal of Artificial Intelligence & Applications 11(4):67-81, July. 2020

arXiv:2104.00648 [pdf]

doi 10.1109/ACCESS.2021.3069877

GDPR Compliant Blockchains-A Systematic Literature Review

Authors: AKM Bahalul Haque, AKM Najmul Islam, Sami Hyrynsalmi, Bilal Naqvi, Kari Smolander

Abstract: Although blockchain-based digital services promise trust, accountability, and transparency, multiple paradoxes between blockchains and GDPR have been highlighted in the recent literature. Some of the recent literature also proposed possible solutions to these paradoxes. This article aims to conduct a systematic literature review on GDPR compliant blockchains and synthesize the findings. In particu… ▽ More Although blockchain-based digital services promise trust, accountability, and transparency, multiple paradoxes between blockchains and GDPR have been highlighted in the recent literature. Some of the recent literature also proposed possible solutions to these paradoxes. This article aims to conduct a systematic literature review on GDPR compliant blockchains and synthesize the findings. In particular, the goal was to identify 1) the GDPR articles that have been explored in prior literature; 2) the relevant research domains that have been explored, and 3) the research gaps. Our findings synthesized that the blockchains relevant GDPR articles can be categorized into six major groups, namely data deletion and modification (Article 16, 17, and 18), protection by design by default (Article 25), responsibilities of controllers and processors (Article 24, 26, and 28), consent management (Article 7), data processing principles and lawfulness (Article 5,6 and 12), and territorial scope (Article 3). We also found seven research domains where GDPR compliant blockchains have been discussed, which include IoT, financial data, healthcare, personal identity, online data, information governance, and smart city. From our analysis, we have identified a few key research gaps and present a future research direction. △ Less

Submitted 1 April, 2021; originally announced April 2021.

Comments: Accepted for Publication in IEEE Access

Journal ref: IEEE Access , 2021

arXiv:2104.00632 [pdf]

doi 10.7717/peerj-cs.407

Blockchain and smart contract for IoT enabled smart agriculture

Authors: Tahmid Hasan Pranto, Abdulla All Noman, Atik Mahmud, AKM Bahalul Haque

Abstract: The agricultural sector is still lagging behind from all other sectors in terms of using the newest technologies. For production, the latest machines are being introduced and adopted. However, pre-harvest and post-harvest processing are still done by following traditional methodologies while tracing, storing, and publishing agricultural data. As a result, farmers are not getting deserved payment,… ▽ More The agricultural sector is still lagging behind from all other sectors in terms of using the newest technologies. For production, the latest machines are being introduced and adopted. However, pre-harvest and post-harvest processing are still done by following traditional methodologies while tracing, storing, and publishing agricultural data. As a result, farmers are not getting deserved payment, consumers are not getting enough information before buying their product, and intermediate person/processors are increasing retail prices. Using blockchain, smart contracts, and IoT devices, we can fully automate the process while establishing absolute trust among all these parties. In this research, we explored the different aspects of using blockchain and smart contracts with the integration of IoT devices in pre-harvesting and post-harvesting segments of agriculture. We proposed a system that uses blockchain as the backbone while IoT devices collect data from the field level, and smart contracts regulate the interaction among all these contributing parties. The system implementation has been shown in diagrams and with proper explanations. Gas costs of every operation have also been attached for a better understanding of the costs. We also analyzed the system in terms of challenges and advantages. The overall impact of this research was to show the immutable, available, transparent, and robustly secure characteristics of blockchain in the field of agriculture while also emphasizing the vigorous mechanism that the collaboration of blockchain, smart contract, and IoT presents. △ Less

Submitted 1 April, 2021; originally announced April 2021.

Journal ref: PeerJ Computer Science, 2021

Showing 1–50 of 100 results for author: Haque, A