-
M2ANET: Mobile Malaria Attention Network for efficient classification of plasmodium parasites in blood cells
Authors:
Salam Ahmed Ali,
Peshraw Salam Abdulqadir,
Shan Ali Abdullah,
Haruna Yunusa
Abstract:
Malaria is a life-threatening infectious disease caused by Plasmodium parasites, which poses a significant public health challenge worldwide, particularly in tropical and subtropical regions. Timely and accurate detection of malaria parasites in blood cells is crucial for effective treatment and control of the disease. In recent years, deep learning techniques have demonstrated remarkable success…
▽ More
Malaria is a life-threatening infectious disease caused by Plasmodium parasites, which poses a significant public health challenge worldwide, particularly in tropical and subtropical regions. Timely and accurate detection of malaria parasites in blood cells is crucial for effective treatment and control of the disease. In recent years, deep learning techniques have demonstrated remarkable success in medical image analysis tasks, offering promising avenues for improving diagnostic accuracy, with limited studies on hybrid mobile models due to the complexity of combining two distinct models and the significant memory demand of self-attention mechanism especially for edge devices. In this study, we explore the potential of designing a hybrid mobile model for efficient classification of plasmodium parasites in blood cell images. Therefore, we present M2ANET (Mobile Malaria Attention Network). The model integrates MBConv3 (MobileNetV3 blocks) for efficient capturing of local feature extractions within blood cell images and a modified global-MHSA (multi-head self-attention) mechanism in the latter stages of the network for capturing global context. Through extensive experimentation on benchmark, we demonstrate that M2ANET outperforms some state-of-the-art lightweight and mobile networks in terms of both accuracy and efficiency. Moreover, we discuss the potential implications of M2ANET in advancing malaria diagnosis and treatment, highlighting its suitability for deployment in resource-constrained healthcare settings. The development of M2ANET represents a significant advancement in the pursuit of efficient and accurate malaria detection, with broader implications for medical image analysis and global healthcare initiatives.
△ Less
Submitted 23 May, 2024;
originally announced May 2024.
-
An Analysis of Recent Advances in Deepfake Image Detection in an Evolving Threat Landscape
Authors:
Sifat Muhammad Abdullah,
Aravind Cheruvu,
Shravya Kanchi,
Taejoong Chung,
Peng Gao,
Murtuza Jadliwala,
Bimal Viswanath
Abstract:
Deepfake or synthetic images produced using deep generative models pose serious risks to online platforms. This has triggered several research efforts to accurately detect deepfake images, achieving excellent performance on publicly available deepfake datasets. In this work, we study 8 state-of-the-art detectors and argue that they are far from being ready for deployment due to two recent developm…
▽ More
Deepfake or synthetic images produced using deep generative models pose serious risks to online platforms. This has triggered several research efforts to accurately detect deepfake images, achieving excellent performance on publicly available deepfake datasets. In this work, we study 8 state-of-the-art detectors and argue that they are far from being ready for deployment due to two recent developments. First, the emergence of lightweight methods to customize large generative models, can enable an attacker to create many customized generators (to create deepfakes), thereby substantially increasing the threat surface. We show that existing defenses fail to generalize well to such \emph{user-customized generative models} that are publicly available today. We discuss new machine learning approaches based on content-agnostic features, and ensemble modeling to improve generalization performance against user-customized models. Second, the emergence of \textit{vision foundation models} -- machine learning models trained on broad data that can be easily adapted to several downstream tasks -- can be misused by attackers to craft adversarial deepfakes that can evade existing defenses. We propose a simple adversarial attack that leverages existing foundation models to craft adversarial samples \textit{without adding any adversarial noise}, through careful semantic manipulation of the image content. We highlight the vulnerabilities of several defenses against our attack, and explore directions leveraging advanced foundation models and adversarial training to defend against this new threat.
△ Less
Submitted 24 April, 2024;
originally announced April 2024.
-
A Genetic Algorithm-Based Support Vector Machine Approach for Intelligent Usability Assessment of m-Learning Applications
Authors:
Muhammad Asghar,
Imran Sarwar Bajwa,
Shabana Ramzan,
Hina Afreen,
Saima Abdullah
Abstract:
In the field of human-computer interaction (HCI), the usability assessment of m-learning (mobile-learning) applications is a real challenge. Such assessment typically involves extraction of the best features of an application like efficiency, effectiveness, learnability, cognition, memorability, etc., and further ranking of those features for an overall assessment of the quality of the mobile appl…
▽ More
In the field of human-computer interaction (HCI), the usability assessment of m-learning (mobile-learning) applications is a real challenge. Such assessment typically involves extraction of the best features of an application like efficiency, effectiveness, learnability, cognition, memorability, etc., and further ranking of those features for an overall assessment of the quality of the mobile application. In the previous literature, it is found that there is neither any theory nor any tool available to measure or assess a user perception and assessment of usability features of a m-learning application for the sake of ranking the graphical user interface of a mobile application in terms of a user acceptance and satisfaction. In this paper, a novel approach is presented by performing a mobile applications quantitative and qualitative analysis. Based on user requirements and perception, a criterion is defined based on a set of important features. Afterward, for the qualitative analysis, a genetic algorithm (GA) is used to score prescribed features for the usability assessment of a mobile application. The used approach assigns a score to each usability feature according to the user requirement and weight of each feature. GA performs the rank assessment process initially by performing feature selection and scoring the best features of the application. A comparison of assessment analysis of GA and various machine learning models, K-nearest neighbours, Naive Bayes, and Random Forests is performed. It was found that a GA-based support vector machine (SVM) provides more accuracy in the extraction of the best features of a mobile application and further ranking of those features.
△ Less
Submitted 4 April, 2024;
originally announced April 2024.
-
RAPTOR: Recursive Abstractive Processing for Tree-Organized Retrieval
Authors:
Parth Sarthi,
Salman Abdullah,
Aditi Tuli,
Shubh Khanna,
Anna Goldie,
Christopher D. Manning
Abstract:
Retrieval-augmented language models can better adapt to changes in world state and incorporate long-tail knowledge. However, most existing methods retrieve only short contiguous chunks from a retrieval corpus, limiting holistic understanding of the overall document context. We introduce the novel approach of recursively embedding, clustering, and summarizing chunks of text, constructing a tree wit…
▽ More
Retrieval-augmented language models can better adapt to changes in world state and incorporate long-tail knowledge. However, most existing methods retrieve only short contiguous chunks from a retrieval corpus, limiting holistic understanding of the overall document context. We introduce the novel approach of recursively embedding, clustering, and summarizing chunks of text, constructing a tree with differing levels of summarization from the bottom up. At inference time, our RAPTOR model retrieves from this tree, integrating information across lengthy documents at different levels of abstraction. Controlled experiments show that retrieval with recursive summaries offers significant improvements over traditional retrieval-augmented LMs on several tasks. On question-answering tasks that involve complex, multi-step reasoning, we show state-of-the-art results; for example, by coupling RAPTOR retrieval with the use of GPT-4, we can improve the best performance on the QuALITY benchmark by 20% in absolute accuracy.
△ Less
Submitted 31 January, 2024;
originally announced January 2024.
-
Designing Voice Interfaces to Support Mindfulness-Based Pain Management
Authors:
Sanjana Mendu,
Sebrina L. Doyle Fosco,
Stephanie T. Lanza,
Saeed Abdullah
Abstract:
Objective: Chronic pain is a critical public health issue affecting approximately 20% of the adult population in the United States. Given the opioid crisis, there has been an urgent focus on non-addictive pain management methods including Mindfulness-Based Stress Reduction (MBSR). Prior work has successfully used MBSR for pain management. However, ensuring longitudinal engagement to MBSR practices…
▽ More
Objective: Chronic pain is a critical public health issue affecting approximately 20% of the adult population in the United States. Given the opioid crisis, there has been an urgent focus on non-addictive pain management methods including Mindfulness-Based Stress Reduction (MBSR). Prior work has successfully used MBSR for pain management. However, ensuring longitudinal engagement to MBSR practices remains a serious challenge. In this work, we explore the utility of a voice interface to support MBSR home practice.
Methods: We interviewed ten mindfulness program facilitators to understand how such a technology might fit in the context of the MBSR class and identify potential usability issues with our prototype. We then used directed content analysis to identify key themes and sub-themes within the interview data.
Results: Our findings show that facilitators supported the use of the voice interface for MBSR, particularly for individuals with limited motor function. Facilitators also highlighted unique affordances of voice interfaces, including perceived social presence, to support sustained engagement.
Conclusion: We demonstrate the acceptability of a voice interface to support home practice for MBSR participants among trained mindfulness facilitators. Based on our findings, we outline design recommendations for technologies aiming to provide longitudinal support for mindfulness-based interventions. Future work should further these efforts towards making non-addictive pain management interventions accessible and efficacious for a wide audience of users.
△ Less
Submitted 12 September, 2023;
originally announced September 2023.
-
Supportive Fintech for Individuals with Bipolar Disorder: Financial Data Sharing Preferences to Support Longitudinal Care Management
Authors:
Jeff Brozena,
Johnna Blair,
Thomas Richardson,
Mark Matthews,
Dahlia Mukherjee,
Erika F. H. Saunders,
Saeed Abdullah
Abstract:
Financial stability is a key challenge for individuals living with bipolar disorder (BD). Symptomatic periods in BD are associated with poor financial decision-making, contributing to a negative cycle of worsening symptoms and an increased risk of bankruptcy. There has been an increased focus on designing supportive financial technologies (fintech) to address varying and intermittent needs across…
▽ More
Financial stability is a key challenge for individuals living with bipolar disorder (BD). Symptomatic periods in BD are associated with poor financial decision-making, contributing to a negative cycle of worsening symptoms and an increased risk of bankruptcy. There has been an increased focus on designing supportive financial technologies (fintech) to address varying and intermittent needs across different stages of BD. However, little is known about this population's expectations and privacy preferences related to financial data sharing for longitudinal care management. To address this knowledge gap, we have deployed a factorial vignette survey using the Contextual Integrity framework. Our data from individuals with BD (N=480) shows that they are open to sharing financial data for long term care management. We have also identified significant differences in sharing preferences across age, gender, and diagnostic subtype. We discuss the implications of these findings in designing equitable fintech to support this marginalized community.
△ Less
Submitted 28 February, 2024; v1 submitted 27 June, 2023;
originally announced June 2023.
-
Word level Bangla Sign Language Dataset for Continuous BSL Recognition
Authors:
Md Shamimul Islam,
A. J. M. Akhtarujjaman Joha,
Md Nur Hossain,
Sohaib Abdullah,
Ibrahim Elwarfalli,
Md Mahedi Hasan
Abstract:
An robust sign language recognition system can greatly alleviate communication barriers, particularly for people who struggle with verbal communication. This is crucial for human growth and progress as it enables the expression of thoughts, feelings, and ideas. However, sign recognition is a complex task that faces numerous challenges such as same gesture patterns for multiple signs, lighting, clo…
▽ More
An robust sign language recognition system can greatly alleviate communication barriers, particularly for people who struggle with verbal communication. This is crucial for human growth and progress as it enables the expression of thoughts, feelings, and ideas. However, sign recognition is a complex task that faces numerous challenges such as same gesture patterns for multiple signs, lighting, clothing, carrying conditions, and the presence of large poses, as well as illumination discrepancies across different views. Additionally, the absence of an extensive Bangla sign language video dataset makes it even more challenging to operate recognition systems, particularly when utilizing deep learning techniques. In order to address this issue, firstly, we created a large-scale dataset called the MVBSL-W50, which comprises 50 isolated words across 13 categories. Secondly, we developed an attention-based Bi-GRU model that captures the temporal dynamics of pose information for individuals communicating through sign language. The proposed model utilizes human pose information, which has shown to be successful in analyzing sign language patterns. By focusing solely on movement information and disregarding body appearance and environmental factors, the model is simplified and can achieve a speedier performance. The accuracy of the model is reported to be 85.64%.
△ Less
Submitted 9 April, 2023; v1 submitted 22 February, 2023;
originally announced February 2023.
-
Deepfake Text Detection: Limitations and Opportunities
Authors:
Jiameng Pu,
Zain Sarwar,
Sifat Muhammad Abdullah,
Abdullah Rehman,
Yoon** Kim,
Parantapa Bhattacharya,
Mobin Javed,
Bimal Viswanath
Abstract:
Recent advances in generative models for language have enabled the creation of convincing synthetic text or deepfake text. Prior work has demonstrated the potential for misuse of deepfake text to mislead content consumers. Therefore, deepfake text detection, the task of discriminating between human and machine-generated text, is becoming increasingly critical. Several defenses have been proposed f…
▽ More
Recent advances in generative models for language have enabled the creation of convincing synthetic text or deepfake text. Prior work has demonstrated the potential for misuse of deepfake text to mislead content consumers. Therefore, deepfake text detection, the task of discriminating between human and machine-generated text, is becoming increasingly critical. Several defenses have been proposed for deepfake text detection. However, we lack a thorough understanding of their real-world applicability. In this paper, we collect deepfake text from 4 online services powered by Transformer-based tools to evaluate the generalization ability of the defenses on content in the wild. We develop several low-cost adversarial attacks, and investigate the robustness of existing defenses against an adaptive attacker. We find that many defenses show significant degradation in performance under our evaluation scenarios compared to their original claimed performance. Our evaluation shows that tap** into the semantic information in the text content is a promising approach for improving the robustness and generalization performance of deepfake text detection schemes.
△ Less
Submitted 17 October, 2022;
originally announced October 2022.
-
Privacy Sensitive Speech Analysis Using Federated Learning to Assess Depression
Authors:
Suhas BN,
Saeed Abdullah
Abstract:
Recent studies have used speech signals to assess depression. However, speech features can lead to serious privacy concerns. To address these concerns, prior work has used privacy-preserving speech features. However, using a subset of features can lead to information loss and, consequently, non-optimal model performance. Furthermore, prior work relies on a centralized approach to support continuou…
▽ More
Recent studies have used speech signals to assess depression. However, speech features can lead to serious privacy concerns. To address these concerns, prior work has used privacy-preserving speech features. However, using a subset of features can lead to information loss and, consequently, non-optimal model performance. Furthermore, prior work relies on a centralized approach to support continuous model updates, posing privacy risks. This paper proposes to use Federated Learning (FL) to enable decentralized, privacy-preserving speech analysis to assess depression. Using an existing dataset (DAIC-WOZ), we show that FL models enable a robust assessment of depression with only 4--6% accuracy loss compared to a centralized approach. These models also outperform prior work using the same dataset. Furthermore, the FL models have short inference latency and small memory footprints while being energy-efficient. These models, thus, can be deployed on mobile devices for real-time, continuous, and privacy-preserving depression assessment at scale.
△ Less
Submitted 20 May, 2022; v1 submitted 29 April, 2022;
originally announced May 2022.
-
Alexa as an Active Listener: How Backchanneling Can Elicit Self-Disclosure and Promote User Experience
Authors:
Eugene Cho,
Nasim Motalebi,
S. Shyam Sundar,
Saeed Abdullah
Abstract:
Active listening is a well-known skill applied in human communication to build intimacy and elicit self-disclosure to support a wide variety of cooperative tasks. When applied to conversational UIs, active listening from machines can also elicit greater self-disclosure by signaling to the users that they are being heard, which can have positive outcomes. However, it takes considerable engineering…
▽ More
Active listening is a well-known skill applied in human communication to build intimacy and elicit self-disclosure to support a wide variety of cooperative tasks. When applied to conversational UIs, active listening from machines can also elicit greater self-disclosure by signaling to the users that they are being heard, which can have positive outcomes. However, it takes considerable engineering effort and training to embed active listening skills in machines at scale, given the need to personalize active-listening cues to individual users and their specific utterances. A more generic solution is needed given the increasing use of conversational agents, especially by the growing number of socially isolated individuals. With this in mind, we developed an Amazon Alexa skill that provides privacy-preserving and pseudo-random backchanneling to indicate active listening. User study (N = 40) data show that backchanneling improves perceived degree of active listening by smart speakers. It also results in more emotional disclosure, with participants using more positive words. Perception of smart speakers as active listeners is positively associated with perceived emotional support. Interview data corroborate the feasibility of using smart speakers to provide emotional support. These findings have important implications for smart speaker interaction design in several domains of cooperative work and social computing.
△ Less
Submitted 22 September, 2022; v1 submitted 21 April, 2022;
originally announced April 2022.
-
Financial technologies (FinTech) for mental health: The potential of objective financial data to better understand the relationships between financial behavior and mental health
Authors:
Johnna Blair,
Jeff Brozena,
Mark Matthews,
Thomas Richardson,
Saeed Abdullah
Abstract:
In this paper, we present novel research methods for collecting and analyzing personal financial data alongside mental health factors, illustrated through a N=1 case study using data from one individual with bipolar disorder. While we have not found statistically significant trends nor our findings are generalizable beyond this case, our approach provides an insight into the challenges of accessin…
▽ More
In this paper, we present novel research methods for collecting and analyzing personal financial data alongside mental health factors, illustrated through a N=1 case study using data from one individual with bipolar disorder. While we have not found statistically significant trends nor our findings are generalizable beyond this case, our approach provides an insight into the challenges of accessing objective financial data. We outline what data is currently available, what can be done with it, and what factors to consider when working with financial data. More specifically, using these methods researchers might be able to identify symptomatic traces of mental ill health in personal financial data such as identifying early warning signs and thereby enable preemptive care for individuals with serious mental illnesses. Based on this work, we have also explored future directions for develo** interventions to support financial wellbeing. Furthermore, we have described the technical, ethical, and equity challenges for financial data-driven assessments and intervention methods, as well as provided a broad research agenda to address these challenges. By leveraging objective, personalized financial data in a privacy-preserving and ethical manner help lead to a shift in mental health care.
△ Less
Submitted 9 January, 2023; v1 submitted 11 April, 2022;
originally announced April 2022.
-
Generative Adversarial Network (GAN) and Enhanced Root Mean Square Error (ERMSE): Deep Learning for Stock Price Movement Prediction
Authors:
Ashish Kumar,
Abeer Alsadoon,
P. W. C. Prasad,
Salma Abdullah,
Tarik A. Rashid,
Duong Thu Hang Pham,
Tran Quoc Vinh Nguyen
Abstract:
The prediction of stock price movement direction is significant in financial circles and academic. Stock price contains complex, incomplete, and fuzzy information which makes it an extremely difficult task to predict its development trend. Predicting and analysing financial data is a nonlinear, time-dependent problem. With rapid development in machine learning and deep learning, this task can be p…
▽ More
The prediction of stock price movement direction is significant in financial circles and academic. Stock price contains complex, incomplete, and fuzzy information which makes it an extremely difficult task to predict its development trend. Predicting and analysing financial data is a nonlinear, time-dependent problem. With rapid development in machine learning and deep learning, this task can be performed more effectively by a purposely designed network. This paper aims to improve prediction accuracy and minimizing forecasting error loss through deep learning architecture by using Generative Adversarial Networks. It was proposed a generic model consisting of Phase-space Reconstruction (PSR) method for reconstructing price series and Generative Adversarial Network (GAN) which is a combination of two neural networks which are Long Short-Term Memory (LSTM) as Generative model and Convolutional Neural Network (CNN) as Discriminative model for adversarial training to forecast the stock market. LSTM will generate new instances based on historical basic indicators information and then CNN will estimate whether the data is predicted by LSTM or is real. It was found that the Generative Adversarial Network (GAN) has performed well on the enhanced root mean square error to LSTM, as it was 4.35% more accurate in predicting the direction and reduced processing time and RMSE by 78 secs and 0.029, respectively. This study provides a better result in the accuracy of the stock index. It seems that the proposed system concentrates on minimizing the root mean square error and processing time and improving the direction prediction accuracy, and provides a better result in the accuracy of the stock index.
△ Less
Submitted 30 November, 2021;
originally announced December 2021.
-
A Novel Visualization System of Using Augmented Reality in Knee Replacement Surgery: Enhanced Bidirectional Maximum Correntropy Algorithm
Authors:
Nitish Maharjan,
Abeer Alsadoon,
P. W. C. Prasad,
Salma Abdullah,
Tarik A. Rashid
Abstract:
Background and aim: Image registration and alignment are the main limitations of augmented reality-based knee replacement surgery. This research aims to decrease the registration error, eliminate outcomes that are trapped in local minima to improve the alignment problems, handle the occlusion, and maximize the overlap** parts. Methodology: markerless image registration method was used for Augmen…
▽ More
Background and aim: Image registration and alignment are the main limitations of augmented reality-based knee replacement surgery. This research aims to decrease the registration error, eliminate outcomes that are trapped in local minima to improve the alignment problems, handle the occlusion, and maximize the overlap** parts. Methodology: markerless image registration method was used for Augmented reality-based knee replacement surgery to guide and visualize the surgical operation. While weight least square algorithm was used to enhance stereo camera-based tracking by filling border occlusion in right to left direction and non-border occlusion from left to right direction. Results: This study has improved video precision to 0.57 mm~0.61 mm alignment error. Furthermore, with the use of bidirectional points, for example, forwards and backwards directional cloud point, the iteration on image registration was decreased. This has led to improve the processing time as well. The processing time of video frames was improved to 7.4~11.74 fps. Conclusions: It seems clear that this proposed system has focused on overcoming the misalignment difficulty caused by movement of patient and enhancing the AR visualization during knee replacement surgery. The proposed system was reliable and favorable which helps in eliminating alignment error by ascertaining the optimal rigid transformation between two cloud points and removing the outliers and non-Gaussian noise. The proposed augmented reality system helps in accurate visualization and navigation of anatomy of knee such as femur, tibia, cartilage, blood vessels, etc.
△ Less
Submitted 13 March, 2021;
originally announced April 2021.
-
Design a Technology Based on the Fusion of Genetic Algorithm, Neural network and Fuzzy logic
Authors:
Raid R. Al-Nima,
Fawaz S. Abdullah,
Ali N. Hamoodi
Abstract:
This paper describes the design and development of a prototype technique for artificial intelligence based on the fusion of genetic algorithm, neural network and fuzzy logic. It starts by establishing a relationship between the neural network and fuzzy logic. Then, it combines the genetic algorithm with them. Information fusions are at the confidence level, where matching scores can be reported an…
▽ More
This paper describes the design and development of a prototype technique for artificial intelligence based on the fusion of genetic algorithm, neural network and fuzzy logic. It starts by establishing a relationship between the neural network and fuzzy logic. Then, it combines the genetic algorithm with them. Information fusions are at the confidence level, where matching scores can be reported and discussed. The technique is called the Genetic Neuro-Fuzzy (GNF). It can be used for high accuracy real-time environments.
△ Less
Submitted 16 February, 2021;
originally announced February 2021.
-
A First Look at Privacy Analysis of COVID-19 Contact Tracing Mobile Applications
Authors:
Muhammad Ajmal Azad,
Junaid Arshad,
Ali Akmal,
Farhan Riaz,
Sidrah Abdullah,
Muhammad Imran,
Farhan Ahmad
Abstract:
Today's smartphones are equipped with a large number of powerful value-added sensors and features such as a low power Bluetooth sensor, powerful embedded sensors such as the digital compass, accelerometer, GPS sensors, Wi-Fi capabilities, microphone, humidity sensors, health tracking sensors, and a camera, etc. These value-added sensors have revolutionized the lives of the human being in many ways…
▽ More
Today's smartphones are equipped with a large number of powerful value-added sensors and features such as a low power Bluetooth sensor, powerful embedded sensors such as the digital compass, accelerometer, GPS sensors, Wi-Fi capabilities, microphone, humidity sensors, health tracking sensors, and a camera, etc. These value-added sensors have revolutionized the lives of the human being in many ways such, as tracking the health of the patients and movement of doctors, tracking employees movement in large manufacturing units, and monitoring the environment, etc. These embedded sensors could also be used for large-scale personal, group, and community sensing applications especially tracing the spread of certain diseases. Governments and regulators are turning to use these features to trace the people thought to have symptoms of certain diseases or virus e.g. COVID-19. The outbreak of COVID-19 in December 2019, has seen a surge of the mobile applications for tracing, tracking and isolating the persons showing COVID-19 symptoms to limit the spread of disease to the larger community. The use of embedded sensors could disclose private information of the users thus potentially bring threat to the privacy and security of users. In this paper, we analyzed a large set of smartphone applications that have been designed to contain the spread of the COVID-19 virus and bring the people back to normal life. Specifically, we have analyzed what type of permission these smartphone apps require, whether these permissions are necessary for the track and trace, how data from the user devices is transported to the analytic center, and analyzing the security measures these apps have deployed to ensure the privacy and security of users.
△ Less
Submitted 16 August, 2020; v1 submitted 23 June, 2020;
originally announced June 2020.
-
Catastrophic forgetting: still a problem for DNNs
Authors:
B. Pfülb,
A. Gepperth,
S. Abdullah,
A. Kilian
Abstract:
We investigate the performance of DNNs when trained on class-incremental visual problems consisting of initial training, followed by retraining with added visual classes. Catastrophic forgetting (CF) behavior is measured using a new evaluation procedure that aims at an application-oriented view of incremental learning. In particular, it imposes that model selection must be performed on the initial…
▽ More
We investigate the performance of DNNs when trained on class-incremental visual problems consisting of initial training, followed by retraining with added visual classes. Catastrophic forgetting (CF) behavior is measured using a new evaluation procedure that aims at an application-oriented view of incremental learning. In particular, it imposes that model selection must be performed on the initial dataset alone, as well as demanding that retraining control be performed only using the retraining dataset, as initial dataset is usually too large to be kept. Experiments are conducted on class-incremental problems derived from MNIST, using a variety of different DNN models, some of them recently proposed to avoid catastrophic forgetting. When comparing our new evaluation procedure to previous approaches for assessing CF, we find their findings are completely negated, and that none of the tested methods can avoid CF in all experiments. This stresses the importance of a realistic empirical measurement procedure for catastrophic forgetting, and the need for further research in incremental learning for DNNs.
△ Less
Submitted 20 May, 2019;
originally announced May 2019.
-
Effectiveness of Crypto-Transcoding for H.264/AVC and HEVC Video Bit-streams
Authors:
Rizwan A. Shah,
Mamoona N. Asghar,
Saima Abdullah,
Martin Fleury,
Neelam Gohar
Abstract:
To avoid delays arising from a need to decrypt a video prior to transcoding and then re-encrypt it afterwards, this paper assesses a selective encryption (SE) content protection scheme. The scheme is suited to both recent standardized codecs, namely H.264/Advanced Video Coding (AVC) and High Efficiency Video Coding (HEVC). Specifically, the paper outlines a joint crypto-transcoding scheme for secu…
▽ More
To avoid delays arising from a need to decrypt a video prior to transcoding and then re-encrypt it afterwards, this paper assesses a selective encryption (SE) content protection scheme. The scheme is suited to both recent standardized codecs, namely H.264/Advanced Video Coding (AVC) and High Efficiency Video Coding (HEVC). Specifically, the paper outlines a joint crypto-transcoding scheme for secure transrating of a video bitstream. That is to say it generates new video bitrates, possibly as part of an HTTP Adaptive Streaming (HAS) content delivery network. The scheme will reduce the bitrate to one or more lower desired bit-rate without consuming time in the encryption/decryption process, which would be the case when full encryption is used. In addition, the decryption key no longer needs to be exposed at intermediate middleboxes, including when transrating is performed in a cloud datacenter. The effectiveness of the scheme is variously evaluated: by examination of the SE generated visual distortion; by the extent of computational and bitrate overheads; and by choice of cipher when encrypting the selected elements within the bitstream. Results indicate that there remains: a content; quantization level (after transrating of an encrypted video); and codec-type dependency to any distortion introduced. A further recommendation is that the Advanced Encryption Standard (AES) is preferred for SE to lightweight XOR encryption, despite it being taken up elsewhere as a real-time encryption method.
△ Less
Submitted 19 February, 2019;
originally announced February 2019.
-
Deep Learning based Early Detection and Grading of Diabetic Retinopathy Using Retinal Fundus Images
Authors:
Sheikh Muhammad Saiful Islam,
Md Mahedi Hasan,
Sohaib Abdullah
Abstract:
Diabetic Retinopathy (DR) is a constantly deteriorating disease, being one of the leading causes of vision impairment and blindness. Subtle distinction among different grades and existence of many significant small features make the task of recognition very challenging. In addition, the present approach of retinopathy detection is a very laborious and time-intensive task, which heavily relies on t…
▽ More
Diabetic Retinopathy (DR) is a constantly deteriorating disease, being one of the leading causes of vision impairment and blindness. Subtle distinction among different grades and existence of many significant small features make the task of recognition very challenging. In addition, the present approach of retinopathy detection is a very laborious and time-intensive task, which heavily relies on the skill of a physician. Automated detection of diabetic retinopathy is essential to tackle these problems. Early-stage detection of diabetic retinopathy is also very important for diagnosis, which can prevent blindness with proper treatment. In this paper, we developed a novel deep convolutional neural network, which performs the early-stage detection by identifying all microaneurysms (MAs), the first signs of DR, along with correctly assigning labels to retinal fundus images which are graded into five categories. We have tested our network on the largest publicly available Kaggle diabetic retinopathy dataset, and achieved 0.851 quadratic weighted kappa score and 0.844 AUC score, which achieves the state-of-the-art performance on severity grading. In the early-stage detection, we have achieved a sensitivity of 98% and specificity of above 94%, which demonstrates the effectiveness of our proposed method. Our proposed architecture is at the same time very simple and efficient with respect to computational time and space are concerned.
△ Less
Submitted 26 December, 2018;
originally announced December 2018.
-
Edge direction matrixes-based local binar patterns descriptor for shape pattern recognition
Authors:
Mohammed A. Talab,
Siti Norul Huda Sheikh Abdullah,
Bilal Bataineh
Abstract:
Shapes and texture image recognition usage is an essential branch of pattern recognition. It is made up of techniques that aim at extracting information from images via human knowledge and works. Local Binary Pattern (LBP) ensures encoding global and local information and scaling invariance by introducing a look-up table to reflect the uniformity structure of an object. However, edge direction mat…
▽ More
Shapes and texture image recognition usage is an essential branch of pattern recognition. It is made up of techniques that aim at extracting information from images via human knowledge and works. Local Binary Pattern (LBP) ensures encoding global and local information and scaling invariance by introducing a look-up table to reflect the uniformity structure of an object. However, edge direction matrixes (EDMS) only apply global invariant descriptor which employs first and secondary order relationships. The main idea behind this methodology is the need of improved recognition capabilities, a goal achieved by the combinative use of these descriptors. This collaboration aims to make use of the major advantages each one presents, by simultaneously complementing each other, in order to elevate their weak points. By using multiple classifier approaches such as random forest and multi-layer perceptron neural network, the proposed combinative descriptor are compared with the state of the art combinative methods based on Gray-Level Co-occurrence matrix (GLCM with EDMS), LBP and moment invariant on four benchmark dataset MPEG-7 CE-Shape-1, KTH-TIPS image, Enghlishfnt and Arabic calligraphy . The experiments have shown the superiority of the introduced descriptor over the GLCM with EDMS, LBP and moment invariants and other well-known descriptor such as Scale Invariant Feature Transform from the literature.
△ Less
Submitted 26 November, 2014;
originally announced November 2014.
-
Face Detection Using Radial Basis Functions Neural Networks With Fixed Spread
Authors:
K. A. A. Aziz,
S. S. Abdullah
Abstract:
This paper presented a face detection system using Radial Basis Function Neural Networks With Fixed Spread Value. Face detection is the first step in face recognition system. The purpose is to localize and extract the face region from the background that will be fed into the face recognition system for identification. General preprocessing approach was used for normalizing the image and Radial Bas…
▽ More
This paper presented a face detection system using Radial Basis Function Neural Networks With Fixed Spread Value. Face detection is the first step in face recognition system. The purpose is to localize and extract the face region from the background that will be fed into the face recognition system for identification. General preprocessing approach was used for normalizing the image and Radial Basis Function (RBF) Neural Network was used to distinguish between face and non-face. RBF Neural Networks offer several advantages compared to other neural network architecture such as they can be trained using fast two stages training algorithm and the network possesses the property of best approximation. The output of the network can be optimized by setting suitable value of center and spread of the RBF. In this paper, fixed spread value will be used. The Radial Basis Function Neural Network (RBFNN) used to distinguish faces and non-faces and the evaluation of the system will be the performance of detection, False Acceptance Rate (FAR), False Rejection Rate (FRR) and the discriminative properties.
△ Less
Submitted 7 August, 2014;
originally announced October 2014.
-
Bipolar Fuzzy Soft sets and its applications in decision making problem
Authors:
Muhammad Aslam,
Saleem Abdullah,
Kifayat ullah
Abstract:
In this article, we combine the concept of a bipolar fuzzy set and a soft set. We introduce the notion of bipolar fuzzy soft set and study fundamental properties. We study basic operations on bipolar fuzzy soft set. We define exdended union, intersection of two bipolar fuzzy soft set. We also give an application of bipolar fuzzy soft set into decision making problem. We give a general algorithm to…
▽ More
In this article, we combine the concept of a bipolar fuzzy set and a soft set. We introduce the notion of bipolar fuzzy soft set and study fundamental properties. We study basic operations on bipolar fuzzy soft set. We define exdended union, intersection of two bipolar fuzzy soft set. We also give an application of bipolar fuzzy soft set into decision making problem. We give a general algorithm to solve decision making problems by using bipolar fuzzy soft set.
△ Less
Submitted 23 March, 2013;
originally announced March 2013.
-
Investigating a Hybrid Metaheuristic For Job Shop Rescheduling
Authors:
Salwani Abdullah,
Uwe Aickelin,
Edmund Burke,
Aniza Din,
Rong Qu
Abstract:
Previous research has shown that artificial immune systems can be used to produce robust schedules in a manufacturing environment. The main goal is to develop building blocks (antibodies) of partial schedules that can be used to construct backup solutions (antigens) when disturbances occur during production. The building blocks are created based upon underpinning ideas from artificial immune sys…
▽ More
Previous research has shown that artificial immune systems can be used to produce robust schedules in a manufacturing environment. The main goal is to develop building blocks (antibodies) of partial schedules that can be used to construct backup solutions (antigens) when disturbances occur during production. The building blocks are created based upon underpinning ideas from artificial immune systems and evolved using a genetic algorithm (Phase I). Each partial schedule (antibody) is assigned a fitness value and the best partial schedules are selected to be converted into complete schedules (antigens). We further investigate whether simulated annealing and the great deluge algorithm can improve the results when hybridised with our artificial immune system (Phase II). We use ten fixed solutions as our target and measure how well we cover these specific scenarios.
△ Less
Submitted 12 March, 2008;
originally announced March 2008.