Search | arXiv e-print repository

Advancing Histopathology-Based Breast Cancer Diagnosis: Insights into Multi-Modality and Explainability

Authors: Faseela Abdullakutty, Younes Akbari, Somaya Al-Maadeed, Ahmed Bouridane, Rifat Hamoudi

Abstract: It is imperative that breast cancer is detected precisely and timely to improve patient outcomes. Diagnostic methodologies have traditionally relied on unimodal approaches; however, medical data analytics is integrating diverse data sources beyond conventional imaging. Using multi-modal techniques, integrating both image and non-image data, marks a transformative advancement in breast cancer diagn… ▽ More It is imperative that breast cancer is detected precisely and timely to improve patient outcomes. Diagnostic methodologies have traditionally relied on unimodal approaches; however, medical data analytics is integrating diverse data sources beyond conventional imaging. Using multi-modal techniques, integrating both image and non-image data, marks a transformative advancement in breast cancer diagnosis. The purpose of this review is to explore the burgeoning field of multimodal techniques, particularly the fusion of histopathology images with non-image data. Further, Explainable AI (XAI) will be used to elucidate the decision-making processes of complex algorithms, emphasizing the necessity of explainability in diagnostic processes. This review utilizes multi-modal data and emphasizes explainability to enhance diagnostic accuracy, clinician confidence, and patient engagement, ultimately fostering more personalized treatment strategies for breast cancer, while also identifying research gaps in multi-modality and explainability, guiding future studies, and contributing to the strategic direction of the field. △ Less

Submitted 7 June, 2024; originally announced June 2024.

Comments: 31 pages including references

arXiv:2311.08059 [pdf, other]

FS-Net: Full Scale Network and Adaptive Threshold for Improving Extraction of Micro-Retinal Vessel Structures

Authors: Melaku N. Getahun, Oleg Y. Rogov, Dmitry V. Dylov, Andrey Somov, Ahmed Bouridane, Rifat Hamoudi

Abstract: Retinal vascular segmentation, is a widely researched subject in biomedical image processing, aims to relieve ophthalmologists' workload when treating and detecting retinal disorders. However, segmenting retinal vessels has its own set of challenges, with prior techniques failing to generate adequate results when segmenting branches and microvascular structures. The neural network approaches used… ▽ More Retinal vascular segmentation, is a widely researched subject in biomedical image processing, aims to relieve ophthalmologists' workload when treating and detecting retinal disorders. However, segmenting retinal vessels has its own set of challenges, with prior techniques failing to generate adequate results when segmenting branches and microvascular structures. The neural network approaches used recently are characterized by the inability to keep local and global properties together and the failure to capture tiny end vessels make it challenging to attain the desired result. To reduce this retinal vessel segmentation problem, we propose a full-scale micro-vessel extraction mechanism based on an encoder-decoder neural network architecture, sigmoid smoothing, and an adaptive threshold method. The network consists of of residual, encoder booster, bottleneck enhancement, squeeze, and excitation building blocks. All of these blocks together help to improve the feature extraction and prediction of the segmentation map. The proposed solution has been evaluated using the DRIVE, CHASE-DB1, and STARE datasets, and competitive results are obtained when compared with previous studies. The AUC and accuracy on the DRIVE dataset are 0.9884 and 0.9702, respectively. On the CHASE-DB1 dataset, the scores are 0.9903 and 0.9755, respectively. On the STARE dataset, the scores are 0.9916 and 0.9750, respectively. The performance achieved is one step ahead of what has been done in previous studies, and this results in a higher chance of having this solution in real-life diagnostic centers that seek ophthalmologists attention. △ Less

Submitted 13 December, 2023; v1 submitted 14 November, 2023; originally announced November 2023.

Comments: 7 pages, 2 figures, under consideration at Pattern Recognition Letters

arXiv:2301.13151 [pdf]

doi 10.2196/27394

Convolutional Neural Network-Based Automatic Classification of Colorectal and Prostate Tumor Biopsies Using Multispectral Imagery: System Development Study

Authors: Remy Peyret, Duaa alSaeed, Fouad Khelifi, Nadia Al-Ghreimil, Heyam Al-Baity, Ahmed Bouridane

Abstract: Colorectal and prostate cancers are the most common types of cancer in men worldwide. To diagnose colorectal and prostate cancer, a pathologist performs a histological analysis on needle biopsy samples. This manual process is time-consuming and error-prone, resulting in high intra and interobserver variability, which affects diagnosis reliability. This study aims to develop an automatic computeriz… ▽ More Colorectal and prostate cancers are the most common types of cancer in men worldwide. To diagnose colorectal and prostate cancer, a pathologist performs a histological analysis on needle biopsy samples. This manual process is time-consuming and error-prone, resulting in high intra and interobserver variability, which affects diagnosis reliability. This study aims to develop an automatic computerized system for diagnosing colorectal and prostate tumors by using images of biopsy samples to reduce time and diagnosis error rates associated with human analysis. We propose a CNN model for classifying colorectal and prostate tumors from multispectral images of biopsy samples. The key idea was to remove the last block of the convolutional layers and halve the number of filters per layer. Our results showed excellent performance, with an average test accuracy of 99.8% and 99.5% for the prostate and colorectal data sets, respectively. The system showed excellent performance when compared with pretrained CNNs and other classification methods, as it avoids the preprocessing phase while using a single CNN model for classification. Overall, the proposed CNN architecture was globally the best-performing system for classifying colorectal and prostate tumor images. The proposed CNN was detailed and compared with previously trained network models used as feature extractors. These CNNs were also compared with other classification techniques. As opposed to pretrained CNNs and other classification approaches, the proposed CNN yielded excellent results. The computational complexity of the CNNs was also investigated, it was shown that the proposed CNN is better at classifying images than pretrained networks because it does not require preprocessing. Thus, the overall analysis was that the proposed CNN architecture was globally the best-performing system for classifying colorectal and prostate tumor images. △ Less

Submitted 30 January, 2023; originally announced January 2023.

Journal ref: JMIR Bioinform Biotech 2022

arXiv:2211.09317 [pdf, other]

doi 10.1109/JAS.2023.123123

Explainable, Domain-Adaptive, and Federated Artificial Intelligence in Medicine

Authors: Ahmad Chaddad, Qizong lu, Jiali Li, Yousef Katib, Reem Kateb, Camel Tanougast, Ahmed Bouridane, Ahmed Abdulkadir

Abstract: Artificial intelligence (AI) continues to transform data analysis in many domains. Progress in each domain is driven by a growing body of annotated data, increased computational resources, and technological innovations. In medicine, the sensitivity of the data, the complexity of the tasks, the potentially high stakes, and a requirement of accountability give rise to a particular set of challenges.… ▽ More Artificial intelligence (AI) continues to transform data analysis in many domains. Progress in each domain is driven by a growing body of annotated data, increased computational resources, and technological innovations. In medicine, the sensitivity of the data, the complexity of the tasks, the potentially high stakes, and a requirement of accountability give rise to a particular set of challenges. In this review, we focus on three key methodological approaches that address some of the particular challenges in AI-driven medical decision making. (1) Explainable AI aims to produce a human-interpretable justification for each output. Such models increase confidence if the results appear plausible and match the clinicians expectations. However, the absence of a plausible explanation does not imply an inaccurate model. Especially in highly non-linear, complex models that are tuned to maximize accuracy, such interpretable representations only reflect a small portion of the justification. (2) Domain adaptation and transfer learning enable AI models to be trained and applied across multiple domains. For example, a classification task based on images acquired on different acquisition hardware. (3) Federated learning enables learning large-scale models without exposing sensitive personal health information. Unlike centralized AI learning, where the centralized learning machine has access to the entire training data, the federated learning process iteratively updates models across multiple sites by exchanging only parameter updates, not personal health data. This narrative review covers the basic concepts, highlights relevant corner-stone and state-of-the-art research in the field, and discusses perspectives. △ Less

Submitted 16 November, 2022; originally announced November 2022.

Comments: This paper is accepted in IEEE CAA Journal of Automatica Sinica, Nov. 10 2022

Journal ref: 10.1109/JAS.2023.123123

arXiv:2208.04298 [pdf, other]

doi 10.3390/s22145462

Gaze Estimation Approach Using Deep Differential Residual Network

Authors: Longzhao Huang, Yujie Li, Xu Wang, Haoyu Wang, Ahmed Bouridane, Ahmad Chaddad

Abstract: Gaze estimation, which is a method to determine where a person is looking at given the person's full face, is a valuable clue for understanding human intention. Similarly to other domains of computer vision, deep learning (DL) methods have gained recognition in the gaze estimation domain. However, there are still gaze calibration problems in the gaze estimation domain, thus preventing existing met… ▽ More Gaze estimation, which is a method to determine where a person is looking at given the person's full face, is a valuable clue for understanding human intention. Similarly to other domains of computer vision, deep learning (DL) methods have gained recognition in the gaze estimation domain. However, there are still gaze calibration problems in the gaze estimation domain, thus preventing existing methods from further improving the performances. An effective solution is to directly predict the difference information of two human eyes, such as the differential network (Diff-Nn). However, this solution results in a loss of accuracy when using only one inference image. We propose a differential residual model (DRNet) combined with a new loss function to make use of the difference information of two eye images. We treat the difference information as auxiliary information. We assess the proposed model (DRNet) mainly using two public datasets (1) MpiiGaze and (2) Eyediap. Considering only the eye features, DRNet outperforms the state-of-the-art gaze estimation methods with $angular-error$ of 4.57 and 6.14 using MpiiGaze and Eyediap datasets, respectively. Furthermore, the experimental results also demonstrate that DRNet is extremely robust to noise images. △ Less

Submitted 8 August, 2022; originally announced August 2022.

Journal ref: Sensors 2022, 22(14), 5462;

arXiv:2105.07533 [pdf]

Private Facial Diagnosis as an Edge Service for Parkinson's DBS Treatment Valuation

Authors: Richard Jiang, Paul Chazot, Danny Crookes, Ahmed Bouridane, M Emre Celebi

Abstract: Facial phenoty** has recently been successfully exploited for medical diagnosis as a novel way to diagnose a range of diseases, where facial biometrics has been revealed to have rich links to underlying genetic or medical causes. In this paper, taking Parkinson's Diseases (PD) as a case study, we proposed an Artificial-Intelligence-of-Things (AIoT) edge-oriented privacy-preserving facial diagnos… ▽ More Facial phenoty** has recently been successfully exploited for medical diagnosis as a novel way to diagnose a range of diseases, where facial biometrics has been revealed to have rich links to underlying genetic or medical causes. In this paper, taking Parkinson's Diseases (PD) as a case study, we proposed an Artificial-Intelligence-of-Things (AIoT) edge-oriented privacy-preserving facial diagnosis framework to analyze the treatment of Deep Brain Stimulation (DBS) on PD patients. In the proposed framework, a new edge-based information theoretically secure framework is proposed to implement private deep facial diagnosis as a service over a privacy-preserving AIoT-oriented multi-party communication scheme, where partial homomorphic encryption (PHE) is leveraged to enable privacy-preserving deep facial diagnosis directly on encrypted facial patterns. In our experiments with a collected facial dataset from PD patients, for the first time, we demonstrated that facial patterns could be used to valuate the improvement of PD patients undergoing DBS treatment. We further implemented a privacy-preserving deep facial diagnosis framework that can achieve the same accuracy as the non-encrypted one, showing the potential of our privacy-preserving facial diagnosis as an trustworthy edge service for grading the severity of PD in patients. △ Less

Submitted 16 May, 2021; originally announced May 2021.

Comments: Under review

arXiv:2008.12616 [pdf]

Towards Building A Facial Identification System Using Quantum Machine Learning Techniques

Authors: Philip Easom-McCaldin, Ahmed Bouridane, Ammar Belatreche, Richard Jiang

Abstract: In the modern world, facial identification is an extremely important task in which many applications rely on high performing algorithms to detect faces efficiently. Whilst classical methods of SVM and k-NN commonly used may perform to a good standard, they are often highly complex and take substantial computing power to run effectively. With the rise of quantum computing boasting large speedups wi… ▽ More In the modern world, facial identification is an extremely important task in which many applications rely on high performing algorithms to detect faces efficiently. Whilst classical methods of SVM and k-NN commonly used may perform to a good standard, they are often highly complex and take substantial computing power to run effectively. With the rise of quantum computing boasting large speedups without sacrificing large amounts of much needed performance, we aim to explore the benefits that quantum machine learning techniques can bring when specifically targeted towards facial identification applications. In the following work, we explore a quantum scheme which uses fidelity estimations of feature vectors in order to determine the classification result. Here, we are able to achieve exponential speedups by utilizing the principles of quantum computing without sacrificing large proportions of performance in terms of classification accuracy. We also propose limitations of the work and where some future efforts should be placed in order to produce robust quantum algorithms that can perform to the same standard as classical methods whilst utilizing the speedup performance gains. △ Less

Submitted 26 August, 2020; originally announced August 2020.

Journal ref: 2020 3rd International Conference on Watermarking and Image Processing

arXiv:1909.04974 [pdf]

Computer-Aided Automated Detection of Gene-Controlled Social Actions of Drosophila

Authors: Khan Faraz, Ahmed Bouridane, Richard Jiang, Tiancheng Xia, Paul Chazot, Abdel Ennaceur

Abstract: Gene expression of social actions in Drosophilae has been attracting wide interest from biologists, medical scientists and psychologists. Gene-edited Drosophilae have been used as a test platform for experimental investigation. For example, Parkinson's genes can be embedded into a group of newly bred Drosophilae for research purpose. However, human observation of numerous tiny Drosophilae for a lo… ▽ More Gene expression of social actions in Drosophilae has been attracting wide interest from biologists, medical scientists and psychologists. Gene-edited Drosophilae have been used as a test platform for experimental investigation. For example, Parkinson's genes can be embedded into a group of newly bred Drosophilae for research purpose. However, human observation of numerous tiny Drosophilae for a long term is an arduous work, and the dependence on human's acute perception is highly unreliable. As a result, an automated system of social action detection using machine learning has been highly demanded. In this study, we propose to automate the detection and classification of two innate aggressive actions demonstrated by Drosophilae. Robust keypoint detection is achieved using selective spatio-temporal interest points (sSTIP) which are then described using the 3D Scale Invariant Feature Transform (3D-SIFT) descriptors. Dimensionality reduction is performed using Spectral Regression Kernel Discriminant Analysis (SR-KDA) and classification is done using the nearest centre rule. The classification accuracy shown demonstrates the feasibility of the proposed system. △ Less

Submitted 11 September, 2019; originally announced September 2019.

Comments: published on International Conference on Smart Cities at Cambridge 2018

Journal ref: International Conference on Smart Cities at Cambridge 2018

arXiv:1909.02157 [pdf]

Atypical Facial Landmark Localisation with Stacked Hourglass Networks: A Study on 3D Facial Modelling for Medical Diagnosis

Authors: Gary Storey, Ahmed Bouridane, Richard Jiang, Chang-tsun Li

Abstract: While facial biometrics has been widely used for identification purpose, it has recently been researched as medical biometrics for a range of diseases. In this chapter, we investigate the facial landmark detection for atypical 3D facial modelling in facial palsy cases, while potentially such modelling can assist the medical diagnosis using atypical facial features. In our work, a study of landmark… ▽ More While facial biometrics has been widely used for identification purpose, it has recently been researched as medical biometrics for a range of diseases. In this chapter, we investigate the facial landmark detection for atypical 3D facial modelling in facial palsy cases, while potentially such modelling can assist the medical diagnosis using atypical facial features. In our work, a study of landmarks localisation methods such as stacked hourglass networks is conducted and evaluated to ascertain their accuracy when presented with unseen atypical faces. The evaluation highlights that the state-of-the-art stacked hourglass architecture outperforms other traditional methods. △ Less

Submitted 4 September, 2019; originally announced September 2019.

Comments: In press, 2019

Journal ref: Deep Biometrics, Springer Book, 2019

arXiv:1907.11272 [pdf, other]

A Novel Approach for Robust Multi Human Action Recognition and Summarization based on 3D Convolutional Neural Networks

Authors: Noor Almaadeed, Omar Elharrouss, Somaya Al-Maadeed, Ahmed Bouridane, Azeddine Beghdadi

Abstract: Human actions in videos are 3D signals. However, there are a few methods available for multiple human action recognition. For long videos, it's difficult to search within a video for a specific action and/or person. For that, this paper proposes a new technic for multiple human action recognition and summarization for surveillance videos. The proposed approach proposes a new representation of the… ▽ More Human actions in videos are 3D signals. However, there are a few methods available for multiple human action recognition. For long videos, it's difficult to search within a video for a specific action and/or person. For that, this paper proposes a new technic for multiple human action recognition and summarization for surveillance videos. The proposed approach proposes a new representation of the data by extracting the sequence of each person from the scene. This is followed by an analysis of each sequence to detect and recognize the corresponding actions using 3D convolutional neural networks (3DCNNs). Action-based video summarization is performed by saving each person's action at each time of the video. Results of this work revealed that the proposed method provides accurate multi human action recognition that easily used for summarization of any action. Further, for other videos that can be collected from the internet, which are complex and not built for surveillance applications, the proposed model was evaluated on some datasets like UCF101 and YouTube without any preprocessing. For this category of videos, the summarization is performed on the video sequences by summarizing the actions in each subsequence. The results obtained demonstrate its efficiency compared to state-of-the-art methods. △ Less

Submitted 15 March, 2021; v1 submitted 25 July, 2019; originally announced July 2019.

arXiv:1905.13607 [pdf]

3DPalsyNet: A Facial Palsy Grading and Motion Recognition Framework using Fully 3D Convolutional Neural Networks

Authors: Gary Storey, Richard Jiang, Shelagh Keogh, Ahmed Bouridane, Chang-Tsun Li

Abstract: The capability to perform facial analysis from video sequences has significant potential to positively impact in many areas of life. One such area relates to the medical domain to specifically aid in the diagnosis and rehabilitation of patients with facial palsy. With this application in mind, this paper presents an end-to-end framework, named 3DPalsyNet, for the tasks of mouth motion recognition… ▽ More The capability to perform facial analysis from video sequences has significant potential to positively impact in many areas of life. One such area relates to the medical domain to specifically aid in the diagnosis and rehabilitation of patients with facial palsy. With this application in mind, this paper presents an end-to-end framework, named 3DPalsyNet, for the tasks of mouth motion recognition and facial palsy grading. 3DPalsyNet utilizes a 3D CNN architecture with a ResNet backbone for the prediction of these dynamic tasks. Leveraging transfer learning from a 3D CNNs pre-trained on the Kinetics data set for general action recognition, the model is modified to apply joint supervised learning using center and softmax loss concepts. 3DPalsyNet is evaluated on a test set consisting of individuals with varying ranges of facial palsy and mouth motions and the results have shown an attractive level of classification accuracy in these task of 82% and 86% respectively. The frame duration and the loss function affect was studied in terms of the predictive qualities of the proposed 3DPalsyNet, where it was found shorter frame duration's of 8 performed best for this specific task. Centre loss and softmax have shown improvements in spatio-temporal feature learning than softmax loss alone, this is in agreement with earlier work involving the spatial domain. △ Less

Submitted 31 May, 2019; originally announced May 2019.

Journal ref: IEEE Access 2019

arXiv:1905.12759 [pdf]

Distant Pedestrian Detection in the Wild using Single Shot Detector with Deep Convolutional Generative Adversarial Networks

Authors: Ranjith Dinakaran, Philip Easom, Li Zhang, Ahmed Bouridane, Richard Jiang, Eran Edirisinghe

Abstract: In this work, we examine the feasibility of applying Deep Convolutional Generative Adversarial Networks (DCGANs) with Single Shot Detector (SSD) as data-processing technique to handle with the challenge of pedestrian detection in the wild. Specifically, we attempted to use in-fill completion (where a portion of the image is masked) to generate random transformations of images with portions missing… ▽ More In this work, we examine the feasibility of applying Deep Convolutional Generative Adversarial Networks (DCGANs) with Single Shot Detector (SSD) as data-processing technique to handle with the challenge of pedestrian detection in the wild. Specifically, we attempted to use in-fill completion (where a portion of the image is masked) to generate random transformations of images with portions missing to expand existing labelled datasets. In our work, GAN has been trained intensively on low resolution images, in order to neutralize the challenges of the pedestrian detection in the wild, and considered humans, and few other classes for detection in smart cities. The object detector experiment performed by training GAN model along with SSD provided a substantial improvement in the results. This approach presents a very interesting overview in the current state of art on GAN networks for object detection. We used Canadian Institute for Advanced Research (CIFAR), Caltech, KITTI data set for training and testing the network under different resolutions and the experimental results with comparison been showedbetween DCGAN cascaded with SSD and SSD itself. △ Less

Submitted 29 May, 2019; originally announced May 2019.

Comments: arXiv admin note: text overlap with arXiv:1711.08174, arXiv:1511.06434, arXiv:1706.05274 by other authors

Journal ref: The 2019 International Joint Conference on Neural Networks (IJCNN)

arXiv:1812.00876 [pdf]

Deep Learning based Pedestrian Detection at Distance in Smart Cities

Authors: Ranjith K Dinakaran, Philip Easom, Ahmed Bouridane, Li Zhang, Richard Jiang, Fozia Mehboob, Abdul Rauf

Abstract: Generative adversarial networks (GANs) have been promising for many computer vision problems due to their powerful capabilities to enhance the data for training and test. In this paper, we leveraged GANs and proposed a new architecture with a cascaded Single Shot Detector (SSD) for pedestrian detection at distance, which is yet a challenge due to the varied sizes of pedestrians in videos at distan… ▽ More Generative adversarial networks (GANs) have been promising for many computer vision problems due to their powerful capabilities to enhance the data for training and test. In this paper, we leveraged GANs and proposed a new architecture with a cascaded Single Shot Detector (SSD) for pedestrian detection at distance, which is yet a challenge due to the varied sizes of pedestrians in videos at distance. To overcome the low-resolution issues in pedestrian detection at distance, DCGAN is employed to improve the resolution first to reconstruct more discriminative features for a SSD to detect objects in images or videos. A crucial advantage of our method is that it learns a multi-scale metric to distinguish multiple objects at different distances under one image, while DCGAN serves as an encoder-decoder platform to generate parts of an image that contain better discriminative information. To measure the effectiveness of our proposed method, experiments were carried out on the Canadian Institute for Advanced Research (CIFAR) dataset, and it was demonstrated that the proposed new architecture achieved a much better detection rate, particularly on vehicles and pedestrians at distance, making it highly suitable for smart cities applications that need to discover key objects or pedestrians at distance. △ Less

Submitted 16 May, 2019; v1 submitted 18 November, 2018; originally announced December 2018.

Comments: Artificial Intelligence Conference 2019 | IntelliSys 2019 | https://saiconference.com/IntelliSys

Showing 1–13 of 13 results for author: Bouridane, A