Search | arXiv e-print repository

Linguistic-Based Mild Cognitive Impairment Detection Using Informative Loss

Authors: Ali Pourramezan Fard, Mohammad H. Mahoor, Muath Alsuhaibani, Hiroko H. Dodgec

Abstract: This paper presents a deep learning method using Natural Language Processing (NLP) techniques, to distinguish between Mild Cognitive Impairment (MCI) and Normal Cognitive (NC) conditions in older adults. We propose a framework that analyzes transcripts generated from video interviews collected within the I-CONECT study project, a randomized controlled trial aimed at improving cognitive functions t… ▽ More This paper presents a deep learning method using Natural Language Processing (NLP) techniques, to distinguish between Mild Cognitive Impairment (MCI) and Normal Cognitive (NC) conditions in older adults. We propose a framework that analyzes transcripts generated from video interviews collected within the I-CONECT study project, a randomized controlled trial aimed at improving cognitive functions through video chats. Our proposed NLP framework consists of two Transformer-based modules, namely Sentence Embedding (SE) and Sentence Cross Attention (SCA). First, the SE module captures contextual relationships between words within each sentence. Subsequently, the SCA module extracts temporal features from a sequence of sentences. This feature is then used by a Multi-Layer Perceptron (MLP) for the classification of subjects into MCI or NC. To build a robust model, we propose a novel loss function, called InfoLoss, that considers the reduction in entropy by observing each sequence of sentences to ultimately enhance the classification accuracy. The results of our comprehensive model evaluation using the I-CONECT dataset show that our framework can distinguish between MCI and NC with an average area under the curve of 84.75%. △ Less

Submitted 23 January, 2024; originally announced February 2024.

arXiv:2302.00908 [pdf, other]

GANalyzer: Analysis and Manipulation of GANs Latent Space for Controllable Face Synthesis

Authors: Ali Pourramezan Fard, Mohammad H. Mahoor, Sarah Ariel Lamer, Timothy Sweeny

Abstract: Generative Adversarial Networks (GANs) are capable of synthesizing high-quality facial images. Despite their success, GANs do not provide any information about the relationship between the input vectors and the generated images. Currently, facial GANs are trained on imbalanced datasets, which generate less diverse images. For example, more than 77% of 100K images that we randomly synthesized using… ▽ More Generative Adversarial Networks (GANs) are capable of synthesizing high-quality facial images. Despite their success, GANs do not provide any information about the relationship between the input vectors and the generated images. Currently, facial GANs are trained on imbalanced datasets, which generate less diverse images. For example, more than 77% of 100K images that we randomly synthesized using the StyleGAN3 are classified as Happy, and only around 3% are Angry. The problem even becomes worse when a mixture of facial attributes is desired: less than 1% of the generated samples are Angry Woman, and only around 2% are Happy Black. To address these problems, this paper proposes a framework, called GANalyzer, for the analysis, and manipulation of the latent space of well-trained GANs. GANalyzer consists of a set of transformation functions designed to manipulate latent vectors for a specific facial attribute such as facial Expression, Age, Gender, and Race. We analyze facial attribute entanglement in the latent space of GANs and apply the proposed transformation for editing the disentangled facial attributes. Our experimental results demonstrate the strength of GANalyzer in editing facial attributes and generating any desired faces. We also create and release a balanced photo-realistic human face dataset. Our code is publicly available on GitHub. △ Less

Submitted 2 February, 2023; originally announced February 2023.

arXiv:2203.15835 [pdf, other]

ACR Loss: Adaptive Coordinate-based Regression Loss for Face Alignment

Authors: Ali Pourramezan Fard, Mohammad H. Mahoor

Abstract: Although deep neural networks have achieved reasonable accuracy in solving face alignment, it is still a challenging task, specifically when we deal with facial images, under occlusion, or extreme head poses. Heatmap-based Regression (HBR) and Coordinate-based Regression (CBR) are among the two mainly used methods for face alignment. CBR methods require less computer memory, though their performan… ▽ More Although deep neural networks have achieved reasonable accuracy in solving face alignment, it is still a challenging task, specifically when we deal with facial images, under occlusion, or extreme head poses. Heatmap-based Regression (HBR) and Coordinate-based Regression (CBR) are among the two mainly used methods for face alignment. CBR methods require less computer memory, though their performance is less than HBR methods. In this paper, we propose an Adaptive Coordinate-based Regression (ACR) loss to improve the accuracy of CBR for face alignment. Inspired by the Active Shape Model (ASM), we generate Smooth-Face objects, a set of facial landmark points with less variations compared to the ground truth landmark points. We then introduce a method to estimate the level of difficulty in predicting each landmark point for the network by comparing the distribution of the ground truth landmark points and the corresponding Smooth-Face objects. Our proposed ACR Loss can adaptively modify its curvature and the influence of the loss based on the difficulty level of predicting each landmark point in a face. Accordingly, the ACR Loss guides the network toward challenging points than easier points, which improves the accuracy of the face alignment task. Our extensive evaluation shows the capabilities of the proposed ACR Loss in predicting facial landmark points in various facial images. △ Less

Submitted 14 September, 2022; v1 submitted 29 March, 2022; originally announced March 2022.

Comments: Accepted in International Conference on Pattern Recognition (ICPR) 2022

arXiv:2111.10854 [pdf, other]

doi 10.1007/s10846-023-01952-w

XnODR and XnIDR: Two Accurate and Fast Fully Connected Layers For Convolutional Neural Networks

Authors: Jian Sun, Ali Pourramezan Fard, Mohammad H. Mahoor

Abstract: Capsule Network is powerful at defining the positional relationship between features in deep neural networks for visual recognition tasks, but it is computationally expensive and not suitable for running on mobile devices. The bottleneck is in the computational complexity of the Dynamic Routing mechanism used between the capsules. On the other hand, XNOR-Net is fast and computationally efficient,… ▽ More Capsule Network is powerful at defining the positional relationship between features in deep neural networks for visual recognition tasks, but it is computationally expensive and not suitable for running on mobile devices. The bottleneck is in the computational complexity of the Dynamic Routing mechanism used between the capsules. On the other hand, XNOR-Net is fast and computationally efficient, though it suffers from low accuracy due to information loss in the binarization process. To address the computational burdens of the Dynamic Routing mechanism, this paper proposes new Fully Connected (FC) layers by xnorizing the linear projection outside or inside the Dynamic Routing within the CapsFC layer. Specifically, our proposed FC layers have two versions, XnODR (Xnorize the Linear Projection Outside Dynamic Routing) and XnIDR (Xnorize the Linear Projection Inside Dynamic Routing). To test the generalization of both XnODR and XnIDR, we insert them into two different networks, MobileNetV2 and ResNet-50. Our experiments on three datasets, MNIST, CIFAR-10, and MultiMNIST validate their effectiveness. The results demonstrate that both XnODR and XnIDR help networks to have high accuracy with lower FLOPs and fewer parameters (e.g., 96.14% correctness with 2.99M parameters and 311.74M FLOPs on CIFAR-10). △ Less

Submitted 19 September, 2023; v1 submitted 21 November, 2021; originally announced November 2021.

Comments: 19 pages, 5 figures, 9 tables, 2 algorithms

Journal ref: J Intell Robot Syst 109, 17 (2023)

arXiv:2111.07047 [pdf, other]

Facial Landmark Points Detection Using Knowledge Distillation-Based Neural Networks

Authors: Ali Pourramezan Fard, Mohammad H. Mahoor

Abstract: Facial landmark detection is a vital step for numerous facial image analysis applications. Although some deep learning-based methods have achieved good performances in this task, they are often not suitable for running on mobile devices. Such methods rely on networks with many parameters, which makes the training and inference time-consuming. Training lightweight neural networks such as MobileNets… ▽ More Facial landmark detection is a vital step for numerous facial image analysis applications. Although some deep learning-based methods have achieved good performances in this task, they are often not suitable for running on mobile devices. Such methods rely on networks with many parameters, which makes the training and inference time-consuming. Training lightweight neural networks such as MobileNets are often challenging, and the models might have low accuracy. Inspired by knowledge distillation (KD), this paper presents a novel loss function to train a lightweight Student network (e.g., MobileNetV2) for facial landmark detection. We use two Teacher networks, a Tolerant-Teacher and a Tough-Teacher in conjunction with the Student network. The Tolerant-Teacher is trained using Soft-landmarks created by active shape models, while the Tough-Teacher is trained using the ground truth (aka Hard-landmarks) landmark points. To utilize the facial landmark points predicted by the Teacher networks, we define an Assistive Loss (ALoss) for each Teacher network. Moreover, we define a loss function called KD-Loss that utilizes the facial landmark points predicted by the two pre-trained Teacher networks (EfficientNet-b3) to guide the lightweight Student network towards predicting the Hard-landmarks. Our experimental results on three challenging facial datasets show that the proposed architecture will result in a better-trained Student network that can extract facial landmark points with high accuracy. △ Less

Submitted 13 November, 2021; originally announced November 2021.

Comments: Accepted in Computer Vision and Image Understanding Journal

arXiv:2103.00119 [pdf, other]

ASMNet: a Lightweight Deep Neural Network for Face Alignment and Pose Estimation

Authors: Ali Pourramezan Fard, Hojjat Abdollahi, Mohammad Mahoor

Abstract: Active Shape Model (ASM) is a statistical model of object shapes that represents a target structure. ASM can guide machine learning algorithms to fit a set of points representing an object (e.g., face) onto an image. This paper presents a lightweight Convolutional Neural Network (CNN) architecture with a loss function being assisted by ASM for face alignment and estimating head pose in the wild. W… ▽ More Active Shape Model (ASM) is a statistical model of object shapes that represents a target structure. ASM can guide machine learning algorithms to fit a set of points representing an object (e.g., face) onto an image. This paper presents a lightweight Convolutional Neural Network (CNN) architecture with a loss function being assisted by ASM for face alignment and estimating head pose in the wild. We use ASM to first guide the network towards learning a smoother distribution of the facial landmark points. Inspired by transfer learning, during the training process, we gradually harden the regression problem and guide the network towards learning the original landmark points distribution. We define multi-tasks in our loss function that are responsible for detecting facial landmark points as well as estimating the face pose. Learning multiple correlated tasks simultaneously builds synergy and improves the performance of individual tasks. We compare the performance of our proposed model called ASMNet with MobileNetV2 (which is about 2 times bigger than ASMNet) in both the face alignment and pose estimation tasks. Experimental results on challenging datasets show that by using the proposed ASM assisted loss function, the ASMNet performance is comparable with MobileNetV2 in the face alignment task. In addition, for face pose estimation, ASMNet performs much better than MobileNetV2. ASMNet achieves an acceptable performance for facial landmark points detection and pose estimation while having a significantly smaller number of parameters and floating-point operations compared to many CNN-based models. △ Less

Submitted 7 May, 2021; v1 submitted 26 February, 2021; originally announced March 2021.

Comments: Accepted at CVPR 2021 Biometrics Workshop, jointly with the Workshop on Analysis and Modeling of Faces and Gestures

Journal ref: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2021, pp. 1521-1530

arXiv:1204.3141 [pdf]

Secure Tracking in Sensor Networks using Adaptive Extended Kalman Filter

Authors: Ali P. Fard, Mahdy Nabaee

Abstract: Location information of sensor nodes has become an essential part of many applications in Wireless Sensor Networks (WSN). The importance of location estimation and object tracking has made them the target of many security attacks. Various methods have tried to provide location information with high accuracy, while lots of them have neglected the fact that WSNs may be deployed in hostile environmen… ▽ More Location information of sensor nodes has become an essential part of many applications in Wireless Sensor Networks (WSN). The importance of location estimation and object tracking has made them the target of many security attacks. Various methods have tried to provide location information with high accuracy, while lots of them have neglected the fact that WSNs may be deployed in hostile environments. In this paper, we address the problem of securely tracking a Mobile Node (MN) which has been noticed very little previously. A novel secure tracking algorithm is proposed based on Extended Kalman Filter (EKF) that is capable of tracking a Mobile Node (MN) with high resolution in the presence of compromised or colluding malicious beacon nodes. It filters out and identifies the malicious beacon data in the process of tracking. The proposed method considerably outperforms the previously proposed secure algorithms in terms of either detection rate or MSE. The experimental data based on different settings for the network has shown promising results. △ Less

Submitted 14 April, 2012; originally announced April 2012.

Showing 1–7 of 7 results for author: Fard, A P