Search | arXiv e-print repository

Phasor-Driven Acceleration for FFT-based CNNs

Authors: Eduardo Reis, Thangarajah Akilan, Mohammed Khalid

Abstract: Recent research in deep learning (DL) has investigated the use of the Fast Fourier Transform (FFT) to accelerate the computations involved in Convolutional Neural Networks (CNNs) by replacing spatial convolution with element-wise multiplications on the spectral domain. These approaches mainly rely on the FFT to reduce the number of operations, which can be further decreased by adopting the Real-Va… ▽ More Recent research in deep learning (DL) has investigated the use of the Fast Fourier Transform (FFT) to accelerate the computations involved in Convolutional Neural Networks (CNNs) by replacing spatial convolution with element-wise multiplications on the spectral domain. These approaches mainly rely on the FFT to reduce the number of operations, which can be further decreased by adopting the Real-Valued FFT. In this paper, we propose using the phasor form, a polar representation of complex numbers, as a more efficient alternative to the traditional approach. The experimental results, evaluated on the CIFAR-10, demonstrate that our method achieves superior speed improvements of up to a factor of 1.376 (average of 1.316) during training and up to 1.390 (average of 1.321) during inference when compared to the traditional rectangular form employed in modern CNN architectures. Similarly, when evaluated on the CIFAR-100, our method achieves superior speed improvements of up to a factor of 1.375 (average of 1.299) during training and up to 1.387 (average of 1.300) during inference. Most importantly, given the modular aspect of our approach, the proposed method can be applied to any existing convolution-based DL model without design changes. △ Less

Submitted 31 May, 2024; originally announced June 2024.

Comments: Presented in the 21st Conference on Robots and Vision (CRV 2024) Workshop

arXiv:2405.02321 [pdf, other]

Accelerating Medical Knowledge Discovery through Automated Knowledge Graph Generation and Enrichment

Authors: Mutahira Khalid, Raihana Rahman, Asim Abbas, Sushama Kumari, Iram Wajahat, Syed Ahmad Chan Bukhari

Abstract: Knowledge graphs (KGs) serve as powerful tools for organizing and representing structured knowledge. While their utility is widely recognized, challenges persist in their automation and completeness. Despite efforts in automation and the utilization of expert-created ontologies, gaps in connectivity remain prevalent within KGs. In response to these challenges, we propose an innovative approach ter… ▽ More Knowledge graphs (KGs) serve as powerful tools for organizing and representing structured knowledge. While their utility is widely recognized, challenges persist in their automation and completeness. Despite efforts in automation and the utilization of expert-created ontologies, gaps in connectivity remain prevalent within KGs. In response to these challenges, we propose an innovative approach termed ``Medical Knowledge Graph Automation (M-KGA)". M-KGA leverages user-provided medical concepts and enriches them semantically using BioPortal ontologies, thereby enhancing the completeness of knowledge graphs through the integration of pre-trained embeddings. Our approach introduces two distinct methodologies for uncovering hidden connections within the knowledge graph: a cluster-based approach and a node-based approach. Through rigorous testing involving 100 frequently occurring medical concepts in Electronic Health Records (EHRs), our M-KGA framework demonstrates promising results, indicating its potential to address the limitations of existing knowledge graph automation techniques. △ Less

Submitted 21 April, 2024; originally announced May 2024.

Comments: 18 pages, 5 figures

arXiv:2401.11113 [pdf, other]

SleepNet: Attention-Enhanced Robust Sleep Prediction using Dynamic Social Networks

Authors: Maryam Khalid, Elizabeth B. Klerman, Andrew W. Mchill, Andrew J. K. Phillips, Akane Sano

Abstract: Sleep behavior significantly impacts health and acts as an indicator of physical and mental well-being. Monitoring and predicting sleep behavior with ubiquitous sensors may therefore assist in both sleep management and tracking of related health conditions. While sleep behavior depends on, and is reflected in the physiology of a person, it is also impacted by external factors such as digital media… ▽ More Sleep behavior significantly impacts health and acts as an indicator of physical and mental well-being. Monitoring and predicting sleep behavior with ubiquitous sensors may therefore assist in both sleep management and tracking of related health conditions. While sleep behavior depends on, and is reflected in the physiology of a person, it is also impacted by external factors such as digital media usage, social network contagion, and the surrounding weather. In this work, we propose SleepNet, a system that exploits social contagion in sleep behavior through graph networks and integrates it with physiological and phone data extracted from ubiquitous mobile and wearable devices for predicting next-day sleep labels about sleep duration. Our architecture overcomes the limitations of large-scale graphs containing connections irrelevant to sleep behavior by devising an attention mechanism. The extensive experimental evaluation highlights the improvement provided by incorporating social networks in the model. Additionally, we conduct robustness analysis to demonstrate the system's performance in real-life conditions. The outcomes affirm the stability of SleepNet against perturbations in input data. Further analyses emphasize the significance of network topology in prediction performance revealing that users with higher eigenvalue centrality are more vulnerable to data perturbations. △ Less

Submitted 26 January, 2024; v1 submitted 19 January, 2024; originally announced January 2024.

Comments: Accepted for publication in Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies (IMWUT), 8 (March 2024)

arXiv:2310.10550 [pdf]

Deep learning applied to EEG data with different montages using spatial attention

Authors: Dung Truong, Muhammad Abdullah Khalid, Arnaud Delorme

Abstract: The ability of Deep Learning to process and extract relevant information in complex brain dynamics from raw EEG data has been demonstrated in various recent works. Deep learning models, however, have also been shown to perform best on large corpora of data. When processing EEG, a natural approach is to combine EEG datasets from different experiments to train large deep-learning models. However, mo… ▽ More The ability of Deep Learning to process and extract relevant information in complex brain dynamics from raw EEG data has been demonstrated in various recent works. Deep learning models, however, have also been shown to perform best on large corpora of data. When processing EEG, a natural approach is to combine EEG datasets from different experiments to train large deep-learning models. However, most EEG experiments use custom channel montages, requiring the data to be transformed into a common space. Previous methods have used the raw EEG signal to extract features of interest and focused on using a common feature space across EEG datasets. While this is a sensible approach, it underexploits the potential richness of EEG raw data. Here, we explore using spatial attention applied to EEG electrode coordinates to perform channel harmonization of raw EEG data, allowing us to train deep learning on EEG data using different montages. We test this model on a gender classification task. We first show that spatial attention increases model performance. Then, we show that a deep learning model trained on data using different channel montages performs significantly better than deep learning models trained on fixed 23- and 128-channel data montages. △ Less

Submitted 16 October, 2023; originally announced October 2023.

arXiv:2304.04858 [pdf, other]

Simulated Annealing in Early Layers Leads to Better Generalization

Authors: Amirmohammad Sarfi, Zahra Karimpour, Muawiz Chaudhary, Nasir M. Khalid, Mirco Ravanelli, Sudhir Mudur, Eugene Belilovsky

Abstract: Recently, a number of iterative learning methods have been introduced to improve generalization. These typically rely on training for longer periods of time in exchange for improved generalization. LLF (later-layer-forgetting) is a state-of-the-art method in this category. It strengthens learning in early layers by periodically re-initializing the last few layers of the network. Our principal inno… ▽ More Recently, a number of iterative learning methods have been introduced to improve generalization. These typically rely on training for longer periods of time in exchange for improved generalization. LLF (later-layer-forgetting) is a state-of-the-art method in this category. It strengthens learning in early layers by periodically re-initializing the last few layers of the network. Our principal innovation in this work is to use Simulated annealing in EArly Layers (SEAL) of the network in place of re-initialization of later layers. Essentially, later layers go through the normal gradient descent process, while the early layers go through short stints of gradient ascent followed by gradient descent. Extensive experiments on the popular Tiny-ImageNet dataset benchmark and a series of transfer learning and few-shot learning tasks show that we outperform LLF by a significant margin. We further show that, compared to normal training, LLF features, although improving on the target task, degrade the transfer learning performance across all datasets we explored. In comparison, our method outperforms LLF across the same target datasets by a large margin. We also show that the prediction depth of our method is significantly lower than that of LLF and normal training, indicating on average better prediction performance. △ Less

Submitted 10 April, 2023; originally announced April 2023.

arXiv:2302.07797 [pdf, other]

'Aariz: A Benchmark Dataset for Automatic Cephalometric Landmark Detection and CVM Stage Classification

Authors: Muhammad Anwaar Khalid, Kanwal Zulfiqar, Ulfat Bashir, Areeba Shaheen, Rida Iqbal, Zarnab Rizwan, Ghina Rizwan, Muhammad Moazam Fraz

Abstract: The accurate identification and precise localization of cephalometric landmarks enable the classification and quantification of anatomical abnormalities. The traditional way of marking cephalometric landmarks on lateral cephalograms is a monotonous and time-consuming job. Endeavours to develop automated landmark detection systems have persistently been made, however, they are inadequate for orthod… ▽ More The accurate identification and precise localization of cephalometric landmarks enable the classification and quantification of anatomical abnormalities. The traditional way of marking cephalometric landmarks on lateral cephalograms is a monotonous and time-consuming job. Endeavours to develop automated landmark detection systems have persistently been made, however, they are inadequate for orthodontic applications due to unavailability of a reliable dataset. We proposed a new state-of-the-art dataset to facilitate the development of robust AI solutions for quantitative morphometric analysis. The dataset includes 1000 lateral cephalometric radiographs (LCRs) obtained from 7 different radiographic imaging devices with varying resolutions, making it the most diverse and comprehensive cephalometric dataset to date. The clinical experts of our team meticulously annotated each radiograph with 29 cephalometric landmarks, including the most significant soft tissue landmarks ever marked in any publicly available dataset. Additionally, our experts also labelled the cervical vertebral maturation (CVM) stage of the patient in a radiograph, making this dataset the first standard resource for CVM classification. We believe that this dataset will be instrumental in the development of reliable automated landmark detection frameworks for use in orthodontics and beyond. △ Less

Submitted 15 February, 2023; originally announced February 2023.

arXiv:2301.12605 [pdf, other]

Traffic Prediction in Cellular Networks using Graph Neural Networks

Authors: Maryam Khalid

Abstract: Cellular networks are ubiquitous entities that provide major means of communication all over the world. One major challenge in cellular networks is a dynamic change in the number of users and their usage of telecommunication service which results in overloading at certain base stations. One class of solution to deal with this overloading issue is the deployment of drones that can act as temporary… ▽ More Cellular networks are ubiquitous entities that provide major means of communication all over the world. One major challenge in cellular networks is a dynamic change in the number of users and their usage of telecommunication service which results in overloading at certain base stations. One class of solution to deal with this overloading issue is the deployment of drones that can act as temporary base stations and offload the traffic from the overloaded base station. There are two main challenges in the development of this solution. Firstly, the drone is expected to be present around the base station where an overload would occur in the future thus requiring a prediction of traffic overload. Secondly, drones are highly constrained in their resources and can only fly for a few minutes. If the affected base station is really far, drones can never reach there. This requires the initial placement of drones in sectors where overloading can occur thus again requiring a traffic forecast but at a different spatial scale. It must be noted that the spatial extent of the region that the problem poses and the extremely limited power resources available to the drone pose a great challenge that is hard to overcome without deploying the drones in strategic positions to reduce the time to fly to the required high-demand zone. Moreover, since drone fly at a finite speed, it is important that a predictive solution that can forecast traffic surges is adopted so that drones are available to offload the overload before it actually happens. Both these goals require analysis and forecast of cellular network traffic which is the main goal of this project △ Less

Submitted 29 January, 2023; originally announced January 2023.

arXiv:2212.04808 [pdf, other]

CEPHA29: Automatic Cephalometric Landmark Detection Challenge 2023

Authors: Muhammad Anwaar Khalid, Kanwal Zulfiqar, Ulfat Bashir, Areeba Shaheen, Rida Iqbal, Zarnab Rizwan, Ghina Rizwan, Muhammad Moazam Fraz

Abstract: Quantitative cephalometric analysis is the most widely used clinical and research tool in modern orthodontics. Accurate localization of cephalometric landmarks enables the quantification and classification of anatomical abnormalities, however, the traditional manual way of marking these landmarks is a very tedious job. Endeavours have constantly been made to develop automated cephalometric landmar… ▽ More Quantitative cephalometric analysis is the most widely used clinical and research tool in modern orthodontics. Accurate localization of cephalometric landmarks enables the quantification and classification of anatomical abnormalities, however, the traditional manual way of marking these landmarks is a very tedious job. Endeavours have constantly been made to develop automated cephalometric landmark detection systems but they are inadequate for orthodontic applications. The fundamental reason for this is that the amount of publicly available datasets as well as the images provided for training in these datasets are insufficient for an AI model to perform well. To facilitate the development of robust AI solutions for morphometric analysis, we organise the CEPHA29 Automatic Cephalometric Landmark Detection Challenge in conjunction with IEEE International Symposium on Biomedical Imaging (ISBI 2023). In this context, we provide the largest known publicly available dataset, consisting of 1000 cephalometric X-ray images. We hope that our challenge will not only derive forward research and innovation in automatic cephalometric landmark identification but will also signal the beginning of a new era in the discipline. △ Less

Submitted 3 April, 2023; v1 submitted 9 December, 2022; originally announced December 2022.

arXiv:2212.01436 [pdf, other]

Unauthorized Drone Detection: Experiments and Prototypes

Authors: Muhammad Asif Khan, Hamid Menouar, Osama Muhammad Khalid, Adnan Abu-Dayya

Abstract: The increase in the number of unmanned aerial vehicles a.k.a. drones pose several threats to public privacy, critical infrastructure and cyber security. Hence, detecting unauthorized drones is a significant problem which received attention in the last few years. In this paper, we present our experimental work on three drone detection methods (i.e., acoustic detection, radio frequency (RF) detectio… ▽ More The increase in the number of unmanned aerial vehicles a.k.a. drones pose several threats to public privacy, critical infrastructure and cyber security. Hence, detecting unauthorized drones is a significant problem which received attention in the last few years. In this paper, we present our experimental work on three drone detection methods (i.e., acoustic detection, radio frequency (RF) detection, and visual detection) to evaluate their efficacy in both indoor and outdoor environments. Owing to the limitations of these schemes, we present a novel encryption-based drone detection scheme that uses a two-stage verification of the drone's received signal strength indicator (RSSI) and the encryption key generated from the drone's position coordinates to reliably detect an unauthorized drone in the presence of authorized drones. △ Less

Submitted 2 December, 2022; originally announced December 2022.

Comments: This paper has been accepted for presentation in 23rd IEEE International Conference on Industrial Technology (ICIT22), 22 - 25 August, 2022, Shanghai, China

arXiv:2210.02840 [pdf, other]

Deep Reinforcement Learning based Evasion Generative Adversarial Network for Botnet Detection

Authors: Rizwan Hamid Randhawa, Nauman Aslam, Mohammad Alauthman, Muhammad Khalid, Husnain Rafiq

Abstract: Botnet detectors based on machine learning are potential targets for adversarial evasion attacks. Several research works employ adversarial training with samples generated from generative adversarial nets (GANs) to make the botnet detectors adept at recognising adversarial evasions. However, the synthetic evasions may not follow the original semantics of the input samples. This paper proposes a no… ▽ More Botnet detectors based on machine learning are potential targets for adversarial evasion attacks. Several research works employ adversarial training with samples generated from generative adversarial nets (GANs) to make the botnet detectors adept at recognising adversarial evasions. However, the synthetic evasions may not follow the original semantics of the input samples. This paper proposes a novel GAN model leveraged with deep reinforcement learning (DRL) to explore semantic aware samples and simultaneously harden its detection. A DRL agent is used to attack the discriminator of the GAN that acts as a botnet detector. The discriminator is trained on the crafted perturbations by the agent during the GAN training, which helps the GAN generator converge earlier than the case without DRL. We name this model RELEVAGAN, i.e. ["relive a GAN" or deep REinforcement Learning-based Evasion Generative Adversarial Network] because, with the help of DRL, it minimises the GAN's job by letting its generator explore the evasion samples within the semantic limits. During the GAN training, the attacks are conducted to adjust the discriminator weights for learning crafted perturbations by the agent. RELEVAGAN does not require adversarial training for the ML classifiers since it can act as an adversarial semantic-aware botnet detection model. Code will be available at https://github.com/rhr407/RELEVAGAN. △ Less

Submitted 6 October, 2022; originally announced October 2022.

arXiv:2208.00931 [pdf, other]

Networked Drones for Industrial Emergency Events

Authors: Maryam Khalid, Edward W. Knightly

Abstract: Uncontrolled emissions of gases from industrial accidents and disasters result in huge loss of life and property. Such extreme events require a quick and reliable survey of the site for effective rescue strategy planning. To achieve these goals, a network of unmanned aerial vehicles can be deployed that survey the affected region and identify safe and danger zones. Although single UAV-based system… ▽ More Uncontrolled emissions of gases from industrial accidents and disasters result in huge loss of life and property. Such extreme events require a quick and reliable survey of the site for effective rescue strategy planning. To achieve these goals, a network of unmanned aerial vehicles can be deployed that survey the affected region and identify safe and danger zones. Although single UAV-based systems for gas sensing applications are well-studied in literature, research on the deployment of a UAV network for such applications, which is more robust and fault tolerant, is still in infancy. The objective of this project is to design a system that can be deployed in emergency situations to provide a quick survey and identification of safe and dangerous zones in a given region that contains a toxic plume without making any assumptions about plume location. We focus on an end-to-end solution and formulate a two-phase strategy that can not only guarantee detection/acquisition of plume but also its characterization with high spatial resolution. To guarantee coverage of the region with a certain spatial resolution, we set up a vehicle routing problem. To overcome the limitations imposed by limited range of sensors and drone resources, we estimate the concentration map by using Gaussian kernel extrapolation. Finally, we evaluate the suggested framework in simulations. Our results suggest that this two-phase strategy not only gives better error performance but is also more efficient in terms of mission time. Moreover, the comparison between 2-phase random search and 2-phase uniform coverage suggest that the latter is better for single drone systems whereas for multiple drones the former gives reasonable performance at low computational cost. △ Less

Submitted 1 August, 2022; originally announced August 2022.

arXiv:2207.10728 [pdf, other]

Decision-Feedback Detection for Bidirectional Molecular Relaying with Direct Links

Authors: Maryam Khalid, Momin Uppal

Abstract: In this paper, we consider bidirectional relaying between two diffusion-based molecular transceivers (bio-nodes). As opposed to existing literature, we incorporate the effect of direct diffusion links between the nodes and leverage it to improve performance. Assuming network coding type operation at the relay, we devise a detection strategy, based on the maximum-likelihood principle, that combines… ▽ More In this paper, we consider bidirectional relaying between two diffusion-based molecular transceivers (bio-nodes). As opposed to existing literature, we incorporate the effect of direct diffusion links between the nodes and leverage it to improve performance. Assuming network coding type operation at the relay, we devise a detection strategy, based on the maximum-likelihood principle, that combines the signal received from the relay and that received from the direct link. At the same time, since a diffusion-based molecular communication channel is characterized by high inter-symbol interference (ISI), we utilize a decision feedback mechanism to mitigate its effect. Simulation results indicate that the proposed setup incorporating the direct link can achieve notable improvement in error performance over conventional detection schemes that do not exploit the direct link and/or do not attempt to mitigate the effect of ISI. △ Less

Submitted 21 July, 2022; originally announced July 2022.

arXiv:2207.05820 [pdf, other]

Exploiting Social Graph Networks for Emotion Prediction

Authors: Maryam Khalid, Akane Sano

Abstract: Emotion prediction plays an essential role in mental health and emotion-aware computing. The complex nature of emotion resulting from its dependency on a person's physiological health, mental state, and his surroundings makes its prediction a challenging task. In this work, we utilize mobile sensing data to predict happiness and stress. In addition to a person's physiological features, we also inc… ▽ More Emotion prediction plays an essential role in mental health and emotion-aware computing. The complex nature of emotion resulting from its dependency on a person's physiological health, mental state, and his surroundings makes its prediction a challenging task. In this work, we utilize mobile sensing data to predict happiness and stress. In addition to a person's physiological features, we also incorporate the environment's impact through weather and social network. To this end, we leverage phone data to construct social networks and develop a machine learning architecture that aggregates information from multiple users of the graph network and integrates it with the temporal dynamics of data to predict emotion for all the users. The construction of social networks does not incur additional cost in terms of EMAs or data collection from users and doesn't raise privacy concerns. We propose an architecture that automates the integration of a user's social network affect prediction, is capable of dealing with the dynamic distribution of real-life social networks, making it scalable to large-scale networks. Our extensive evaluation highlights the improvement provided by the integration of social networks. We further investigate the impact of graph topology on model's performance. △ Less

Submitted 12 July, 2022; originally announced July 2022.

arXiv:2203.15269 [pdf]

Vision Transformers in Medical Computer Vision -- A Contemplative Retrospection

Authors: Arshi Parvaiz, Muhammad Anwaar Khalid, Rukhsana Zafar, Huma Ameer, Muhammad Ali, Muhammad Moazam Fraz

Abstract: Recent escalation in the field of computer vision underpins a huddle of algorithms with the magnificent potential to unravel the information contained within images. These computer vision algorithms are being practised in medical image analysis and are transfiguring the perception and interpretation of Imaging data. Among these algorithms, Vision Transformers are evolved as one of the most contemp… ▽ More Recent escalation in the field of computer vision underpins a huddle of algorithms with the magnificent potential to unravel the information contained within images. These computer vision algorithms are being practised in medical image analysis and are transfiguring the perception and interpretation of Imaging data. Among these algorithms, Vision Transformers are evolved as one of the most contemporary and dominant architectures that are being used in the field of computer vision. These are immensely utilized by a plenty of researchers to perform new as well as former experiments. Here, in this article we investigate the intersection of Vision Transformers and Medical images and proffered an overview of various ViTs based frameworks that are being used by different researchers in order to decipher the obstacles in Medical Computer Vision. We surveyed the application of Vision transformers in different areas of medical computer vision such as image-based disease classification, anatomical structure segmentation, registration, region-based lesion Detection, captioning, report generation, reconstruction using multiple medical imaging modalities that greatly assist in medical diagnosis and hence treatment process. Along with this, we also demystify several imaging modalities used in Medical Computer Vision. Moreover, to get more insight and deeper understanding, self-attention mechanism of transformers is also explained briefly. Conclusively, we also put some light on available data sets, adopted methodology, their performance measures, challenges and their solutions in form of discussion. We hope that this review article will open future directions for researchers in medical computer vision. △ Less

Submitted 29 March, 2022; originally announced March 2022.

arXiv:2203.13333 [pdf, other]

doi 10.1145/3550469.3555392

CLIP-Mesh: Generating textured meshes from text using pretrained image-text models

Authors: Nasir Mohammad Khalid, Tianhao Xie, Eugene Belilovsky, Tiberiu Popa

Abstract: We present a technique for zero-shot generation of a 3D model using only a target text prompt. Without any 3D supervision our method deforms the control shape of a limit subdivided surface along with its texture map and normal map to obtain a 3D asset that corresponds to the input text prompt and can be easily deployed into games or modeling applications. We rely only on a pre-trained CLIP model t… ▽ More We present a technique for zero-shot generation of a 3D model using only a target text prompt. Without any 3D supervision our method deforms the control shape of a limit subdivided surface along with its texture map and normal map to obtain a 3D asset that corresponds to the input text prompt and can be easily deployed into games or modeling applications. We rely only on a pre-trained CLIP model that compares the input text prompt with differentiably rendered images of our 3D model. While previous works have focused on stylization or required training of generative models we perform optimization on mesh parameters directly to generate shape, texture or both. To constrain the optimization to produce plausible meshes and textures we introduce a number of techniques using image augmentations and the use of a pretrained prior that generates CLIP image embeddings given a text embedding. △ Less

Submitted 2 September, 2022; v1 submitted 24 March, 2022; originally announced March 2022.

Comments: 8 pages, 8 figures, Accepted at SIGGRAPH ASIA 2022, Project Page at https://www.nasir.lol/clipmesh

arXiv:2201.06610 [pdf, other]

A Brief Survey of Machine Learning Methods for Emotion Prediction using Physiological Data

Authors: Maryam Khalid, Emily Willis

Abstract: Emotion prediction is a key emerging research area that focuses on identifying and forecasting the emotional state of a human from multiple modalities. Among other data sources, physiological data can serve as an indicator for emotions with an added advantage that it cannot be masked/tampered by the individual and can be easily collected. This paper surveys multiple machine learning methods that d… ▽ More Emotion prediction is a key emerging research area that focuses on identifying and forecasting the emotional state of a human from multiple modalities. Among other data sources, physiological data can serve as an indicator for emotions with an added advantage that it cannot be masked/tampered by the individual and can be easily collected. This paper surveys multiple machine learning methods that deploy smartphone and physiological data to predict emotions in real-time, using self-reported ecological momentary assessments (EMA) scores as ground-truth. Comparing regression, long short-term memory (LSTM) networks, convolutional neural networks (CNN), reinforcement online learning (ROL), and deep belief networks (DBN), we showcase the variability of machine learning methods employed to achieve accurate emotion prediction. We compare the state-of-the-art methods and highlight that experimental performance is still not very good. The performance can be improved in future works by considering the following issues: improving scalability and generalizability, synchronizing multimodal data, optimizing EMA sampling, integrating adaptability with sequence prediction, collecting unbiased data, and leveraging sophisticated feature engineering techniques. △ Less

Submitted 17 January, 2022; originally announced January 2022.

arXiv:2111.09416 [pdf]

Highly Accurate and Reliable Wireless Network Slicing in 5th Generation Networks: A Hybrid Deep Learning Approach

Authors: Sulaiman Khan, Suleman Khan, Yasir Ali, Muhammad Khalid, Zahid Ullah, Shahid Mumtaz

Abstract: In the current era, the next-generation networks like 5th generation (5G) and 6th generation (6G) networks require high security, low latency with a high reliable standards and capacity. In these networks, reconfigurable wireless network slicing is considered as one of the key elements for 5G and 6G networks. A reconfigurable slicing allows the operators to run various instances of the network usi… ▽ More In the current era, the next-generation networks like 5th generation (5G) and 6th generation (6G) networks require high security, low latency with a high reliable standards and capacity. In these networks, reconfigurable wireless network slicing is considered as one of the key elements for 5G and 6G networks. A reconfigurable slicing allows the operators to run various instances of the network using a single infrastructure for a better quality of services (QoS). The QoS can be achieved by reconfiguring and optimizing these networks using Artificial intelligence and machine learning algorithms. To develop a smart decision-making mechanism for network management and restricting network slice failures, machine learning-enabled reconfigurable wireless network solutions are required. In this paper, we propose a hybrid deep learning model that consists of a convolution neural network (CNN) and long short term memory (LSTM). The CNN performs resource allocation, network reconfiguration, and slice selection while the LSTM is used for statistical information (load balancing, error rate etc.) regarding network slices. The applicability of the proposed model is validated by using multiple unknown devices, slice failure, and overloading conditions. The overall accuracy of 95.17% is achieved by the proposed model that reflects its applicability. △ Less

Submitted 7 October, 2021; originally announced November 2021.

arXiv:2110.00992 [pdf, other]

Precise Object Placement with Pose Distance Estimations for Different Objects and Grippers

Authors: Kilian Kleeberger, Jonathan Schnitzler, Muhammad Usman Khalid, Richard Bormann, Werner Kraus, Marco F. Huber

Abstract: This paper introduces a novel approach for the gras** and precise placement of various known rigid objects using multiple grippers within highly cluttered scenes. Using a single depth image of the scene, our method estimates multiple 6D object poses together with an object class, a pose distance for object pose estimation, and a pose distance from a target pose for object placement for each auto… ▽ More This paper introduces a novel approach for the gras** and precise placement of various known rigid objects using multiple grippers within highly cluttered scenes. Using a single depth image of the scene, our method estimates multiple 6D object poses together with an object class, a pose distance for object pose estimation, and a pose distance from a target pose for object placement for each automatically obtained grasp pose with a single forward pass of a neural network. By incorporating model knowledge into the system, our approach has higher success rates for gras** than state-of-the-art model-free approaches. Furthermore, our method chooses grasps that result in significantly more precise object placements than prior model-based work. △ Less

Submitted 3 October, 2021; originally announced October 2021.

Comments: Accepted at 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2021)

arXiv:2109.11661 [pdf]

Deep Reinforcement Learning-Based Long-Range Autonomous Valet Parking for Smart Cities

Authors: Muhammad Khalid, Liang Wang, Kezhi Wang, Cunhua Pan, Nauman Aslam, Yue Cao

Abstract: In this paper, to reduce the congestion rate at the city center and increase the quality of experience (QoE) of each user, the framework of long-range autonomous valet parking (LAVP) is presented, where an Autonomous Vehicle (AV) is deployed in the city, which can pick up, drop off users at their required spots, and then drive to the car park out of city center autonomously. In this framework, we… ▽ More In this paper, to reduce the congestion rate at the city center and increase the quality of experience (QoE) of each user, the framework of long-range autonomous valet parking (LAVP) is presented, where an Autonomous Vehicle (AV) is deployed in the city, which can pick up, drop off users at their required spots, and then drive to the car park out of city center autonomously. In this framework, we aim to minimize the overall distance of the AV, while guarantee all users are served, i.e., picking up, and drop** off users at their required spots through optimizing the path planning of the AV and number of serving time slots. To this end, we first propose a learning based algorithm, which is named as Double-Layer Ant Colony Optimization (DL-ACO) algorithm to solve the above problem in an iterative way. Then, to make the real-time decision, while consider the dynamic environment (i.e., the AV may pick up and drop off users from different locations), we further present a deep reinforcement learning (DRL) based algorithm, which is known as deep Q network (DQN). The experimental results show that the DL-ACO and DQN-based algorithms both achieve the considerable performance. △ Less

Submitted 4 May, 2022; v1 submitted 23 September, 2021; originally announced September 2021.

Comments: 6 Figures, 1 Table

arXiv:2107.00746 [pdf]

Case study of Innovative Teaching Practices and their Impact for Electrical Engineering Courses during COVID-19 Pandemic

Authors: Amith Khandakar, Muhammad E. H. Chowdhury, Md. Saifuddin Khalid, Nizar Zorba

Abstract: Due to the COVID-19 pandemic, there was an urgent need to move to online teaching and develop innovations to guarantee the Student Learning Outcomes (SLOs) are being fulfilled. The contributions of this paper are two-fold: the effects of an experimented teaching strategy, i.e. multi-course project-based learning (MPL) approach, are presented followed with online assessment techniques investigation… ▽ More Due to the COVID-19 pandemic, there was an urgent need to move to online teaching and develop innovations to guarantee the Student Learning Outcomes (SLOs) are being fulfilled. The contributions of this paper are two-fold: the effects of an experimented teaching strategy, i.e. multi-course project-based learning (MPL) approach, are presented followed with online assessment techniques investigation for senior level electrical engineering (EE) courses at Qatar University. The course project of the senior course was designed in such a way that it helps in simultaneously attaining the objectives of the senior and capstone courses, that the students were taking at the same time. It is known that the MPL approach enhances the critical thinking capacity of students which is also a major outcome of Education for Sustainable Development (ESD). The developed project ensures the fulfillment of a series of SLOs, that are concentrated on soft engineering and project management skills. The difficulties of adopting the MPL method for the senior level courses are in aligning the project towards fulfilling the learning outcomes of every individual course. The study also provides the students feedback on online assessment techniques incorporated with the MPL, due to online teaching during COVID-19 pandemic. In order to provide a benchmark and to highlight the obtained results, the innovative teaching approaches were compared to conventional methods taught on the same senior course in a previous semester. Based on the feedback from teachers and students from previously conducted case study it was believed that the MPL approach would support the students. With the statistical analysis (Chi-square, two-tailed T statistics and hypothesis testing using z-test) it can be concluded that the MPL and online assessment actually help to achieve better attainment of the SLOs, even during a pandemic situation. △ Less

Submitted 1 July, 2021; originally announced July 2021.

Comments: 24 pages, 6 figures

arXiv:2007.06818 [pdf, other]

Securing the Insecure: A First-Line-of-Defense for Nanoscale Communication Systems Operating in THz Band

Authors: Waqas Aman, M. Mahboob Ur Rahman, Hassan T. Abbas, Muhammad Arslan Khalid, Muhammad A. Imran, Akram Alomainy, Qammer H. Abbasi

Abstract: Nanoscale communication systems operating in Ter-ahertz (THz) band are anticipated to revolutionise the healthcaresystems of the future. Global wireless data traffic is undergoinga rapid growth. However, wireless systems, due to their broad-casting nature, are vulnerable to malicious security breaches. Inaddition, advances in quantum computing poses a risk to existingcrypto-based information secur… ▽ More Nanoscale communication systems operating in Ter-ahertz (THz) band are anticipated to revolutionise the healthcaresystems of the future. Global wireless data traffic is undergoinga rapid growth. However, wireless systems, due to their broad-casting nature, are vulnerable to malicious security breaches. Inaddition, advances in quantum computing poses a risk to existingcrypto-based information security. It is of the utmost importanceto make the THz systems resilient to potential active and passiveattacks which may lead to devastating consequences, especiallywhen handling sensitive patient data in healthcare systems. Newstrategies are needed to analyse these malicious attacks and topropose viable countermeasures. In this manuscript, we presenta new authentication mechanism for nanoscale communicationsystems operating in THz band at the physical layer. We assessedan impersonation attack on a THz system. We propose usingpath loss as a fingerprint to conduct authentication via two-stephypothesis testing for a transmission device. We used hiddenMarkov Model (HMM) viterbi algorithm to enhance the outputof hypothesis testing. We also conducted transmitter identificationusing maximum likelihood and Gaussian mixture model (GMM)expectation maximization algorithms. Our simulations showedthat the error probabilities are a decreasing functions of SNR. At 10 dB with 0.2 false alarm, the detection probability was almostone. We further observed that HMM out-performs hypothesistesting at low SNR regime (10% increase in accuracy is recordedat SNR =5 dB) whereas the GMM is useful when groundtruths are noisy. Our work addresses major security gaps facedby communication system either through malicious breachesor quantum computing, enabling new applications of nanoscalesystems for Industry 4.0. △ Less

Submitted 14 July, 2020; originally announced July 2020.

arXiv:2001.02501 [pdf, other]

doi 10.1109/ICDAR.2019.00220

Table Structure Extraction with Bi-directional Gated Recurrent Unit Networks

Authors: Saqib Ali Khan, Syed Muhammad Daniyal Khalid, Muhammad Ali Shahzad, Faisal Shafait

Abstract: Tables present summarized and structured information to the reader, which makes table structure extraction an important part of document understanding applications. However, table structure identification is a hard problem not only because of the large variation in the table layouts and styles, but also owing to the variations in the page layouts and the noise contamination levels. A lot of resear… ▽ More Tables present summarized and structured information to the reader, which makes table structure extraction an important part of document understanding applications. However, table structure identification is a hard problem not only because of the large variation in the table layouts and styles, but also owing to the variations in the page layouts and the noise contamination levels. A lot of research has been done to identify table structure, most of which is based on applying heuristics with the aid of optical character recognition (OCR) to hand pick layout features of the tables. These methods fail to generalize well because of the variations in the table layouts and the errors generated by OCR. In this paper, we have proposed a robust deep learning based approach to extract rows and columns from a detected table in document images with a high precision. In the proposed solution, the table images are first pre-processed and then fed to a bi-directional Recurrent Neural Network with Gated Recurrent Units (GRU) followed by a fully-connected layer with soft max activation. The network scans the images from top-to-bottom as well as left-to-right and classifies each input as either a row-separator or a column-separator. We have benchmarked our system on publicly available UNLV as well as ICDAR 2013 datasets on which it outperformed the state-of-the-art table structure extraction systems by a significant margin. △ Less

Submitted 8 January, 2020; originally announced January 2020.

Comments: Proceedings of the 15th International Conference on Document Analysis and Recognition (ICDAR) 2019, Sydney, Australia

arXiv:1909.03466 [pdf, other]

Multi-Modal Three-Stream Network for Action Recognition

Authors: Muhammad Usman Khalid, Jie Yu

Abstract: Human action recognition in video is an active yet challenging research topic due to high variation and complexity of data. In this paper, a novel video based action recognition framework utilizing complementary cues is proposed to handle this complex problem. Inspired by the successful two stream networks for action classification, additional pose features are studied and fused to enhance underst… ▽ More Human action recognition in video is an active yet challenging research topic due to high variation and complexity of data. In this paper, a novel video based action recognition framework utilizing complementary cues is proposed to handle this complex problem. Inspired by the successful two stream networks for action classification, additional pose features are studied and fused to enhance understanding of human action in a more abstract and semantic way. Towards practices, not only ground truth poses but also noisy estimated poses are incorporated in the framework with our proposed pre-processing module. The whole framework and each cue are evaluated on varied benchmarking datasets as JHMDB, sub-JHMDB and Penn Action. Our results outperform state-of-the-art performance on these datasets and show the strength of complementary cues. △ Less

Submitted 8 September, 2019; originally announced September 2019.

Comments: Presented in IEEE ICPR 2018

arXiv:1909.03462 [pdf, other]

Deep Workpiece Region Segmentation for Bin Picking

Authors: Muhammad Usman Khalid, Janik M. Hager, Werner Kraus, Marco F. Huber, Marc Toussaint

Abstract: For most industrial bin picking solutions, the pose of a workpiece is localized by matching a CAD model to point cloud obtained from 3D sensor. Distinguishing flat workpieces from bottom of the bin in point cloud imposes challenges in the localization of workpieces that lead to wrong or phantom detections. In this paper, we propose a framework that solves this problem by automatically segmenting w… ▽ More For most industrial bin picking solutions, the pose of a workpiece is localized by matching a CAD model to point cloud obtained from 3D sensor. Distinguishing flat workpieces from bottom of the bin in point cloud imposes challenges in the localization of workpieces that lead to wrong or phantom detections. In this paper, we propose a framework that solves this problem by automatically segmenting workpiece regions from non-workpiece regions in a point cloud data. It is done in real time by applying a fully convolutional neural network trained on both simulated and real data. The real data has been labelled by our novel technique which automatically generates ground truth labels for real point clouds. Along with real time workpiece segmentation, our framework also helps in improving the number of detected workpieces and estimating the correct object poses. Moreover, it decreases the computation time by approximately 1s due to a reduction of the search space for the object pose estimation. △ Less

Submitted 8 September, 2019; originally announced September 2019.

Comments: IEEE CASE 2019

arXiv:1811.01393 [pdf, other]

Communication Through Breath: Aerosol Transmission

Authors: Maryam Khalid, Osama Amin, Sajid Ahmed, Basem Shihada, Mohamed-Slim Alouini

Abstract: Exhaled breath can be used in retrieving information and creating innovative communication systems. It contains several volatile organic compounds (VOCs) and biological entities that can act as health biomarkers. For instance, the breath of infected human contains a nonnegligible amount of pathogenic aerosol that can spread or remain suspended in the atmosphere. Therefore, the exhaled breath can b… ▽ More Exhaled breath can be used in retrieving information and creating innovative communication systems. It contains several volatile organic compounds (VOCs) and biological entities that can act as health biomarkers. For instance, the breath of infected human contains a nonnegligible amount of pathogenic aerosol that can spread or remain suspended in the atmosphere. Therefore, the exhaled breath can be exploited as a source's message in a communication setup to remotely scan the bio-information via an aerosol transmission channel. An overview of the basic configuration is presented along with a description of system components with a particular emphasis on channel modeling. Furthermore, the challenges that arise in theoretical analysis and system development are highlighted. Finally, several open issues are discussed to concretize the proposed communication concept. △ Less

Submitted 4 November, 2018; originally announced November 2018.

arXiv:1710.00760 [pdf, ps, other]

Scalable Nonlinear AUC Maximization Methods

Authors: Majdi Khalid, Indrakshi Ray, Hamidreza Chitsaz

Abstract: The area under the ROC curve (AUC) is a measure of interest in various machine learning and data mining applications. It has been widely used to evaluate classification performance on heavily imbalanced data. The kernelized AUC maximization machines have established a superior generalization ability compared to linear AUC machines because of their capability in modeling the complex nonlinear struc… ▽ More The area under the ROC curve (AUC) is a measure of interest in various machine learning and data mining applications. It has been widely used to evaluate classification performance on heavily imbalanced data. The kernelized AUC maximization machines have established a superior generalization ability compared to linear AUC machines because of their capability in modeling the complex nonlinear structure underlying most real-world data. However, the high training complexity renders the kernelized AUC machines infeasible for large-scale data. In this paper, we present two nonlinear AUC maximization algorithms that optimize pairwise linear classifiers over a finite-dimensional feature space constructed via the k-means Nyström method. Our first algorithm maximize the AUC metric by optimizing a pairwise squared hinge loss function using the truncated Newton method. However, the second-order batch AUC maximization method becomes expensive to optimize for extremely massive datasets. This motivate us to develop a first-order stochastic AUC maximization algorithm that incorporates a scheduled regularization update and scheduled averaging techniques to accelerate the convergence of the classifier. Experiments on several benchmark datasets demonstrate that the proposed AUC classifiers are more efficient than kernelized AUC machines while they are able to surpass or at least match the AUC performance of the kernelized AUC machines. The experiments also show that the proposed stochastic AUC classifier outperforms the state-of-the-art online AUC maximization methods in terms of AUC classification accuracy. △ Less

Submitted 29 April, 2019; v1 submitted 2 October, 2017; originally announced October 2017.

arXiv:1607.00847 [pdf, other]

Confidence-Weighted Bipartite Ranking

Authors: Majdi Khalid, Indrakshi Ray, Hamidreza Chitsaz

Abstract: Bipartite ranking is a fundamental machine learning and data mining problem. It commonly concerns the maximization of the AUC metric. Recently, a number of studies have proposed online bipartite ranking algorithms to learn from massive streams of class-imbalanced data. These methods suggest both linear and kernel-based bipartite ranking algorithms based on first and second-order online learning. U… ▽ More Bipartite ranking is a fundamental machine learning and data mining problem. It commonly concerns the maximization of the AUC metric. Recently, a number of studies have proposed online bipartite ranking algorithms to learn from massive streams of class-imbalanced data. These methods suggest both linear and kernel-based bipartite ranking algorithms based on first and second-order online learning. Unlike kernelized ranker, linear ranker is more scalable learning algorithm. The existing linear online bipartite ranking algorithms lack either handling non-separable data or constructing adaptive large margin. These limitations yield unreliable bipartite ranking performance. In this work, we propose a linear online confidence-weighted bipartite ranking algorithm (CBR) that adopts soft confidence-weighted learning. The proposed algorithm leverages the same properties of soft confidence-weighted learning in a framework for bipartite ranking. We also develop a diagonal variation of the proposed confidence-weighted bipartite ranking algorithm to deal with high-dimensional data by maintaining only the diagonal elements of the covariance matrix. We empirically evaluate the effectiveness of the proposed algorithms on several benchmark and high-dimensional datasets. The experimental results validate the reliability of the proposed algorithms. The results also show that our algorithms outperform or are at least comparable to the competing online AUC maximization methods. △ Less

Submitted 10 March, 2019; v1 submitted 4 July, 2016; originally announced July 2016.

Comments: 15 pages, 6 tables, and 2 figures

arXiv:1002.1881 [pdf]

Evaluation and Design Space Exploration of a Time-Division Multiplexed NoC on FPGA for Image Analysis Applications

Authors: Linlin Zhang, Virginie Fresse, Mohammed Khalid, Dominique Houzet, Anne-Claire Legrand

Abstract: The aim of this paper is to present an adaptable Fat Tree NoC architecture for Field Programmable Gate Array (FPGA) designed for image analysis applications. Traditional NoCs (Network on Chip) are not optimal for dataflow applications with large amount of data. On the opposite, point to point communications are designed from the algorithm requirements but they are expensives in terms of resource… ▽ More The aim of this paper is to present an adaptable Fat Tree NoC architecture for Field Programmable Gate Array (FPGA) designed for image analysis applications. Traditional NoCs (Network on Chip) are not optimal for dataflow applications with large amount of data. On the opposite, point to point communications are designed from the algorithm requirements but they are expensives in terms of resource and wire. We propose a dedicated communication architecture for image analysis algorithms. This communication mechanism is a generic NoC infrastructure dedicated to dataflow image processing applications, mixing circuit-switching and packet-switching communications. The complete architecture integrates two dedicated communication architectures and reusable IP blocks. Communications are based on the NoC concept to support the high bandwidth required for a large number and type of data. △ Less

Submitted 9 February, 2010; originally announced February 2010.

Journal ref: Eurasip Journal on Embedded Systems 2010 (2010) 542035

Showing 1–28 of 28 results for author: Khalid, M