Search | arXiv e-print repository

A Short Survey of Human Mobility Prediction in Epidemic Modeling from Transformers to LLMs

Authors: Christian N. Mayemba, D'Jeff K. Nkashama, Jean Marie Tshimula, Maximilien V. Dialufuma, Jean Tshibangu Muabila, Mbuyi Mukendi Didier, Hugues Kanda, René Manassé Galekwa, Heber Dibwe Fita, Serge Mundele, Kalonji Kalala, Aristarque Ilunga, Lambert Mukendi Ntobo, Dominique Muteba, Aaron Aruna Abedi

Abstract: This paper provides a comprehensive survey of recent advancements in leveraging machine learning techniques, particularly Transformer models, for predicting human mobility patterns during epidemics. Understanding how people move during epidemics is essential for modeling the spread of diseases and devising effective response strategies. Forecasting population movement is crucial for informing epid… ▽ More This paper provides a comprehensive survey of recent advancements in leveraging machine learning techniques, particularly Transformer models, for predicting human mobility patterns during epidemics. Understanding how people move during epidemics is essential for modeling the spread of diseases and devising effective response strategies. Forecasting population movement is crucial for informing epidemiological models and facilitating effective response planning in public health emergencies. Predicting mobility patterns can enable authorities to better anticipate the geographical and temporal spread of diseases, allocate resources more efficiently, and implement targeted interventions. We review a range of approaches utilizing both pretrained language models like BERT and Large Language Models (LLMs) tailored specifically for mobility prediction tasks. These models have demonstrated significant potential in capturing complex spatio-temporal dependencies and contextual patterns in textual data. △ Less

Submitted 25 April, 2024; originally announced April 2024.

arXiv:2403.17175 [pdf, ps, other]

Engagement Measurement Based on Facial Landmarks and Spatial-Temporal Graph Convolutional Networks

Authors: Ali Abedi, Shehroz S. Khan

Abstract: Engagement in virtual learning is crucial for a variety of factors including learner satisfaction, performance, and compliance with learning programs, but measuring it is a challenging task. There is therefore considerable interest in utilizing artificial intelligence and affective computing to measure engagement in natural settings as well as on a large scale. This paper introduces a novel, priva… ▽ More Engagement in virtual learning is crucial for a variety of factors including learner satisfaction, performance, and compliance with learning programs, but measuring it is a challenging task. There is therefore considerable interest in utilizing artificial intelligence and affective computing to measure engagement in natural settings as well as on a large scale. This paper introduces a novel, privacy-preserving method for engagement measurement from videos. It uses facial landmarks, which carry no personally identifiable information, extracted from videos via the MediaPipe deep learning solution. The extracted facial landmarks are fed to a Spatial-Temporal Graph Convolutional Network (ST-GCN) to output the engagement level of the learner in the video. To integrate the ordinal nature of the engagement variable into the training process, ST-GCNs undergo training in a novel ordinal learning framework based on transfer learning. Experimental results on two video student engagement measurement datasets show the superiority of the proposed method compared to previous methods with improved state-of-the-art on the EngageNet dataset with a %3.1 improvement in four-class engagement level classification accuracy and on the Online Student Engagement dataset with a %1.5 improvement in binary engagement classification accuracy. The relatively lightweight ST-GCN and its integration with the real-time MediaPipe deep learning solution make the proposed approach capable of being deployed on virtual learning platforms and measuring engagement in real time. △ Less

Submitted 25 March, 2024; originally announced March 2024.

arXiv:2403.02772 [pdf, other]

Rehabilitation Exercise Quality Assessment through Supervised Contrastive Learning with Hard and Soft Negatives

Authors: Mark Karlov, Ali Abedi, Shehroz S. Khan

Abstract: Exercise-based rehabilitation programs have proven to be effective in enhancing the quality of life and reducing mortality and rehospitalization rates. AI-driven virtual rehabilitation, which allows patients to independently complete exercises at home, utilizes AI algorithms to analyze exercise data, providing feedback to patients and updating clinicians on their progress. These programs commonly… ▽ More Exercise-based rehabilitation programs have proven to be effective in enhancing the quality of life and reducing mortality and rehospitalization rates. AI-driven virtual rehabilitation, which allows patients to independently complete exercises at home, utilizes AI algorithms to analyze exercise data, providing feedback to patients and updating clinicians on their progress. These programs commonly prescribe a variety of exercise types, leading to a distinct challenge in rehabilitation exercise assessment datasets: while abundant in overall training samples, these datasets often have a limited number of samples for each individual exercise type. This disparity hampers the ability of existing approaches to train generalizable models with such a small sample size per exercise. Addressing this issue, our paper introduces a novel supervised contrastive learning framework with hard and soft negative samples that effectively utilizes the entire dataset to train a single model applicable to all exercise types. This model, with a Spatial-Temporal Graph Convolutional Network (ST-GCN) architecture, demonstrated enhanced generalizability across exercises and a decrease in overall complexity. Through extensive experiments on three publicly available rehabilitation exercise assessment datasets, the University of Idaho-Physical Rehabilitation Movement Data (UI-PRMD), IntelliRehabDS (IRDS), and KInematic assessment of MOvement and clinical scores for remote monitoring of physical REhabilitation (KIMORE), our method has shown to surpass existing methods, setting a new benchmark in rehabilitation exercise assessment accuracy. △ Less

Submitted 5 March, 2024; originally announced March 2024.

arXiv:2308.02410 [pdf, ps, other]

RFID-Assisted Indoor Localization Using Hybrid Wireless Data Fusion

Authors: Abouzar Ghavami, Ali Abedi

Abstract: Wireless localization is essential for tracking objects in indoor environments. Internet of Things (IoT) enables localization through its diverse wireless communication protocols. In this paper, a hybrid section-based indoor localization method using a developed Radio Frequency Identification (RFID) tracking device and multiple IoT wireless technologies is proposed. In order to reduce the cost of… ▽ More Wireless localization is essential for tracking objects in indoor environments. Internet of Things (IoT) enables localization through its diverse wireless communication protocols. In this paper, a hybrid section-based indoor localization method using a developed Radio Frequency Identification (RFID) tracking device and multiple IoT wireless technologies is proposed. In order to reduce the cost of the RFID tags, the tags are installed only on the borders of each section. The RFID tracking device identifies the section, and the proposed wireless hybrid method finds the location of the object inside the section. The proposed hybrid method is analytically driven by linear location estimates obtained from different IoT wireless technologies. The experimental results using developed RFID tracking device and RSSI-based localization for Bluetooth, WiFi and ZigBee technologies verifies the analytical results. △ Less

Submitted 28 July, 2023; originally announced August 2023.

arXiv:2306.09546 [pdf, other]

Cross-Modal Video to Body-joints Augmentation for Rehabilitation Exercise Quality Assessment

Authors: Ali Abedi, Mobin Malmirian, Shehroz S. Khan

Abstract: Exercise-based rehabilitation programs have been shown to enhance quality of life and reduce mortality and rehospitalizations. AI-driven virtual rehabilitation programs enable patients to complete exercises independently at home while AI algorithms can analyze exercise data to provide feedback to patients and report their progress to clinicians. This paper introduces a novel approach to assessing… ▽ More Exercise-based rehabilitation programs have been shown to enhance quality of life and reduce mortality and rehospitalizations. AI-driven virtual rehabilitation programs enable patients to complete exercises independently at home while AI algorithms can analyze exercise data to provide feedback to patients and report their progress to clinicians. This paper introduces a novel approach to assessing the quality of rehabilitation exercises using RGB video. Sequences of skeletal body joints are extracted from consecutive RGB video frames and analyzed by many-to-one sequential neural networks to evaluate exercise quality. Existing datasets for exercise rehabilitation lack adequate samples for training deep sequential neural networks to generalize effectively. A cross-modal data augmentation approach is proposed to resolve this problem. Visual augmentation techniques are applied to video data, and body joints extracted from the resulting augmented videos are used for training sequential neural networks. Extensive experiments conducted on the KInematic assessment of MOvement and clinical scores for remote monitoring of physical REhabilitation (KIMORE) dataset, demonstrate the superiority of the proposed method over previous baseline approaches. The ablation study highlights a significant enhancement in exercise quality assessment following cross-modal augmentation. △ Less

Submitted 15 June, 2023; originally announced June 2023.

arXiv:2304.09735 [pdf, other]

Rehabilitation Exercise Repetition Segmentation and Counting using Skeletal Body Joints

Authors: Ali Abedi, Paritosh Bisht, Riddhi Chatterjee, Rachit Agrawal, Vyom Sharma, Dinesh Babu Jayagopi, Shehroz S. Khan

Abstract: Physical exercise is an essential component of rehabilitation programs that improve quality of life and reduce mortality and re-hospitalization rates. In AI-driven virtual rehabilitation programs, patients complete their exercises independently at home, while AI algorithms analyze the exercise data to provide feedback to patients and report their progress to clinicians. To analyze exercise data, t… ▽ More Physical exercise is an essential component of rehabilitation programs that improve quality of life and reduce mortality and re-hospitalization rates. In AI-driven virtual rehabilitation programs, patients complete their exercises independently at home, while AI algorithms analyze the exercise data to provide feedback to patients and report their progress to clinicians. To analyze exercise data, the first step is to segment it into consecutive repetitions. There has been a significant amount of research performed on segmenting and counting the repetitive activities of healthy individuals using raw video data, which raises concerns regarding privacy and is computationally intensive. Previous research on patients' rehabilitation exercise segmentation relied on data collected by multiple wearable sensors, which are difficult to use at home by rehabilitation patients. Compared to healthy individuals, segmenting and counting exercise repetitions in patients is more challenging because of the irregular repetition duration and the variation between repetitions. This paper presents a novel approach for segmenting and counting the repetitions of rehabilitation exercises performed by patients, based on their skeletal body joints. Skeletal body joints can be acquired through depth cameras or computer vision techniques applied to RGB videos of patients. Various sequential neural networks are designed to analyze the sequences of skeletal body joints and perform repetition segmentation and counting. Extensive experiments on three publicly available rehabilitation exercise datasets, KIMORE, UI-PRMD, and IntelliRehabDS, demonstrate the superiority of the proposed method compared to previous methods. The proposed method enables accurate exercise analysis while preserving privacy, facilitating the effective delivery of virtual rehabilitation programs. △ Less

Submitted 19 April, 2023; originally announced April 2023.

Comments: 8 pages, 1 figure, 2 tables

arXiv:2303.13610 [pdf, other]

doi 10.1038/s41591-023-02252-4

Artificial-intelligence-based molecular classification of diffuse gliomas using rapid, label-free optical imaging

Authors: Todd C. Hollon, Cheng Jiang, Asadur Chowdury, Mustafa Nasir-Moin, Akhil Kondepudi, Alexander Aabedi, Arjun Adapa, Wajd Al-Holou, Jason Heth, Oren Sagher, Pedro Lowenstein, Maria Castro, Lisa Irina Wadiura, Georg Widhalm, Volker Neuschmelting, David Reinecke, Niklas von Spreckelsen, Mitchel S. Berger, Shawn L. Hervey-Jumper, John G. Golfinos, Matija Snuderl, Sandra Camelo-Piragua, Christian Freudiger, Honglak Lee, Daniel A. Orringer

Abstract: Molecular classification has transformed the management of brain tumors by enabling more accurate prognostication and personalized treatment. However, timely molecular diagnostic testing for patients with brain tumors is limited, complicating surgical and adjuvant treatment and obstructing clinical trial enrollment. In this study, we developed DeepGlioma, a rapid ($< 90$ seconds), artificial-intel… ▽ More Molecular classification has transformed the management of brain tumors by enabling more accurate prognostication and personalized treatment. However, timely molecular diagnostic testing for patients with brain tumors is limited, complicating surgical and adjuvant treatment and obstructing clinical trial enrollment. In this study, we developed DeepGlioma, a rapid ($< 90$ seconds), artificial-intelligence-based diagnostic screening system to streamline the molecular diagnosis of diffuse gliomas. DeepGlioma is trained using a multimodal dataset that includes stimulated Raman histology (SRH); a rapid, label-free, non-consumptive, optical imaging method; and large-scale, public genomic data. In a prospective, multicenter, international testing cohort of patients with diffuse glioma ($n=153$) who underwent real-time SRH imaging, we demonstrate that DeepGlioma can predict the molecular alterations used by the World Health Organization to define the adult-type diffuse glioma taxonomy (IDH mutation, 1p19q co-deletion and ATRX mutation), achieving a mean molecular classification accuracy of $93.3\pm 1.6\%$. Our results represent how artificial intelligence and optical histology can be used to provide a rapid and scalable adjunct to wet lab methods for the molecular screening of patients with diffuse glioma. △ Less

Submitted 23 March, 2023; originally announced March 2023.

Comments: Paper published in Nature Medicine

arXiv:2303.10766 [pdf, other]

Multi-modal reward for visual relationships-based image captioning

Authors: Ali Abedi, Hossein Karshenas, Peyman Adibi

Abstract: Deep neural networks have achieved promising results in automatic image captioning due to their effective representation learning and context-based content generation capabilities. As a prominent type of deep features used in many of the recent image captioning methods, the well-known bottomup features provide a detailed representation of different objects of the image in comparison with the featu… ▽ More Deep neural networks have achieved promising results in automatic image captioning due to their effective representation learning and context-based content generation capabilities. As a prominent type of deep features used in many of the recent image captioning methods, the well-known bottomup features provide a detailed representation of different objects of the image in comparison with the feature maps directly extracted from the raw image. However, the lack of high-level semantic information about the relationships between these objects is an important drawback of bottom-up features, despite their expensive and resource-demanding extraction procedure. To take advantage of visual relationships in caption generation, this paper proposes a deep neural network architecture for image captioning based on fusing the visual relationships information extracted from an image's scene graph with the spatial feature maps of the image. A multi-modal reward function is then introduced for deep reinforcement learning of the proposed network using a combination of language and vision similarities in a common embedding space. The results of extensive experimentation on the MSCOCO dataset show the effectiveness of using visual relationships in the proposed captioning method. Moreover, the results clearly indicate that the proposed multi-modal reward in deep reinforcement learning leads to better model optimization, outperforming several state-of-the-art image captioning algorithms, while using light and easy to extract image features. A detailed experimental study of the components constituting the proposed method is also presented. △ Less

Submitted 21 March, 2023; v1 submitted 19 March, 2023; originally announced March 2023.

Comments: 18 pages, 10 figures

arXiv:2301.07202 [pdf, other]

Are Home Security Systems Reliable?

Authors: Christopher Vattheuer, Charlie Liu, Ali Abedi, Omid Abari

Abstract: Home security systems have become increasingly popular since they provide an additional layer of protection and peace of mind. These systems typically include battery-powered motion sensors, contact sensors, and smart locks. Z-Wave is a very popular wireless communication technology for these low-power systems. In this paper, we demonstrate two new attacks targeting Z-Wave devices. First, we show… ▽ More Home security systems have become increasingly popular since they provide an additional layer of protection and peace of mind. These systems typically include battery-powered motion sensors, contact sensors, and smart locks. Z-Wave is a very popular wireless communication technology for these low-power systems. In this paper, we demonstrate two new attacks targeting Z-Wave devices. First, we show how an attacker can remotely attack Z-Wave security devices to increase their power consumption by three orders of magnitude, reducing their battery life from a few years to just a few hours. Second, we show multiple Denial of Service (DoS) attacks which enables an attacker to interrupt the operation of security systems in just a few seconds. Our experiments show that these attacks are effective even when the attacker device is in a car 100 meters away from the targeted house. △ Less

Submitted 17 January, 2023; originally announced January 2023.

arXiv:2301.06730 [pdf, other]

Bag of States: A Non-sequential Approach to Video-based Engagement Measurement

Authors: Ali Abedi, Chinchu Thomas, Dinesh Babu Jayagopi, Shehroz S. Khan

Abstract: Automatic measurement of student engagement provides helpful information for instructors to meet learning program objectives and individualize program delivery. Students' behavioral and emotional states need to be analyzed at fine-grained time scales in order to measure their level of engagement. Many existing approaches have developed sequential and spatiotemporal models, such as recurrent neural… ▽ More Automatic measurement of student engagement provides helpful information for instructors to meet learning program objectives and individualize program delivery. Students' behavioral and emotional states need to be analyzed at fine-grained time scales in order to measure their level of engagement. Many existing approaches have developed sequential and spatiotemporal models, such as recurrent neural networks, temporal convolutional networks, and three-dimensional convolutional neural networks, for measuring student engagement from videos. These models are trained to incorporate the order of behavioral and emotional states of students into video analysis and output their level of engagement. In this paper, backed by educational psychology, we question the necessity of modeling the order of behavioral and emotional states of students in measuring their engagement. We develop bag-of-words-based models in which only the occurrence of behavioral and emotional states of students is modeled and analyzed and not the order in which they occur. Behavioral and affective features are extracted from videos and analyzed by the proposed models to determine the level of engagement in an ordinal-output classification setting. Compared to the existing sequential and spatiotemporal approaches for engagement measurement, the proposed non-sequential approach improves the state-of-the-art results. According to experimental results, our method significantly improved engagement level classification accuracy on the IIITB Online SE dataset by 26% compared to sequential models and achieved engagement level classification accuracy as high as 66.58% on the DAiSEE student engagement dataset. △ Less

Submitted 17 January, 2023; originally announced January 2023.

arXiv:2301.00269 [pdf, other]

WiFi Physical Layer Stays Awake and Responds When it Should Not

Authors: Ali Abedi, Haofan Lu, Alex Chen, Charlie Liu, Omid Abari

Abstract: WiFi communication should be possible only between devices inside the same network. However, we find that all existing WiFi devices send back acknowledgments (ACK) to even fake packets received from unauthorized WiFi devices outside of their network. Moreover, we find that an unauthorized device can manipulate the power-saving mechanism of WiFi radios and keep them continuously awake by sending sp… ▽ More WiFi communication should be possible only between devices inside the same network. However, we find that all existing WiFi devices send back acknowledgments (ACK) to even fake packets received from unauthorized WiFi devices outside of their network. Moreover, we find that an unauthorized device can manipulate the power-saving mechanism of WiFi radios and keep them continuously awake by sending specific fake beacon frames to them. Our evaluation of over 5,000 devices from 186 vendors confirms that these are widespread issues. We believe these loopholes cannot be prevented, and hence they create privacy and security concerns. Finally, to show the importance of these issues and their consequences, we implement and demonstrate two attacks where an adversary performs battery drain and WiFi sensing attacks just using a tiny WiFi module which costs less than ten dollars. △ Less

Submitted 24 March, 2023; v1 submitted 31 December, 2022; originally announced January 2023.

Comments: 12 pages

arXiv:2211.13114 [pdf, other]

Step Counting with Attention-based LSTM

Authors: Shehroz S. Khan, Ali Abedi

Abstract: Physical activity is recognized as an essential component of overall health. One measure of physical activity, the step count, is well known as a predictor of long-term morbidity and mortality. Step Counting (SC) is the automated counting of the number of steps an individual takes over a specified period of time and space. Due to the ubiquity of smartphones and smartwatches, most current SC approa… ▽ More Physical activity is recognized as an essential component of overall health. One measure of physical activity, the step count, is well known as a predictor of long-term morbidity and mortality. Step Counting (SC) is the automated counting of the number of steps an individual takes over a specified period of time and space. Due to the ubiquity of smartphones and smartwatches, most current SC approaches rely on the built-in accelerometer sensors on these devices. The sensor signals are analyzed as multivariate time series, and the number of steps is calculated through a variety of approaches, such as time-domain, frequency-domain, machine-learning, and deep-learning approaches. Most of the existing approaches rely on dividing the input signal into windows, detecting steps in each window, and summing the detected steps. However, these approaches require the determination of multiple parameters, including the window size. Furthermore, most of the existing deep-learning SC approaches require ground-truth labels for every single step, which can be arduous and time-consuming to annotate. To circumvent these requirements, we present a novel SC approach utilizing many-to-one attention-based LSTM. With the proposed LSTM network, SC is solved as a regression problem, taking the entire sensor signal as input and the step count as the output. The analysis shows that the attention-based LSTM automatically learned the pattern of steps even in the absence of ground-truth labels. The experimental results on three publicly available SC datasets demonstrate that the proposed method successfully counts the number of steps with low values of mean absolute error and high values of SC accuracy. △ Less

Submitted 17 November, 2022; originally announced November 2022.

Report number: EFI-94-11

arXiv:2211.06870 [pdf, other]

Detecting Disengagement in Virtual Learning as an Anomaly using Temporal Convolutional Network Autoencoder

Authors: Ali Abedi, Shehroz S. Khan

Abstract: Student engagement is an important factor in meeting the goals of virtual learning programs. Automatic measurement of student engagement provides helpful information for instructors to meet learning program objectives and individualize program delivery. Many existing approaches solve video-based engagement measurement using the traditional frameworks of binary classification (classifying video sni… ▽ More Student engagement is an important factor in meeting the goals of virtual learning programs. Automatic measurement of student engagement provides helpful information for instructors to meet learning program objectives and individualize program delivery. Many existing approaches solve video-based engagement measurement using the traditional frameworks of binary classification (classifying video snippets into engaged or disengaged classes), multi-class classification (classifying video snippets into multiple classes corresponding to different levels of engagement), or regression (estimating a continuous value corresponding to the level of engagement). However, we observe that while the engagement behaviour is mostly well-defined (e.g., focused, not distracted), disengagement can be expressed in various ways. In addition, in some cases, the data for disengaged classes may not be sufficient to train generalizable binary or multi-class classifiers. To handle this situation, in this paper, for the first time, we formulate detecting disengagement in virtual learning as an anomaly detection problem. We design various autoencoders, including temporal convolutional network autoencoder, long-short-term memory autoencoder, and feedforward autoencoder using different behavioral and affect features for video-based student disengagement detection. The result of our experiments on two publicly available student engagement datasets, DAiSEE and EmotiW, shows the superiority of the proposed approach for disengagement detection as an anomaly compared to binary classifiers for classifying videos into engaged versus disengaged classes (with an average improvement of 9% on the area under the curve of the receiver operating characteristic curve and 22% on the area under the curve of the precision-recall curve). △ Less

Submitted 4 February, 2023; v1 submitted 13 November, 2022; originally announced November 2022.

arXiv:2211.03615 [pdf, other]

MAISON -- Multimodal AI-based Sensor platform for Older Individuals

Authors: Ali Abedi, Faranak Dayyani, Charlene Chu, Shehroz S. Khan

Abstract: There is a global aging population requiring the need for the right tools that can enable older adults' greater independence and the ability to age at home, as well as assist healthcare workers. It is feasible to achieve this objective by building predictive models that assist healthcare workers in monitoring and analyzing older adults' behavioral, functional, and psychological data. To develop su… ▽ More There is a global aging population requiring the need for the right tools that can enable older adults' greater independence and the ability to age at home, as well as assist healthcare workers. It is feasible to achieve this objective by building predictive models that assist healthcare workers in monitoring and analyzing older adults' behavioral, functional, and psychological data. To develop such models, a large amount of multimodal sensor data is typically required. In this paper, we propose MAISON, a scalable cloud-based platform of commercially available smart devices capable of collecting desired multimodal sensor data from older adults and patients living in their own homes. The MAISON platform is novel due to its ability to collect a greater variety of data modalities than the existing platforms, as well as its new features that result in seamless data collection and ease of use for older adults who may not be digitally literate. We demonstrated the feasibility of the MAISON platform with two older adults discharged home from a large rehabilitation center. The results indicate that the MAISON platform was able to collect and store sensor data in a cloud without functional glitches or performance degradation. This paper will also discuss the challenges faced during the development of the platform and data collection in the homes of older adults. MAISON is a novel platform designed to collect multimodal data and facilitate the development of predictive models for detecting key health indicators, including social isolation, depression, and functional decline, and is feasible to use with older adults in the community. △ Less

Submitted 7 November, 2022; originally announced November 2022.

arXiv:2208.04548 [pdf, other]

Inconsistencies in the Definition and Annotation of Student Engagement in Virtual Learning Datasets: A Critical Review

Authors: Shehroz S. Khan, Ali Abedi, Tracey Colella

Abstract: Background: Student engagement (SE) in virtual learning can have a major impact on meeting learning objectives and program dropout risks. Develo** Artificial Intelligence (AI) models for automatic SE measurement requires annotated datasets. However, existing SE datasets suffer from inconsistent definitions and annotation protocols mostly unaligned with the definition of SE in educational psychol… ▽ More Background: Student engagement (SE) in virtual learning can have a major impact on meeting learning objectives and program dropout risks. Develo** Artificial Intelligence (AI) models for automatic SE measurement requires annotated datasets. However, existing SE datasets suffer from inconsistent definitions and annotation protocols mostly unaligned with the definition of SE in educational psychology. This issue could be misleading in develo** generalizable AI models and make it hard to compare the performance of these models developed on different datasets. The objective of this critical review was to explore the existing SE datasets and highlight inconsistencies in terms of differing engagement definitions and annotation protocols. Methods: Several academic databases were searched for publications introducing new SE datasets. The datasets containing students' single- or multi-modal data in online or offline computer-based virtual learning sessions were included. The definition and annotation of SE in the existing datasets were analyzed based on our defined seven dimensions of engagement annotation: sources, data modalities, timing, temporal resolution, level of abstraction, combination, and quantification. Results: Thirty SE measurement datasets met the inclusion criteria. The reviewed SE datasets used very diverse and inconsistent definitions and annotation protocols. Unexpectedly, very few of the reviewed datasets used existing psychometrically validated scales in their definition of SE. Discussion: The inconsistent definition and annotation of SE are problematic for research on develo** comparable AI models for automatic SE measurement. Some of the existing SE definitions and protocols in settings other than virtual learning that have the potential to be used in virtual learning are introduced. △ Less

Submitted 16 January, 2023; v1 submitted 9 August, 2022; originally announced August 2022.

arXiv:2109.04021 [pdf, other]

Supervised Contrastive Learning for Detecting Anomalous Driving Behaviours from Multimodal Videos

Authors: Shehroz S. Khan, Ziting Shen, Haoying Sun, Ax Patel, Ali Abedi

Abstract: Distracted driving is one of the major reasons for vehicle accidents. Therefore, detecting distracted driving behaviors is of paramount importance to reduce the millions of deaths and injuries occurring worldwide. Distracted or anomalous driving behaviors are deviations from 'normal' driving that need to be identified correctly to alert the driver. However, these driving behaviors do not comprise… ▽ More Distracted driving is one of the major reasons for vehicle accidents. Therefore, detecting distracted driving behaviors is of paramount importance to reduce the millions of deaths and injuries occurring worldwide. Distracted or anomalous driving behaviors are deviations from 'normal' driving that need to be identified correctly to alert the driver. However, these driving behaviors do not comprise one specific type of driving style and their distribution can be different during the training and test phases of a classifier. We formulate this problem as a supervised contrastive learning approach to learn a visual representation to detect normal, and seen and unseen anomalous driving behaviors. We made a change to the standard contrastive loss function to adjust the similarity of negative pairs to aid the optimization. Normally, in a (self) supervised contrastive framework, the projection head layers are omitted during the test phase as the encoding layers are considered to contain general visual representative information. However, we assert that for a video-based supervised contrastive learning task, including a projection head can be beneficial. We showed our results on a driver anomaly detection dataset that contains 783 minutes of video recordings of normal and anomalous driving behaviors of 31 drivers from the various top and front cameras (both depth and infrared). Out of 9 video modalities combinations, our proposed contrastive approach improved the ROC AUC on 6 in comparison to the baseline models (from 4.23% to 8.91% for different modalities). We performed statistical tests that showed evidence that our proposed method performs better than the baseline contrastive learning setup. Finally, the results showed that the fusion of depth and infrared modalities from the top and front views achieved the best AUC ROC of 0.9738 and AUC PR of 0.9772. △ Less

Submitted 29 April, 2022; v1 submitted 8 September, 2021; originally announced September 2021.

Comments: 8 pages, 2 figures, 5 tables

arXiv:2106.10882 [pdf, other]

Affect-driven Ordinal Engagement Measurement from Video

Authors: Ali Abedi, Shehroz Khan

Abstract: In education and intervention programs, user engagement has been identified as a major factor in successful program completion. Automatic measurement of user engagement provides helpful information for instructors to meet program objectives and individualize program delivery. In this paper, we present a novel approach for video-based engagement measurement in virtual learning programs. We propose… ▽ More In education and intervention programs, user engagement has been identified as a major factor in successful program completion. Automatic measurement of user engagement provides helpful information for instructors to meet program objectives and individualize program delivery. In this paper, we present a novel approach for video-based engagement measurement in virtual learning programs. We propose to use affect states, continuous values of valence and arousal extracted from consecutive video frames, along with a new latent affective feature vector and behavioral features for engagement measurement. Deep-learning sequential models are trained and validated on the extracted frame-level features. In addition, due to the fact that engagement is an ordinal variable, we develop the ordinal versions of the above models in order to address the problem of engagement measurement as an ordinal classification problem. We evaluated the performance of the proposed method on the only two publicly available video engagement measurement datasets, DAiSEE and EmotiW-EW, containing videos of students in online learning programs. Our experiments show a state-of-the-art engagement level classification accuracy of 67.4% on the DAiSEE dataset, and a regression mean squared error of 0.0508 on the EmotiW-EW dataset. Our ablation study shows the effectiveness of incorporating affect states and ordinality of engagement in engagement measurement. △ Less

Submitted 6 November, 2022; v1 submitted 21 June, 2021; originally announced June 2021.

Comments: 13 pages, 8 figures, 7 tables

arXiv:2106.07088 [pdf]

A new soft computing method for integration of expert's knowledge in reinforcement learn-ing problems

Authors: Mohsen Annabestani, Ali Abedi, Mohammad Reza Nematollahi, Mohammad Bagher Naghibi Sis-tani

Abstract: This paper proposes a novel fuzzy action selection method to leverage human knowledge in reinforcement learning problems. Based on the estimates of the most current action-state values, the proposed fuzzy nonlinear map** as-signs each member of the action set to its probability of being chosen in the next step. A user tunable parameter is introduced to control the action selection policy, which… ▽ More This paper proposes a novel fuzzy action selection method to leverage human knowledge in reinforcement learning problems. Based on the estimates of the most current action-state values, the proposed fuzzy nonlinear map** as-signs each member of the action set to its probability of being chosen in the next step. A user tunable parameter is introduced to control the action selection policy, which determines the agent's greedy behavior throughout the learning process. This parameter resembles the role of the temperature parameter in the softmax action selection policy, but its tuning process can be more knowledge-oriented since this parameter reflects the human knowledge into the learning agent by making modifications in the fuzzy rule base. Simulation results indicate that including fuzzy logic within the reinforcement learning in the proposed manner improves the learning algorithm's convergence rate, and provides superior performance. △ Less

Submitted 13 June, 2021; originally announced June 2021.

arXiv:2104.10122 [pdf, other]

Improving state-of-the-art in Detecting Student Engagement with Resnet and TCN Hybrid Network

Authors: Ali Abedi, Shehroz S. Khan

Abstract: Automatic detection of students' engagement in online learning settings is a key element to improve the quality of learning and to deliver personalized learning materials to them. Varying levels of engagement exhibited by students in an online classroom is an affective behavior that takes place over space and time. Therefore, we formulate detecting levels of students' engagement from videos as a s… ▽ More Automatic detection of students' engagement in online learning settings is a key element to improve the quality of learning and to deliver personalized learning materials to them. Varying levels of engagement exhibited by students in an online classroom is an affective behavior that takes place over space and time. Therefore, we formulate detecting levels of students' engagement from videos as a spatio-temporal classification problem. In this paper, we present a novel end-to-end Residual Network (ResNet) and Temporal Convolutional Network (TCN) hybrid neural network architecture for students' engagement level detection in videos. The 2D ResNet extracts spatial features from consecutive video frames, and the TCN analyzes the temporal changes in video frames to detect the level of engagement. The spatial and temporal arms of the hybrid network are jointly trained on raw video frames of a large publicly available students' engagement detection dataset, DAiSEE. We compared our method with several competing students' engagement detection methods on this dataset. The ResNet+TCN architecture outperforms all other studied methods, improves the state-of-the-art engagement level detection accuracy, and sets a new baseline for future research. △ Less

Submitted 16 October, 2021; v1 submitted 20 April, 2021; originally announced April 2021.

Comments: 7 pages, 3 figures, 1 table

arXiv:2011.03180 [pdf, other]

FedSL: Federated Split Learning on Distributed Sequential Data in Recurrent Neural Networks

Authors: Ali Abedi, Shehroz S. Khan

Abstract: Federated Learning (FL) and Split Learning (SL) are privacy-preserving Machine-Learning (ML) techniques that enable training ML models over data distributed among clients without requiring direct access to their raw data. Existing FL and SL approaches work on horizontally or vertically partitioned data and cannot handle sequentially partitioned data where segments of multiple-segment sequential da… ▽ More Federated Learning (FL) and Split Learning (SL) are privacy-preserving Machine-Learning (ML) techniques that enable training ML models over data distributed among clients without requiring direct access to their raw data. Existing FL and SL approaches work on horizontally or vertically partitioned data and cannot handle sequentially partitioned data where segments of multiple-segment sequential data are distributed across clients. In this paper, we propose a novel federated split learning framework, FedSL, to train models on distributed sequential data. The most common ML models to train on sequential data are Recurrent Neural Networks (RNNs). Since the proposed framework is privacy-preserving, segments of multiple-segment sequential data cannot be shared between clients or between clients and server. To circumvent this limitation, we propose a novel SL approach tailored for RNNs. A RNN is split into sub-networks, and each sub-network is trained on one client containing single segments of multiple-segment training sequences. During local training, the sub-networks on different clients communicate with each other to capture latent dependencies between consecutive segments of multiple-segment sequential data on different clients, but without sharing raw data or complete model parameters. After training local sub-networks with local sequential data segments, all clients send their sub-networks to a federated server where sub-networks are aggregated to generate a global model. The experimental results on simulated and real-world datasets demonstrate that the proposed method successfully trains models on distributed sequential data, while preserving privacy, and outperforms previous FL and centralized learning approaches in terms of achieving higher accuracy in fewer communication rounds. △ Less

Submitted 6 November, 2022; v1 submitted 5 November, 2020; originally announced November 2020.

arXiv:1511.06344 [pdf, other]

doi 10.1109/TWC.2015.2503750

Channel-Adaptive Packetization Policy for Minimal Latency and Maximal Energy Efficiency

Authors: Abolfazl Razi, Fatemeh Afghah, Ali Abedi

Abstract: This article considers the problem of delay optimal bundling of the input symbols into transmit packets in the entry point of a wireless sensor network such that the link delay is minimized under an arbitrary arrival rate and a given channel error rate. The proposed policy exploits the variable packet length feature of contemporary communications protocols in order to minimize the link delay via p… ▽ More This article considers the problem of delay optimal bundling of the input symbols into transmit packets in the entry point of a wireless sensor network such that the link delay is minimized under an arbitrary arrival rate and a given channel error rate. The proposed policy exploits the variable packet length feature of contemporary communications protocols in order to minimize the link delay via packet length regularization. This is performed through concrete characterization of the end-to-end link delay for zero error tolerance system with First Come First Serve (FCFS)queuing discipline and Automatic Repeat Request (ARQ) retransmission mechanism. The derivations are provided for an uncoded system as well as a coded system with a given bit error rate. The proposed packetization policy provides an optimal packetization interval that minimizes the end-to-end delay for a given channel with certain bit error probability. This algorithm can also be used for near-optimal bundling of input symbols for dynamic channel conditions provided that the channel condition varies slowly over time with respect to symbol arrival rate. This algorithm complements the current network-based delay-optimal routing and scheduling algorithms in order to further reduce the end-to-end delivery time. Moreover, the proposed method is employed to solve the problem of energy efficiency maximization under an average delay constraint by recasting it as a convex optimization problem. △ Less

Submitted 19 November, 2015; originally announced November 2015.

Comments: IEEE Transactions on Wireless Communications, to appear 2015

Showing 1–21 of 21 results for author: Aabedi, A