Search | arXiv e-print repository

Enhancing Robotic Arm Activity Recognition with Vision Transformers and Wavelet-Transformed Channel State Information

Authors: Ro** Zandi, Kian Behzad, Elaheh Motamedi, Hojjat Salehinejad, Milad Siami

Abstract: Vision-based methods are commonly used in robotic arm activity recognition. These approaches typically rely on line-of-sight (LoS) and raise privacy concerns, particularly in smart home applications. Passive Wi-Fi sensing represents a new paradigm for recognizing human and robotic arm activities, utilizing channel state information (CSI) measurements to identify activities in indoor environments.… ▽ More Vision-based methods are commonly used in robotic arm activity recognition. These approaches typically rely on line-of-sight (LoS) and raise privacy concerns, particularly in smart home applications. Passive Wi-Fi sensing represents a new paradigm for recognizing human and robotic arm activities, utilizing channel state information (CSI) measurements to identify activities in indoor environments. In this paper, a novel machine learning approach based on discrete wavelet transform and vision transformers for robotic arm activity recognition from CSI measurements in indoor settings is proposed. This method outperforms convolutional neural network (CNN) and long short-term memory (LSTM) models in robotic arm activity recognition, particularly when LoS is obstructed by barriers, without relying on external or internal sensors or visual aids. Experiments are conducted using four different data collection scenarios and four different robotic arm activities. Performance results demonstrate that wavelet transform can significantly enhance the accuracy of visual transformer networks in robotic arms activity recognition. △ Less

Submitted 8 July, 2024; originally announced July 2024.

Comments: Accepted at 2024 IEEE International Symposium on Personal, Indoor and Mobile Radio Communications

arXiv:2401.09606 [pdf, other]

Robustness Evaluation of Machine Learning Models for Robot Arm Action Recognition in Noisy Environments

Authors: Elaheh Motamedi, Kian Behzad, Ro** Zandi, Hojjat Salehinejad, Milad Siami

Abstract: In the realm of robot action recognition, identifying distinct but spatially proximate arm movements using vision systems in noisy environments poses a significant challenge. This paper studies robot arm action recognition in noisy environments using machine learning techniques. Specifically, a vision system is used to track the robot's movements followed by a deep learning model to extract the ar… ▽ More In the realm of robot action recognition, identifying distinct but spatially proximate arm movements using vision systems in noisy environments poses a significant challenge. This paper studies robot arm action recognition in noisy environments using machine learning techniques. Specifically, a vision system is used to track the robot's movements followed by a deep learning model to extract the arm's key points. Through a comparative analysis of machine learning methods, the effectiveness and robustness of this model are assessed in noisy environments. A case study was conducted using the Tic-Tac-Toe game in a 3-by-3 grid environment, where the focus is to accurately identify the actions of the arms in selecting specific locations within this constrained environment. Experimental results show that our approach can achieve precise key point detection and action classification despite the addition of noise and uncertainties to the dataset. △ Less

Submitted 17 January, 2024; originally announced January 2024.

Comments: Accepted at ICASSP

arXiv:2312.15345 [pdf, other]

RoboFiSense: Attention-Based Robotic Arm Activity Recognition with WiFi Sensing

Authors: Ro** Zandi, Kian Behzad, Elaheh Motamedi, Hojjat Salehinejad, Milad Siami

Abstract: Despite the current surge of interest in autonomous robotic systems, robot activity recognition within restricted indoor environments remains a formidable challenge. Conventional methods for detecting and recognizing robotic arms' activities often rely on vision-based or light detection and ranging (LiDAR) sensors, which require line-of-sight (LoS) access and may raise privacy concerns, for exampl… ▽ More Despite the current surge of interest in autonomous robotic systems, robot activity recognition within restricted indoor environments remains a formidable challenge. Conventional methods for detecting and recognizing robotic arms' activities often rely on vision-based or light detection and ranging (LiDAR) sensors, which require line-of-sight (LoS) access and may raise privacy concerns, for example, in nursing facilities. This research pioneers an innovative approach harnessing channel state information (CSI) measured from WiFi signals, subtly influenced by the activity of robotic arms. We developed an attention-based network to classify eight distinct activities performed by a Franka Emika robotic arm in different situations. Our proposed bidirectional vision transformer-concatenated (BiVTC) methodology aspires to predict robotic arm activities accurately, even when trained on activities with different velocities, all without dependency on external or internal sensors or visual aids. Considering the high dependency of CSI data on the environment motivated us to study the problem of sniffer location selection, by systematically changing the sniffer's location and collecting different sets of data. Finally, this paper also marks the first publication of the CSI data of eight distinct robotic arm activities, collectively referred to as RoboFiSense. This initiative aims to provide a benchmark dataset and baselines to the research community, fostering advancements in the field of robotics sensing. △ Less

Submitted 6 May, 2024; v1 submitted 23 December, 2023; originally announced December 2023.

Comments: 11 pages, 11 figures

arXiv:2308.02425 [pdf, other]

Hypertension Detection From High-Dimensional Representation of Photoplethysmogram Signals

Authors: Navid Hasanzadeh, Shahrokh Valaee, Hojjat Salehinejad

Abstract: Hypertension is commonly referred to as the "silent killer", since it can lead to severe health complications without any visible symptoms. Early detection of hypertension is crucial in preventing significant health issues. Although some studies suggest a relationship between blood pressure and certain vital signals, such as Photoplethysmogram (PPG), reliable generalization of the proposed blood p… ▽ More Hypertension is commonly referred to as the "silent killer", since it can lead to severe health complications without any visible symptoms. Early detection of hypertension is crucial in preventing significant health issues. Although some studies suggest a relationship between blood pressure and certain vital signals, such as Photoplethysmogram (PPG), reliable generalization of the proposed blood pressure estimation methods is not yet guaranteed. This lack of certainty has resulted in some studies doubting the existence of such relationships, or considering them weak and limited to heart rate and blood pressure. In this paper, a high-dimensional representation technique based on random convolution kernels is proposed for hypertension detection using PPG signals. The results show that this relationship extends beyond heart rate and blood pressure, demonstrating the feasibility of hypertension detection with generalization. Additionally, the utilized transform using convolution kernels, as an end-to-end time-series feature extractor, outperforms the methods proposed in the previous studies and state-of-the-art deep learning models. △ Less

Submitted 30 July, 2023; originally announced August 2023.

Comments: 4 pages, 2 figures, 1 table, Accepted at IEEE-EMBS International Conference on Biomedical and Health Informatics (BHI 23), Oct. 15--18, 2023, Pittsburgh, Pennsylvania, USA

arXiv:2307.03829 [pdf, other]

Robot Motion Prediction by Channel State Information

Authors: Ro** Zandi, Hojjat Salehinejad, Kian Behzad, Elaheh Motamedi, Milad Siami

Abstract: Autonomous robotic systems have gained a lot of attention, in recent years. However, accurate prediction of robot motion in indoor environments with limited visibility is challenging. While vision-based and light detection and ranging (LiDAR) sensors are commonly used for motion detection and localization of robotic arms, they are privacy-invasive and depend on a clear line-of-sight (LOS) for prec… ▽ More Autonomous robotic systems have gained a lot of attention, in recent years. However, accurate prediction of robot motion in indoor environments with limited visibility is challenging. While vision-based and light detection and ranging (LiDAR) sensors are commonly used for motion detection and localization of robotic arms, they are privacy-invasive and depend on a clear line-of-sight (LOS) for precise measurements. In cases where additional sensors are not available or LOS is not possible, these technologies may not be the best option. This paper proposes a novel method that employs channel state information (CSI) from WiFi signals affected by robotic arm motion. We developed a convolutional neural network (CNN) model to classify four different activities of a Franka Emika robotic arm. The implemented method seeks to accurately predict robot motion even in scenarios in which the robot is obscured by obstacles, without relying on any attached or internal sensors. △ Less

Submitted 7 July, 2023; originally announced July 2023.

Comments: 6 pages, 10 figures, 2 tables, MLSP Conference

arXiv:2210.05078 [pdf, other]

Joint Human Orientation-Activity Recognition Using WiFi Signals for Human-Machine Interaction

Authors: Hojjat Salehinejad, Navid Hasanzadeh, Radomir Djogo, Shahrokh Valaee

Abstract: WiFi sensing is an important part of the new WiFi 802.11bf standard, which can detect motion and measure distances. In recent years, some machine learning methods have been proposed for human activity recognition from WiFi signals. However, to the best of our knowledge, none of these methods have explored orientation prediction of the user using WiFi signals. Orientation prediction is particularly… ▽ More WiFi sensing is an important part of the new WiFi 802.11bf standard, which can detect motion and measure distances. In recent years, some machine learning methods have been proposed for human activity recognition from WiFi signals. However, to the best of our knowledge, none of these methods have explored orientation prediction of the user using WiFi signals. Orientation prediction is particularly critical for human-machine interaction in an environment with multiple smart devices. In this paper, we propose a data collection setup and machine learning models for joint human orientation and activity recognition using WiFi signals from a single access point (AP) or multiple APs. The results show feasibility of joint orientation-activity recognition in an indoor environment with a high accuracy. △ Less

Submitted 21 October, 2022; v1 submitted 10 October, 2022; originally announced October 2022.

arXiv:2203.03445 [pdf, other]

S-Rocket: Selective Random Convolution Kernels for Time Series Classification

Authors: Hojjat Salehinejad, Yang Wang, Yuanhao Yu, Tang **, Shahrokh Valaee

Abstract: Random convolution kernel transform (Rocket) is a fast, efficient, and novel approach for time series feature extraction using a large number of independent randomly initialized 1-D convolution kernels of different configurations. The output of the convolution operation on each time series is represented by a partial positive value (PPV). A concatenation of PPVs from all kernels is the input featu… ▽ More Random convolution kernel transform (Rocket) is a fast, efficient, and novel approach for time series feature extraction using a large number of independent randomly initialized 1-D convolution kernels of different configurations. The output of the convolution operation on each time series is represented by a partial positive value (PPV). A concatenation of PPVs from all kernels is the input feature vector to a Ridge regression classifier. Unlike typical deep learning models, the kernels are not trained and there is no weighted/trainable connection between kernels or concatenated features and the classifier. Since these kernels are generated randomly, a portion of these kernels may not positively contribute in performance of the model. Hence, selection of the most important kernels and pruning the redundant and less important ones is necessary to reduce computational complexity and accelerate inference of Rocket for applications on the edge devices. Selection of these kernels is a combinatorial optimization problem. In this paper, we propose a scheme for selecting these kernels while maintaining the classification performance. First, the original model is pre-trained at full capacity. Then, a population of binary candidate state vectors is initialized where each element of a vector represents the active/inactive status of a kernel. A population-based optimization algorithm evolves the population in order to find a best state vector which minimizes the number of active kernels while maximizing the accuracy of the classifier. This activation function is a linear combination of the total number of active kernels and the classification accuracy of the pre-trained classifier with the active kernels. Finally, the selected kernels in the best state vector are utilized to train the Ridge regression classifier with the selected kernels. △ Less

Submitted 20 September, 2022; v1 submitted 7 March, 2022; originally announced March 2022.

arXiv:2201.09310 [pdf, other]

doi 10.1109/ICASSP43922.2022.9746803

LiteHAR: Lightweight Human Activity Recognition from WiFi Signals with Random Convolution Kernels

Authors: Hojjat Salehinejad, Shahrokh Valaee

Abstract: Anatomical movements of the human body can change the channel state information (CSI) of wireless signals in an indoor environment. These changes in the CSI signals can be used for human activity recognition (HAR), which is a predominant and unique approach due to preserving privacy and flexibility of capturing motions in non-line-of-sight environments. Existing models for HAR generally have a hig… ▽ More Anatomical movements of the human body can change the channel state information (CSI) of wireless signals in an indoor environment. These changes in the CSI signals can be used for human activity recognition (HAR), which is a predominant and unique approach due to preserving privacy and flexibility of capturing motions in non-line-of-sight environments. Existing models for HAR generally have a high computational complexity, contain very large number of trainable parameters, and require extensive computational resources. This issue is particularly important for implementation of these solutions on devices with limited resources, such as edge devices. In this paper, we propose a lightweight human activity recognition (LiteHAR) approach which, unlike the state-of-the-art deep learning models, does not require extensive training of large number of parameters. This approach uses randomly initialized convolution kernels for feature extraction from CSI signals without training the kernels. The extracted features are then classified using Ridge regression classifier, which has a linear computational complexity and is very fast. LiteHAR is evaluated on a public benchmark dataset and the results show its high classification performance in comparison with the complex deep learning models with a much lower computational complexity. △ Less

Submitted 23 January, 2022; originally announced January 2022.

arXiv:2102.13188 [pdf, ps, other]

A Framework For Pruning Deep Neural Networks Using Energy-Based Models

Authors: Hojjat Salehinejad, Shahrokh Valaee

Abstract: A typical deep neural network (DNN) has a large number of trainable parameters. Choosing a network with proper capacity is challenging and generally a larger network with excessive capacity is trained. Pruning is an established approach to reducing the number of parameters in a DNN. In this paper, we propose a framework for pruning DNNs based on a population-based global optimization method. This… ▽ More A typical deep neural network (DNN) has a large number of trainable parameters. Choosing a network with proper capacity is challenging and generally a larger network with excessive capacity is trained. Pruning is an established approach to reducing the number of parameters in a DNN. In this paper, we propose a framework for pruning DNNs based on a population-based global optimization method. This framework can use any pruning objective function. As a case study, we propose a simple but efficient objective function based on the concept of energy-based models. Our experiments on ResNets, AlexNet, and SqueezeNet for the CIFAR-10 and CIFAR-100 datasets show a pruning rate of more than $50\%$ of the trainable parameters with approximately $<5\%$ and $<1\%$ drop of Top-1 and Top-5 classification accuracy, respectively. △ Less

Submitted 25 February, 2021; originally announced February 2021.

Comments: This paper is accepted for presentation at IEEE International Conference on Acoustics, Speech and Signal Processing (IEEE ICASSP), 2021. arXiv admin note: text overlap with arXiv:2006.04270, arXiv:2102.05437

arXiv:2102.05437 [pdf, ps, other]

Pruning of Convolutional Neural Networks Using Ising Energy Model

Authors: Hojjat Salehinejad, Shahrokh Valaee

Abstract: Pruning is one of the major methods to compress deep neural networks. In this paper, we propose an Ising energy model within an optimization framework for pruning convolutional kernels and hidden units. This model is designed to reduce redundancy between weight kernels and detect inactive kernels/hidden units. Our experiments using ResNets, AlexNet, and SqueezeNet on CIFAR-10 and CIFAR-100 dataset… ▽ More Pruning is one of the major methods to compress deep neural networks. In this paper, we propose an Ising energy model within an optimization framework for pruning convolutional kernels and hidden units. This model is designed to reduce redundancy between weight kernels and detect inactive kernels/hidden units. Our experiments using ResNets, AlexNet, and SqueezeNet on CIFAR-10 and CIFAR-100 datasets show that the proposed method on average can achieve a pruning rate of more than $50\%$ of the trainable parameters with approximately $<10\%$ and $<5\%$ drop of Top-1 and Top-5 classification accuracy, respectively. △ Less

Submitted 10 February, 2021; originally announced February 2021.

Comments: This paper is accepted for presentation at IEEE International Conference on Acoustics, Speech and Signal Processing (IEEE ICASSP), 2021

arXiv:2102.04869 [pdf]

doi 10.1038/s41598-021-95533-2

A Real-World Demonstration of Machine Learning Generalizability: Intracranial Hemorrhage Detection on Head CT

Authors: Hojjat Salehinejad, Jumpei Kitamura, Noah Ditkofsky, Amy Lin, Aditya Bharatha, Suradech Suthiphosuwan, Hui-Ming Lin, Jefferson R. Wilson, Muhammad Mamdani, Errol Colak

Abstract: Machine learning (ML) holds great promise in transforming healthcare. While published studies have shown the utility of ML models in interpreting medical imaging examinations, these are often evaluated under laboratory settings. The importance of real world evaluation is best illustrated by case studies that have documented successes and failures in the translation of these models into clinical en… ▽ More Machine learning (ML) holds great promise in transforming healthcare. While published studies have shown the utility of ML models in interpreting medical imaging examinations, these are often evaluated under laboratory settings. The importance of real world evaluation is best illustrated by case studies that have documented successes and failures in the translation of these models into clinical environments. A key prerequisite for the clinical adoption of these technologies is demonstrating generalizable ML model performance under real world circumstances. The purpose of this study was to demonstrate that ML model generalizability is achievable in medical imaging with the detection of intracranial hemorrhage (ICH) on non-contrast computed tomography (CT) scans serving as the use case. An ML model was trained using 21,784 scans from the RSNA Intracranial Hemorrhage CT dataset while generalizability was evaluated using an external validation dataset obtained from our busy trauma and neurosurgical center. This real world external validation dataset consisted of every unenhanced head CT scan (n = 5,965) performed in our emergency department in 2019 without exclusion. The model demonstrated an AUC of 98.4%, sensitivity of 98.8%, and specificity of 98.0%, on the test dataset. On external validation, the model demonstrated an AUC of 95.4%, sensitivity of 91.3%, and specificity of 94.1%. Evaluating the ML model using a real world external validation dataset that is temporally and geographically distinct from the training dataset indicates that ML generalizability is achievable in medical imaging applications. △ Less

Submitted 9 February, 2021; originally announced February 2021.

Comments: This paper is under review

arXiv:2010.13336 [pdf, other]

Deep Sequential Learning for Cervical Spine Fracture Detection on Computed Tomography Imaging

Authors: Hojjat Salehinejad, Edward Ho, Hui-Ming Lin, Priscila Crivellaro, Oleksandra Samorodova, Monica Tafur Arciniegas, Zamir Merali, Suradech Suthiphosuwan, Aditya Bharatha, Kristen Yeom, Muhammad Mamdani, Jefferson Wilson, Errol Colak

Abstract: Fractures of the cervical spine are a medical emergency and may lead to permanent paralysis and even death. Accurate diagnosis in patients with suspected fractures by computed tomography (CT) is critical to patient management. In this paper, we propose a deep convolutional neural network (DCNN) with a bidirectional long-short term memory (BLSTM) layer for the automated detection of cervical spine… ▽ More Fractures of the cervical spine are a medical emergency and may lead to permanent paralysis and even death. Accurate diagnosis in patients with suspected fractures by computed tomography (CT) is critical to patient management. In this paper, we propose a deep convolutional neural network (DCNN) with a bidirectional long-short term memory (BLSTM) layer for the automated detection of cervical spine fractures in CT axial images. We used an annotated dataset of 3,666 CT scans (729 positive and 2,937 negative cases) to train and validate the model. The validation results show a classification accuracy of 70.92% and 79.18% on the balanced (104 positive and 104 negative cases) and imbalanced (104 positive and 419 negative cases) test datasets, respectively. △ Less

Submitted 5 February, 2021; v1 submitted 26 October, 2020; originally announced October 2020.

Comments: This paper is accepted for presentation at the IEEE International Symposium on Biomedical Imaging (ISBI) 2021

arXiv:2006.04270 [pdf, ps, other]

doi 10.1109/TNNLS.2021.3069970

EDropout: Energy-Based Dropout and Pruning of Deep Neural Networks

Authors: Hojjat Salehinejad, Shahrokh Valaee

Abstract: Dropout is a well-known regularization method by sampling a sub-network from a larger deep neural network and training different sub-networks on different subsets of the data. Inspired by the dropout concept, we propose EDropout as an energy-based framework for pruning neural networks in classification tasks. In this approach, a set of binary pruning state vectors (population) represents a set of… ▽ More Dropout is a well-known regularization method by sampling a sub-network from a larger deep neural network and training different sub-networks on different subsets of the data. Inspired by the dropout concept, we propose EDropout as an energy-based framework for pruning neural networks in classification tasks. In this approach, a set of binary pruning state vectors (population) represents a set of corresponding sub-networks from an arbitrary provided original neural network. An energy loss function assigns a scalar energy loss value to each pruning state. The energy-based model stochastically evolves the population to find states with lower energy loss. The best pruning state is then selected and applied to the original network. Similar to dropout, the kept weights are updated using backpropagation in a probabilistic model. The energy-based model again searches for better pruning states and the cycle continuous. Indeed, this procedure is in fact switching between the energy model, which manages the pruning states, and the probabilistic model, which updates the temporarily unpruned weights, in each iteration. The population can dynamically converge to a pruning state. This can be interpreted as dropout leading to pruning the network. From an implementation perspective, EDropout can prune typical neural networks without modification of the network architecture. We evaluated the proposed method on different flavours of ResNets, AlexNet, and SqueezeNet on the Kuzushiji, Fashion, CIFAR-10, CIFAR-100, and Flowers datasets, and compared the pruning rate and classification performance of the models. On average the networks trained with EDropout achieved a pruning rate of more than $50\%$ of the trainable parameters with approximately $<5\%$ and $<1\%$ drop of Top-1 and Top-5 classification accuracy, respectively. △ Less

Submitted 7 March, 2022; v1 submitted 7 June, 2020; originally announced June 2020.

arXiv:1904.13310 [pdf, other]

Survey of Dropout Methods for Deep Neural Networks

Authors: Alex Labach, Hojjat Salehinejad, Shahrokh Valaee

Abstract: Dropout methods are a family of stochastic techniques used in neural network training or inference that have generated significant research interest and are widely used in practice. They have been successfully applied in neural network regularization, model compression, and in measuring the uncertainty of neural network outputs. While original formulated for dense neural network layers, recent adv… ▽ More Dropout methods are a family of stochastic techniques used in neural network training or inference that have generated significant research interest and are widely used in practice. They have been successfully applied in neural network regularization, model compression, and in measuring the uncertainty of neural network outputs. While original formulated for dense neural network layers, recent advances have made dropout methods also applicable to convolutional and recurrent neural network layers. This paper summarizes the history of dropout methods, their various applications, and current areas of research interest. Important proposed methods are described in additional detail. △ Less

Submitted 25 October, 2019; v1 submitted 25 April, 2019; originally announced April 2019.

arXiv:1902.08673 [pdf, other]

Ising-Dropout: A Regularization Method for Training and Compression of Deep Neural Networks

Authors: Hojjat Salehinejad, Shahrokh Valaee

Abstract: Overfitting is a major problem in training machine learning models, specifically deep neural networks. This problem may be caused by imbalanced datasets and initialization of the model parameters, which conforms the model too closely to the training data and negatively affects the generalization performance of the model for unseen data. The original dropout is a regularization technique to drop hi… ▽ More Overfitting is a major problem in training machine learning models, specifically deep neural networks. This problem may be caused by imbalanced datasets and initialization of the model parameters, which conforms the model too closely to the training data and negatively affects the generalization performance of the model for unseen data. The original dropout is a regularization technique to drop hidden units randomly during training. In this paper, we propose an adaptive technique to wisely drop the visible and hidden units in a deep neural network using Ising energy of the network. The preliminary results show that the proposed approach can keep the classification performance competitive to the original network while eliminating optimization of unnecessary network parameters in each training cycle. The dropout state of units can also be applied to the trained (inference) model. This technique could compress the network in terms of number of parameters up to 41.18% and 55.86% for the classification task on the MNIST and Fashion-MNIST datasets, respectively. △ Less

Submitted 6 February, 2019; originally announced February 2019.

Comments: This paper is accepted at 44th IEEE International Conference on Acoustics, Speech and Signal Processing (IEEE ICASSP), 2019

arXiv:1809.10245 [pdf, other]

Cylindrical Transform: 3D Semantic Segmentation of Kidneys With Limited Annotated Images

Authors: Hojjat Salehinejad, Sumeya Naqvi, Errol Colak, Joseph Barfett, Shahrokh Valaee

Abstract: In this paper, we propose a novel technique for sampling sequential images using a cylindrical transform in a cylindrical coordinate system for kidney semantic segmentation in abdominal computed tomography (CT). The images generated from a cylindrical transform augment a limited annotated set of images in three dimensions. This approach enables us to train contemporary classification deep convolut… ▽ More In this paper, we propose a novel technique for sampling sequential images using a cylindrical transform in a cylindrical coordinate system for kidney semantic segmentation in abdominal computed tomography (CT). The images generated from a cylindrical transform augment a limited annotated set of images in three dimensions. This approach enables us to train contemporary classification deep convolutional neural networks (DCNNs) instead of fully convolutional networks (FCNs) for semantic segmentation. Typical semantic segmentation models segment a sequential set of images (e.g. CT or video) by segmenting each image independently. However, the proposed method not only considers the spatial dependency in the x-y plane, but also the spatial sequential dependency along the z-axis. The results show that classification DCNNs, trained on cylindrical transformed images, can achieve a higher segmentation performance value than FCNs using a limited number of annotated images. △ Less

Submitted 24 September, 2018; originally announced September 2018.

Comments: This paper is accepted for presentation at IEEE Global Conference on Signal and Information Processing (IEEE GlobalSIP), California, USA, 2018

arXiv:1801.01078 [pdf, ps, other]

Recent Advances in Recurrent Neural Networks

Authors: Hojjat Salehinejad, Sharan Sankar, Joseph Barfett, Errol Colak, Shahrokh Valaee

Abstract: Recurrent neural networks (RNNs) are capable of learning features and long term dependencies from sequential and time-series data. The RNNs have a stack of non-linear units where at least one connection between units forms a directed cycle. A well-trained RNN can model any dynamical system; however, training RNNs is mostly plagued by issues in learning long-term dependencies. In this paper, we pre… ▽ More Recurrent neural networks (RNNs) are capable of learning features and long term dependencies from sequential and time-series data. The RNNs have a stack of non-linear units where at least one connection between units forms a directed cycle. A well-trained RNN can model any dynamical system; however, training RNNs is mostly plagued by issues in learning long-term dependencies. In this paper, we present a survey on RNNs and several new advances for newcomers and professionals in the field. The fundamentals and recent advances are explained and the research challenges are introduced. △ Less

Submitted 22 February, 2018; v1 submitted 28 December, 2017; originally announced January 2018.

Comments: arXiv admin note: text overlap with arXiv:1602.04335

arXiv:1712.01636 [pdf, other]

Generalization of Deep Neural Networks for Chest Pathology Classification in X-Rays Using Generative Adversarial Networks

Authors: Hojjat Salehinejad, Shahrokh Valaee, Tim Dowdell, Errol Colak, Joseph Barfett

Abstract: Medical datasets are often highly imbalanced with over-representation of common medical problems and a paucity of data from rare conditions. We propose simulation of pathology in images to overcome the above limitations. Using chest X-rays as a model medical image, we implement a generative adversarial network (GAN) to create artificial images based upon a modest sized labeled dataset. We employ a… ▽ More Medical datasets are often highly imbalanced with over-representation of common medical problems and a paucity of data from rare conditions. We propose simulation of pathology in images to overcome the above limitations. Using chest X-rays as a model medical image, we implement a generative adversarial network (GAN) to create artificial images based upon a modest sized labeled dataset. We employ a combination of real and artificial images to train a deep convolutional neural network (DCNN) to detect pathology across five classes of chest X-rays. Furthermore, we demonstrate that augmenting the original imbalanced dataset with GAN generated images improves performance of chest pathology classification using the proposed DCNN in comparison to the same DCNN trained with the original dataset alone. This improved performance is largely attributed to balancing of the dataset using GAN generated images, where image classes that are lacking in example images are preferentially augmented. △ Less

Submitted 12 February, 2018; v1 submitted 8 November, 2017; originally announced December 2017.

Comments: This paper is accepted for presentation at IEEE International Conference on Acoustics, Speech and Signal Processing (IEEE ICASSP), 2018

arXiv:1709.06909 [pdf, ps, other]

Opposition based Ensemble Micro Differential Evolution

Authors: Hojjat Salehinejad, Shahryar Rahnamayan, Hamid R. Tizhoosh

Abstract: Differential evolution (DE) algorithm with a small population size is called Micro-DE (MDE). A small population size decreases the computational complexity but also reduces the exploration ability of DE by limiting the population diversity. In this paper, we propose the idea of combining ensemble mutation scheme selection and opposition-based learning concepts to enhance the diversity of populatio… ▽ More Differential evolution (DE) algorithm with a small population size is called Micro-DE (MDE). A small population size decreases the computational complexity but also reduces the exploration ability of DE by limiting the population diversity. In this paper, we propose the idea of combining ensemble mutation scheme selection and opposition-based learning concepts to enhance the diversity of population in MDE at mutation and selection stages. The proposed algorithm enhances the diversity of population by generating a random mutation scale factor per individual and per dimension, randomly assigning a mutation scheme to each individual in each generation, and diversifying individuals selection using opposition-based learning. This approach is easy to implement and does not require the setting of mutation scheme selection and mutation scale factor. Experimental results are conducted for a variety of objective functions with low and high dimensionality on the CEC Black- Box Optimization Benchmarking 2015 (CEC-BBOB 2015). The results show superior performance of the proposed algorithm compared to the other micro-DE algorithms. △ Less

Submitted 20 September, 2017; v1 submitted 7 September, 2017; originally announced September 2017.

Comments: This paper is accepted for presentation at IEEE Symposium Series on Computational Intelligence (IEEE SSCI 2017), Hawaii, USA, 2017

arXiv:1708.09254 [pdf, other]

Interpretation of Mammogram and Chest X-Ray Reports Using Deep Neural Networks - Preliminary Results

Authors: Hojjat Salehinejad, Shahrokh Valaee, Aren Mnatzakanian, Tim Dowdell, Joseph Barfett, Errol Colak

Abstract: Radiology reports are an important means of communication between radiologists and other physicians. These reports express a radiologist's interpretation of a medical imaging examination and are critical in establishing a diagnosis and formulating a treatment plan. In this paper, we propose a Bi-directional convolutional neural network (Bi-CNN) model for the interpretation and classification of ma… ▽ More Radiology reports are an important means of communication between radiologists and other physicians. These reports express a radiologist's interpretation of a medical imaging examination and are critical in establishing a diagnosis and formulating a treatment plan. In this paper, we propose a Bi-directional convolutional neural network (Bi-CNN) model for the interpretation and classification of mammograms based on breast density and chest radiographic radiology reports based on the basis of chest pathology. The proposed approach helps to organize databases of radiology reports, retrieve them expeditiously, and evaluate the radiology report that could be used in an auditing system to decrease incorrect diagnoses. Our study revealed that the proposed Bi-CNN outperforms the random forest and the support vector machine methods. △ Less

Submitted 12 September, 2017; v1 submitted 24 August, 2017; originally announced August 2017.

Comments: This paper is submitted for peer-review

arXiv:1708.04347 [pdf, other]

Image Augmentation using Radial Transform for Training Deep Neural Networks

Authors: Hojjat Salehinejad, Shahrokh Valaee, Timothy Dowdell, Joseph Barfett

Abstract: Deep learning models have a large number of free parameters that must be estimated by efficient training of the models on a large number of training data samples to increase their generalization performance. In real-world applications, the data available to train these networks is often limited or imbalanced. We propose a sampling method based on the radial transform in a polar coordinate system f… ▽ More Deep learning models have a large number of free parameters that must be estimated by efficient training of the models on a large number of training data samples to increase their generalization performance. In real-world applications, the data available to train these networks is often limited or imbalanced. We propose a sampling method based on the radial transform in a polar coordinate system for image augmentation to facilitate the training of deep learning models from limited source data. This pixel-wise transform provides representations of the original image in the polar coordinate system by generating a new image from each pixel. This technique can generate radial transformed images up to the number of pixels in the original image to increase the diversity of poorly represented image classes. Our experiments show improved generalization performance in training deep convolutional neural networks with radial transformed images. △ Less

Submitted 14 February, 2018; v1 submitted 14 August, 2017; originally announced August 2017.

Comments: This paper is accepted for presentation at IEEE International Conference on Acoustics, Speech and Signal Processing (IEEE ICASSP), 2018

arXiv:1708.02238 [pdf, other]

A Convolutional Neural Network for Search Term Detection

Authors: Hojjat Salehinejad, Joseph Barfett, Parham Aarabi, Shahrokh Valaee, Errol Colak, Bruce Gray, Tim Dowdell

Abstract: Pathfinding in hospitals is challenging for patients, visitors, and even employees. Many people have experienced getting lost due to lack of clear guidance, large footprint of hospitals, and confusing array of hospital wings. In this paper, we propose Halo; An indoor navigation application based on voice-user interaction to help provide directions for users without assistance of a localization sys… ▽ More Pathfinding in hospitals is challenging for patients, visitors, and even employees. Many people have experienced getting lost due to lack of clear guidance, large footprint of hospitals, and confusing array of hospital wings. In this paper, we propose Halo; An indoor navigation application based on voice-user interaction to help provide directions for users without assistance of a localization system. The main challenge is accurate detection of origin and destination search terms. A custom convolutional neural network (CNN) is proposed to detect origin and destination search terms from transcription of a submitted speech query. The CNN is trained based on a set of queries tailored specifically for hospital and clinic environments. Performance of the proposed model is studied and compared with Levenshtein distance-based word matching. △ Less

Submitted 7 November, 2017; v1 submitted 6 August, 2017; originally announced August 2017.

Comments: This paper is accepted for presentation at 2017 IEEE 28th Annual International Symposium on Personal, Indoor, and Mobile Radio Communications

arXiv:1602.04335 [pdf, ps, other]

Learning Over Long Time Lags

Authors: Hojjat Salehinejad

Abstract: The advantage of recurrent neural networks (RNNs) in learning dependencies between time-series data has distinguished RNNs from other deep learning models. Recently, many advances are proposed in this emerging field. However, there is a lack of comprehensive review on memory models in RNNs in the literature. This paper provides a fundamental review on RNNs and long short term memory (LSTM) model.… ▽ More The advantage of recurrent neural networks (RNNs) in learning dependencies between time-series data has distinguished RNNs from other deep learning models. Recently, many advances are proposed in this emerging field. However, there is a lack of comprehensive review on memory models in RNNs in the literature. This paper provides a fundamental review on RNNs and long short term memory (LSTM) model. Then, provides a surveys of recent advances in different memory enhancements and learning techniques for capturing long term dependencies in RNNs. △ Less

Submitted 13 February, 2016; originally announced February 2016.

Comments: This is a draft article, in preparation to submit for peer-review

arXiv:1601.00360 [pdf]

Internet of Things for Residential Areas: Toward Personalized Energy Management Using Big Data

Authors: Hojjat Salehinejad

Abstract: Intelligent management of machines, particularly in a residence area, has been of interest for many years. However, such system design has always been limited to simple control of machines from a local area or remotely from the Internet. In this report, for the first time, an intelligent system is proposed, where not only provides intelligent control ability of machines to user, but also utilizes… ▽ More Intelligent management of machines, particularly in a residence area, has been of interest for many years. However, such system design has always been limited to simple control of machines from a local area or remotely from the Internet. In this report, for the first time, an intelligent system is proposed, where not only provides intelligent control ability of machines to user, but also utilizes big data and optimization techniques to provide promotional offers to the user to optimize energy consumption of machines. Since a high traffic communication is involved among the machines and the optimization-big data core of system, the communication core of the proposed system is designed based on cloud, where many challenging issues such as spectrum assignment and resource management are involved. To deal with that, the communication network in the home area network (HAN) is designed based on the cognitive radio system, where a new spectrum assignment method based on the ant colony optimization (ACO) algorithm is proposed to perform spectrum assignment to the machines in the HAN. Performance evaluation of the proposed spectrum assignment method shows its performance in fair spectrum assignment among machines. △ Less

Submitted 29 December, 2015; originally announced January 2016.

Comments: Draft of technical report. Limited version under preparation for submission

arXiv:1512.07980 [pdf, ps, other]

Diversity Enhancement for Micro-Differential Evolution

Authors: Hojjat Salehinejad, Shahryar Rahnamayan, Hamid R. Tizhoosh

Abstract: The differential evolution (DE) algorithm suffers from high computational time due to slow nature of evaluation. In contrast, micro-DE (MDE) algorithms employ a very small population size, which can converge faster to a reasonable solution. However, these algorithms are vulnerable to a premature convergence as well as to high risk of stagnation. In this paper, MDE algorithm with vectorized random… ▽ More The differential evolution (DE) algorithm suffers from high computational time due to slow nature of evaluation. In contrast, micro-DE (MDE) algorithms employ a very small population size, which can converge faster to a reasonable solution. However, these algorithms are vulnerable to a premature convergence as well as to high risk of stagnation. In this paper, MDE algorithm with vectorized random mutation factor (MDEVM) is proposed, which utilizes the small size population benefit while empowers the exploration ability of mutation factor through randomizing it in the decision variable level. The idea is supported by analyzing mutation factor using Monte-Carlo based simulations. To facilitate the usage of MDE algorithms with very-small population sizes, new mutation schemes for population sizes less than four are also proposed. Furthermore, comprehensive comparative simulations and analysis on performance of the MDE algorithms over various mutation schemes, population sizes, problem types (i.e. uni-modal, multi-modal, and composite), problem dimensionalities, and mutation factor ranges are conducted by considering population diversity analysis for stagnation and trap** in local optimum situations. The studies are conducted on 28 benchmark functions provided for the IEEE CEC-2013 competition. Experimental results demonstrate high performance and convergence speed of the proposed MDEVM algorithm. △ Less

Submitted 26 September, 2016; v1 submitted 25 December, 2015; originally announced December 2015.

Comments: Developed version is submitted for review to Applied soft computing

arXiv:1511.02554 [pdf, ps, other]

Deep Recurrent Neural Networks for Sequential Phenotype Prediction in Genomics

Authors: Farhad Pouladi, Hojjat Salehinejad, Amir Mohammad Gilani

Abstract: In analyzing of modern biological data, we are often dealing with ill-posed problems and missing data, mostly due to high dimensionality and multicollinearity of the dataset. In this paper, we have proposed a system based on matrix factorization (MF) and deep recurrent neural networks (DRNNs) for genotype imputation and phenotype sequences prediction. In order to model the long-term dependencies o… ▽ More In analyzing of modern biological data, we are often dealing with ill-posed problems and missing data, mostly due to high dimensionality and multicollinearity of the dataset. In this paper, we have proposed a system based on matrix factorization (MF) and deep recurrent neural networks (DRNNs) for genotype imputation and phenotype sequences prediction. In order to model the long-term dependencies of phenotype data, the new Recurrent Linear Units (ReLU) learning strategy is utilized for the first time. The proposed model is implemented for parallel processing on central processing units (CPUs) and graphic processing units (GPUs). Performance of the proposed model is compared with other training algorithms for learning long-term dependencies as well as the sparse partial least square (SPLS) method on a set of genotype and phenotype data with 604 samples, 1980 single-nucleotide polymorphisms (SNPs), and two traits. The results demonstrate performance of the ReLU training algorithm in learning long-term dependencies in RNNs. △ Less

Submitted 16 January, 2016; v1 submitted 8 November, 2015; originally announced November 2015.

Comments: The articles is accepted at DeSE 2015

arXiv:1504.07329 [pdf]

Combined A*-Ants Algorithm: A New Multi-Parameter Vehicle Navigation Scheme

Authors: Hojjat Salehinejad, Hossein Nezamabadi-pour, Saeid Saryazdi, Fereydoun Farrahi-Moghaddam

Abstract: In this paper a multi-parameter A*(A- star)-ants based algorithm is proposed in order to find the best optimized multi-parameter path between two desired points in regions. This algorithm recognizes paths, according to user desired parameters using electronic maps. The proposed algorithm is a combination of A* and ants algorithm in which the proposed A* algorithm is the prologue to the suggested a… ▽ More In this paper a multi-parameter A*(A- star)-ants based algorithm is proposed in order to find the best optimized multi-parameter path between two desired points in regions. This algorithm recognizes paths, according to user desired parameters using electronic maps. The proposed algorithm is a combination of A* and ants algorithm in which the proposed A* algorithm is the prologue to the suggested ant based algorithm .In fact, this A* algorithm invigorates some paths pheromones in ants algorithm. As one of implementations of this method, this algorithm was applied on a part of Kerman city, Iran as a multi-parameter vehicle navigator. It finds the best optimized multi-parameter direction between two desired junctions based on city traveler parameters. Comparison results between the proposed method and ants algorithm demonstrates efficiency and lower cost function results of the proposed method versus ants algorithm. △ Less

Submitted 27 April, 2015; originally announced April 2015.

Comments: This paper has been presented at the 16th Iranian Conference on Electrical Engineering in 2008

arXiv:1504.07327 [pdf]

Toward Smart Power Grids: Communication Network Design for Power Grids Synchronization

Authors: Hojjat Salehinejad, Farhad Pouladi, Siamak Talebi

Abstract: In smart power grids, kee** the synchronicity of generators and the corresponding controls is of great importance. To do so, a simple model is employed in terms of swing equation to represent the interactions among dynamics of generators and feedback control. In case of having a communication network available, the control can be done based on the transmitted measurements by the communication ne… ▽ More In smart power grids, kee** the synchronicity of generators and the corresponding controls is of great importance. To do so, a simple model is employed in terms of swing equation to represent the interactions among dynamics of generators and feedback control. In case of having a communication network available, the control can be done based on the transmitted measurements by the communication network. The stability of system is denoted by the largest eigenvalue of the weighted sum of the Laplacian matrices of the communication infrastructure and power network. In this work, we use graph theory to model the communication network as a graph problem. Then, Ant Colony System (ACS) is employed for optimum design of above graph for synchronization of power grids. Performance evaluation of the proposed method for the 39-bus New England power system versus methods such as exhaustive search and Rayleigh quotient approximation indicates feasibility and effectiveness of our method for even large scale smart power grids. △ Less

Submitted 27 April, 2015; originally announced April 2015.

Comments: This paper has been presented at the 27th International Power System Conference in 2012

Showing 1–28 of 28 results for author: Salehinejad, H