Search | arXiv e-print repository

Open-Source Conversational AI with SpeechBrain 1.0

Authors: Mirco Ravanelli, Titouan Parcollet, Adel Moumen, Sylvain de Langen, Cem Subakan, Peter Plantinga, Yingzhi Wang, Pooneh Mousavi, Luca Della Libera, Artem Ploujnikov, Francesco Paissan, Davide Borra, Salah Zaiem, Zeyu Zhao, Shucong Zhang, Georgios Karakasidis, Sung-Lin Yeh, Aku Rouhe, Rudolf Braun, Florian Mai, Juan Zuluaga-Gomez, Seyed Mahed Mousavi, Andreas Nautsch, Xuechen Liu, Sangeet Sagar , et al. (5 additional authors not shown)

Abstract: SpeechBrain is an open-source Conversational AI toolkit based on PyTorch, focused particularly on speech processing tasks such as speech recognition, speech enhancement, speaker recognition, text-to-speech, and much more. It promotes transparency and replicability by releasing both the pre-trained models and the complete "recipes" of code and algorithms required for training them. This paper prese… ▽ More SpeechBrain is an open-source Conversational AI toolkit based on PyTorch, focused particularly on speech processing tasks such as speech recognition, speech enhancement, speaker recognition, text-to-speech, and much more. It promotes transparency and replicability by releasing both the pre-trained models and the complete "recipes" of code and algorithms required for training them. This paper presents SpeechBrain 1.0, a significant milestone in the evolution of the toolkit, which now has over 200 recipes for speech, audio, and language processing tasks, and more than 100 models available on Hugging Face. SpeechBrain 1.0 introduces new technologies to support diverse learning modalities, Large Language Model (LLM) integration, and advanced decoding strategies, along with novel models, tasks, and modalities. It also includes a new benchmark repository, offering researchers a unified platform for evaluating models across diverse tasks △ Less

Submitted 2 July, 2024; v1 submitted 29 June, 2024; originally announced July 2024.

Comments: Submitted to JMLR (Machine Learning Open Source Software)

arXiv:2404.12498 [pdf]

A Configurable Pythonic Data Center Model for Sustainable Cooling and ML Integration

Authors: Avisek Naug, Antonio Guillen, Ricardo Luna Gutierrez, Vineet Gundecha, Sahand Ghorbanpour, Sajad Mousavi, Ashwin Ramesh Babu, Soumyendu Sarkar

Abstract: There have been growing discussions on estimating and subsequently reducing the operational carbon footprint of enterprise data centers. The design and intelligent control for data centers have an important impact on data center carbon footprint. In this paper, we showcase PyDCM, a Python library that enables extremely fast prototy** of data center design and applies reinforcement learning-enabl… ▽ More There have been growing discussions on estimating and subsequently reducing the operational carbon footprint of enterprise data centers. The design and intelligent control for data centers have an important impact on data center carbon footprint. In this paper, we showcase PyDCM, a Python library that enables extremely fast prototy** of data center design and applies reinforcement learning-enabled control with the purpose of evaluating key sustainability metrics including carbon footprint, energy consumption, and observing temperature hotspots. We demonstrate these capabilities of PyDCM and compare them to existing works in EnergyPlus for modeling data centers. PyDCM can also be used as a standalone Gymnasium environment for demonstrating sustainability-focused data center control. △ Less

Submitted 18 April, 2024; originally announced April 2024.

Comments: NeurIPS 2023 Workshop on Tackling Climate Change with Machine Learning https://www.climatechange.ai/papers/neurips2023/15. arXiv admin note: substantial text overlap with arXiv:2310.03906

arXiv:2404.10786 [pdf]

doi 10.1609/aaai.v38i20.30238

Sustainability of Data Center Digital Twins with Reinforcement Learning

Authors: Soumyendu Sarkar, Avisek Naug, Antonio Guillen, Ricardo Luna, Vineet Gundecha, Ashwin Ramesh Babu, Sajad Mousavi

Abstract: The rapid growth of machine learning (ML) has led to an increased demand for computational power, resulting in larger data centers (DCs) and higher energy consumption. To address this issue and reduce carbon emissions, intelligent design and control of DC components such as IT servers, cabinets, HVAC cooling, flexible load shifting, and battery energy storage are essential. However, the complexity… ▽ More The rapid growth of machine learning (ML) has led to an increased demand for computational power, resulting in larger data centers (DCs) and higher energy consumption. To address this issue and reduce carbon emissions, intelligent design and control of DC components such as IT servers, cabinets, HVAC cooling, flexible load shifting, and battery energy storage are essential. However, the complexity of designing and controlling them in tandem presents a significant challenge. While some individual components like CFD-based design and Reinforcement Learning (RL) based HVAC control have been researched, there's a gap in the holistic design and optimization covering all elements simultaneously. To tackle this, we've developed DCRL-Green, a multi-agent RL environment that empowers the ML community to design data centers and research, develop, and refine RL controllers for carbon footprint reduction in DCs. It is a flexible, modular, scalable, and configurable platform that can handle large High Performance Computing (HPC) clusters. Furthermore, in its default setup, DCRL-Green provides a benchmark for evaluating single as well as multi-agent RL algorithms. It easily allows users to subclass the default implementations and design their own control approaches, encouraging community development for sustainable data centers. Open Source Link: https://github.com/HewlettPackard/dc-rl △ Less

Submitted 16 April, 2024; originally announced April 2024.

Comments: 2024 Proceedings of the AAAI Conference on Artificial Intelligence

Journal ref: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 38, no. 20, pp. 22322-22330, Mar. 2024

arXiv:2403.14092 [pdf]

Carbon Footprint Reduction for Sustainable Data Centers in Real-Time

Authors: Soumyendu Sarkar, Avisek Naug, Ricardo Luna, Antonio Guillen, Vineet Gundecha, Sahand Ghorbanpour, Sajad Mousavi, Dejan Markovikj, Ashwin Ramesh Babu

Abstract: As machine learning workloads significantly increase energy consumption, sustainable data centers with low carbon emissions are becoming a top priority for governments and corporations worldwide. This requires a paradigm shift in optimizing power consumption in cooling and IT loads, shifting flexible loads based on the availability of renewable energy in the power grid, and leveraging battery stor… ▽ More As machine learning workloads significantly increase energy consumption, sustainable data centers with low carbon emissions are becoming a top priority for governments and corporations worldwide. This requires a paradigm shift in optimizing power consumption in cooling and IT loads, shifting flexible loads based on the availability of renewable energy in the power grid, and leveraging battery storage from the uninterrupted power supply in data centers, using collaborative agents. The complex association between these optimization strategies and their dependencies on variable external factors like weather and the power grid carbon intensity makes this a hard problem. Currently, a real-time controller to optimize all these goals simultaneously in a dynamic real-world setting is lacking. We propose a Data Center Carbon Footprint Reduction (DC-CFR) multi-agent Reinforcement Learning (MARL) framework that optimizes data centers for the multiple objectives of carbon footprint reduction, energy consumption, and energy cost. The results show that the DC-CFR MARL agents effectively resolved the complex interdependencies in optimizing cooling, load shifting, and energy storage in real-time for various locations under real-world dynamic weather and grid carbon intensity conditions. DC-CFR significantly outperformed the industry standard ASHRAE controller with a considerable reduction in carbon emissions (14.5%), energy usage (14.4%), and energy cost (13.7%) when evaluated over one year across multiple geographical regions. △ Less

Submitted 25 March, 2024; v1 submitted 20 March, 2024; originally announced March 2024.

Journal ref: 2024 Proceedings of the AAAI Conference on Artificial Intelligence

arXiv:2308.15633 [pdf, other]

The Impact of Reference-Command Preview on Human-in-the-Loop Control Behavior

Authors: Pedram Rabiee, S. Alireza Seyyed Mousavi, Amelia J. S. Sheffler, Erik Hellström, Mrdjan Jankovic, Mario A. Santillo, T. M. Seigler, Jesse B. Hoagg

Abstract: This article presents results from an experiment in which 44 human subjects interact with a dynamic system to perform 40 trials of a command-following task. The reference command is unpredictable and different on each trial, but all subjects have the same sequence of reference commands for the 40 trials. The subjects are divided into 4 groups of 11 subjects. One group performs the command-followin… ▽ More This article presents results from an experiment in which 44 human subjects interact with a dynamic system to perform 40 trials of a command-following task. The reference command is unpredictable and different on each trial, but all subjects have the same sequence of reference commands for the 40 trials. The subjects are divided into 4 groups of 11 subjects. One group performs the command-following task without preview of the reference command, and the other 3 groups are given preview of the reference command for different time lengths into the future (0.5 s, 1 s, 1.5 s). A subsystem identification algorithm is used to obtain best-fit models of each subject's control behavior on each trial. The time- and frequency-domain performance, as well as the identified models of the control behavior for the 4 groups are examined to investigate the effects of reference-command preview. The results suggest that preview tends to improve performance by allowing the subjects to compensate for sensory time delay and approximate the inverse dynamics in feedforward. However, too much preview may decrease performance by degrading the ability to use the correct phase lead in feedforward. △ Less

Submitted 29 August, 2023; originally announced August 2023.

Comments: Preprint submitted to IEEE Transactions on Cybernetics

arXiv:2306.10071 [pdf, other]

Joint Path planning and Power Allocation of a Cellular-Connected UAV using Apprenticeship Learning via Deep Inverse Reinforcement Learning

Authors: Alireza Shamsoshoara, Fatemeh Lotfi, Sajad Mousavi, Fatemeh Afghah, Ismail Guvenc

Abstract: This paper investigates an interference-aware joint path planning and power allocation mechanism for a cellular-connected unmanned aerial vehicle (UAV) in a sparse suburban environment. The UAV's goal is to fly from an initial point and reach a destination point by moving along the cells to guarantee the required quality of service (QoS). In particular, the UAV aims to maximize its uplink throughp… ▽ More This paper investigates an interference-aware joint path planning and power allocation mechanism for a cellular-connected unmanned aerial vehicle (UAV) in a sparse suburban environment. The UAV's goal is to fly from an initial point and reach a destination point by moving along the cells to guarantee the required quality of service (QoS). In particular, the UAV aims to maximize its uplink throughput and minimize the level of interference to the ground user equipment (UEs) connected to the neighbor cellular BSs, considering the shortest path and flight resource limitation. Expert knowledge is used to experience the scenario and define the desired behavior for the sake of the agent (i.e., UAV) training. To solve the problem, an apprenticeship learning method is utilized via inverse reinforcement learning (IRL) based on both Q-learning and deep reinforcement learning (DRL). The performance of this method is compared to learning from a demonstration technique called behavioral cloning (BC) using a supervised learning approach. Simulation and numerical results show that the proposed approach can achieve expert-level performance. We also demonstrate that, unlike the BC technique, the performance of our proposed approach does not degrade in unseen situations. △ Less

Submitted 15 June, 2023; originally announced June 2023.

arXiv:2306.06340 [pdf, other]

ECGBERT: Understanding Hidden Language of ECGs with Self-Supervised Representation Learning

Authors: Seokmin Choi, Sajad Mousavi, Phillip Si, Haben G. Yhdego, Fatemeh Khadem, Fatemeh Afghah

Abstract: In the medical field, current ECG signal analysis approaches rely on supervised deep neural networks trained for specific tasks that require substantial amounts of labeled data. However, our paper introduces ECGBERT, a self-supervised representation learning approach that unlocks the underlying language of ECGs. By unsupervised pre-training of the model, we mitigate challenges posed by the lack of… ▽ More In the medical field, current ECG signal analysis approaches rely on supervised deep neural networks trained for specific tasks that require substantial amounts of labeled data. However, our paper introduces ECGBERT, a self-supervised representation learning approach that unlocks the underlying language of ECGs. By unsupervised pre-training of the model, we mitigate challenges posed by the lack of well-labeled and curated medical data. ECGBERT, inspired by advances in the area of natural language processing and large language models, can be fine-tuned with minimal additional layers for various ECG-based problems. Through four tasks, including Atrial Fibrillation arrhythmia detection, heartbeat classification, sleep apnea detection, and user authentication, we demonstrate ECGBERT's potential to achieve state-of-the-art results on a wide variety of tasks. △ Less

Submitted 10 June, 2023; originally announced June 2023.

arXiv:2301.12176 [pdf]

Neural Gas Network Image Features and Segmentation for Brain Tumor Detection Using Magnetic Resonance Imaging Data

Authors: S. Muhammad Hossein Mousavi

Abstract: Accurate detection of brain tumors could save lots of lives and increasing the accuracy of this binary classification even as much as a few percent has high importance. Neural Gas Networks (NGN) is a fast, unsupervised algorithm that could be used in data clustering, image pattern recognition, and image segmentation. In this research, we used the metaheuristic Firefly Algorithm (FA) for image cont… ▽ More Accurate detection of brain tumors could save lots of lives and increasing the accuracy of this binary classification even as much as a few percent has high importance. Neural Gas Networks (NGN) is a fast, unsupervised algorithm that could be used in data clustering, image pattern recognition, and image segmentation. In this research, we used the metaheuristic Firefly Algorithm (FA) for image contrast enhancement as pre-processing and NGN weights for feature extraction and segmentation of Magnetic Resonance Imaging (MRI) data on two brain tumor datasets from the Kaggle platform. Also, tumor classification is conducted by Support Vector Machine (SVM) classification algorithms and compared with a deep learning technique plus other features in train and test phases. Additionally, NGN tumor segmentation is evaluated by famous performance metrics such as Accuracy, F-measure, Jaccard, and more versus ground truth data and compared with traditional segmentation techniques. The proposed method is fast and precise in both tasks of tumor classification and segmentation compared with other methods. A classification accuracy of 95.14 % and segmentation accuracy of 0.977 is achieved by the proposed method. △ Less

Submitted 28 January, 2023; originally announced January 2023.

Comments: 7 pages

arXiv:2208.10540 [pdf]

Fast Updating the STBC Decoder Matrices in the Uplink of a Massive MIMO System

Authors: Seyed Hosein Mousavi, Jafar Pourrostam

Abstract: Reducing computational complexity of the modern wireless communication systems such as massive Multiple-Input Multiple-Output (MIMO) configurations is of utmost interest. In this paper, we propose new algorithm that can be used to accelerate matrix inversion in the decoding of space-time block codes (STBC) in the uplink of dynamic massive MIMO systems. A multi-user system in which the base station… ▽ More Reducing computational complexity of the modern wireless communication systems such as massive Multiple-Input Multiple-Output (MIMO) configurations is of utmost interest. In this paper, we propose new algorithm that can be used to accelerate matrix inversion in the decoding of space-time block codes (STBC) in the uplink of dynamic massive MIMO systems. A multi-user system in which the base station is equipped with a large number of antennas and each user has two antennas is considered. In addition, users can enter or exit the system dynamically. For a given space-time block coding/decoding scheme the computational complexity of the receiver will be significantly reduced when a user is added to or removed from the system by employing the proposed method. In the proposed scheme, the matrix inversion for zero-forcing (ZF) as well as minimum mean square error (MMSE) decoding is derived from the inverse of a partitioned matrix and the Woodbury matrix identity. Furthermore, the suggested technique can be utilized when the number of users is fixed but the channel estimate changes for a particular user. The mathematical equations for updating the inverse of the decoding matrices are derived and its complexity is compared to the direct way of computing the inverse. Evaluations confirm the effectiveness of the proposed approach. △ Less

Submitted 22 August, 2022; originally announced August 2022.

Comments: 5 pages, 1 figure

arXiv:2108.12171 [pdf, other]

Modal Strong Structural Controllability for Networks with Dynamical Nodes

Authors: Shima Sadat Mousavi, Anastasios Kouvelas, Karl H. Johansson

Abstract: In this article, a new notion of modal strong structural controllability is introduced and examined for a family of LTI networks. These networks include structured LTI subsystems, whose system matrices have the same zero/nonzero/arbitrary pattern. An eigenvalue associated with a system matrix is controllable if it can be directly influenced by the control inputs. We consider an arbitrary set Δ, an… ▽ More In this article, a new notion of modal strong structural controllability is introduced and examined for a family of LTI networks. These networks include structured LTI subsystems, whose system matrices have the same zero/nonzero/arbitrary pattern. An eigenvalue associated with a system matrix is controllable if it can be directly influenced by the control inputs. We consider an arbitrary set Δ, and we refer to a network as modal strongly structurally controllable with respect to Δif, for all systems in a specific family of LTI networks, every λ\inΔis a controllable eigenvalue. For this family of LTI networks, not only is the zero/nonzero/arbitrary pattern of system matrices available, but also for a given Δ, there might be extra information about the intersection of the spectrum associated with some subsystems and Δ. Given a set Δ, we first define a Δ-network graph, and by introducing a coloring process of this graph, we establish a correspondence between the set of control subsystems and the so-called zero forcing sets. We also demonstrate how with Δ={0} or Δ=C\{0}, existing results on strong structural controllability can be derived through our approach. Compared to relevant literature, a more restricted family of LTI networks is considered in this work, and then, the derived condition is less conservative. △ Less

Submitted 27 August, 2021; originally announced August 2021.

arXiv:2107.13216 [pdf, ps, other]

Synthesis of Output-Feedback Controllers for Mixed Traffic Systems in Presence of Disturbances and Uncertainties

Authors: Shima Sadat Mousavi, Somayeh Bahrami, Anastasios Kouvelas

Abstract: In this paper, we study mixed traffic systems that move along a single-lane ring-road or open-road. The traffic flow forms a platoon, which includes a number of heterogeneous human-driven vehicles (HDVs) together with only one connected and automated vehicle (CAV) that receives information from several neighbors. The dynamics of HDVs are assumed to follow the optimal velocity model (OVM), and the… ▽ More In this paper, we study mixed traffic systems that move along a single-lane ring-road or open-road. The traffic flow forms a platoon, which includes a number of heterogeneous human-driven vehicles (HDVs) together with only one connected and automated vehicle (CAV) that receives information from several neighbors. The dynamics of HDVs are assumed to follow the optimal velocity model (OVM), and the acceleration of the single CAV is directly controlled by a dynamical output-feedback controller. The ultimate goal of this work is to present a robust control strategy that can smoothen the traffic flow in the presence of undesired disturbances (e.g. abrupt deceleration) and parametric uncertainties. A prerequisite for synthesizing a dynamical output controller is the stabilizability and detectability of the underlying system. Accordingly, a theoretical analysis is presented first to prove the stabilizability and detectability of the mixed traffic flow system. Then, two H-infinity control strategies, with and without considering uncertainties in the system dynamics, are designed. The efficiency of the two control methods is subsequently illustrated through numerical simulations, and various experimental results are presented to demonstrate the effectiveness of the proposed controller to mitigate disturbance amplification and achieve platoon stability. △ Less

Submitted 28 July, 2021; originally announced July 2021.

arXiv:2011.00121 [pdf, other]

An Uncertainty Estimation Framework for Risk Assessment in Deep Learning-based Atrial Fibrillation Classification

Authors: James Belen, Sajad Mousavi, Alireza Shamsoshoara, Fatemeh Afghah

Abstract: Atrial Fibrillation (AF) is among one of the most common types of heart arrhythmia afflicting more than 3 million people in the U.S. alone. AF is estimated to be the cause of death of 1 in 4 individuals. Recent advancements in Artificial Intelligence (AI) algorithms have led to the capability of reliably detecting AF from ECG signals. While these algorithms can accurately detect AF with high preci… ▽ More Atrial Fibrillation (AF) is among one of the most common types of heart arrhythmia afflicting more than 3 million people in the U.S. alone. AF is estimated to be the cause of death of 1 in 4 individuals. Recent advancements in Artificial Intelligence (AI) algorithms have led to the capability of reliably detecting AF from ECG signals. While these algorithms can accurately detect AF with high precision, the discrete and deterministic classifications mean that these networks are likely to erroneously classify the given ECG signal. This paper proposes a variational autoencoder classifier network that provides an uncertainty estimation of the network's output in addition to reliable classification accuracy. This framework can increase physicians' trust in using AI-based AF detection algorithms by providing them with a confidence score which reflects how uncertain the algorithm is about a case and recommending them to put more attention to the cases with a lower confidence score. The uncertainty is estimated by conducting multiple passes of the input through the network to build a distribution; the mean of the standard deviations is reported as the network's uncertainty. Our proposed network obtains 97.64% accuracy in addition to reporting the uncertainty △ Less

Submitted 30 October, 2020; originally announced November 2020.

arXiv:2008.10099 [pdf]

ECG-Based Blood Pressure Estimation Using Mechano-Electric Coupling Concept

Authors: Seyedeh Somayyeh Mousavi, Mostafa Charmi, Mohammad Firouzmand, Mohammad Hemmati, Maryam Moghadam, Yadollah Ghorbani

Abstract: The Electrocardiograph signal represents the heart's electrical activity while blood pressure results from the heart's mechanical activity. Previous studies have investigated how the heart's electrical and mechanical activities are related and have referred to their relationship as the Mechano-Electric Coupling term. A new method to estimate the blood pressure including is proposed which uses only… ▽ More The Electrocardiograph signal represents the heart's electrical activity while blood pressure results from the heart's mechanical activity. Previous studies have investigated how the heart's electrical and mechanical activities are related and have referred to their relationship as the Mechano-Electric Coupling term. A new method to estimate the blood pressure including is proposed which uses only the Electrocardiograph signal. In spite of studies performed on feature extraction based on the signals' physiological parameters (Parameter-based), in this work, the feature vectors are formed with samples of the Electrocardiograph signal in a particular time frame (Whole-based) and these vectors are input into Adaptive Boosting Regression to estimate blood pressure. The nonlinear relationship which correlates blood pressure with the Electrocardiograph signal is concluded by the results of this study. According to the results, the used algorithms, for estimating both diastolic blood pressures and mean arterial pressure, are in compliance with the standards of the Association for the Advancement of Medical Instrumentation. Also, according to the British Hypertension Society standard, estimating diastolic blood pressures and mean arterial pressure with the proposed method attain an A grade while it achieves B for systolic blood pressure. The results indicate that using the introduced method, blood pressure can be estimated continuously, noninvasively, without cuff, calibration-free and by using only the Electrocardiograph signal. △ Less

Submitted 23 August, 2020; originally announced August 2020.

Comments: 8 pages, 7 figures

arXiv:2006.08841 [pdf, other]

ECG Language Processing (ELP): a New Technique to Analyze ECG Signals

Authors: Sajad Mousavi, Fatemeh Afghah, Fatemeh Khadem, U. Rajendra Acharya

Abstract: A language is constructed of a finite/infinite set of sentences composing of words. Similar to natural languages, Electrocardiogram (ECG) signal, the most common noninvasive tool to study the functionality of the heart and diagnose several abnormal arrhythmias, is made up of sequences of three or four distinct waves including the P-wave, QRS complex, T-wave and U-wave. An ECG signal may contain se… ▽ More A language is constructed of a finite/infinite set of sentences composing of words. Similar to natural languages, Electrocardiogram (ECG) signal, the most common noninvasive tool to study the functionality of the heart and diagnose several abnormal arrhythmias, is made up of sequences of three or four distinct waves including the P-wave, QRS complex, T-wave and U-wave. An ECG signal may contain several different varieties of each wave (e.g., the QRS complex can have various appearances). For this reason, the ECG signal is a sequence of heartbeats similar to sentences in natural languages) and each heartbeat is composed of a set of waves (similar to words in a sentence) of different morphologies. Analogous to natural language processing (NLP) which is used to help computers understand and interpret the human's natural language, it is possible to develop methods inspired by NLP to aid computers to gain a deeper understanding of Electrocardiogram signals. In this work, our goal is to propose a novel ECG analysis technique, \textit{ECG language processing (ELP)}, focusing on empowering computers to understand ECG signals in a way physicians do. We evaluated the proposed method on two tasks including the classification of heartbeats and the detection of atrial fibrillation in the ECG signals. Experimental results on three databases (i.e., PhysionNet's MIT-BIH, MIT-BIH AFIB and PhysioNet Challenge 2017 AFIB Dataset databases) reveal that the proposed method is a general idea that can be applied to a variety of biomedical applications and is able to achieve remarkable performance. △ Less

Submitted 12 June, 2020; originally announced June 2020.

arXiv:2005.03059 [pdf]

CovidCTNet: An Open-Source Deep Learning Approach to Identify Covid-19 Using CT Image

Authors: Tahereh Javaheri, Morteza Homayounfar, Zohreh Amoozgar, Reza Reiazi, Fatemeh Homayounieh, Engy Abbas, Azadeh Laali, Amir Reza Radmard, Mohammad Hadi Gharib, Seyed Ali Javad Mousavi, Omid Ghaemi, Rosa Babaei, Hadi Karimi Mobin, Mehdi Hosseinzadeh, Rana Jahanban-Esfahlan, Khaled Seidi, Mannudeep K. Kalra, Guanglan Zhang, L. T. Chitkushev, Benjamin Haibe-Kains, Reza Malekzadeh, Reza Rawassizadeh

Abstract: Coronavirus disease 2019 (Covid-19) is highly contagious with limited treatment options. Early and accurate diagnosis of Covid-19 is crucial in reducing the spread of the disease and its accompanied mortality. Currently, detection by reverse transcriptase polymerase chain reaction (RT-PCR) is the gold standard of outpatient and inpatient detection of Covid-19. RT-PCR is a rapid method, however, it… ▽ More Coronavirus disease 2019 (Covid-19) is highly contagious with limited treatment options. Early and accurate diagnosis of Covid-19 is crucial in reducing the spread of the disease and its accompanied mortality. Currently, detection by reverse transcriptase polymerase chain reaction (RT-PCR) is the gold standard of outpatient and inpatient detection of Covid-19. RT-PCR is a rapid method, however, its accuracy in detection is only ~70-75%. Another approved strategy is computed tomography (CT) imaging. CT imaging has a much higher sensitivity of ~80-98%, but similar accuracy of 70%. To enhance the accuracy of CT imaging detection, we developed an open-source set of algorithms called CovidCTNet that successfully differentiates Covid-19 from community-acquired pneumonia (CAP) and other lung diseases. CovidCTNet increases the accuracy of CT imaging detection to 90% compared to radiologists (70%). The model is designed to work with heterogeneous and small sample sizes independent of the CT imaging hardware. In order to facilitate the detection of Covid-19 globally and assist radiologists and physicians in the screening process, we are releasing all algorithms and parametric details in an open-source format. Open-source sharing of our CovidCTNet enables developers to rapidly improve and optimize services, while preserving user privacy and data ownership. △ Less

Submitted 15 May, 2020; v1 submitted 6 May, 2020; originally announced May 2020.

Comments: 5 figures

arXiv:2004.00405 [pdf, other]

doi 10.1109/ICEE50131.2020.9260979

A Low Complexity Space-Time Block Codes Detection for Cell-Free Massive MIMO Systems

Authors: A. Mazhari Saray, J. Pourrostam, S. H. Mousavi, M. Mohassel Feghhi

Abstract: The new generation of telecommunication systems must provide acceptable data rates and spectral efficiency for new applications. Recently massive MIMO has been introduced as a key technique for the new generation of telecommunication systems. Cell-free massive MIMO system is not segmented into cells. Each BS antennas are distributed throughout the environment and each user is served by all BSs, si… ▽ More The new generation of telecommunication systems must provide acceptable data rates and spectral efficiency for new applications. Recently massive MIMO has been introduced as a key technique for the new generation of telecommunication systems. Cell-free massive MIMO system is not segmented into cells. Each BS antennas are distributed throughout the environment and each user is served by all BSs, simultaneously. In this paper, the performance of the multiuser cell-free massive MIMO-system exploying space-time block codes in the uplink, and with linear decoders is studied. An Inverse matrix approximation using Neumann series is proposed to reduce the computational and hardware complexity of the decoding in the receiver. For this purpose, each user has two antennas, and also for improving the diversity gain performance, space-time block codes are used in the uplink. Then, Neumann series is used to approximate the inverse matrix in ZF and MMSE decoders, and its performance is evaluated in terms of BER and spectral efficiency. In addition, we derive lower bound for throughput of ZF decoder. The simulation results show that performance of the system , in terms of BER and spectral efficiency, is better than the single-antenna users at the same system. Also, the BER performance in a given system with the proposed method will be close to the exact method. △ Less

Submitted 1 April, 2020; originally announced April 2020.

Comments: 5 pages, 4 figures, Accepted for ICEE2020

arXiv:2002.05262 [pdf, other]

HAN-ECG: An Interpretable Atrial Fibrillation Detection Model Using Hierarchical Attention Networks

Authors: Sajad Mousavi, Fatemeh Afghah, U. Rajendra Acharya

Abstract: Atrial fibrillation (AF) is one of the most prevalent cardiac arrhythmias that affects the lives of more than 3 million people in the U.S. and over 33 million people around the world and is associated with a five-fold increased risk of stroke and mortality. like other problems in healthcare domain, artificial intelligence (AI)-based algorithms have been used to reliably detect AF from patients' ph… ▽ More Atrial fibrillation (AF) is one of the most prevalent cardiac arrhythmias that affects the lives of more than 3 million people in the U.S. and over 33 million people around the world and is associated with a five-fold increased risk of stroke and mortality. like other problems in healthcare domain, artificial intelligence (AI)-based algorithms have been used to reliably detect AF from patients' physiological signals. The cardiologist level performance in detecting this arrhythmia is often achieved by deep learning-based methods, however, they suffer from the lack of interpretability. In other words, these approaches are unable to explain the reasons behind their decisions. The lack of interpretability is a common challenge toward a wide application of machine learning-based approaches in the healthcare which limits the trust of clinicians in such methods. To address this challenge, we propose HAN-ECG, an interpretable bidirectional-recurrent-neural-network-based approach for the AF detection task. The HAN-ECG employs three attention mechanism levels to provide a multi-resolution analysis of the patterns in ECG leading to AF. The first level, wave level, computes the wave weights, the second level, heartbeat level, calculates the heartbeat weights, and third level, window (i.e., multiple heartbeats) level, produces the window weights in triggering a class of interest. The detected patterns by this hierarchical attention model facilitate the interpretation of the neural network decision process in identifying the patterns in the signal which contributed the most to the final prediction. Experimental results on two AF databases demonstrate that our proposed model performs significantly better than the existing algorithms. Visualization of these attention layers illustrates that our model decides upon the important waves and heartbeats which are clinically meaningful in the detection task. △ Less

Submitted 12 February, 2020; originally announced February 2020.

arXiv:1912.01144 [pdf, other]

doi 10.1109/TGRS.2020.2988770

Bayesian-Deep-Learning Estimation of Earthquake Location from Single-Station Observations

Authors: S. Mostafa Mousavi, Gregory C. Beroza

Abstract: We present a deep learning method for single-station earthquake location, which we approach as a regression problem using two separate Bayesian neural networks. We use a multi-task temporal-convolutional neural network to learn epicentral distance and P travel time from 1-minute seismograms. The network estimates epicentral distance and P travel time with absolute mean errors of 0.23 km and 0.03 s… ▽ More We present a deep learning method for single-station earthquake location, which we approach as a regression problem using two separate Bayesian neural networks. We use a multi-task temporal-convolutional neural network to learn epicentral distance and P travel time from 1-minute seismograms. The network estimates epicentral distance and P travel time with absolute mean errors of 0.23 km and 0.03 s respectively, along with their epistemic and aleatory uncertainties. We design a separate multi-input network using standard convolutional layers to estimate the back-azimuth angle, and its epistemic uncertainty. This network estimates the direction from which seismic waves arrive to the station with a mean error of 1 degree. Using this information, we estimate the epicenter, origin time, and depth along with their confidence intervals. We use a global dataset of earthquake signals recorded within 1 degree (~112 km) from the event to build the model and to demonstrate its performance. Our model can predict epicenter, origin time, and depth with mean errors of 7.3 km, 0.4 second, and 6.7 km respectively, at different locations around the world. Our approach can be used for fast earthquake source characterization with a limited number of observations, and also for estimating location of earthquakes that are sparsely recorded -- either because they are small or because stations are widely separated. △ Less

Submitted 2 December, 2019; originally announced December 2019.

arXiv:1911.11343 [pdf, other]

An Autonomous Spectrum Management Scheme for Unmanned Aerial Vehicle Networks in Disaster Relief Operations

Authors: Alireza Shamsoshoara, Fatemeh Afghah, Abolfazl Razi, Sajad Mousavi, Jonathan Ashdown, Kurt Turk

Abstract: This paper studies the problem of spectrum shortage in an unmanned aerial vehicle (UAV) network during critical missions such as wildfire monitoring, search and rescue, and disaster monitoring. Such applications involve a high demand for high-throughput data transmissions such as real-time video-, image-, and voice- streaming where the assigned spectrum to the UAV network may not be adequate to pr… ▽ More This paper studies the problem of spectrum shortage in an unmanned aerial vehicle (UAV) network during critical missions such as wildfire monitoring, search and rescue, and disaster monitoring. Such applications involve a high demand for high-throughput data transmissions such as real-time video-, image-, and voice- streaming where the assigned spectrum to the UAV network may not be adequate to provide the desired Quality of Service (QoS). In these scenarios, the aerial network can borrow an additional spectrum from the available terrestrial networks in the trade of a relaying service for them. We propose a spectrum sharing model in which the UAVs are grouped into two classes of relaying UAVs that service the spectrum owner and the sensing UAVs that perform the disaster relief mission using the obtained spectrum. The operation of the UAV network is managed by a hierarchical mechanism in which a central controller assigns the tasks of the UAVs based on their resources and determine their operation region based on the level of priority of impacted areas and then the UAVs autonomously fine-tune their position using a model-free reinforcement learning algorithm to maximize the individual throughput and prolong their lifetime. We analyze the performance and the convergence for the proposed method analytically and with extensive simulations in different scenarios. △ Less

Submitted 26 November, 2019; originally announced November 2019.

Comments: 14 pages, 14 figures, 1 table

arXiv:1911.05975 [pdf, other]

doi 10.1029/2019GL085976

A Machine-Learning Approach for Earthquake Magnitude Estimation

Authors: S. Mostafa Mousavi, Gregory C. Beroza

Abstract: In this study we develop a single-station deep-learning approach for fast and reliable estimation of earthquake magnitude directly from raw waveforms. We design a regressor composed of convolutional and recurrent neural networks that is not sensitive to the data normalization, hence waveform amplitude information can be utilized during the training. Our network can predict earthquake magnitudes wi… ▽ More In this study we develop a single-station deep-learning approach for fast and reliable estimation of earthquake magnitude directly from raw waveforms. We design a regressor composed of convolutional and recurrent neural networks that is not sensitive to the data normalization, hence waveform amplitude information can be utilized during the training. Our network can predict earthquake magnitudes with an average error close to zero and standard deviation of ~0.2 based on single-station waveforms without instrument response correction. We test the network for both local and duration magnitude scales and show a station-based learning can be an effective approach for improving the performance. The proposed approach has a variety of potential applications from routine earthquake monitoring to early warning systems. △ Less

Submitted 14 November, 2019; originally announced November 2019.

arXiv:1909.11791 [pdf, other]

doi 10.1371/journal.pone.0226990

Single-modal and Multi-modal False Arrhythmia Alarm Reduction using Attention-based Convolutional and Recurrent Neural Networks

Authors: Sajad Mousavi, Atiyeh Fotoohinasab, Fatemeh Afghah

Abstract: This study proposes a deep learning model that effectively suppresses the false alarms in the intensive care units (ICUs) without ignoring the true alarms using single- and multimodal biosignals. Most of the current work in the literature are either rule-based methods, requiring prior knowledge of arrhythmia analysis to build rules, or classical machine learning approaches, depending on hand-engin… ▽ More This study proposes a deep learning model that effectively suppresses the false alarms in the intensive care units (ICUs) without ignoring the true alarms using single- and multimodal biosignals. Most of the current work in the literature are either rule-based methods, requiring prior knowledge of arrhythmia analysis to build rules, or classical machine learning approaches, depending on hand-engineered features. In this work, we apply convolutional neural networks to automatically extract time-invariant features, an attention mechanism to put more emphasis on the important regions of the input segmented signal(s) that are more likely to contribute to an alarm, and long short-term memory units to capture the temporal information presented in the signal segments. We trained our method efficiently using a two-step training algorithm (i.e., pre-training and fine-tuning the proposed network) on the dataset provided by the PhysioNet computing in cardiology challenge 2015. The evaluation results demonstrate that the proposed method obtains better results compared to other existing algorithms for the false alarm reduction task in ICUs. The proposed method achieves a sensitivity of 93.88% and a specificity of 92.05% for the alarm classification, considering three different signals. In addition, our experiments for 5 separate alarm types leads significant results, where we just consider a single-lead ECG (e.g., a sensitivity of 90.71%, a specificity of 88.30%, an AUC of 89.51 for alarm type of Ventricular Tachycardia arrhythmia) △ Less

Submitted 25 September, 2019; originally announced September 2019.

arXiv:1903.02108 [pdf, other]

doi 10.1371/journal.pone.0216456

SleepEEGNet: Automated Sleep Stage Scoring with Sequence to Sequence Deep Learning Approach

Authors: Sajad Mousavi, Fatemeh Afghah, U. Rajendra Acharya

Abstract: Electroencephalogram (EEG) is a common base signal used to monitor brain activity and diagnose sleep disorders. Manual sleep stage scoring is a time-consuming task for sleep experts and is limited by inter-rater reliability. In this paper, we propose an automatic sleep stage annotation method called SleepEEGNet using a single-channel EEG signal. The SleepEEGNet is composed of deep convolutional ne… ▽ More Electroencephalogram (EEG) is a common base signal used to monitor brain activity and diagnose sleep disorders. Manual sleep stage scoring is a time-consuming task for sleep experts and is limited by inter-rater reliability. In this paper, we propose an automatic sleep stage annotation method called SleepEEGNet using a single-channel EEG signal. The SleepEEGNet is composed of deep convolutional neural networks (CNNs) to extract time-invariant features, frequency information, and a sequence to sequence model to capture the complex and long short-term context dependencies between sleep epochs and scores. In addition, to reduce the effect of the class imbalance problem presented in the available sleep datasets, we applied novel loss functions to have an equal misclassified error for each sleep stage while training the network. We evaluated the proposed method on different single-EEG channels (i.e., Fpz-Cz and Pz-Oz EEG channels) from the Physionet Sleep-EDF datasets published in 2013 and 2018. The evaluation results demonstrate that the proposed method achieved the best annotation performance compared to current literature, with an overall accuracy of 84.26%, a macro F1-score of 79.66% and Cohen's Kappa coefficient = 0.79. Our developed model is ready to test with more sleep EEG signals and aid the sleep specialists to arrive at an accurate diagnosis. The source code is available at https://github.com/SajadMo/SleepEEGNet. △ Less

Submitted 5 March, 2019; originally announced March 2019.

arXiv:1812.07422 [pdf, other]

ECGNET: Learning where to attend for detection of atrial fibrillation with deep visual attention

Authors: Sajad Mousavi, Fatemeh Afghah, Abolfazl Razi, U. Rajendra Acharya

Abstract: The complexity of the patterns associated with Atrial Fibrillation (AF) and the high level of noise affecting these patterns have significantly limited the current signal processing and shallow machine learning approaches to get accurate AF detection results. Deep neural networks have shown to be very powerful to learn the non-linear patterns in the data. While a deep learning approach attempts to… ▽ More The complexity of the patterns associated with Atrial Fibrillation (AF) and the high level of noise affecting these patterns have significantly limited the current signal processing and shallow machine learning approaches to get accurate AF detection results. Deep neural networks have shown to be very powerful to learn the non-linear patterns in the data. While a deep learning approach attempts to learn complex pattern related to the presence of AF in the ECG, they can benefit from knowing which parts of the signal is more important to focus during learning. In this paper, we introduce a two-channel deep neural network to more accurately detect AF presented in the ECG signal. The first channel takes in a preprocessed ECG signal and automatically learns where to attend for detection of AF. The second channel simultaneously takes in the preprocessed ECG signal to consider all features of entire signals. The model shows via visualization that what parts of the given ECG signal are important to attend while trying to detect atrial fibrillation. In addition, this combination significantly improves the performance of the atrial fibrillation detection (achieved a sensitivity of 99.53%, specificity of 99.26% and accuracy of 99.40% on the MIT-BIH atrial fibrillation database with 5-s ECG segments.) △ Less

Submitted 14 February, 2019; v1 submitted 8 December, 2018; originally announced December 2018.

arXiv:1812.07421 [pdf, other]

Inter- and intra- patient ECG heartbeat classification for arrhythmia detection: a sequence to sequence deep learning approach

Authors: Sajad Mousavi, Fatemeh Afghah, U. Rajendra Acharya

Abstract: Electrocardiogram (ECG) signal is a common and powerful tool to study heart function and diagnose several abnormal arrhythmias. While there have been remarkable improvements in cardiac arrhythmia classification methods, they still cannot offer acceptable performance in detecting different heart conditions, especially when dealing with imbalanced datasets. In this paper, we propose a solution to ad… ▽ More Electrocardiogram (ECG) signal is a common and powerful tool to study heart function and diagnose several abnormal arrhythmias. While there have been remarkable improvements in cardiac arrhythmia classification methods, they still cannot offer acceptable performance in detecting different heart conditions, especially when dealing with imbalanced datasets. In this paper, we propose a solution to address this limitation of current classification approaches by develo** an automatic heartbeat classification method using deep convolutional neural networks and sequence to sequence models. We evaluated the proposed method on the MIT-BIH arrhythmia database, considering the intra-patient and inter-patient paradigms, and the AAMI EC57 standard. The evaluation results for both paradigms show that our method achieves the best performance in the literature (a positive predictive value of 96.46% and sensitivity of 100% for the category S, and a positive predictive value of 98.68% and sensitivity of 97.40% for the category F for the intra-patient scheme; a positive predictive value of 92.57% and sensitivity of 88.94% for the category S, and a positive predictive value of 99.50% and sensitivity of 99.94% for the category V for the inter-patient scheme.). The source code is available at https://github.com/SajadMo/ECG-Heartbeat-Classification-seq2seq-model. △ Less

Submitted 12 March, 2019; v1 submitted 8 December, 2018; originally announced December 2018.

arXiv:1811.02695 [pdf, other]

doi 10.1109/TGRS.2019.2926772

Seismic Signal Denoising and Decomposition Using Deep Neural Networks

Authors: Weiqiang Zhu, S. Mostafa Mousavi, Gregory C. Beroza

Abstract: Denoising and filtering are widely used in routine seismic-data-processing to improve the signal-to-noise ratio (SNR) of recorded signals and by doing so to improve subsequent analyses. In this paper we develop a new denoising/decomposition method, DeepDenoiser, based on a deep neural network. This network is able to learn simultaneously a sparse representation of data in the time-frequency domain… ▽ More Denoising and filtering are widely used in routine seismic-data-processing to improve the signal-to-noise ratio (SNR) of recorded signals and by doing so to improve subsequent analyses. In this paper we develop a new denoising/decomposition method, DeepDenoiser, based on a deep neural network. This network is able to learn simultaneously a sparse representation of data in the time-frequency domain and a non-linear function that maps this representation into masks that decompose input data into a signal of interest and noise (defined as any non-seismic signal). We show that DeepDenoiser achieves impressive denoising of seismic signals even when the signal and noise share a common frequency band. Our method properly handles a variety of colored noise and non-earthquake signals. DeepDenoiser can significantly improve the SNR with minimal changes in the waveform shape of interest, even in presence of high noise levels. We demonstrate the effect of our method on improving earthquake detection. There are clear applications of DeepDenoiser to seismic imaging, micro-seismic monitoring, and preprocessing of ambient noise data. We also note that potential applications of our approach are not limited to these applications or even to earthquake data, and that our approach can be adapted to diverse signals and applications in other settings. △ Less

Submitted 6 November, 2018; originally announced November 2018.

arXiv:1810.02898 [pdf, ps, other]

Stability analysis of networked control systems with not necessarily UGES protocols

Authors: Seyed Hossein Mousavi, Navid Noroozi, Anton H. J. de Ruiter, Roman Geiselhart

Abstract: This note studies (practical) asymptotic stability of nonlinear networked control systems whose protocols are not necessarily uniformly globally exponentially stable. In particular, we propose a Lyapunov-based approach to establish (practical) asymptotic stability of the networked control systems. Considering so-called modified Round Robin and Try-Once-Discard protocols, which are only uniformly g… ▽ More This note studies (practical) asymptotic stability of nonlinear networked control systems whose protocols are not necessarily uniformly globally exponentially stable. In particular, we propose a Lyapunov-based approach to establish (practical) asymptotic stability of the networked control systems. Considering so-called modified Round Robin and Try-Once-Discard protocols, which are only uniformly globally asymptotically stable, we explicitly construct Lyapunov functions for these two protocols, which fit our proposed setting. In order to optimize the usage of communication resource, we exploit the following transmission policy: wait for a certain minimum amount of time after the last sampling instant and then check a state-dependent criterion. When the latter condition is violated, a transmission occurs. In that way, the existence of the minimum amount of time between two consecutive transmission is established and so-called Zeno phenomenon, therefore, is avoided. Finally, illustrative examples are given to verify the effectiveness of our results. △ Less

Submitted 9 October, 2018; v1 submitted 5 October, 2018; originally announced October 2018.

arXiv:1804.11196 [pdf, other]

A Feature Selection Method Based on Shapley Value to False Alarm Reduction in ICUs, A Genetic-Algorithm Approach

Authors: Mohammad Zaeri-Amirani, Fatemeh Afghah, Sajad Mousavi

Abstract: High false alarm rate in intensive care units (ICUs) has been identified as one of the most critical medical challenges in recent years. This often results in overwhelming the clinical staff by numerous false or unurgent alarms and decreasing the quality of care through enhancing the probability of missing true alarms as well as causing delirium, stress, sleep deprivation and depressed immune syst… ▽ More High false alarm rate in intensive care units (ICUs) has been identified as one of the most critical medical challenges in recent years. This often results in overwhelming the clinical staff by numerous false or unurgent alarms and decreasing the quality of care through enhancing the probability of missing true alarms as well as causing delirium, stress, sleep deprivation and depressed immune systems for patients. One major cause of false alarms in clinical practice is that the collected signals from different devices are processed individually to trigger an alarm, while there exists a considerable chance that the signal collected from one device is corrupted by noise or motion artifacts. In this paper, we propose a low-computational complexity yet accurate game-theoretic feature selection method which is based on a genetic algorithm that identifies the most informative biomarkers across the signals collected from various monitoring devices and can considerably reduce the rate of false alarms. △ Less

Submitted 25 April, 2018; originally announced April 2018.

Showing 1–27 of 27 results for author: Mousavi, S