Search | arXiv e-print repository

arXiv:2402.01931 [pdf, other]

Digits micro-model for accurate and secure transactions

Authors: Chirag Chhablani, Nikhita Sharma, Jordan Hosier, Vijay K. Gurbani

Abstract: Automatic Speech Recognition (ASR) systems are used in the financial domain to enhance the caller experience by enabling natural language understanding and facilitating efficient and intuitive interactions. Increasing use of ASR systems requires that such systems exhibit very low error rates. The predominant ASR models to collect numeric data are large, general-purpose commercial models -- Google… ▽ More Automatic Speech Recognition (ASR) systems are used in the financial domain to enhance the caller experience by enabling natural language understanding and facilitating efficient and intuitive interactions. Increasing use of ASR systems requires that such systems exhibit very low error rates. The predominant ASR models to collect numeric data are large, general-purpose commercial models -- Google Speech-to-text (STT), or Amazon Transcribe -- or open source (OpenAI's Whisper). Such ASR models are trained on hundreds of thousands of hours of audio data and require considerable resources to run. Despite recent progress large speech recognition models, we highlight the potential of smaller, specialized "micro" models. Such light models can be trained perform well on number recognition specific tasks, competing with general models like Whisper or Google STT while using less than 80 minutes of training time and occupying at least an order of less memory resources. Also, unlike larger speech recognition models, micro-models are trained on carefully selected and curated datasets, which makes them highly accurate, agile, and easy to retrain, while using low compute resources. We present our work on creating micro models for multi-digit number recognition that handle diverse speaking styles reflecting real-world pronunciation patterns. Our work contributes to domain-specific ASR models, improving digit recognition accuracy, and privacy of data. An added advantage, their low resource consumption allows them to be hosted on-premise, kee** private data local instead uploading to an external cloud. Our results indicate that our micro-model makes less errors than the best-of-breed commercial or open-source ASRs in recognizing digits (1.8% error rate of our best micro-model versus 5.8% error rate of Whisper), and has a low memory footprint (0.66 GB VRAM for our model versus 11 GB VRAM for Whisper). △ Less

Submitted 2 February, 2024; originally announced February 2024.

Comments: 7 pages, 1 figure, 5 tables

arXiv:2305.12741 [pdf, other]

Coswara: A respiratory sounds and symptoms dataset for remote screening of SARS-CoV-2 infection

Authors: Debarpan Bhattacharya, Neeraj Kumar Sharma, Debottam Dutta, Srikanth Raj Chetupalli, Pravin Mote, Sriram Ganapathy, Chandrakiran C, Sahiti Nori, Suhail K K, Sadhana Gonuguntla, Murali Alagesan

Abstract: This paper presents the Coswara dataset, a dataset containing diverse set of respiratory sounds and rich meta-data, recorded between April-2020 and February-2022 from 2635 individuals (1819 SARS-CoV-2 negative, 674 positive, and 142 recovered subjects). The respiratory sounds contained nine sound categories associated with variants of breathing, cough and speech. The rich metadata contained demogr… ▽ More This paper presents the Coswara dataset, a dataset containing diverse set of respiratory sounds and rich meta-data, recorded between April-2020 and February-2022 from 2635 individuals (1819 SARS-CoV-2 negative, 674 positive, and 142 recovered subjects). The respiratory sounds contained nine sound categories associated with variants of breathing, cough and speech. The rich metadata contained demographic information associated with age, gender and geographic location, as well as the health information relating to the symptoms, pre-existing respiratory ailments, comorbidity and SARS-CoV-2 test status. Our study is the first of its kind to manually annotate the audio quality of the entire dataset (amounting to 65~hours) through manual listening. The paper summarizes the data collection procedure, demographic, symptoms and audio data information. A COVID-19 classifier based on bi-directional long short-term (BLSTM) architecture, is trained and evaluated on the different population sub-groups contained in the dataset to understand the bias/fairness of the model. This enabled the analysis of the impact of gender, geographic location, date of recording, and language proficiency on the COVID-19 detection performance. △ Less

Submitted 22 May, 2023; originally announced May 2023.

Comments: Accepted for publiation in Nature Scientific Data

arXiv:2206.12309 [pdf, other]

Analyzing the impact of SARS-CoV-2 variants on respiratory sound signals

Authors: Debarpan Bhattacharya, Debottam Dutta, Neeraj Kumar Sharma, Srikanth Raj Chetupalli, Pravin Mote, Sriram Ganapathy, Chandrakiran C, Sahiti Nori, Suhail K K, Sadhana Gonuguntla, Murali Alagesan

Abstract: The COVID-19 outbreak resulted in multiple waves of infections that have been associated with different SARS-CoV-2 variants. Studies have reported differential impact of the variants on respiratory health of patients. We explore whether acoustic signals, collected from COVID-19 subjects, show computationally distinguishable acoustic patterns suggesting a possibility to predict the underlying virus… ▽ More The COVID-19 outbreak resulted in multiple waves of infections that have been associated with different SARS-CoV-2 variants. Studies have reported differential impact of the variants on respiratory health of patients. We explore whether acoustic signals, collected from COVID-19 subjects, show computationally distinguishable acoustic patterns suggesting a possibility to predict the underlying virus variant. We analyze the Coswara dataset which is collected from three subject pools, namely, i) healthy, ii) COVID-19 subjects recorded during the delta variant dominant period, and iii) data from COVID-19 subjects recorded during the omicron surge. Our findings suggest that multiple sound categories, such as cough, breathing, and speech, indicate significant acoustic feature differences when comparing COVID-19 subjects with omicron and delta variants. The classification areas-under-the-curve are significantly above chance for differentiating subjects infected by omicron from those infected by delta. Using a score fusion from multiple sound categories, we obtained an area-under-the-curve of 89% and 52.4% sensitivity at 95% specificity. Additionally, a hierarchical three class approach was used to classify the acoustic data into healthy and COVID-19 positive, and further COVID-19 subjects into delta and omicron variants providing high level of 3-class classification accuracy. These results suggest new ways for designing sound based COVID-19 diagnosis approaches. △ Less

Submitted 24 June, 2022; originally announced June 2022.

Journal ref: Interspeech, 2022

arXiv:2206.05053 [pdf, other]

Coswara: A website application enabling COVID-19 screening by analysing respiratory sound samples and health symptoms

Authors: Debarpan Bhattacharya, Debottam Dutta, Neeraj Kumar Sharma, Srikanth Raj Chetupalli, Pravin Mote, Sriram Ganapathy, Chandrakiran C, Sahiti Nori, Suhail K K, Sadhana Gonuguntla, Murali Alagesan

Abstract: The COVID-19 pandemic has accelerated research on design of alternative, quick and effective COVID-19 diagnosis approaches. In this paper, we describe the Coswara tool, a website application designed to enable COVID-19 detection by analysing respiratory sound samples and health symptoms. A user using this service can log into a website using any device connected to the internet, provide there curr… ▽ More The COVID-19 pandemic has accelerated research on design of alternative, quick and effective COVID-19 diagnosis approaches. In this paper, we describe the Coswara tool, a website application designed to enable COVID-19 detection by analysing respiratory sound samples and health symptoms. A user using this service can log into a website using any device connected to the internet, provide there current health symptom information and record few sound sampled corresponding to breathing, cough, and speech. Within a minute of analysis of this information on a cloud server the website tool will output a COVID-19 probability score to the user. As the COVID-19 pandemic continues to demand massive and scalable population level testing, we hypothesize that the proposed tool provides a potential solution towards this. △ Less

Submitted 9 June, 2022; originally announced June 2022.

Journal ref: Interspeech, 2022

arXiv:2110.01177 [pdf, other]

The Second DiCOVA Challenge: Dataset and performance analysis for COVID-19 diagnosis using acoustics

Authors: Neeraj Kumar Sharma, Srikanth Raj Chetupalli, Debarpan Bhattacharya, Debottam Dutta, Pravin Mote, Sriram Ganapathy

Abstract: The Second Diagnosis of COVID-19 using Acoustics (DiCOVA) Challenge aimed at accelerating the research in acoustics based detection of COVID-19, a topic at the intersection of acoustics, signal processing, machine learning, and healthcare. This paper presents the details of the challenge, which was an open call for researchers to analyze a dataset of audio recordings consisting of breathing, cough… ▽ More The Second Diagnosis of COVID-19 using Acoustics (DiCOVA) Challenge aimed at accelerating the research in acoustics based detection of COVID-19, a topic at the intersection of acoustics, signal processing, machine learning, and healthcare. This paper presents the details of the challenge, which was an open call for researchers to analyze a dataset of audio recordings consisting of breathing, cough and speech signals. This data was collected from individuals with and without COVID-19 infection, and the task in the challenge was a two-class classification. The development set audio recordings were collected from 965 (172 COVID-19 positive) individuals, while the evaluation set contained data from 471 individuals (71 COVID-19 positive). The challenge featured four tracks, one associated with each sound category of cough, speech and breathing, and a fourth fusion track. A baseline system was also released to benchmark the participants. In this paper, we present an overview of the challenge, the rationale for the data collection and the baseline system. Further, a performance analysis for the systems submitted by the $16$ participating teams in the leaderboard is also presented. △ Less

Submitted 11 October, 2021; v1 submitted 4 October, 2021; originally announced October 2021.

arXiv:2106.10997 [pdf, other]

Towards sound based testing of COVID-19 -- Summary of the first Diagnostics of COVID-19 using Acoustics (DiCOVA) Challenge

Authors: Neeraj Kumar Sharma, Ananya Muguli, Prashant Krishnan, Rohit Kumar, Srikanth Raj Chetupalli, Sriram Ganapathy

Abstract: The technology development for point-of-care tests (POCTs) targeting respiratory diseases has witnessed a growing demand in the recent past. Investigating the presence of acoustic biomarkers in modalities such as cough, breathing and speech sounds, and using them for building POCTs can offer fast, contactless and inexpensive testing. In view of this, over the past year, we launched the ``Coswara''… ▽ More The technology development for point-of-care tests (POCTs) targeting respiratory diseases has witnessed a growing demand in the recent past. Investigating the presence of acoustic biomarkers in modalities such as cough, breathing and speech sounds, and using them for building POCTs can offer fast, contactless and inexpensive testing. In view of this, over the past year, we launched the ``Coswara'' project to collect cough, breathing and speech sound recordings via worldwide crowdsourcing. With this data, a call for development of diagnostic tools was announced in the Interspeech 2021 as a special session titled ``Diagnostics of COVID-19 using Acoustics (DiCOVA) Challenge''. The goal was to bring together researchers and practitioners interested in develo** acoustics-based COVID-19 POCTs by enabling them to work on the same set of development and test datasets. As part of the challenge, datasets with breathing, cough, and speech sound samples from COVID-19 and non-COVID-19 individuals were released to the participants. The challenge consisted of two tracks. The Track-1 focused only on cough sounds, and participants competed in a leaderboard setting. In Track-2, breathing and speech samples were provided for the participants, without a competitive leaderboard. The challenge attracted 85 plus registrations with 29 final submissions for Track-1. This paper describes the challenge (datasets, tasks, baseline system), and presents a focused summary of the various systems submitted by the participating teams. An analysis of the results from the top four teams showed that a fusion of the scores from these teams yields an area-under-the-curve of 95.1% on the blind test data. By summarizing the lessons learned, we foresee the challenge overview in this paper to help accelerate technology for acoustic-based POCTs. △ Less

Submitted 21 June, 2021; originally announced June 2021.

Comments: Manuscript in review in the Elsevier Computer Speech and Language journal

arXiv:2106.00639 [pdf, other]

Multi-modal Point-of-Care Diagnostics for COVID-19 Based On Acoustics and Symptoms

Authors: Srikanth Raj Chetupalli, Prashant Krishnan, Neeraj Sharma, Ananya Muguli, Rohit Kumar, Viral Nanda, Lancelot Mark Pinto, Prasanta Kumar Ghosh, Sriram Ganapathy

Abstract: The research direction of identifying acoustic bio-markers of respiratory diseases has received renewed interest following the onset of COVID-19 pandemic. In this paper, we design an approach to COVID-19 diagnostic using crowd-sourced multi-modal data. The data resource, consisting of acoustic signals like cough, breathing, and speech signals, along with the data of symptoms, are recorded using a… ▽ More The research direction of identifying acoustic bio-markers of respiratory diseases has received renewed interest following the onset of COVID-19 pandemic. In this paper, we design an approach to COVID-19 diagnostic using crowd-sourced multi-modal data. The data resource, consisting of acoustic signals like cough, breathing, and speech signals, along with the data of symptoms, are recorded using a web-application over a period of ten months. We investigate the use of statistical descriptors of simple time-frequency features for acoustic signals and binary features for the presence of symptoms. Unlike previous works, we primarily focus on the application of simple linear classifiers like logistic regression and support vector machines for acoustic data while decision tree models are employed on the symptoms data. We show that a multi-modal integration of acoustics and symptoms classifiers achieves an area-under-curve (AUC) of 92.40, a significant improvement over any individual modality. Several ablation experiments are also provided which highlight the acoustic and symptom dimensions that are important for the task of COVID-19 diagnostics. △ Less

Submitted 5 June, 2021; v1 submitted 1 June, 2021; originally announced June 2021.

Comments: The Manuscript is submitted to IEEE-EMBS Journal of Biomedical and Health Informatics on June 1, 2021

arXiv:2104.12862 [pdf]

A digital score of tumour-associated stroma infiltrating lymphocytes predicts survival in head and neck squamous cell carcinoma

Authors: Muhammad Shaban, Shan E Ahmed Raza, Mariam Hassan, Arif Jamshed, Sajid Mushtaq, Asif Loya, Nikolaos Batis, Jill Brooks, Paul Nankivell, Neil Sharma, Max Robinson, Hisham Mehanna, Syed Ali Khurram, Nasir Rajpoot

Abstract: The infiltration of T-lymphocytes in the stroma and tumour is an indication of an effective immune response against the tumour, resulting in better survival. In this study, our aim is to explore the prognostic significance of tumour-associated stroma infiltrating lymphocytes (TASILs) in head and neck squamous cell carcinoma (HNSCC) through an AI based automated method. A deep learning based automa… ▽ More The infiltration of T-lymphocytes in the stroma and tumour is an indication of an effective immune response against the tumour, resulting in better survival. In this study, our aim is to explore the prognostic significance of tumour-associated stroma infiltrating lymphocytes (TASILs) in head and neck squamous cell carcinoma (HNSCC) through an AI based automated method. A deep learning based automated method was employed to segment tumour, stroma and lymphocytes in digitally scanned whole slide images of HNSCC tissue slides. The spatial patterns of lymphocytes and tumour-associated stroma were digitally quantified to compute the TASIL-score. Finally, prognostic significance of the TASIL-score for disease-specific and disease-free survival was investigated with the Cox proportional hazard analysis. Three different cohorts of Haematoxylin & Eosin (H&E) stained tissue slides of HNSCC cases (n=537 in total) were studied, including publicly available TCGA head and neck cancer cases. The TASIL-score carries prognostic significance (p=0.002) for disease-specific survival of HNSCC patients. The TASIL-score also shows a better separation between low- and high-risk patients as compared to the manual TIL scoring by pathologists for both disease-specific and disease-free survival. A positive correlation of TASIL-score with molecular estimates of CD8+ T cells was also found, which is in line with existing findings. To the best of our knowledge, this is the first study to automate the quantification of TASIL from routine H&E slides of head and neck cancer. Our TASIL-score based findings are aligned with the clinical knowledge with the added advantages of objectivity, reproducibility and strong prognostic value. A comprehensive evaluation on large multicentric cohorts is required before the proposed digital score can be adopted in clinical practice. △ Less

Submitted 16 April, 2021; originally announced April 2021.

arXiv:2103.09148 [pdf, other]

DiCOVA Challenge: Dataset, task, and baseline system for COVID-19 diagnosis using acoustics

Authors: Ananya Muguli, Lancelot Pinto, Nirmala R., Neeraj Sharma, Prashant Krishnan, Prasanta Kumar Ghosh, Rohit Kumar, Shrirama Bhat, Srikanth Raj Chetupalli, Sriram Ganapathy, Shreyas Ramoji, Viral Nanda

Abstract: The DiCOVA challenge aims at accelerating research in diagnosing COVID-19 using acoustics (DiCOVA), a topic at the intersection of speech and audio processing, respiratory health diagnosis, and machine learning. This challenge is an open call for researchers to analyze a dataset of sound recordings collected from COVID-19 infected and non-COVID-19 individuals for a two-class classification. These… ▽ More The DiCOVA challenge aims at accelerating research in diagnosing COVID-19 using acoustics (DiCOVA), a topic at the intersection of speech and audio processing, respiratory health diagnosis, and machine learning. This challenge is an open call for researchers to analyze a dataset of sound recordings collected from COVID-19 infected and non-COVID-19 individuals for a two-class classification. These recordings were collected via crowdsourcing from multiple countries, through a website application. The challenge features two tracks, one focusing on cough sounds, and the other on using a collection of breath, sustained vowel phonation, and number counting speech recordings. In this paper, we introduce the challenge and provide a detailed description of the task, and present a baseline system for the task. △ Less

Submitted 17 June, 2021; v1 submitted 16 March, 2021; originally announced March 2021.

Comments: To appear in Proceedings of Interspeech, 2021

arXiv:2011.07124 [pdf, other]

doi 10.1093/mnras/stab294

Survey2Survey: A deep learning generative model approach for cross-survey image map**

Authors: Brandon Buncher, Awshesh Nath Sharma, Matias Carrasco Kind

Abstract: During the last decade, there has been an explosive growth in survey data and deep learning techniques, both of which have enabled great advances for astronomy. The amount of data from various surveys from multiple epochs with a wide range of wavelengths, albeit with varying brightness and quality, is overwhelming, and leveraging information from overlap** observations from different surveys has… ▽ More During the last decade, there has been an explosive growth in survey data and deep learning techniques, both of which have enabled great advances for astronomy. The amount of data from various surveys from multiple epochs with a wide range of wavelengths, albeit with varying brightness and quality, is overwhelming, and leveraging information from overlap** observations from different surveys has limitless potential in understanding galaxy formation and evolution. Synthetic galaxy image generation using physical models has been an important tool for survey data analysis, while deep learning generative models show great promise. In this paper, we present a novel approach for robustly expanding and improving survey data through cross survey feature translation. We trained two types of neural networks to map images from the Sloan Digital Sky Survey (SDSS) to corresponding images from the Dark Energy Survey (DES). This map was used to generate false DES representations of SDSS images, increasing the brightness and S/N while retaining important morphological information. We substantiate the robustness of our method by generating DES representations of SDSS images from outside the overlap** region, showing that the brightness and quality are improved even when the source images are of lower quality than the training images. Finally, we highlight several images in which the reconstruction process appears to have removed large artifacts from SDSS images. While only an initial application, our method shows promise as a method for robustly expanding and improving the quality of optical survey data and provides a potential avenue for cross-band reconstruction. △ Less

Submitted 5 February, 2021; v1 submitted 13 November, 2020; originally announced November 2020.

Comments: 24 pages, 19 figures. Accepted by MNRAS

arXiv:2008.07899 [pdf, ps, other]

doi 10.1109/TIM.2021.3122182

Accelerometric Method for Cuffless Continuous Blood Pressure Measurement

Authors: Mousumi Das, Tilendra Choudhary, L. N. Sharma, M. K. Bhuyan

Abstract: Pulse transit time (PTT) has been widely used for cuffless blood pressure (BP) measurement. But, it requires more than one cardiovascular signals involving more than one sensing device. In this paper, we propose a method for continuous cuffless blood pressure measurement with the help of left ventricular ejection time (LVET). The LVET is estimated using a signal obtained through a micro-electromec… ▽ More Pulse transit time (PTT) has been widely used for cuffless blood pressure (BP) measurement. But, it requires more than one cardiovascular signals involving more than one sensing device. In this paper, we propose a method for continuous cuffless blood pressure measurement with the help of left ventricular ejection time (LVET). The LVET is estimated using a signal obtained through a micro-electromechanical system (MEMS)-based accelerometric sensor. The sensor acquires a seismocardiogram (SCG) signal at the chest surface, and the LVET information is extracted. Both systolic blood pressure (SBP) and diastolic blood pressure (DBP) are estimated by calibrating the system with the original arterial blood pressure values of the subjects. The proposed method is evaluated using different quantitative measures on the signals collected from ten subjects under the supine position. The performance of the proposed method is also compared with two earlier approaches, where PTT intervals are estimated from electrocardiogram (ECG)-photoplethysmogram (PPG) and SCG-PPG, respectively. The performance results clearly show that the proposed method is comparable with the state-of-the-art methods. Also, the computed blood pressure is compared with the original one, measured through a CNAP system. It gives the mean errors of the estimated systolic BP and diastolic BP within the range of -0.19 +/- 3.3 mmHg and -1.29 +/- 2.6 mmHg, respectively. The mean absolute errors for systolic BP and diastolic BP are 3.2 mmHg and 2.6 mmHg, respectively. The accuracy of BPs estimated from the proposed method satisfies the requirements of the IEEE standard of 5 +/- 8 mmHg deviation, and thus, it may be used for ubiquitous long term blood pressure monitoring. △ Less

Submitted 18 August, 2020; originally announced August 2020.

Journal ref: Noninvasive Accelerometric Approach for Cuffless Continuous Blood Pressure Measurement, IEEE Transactions on Instrumentation and Measurement, vol. 70, pp. 1-9, 2021, Art no. 4008109

arXiv:2005.10548 [pdf, other]

doi 10.21437/Interspeech.2020-2768

Coswara -- A Database of Breathing, Cough, and Voice Sounds for COVID-19 Diagnosis

Authors: Neeraj Sharma, Prashant Krishnan, Rohit Kumar, Shreyas Ramoji, Srikanth Raj Chetupalli, Nirmala R., Prasanta Kumar Ghosh, Sriram Ganapathy

Abstract: The COVID-19 pandemic presents global challenges transcending boundaries of country, race, religion, and economy. The current gold standard method for COVID-19 detection is the reverse transcription polymerase chain reaction (RT-PCR) testing. However, this method is expensive, time-consuming, and violates social distancing. Also, as the pandemic is expected to stay for a while, there is a need for… ▽ More The COVID-19 pandemic presents global challenges transcending boundaries of country, race, religion, and economy. The current gold standard method for COVID-19 detection is the reverse transcription polymerase chain reaction (RT-PCR) testing. However, this method is expensive, time-consuming, and violates social distancing. Also, as the pandemic is expected to stay for a while, there is a need for an alternate diagnosis tool which overcomes these limitations, and is deployable at a large scale. The prominent symptoms of COVID-19 include cough and breathing difficulties. We foresee that respiratory sounds, when analyzed using machine learning techniques, can provide useful insights, enabling the design of a diagnostic tool. Towards this, the paper presents an early effort in creating (and analyzing) a database, called Coswara, of respiratory sounds, namely, cough, breath, and voice. The sound samples are collected via worldwide crowdsourcing using a website application. The curated dataset is released as open access. As the pandemic is evolving, the data collection and analysis is a work in progress. We believe that insights from analysis of Coswara can be effective in enabling sound based technology solutions for point-of-care diagnosis of respiratory infection, and in the near future this can help to diagnose COVID-19. △ Less

Submitted 11 August, 2020; v1 submitted 21 May, 2020; originally announced May 2020.

Comments: A description of Coswara dataset to evaluate COVID-19 diagnosis using respiratory sounds

arXiv:2002.10510 [pdf, ps, other]

doi 10.1109/JSEN.2020.3025384

Design of Breathing-states Detector for m-Health Platform using Seismocardiographic Signal

Authors: Tilendra Choudhary, L. N. Sharma, M. K. Bhuyan, Kangkana Bora

Abstract: In this work, a seismocardiogram (SCG) based breathing-state measuring method is proposed for m-health applications. The aim of the proposed framework is to assess the human respiratory system by identifying degree-of-breathings, such as breathlessness, normal breathing, and long and labored breathing. For this, it is needed to measure cardiac-induced chest-wall vibrations, reflected in the SCG si… ▽ More In this work, a seismocardiogram (SCG) based breathing-state measuring method is proposed for m-health applications. The aim of the proposed framework is to assess the human respiratory system by identifying degree-of-breathings, such as breathlessness, normal breathing, and long and labored breathing. For this, it is needed to measure cardiac-induced chest-wall vibrations, reflected in the SCG signal. Orthogonal subspace projection is employed to extract the SCG cycles with the help of a concurrent ECG signal. Subsequently, fifteen statistically significant morphological-features are extracted from each of the SCG cycles. These features can efficiently characterize physiological changes due to varying respiratory rates. Stacked autoencoder (SAE) based architecture is employed for the identification of different respiratory-effort levels. The performance of the proposed method is evaluated and compared with other standard classifiers for 1147 analyzed SCG-beats. The proposed method gives an overall average accuracy of 91.45% in recognizing three different breathing states. The quantitative analysis of the performance results clearly shows the effectiveness of the proposed framework. It may be employed in various healthcare applications, such as pre-screening medical sensors and IoT based remote health-monitoring systems. △ Less

Submitted 5 April, 2021; v1 submitted 24 February, 2020; originally announced February 2020.

Journal ref: Identification of Human Breathing-States Using Cardiac-Vibrational Signal for m-Health Applications, IEEE Sensors Journal, vol. 21, no. 3, pp. 3463-3470, 1 Feb.1, 2021

arXiv:2002.10405 [pdf, ps, other]

doi 10.1109/TIM.2020.3007295

Delineation and Analysis of Seismocardiographic Systole and Diastole Profiles

Authors: Tilendra Choudhary, M. K. Bhuyan, L. N. Sharma

Abstract: Precise estimation of fiducial points of a seismocardiogram (SCG) signal is a challenging problem for its clinical usage. Delineation techniques proposed in the existing literature do not estimate all the clinically significant points of an SCG signal, simultaneously. The aim of this research work is to propose a delineation framework to identify IM, AO, IC, AC, pAC and MO fiducial points with the… ▽ More Precise estimation of fiducial points of a seismocardiogram (SCG) signal is a challenging problem for its clinical usage. Delineation techniques proposed in the existing literature do not estimate all the clinically significant points of an SCG signal, simultaneously. The aim of this research work is to propose a delineation framework to identify IM, AO, IC, AC, pAC and MO fiducial points with the help of a PPG signal. The proposed delineation method processes a wavelet-based scalographic PPG and an envelope construction scheme is proposed to estimate the prominent peaks of the PPG signal. A set of amplitude histogram based decision rules is developed for estimation of SCG diastole phases, namely AC, pAC and MO. Subsequently, the systolic phases, IM, AO and IC are detected by applying diastole masking on SCG and decision rules. Experimental results on real-time SCG signals acquired from our designed data acquisition-circuitry and their analysis show the effectiveness of the proposed scheme. Additionally, these estimated parameters are analyzed to show the discrimination between normal breathing and breathlessness conditions. △ Less

Submitted 24 February, 2020; originally announced February 2020.

Comments: IEEE Transactions on Instrumentation and Measurement, 2020

arXiv:2002.02357 [pdf, other]

Computationally efficient algorithm for eco-driving over long look-ahead horizons

Authors: Ahad Hamednia, Nalin Kumar Sharma, Nikolce Murgovski, Jonas Fredriksson

Abstract: This paper presents a computationally efficient algorithm for eco-driving over long prediction horizons. The eco-driving problem is formulated as a bi-level program, where the bottom level is solved offline, pre-optimizing gear as a function of longitudinal velocity and acceleration. The top level is solved online, optimizing a nonlinear dynamic program with travel time, kinetic energy and acceler… ▽ More This paper presents a computationally efficient algorithm for eco-driving over long prediction horizons. The eco-driving problem is formulated as a bi-level program, where the bottom level is solved offline, pre-optimizing gear as a function of longitudinal velocity and acceleration. The top level is solved online, optimizing a nonlinear dynamic program with travel time, kinetic energy and acceleration as state variables. To further reduce computational effort, the travel time is adjoined to the objective by applying necessary Pontryagin Maximum Principle conditions, and the nonlinear program is solved using real-time iteration sequential quadratic programming scheme in a model predictive control framework. Compared to standard cruise control, the energy savings of using the proposed algorithm is up to 15.71%. △ Less

Submitted 6 February, 2020; originally announced February 2020.

arXiv:1910.01461 [pdf]

RNGA for non-square multivariable control systems: properties and application

Authors: Shaival Hemant Nagarsheth, Shambhu Nath Sharma

Abstract: The Relative Gain Array (RGA) and Relative Normalized Gain Array (RNGA) have received considerable attention for square systems. In this paper RNGA with the column-major, for non-square multivariable systems is introduced. RNGA of the paper has a row-column inequality, i.e. the number of rows is less than the number of columns. Unlike the conventional RGA, the RNGA loop pairing criteria of the pap… ▽ More The Relative Gain Array (RGA) and Relative Normalized Gain Array (RNGA) have received considerable attention for square systems. In this paper RNGA with the column-major, for non-square multivariable systems is introduced. RNGA of the paper has a row-column inequality, i.e. the number of rows is less than the number of columns. Unlike the conventional RGA, the RNGA loop pairing criteria of the paper considers both steady-state as well as transient information for the assessment of control-loop interactions. The RNGA for square systems is extended for non-square multivariable systems by thoroughly deriving its supporting properties. The RNGA method is applied to a non-square multivariable radiator laboratory test setup for loop pairing. Closed-loop results arising from the RNGA-based loop pairing are depicted in the paper. The lacuna of the conventional RGA loop pairing has been overcome by the application of the developed RNGA of this paper. The results unfold the effectiveness of RNGA over RGA for non-square multivariable systems to have minimum interactions and better control. △ Less

Submitted 25 November, 2019; v1 submitted 3 October, 2019; originally announced October 2019.

Comments: 16 pages, 5 figures, 3 tables

MSC Class: 93A14-Decentralized system; 15A09-Matrix inversion; generalized inverses; 93C35-Multivariable systems

arXiv:1907.07564 [pdf, other]

Conversational Help for Task Completion and Feature Discovery in Personal Assistants

Authors: Madan Gopal Jhawar, Vipindeep Vangala, Nishchay Sharma, Ankur Hayatnagarkar, Mansi Saxena, Swati Valecha

Abstract: Intelligent Personal Assistants (IPAs) have become widely popular in recent times. Most of the commercial IPAs today support a wide range of skills including Alarms, Reminders, Weather Updates, Music, News, Factual Questioning-Answering, etc. The list grows every day, making it difficult to remember the command structures needed to execute various tasks. An IPA must have the ability to communicate… ▽ More Intelligent Personal Assistants (IPAs) have become widely popular in recent times. Most of the commercial IPAs today support a wide range of skills including Alarms, Reminders, Weather Updates, Music, News, Factual Questioning-Answering, etc. The list grows every day, making it difficult to remember the command structures needed to execute various tasks. An IPA must have the ability to communicate information about supported skills and direct users towards the right commands needed to execute them. Users interact with personal assistants in natural language. A query is defined to be a Help Query if it seeks information about a personal assistant's capabilities, or asks for instructions to execute a task. In this paper, we propose an interactive system which identifies help queries and retrieves appropriate responses. Our system comprises of a C-BiLSTM based classifier, which is a fusion of Convolutional Neural Networks (CNN) and Bidirectional LSTM (BiLSTM) architectures, to detect help queries and a semantic Approximate Nearest Neighbours (ANN) module to map the query to an appropriate predefined response. Evaluation of our system on real-world queries from a commercial IPA and a detailed comparison with popular traditional machine learning and deep learning based models reveal that our system outperforms other approaches and returns relevant responses for help queries. △ Less

Submitted 16 July, 2019; originally announced July 2019.

arXiv:1903.03725 [pdf, other]

Control over Skies: Survivability, Coverage, and Mobility Laws for Hierarchical Aerial Base Stations

Authors: Vishal Sharma, Navuday Sharma, Mubashir Husain Rehmani, Haris Pervaiz

Abstract: Aerial Base Stations (ABSs) have gained significant importance in the next generation of wireless networks for accommodating mobile ground users and flash crowds with high convenience and quality. However, to achieve an efficient ABS network, many factors pertaining to ABS flight, governing laws and information transmissions must be studied. In this article, multi-drone communications are studied… ▽ More Aerial Base Stations (ABSs) have gained significant importance in the next generation of wireless networks for accommodating mobile ground users and flash crowds with high convenience and quality. However, to achieve an efficient ABS network, many factors pertaining to ABS flight, governing laws and information transmissions must be studied. In this article, multi-drone communications are studied in three major aspects, survivability, coverage, and mobility laws, which optimize the multi-tier ABS network to avoid issues related to inter-cell interference, deficient energy, frequent handovers, and lifetime. The article includes simulation results of hierarchical ABS allocations for handling a set of users over a defined geographical area. Several open issues and challenges are presented to provide deep insights into the ABS network management and its utility framework. △ Less

Submitted 16 April, 2021; v1 submitted 8 March, 2019; originally announced March 2019.

Comments: 7 pages, 6 figures

arXiv:1807.08315 [pdf, other]

Accelerated Structure-Aware Reinforcement Learning for Delay-Sensitive Energy Harvesting Wireless Sensors

Authors: Nikhilesh Sharma, Nicholas Mastronarde, Jacob Chakareski

Abstract: We investigate an energy-harvesting wireless sensor transmitting latency-sensitive data over a fading channel. The sensor injects captured data packets into its transmission queue and relies on ambient energy harvested from the environment to transmit them. We aim to find the optimal scheduling policy that decides whether or not to transmit the queue's head-of-line packet at each transmission oppo… ▽ More We investigate an energy-harvesting wireless sensor transmitting latency-sensitive data over a fading channel. The sensor injects captured data packets into its transmission queue and relies on ambient energy harvested from the environment to transmit them. We aim to find the optimal scheduling policy that decides whether or not to transmit the queue's head-of-line packet at each transmission opportunity such that the expected packet queuing delay is minimized given the available harvested energy. No prior knowledge of the stochastic processes that govern the channel, captured data, or harvested energy dynamics are assumed, thereby necessitating the use of online learning to optimize the scheduling policy. We formulate this scheduling problem as a Markov decision process (MDP) and analyze the structural properties of its optimal value function. In particular, we show that it is non-decreasing and has increasing differences in the queue backlog and that it is non-increasing and has increasing differences in the battery state. We exploit this structure to formulate a novel accelerated reinforcement learning (RL) algorithm to solve the scheduling problem online at a much faster learning rate, while limiting the induced computational complexity. Our experiments demonstrate that the proposed algorithm closely approximates the performance of an optimal offline solution that requires a priori knowledge of the channel, captured data, and harvested energy dynamics. Simultaneously, by leveraging the value function's structure, our approach achieves competitive performance relative to a state-of-the-art RL algorithm, at potentially orders of magnitude lower complexity. Finally, considerable performance gains are demonstrated over the well-known and widely used Q-learning algorithm. △ Less

Submitted 5 May, 2019; v1 submitted 22 July, 2018; originally announced July 2018.

Comments: arXiv admin note: text overlap with arXiv:1803.09778

Showing 1–19 of 19 results for author: Sharma, N