Search | arXiv e-print repository

Bone Fracture Classification using Transfer Learning

Abstract: The manual examination of X-ray images for fractures is a time-consuming process that is prone to human error. In this work, we introduce a robust yet simple training loop for the classification of fractures, which significantly outperforms existing methods. Our method achieves superior performance in less than ten epochs and utilizes the latest dataset to deliver the best-performing model for thi… ▽ More The manual examination of X-ray images for fractures is a time-consuming process that is prone to human error. In this work, we introduce a robust yet simple training loop for the classification of fractures, which significantly outperforms existing methods. Our method achieves superior performance in less than ten epochs and utilizes the latest dataset to deliver the best-performing model for this task. We emphasize the importance of training deep learning models responsibly and efficiently, as well as the critical role of selecting high-quality datasets. △ Less

Submitted 22 June, 2024; originally announced June 2024.

Comments: code is publicly available at - https://github.com/shyamgupta196/Bone-Fracture-Classification

arXiv:2406.15520 [pdf]

doi 10.1039/D3LC00982C

Miniature fluorescence sensor for quantitative detection of brain tumour

Authors: Jean Pierre Ndabakuranye, James Belcourt, Deepak Sharma, Cathal D. O'Connell, Victor Mondal, Sanjay K. Srivastava, Alastair Stacey, Sam Long, Bobbi Fleiss, Arman Ahnood

Abstract: Fluorescence-guided surgery has emerged as a vital tool for tumour resection procedures. As well as intraoperative tumour visualisation, 5-ALA-induced PpIX provides an avenue for quantitative tumour identification based on ratiometric fluorescence measurement. To this end, fluorescence imaging and fibre-based probes have enabled more precise demarcation between the cancerous and healthy tissues. T… ▽ More Fluorescence-guided surgery has emerged as a vital tool for tumour resection procedures. As well as intraoperative tumour visualisation, 5-ALA-induced PpIX provides an avenue for quantitative tumour identification based on ratiometric fluorescence measurement. To this end, fluorescence imaging and fibre-based probes have enabled more precise demarcation between the cancerous and healthy tissues. These sensing approaches, which rely on collecting the fluorescence light from the tumour resection site and its remote spectral sensing, introduce challenges associated with optical losses. In this work, we demonstrate the viability of tumour detection at the resection site using a miniature fluorescence measurement system. Unlike the current bulky systems, which necessitate remote measurement, we have adopted a millimetre-sized spectral sensor chip for quantitative fluorescence measurements. A reliable measurement at the resection site requires a stable optical window between the tissue and the optoelectronic system. This is achieved using an antifouling diamond window, which provides stable optical transparency. The system achieved a sensitivity of 92.3% and specificity of 98.3% in detecting a surrogate tumour at a resolution of 1 x 1 mm2. As well as addressing losses associated with collecting and coupling fluorescence light in the current remote sensing approaches, the small size of the system introduced in this work paves the way for its direct integration with the tumour resection tools with the aim of more accurate interoperative tumour identification. △ Less

Submitted 20 June, 2024; originally announced June 2024.

Journal ref: Lab on a Chip 24.4 (2024): 946-954

arXiv:2406.05199 [pdf, other]

XANE: eXplainable Acoustic Neural Embeddings

Authors: Sri Harsha Dumpala, Dushyant Sharma, Chandramouli Shama Sastri, Stanislav Kruchinin, James Fosburgh, Patrick A. Naylor

Abstract: We present a novel method for extracting neural embeddings that model the background acoustics of a speech signal. The extracted embeddings are used to estimate specific parameters related to the background acoustic properties of the signal in a non-intrusive manner, which allows the embeddings to be explainable in terms of those parameters. We illustrate the value of these embeddings by performin… ▽ More We present a novel method for extracting neural embeddings that model the background acoustics of a speech signal. The extracted embeddings are used to estimate specific parameters related to the background acoustic properties of the signal in a non-intrusive manner, which allows the embeddings to be explainable in terms of those parameters. We illustrate the value of these embeddings by performing clustering experiments on unseen test data and show that the proposed embeddings achieve a mean F1 score of 95.2\% for three different tasks, outperforming significantly the WavLM based signal embeddings. We also show that the proposed method can explain the embeddings by estimating 14 acoustic parameters characterizing the background acoustics, including reverberation and noise levels, overlapped speech detection, CODEC type detection and noise type detection with high accuracy and a real-time factor 17 times lower than an external baseline method. △ Less

Submitted 7 June, 2024; originally announced June 2024.

arXiv:2405.07100 [pdf, ps, other]

Analysis of Decentralized Stochastic Successive Convex Approximation for composite non-convex problems

Authors: Basil M. Idrees, Shivangi Dubey Sharma, Ketan Rajawat

Abstract: This work considers the decentralized successive convex approximation (SCA) method for minimizing stochastic non-convex objectives subject to convex constraints, along with possibly non-smooth convex regularizers. Although SCA has been widely applied in decentralized settings, its stochastic first order (SFO) complexity is unknown, and it is thought to be slower than the centralized momentum-enhan… ▽ More This work considers the decentralized successive convex approximation (SCA) method for minimizing stochastic non-convex objectives subject to convex constraints, along with possibly non-smooth convex regularizers. Although SCA has been widely applied in decentralized settings, its stochastic first order (SFO) complexity is unknown, and it is thought to be slower than the centralized momentum-enhanced SCA variants. In this work, we advance the state-of-the-art for SCA methods by proposing an accelerated variant, namely the \textbf{D}ecentralized \textbf{M}omentum-based \textbf{S}tochastic \textbf{SCA} (\textbf{D-MSSCA}) and analyze its SFO complexity. The proposed algorithm entails creating a stochastic surrogate of the objective at every iteration, which is minimized at each node separately. Remarkably, the D-MSSCA achieves an SFO complexity of $\mathcal{O}(ε^{-3/2})$ to reach an $ε$-stationary point, which is at par with the SFO complexity lower bound for unconstrained stochastic non-convex optimization in centralized setting. △ Less

Submitted 27 May, 2024; v1 submitted 11 May, 2024; originally announced May 2024.

arXiv:2401.00180 [pdf, other]

Auxiliary Network-Enabled Attack Detection and Resilient Control of Islanded AC Microgrid

Authors: Vaibhav Vaishnav, Anoop Jain, Dushyant Sharma

Abstract: This paper proposes a cyber-resilient distributed control strategy equipped with attack detection capabilities for islanded AC microgrids in the presence of bounded stealthy cyber attacks affecting both frequency and power information exchanged among neighboring distributed generators (DGs). The proposed control methodology relies on the construction of an auxiliary layer and the establishment of… ▽ More This paper proposes a cyber-resilient distributed control strategy equipped with attack detection capabilities for islanded AC microgrids in the presence of bounded stealthy cyber attacks affecting both frequency and power information exchanged among neighboring distributed generators (DGs). The proposed control methodology relies on the construction of an auxiliary layer and the establishment of effective inter-layer cooperation between the actual DGs in the control layer and the virtual DGs in the auxiliary layer. This cooperation aims to achieve robust frequency restoration and proportional active power-sharing. It is shown that the in situ presence of a concealed auxiliary layer not only guarantees resilience against stealthy bounded attacks on both frequency and power-sharing but also facilitates a network-enabled attack identification mechanism. The paper provides rigorous proof of the stability of the closed-loop system and derives bounds for frequency and power deviations under attack conditions, offering insights into the impact of the attack signal, control and pinning gains, and network connectivity on the system's convergence properties. The performance of the proposed controllers is illustrated by simulating a networked islanded AC microgrid in a Simulink environment showcasing both attributes of attack resilience and attack detection. △ Less

Submitted 30 December, 2023; originally announced January 2024.

arXiv:2310.18494 [pdf, other]

Knowledge-based in silico models and dataset for the comparative evaluation of mammography AI for a range of breast characteristics, lesion conspicuities and doses

Authors: Elena Sizikova, Niloufar Saharkhiz, Diksha Sharma, Miguel Lago, Berkman Sahiner, Jana G. Delfino, Aldo Badano

Abstract: To generate evidence regarding the safety and efficacy of artificial intelligence (AI) enabled medical devices, AI models need to be evaluated on a diverse population of patient cases, some of which may not be readily available. We propose an evaluation approach for testing medical imaging AI models that relies on in silico imaging pipelines in which stochastic digital models of human anatomy (in… ▽ More To generate evidence regarding the safety and efficacy of artificial intelligence (AI) enabled medical devices, AI models need to be evaluated on a diverse population of patient cases, some of which may not be readily available. We propose an evaluation approach for testing medical imaging AI models that relies on in silico imaging pipelines in which stochastic digital models of human anatomy (in object space) with and without pathology are imaged using a digital replica imaging acquisition system to generate realistic synthetic image datasets. Here, we release M-SYNTH, a dataset of cohorts with four breast fibroglandular density distributions imaged at different exposure levels using Monte Carlo x-ray simulations with the publicly available Virtual Imaging Clinical Trial for Regulatory Evaluation (VICTRE) toolkit. We utilize the synthetic dataset to analyze AI model performance and find that model performance decreases with increasing breast density and increases with higher mass density, as expected. As exposure levels decrease, AI model performance drops with the highest performance achieved at exposure levels lower than the nominal recommended dose for the breast type. △ Less

Submitted 27 October, 2023; originally announced October 2023.

Comments: NeurIPS 2023 Datasets and Benchmarks Track

arXiv:2306.06375 [pdf, ps, other]

Optimized Gradient Tracking for Decentralized Online Learning

Authors: Shivangi Dubey Sharma, Ketan Rajawat

Abstract: This work considers the problem of decentralized online learning, where the goal is to track the optimum of the sum of time-varying functions, distributed across several nodes in a network. The local availability of the functions and their gradients necessitates coordination and consensus among the nodes. We put forth the Generalized Gradient Tracking (GGT) framework that unifies a number of exist… ▽ More This work considers the problem of decentralized online learning, where the goal is to track the optimum of the sum of time-varying functions, distributed across several nodes in a network. The local availability of the functions and their gradients necessitates coordination and consensus among the nodes. We put forth the Generalized Gradient Tracking (GGT) framework that unifies a number of existing approaches, including the state-of-the-art ones. The performance of the proposed GGT algorithm is theoretically analyzed using a novel semidefinite programming-based analysis that yields the desired regret bounds under very general conditions and without requiring the gradient boundedness assumption. The results are applicable to the special cases of GGT, which include various state-of-the-art algorithms as well as new dynamic versions of various classical decentralized algorithms. To further minimize the regret, we consider a condensed version of GGT with only four free parameters. A procedure for offline tuning of these parameters using only the problem parameters is also detailed. The resulting optimized GGT (oGGT) algorithm not only achieves improved dynamic regret bounds, but also outperforms all state-of-the-art algorithms on both synthetic and real-world datasets. △ Less

Submitted 13 February, 2024; v1 submitted 10 June, 2023; originally announced June 2023.

Comments: 30 pages, 6 Figures

arXiv:2304.11238 [pdf, ps, other]

Adapting model-based deep learning to multiple acquisition conditions: Ada-MoDL

Authors: Aniket Pramanik, Sampada Bhave, Saurav Sajib, Samir D. Sharma, Mathews Jacob

Abstract: Purpose: The aim of this work is to introduce a single model-based deep network that can provide high-quality reconstructions from undersampled parallel MRI data acquired with multiple sequences, acquisition settings and field strengths. Methods: A single unrolled architecture, which offers good reconstructions for multiple acquisition settings, is introduced. The proposed scheme adapts the mode… ▽ More Purpose: The aim of this work is to introduce a single model-based deep network that can provide high-quality reconstructions from undersampled parallel MRI data acquired with multiple sequences, acquisition settings and field strengths. Methods: A single unrolled architecture, which offers good reconstructions for multiple acquisition settings, is introduced. The proposed scheme adapts the model to each setting by scaling the CNN features and the regularization parameter with appropriate weights. The scaling weights and regularization parameter are derived using a multi-layer perceptron model from conditional vectors, which represents the specific acquisition setting. The perceptron parameters and the CNN weights are jointly trained using data from multiple acquisition settings, including differences in field strengths, acceleration, and contrasts. The conditional network is validated using datasets acquired with different acquisition settings. Results: The comparison of the adaptive framework, which trains a single model using the data from all the settings, shows that it can offer consistently improved performance for each acquisition condition. The comparison of the proposed scheme with networks that are trained independently for each acquisition setting shows that it requires less training data per acquisition setting to offer good performance. Conclusion: The Ada-MoDL framework enables the use of a single model-based unrolled network for multiple acquisition settings. In addition to eliminating the need to train and store multiple networks for different acquisition settings, this approach reduces the training data needed for each acquisition setting. △ Less

Submitted 21 April, 2023; originally announced April 2023.

arXiv:2211.12072 [pdf, other]

Design and Performance Analysis of Hardware Realization of 3GPP Physical Layer for 5G Cell Search

Authors: Khalid Lodhi, Jayant Chhillar, Sumit J. Darak, Divisha Sharma

Abstract: 5G Cell Search (CS) is the first step for user equipment (UE) to initiate the communication with the 5G node B (gNB) every time it is powered ON. In cellular networks, CS is accomplished via synchronization signals (SS) broadcasted by gNB. 5G 3rd generation partnership project (3GPP) specifications offer a detailed discussion on the SS generation at gNB but a limited understanding of their blind s… ▽ More 5G Cell Search (CS) is the first step for user equipment (UE) to initiate the communication with the 5G node B (gNB) every time it is powered ON. In cellular networks, CS is accomplished via synchronization signals (SS) broadcasted by gNB. 5G 3rd generation partnership project (3GPP) specifications offer a detailed discussion on the SS generation at gNB but a limited understanding of their blind search, and detection is available. Unlike 4G, 5G SS may not be transmitted at the center of carrier frequency and their frequency location is unknown to UE. In this work, we demonstrate the 5G CS by designing 3GPP compatible hardware realization of the physical layer (PHY) of the gNB transmitter and UE receiver. The proposed SS detection explores a novel down-sampling approach resulting in a significant reduction in complexity and latency. Via detailed performance analysis, we analyze the functional correctness, computational complexity, and latency of the proposed approach for different word lengths, signal-to-noise ratio (SNR), and down-sampling factors. We demonstrate the complete CS functionality on GNU Radio-based RFNoC framework and USRP-FPGA platform. The 3GPP compatibility and demonstration on hardware strengthen the commercial significance of the proposed work. △ Less

Submitted 22 November, 2022; originally announced November 2022.

arXiv:2211.01338 [pdf, other]

Technology Pipeline for Large Scale Cross-Lingual Dubbing of Lecture Videos into Multiple Indian Languages

Authors: Anusha Prakash, Arun Kumar, Ashish Seth, Bhagyashree Mukherjee, Ishika Gupta, Jom Kuriakose, Jordan Fernandes, K V Vikram, Mano Ranjith Kumar M, Metilda Sagaya Mary, Mohammad Wajahat, Mohana N, Mudit Batra, Navina K, Nihal John George, Nithya Ravi, Pruthwik Mishra, Sudhanshu Srivastava, Vasista Sai Lodagala, Vandan Mujadia, Kada Sai Venkata Vineeth, Vrunda Sukhadia, Dipti Sharma, Hema Murthy, Pushpak Bhattacharya , et al. (2 additional authors not shown)

Abstract: Cross-lingual dubbing of lecture videos requires the transcription of the original audio, correction and removal of disfluencies, domain term discovery, text-to-text translation into the target language, chunking of text using target language rhythm, text-to-speech synthesis followed by isochronous lipsyncing to the original video. This task becomes challenging when the source and target languages… ▽ More Cross-lingual dubbing of lecture videos requires the transcription of the original audio, correction and removal of disfluencies, domain term discovery, text-to-text translation into the target language, chunking of text using target language rhythm, text-to-speech synthesis followed by isochronous lipsyncing to the original video. This task becomes challenging when the source and target languages belong to different language families, resulting in differences in generated audio duration. This is further compounded by the original speaker's rhythm, especially for extempore speech. This paper describes the challenges in regenerating English lecture videos in Indian languages semi-automatically. A prototype is developed for dubbing lectures into 9 Indian languages. A mean-opinion-score (MOS) is obtained for two languages, Hindi and Tamil, on two different courses. The output video is compared with the original video in terms of MOS (1-5) and lip synchronisation with scores of 4.09 and 3.74, respectively. The human effort also reduces by 75%. △ Less

Submitted 1 November, 2022; originally announced November 2022.

arXiv:2203.13919 [pdf]

Spatial Processing Front-End For Distant ASR Exploiting Self-Attention Channel Combinator

Authors: Dushyant Sharma, Rong Gong, James Fosburgh, Stanislav Yu. Kruchinin, Patrick A. Naylor, Ljubomir Milanovic

Abstract: We present a novel multi-channel front-end based on channel shortening with theWeighted Prediction Error (WPE) method followed by a fixed MVDR beamformer used in combination with a recently proposed self-attention-based channel combination (SACC) scheme, for tackling the distant ASR problem. We show that the proposed system used as part of a ContextNet based end-to-end (E2E) ASR system outperforms… ▽ More We present a novel multi-channel front-end based on channel shortening with theWeighted Prediction Error (WPE) method followed by a fixed MVDR beamformer used in combination with a recently proposed self-attention-based channel combination (SACC) scheme, for tackling the distant ASR problem. We show that the proposed system used as part of a ContextNet based end-to-end (E2E) ASR system outperforms leading ASR systems as demonstrated by a 21.6% reduction in relative WER on a multi-channel LibriSpeech playback dataset. We also show how dereverberation prior to beamforming is beneficial and compare the WPE method with a modified neural channel shortening approach. An analysis of the non-intrusive estimate of the signal C50 confirms that the 8 channel WPE method provides significant dereverberation of the signals (13.6 dB improvement). We also show how the weights of the SACC system allow the extraction of accurate spatial information which can be beneficial for other speech processing applications like diarization. △ Less

Submitted 25 March, 2022; originally announced March 2022.

Comments: to be presented at ICASSP 2022

arXiv:2109.11225 [pdf, other]

ChannelAugment: Improving generalization of multi-channel ASR by training with input channel randomization

Authors: Marco Gaudesi, Felix Weninger, Dushyant Sharma, Puming Zhan

Abstract: End-to-end (E2E) multi-channel ASR systems show state-of-the-art performance in far-field ASR tasks by joint training of a multi-channel front-end along with the ASR model. The main limitation of such systems is that they are usually trained with data from a fixed array geometry, which can lead to degradation in accuracy when a different array is used in testing. This makes it challenging to deplo… ▽ More End-to-end (E2E) multi-channel ASR systems show state-of-the-art performance in far-field ASR tasks by joint training of a multi-channel front-end along with the ASR model. The main limitation of such systems is that they are usually trained with data from a fixed array geometry, which can lead to degradation in accuracy when a different array is used in testing. This makes it challenging to deploy these systems in practice, as it is costly to retrain and deploy different models for various array configurations. To address this, we present a simple and effective data augmentation technique, which is based on randomly drop** channels in the multi-channel audio input during training, in order to improve the robustness to various array configurations at test time. We call this technique ChannelAugment, in contrast to SpecAugment (SA) which drops time and/or frequency components of a single channel input audio. We apply ChannelAugment to the Spatial Filtering (SF) and Minimum Variance Distortionless Response (MVDR) neural beamforming approaches. For SF, we observe 10.6% WER improvement across various array configurations employing different numbers of microphones. For MVDR, we achieve a 74% reduction in training time without causing degradation of recognition accuracy. △ Less

Submitted 23 September, 2021; originally announced September 2021.

Comments: To appear in ASRU 2021

arXiv:2109.04783 [pdf, other]

Self-Attention Channel Combinator Frontend for End-to-End Multichannel Far-field Speech Recognition

Authors: Rong Gong, Carl Quillen, Dushyant Sharma, Andrew Goderre, José Laínez, Ljubomir Milanović

Abstract: When a sufficiently large far-field training data is presented, jointly optimizing a multichannel frontend and an end-to-end (E2E) Automatic Speech Recognition (ASR) backend shows promising results. Recent literature has shown traditional beamformer designs, such as MVDR (Minimum Variance Distortionless Response) or fixed beamformers can be successfully integrated as the frontend into an E2E ASR s… ▽ More When a sufficiently large far-field training data is presented, jointly optimizing a multichannel frontend and an end-to-end (E2E) Automatic Speech Recognition (ASR) backend shows promising results. Recent literature has shown traditional beamformer designs, such as MVDR (Minimum Variance Distortionless Response) or fixed beamformers can be successfully integrated as the frontend into an E2E ASR system with learnable parameters. In this work, we propose the self-attention channel combinator (SACC) ASR frontend, which leverages the self-attention mechanism to combine multichannel audio signals in the magnitude spectral domain. Experiments conducted on a multichannel playback test data shows that the SACC achieved a 9.3% WERR compared to a state-of-the-art fixed beamformer-based frontend, both jointly optimized with a ContextNet-based ASR backend. We also demonstrate the connection between the SACC and the traditional beamformers, and analyze the intermediate outputs of the SACC. △ Less

Submitted 10 September, 2021; originally announced September 2021.

Comments: In Proceedings of Interspeech 2021

arXiv:2011.00052 [pdf, other]

(Un)Masked COVID-19 Trends from Social Media

Authors: Asmit Kumar Singh, Paras Mehan, Divyanshu Sharma, Rohan Pandey, Tavpritesh Sethi, Ponnurangam Kumaraguru

Abstract: Wearing masks is a useful protection method against COVID-19, which has caused widespread economic and social impact worldwide. Across the globe, governments have put mandates for the use of face masks, which have received both positive and negative reaction. Online social media provides an exciting platform to study the use of masks and analyze underlying mask-wearing patterns. In this article, w… ▽ More Wearing masks is a useful protection method against COVID-19, which has caused widespread economic and social impact worldwide. Across the globe, governments have put mandates for the use of face masks, which have received both positive and negative reaction. Online social media provides an exciting platform to study the use of masks and analyze underlying mask-wearing patterns. In this article, we analyze 2.04 million social media images for six US cities. An increase in masks worn in images is seen as the COVID-19 cases rose, particularly when their respective states imposed strict regulations. We also found a decrease in the posting of group pictures as stay-at-home laws were put into place. Furthermore, mask compliance in the Black Lives Matter protest was analyzed, eliciting that 40% of the people in group photos wore masks, and 45% of them wore the masks with a fit score of greater than 80%. We introduce two new datasets, VAriety MAsks - Classification (VAMA-C) and VAriety MAsks - Segmentation (VAMA-S), for mask detection and mask fit analysis tasks, respectively. For the analysis, we create two frameworks, face mask detector (for classifying masked and unmasked faces) and mask fit analyzer (a semantic segmentation based model to calculate a mask-fit score). The face mask detector achieved a classification accuracy of 98%, and the semantic segmentation model for the mask fit analyzer achieved an Intersection Over Union (IOU) score of 98%. We conclude that such a framework can be used to evaluate the effectiveness of such public health strategies using social media platforms in times of pandemic. △ Less

Submitted 9 July, 2021; v1 submitted 30 October, 2020; originally announced November 2020.

arXiv:2001.03553 [pdf, ps, other]

Impact of Sampler Offset on Jitter Transfer in Clock and Data Recovery Circuits

Authors: Naveen Kadayinti, Maryam Shojaei Baghini, Dinesh K. Sharma

Abstract: This paper shows how the input offset of sampling flip-flops in the Alexander phase detector affects the jitter transfer from data to the recovered clock in a clock data recovery circuit. The Alexander phase detector samples the data at both the edges of the clock in order to recover the data, as well as the clock timing information. The timing information is used in a clock recovery circuit, whic… ▽ More This paper shows how the input offset of sampling flip-flops in the Alexander phase detector affects the jitter transfer from data to the recovered clock in a clock data recovery circuit. The Alexander phase detector samples the data at both the edges of the clock in order to recover the data, as well as the clock timing information. The timing information is used in a clock recovery circuit, which is basically a PLL or a DLL. Once the PLL (or DLL) is locked, the phase detector samples the data at the center of the eye as well as at the data transitions. It is shown how the offset of the sampling flip-flop that samples the data at its transitions influences the jitter transfer from data to the recovered clock. Importantly, it is shown that zero offset is not always the best case. The effect is studied for different levels of data dependent jitter. The mechanism of this phenomenon is explained and the predictions are supported with simulations. The paper also discusses a tracking circuit that keeps the offset at the minimum jitter point. △ Less

Submitted 10 January, 2020; originally announced January 2020.

Comments: 5 pages, 10 figures

arXiv:1808.01039 [pdf, other]

An Energy Efficient Routing Protocol for Wireless Internet-of-Things Sensor Networks

Authors: Vidushi Vashishth, Anshuman Chhabra, Anirudh Khanna, Deepak Kumar Sharma, Jyotsna Singh

Abstract: Internet of Things (IoT) are increasingly being adopted into practical applications such as security systems, smart infrastructure, traffic management, weather systems, among others. While the scale of these applications is enormous, device capabilities, particularly in terms of battery life and energy efficiency are limited. Despite research being done to ameliorate these shortcomings, wireless I… ▽ More Internet of Things (IoT) are increasingly being adopted into practical applications such as security systems, smart infrastructure, traffic management, weather systems, among others. While the scale of these applications is enormous, device capabilities, particularly in terms of battery life and energy efficiency are limited. Despite research being done to ameliorate these shortcomings, wireless IoT networks still cannot guarantee satisfactory network lifetimes and prolonged sensing coverage. Moreover, proposed schemes in literature are convoluted and cannot be easily implemented in real-world scenarios. This necessitates the development of a simple yet energy efficient routing scheme for wireless IoT sensor networks. This paper models the energy constraint problem of devices in IoT applications as an optimization problem. To conserve the energy of device nodes, the routing protocol first aggregates devices into clusters based on a number of different features such as distance from base station, data/message length and data sensed from the environment in the current epoch. Then, a cluster head is elected for each cluster and a directed acyclic graph (DAG) is generated with all the cluster heads as nodes. Edges represent communication intent from transmitter to receiver and the edge weights are computed using a formulated equation. The minimum cost path to the base station is computed to allow for efficient real-time routing. Sleep scheduling is also optionally used to further boost network energy efficiency. The proposed routing protocol has been simulated and outperforms existing routing protocols in terms of metrics such as number of active nodes, energy dynamics and network coverage. △ Less

Submitted 8 March, 2019; v1 submitted 2 August, 2018; originally announced August 2018.

Journal ref: International Journal of Communication Systems (applied for review in 2019)

arXiv:1807.05331 [pdf]

doi 10.1109/TIM.2018.2829488

Improving Photoplethysmographic Measurements under Motion Artifacts using Artificial Neural Network for Personal Healthcare

Authors: Monalisa Singha Roy, Rajarshi Gupta, Jayanta K. Chandra, Kaushik Das Sharma, Arunansu Talukdar

Abstract: Photoplethysmographic (PPG) measurements are susceptible to motion artifacts (MA) due to movement of the peripheral body parts. In this paper, we present a new approach to identify the MA corrupted PPG beats and then rectify the beat morphology using artificial neural network (ANN). Initially, beat quality assessment was done to identify the clean PPG beats by a pre-trained feedback ANN to generat… ▽ More Photoplethysmographic (PPG) measurements are susceptible to motion artifacts (MA) due to movement of the peripheral body parts. In this paper, we present a new approach to identify the MA corrupted PPG beats and then rectify the beat morphology using artificial neural network (ANN). Initially, beat quality assessment was done to identify the clean PPG beats by a pre-trained feedback ANN to generate a reference beat template for each person. The PPG data was decomposed using principal component analysis (PCA) and reconstructed using fixed energy retention. A weight coefficient was assigned for each PPG samples in such a way that when they are multiplied , the modified beat morphology matches the reference template. A particle swarm optimization (PSO) based technique was utilized to select the best weight weight vector coefficients to tune another feedback ANN, fed with a set of significant features generated by an auto encoder from PCA reconstructed data. For real time implementation, this pre-trained ANN was operated in feed-forward mode to directly generate the weight vectors for any subsequent measurements of PPG. The method was validated with PPG data collected from 55 human subjects. An average RMSE of 0.28 and SNR improvement of 14.54 dB was obtained, with an average improvement of 36% and 47% measurement accuracy on crest time and systolic to diastolic peak height ratio respectively. With IEEE Signal Processing Cup 2015 Challenge database, Pearson's correlation coefficient between PPG estimated and ECG derived heart rate was 0.990. The proposed method can be useful for personal health monitoring applications. △ Less

Submitted 14 July, 2018; originally announced July 2018.

Journal ref: IEEE Transactions on Instrumentation & Measurement 2018

Showing 1–17 of 17 results for author: Sharma, D