-
TABSurfer: a Hybrid Deep Learning Architecture for Subcortical Segmentation
Authors:
Aaron Cao,
Vishwanatha M. Rao,
Kejia Liu,
Xinru Liu,
Andrew F. Laine,
Jia Guo
Abstract:
Subcortical segmentation remains challenging despite its important applications in quantitative structural analysis of brain MRI scans. The most accurate method, manual segmentation, is highly labor intensive, so automated tools like FreeSurfer have been adopted to handle this task. However, these traditional pipelines are slow and inefficient for processing large datasets. In this study, we propo…
▽ More
Subcortical segmentation remains challenging despite its important applications in quantitative structural analysis of brain MRI scans. The most accurate method, manual segmentation, is highly labor intensive, so automated tools like FreeSurfer have been adopted to handle this task. However, these traditional pipelines are slow and inefficient for processing large datasets. In this study, we propose TABSurfer, a novel 3D patch-based CNN-Transformer hybrid deep learning model designed for superior subcortical segmentation compared to existing state-of-the-art tools. To evaluate, we first demonstrate TABSurfer's consistent performance across various T1w MRI datasets with significantly shorter processing times compared to FreeSurfer. Then, we validate against manual segmentations, where TABSurfer outperforms FreeSurfer based on the manual ground truth. In each test, we also establish TABSurfer's advantage over a leading deep learning benchmark, FastSurferVINN. Together, these studies highlight TABSurfer's utility as a powerful tool for fully automated subcortical segmentation with high fidelity.
△ Less
Submitted 13 December, 2023;
originally announced December 2023.
-
Improving fairness for spoken language understanding in atypical speech with Text-to-Speech
Authors:
Helin Wang,
Venkatesh Ravichandran,
Milind Rao,
Becky Lammers,
Myra Sydnor,
Nicholas Maragakis,
Ankur A. Butala,
Jayne Zhang,
Lora Clawson,
Victoria Chovaz,
Laureano Moro-Velazquez
Abstract:
Spoken language understanding (SLU) systems often exhibit suboptimal performance in processing atypical speech, typically caused by neurological conditions and motor impairments. Recent advancements in Text-to-Speech (TTS) synthesis-based augmentation for more fair SLU have struggled to accurately capture the unique vocal characteristics of atypical speakers, largely due to insufficient data. To a…
▽ More
Spoken language understanding (SLU) systems often exhibit suboptimal performance in processing atypical speech, typically caused by neurological conditions and motor impairments. Recent advancements in Text-to-Speech (TTS) synthesis-based augmentation for more fair SLU have struggled to accurately capture the unique vocal characteristics of atypical speakers, largely due to insufficient data. To address this issue, we present a novel data augmentation method for atypical speakers by finetuning a TTS model, called Aty-TTS. Aty-TTS models speaker and atypical characteristics via knowledge transferring from a voice conversion model. Then, we use the augmented data to train SLU models adapted to atypical speech. To train these data augmentation models and evaluate the resulting SLU systems, we have collected a new atypical speech dataset containing intent annotation. Both objective and subjective assessments validate that Aty-TTS is capable of generating high-quality atypical speech. Furthermore, it serves as an effective data augmentation strategy, contributing to more fair SLU systems that can better accommodate individuals with atypical speech patterns.
△ Less
Submitted 16 November, 2023;
originally announced November 2023.
-
Federated Representation Learning for Automatic Speech Recognition
Authors:
Guruprasad V Ramesh,
Gopinath Chennupati,
Milind Rao,
Anit Kumar Sahu,
Ariya Rastrow,
Jasha Droppo
Abstract:
Federated Learning (FL) is a privacy-preserving paradigm, allowing edge devices to learn collaboratively without sharing data. Edge devices like Alexa and Siri are prospective sources of unlabeled audio data that can be tapped to learn robust audio representations. In this work, we bring Self-supervised Learning (SSL) and FL together to learn representations for Automatic Speech Recognition respec…
▽ More
Federated Learning (FL) is a privacy-preserving paradigm, allowing edge devices to learn collaboratively without sharing data. Edge devices like Alexa and Siri are prospective sources of unlabeled audio data that can be tapped to learn robust audio representations. In this work, we bring Self-supervised Learning (SSL) and FL together to learn representations for Automatic Speech Recognition respecting data privacy constraints. We use the speaker and chapter information in the unlabeled speech dataset, Libri-Light, to simulate non-IID speaker-siloed data distributions and pre-train an LSTM encoder with the Contrastive Predictive Coding framework with FedSGD. We show that the pre-trained ASR encoder in FL performs as well as a centrally pre-trained model and produces an improvement of 12-15% (WER) compared to no pre-training. We further adapt the federated pre-trained models to a new language, French, and show a 20% (WER) improvement over no pre-training.
△ Less
Submitted 7 August, 2023; v1 submitted 3 August, 2023;
originally announced August 2023.
-
Distributed Radar Imaging Based on Accelerated ADMM
Authors:
Ahmed Murtada,
Bhavani Shankar Mysore Rama Rao,
Udo Schroeder
Abstract:
The ability of widely distributed radar systems to capture diverse spatial scattering properties substantially improves radar imaging performance. Traditional imaging methods leverage regularized optimization techniques to reconstruct sparse images from local sensors and later combine them to create a global image. Alternatively, we proposed in an earlier work a joint reconstruction technique base…
▽ More
The ability of widely distributed radar systems to capture diverse spatial scattering properties substantially improves radar imaging performance. Traditional imaging methods leverage regularized optimization techniques to reconstruct sparse images from local sensors and later combine them to create a global image. Alternatively, we proposed in an earlier work a joint reconstruction technique based on two problem formulations according to the optimization framework of the Alternating Direction Method of Multipliers (ADMM). The joint reconstruction of the global image offers faster convergence, flexible implementation, and a general distributed reconstruction framework. However, despite its benefits, ADMM framework still exhibits a slow convergence rate, making its employment in some contexts impractical. In this paper, we introduce a heuristic method to accelerate the convergence of the previously proposed ADMM formulations based on the gradual elimination of the already converged pixels in accordance with a predetermined criterion. In addition to reducing running time, the accelerated implementation offers reduced computational complexity and lower communication cost between the sensors during iterative updates.
△ Less
Submitted 17 July, 2023;
originally announced July 2023.
-
Federated Self-Learning with Weak Supervision for Speech Recognition
Authors:
Milind Rao,
Gopinath Chennupati,
Gautam Tiwari,
Anit Kumar Sahu,
Anirudh Raju,
Ariya Rastrow,
Jasha Droppo
Abstract:
Automatic speech recognition (ASR) models with low-footprint are increasingly being deployed on edge devices for conversational agents, which enhances privacy. We study the problem of federated continual incremental learning for recurrent neural network-transducer (RNN-T) ASR models in the privacy-enhancing scheme of learning on-device, without access to ground truth human transcripts or machine t…
▽ More
Automatic speech recognition (ASR) models with low-footprint are increasingly being deployed on edge devices for conversational agents, which enhances privacy. We study the problem of federated continual incremental learning for recurrent neural network-transducer (RNN-T) ASR models in the privacy-enhancing scheme of learning on-device, without access to ground truth human transcripts or machine transcriptions from a stronger ASR model. In particular, we study the performance of a self-learning based scheme, with a paired teacher model updated through an exponential moving average of ASR models. Further, we propose using possibly noisy weak-supervision signals such as feedback scores and natural language understanding semantics determined from user behavior across multiple turns in a session of interactions with the conversational agent. These signals are leveraged in a multi-task policy-gradient training approach to improve the performance of self-learning for ASR. Finally, we show how catastrophic forgetting can be mitigated by combining on-device learning with a memory-replay approach using selected historical datasets. These innovations allow for 10% relative improvement in WER on new use cases with minimal degradation on other test sets in the absence of strong-supervision signals such as ground-truth transcriptions.
△ Less
Submitted 21 June, 2023;
originally announced June 2023.
-
Learning When to Trust Which Teacher for Weakly Supervised ASR
Authors:
Aakriti Agrawal,
Milind Rao,
Anit Kumar Sahu,
Gopinath Chennupati,
Andreas Stolcke
Abstract:
Automatic speech recognition (ASR) training can utilize multiple experts as teacher models, each trained on a specific domain or accent. Teacher models may be opaque in nature since their architecture may be not be known or their training cadence is different from that of the student ASR model. Still, the student models are updated incrementally using the pseudo-labels generated independently by t…
▽ More
Automatic speech recognition (ASR) training can utilize multiple experts as teacher models, each trained on a specific domain or accent. Teacher models may be opaque in nature since their architecture may be not be known or their training cadence is different from that of the student ASR model. Still, the student models are updated incrementally using the pseudo-labels generated independently by the expert teachers. In this paper, we exploit supervision from multiple domain experts in training student ASR models. This training strategy is especially useful in scenarios where few or no human transcriptions are available. To that end, we propose a Smart-Weighter mechanism that selects an appropriate expert based on the input audio, and then trains the student model in an unsupervised setting. We show the efficacy of our approach using LibriSpeech and LibriLight benchmarks and find an improvement of 4 to 25\% over baselines that uniformly weight all the experts, use a single expert model, or combine experts using ROVER.
△ Less
Submitted 21 June, 2023;
originally announced June 2023.
-
Adaptive Super-Twisting Controller Design for Accurate Trajectory Tracking Performance of Unmanned Aerial Vehicles
Authors:
D. M. K. K. Venkateswara Rao,
Hamed Habibi,
Jose Luis Sanchez-Lopez,
Prathyush P. Menon,
Christopher Edwards,
Holger Voos
Abstract:
In this paper, an adaptive super-twisting controller is designed for an agile maneuvering quadrotor unmanned aerial vehicle to achieve accurate trajectory tracking in the presence of external disturbances. A cascaded control architecture is designed to determine the desired accelerations using the proposed controller and subsequently used to compute the desired orientation and angular rates. The f…
▽ More
In this paper, an adaptive super-twisting controller is designed for an agile maneuvering quadrotor unmanned aerial vehicle to achieve accurate trajectory tracking in the presence of external disturbances. A cascaded control architecture is designed to determine the desired accelerations using the proposed controller and subsequently used to compute the desired orientation and angular rates. The finite-time convergence of sliding functions and closed-loop system stability are analytically proven. Furthermore, the restrictive assumption on the maximum variation of the disturbance is relaxed by designing a gain adaptation law and low-pass filtering of the estimated equivalent control. The proper selection of design parameters is discussed in detail. Finally, the effectiveness of the proposed method is evaluated by high-fidelity software-in-the-loop simulations and validated by experimental studies.
△ Less
Submitted 14 September, 2023; v1 submitted 21 March, 2023;
originally announced March 2023.
-
On SORA for High-Risk UAV Operations under New EU Regulations: Perspectives for Automated Approach
Authors:
Hamed Habibi,
D. M. K. K. Venkateswara Rao,
Jose Luis Sanchez-Lopez,
Holger Voos
Abstract:
In this paper, we investigate requirements to prepare an application for Specific Operations Risk Assessment (SORA), regulated by European Union Aviation Safety Agency (EASA) to obtain flight authorization for Unmanned Aerial Vehicles (UAVs) operations and propose some perspectives to automate the approach based on our successful application. Preparation of SORA requires expert knowledge as it con…
▽ More
In this paper, we investigate requirements to prepare an application for Specific Operations Risk Assessment (SORA), regulated by European Union Aviation Safety Agency (EASA) to obtain flight authorization for Unmanned Aerial Vehicles (UAVs) operations and propose some perspectives to automate the approach based on our successful application. Preparation of SORA requires expert knowledge as it contains technicalities. Also, the whole process is an iterative and time-consuming one. It is even more challenging for higher-risk operations, such as those in urban environments, near airports, and multi- and customized models for research activities. SORA process limits the potential socio-economic impacts of innovative UAV capabilities. Therefore, in this paper, we present a SORA example, review the steps and highlight challenges. Accordingly, we propose an alternative workflow, considering the same steps, while addressing the challenges and pitfalls, to shorten the whole process. Furthermore, we present a comprehensive list of preliminary technical procedures, including the pre/during/post-flight checklists, design and installation appraisal, flight logbook, operational manual, training manual, and General Data Protection Regulation (GDPR), which are not explicitly instructed in SORA manual. Moreover, we propose the initial idea to create an automated SORA workflow to facilitate obtaining authorization, which is significantly helpful for operators, especially the scientific community, to conduct experimental operations.
△ Less
Submitted 3 March, 2023;
originally announced March 2023.
-
An Integrated Real-time UAV Trajectory Optimization with Potential Field Approach for Dynamic Collision Avoidance
Authors:
D. M. K. K. Venkateswara Rao,
Hamed Habibi,
Jose Luis Sanchez-Lopez,
Holger Voos
Abstract:
This paper presents an integrated approach that combines trajectory optimization and Artificial Potential Field (APF) method for real-time optimal Unmanned Aerial Vehicle (UAV) trajectory planning and dynamic collision avoidance. A minimum-time trajectory optimization problem is formulated with initial and final positions as boundary conditions and collision avoidance as constraints. It is transcr…
▽ More
This paper presents an integrated approach that combines trajectory optimization and Artificial Potential Field (APF) method for real-time optimal Unmanned Aerial Vehicle (UAV) trajectory planning and dynamic collision avoidance. A minimum-time trajectory optimization problem is formulated with initial and final positions as boundary conditions and collision avoidance as constraints. It is transcribed into a nonlinear programming problem using Chebyshev pseudospectral method. The state and control histories are approximated by using Lagrange polynomials and the collocation points are used to satisfy constraints. A novel sigmoid-type collision avoidance constraint is proposed to overcome the drawbacks of Lagrange polynomial approximation in pseudospectral methods that only guarantees inequality constraint satisfaction only at nodal points. Automatic differentiation of cost function and constraints is used to quickly determine their gradient and Jacobian, respectively. An APF method is used to update the optimal control inputs for guaranteeing collision avoidance. The trajectory optimization and APF method are implemented in a closed-loop fashion continuously, but in parallel at moderate and high frequencies, respectively. The initial guess for the optimization is provided based on the previous solution. The proposed approach is tested and validated through indoor experiments.
△ Less
Submitted 3 March, 2023;
originally announced March 2023.
-
Iterative RNDOP-Optimal Anchor Placement for Beyond Convex Hull ToA-based Localization: Performance Bounds and Heuristic Algorithms
Authors:
Raghunandan M. Rao,
Don-Roberts Emenonye
Abstract:
Localizing targets outside the anchors' convex hull is an understudied but prevalent scenario in vehicle-centric, UAV-based, and self-localization applications. Considering such scenarios, this paper studies the optimal anchor placement problem for Time-of-Arrival (ToA)-based localization schemes such that the worst-case Dilution of Precision (DOP) is minimized. Building on prior results on DOP sc…
▽ More
Localizing targets outside the anchors' convex hull is an understudied but prevalent scenario in vehicle-centric, UAV-based, and self-localization applications. Considering such scenarios, this paper studies the optimal anchor placement problem for Time-of-Arrival (ToA)-based localization schemes such that the worst-case Dilution of Precision (DOP) is minimized. Building on prior results on DOP scaling laws for beyond convex hull ToA-based localization, we propose a novel metric termed the Range-Normalized DOP (RNDOP). We show that the worst-case DOP-optimal anchor placement problem simplifies to a min-max RNDOP-optimal anchor placement problem. Unfortunately, this formulation results in a non-convex and intractable problem under realistic constraints. To overcome this, we propose iterative anchor addition schemes, which result in a tractable albeit non-convex problem. By exploiting the structure arising from the resultant rank-1 update, we devise three heuristic schemes with varying performance-complexity tradeoffs. In addition, we also derive the upper and lower bounds for scenarios where we are placing anchors to optimize the worst-case (a) 3D positioning error and (b) 2D positioning error. We build on these results to design a cohesive iterative algorithmic framework for robust anchor placement, characterize the impact of anchor position uncertainty, and then discuss the computational complexity of the proposed schemes. Using numerical results, we validate the accuracy of our theoretical results. We also present comprehensive Monte-Carlo simulation results to compare the positioning error and execution time performance of each iterative scheme, discuss the tradeoffs, and provide valuable system design insights for beyond convex hull localization scenarios.
△ Less
Submitted 17 February, 2024; v1 submitted 16 December, 2022;
originally announced December 2022.
-
Detecting Schizophrenia with 3D Structural Brain MRI Using Deep Learning
Authors:
Junhao Zhang,
Vishwanatha M. Rao,
Ye Tian,
Yanting Yang,
Nicolas Acosta,
Zihan Wan,
Pin-Yu Lee,
Chloe Zhang,
Lawrence S. Kegeles,
Scott A. Small,
Jia Guo
Abstract:
Schizophrenia is a chronic neuropsychiatric disorder that causes distinct structural alterations within the brain. We hypothesize that deep learning applied to a structural neuroimaging dataset could detect disease-related alteration and improve classification and diagnostic accuracy. We tested this hypothesis using a single, widely available, and conventional T1-weighted MRI scan, from which we e…
▽ More
Schizophrenia is a chronic neuropsychiatric disorder that causes distinct structural alterations within the brain. We hypothesize that deep learning applied to a structural neuroimaging dataset could detect disease-related alteration and improve classification and diagnostic accuracy. We tested this hypothesis using a single, widely available, and conventional T1-weighted MRI scan, from which we extracted the 3D whole-brain structure using standard post-processing methods. A deep learning model was then developed, optimized, and evaluated on three open datasets with T1-weighted MRI scans of patients with schizophrenia. Our proposed model outperformed the benchmark model, which was also trained with structural MR images using a 3D CNN architecture. Our model is capable of almost perfectly (area under the ROC curve = 0.987) distinguishing schizophrenia patients from healthy controls on unseen structural MRI scans. Regional analysis localized subcortical regions and ventricles as the most predictive brain regions. Subcortical structures serve a pivotal role in cognitive, affective, and social functions in humans, and structural abnormalities of these regions have been associated with schizophrenia. Our finding corroborates that schizophrenia is associated with widespread alterations in subcortical brain structure and the subcortical structural information provides prominent features in diagnostic classification. Together, these results further demonstrate the potential of deep learning to improve schizophrenia diagnosis and identify its structural neuroimaging signatures from a single, standard T1-weighted brain MRI.
△ Less
Submitted 7 July, 2022; v1 submitted 26 June, 2022;
originally announced June 2022.
-
Bi-Sampling Approach to Classify Music Mood leveraging Raga-Rasa Association in Indian Classical Music
Authors:
Mohan Rao B C,
Vinayak Arkachaari,
Harsha M N,
Sushmitha M N,
Gayathri Ramesh K K,
Ullas M S,
Pathi Mohan Rao,
Sudha G,
Narayana Darapaneni
Abstract:
The impact of Music on the mood or emotion of the listener is a well-researched area in human psychology and behavioral science. In Indian classical music, ragas are the melodic structure that defines the various styles and forms of the music. Each raga has been found to evoke a specific emotion in the listener. With the advent of advanced capabilities of audio signal processing and the applicatio…
▽ More
The impact of Music on the mood or emotion of the listener is a well-researched area in human psychology and behavioral science. In Indian classical music, ragas are the melodic structure that defines the various styles and forms of the music. Each raga has been found to evoke a specific emotion in the listener. With the advent of advanced capabilities of audio signal processing and the application of machine learning, the demand for intelligent music classifiers and recommenders has received increased attention, especially in the 'Music as a service' cloud applications. This paper explores a novel framework to leverage the raga-rasa association in Indian classical Music to build an intelligent classifier and its application in music recommendation system based on user's current mood and the mood they aspire to be in.
△ Less
Submitted 13 March, 2022;
originally announced March 2022.
-
Widely Distributed Radar Imaging: Unmediated ADMM Based Approach
Authors:
Ahmed Murtada,
Ruizhi Hu,
Bhavani Shankar Mysore Rama Rao,
Udo Schroeder
Abstract:
In this paper, we present a novel approach to reconstruct a unique image of an observed scene with widely distributed radar sensors. The problem is posed as a constrained optimization problem in which the global image which represents the aggregate view of the sensors is a decision variable. While the problem is designed to promote a sparse solution for the global image, it is constrained such tha…
▽ More
In this paper, we present a novel approach to reconstruct a unique image of an observed scene with widely distributed radar sensors. The problem is posed as a constrained optimization problem in which the global image which represents the aggregate view of the sensors is a decision variable. While the problem is designed to promote a sparse solution for the global image, it is constrained such that a relationship with local images that can be reconstructed using the measurements at each sensor is respected. Two problem formulations are introduced by stipulating two different establishments of that relationship. The proposed formulations are designed according to consensus ADMM (CADMM) and sharing ADMM (SADMM), and their solutions are provided accordingly as iterative algorithms. We drive the explicit variable updates for each algorithm in addition to the recommended scheme for hybrid parallel implementation on the distributed sensors and a central processing unit. Our algorithms are validated and their performance is evaluated exploiting Civilian Vehicles Dome data-set to realize different scenarios of practical relevance. Experimental results show the effectiveness of the proposed algorithms, especially in cases with limited measurements.
△ Less
Submitted 10 March, 2022;
originally announced March 2022.
-
Improving Across-Dataset Brain Tissue Segmentation Using Transformer
Authors:
Vishwanatha M. Rao,
Zihan Wan,
Soroush Arabshahi,
David J. Ma,
Pin-Yu Lee,
Ye Tian,
Xuzhe Zhang,
Andrew F. Laine,
Jia Guo
Abstract:
Brain tissue segmentation has demonstrated great utility in quantifying MRI data through Voxel-Based Morphometry and highlighting subtle structural changes associated with various conditions within the brain. However, manual segmentation is highly labor-intensive, and automated approaches have struggled due to properties inherent to MRI acquisition, leaving a great need for an effective segmentati…
▽ More
Brain tissue segmentation has demonstrated great utility in quantifying MRI data through Voxel-Based Morphometry and highlighting subtle structural changes associated with various conditions within the brain. However, manual segmentation is highly labor-intensive, and automated approaches have struggled due to properties inherent to MRI acquisition, leaving a great need for an effective segmentation tool. Despite the recent success of deep convolutional neural networks (CNNs) for brain tissue segmentation, many such solutions do not generalize well to new datasets, which is critical for a reliable solution. Transformers have demonstrated success in natural image segmentation and have recently been applied to 3D medical image segmentation tasks due to their ability to capture long-distance relationships in the input where the local receptive fields of CNNs struggle. This study introduces a novel CNN-Transformer hybrid architecture designed for brain tissue segmentation. We validate our model's performance across four multi-site T1w MRI datasets, covering different vendors, field strengths, scan parameters, time points, and neuropsychiatric conditions. In all situations, our model achieved the greatest generality and reliability. Out method is inherently robust and can serve as a valuable tool for brain-related T1w MRI studies. The code for the TABS network is available at: https://github.com/raovish6/TABS.
△ Less
Submitted 31 January, 2023; v1 submitted 21 January, 2022;
originally announced January 2022.
-
Novel Local Radiomic Bayesian Classifiers for Non-Invasive Prediction of MGMT Methylation Status in Glioblastoma
Authors:
Mihir Rao
Abstract:
Glioblastoma, an aggressive brain cancer, is amongst the most lethal of all cancers. Expression of the O6-methylguanine-DNA-methyltransferase (MGMT) gene in glioblastoma tumor tissue is of clinical importance as it has a significant effect on the efficacy of Temozolomide, the primary chemotherapy treatment administered to glioblastoma patients. Currently, MGMT methylation is determined through an…
▽ More
Glioblastoma, an aggressive brain cancer, is amongst the most lethal of all cancers. Expression of the O6-methylguanine-DNA-methyltransferase (MGMT) gene in glioblastoma tumor tissue is of clinical importance as it has a significant effect on the efficacy of Temozolomide, the primary chemotherapy treatment administered to glioblastoma patients. Currently, MGMT methylation is determined through an invasive brain biopsy and subsequent genetic analysis of the extracted tumor tissue. In this work, we present novel Bayesian classifiers that make probabilistic predictions of MGMT methylation status based on radiomic features extracted from FLAIR-sequence magnetic resonance imagery (MRIs). We implement local radiomic techniques to produce radiomic activation maps and analyze MRIs for the MGMT biomarker based on statistical features of raw voxel-intensities. We demonstrate the ability for simple Bayesian classifiers to provide a boost in predictive performance when modelling local radiomic data rather than global features. The presented techniques provide a non-invasive MRI-based approach to determining MGMT methylation status in glioblastoma patients.
△ Less
Submitted 29 November, 2021;
originally announced December 2021.
-
On joint training with interfaces for spoken language understanding
Authors:
Anirudh Raju,
Milind Rao,
Gautam Tiwari,
Pranav Dheram,
Bryan Anderson,
Zhe Zhang,
Chul Lee,
Bach Bui,
Ariya Rastrow
Abstract:
Spoken language understanding (SLU) systems extract both text transcripts and semantics associated with intents and slots from input speech utterances. SLU systems usually consist of (1) an automatic speech recognition (ASR) module, (2) an interface module that exposes relevant outputs from ASR, and (3) a natural language understanding (NLU) module. Interfaces in SLU systems carry information on t…
▽ More
Spoken language understanding (SLU) systems extract both text transcripts and semantics associated with intents and slots from input speech utterances. SLU systems usually consist of (1) an automatic speech recognition (ASR) module, (2) an interface module that exposes relevant outputs from ASR, and (3) a natural language understanding (NLU) module. Interfaces in SLU systems carry information on text transcriptions or richer information like neural embeddings from ASR to NLU. In this paper, we study how interfaces affect joint-training for spoken language understanding. Most notably, we obtain the state-of-the-art results on the publicly available 50-hr SLURP dataset. We first leverage large-size pretrained ASR and NLU models that are connected by a text interface, and then jointly train both models via a sequence loss function. For scenarios where pretrained models are not utilized, the best results are obtained through a joint sequence loss training using richer neural interfaces. Finally, we show the overall diminishing impact of leveraging pretrained models with increased training data size.
△ Less
Submitted 25 July, 2022; v1 submitted 30 June, 2021;
originally announced June 2021.
-
Listen with Intent: Improving Speech Recognition with Audio-to-Intent Front-End
Authors:
Swayambhu Nath Ray,
Minhua Wu,
Anirudh Raju,
Pegah Ghahremani,
Raghavendra Bilgi,
Milind Rao,
Harish Arsikere,
Ariya Rastrow,
Andreas Stolcke,
Jasha Droppo
Abstract:
Comprehending the overall intent of an utterance helps a listener recognize the individual words spoken. Inspired by this fact, we perform a novel study of the impact of explicitly incorporating intent representations as additional information to improve a recurrent neural network-transducer (RNN-T) based automatic speech recognition (ASR) system. An audio-to-intent (A2I) model encodes the intent…
▽ More
Comprehending the overall intent of an utterance helps a listener recognize the individual words spoken. Inspired by this fact, we perform a novel study of the impact of explicitly incorporating intent representations as additional information to improve a recurrent neural network-transducer (RNN-T) based automatic speech recognition (ASR) system. An audio-to-intent (A2I) model encodes the intent of the utterance in the form of embeddings or posteriors, and these are used as auxiliary inputs for RNN-T training and inference. Experimenting with a 50k-hour far-field English speech corpus, this study shows that when running the system in non-streaming mode, where intent representation is extracted from the entire utterance and then used to bias streaming RNN-T search from the start, it provides a 5.56% relative word error rate reduction (WERR). On the other hand, a streaming system using per-frame intent posteriors as extra inputs for the RNN-T ASR system yields a 3.33% relative WERR. A further detailed analysis of the streaming system indicates that our proposed method brings especially good gain on media-playing related intents (e.g. 9.12% relative WERR on PlayMusicIntent).
△ Less
Submitted 16 June, 2021; v1 submitted 14 May, 2021;
originally announced May 2021.
-
Coexistence of Communications and Cognitive MIMO Radar: Waveform Design and Prototype
Authors:
Mohammad Alaee-Kerahroodi,
Ehsan Raei,
Sumit Kumar,
Bhavani Shankar Mysore Rama Rao
Abstract:
New generation of radar systems will need to coexist with other radio frequency (RF) systems, anticipating their behavior and reacting appropriately to avoid interference. In light of this requirement, this paper designs, implements, and evaluates the performance of phase-only sequences (with constant power) for intelligent spectrum utilization using the custom built cognitive Multiple Input Multi…
▽ More
New generation of radar systems will need to coexist with other radio frequency (RF) systems, anticipating their behavior and reacting appropriately to avoid interference. In light of this requirement, this paper designs, implements, and evaluates the performance of phase-only sequences (with constant power) for intelligent spectrum utilization using the custom built cognitive Multiple Input Multiple Output (MIMO) radar prototype. The proposed transmit waveforms avoid the frequency bands occupied by narrowband interferers or communication links, while simultaneously have a small cross-correlation among each other to enable their separability at the MIMO radar receiver. The performance of the optimized set of sequences obtained through solving a non-convex bi-objective optimization problem, is compared with the state-of-the-art counterparts, and its applicability is illustrated by the developed prototype. A realistic Long Term Evolution (LTE) downlink is used for the communications, and the real-time system implementation is validated and evaluated through the throughput calculations for communications and the detection performance measurement for the radar system.
△ Less
Submitted 8 March, 2021;
originally announced March 2021.
-
Do as I mean, not as I say: Sequence Loss Training for Spoken Language Understanding
Authors:
Milind Rao,
Pranav Dheram,
Gautam Tiwari,
Anirudh Raju,
Jasha Droppo,
Ariya Rastrow,
Andreas Stolcke
Abstract:
Spoken language understanding (SLU) systems extract transcriptions, as well as semantics of intent or named entities from speech, and are essential components of voice activated systems. SLU models, which either directly extract semantics from audio or are composed of pipelined automatic speech recognition (ASR) and natural language understanding (NLU) models, are typically trained via differentia…
▽ More
Spoken language understanding (SLU) systems extract transcriptions, as well as semantics of intent or named entities from speech, and are essential components of voice activated systems. SLU models, which either directly extract semantics from audio or are composed of pipelined automatic speech recognition (ASR) and natural language understanding (NLU) models, are typically trained via differentiable cross-entropy losses, even when the relevant performance metrics of interest are word or semantic error rates. In this work, we propose non-differentiable sequence losses based on SLU metrics as a proxy for semantic error and use the REINFORCE trick to train ASR and SLU models with this loss. We show that custom sequence loss training is the state-of-the-art on open SLU datasets and leads to 6% relative improvement in both ASR and NLU performance metrics on large proprietary datasets. We also demonstrate how the semantic sequence loss training paradigm can be used to update ASR and SLU models without transcripts, using semantic feedback alone.
△ Less
Submitted 12 February, 2021;
originally announced February 2021.
-
Widely-distributed Radar Imaging Based on Consensus ADMM
Authors:
Ruizhi Hu,
Bhavani Shankar Mysore Rama Rao,
Ahmed Murtada,
Mohammad Alaee-Kerahroodi,
Björn Ottersten
Abstract:
A widely-distributed radar system is a promising architecture to enhance radar imaging performance. However, most existing algorithms rely on isotropic scattering assumption, which is only satisfied in collocated radar systems. Moreover, due to noise and imaging model imperfections, artifacts such as layovers are common in radar images. In this paper, a novel $l_1$-regularized, consensus alternati…
▽ More
A widely-distributed radar system is a promising architecture to enhance radar imaging performance. However, most existing algorithms rely on isotropic scattering assumption, which is only satisfied in collocated radar systems. Moreover, due to noise and imaging model imperfections, artifacts such as layovers are common in radar images. In this paper, a novel $l_1$-regularized, consensus alternating direction method of multipliers (CADMM) based algorithm is proposed to mitigate artifacts by exploiting a widely-distributed radar system's spatial diversity. By imposing the consensus constraints on the local images formed by distributed antenna clusters and solving the resulting distributed optimization problem, the scenario's spatial-invariant common features are retained. Simultaneously, the spatial-variant artifacts are mitigated, and it will finally converge to a high-quality global image in the consensus of all distributed measurements. The proposed algorithm outperforms the joint sparsity-based composite imaging (JSC) algorithm in terms of artifacts mitigation. It can also reduce the computation and storage burden of large-scale imaging problems through its distributed and parallelizable optimization scheme.
△ Less
Submitted 8 November, 2020; v1 submitted 4 November, 2020;
originally announced November 2020.
-
Decentralized optimization over noisy, rate-constrained networks: Achieving consensus by communicating differences
Authors:
Rajarshi Saha,
Stefano Rini,
Milind Rao,
Andrea Goldsmith
Abstract:
In decentralized optimization, multiple nodes in a network collaborate to minimize the sum of their local loss functions. The information exchange between nodes required for this task, is often limited by network connectivity. We consider a setting in which communication between nodes is hindered by both (i) a finite rate-constraint on the signal transmitted by any node, and (ii) additive noise co…
▽ More
In decentralized optimization, multiple nodes in a network collaborate to minimize the sum of their local loss functions. The information exchange between nodes required for this task, is often limited by network connectivity. We consider a setting in which communication between nodes is hindered by both (i) a finite rate-constraint on the signal transmitted by any node, and (ii) additive noise corrupting the signal received by any node. We propose a novel algorithm for this scenario: Decentralized Lazy Mirror Descent with Differential Exchanges (DLMD-DiffEx), which guarantees convergence of the local estimates to the optimal solution under the given communication constraints. A salient feature of DLMD-DiffEx is the introduction of additional proxy variables that are maintained by the nodes to account for the disagreement in their estimates due to channel noise and rate-constraints. Convergence to the optimal solution is attained by having nodes iteratively exchange these disagreement terms until consensus is achieved. In order to prevent noise accumulation during this exchange, DLMD-DiffEx relies on two sequences; one controlling the power of the transmitted signal, and the other determining the consensus rate. We provide clear insights on the design of these two sequences which highlights the interplay between consensus rate and noise amplification. We investigate the performance of DLMD-DiffEx both from a theoretical perspective as well as through numerical evaluations.
△ Less
Submitted 6 October, 2021; v1 submitted 21 October, 2020;
originally announced October 2020.
-
Speech To Semantics: Improve ASR and NLU Jointly via All-Neural Interfaces
Authors:
Milind Rao,
Anirudh Raju,
Pranav Dheram,
Bach Bui,
Ariya Rastrow
Abstract:
We consider the problem of spoken language understanding (SLU) of extracting natural language intents and associated slot arguments or named entities from speech that is primarily directed at voice assistants. Such a system subsumes both automatic speech recognition (ASR) as well as natural language understanding (NLU). An end-to-end joint SLU model can be built to a required specification opening…
▽ More
We consider the problem of spoken language understanding (SLU) of extracting natural language intents and associated slot arguments or named entities from speech that is primarily directed at voice assistants. Such a system subsumes both automatic speech recognition (ASR) as well as natural language understanding (NLU). An end-to-end joint SLU model can be built to a required specification opening up the opportunity to deploy on hardware constrained scenarios like devices enabling voice assistants to work offline, in a privacy preserving manner, whilst also reducing server costs.
We first present models that extract utterance intent directly from speech without intermediate text output. We then present a compositional model, which generates the transcript using the Listen Attend Spell ASR system and then extracts interpretation using a neural NLU model. Finally, we contrast these methods to a jointly trained end-to-end joint SLU model, consisting of ASR and NLU subsystems which are connected by a neural network based interface instead of text, that produces transcripts as well as NLU interpretation. We show that the jointly trained model shows improvements to ASR incorporating semantic information from NLU and also improves NLU by exposing it to ASR confusion encoded in the hidden layer.
△ Less
Submitted 13 August, 2020;
originally announced August 2020.
-
Reconfigurable and Intelligent Ultra-Wideband Angular Sensing: Prototype Design and Validation
Authors:
Himani Joshi,
Sumit J. Darak,
Mohammad Alaee-Kerahroodi,
Bhavani Shankar Mysore Rama Rao
Abstract:
The emergence of beyond-licensed spectrum sharing in FR1 (0.45-6 GHz) and FR2 (24 - 52 GHz) along with the multi-antenna narrow-beam based directional transmissions demand a wideband spectrum sensing in temporal as well as spatial domains. We referred to it as ultra-wideband angular spectrum sensing (UWAS), and it consists of digitization followed by characterization of the wideband spectrum. In t…
▽ More
The emergence of beyond-licensed spectrum sharing in FR1 (0.45-6 GHz) and FR2 (24 - 52 GHz) along with the multi-antenna narrow-beam based directional transmissions demand a wideband spectrum sensing in temporal as well as spatial domains. We referred to it as ultra-wideband angular spectrum sensing (UWAS), and it consists of digitization followed by characterization of the wideband spectrum. In this paper, we design and develop state-of-the-art UWAS prototype using USRPs and LabVIEW NXG for the validation in the real-radio environment. Since 5G is expected to co-exist with LTE, the transmitter generates the multi-directional multi-user wideband traffic via LTE specific single carrier frequency division multiple access (SC-FDMA) approach. At the receiver, the first step of wideband spectrum digitization is accomplished using a novel approach of integrating sparse antenna-array with reconfigurable sub-Nyquist sampling (SNS). The reconfigurable SNS allows the digitization of non-contiguous spectrum via low-rate analog-to-digital converters, but it needs intelligence to choose the frequency bands for digitization. We explore the multi-play multi-armed bandit based learning algorithm to embed intelligence. Compared to previous works, the proposed characterization (frequency band status and direction-of-arrival estimation) approach does not need prior knowledge of received signal distribution. The detailed experimental results for various spectrum statistics, power gains and antenna array arrangements along with lower complexity validate the functional correctness, superiority and feasibility of the proposed UWAS over state-of-the-art approaches.
△ Less
Submitted 13 August, 2020;
originally announced August 2020.
-
Underlay Radar-Massive MIMO Spectrum Sharing: Modeling Fundamentals and Performance Analysis
Authors:
Raghunandan M. Rao,
Harpreet S. Dhillon,
Vuk Marojevic,
Jeffrey H. Reed
Abstract:
In this work, we study underlay radar-massive MIMO cellular coexistence in LoS/near-LoS channels, where both systems have 3D beamforming capabilities. Using mathematical tools from stochastic geometry, we derive an upper bound on the average interference power at the radar due to the 3D massive MIMO cellular downlink under the worst-case `cell-edge beamforming' conditions. To overcome the technica…
▽ More
In this work, we study underlay radar-massive MIMO cellular coexistence in LoS/near-LoS channels, where both systems have 3D beamforming capabilities. Using mathematical tools from stochastic geometry, we derive an upper bound on the average interference power at the radar due to the 3D massive MIMO cellular downlink under the worst-case `cell-edge beamforming' conditions. To overcome the technical challenges imposed by asymmetric and arbitrarily large cells, we devise a novel construction in which each Poisson Voronoi (PV) cell is bounded by its circumcircle to bound the effect of the random cell shapes on average interference. Since this model is intractable for further analysis due to the correlation between adjacent PV cells' shapes and sizes, we propose a tractable nominal interference model, where we model each PV cell as a circular disk with an area equal to the average area of the typical cell. We quantify the gap in the average interference power between these two models and show that the upper bound is tight for realistic deployment parameters. We also compare them with a more practical but intractable MU-MIMO scheduling model to show that our worst-case interference models show the same trends and do not deviate significantly from realistic scheduler models. Under the nominal interference model, we characterize the interference distribution using the dominant interferer approximation by deriving the equi-interference contour expression when the typical receiver uses 3D beamforming. Finally, we use tractable expressions for the interference distribution to characterize radar's spatial probability of false alarm/detection in a quasi-static target tracking scenario. Our results reveal useful trends in the average interference as a function of the deployment parameters (BS density, exclusion zone radius, antenna height, transmit power of each BS, etc.).
△ Less
Submitted 16 May, 2021; v1 submitted 3 August, 2020;
originally announced August 2020.
-
Semi-Blind Post-Equalizer SINR Estimation and Dual CSI Feedback for Radar-Cellular Coexistence
Authors:
Raghunandan M. Rao,
Vuk Marojevic,
Jeffrey H. Reed
Abstract:
Current cellular systems use pilot-aided statistical-channel state information (S-CSI) estimation and limited feedback schemes to aid in link adaptation and scheduling decisions. However, in the presence of pulsed radar signals, pilot-aided S-CSI is inaccurate since interference statistics on pilot and non-pilot resources can be different. Moreover, the channel will be bimodal as a result of the p…
▽ More
Current cellular systems use pilot-aided statistical-channel state information (S-CSI) estimation and limited feedback schemes to aid in link adaptation and scheduling decisions. However, in the presence of pulsed radar signals, pilot-aided S-CSI is inaccurate since interference statistics on pilot and non-pilot resources can be different. Moreover, the channel will be bimodal as a result of the periodic interference. In this paper, we propose a max-min heuristic to estimate the post-equalizer SINR in the case of non-pilot pulsed radar interference, and characterize its distribution as a function of noise variance and interference power. We observe that the proposed heuristic incurs low computational complexity, and is robust beyond a certain SINR threshold for different modulation schemes, especially for QPSK. This enables us to develop a comprehensive semi-blind framework to estimate the wideband SINR metric that is commonly used for S-CSI quantization in 3GPP Long-Term Evolution (LTE) and New Radio (NR) networks. Finally, we propose dual CSI feedback for practical radar-cellular spectrum sharing, to enable accurate CSI acquisition in the bimodal channel. We demonstrate significant improvements in throughput, block error rate and retransmission-induced latency for LTE-Advanced Pro when compared to conventional pilot-aided S-CSI estimation and limited feedback schemes.
△ Less
Submitted 1 June, 2020;
originally announced June 2020.
-
Probability of Pilot Interference in Pulsed Radar-Cellular Coexistence: Fundamental Insights on Demodulation and Limited CSI Feedback
Authors:
Raghunandan M. Rao,
Vuk Marojevic,
Jeffrey H. Reed
Abstract:
This paper considers an underlay pulsed radar-cellular spectrum sharing scenario, where the cellular system uses pilot-aided demodulation, statistical channel state information (S-CSI) estimation and limited feedback schemes. Under a realistic system model, upper and lower bounds are derived on the probability that at least a specified number of pilot signals are interfered by a radar pulse train…
▽ More
This paper considers an underlay pulsed radar-cellular spectrum sharing scenario, where the cellular system uses pilot-aided demodulation, statistical channel state information (S-CSI) estimation and limited feedback schemes. Under a realistic system model, upper and lower bounds are derived on the probability that at least a specified number of pilot signals are interfered by a radar pulse train in a finite CSI estimation window. Exact probabilities are also derived for important special cases which reveal operational regimes where the lower bound is achieved. Using these results, this paper (a) provides insights on pilot interference-minimizing schemes for accurate coherent symbol demodulation, and (b) demonstrates that pilot-aided methods fail to accurately estimate S-CSI of the pulsed radar interference channel for a wide range of radar repetition intervals.
△ Less
Submitted 30 April, 2020;
originally announced May 2020.
-
Analysis of Worst-Case Interference in Underlay Radar-Massive MIMO Spectrum Sharing Scenarios
Authors:
Raghunandan M. Rao,
Harpeet S. Dhillon,
Vuk Marojevic,
Jeffrey H. Reed
Abstract:
In this paper, we consider an underlay radar-massive MIMO spectrum sharing scenario in which massive MIMO base stations (BSs) are allowed to operate outside a circular exclusion zone centered at the radar. Modeling the locations of the massive MIMO BSs as a homogeneous Poisson point process (PPP), we derive an analytical expression for a tight upper bound on the average interference at the radar d…
▽ More
In this paper, we consider an underlay radar-massive MIMO spectrum sharing scenario in which massive MIMO base stations (BSs) are allowed to operate outside a circular exclusion zone centered at the radar. Modeling the locations of the massive MIMO BSs as a homogeneous Poisson point process (PPP), we derive an analytical expression for a tight upper bound on the average interference at the radar due to cellular transmissions. The technical novelty is in bounding the worst-case elevation angle for each massive MIMO BS for which we devise a novel construction based on the circumradius distribution of a typical Poisson-Voronoi (PV) cell. While these worst-case elevation angles are correlated for neighboring BSs due to the structure of the PV tessellation, it does not explicitly appear in our analysis because of our focus on the average interference. We also provide an estimate of the nominal average interference by approximating each cell as a circle with area equal to the average area of the typical cell. Using these results, we demonstrate that the gap between the two results remains approximately constant with respect to the exclusion zone radius. Our analysis reveals useful trends in average interference power, as a function of key deployment parameters such as radar/BS antenna heights, number of antenna elements per radar/BS, BS density, and exclusion zone radius.
△ Less
Submitted 22 July, 2019;
originally announced July 2019.
-
Planar Geometry and Image Recovery from Motion-Blur
Authors:
Kuldeep Purohit,
Subeesh Vasu,
M. Purnachandra Rao,
A. N. Rajagopalan
Abstract:
Existing works on motion deblurring either ignore the effects of depth-dependent blur or work with the assumption of a multi-layered scene wherein each layer is modeled in the form of fronto-parallel plane. In this work, we consider the case of 3D scenes with piecewise planar structure i.e., a scene that can be modeled as a combination of multiple planes with arbitrary orientations. We first propo…
▽ More
Existing works on motion deblurring either ignore the effects of depth-dependent blur or work with the assumption of a multi-layered scene wherein each layer is modeled in the form of fronto-parallel plane. In this work, we consider the case of 3D scenes with piecewise planar structure i.e., a scene that can be modeled as a combination of multiple planes with arbitrary orientations. We first propose an approach for estimation of normal of a planar scene from a single motion blurred observation. We then develop an algorithm for automatic recovery of number of planes, the parameters corresponding to each plane, and camera motion from a single motion blurred image of a multiplanar 3D scene. Finally, we propose a first-of-its-kind approach to recover the planar geometry and latent image of the scene by adopting an alternating minimization framework built on our findings. Experiments on synthetic and real data reveal that our proposed method achieves state-of-the-art results.
△ Less
Submitted 6 February, 2022; v1 submitted 7 April, 2019;
originally announced April 2019.
-
Distributed Convex Optimization With Limited Communications
Authors:
Milind Rao,
Stefano Rini,
Andrea Goldsmith
Abstract:
In this paper, a distributed convex optimization algorithm, termed \emph{distributed coordinate dual averaging} (DCDA) algorithm, is proposed. The DCDA algorithm addresses the scenario of a large distributed optimization problem with limited communication among nodes in the network. Currently known distributed subgradient methods, such as the distributed dual averaging or the distributed alternati…
▽ More
In this paper, a distributed convex optimization algorithm, termed \emph{distributed coordinate dual averaging} (DCDA) algorithm, is proposed. The DCDA algorithm addresses the scenario of a large distributed optimization problem with limited communication among nodes in the network. Currently known distributed subgradient methods, such as the distributed dual averaging or the distributed alternating direction method of multipliers algorithms, assume that nodes can exchange messages of large cardinality. Such network communication capabilities are not valid in many scenarios of practical relevance. In the DCDA algorithm, on the other hand, communication of each coordinate of the optimization variable is restricted over time. For the proposed algorithm, we bound the rate of convergence under different communication protocols and network architectures. We also consider the extensions to the case of imperfect gradient knowledge and the case in which transmitted messages are corrupted by additive noise or are quantized. Relevant numerical simulations are also provided.
△ Less
Submitted 29 October, 2018;
originally announced October 2018.
-
Measuring Hardware Impairments with Software-Defined Radios
Authors:
Vuk Marojevic,
Aditya V. Padaki,
Raghunandan M. Rao,
Jeffrey H. Reed
Abstract:
This Innovative Practice Full Paper introduces a novel tool for educating electrical engineering students about hardware impairments in wireless communications. A radio frequency (RF) front end is an essential part of a wireless transmitter or receiver. It features analog processing components and data converters which are driven by today's digital communication systems. Advancements in computing…
▽ More
This Innovative Practice Full Paper introduces a novel tool for educating electrical engineering students about hardware impairments in wireless communications. A radio frequency (RF) front end is an essential part of a wireless transmitter or receiver. It features analog processing components and data converters which are driven by today's digital communication systems. Advancements in computing and software-defined radio (SDR) technology have enabled sha** waveforms in software and using experimental and easily accessible plug-and-play RF front ends for education, research and development. We use this same technology to teach nonlinear effects of RF front ends and their implications. It uses widely available RF instruments and components and SDR technology--well-established affordable hardware and free open source software--to teach students how to characterize the nonlinearity of RF receivers while providing hands-on experience with SDR tools. We present the hardware, software and procedures of our laboratory session that enable easy reproducibility in other classrooms. We discuss different forms of evaluating the suitability of the new class modules and conclude that it provides a valuable learning experience that bolsters the theory that is typically provided in lectures only.
△ Less
Submitted 10 October, 2018;
originally announced October 2018.
-
FastTrack: Minimizing Stalls for CDN-based Over-the-top Video Streaming Systems
Authors:
Abubakr Alabbasi,
Vaneet Aggarwal,
Tian Lan,
Yu Xiang,
Moo-Ryong Ra,
Yih-Farn R. Chen
Abstract:
Traffic for internet video streaming has been rapidly increasing and is further expected to increase with the higher definition videos and IoT applications, such as 360 degree videos and augmented virtual reality applications. While efficient management of heterogeneous cloud resources to optimize the quality of experience is important, existing work in this problem space often left out important…
▽ More
Traffic for internet video streaming has been rapidly increasing and is further expected to increase with the higher definition videos and IoT applications, such as 360 degree videos and augmented virtual reality applications. While efficient management of heterogeneous cloud resources to optimize the quality of experience is important, existing work in this problem space often left out important factors. In this paper, we present a model for describing a today's representative system architecture for video streaming applications, typically composed of a centralized origin server and several CDN sites. Our model comprehensively considers the following factors: limited caching spaces at the CDN sites, allocation of CDN for a video request, choice of different ports from the CDN, and the central storage and bandwidth allocation. With the model, we focus on minimizing a performance metric, stall duration tail probability (SDTP), and present a novel, yet efficient, algorithm to solve the formulated optimization problem. The theoretical bounds with respect to the SDTP metric are also analyzed and presented. Our extensive simulation results demonstrate that the proposed algorithms can significantly improve the SDTP metric, compared to the baseline strategies. Small-scale video streaming system implementation in a real cloud environment further validates our results.
△ Less
Submitted 30 June, 2018;
originally announced July 2018.
-
Energy Efficient Task Assignment in Virtualized Wireless Sensor Networks
Authors:
Vahid Maleki Raee,
Diala Naboulsi,
Roch Glitho
Abstract:
Wireless Sensor Networks (WSNs) are being used extensively today in various domains. However, they are traditionally deployed with applications embedded in them which precludes their re-use for new applications. Nowadays, virtualization enables several applications on a same WSN by abstracting the physical resources (i.e. sensing capabilities) into logical ones. However, this comes at a cost, incl…
▽ More
Wireless Sensor Networks (WSNs) are being used extensively today in various domains. However, they are traditionally deployed with applications embedded in them which precludes their re-use for new applications. Nowadays, virtualization enables several applications on a same WSN by abstracting the physical resources (i.e. sensing capabilities) into logical ones. However, this comes at a cost, including an energy cost. It is therefore critical to ensure the efficient allocation of these resources. In this paper, we study the problem of assigning application sensing tasks to sensor devices, in virtualized WSNs. Our goal is to minimize the overall energy consumption resulting from the assignment. We focus on the static version of the problem and formulate it using Integer Linear Programming (ILP), while accounting for sensor nodes' available energy and virtualization overhead. We solve the problem over different scenarios and compare the obtained solution to the case of a traditional WSN, i.e. one with no support for virtualization. Our results show that significant energy can be saved when tasks are appropriately assigned in a WSN that supports virtualization.
△ Less
Submitted 7 May, 2018;
originally announced May 2018.
-
Characterization of maximum hands-off control
Authors:
Debasish Chatterjee,
Masaaki Nagahara,
Daniel Quevedo,
K. S. Mallikarjuna Rao
Abstract:
Maximum hands-off control aims to maximize the length of time over which zero actuator values are applied to a system when executing specified control tasks. To tackle such problems, recent literature has investigated optimal control problems which penalize the size of the support of the control function and thereby lead to desired sparsity properties. This article gives the exact set of necessary…
▽ More
Maximum hands-off control aims to maximize the length of time over which zero actuator values are applied to a system when executing specified control tasks. To tackle such problems, recent literature has investigated optimal control problems which penalize the size of the support of the control function and thereby lead to desired sparsity properties. This article gives the exact set of necessary conditions for a maximum hands-off optimal control problem using an $L_0$-(semi)norm, and also provides sufficient conditions for the optimality of such controls. Numerical example illustrates that adopting an $L_0$ cost leads to a sparse control, whereas an $L_1$-relaxation in singular problems leads to a non-sparse solution.
△ Less
Submitted 29 February, 2016;
originally announced February 2016.