-
Joint User and Beam Selection in Millimeter Wave Networks
Authors:
Santosh Kumar Singh,
Satyabrata Sahu,
Ayushi Thawait,
Prasanna Chaporkar,
Gaurav S. Kasbekar
Abstract:
We study the problem of selecting a user equipment (UE) and a beam for each access point (AP) for concurrent transmissions in a millimeter wave (mmWave) network, such that the sum of weighted rates of UEs is maximized. We prove that this problem is NP-complete. We propose two algorithms -- Markov Chain Monte Carlo (MCMC) based and local interaction game (LIG) based UE and beam selection -- and pro…
▽ More
We study the problem of selecting a user equipment (UE) and a beam for each access point (AP) for concurrent transmissions in a millimeter wave (mmWave) network, such that the sum of weighted rates of UEs is maximized. We prove that this problem is NP-complete. We propose two algorithms -- Markov Chain Monte Carlo (MCMC) based and local interaction game (LIG) based UE and beam selection -- and prove that both of them asymptotically achieve the optimal solution. Also, we propose two fast greedy algorithms -- NGUB1 and NGUB2 -- for UE and beam selection. Through extensive simulations, we show that our proposed greedy algorithms outperform the most relevant algorithms proposed in prior work and perform close to the asymptotically optimal algorithms.
△ Less
Submitted 15 March, 2024; v1 submitted 12 February, 2024;
originally announced February 2024.
-
TaskMix: Data Augmentation for Meta-Learning of Spoken Intent Understanding
Authors:
Surya Kant Sahu
Abstract:
Meta-Learning has emerged as a research direction to better transfer knowledge from related tasks to unseen but related tasks. However, Meta-Learning requires many training tasks to learn representations that transfer well to unseen tasks; otherwise, it leads to overfitting, and the performance degenerates to worse than Multi-task Learning. We show that a state-of-the-art data augmentation method…
▽ More
Meta-Learning has emerged as a research direction to better transfer knowledge from related tasks to unseen but related tasks. However, Meta-Learning requires many training tasks to learn representations that transfer well to unseen tasks; otherwise, it leads to overfitting, and the performance degenerates to worse than Multi-task Learning. We show that a state-of-the-art data augmentation method worsens this problem of overfitting when the task diversity is low. We propose a simple method, TaskMix, which synthesizes new tasks by linearly interpolating existing tasks. We compare TaskMix against many baselines on an in-house multilingual intent classification dataset of N-Best ASR hypotheses derived from real-life human-machine telephony utterances and two datasets derived from MTOP. We show that TaskMix outperforms baselines, alleviates overfitting when task diversity is low, and does not degrade performance even when it is high.
△ Less
Submitted 25 September, 2022;
originally announced October 2022.
-
Audiomer: A Convolutional Transformer For Keyword Spotting
Authors:
Surya Kant Sahu,
Sai Mitheran,
Juhi Kamdar,
Meet Gandhi
Abstract:
Transformers have seen an unprecedented rise in Natural Language Processing and Computer Vision tasks. However, in audio tasks, they are either infeasible to train due to extremely large sequence length of audio waveforms or incur a performance penalty when trained on Fourier-based features. In this work, we introduce an architecture, Audiomer, where we combine 1D Residual Networks with Performer…
▽ More
Transformers have seen an unprecedented rise in Natural Language Processing and Computer Vision tasks. However, in audio tasks, they are either infeasible to train due to extremely large sequence length of audio waveforms or incur a performance penalty when trained on Fourier-based features. In this work, we introduce an architecture, Audiomer, where we combine 1D Residual Networks with Performer Attention to achieve state-of-the-art performance in keyword spotting with raw audio waveforms, outperforming all previous methods while being computationally cheaper and parameter-efficient. Additionally, our model has practical advantages for speech processing, such as inference on arbitrarily long audio clips owing to the absence of positional encoding. The code is available at https://github.com/The-Learning-Machines/Audiomer-PyTorch.
△ Less
Submitted 1 February, 2022; v1 submitted 21 September, 2021;
originally announced September 2021.
-
A One-Shot Learning Framework for Assessment of Fibrillar Collagen from Second Harmonic Generation Images of an Infarcted Myocardium
Authors:
Qun Liu,
Supratik Mukhopadhyay,
Maria Ximena Bastidas Rodriguez,
Xing Fu,
Sushant Sahu,
David Burk,
Manas Gartia
Abstract:
Myocardial infarction (MI) is a scientific term that refers to heart attack. In this study, we infer highly relevant second harmonic generation (SHG) cues from collagen fibers exhibiting highly non-centrosymmetric assembly together with two-photon excited cellular autofluorescence in infarcted mouse heart to quantitatively probe fibrosis, especially targeted at an early stage after MI. We present…
▽ More
Myocardial infarction (MI) is a scientific term that refers to heart attack. In this study, we infer highly relevant second harmonic generation (SHG) cues from collagen fibers exhibiting highly non-centrosymmetric assembly together with two-photon excited cellular autofluorescence in infarcted mouse heart to quantitatively probe fibrosis, especially targeted at an early stage after MI. We present a robust one-shot machine learning algorithm that enables determination of 2D assembly of collagen with high spatial resolution along with its structural arrangement in heart tissues post-MI with spectral specificity and sensitivity. Detection, evaluation, and precise quantification of fibrosis extent at early stage would guide one to develop treatment therapies that may prevent further progression and determine heart transplant needs for patient survival.
△ Less
Submitted 29 January, 2020; v1 submitted 23 January, 2020;
originally announced January 2020.
-
Modeling Feature Representations for Affective Speech using Generative Adversarial Networks
Authors:
Saurabh Sahu,
Rahul Gupta,
Carol Espy-Wilson
Abstract:
Emotion recognition is a classic field of research with a typical setup extracting features and feeding them through a classifier for prediction. On the other hand, generative models jointly capture the distributional relationship between emotions and the feature profiles. Relatively recently, Generative Adversarial Networks (GANs) have surfaced as a new class of generative models and have shown c…
▽ More
Emotion recognition is a classic field of research with a typical setup extracting features and feeding them through a classifier for prediction. On the other hand, generative models jointly capture the distributional relationship between emotions and the feature profiles. Relatively recently, Generative Adversarial Networks (GANs) have surfaced as a new class of generative models and have shown considerable success in modeling distributions in the fields of computer vision and natural language understanding. In this work, we experiment with variants of GAN architectures to generate feature vectors corresponding to an emotion in two ways: (i) A generator is trained with samples from a mixture prior. Each mixture component corresponds to an emotional class and can be sampled to generate features from the corresponding emotion. (ii) A one-hot vector corresponding to an emotion can be explicitly used to generate the features. We perform analysis on such models and also propose different metrics used to measure the performance of the GAN models in their ability to generate realistic synthetic samples. Apart from evaluation on a given dataset of interest, we perform a cross-corpus study where we study the utility of the synthetic samples as additional training data in low resource conditions.
△ Less
Submitted 31 October, 2019;
originally announced November 2019.
-
Blind Deblurring using Deep Learning: A Survey
Authors:
Siddhant Sahu,
Manoj Kumar Lenka,
Pankaj Kumar Sa
Abstract:
We inspect all the deep learning based solutions and provide holistic understanding of various architectures that have evolved over the past few years to solve blind deblurring. The introductory work used deep learning to estimate some features of the blur kernel and then moved onto predicting the blur kernel entirely, which converts the problem into non-blind deblurring. The recent state of the a…
▽ More
We inspect all the deep learning based solutions and provide holistic understanding of various architectures that have evolved over the past few years to solve blind deblurring. The introductory work used deep learning to estimate some features of the blur kernel and then moved onto predicting the blur kernel entirely, which converts the problem into non-blind deblurring. The recent state of the art techniques are end to end, i.e., they don't estimate the blur kernel rather try to estimate the latent sharp image directly from the blurred image. The benchmarking PSNR and SSIM values on standard datasets of GOPRO and Kohler using various architectures are also provided.
△ Less
Submitted 23 July, 2019;
originally announced July 2019.