-
Bias Neutralization Framework: Measuring Fairness in Large Language Models with Bias Intelligence Quotient (BiQ)
Authors:
Malur Narayan,
John Pasmore,
Elton Sampaio,
Vijay Raghavan,
Gabriella Waters
Abstract:
The burgeoning influence of Large Language Models (LLMs) in sha** public discourse and decision-making underscores the imperative to address inherent biases within these AI systems. In the wake of AI's expansive integration across sectors, addressing racial bias in LLMs has never been more critical. This paper introduces a novel framework called Comprehensive Bias Neutralization Framework (CBNF)…
▽ More
The burgeoning influence of Large Language Models (LLMs) in sha** public discourse and decision-making underscores the imperative to address inherent biases within these AI systems. In the wake of AI's expansive integration across sectors, addressing racial bias in LLMs has never been more critical. This paper introduces a novel framework called Comprehensive Bias Neutralization Framework (CBNF) which embodies an innovative approach to quantifying and mitigating biases within LLMs. Our framework combines the Large Language Model Bias Index (LLMBI) [Oketunji, A., Anas, M., Saina, D., (2023)] and Bias removaL with No Demographics (BLIND) [Orgad, H., Belinkov, Y. (2023)] methodologies to create a new metric called Bias Intelligence Quotient (BiQ)which detects, measures, and mitigates racial bias in LLMs without reliance on demographic annotations.
By introducing a new metric called BiQ that enhances LLMBI with additional fairness metrics, CBNF offers a multi-dimensional metric for bias assessment, underscoring the necessity of a nuanced approach to fairness in AI [Mehrabi et al., 2021]. This paper presents a detailed analysis of Latimer AI (a language model incrementally trained on black history and culture) in comparison to ChatGPT 3.5, illustrating Latimer AI's efficacy in detecting racial, cultural, and gender biases through targeted training and refined bias mitigation strategies [Latimer & Bender, 2023].
△ Less
Submitted 28 April, 2024;
originally announced April 2024.
-
StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models
Authors:
Yinghao Aaron Li,
Cong Han,
Vinay S. Raghavan,
Gavin Mischler,
Nima Mesgarani
Abstract:
In this paper, we present StyleTTS 2, a text-to-speech (TTS) model that leverages style diffusion and adversarial training with large speech language models (SLMs) to achieve human-level TTS synthesis. StyleTTS 2 differs from its predecessor by modeling styles as a latent random variable through diffusion models to generate the most suitable style for the text without requiring reference speech, a…
▽ More
In this paper, we present StyleTTS 2, a text-to-speech (TTS) model that leverages style diffusion and adversarial training with large speech language models (SLMs) to achieve human-level TTS synthesis. StyleTTS 2 differs from its predecessor by modeling styles as a latent random variable through diffusion models to generate the most suitable style for the text without requiring reference speech, achieving efficient latent diffusion while benefiting from the diverse speech synthesis offered by diffusion models. Furthermore, we employ large pre-trained SLMs, such as WavLM, as discriminators with our novel differentiable duration modeling for end-to-end training, resulting in improved speech naturalness. StyleTTS 2 surpasses human recordings on the single-speaker LJSpeech dataset and matches it on the multispeaker VCTK dataset as judged by native English speakers. Moreover, when trained on the LibriTTS dataset, our model outperforms previous publicly available models for zero-shot speaker adaptation. This work achieves the first human-level TTS on both single and multispeaker datasets, showcasing the potential of style diffusion and adversarial training with large SLMs. The audio demos and source code are available at https://styletts2.github.io/.
△ Less
Submitted 19 November, 2023; v1 submitted 13 June, 2023;
originally announced June 2023.
-
IndicTrans2: Towards High-Quality and Accessible Machine Translation Models for all 22 Scheduled Indian Languages
Authors:
Jay Gala,
Pranjal A. Chitale,
Raghavan AK,
Varun Gumma,
Sumanth Doddapaneni,
Aswanth Kumar,
Janki Nawale,
Anupama Sujatha,
Ratish Puduppully,
Vivek Raghavan,
Pratyush Kumar,
Mitesh M. Khapra,
Raj Dabre,
Anoop Kunchukuttan
Abstract:
India has a rich linguistic landscape with languages from 4 major language families spoken by over a billion people. 22 of these languages are listed in the Constitution of India (referred to as scheduled languages) are the focus of this work. Given the linguistic diversity, high-quality and accessible Machine Translation (MT) systems are essential in a country like India. Prior to this work, ther…
▽ More
India has a rich linguistic landscape with languages from 4 major language families spoken by over a billion people. 22 of these languages are listed in the Constitution of India (referred to as scheduled languages) are the focus of this work. Given the linguistic diversity, high-quality and accessible Machine Translation (MT) systems are essential in a country like India. Prior to this work, there was (i) no parallel training data spanning all 22 languages, (ii) no robust benchmarks covering all these languages and containing content relevant to India, and (iii) no existing translation models which support all the 22 scheduled languages of India. In this work, we aim to address this gap by focusing on the missing pieces required for enabling wide, easy, and open access to good machine translation systems for all 22 scheduled Indian languages. We identify four key areas of improvement: curating and creating larger training datasets, creating diverse and high-quality benchmarks, training multilingual models, and releasing models with open access. Our first contribution is the release of the Bharat Parallel Corpus Collection (BPCC), the largest publicly available parallel corpora for Indic languages. BPCC contains a total of 230M bitext pairs, of which a total of 126M were newly added, including 644K manually translated sentence pairs created as part of this work. Our second contribution is the release of the first n-way parallel benchmark covering all 22 Indian languages, featuring diverse domains, Indian-origin content, and source-original test sets. Next, we present IndicTrans2, the first model to support all 22 languages, surpassing existing models on multiple existing and new benchmarks created as a part of this work. Lastly, to promote accessibility and collaboration, we release our models and associated data with permissive licenses at https://github.com/AI4Bharat/IndicTrans2.
△ Less
Submitted 20 December, 2023; v1 submitted 25 May, 2023;
originally announced May 2023.
-
SemEval 2023 Task 6: LegalEval - Understanding Legal Texts
Authors:
Ashutosh Modi,
Prathamesh Kalamkar,
Saurabh Karn,
Aman Tiwari,
Abhinav Joshi,
Sai Kiran Tanikella,
Shouvik Kumar Guha,
Sachin Malhan,
Vivek Raghavan
Abstract:
In populous countries, pending legal cases have been growing exponentially. There is a need for develo** NLP-based techniques for processing and automatically understanding legal documents. To promote research in the area of Legal NLP we organized the shared task LegalEval - Understanding Legal Texts at SemEval 2023. LegalEval task has three sub-tasks: Task-A (Rhetorical Roles Labeling) is about…
▽ More
In populous countries, pending legal cases have been growing exponentially. There is a need for develo** NLP-based techniques for processing and automatically understanding legal documents. To promote research in the area of Legal NLP we organized the shared task LegalEval - Understanding Legal Texts at SemEval 2023. LegalEval task has three sub-tasks: Task-A (Rhetorical Roles Labeling) is about automatically structuring legal documents into semantically coherent units, Task-B (Legal Named Entity Recognition) deals with identifying relevant entities in a legal document and Task-C (Court Judgement Prediction with Explanation) explores the possibility of automatically predicting the outcome of a legal case along with providing an explanation for the prediction. In total 26 teams (approx. 100 participants spread across the world) submitted systems paper. In each of the sub-tasks, the proposed systems outperformed the baselines; however, there is a lot of scope for improvement. This paper describes the tasks, and analyzes techniques proposed by various teams.
△ Less
Submitted 1 May, 2023; v1 submitted 19 April, 2023;
originally announced April 2023.
-
naplib-python: Neural Acoustic Data Processing and Analysis Tools in Python
Authors:
Gavin Mischler,
Vinay Raghavan,
Menoua Keshishian,
Nima Mesgarani
Abstract:
Recently, the computational neuroscience community has pushed for more transparent and reproducible methods across the field. In the interest of unifying the domain of auditory neuroscience, naplib-python provides an intuitive and general data structure for handling all neural recordings and stimuli, as well as extensive preprocessing, feature extraction, and analysis tools which operate on that d…
▽ More
Recently, the computational neuroscience community has pushed for more transparent and reproducible methods across the field. In the interest of unifying the domain of auditory neuroscience, naplib-python provides an intuitive and general data structure for handling all neural recordings and stimuli, as well as extensive preprocessing, feature extraction, and analysis tools which operate on that data structure. The package removes many of the complications associated with this domain, such as varying trial durations and multi-modal stimuli, and provides a general-purpose analysis framework that interfaces easily with existing toolboxes used in the field.
△ Less
Submitted 4 April, 2023;
originally announced April 2023.
-
Named Entity Recognition in Indian court judgments
Authors:
Prathamesh Kalamkar,
Astha Agarwal,
Aman Tiwari,
Smita Gupta,
Saurabh Karn,
Vivek Raghavan
Abstract:
Identification of named entities from legal texts is an essential building block for develo** other legal Artificial Intelligence applications. Named Entities in legal texts are slightly different and more fine-grained than commonly used named entities like Person, Organization, Location etc. In this paper, we introduce a new corpus of 46545 annotated legal named entities mapped to 14 legal enti…
▽ More
Identification of named entities from legal texts is an essential building block for develo** other legal Artificial Intelligence applications. Named Entities in legal texts are slightly different and more fine-grained than commonly used named entities like Person, Organization, Location etc. In this paper, we introduce a new corpus of 46545 annotated legal named entities mapped to 14 legal entity types. The Baseline model for extracting legal named entities from judgment text is also developed.
△ Less
Submitted 7 November, 2022;
originally announced November 2022.
-
SystemMatch: optimizing preclinical drug models to human clinical outcomes via generative latent-space matching
Authors:
Scott Gigante,
Varsha G. Raghavan,
Amanda M. Robinson,
Robert A. Barton,
Adeeb H. Rahman,
Drausin F. Wulsin,
Jacques Banchereau,
Noam Solomon,
Luis F. Voloch,
Fabian J. Theis
Abstract:
Translating the relevance of preclinical models ($\textit{in vitro}$, animal models, or organoids) to their relevance in humans presents an important challenge during drug development. The rising abundance of single-cell genomic data from human tumors and tissue offers a new opportunity to optimize model systems by their similarity to targeted human cell types in disease. In this work, we introduc…
▽ More
Translating the relevance of preclinical models ($\textit{in vitro}$, animal models, or organoids) to their relevance in humans presents an important challenge during drug development. The rising abundance of single-cell genomic data from human tumors and tissue offers a new opportunity to optimize model systems by their similarity to targeted human cell types in disease. In this work, we introduce SystemMatch to assess the fit of preclinical model systems to an $\textit{in sapiens}$ target population and to recommend experimental changes to further optimize these systems. We demonstrate this through an application to develo** $\textit{in vitro}$ systems to model human tumor-derived suppressive macrophages. We show with held-out $\textit{in vivo}$ controls that our pipeline successfully ranks macrophage subpopulations by their biological similarity to the target population, and apply this analysis to rank a series of 18 $\textit{in vitro}$ macrophage systems perturbed with a variety of cytokine stimulations. We extend this analysis to predict the behavior of 66 $\textit{in silico}$ model systems generated using a perturbational autoencoder and apply a $k$-medoids approach to recommend a subset of these model systems for further experimental development in order to fully explore the space of possible perturbations. Through this use case, we demonstrate a novel approach to model system development to generate a system more similar to human biology.
△ Less
Submitted 14 May, 2022;
originally announced May 2022.
-
Speaker Recognition in the Wild
Authors:
Neeraj Chhimwal,
Anirudh Gupta,
Rishabh Gaur,
Harveen Singh Chadha,
Priyanshi Shah,
Ankur Dhuriya,
Vivek Raghavan
Abstract:
In this paper, we propose a pipeline to find the number of speakers, as well as audios belonging to each of these now identified speakers in a source of audio data where number of speakers or speaker labels are not known a priori. We used this approach as a part of our Data Preparation pipeline for Speech Recognition in Indic Languages (https://github.com/Open-Speech-EkStep/vakyansh-wav2vec2-exper…
▽ More
In this paper, we propose a pipeline to find the number of speakers, as well as audios belonging to each of these now identified speakers in a source of audio data where number of speakers or speaker labels are not known a priori. We used this approach as a part of our Data Preparation pipeline for Speech Recognition in Indic Languages (https://github.com/Open-Speech-EkStep/vakyansh-wav2vec2-experimentation). To understand and evaluate the accuracy of our proposed pipeline, we introduce two metrics: Cluster Purity, and Cluster Uniqueness. Cluster Purity quantifies how "pure" a cluster is. Cluster Uniqueness, on the other hand, quantifies what percentage of clusters belong only to a single dominant speaker. We discuss more on these metrics in section \ref{sec:metrics}. Since we develop this utility to aid us in identifying data based on speaker IDs before training an Automatic Speech Recognition (ASR) model, and since most of this data takes considerable effort to scrape, we also conclude that 98\% of data gets mapped to the top 80\% of clusters (computed by removing any clusters with less than a fixed number of utterances -- we do this to get rid of some very small clusters and use this threshold as 30), in the test set chosen.
△ Less
Submitted 5 May, 2022;
originally announced May 2022.
-
indic-punct: An automatic punctuation restoration and inverse text normalization framework for Indic languages
Authors:
Anirudh Gupta,
Neeraj Chhimwal,
Ankur Dhuriya,
Rishabh Gaur,
Priyanshi Shah,
Harveen Singh Chadha,
Vivek Raghavan
Abstract:
Automatic Speech Recognition (ASR) generates text which is most of the times devoid of any punctuation. Absence of punctuation is text can affect readability. Also, down stream NLP tasks such as sentiment analysis, machine translation, greatly benefit by having punctuation and sentence boundary information. We present an approach for automatic punctuation of text using a pretrained IndicBERT model…
▽ More
Automatic Speech Recognition (ASR) generates text which is most of the times devoid of any punctuation. Absence of punctuation is text can affect readability. Also, down stream NLP tasks such as sentiment analysis, machine translation, greatly benefit by having punctuation and sentence boundary information. We present an approach for automatic punctuation of text using a pretrained IndicBERT model. Inverse text normalization is done by hand writing weighted finite state transducer (WFST) grammars. We have developed this tool for 11 Indic languages namely Hindi, Tamil, Telugu, Kannada, Gujarati, Marathi, Odia, Bengali, Assamese, Malayalam and Punjabi. All code and data is publicly. available
△ Less
Submitted 31 March, 2022;
originally announced March 2022.
-
Effectiveness of text to speech pseudo labels for forced alignment and cross lingual pretrained models for low resource speech recognition
Authors:
Anirudh Gupta,
Rishabh Gaur,
Ankur Dhuriya,
Harveen Singh Chadha,
Neeraj Chhimwal,
Priyanshi Shah,
Vivek Raghavan
Abstract:
In the recent years end to end (E2E) automatic speech recognition (ASR) systems have achieved promising results given sufficient resources. Even for languages where not a lot of labelled data is available, state of the art E2E ASR systems can be developed by pretraining on huge amounts of high resource languages and finetune on low resource languages. For a lot of low resource languages the curren…
▽ More
In the recent years end to end (E2E) automatic speech recognition (ASR) systems have achieved promising results given sufficient resources. Even for languages where not a lot of labelled data is available, state of the art E2E ASR systems can be developed by pretraining on huge amounts of high resource languages and finetune on low resource languages. For a lot of low resource languages the current approaches are still challenging, since in many cases labelled data is not available in open domain. In this paper we present an approach to create labelled data for Maithili, Bhojpuri and Dogri by utilising pseudo labels from text to speech for forced alignment. The created data was inspected for quality and then further used to train a transformer based wav2vec 2.0 ASR model. All data and models are available in open domain.
△ Less
Submitted 31 March, 2022;
originally announced March 2022.
-
Is Word Error Rate a good evaluation metric for Speech Recognition in Indic Languages?
Authors:
Priyanshi Shah,
Harveen Singh Chadha,
Anirudh Gupta,
Ankur Dhuriya,
Neeraj Chhimwal,
Rishabh Gaur,
Vivek Raghavan
Abstract:
We propose a new method for the calculation of error rates in Automatic Speech Recognition (ASR). This new metric is for languages that contain half characters and where the same character can be written in different forms. We implement our methodology in Hindi which is one of the main languages from Indic context and we think this approach is scalable to other similar languages containing a large…
▽ More
We propose a new method for the calculation of error rates in Automatic Speech Recognition (ASR). This new metric is for languages that contain half characters and where the same character can be written in different forms. We implement our methodology in Hindi which is one of the main languages from Indic context and we think this approach is scalable to other similar languages containing a large character set. We call our metrics Alternate Word Error Rate (AWER) and Alternate Character Error Rate (ACER).
We train our ASR models using wav2vec 2.0\cite{baevski2020wav2vec} for Indic languages. Additionally we use language models to improve our model performance. Our results show a significant improvement in analyzing the error rates at word and character level and the interpretability of the ASR system is improved upto $3$\% in AWER and $7$\% in ACER for Hindi. Our experiments suggest that in languages which have complex pronunciation, there are multiple ways of writing words without changing their meaning. In such cases AWER and ACER will be more useful rather than WER and CER as metrics. Further, we open source a new benchmarking dataset of 21 hours for Hindi with the new metric scripts.
△ Less
Submitted 15 June, 2022; v1 submitted 30 March, 2022;
originally announced March 2022.
-
Improving Speech Recognition for Indic Languages using Language Model
Authors:
Ankur Dhuriya,
Harveen Singh Chadha,
Anirudh Gupta,
Priyanshi Shah,
Neeraj Chhimwal,
Rishabh Gaur,
Vivek Raghavan
Abstract:
We study the effect of applying a language model (LM) on the output of Automatic Speech Recognition (ASR) systems for Indic languages. We fine-tune wav2vec $2.0$ models for $18$ Indic languages and adjust the results with language models trained on text derived from a variety of sources. Our findings demonstrate that the average Character Error Rate (CER) decreases by over $28$ \% and the average…
▽ More
We study the effect of applying a language model (LM) on the output of Automatic Speech Recognition (ASR) systems for Indic languages. We fine-tune wav2vec $2.0$ models for $18$ Indic languages and adjust the results with language models trained on text derived from a variety of sources. Our findings demonstrate that the average Character Error Rate (CER) decreases by over $28$ \% and the average Word Error Rate (WER) decreases by about $36$ \% after decoding with LM. We show that a large LM may not provide a substantial improvement as compared to a diverse one. We also demonstrate that high quality transcriptions can be obtained on domain-specific data without retraining the ASR model and show results on biomedical domain.
△ Less
Submitted 15 June, 2022; v1 submitted 30 March, 2022;
originally announced March 2022.
-
Code Switched and Code Mixed Speech Recognition for Indic languages
Authors:
Harveen Singh Chadha,
Priyanshi Shah,
Ankur Dhuriya,
Neeraj Chhimwal,
Anirudh Gupta,
Vivek Raghavan
Abstract:
Training multilingual automatic speech recognition (ASR) systems is challenging because acoustic and lexical information is typically language specific. Training multilingual system for Indic languages is even more tougher due to lack of open source datasets and results on different approaches. We compare the performance of end to end multilingual speech recognition system to the performance of mo…
▽ More
Training multilingual automatic speech recognition (ASR) systems is challenging because acoustic and lexical information is typically language specific. Training multilingual system for Indic languages is even more tougher due to lack of open source datasets and results on different approaches. We compare the performance of end to end multilingual speech recognition system to the performance of monolingual models conditioned on language identification (LID). The decoding information from a multilingual model is used for language identification and then combined with monolingual models to get an improvement of 50% WER across languages. We also propose a similar technique to solve the Code Switched problem and achieve a WER of 21.77 and 28.27 over Hindi-English and Bengali-English respectively. Our work talks on how transformer based ASR especially wav2vec 2.0 can be applied in develo** multilingual ASR and code switched ASR for Indic languages.
△ Less
Submitted 13 June, 2022; v1 submitted 30 March, 2022;
originally announced March 2022.
-
Vakyansh: ASR Toolkit for Low Resource Indic languages
Authors:
Harveen Singh Chadha,
Anirudh Gupta,
Priyanshi Shah,
Neeraj Chhimwal,
Ankur Dhuriya,
Rishabh Gaur,
Vivek Raghavan
Abstract:
We present Vakyansh, an end to end toolkit for Speech Recognition in Indic languages. India is home to almost 121 languages and around 125 crore speakers. Yet most of the languages are low resource in terms of data and pretrained models. Through Vakyansh, we introduce automatic data pipelines for data creation, model training, model evaluation and deployment. We create 14,000 hours of speech data…
▽ More
We present Vakyansh, an end to end toolkit for Speech Recognition in Indic languages. India is home to almost 121 languages and around 125 crore speakers. Yet most of the languages are low resource in terms of data and pretrained models. Through Vakyansh, we introduce automatic data pipelines for data creation, model training, model evaluation and deployment. We create 14,000 hours of speech data in 23 Indic languages and train wav2vec 2.0 based pretrained models. These pretrained models are then finetuned to create state of the art speech recognition models for 18 Indic languages which are followed by language models and punctuation restoration models. We open source all these resources with a mission that this will inspire the speech community to develop speech first applications using our ASR models in Indic languages.
△ Less
Submitted 15 June, 2022; v1 submitted 30 March, 2022;
originally announced March 2022.
-
Corpus for Automatic Structuring of Legal Documents
Authors:
Prathamesh Kalamkar,
Aman Tiwari,
Astha Agarwal,
Saurabh Karn,
Smita Gupta,
Vivek Raghavan,
Ashutosh Modi
Abstract:
In populous countries, pending legal cases have been growing exponentially. There is a need for develo** techniques for processing and organizing legal documents. In this paper, we introduce a new corpus for structuring legal documents. In particular, we introduce a corpus of legal judgment documents in English that are segmented into topical and coherent parts. Each of these parts is annotated…
▽ More
In populous countries, pending legal cases have been growing exponentially. There is a need for develo** techniques for processing and organizing legal documents. In this paper, we introduce a new corpus for structuring legal documents. In particular, we introduce a corpus of legal judgment documents in English that are segmented into topical and coherent parts. Each of these parts is annotated with a label coming from a list of pre-defined Rhetorical Roles. We develop baseline models for automatically predicting rhetorical roles in a legal document based on the annotated corpus. Further, we show the application of rhetorical roles to improve performance on the tasks of summarization and legal judgment prediction. We release the corpus and baseline model code along with the paper.
△ Less
Submitted 19 September, 2022; v1 submitted 31 January, 2022;
originally announced January 2022.
-
Model-Agnostic Hybrid Numerical Weather Prediction and Machine Learning Paradigm for Solar Forecasting in the Tropics
Authors:
Nigel Yuan Yun Ng,
Harish Gopalan,
Venugopalan S. G. Raghavan,
Chin Chun Ooi
Abstract:
Numerical weather prediction (NWP) and machine learning (ML) methods are popular for solar forecasting. However, NWP models have multiple possible physical parameterizations, which requires site-specific NWP optimization. This is further complicated when regional NWP models are used with global climate models with different possible parameterizations. In this study, an alternative approach is prop…
▽ More
Numerical weather prediction (NWP) and machine learning (ML) methods are popular for solar forecasting. However, NWP models have multiple possible physical parameterizations, which requires site-specific NWP optimization. This is further complicated when regional NWP models are used with global climate models with different possible parameterizations. In this study, an alternative approach is proposed and evaluated for four radiation models. Weather Research and Forecasting (WRF) model is run in both global and regional mode to provide an estimate for solar irradiance. This estimate is then post-processed using ML to provide a final prediction. Normalized root-mean-square error from WRF is reduced by up to 40-50% with this ML error correction model. Results obtained using CAM, GFDL, New Goddard and RRTMG radiation models were comparable after this correction, negating the need for WRF parameterization tuning. Other models incorporating nearby locations and sensor data are also evaluated, with the latter being particularly promising.
△ Less
Submitted 9 December, 2021;
originally announced December 2021.
-
CLSRIL-23: Cross Lingual Speech Representations for Indic Languages
Authors:
Anirudh Gupta,
Harveen Singh Chadha,
Priyanshi Shah,
Neeraj Chhimwal,
Ankur Dhuriya,
Rishabh Gaur,
Vivek Raghavan
Abstract:
We present a CLSRIL-23, a self supervised learning based audio pre-trained model which learns cross lingual speech representations from raw audio across 23 Indic languages. It is built on top of wav2vec 2.0 which is solved by training a contrastive task over masked latent speech representations and jointly learns the quantization of latents shared across all languages. We compare the language wise…
▽ More
We present a CLSRIL-23, a self supervised learning based audio pre-trained model which learns cross lingual speech representations from raw audio across 23 Indic languages. It is built on top of wav2vec 2.0 which is solved by training a contrastive task over masked latent speech representations and jointly learns the quantization of latents shared across all languages. We compare the language wise loss during pretraining to compare effects of monolingual and multilingual pretraining. Performance on some downstream fine-tuning tasks for speech recognition is also compared and our experiments show that multilingual pretraining outperforms monolingual training, in terms of learning speech representations which encodes phonetic similarity of languages and also in terms of performance on down stream tasks. A decrease of 5% is observed in WER and 9.5% in CER when a multilingual pretrained model is used for finetuning in Hindi. All the code models are also open sourced. CLSRIL-23 is a model trained on $23$ languages and almost 10,000 hours of audio data to facilitate research in speech recognition for Indic languages. We hope that new state of the art systems will be created using the self supervised approach, especially for low resources Indic languages.
△ Less
Submitted 13 January, 2022; v1 submitted 15 July, 2021;
originally announced July 2021.
-
Mitigating Hand Blockage with Non-Directional Beamforming Codebooks
Authors:
Vasanthan Raghavan,
Ricardo A. Motos,
M. Ali Tassoudji,
Yu-Chin Ou,
Ozge H. Koymen,
Junyi Li
Abstract:
Hand blockage leads to significant performance impairments at millimeter wave carrier frequencies. A number of prior works have characterized the loss in signal strength with the hand using studies with horn antennas and form-factor user equipments (UEs). However, the impact of the hand on the effective phase response seen by the antenna elements has not been studied so far. Towards this goal, we…
▽ More
Hand blockage leads to significant performance impairments at millimeter wave carrier frequencies. A number of prior works have characterized the loss in signal strength with the hand using studies with horn antennas and form-factor user equipments (UEs). However, the impact of the hand on the effective phase response seen by the antenna elements has not been studied so far. Towards this goal, we consider a measurement framework that uses a hand phantom holding the UE in relaxed positions reflective of talk mode, watching videos, and playing games. We first study the impact of blockage on a directional beam steering codebook. The tight phase relationship across antenna elements needed to steer beams leads to a significant performance degradation as the hand surface can distort the observed amplitudes and phases across the antenna elements, which cannot be matched by this codebook. To overcome this loss, we propose a non-directional beamforming codebook made of amplitudes and/or quantized phases with both these quantities estimated as necessary. Theoretical as well as numerical studies show that the proposed codebook can de-randomize the phase distortions induced by the hand and coherently combine the energy across antenna elements and thus help in mitigating hand blockage losses.
△ Less
Submitted 13 April, 2021;
originally announced April 2021.
-
Samanantar: The Largest Publicly Available Parallel Corpora Collection for 11 Indic Languages
Authors:
Gowtham Ramesh,
Sumanth Doddapaneni,
Aravinth Bheemaraj,
Mayank Jobanputra,
Raghavan AK,
Ajitesh Sharma,
Sujit Sahoo,
Harshita Diddee,
Mahalakshmi J,
Divyanshu Kakwani,
Navneet Kumar,
Aswin Pradeep,
Srihari Nagaraj,
Kumar Deepak,
Vivek Raghavan,
Anoop Kunchukuttan,
Pratyush Kumar,
Mitesh Shantadevi Khapra
Abstract:
We present Samanantar, the largest publicly available parallel corpora collection for Indic languages. The collection contains a total of 49.7 million sentence pairs between English and 11 Indic languages (from two language families). Specifically, we compile 12.4 million sentence pairs from existing, publicly-available parallel corpora, and additionally mine 37.4 million sentence pairs from the w…
▽ More
We present Samanantar, the largest publicly available parallel corpora collection for Indic languages. The collection contains a total of 49.7 million sentence pairs between English and 11 Indic languages (from two language families). Specifically, we compile 12.4 million sentence pairs from existing, publicly-available parallel corpora, and additionally mine 37.4 million sentence pairs from the web, resulting in a 4x increase. We mine the parallel sentences from the web by combining many corpora, tools, and methods: (a) web-crawled monolingual corpora, (b) document OCR for extracting sentences from scanned documents, (c) multilingual representation models for aligning sentences, and (d) approximate nearest neighbor search for searching in a large collection of sentences. Human evaluation of samples from the newly mined corpora validate the high quality of the parallel sentences across 11 languages. Further, we extract 83.4 million sentence pairs between all 55 Indic language pairs from the English-centric parallel corpus using English as the pivot language. We trained multilingual NMT models spanning all these languages on Samanantar, which outperform existing models and baselines on publicly available benchmarks, such as FLORES, establishing the utility of Samanantar. Our data and models are available publicly at https://ai4bharat.iitm.ac.in/samanantar and we hope they will help advance research in NMT and multilingual NLP for Indic languages.
△ Less
Submitted 12 June, 2023; v1 submitted 12 April, 2021;
originally announced April 2021.
-
Greenplum: A Hybrid Database for Transactional and Analytical Workloads
Authors:
Zhenghua Lyu,
Huan Hubert Zhang,
Gang Xiong,
Haozhou Wang,
Gang Guo,
**bao Chen,
Asim Praveen,
Yu Yang,
Xiaoming Gao,
Ashwin Agrawal,
Alexandra Wang,
Wen Lin,
Junfeng Yang,
Hao Wu,
Xiaoliang Li,
Feng Guo,
Jiang Wu,
Jesse Zhang,
Venkatesh Raghavan
Abstract:
Demand for enterprise data warehouse solutions to support real-time Online Transaction Processing (OLTP) queries as well as long-running Online Analytical Processing (OLAP) workloads is growing. Greenplum database is traditionally known as an OLAP data warehouse system with limited ability to process OLTP workloads. In this paper, we augment Greenplum into a hybrid system to serve both OLTP and OL…
▽ More
Demand for enterprise data warehouse solutions to support real-time Online Transaction Processing (OLTP) queries as well as long-running Online Analytical Processing (OLAP) workloads is growing. Greenplum database is traditionally known as an OLAP data warehouse system with limited ability to process OLTP workloads. In this paper, we augment Greenplum into a hybrid system to serve both OLTP and OLAP workloads. The challenge we address here is to achieve this goal while maintaining the ACID properties with minimal performance overhead. In this effort, we identify the engineering and performance bottlenecks such as the under-performing restrictive locking and the two-phase commit protocol. Next we solve the resource contention issues between transactional and analytical queries. We propose a global deadlock detector to increase the concurrency of query processing. When transactions that update data are guaranteed to reside on exactly one segment we introduce one-phase commit to speed up query processing. Our resource group model introduces the capability to separate OLAP and OLTP workloads into more suitable query processing mode. Our experimental evaluation on the TPC-B and CH-benCHmark benchmarks demonstrates the effectiveness of our approach in boosting the OLTP performance without sacrificing the OLAP performance.
△ Less
Submitted 13 May, 2021; v1 submitted 19 March, 2021;
originally announced March 2021.
-
Device-aware inference operations in SONOS nonvolatile memory arrays
Authors:
Christopher H. Bennett,
T. Patrick Xiao,
Ryan Dellana,
Vineet Agrawal,
Ben Feinberg,
Venkatraman Prabhakar,
Krishnaswamy Ramkumar,
Long Hinh,
Swatilekha Saha,
Vijay Raghavan,
Ramesh Chettuvetty,
Sapan Agarwal,
Matthew J. Marinella
Abstract:
Non-volatile memory arrays can deploy pre-trained neural network models for edge inference. However, these systems are affected by device-level noise and retention issues. Here, we examine damage caused by these effects, introduce a mitigation strategy, and demonstrate its use in fabricated array of SONOS (Silicon-Oxide-Nitride-Oxide-Silicon) devices. On MNIST, fashion-MNIST, and CIFAR-10 tasks, o…
▽ More
Non-volatile memory arrays can deploy pre-trained neural network models for edge inference. However, these systems are affected by device-level noise and retention issues. Here, we examine damage caused by these effects, introduce a mitigation strategy, and demonstrate its use in fabricated array of SONOS (Silicon-Oxide-Nitride-Oxide-Silicon) devices. On MNIST, fashion-MNIST, and CIFAR-10 tasks, our approach increases resilience to synaptic noise and drift. We also show strong performance can be realized with ADCs of 5-8 bits precision.
△ Less
Submitted 2 April, 2020;
originally announced April 2020.
-
Hand and Body Blockage Measurements with Form-Factor User Equipment at 28 GHz
Authors:
Vasanthan Raghavan,
Sonsay Noimanivone,
Sung Kil Rho,
Bernie Farin,
Patrick Connor,
Ricardo A. Motos,
Yu-Chin Ou,
Kobi Ravid,
M. Ali Tassoudji,
Ozge H. Koymen,
Junyi Li
Abstract:
Blockage by the human hand/body is an important impairment in realizing practical millimeter wave wireless systems. Prior works on blockage modeling are either based on theoretical studies of double knife edge diffraction or its modifications, high-frequency simulations of electromagnetic effects, or measurements with experimental millimeter wave prototypes. While such studies are useful, they do…
▽ More
Blockage by the human hand/body is an important impairment in realizing practical millimeter wave wireless systems. Prior works on blockage modeling are either based on theoretical studies of double knife edge diffraction or its modifications, high-frequency simulations of electromagnetic effects, or measurements with experimental millimeter wave prototypes. While such studies are useful, they do not capture the form-factor constraints of user equipments (UEs). In this work, we study the impact of hand/body blockage with a UE at $28$ GHz built on Qualcomm's millimeter wave modem, antenna modules and beamforming solutions. We report five exhaustive and controlled studies with different types of hand holdings/grips, antenna types, and with directional/narrow beams. For both hard as well as loose hand grips, we report considerably lower blockage loss estimates than prior works. Critical in estimating the loss is the definition of a "region of interest" (RoI) around the UE where the impact of the hand/body is seen. Towards this goal, we define a RoI that includes the spatial area where significant energy is seen in either the no blockage or blockage modes. Our studies show that significant spatial area coverage improvement can be seen with loose hand grip due to hand reflections.
△ Less
Submitted 8 December, 2019;
originally announced December 2019.
-
Overview of Guidance, Navigation and Control System of the TeamIndus lunar lander
Authors:
Vishesh Vatsal,
C. Barath,
J. Yogeshwaran,
Deepana Gandhi,
Chhavilata Sahu,
Karthic Balasubramanian,
Shyam Mohan,
Midhun S. Menon,
P. Natarajan,
Vivek Raghavan
Abstract:
TeamIndus' lunar logistics vision includes multiple lunar missions to meet requirements of science, commercial and efforts towards global exploration. The first mission is slated for launch in 2020. The prime objective is to demonstrate autonomous precision lunar landing, and Surface Exploration Rover to collect data on the vicinity of the landing site. TeamIndus has developed various technologies…
▽ More
TeamIndus' lunar logistics vision includes multiple lunar missions to meet requirements of science, commercial and efforts towards global exploration. The first mission is slated for launch in 2020. The prime objective is to demonstrate autonomous precision lunar landing, and Surface Exploration Rover to collect data on the vicinity of the landing site. TeamIndus has developed various technologies towards lowering the access barrier to the lunar surface. This paper shall provide an overview of design of lander GNC system. The design of the GNC system has been described after concluding studies on sensor and actuator configurations. Frugal design approach is followed in the selection of GNC hardware. The paper describes the constraints for the orbital maneuvers and the lunar descent strategy. Various aspects of the GNC design of autonomous lunar descent maneuver: timeline of events, guidance, inertial and optical terrain-relative navigation schemes are described. The GNC software description focuses on system architecture, modes of operation, and core elements of the GNC software. The GNC algorithms have been tested using Monte-Carlo simulations and Processor-in-Loop runs. The paper concludes with a summary of key risk-mitigation studies for soft landing.
△ Less
Submitted 25 July, 2019;
originally announced July 2019.
-
Evolution of Physical-Layer Communications Research in the Post-5G Era
Authors:
Vasanthan Raghavan,
Junyi Li
Abstract:
The evolving Fifth Generation New Radio (5G-NR) cellular standardization efforts at the Third Generation Partnership Project (3GPP) brings into focus a number of questions on relevant research problems in physical-layer communications for study by both academia and industry. To address this question, we show that the peak download data rates for both WiFi and cellular systems have been scaling exp…
▽ More
The evolving Fifth Generation New Radio (5G-NR) cellular standardization efforts at the Third Generation Partnership Project (3GPP) brings into focus a number of questions on relevant research problems in physical-layer communications for study by both academia and industry. To address this question, we show that the peak download data rates for both WiFi and cellular systems have been scaling exponentially with time over the last twenty five years. While kee** up with the historic cellular trends will be possible in the near-term with a modest bandwidth and hardware complexity expansion, even a reasonable stretching of this road-map into the far future would require significant bandwidth accretion, perhaps possible at the millimeter wave, sub-millimeter wave, or Terahertz (THz) regimes. The consequent increase in focus on systems at higher carrier frequencies necessitates a paradigm shift from the reuse of over-simplified (yet mathematically elegant) models, often inherited from sub-6 GHz systems, to a more holistic view where real measurements guide, motivate and refine the building of relevant but possibly complicated models, solution space(s), and good solutions. To motivate the need for this shift, we illustrate how the traditional abstraction fails to correctly estimate the delay spread of millimeter wave wireless channels and hand blockage losses at higher carrier frequencies. We conclude this paper with a broad set of implications for future research prospects at the physical-layer including key use-cases, possible research policy initiatives, and structural changes needed in telecommunications departments at universities.
△ Less
Submitted 3 January, 2019;
originally announced January 2019.
-
Antenna Placement and Performance Tradeoffs with Hand Blockage in Millimeter Wave Systems
Authors:
Vasanthan Raghavan,
Mei-Li,
Chi,
M. Ali Tassoudji,
Ozge H. Koymen,
Junyi Li
Abstract:
The ongoing commercial deployment of millimeter wave systems brings into focus a number of practical issues in form factor user equipment (UE) design. With wavelengths becoming smaller, antenna gain patterns becoming directional, and link budgets critically dependent on beamforming, it becomes imperative to use a number of antenna modules at different locations of the UE for good performance. Whil…
▽ More
The ongoing commercial deployment of millimeter wave systems brings into focus a number of practical issues in form factor user equipment (UE) design. With wavelengths becoming smaller, antenna gain patterns becoming directional, and link budgets critically dependent on beamforming, it becomes imperative to use a number of antenna modules at different locations of the UE for good performance. While more antennas/modules can enhance beamforming array gains, it comes with the tradeoff of higher component cost, power consumption of the associated radio frequency circuitry, and a beam management overhead in learning the appropriate beam weights. Thus, the goal of a good UE design is to provide robust spherical coverage corresponding to good array gains over the entire sphere around the UE with a low beam management overhead, complexity, and cost. The scope of this paper is to study the implications of two popular commercial millimeter wave UE designs (a face and an edge design) on spherical coverage. We show that analog beam codebooks can result in good performance for both the designs, and the edge design provides a better tradeoff in terms of robust performance (with hand blockage), beam management overhead, implementation complexity from an antenna placement standpoint and cost.
△ Less
Submitted 3 January, 2019;
originally announced January 2019.
-
Channel Reconstruction-Based Hybrid Precoding for Millimeter Wave Multi-User MIMO Systems
Authors:
Miguel R. Castellanos,
Vasanthan Raghavan,
Jung H. Ryu,
Ozge H. Koymen,
Junyi Li,
David J. Love,
Borja Peleato
Abstract:
The focus of this paper is on multi-user MIMO transmissions for millimeter wave systems with a hybrid precoding architecture at the base-station. To enable multi-user transmissions, the base-station uses a cell-specific codebook of beamforming vectors over an initial beam alignment phase. Each user uses a user-specific codebook of beamforming vectors to learn the top-P (where P >= 1) beam pairs in…
▽ More
The focus of this paper is on multi-user MIMO transmissions for millimeter wave systems with a hybrid precoding architecture at the base-station. To enable multi-user transmissions, the base-station uses a cell-specific codebook of beamforming vectors over an initial beam alignment phase. Each user uses a user-specific codebook of beamforming vectors to learn the top-P (where P >= 1) beam pairs in terms of the observed SNR in a single-user setting. The top-P beam indices along with their SNRs are fed back from each user and the base-station leverages this information to generate beam weights for simultaneous transmissions. A typical method to generate the beam weights is to use only the best beam for each user and either steer energy along this beam, or to utilize this information to reduce multi-user interference. The other beams are used as fall back options to address blockage or mobility. Such an approach completely discards information learned about the channel condition(s) even though each user feeds back this information. With this background, this work develops an advanced directional precoding structure for simultaneous transmissions at the cost of an additional marginal feedback overhead. This construction relies on three main innovations: 1) Additional feedback to allow the base-station to reconstruct a rank-P approximation of the channel matrix between it and each user, 2) A zeroforcing structure that leverages this information to combat multi-user interference by remaining agnostic of the receiver beam knowledge in the precoder design, and 3) A hybrid precoding architecture that allows both amplitude and phase control at low-complexity and cost to allow the implementation of the zeroforcing structure. Numerical studies show that the proposed scheme results in a significant sum rate performance improvement over naive schemes even with a coarse initial beam alignment codebook.
△ Less
Submitted 14 February, 2018;
originally announced February 2018.
-
Deep Multi-view Learning to Rank
Authors:
Guanqun Cao,
Alexandros Iosifidis,
Moncef Gabbouj,
Vijay Raghavan,
Raju Gottumukkala
Abstract:
We study the problem of learning to rank from multiple information sources. Though multi-view learning and learning to rank have been studied extensively leading to a wide range of applications, multi-view learning to rank as a synergy of both topics has received little attention. The aim of the paper is to propose a composite ranking method while kee** a close correlation with the individual ra…
▽ More
We study the problem of learning to rank from multiple information sources. Though multi-view learning and learning to rank have been studied extensively leading to a wide range of applications, multi-view learning to rank as a synergy of both topics has received little attention. The aim of the paper is to propose a composite ranking method while kee** a close correlation with the individual rankings simultaneously. We present a generic framework for multi-view subspace learning to rank (MvSL2R), and two novel solutions are introduced under the framework. The first solution captures information of feature map**s from within each view as well as across views using autoencoder-like networks. Novel feature embedding methods are formulated in the optimization of multi-view unsupervised and discriminant autoencoders. Moreover, we introduce an end-to-end solution to learning towards both the joint ranking objective and the individual rankings. The proposed solution enhances the joint ranking with minimum view-specific ranking loss, so that it can achieve the maximum global view agreements in a single optimization process. The proposed method is evaluated on three different ranking problems, i.e. university ranking, multi-view lingual text ranking and image data ranking, providing superior results compared to related methods.
△ Less
Submitted 23 September, 2019; v1 submitted 31 January, 2018;
originally announced January 2018.
-
Statistical Blockage Modeling and Robustness of Beamforming in Millimeter Wave Systems
Authors:
Vasanthan Raghavan,
Lida Akhoondzadeh-asl,
Vladimir Podshivalov,
Joakim Hulten,
M. Ali Tassoudji,
Ozge Hizir Koymen,
Ashwin Sampath,
Junyi Li
Abstract:
There has been a growing interest in the commercialization of millimeter wave (mmW) technology as a part of the Fifth-Generation New Radio (5G-NR) wireless standardization efforts. In this direction, many sets of independent measurement campaigns show that wireless propagation at mmW carrier frequencies is only marginally worse than propagation at sub-6 GHz carrier frequencies for small-cell cover…
▽ More
There has been a growing interest in the commercialization of millimeter wave (mmW) technology as a part of the Fifth-Generation New Radio (5G-NR) wireless standardization efforts. In this direction, many sets of independent measurement campaigns show that wireless propagation at mmW carrier frequencies is only marginally worse than propagation at sub-6 GHz carrier frequencies for small-cell coverage --- one of the most important use-cases for 5G-NR. On the other hand, the biggest determinants of viability of mmW systems in practice are penetration and blockage of mmW signals through different materials in the scattering environment. With this background, the focus of this paper is on understanding the impact of blockage of mmW signals and reduced spatial coverage due to penetration through the human hand, body, vehicles, etc. Leveraging measurements with a 28 GHz mmW experimental prototype and electromagnetic simulation studies, we first propose statistical blockage models to capture the impact of the hand, human body and vehicles. We then study the time-scales at which mmW signals are disrupted by blockage (hand and human body). Our results show that these events can be attributed to physical movements and the time-scales corresponding to blockage are hence on the order of a few 100 ms or more. Building on this fundamental understanding, we finally consider the broader question of robustness of mmW beamforming to handle blockage. Network densification, subarray switching in a user equipment (UE) designed with multiple subarrays, fall back mechanisms such as codebook enhancements and switching to legacy carriers in non-standalone deployments, etc. can address blockage before it leads to a deleterious impact on the mmW link margin.
△ Less
Submitted 10 January, 2018;
originally announced January 2018.
-
Millimeter Wave MIMO Prototype: Measurements and Experimental Results
Authors:
Vasanthan Raghavan,
Andrzej Partyka,
Ashwin Sampath,
Sundar Subramanian,
Ozge Hizir Koymen,
Kobi Ravid,
Juergen Cezanne,
Kiran Mukkavilli,
Junyi Li
Abstract:
Millimeter-wave multi-input multi-output (mm-Wave MIMO) systems are one of the candidate schemes for 5G wireless standardization efforts. In this context, the main contributions of this article are three-fold. 1) We describe parallel sets of measurements at identical transmit-receive location pairs with 2.9, 29 and 61 GHz carrier frequencies in indoor office, shop** mall, and outdoor settings. T…
▽ More
Millimeter-wave multi-input multi-output (mm-Wave MIMO) systems are one of the candidate schemes for 5G wireless standardization efforts. In this context, the main contributions of this article are three-fold. 1) We describe parallel sets of measurements at identical transmit-receive location pairs with 2.9, 29 and 61 GHz carrier frequencies in indoor office, shop** mall, and outdoor settings. These measurements provide insights on propagation, blockage and material penetration losses, and the key elements necessary in system design to make mm-Wave systems viable in practice. 2) One of these elements is hybrid beamforming necessary for better link margins by rea** the array gain with large antenna dimensions. From the class of fully-flexible hybrid beamformers, we describe a robust class of directional beamformers towards meeting the high data-rate requirements of mm-Wave systems. 3) Leveraging these design insights, we then describe an experimental prototype system at 28 GHz that realizes high data-rates on both the downlink and uplink and robustly maintains these rates in outdoor and indoor mobility scenarios. In addition to maintaining large signal constellation sizes in spite of radio frequency challenges, this prototype leverages the directional nature of the mm-Wave channel to perform seamless beam switching and handover across mm-Wave base-stations thereby overcoming the path losses in non-line-of-sight links and blockages encountered at mm-Wave frequencies.
△ Less
Submitted 25 October, 2017;
originally announced October 2017.
-
Millimeter Wave Channel Measurements and Implications for PHY Layer Design
Authors:
Vasanthan Raghavan,
Andrzej Partyka,
Lida Akhoondzadehasl,
Ali Tassoudji,
Ozge Koymen,
John Sanelli
Abstract:
There has been an increasing interest in the millimeter wave (mmW) frequency regime in the design of next-generation wireless systems. The focus of this work is on understanding mmW channel properties that have an important bearing on the feasibility of mmW systems in practice and have a significant impact on physical (PHY) layer design. In this direction, simultaneous channel sounding measurement…
▽ More
There has been an increasing interest in the millimeter wave (mmW) frequency regime in the design of next-generation wireless systems. The focus of this work is on understanding mmW channel properties that have an important bearing on the feasibility of mmW systems in practice and have a significant impact on physical (PHY) layer design. In this direction, simultaneous channel sounding measurements at 2.9, 29 and 61 GHz are performed at a number of transmit-receive location pairs in indoor office, shop** mall and outdoor environments. Based on these measurements, this paper first studies large-scale properties such as path loss and delay spread across different carrier frequencies in these scenarios. Towards the goal of understanding the feasibility of outdoor-to-indoor coverage, material measurements corresponding to mmW reflection and penetration are studied and significant notches in signal reception spread over a few GHz are reported. Finally, implications of these measurements on system design are discussed and multiple solutions are proposed to overcome these impairments.
△ Less
Submitted 16 September, 2017;
originally announced September 2017.
-
Multiple Kernel Learning and Automatic Subspace Relevance Determination for High-dimensional Neuroimaging Data
Authors:
Murat Seckin Ayhan,
Vijay Raghavan,
Alzheimer's disease Neuroimaging Initiative
Abstract:
Alzheimer's disease is a major cause of dementia. Its diagnosis requires accurate biomarkers that are sensitive to disease stages. In this respect, we regard probabilistic classification as a method of designing a probabilistic biomarker for disease staging. Probabilistic biomarkers naturally support the interpretation of decisions and evaluation of uncertainty associated with them. In this paper,…
▽ More
Alzheimer's disease is a major cause of dementia. Its diagnosis requires accurate biomarkers that are sensitive to disease stages. In this respect, we regard probabilistic classification as a method of designing a probabilistic biomarker for disease staging. Probabilistic biomarkers naturally support the interpretation of decisions and evaluation of uncertainty associated with them. In this paper, we obtain probabilistic biomarkers via Gaussian Processes. Gaussian Processes enable probabilistic kernel machines that offer flexible means to accomplish Multiple Kernel Learning. Exploiting this flexibility, we propose a new variation of Automatic Relevance Determination and tackle the challenges of high dimensionality through multiple kernels. Our research results demonstrate that the Gaussian Process models are competitive with or better than the well-known Support Vector Machine in terms of classification performance even in the cases of single kernel learning. Extending the basic scheme towards the Multiple Kernel Learning, we improve the efficacy of the Gaussian Process models and their interpretability in terms of the known anatomical correlates of the disease. For instance, the disease pathology starts in and around the hippocampus and entorhinal cortex. Through the use of Gaussian Processes and Multiple Kernel Learning, we have automatically and efficiently determined those portions of neuroimaging data. In addition to their interpretability, our Gaussian Process models are competitive with recent deep learning solutions under similar settings.
△ Less
Submitted 2 June, 2017;
originally announced June 2017.
-
Comparative Performance Analysis of the Cumulative Sum Chart and the Shiryaev-Roberts Procedure for Detecting Changes in Autocorrelated Data
Authors:
Aleksey S. Polunchenko,
Vasanthan Raghavan
Abstract:
We consider the problem of quickest change-point detection where the observations form a first-order autoregressive (AR) process driven by temporally independent standard Gaussian noise. Subject to possible change are both the drift of the AR(1) process ($μ$) as well as its correlation coefficient ($λ$), both known. The change is abrupt and persistent, and is of known magnitude, with…
▽ More
We consider the problem of quickest change-point detection where the observations form a first-order autoregressive (AR) process driven by temporally independent standard Gaussian noise. Subject to possible change are both the drift of the AR(1) process ($μ$) as well as its correlation coefficient ($λ$), both known. The change is abrupt and persistent, and is of known magnitude, with $\vertλ\vert<1$ throughout. For this scenario, we carry out a comparative performance analysis of the popular Cumulative Sum (CUSUM) chart and its less well-known but worthy competitor -- the Shiryaev-Roberts (SR) procedure. Specifically, the performance is measured through Pollak's Supremum (conditional) Average Delay to Detection (SADD) constrained to a pre-specified level of the Average Run Length (ARL) to false alarm. Particular attention is drawn to the sensitivity of each procedure's SADD and ARL with respect to the value of $λ$ before and after the change. The performance is studied through the solution of the respective integral renewal equations obtained via Monte Carlo simulations. The simulations are designed to estimate the sought performance metrics in an unbiased and asymptotically strongly consistent manner, and to within a prescribed proportional closeness (also asymptotically). Our extensive numerical studies suggest that both the CUSUM chart and the SR procedure are asymptotically second-order optimal, even though the CUSUM chart is found to be slightly better than the SR procedure, irrespective of the model parameters. Moreover, the existence of a worst-case post-change correlation parameter corresponding to the poorest detectability of the change for a given ARL to false alarm is established as well. To the best of our knowledge, this is the first time the performance of the SR procedure is studied for autocorrelated data.
△ Less
Submitted 2 June, 2017;
originally announced June 2017.
-
Noisy Beam Alignment Techniques for Reciprocal MIMO Channels
Authors:
Dennis Ogbe,
David J. Love,
Vasanthan Raghavan
Abstract:
Future multi-input multi-output (MIMO) wireless communications systems will use beamforming as a first-step towards realizing the capacity requirements necessitated by the exponential increase in data demands. The focus of this work is on beam alignment for time-division duplexing (TDD) systems, for which we propose a number of novel algorithms. These algorithms seek to obtain good estimates of th…
▽ More
Future multi-input multi-output (MIMO) wireless communications systems will use beamforming as a first-step towards realizing the capacity requirements necessitated by the exponential increase in data demands. The focus of this work is on beam alignment for time-division duplexing (TDD) systems, for which we propose a number of novel algorithms. These algorithms seek to obtain good estimates of the optimal beamformer/combiner pair (which are the dominant singular vectors of the channel matrix). They are motivated by the power method, an iterative algorithm to determine eigenvalues and eigenvectors through repeated matrix multiplication. In contrast to the basic power method which considers only the most recent iteration and assumes noiseless links, the proposed techniques consider information from all the previous iterations of the algorithm and combine them in different ways. The first technique (Sequential Least-Squares method) sequentially constructs a least-squares estimate of the channel matrix, which is then used to calculate the beamformer/combiner pair estimate. The second technique (Summed Power method) aims to mitigate the effect of noise by using a linear combination of the previously tried beams to calculate the next beam, providing improved performance in the low-SNR regime (typical for mmWave systems) with minimal complexity/feedback overhead. A third technique (Least-Squares Initialized Summed Power method) combines the good performance of the first technique at the high-SNR regime with the low-complexity advantage of the second technique by priming the summed power method with initial estimates from the sequential method.
△ Less
Submitted 26 July, 2017; v1 submitted 12 September, 2016;
originally announced September 2016.
-
Tracking Changes in Resilience and Level of Coordination in Terrorist Groups
Authors:
Vasanthan Raghavan,
Alexander G. Tartakovsky
Abstract:
Activity profiles of terrorist groups show frequent spurts and downfalls corresponding to changes in the underlying organizational dynamics. In particular, it is of interest in understanding changes in attributes such as intentions/ideology, tactics/strategies, capabilities/resources, etc., that influence and impact the activity. The goal of this work is the quick detection of such changes and in…
▽ More
Activity profiles of terrorist groups show frequent spurts and downfalls corresponding to changes in the underlying organizational dynamics. In particular, it is of interest in understanding changes in attributes such as intentions/ideology, tactics/strategies, capabilities/resources, etc., that influence and impact the activity. The goal of this work is the quick detection of such changes and in general, tracking of macroscopic as well as microscopic trends in group dynamics. Prior work in this area are based on parametric approaches and rely on time-series analysis techniques, self-exciting hurdle models (SEHM), or hidden Markov models (HMM). While these approaches detect spurts and downfalls reasonably accurately, they are all based on model learning --- a task that is difficult in practice because of the "rare" nature of terrorist attacks from a model learning perspective. In this paper, we pursue an alternate non-parametric approach for spurt detection in activity profiles. Our approach is based on binning the count data of terrorist activity to form observation vectors that can be compared with each other. Motivated by a majorization theory framework, these vectors are then transformed via certain functionals and used in spurt classification. While the parametric approaches often result in either a large number of missed detections of real changes or false alarms of unoccurred changes, the proposed approach is shown to result in a small number of missed detections and false alarms. Further, the non-parametric nature of the approach makes it attractive for ready applications in a practical context.
△ Less
Submitted 7 April, 2016;
originally announced April 2016.
-
Beamforming Tradeoffs for Initial UE Discovery in Millimeter-Wave MIMO Systems
Authors:
Vasanthan Raghavan,
Juergen Cezanne,
Sundar Subramanian,
Ashwin Sampath,
Ozge Koymen
Abstract:
Millimeter-wave MIMO systems have gained increasing traction towards the goal of meeting the high data-rate requirements in next-generation wireless systems. The focus of this work is on low-complexity beamforming approaches for initial UE discovery in such systems. Towards this goal, we first note the structure of the optimal beamformer with per-antenna gain and phase control and the structure of…
▽ More
Millimeter-wave MIMO systems have gained increasing traction towards the goal of meeting the high data-rate requirements in next-generation wireless systems. The focus of this work is on low-complexity beamforming approaches for initial UE discovery in such systems. Towards this goal, we first note the structure of the optimal beamformer with per-antenna gain and phase control and the structure of good beamformers with per-antenna phase-only control. Learning these beamforming structures in mmW systems is fraught with considerable complexities such as the need for a non-broadcast system design, the sensitivity of the beamformer approximants to small path length changes, etc. To overcome these issues, we establish a physical interpretation between these beamformer structures and the angles of departure/arrival of the dominant path(s). This physical interpretation provides a theoretical underpinning to the emerging interest on directional beamforming approaches that are less sensitive to small path length changes. While classical approaches for direction learning such as MUSIC have been well-understood, they suffer from many practical difficulties in a mmW context such as a non-broadcast system design and high computational complexity. A simpler broadcast solution for mmW systems is the adaptation of directional codebooks for beamforming at the two ends. We establish fundamental limits for the best beam broadening codebooks and propose a construction motivated by a virtual subarray architecture that is within a couple of dB of the best tradeoff curve at all useful beam broadening factors. We finally provide the received SNR loss-UE discovery latency tradeoff with the proposed constructions. Our results show that users with a reasonable link margin can be quickly discovered by the proposed design with a smooth roll-off in performance as the link margin deteriorates.
△ Less
Submitted 12 January, 2016;
originally announced January 2016.
-
Directional Beamforming for Millimeter-Wave MIMO Systems
Authors:
Vasanthan Raghavan,
Sundar Subramanian,
Juergen Cezanne,
Ashwin Sampath
Abstract:
The focus of this paper is on beamforming in a millimeter-wave (mmW) multi-input multi-output (MIMO) setup that has gained increasing traction in meeting the high data-rate requirements of next-generation wireless systems. For a given MIMO channel matrix, the optimality of beamforming with the dominant right-singular vector (RSV) at the transmit end and with the matched filter to the RSV at the re…
▽ More
The focus of this paper is on beamforming in a millimeter-wave (mmW) multi-input multi-output (MIMO) setup that has gained increasing traction in meeting the high data-rate requirements of next-generation wireless systems. For a given MIMO channel matrix, the optimality of beamforming with the dominant right-singular vector (RSV) at the transmit end and with the matched filter to the RSV at the receive end has been well-understood. When the channel matrix can be accurately captured by a physical (geometric) scattering model across multiple clusters/paths as is the case in mmW MIMO systems, we provide a physical interpretation for this optimal structure: beam steering across the different paths with appropriate power allocation and phase compensation. While such an explicit physical interpretation has not been provided hitherto, practical implementation of such a structure in a mmW system is fraught with considerable difficulties (complexity as well as cost) as it requires the use of per-antenna gain and phase control. This paper characterizes the loss in received SNR with an alternate low-complexity beamforming solution that needs only per-antenna phase control and corresponds to steering the beam to the dominant path at the transmit and receive ends. While the loss in received SNR can be arbitrarily large (theoretically), this loss is minimal in a large fraction of the channel realizations reinforcing the utility of directional beamforming as a good candidate solution for mmW MIMO systems.
△ Less
Submitted 11 January, 2016;
originally announced January 2016.
-
Internet Control Plane Event Identification using Model Based Change Point Detection Techniques
Authors:
S. P. Meenakshi,
S. V. Raghavan
Abstract:
In the raise of many global organizations deploying their data centers and content services in India, the prefix reachability performance study from global destinations garners our attention. The events such as failures and attacks occurring in the Internet topology have impact on Autonomous System (AS) paths announced in the control plane and reachability of prefixes from spatially distributed AS…
▽ More
In the raise of many global organizations deploying their data centers and content services in India, the prefix reachability performance study from global destinations garners our attention. The events such as failures and attacks occurring in the Internet topology have impact on Autonomous System (AS) paths announced in the control plane and reachability of prefixes from spatially distributed ASes. As a consequence the customer reachability to the services in terms of increased latency and outages for a short or long time are experienced. The challenge in control plane event detection is when the data plane traffic is able to reach the intended destinations correctly. However detection of such events are crucial for the operations of content and data center industries. By monitoring the spatially distributed routing table features like AS path length distributions, spatial prefix reachability distribution and covering to overlap route ratio, we can detect the control plane events. In our work, we study prefix AS paths from the publicly available route-view data and analyze the global reachability as well as reachability to Indian AS topology. To capture the spatial events in a single temporal pattern, we propose a counting based measure using prefixes announced by x % of spatial peers. Employing statistical characteristics change point detection and temporal aberration algorithm on the time series of the proposed measure, we identify the occurrence of long and stochastic control plane events. The impact and duration of the events are also quantified. We validate the mechanisms over the proposed measure using the SEA-Me-We4 cable cut event manifestations in the control plane of Indian AS topology. The cable cut events occurred on 6th June 2012 (long term event) and 17th April 2012 (stochastic event) are considered for validation.
△ Less
Submitted 27 June, 2013;
originally announced June 2013.
-
Forecasting and Event Detection in Internet Resource Dynamics using Time Series Models
Authors:
S. P. Meenakshi,
S. V. Raghavan
Abstract:
At present Internet has emerged as a country's predominant and viable data communication infrastructure. The Autonomous System (AS) resources which are building blocks of the Internet are AS numbers, IPv4 and IPv6 Prefixes. AS number growth is one of Internet infrastructure development indicators. Hence understanding on long term trend and stochastic variation behaviour are essential to detect sig…
▽ More
At present Internet has emerged as a country's predominant and viable data communication infrastructure. The Autonomous System (AS) resources which are building blocks of the Internet are AS numbers, IPv4 and IPv6 Prefixes. AS number growth is one of Internet infrastructure development indicators. Hence understanding on long term trend and stochastic variation behaviour are essential to detect significant events during the growth. In this work, time series based approximation is considered for mathematical modelling and forecast the yearly AS growth. The AS data of five countries namely India, China, Japan, South Korea and Taiwan are extracted from APNIC archive. ARIMA models with different Auto Regressive and Moving Average parameters are identified for forecasting. Model validation, parameter estimation, point forecast and prediction intervals with 95 % confidence levels for the five countries are reported in the paper. The significant level change in variations, positive growth percentage in Inter Annual Absolute Variations (IAAV) and higher percentage of advertised ASes when compared to other countries indicate India's fast growth and wider global reachability of Internet infrastructure from 2007 onwards. The correlation between IAAV change point and GDP growth period indicates that service sector industry growth is the driving force behind significant yearly changes.
△ Less
Submitted 27 June, 2013;
originally announced June 2013.
-
Modeling Temporal Activity Patterns in Dynamic Social Networks
Authors:
Vasanthan Raghavan,
Greg Ver Steeg,
Aram Galstyan,
Alexander G. Tartakovsky
Abstract:
The focus of this work is on develo** probabilistic models for user activity in social networks by incorporating the social network influence as perceived by the user. For this, we propose a coupled Hidden Markov Model, where each user's activity evolves according to a Markov chain with a hidden state that is influenced by the collective activity of the friends of the user. We develop generalize…
▽ More
The focus of this work is on develo** probabilistic models for user activity in social networks by incorporating the social network influence as perceived by the user. For this, we propose a coupled Hidden Markov Model, where each user's activity evolves according to a Markov chain with a hidden state that is influenced by the collective activity of the friends of the user. We develop generalized Baum-Welch and Viterbi algorithms for model parameter learning and state estimation for the proposed framework. We then validate the proposed model using a significant corpus of user activity on Twitter. Our numerical studies show that with sufficient observations to ensure accurate model learning, the proposed framework explains the observed data better than either a renewal process-based model or a conventional uncoupled Hidden Markov Model. We also demonstrate the utility of the proposed approach in predicting the time to the next tweet. Finally, clustering in the model parameter space is shown to result in distinct natural clusters of users characterized by the interaction dynamic between a user and his network.
△ Less
Submitted 8 May, 2013;
originally announced May 2013.
-
Ensemble Properties of RVQ-Based Limited-Feedback Beamforming Codebooks
Authors:
Vasanthan Raghavan,
Venugopal V. Veeravalli
Abstract:
The ensemble properties of Random Vector Quantization (RVQ) codebooks for limited-feedback beamforming in multi-input multi-output (MIMO) systems are studied with the metrics of interest being the received SNR loss and mutual information loss, both relative to a perfect channel state information (CSI) benchmark. The simplest case of unskewed codebooks is studied in the correlated MIMO setting and…
▽ More
The ensemble properties of Random Vector Quantization (RVQ) codebooks for limited-feedback beamforming in multi-input multi-output (MIMO) systems are studied with the metrics of interest being the received SNR loss and mutual information loss, both relative to a perfect channel state information (CSI) benchmark. The simplest case of unskewed codebooks is studied in the correlated MIMO setting and these loss metrics are computed as a function of the number of bits of feedback ($B$), transmit antenna dimension ($N_t$), and spatial correlation. In particular, it is established that: i) the loss metrics are a product of two components -- a quantization component and a channel-dependent component; ii) the quantization component, which is also common to analysis of channels with independent and identically distributed (i.i.d.) fading, decays as $B$ increases at the rate $2^{-B/(N_t-1)}$; iii) the channel-dependent component reflects the condition number of the channel. Further, the precise connection between the received SNR loss and the squared singular values of the channel is shown to be a Schur-convex majorization relationship. Finally, the ensemble properties of skewed codebooks that are generated by skewing RVQ codebooks with an appropriately designed fixed skewing matrix are studied. Based on an estimate of the loss expression for skewed codebooks, it is established that the optimal skewing matrix is critically dependent on the condition numbers of the effective channel (product of the true channel and the skewing matrix) and the skewing matrix.
△ Less
Submitted 6 July, 2012;
originally announced July 2012.
-
Hidden Markov models for the activity profile of terrorist groups
Authors:
Vasanthan Raghavan,
Aram Galstyan,
Alexander G. Tartakovsky
Abstract:
The main focus of this work is on develo** models for the activity profile of a terrorist group, detecting sudden spurts and downfalls in this profile, and, in general, tracking it over a period of time. Toward this goal, a $d$-state hidden Markov model (HMM) that captures the latent states underlying the dynamics of the group and thus its activity profile is developed. The simplest setting of…
▽ More
The main focus of this work is on develo** models for the activity profile of a terrorist group, detecting sudden spurts and downfalls in this profile, and, in general, tracking it over a period of time. Toward this goal, a $d$-state hidden Markov model (HMM) that captures the latent states underlying the dynamics of the group and thus its activity profile is developed. The simplest setting of $d=2$ corresponds to the case where the dynamics are coarsely quantized as Active and Inactive, respectively. A state estimation strategy that exploits the underlying HMM structure is then developed for spurt detection and tracking. This strategy is shown to track even nonpersistent changes that last only for a short duration at the cost of learning the underlying model. Case studies with real terrorism data from open-source databases are provided to illustrate the performance of the proposed methodology.
△ Less
Submitted 15 January, 2014; v1 submitted 5 July, 2012;
originally announced July 2012.
-
Statistical Beamforming on the Grassmann Manifold for the Two-User Broadcast Channel
Authors:
Vasanthan Raghavan,
Stephen Hanly,
Venugopal Veeravalli
Abstract:
A Rayleigh fading spatially correlated broadcast setting with M = 2 antennas at the transmitter and two-users (each with a single antenna) is considered. It is assumed that the users have perfect channel information about their links whereas the transmitter has only statistical information of each user's link (covariance matrix of the vector channel). A low-complexity linear beamforming strategy t…
▽ More
A Rayleigh fading spatially correlated broadcast setting with M = 2 antennas at the transmitter and two-users (each with a single antenna) is considered. It is assumed that the users have perfect channel information about their links whereas the transmitter has only statistical information of each user's link (covariance matrix of the vector channel). A low-complexity linear beamforming strategy that allocates equal power and one spatial eigen-mode to each user is employed at the transmitter. Beamforming vectors on the Grassmann manifold that depend only on statistical information are to be designed at the transmitter to maximize the ergodic sum-rate delivered to the two users. Towards this goal, the beamforming vectors are first fixed and a closed-form expression is obtained for the ergodic sum-rate in terms of the covariance matrices of the links. This expression is non-convex in the beamforming vectors ensuring that the classical Lagrange multiplier technique is not applicable. Despite this difficulty, the optimal solution to this problem is shown to be the solution to the maximization of an appropriately-defined average signal-to-interference and noise ratio (SINR) metric for each user. This solution is the dominant generalized eigenvector of a pair of positive-definite matrices where the first matrix is the covariance matrix of the forward link and the second is an appropriately-designed "effective" interference covariance matrix. In this sense, our work is a generalization of optimal signalling along the dominant eigen-mode of the transmit covariance matrix in the single-user case. Finally, the ergodic sum-rate for the general broadcast setting with M antennas at the transmitter and M-users (each with a single antenna) is obtained in terms of the covariance matrices of the links and the beamforming vectors.
△ Less
Submitted 12 April, 2011;
originally announced April 2011.
-
Linear Beamforming for the Spatially Correlated MISO broadcast channel
Authors:
Vasanthan Raghavan,
Venu Veeravalli,
Stephen Hanly
Abstract:
A spatially correlated broadcast setting with M antennas at the base station and M users (each with a single antenna) is considered. We assume that the users have perfect channel information about their links and the base station has only statistical information about each user's link. The base station employs a linear beamforming strategy with one spatial eigen-mode allocated to each user. The go…
▽ More
A spatially correlated broadcast setting with M antennas at the base station and M users (each with a single antenna) is considered. We assume that the users have perfect channel information about their links and the base station has only statistical information about each user's link. The base station employs a linear beamforming strategy with one spatial eigen-mode allocated to each user. The goal of this work is to understand the structure of the beamforming vectors that maximize the ergodic sum-rate achieved by treating interference as noise. In the M = 2 case, we first fix the beamforming vectors and compute the ergodic sum-rate in closed-form as a function of the channel statistics. We then show that the optimal beamforming vectors are the dominant generalized eigenvectors of the covariance matrices of the two links. It is difficult to obtain intuition on the structure of the optimal beamforming vectors for M > 2 due to the complicated nature of the sum-rate expression. Nevertheless, in the case of asymptotic M, we show that the optimal beamforming vectors have to satisfy a set of fixed-point equations.
△ Less
Submitted 6 August, 2010;
originally announced August 2010.
-
Quickest Change Detection of a Markov Process Across a Sensor Array
Authors:
Vasanthan Raghavan,
Venugopal V. Veeravalli
Abstract:
Recent attention in quickest change detection in the multi-sensor setting has been on the case where the densities of the observations change at the same instant at all the sensors due to the disruption. In this work, a more general scenario is considered where the change propagates across the sensors, and its propagation can be modeled as a Markov process. A centralized, Bayesian version of thi…
▽ More
Recent attention in quickest change detection in the multi-sensor setting has been on the case where the densities of the observations change at the same instant at all the sensors due to the disruption. In this work, a more general scenario is considered where the change propagates across the sensors, and its propagation can be modeled as a Markov process. A centralized, Bayesian version of this problem, with a fusion center that has perfect information about the observations and a priori knowledge of the statistics of the change process, is considered. The problem of minimizing the average detection delay subject to false alarm constraints is formulated as a partially observable Markov decision process (POMDP). Insights into the structure of the optimal stop** rule are presented. In the limiting case of rare disruptions, we show that the structure of the optimal test reduces to thresholding the a posteriori probability of the hypothesis that no change has happened. We establish the asymptotic optimality (in the vanishing false alarm probability regime) of this threshold test under a certain condition on the Kullback-Leibler (K-L) divergence between the post- and the pre-change densities. In the special case of near-instantaneous change propagation across the sensors, this condition reduces to the mild condition that the K-L divergence be positive. Numerical studies show that this low complexity threshold test results in a substantial improvement in performance over naive tests such as a single-sensor test or a test that wrongly assumes that the change propagates instantaneously.
△ Less
Submitted 19 December, 2008;
originally announced December 2008.
-
Why Does a Kronecker Model Result in Misleading Capacity Estimates?
Authors:
Vasanthan Raghavan,
Jayesh H. Kotecha,
Akbar M. Sayeed
Abstract:
Many recent works that study the performance of multi-input multi-output (MIMO) systems in practice assume a Kronecker model where the variances of the channel entries, upon decomposition on to the transmit and the receive eigen-bases, admit a separable form. Measurement campaigns, however, show that the Kronecker model results in poor estimates for capacity. Motivated by these observations, a c…
▽ More
Many recent works that study the performance of multi-input multi-output (MIMO) systems in practice assume a Kronecker model where the variances of the channel entries, upon decomposition on to the transmit and the receive eigen-bases, admit a separable form. Measurement campaigns, however, show that the Kronecker model results in poor estimates for capacity. Motivated by these observations, a channel model that does not impose a separable structure has been recently proposed and shown to fit the capacity of measured channels better. In this work, we show that this recently proposed modeling framework can be viewed as a natural consequence of channel decomposition on to its canonical coordinates, the transmit and/or the receive eigen-bases. Using tools from random matrix theory, we then establish the theoretical basis behind the Kronecker mismatch at the low- and the high-SNR extremes: 1) Sparsity of the dominant statistical degrees of freedom (DoF) in the true channel at the low-SNR extreme, and 2) Non-regularity of the sparsity structure (disparities in the distribution of the DoF across the rows and the columns) at the high-SNR extreme.
△ Less
Submitted 31 July, 2008;
originally announced August 2008.
-
Low-Complexity Structured Precoding for Spatially Correlated MIMO Channels
Authors:
Vasanthan Raghavan,
Akbar Sayeed,
Venu Veeravalli
Abstract:
The focus of this paper is on spatial precoding in correlated multi-antenna channels, where the number of independent data-streams is adapted to trade-off the data-rate with the transmitter complexity. Towards the goal of a low-complexity implementation, a structured precoder is proposed, where the precoder matrix evolves fairly slowly at a rate comparable with the statistical evolution of the c…
▽ More
The focus of this paper is on spatial precoding in correlated multi-antenna channels, where the number of independent data-streams is adapted to trade-off the data-rate with the transmitter complexity. Towards the goal of a low-complexity implementation, a structured precoder is proposed, where the precoder matrix evolves fairly slowly at a rate comparable with the statistical evolution of the channel. Here, the eigenvectors of the precoder matrix correspond to the dominant eigenvectors of the transmit covariance matrix, whereas the power allocation across the modes is fixed, known at both the ends, and is of low-complexity. A particular case of the proposed scheme (semiunitary precoding), where the spatial modes are excited with equal power, is shown to be near-optimal in matched channels. A matched channel is one where the dominant eigenvalues of the transmit covariance matrix are well-conditioned and their number equals the number of independent data-streams, and the receive covariance matrix is also well-conditioned. In mismatched channels, where the above conditions are not met, it is shown that the loss in performance with semiunitary precoding when compared with a perfect channel information benchmark is substantial. This loss needs to be mitigated via limited feedback techniques that provide partial channel information to the transmitter. More importantly, we develop matching metrics that capture the degree of matching of a channel to the precoder structure continuously, and allow ordering two matrix channels in terms of their mutual information or error probability performance.
△ Less
Submitted 28 May, 2008;
originally announced May 2008.
-
Quantized Multimode Precoding in Spatially Correlated Multi-Antenna Channels
Authors:
Vasanthan Raghavan,
Venu Veeravalli,
Akbar Sayeed
Abstract:
Multimode precoding, where the number of independent data-streams is adapted optimally, can be used to maximize the achievable throughput in multi-antenna communication systems. Motivated by standardization efforts embraced by the industry, the focus of this work is on systematic precoder design with realistic assumptions on the spatial correlation, channel state information (CSI) at the transmi…
▽ More
Multimode precoding, where the number of independent data-streams is adapted optimally, can be used to maximize the achievable throughput in multi-antenna communication systems. Motivated by standardization efforts embraced by the industry, the focus of this work is on systematic precoder design with realistic assumptions on the spatial correlation, channel state information (CSI) at the transmitter and the receiver, and implementation complexity. For spatial correlation of the channel matrix, we assume a general channel model, based on physical principles, that has been verified by many recent measurement campaigns. We also assume a coherent receiver and knowledge of the spatial statistics at the transmitter along with the presence of an ideal, low-rate feedback link from the receiver to the transmitter. The reverse link is used for codebook-index feedback and the goal of this work is to construct precoder codebooks, adaptable in response to the statistical information, such that the achievable throughput is significantly enhanced over that of a fixed, non-adaptive, i.i.d. codebook design. We illustrate how a codebook of semiunitary precoder matrices localized around some fixed center on the Grassmann manifold can be skewed in response to the spatial correlation via low-complexity maps that can rotate and scale submanifolds on the Grassmann manifold. The skewed codebook in combination with a lowcomplexity statistical power allocation scheme is then shown to bridge the gap in performance between a perfect CSI benchmark and an i.i.d. codebook design.
△ Less
Submitted 23 January, 2008;
originally announced January 2008.
-
Capacity of Sparse Wideband Channels with Partial Channel Feedback
Authors:
Gautham Hariharan,
Vasanthan Raghavan,
Akbar M. Sayeed
Abstract:
This paper studies the ergodic capacity of wideband multipath channels with limited feedback. Our work builds on recent results that have established the possibility of significant capacity gains in the wideband/low-SNR regime when there is perfect channel state information (CSI) at the transmitter. Furthermore, the perfect CSI benchmark gain can be obtained with the feedback of just one bit per…
▽ More
This paper studies the ergodic capacity of wideband multipath channels with limited feedback. Our work builds on recent results that have established the possibility of significant capacity gains in the wideband/low-SNR regime when there is perfect channel state information (CSI) at the transmitter. Furthermore, the perfect CSI benchmark gain can be obtained with the feedback of just one bit per channel coefficient. However, the input signals used in these methods are peaky, that is, they have a large peak-to-average power ratios. Signal peakiness is related to channel coherence and many recent measurement campaigns show that, in contrast to previous assumptions, wideband channels exhibit a sparse multipath structure that naturally leads to coherence in time and frequency. In this work, we first show that even an instantaneous power constraint is sufficient to achieve the benchmark gain when perfect CSI is available at the receiver. In the more realistic non-coherent setting, we study the performance of a training-based signaling scheme. We show that multipath sparsity can be leveraged to achieve the benchmark gain under both average as well as instantaneous power constraints as long as the channel coherence scales at a sufficiently fast rate with signal space dimensions. We also present rules of thumb on choosing signaling parameters as a function of the channel parameters so that the full benefits of sparsity can be realized.
△ Less
Submitted 23 January, 2008;
originally announced January 2008.
-
To Code or Not to Code Across Time: Space-Time Coding with Feedback
Authors:
Che Lin,
Vasanthan Raghavan,
Venu Veeravalli
Abstract:
Space-time codes leverage the availability of multiple antennas to enhance the reliability of communication over wireless channels. While space-time codes have initially been designed with a focus on open-loop systems, recent technological advances have enabled the possibility of low-rate feedback from the receiver to the transmitter. The focus of this paper is on the implications of this feedba…
▽ More
Space-time codes leverage the availability of multiple antennas to enhance the reliability of communication over wireless channels. While space-time codes have initially been designed with a focus on open-loop systems, recent technological advances have enabled the possibility of low-rate feedback from the receiver to the transmitter. The focus of this paper is on the implications of this feedback in a single-user multi-antenna system with a general model for spatial correlation. We assume a limited feedback model, that is, a coherent receiver and statistics along with B bits of quantized channel information at the transmitter. We study space-time coding with a family of linear dispersion (LD) codes that meet an additional orthogonality constraint so as to ensure low-complexity decoding. Our results show that, when the number of bits of feedback (B) is small, a space-time coding scheme that is equivalent to beamforming and does not code across time is optimal in a weak sense in that it maximizes the average received SNR. As B increases, this weak optimality transitions to optimality in a strong sense which is characterized by the maximization of the average mutual information. Thus, from a system designer's perspective, our work suggests that beamforming may not only be attractive from a low-complexity viewpoint, but also from an information-theoretic viewpoint.
△ Less
Submitted 22 November, 2007;
originally announced November 2007.
-
Capacity of Sparse Multipath Channels in the Ultra-Wideband Regime
Authors:
Vasanthan Raghavan,
Gautham Hariharan,
Akbar Sayeed
Abstract:
This paper studies the ergodic capacity of time- and frequency-selective multipath fading channels in the ultrawideband (UWB) regime when training signals are used for channel estimation at the receiver. Motivated by recent measurement results on UWB channels, we propose a model for sparse multipath channels. A key implication of sparsity is that the independent degrees of freedom (DoF) in the c…
▽ More
This paper studies the ergodic capacity of time- and frequency-selective multipath fading channels in the ultrawideband (UWB) regime when training signals are used for channel estimation at the receiver. Motivated by recent measurement results on UWB channels, we propose a model for sparse multipath channels. A key implication of sparsity is that the independent degrees of freedom (DoF) in the channel scale sub-linearly with the signal space dimension (product of signaling duration and bandwidth). Sparsity is captured by the number of resolvable paths in delay and Doppler. Our analysis is based on a training and communication scheme that employs signaling over orthogonal short-time Fourier (STF) basis functions. STF signaling naturally relates sparsity in delay-Doppler to coherence in time-frequency. We study the impact of multipath sparsity on two fundamental metrics of spectral efficiency in the wideband/low-SNR limit introduced by Verdu: first- and second-order optimality conditions. Recent results by Zheng et. al. have underscored the large gap in spectral efficiency between coherent and non-coherent extremes and the importance of channel learning in bridging the gap. Building on these results, our results lead to the following implications of multipath sparsity: 1) The coherence requirements are shared in both time and frequency, thereby significantly relaxing the required scaling in coherence time with SNR; 2) Sparse multipath channels are asymptotically coherent -- for a given but large bandwidth, the channel can be learned perfectly and the coherence requirements for first- and second-order optimality met through sufficiently large signaling duration; and 3) The requirement of peaky signals in attaining capacity is eliminated or relaxed in sparse environments.
△ Less
Submitted 19 May, 2007;
originally announced May 2007.