-
SUGARCREPE++ Dataset: Vision-Language Model Sensitivity to Semantic and Lexical Alterations
Authors:
Sri Harsha Dumpala,
Aman Jaiswal,
Chandramouli Sastry,
Evangelos Milios,
Sageev Oore,
Hassan Sajjad
Abstract:
Despite their remarkable successes, state-of-the-art large language models (LLMs), including vision-and-language models (VLMs) and unimodal language models (ULMs), fail to understand precise semantics. For example, semantically equivalent sentences expressed using different lexical compositions elicit diverging representations. The degree of this divergence and its impact on encoded semantics is n…
▽ More
Despite their remarkable successes, state-of-the-art large language models (LLMs), including vision-and-language models (VLMs) and unimodal language models (ULMs), fail to understand precise semantics. For example, semantically equivalent sentences expressed using different lexical compositions elicit diverging representations. The degree of this divergence and its impact on encoded semantics is not very well understood. In this paper, we introduce the SUGARCREPE++ dataset to analyze the sensitivity of VLMs and ULMs to lexical and semantic alterations. Each sample in SUGARCREPE++ dataset consists of an image and a corresponding triplet of captions: a pair of semantically equivalent but lexically different positive captions and one hard negative caption. This poses a 3-way semantic (in)equivalence problem to the language models. We comprehensively evaluate VLMs and ULMs that differ in architecture, pre-training objectives and datasets to benchmark the performance of SUGARCREPE++ dataset. Experimental results highlight the difficulties of VLMs in distinguishing between lexical and semantic variations, particularly in object attributes and spatial relations. Although VLMs with larger pre-training datasets, model sizes, and multiple pre-training objectives achieve better performance on SUGARCREPE++, there is a significant opportunity for improvement. We show that all the models which achieve better performance on compositionality datasets need not perform equally well on SUGARCREPE++, signifying that compositionality alone may not be sufficient for understanding semantic and lexical alterations. Given the importance of the property that the SUGARCREPE++ dataset targets, it serves as a new challenge to the vision-and-language community.
△ Less
Submitted 18 June, 2024; v1 submitted 16 June, 2024;
originally announced June 2024.
-
VISLA Benchmark: Evaluating Embedding Sensitivity to Semantic and Lexical Alterations
Authors:
Sri Harsha Dumpala,
Aman Jaiswal,
Chandramouli Sastry,
Evangelos Milios,
Sageev Oore,
Hassan Sajjad
Abstract:
Despite their remarkable successes, state-of-the-art language models face challenges in gras** certain important semantic details. This paper introduces the VISLA (Variance and Invariance to Semantic and Lexical Alterations) benchmark, designed to evaluate the semantic and lexical understanding of language models. VISLA presents a 3-way semantic (in)equivalence task with a triplet of sentences a…
▽ More
Despite their remarkable successes, state-of-the-art language models face challenges in gras** certain important semantic details. This paper introduces the VISLA (Variance and Invariance to Semantic and Lexical Alterations) benchmark, designed to evaluate the semantic and lexical understanding of language models. VISLA presents a 3-way semantic (in)equivalence task with a triplet of sentences associated with an image, to evaluate both vision-language models (VLMs) and unimodal language models (ULMs). An evaluation involving 34 VLMs and 20 ULMs reveals surprising difficulties in distinguishing between lexical and semantic variations. Spatial semantics encoded by language models also appear to be highly sensitive to lexical information. Notably, text encoders of VLMs demonstrate greater sensitivity to semantic and lexical variations than unimodal text encoders. Our contributions include the unification of image-to-text and text-to-text retrieval tasks, an off-the-shelf evaluation without fine-tuning, and assessing LMs' semantic (in)variance in the presence of lexical alterations. The results highlight strengths and weaknesses across diverse vision and unimodal language models, contributing to a deeper understanding of their capabilities. % VISLA enables a rigorous evaluation, shedding light on language models' capabilities in handling semantic and lexical nuances. Data and code will be made available at https://github.com/Sri-Harsha/visla_benchmark.
△ Less
Submitted 25 April, 2024;
originally announced April 2024.
-
Test-Time Training for Depression Detection
Authors:
Sri Harsha Dumpala,
Chandramouli Shama Sastry,
Rudolf Uher,
Sageev Oore
Abstract:
Previous works on depression detection use datasets collected in similar environments to train and test the models. In practice, however, the train and test distributions cannot be guaranteed to be identical. Distribution shifts can be introduced due to variations such as recording environment (e.g., background noise) and demographics (e.g., gender, age, etc). Such distributional shifts can surpri…
▽ More
Previous works on depression detection use datasets collected in similar environments to train and test the models. In practice, however, the train and test distributions cannot be guaranteed to be identical. Distribution shifts can be introduced due to variations such as recording environment (e.g., background noise) and demographics (e.g., gender, age, etc). Such distributional shifts can surprisingly lead to severe performance degradation of the depression detection models. In this paper, we analyze the application of test-time training (TTT) to improve robustness of models trained for depression detection. When compared to regular testing of the models, we find TTT can significantly improve the robustness of the model under a variety of distributional shifts introduced due to: (a) background-noise, (b) gender-bias, and (c) data collection and curation procedure (i.e., train and test samples are from separate datasets).
△ Less
Submitted 7 April, 2024;
originally announced April 2024.
-
Symbolic Music Generation with Non-Differentiable Rule Guided Diffusion
Authors:
Yujia Huang,
Adishree Ghatare,
Yuanzhe Liu,
Ziniu Hu,
Qinsheng Zhang,
Chandramouli S Sastry,
Siddharth Gururani,
Sageev Oore,
Yisong Yue
Abstract:
We study the problem of symbolic music generation (e.g., generating piano rolls), with a technical focus on non-differentiable rule guidance. Musical rules are often expressed in symbolic form on note characteristics, such as note density or chord progression, many of which are non-differentiable which pose a challenge when using them for guided diffusion. We propose \oursfull (\ours), a novel gui…
▽ More
We study the problem of symbolic music generation (e.g., generating piano rolls), with a technical focus on non-differentiable rule guidance. Musical rules are often expressed in symbolic form on note characteristics, such as note density or chord progression, many of which are non-differentiable which pose a challenge when using them for guided diffusion. We propose \oursfull (\ours), a novel guidance method that only requires forward evaluation of rule functions that can work with pre-trained diffusion models in a plug-and-play way, thus achieving training-free guidance for non-differentiable rules for the first time. Additionally, we introduce a latent diffusion architecture for symbolic music generation with high time resolution, which can be composed with SCG in a plug-and-play fashion. Compared to standard strong baselines in symbolic music generation, this framework demonstrates marked advancements in music quality and rule-based controllability, outperforming current state-of-the-art generators in a variety of settings. For detailed demonstrations, code and model checkpoints, please visit our project website: https://scg-rule-guided-music.github.io/.
△ Less
Submitted 2 June, 2024; v1 submitted 21 February, 2024;
originally announced February 2024.
-
Test-Time Training for Speech
Authors:
Sri Harsha Dumpala,
Chandramouli Sastry,
Sageev Oore
Abstract:
In this paper, we study the application of Test-Time Training (TTT) as a solution to handling distribution shifts in speech applications. In particular, we introduce distribution-shifts to the test datasets of standard speech-classification tasks -- for example, speaker-identification and emotion-detection -- and explore how Test-Time Training (TTT) can help adjust to the distribution-shift. In ou…
▽ More
In this paper, we study the application of Test-Time Training (TTT) as a solution to handling distribution shifts in speech applications. In particular, we introduce distribution-shifts to the test datasets of standard speech-classification tasks -- for example, speaker-identification and emotion-detection -- and explore how Test-Time Training (TTT) can help adjust to the distribution-shift. In our experiments that include distribution shifts due to background noise and natural variations in speech such as gender and age, we identify some key-challenges with TTT including sensitivity to optimization hyperparameters (e.g., number of optimization steps and subset of parameters chosen for TTT) and scalability (e.g., as each example gets its own set of parameters, TTT is not scalable). Finally, we propose using BitFit -- a parameter-efficient fine-tuning algorithm proposed for text applications that only considers the bias parameters for fine-tuning -- as a solution to the aforementioned challenges and demonstrate that it is consistently more stable than fine-tuning all the parameters of the model.
△ Less
Submitted 28 September, 2023; v1 submitted 19 September, 2023;
originally announced September 2023.
-
DiffAug: A Diffuse-and-Denoise Augmentation for Training Robust Classifiers
Authors:
Chandramouli Sastry,
Sri Harsha Dumpala,
Sageev Oore
Abstract:
We introduce DiffAug, a simple and efficient diffusion-based augmentation technique to train image classifiers for the crucial yet challenging goal of improved classifier robustness. Applying DiffAug to a given example consists of one forward-diffusion step followed by one reverse-diffusion step. Using both ResNet-50 and Vision Transformer architectures, we comprehensively evaluate classifiers tra…
▽ More
We introduce DiffAug, a simple and efficient diffusion-based augmentation technique to train image classifiers for the crucial yet challenging goal of improved classifier robustness. Applying DiffAug to a given example consists of one forward-diffusion step followed by one reverse-diffusion step. Using both ResNet-50 and Vision Transformer architectures, we comprehensively evaluate classifiers trained with DiffAug and demonstrate the surprising effectiveness of single-step reverse diffusion in improving robustness to covariate shifts, certified adversarial accuracy and out of distribution detection. When we combine DiffAug with other augmentations such as AugMix and DeepAugment we demonstrate further improved robustness. Finally, building on this approach, we also improve classifier-guided diffusion wherein we observe improvements in: (i) classifier-generalization, (ii) gradient quality (i.e., improved perceptual alignment) and (iii) image generation performance. We thus introduce a computationally efficient technique for training with improved robustness that does not require any additional data, and effectively complements existing augmentation approaches.
△ Less
Submitted 28 May, 2024; v1 submitted 15 June, 2023;
originally announced June 2023.
-
Benchmarking Neural Network Training Algorithms
Authors:
George E. Dahl,
Frank Schneider,
Zachary Nado,
Naman Agarwal,
Chandramouli Shama Sastry,
Philipp Hennig,
Sourabh Medapati,
Runa Eschenhagen,
Priya Kasimbeg,
Daniel Suo,
Juhan Bae,
Justin Gilmer,
Abel L. Peirson,
Bilal Khan,
Rohan Anil,
Mike Rabbat,
Shankar Krishnan,
Daniel Snider,
Ehsan Amid,
Kongtao Chen,
Chris J. Maddison,
Rakshith Vasudev,
Michal Badura,
Ankush Garg,
Peter Mattson
Abstract:
Training algorithms, broadly construed, are an essential part of every deep learning pipeline. Training algorithm improvements that speed up training across a wide variety of workloads (e.g., better update rules, tuning protocols, learning rate schedules, or data selection schemes) could save time, save computational resources, and lead to better, more accurate, models. Unfortunately, as a communi…
▽ More
Training algorithms, broadly construed, are an essential part of every deep learning pipeline. Training algorithm improvements that speed up training across a wide variety of workloads (e.g., better update rules, tuning protocols, learning rate schedules, or data selection schemes) could save time, save computational resources, and lead to better, more accurate, models. Unfortunately, as a community, we are currently unable to reliably identify training algorithm improvements, or even determine the state-of-the-art training algorithm. In this work, using concrete experiments, we argue that real progress in speeding up training requires new benchmarks that resolve three basic challenges faced by empirical comparisons of training algorithms: (1) how to decide when training is complete and precisely measure training time, (2) how to handle the sensitivity of measurements to exact workload details, and (3) how to fairly compare algorithms that require hyperparameter tuning. In order to address these challenges, we introduce a new, competitive, time-to-result benchmark using multiple workloads running on fixed hardware, the AlgoPerf: Training Algorithms benchmark. Our benchmark includes a set of workload variants that make it possible to detect benchmark submissions that are more robust to workload changes than current widely-used methods. Finally, we evaluate baseline submissions constructed using various optimizers that represent current practice, as well as other optimizers that have recently received attention in the literature. These baseline results collectively demonstrate the feasibility of our benchmark, show that non-trivial gaps between methods exist, and set a provisional state-of-the-art for future benchmark submissions to try and surpass.
△ Less
Submitted 12 June, 2023;
originally announced June 2023.
-
Efficient CDF Approximations for Normalizing Flows
Authors:
Chandramouli Shama Sastry,
Andreas Lehrmann,
Marcus Brubaker,
Alexander Radovic
Abstract:
Normalizing flows model a complex target distribution in terms of a bijective transform operating on a simple base distribution. As such, they enable tractable computation of a number of important statistical quantities, particularly likelihoods and samples. Despite these appealing properties, the computation of more complex inference tasks, such as the cumulative distribution function (CDF) over…
▽ More
Normalizing flows model a complex target distribution in terms of a bijective transform operating on a simple base distribution. As such, they enable tractable computation of a number of important statistical quantities, particularly likelihoods and samples. Despite these appealing properties, the computation of more complex inference tasks, such as the cumulative distribution function (CDF) over a complex region (e.g., a polytope) remains challenging. Traditional CDF approximations using Monte-Carlo techniques are unbiased but have unbounded variance and low sample efficiency. Instead, we build upon the diffeomorphic properties of normalizing flows and leverage the divergence theorem to estimate the CDF over a closed region in target space in terms of the flux across its \emph{boundary}, as induced by the normalizing flow. We describe both deterministic and stochastic instances of this estimator: while the deterministic variant iteratively improves the estimate by strategically subdividing the boundary, the stochastic variant provides unbiased estimates. Our experiments on popular flow architectures and UCI benchmark datasets show a marked improvement in sample efficiency as compared to traditional estimators.
△ Less
Submitted 31 August, 2022; v1 submitted 23 February, 2022;
originally announced February 2022.
-
Musical Speech: A Transformer-based Composition Tool
Authors:
Jason d'Eon,
Sri Harsha Dumpala,
Chandramouli Shama Sastry,
Dani Oore,
Sageev Oore
Abstract:
In this paper, we propose a new compositional tool that will generate a musical outline of speech recorded/provided by the user for use as a musical building block in their compositions. The tool allows any user to use their own speech to generate musical material, while still being able to hear the direct connection between their recorded speech and the resulting music. The tool is built on our p…
▽ More
In this paper, we propose a new compositional tool that will generate a musical outline of speech recorded/provided by the user for use as a musical building block in their compositions. The tool allows any user to use their own speech to generate musical material, while still being able to hear the direct connection between their recorded speech and the resulting music. The tool is built on our proposed pipeline. This pipeline begins with speech-based signal processing, after which some simple musical heuristics are applied, and finally these pre-processed signals are passed through Transformer models trained on new musical tasks. We illustrate the effectiveness of our pipeline -- which does not require a paired dataset for training -- through examples of music created by musicians making use of our tool.
△ Less
Submitted 2 August, 2021;
originally announced August 2021.
-
On compactly supported discrete radial wavelets in $L^2(\mathbb{R}^2)$ and application in Tomography
Authors:
K. Z. Najiya,
Akshaya Ravichandran,
C. S. Sastry
Abstract:
Radially symmetric wavelets possessing multiresolution framework are found to be useful in different fields like Pattern recognition, Computed Tomography (CT) etc. The compactly supported wavelets are known to be useful for localized operations in applications such as reconstruction, enhancement etc. In this work we introduce a novel way of designing compactly supported radial wavelets in…
▽ More
Radially symmetric wavelets possessing multiresolution framework are found to be useful in different fields like Pattern recognition, Computed Tomography (CT) etc. The compactly supported wavelets are known to be useful for localized operations in applications such as reconstruction, enhancement etc. In this work we introduce a novel way of designing compactly supported radial wavelets in $L^2(\mathbb{R}^2)$ from a 1D Daubechies wavelets and obtain a reconstruction formula possessing multiresolution framework. Further, we demonstrate the usefulness of our radial wavelets in Tomography.
△ Less
Submitted 10 September, 2020;
originally announced September 2020.
-
Local recovery bounds for prior support constrained Compressed Sensing
Authors:
K. Z. Najiya,
Munnu Sonkar,
C. S. Sastry
Abstract:
Prior support constrained compressed sensing has of late become popular due to its potential for applications. The existing results on recovery guarantees provide global recovery bounds in the sense that they deal with full support. However, in some applications, one might be interested in the recovery guarantees limited to the given prior support, such bounds may be termed as local recovery bound…
▽ More
Prior support constrained compressed sensing has of late become popular due to its potential for applications. The existing results on recovery guarantees provide global recovery bounds in the sense that they deal with full support. However, in some applications, one might be interested in the recovery guarantees limited to the given prior support, such bounds may be termed as local recovery bounds. The present work proposes the local recovery guarantees and analyzes the conditions on associated parameters that make recovery error small.
△ Less
Submitted 2 July, 2020;
originally announced July 2020.
-
Detecting Out-of-Distribution Examples with In-distribution Examples and Gram Matrices
Authors:
Chandramouli Shama Sastry,
Sageev Oore
Abstract:
When presented with Out-of-Distribution (OOD) examples, deep neural networks yield confident, incorrect predictions. Detecting OOD examples is challenging, and the potential risks are high. In this paper, we propose to detect OOD examples by identifying inconsistencies between activity patterns and class predicted. We find that characterizing activity patterns by Gram matrices and identifying anom…
▽ More
When presented with Out-of-Distribution (OOD) examples, deep neural networks yield confident, incorrect predictions. Detecting OOD examples is challenging, and the potential risks are high. In this paper, we propose to detect OOD examples by identifying inconsistencies between activity patterns and class predicted. We find that characterizing activity patterns by Gram matrices and identifying anomalies in gram matrix values can yield high OOD detection rates. We identify anomalies in the gram matrices by simply comparing each value with its respective range observed over the training data. Unlike many approaches, this can be used with any pre-trained softmax classifier and does not require access to OOD data for fine-tuning hyperparameters, nor does it require OOD access for inferring parameters. The method is applicable across a variety of architectures and vision datasets and, for the important and surprisingly hard task of detecting far-from-distribution out-of-distribution examples, it generally performs better than or equal to state-of-the-art OOD detection methods (including those that do assume access to OOD examples).
△ Less
Submitted 9 January, 2020; v1 submitted 28 December, 2019;
originally announced December 2019.
-
Sufficient conditions for the uniqueness of solution of the weighted norm minimization problem
Authors:
K. Z. Najiya,
Munnu Sonkar,
C. S. Sastry
Abstract:
Prior support constrained compressed sensing, achieved via the weighted norm minimization, has of late become popular due to its potential for applications. For the weighted norm minimization problem, $$
min \|x\|_{p,w} \text{ subject to } y=Ax, \; p=0,1, \text{ and } w \in [0,1], $$ uniqueness results are known when $w=0,1$. Here, $\|x\|_{p,w}=w\|x_T\|_p+\|x_{T^c}\|_p, \; p=0,1$ with $T$ repres…
▽ More
Prior support constrained compressed sensing, achieved via the weighted norm minimization, has of late become popular due to its potential for applications. For the weighted norm minimization problem, $$
min \|x\|_{p,w} \text{ subject to } y=Ax, \; p=0,1, \text{ and } w \in [0,1], $$ uniqueness results are known when $w=0,1$. Here, $\|x\|_{p,w}=w\|x_T\|_p+\|x_{T^c}\|_p, \; p=0,1$ with $T$ representing the partial support information. The work reported in this paper presents the conditions that ensure the uniqueness of the solution of this problem for general $w \in [0,1]$.
△ Less
Submitted 20 November, 2019;
originally announced November 2019.
-
Sparse recovery guarantees for block orthogonal binary matrices constructed via Generalized Euler Squares
Authors:
Pradip Sasmal,
Phanindra Jampana,
C. S. Sastry
Abstract:
In recent times, the construction of deterministic matrices has gained popularity as an alternative of random matrices as they provide guarantees for recovery of sparse signals. In particular, the construction of binary matrices has attained significance due to their potential for hardware-friendly implementation and appealing applications. Our present work aims at constructing incoherent binary m…
▽ More
In recent times, the construction of deterministic matrices has gained popularity as an alternative of random matrices as they provide guarantees for recovery of sparse signals. In particular, the construction of binary matrices has attained significance due to their potential for hardware-friendly implementation and appealing applications. Our present work aims at constructing incoherent binary matrices consisting of orthogonal blocks with small block coherence. We show that the binary matrices constructed from Euler squares exhibit block orthogonality and possess low block coherence. With a goal of obtaining better aspect ratios, the present work generalizes the notion of Euler Squares and obtains a new class of deterministic binary matrices of more general size. For realizing the stated objectives, to begin with, the paper revisits the connection of finite field theory to Euler Squares and their construction. Using the stated connection, the work proposes Generalized Euler Squares (GES) and then presents a construction procedure. Binary matrices with low coherence and general row-sizes are obtained, whose column size is in the maximum possible order. Finally, the paper shows that the special structure possessed by GES is helpful in resulting in block orthogonal structure with small block coherence, which supports the recovery of block sparse signals.
△ Less
Submitted 17 July, 2019;
originally announced July 2019.
-
Robust Heartbeat Detection from Multimodal Data via CNN-based Generalizable Information Fusion
Authors:
B S Chandra,
C S Sastry,
S Jana
Abstract:
Objective: Heartbeat detection remains central to cardiac disease diagnosis and management, and is traditionally performed based on electrocardiogram (ECG). To improve robustness and accuracy of detection, especially, in certain critical-care scenarios, the use of additional physiological signals such as arterial blood pressure (BP) has recently been suggested. There, estimation of heartbeat locat…
▽ More
Objective: Heartbeat detection remains central to cardiac disease diagnosis and management, and is traditionally performed based on electrocardiogram (ECG). To improve robustness and accuracy of detection, especially, in certain critical-care scenarios, the use of additional physiological signals such as arterial blood pressure (BP) has recently been suggested. There, estimation of heartbeat location requires information fusion from multiple signals. However, reported efforts in this direction often obtain multimodal estimates somewhat indirectly, by voting among separately obtained signal-specific intermediate estimates. In contrast, we propose to directly fuse information from multiple signals without requiring intermediate estimates, and thence estimate heartbeat location in a robust manner. Method: We propose as a heartbeat detector, a convolutional neural network (CNN) that learns fused features from multiple physiological signals. This method eliminates the need for hand-picked signal-specific features and ad hoc fusion schemes. Further, being data-driven, the same algorithm learns suitable features from arbitrary set of signals. Results: Using ECG and BP signals of PhysioNet 2014 Challenge database, we obtained a score of 94%. Further, using two ECG channels of MIT-BIH arrhythmia database, we scored 99.92\%. Both those scores compare favourably with previously reported database-specific results. Also, our detector achieved high accuracy in a variety of clinical conditions. Conclusion: The proposed CNN-based information fusion (CIF) algorithm is generalizable, robust and efficient in detecting heartbeat location from multiple signals. Significance: In medical signal monitoring systems, our technique would accurately estimate heartbeat locations even when only a subset of channels are reliable.
△ Less
Submitted 29 June, 2018;
originally announced July 2018.
-
Novel Light Weight Compressed Data Aggregation Using Sparse Measurements for IoT Networks
Authors:
Amarlingam M,
Pradeep Kumar Mishra,
P Rajalakshmi,
Sumohana S. Channappayya,
C. S. Sastry
Abstract:
Optimal data aggregation aimed at maximizing IoT network lifetime by minimizing constrained on-board resource utilization continues to be a challenging task. The existing data aggregation methods have proven that compressed sensing is promising for data aggregation. However, they compromise either on energy efficiency or recovery fidelity and require complex on-node computations. In this paper, we…
▽ More
Optimal data aggregation aimed at maximizing IoT network lifetime by minimizing constrained on-board resource utilization continues to be a challenging task. The existing data aggregation methods have proven that compressed sensing is promising for data aggregation. However, they compromise either on energy efficiency or recovery fidelity and require complex on-node computations. In this paper, we propose a novel Light Weight Compressed Data Aggregation (LWCDA) algorithm that randomly divides the entire network into non-overlap** clusters for data aggregation. The random non-overlap** clustering offers two important advantages: 1) energy efficiency, as each node has to send its measurement only to its cluster head, 2) highly sparse measurement matrix, which leads to a practically implementable framework with low complexity. We analyze the properties of our measurement matrix using restricted isometry property, the associated coherence and phase transition. Through extensive simulations on practical data, we show that the measurement matrix can reconstruct data with high fidelity. Further, we demonstrate that the LWCDA algorithm reduces transmission cost significantly against baseline approaches, implying thereby the enhancement of the network lifetime.
△ Less
Submitted 13 June, 2018;
originally announced June 2018.
-
Construction of Structured Incoherent Unit Norm Tight Frames
Authors:
Pradip Sasmal,
Phanindra Jampana,
C. S. Sastry
Abstract:
The exact recovery property of Basis pursuit (BP) and Orthogonal Matching Pursuit (OMP) has a relation with the coherence of the underlying frame. A frame with low coherence provides better guarantees for exact recovery. In particular, Incoherent Unit Norm Tight Frames (IUNTFs) play a significant role in sparse representations. IUNTFs with special structure, in particular those given by a union of…
▽ More
The exact recovery property of Basis pursuit (BP) and Orthogonal Matching Pursuit (OMP) has a relation with the coherence of the underlying frame. A frame with low coherence provides better guarantees for exact recovery. In particular, Incoherent Unit Norm Tight Frames (IUNTFs) play a significant role in sparse representations. IUNTFs with special structure, in particular those given by a union of several orthonormal bases, are known to satisfy better theoretical guarantees for recovering sparse signals. In the present work, we propose to construct structured IUNTFs consisting of large number of orthonormal bases. For a given $r, k, m$ with $k$ being less than or equal to the smallest prime power factor of $m$ and $r<k,$ we construct a CS matrix of size $mk \times (mk\times m^{r})$ with coherence at most $\frac{r}{k},$ which consists of $m^{r}$ number of orthonormal bases and with density $\frac{1}{m}$. We also present numerical results of recovery performance of union of orthonormal bases as against their Gaussian counterparts.
△ Less
Submitted 2 July, 2017;
originally announced July 2017.
-
Dictionary-based Monitoring of Premature Ventricular Contractions: An Ultra-Low-Cost Point-of-Care Service
Authors:
Bollepalli S. Chandra,
Challa S. Sastry,
Laxminarayana Anumandla,
Soumya Jana
Abstract:
While cardiovascular diseases (CVDs) are prevalent across economic strata, the economically disadvantaged population is disproportionately affected due to the high cost of traditional CVD management. Accordingly, develo** an ultra-low-cost alternative, affordable even to groups at the bottom of the economic pyramid, has emerged as a societal imperative. Against this backdrop, we propose an inexp…
▽ More
While cardiovascular diseases (CVDs) are prevalent across economic strata, the economically disadvantaged population is disproportionately affected due to the high cost of traditional CVD management. Accordingly, develo** an ultra-low-cost alternative, affordable even to groups at the bottom of the economic pyramid, has emerged as a societal imperative. Against this backdrop, we propose an inexpensive yet accurate home-based electrocardiogram(ECG) monitoring service. Specifically, we seek to provide point-of-care monitoring of premature ventricular contractions (PVCs), high frequency of which could indicate the onset of potentially fatal arrhythmia. Note that a traditional telecardiology system acquires the ECG, transmits it to a professional diagnostic centre without processing, and nearly achieves the diagnostic accuracy of a bedside setup, albeit at high bandwidth cost. In this context, we aim at reducing cost without significantly sacrificing reliability. To this end, we develop a dictionary-based algorithm that detects with high sensitivity the anomalous beats only which are then transmitted. We further compress those transmitted beats using class-specific dictionaries subject to suitable reconstruction/diagnostic fidelity. Such a scheme would not only reduce the overall bandwidth requirement, but also localising anomalous beats, thereby reducing physicians' burden. Finally, using Monte Carlo cross validation on MIT/BIH arrhythmia database, we evaluate the performance of the proposed system. In particular, with a sensitivity target of at most one undetected PVC in one hundred beats, and a percentage root mean squared difference less than 9% (a clinically acceptable level of fidelity), we achieved about 99.15% reduction in bandwidth cost, equivalent to 118-fold savings over traditional telecardiology.
△ Less
Submitted 24 May, 2017;
originally announced May 2017.
-
Nullspace Property for Optimality of Minimum Frame Angle Under Invertible Linear Operators
Authors:
Pradip Sasmal,
Prasad Theeda,
Phanindra Jampana,
C. S. Sastry
Abstract:
Orthogonal Matching Pursuit and Basis Pursuit are popular reconstruction algorithms for recovery of sparse signals. The exact recovery property of both the methods has a relation with the coherence of the underlying redundant dictionary, i.e. a frame. A frame with low coherence provides better guarantees for exact recovery. An equivalent formulation of the associated linear system is obtained via…
▽ More
Orthogonal Matching Pursuit and Basis Pursuit are popular reconstruction algorithms for recovery of sparse signals. The exact recovery property of both the methods has a relation with the coherence of the underlying redundant dictionary, i.e. a frame. A frame with low coherence provides better guarantees for exact recovery. An equivalent formulation of the associated linear system is obtained via premultiplication by a non-singular matrix. In view of bounds that guarantee sparse recovery, it is very useful to generate the preconditioner in such way that the preconditioned frame has low coherence as compared to the original. In this paper, we discuss the impact of preconditioning on sparse recovery. Further, we formulate a convex optimization problem for designing the preconditioner that yields a frame with improved coherence. In addition to reducing coherence, we focus on designing well conditioned frames and numerically study the relationship between the condition number of the preconditioner and the coherence of the new frame. Alongside theoretical justifications, we demonstrate through simulations the efficacy of the preconditioner in reducing coherence as well as recovering sparse signals.
△ Less
Submitted 9 June, 2021; v1 submitted 22 April, 2016;
originally announced April 2016.
-
Deterministic construction of sparse binary and ternary matrices from existing binary sensing matrices
Authors:
Pradip Sasmal,
R. Ramu Naidu,
C. S. Sastry,
P. V. Jampana
Abstract:
In the present work, we discuss a procedure for constructing sparse binary and ternary matrices from existing two binary sensing matrices. The matrices that we construct have several attractive properties such as smaller density, which supports algorithms with low computational complexity. As an application of our method, we show that a CS matrix of general row size different from $p, p^2, pq$ (fo…
▽ More
In the present work, we discuss a procedure for constructing sparse binary and ternary matrices from existing two binary sensing matrices. The matrices that we construct have several attractive properties such as smaller density, which supports algorithms with low computational complexity. As an application of our method, we show that a CS matrix of general row size different from $p, p^2, pq$ (for different primes $p,q$) can be constructed.
△ Less
Submitted 4 March, 2015;
originally announced March 2015.
-
Deterministic compressed sensing matrices: Construction via Euler Squares and applications
Authors:
R. Ramu Naidu,
C. S. Sastry,
Phanindra Jampana
Abstract:
In Compressed Sensing the matrices that satisfy the Restricted Isometry Property (RIP) play an important role. But to date, very few results for designing such matrices are available. For applications such as multiplier-less data compression, binary sensing matrices are of interest. The present work constructs deterministic and binary sensing matrices using Euler Squares. In particular, given a po…
▽ More
In Compressed Sensing the matrices that satisfy the Restricted Isometry Property (RIP) play an important role. But to date, very few results for designing such matrices are available. For applications such as multiplier-less data compression, binary sensing matrices are of interest. The present work constructs deterministic and binary sensing matrices using Euler Squares. In particular, given a positive integer $m$ different from $p, p^2$ for a prime $p$, we show that it is possible to construct a binary sensing matrix of size $m \times c (mμ)^2$, where $μ$ is the coherence parameter of the matrix and $c \in [1,2)$. The matrices that we construct have smaller density (that is, percentage of nonzero entries in the matrix is small) with no function evaluation in their construction, which support algorithms with low computational complexity. Through experimental work, we show that our binary sensing matrices can be used for such applications as content based image retrieval. Our simulation results demonstrate that the Euler Square based CS matrices give better performance than their Gaussian counterparts.
△ Less
Submitted 26 March, 2016; v1 submitted 27 January, 2015;
originally announced January 2015.
-
A low frequency radio telescope at Mauritius for a Southern sky survey
Authors:
K. Golap,
N. Udaya Shankar,
S. Sachdev,
R. Dodson,
Ch. V. Sastry
Abstract:
A new, meter-wave radio telescope has been built in the North-East of Mauritius, an island in the Indian ocean, at a latitude of -20.14 deg. The Mauritius Radio Telescope (MRT) is a Fourier Synthesis T-shaped array, consisting of a 2048 m long East-West arm and a 880 m long South arm. In the East-West arm 1024 fixed helices are arranged in 32 groups and in the South arm 16 trolleys, with four he…
▽ More
A new, meter-wave radio telescope has been built in the North-East of Mauritius, an island in the Indian ocean, at a latitude of -20.14 deg. The Mauritius Radio Telescope (MRT) is a Fourier Synthesis T-shaped array, consisting of a 2048 m long East-West arm and a 880 m long South arm. In the East-West arm 1024 fixed helices are arranged in 32 groups and in the South arm 16 trolleys, with four helices on each, which move on a rail are used. A 512 channel digital complex correlation receiver is used to measure the visibility function. At least 60 days of observing are required for obtaining the visibilities up to 880 m spacing. The Fourier transform of the calibrated visibilities produces a map of the area of the sky under observation with a synthesized beam width 4'X 4.6'sec(dec+20.14) at 151.5 MHz.
The primary objective of the telescope is to produce a sky survey in the declination range -70 deg to -10 deg with a point source sensitivity of about 200 mJy (3-sigma level). This will be the southern sky equivalent of the Cambridge 6C survey. In this paper we describe the telescope, discuss the array design and the calibration techniques used, and present a map made using the telescope.
△ Less
Submitted 7 August, 1998;
originally announced August 1998.