Search | arXiv e-print repository

CWCL: Cross-Modal Transfer with Continuously Weighted Contrastive Loss

Authors: Rakshith Sharma Srinivasa, Jae** Cho, Chouchang Yang, Yashas Malur Saidutta, Ching-Hua Lee, Yilin Shen, Hongxia **

Abstract: This paper considers contrastive training for cross-modal 0-shot transfer wherein a pre-trained model in one modality is used for representation learning in another domain using pairwise data. The learnt models in the latter domain can then be used for a diverse set of tasks in a zero-shot way, similar to ``Contrastive Language-Image Pre-training (CLIP)'' and ``Locked-image Tuning (LiT)'' that hav… ▽ More This paper considers contrastive training for cross-modal 0-shot transfer wherein a pre-trained model in one modality is used for representation learning in another domain using pairwise data. The learnt models in the latter domain can then be used for a diverse set of tasks in a zero-shot way, similar to ``Contrastive Language-Image Pre-training (CLIP)'' and ``Locked-image Tuning (LiT)'' that have recently gained considerable attention. Most existing works for cross-modal representation alignment (including CLIP and LiT) use the standard contrastive training objective, which employs sets of positive and negative examples to align similar and repel dissimilar training data samples. However, similarity amongst training examples has a more continuous nature, thus calling for a more `non-binary' treatment. To address this, we propose a novel loss function called Continuously Weighted Contrastive Loss (CWCL) that employs a continuous measure of similarity. With CWCL, we seek to align the embedding space of one modality with another. Owing to the continuous nature of similarity in the proposed loss function, these models outperform existing methods for 0-shot transfer across multiple models, datasets and modalities. Particularly, we consider the modality pairs of image-text and speech-text and our models achieve 5-8% (absolute) improvement over previous state-of-the-art methods in 0-shot image classification and 20-30% (absolute) improvement in 0-shot speech-to-intent classification and keyword classification. △ Less

Submitted 25 September, 2023; originally announced September 2023.

Comments: Accepted to Neural Information Processing Systems (NeurIPS) 2023 conference

arXiv:2304.03416 [pdf, other]

To Wake-up or Not to Wake-up: Reducing Keyword False Alarm by Successive Refinement

Authors: Yashas Malur Saidutta, Rakshith Sharma Srinivasa, Ching-Hua Lee, Chouchang Yang, Yilin Shen, Hongxia **

Abstract: Keyword spotting systems continuously process audio streams to detect keywords. One of the most challenging tasks in designing such systems is to reduce False Alarm (FA) which happens when the system falsely registers a keyword despite the keyword not being uttered. In this paper, we propose a simple yet elegant solution to this problem that follows from the law of total probability. We show that… ▽ More Keyword spotting systems continuously process audio streams to detect keywords. One of the most challenging tasks in designing such systems is to reduce False Alarm (FA) which happens when the system falsely registers a keyword despite the keyword not being uttered. In this paper, we propose a simple yet elegant solution to this problem that follows from the law of total probability. We show that existing deep keyword spotting mechanisms can be improved by Successive Refinement, where the system first classifies whether the input audio is speech or not, followed by whether the input is keyword-like or not, and finally classifies which keyword was uttered. We show across multiple models with size ranging from 13K parameters to 2.41M parameters, the successive refinement technique reduces FA by up to a factor of 8 on in-domain held-out FA data, and up to a factor of 7 on out-of-domain (OOD) FA data. Further, our proposed approach is "plug-and-play" and can be applied to any deep keyword spotting model. △ Less

Submitted 6 April, 2023; originally announced April 2023.

Comments: Accepted for publication in ICASSP 2023

arXiv:2210.07077 [pdf, ps, other]

Sketching low-rank matrices with a shared column space by convex programming

Authors: Rakshith S Srinivasa, Seonho Kim, Kiryung Lee

Abstract: In many practical applications including remote sensing, multi-task learning, and multi-spectrum imaging, data are described as a set of matrices sharing a common column space. We consider the joint estimation of such matrices from their noisy linear measurements. We study a convex estimator regularized by a pair of matrix norms. The measurement model corresponds to block-wise sensing and the reco… ▽ More In many practical applications including remote sensing, multi-task learning, and multi-spectrum imaging, data are described as a set of matrices sharing a common column space. We consider the joint estimation of such matrices from their noisy linear measurements. We study a convex estimator regularized by a pair of matrix norms. The measurement model corresponds to block-wise sensing and the reconstruction is possible only when the total energy is well distributed over blocks. The first norm, which is the maximum-block-Frobenius norm, favors such a solution. This condition is analogous to the notion of low-spikiness in matrix completion or column-wise sensing. The second norm, which is a tensor norm on a pair of suitable Banach spaces, induces low-rankness in the solution together with the first norm. We demonstrate that the joint estimation provides a significant gain over the individual recovery of each matrix when the number of matrices sharing a column space and the ambient dimension of the shared column space are large relative to the number of columns in each matrix. The convex estimator is cast as a semidefinite program and an efficient ADMM algorithm is derived. The empirical behavior of the convex estimator is illustrated using Monte Carlo simulations and recovery performance is compared to existing methods in the literature. △ Less

Submitted 5 June, 2023; v1 submitted 13 October, 2022; originally announced October 2022.

arXiv:2204.06501 [pdf, other]

Clinical trial site matching with improved diversity using fair policy learning

Authors: Rakshith S Srinivasa, Cheng Qian, Brandon Theodorou, Jeffrey Spaeder, Cao Xiao, Lucas Glass, Jimeng Sun

Abstract: The ongoing pandemic has highlighted the importance of reliable and efficient clinical trials in healthcare. Trial sites, where the trials are conducted, are chosen mainly based on feasibility in terms of medical expertise and access to a large group of patients. More recently, the issue of diversity and inclusion in clinical trials is gaining importance. Different patient groups may experience th… ▽ More The ongoing pandemic has highlighted the importance of reliable and efficient clinical trials in healthcare. Trial sites, where the trials are conducted, are chosen mainly based on feasibility in terms of medical expertise and access to a large group of patients. More recently, the issue of diversity and inclusion in clinical trials is gaining importance. Different patient groups may experience the effects of a medical drug/ treatment differently and hence need to be included in the clinical trials. These groups could be based on ethnicity, co-morbidities, age, or economic factors. Thus, designing a method for trial site selection that accounts for both feasibility and diversity is a crucial and urgent goal. In this paper, we formulate this problem as a ranking problem with fairness constraints. Using principles of fairness in machine learning, we learn a model that maps a clinical trial description to a ranked list of potential trial sites. Unlike existing fairness frameworks, the group membership of each trial site is non-binary: each trial site may have access to patients from multiple groups. We propose fairness criteria based on demographic parity to address such a multi-group membership scenario. We test our method on 480 real-world clinical trials and show that our model results in a list of potential trial sites that provides access to a diverse set of patients while also ensuing a high number of enrolled patients. △ Less

Submitted 13 April, 2022; originally announced April 2022.

ACM Class: J.3; I.2.1

arXiv:2202.00071 [pdf, other]

JULIA: Joint Multi-linear and Nonlinear Identification for Tensor Completion

Authors: Cheng Qian, Kejun Huang, Lucas Glass, Rakshith S. Srinivasa, Jimeng Sun

Abstract: Tensor completion aims at imputing missing entries from a partially observed tensor. Existing tensor completion methods often assume either multi-linear or nonlinear relationships between latent components. However, real-world tensors have much more complex patterns where both multi-linear and nonlinear relationships may coexist. In such cases, the existing methods are insufficient to describe t… ▽ More Tensor completion aims at imputing missing entries from a partially observed tensor. Existing tensor completion methods often assume either multi-linear or nonlinear relationships between latent components. However, real-world tensors have much more complex patterns where both multi-linear and nonlinear relationships may coexist. In such cases, the existing methods are insufficient to describe the data structure. This paper proposes a Joint mUlti-linear and nonLinear IdentificAtion (JULIA) framework for large-scale tensor completion. JULIA unifies the multi-linear and nonlinear tensor completion models with several advantages over the existing methods: 1) Flexible model selection, i.e., it fits a tensor by assigning its values as a combination of multi-linear and nonlinear components; 2) Compatible with existing nonlinear tensor completion methods; 3) Efficient training based on a well-designed alternating optimization approach. Experiments on six real large-scale tensors demonstrate that JULIA outperforms many existing tensor completion algorithms. Furthermore, JULIA can improve the performance of a class of nonlinear tensor completion methods. The results show that in some large-scale tensor completion scenarios, baseline methods with JULIA are able to obtain up to 55% lower root mean-squared-error and save 67% computational complexity. △ Less

Submitted 31 January, 2022; originally announced February 2022.

arXiv:2110.15205 [pdf, ps, other]

Approximately low-rank recovery from noisy and local measurements by convex program

Authors: Kiryung Lee, Rakshith Sharma Srinivasa, Marius Junge, Justin Romberg

Abstract: Low-rank matrix models have been universally useful for numerous applications, from classical system identification to more modern matrix completion in signal processing and statistics. The nuclear norm has been employed as a convex surrogate of the low-rankness since it induces a low-rank solution to inverse problems. While the nuclear norm for low rankness has an excellent analogy with the… ▽ More Low-rank matrix models have been universally useful for numerous applications, from classical system identification to more modern matrix completion in signal processing and statistics. The nuclear norm has been employed as a convex surrogate of the low-rankness since it induces a low-rank solution to inverse problems. While the nuclear norm for low rankness has an excellent analogy with the $\ell_1$ norm for sparsity through the singular value decomposition, other matrix norms also induce low-rankness. Particularly as one interprets a matrix as a linear operator between Banach spaces, various tensor product norms generalize the role of the nuclear norm. We provide a tensor-norm-constrained estimator for the recovery of approximately low-rank matrices from local measurements corrupted with noise. A tensor-norm regularizer is designed to adapt to the local structure. We derive statistical analysis of the estimator over matrix completion and decentralized sketching by applying Maurey's empirical method to tensor products of Banach spaces. The estimator provides a near-optimal error bound in a minimax sense and admits a polynomial-time algorithm for these applications. △ Less

Submitted 3 March, 2023; v1 submitted 28 October, 2021; originally announced October 2021.

arXiv:2006.08796 [pdf, other]

Fast Graph Attention Networks Using Effective Resistance Based Graph Sparsification

Authors: Rakshith S Srinivasa, Cao Xiao, Lucas Glass, Justin Romberg, Jimeng Sun

Abstract: The attention mechanism has demonstrated superior performance for inference over nodes in graph neural networks (GNNs), however, they result in a high computational burden during both training and inference. We propose FastGAT, a method to make attention based GNNs lightweight by using spectral sparsification to generate an optimal pruning of the input graph. This results in a per-epoch time that… ▽ More The attention mechanism has demonstrated superior performance for inference over nodes in graph neural networks (GNNs), however, they result in a high computational burden during both training and inference. We propose FastGAT, a method to make attention based GNNs lightweight by using spectral sparsification to generate an optimal pruning of the input graph. This results in a per-epoch time that is almost linear in the number of graph nodes as opposed to quadratic. We theoretically prove that spectral sparsification preserves the features computed by the GAT model, thereby justifying our algorithm. We experimentally evaluate FastGAT on several large real world graph datasets for node classification tasks under both inductive and transductive settings. FastGAT can dramatically reduce (up to \textbf{10x}) the computational time and memory requirements, allowing the usage of attention based GNNs on large graphs. △ Less

Submitted 5 October, 2020; v1 submitted 15 June, 2020; originally announced June 2020.

MSC Class: 05C50; 68T07

arXiv:2003.09097 [pdf, other]

Localized sketching for matrix multiplication and ridge regression

Authors: Rakshith S Srinivasa, Mark A Davenport, Justin Romberg

Abstract: We consider sketched approximate matrix multiplication and ridge regression in the novel setting of localized sketching, where at any given point, only part of the data matrix is available. This corresponds to a block diagonal structure on the sketching matrix. We show that, under mild conditions, block diagonal sketching matrices require only O(stable rank / ε^2) and $O( stat. dim. ε)$ total samp… ▽ More We consider sketched approximate matrix multiplication and ridge regression in the novel setting of localized sketching, where at any given point, only part of the data matrix is available. This corresponds to a block diagonal structure on the sketching matrix. We show that, under mild conditions, block diagonal sketching matrices require only O(stable rank / ε^2) and $O( stat. dim. ε)$ total sample complexity for matrix multiplication and ridge regression, respectively. This matches the state-of-the-art bounds that are obtained using global sketching matrices. The localized nature of sketching considered allows for different parts of the data matrix to be sketched independently and hence is more amenable to computation in distributed and streaming settings and results in a smaller memory and computational footprint. △ Less

Submitted 20 March, 2020; originally announced March 2020.

Comments: Accepted to AISTATS 2020

arXiv:1902.00075 [pdf, other]

Trading beams for bandwidth: Imaging with randomized beamforming

Authors: Rakshith Sharma Srinivasa, Mark A. Davenport, Justin Romberg

Abstract: We study the problem of actively imaging a range-limited far-field scene using an antenna array. We describe how the range limit imposes structure in the measurements across multiple wavelengths. This structure allows us to introduce a novel trade-off: the number of spatial array measurements (i.e., beams that have to be formed) can be reduced significantly lower than the number array elements if… ▽ More We study the problem of actively imaging a range-limited far-field scene using an antenna array. We describe how the range limit imposes structure in the measurements across multiple wavelengths. This structure allows us to introduce a novel trade-off: the number of spatial array measurements (i.e., beams that have to be formed) can be reduced significantly lower than the number array elements if the scene is illuminated with a broadband source. To take advantage of this trade-off, we use a small number of "generic" linear combinations of the array outputs, instead of the phase offsets used in conventional beamforming. We provide theoretical justification for the proposed trade-off without making any strong structural assumptions on the target scene (such as sparsity) except that it is range limited. In proving our theoretical results, we take inspiration from the sketching literature. We also provide simulation results to establish the merit of the proposed signal acquisition strategy. Our proposed method results in a reduction in the number of required spatial measurements in an array imaging system and hence can directly impact their speed and cost of operation. △ Less

Submitted 31 January, 2019; originally announced February 2019.

Showing 1–9 of 9 results for author: Srinivasa, R S