Skip to main content

Showing 1–25 of 25 results for author: Vaze, S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2403.13684  [pdf, other

    cs.CV cs.AI

    SPTNet: An Efficient Alternative Framework for Generalized Category Discovery with Spatial Prompt Tuning

    Authors: Hongjun Wang, Sagar Vaze, Kai Han

    Abstract: Generalized Category Discovery (GCD) aims to classify unlabelled images from both `seen' and `unseen' classes by transferring knowledge from a set of labelled `seen' class images. A key theme in existing GCD approaches is adapting large-scale pre-trained models for the GCD task. An alternate perspective, however, is to adapt the data representation itself for better alignment with the pre-trained… ▽ More

    Submitted 20 May, 2024; v1 submitted 20 March, 2024; originally announced March 2024.

    Comments: Accepted as a conference paper at ICLR 2024; Project page: https://visual-ai.github.io/sptnet

  2. arXiv:2311.17055  [pdf, other

    cs.CV cs.AI cs.IT cs.LG

    No Representation Rules Them All in Category Discovery

    Authors: Sagar Vaze, Andrea Vedaldi, Andrew Zisserman

    Abstract: In this paper we tackle the problem of Generalized Category Discovery (GCD). Specifically, given a dataset with labelled and unlabelled images, the task is to cluster all images in the unlabelled subset, whether or not they belong to the labelled categories. Our first contribution is to recognize that most existing GCD benchmarks only contain labels for a single clustering of the data, making it d… ▽ More

    Submitted 28 November, 2023; originally announced November 2023.

    Comments: NeurIPS 2023

  3. arXiv:2306.07969  [pdf, other

    cs.CV cs.AI cs.LG cs.MM

    GeneCIS: A Benchmark for General Conditional Image Similarity

    Authors: Sagar Vaze, Nicolas Carion, Ishan Misra

    Abstract: We argue that there are many notions of 'similarity' and that models, like humans, should be able to adapt to these dynamically. This contrasts with most representation learning methods, supervised or self-supervised, which learn a fixed embedding function and hence implicitly assume a single notion of similarity. For instance, models trained on ImageNet are biased towards object categories, while… ▽ More

    Submitted 13 June, 2023; originally announced June 2023.

    Comments: CVPR 2023 (Highlighted Paper). Project page at https://sgvaze.github.io/genecis/

  4. arXiv:2304.02364  [pdf, other

    cs.CV

    What's in a Name? Beyond Class Indices for Image Recognition

    Authors: Kai Han, Yandong Li, Sagar Vaze, Jie Li, Xuhui Jia

    Abstract: Existing machine learning models demonstrate excellent performance in image object recognition after training on a large-scale dataset under full supervision. However, these models only learn to map an image to a predefined class index, without revealing the actual semantic meaning of the object in the image. In contrast, vision-language models like CLIP are able to assign semantic class names to… ▽ More

    Submitted 5 April, 2023; originally announced April 2023.

  5. arXiv:2204.03635  [pdf, other

    cs.CV cs.RO

    Zero-Shot Category-Level Object Pose Estimation

    Authors: Walter Goodwin, Sagar Vaze, Ioannis Havoutis, Ingmar Posner

    Abstract: Object pose estimation is an important component of most vision pipelines for embodied agents, as well as in 3D vision more generally. In this paper we tackle the problem of estimating the pose of novel object categories in a zero-shot manner. This extends much of the existing literature by removing the need for pose-labelled datasets or category-specific CAD models for training or inference. Spec… ▽ More

    Submitted 2 October, 2022; v1 submitted 7 April, 2022; originally announced April 2022.

    Comments: 28 pages, 6 figures

    Journal ref: ECCV 2022

  6. arXiv:2201.02609  [pdf, other

    cs.CV cs.LG

    Generalized Category Discovery

    Authors: Sagar Vaze, Kai Han, Andrea Vedaldi, Andrew Zisserman

    Abstract: In this paper, we consider a highly general image recognition setting wherein, given a labelled and unlabelled set of images, the task is to categorize all images in the unlabelled set. Here, the unlabelled images may come from labelled classes or from novel ones. Existing recognition methods are not able to deal with this setting, because they make several restrictive assumptions, such as the unl… ▽ More

    Submitted 18 June, 2022; v1 submitted 7 January, 2022; originally announced January 2022.

    Comments: CVPR 22. Changes from pre-print highlighted in GitHub repo

  7. arXiv:2111.07975  [pdf, other

    cs.RO cs.CV

    Semantically Grounded Object Matching for Robust Robotic Scene Rearrangement

    Authors: Walter Goodwin, Sagar Vaze, Ioannis Havoutis, Ingmar Posner

    Abstract: Object rearrangement has recently emerged as a key competency in robot manipulation, with practical solutions generally involving object detection, recognition, gras** and high-level planning. Goal-images describing a desired scene configuration are a promising and increasingly used mode of instruction. A key outstanding challenge is the accurate inference of matches between objects in front of… ▽ More

    Submitted 15 November, 2021; originally announced November 2021.

    Comments: 8 pages, 5 figures

  8. arXiv:2110.06207  [pdf, other

    cs.CV cs.LG

    Open-Set Recognition: a Good Closed-Set Classifier is All You Need?

    Authors: Sagar Vaze, Kai Han, Andrea Vedaldi, Andrew Zisserman

    Abstract: The ability to identify whether or not a test sample belongs to one of the semantic classes in a classifier's training set is critical to practical deployment of the model. This task is termed open-set recognition (OSR) and has received significant attention in recent years. In this paper, we first demonstrate that the ability of a classifier to make the 'none-of-above' decision is highly correlat… ▽ More

    Submitted 13 April, 2022; v1 submitted 12 October, 2021; originally announced October 2021.

    Comments: ICLR 22 Oral. Changes from pre-print highlighted on Github page

  9. arXiv:2009.07000  [pdf, other

    cs.CV cs.LG eess.IV

    Optimal Use of Multi-spectral Satellite Data with Convolutional Neural Networks

    Authors: Sagar Vaze, James Foley, Mohamed Seddiq, Alexey Unagaev, Natalia Efremova

    Abstract: The analysis of satellite imagery will prove a crucial tool in the pursuit of sustainable development. While Convolutional Neural Networks (CNNs) have made large gains in natural image analysis, their application to multi-spectral satellite images (wherein input images have a large number of channels) remains relatively unexplored. In this paper, we compare different methods of leveraging multi-ba… ▽ More

    Submitted 15 September, 2020; originally announced September 2020.

    Comments: AI for Social Good workshop - Harvard CRCS

  10. arXiv:2003.10823  [pdf, other

    eess.IV cs.CV

    SMArtCast: Predicting soil moisture interpolations into the future using Earth observation data in a deep learning framework

    Authors: Conrad James Foley, Sagar Vaze, Mohamed El Amine Seddiq, Alexey Unagaev, Natalia Efremova

    Abstract: Soil moisture is critical component of crop health and monitoring it can enable further actions for increasing yield or preventing catastrophic die off. As climate change increases the likelihood of extreme weather events and reduces the predictability of weather, and non-optimal soil moistures for crops may become more likely. In this work, we a series of LSTM architectures to analyze measurement… ▽ More

    Submitted 24 April, 2020; v1 submitted 16 March, 2020; originally announced March 2020.

    Comments: Climate change AI workshop

    Journal ref: ICLR 2020

  11. arXiv:1209.1291  [pdf, other

    cs.IT

    The degrees of freedom of MIMO networks with full-duplex receiver cooperation but no CSIT

    Authors: Chinmay S. Vaze, Mahesh K. Varanasi

    Abstract: The question of whether the degrees of freedom (DoF) of multi-user networks can be enhanced even under isotropic fading and no channel state information (or output feedback) at the transmitters (CSIT) is investigated. Toward this end, the two-user MIMO (multiple-input, multiple-output) broadcast and interference channels are studied with no side-information whatsoever at the transmitters and with… ▽ More

    Submitted 6 September, 2012; originally announced September 2012.

    Comments: This work was presented at the Workshop on Interference in Wireless Networks, Boston University, June 2012

  12. arXiv:1209.0047  [pdf, other

    cs.IT

    The Degrees of Freedom Region of the MIMO Interference Channel with Hybrid CSIT

    Authors: Kaniska Mohanty, Chinmay S. Vaze, Mahesh K. Varanasi

    Abstract: The degrees of freedom (DoF) region of the two-user MIMO (multiple-input multiple-output) interference channel is established under a new model termed as hybrid CSIT. In this model, one transmitter has delayed channel state information (CSI) and the other transmitter has instantaneous CSIT, of incoming channel matrices at the respective unpaired receivers, and neither transmitter has any knowledge… ▽ More

    Submitted 2 December, 2013; v1 submitted 31 August, 2012; originally announced September 2012.

  13. arXiv:1202.6658  [pdf, other

    cs.IT

    Independent signaling achieves the capacity region of the Gaussian interference channel with common information to within one bit

    Authors: Chinmay S. Vaze, Mahesh K. Varanasi

    Abstract: The interference channel with common information (IC-CI) consists of two transmit-receive pairs that communicate over a common noisy medium. Each transmitter has an individual message for its paired receiver, and additionally, both transmitters have a common message to deliver to both receivers. In this paper, through explicit inner and outer bounds on the capacity region, we establish the capacit… ▽ More

    Submitted 29 February, 2012; originally announced February 2012.

    Comments: Submitted to IEEE Trans. on Information Theory, Feb. 2012

  14. arXiv:1109.5790  [pdf, other

    cs.IT

    The Degrees of Freedom of the 2-Hop, 2-User Interference Channel with Feedback

    Authors: Chinmay S. Vaze, Mahesh K. Varanasi

    Abstract: The layered two-hop, two-flow interference network is considered that consists of two sources, two relays and two destinations with the first hop network between he sources and the relays and the second hop network between relays and destinations both being i.i.d. Rayleigh fading Gaussian interference channels. Two feedback models are studied. In the first one, called the delayed channel state inf… ▽ More

    Submitted 27 September, 2011; originally announced September 2011.

    Comments: Submitted, July 10, 2011 to the 2011 Allerton Conf. Commun., Control, Comput., Monticello, IL; accepted August 04, 2011

  15. arXiv:1109.5779  [pdf, other

    cs.IT

    The Degrees of Freedom Region of the MIMO Interference Channel with Shannon Feedback

    Authors: Chinmay S. Vaze, Mahesh K. Varanasi

    Abstract: The two-user multiple-input multiple-output (MIMO) fast-fading interference channel (IC) with an arbitrary number of antennas at each of the four terminals is studied under the settings of Shannon feedback, limited Shannon feedback, and output feedback, wherein all or certain channel matrices and outputs, or just the channel outputs, respectively, are available to the transmitters with a finite de… ▽ More

    Submitted 31 October, 2011; v1 submitted 27 September, 2011; originally announced September 2011.

    Comments: 30 pages, 3 tables, 9 figures. This paper was submitted to the IEEE Trans. Inform. Th. Oct. 2011. It was presented in part at the 49th Annual Allerton Conference on Communications, Control and Computing in Sept. 2011

  16. arXiv:1105.6033  [pdf, other

    cs.IT

    A New Outer-Bound via Interference Localization and the Degrees of Freedom Regions of MIMO Interference Networks with no CSIT

    Authors: Chinmay S. Vaze, Mahesh K. Varanasi

    Abstract: The two-user multi-input, multi-output (MIMO) interference and cognitive radio channels are studied under the assumption of no channel state information at the transmitter (CSIT) from the degrees of freedom (DoF) region perspective. With $M_i$ and $N_i$ denoting the number of antennas at transmitter $i$ and receiver $i$ respectively, the DoF regions of the MIMO interference channel were recently c… ▽ More

    Submitted 30 May, 2011; originally announced May 2011.

    Comments: Submitted to IEEE Trans. Information Theory, May 2011. A material in this paper will be presented in part at the IEEE Intern. Symp. Information Theory (ISIT), Aug. 2011

  17. arXiv:1101.5809  [pdf, other

    cs.IT

    The Degrees of Freedom Region and Interference Alignment for the MIMO Interference Channel with Delayed CSI

    Authors: Chinmay S. Vaze, Mahesh K. Varanasi

    Abstract: The degrees of freedom (DoF) region of the 2-user multiple-antenna or MIMO (multiple-input, multiple-output) interference channel (IC) is studied under fast fading and the assumption of {\em delayed} channel state information (CSI) wherein all terminals know all (or certain) channel matrices perfectly, but with a delay, and each receiver in addition knows its own incoming channels instantaneously.… ▽ More

    Submitted 11 March, 2011; v1 submitted 30 January, 2011; originally announced January 2011.

    Comments: New results are added. 57 pages, 6 figures, 2 tables, submitted to IEEE Trans. Inform. Theory

  18. arXiv:1101.0306  [pdf, other

    cs.IT

    The Degrees of Freedom Regions of Two-User and Certain Three-User MIMO Broadcast Channels with Delayed CSIT

    Authors: Chinmay S. Vaze, Mahesh K. Varanasi

    Abstract: The degrees of freedom (DoF) region of the fast-fading MIMO (multiple-input multiple-output) Gaussian broadcast channel (BC) is studied when there is delayed channel state information at the transmitter (CSIT). In this setting, the channel matrices are assumed to vary independently across time and the transmitter is assumed to know the channel matrices with some arbitrary finite delay. An outer-bo… ▽ More

    Submitted 22 December, 2011; v1 submitted 31 December, 2010; originally announced January 2011.

    Comments: 27 pages, 6 figures; submitted to IEEE Trans. on Information Theory, Dec. 2011

  19. arXiv:1002.1532   

    cs.IT

    On the scaling of feedback bits to achieve the full multiplexing gain over the Gaussian broadcast channel using DPC

    Authors: Chinmay S. Vaze, Mahesh K. Varanasi

    Abstract: This paper has been withdrawn by the author(s) for revision.

    Submitted 9 February, 2010; v1 submitted 8 February, 2010; originally announced February 2010.

    Comments: This paper has been withdrawn

  20. arXiv:1002.1531  [pdf, ps, other

    cs.IT

    A Large-System Analysis of the Imperfect-CSIT Gaussian Broadcast Channel with a DPC-based Transmission Strategy

    Authors: Chinmay S. Vaze, Mahesh K. Varanasi

    Abstract: The Gaussian broadcast channel (GBC) with $K$ transmit antennas and $K$ single-antenna users is considered for the case in which the channel state information is obtained at the transmitter via a finite-rate feedback link of capacity $r$ bits per user. The throughput (i.e., the sum-rate normalized by $K$) of the GBC is analyzed in the limit as $K \to \infty$ with $\frac{r}{K} \to \bar{r}$. Consi… ▽ More

    Submitted 18 February, 2010; v1 submitted 8 February, 2010; originally announced February 2010.

    Comments: Submitted to ISIT 2010

  21. arXiv:1002.1530   

    cs.IT

    The Degrees of Freedom Region of the MIMO Cognitive Interference Channel with No CSIT

    Authors: Chinmay S. Vaze, Mahesh K. Varanasi

    Abstract: This paper has been withdrawn by the author(s) for revision.

    Submitted 9 February, 2010; v1 submitted 8 February, 2010; originally announced February 2010.

    Comments: This paper has been withdrawn

  22. arXiv:0909.5424  [pdf, ps, other

    cs.IT

    The Degrees of Freedom Regions of MIMO Broadcast, Interference, and Cognitive Radio Channels with No CSIT

    Authors: Chinmay S. Vaze, Mahesh K. Varanasi

    Abstract: The degrees of freedom (DoF) regions are characterized for the multiple-input multiple-output (MIMO) broadcast channel (BC), interference channels (IC) (including X and multi-hop interference channels) and the cognitive radio channel (CRC), when there is perfect and no channel state information at the receivers and the transmitter(s) (CSIR and CSIT), respectively. For the K-user MIMO BC, the exact… ▽ More

    Submitted 23 January, 2011; v1 submitted 29 September, 2009; originally announced September 2009.

    Comments: 49 pages, 11 figures, under review, IEEE Trans. Inform. Th. Submitted Sept. 2009, Revised Jan. 2011

  23. arXiv:0906.2252  [pdf, ps, other

    cs.IT

    Dirty Paper Coding for the MIMO Cognitive Radio Channel with Imperfect CSIT

    Authors: Chinmay S. Vaze, Mahesh K. Varanasi

    Abstract: A Dirty Paper Coding (DPC) based transmission scheme for the Gaussian multiple-input multiple-output (MIMO) cognitive radio channel (CRC) is studied when there is imperfect and perfect channel knowledge at the transmitters (CSIT) and the receivers, respectively. In particular, the problem of optimizing the sum-rate of the MIMO CRC over the transmit covariance matrices is dealt with. Such an opti… ▽ More

    Submitted 12 June, 2009; originally announced June 2009.

    Comments: To be presented at ISIT 2009, Seoul, S. Korea

  24. arXiv:0903.4526  [pdf, ps, other

    cs.IT

    On the Achievable Rate of the Fading Dirty Paper Channel with Imperfect CSIT

    Authors: Chinmay S. Vaze, Mahesh K. Varanasi

    Abstract: The problem of dirty paper coding (DPC) over the (multi-antenna) fading dirty paper channel (FDPC) Y = H(X + S) + Z is considered when there is imperfect knowledge of the channel state information H at the transmitter (CSIT). The case of FDPC with positive definite (p.d.) input covariance matrix was studied by the authors in a recent paper, and here the more general case of positive semi-definit… ▽ More

    Submitted 26 March, 2009; originally announced March 2009.

    Comments: Presented at the 43rd Annual Conference on Information Sciences and Systems, John Hopkins University, March 2009

  25. arXiv:0901.2764  [pdf, ps, other

    cs.IT

    Dirty Paper Coding for Fading Channels with Partial Transmitter Side Information

    Authors: Chinmay S. Vaze, Mahesh K. Varanasi

    Abstract: The problem of Dirty Paper Coding (DPC) over the Fading Dirty Paper Channel (FDPC) Y = H(X + S)+Z, a more general version of Costa's channel, is studied for the case in which there is partial and perfect knowledge of the fading process H at the transmitter (CSIT) and the receiver (CSIR), respectively. A key step in this problem is to determine the optimal inflation factor (under Costa's choice o… ▽ More

    Submitted 19 January, 2009; originally announced January 2009.

    Comments: 5 pages with 2 figures, presented at 42nd Asilomar Conference on Signals, Systems, and Computers, Pacific Grove, USA, Oct. 2008