Skip to main content

Showing 1–2 of 2 results for author: Saravanan, D

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.10889  [pdf, other

    cs.CV cs.AI cs.LG

    VELOCITI: Can Video-Language Models Bind Semantic Concepts through Time?

    Authors: Darshana Saravanan, Darshan Singh, Varun Gupta, Zeeshan Khan, Vineet Gandhi, Makarand Tapaswi

    Abstract: Compositionality is a fundamental aspect of vision-language understanding and is especially required for videos since they contain multiple entities (e.g. persons, actions, and scenes) interacting dynamically over time. Existing benchmarks focus primarily on perception capabilities. However, they do not study binding, the ability of a model to associate entities through appropriate relationships.… ▽ More

    Submitted 16 June, 2024; originally announced June 2024.

    Comments: 26 pages, 17 figures, 3 tables

  2. arXiv:2402.04835  [pdf, other

    cs.CV cs.LG

    Pseudo-labelling meets Label Smoothing for Noisy Partial Label Learning

    Authors: Darshana Saravanan, Naresh Manwani, Vineet Gandhi

    Abstract: Partial label learning (PLL) is a weakly-supervised learning paradigm where each training instance is paired with a set of candidate labels (partial label), one of which is the true label. Noisy PLL (NPLL) relaxes this constraint by allowing some partial labels to not contain the true label, enhancing the practicality of the problem. Our work centres on NPLL and presents a minimalistic framework t… ▽ More

    Submitted 28 May, 2024; v1 submitted 7 February, 2024; originally announced February 2024.

    Comments: 7 tables, 2 figures