Skip to main content

Showing 1–7 of 7 results for author: Hegde, S B

.
  1. arXiv:2310.05304  [pdf, other

    cs.CV

    GestSync: Determining who is speaking without a talking head

    Authors: Sindhu B Hegde, Andrew Zisserman

    Abstract: In this paper we introduce a new synchronisation task, Gesture-Sync: determining if a person's gestures are correlated with their speech or not. In comparison to Lip-Sync, Gesture-Sync is far more challenging as there is a far looser relationship between the voice and body movement than there is between voice and lip motion. We introduce a dual-encoder model for this task, and compare a number of… ▽ More

    Submitted 8 October, 2023; originally announced October 2023.

    Comments: Accepted in BMVC 2023, 10 pages paper, 7 pages supplementary, 7 Figures

  2. arXiv:2209.00642  [pdf, other

    cs.CV cs.CL cs.SD eess.AS

    Lip-to-Speech Synthesis for Arbitrary Speakers in the Wild

    Authors: Sindhu B Hegde, K R Prajwal, Rudrabha Mukhopadhyay, Vinay P Namboodiri, C. V. Jawahar

    Abstract: In this work, we address the problem of generating speech from silent lip videos for any speaker in the wild. In stark contrast to previous works, our method (i) is not restricted to a fixed number of speakers, (ii) does not explicitly impose constraints on the domain or the vocabulary and (iii) deals with videos that are recorded in the wild as opposed to within laboratory settings. The task pres… ▽ More

    Submitted 1 September, 2022; originally announced September 2022.

    Comments: Accepted in ACM-MM 2022, 9 pages, 2 pages supplementary, 7 Figures

  3. Extreme-scale Talking-Face Video Upsampling with Audio-Visual Priors

    Authors: Sindhu B Hegde, Rudrabha Mukhopadhyay, Vinay P Namboodiri, C. V. Jawahar

    Abstract: In this paper, we explore an interesting question of what can be obtained from an $8\times8$ pixel video sequence. Surprisingly, it turns out to be quite a lot. We show that when we process this $8\times8$ video with the right set of audio and image priors, we can obtain a full-length, $256\times256$ video. We achieve this $32\times$ scaling of an extremely low-resolution input using our novel aud… ▽ More

    Submitted 17 August, 2022; originally announced August 2022.

    Comments: Accepted in ACM-MM 2022, 10 pages, 6 pages supplementary, 18 Figures

  4. arXiv:2106.12790  [pdf, other

    cs.CV

    Towards Automatic Speech to Sign Language Generation

    Authors: Parul Kapoor, Rudrabha Mukhopadhyay, Sindhu B Hegde, Vinay Namboodiri, C V Jawahar

    Abstract: We aim to solve the highly challenging task of generating continuous sign language videos solely from speech segments for the first time. Recent efforts in this space have focused on generating such videos from human-annotated text transcripts without considering other modalities. However, replacing speech with sign language proves to be a practical solution while communicating with people sufferi… ▽ More

    Submitted 24 June, 2021; originally announced June 2021.

    Comments: 5 pages(including references), 5 figures, Accepted in Interspeech 2021

  5. arXiv:2012.10852  [pdf, other

    cs.CV cs.LG cs.MM cs.SD eess.AS

    Visual Speech Enhancement Without A Real Visual Stream

    Authors: Sindhu B Hegde, K R Prajwal, Rudrabha Mukhopadhyay, Vinay Namboodiri, C. V. Jawahar

    Abstract: In this work, we re-think the task of speech enhancement in unconstrained real-world environments. Current state-of-the-art methods use only the audio stream and are limited in their performance in a wide range of real-world noises. Recent works using lip movements as additional cues improve the quality of generated speech over "audio-only" methods. But, these methods cannot be used for several ap… ▽ More

    Submitted 20 December, 2020; originally announced December 2020.

    Comments: 10 pages, 4 figures, Accepted in WACV 2021

  6. A Cognitive Theory-based Opportunistic Resource-Pooling Scheme for Ad hoc Networks

    Authors: Seema B Hegde, B. Sathish babu, Pallapa Venkatram

    Abstract: Resource pooling in ad hoc networks deals with accumulating computing and network resources to implement network control schemes such as routing, congestion, traffic management, and so on. Pooling of resources can be accomplished using the distributed and dynamic nature of ad hoc networks to achieve collaboration between the devices. Ad hoc networks need a resource-pooling technique that offers qu… ▽ More

    Submitted 11 July, 2017; originally announced July 2017.

    Comments: 22 pages, 16 figures,

    Journal ref: Journal of Intelligent Systems 2017 Volume 26 issuse 1 pp 47-68

  7. An Opportunistic AODV Routing Scheme : A Cognitive Mobile Agents Approach

    Authors: Seema B Hegde, Sathish Babu, Pallapa Venkatram

    Abstract: In Manets Dynamics and Robustness are the key features of the nodes and are governed by several routing protocols such as AODV, DSR and so on. However in the network the growing resource demand leads to resource scarcity. The Node Mobility often leads to the link breakages and high routing overhead decreasing the stability and reliability of the network connectivity. In this context, the paper pro… ▽ More

    Submitted 11 July, 2017; originally announced July 2017.

    Comments: 21 pages, 16 figures, International Journal of Ad hoc, Sensor & Ubiquitous Computing

    ACM Class: C.2.2; I.2.0