Skip to main content

Showing 1–1 of 1 results for author: Valverde, F R

Searching in archive cs. Search in all archives.
.
  1. arXiv:2103.01353  [pdf, other

    cs.CV cs.LG cs.RO

    There is More than Meets the Eye: Self-Supervised Multi-Object Detection and Tracking with Sound by Distilling Multimodal Knowledge

    Authors: Francisco Rivera Valverde, Juana Valeria Hurtado, Abhinav Valada

    Abstract: Attributes of sound inherent to objects can provide valuable cues to learn rich representations for object detection and tracking. Furthermore, the co-occurrence of audiovisual events in videos can be exploited to localize objects over the image field by solely monitoring the sound in the environment. Thus far, this has only been feasible in scenarios where the camera is static and for single obje… ▽ More

    Submitted 1 March, 2021; originally announced March 2021.

    Comments: Accepted at CVPR 2021. Dataset, code and models are available at http://rl.uni-freiburg.de/research/multimodal-distill

    Journal ref: IEEE/ CVF International Conference on Computer Vision and Pattern Recognition (CVPR), pp. 11612-11621, 2021