Skip to main content

Showing 1–3 of 3 results for author: Gari, S V A

Searching in archive eess. Search in all archives.
.
  1. arXiv:2403.19217  [pdf, other

    eess.AS

    Blind Identification of Binaural Room Impulse Responses from Smart Glasses

    Authors: Thomas Deppisch, Nils Meyer-Kahlen, Sebastià V. Amengual Garí

    Abstract: Smart glasses are increasingly recognized as a key medium for augmented reality, offering a hands-free platform with integrated microphones and non-ear-occluding loudspeakers to seamlessly mix virtual sound sources into the real-world acoustic scene. To convincingly integrate virtual sound sources, the room acoustic rendering of the virtual sources must match the real-world acoustics. Information… ▽ More

    Submitted 28 March, 2024; originally announced March 2024.

  2. Direct and Residual Subspace Decomposition of Spatial Room Impulse Responses

    Authors: Thomas Deppisch, Sebastià V. Amengual Garí, Paul Calamia, Jens Ahrens

    Abstract: Psychoacoustic experiments have shown that directional properties of the direct sound, salient reflections, and the late reverberation of an acoustic room response can have a distinct influence on the auditory perception of a given room. Spatial room impulse responses (SRIRs) capture those properties and thus are used for direction-dependent room acoustic analysis and virtual acoustic rendering. T… ▽ More

    Submitted 31 January, 2023; v1 submitted 20 July, 2022; originally announced July 2022.

    Comments: This article has been accepted for publication in the IEEE/ACM Transactions on Audio, Speech, and Language Processing. (c) 2023 IEEE

    Journal ref: IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 31, pp. 927-942, 2023

  3. arXiv:1912.11474  [pdf, other

    cs.CV cs.HC cs.SD eess.AS

    SoundSpaces: Audio-Visual Navigation in 3D Environments

    Authors: Changan Chen, Unnat Jain, Carl Schissler, Sebastia Vicenc Amengual Gari, Ziad Al-Halah, Vamsi Krishna Ithapu, Philip Robinson, Kristen Grauman

    Abstract: Moving around in the world is naturally a multisensory experience, but today's embodied agents are deaf---restricted to solely their visual perception of the environment. We introduce audio-visual navigation for complex, acoustically and visually realistic 3D environments. By both seeing and hearing, the agent must learn to navigate to a sounding object. We propose a multi-modal deep reinforcement… ▽ More

    Submitted 21 August, 2020; v1 submitted 24 December, 2019; originally announced December 2019.

    Comments: Accepted to ECCV 2020 (Spotlight). Project page: http://vision.cs.utexas.edu/projects/audio_visual_navigation/