Skip to main content

Showing 1–7 of 7 results for author: Cámara, M

Searching in archive eess. Search in all archives.
.
  1. arXiv:2406.14379  [pdf, other

    eess.AS

    Decoding Vocal Articulations from Acoustic Latent Representations

    Authors: Mateo Cámara, Fernando Marcos, José Luis Blanco

    Abstract: We present a novel neural encoder system for acoustic-to-articulatory inversion. We leverage the Pink Trombone voice synthesizer that reveals articulatory parameters (e.g tongue position and vocal cord configuration). Our system is designed to identify the articulatory features responsible for producing specific acoustic characteristics contained in a neural latent representation. To generate the… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

    Comments: Presented at AES Europe 2024 in Madrid

  2. Perceptually Equivalent Resolution in Handheld Devices for Streaming Bandwidth Saving

    Authors: Mateo Cámara, César Díaz, Juan Casal, Jorge Ruano, Narciso García

    Abstract: We present the description, results, and analysis of the experiments conducted to find the equivalent resolution associated with handheld devices. That is, the resolution from which users stop perceiving quality improvements if better resolutions are presented to them in such devices. Thus, it is the maximum resolution that it is worth considering for generating and delivering video, as long as se… ▽ More

    Submitted 7 February, 2024; originally announced February 2024.

    Journal ref: IEEE Signal Processing Letters, vol. 26, no. 6, pp. 878-882, Jun. 2019

  3. arXiv:2402.01385  [pdf

    eess.AS cs.SD

    Del Visual al Auditivo: Sonorización de Escenas Guiada por Imagen

    Authors: María Sánchez, Laura Fernández, Julián Arias, Mateo Cámara, Giulia Comini, Adam Gabrys, José Luis Blanco, Juan Ignacio Godino, Luis Alfonso Hernández

    Abstract: Recent advances in image, video, text and audio generative techniques, and their use by the general public, are leading to new forms of content generation. Usually, each modality was approached separately, which poses limitations. The automatic sound recording of visual sequences is one of the greatest challenges for the automatic generation of multimodal content. We present a processing flow that… ▽ More

    Submitted 2 February, 2024; originally announced February 2024.

    Comments: 10 pages, in Spanish, Tecniacústica

  4. arXiv:2310.16140  [pdf

    eess.AS cs.SD

    IA Para el Mantenimiento Predictivo en Canteras: Modelado

    Authors: Fernando Marcos, Rodrigo Tamaki, Mateo Cámara, Virginia Yagüe, José Luis Blanco

    Abstract: Dependence on raw materials, especially in the mining sector, is a key part of today's economy. Aggregates are vital, being the second most used raw material after water. Digitally transforming this sector is key to optimizing operations. However, supervision and maintenance (predictive and corrective) are challenges little explored in this sector, due to the particularities of the sector, machine… ▽ More

    Submitted 24 October, 2023; originally announced October 2023.

    Comments: 10 pages, in Spanish language, 5 figures. Presented in Tecniacustica 2023 conference (Cuenca, Spain)

  5. arXiv:2310.15663  [pdf

    eess.AS cs.SD

    FOLEY-VAE: Generación de efectos de audio para cine con inteligencia artificial

    Authors: Mateo Cámara, José Luis Blanco

    Abstract: In this research, we present an interface based on Variational Autoencoders trained with a wide range of natural sounds for the innovative creation of Foley effects. The model can transfer new sound features to prerecorded audio or microphone-captured speech in real time. In addition, it allows interactive modification of latent variables, facilitating precise and customized artistic adjustments.… ▽ More

    Submitted 24 October, 2023; originally announced October 2023.

    Comments: 9 pages, in Spanish, Tecniacústica

  6. arXiv:2309.14761  [pdf, other

    eess.AS cs.SD

    Optimization Techniques for a Physical Model of Human Vocalisation

    Authors: Mateo Cámara, Zhiyuan Xu, Yisu Zong, José Luis Blanco, Joshua D. Reiss

    Abstract: We present a non-supervised approach to optimize and evaluate the synthesis of non-speech audio effects from a speech production model. We use the Pink Trombone synthesizer as a case study of a simplified production model of the vocal tract to target non-speech human audio signals --yawnings. We selected and optimized the control parameters of the synthesizer to minimize the difference between rea… ▽ More

    Submitted 26 September, 2023; originally announced September 2023.

    Comments: Accepted to DAFx 2023

  7. arXiv:2307.04702  [pdf, other

    cs.SD eess.AS

    Vocal Tract Area Estimation by Gradient Descent

    Authors: David Südholt, Mateo Cámara, Zhiyuan Xu, Joshua D. Reiss

    Abstract: Articulatory features can provide interpretable and flexible controls for the synthesis of human vocalizations by allowing the user to directly modify parameters like vocal strain or lip position. To make this manipulation through resynthesis possible, we need to estimate the features that result in a desired vocalization directly from audio recordings. In this work, we propose a white-box optimiz… ▽ More

    Submitted 10 July, 2023; originally announced July 2023.

    Comments: Accepted to DAFx 2023