Skip to main content

Showing 1–4 of 4 results for author: Velimirović, M

Searching in archive eess. Search in all archives.
.
  1. arXiv:2404.17490  [pdf, other

    eess.AS cs.SD eess.SP

    The CARFAC v2 Cochlear Model in Matlab, NumPy, and JAX

    Authors: Richard F. Lyon, Rob Schonberger, Malcolm Slaney, Mihajlo Velimirović, Honglin Yu

    Abstract: The open-source CARFAC (Cascade of Asymmetric Resonators with Fast-Acting Compression) cochlear model is upgraded to version 2, with improvements to the Matlab implementation, and with new Python/NumPy and JAX implementations -- but C++ version changes are still pending. One change addresses the DC (direct current, or zero frequency) quadratic distortion anomaly previously reported; another reduce… ▽ More

    Submitted 26 April, 2024; originally announced April 2024.

  2. arXiv:2306.12925  [pdf, other

    cs.CL cs.AI cs.SD eess.AS stat.ML

    AudioPaLM: A Large Language Model That Can Speak and Listen

    Authors: Paul K. Rubenstein, Chulayuth Asawaroengchai, Duc Dung Nguyen, Ankur Bapna, Zalán Borsos, Félix de Chaumont Quitry, Peter Chen, Dalia El Badawy, Wei Han, Eugene Kharitonov, Hannah Muckenhirn, Dirk Padfield, James Qin, Danny Rozenberg, Tara Sainath, Johan Schalkwyk, Matt Sharifi, Michelle Tadmor Ramanovich, Marco Tagliasacchi, Alexandru Tudor, Mihajlo Velimirović, Damien Vincent, Jiahui Yu, Yongqiang Wang, Vicky Zayats , et al. (5 additional authors not shown)

    Abstract: We introduce AudioPaLM, a large language model for speech understanding and generation. AudioPaLM fuses text-based and speech-based language models, PaLM-2 [Anil et al., 2023] and AudioLM [Borsos et al., 2022], into a unified multimodal architecture that can process and generate text and speech with applications including speech recognition and speech-to-speech translation. AudioPaLM inherits the… ▽ More

    Submitted 22 June, 2023; originally announced June 2023.

    Comments: Technical report

  3. arXiv:1910.11664  [pdf, other

    eess.AS cs.LG cs.SD

    SPICE: Self-supervised Pitch Estimation

    Authors: Beat Gfeller, Christian Frank, Dominik Roblek, Matt Sharifi, Marco Tagliasacchi, Mihajlo Velimirović

    Abstract: We propose a model to estimate the fundamental frequency in monophonic audio, often referred to as pitch estimation. We acknowledge the fact that obtaining ground truth annotations at the required temporal and frequency resolution is a particularly daunting task. Therefore, we propose to adopt a self-supervised learning technique, which is able to estimate pitch without any form of supervision. Th… ▽ More

    Submitted 4 September, 2020; v1 submitted 25 October, 2019; originally announced October 2019.

    Comments: Accepted to IEEE Transactions on Audio, Speech and Language Processing

    Journal ref: in IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 28, pp. 1118-1128, 2020

  4. arXiv:1711.10958  [pdf, other

    cs.SD cs.AI eess.AS

    Now Playing: Continuous low-power music recognition

    Authors: Blaise Agüera y Arcas, Beat Gfeller, Ruiqi Guo, Kevin Kilgour, Sanjiv Kumar, James Lyon, Julian Odell, Marvin Ritter, Dominik Roblek, Matthew Sharifi, Mihajlo Velimirović

    Abstract: Existing music recognition applications require a connection to a server that performs the actual recognition. In this paper we present a low-power music recognizer that runs entirely on a mobile device and automatically recognizes music without user interaction. To reduce battery consumption, a small music detector runs continuously on the mobile device's DSP chip and wakes up the main applicatio… ▽ More

    Submitted 29 November, 2017; originally announced November 2017.

    Comments: Authors are listed in alphabetical order by last name