Skip to main content

Showing 1–2 of 2 results for author: Hogg, A O T

.
  1. arXiv:2312.16763  [pdf, other

    eess.AS cs.SD

    Uncertainty Quantification in Machine Learning for Joint Speaker Diarization and Identification

    Authors: Simon W. McKnight, Aidan O. T. Hogg, Vincent W. Neo, Patrick A. Naylor

    Abstract: This paper studies modulation spectrum features ($Φ$) and mel-frequency cepstral coefficients ($Ψ$) in joint speaker diarization and identification (JSID). JSID is important as speaker diarization on its own to distinguish speakers is insufficient for many applications, it is often necessary to identify speakers as well. Machine learning models are set up using convolutional neural networks (CNNs)… ▽ More

    Submitted 30 December, 2023; v1 submitted 27 December, 2023; originally announced December 2023.

    Comments: 12 pages, 7 figures

  2. arXiv:2306.05812  [pdf, other

    eess.AS cs.CV cs.HC cs.LG cs.SD eess.SP

    HRTF upsampling with a generative adversarial network using a gnomonic equiangular projection

    Authors: Aidan O. T. Hogg, Mads Jenkins, He Liu, Isaac Squires, Samuel J. Cooper, Lorenzo Picinali

    Abstract: An individualised head-related transfer function (HRTF) is very important for creating realistic virtual reality (VR) and augmented reality (AR) environments. However, acoustically measuring high-quality HRTFs requires expensive equipment and an acoustic lab setting. To overcome these limitations and to make this measurement more efficient HRTF upsampling has been exploited in the past where a hig… ▽ More

    Submitted 27 February, 2024; v1 submitted 9 June, 2023; originally announced June 2023.

    Comments: 15 pages, 9 figures, Preprint (Accepted to IEEE/ACM Transactions on Audio, Speech, and Language Processing on the 15 Feb 2024)