Skip to main content

Showing 1–5 of 5 results for author: Ollerenshaw, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2306.17500  [pdf, other

    cs.SD cs.AI cs.LG eess.AS

    Empirical Interpretation of the Relationship Between Speech Acoustic Context and Emotion Recognition

    Authors: Anna Ollerenshaw, Md Asif Jalal, Rosanna Milner, Thomas Hain

    Abstract: Speech emotion recognition (SER) is vital for obtaining emotional intelligence and understanding the contextual meaning of speech. Variations of consonant-vowel (CV) phonemic boundaries can enrich acoustic context with linguistic cues, which impacts SER. In practice, speech emotions are treated as single labels over an acoustic segment for a given time duration. However, phone boundaries within sp… ▽ More

    Submitted 30 June, 2023; originally announced June 2023.

  2. arXiv:2303.00550  [pdf, other

    eess.AS cs.SD

    Towards domain generalisation in ASR with elitist sampling and ensemble knowledge distillation

    Authors: Rehan Ahmad, Md Asif Jalal, Muhammad Umar Farooq, Anna Ollerenshaw, Thomas Hain

    Abstract: Knowledge distillation has widely been used for model compression and domain adaptation for speech applications. In the presence of multiple teachers, knowledge can easily be transferred to the student by averaging the models output. However, previous research shows that the student do not adapt well with such combination. This paper propose to use an elitist sampling strategy at the output of ens… ▽ More

    Submitted 1 March, 2023; originally announced March 2023.

  3. arXiv:2211.02000  [pdf, other

    cs.SD cs.CL eess.AS

    Dynamic Kernels and Channel Attention for Low Resource Speaker Verification

    Authors: Anna Ollerenshaw, Md Asif Jalal, Thomas Hain

    Abstract: State-of-the-art speaker verification frameworks have typically focused on develo** models with increasingly deeper (more layers) and wider (number of channels) models to improve their verification performance. Instead, this paper proposes an approach to increase the model resolution capability using attention-based dynamic kernels in a convolutional neural network to adapt the model parameters… ▽ More

    Submitted 27 February, 2023; v1 submitted 3 November, 2022; originally announced November 2022.

  4. arXiv:2211.01993  [pdf, other

    cs.CL cs.SD eess.AS

    Probing Statistical Representations For End-To-End ASR

    Authors: Anna Ollerenshaw, Md Asif Jalal, Thomas Hain

    Abstract: End-to-End automatic speech recognition (ASR) models aim to learn a generalised speech representation to perform recognition. In this domain there is little research to analyse internal representation dependencies and their relationship to modelling approaches. This paper investigates cross-domain language model dependencies within transformer architectures using SVCCA and uses these insights to e… ▽ More

    Submitted 3 November, 2022; originally announced November 2022.

    Comments: Submitted to ICASSP 2023

  5. Insights on Neural Representations for End-to-End Speech Recognition

    Authors: Anna Ollerenshaw, Md Asif Jalal, Thomas Hain

    Abstract: End-to-end automatic speech recognition (ASR) models aim to learn a generalised speech representation. However, there are limited tools available to understand the internal functions and the effect of hierarchical dependencies within the model architecture. It is crucial to understand the correlations between the layer-wise representations, to derive insights on the relationship between neural rep… ▽ More

    Submitted 19 May, 2022; originally announced May 2022.

    Comments: Submitted to Interspeech 2021

    Journal ref: Proc. Interspeech 2021, 4079-4083