Skip to main content

Showing 1–5 of 5 results for author: Ng, R W M

.
  1. A cross-corpus study on speech emotion recognition

    Authors: Rosanna Milner, Md Asif Jalal, Raymond W. M. Ng, Thomas Hain

    Abstract: For speech emotion datasets, it has been difficult to acquire large quantities of reliable data and acted emotions may be over the top compared to less expressive emotions displayed in everyday life. Lately, larger datasets with natural emotions have been created. Instead of ignoring smaller, acted datasets, this study investigates whether information learnt from acted emotions is useful for detec… ▽ More

    Submitted 5 July, 2022; originally announced July 2022.

    Comments: ASRU 2019

    Journal ref: IEEE Workshop on Automatic Speech Recognition and Understanding 2019

  2. arXiv:1606.03333  [pdf, other

    cs.MM cs.CL cs.IR

    Automatic Genre and Show Identification of Broadcast Media

    Authors: Mortaza Doulaty, Oscar Saz, Raymond W. M. Ng, Thomas Hain

    Abstract: Huge amounts of digital videos are being produced and broadcast every day, leading to giant media archives. Effective techniques are needed to make such data accessible further. Automatic meta-data labelling of broadcast media is an essential task for multimedia indexing, where it is standard to use multi-modal input for such purposes. This paper describes a novel method for automatic detection of… ▽ More

    Submitted 10 June, 2016; originally announced June 2016.

    Comments: Proc. of 17th Interspeech (2016), San Francisco, California, USA

  3. The 2015 Sheffield System for Transcription of Multi-Genre Broadcast Media

    Authors: Oscar Saz, Mortaza Doulaty, Salil Deena, Rosanna Milner, Raymond W. M. Ng, Madina Hasan, Yulan Liu, Thomas Hain

    Abstract: We describe the University of Sheffield system for participation in the 2015 Multi-Genre Broadcast (MGB) challenge task of transcribing multi-genre broadcast shows. Transcription was one of four tasks proposed in the MGB challenge, with the aim of advancing the state of the art of automatic speech recognition, speaker diarisation and automatic alignment of subtitles for broadcast media. Four topic… ▽ More

    Submitted 21 December, 2015; originally announced December 2015.

    Comments: IEEE Automatic Speech Recognition and Understanding Workshop (ASRU 2015), 13-17 Dec 2015, Scottsdale, Arizona, USA

  4. Latent Dirichlet Allocation Based Organisation of Broadcast Media Archives for Deep Neural Network Adaptation

    Authors: Mortaza Doulaty, Oscar Saz, Raymond W. M. Ng, Thomas Hain

    Abstract: This paper presents a new method for the discovery of latent domains in diverse speech data, for the use of adaptation of Deep Neural Networks (DNNs) for Automatic Speech Recognition. Our work focuses on transcription of multi-genre broadcast media, which is often only categorised broadly in terms of high level genres such as sports, news, documentary, etc. However, in terms of acoustic modelling… ▽ More

    Submitted 16 November, 2015; originally announced November 2015.

    Comments: IEEE Automatic Speech Recognition and Understanding Workshop (ASRU 2015), 13-17 Dec 2015, Scottsdale, Arizona, USA

  5. arXiv:1509.03870  [pdf, other

    cs.CL

    The USFD Spoken Language Translation System for IWSLT 2014

    Authors: Raymond W. M. Ng, Mortaza Doulaty, Rama Doddipatla, Wilker Aziz, Kashif Shah, Oscar Saz, Madina Hasan, Ghada AlHarbi, Lucia Specia, Thomas Hain

    Abstract: The University of Sheffield (USFD) participated in the International Workshop for Spoken Language Translation (IWSLT) in 2014. In this paper, we will introduce the USFD SLT system for IWSLT. Automatic speech recognition (ASR) is achieved by two multi-pass deep neural network systems with adaptation and rescoring techniques. Machine translation (MT) is achieved by a phrase-based system. The USFD pr… ▽ More

    Submitted 13 September, 2015; originally announced September 2015.

    Journal ref: Proc. of 11th International Workshop on Spoken Language Translation (SLT 2014) 86-91, Lake Tahoe, USA, December 4th and 5th, 2014