SpliceOut: A Simple and Efficient Audio Augmentation Method
Authors:
Arjit Jain,
Pranay Reddy Samala,
Deepak Mittal,
Preethi Jyoti,
Maneesh Singh
Abstract:
Time masking has become a de facto augmentation technique for speech and audio tasks, including automatic speech recognition (ASR) and audio classification, most notably as a part of SpecAugment. In this work, we propose SpliceOut, a simple modification to time masking which makes it computationally more efficient. SpliceOut performs comparably to (and sometimes outperforms) SpecAugment on a wide…
▽ More
Time masking has become a de facto augmentation technique for speech and audio tasks, including automatic speech recognition (ASR) and audio classification, most notably as a part of SpecAugment. In this work, we propose SpliceOut, a simple modification to time masking which makes it computationally more efficient. SpliceOut performs comparably to (and sometimes outperforms) SpecAugment on a wide variety of speech and audio tasks, including ASR for seven different languages using varying amounts of training data, as well as on speech translation, sound and music classification, thus establishing itself as a broadly applicable audio augmentation method. SpliceOut also provides additional gains when used in conjunction with other augmentation techniques. Apart from the fully-supervised setting, we also demonstrate that SpliceOut can complement unsupervised representation learning with performance gains in the semi-supervised and self-supervised settings.
△ Less
Submitted 13 October, 2021; v1 submitted 30 September, 2021;
originally announced October 2021.
Evaluation of some Information Retrieval models for Gujarati Ad hoc Monolingual Tasks
Authors:
Hardik J. Joshi,
Pareek Jyoti
Abstract:
This paper describes the work towards Gujarati Ad hoc Monolingual Retrieval task for widely used Information Retrieval (IR) models. We present an indexing baseline for the Gujarati Language represented by Mean Average Precision (MAP) values. Our objective is to obtain a relative picture of a better IR model for Gujarati Language. Results show that Classical IR models like Term Frequency Inverse Do…
▽ More
This paper describes the work towards Gujarati Ad hoc Monolingual Retrieval task for widely used Information Retrieval (IR) models. We present an indexing baseline for the Gujarati Language represented by Mean Average Precision (MAP) values. Our objective is to obtain a relative picture of a better IR model for Gujarati Language. Results show that Classical IR models like Term Frequency Inverse Document Frequency (TF_IDF) performs better when compared to few recent probabilistic IR models. The experiments helped to identify the outperforming IR models for Gujarati Language.
△ Less
Submitted 1 September, 2012;
originally announced September 2012.