Deepfake audio as a data augmentation technique for training automatic speech to text transcription models
Authors:
Alexandre R. Ferreira,
Cláudio E. C. Campelo
Abstract:
To train transcriptor models that produce robust results, a large and diverse labeled dataset is required. Finding such data with the necessary characteristics is a challenging task, especially for languages less popular than English. Moreover, producing such data requires significant effort and often money. Therefore, a strategy to mitigate this problem is the use of data augmentation techniques.…
▽ More
To train transcriptor models that produce robust results, a large and diverse labeled dataset is required. Finding such data with the necessary characteristics is a challenging task, especially for languages less popular than English. Moreover, producing such data requires significant effort and often money. Therefore, a strategy to mitigate this problem is the use of data augmentation techniques. In this work, we propose a framework that approaches data augmentation based on deepfake audio. To validate the produced framework, experiments were conducted using existing deepfake and transcription models. A voice cloner and a dataset produced by Indians (in English) were selected, ensuring the presence of a single accent in the dataset. Subsequently, the augmented data was used to train speech to text models in various scenarios.
△ Less
Submitted 22 September, 2023;
originally announced September 2023.
An Evolutionary Approach for Creating of Diverse Classifier Ensembles
Authors:
Alvaro R. Ferreira Jr,
Fabio A. Faria,
Gustavo Carneiro,
Vinicius V. de Melo
Abstract:
Classification is one of the most studied tasks in data mining and machine learning areas and many works in the literature have been presented to solve classification problems for multiple fields of knowledge such as medicine, biology, security, and remote sensing. Since there is no single classifier that achieves the best results for all kinds of applications, a good alternative is to adopt class…
▽ More
Classification is one of the most studied tasks in data mining and machine learning areas and many works in the literature have been presented to solve classification problems for multiple fields of knowledge such as medicine, biology, security, and remote sensing. Since there is no single classifier that achieves the best results for all kinds of applications, a good alternative is to adopt classifier fusion strategies. A key point in the success of classifier fusion approaches is the combination of diversity and accuracy among classifiers belonging to an ensemble. With a large amount of classification models available in the literature, one challenge is the choice of the most suitable classifiers to compose the final classification system, which generates the need of classifier selection strategies. We address this point by proposing a framework for classifier selection and fusion based on a four-step protocol called CIF-E (Classifiers, Initialization, Fitness function, and Evolutionary algorithm). We implement and evaluate 24 varied ensemble approaches following the proposed CIF-E protocol and we are able to find the most accurate approach. A comparative analysis has also been performed among the best approaches and many other baselines from the literature. The experiments show that the proposed evolutionary approach based on Univariate Marginal Distribution Algorithm (UMDA) can outperform the state-of-the-art literature approaches in many well-known UCI datasets.
△ Less
Submitted 23 August, 2022;
originally announced August 2022.