Showing 1–2 of 2 results for author: Ferreira, A R

Search v0.5.6 released 2020-02-24

arXiv:2309.12802 [pdf, other]

cs.SD cs.LG eess.AS

doi 10.21528/CBIC2023-169

Deepfake audio as a data augmentation technique for training automatic speech to text transcription models

Authors: Alexandre R. Ferreira, Cláudio E. C. Campelo

Abstract: To train transcriptor models that produce robust results, a large and diverse labeled dataset is required. Finding such data with the necessary characteristics is a challenging task, especially for languages less popular than English. Moreover, producing such data requires significant effort and often money. Therefore, a strategy to mitigate this problem is the use of data augmentation techniques.… ▽ More To train transcriptor models that produce robust results, a large and diverse labeled dataset is required. Finding such data with the necessary characteristics is a challenging task, especially for languages less popular than English. Moreover, producing such data requires significant effort and often money. Therefore, a strategy to mitigate this problem is the use of data augmentation techniques. In this work, we propose a framework that approaches data augmentation based on deepfake audio. To validate the produced framework, experiments were conducted using existing deepfake and transcription models. A voice cloner and a dataset produced by Indians (in English) were selected, ensuring the presence of a single accent in the dataset. Subsequently, the augmented data was used to train speech to text models in various scenarios. △ Less

Submitted 22 September, 2023; originally announced September 2023.

Comments: 9 pages, 6 figures, 7 tables

ACM Class: I.2.6; I.2.0; E.0
arXiv:2208.10996 [pdf, other]

cs.CV

An Evolutionary Approach for Creating of Diverse Classifier Ensembles

Authors: Alvaro R. Ferreira Jr, Fabio A. Faria, Gustavo Carneiro, Vinicius V. de Melo

Abstract: Classification is one of the most studied tasks in data mining and machine learning areas and many works in the literature have been presented to solve classification problems for multiple fields of knowledge such as medicine, biology, security, and remote sensing. Since there is no single classifier that achieves the best results for all kinds of applications, a good alternative is to adopt class… ▽ More Classification is one of the most studied tasks in data mining and machine learning areas and many works in the literature have been presented to solve classification problems for multiple fields of knowledge such as medicine, biology, security, and remote sensing. Since there is no single classifier that achieves the best results for all kinds of applications, a good alternative is to adopt classifier fusion strategies. A key point in the success of classifier fusion approaches is the combination of diversity and accuracy among classifiers belonging to an ensemble. With a large amount of classification models available in the literature, one challenge is the choice of the most suitable classifiers to compose the final classification system, which generates the need of classifier selection strategies. We address this point by proposing a framework for classifier selection and fusion based on a four-step protocol called CIF-E (Classifiers, Initialization, Fitness function, and Evolutionary algorithm). We implement and evaluate 24 varied ensemble approaches following the proposed CIF-E protocol and we are able to find the most accurate approach. A comparative analysis has also been performed among the best approaches and many other baselines from the literature. The experiments show that the proposed evolutionary approach based on Univariate Marginal Distribution Algorithm (UMDA) can outperform the state-of-the-art literature approaches in many well-known UCI datasets. △ Less

Submitted 23 August, 2022; originally announced August 2022.

Search v0.5.6 released 2020-02-24