Skip to main content

Showing 1–20 of 20 results for author: Aluísio, S

.
  1. arXiv:2305.14580  [pdf, other

    cs.CL cs.AI

    Evaluating OpenAI's Whisper ASR for Punctuation Prediction and Topic Modeling of life histories of the Museum of the Person

    Authors: Lucas Rafael Stefanel Gris, Ricardo Marcacini, Arnaldo Candido Junior, Edresson Casanova, Anderson Soares, Sandra Maria Aluísio

    Abstract: Automatic speech recognition (ASR) systems play a key role in applications involving human-machine interactions. Despite their importance, ASR models for the Portuguese language proposed in the last decade have limitations in relation to the correct identification of punctuation marks in automatic transcriptions, which hinder the use of transcriptions by other systems, models, and even by humans.… ▽ More

    Submitted 26 May, 2023; v1 submitted 23 May, 2023; originally announced May 2023.

  2. arXiv:2211.14372  [pdf, other

    eess.AS cs.CL cs.LG cs.SD

    Interpretability Analysis of Deep Models for COVID-19 Detection

    Authors: Daniel Peixoto Pinto da Silva, Edresson Casanova, Lucas Rafael Stefanel Gris, Arnaldo Candido Junior, Marcelo Finger, Flaviane Svartman, Beatriz Raposo, Marcus Vinícius Moreira Martins, Sandra Maria Aluísio, Larissa Cristina Berti, João Paulo Teixeira

    Abstract: During the outbreak of COVID-19 pandemic, several research areas joined efforts to mitigate the damages caused by SARS-CoV-2. In this paper we present an interpretability analysis of a convolutional neural network based model for COVID-19 detection in audios. We investigate which features are important for model decision process, investigating spectrograms, F0, F0 standard deviation, sex and age.… ▽ More

    Submitted 25 November, 2022; originally announced November 2022.

    Comments: 14 pages, 4 figures

  3. arXiv:2210.07852  [pdf, other

    cs.CL cs.SD eess.AS

    Bringing NURC/SP to Digital Life: the Role of Open-source Automatic Speech Recognition Models

    Authors: Lucas Rafael Stefanel Gris, Arnaldo Candido Junior, Vinícius G. dos Santos, Bruno A. Papa Dias, Marli Quadros Leite, Flaviane Romani Fernandes Svartman, Sandra Aluísio

    Abstract: The NURC Project that started in 1969 to study the cultured linguistic urban norm spoken in five Brazilian capitals, was responsible for compiling a large corpus for each capital. The digitized NURC/SP comprises 375 inquiries in 334 hours of recordings taken in São Paulo capital. Although 47 inquiries have transcripts, there was no alignment between the audio-transcription, and 328 inquiries were… ▽ More

    Submitted 14 October, 2022; originally announced October 2022.

  4. arXiv:2204.00618  [pdf, other

    eess.AS cs.CL cs.SD

    ASR data augmentation in low-resource settings using cross-lingual multi-speaker TTS and cross-lingual voice conversion

    Authors: Edresson Casanova, Christopher Shulby, Alexander Korolev, Arnaldo Candido Junior, Anderson da Silva Soares, Sandra Aluísio, Moacir Antonelli Ponti

    Abstract: We explore cross-lingual multi-speaker speech synthesis and cross-lingual voice conversion applied to data augmentation for automatic speech recognition (ASR) systems in low/medium-resource scenarios. Through extensive experiments, we show that our approach permits the application of speech synthesis and voice conversion to improve ASR systems using only one target-language speaker during model tr… ▽ More

    Submitted 20 May, 2023; v1 submitted 29 March, 2022; originally announced April 2022.

    Comments: This paper was accepted at INTERSPEECH 2023

  5. arXiv:2201.03445  [pdf, other

    cs.CL

    NILC-Metrix: assessing the complexity of written and spoken language in Brazilian Portuguese

    Authors: Sidney Evaldo Leal, Magali Sanches Duran, Carolina Evaristo Scarton, Nathan Siegle Hartmann, Sandra Maria Aluísio

    Abstract: This paper presents and makes publicly available the NILC-Metrix, a computational system comprising 200 metrics proposed in studies on discourse, psycholinguistics, cognitive and computational linguistics, to assess textual complexity in Brazilian Portuguese (BP). These metrics are relevant for descriptive analysis and the creation of computational models and can be used to extract information fro… ▽ More

    Submitted 17 December, 2021; originally announced January 2022.

    Comments: 26 pages

  6. arXiv:2110.15731  [pdf, other

    cs.CL cs.SD eess.AS

    CORAA: a large corpus of spontaneous and prepared speech manually validated for speech recognition in Brazilian Portuguese

    Authors: Arnaldo Candido Junior, Edresson Casanova, Anderson Soares, Frederico Santos de Oliveira, Lucas Oliveira, Ricardo Corso Fernandes Junior, Daniel Peixoto Pinto da Silva, Fernando Gorgulho Fayet, Bruno Baldissera Carlotto, Lucas Rafael Stefanel Gris, Sandra Maria Aluísio

    Abstract: Automatic Speech recognition (ASR) is a complex and challenging task. In recent years, there have been significant advances in the area. In particular, for the Brazilian Portuguese (BP) language, there were about 376 hours public available for ASR task until the second half of 2020. With the release of new datasets in early 2021, this number increased to 574 hours. The existing resources, however,… ▽ More

    Submitted 18 November, 2021; v1 submitted 14 October, 2021; originally announced October 2021.

    Comments: This paper is under consideration at Language Resources and Evaluation (LREV)

  7. arXiv:2104.05557  [pdf, other

    eess.AS cs.SD

    SC-GlowTTS: an Efficient Zero-Shot Multi-Speaker Text-To-Speech Model

    Authors: Edresson Casanova, Christopher Shulby, Eren Gölge, Nicolas Michael Müller, Frederico Santos de Oliveira, Arnaldo Candido Junior, Anderson da Silva Soares, Sandra Maria Aluisio, Moacir Antonelli Ponti

    Abstract: In this paper, we propose SC-GlowTTS: an efficient zero-shot multi-speaker text-to-speech model that improves similarity for speakers unseen during training. We propose a speaker-conditional architecture that explores a flow-based decoder that works in a zero-shot scenario. As text encoders, we explore a dilated residual convolutional-based encoder, gated convolutional-based encoder, and transform… ▽ More

    Submitted 15 June, 2021; v1 submitted 2 April, 2021; originally announced April 2021.

    Comments: Accepted on Interspeech 2021

  8. arXiv:2005.05144  [pdf, other

    eess.AS cs.CL cs.LG

    TTS-Portuguese Corpus: a corpus for speech synthesis in Brazilian Portuguese

    Authors: Edresson Casanova, Arnaldo Candido Junior, Christopher Shulby, Frederico Santos de Oliveira, João Paulo Teixeira, Moacir Antonelli Ponti, Sandra Maria Aluisio

    Abstract: Speech provides a natural way for human-computer interaction. In particular, speech synthesis systems are popular in different applications, such as personal assistants, GPS applications, screen readers and accessibility tools. However, not all languages are on the same level when in terms of resources and systems for speech synthesis. This work consists of creating publicly available resources fo… ▽ More

    Submitted 29 January, 2022; v1 submitted 11 May, 2020; originally announced May 2020.

  9. arXiv:2002.11213  [pdf, other

    cs.CL cs.SD eess.AS

    Speech2Phone: A Novel and Efficient Method for Training Speaker Recognition Models

    Authors: Edresson Casanova, Arnaldo Candido Junior, Christopher Shulby, Frederico Santos de Oliveira, Lucas Rafael Stefanel Gris, Hamilton Pereira da Silva, Sandra Maria Aluisio, Moacir Antonelli Ponti

    Abstract: In this paper we present an efficient method for training models for speaker recognition using small or under-resourced datasets. This method requires less data than other SOTA (State-Of-The-Art) methods, e.g. the Angular Prototypical and GE2E loss functions, while achieving similar results to those methods. This is done using the knowledge of the reconstruction of a phoneme in the speaker's voice… ▽ More

    Submitted 18 June, 2021; v1 submitted 25 February, 2020; originally announced February 2020.

    Comments: Submitted to BRACIS

  10. MilkQA: a Dataset of Consumer Questions for the Task of Answer Selection

    Authors: Marcelo Criscuolo, Erick Rocha Fonseca, Sandra Maria Aluísio, Ana Carolina Sperança-Criscuolo

    Abstract: We introduce MilkQA, a question answering dataset from the dairy domain dedicated to the study of consumer questions. The dataset contains 2,657 pairs of questions and answers, written in the Portuguese language and originally collected by the Brazilian Agricultural Research Corporation (Embrapa). All questions were motivated by real situations and written by thousands of authors with very differe… ▽ More

    Submitted 10 January, 2018; originally announced January 2018.

    Comments: 6 pages

    Journal ref: Intelligent Systems (BRACIS), 2017 Brazilian Conference on

  11. arXiv:1708.06025  [pdf, ps, other

    cs.CL

    Portuguese Word Embeddings: Evaluating on Word Analogies and Natural Language Tasks

    Authors: Nathan Hartmann, Erick Fonseca, Christopher Shulby, Marcos Treviso, Jessica Rodrigues, Sandra Aluisio

    Abstract: Word embeddings have been found to provide meaningful representations for words in an efficient way; therefore, they have become common in Natural Language Processing sys- tems. In this paper, we evaluated different word embedding models trained on a large Portuguese corpus, including both Brazilian and European variants. We trained 31 word embedding models using FastText, GloVe, Wang2Vec and Word… ▽ More

    Submitted 20 August, 2017; originally announced August 2017.

    Comments: 7 pages, STIL 2017 Full paper

  12. arXiv:1708.04704  [pdf, other

    cs.CL

    Evaluating Word Embeddings for Sentence Boundary Detection in Speech Transcripts

    Authors: Marcos V. Treviso, Christopher D. Shulby, Sandra M. Aluisio

    Abstract: This paper is motivated by the automation of neuropsychological tests involving discourse analysis in the retellings of narratives by patients with potential cognitive impairment. In this scenario the task of sentence boundary detection in speech transcripts is important as discourse analysis involves the application of Natural Language Processing tools, such as taggers and parsers, which depend o… ▽ More

    Submitted 15 August, 2017; originally announced August 2017.

    Comments: Accepted on STIL 2017

  13. arXiv:1706.09055  [pdf, other

    cs.SD cs.CL

    Acoustic Modeling Using a Shallow CNN-HTSVM Architecture

    Authors: Christopher Dane Shulby, Martha Dais Ferreira, Rodrigo F. de Mello, Sandra Maria Aluisio

    Abstract: High-accuracy speech recognition is especially challenging when large datasets are not available. It is possible to bridge this gap with careful and knowledge-driven parsing combined with the biologically inspired CNN and the learning guarantees of the Vapnik Chervonenkis (VC) theory. This work presents a Shallow-CNN-HTSVM (Hierarchical Tree Support Vector Machine classifier) architecture which us… ▽ More

    Submitted 27 June, 2017; originally announced June 2017.

    Comments: Pre-review version of Bracis 2017

  14. arXiv:1705.07008  [pdf, ps, other

    cs.CL

    A Lightweight Regression Method to Infer Psycholinguistic Properties for Brazilian Portuguese

    Authors: Leandro B. dos Santos, Magali S. Duran, Nathan S. Hartmann, Arnaldo Candido Jr., Gustavo H. Paetzold, Sandra M. Aluisio

    Abstract: Psycholinguistic properties of words have been used in various approaches to Natural Language Processing tasks, such as text simplification and readability assessment. Most of these properties are subjective, involving costly and time-consuming surveys to be gathered. Recent approaches use the limited datasets of psycholinguistic properties to extend them automatically to large lexicons. However,… ▽ More

    Submitted 19 May, 2017; originally announced May 2017.

    Comments: Paper accepted for TSD2017

  15. arXiv:1704.08088  [pdf, other

    cs.CL

    Enriching Complex Networks with Word Embeddings for Detecting Mild Cognitive Impairment from Speech Transcripts

    Authors: Leandro B. dos Santos, Edilson A. Corrêa Jr, Osvaldo N. Oliveira Jr, Diego R. Amancio, Letícia L. Mansur, Sandra M. Aluísio

    Abstract: Mild Cognitive Impairment (MCI) is a mental disorder difficult to diagnose. Linguistic features, mainly from parsers, have been used to detect MCI, but this is not suitable for large-scale assessments. MCI disfluencies produce non-grammatical speech that requires manual or high precision automatic correction of transcripts. In this paper, we modeled transcripts into complex networks and enriched t… ▽ More

    Submitted 26 April, 2017; originally announced April 2017.

    Comments: Published in Annual Meeting of the Association for Computational Linguist 2017

  16. Automatic semantic role labeling on non-revised syntactic trees of journalistic texts

    Authors: Nathan Siegle Hartmann, Magali Sanches Duran, Sandra Maria Aluísio

    Abstract: Semantic Role Labeling (SRL) is a Natural Language Processing task that enables the detection of events described in sentences and the participants of these events. For Brazilian Portuguese (BP), there are two studies recently concluded that perform SRL in journalistic texts. [1] obtained F1-measure scores of 79.6, using the PropBank.Br corpus, which has syntactic trees manually revised, [8], with… ▽ More

    Submitted 10 April, 2017; originally announced April 2017.

    Comments: PROPOR International Conference on the Computational Processing of Portuguese, 2016, 8 pages

    Journal ref: PROPOR 2016. Springer. Lecture Notes in Computer Science volume 9727 (2016) pgs. 202-212

  17. Automatic Classification of the Complexity of Nonfiction Texts in Portuguese for Early School Years

    Authors: Nathan Siegle Hartmann, Livia Cucatto, Danielle Brants, Sandra Aluísio

    Abstract: Recent research shows that most Brazilian students have serious problems regarding their reading skills. The full development of this skill is key for the academic and professional future of every citizen. Tools for classifying the complexity of reading materials for children aim to improve the quality of the model of teaching reading and text comprehension. For English, Fengs work [11] is conside… ▽ More

    Submitted 10 April, 2017; originally announced April 2017.

    Comments: PROPOR International Conference on the Computational Processing of Portuguese, 2016, 9 pages

    Journal ref: Hartmann N., Cucatto L., Brants D., Aluísio S. (2016) Automatic Classification of the Complexity of Nonfiction Texts in Portuguese for Early School Years. In: Computational Processing of the Portuguese Language. PROPOR 2016. Springer

  18. arXiv:1610.00211  [pdf, other

    cs.CL

    Sentence Segmentation in Narrative Transcripts from Neuropsychological Tests using Recurrent Convolutional Neural Networks

    Authors: Marcos Vinícius Treviso, Christopher Shulby, Sandra Maria Aluísio

    Abstract: Automated discourse analysis tools based on Natural Language Processing (NLP) aiming at the diagnosis of language-impairing dementias generally extract several textual metrics of narrative transcripts. However, the absence of sentence boundary segmentation in the transcripts prevents the direct application of NLP methods which rely on these marks to function properly, such as taggers and parsers.… ▽ More

    Submitted 15 August, 2017; v1 submitted 1 October, 2016; originally announced October 2016.

    Comments: EACL 2017

    MSC Class: 68T50

  19. arXiv:1302.4490  [pdf, other

    physics.soc-ph cs.CL cs.SI physics.data-an

    Complex networks analysis of language complexity

    Authors: Diego R. Amancio, Sandra M. Aluisio, Osvaldo N. Oliveira Jr., Luciano da F. Costa

    Abstract: Methods from statistical physics, such as those involving complex networks, have been increasingly used in quantitative analysis of linguistic phenomena. In this paper, we represented pieces of text with different levels of simplification in co-occurrence networks and found that topological regularity correlated negatively with textual complexity. Furthermore, in less complex texts the distance be… ▽ More

    Submitted 18 February, 2013; originally announced February 2013.

    Comments: The Supplementary Information (SI) is available from https://dl.dropbox.com/u/2740286/supplementary.pdf

    Journal ref: Europhysics Letters (2012) 100 58002

  20. arXiv:cs/0611013  [pdf

    cs.OH

    Develo** strategies to produce better scientific papers: a Recipe for non-native users of English

    Authors: Osvaldo N. Oliveira Jr., Valtencir Zucolotto, Sandra M. Aluisio

    Abstract: In this paper we introduce the AMADEUS strategy, which has been used to produce scientific writing tools for non-native users of English for 15 years, and emphasize a learn-by-doing approach through which students and novice writers can improve their scientific writing. More specifically, we provide a 9-step recipe for the students to compile writing material according to a procedure that has pr… ▽ More

    Submitted 3 November, 2006; originally announced November 2006.

    Comments: 10 pages, 1 figure