Skip to main content

Showing 1–50 of 54 results for author: Koerich, A L

Searching in archive cs. Search in all archives.
.
  1. arXiv:2404.12260  [pdf, other

    cs.CV cs.LG

    Alleviating Catastrophic Forgetting in Facial Expression Recognition with Emotion-Centered Models

    Authors: Israel A. Laurensi, Alceu de Souza Britto Jr., Jean Paul Barddal, Alessandro Lameiras Koerich

    Abstract: Facial expression recognition is a pivotal component in machine learning, facilitating various applications. However, convolutional neural networks (CNNs) are often plagued by catastrophic forgetting, impeding their adaptability. The proposed method, emotion-centered generative replay (ECgr), tackles this challenge by integrating synthetic images from generative adversarial networks. Moreover, ECg… ▽ More

    Submitted 18 April, 2024; originally announced April 2024.

    Comments: 15 pages

  2. arXiv:2404.12251  [pdf, other

    cs.LG cs.CV cs.SD eess.AS

    Dynamic Modality and View Selection for Multimodal Emotion Recognition with Missing Modalities

    Authors: Luciana Trinkaus Menon, Luiz Carlos Ribeiro Neduziak, Jean Paul Barddal, Alessandro Lameiras Koerich, Alceu de Souza Britto Jr

    Abstract: The study of human emotions, traditionally a cornerstone in fields like psychology and neuroscience, has been profoundly impacted by the advent of artificial intelligence (AI). Multiple channels, such as speech (voice) and facial expressions (image), are crucial in understanding human emotions. However, AI's journey in multimodal emotion recognition (MER) is marked by substantial technical challen… ▽ More

    Submitted 18 April, 2024; originally announced April 2024.

    Comments: 15 pages

  3. arXiv:2403.15455  [pdf, other

    cs.CL cs.LG

    Improving Sampling Methods for Fine-tuning SentenceBERT in Text Streams

    Authors: Cristiano Mesquita Garcia, Alessandro Lameiras Koerich, Alceu de Souza Britto Jr, Jean Paul Barddal

    Abstract: The proliferation of textual data on the Internet presents a unique opportunity for institutions and companies to monitor public opinion about their services and products. Given the rapid generation of such data, the text stream mining setting, which handles sequentially arriving, potentially infinite text streams, is often more suitable than traditional batch learning. While pre-trained language… ▽ More

    Submitted 18 March, 2024; originally announced March 2024.

  4. arXiv:2403.12328  [pdf, other

    cs.LG cs.CL cs.IR

    Methods for Generating Drift in Text Streams

    Authors: Cristiano Mesquita Garcia, Alessandro Lameiras Koerich, Alceu de Souza Britto Jr, Jean Paul Barddal

    Abstract: Systems and individuals produce data continuously. On the Internet, people share their knowledge, sentiments, and opinions, provide reviews about services and products, and so on. Automatically learning from these textual data can provide insights to organizations and institutions, thus preventing financial impacts, for example. To learn from textual data over time, the machine learning system mus… ▽ More

    Submitted 18 March, 2024; originally announced March 2024.

  5. arXiv:2403.10488  [pdf, other

    cs.CV cs.LG cs.SD eess.AS

    Joint Multimodal Transformer for Emotion Recognition in the Wild

    Authors: Paul Waligora, Haseeb Aslam, Osama Zeeshan, Soufiane Belharbi, Alessandro Lameiras Koerich, Marco Pedersoli, Simon Bacon, Eric Granger

    Abstract: Multimodal emotion recognition (MMER) systems typically outperform unimodal systems by leveraging the inter- and intra-modal relationships between, e.g., visual, textual, physiological, and auditory modalities. This paper proposes an MMER method that relies on a joint multimodal transformer (JMT) for fusion with key-based cross-attention. This framework can exploit the complementary nature of dive… ▽ More

    Submitted 20 April, 2024; v1 submitted 15 March, 2024; originally announced March 2024.

    Comments: 10 pages, 4 figures, 6 tables, CVPRw 2024

  6. arXiv:2402.00281  [pdf, other

    cs.CV

    Guided Interpretable Facial Expression Recognition via Spatial Action Unit Cues

    Authors: Soufiane Belharbi, Marco Pedersoli, Alessandro Lameiras Koerich, Simon Bacon, Eric Granger

    Abstract: Although state-of-the-art classifiers for facial expression recognition (FER) can achieve a high level of accuracy, they lack interpretability, an important feature for end-users. Experts typically associate spatial action units (\aus) from a codebook to facial regions for the visual interpretation of expressions. In this paper, the same expert steps are followed. A new learning strategy is propos… ▽ More

    Submitted 14 May, 2024; v1 submitted 31 January, 2024; originally announced February 2024.

    Comments: 15 pages, 11 figures, 3 tables, International Conference on Automatic Face and Gesture Recognition (FG 2024)

  7. arXiv:2312.05632  [pdf, other

    cs.CV

    Subject-Based Domain Adaptation for Facial Expression Recognition

    Authors: Muhammad Osama Zeeshan, Muhammad Haseeb Aslam, Soufiane Belharbi, Alessandro Lameiras Koerich, Marco Pedersoli, Simon Bacon, Eric Granger

    Abstract: Adapting a deep learning model to a specific target individual is a challenging facial expression recognition (FER) task that may be achieved using unsupervised domain adaptation (UDA) methods. Although several UDA methods have been proposed to adapt deep FER models across source and target data sets, multiple subject-specific source domains are needed to accurately represent the intra- and inter-… ▽ More

    Submitted 26 April, 2024; v1 submitted 9 December, 2023; originally announced December 2023.

  8. arXiv:2312.02901  [pdf, other

    cs.LG cs.CL cs.IR

    Concept Drift Adaptation in Text Stream Mining Settings: A Comprehensive Review

    Authors: Cristiano Mesquita Garcia, Ramon Simoes Abilio, Alessandro Lameiras Koerich, Alceu de Souza Britto Jr., Jean Paul Barddal

    Abstract: Due to the advent and increase in the popularity of the Internet, people have been producing and disseminating textual data in several ways, such as reviews, social media posts, and news articles. As a result, numerous researchers have been working on discovering patterns in textual data, especially because social media posts function as social sensors, indicating peoples' opinions, interests, etc… ▽ More

    Submitted 5 December, 2023; originally announced December 2023.

    Comments: 49 pages

  9. arXiv:2208.02397  [pdf, other

    cs.CV cs.IR cs.LG

    Pattern Spotting and Image Retrieval in Historical Documents using Deep Hashing

    Authors: Caio da S. Dias, Alceu de S. Britto Jr., Jean P. Barddal, Laurent Heutte, Alessandro L. Koerich

    Abstract: This paper presents a deep learning approach for image retrieval and pattern spotting in digital collections of historical documents. First, a region proposal algorithm detects object candidates in the document page images. Next, deep learning models are used for feature extraction, considering two distinct variants, which provide either real-valued or binary code representations. Finally, candida… ▽ More

    Submitted 3 August, 2022; originally announced August 2022.

    Comments: 7 pages

  10. arXiv:2206.08537  [pdf, ps, other

    cs.CV cs.LG

    Large-Margin Representation Learning for Texture Classification

    Authors: Jonathan de Matos, Luiz Eduardo Soares de Oliveira, Alceu de Souza Britto Junior, Alessandro Lameiras Koerich

    Abstract: This paper presents a novel approach combining convolutional layers (CLs) and large-margin metric learning for training supervised models on small datasets for texture classification. The core of such an approach is a loss function that computes the distances between instances of interest and support vectors. The objective is to update the weights of CLs iteratively to learn a representation with… ▽ More

    Submitted 17 June, 2022; originally announced June 2022.

    Comments: 7 pages

  11. arXiv:2204.12624  [pdf, other

    cs.CV cs.LG

    Evaluation of Self-taught Learning-based Representations for Facial Emotion Recognition

    Authors: Bruna Delazeri, Leonardo L. Veras, Alceu de S. Britto Jr., Jean Paul Barddal, Alessandro L. Koerich

    Abstract: This work describes different strategies to generate unsupervised representations obtained through the concept of self-taught learning for facial emotion recognition (FER). The idea is to create complementary representations promoting diversity by varying the autoencoders' initialization, architecture, and training data. SVM, Bagging, Random Forest, and a dynamic ensemble selection method are eval… ▽ More

    Submitted 26 April, 2022; originally announced April 2022.

    Comments: 8 pages

  12. arXiv:2204.12622  [pdf, other

    cs.SD cs.CR eess.AS

    Named Entity Recognition for Audio De-Identification

    Authors: Guillaume Baril, Patrick Cardinal, Alessandro Lameiras Koerich

    Abstract: Data anonymization is often a task carried out by humans. Automating it would reduce the cost and time required to complete this task. This paper presents a pipeline to automate the anonymization of audio data in French. We propose a pipeline, which takes audio files with their transcriptions and removes the named entities (NEs) present in the audio. Our pipeline is made up of a forced aligner, wh… ▽ More

    Submitted 26 April, 2022; originally announced April 2022.

    Comments: 8 pages

  13. arXiv:2204.09841  [pdf, other

    cs.CV

    Multiscale Analysis for Improving Texture Classification

    Authors: Steve T. M. Ataky, Diego Saqui, Jonathan de Matos, Alceu S. Britto Jr., Alessandro L. Koerich

    Abstract: Information from an image occurs over multiple and distinct spatial scales. Image pyramid multiresolution representations are a useful data structure for image analysis and manipulation over a spectrum of spatial scales. This paper employs the Gaussian-Laplacian pyramid to treat different spatial frequency bands of a texture separately. First, we generate three images corresponding to three levels… ▽ More

    Submitted 20 April, 2022; originally announced April 2022.

    Comments: 6 pages

  14. arXiv:2204.07018  [pdf, other

    cs.SD cs.CR cs.CV cs.LG eess.AS

    From Environmental Sound Representation to Robustness of 2D CNN Models Against Adversarial Attacks

    Authors: Mohammad Esmaeilpour, Patrick Cardinal, Alessandro Lameiras Koerich

    Abstract: This paper investigates the impact of different standard environmental sound representations (spectrograms) on the recognition performance and adversarial attack robustness of a victim residual convolutional neural network, namely ResNet-18. Our main motivation for focusing on such a front-end classifier rather than other complex architectures is balancing recognition accuracy and the total number… ▽ More

    Submitted 14 April, 2022; originally announced April 2022.

    Comments: 32 pages, Preprint Submitted to Journal of Applied Acoustics. arXiv admin note: substantial text overlap with arXiv:2007.13703

  15. arXiv:2202.13270  [pdf, other

    cs.CV cs.LG eess.IV

    Texture Characterization of Histopathologic Images Using Ecological Diversity Measures and Discrete Wavelet Transform

    Authors: Steve Tsham Mpinda Ataky, Alessandro Lameiras Koerich

    Abstract: Breast cancer is a health problem that affects mainly the female population. An early detection increases the chances of effective treatment, improving the prognosis of the disease. In this regard, computational tools have been proposed to assist the specialist in interpreting the breast digital image exam, providing features for detecting and diagnosing tumors and cancerous cells. Nonetheless, de… ▽ More

    Submitted 26 February, 2022; originally announced February 2022.

    Comments: 14 pages

  16. arXiv:2105.07302  [pdf, other

    cs.SD eess.AS

    1D CNN Architectures for Music Genre Classification

    Authors: Safaa Allamy, Alessandro Lameiras Koerich

    Abstract: This paper proposes a 1D residual convolutional neural network (CNN) architecture for music genre classification and compares it with other recent 1D CNN architectures. The 1D CNNs learn a representation and a discriminant directly from the raw audio signal. Several convolutional layers capture the time-frequency characteristics of the audio signal and learn various filters relevant to the music g… ▽ More

    Submitted 15 May, 2021; originally announced May 2021.

    Comments: 6 pages

  17. arXiv:2103.14717  [pdf, other

    cs.SD cs.CR eess.AS

    Cyclic Defense GAN Against Speech Adversarial Attacks

    Authors: Mohammad Esmaeilpour, Patrick Cardinal, Alessandro Lameiras Koerich

    Abstract: This paper proposes a new defense approach for counteracting state-of-the-art white and black-box adversarial attack algorithms. Our approach fits into the implicit reactive defense algorithm category since it does not directly manipulate the potentially malicious input signals. Instead, it reconstructs a similar signal with a synthesized spectrogram using a cyclic generative adversarial network.… ▽ More

    Submitted 22 August, 2021; v1 submitted 26 March, 2021; originally announced March 2021.

    Comments: 5

    Journal ref: IEEE Signal Processing Letters (2021) 1-5

  18. arXiv:2103.10166  [pdf, other

    q-bio.QM cs.LG cs.SD eess.AS

    Discriminative Singular Spectrum Classifier with Applications on Bioacoustic Signal Recognition

    Authors: Bernardo B. Gatto, Juan G. Colonna, Eulanda M. dos Santos, Alessandro L. Koerich, Kazuhiro Fukui

    Abstract: Automatic analysis of bioacoustic signals is a fundamental tool to evaluate the vitality of our planet. Frogs and bees, for instance, may act like biological sensors providing information about environmental changes. This task is fundamental for ecological monitoring still includes many challenges such as nonuniform signal length processing, degraded target signal due to environmental noise, and t… ▽ More

    Submitted 18 March, 2021; originally announced March 2021.

    Comments: 15 pages

  19. arXiv:2103.08095  [pdf, ps, other

    cs.SD cs.LG eess.AS

    Towards Robust Speech-to-Text Adversarial Attack

    Authors: Mohammad Esmaeilpour, Patrick Cardinal, Alessandro Lameiras Koerich

    Abstract: This paper introduces a novel adversarial algorithm for attacking the state-of-the-art speech-to-text systems, namely DeepSpeech, Kaldi, and Lingvo. Our approach is based on develo** an extension for the conventional distortion condition of the adversarial optimization formulation using the Cramèr integral probability metric. Minimizing over this metric, which measures the discrepancies between… ▽ More

    Submitted 14 March, 2021; originally announced March 2021.

    Comments: 5 pages

  20. arXiv:2103.08086  [pdf, other

    cs.SD cs.LG eess.AS

    Multi-Discriminator Sobolev Defense-GAN Against Adversarial Attacks for End-to-End Speech Systems

    Authors: Mohammad Esmaeilpour, Patrick Cardinal, Alessandro Lameiras Koerich

    Abstract: This paper introduces a defense approach against end-to-end adversarial attacks developed for cutting-edge speech-to-text systems. The proposed defense algorithm has four major steps. First, we represent speech signals with 2D spectrograms using the short-time Fourier transform. Second, we iteratively find a safe vector using a spectrogram subspace projection operation. This operation minimizes th… ▽ More

    Submitted 14 March, 2021; originally announced March 2021.

    Comments: 10 pages

  21. arXiv:2102.06997  [pdf, other

    cs.CV

    A Novel Bio-Inspired Texture Descriptor based on Biodiversity and Taxonomic Measures

    Authors: Steve Tsham Mpinda Ataky, Alessandro Lameiras Koerich

    Abstract: Texture can be defined as the change of image intensity that forms repetitive patterns, resulting from physical properties of the object's roughness or differences in a reflection on the surface. Considering that texture forms a complex system of patterns in a non-deterministic way, biodiversity concepts can help texture characterization in images. This paper proposes a novel approach capable of q… ▽ More

    Submitted 4 October, 2021; v1 submitted 13 February, 2021; originally announced February 2021.

    Comments: 34 pages

  22. arXiv:2102.03889  [pdf, other

    cs.CV

    Machine Learning Methods for Histopathological Image Analysis: A Review

    Authors: Jonathan de Matos, Steve Tsham Mpinda Ataky, Alceu de Souza Britto Jr., Luiz Eduardo Soares de Oliveira, Alessandro Lameiras Koerich

    Abstract: Histopathological images (HIs) are the gold standard for evaluating some types of tumors for cancer diagnosis. The analysis of such images is not only time and resource consuming, but also very challenging even for experienced pathologists, resulting in inter- and intra-observer disagreements. One of the ways of accelerating such an analysis is to use computer-aided diagnosis (CAD) systems. In thi… ▽ More

    Submitted 7 February, 2021; originally announced February 2021.

    Comments: 45 pages. arXiv admin note: text overlap with arXiv:1904.07900

  23. arXiv:2011.09280  [pdf, other

    cs.CV cs.LG

    Continuous Emotion Recognition with Spatiotemporal Convolutional Neural Networks

    Authors: Thomas Teixeira, Eric Granger, Alessandro Lameiras Koerich

    Abstract: Facial expressions are one of the most powerful ways for depicting specific patterns in human behavior and describing human emotional state. Despite the impressive advances of affective computing over the last decade, automatic video-based systems for facial expression recognition still cannot handle properly variations in facial expression among individuals as well as cross-cultural and demograph… ▽ More

    Submitted 15 January, 2021; v1 submitted 18 November, 2020; originally announced November 2020.

    Comments: 33 pages

  24. arXiv:2010.11352  [pdf, other

    cs.SD cs.CR cs.CV cs.LG eess.AS

    Class-Conditional Defense GAN Against End-to-End Speech Attacks

    Authors: Mohammad Esmaeilpour, Patrick Cardinal, Alessandro Lameiras Koerich

    Abstract: In this paper we propose a novel defense approach against end-to-end adversarial attacks developed to fool advanced speech-to-text systems such as DeepSpeech and Lingvo. Unlike conventional defense approaches, the proposed approach does not directly employ low-level transformations such as autoencoding a given input signal aiming at removing potential adversarial perturbation. Instead of that, we… ▽ More

    Submitted 19 February, 2021; v1 submitted 21 October, 2020; originally announced October 2020.

    Comments: 5 pages

    Journal ref: 46th IEEE International Conference on Acoustics, Speech, & Signal Processing (ICASSP), 2021

  25. arXiv:2010.05844  [pdf, other

    cs.SD cs.LG eess.AS

    Conditioning Trick for Training Stable GANs

    Authors: Mohammad Esmaeilpour, Raymel Alfonso Sallo, Olivier St-Georges, Patrick Cardinal, Alessandro Lameiras Koerich

    Abstract: In this paper we propose a conditioning trick, called difference departure from normality, applied on the generator network in response to instability issues during GAN training. We force the generator to get closer to the departure from normality function of real samples computed in the spectral domain of Schur decomposition. This binding makes the generator amenable to truncation and does not li… ▽ More

    Submitted 12 October, 2020; originally announced October 2020.

  26. arXiv:2008.05454  [pdf, other

    cs.LG cs.SD eess.AS stat.ML

    Improving Stability of LS-GANs for Audio and Speech Signals

    Authors: Mohammad Esmaeilpour, Raymel Alfonso Sallo, Olivier St-Georges, Patrick Cardinal, Alessandro Lameiras Koerich

    Abstract: In this paper we address the instability issue of generative adversarial network (GAN) by proposing a new similarity metric in unitary space of Schur decomposition for 2D representations of audio and speech signals. We show that encoding departure from normality computed in this vector space into the generator optimization formulation helps to craft more comprehensive spectrograms. We demonstrate… ▽ More

    Submitted 12 August, 2020; originally announced August 2020.

    Comments: 10 pages

  27. arXiv:2007.13703  [pdf, other

    eess.AS cs.LG cs.SD

    From Sound Representation to Model Robustness

    Authors: Mohammad Esmaeilpour, Patrick Cardinal, Alessandro Lameiras Koerich

    Abstract: In this paper, we investigate the impact of different standard environmental sound representations (spectrograms) on the recognition performance and adversarial attack robustness of a victim residual convolutional neural network. Averaged over various experiments on three benchmarking environmental sound datasets, we found the ResNet-18 model outperforms other deep learning architectures such as G… ▽ More

    Submitted 17 January, 2021; v1 submitted 27 July, 2020; originally announced July 2020.

    Comments: 12 pages

  28. arXiv:2007.00779  [pdf, other

    cs.CV

    Self-supervised Deep Reconstruction of Mixed Strip-shredded Text Documents

    Authors: Thiago M. Paixão, Rodrigo F. Berriel, Maria C. S. Boeres, Alessandro L. Koerich, Claudine Badue, Alberto F. de Souza, Thiago Oliveira-Santos

    Abstract: The reconstruction of shredded documents consists of coherently arranging fragments of paper (shreds) to recover the original document(s). A great challenge in computational reconstruction is to properly evaluate the compatibility between the shreds. While traditional pixel-based approaches are not robust to real shredding, more sophisticated solutions compromise significantly time performance. Th… ▽ More

    Submitted 1 July, 2020; originally announced July 2020.

    Comments: Accepted for publication in Pattern Recognition

  29. arXiv:2005.09110  [pdf, other

    cs.CV cs.LG

    Two-View Fine-grained Classification of Plant Species

    Authors: Voncarlos M. Araujo, Alceu S. Britto Jr., Luiz E. S. Oliveira, Alessandro L. Koerich

    Abstract: Automatic plant classification is a challenging problem due to the wide biodiversity of the existing plant species in a fine-grained scenario. Powerful deep learning architectures have been used to improve the classification performance in such a fine-grained problem, but usually building models that are highly dependent on a large training dataset and which are not scalable. In this paper, we pro… ▽ More

    Submitted 4 October, 2021; v1 submitted 18 May, 2020; originally announced May 2020.

  30. arXiv:2003.10063  [pdf, other

    cs.CV cs.LG eess.IV

    Fast(er) Reconstruction of Shredded Text Documents via Self-Supervised Deep Asymmetric Metric Learning

    Authors: Thiago M. Paixão, Rodrigo F. Berriel, Maria C. S. Boeres, Alessando L. Koerich, Claudine Badue, Alberto F. De Souza, Thiago Oliveira-Santos

    Abstract: The reconstruction of shredded documents consists in arranging the pieces of paper (shreds) in order to reassemble the original aspect of such documents. This task is particularly relevant for supporting forensic investigation as documents may contain criminal evidence. As an alternative to the laborious and time-consuming manual process, several researchers have been investigating ways to perform… ▽ More

    Submitted 28 April, 2020; v1 submitted 22 March, 2020; originally announced March 2020.

    Comments: Accepted to CVPR 2020. Main Paper (9 pages, 10 figures) and Supplementary Material (5 pages, 9 figures)

  31. arXiv:2002.00072  [pdf, other

    eess.IV cs.LG stat.ML

    Data Augmentation for Histopathological Images Based on Gaussian-Laplacian Pyramid Blending

    Authors: Steve Tsham Mpinda Ataky, Jonathan de Matos, Alceu de S. Britto Jr., Luiz E. S. Oliveira, Alessandro L. Koerich

    Abstract: Data imbalance is a major problem that affects several machine learning (ML) algorithms. Such a problem is troublesome because most of the ML algorithms attempt to optimize a loss function that does not take into account the data imbalance. Accordingly, the ML algorithm simply generates a trivial model that is biased toward predicting the most frequent class in the training data. In the case of hi… ▽ More

    Submitted 16 May, 2020; v1 submitted 31 January, 2020; originally announced February 2020.

    Comments: 8 pages

    Journal ref: IEEE International Joint Conference on Neural Networks (IJCNN 2020), Glasgow, UK

  32. arXiv:2001.11976  [pdf, other

    cs.CV cs.LG

    Continuous Emotion Recognition via Deep Convolutional Autoencoder and Support Vector Regressor

    Authors: Sevegni Odilon Clement Allognon, Alessandro L. Koerich, Alceu de S. Britto Jr

    Abstract: Automatic facial expression recognition is an important research area in the emotion recognition and computer vision. Applications can be found in several domains such as medical treatment, driver fatigue surveillance, sociable robotics, and several other human-computer interaction systems. Therefore, it is crucial that the machine should be able to recognize the emotional state of the user with h… ▽ More

    Submitted 31 January, 2020; originally announced January 2020.

  33. arXiv:1910.12084  [pdf, ps, other

    cs.LG cs.CR cs.SD eess.AS stat.ML

    Detection of Adversarial Attacks and Characterization of Adversarial Subspace

    Authors: Mohammad Esmaeilpour, Patrick Cardinal, Alessandro Lameiras Koerich

    Abstract: Adversarial attacks have always been a serious threat for any data-driven model. In this paper, we explore subspaces of adversarial examples in unitary vector domain, and we propose a novel detector for defending our models trained for environmental sound classification. We measure chordal distance between legitimate and malicious representation of sounds in unitary space of generalized Schur deco… ▽ More

    Submitted 26 October, 2019; originally announced October 2019.

    Comments: Submitted to ICASSP 2020

  34. arXiv:1910.10106  [pdf, other

    cs.SD cs.LG cs.MM eess.AS stat.ML

    Cross-Representation Transferability of Adversarial Attacks: From Spectrograms to Audio Waveforms

    Authors: Karl Michel Koerich, Mohammad Esmaeilpour, Sajjad Abdoli, Alceu de Souza Britto Jr., Alessandro Lameiras Koerich

    Abstract: This paper shows the susceptibility of spectrogram-based audio classifiers to adversarial attacks and the transferability of such attacks to audio waveforms. Some commonly used adversarial attacks to images have been applied to Mel-frequency and short-time Fourier transform spectrograms, and such perturbed spectrograms are able to fool a 2D convolutional neural network (CNN). Such attacks produce… ▽ More

    Submitted 29 July, 2020; v1 submitted 22 October, 2019; originally announced October 2019.

    Comments: 8 pages

    Journal ref: IEEE International Joint Conference on Neural Networks (IJCNN 2020), Glasgow, UK

  35. arXiv:1909.01954  [pdf, other

    cs.LG stat.ML

    Tensor Analysis with n-Mode Generalized Difference Subspace

    Authors: Bernardo B. Gatto, Eulanda M. dos Santos, Alessandro L. Koerich, Kazuhiro Fukui, Waldir S. S. Junior

    Abstract: The increasing use of multiple sensors, which produce a large amount of multi-dimensional data, requires efficient representation and classification methods. In this paper, we present a new method for multi-dimensional data classification that relies on two premises: 1) multi-dimensional data are usually represented by tensors, since this brings benefits from multilinear algebra and established te… ▽ More

    Submitted 29 November, 2020; v1 submitted 4 September, 2019; originally announced September 2019.

    Comments: Submitted to Expert Systems with Applications

  36. arXiv:1908.03173  [pdf, other

    cs.LG cs.SD eess.AS stat.ML

    Universal Adversarial Audio Perturbations

    Authors: Sajjad Abdoli, Luiz G. Hafemann, Jerome Rony, Ismail Ben Ayed, Patrick Cardinal, Alessandro L. Koerich

    Abstract: We demonstrate the existence of universal adversarial perturbations, which can fool a family of audio classification architectures, for both targeted and untargeted attack scenarios. We propose two methods for finding such perturbations. The first method is based on an iterative, greedy approach that is well-known in computer vision: it aggregates small perturbations to the input so as to push it… ▽ More

    Submitted 16 November, 2020; v1 submitted 8 August, 2019; originally announced August 2019.

  37. arXiv:1907.09613  [pdf, other

    cs.LG stat.ML

    Incremental and Decremental Fuzzy Bounded Twin Support Vector Machine

    Authors: Alexandre Reeberg de Mello, Marcelo Ricardo Stemmer, Alessandro Lameiras Koerich

    Abstract: In this paper we present an incremental variant of the Twin Support Vector Machine (TWSVM) called Fuzzy Bounded Twin Support Vector Machine (FBTWSVM) to deal with large datasets and learning from data streams. We combine the TWSVM with a fuzzy membership function, so that each input has a different contribution to each hyperplane in a binary classifier. To solve the pair of quadratic programming p… ▽ More

    Submitted 23 March, 2020; v1 submitted 22 July, 2019; originally announced July 2019.

    Comments: 23 pages

    Journal ref: Information Sciences, 2020

  38. arXiv:1907.09404  [pdf, other

    cs.CV cs.LG cs.MM

    Deep Learning Approaches for Image Retrieval and Pattern Spotting in Ancient Documents

    Authors: Kelly Lais Wiggers, Alceu de Souza Britto Junior, Alessandro Lameiras Koerich, Laurent Heutte, Luiz Eduardo Soares de Oliveira

    Abstract: This paper describes two approaches for content-based image retrieval and pattern spotting in document images using deep learning. The first approach uses a pre-trained CNN model to cope with the lack of training data, which is fine-tuned to achieve a compact yet discriminant representation of queries and image candidates. The second approach uses a Siamese Convolution Neural Network trained on a… ▽ More

    Submitted 22 July, 2019; originally announced July 2019.

    Comments: The paper is under consideration at Pattern Recognition Letters

  39. arXiv:1907.07270  [pdf, other

    cs.CV

    Style Transfer Applied to Face Liveness Detection with User-Centered Models

    Authors: Israel A. Laurensi R., Luciana T. Menon, Manoel Camillo O. Penna N., Alessandro L. Koerich, Alceu S. Britto Jr

    Abstract: This paper proposes a face anti-spoofing user-centered model (FAS-UCM). The major difficulty, in this case, is obtaining fraudulent images from all users to train the models. To overcome this problem, the proposed method is divided in three main parts: generation of new spoof images, based on style transfer and spoof image representation models; training of a Convolutional Neural Network (CNN) for… ▽ More

    Submitted 16 July, 2019; originally announced July 2019.

  40. arXiv:1907.04928  [pdf, other

    eess.AS cs.LG cs.SD stat.ML

    Bag-of-Audio-Words based on Autoencoder Codebook for Continuous Emotion Prediction

    Authors: Mohammed Senoussaoui, Patrick Cardinal, Alessandro Lameiras Koerich

    Abstract: In this paper we present a novel approach for extracting a Bag-of-Words (BoW) representation based on a Neural Network codebook. The conventional BoW model is based on a dictionary (codebook) built from elementary representations which are selected randomly or by using a clustering algorithm on a training dataset. A metric is then used to assign unseen elementary representations to the closest dic… ▽ More

    Submitted 6 July, 2019; originally announced July 2019.

  41. arXiv:1907.03196  [pdf, other

    cs.CV eess.AS eess.IV

    Multimodal Fusion with Deep Neural Networks for Audio-Video Emotion Recognition

    Authors: Juan D. S. Ortega, Mohammed Senoussaoui, Eric Granger, Marco Pedersoli, Patrick Cardinal, Alessandro L. Koerich

    Abstract: This paper presents a novel deep neural network (DNN) for multimodal fusion of audio, video and text modalities for emotion recognition. The proposed DNN architecture has independent and shared layers which aim to learn the representation for each modality, as well as the best combined representation to achieve the best prediction. Experimental results on the AVEC Sentiment Analysis in the Wild da… ▽ More

    Submitted 6 July, 2019; originally announced July 2019.

  42. arXiv:1906.10623  [pdf, other

    cs.LG cs.SD eess.AS stat.ML

    Emotion Recognition Using Fusion of Audio and Video Features

    Authors: Juan D. S. Ortega, Patrick Cardinal, Alessandro L. Koerich

    Abstract: In this paper we propose a fusion approach to continuous emotion recognition that combines visual and auditory modalities in their representation spaces to predict the arousal and valence levels. The proposed approach employs a pre-trained convolution neural network and transfer learning to extract features from video frames that capture the emotional content. For the auditory content, a minimalis… ▽ More

    Submitted 25 June, 2019; originally announced June 2019.

  43. arXiv:1906.09513  [pdf

    cs.CV

    Image Retrieval and Pattern Spotting using Siamese Neural Network

    Authors: Kelly L. Wiggers, Alceu S. Britto Jr., Laurent Heutte, Alessandro L. Koerich, Luiz S. Oliveira

    Abstract: This paper presents a novel approach for image retrieval and pattern spotting in document image collections. The manual feature engineering is avoided by learning a similarity-based representation using a Siamese Neural Network trained on a previously prepared subset of image pairs from the ImageNet dataset. The learned representation is used to provide the similarity-based feature maps used to fi… ▽ More

    Submitted 22 June, 2019; originally announced June 2019.

    Comments: Accepted for IJCNN 2019

  44. arXiv:1905.12082  [pdf, other

    cs.CV

    Memory Integrity of CNNs for Cross-Dataset Facial Expression Recognition

    Authors: Dylan C. Tannugi, Alceu S. Britto Jr., Alessandro L. Koerich

    Abstract: Facial expression recognition is a major problem in the domain of artificial intelligence. One of the best ways to solve this problem is the use of convolutional neural networks (CNNs). However, a large amount of data is required to train properly these networks but most of the datasets available for facial expression recognition are relatively small. A common way to circumvent the lack of data is… ▽ More

    Submitted 28 May, 2019; originally announced May 2019.

  45. arXiv:1905.12005  [pdf, other

    cs.CV eess.IV

    Texture CNN for Histopathological Image Classification

    Authors: Jonathan de Matos, Alceu de S. Britto Jr., Luiz E. S. de Oliveira, Alessandro L. Koerich

    Abstract: Biopsies are the gold standard for breast cancer diagnosis. This task can be improved by the use of Computer Aided Diagnosis (CAD) systems, reducing the time of diagnosis and reducing the inter and intra-observer variability. The advances in computing have brought this type of system closer to reality. However, datasets of Histopathological Images (HI) from biopsies are quite small and unbalanced… ▽ More

    Submitted 28 May, 2019; originally announced May 2019.

  46. arXiv:1905.12003  [pdf, other

    cs.CV

    Texture CNN for Thermoelectric Metal Pipe Image Classification

    Authors: Daniel Vriesman, Alessandro Zimmer, Alceu S. Britto Jr., Alessandro L. Koerich

    Abstract: In this paper, the concept of representation learning based on deep neural networks is applied as an alternative to the use of handcrafted features in a method for automatic visual inspection of corroded thermoelectric metallic pipes. A texture convolutional neural network (TCNN) replaces handcrafted features based on Local Phase Quantization (LPQ) and Haralick descriptors (HD) with the advantage… ▽ More

    Submitted 28 May, 2019; originally announced May 2019.

  47. arXiv:1904.11649  [pdf, other

    cs.LG stat.ML

    A Novel Orthogonal Direction Mesh Adaptive Direct Search Approach for SVM Hyperparameter Tuning

    Authors: Alexandre Reeberg Mello, Jonathan de Matos, Marcelo R. Stemmer, Alceu de Souza Britto Jr., Alessandro Lameiras Koerich

    Abstract: In this paper, we propose the use of a black-box optimization method called deterministic Mesh Adaptive Direct Search (MADS) algorithm with orthogonal directions (Ortho-MADS) for the selection of hyperparameters of Support Vector Machines with a Gaussian kernel. Different from most of the methods in the literature that exploit the properties of the data or attempt to minimize the accuracy of a val… ▽ More

    Submitted 25 April, 2019; originally announced April 2019.

  48. arXiv:1904.11641  [pdf, other

    cs.SD cs.CL eess.AS

    Speaker Sincerity Detection based on Covariance Feature Vectors and Ensemble Methods

    Authors: Mohammed Senoussaoui, Patrick Cardinal, Najim Dehak, Alessandro Lameiras Koerich

    Abstract: Automatic measuring of speaker sincerity degree is a novel research problem in computational paralinguistics. This paper proposes covariance-based feature vectors to model speech and ensembles of support vector regressors to estimate the degree of sincerity of a speaker. The elements of each covariance vector are pairwise statistics between the short-term feature components. These features are use… ▽ More

    Submitted 25 April, 2019; originally announced April 2019.

  49. arXiv:1904.10990  [pdf, other

    cs.LG cs.CR cs.SD eess.AS stat.ML

    A Robust Approach for Securing Audio Classification Against Adversarial Attacks

    Authors: Mohammad Esmaeilpour, Patrick Cardinal, Alessandro Lameiras Koerich

    Abstract: Adversarial audio attacks can be considered as a small perturbation unperceptive to human ears that is intentionally added to the audio signal and causes a machine learning model to make mistakes. This poses a security concern about the safety of machine learning models since the adversarial attacks can fool such models toward the wrong predictions. In this paper we first review some strong advers… ▽ More

    Submitted 25 November, 2019; v1 submitted 24 April, 2019; originally announced April 2019.

    Comments: Paper Accepted for Publication in IEEE Transactions on Information Forensics and Security

  50. arXiv:1904.08990  [pdf, other

    cs.SD cs.LG stat.ML

    End-to-End Environmental Sound Classification using a 1D Convolutional Neural Network

    Authors: Sajjad Abdoli, Patrick Cardinal, Alessandro Lameiras Koerich

    Abstract: In this paper, we present an end-to-end approach for environmental sound classification based on a 1D Convolution Neural Network (CNN) that learns a representation directly from the audio signal. Several convolutional layers are used to capture the signal's fine time structure and learn diverse filters that are relevant to the classification task. The proposed approach can deal with audio signals… ▽ More

    Submitted 18 April, 2019; originally announced April 2019.