Search | arXiv e-print repository

doi 10.1007/s00521-022-07645-z

Feature transforms for image data augmentation

Authors: Loris Nanni, Michelangelo Paci, Sheryl Brahnam, Alessandra Lumini

Abstract: A problem with Convolutional Neural Networks (CNNs) is that they require large datasets to obtain adequate robustness; on small datasets, they are prone to overfitting. Many methods have been proposed to overcome this shortcoming with CNNs. In cases where additional samples cannot easily be collected, a common approach is to generate more data points from existing data using an augmentation techni… ▽ More A problem with Convolutional Neural Networks (CNNs) is that they require large datasets to obtain adequate robustness; on small datasets, they are prone to overfitting. Many methods have been proposed to overcome this shortcoming with CNNs. In cases where additional samples cannot easily be collected, a common approach is to generate more data points from existing data using an augmentation technique. In image classification, many augmentation approaches utilize simple image manipulation algorithms. In this work, we build ensembles on the data level by adding images generated by combining fourteen augmentation approaches, three of which are proposed here for the first time. These novel methods are based on the Fourier Transform (FT), the Radon Transform (RT) and the Discrete Cosine Transform (DCT). Pretrained ResNet50 networks are finetuned on training sets that include images derived from each augmentation method. These networks and several fusions are evaluated and compared across eleven benchmarks. Results show that building ensembles on the data level by combining different data augmentation methods produce classifiers that not only compete competitively against the state-of-the-art but often surpass the best approaches reported in the literature. △ Less

Submitted 22 August, 2022; v1 submitted 24 January, 2022; originally announced January 2022.

arXiv:2112.12955 [pdf]

Deep ensembles in bioimage segmentation

Authors: Loris Nanni, Daniela Cuza, Alessandra Lumini, Andrea Loreggia, Sheryl Brahnam

Abstract: Semantic segmentation consists in classifying each pixel of an image by assigning it to a specific label chosen from a set of all the available ones. During the last few years, a lot of attention shifted to this kind of task. Many computer vision researchers tried to apply autoencoder structures to develop models that can learn the semantics of the image as well as a low-level representation of it… ▽ More Semantic segmentation consists in classifying each pixel of an image by assigning it to a specific label chosen from a set of all the available ones. During the last few years, a lot of attention shifted to this kind of task. Many computer vision researchers tried to apply autoencoder structures to develop models that can learn the semantics of the image as well as a low-level representation of it. In an autoencoder architecture, given an input, an encoder computes a low dimensional representation of the input that is then used by a decoder to reconstruct the original data. In this work, we propose an ensemble of convolutional neural networks (CNNs). In ensemble methods, many different models are trained and then used for classification, the ensemble aggregates the outputs of the single classifiers. The approach leverages on differences of various classifiers to improve the performance of the whole system. Diversity among the single classifiers is enforced by using different loss functions. In particular, we present a new loss function that results from the combination of Dice and Structural Similarity Index. The proposed ensemble is implemented by combining different backbone networks using the DeepLabV3+ and HarDNet environment. The proposal is evaluated through an extensive empirical evaluation on two real-world scenarios: polyp and skin segmentation. All the code is available online at https://github.com/LorisNanni. △ Less

Submitted 24 December, 2021; originally announced December 2021.

arXiv:2110.04414 [pdf]

Gated recurrent units and temporal convolutional network for multilabel classification

Authors: Loris Nanni, Alessandra Lumini, Alessandro Manfe, Riccardo Rampon, Sheryl Brahnam, Giorgio Venturin

Abstract: Multilabel learning tackles the problem of associating a sample with multiple class labels. This work proposes a new ensemble method for managing multilabel classification: the core of the proposed approach combines a set of gated recurrent units and temporal convolutional neural networks trained with variants of the Adam optimization approach. Multiple Adam variants, including novel one proposed… ▽ More Multilabel learning tackles the problem of associating a sample with multiple class labels. This work proposes a new ensemble method for managing multilabel classification: the core of the proposed approach combines a set of gated recurrent units and temporal convolutional neural networks trained with variants of the Adam optimization approach. Multiple Adam variants, including novel one proposed here, are compared and tested; these variants are based on the difference between present and past gradients, with step size adjusted for each parameter. The proposed neural network approach is also combined with Incorporating Multiple Clustering Centers (IMCC), which further boosts classification performance. Multiple experiments on nine data sets representing a wide variety of multilabel tasks demonstrate the robustness of our best ensemble, which is shown to outperform the state-of-the-art. The MATLAB code for generating the best ensembles in the experimental section will be available at https://github.com/LorisNanni. △ Less

Submitted 22 August, 2022; v1 submitted 8 October, 2021; originally announced October 2021.

arXiv:2108.12539 [pdf]

High performing ensemble of convolutional neural networks for insect pest image detection

Authors: Loris Nanni, Alessandro Manfe, Gianluca Maguolo, Alessandra Lumini, Sheryl Brahnam

Abstract: Pest infestation is a major cause of crop damage and lost revenues worldwide. Automatic identification of invasive insects would greatly speedup the identification of pests and expedite their removal. In this paper, we generate ensembles of CNNs based on different topologies (ResNet50, GoogleNet, ShuffleNet, MobileNetv2, and DenseNet201) altered by random selection from a simple set of data augmen… ▽ More Pest infestation is a major cause of crop damage and lost revenues worldwide. Automatic identification of invasive insects would greatly speedup the identification of pests and expedite their removal. In this paper, we generate ensembles of CNNs based on different topologies (ResNet50, GoogleNet, ShuffleNet, MobileNetv2, and DenseNet201) altered by random selection from a simple set of data augmentation methods or optimized with different Adam variants for pest identification. Two new Adam algorithms for deep network optimization based on DGrad are proposed that introduce a scaling factor in the learning rate. Sets of the five CNNs that vary in either data augmentation or the type of Adam optimization were trained on both the Deng (SMALL) and the large IP102 pest data sets. Ensembles were compared and evaluated using three performance indicators. The best performing ensemble, which combined the CNNs using the different augmentation methods and the two new Adam variants proposed here, achieved state of the art on both insect data sets: 95.52% on Deng and 73.46% on IP102, a score on Deng that competed with human expert classifications. Additional tests were performed on data sets for medical imagery classification that further validated the robustness and power of the proposed Adam optimization variants. All MATLAB source code is available at https://github.com/LorisNanni/. △ Less

Submitted 27 August, 2021; originally announced August 2021.

arXiv:2104.03488 [pdf]

doi 10.3390/jimaging6120143

Deep Features for training Support Vector Machine

Authors: Loris Nanni, Stefano Ghidoni, Sheryl Brahnam

Abstract: Features play a crucial role in computer vision. Initially designed to detect salient elements by means of handcrafted algorithms, features are now often learned by different layers in Convolutional Neural Networks (CNNs). This paper develops a generic computer vision system based on features extracted from trained CNNs. Multiple learned features are combined into a single structure to work on dif… ▽ More Features play a crucial role in computer vision. Initially designed to detect salient elements by means of handcrafted algorithms, features are now often learned by different layers in Convolutional Neural Networks (CNNs). This paper develops a generic computer vision system based on features extracted from trained CNNs. Multiple learned features are combined into a single structure to work on different image classification tasks. The proposed system was experimentally derived by testing several approaches for extracting features from the inner layers of CNNs and using them as inputs to SVMs that are then combined by sum rule. Dimensionality reduction techniques are used to reduce the high dimensionality of inner layers. The resulting vision system is shown to significantly boost the performance of standard CNNs across a large and diverse collection of image data sets. An ensemble of different topologies using the same approach obtains state-of-the-art results on a virus data set. △ Less

Submitted 28 June, 2021; v1 submitted 7 April, 2021; originally announced April 2021.

arXiv:2104.00850 [pdf]

Deep ensembles based on Stochastic Activation Selection for Polyp Segmentation

Authors: Alessandra Lumini, Loris Nanni, Gianluca Maguolo

Abstract: Semantic segmentation has a wide array of applications ranging from medical-image analysis, scene understanding, autonomous driving and robotic navigation. This work deals with medical image segmentation and in particular with accurate polyp detection and segmentation during colonoscopy examinations. Several convolutional neural network architectures have been proposed to effectively deal with thi… ▽ More Semantic segmentation has a wide array of applications ranging from medical-image analysis, scene understanding, autonomous driving and robotic navigation. This work deals with medical image segmentation and in particular with accurate polyp detection and segmentation during colonoscopy examinations. Several convolutional neural network architectures have been proposed to effectively deal with this task and with the problem of segmenting objects at different scale input. The basic architecture in image segmentation consists of an encoder and a decoder: the first uses convolutional filters to extract features from the image, the second is responsible for generating the final output. In this work, we compare some variant of the DeepLab architecture obtained by varying the decoder backbone. We compare several decoder architectures, including ResNet, Xception, EfficentNet, MobileNet and we perturb their layers by substituting ReLU activation layers with other functions. The resulting methods are used to create deep ensembles which are shown to be very effective. Our experimental evaluations show that our best ensemble produces good segmentation results by achieving high evaluation scores with a dice coefficient of 0.884, and a mean Intersection over Union (mIoU) of 0.818 for the Kvasir-SEG dataset. To improve reproducibility and research efficiency the MATLAB source code used for this research is available at GitHub: https://github.com/LorisNanni. △ Less

Submitted 7 April, 2021; v1 submitted 1 April, 2021; originally announced April 2021.

arXiv:2103.15898 [pdf]

Comparison of different convolutional neural network activation functions and methods for building ensembles

Authors: Loris Nanni, Gianluca Maguolo, Sheryl Brahnam, Michelangelo Paci

Abstract: Recently, much attention has been devoted to finding highly efficient and powerful activation functions for CNN layers. Because activation functions inject different nonlinearities between layers that affect performance, varying them is one method for building robust ensembles of CNNs. The objective of this study is to examine the performance of CNN ensembles made with different activation functio… ▽ More Recently, much attention has been devoted to finding highly efficient and powerful activation functions for CNN layers. Because activation functions inject different nonlinearities between layers that affect performance, varying them is one method for building robust ensembles of CNNs. The objective of this study is to examine the performance of CNN ensembles made with different activation functions, including six new ones presented here: 2D Mexican ReLU, TanELU, MeLU+GaLU, Symmetric MeLU, Symmetric GaLU, and Flexible MeLU. The highest performing ensemble was built with CNNs having different activation layers that randomly replaced the standard ReLU. A comprehensive evaluation of the proposed approach was conducted across fifteen biomedical data sets representing various classification tasks. The proposed method was tested on two basic CNN architectures: Vgg16 and ResNet50. Results demonstrate the superiority in performance of this approach. The MATLAB source code for this study will be available at https://github.com/LorisNanni. △ Less

Submitted 1 April, 2021; v1 submitted 29 March, 2021; originally announced March 2021.

arXiv:2103.14689 [pdf]

Exploiting Adam-like Optimization Algorithms to Improve the Performance of Convolutional Neural Networks

Authors: Loris Nanni, Gianluca Maguolo, Alessandra Lumini

Abstract: Stochastic gradient descent (SGD) is the main approach for training deep networks: it moves towards the optimum of the cost function by iteratively updating the parameters of a model in the direction of the gradient of the loss evaluated on a minibatch. Several variants of SGD have been proposed to make adaptive step sizes for each parameter (adaptive gradient) and take into account the previous u… ▽ More Stochastic gradient descent (SGD) is the main approach for training deep networks: it moves towards the optimum of the cost function by iteratively updating the parameters of a model in the direction of the gradient of the loss evaluated on a minibatch. Several variants of SGD have been proposed to make adaptive step sizes for each parameter (adaptive gradient) and take into account the previous updates (momentum). Among several alternative of SGD the most popular are AdaGrad, AdaDelta, RMSProp and Adam which scale coordinates of the gradient by square roots of some form of averaging of the squared coordinates in the past gradients and automatically adjust the learning rate on a parameter basis. In this work, we compare Adam based variants based on the difference between the present and the past gradients, the step size is adjusted for each parameter. We run several tests benchmarking proposed methods using medical image data. The experiments are performed using ResNet50 architecture neural network. Moreover, we have tested ensemble of networks and the fusion with ResNet50 trained with stochastic gradient descent. To combine the set of ResNet50 the simple sum rule has been applied. Proposed ensemble obtains very high performance, it obtains accuracy comparable or better than actual state of the art. To improve reproducibility and research efficiency the MATLAB source code used for this research is available at GitHub: https://github.com/LorisNanni. △ Less

Submitted 26 March, 2021; originally announced March 2021.

arXiv:2101.11713 [pdf]

Neural networks for Anatomical Therapeutic Chemical (ATC) classification

Authors: Loris Nanni, Alessandra Lumini, Sheryl Brahnam

Abstract: Motivation: Automatic Anatomical Therapeutic Chemical (ATC) classification is a critical and highly competitive area of research in bioinformatics because of its potential for expediting drug develop-ment and research. Predicting an unknown compound's therapeutic and chemical characteristics ac-cording to how these characteristics affect multiple organs/systems makes automatic ATC classifica-tion… ▽ More Motivation: Automatic Anatomical Therapeutic Chemical (ATC) classification is a critical and highly competitive area of research in bioinformatics because of its potential for expediting drug develop-ment and research. Predicting an unknown compound's therapeutic and chemical characteristics ac-cording to how these characteristics affect multiple organs/systems makes automatic ATC classifica-tion a challenging multi-label problem. Results: In this work, we propose combining multiple multi-label classifiers trained on distinct sets of features, including sets extracted from a Bidirectional Long Short-Term Memory Network (BiLSTM). Experiments demonstrate the power of this approach, which is shown to outperform the best methods reported in the literature, including the state-of-the-art developed by the fast.ai research group. Availability: All source code developed for this study is available at https://github.com/LorisNanni. Contact: [email protected] △ Less

Submitted 5 August, 2021; v1 submitted 22 January, 2021; originally announced January 2021.

arXiv:2011.11834 [pdf]

Comparisons among different stochastic selection of activation layers for convolutional neural networks for healthcare

Authors: Loris Nanni, Alessandra Lumini, Stefano Ghidoni, Gianluca Maguolo

Abstract: Classification of biological images is an important task with crucial application in many fields, such as cell phenotypes recognition, detection of cell organelles and histopathological classification, and it might help in early medical diagnosis, allowing automatic disease classification without the need of a human expert. In this paper we classify biomedical images using ensembles of neural netw… ▽ More Classification of biological images is an important task with crucial application in many fields, such as cell phenotypes recognition, detection of cell organelles and histopathological classification, and it might help in early medical diagnosis, allowing automatic disease classification without the need of a human expert. In this paper we classify biomedical images using ensembles of neural networks. We create this ensemble using a ResNet50 architecture and modifying its activation layers by substituting ReLUs with other functions. We select our activations among the following ones: ReLU, leaky ReLU, Parametric ReLU, ELU, Adaptive Piecewice Linear Unit, S-Shaped ReLU, Swish , Mish, Mexican Linear Unit, Gaussian Linear Unit, Parametric Deformable Linear Unit, Soft Root Sign (SRS) and others. As a baseline, we used an ensemble of neural networks that only use ReLU activations. We tested our networks on several small and medium sized biomedical image datasets. Our results prove that our best ensemble obtains a better performance than the ones of the naive approaches. In order to encourage the reproducibility of this work, the MATLAB code of all the experiments will be shared at https://github.com/LorisNanni. △ Less

Submitted 23 November, 2020; originally announced November 2020.

arXiv:2011.06123 [pdf]

doi 10.3390/jimaging6120143

Deep learning and hand-crafted features for virus image classification

Authors: Loris Nanni, Eugenio De Luca, Marco Ludovico Facin, Gianluca Maguolo

Abstract: In this work, we present an ensemble of descriptors for the classification of transmission electron microscopy images of viruses. We propose to combine handcrafted and deep learning approaches for virus image classification. The set of handcrafted is mainly based on Local Binary Pattern variants, for each descriptor a different Support Vector Machine is trained, then the set of classifiers is comb… ▽ More In this work, we present an ensemble of descriptors for the classification of transmission electron microscopy images of viruses. We propose to combine handcrafted and deep learning approaches for virus image classification. The set of handcrafted is mainly based on Local Binary Pattern variants, for each descriptor a different Support Vector Machine is trained, then the set of classifiers is combined by sum rule. The deep learning approach is a densenet201 pretrained on ImageNet and then tuned in the virus dataset, the net is used as features extractor for feeding another Support Vector Machine, in particular the last average pooling layer is used as feature extractor. Finally, classifiers trained on handcrafted features and classifier trained on deep learning features are combined by sum rule. The proposed fusion strongly boosts the performance obtained by each stand-alone approach, obtaining state of the art performance. △ Less

Submitted 12 December, 2020; v1 submitted 11 November, 2020; originally announced November 2020.

Journal ref: Journal of Imaging 2020, 6(12), 143

arXiv:2009.09780 [pdf, other]

Impact of lung segmentation on the diagnosis and explanation of COVID-19 in chest X-ray images

Authors: Lucas O. Teixeira, Rodolfo M. Pereira, Diego Bertolini, Luiz S. Oliveira, Loris Nanni, George D. C. Cavalcanti, Yandre M. G. Costa

Abstract: COVID-19 frequently provokes pneumonia, which can be diagnosed using imaging exams. Chest X-ray (CXR) is often useful because it is cheap, fast, widespread, and uses less radiation. Here, we demonstrate the impact of lung segmentation in COVID-19 identification using CXR images and evaluate which contents of the image influenced the most. Semantic segmentation was performed using a U-Net CNN archi… ▽ More COVID-19 frequently provokes pneumonia, which can be diagnosed using imaging exams. Chest X-ray (CXR) is often useful because it is cheap, fast, widespread, and uses less radiation. Here, we demonstrate the impact of lung segmentation in COVID-19 identification using CXR images and evaluate which contents of the image influenced the most. Semantic segmentation was performed using a U-Net CNN architecture, and the classification using three CNN architectures (VGG, ResNet, and Inception). Explainable Artificial Intelligence techniques were employed to estimate the impact of segmentation. A three-classes database was composed: lung opacity (pneumonia), COVID-19, and normal. We assessed the impact of creating a CXR image database from different sources, and the COVID-19 generalization from one source to another. The segmentation achieved a Jaccard distance of 0.034 and a Dice coefficient of 0.982. The classification using segmented images achieved an F1-Score of 0.88 for the multi-class setup, and 0.83 for COVID-19 identification. In the cross-dataset scenario, we obtained an F1-Score of 0.74 and an area under the ROC curve of 0.9 for COVID-19 identification using segmented images. Experiments support the conclusion that even after segmentation, there is a strong bias introduced by underlying factors from different sources. △ Less

Submitted 13 September, 2021; v1 submitted 21 September, 2020; originally announced September 2020.

Comments: Submitted to Sensors

arXiv:2007.07966 [pdf]

doi 10.3390/app11135796

An Ensemble of Convolutional Neural Networks for Audio Classification

Authors: Loris Nanni, Gianluca Maguolo, Sheryl Brahnam, Michelangelo Paci

Abstract: In this paper, ensembles of classifiers that exploit several data augmentation techniques and four signal representations for training Convolutional Neural Networks (CNNs) for audio classification are presented and tested on three freely available audio classification datasets: i) bird calls, ii) cat sounds, and iii) the Environmental Sound Classification dataset. The best performing ensembles com… ▽ More In this paper, ensembles of classifiers that exploit several data augmentation techniques and four signal representations for training Convolutional Neural Networks (CNNs) for audio classification are presented and tested on three freely available audio classification datasets: i) bird calls, ii) cat sounds, and iii) the Environmental Sound Classification dataset. The best performing ensembles combining data augmentation techniques with different signal representations are compared and shown to outperform the best methods reported in the literature on these datasets. The approach proposed here obtains state-of-the-art results in the widely used ESC-50 dataset. To the best of our knowledge, this is the most extensive study investigating ensembles of CNNs for audio classification. Results demonstrate not only that CNNs can be trained for audio classification but also that their fusion using different techniques works better than the stand-alone classifiers. △ Less

Submitted 27 April, 2021; v1 submitted 15 July, 2020; originally announced July 2020.

Journal ref: Appl. Sci. 2021, 11(13), 5796

arXiv:2006.02536 [pdf]

Phasic dopamine release identification using ensemble of AlexNet

Authors: Luca Patarnello, Marco Celin, Loris Nanni

Abstract: Dopamine (DA) is an organic chemical that influences several parts of behaviour and physical functions. Fast-scan cyclic voltammetry (FSCV) is a technique used for in vivo phasic dopamine release measurements. The analysis of such measurements, though, requires notable effort. In this paper, we present the use of convolutional neural networks (CNNs) for the identification of phasic dopamine releas… ▽ More Dopamine (DA) is an organic chemical that influences several parts of behaviour and physical functions. Fast-scan cyclic voltammetry (FSCV) is a technique used for in vivo phasic dopamine release measurements. The analysis of such measurements, though, requires notable effort. In this paper, we present the use of convolutional neural networks (CNNs) for the identification of phasic dopamine releases. △ Less

Submitted 3 June, 2020; originally announced June 2020.

arXiv:2005.02863 [pdf]

doi 10.1007/s42979-020-00286-w

The computerization of archaeology: survey on AI techniques

Authors: Lorenzo Mantovan, Loris Nanni

Abstract: This paper analyses the application of artificial intelligence techniques to various areas of archaeology and more specifically: a) The use of software tools as a creative stimulus for the organization of exhibitions; the use of humanoid robots and holographic displays as guides that interact and involve museum visitors; b) The analysis of methods for the classification of fragments found in archa… ▽ More This paper analyses the application of artificial intelligence techniques to various areas of archaeology and more specifically: a) The use of software tools as a creative stimulus for the organization of exhibitions; the use of humanoid robots and holographic displays as guides that interact and involve museum visitors; b) The analysis of methods for the classification of fragments found in archaeological excavations and for the reconstruction of ceramics, with the recomposition of the parts of text missing from historical documents and epigraphs; c) The cataloguing and study of human remains to understand the social and historical context of belonging with the demonstration of the effectiveness of the AI techniques used; d) The detection of particularly difficult terrestrial archaeological sites with the analysis of the architectures of the Artificial Neural Networks most suitable for solving the problems presented by the site; the design of a study for the exploration of marine archaeological sites, located at depths that cannot be reached by man, through the construction of a freely explorable 3D version. △ Less

Submitted 30 June, 2020; v1 submitted 5 May, 2020; originally announced May 2020.

Journal ref: SN Computer Science, 2020

arXiv:2004.12823 [pdf]

A Critic Evaluation of Methods for COVID-19 Automatic Detection from X-Ray Images

Authors: Gianluca Maguolo, Loris Nanni

Abstract: In this paper, we compare and evaluate different testing protocols used for automatic COVID-19 diagnosis from X-Ray images in the recent literature. We show that similar results can be obtained using X-Ray images that do not contain most of the lungs. We are able to remove the lungs from the images by turning to black the center of the X-Ray scan and training our classifiers only on the outer part… ▽ More In this paper, we compare and evaluate different testing protocols used for automatic COVID-19 diagnosis from X-Ray images in the recent literature. We show that similar results can be obtained using X-Ray images that do not contain most of the lungs. We are able to remove the lungs from the images by turning to black the center of the X-Ray scan and training our classifiers only on the outer part of the images. Hence, we deduce that several testing protocols for the recognition are not fair and that the neural networks are learning patterns in the dataset that are not correlated to the presence of COVID-19. Finally, we show that creating a fair testing protocol is a challenging task, and we provide a method to measure how fair a specific testing protocol is. In the future research we suggest to check the fairness of a testing protocol using our tools and we encourage researchers to look for better techniques than the ones that we propose. △ Less

Submitted 19 September, 2020; v1 submitted 27 April, 2020; originally announced April 2020.

arXiv:1912.07756 [pdf]

Data augmentation approaches for improving animal audio classification

Authors: Loris Nanni, Gianluca Maguolo, Michelangelo Paci

Abstract: In this paper we present ensembles of classifiers for automated animal audio classification, exploiting different data augmentation techniques for training Convolutional Neural Networks (CNNs). The specific animal audio classification problems are i) birds and ii) cat sounds, whose datasets are freely available. We train five different CNNs on the original datasets and on their versions augmented… ▽ More In this paper we present ensembles of classifiers for automated animal audio classification, exploiting different data augmentation techniques for training Convolutional Neural Networks (CNNs). The specific animal audio classification problems are i) birds and ii) cat sounds, whose datasets are freely available. We train five different CNNs on the original datasets and on their versions augmented by four augmentation protocols, working on the raw audio signals or their representations as spectrograms. We compared our best approaches with the state of the art, showing that we obtain the best recognition rate on the same datasets, without ad hoc parameter optimization. Our study shows that different CNNs can be trained for the purpose of animal audio classification and that their fusion works better than the stand-alone classifiers. To the best of our knowledge this is the largest study on data augmentation for CNNs in animal audio classification audio datasets using the same set of classifiers and parameters. Our MATLAB code is available at https://github.com/LorisNanni. △ Less

Submitted 15 March, 2020; v1 submitted 16 December, 2019; originally announced December 2019.

arXiv:1912.05472 [pdf]

Audiogmenter: a MATLAB Toolbox for Audio Data Augmentation

Authors: Gianluca Maguolo, Michelangelo Paci, Loris Nanni, Ludovico Bonan

Abstract: Audio data augmentation is a key step in training deep neural networks for solving audio classification tasks. In this paper, we introduce Audiogmenter, a novel audio data augmentation library in MATLAB. We provide 15 different augmentation algorithms for raw audio data and 8 for spectrograms. We efficiently implemented several augmentation techniques whose usefulness has been extensively proved i… ▽ More Audio data augmentation is a key step in training deep neural networks for solving audio classification tasks. In this paper, we introduce Audiogmenter, a novel audio data augmentation library in MATLAB. We provide 15 different augmentation algorithms for raw audio data and 8 for spectrograms. We efficiently implemented several augmentation techniques whose usefulness has been extensively proved in the literature. To the best of our knowledge, this is the largest MATLAB audio data augmentation library freely available. We validate the efficiency of our algorithms evaluating them on the ESC-50 dataset. The toolbox and its documentation can be downloaded at https://github.com/LorisNanni/Audiogmenter. △ Less

Submitted 1 January, 2022; v1 submitted 11 December, 2019; originally announced December 2019.

arXiv:1910.00296 [pdf]

doi 10.1016/j.ecoinf.2020.101089

Insect pest image detection and recognition based on bio-inspired methods

Authors: Loris Nanni, Gianluca Maguolo, Fabio Pancino

Abstract: Insect pests recognition is necessary for crop protection in many areas of the world. In this paper we propose an automatic classifier based on the fusion between saliency methods and convolutional neural networks. Saliency methods are famous image processing algorithms that highlight the most relevant pixels of an image. In this paper, we use three different saliency methods as image preprocessin… ▽ More Insect pests recognition is necessary for crop protection in many areas of the world. In this paper we propose an automatic classifier based on the fusion between saliency methods and convolutional neural networks. Saliency methods are famous image processing algorithms that highlight the most relevant pixels of an image. In this paper, we use three different saliency methods as image preprocessing and create three different images for every saliency method. Hence, we create 3x3=9 new images for every original image to train different convolutional neural networks. We evaluate the performance of every preprocessing/network couple and we also evaluate the performance of their ensemble. We test our approach on both a small dataset and the large IP102 dataset. Our best ensembles reaches the state of the art accuracy on both the smaller dataset (92.43%) and the IP102 dataset (61.93%), approaching the performance of human experts on the smaller one. Besides, we share our MATLAB code at: https://github.com/LorisNanni/. △ Less

Submitted 27 March, 2020; v1 submitted 1 October, 2019; originally announced October 2019.

arXiv:1908.05489 [pdf]

Deep learning for Plankton and Coral Classification

Authors: Alessandra Lumini, Loris Nanni, Gianluca Maguolo

Abstract: Oceans are the essential lifeblood of the Earth: they provide over 70% of the oxygen and over 97% of the water. Plankton and corals are two of the most fundamental components of ocean ecosystems, the former due to their function at many levels of the oceans food chain, the latter because they provide spawning and nursery grounds to many fish populations. Studying and monitoring plankton distributi… ▽ More Oceans are the essential lifeblood of the Earth: they provide over 70% of the oxygen and over 97% of the water. Plankton and corals are two of the most fundamental components of ocean ecosystems, the former due to their function at many levels of the oceans food chain, the latter because they provide spawning and nursery grounds to many fish populations. Studying and monitoring plankton distribution and coral reefs is vital for environment protection. In the last years there has been a massive proliferation of digital imagery for the monitoring of underwater ecosystems and much research is concentrated on the automated recognition of plankton and corals. In this paper, we present a study about an automated system for monitoring of underwater ecosystems. The system here proposed is based on the fusion of different deep learning methods. We study how to create an ensemble based of different CNN models, fine tuned on several datasets with the aim of exploiting their diversity. The aim of our study is to experiment the possibility of fine-tuning pretrained CNN for underwater imagery analysis, the opportunity of using different datasets for pretraining models, the possibility to design an ensemble using the same architecture with small variations in the training procedure. The experimental results are very encouraging, our experiments performed on 5 well-knowns datasets (3 plankton and 2 coral datasets) show that the fusion of such different CNN models in a heterogeneous ensemble grants a substantial performance improvement with respect to other state-of-the-art approaches in all the tested problems. One of the main contributions of this work is a wide experimental evaluation of famous CNN architectures to report performance of both single CNN and ensemble of CNNs in different problems. Moreover, we show how to create an ensemble which improves the performance of the best single model. △ Less

Submitted 9 December, 2019; v1 submitted 15 August, 2019; originally announced August 2019.

arXiv:1908.03630 [pdf]

doi 10.33969/AIS.2019.11004

Learning morphological operators for skin detection

Authors: Alessandra Lumini, Loris Nanni, Alice Codogno, Filippo Berno

Abstract: In this work we propose a novel post processing approach for skin detectors based on trained morphological operators. The first step, consisting in skin segmentation is performed according to an existing skin detection approach is performed for skin segmentation, then a second step is carried out consisting in the application of a set of morphological operators to refine the resulting mask. Extens… ▽ More In this work we propose a novel post processing approach for skin detectors based on trained morphological operators. The first step, consisting in skin segmentation is performed according to an existing skin detection approach is performed for skin segmentation, then a second step is carried out consisting in the application of a set of morphological operators to refine the resulting mask. Extensive experimental evaluation performed considering two different detection approaches (one based on deep learning and a handcrafted one) carried on 10 different datasets confirms the quality of the proposed method. △ Less

Submitted 23 September, 2019; v1 submitted 9 August, 2019; originally announced August 2019.

Journal ref: Journal of Artificial Intelligence and Systems (2019)

arXiv:1906.04407 [pdf]

doi 10.1016/j.eswa.2019.113019

iProStruct2D: Identifying protein structural classes by deep learning via 2D representations

Authors: Loris Nanni, Alessandra Lumini, Federica Pasquali, Sheryl Brahnam

Abstract: In this paper we address the problem of protein classification starting from a multi-view 2D representation of proteins. From each 3D protein structure, a large set of 2D projections is generated using the protein visualization software Jmol. This set of multi-view 2D representations includes 13 different types of protein visualizations that emphasize specific properties of protein structure (e.g.… ▽ More In this paper we address the problem of protein classification starting from a multi-view 2D representation of proteins. From each 3D protein structure, a large set of 2D projections is generated using the protein visualization software Jmol. This set of multi-view 2D representations includes 13 different types of protein visualizations that emphasize specific properties of protein structure (e.g., a backbone visualization that displays the backbone structure of the protein as a trace of the Cα atom). Each type of representation is used to train a different Convolutional Neural Network (CNN), and the fusion of these CNNs is shown to be able to exploit the diversity of different types of representations to improve classification performance. In addition, several multi-view projections are obtained by uniformly rotating the protein structure around its central X, Y, and Z viewing axes to produce 125 images. This approach can be considered a data augmentation method for improving the performance of the classifier and can be used in both the training and the testing phases. Experimental evaluation of the proposed approach on two datasets demonstrates the strength of the proposed method with respect to the other state-of-the-art approaches. The MATLAB code used in this paper is available at https://github.com/LorisNanni. △ Less

Submitted 11 June, 2019; originally announced June 2019.

Comments: 9 pages, 3 figures, 4 tables

Journal ref: Expert Systems With Applications 2020, 142, (March), 113019

arXiv:1905.02473 [pdf]

Ensemble of Convolutional Neural Networks Trained with Different Activation Functions

Authors: Gianluca Maguolo, Loris Nanni, Stefano Ghidoni

Abstract: Activation functions play a vital role in the training of Convolutional Neural Networks. For this reason, to develop efficient and performing functions is a crucial problem in the deep learning community. Key to these approaches is to permit a reliable parameter learning, avoiding vanishing gradient problems. The goal of this work is to propose an ensemble of Convolutional Neural Networks trained… ▽ More Activation functions play a vital role in the training of Convolutional Neural Networks. For this reason, to develop efficient and performing functions is a crucial problem in the deep learning community. Key to these approaches is to permit a reliable parameter learning, avoiding vanishing gradient problems. The goal of this work is to propose an ensemble of Convolutional Neural Networks trained using several different activation functions. Moreover, a novel activation function is here proposed for the first time. Our aim is to improve the performance of Convolutional Neural Networks in small/medium size biomedical datasets. Our results clearly show that the proposed ensemble outperforms Convolutional Neural Networks trained with standard ReLU as activation function. The proposed ensemble outperforms with a p-value of 0.01 each tested stand-alone activation function; for reliable performance comparison we have tested our approach in more than 10 datasets, using two well-known Convolutional Neural Network: Vgg16 and ResNet50. MATLAB code used here will be available at https://github.com/LorisNanni. △ Less

Submitted 21 September, 2020; v1 submitted 7 May, 2019; originally announced May 2019.

arXiv:1904.08084 [pdf]

General Purpose (GenP) Bioimage Ensemble of Handcrafted and Learned Features with Data Augmentation

Authors: L. Nanni, S. Brahnam, S. Ghidoni, G. Maguolo

Abstract: Bioimage classification plays a crucial role in many biological problems. In this work, we present a new General Purpose (GenP) ensemble that boosts performance by combining local features, dense sampling features, and deep learning approaches. First, we introduce three new methods for data augmentation based on PCA/DCT; second, we show that different data augmentation approaches can boost the per… ▽ More Bioimage classification plays a crucial role in many biological problems. In this work, we present a new General Purpose (GenP) ensemble that boosts performance by combining local features, dense sampling features, and deep learning approaches. First, we introduce three new methods for data augmentation based on PCA/DCT; second, we show that different data augmentation approaches can boost the performance of an ensemble of CNNs; and, finally, we propose a set of handcrafted/learned descriptors that are highly generalizable. Each handcrafted descriptor is used to train a different Support Vector Machine (SVM), and the different SVMs are combined with the ensemble of CNNs. Our method is evaluated on a diverse set of bioimage classification problems. Results demonstrate that the proposed GenP bioimage ensemble obtains state-of-the-art performance without any ad-hoc dataset tuning of parameters (thus avoiding the risk of overfitting/overtraining). △ Less

Submitted 6 July, 2021; v1 submitted 17 April, 2019; originally announced April 2019.

Comments: 27 pages, 1 figure, 5 tables, manuscript

arXiv:1807.08008 [pdf]

Ensemble of Deep Learned Features for Melanoma Classification

Authors: Loris Nanni, Alessandra Lumini, Stefano Ghidoni

Abstract: The aim of this work is to propose an ensemble of descriptors for Melanoma Classification, whose performance has been evaluated on validation and test datasets of the melanoma challenge 2018. The system proposed here achieves a strong discriminative power thanks to the combination of multiple descriptors. The proposed system represents a very simple yet effective way of boosting the performance of… ▽ More The aim of this work is to propose an ensemble of descriptors for Melanoma Classification, whose performance has been evaluated on validation and test datasets of the melanoma challenge 2018. The system proposed here achieves a strong discriminative power thanks to the combination of multiple descriptors. The proposed system represents a very simple yet effective way of boosting the performance of trained CNNs by composing multiple CNNs into an ensemble and combining scores by sum rule. Several types of ensembles are considered, with different CNN architectures along with different learning parameter sets. Moreover CNN are used as feature extractors: an input image is processed by a trained CNN and the response of a particular layer (usually the classification layer, but also internal layers can be employed) is treated as a descriptor for the image and used for training a set of Support Vector Machines (SVM). △ Less

Submitted 20 July, 2018; originally announced July 2018.

arXiv:1802.02531 [pdf]

doi 10.1016/j.eswa.2020.113677

Fair comparison of skin detection approaches on publicly available datasets

Authors: Alessandra Lumini, Loris Nanni

Abstract: Skin detection is the process of discriminating skin and non-skin regions in a digital image and it is widely used in several applications ranging from hand gesture analysis to track body parts and face detection. Skin detection is a challenging problem which has drawn extensive attention from the research community, nevertheless a fair comparison among approaches is very difficult due to the lack… ▽ More Skin detection is the process of discriminating skin and non-skin regions in a digital image and it is widely used in several applications ranging from hand gesture analysis to track body parts and face detection. Skin detection is a challenging problem which has drawn extensive attention from the research community, nevertheless a fair comparison among approaches is very difficult due to the lack of a common benchmark and a unified testing protocol. In this work, we investigate the most recent researches in this field and we propose a fair comparison among approaches using several different datasets. The major contributions of this work are an exhaustive literature review of skin color detection approaches, a framework to evaluate and combine different skin detector approaches, whose source code is made freely available for future research, and an extensive experimental comparison among several recent methods which have also been used to define an ensemble that works well in many different problems. Experiments are carried out in 10 different datasets including more than 10000 labelled images: experimental results confirm that the best method here proposed obtains a very good performance with respect to other stand-alone approaches, without requiring ad hoc parameter tuning. A MATLAB version of the framework for testing and of the methods proposed in this paper will be freely available from https://github.com/LorisNanni △ Less

Submitted 10 July, 2020; v1 submitted 7 February, 2018; originally announced February 2018.

Journal ref: Expert Systems with Applications Volume 160, 1 December 2020, 113677

Showing 1–26 of 26 results for author: Nanni, L