-
A Comparative Survey of Vision Transformers for Feature Extraction in Texture Analysis
Authors:
Leonardo Scabini,
Andre Sacilotti,
Kallil M. Zielinski,
Lucas C. Ribas,
Bernard De Baets,
Odemir M. Bruno
Abstract:
Texture, a significant visual attribute in images, has been extensively investigated across various image recognition applications. Convolutional Neural Networks (CNNs), which have been successful in many computer vision tasks, are currently among the best texture analysis approaches. On the other hand, Vision Transformers (ViTs) have been surpassing the performance of CNNs on tasks such as object…
▽ More
Texture, a significant visual attribute in images, has been extensively investigated across various image recognition applications. Convolutional Neural Networks (CNNs), which have been successful in many computer vision tasks, are currently among the best texture analysis approaches. On the other hand, Vision Transformers (ViTs) have been surpassing the performance of CNNs on tasks such as object recognition, causing a paradigm shift in the field. However, ViTs have so far not been scrutinized for texture recognition, hindering a proper appreciation of their potential in this specific setting. For this reason, this work explores various pre-trained ViT architectures when transferred to tasks that rely on textures. We review 21 different ViT variants and perform an extensive evaluation and comparison with CNNs and hand-engineered models on several tasks, such as assessing robustness to changes in texture rotation, scale, and illumination, and distinguishing color textures, material textures, and texture attributes. The goal is to understand the potential and differences among these models when directly applied to texture recognition, using pre-trained ViTs primarily for feature extraction and employing linear classifiers for evaluation. We also evaluate their efficiency, which is one of the main drawbacks in contrast to other methods. Our results show that ViTs generally outperform both CNNs and hand-engineered models, especially when using stronger pre-training and tasks involving in-the-wild textures (images from the internet). We highlight the following promising models: ViT-B with DINO pre-training, BeiTv2, and the Swin architecture, as well as the EfficientFormer as a low-cost alternative. In terms of efficiency, although having a higher number of GFLOPs and parameters, ViT-B and BeiT(v2) can achieve a lower feature extraction time on GPUs compared to ResNet50.
△ Less
Submitted 10 June, 2024;
originally announced June 2024.
-
Advanced wood species identification based on multiple anatomical sections and using deep feature transfer and fusion
Authors:
Kallil M. Zielinski,
Leonardo Scabini,
Lucas C. Ribas,
Núbia R. da Silva,
Hans Beeckman,
Jan Verwaeren,
Odemir M. Bruno,
Bernard De Baets
Abstract:
In recent years, we have seen many advancements in wood species identification. Methods like DNA analysis, Near Infrared (NIR) spectroscopy, and Direct Analysis in Real Time (DART) mass spectrometry complement the long-established wood anatomical assessment of cell and tissue morphology. However, most of these methods have some limitations such as high costs, the need for skilled experts for data…
▽ More
In recent years, we have seen many advancements in wood species identification. Methods like DNA analysis, Near Infrared (NIR) spectroscopy, and Direct Analysis in Real Time (DART) mass spectrometry complement the long-established wood anatomical assessment of cell and tissue morphology. However, most of these methods have some limitations such as high costs, the need for skilled experts for data interpretation, and the lack of good datasets for professional reference. Therefore, most of these methods, and certainly the wood anatomical assessment, may benefit from tools based on Artificial Intelligence. In this paper, we apply two transfer learning techniques with Convolutional Neural Networks (CNNs) to a multi-view Congolese wood species dataset including sections from different orientations and viewed at different microscopic magnifications. We explore two feature extraction methods in detail, namely Global Average Pooling (GAP) and Random Encoding of Aggregated Deep Activation Maps (RADAM), for efficient and accurate wood species identification. Our results indicate superior accuracy on diverse datasets and anatomical sections, surpassing the results of other methods. Our proposal represents a significant advancement in wood species identification, offering a robust tool to support the conservation of forest ecosystems and promote sustainable forestry practices.
△ Less
Submitted 12 April, 2024;
originally announced April 2024.
-
Prediction of Activated Sludge Settling Characteristics from Microscopy Images with Deep Convolutional Neural Networks and Transfer Learning
Authors:
Sina Borzooei,
Leonardo Scabini,
Gisele Miranda,
Saba Daneshgar,
Lukas Deblieck,
Piet De Langhe,
Odemir Bruno,
Bernard De Baets,
Ingmar Nopens,
Elena Torfs
Abstract:
Microbial communities play a key role in biological wastewater treatment processes. Activated sludge settling characteristics, for example, are affected by microbial community composition, varying by changes in operating conditions and influent characteristics of wastewater treatment plants (WWTPs). Timely assessment and prediction of changes in microbial composition leading to settling problems,…
▽ More
Microbial communities play a key role in biological wastewater treatment processes. Activated sludge settling characteristics, for example, are affected by microbial community composition, varying by changes in operating conditions and influent characteristics of wastewater treatment plants (WWTPs). Timely assessment and prediction of changes in microbial composition leading to settling problems, such as filamentous bulking (FB), can prevent operational challenges, reductions in treatment efficiency, and adverse environmental impacts. This study presents an innovative computer vision-based approach to assess activated sludge-settling characteristics based on the morphological properties of flocs and filaments in microscopy images. Implementing the transfer learning of deep convolutional neural network (CNN) models, this approach aims to overcome the limitations of existing quantitative image analysis techniques. The offline microscopy image dataset was collected over two years, with weekly sampling at a full-scale industrial WWTP in Belgium. Multiple data augmentation techniques were employed to enhance the generalizability of the CNN models. Various CNN architectures, including Inception v3, ResNet18, ResNet152, ConvNeXt-nano, and ConvNeXt-S, were tested to evaluate their performance in predicting sludge settling characteristics. The sludge volume index was used as the final prediction variable, but the method can easily be adjusted to predict any other settling metric of choice. The results showed that the suggested CNN-based approach provides less labour-intensive, objective, and consistent assessments, while transfer learning notably minimises the training phase, resulting in a generalizable system that can be employed in real-time applications.
△ Less
Submitted 1 March, 2024; v1 submitted 14 February, 2024;
originally announced February 2024.
-
RADAM: Texture Recognition through Randomized Aggregated Encoding of Deep Activation Maps
Authors:
Leonardo Scabini,
Kallil M. Zielinski,
Lucas C. Ribas,
Wesley N. Gonçalves,
Bernard De Baets,
Odemir M. Bruno
Abstract:
Texture analysis is a classical yet challenging task in computer vision for which deep neural networks are actively being applied. Most approaches are based on building feature aggregation modules around a pre-trained backbone and then fine-tuning the new architecture on specific texture recognition tasks. Here we propose a new method named \textbf{R}andom encoding of \textbf{A}ggregated \textbf{D…
▽ More
Texture analysis is a classical yet challenging task in computer vision for which deep neural networks are actively being applied. Most approaches are based on building feature aggregation modules around a pre-trained backbone and then fine-tuning the new architecture on specific texture recognition tasks. Here we propose a new method named \textbf{R}andom encoding of \textbf{A}ggregated \textbf{D}eep \textbf{A}ctivation \textbf{M}aps (RADAM) which extracts rich texture representations without ever changing the backbone. The technique consists of encoding the output at different depths of a pre-trained deep convolutional network using a Randomized Autoencoder (RAE). The RAE is trained locally to each image using a closed-form solution, and its decoder weights are used to compose a 1-dimensional texture representation that is fed into a linear SVM. This means that no fine-tuning or backpropagation is needed. We explore RADAM on several texture benchmarks and achieve state-of-the-art results with different computational budgets. Our results suggest that pre-trained backbones may not require additional fine-tuning for texture recognition if their learned representations are better encoded.
△ Less
Submitted 8 March, 2023;
originally announced March 2023.
-
Improving Deep Neural Network Random Initialization Through Neuronal Rewiring
Authors:
Leonardo Scabini,
Bernard De Baets,
Odemir M. Bruno
Abstract:
The deep learning literature is continuously updated with new architectures and training techniques. However, weight initialization is overlooked by most recent research, despite some intriguing findings regarding random weights. On the other hand, recent works have been approaching Network Science to understand the structure and dynamics of Artificial Neural Networks (ANNs) after training. Theref…
▽ More
The deep learning literature is continuously updated with new architectures and training techniques. However, weight initialization is overlooked by most recent research, despite some intriguing findings regarding random weights. On the other hand, recent works have been approaching Network Science to understand the structure and dynamics of Artificial Neural Networks (ANNs) after training. Therefore, in this work, we analyze the centrality of neurons in randomly initialized networks. We show that a higher neuronal strength variance may decrease performance, while a lower neuronal strength variance usually improves it. A new method is then proposed to rewire neuronal connections according to a preferential attachment (PA) rule based on their strength, which significantly reduces the strength variance of layers initialized by common methods. In this sense, PA rewiring only reorganizes connections, while preserving the magnitude and distribution of the weights. We show through an extensive statistical analysis in image classification that performance is improved in most cases, both during training and testing, when using both simple and complex architectures and learning schedules. Our results show that, aside from the magnitude, the organization of the weights is also relevant for better initialization of deep ANNs.
△ Less
Submitted 17 July, 2022;
originally announced July 2022.
-
Structure and Performance of Fully Connected Neural Networks: Emerging Complex Network Properties
Authors:
Leonardo F. S. Scabini,
Odemir M. Bruno
Abstract:
Understanding the behavior of Artificial Neural Networks is one of the main topics in the field recently, as black-box approaches have become usual since the widespread of deep learning. Such high-dimensional models may manifest instabilities and weird properties that resemble complex systems. Therefore, we propose Complex Network (CN) techniques to analyze the structure and performance of fully c…
▽ More
Understanding the behavior of Artificial Neural Networks is one of the main topics in the field recently, as black-box approaches have become usual since the widespread of deep learning. Such high-dimensional models may manifest instabilities and weird properties that resemble complex systems. Therefore, we propose Complex Network (CN) techniques to analyze the structure and performance of fully connected neural networks. For that, we build a dataset with 4 thousand models and their respective CN properties. They are employed in a supervised classification setup considering four vision benchmarks. Each neural network is approached as a weighted and undirected graph of neurons and synapses, and centrality measures are computed after training. Results show that these measures are highly related to the network classification performance. We also propose the concept of Bag-Of-Neurons (BoN), a CN-based approach for finding topological signatures linking similar neurons. Results suggest that six neuronal types emerge in such networks, independently of the target domain, and are distributed differently according to classification accuracy. We also tackle specific CN properties related to performance, such as higher subgraph centrality on lower-performing models. Our findings suggest that CN properties play a critical role in the performance of fully connected neural networks, with topological patterns emerging independently on a wide range of models.
△ Less
Submitted 29 July, 2021;
originally announced July 2021.
-
Learning Local Complex Features using Randomized Neural Networks for Texture Analysis
Authors:
Lucas C. Ribas,
Leonardo F. S. Scabini,
Jarbas Joaci de Mesquita Sá Junior,
Odemir M. Bruno
Abstract:
Texture is a visual attribute largely used in many problems of image analysis. Currently, many methods that use learning techniques have been proposed for texture discrimination, achieving improved performance over previous handcrafted methods. In this paper, we present a new approach that combines a learning technique and the Complex Network (CN) theory for texture analysis. This method takes adv…
▽ More
Texture is a visual attribute largely used in many problems of image analysis. Currently, many methods that use learning techniques have been proposed for texture discrimination, achieving improved performance over previous handcrafted methods. In this paper, we present a new approach that combines a learning technique and the Complex Network (CN) theory for texture analysis. This method takes advantage of the representation capacity of CN to model a texture image as a directed network and uses the topological information of vertices to train a randomized neural network. This neural network has a single hidden layer and uses a fast learning algorithm, which is able to learn local CN patterns for texture characterization. Thus, we use the weighs of the trained neural network to compose a feature vector. These feature vectors are evaluated in a classification experiment in four widely used image databases. Experimental results show a high classification performance of the proposed method when compared to other methods, indicating that our approach can be used in many image analysis problems.
△ Less
Submitted 17 August, 2020; v1 submitted 10 July, 2020;
originally announced July 2020.
-
Social Interaction Layers in Complex Networks for the Dynamical Epidemic Modeling of COVID-19 in Brazil
Authors:
Leonardo F. S. Scabini,
Lucas C. Ribas,
Mariane B. Neiva,
Altamir G. B. Junior,
Alex J. F. Farfán,
Odemir M. Bruno
Abstract:
We are currently living in a state of uncertainty due to the pandemic caused by the Sars-CoV-2 virus. There are several factors involved in the epidemic spreading such as the individual characteristics of each city/country. The true shape of the epidemic dynamics is a large, complex system such as most of the social systems. In this context, Complex networks are a great candidate to analyze these…
▽ More
We are currently living in a state of uncertainty due to the pandemic caused by the Sars-CoV-2 virus. There are several factors involved in the epidemic spreading such as the individual characteristics of each city/country. The true shape of the epidemic dynamics is a large, complex system such as most of the social systems. In this context, Complex networks are a great candidate to analyze these systems due to their ability to tackle structural and dynamical properties. Therefore this study presents a new approach to model the COVID-19 epidemic using a multi-layer complex network, where nodes represent people, edges are social contacts, and layers represent different social activities. The model improves the traditional SIR and it is applied to study the Brazilian epidemic by analyzing possible future actions and their consequences. The network is characterized using statistics of infection, death, and hospitalization time. To simulate isolation, social distancing, or precautionary measures we remove layers and/or reduce the intensity of social contacts. Results show that even taking various optimistic assumptions, the current isolation levels in Brazil still may lead to a critical scenario for the healthcare system and a considerable death toll (average of 149,000). If all activities return to normal, the epidemic growth may suffer a steep increase, and the demand for ICU beds may surpass 3 times the country's capacity. This would surely lead to a catastrophic scenario, as our estimation reaches an average of 212,000 deaths even considering that all cases are effectively treated. The increase of isolation (up to a lockdown) shows to be the best option to keep the situation under the healthcare system capacity, aside from ensuring a faster decrease of new case occurrences (months of difference), and a significantly smaller death toll (average of 87,000).
△ Less
Submitted 20 May, 2020; v1 submitted 16 May, 2020;
originally announced May 2020.
-
Spatio-spectral networks for color-texture analysis
Authors:
Leonardo F. S. Scabini,
Lucas C. Ribas,
Odemir M. Bruno
Abstract:
Texture is one of the most-studied visual attribute for image characterization since the 1960s. However, most hand-crafted descriptors are monochromatic, focusing on the gray scale images and discarding the color information. In this context, this work focus on a new method for color texture analysis considering all color channels in a more intrinsic approach. Our proposal consists of modeling col…
▽ More
Texture is one of the most-studied visual attribute for image characterization since the 1960s. However, most hand-crafted descriptors are monochromatic, focusing on the gray scale images and discarding the color information. In this context, this work focus on a new method for color texture analysis considering all color channels in a more intrinsic approach. Our proposal consists of modeling color images as directed complex networks that we named Spatio-Spectral Network (SSN). Its topology includes within-channel edges that cover spatial patterns throughout individual image color channels, while between-channel edges tackle spectral properties of channel pairs in an opponent fashion. Image descriptors are obtained through a concise topological characterization of the modeled network in a multiscale approach with radially symmetric neighborhoods. Experiments with four datasets cover several aspects of color-texture analysis, and results demonstrate that SSN overcomes all the compared literature methods, including known deep convolutional networks, and also has the most stable performance between datasets, achieving $98.5(\pm1.1)$ of average accuracy against $97.1(\pm1.3)$ of MCND and $96.8(\pm3.2)$ of AlexNet. Additionally, an experiment verifies the performance of the methods under different color spaces, where results show that SSN also has higher performance and robustness.
△ Less
Submitted 13 September, 2019;
originally announced September 2019.
-
Fusion of complex networks and randomized neural networks for texture analysis
Authors:
Lucas C. Ribas,
Jarbas J. M. Sa Junior,
Leonardo F. S. Scabini,
Odemir M. Bruno
Abstract:
This paper presents a high discriminative texture analysis method based on the fusion of complex networks and randomized neural networks. In this approach, the input image is modeled as a complex networks and its topological properties as well as the image pixels are used to train randomized neural networks in order to create a signature that represents the deep characteristics of the texture. The…
▽ More
This paper presents a high discriminative texture analysis method based on the fusion of complex networks and randomized neural networks. In this approach, the input image is modeled as a complex networks and its topological properties as well as the image pixels are used to train randomized neural networks in order to create a signature that represents the deep characteristics of the texture. The results obtained surpassed the accuracies of many methods available in the literature. This performance demonstrates that our proposed approach opens a promising source of research, which consists of exploring the synergy of neural networks and complex networks in the texture analysis field.
△ Less
Submitted 17 August, 2020; v1 submitted 24 June, 2018;
originally announced June 2018.
-
Multilayer Complex Network Descriptors for Color-Texture Characterization
Authors:
Leonardo F S Scabini,
Rayner H M Condori,
Wesley N Gonçalves,
Odemir M Bruno
Abstract:
A new method based on complex networks is proposed for color-texture analysis. The proposal consists on modeling the image as a multilayer complex network where each color channel is a layer, and each pixel (in each color channel) is represented as a network vertex. The network dynamic evolution is accessed using a set of modeling parameters (radii and thresholds), and new characterization techniq…
▽ More
A new method based on complex networks is proposed for color-texture analysis. The proposal consists on modeling the image as a multilayer complex network where each color channel is a layer, and each pixel (in each color channel) is represented as a network vertex. The network dynamic evolution is accessed using a set of modeling parameters (radii and thresholds), and new characterization techniques are introduced to capt information regarding within and between color channel spatial interaction. An automatic and adaptive approach for threshold selection is also proposed. We conduct classification experiments on 5 well-known datasets: Vistex, Usptex, Outex13, CURet and MBT. Results among various literature methods are compared, including deep convolutional neural networks with pre-trained architectures. The proposed method presented the highest overall performance over the 5 datasets, with 97.7 of mean accuracy against 97.0 achieved by the ResNet convolutional neural network with 50 layers.
△ Less
Submitted 2 April, 2018;
originally announced April 2018.