Search | arXiv e-print repository

A Comparative Survey of Vision Transformers for Feature Extraction in Texture Analysis

Authors: Leonardo Scabini, Andre Sacilotti, Kallil M. Zielinski, Lucas C. Ribas, Bernard De Baets, Odemir M. Bruno

Abstract: Texture, a significant visual attribute in images, has been extensively investigated across various image recognition applications. Convolutional Neural Networks (CNNs), which have been successful in many computer vision tasks, are currently among the best texture analysis approaches. On the other hand, Vision Transformers (ViTs) have been surpassing the performance of CNNs on tasks such as object… ▽ More Texture, a significant visual attribute in images, has been extensively investigated across various image recognition applications. Convolutional Neural Networks (CNNs), which have been successful in many computer vision tasks, are currently among the best texture analysis approaches. On the other hand, Vision Transformers (ViTs) have been surpassing the performance of CNNs on tasks such as object recognition, causing a paradigm shift in the field. However, ViTs have so far not been scrutinized for texture recognition, hindering a proper appreciation of their potential in this specific setting. For this reason, this work explores various pre-trained ViT architectures when transferred to tasks that rely on textures. We review 21 different ViT variants and perform an extensive evaluation and comparison with CNNs and hand-engineered models on several tasks, such as assessing robustness to changes in texture rotation, scale, and illumination, and distinguishing color textures, material textures, and texture attributes. The goal is to understand the potential and differences among these models when directly applied to texture recognition, using pre-trained ViTs primarily for feature extraction and employing linear classifiers for evaluation. We also evaluate their efficiency, which is one of the main drawbacks in contrast to other methods. Our results show that ViTs generally outperform both CNNs and hand-engineered models, especially when using stronger pre-training and tasks involving in-the-wild textures (images from the internet). We highlight the following promising models: ViT-B with DINO pre-training, BeiTv2, and the Swin architecture, as well as the EfficientFormer as a low-cost alternative. In terms of efficiency, although having a higher number of GFLOPs and parameters, ViT-B and BeiT(v2) can achieve a lower feature extraction time on GPUs compared to ResNet50. △ Less

Submitted 10 June, 2024; originally announced June 2024.

arXiv:2404.08585 [pdf, ps, other]

Advanced wood species identification based on multiple anatomical sections and using deep feature transfer and fusion

Authors: Kallil M. Zielinski, Leonardo Scabini, Lucas C. Ribas, Núbia R. da Silva, Hans Beeckman, Jan Verwaeren, Odemir M. Bruno, Bernard De Baets

Abstract: In recent years, we have seen many advancements in wood species identification. Methods like DNA analysis, Near Infrared (NIR) spectroscopy, and Direct Analysis in Real Time (DART) mass spectrometry complement the long-established wood anatomical assessment of cell and tissue morphology. However, most of these methods have some limitations such as high costs, the need for skilled experts for data… ▽ More In recent years, we have seen many advancements in wood species identification. Methods like DNA analysis, Near Infrared (NIR) spectroscopy, and Direct Analysis in Real Time (DART) mass spectrometry complement the long-established wood anatomical assessment of cell and tissue morphology. However, most of these methods have some limitations such as high costs, the need for skilled experts for data interpretation, and the lack of good datasets for professional reference. Therefore, most of these methods, and certainly the wood anatomical assessment, may benefit from tools based on Artificial Intelligence. In this paper, we apply two transfer learning techniques with Convolutional Neural Networks (CNNs) to a multi-view Congolese wood species dataset including sections from different orientations and viewed at different microscopic magnifications. We explore two feature extraction methods in detail, namely Global Average Pooling (GAP) and Random Encoding of Aggregated Deep Activation Maps (RADAM), for efficient and accurate wood species identification. Our results indicate superior accuracy on diverse datasets and anatomical sections, surpassing the results of other methods. Our proposal represents a significant advancement in wood species identification, offering a robust tool to support the conservation of forest ecosystems and promote sustainable forestry practices. △ Less

Submitted 12 April, 2024; originally announced April 2024.

Comments: 33 pages, 7 tables, 9 figures

arXiv:2311.03056 [pdf]

LitSumm: Large language models for literature summarisation of non-coding RNAs

Authors: Andrew Green, Carlos Ribas, Nancy Ontiveros-Palacios, Sam Griffiths-Jones, Anton I. Petrov, Alex Bateman, Blake Sweeney

Abstract: Motivation: Curation of literature in life sciences is a growing challenge. The continued increase in the rate of publication, coupled with the relatively fixed number of curators worldwide presents a major challenge to developers of biomedical knowledgebases. Very few knowledgebases have resources to scale to the whole relevant literature and all have to prioritise their efforts. Results: In th… ▽ More Motivation: Curation of literature in life sciences is a growing challenge. The continued increase in the rate of publication, coupled with the relatively fixed number of curators worldwide presents a major challenge to developers of biomedical knowledgebases. Very few knowledgebases have resources to scale to the whole relevant literature and all have to prioritise their efforts. Results: In this work, we take a first step to alleviating the lack of curator time in RNA science by generating summaries of literature for non-coding RNAs using large language models (LLMs). We demonstrate that high-quality, factually accurate summaries with accurate references can be automatically generated from the literature using a commercial LLM and a chain of prompts and checks. Manual assessment was carried out for a subset of summaries, with the majority being rated extremely high quality. We also applied the most commonly used automated evaluation approaches, finding that they do not correlate with human assessment. Finally, we apply our tool to a selection of over 4,600 ncRNAs and make the generated summaries available via the RNAcentral resource. We conclude that automated literature summarization is feasible with the current generation of LLMs, provided careful prompting and automated checking are applied. Availability: Code used to produce these summaries can be found here: https://github.com/RNAcentral/litscan-summarization and the dataset of contexts and summaries can be found here: https://huggingface.co/datasets/RNAcentral/litsumm-v1. Summaries are also displayed on the RNA report pages in RNAcentral (https://rnacentral.org/) △ Less

Submitted 19 April, 2024; v1 submitted 6 November, 2023; originally announced November 2023.

arXiv:2303.04554 [pdf, other]

RADAM: Texture Recognition through Randomized Aggregated Encoding of Deep Activation Maps

Authors: Leonardo Scabini, Kallil M. Zielinski, Lucas C. Ribas, Wesley N. Gonçalves, Bernard De Baets, Odemir M. Bruno

Abstract: Texture analysis is a classical yet challenging task in computer vision for which deep neural networks are actively being applied. Most approaches are based on building feature aggregation modules around a pre-trained backbone and then fine-tuning the new architecture on specific texture recognition tasks. Here we propose a new method named \textbf{R}andom encoding of \textbf{A}ggregated \textbf{D… ▽ More Texture analysis is a classical yet challenging task in computer vision for which deep neural networks are actively being applied. Most approaches are based on building feature aggregation modules around a pre-trained backbone and then fine-tuning the new architecture on specific texture recognition tasks. Here we propose a new method named \textbf{R}andom encoding of \textbf{A}ggregated \textbf{D}eep \textbf{A}ctivation \textbf{M}aps (RADAM) which extracts rich texture representations without ever changing the backbone. The technique consists of encoding the output at different depths of a pre-trained deep convolutional network using a Randomized Autoencoder (RAE). The RAE is trained locally to each image using a closed-form solution, and its decoder weights are used to compose a 1-dimensional texture representation that is fed into a linear SVM. This means that no fine-tuning or backpropagation is needed. We explore RADAM on several texture benchmarks and achieve state-of-the-art results with different computational budgets. Our results suggest that pre-trained backbones may not require additional fine-tuning for texture recognition if their learned representations are better encoded. △ Less

Submitted 8 March, 2023; originally announced March 2023.

Comments: 17 pages, 3 figures, submitted to peer-review journal

ACM Class: I.2.10; I.2.6; I.4.10; I.4.7; I.5.4; I.5.2; I.5.1

arXiv:2211.13000 [pdf, ps, other]

A Network Classification Method based on Density Time Evolution Patterns Extracted from Network Automata

Authors: Kallil M. C. Zielinski, Lucas C. Ribas, Jeaneth Machicao, Odemir M. Bruno

Abstract: Network modeling has proven to be an efficient tool for many interdisciplinary areas, including social, biological, transport, and many other real world complex systems. In addition, cellular automata (CA) are a formalism that has been studied in the last decades as a model for exploring patterns in the dynamic spatio-temporal behavior of these systems based on local rules. Some studies explore th… ▽ More Network modeling has proven to be an efficient tool for many interdisciplinary areas, including social, biological, transport, and many other real world complex systems. In addition, cellular automata (CA) are a formalism that has been studied in the last decades as a model for exploring patterns in the dynamic spatio-temporal behavior of these systems based on local rules. Some studies explore the use of cellular automata to analyze the dynamic behavior of networks, denominating them as network automata (NA). Recently, NA proved to be efficient for network classification, since it uses a time-evolution pattern (TEP) for the feature extraction. However, the TEPs explored by previous studies are composed of binary values, which does not represent detailed information on the network analyzed. Therefore, in this paper, we propose alternate sources of information to use as descriptor for the classification task, which we denominate as density time-evolution pattern (D-TEP) and state density time-evolution pattern (SD-TEP). We explore the density of alive neighbors of each node, which is a continuous value, and compute feature vectors based on histograms of the TEPs. Our results show a significant improvement compared to previous studies at five synthetic network databases and also seven real world databases. Our proposed method demonstrates not only a good approach for pattern recognition in networks, but also shows great potential for other kinds of data, such as images. △ Less

Submitted 18 November, 2022; originally announced November 2022.

arXiv:2204.03583 [pdf]

Risk-based regulation for all: The need and a method for a wide adoption solution for data-driven inspection targeting

Authors: Celso H. H. Ribas, José C. M. Bermudez

Abstract: Access to data and data processing, including the use of machine learning techniques, has become significantly easier and cheaper in recent years. Nevertheless, solutions that can be widely adopted by regulators for market monitoring and inspection targeting in a data-driven way have not been frequently discussed by the scientific community. This article discusses the need and the difficulties for… ▽ More Access to data and data processing, including the use of machine learning techniques, has become significantly easier and cheaper in recent years. Nevertheless, solutions that can be widely adopted by regulators for market monitoring and inspection targeting in a data-driven way have not been frequently discussed by the scientific community. This article discusses the need and the difficulties for the development of such solutions, presents an effective method to address regulation planning, and illustrates its use to account for the most important and common subject for the majority of regulators: the consumer. This article hopes to contribute to increase the awareness of the regulatory community to the need for data processing methods that are objective, impartial, transparent, explainable, simple to implement and with low computational cost, aiming to the implementation of risk-based regulation in the world. △ Less

Submitted 7 April, 2022; originally announced April 2022.

arXiv:2007.05643 [pdf, other]

Learning Local Complex Features using Randomized Neural Networks for Texture Analysis

Authors: Lucas C. Ribas, Leonardo F. S. Scabini, Jarbas Joaci de Mesquita Sá Junior, Odemir M. Bruno

Abstract: Texture is a visual attribute largely used in many problems of image analysis. Currently, many methods that use learning techniques have been proposed for texture discrimination, achieving improved performance over previous handcrafted methods. In this paper, we present a new approach that combines a learning technique and the Complex Network (CN) theory for texture analysis. This method takes adv… ▽ More Texture is a visual attribute largely used in many problems of image analysis. Currently, many methods that use learning techniques have been proposed for texture discrimination, achieving improved performance over previous handcrafted methods. In this paper, we present a new approach that combines a learning technique and the Complex Network (CN) theory for texture analysis. This method takes advantage of the representation capacity of CN to model a texture image as a directed network and uses the topological information of vertices to train a randomized neural network. This neural network has a single hidden layer and uses a fast learning algorithm, which is able to learn local CN patterns for texture characterization. Thus, we use the weighs of the trained neural network to compose a feature vector. These feature vectors are evaluated in a classification experiment in four widely used image databases. Experimental results show a high classification performance of the proposed method when compared to other methods, indicating that our approach can be used in many image analysis problems. △ Less

Submitted 17 August, 2020; v1 submitted 10 July, 2020; originally announced July 2020.

arXiv:2005.08125 [pdf, other]

doi 10.1016/j.physa.2020.125498

Social Interaction Layers in Complex Networks for the Dynamical Epidemic Modeling of COVID-19 in Brazil

Authors: Leonardo F. S. Scabini, Lucas C. Ribas, Mariane B. Neiva, Altamir G. B. Junior, Alex J. F. Farfán, Odemir M. Bruno

Abstract: We are currently living in a state of uncertainty due to the pandemic caused by the Sars-CoV-2 virus. There are several factors involved in the epidemic spreading such as the individual characteristics of each city/country. The true shape of the epidemic dynamics is a large, complex system such as most of the social systems. In this context, Complex networks are a great candidate to analyze these… ▽ More We are currently living in a state of uncertainty due to the pandemic caused by the Sars-CoV-2 virus. There are several factors involved in the epidemic spreading such as the individual characteristics of each city/country. The true shape of the epidemic dynamics is a large, complex system such as most of the social systems. In this context, Complex networks are a great candidate to analyze these systems due to their ability to tackle structural and dynamical properties. Therefore this study presents a new approach to model the COVID-19 epidemic using a multi-layer complex network, where nodes represent people, edges are social contacts, and layers represent different social activities. The model improves the traditional SIR and it is applied to study the Brazilian epidemic by analyzing possible future actions and their consequences. The network is characterized using statistics of infection, death, and hospitalization time. To simulate isolation, social distancing, or precautionary measures we remove layers and/or reduce the intensity of social contacts. Results show that even taking various optimistic assumptions, the current isolation levels in Brazil still may lead to a critical scenario for the healthcare system and a considerable death toll (average of 149,000). If all activities return to normal, the epidemic growth may suffer a steep increase, and the demand for ICU beds may surpass 3 times the country's capacity. This would surely lead to a catastrophic scenario, as our estimation reaches an average of 212,000 deaths even considering that all cases are effectively treated. The increase of isolation (up to a lockdown) shows to be the best option to keep the situation under the healthcare system capacity, aside from ensuring a faster decrease of new case occurrences (months of difference), and a significantly smaller death toll (average of 87,000). △ Less

Submitted 20 May, 2020; v1 submitted 16 May, 2020; originally announced May 2020.

Comments: 16 pages, 7 figures, 2 tables

MSC Class: 05C82 (Primary) 05C81; 92C60; 37M05 (Secondary)

arXiv:1909.06446 [pdf, other]

doi 10.1016/j.ins.2019.11.042

Spatio-spectral networks for color-texture analysis

Authors: Leonardo F. S. Scabini, Lucas C. Ribas, Odemir M. Bruno

Abstract: Texture is one of the most-studied visual attribute for image characterization since the 1960s. However, most hand-crafted descriptors are monochromatic, focusing on the gray scale images and discarding the color information. In this context, this work focus on a new method for color texture analysis considering all color channels in a more intrinsic approach. Our proposal consists of modeling col… ▽ More Texture is one of the most-studied visual attribute for image characterization since the 1960s. However, most hand-crafted descriptors are monochromatic, focusing on the gray scale images and discarding the color information. In this context, this work focus on a new method for color texture analysis considering all color channels in a more intrinsic approach. Our proposal consists of modeling color images as directed complex networks that we named Spatio-Spectral Network (SSN). Its topology includes within-channel edges that cover spatial patterns throughout individual image color channels, while between-channel edges tackle spectral properties of channel pairs in an opponent fashion. Image descriptors are obtained through a concise topological characterization of the modeled network in a multiscale approach with radially symmetric neighborhoods. Experiments with four datasets cover several aspects of color-texture analysis, and results demonstrate that SSN overcomes all the compared literature methods, including known deep convolutional networks, and also has the most stable performance between datasets, achieving $98.5(\pm1.1)$ of average accuracy against $97.1(\pm1.3)$ of MCND and $96.8(\pm3.2)$ of AlexNet. Additionally, an experiment verifies the performance of the methods under different color spaces, where results show that SSN also has higher performance and robustness. △ Less

Submitted 13 September, 2019; originally announced September 2019.

arXiv:1806.10681 [pdf, other]

Dynamic texture analysis with diffusion in networks

Authors: Lucas C. Ribas, Wesley N. Goncalves, Odemir M. Bruno

Abstract: Dynamic texture is a field of research that has gained considerable interest from computer vision community due to the explosive growth of multimedia databases. In addition, dynamic texture is present in a wide range of videos, which makes it very important in expert systems based on videos such as medical systems, traffic monitoring systems, forest fire detection system, among others. In this pap… ▽ More Dynamic texture is a field of research that has gained considerable interest from computer vision community due to the explosive growth of multimedia databases. In addition, dynamic texture is present in a wide range of videos, which makes it very important in expert systems based on videos such as medical systems, traffic monitoring systems, forest fire detection system, among others. In this paper, a new method for dynamic texture characterization based on diffusion in directed networks is proposed. The dynamic texture is modeled as a directed network. The method consists in the analysis of the dynamic of this network after a series of graph cut transformations based on the edge weights. For each network transformation, the activity for each vertex is estimated. The activity is the relative frequency that one vertex is visited by random walks in balance. Then, texture descriptor is constructed by concatenating the activity histograms. The main contributions of this paper are the use of directed network modeling and diffusion in network to dynamic texture characterization. These tend to provide better performance in dynamic textures classification. Experiments with rotation and interference of the motion pattern were conducted in order to demonstrate the robustness of the method. The proposed approach is compared to other dynamic texture methods on two very well know dynamic texture database and on traffic condition classification, and outperform in most of the cases. △ Less

Submitted 27 June, 2018; originally announced June 2018.

Comments: 30 pages, 20 figures

arXiv:1806.09170 [pdf, other]

Fusion of complex networks and randomized neural networks for texture analysis

Authors: Lucas C. Ribas, Jarbas J. M. Sa Junior, Leonardo F. S. Scabini, Odemir M. Bruno

Abstract: This paper presents a high discriminative texture analysis method based on the fusion of complex networks and randomized neural networks. In this approach, the input image is modeled as a complex networks and its topological properties as well as the image pixels are used to train randomized neural networks in order to create a signature that represents the deep characteristics of the texture. The… ▽ More This paper presents a high discriminative texture analysis method based on the fusion of complex networks and randomized neural networks. In this approach, the input image is modeled as a complex networks and its topological properties as well as the image pixels are used to train randomized neural networks in order to create a signature that represents the deep characteristics of the texture. The results obtained surpassed the accuracies of many methods available in the literature. This performance demonstrates that our proposed approach opens a promising source of research, which consists of exploring the synergy of neural networks and complex networks in the texture analysis field. △ Less

Submitted 17 August, 2020; v1 submitted 24 June, 2018; originally announced June 2018.

Comments: 13 pages, 4 figures

arXiv:1611.08629 [pdf, other]

Texture analysis using deterministic partially self-avoiding walk with thresholds

Authors: Lucas Correia Ribas, Wesley Nunes Gonçalves, Odemir Martinez Bruno

Abstract: In this paper, we propose a new texture analysis method using the deterministic partially self-avoiding walk performed on maps modified with thresholds. In this method, two pixels of the map are neighbors if the Euclidean distance between them is less than $\sqrt{2}$ and the weight (difference between its intensities) is less than a given threshold. The maps obtained by using different thresholds… ▽ More In this paper, we propose a new texture analysis method using the deterministic partially self-avoiding walk performed on maps modified with thresholds. In this method, two pixels of the map are neighbors if the Euclidean distance between them is less than $\sqrt{2}$ and the weight (difference between its intensities) is less than a given threshold. The maps obtained by using different thresholds highlight several properties of the image that are extracted by the deterministic walk. To compose the feature vector, deterministic walks are performed with different thresholds and its statistics are concatenated. Thus, this approach can be considered as a multi-scale analysis. We validate our method on the Brodatz database, which is very well known public image database and widely used by texture analysis methods. Experimental results indicate that the proposed method presents a good texture discrimination, overcoming traditional texture methods. △ Less

Submitted 25 November, 2016; originally announced November 2016.

Comments: 8 pages, 3 figures

Journal ref: WVC proceedings 2016, pages 39-44

arXiv:1611.08624 [pdf, other]

Fast deterministic tourist walk for texture analysis

Authors: Lucas Correia Ribas, Odemir Martinez Bruno

Abstract: Deterministic tourist walk (DTW) has attracted increasing interest in computer vision. In the last years, different methods for analysis of dynamic and static textures were proposed. So far, all works based on the DTW for texture analysis use all image pixels as initial point of a walk. However, this requires much runtime. In this paper, we conducted a study to verify the performance of the DTW me… ▽ More Deterministic tourist walk (DTW) has attracted increasing interest in computer vision. In the last years, different methods for analysis of dynamic and static textures were proposed. So far, all works based on the DTW for texture analysis use all image pixels as initial point of a walk. However, this requires much runtime. In this paper, we conducted a study to verify the performance of the DTW method according to the number of initial points to start a walk. The proposed method assigns a unique code to each image pixel, then, the pixels whose code is not divisible by a given $k$ value are ignored as initial points of walks. Feature vectors were extracted and a classification process was performed for different percentages of initial points. Experimental results on the Brodatz and Vistex datasets indicate that to use fewer pixels as initial points significantly improves the runtime compared to use all image pixels. In addition, the correct classification rate decreases very little. △ Less

Submitted 25 November, 2016; originally announced November 2016.

Comments: 7 page, 7 figure

Journal ref: WVC 2016 proceedings p45-50

arXiv:1502.06810 [pdf]

White Paper: Radio y Redes Cognitivas

Authors: Carles Anton Haro, Luis Castedo Ribas, Javier del Ser Lorente, Armin Dekorsy, Miguel Egido Cortes, Xavier Gelabert, Lorenza Giupponi, Xavier Mestre, Jose Monserrat, Carlos Mosquera, Miquel Soriano, Liesbet van der Perre, Jon Arambarri, Juan Antonio Romo

Abstract: Traditionally, two different policies to access the radio spectrum have coexisted: licensed regulation, whereby the rights to use specific spectral bands are granted in exclusivity to an individual operator; or unlicensed regulation, according to which certain spectral bands are declared open for free use by any operator or individual following specific rules. While these paradigms have allowed th… ▽ More Traditionally, two different policies to access the radio spectrum have coexisted: licensed regulation, whereby the rights to use specific spectral bands are granted in exclusivity to an individual operator; or unlicensed regulation, according to which certain spectral bands are declared open for free use by any operator or individual following specific rules. While these paradigms have allowed the wireless communications sector to blossom in the past, in recent years they have evidenced shortcomings and given signs of exhaustion. For instance, it is quite usual to encounter fully overloaded mobile communication systems coexisting with unused contiguous spectral bands. This clearly advocates for a more flexible and dynamic allocation of the spectrum resources which can only be achieved with the advent of the so-called cognitive radios and networks. This whitepaper provides an accurate description of priority research activities and open challenges related to the different functionalities of cognitive radios and networks. First, we outline the main open problems related to the theoretical characterization of cognitive radios, spectrum sensing techniques as well as the optimization of physical layer functionalities in these networks. Second, we provide a description of the main research challenges that arise from a system point of view: MAC protocol optimization, traffic modelling, RRM strategies, routing paradigms or security issues. Next, we point out other problems related to the practical hardware implementation of cognitive radios, giving especial emphasis to sensing capabilities, reconfigurability and cognitive control and management. Finally, we succinctly report on a number of current activities related to the standardization of cognitive radio systems. △ Less

Submitted 24 February, 2015; originally announced February 2015.

Comments: AEI eMOV White Paper, published 04/11/2011, In Spanish

Showing 1–14 of 14 results for author: Ribas, C