Search | arXiv e-print repository

MTLSegFormer: Multi-task Learning with Transformers for Semantic Segmentation in Precision Agriculture

Authors: Diogo Nunes Goncalves, Jose Marcato Junior, Pedro Zamboni, Hemerson Pistori, Jonathan Li, Keiller Nogueira, Wesley Nunes Goncalves

Abstract: Multi-task learning has proven to be effective in improving the performance of correlated tasks. Most of the existing methods use a backbone to extract initial features with independent branches for each task, and the exchange of information between the branches usually occurs through the concatenation or sum of the feature maps of the branches. However, this type of information exchange does not… ▽ More Multi-task learning has proven to be effective in improving the performance of correlated tasks. Most of the existing methods use a backbone to extract initial features with independent branches for each task, and the exchange of information between the branches usually occurs through the concatenation or sum of the feature maps of the branches. However, this type of information exchange does not directly consider the local characteristics of the image nor the level of importance or correlation between the tasks. In this paper, we propose a semantic segmentation method, MTLSegFormer, which combines multi-task learning and attention mechanisms. After the backbone feature extraction, two feature maps are learned for each task. The first map is proposed to learn features related to its task, while the second map is obtained by applying learned visual attention to locally re-weigh the feature maps of the other tasks. In this way, weights are assigned to local regions of the image of other tasks that have greater importance for the specific task. Finally, the two maps are combined and used to solve a task. We tested the performance in two challenging problems with correlated tasks and observed a significant improvement in accuracy, mainly in tasks with high dependence on the others. △ Less

Submitted 4 May, 2023; originally announced May 2023.

Comments: Accepted 4th Agriculture-Vision Workshop - CVPRW

arXiv:2102.04566 [pdf, other]

doi 10.1016/j.jag.2022.102690

Semantic Segmentation with Labeling Uncertainty and Class Imbalance

Authors: Patrik Olã Bressan, José Marcato Junior, José Augusto Correa Martins, Diogo Nunes Gonçalves, Daniel Matte Freitas, Lucas Prado Osco, Jonathan de Andrade Silva, Zhipeng Luo, Jonathan Li, Raymundo Cordero Garcia, Wesley Nunes Gonçalves

Abstract: Recently, methods based on Convolutional Neural Networks (CNN) achieved impressive success in semantic segmentation tasks. However, challenges such as the class imbalance and the uncertainty in the pixel-labeling process are not completely addressed. As such, we present a new approach that calculates a weight for each pixel considering its class and uncertainty during the labeling process. The pix… ▽ More Recently, methods based on Convolutional Neural Networks (CNN) achieved impressive success in semantic segmentation tasks. However, challenges such as the class imbalance and the uncertainty in the pixel-labeling process are not completely addressed. As such, we present a new approach that calculates a weight for each pixel considering its class and uncertainty during the labeling process. The pixel-wise weights are used during training to increase or decrease the importance of the pixels. Experimental results show that the proposed approach leads to significant improvements in three challenging segmentation tasks in comparison to baseline methods. It was also proved to be more invariant to noise. The approach presented here may be used within a wide range of semantic segmentation methods to improve their robustness. △ Less

Submitted 8 February, 2021; originally announced February 2021.

Comments: 15 pages, 9 figures, 3 tables

MSC Class: 68T07 ACM Class: I.2.1

Journal ref: International Journal of Applied Earth Observation and Geoinformation, 2022

arXiv:2102.04366 [pdf, other]

doi 10.1016/j.eswa.2022.116555

Counting and Locating High-Density Objects Using Convolutional Neural Network

Authors: Mauro dos Santos de Arruda, Lucas Prado Osco, Plabiany Rodrigo Acosta, Diogo Nunes Gonçalves, José Marcato Junior, Ana Paula Marques Ramos, Edson Takashi Matsubara, Zhipeng Luo, Jonathan Li, Jonathan de Andrade Silva, Wesley Nunes Gonçalves

Abstract: This paper presents a Convolutional Neural Network (CNN) approach for counting and locating objects in high-density imagery. To the best of our knowledge, this is the first object counting and locating method based on a feature map enhancement and a Multi-Stage Refinement of the confidence map. The proposed method was evaluated in two counting datasets: tree and car. For the tree dataset, our meth… ▽ More This paper presents a Convolutional Neural Network (CNN) approach for counting and locating objects in high-density imagery. To the best of our knowledge, this is the first object counting and locating method based on a feature map enhancement and a Multi-Stage Refinement of the confidence map. The proposed method was evaluated in two counting datasets: tree and car. For the tree dataset, our method returned a mean absolute error (MAE) of 2.05, a root-mean-squared error (RMSE) of 2.87 and a coefficient of determination (R$^2$) of 0.986. For the car dataset (CARPK and PUCPR+), our method was superior to state-of-the-art methods. In the these datasets, our approach achieved an MAE of 4.45 and 3.16, an RMSE of 6.18 and 4.39, and an R$^2$ of 0.975 and 0.999, respectively. The proposed method is suitable for dealing with high object-density, returning a state-of-the-art performance for counting and locating objects. △ Less

Submitted 8 February, 2021; originally announced February 2021.

Comments: 15 pages, 10 figures, 8 tables

MSC Class: 68T07 ACM Class: I.2.1

Journal ref: Expert Systems with Applications, 2022

arXiv:2102.03213 [pdf, other]

A Deep Learning Approach Based on Graphs to Detect Plantation Lines

Authors: Diogo Nunes Gonçalves, Mauro dos Santos de Arruda, Hemerson Pistori, Vanessa Jordão Marcato Fernandes, Ana Paula Marques Ramos, Danielle Elis Garcia Furuya, Lucas Prado Osco, Hongjie He, Jonathan Li, José Marcato Junior, Wesley Nunes Gonçalves

Abstract: Deep learning-based networks are among the most prominent methods to learn linear patterns and extract this type of information from diverse imagery conditions. Here, we propose a deep learning approach based on graphs to detect plantation lines in UAV-based RGB imagery presenting a challenging scenario containing spaced plants. The first module of our method extracts a feature map throughout the… ▽ More Deep learning-based networks are among the most prominent methods to learn linear patterns and extract this type of information from diverse imagery conditions. Here, we propose a deep learning approach based on graphs to detect plantation lines in UAV-based RGB imagery presenting a challenging scenario containing spaced plants. The first module of our method extracts a feature map throughout the backbone, which consists of the initial layers of the VGG16. This feature map is used as an input to the Knowledge Estimation Module (KEM), organized in three concatenated branches for detecting 1) the plant positions, 2) the plantation lines, and 3) for the displacement vectors between the plants. A graph modeling is applied considering each plant position on the image as vertices, and edges are formed between two vertices (i.e. plants). Finally, the edge is classified as pertaining to a certain plantation line based on three probabilities (higher than 0.5): i) in visual features obtained from the backbone; ii) a chance that the edge pixels belong to a line, from the KEM step; and iii) an alignment of the displacement vectors with the edge, also from KEM. Experiments were conducted in corn plantations with different growth stages and patterns with aerial RGB imagery. A total of 564 patches with 256 x 256 pixels were used and randomly divided into training, validation, and testing sets in a proportion of 60\%, 20\%, and 20\%, respectively. The proposed method was compared against state-of-the-art deep learning methods, and achieved superior performance with a significant margin, returning precision, recall, and F1-score of 98.7\%, 91.9\%, and 95.1\%, respectively. This approach is useful in extracting lines with spaced plantation patterns and could be implemented in scenarios where plantation gaps occur, generating lines with few-to-none interruptions. △ Less

Submitted 5 February, 2021; originally announced February 2021.

Comments: 19 pages, 11 figures, 4 tables

MSC Class: 68Txx

arXiv:2012.15827 [pdf, other]

doi 10.1016/j.isprsjprs.2021.01.024

A CNN Approach to Simultaneously Count Plants and Detect Plantation-Rows from UAV Imagery

Authors: Lucas Prado Osco, Mauro dos Santos de Arruda, Diogo Nunes Gonçalves, Alexandre Dias, Juliana Batistoti, Mauricio de Souza, Felipe David Georges Gomes, Ana Paula Marques Ramos, Lúcio André de Castro Jorge, Veraldo Liesenberg, Jonathan Li, Lingfei Ma, José Marcato Junior, Wesley Nunes Gonçalves

Abstract: In this paper, we propose a novel deep learning method based on a Convolutional Neural Network (CNN) that simultaneously detects and geolocates plantation-rows while counting its plants considering highly-dense plantation configurations. The experimental setup was evaluated in a cornfield with different growth stages and in a Citrus orchard. Both datasets characterize different plant density scena… ▽ More In this paper, we propose a novel deep learning method based on a Convolutional Neural Network (CNN) that simultaneously detects and geolocates plantation-rows while counting its plants considering highly-dense plantation configurations. The experimental setup was evaluated in a cornfield with different growth stages and in a Citrus orchard. Both datasets characterize different plant density scenarios, locations, types of crops, sensors, and dates. A two-branch architecture was implemented in our CNN method, where the information obtained within the plantation-row is updated into the plant detection branch and retro-feed to the row branch; which are then refined by a Multi-Stage Refinement method. In the corn plantation datasets (with both growth phases, young and mature), our approach returned a mean absolute error (MAE) of 6.224 plants per image patch, a mean relative error (MRE) of 0.1038, precision and recall values of 0.856, and 0.905, respectively, and an F-measure equal to 0.876. These results were superior to the results from other deep networks (HRNet, Faster R-CNN, and RetinaNet) evaluated with the same task and dataset. For the plantation-row detection, our approach returned precision, recall, and F-measure scores of 0.913, 0.941, and 0.925, respectively. To test the robustness of our model with a different type of agriculture, we performed the same task in the citrus orchard dataset. It returned an MAE equal to 1.409 citrus-trees per patch, MRE of 0.0615, precision of 0.922, recall of 0.911, and F-measure of 0.965. For citrus plantation-row detection, our approach resulted in precision, recall, and F-measure scores equal to 0.965, 0.970, and 0.964, respectively. The proposed method achieved state-of-the-art performance for counting and geolocating plants and plant-rows in UAV images from different types of crops. △ Less

Submitted 14 February, 2021; v1 submitted 31 December, 2020; originally announced December 2020.

Comments: 27 pages, 12 figures, 9 tables

ACM Class: J.2

Journal ref: ISPRS JOURNAL OF PHOTOGRAMMETRY AND REMOTE SENSING 174 (2021) 1-17

Showing 1–5 of 5 results for author: Goncalves, D N