Search | arXiv e-print repository

Strength in Diversity: Multi-Branch Representation Learning for Vehicle Re-Identification

Authors: Eurico Almeida, Bruno Silva, Jorge Batista

Abstract: This paper presents an efficient and lightweight multi-branch deep architecture to improve vehicle re-identification (V-ReID). While most V-ReID work uses a combination of complex multi-branch architectures to extract robust and diversified embeddings towards re-identification, we advocate that simple and lightweight architectures can be designed to fulfill the Re-ID task without compromising perf… ▽ More This paper presents an efficient and lightweight multi-branch deep architecture to improve vehicle re-identification (V-ReID). While most V-ReID work uses a combination of complex multi-branch architectures to extract robust and diversified embeddings towards re-identification, we advocate that simple and lightweight architectures can be designed to fulfill the Re-ID task without compromising performance. We propose a combination of Grouped-convolution and Loss-Branch-Split strategies to design a multi-branch architecture that improve feature diversity and feature discriminability. We combine a ResNet50 global branch architecture with a BotNet self-attention branch architecture, both designed within a Loss-Branch-Split (LBS) strategy. We argue that specialized loss-branch-splitting helps to improve re-identification tasks by generating specialized re-identification features. A lightweight solution using grouped convolution is also proposed to mimic the learning of loss-splitting into multiple embeddings while significantly reducing the model size. In addition, we designed an improved solution to leverage additional metadata, such as camera ID and pose information, that uses 97% less parameters, further improving re-identification performance. In comparison to state-of-the-art (SoTA) methods, our approach outperforms competing solutions in Veri-776 by achieving 85.6% mAP and 97.7% CMC1 and obtains competitive results in Veri-Wild with 88.1% mAP and 96.3% CMC1. Overall, our work provides important insights into improving vehicle re-identification and presents a strong basis for other retrieval tasks. Our code is available at the https://github.com/videturfortuna/vehicle_reid_itsc2023. △ Less

Submitted 2 October, 2023; originally announced October 2023.

Comments: Paper accepted in ITSC2023

arXiv:2102.04179 [pdf, other]

Plotting time: On the usage of CNNs for time series classification

Authors: Nuno M. Rodrigues, João E. Batista, Leonardo Trujillo, Bernardo Duarte, Mario Giacobini, Leonardo Vanneschi, Sara Silva

Abstract: We present a novel approach for time series classification where we represent time series data as plot images and feed them to a simple CNN, outperforming several state-of-the-art methods. We propose a simple and highly replicable way of plotting the time series, and feed these images as input to a non-optimized shallow CNN, without any normalization or residual connections. These representations… ▽ More We present a novel approach for time series classification where we represent time series data as plot images and feed them to a simple CNN, outperforming several state-of-the-art methods. We propose a simple and highly replicable way of plotting the time series, and feed these images as input to a non-optimized shallow CNN, without any normalization or residual connections. These representations are no more than default line plots using the time series data, where the only pre-processing applied is to reduce the number of white pixels in the image. We compare our method with different state-of-the-art methods specialized in time series classification on two real-world non public datasets, as well as 98 datasets of the UCR dataset collection. The results show that our approach is very promising, achieving the best results on both real-world datasets and matching / beating the best state-of-the-art methods in six UCR datasets. We argue that, if a simple naive design like ours can obtain such good results, it is worth further exploring the capabilities of using image representation of time series data, along with more powerful CNNs, for classification and other related tasks. △ Less

Submitted 8 February, 2021; originally announced February 2021.

arXiv:2002.00053 [pdf, other]

Improving the Detection of Burnt Areas in Remote Sensing using Hyper-features Evolved by M3GP

Authors: João E. Batista, Sara Silva

Abstract: One problem found when working with satellite images is the radiometric variations across the image and different images. Intending to improve remote sensing models for the classification of burnt areas, we set two objectives. The first is to understand the relationship between feature spaces and the predictive ability of the models, allowing us to explain the differences between learning and gene… ▽ More One problem found when working with satellite images is the radiometric variations across the image and different images. Intending to improve remote sensing models for the classification of burnt areas, we set two objectives. The first is to understand the relationship between feature spaces and the predictive ability of the models, allowing us to explain the differences between learning and generalization when training and testing in different datasets. We find that training on datasets built from more than one image provides models that generalize better. These results are explained by visualizing the dispersion of values on the feature space. The second objective is to evolve hyper-features that improve the performance of different classifiers on a variety of test sets. We find the hyper-features to be beneficial, and obtain the best models with XGBoost, even if the hyper-features are optimized for a different method. △ Less

Submitted 31 January, 2020; originally announced February 2020.

arXiv:2001.07553 [pdf, other]

Ensemble Genetic Programming

Authors: Nuno M. Rodrigues, João E. Batista, Sara Silva

Abstract: Ensemble learning is a powerful paradigm that has been usedin the top state-of-the-art machine learning methods like Random Forestsand XGBoost. Inspired by the success of such methods, we have devel-oped a new Genetic Programming method called Ensemble GP. The evo-lutionary cycle of Ensemble GP follows the same steps as other GeneticProgramming systems, but with differences in the population struc… ▽ More Ensemble learning is a powerful paradigm that has been usedin the top state-of-the-art machine learning methods like Random Forestsand XGBoost. Inspired by the success of such methods, we have devel-oped a new Genetic Programming method called Ensemble GP. The evo-lutionary cycle of Ensemble GP follows the same steps as other GeneticProgramming systems, but with differences in the population structure,fitness evaluation and genetic operators. We have tested this method oneight binary classification problems, achieving results significantly betterthan standard GP, with much smaller models. Although other methodslike M3GP and XGBoost were the best overall, Ensemble GP was able toachieve exceptionally good generalization results on a particularly hardproblem where none of the other methods was able to succeed. △ Less

Submitted 21 January, 2020; originally announced January 2020.

Comments: eurogp 2020 submission

arXiv:1503.06465 [pdf, other]

Lifting Object Detection Datasets into 3D

Authors: Joao Carreira, Sara Vicente, Lourdes Agapito, Jorge Batista

Abstract: While data has certainly taken the center stage in computer vision in recent years, it can still be difficult to obtain in certain scenarios. In particular, acquiring ground truth 3D shapes of objects pictured in 2D images remains a challenging feat and this has hampered progress in recognition-based object reconstruction from a single image. Here we propose to bypass previous solutions such as 3D… ▽ More While data has certainly taken the center stage in computer vision in recent years, it can still be difficult to obtain in certain scenarios. In particular, acquiring ground truth 3D shapes of objects pictured in 2D images remains a challenging feat and this has hampered progress in recognition-based object reconstruction from a single image. Here we propose to bypass previous solutions such as 3D scanning or manual design, that scale poorly, and instead populate object category detection datasets semi-automatically with dense, per-object 3D reconstructions, bootstrapped from:(i) class labels, (ii) ground truth figure-ground segmentations and (iii) a small set of keypoint annotations. Our proposed algorithm first estimates camera viewpoint using rigid structure-from-motion and then reconstructs object shapes by optimizing over visual hull proposals guided by loose within-class shape similarity assumptions. The visual hull sampling process attempts to intersect an object's projection cone with the cones of minimal subsets of other similar objects among those pictured from certain vantage points. We show that our method is able to produce convincing per-object 3D reconstructions and to accurately estimate cameras viewpoints on one of the most challenging existing object-category detection datasets, PASCAL VOC. We hope that our results will re-stimulate interest on joint object recognition and 3D reconstruction from a single image. △ Less

Submitted 31 July, 2016; v1 submitted 22 March, 2015; originally announced March 2015.

arXiv:1404.7584 [pdf, other]

doi 10.1109/TPAMI.2014.2345390

High-Speed Tracking with Kernelized Correlation Filters

Authors: João F. Henriques, Rui Caseiro, Pedro Martins, Jorge Batista

Abstract: The core component of most modern trackers is a discriminative classifier, tasked with distinguishing between the target and the surrounding environment. To cope with natural image changes, this classifier is typically trained with translated and scaled sample patches. Such sets of samples are riddled with redundancies -- any overlap** pixels are constrained to be the same. Based on this simple… ▽ More The core component of most modern trackers is a discriminative classifier, tasked with distinguishing between the target and the surrounding environment. To cope with natural image changes, this classifier is typically trained with translated and scaled sample patches. Such sets of samples are riddled with redundancies -- any overlap** pixels are constrained to be the same. Based on this simple observation, we propose an analytic model for datasets of thousands of translated patches. By showing that the resulting data matrix is circulant, we can diagonalize it with the Discrete Fourier Transform, reducing both storage and computation by several orders of magnitude. Interestingly, for linear regression our formulation is equivalent to a correlation filter, used by some of the fastest competitive trackers. For kernel regression, however, we derive a new Kernelized Correlation Filter (KCF), that unlike other kernel algorithms has the exact same complexity as its linear counterpart. Building on it, we also propose a fast multi-channel extension of linear correlation filters, via a linear kernel, which we call Dual Correlation Filter (DCF). Both KCF and DCF outperform top-ranking trackers such as Struck or TLD on a 50 videos benchmark, despite running at hundreds of frames-per-second, and being implemented in a few lines of code (Algorithm 1). To encourage further developments, our tracking framework was made open-source. △ Less

Submitted 4 November, 2014; v1 submitted 30 April, 2014; originally announced April 2014.

arXiv:1101.5379 [pdf, ps, other]

doi 10.1103/PhysRevE.85.036105

How Many Nodes are Effectively Accessed in Complex Networks?

Authors: Matheus P. Viana, João L. B. Batista, Luciano da F. Costa

Abstract: The measurement called accessibility has been proposed as a means to quantify the efficiency of the communication between nodes in complex networks. This article reports important results regarding the properties of the accessibility, including its relationship with the average minimal time to visit all nodes reachable after $h$ steps along a random walk starting from a source, as well as the numb… ▽ More The measurement called accessibility has been proposed as a means to quantify the efficiency of the communication between nodes in complex networks. This article reports important results regarding the properties of the accessibility, including its relationship with the average minimal time to visit all nodes reachable after $h$ steps along a random walk starting from a source, as well as the number of nodes that are visited after a finite period of time. We characterize the relationship between accessibility and the average number of walks required in order to visit all reachable nodes (the exploration time), conjecture that the maximum accessibility implies the minimal exploration time, and confirm the relationship between the accessibility values and the number of nodes visited after a basic time unit. The latter relationship is investigated with respect to three types of dynamics, namely: traditional random walks, self-avoiding random walks, and preferential random walks. △ Less

Submitted 1 October, 2011; v1 submitted 27 January, 2011; originally announced January 2011.

Comments: 8 pages and 7 figures

MSC Class: 05C82

Showing 1–7 of 7 results for author: Batista, J