-
Wild Berry image dataset collected in Finnish forests and peatlands using drones
Authors:
Luigi Riz,
Sergio Povoli,
Andrea Caraffa,
Davide Boscaini,
Mohamed Lamine Mekhalfi,
Paul Chippendale,
Marjut Turtiainen,
Birgitta Partanen,
Laura Smith Ballester,
Francisco Blanes Noguera,
Alessio Franchi,
Elisa Castelli,
Giacomo Piccinini,
Luca Marchesotti,
Micael Santos Couceiro,
Fabio Poiesi
Abstract:
Berry picking has long-standing traditions in Finland, yet it is challenging and can potentially be dangerous. The integration of drones equipped with advanced imaging techniques represents a transformative leap forward, optimising harvests and promising sustainable practices. We propose WildBe, the first image dataset of wild berries captured in peatlands and under the canopy of Finnish forests u…
▽ More
Berry picking has long-standing traditions in Finland, yet it is challenging and can potentially be dangerous. The integration of drones equipped with advanced imaging techniques represents a transformative leap forward, optimising harvests and promising sustainable practices. We propose WildBe, the first image dataset of wild berries captured in peatlands and under the canopy of Finnish forests using drones. Unlike previous and related datasets, WildBe includes new varieties of berries, such as bilberries, cloudberries, lingonberries, and crowberries, captured under severe light variations and in cluttered environments. WildBe features 3,516 images, including a total of 18,468 annotated bounding boxes. We carry out a comprehensive analysis of WildBe using six popular object detectors, assessing their effectiveness in berry detection across different forest regions and camera types. We will release WildBe publicly.
△ Less
Submitted 15 May, 2024; v1 submitted 13 May, 2024;
originally announced May 2024.
-
Detect, Augment, Compose, and Adapt: Four Steps for Unsupervised Domain Adaptation in Object Detection
Authors:
Mohamed L. Mekhalfi,
Davide Boscaini,
Fabio Poiesi
Abstract:
Unsupervised domain adaptation (UDA) plays a crucial role in object detection when adapting a source-trained detector to a target domain without annotated data. In this paper, we propose a novel and effective four-step UDA approach that leverages self-supervision and trains source and target data concurrently. We harness self-supervised learning to mitigate the lack of ground truth in the target d…
▽ More
Unsupervised domain adaptation (UDA) plays a crucial role in object detection when adapting a source-trained detector to a target domain without annotated data. In this paper, we propose a novel and effective four-step UDA approach that leverages self-supervision and trains source and target data concurrently. We harness self-supervised learning to mitigate the lack of ground truth in the target domain. Our method consists of the following steps: (1) identify the region with the highest-confidence set of detections in each target image, which serve as our pseudo-labels; (2) crop the identified region and generate a collection of its augmented versions; (3) combine these latter into a composite image; (4) adapt the network to the target domain using the composed image. Through extensive experiments under cross-camera, cross-weather, and synthetic-to-real scenarios, our approach achieves state-of-the-art performance, improving upon the nearest competitor by more than 2% in terms of mean Average Precision (mAP). The code is available at https://github.com/MohamedTEV/DACA.
△ Less
Submitted 29 August, 2023;
originally announced August 2023.
-
The MONET dataset: Multimodal drone thermal dataset recorded in rural scenarios
Authors:
Luigi Riz,
Andrea Caraffa,
Matteo Bortolon,
Mohamed Lamine Mekhalfi,
Davide Boscaini,
André Moura,
José Antunes,
André Dias,
Hugo Silva,
Andreas Leonidou,
Christos Constantinides,
Christos Keleshis,
Dante Abate,
Fabio Poiesi
Abstract:
We present MONET, a new multimodal dataset captured using a thermal camera mounted on a drone that flew over rural areas, and recorded human and vehicle activities. We captured MONET to study the problem of object localisation and behaviour understanding of targets undergoing large-scale variations and being recorded from different and moving viewpoints. Target activities occur in two different la…
▽ More
We present MONET, a new multimodal dataset captured using a thermal camera mounted on a drone that flew over rural areas, and recorded human and vehicle activities. We captured MONET to study the problem of object localisation and behaviour understanding of targets undergoing large-scale variations and being recorded from different and moving viewpoints. Target activities occur in two different land sites, each with unique scene structures and cluttered backgrounds. MONET consists of approximately 53K images featuring 162K manually annotated bounding boxes. Each image is timestamp-aligned with drone metadata that includes information about attitudes, speed, altitude, and GPS coordinates. MONET is different from previous thermal drone datasets because it features multimodal data, including rural scenes captured with thermal cameras containing both person and vehicle targets, along with trajectory information and metadata. We assessed the difficulty of the dataset in terms of transfer learning between the two sites and evaluated nine object detection algorithms to identify the open challenges associated with this type of data. Project page: https://github.com/fabiopoiesi/monet_dataset.
△ Less
Submitted 19 July, 2023; v1 submitted 11 April, 2023;
originally announced April 2023.
-
A Sparse Representation of Complete Local Binary Pattern Histogram for Human Face Recognition
Authors:
Mawloud Guermoui,
Mohamed L. Mekhalfi
Abstract:
Human face recognition has been a long standing problem in computer vision and pattern recognition. Facial analysis can be viewed as a two-fold problem, namely (i) facial representation, and (ii) classification. So far, many face representations have been proposed, a well-known method is the Local Binary Pattern (LBP), which has witnessed a growing interest. In this respect, we treat in this paper…
▽ More
Human face recognition has been a long standing problem in computer vision and pattern recognition. Facial analysis can be viewed as a two-fold problem, namely (i) facial representation, and (ii) classification. So far, many face representations have been proposed, a well-known method is the Local Binary Pattern (LBP), which has witnessed a growing interest. In this respect, we treat in this paper the issues of face representation as well as classification in a novel manner. On the one hand, we use a variant to LBP, so-called Complete Local Binary Pattern (CLBP), which differs from the basic LBP by coding a given local region using a given central pixel and Sing_ Magnitude difference. Subsequently, most of LBPbased descriptors use a fixed grid to code a given facial image, which technique is, in most cases, not robust to pose variation and misalignment. To cope with such issue, a representative Multi-Resolution Histogram (MH) decomposition is adopted in our work. On the other hand, having the histograms of the considered images extracted, we exploit their sparsity to construct a so-called Sparse Representation Classifier (SRC) for further face classification. Experimental results have been conducted on ORL face database, and pointed out the superiority of our scheme over other popular state-of-the-art techniques.
△ Less
Submitted 31 May, 2016;
originally announced May 2016.