Skip to main content

Showing 1–20 of 20 results for author: Boscaini, D

.
  1. arXiv:2406.16384  [pdf, other

    cs.CV

    High-resolution open-vocabulary object 6D pose estimation

    Authors: Jaime Corsetti, Davide Boscaini, Francesco Giuliari, Changjae Oh, Andrea Cavallaro, Fabio Poiesi

    Abstract: The generalisation to unseen objects in the 6D pose estimation task is very challenging. While Vision-Language Models (VLMs) enable using natural language descriptions to support 6D pose estimation of unseen objects, these solutions underperform compared to model-based methods. In this work we present Horyon, an open-vocabulary VLM-based architecture that addresses relative pose estimation between… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

    Comments: Technical report. Extension of CVPR paper "Open-vocabulary object 6D pose estimation". Project page: https://jcorsetti.github.io/oryon

  2. arXiv:2405.07550  [pdf, other

    cs.CV

    Wild Berry image dataset collected in Finnish forests and peatlands using drones

    Authors: Luigi Riz, Sergio Povoli, Andrea Caraffa, Davide Boscaini, Mohamed Lamine Mekhalfi, Paul Chippendale, Marjut Turtiainen, Birgitta Partanen, Laura Smith Ballester, Francisco Blanes Noguera, Alessio Franchi, Elisa Castelli, Giacomo Piccinini, Luca Marchesotti, Micael Santos Couceiro, Fabio Poiesi

    Abstract: Berry picking has long-standing traditions in Finland, yet it is challenging and can potentially be dangerous. The integration of drones equipped with advanced imaging techniques represents a transformative leap forward, optimising harvests and promising sustainable practices. We propose WildBe, the first image dataset of wild berries captured in peatlands and under the canopy of Finnish forests u… ▽ More

    Submitted 15 May, 2024; v1 submitted 13 May, 2024; originally announced May 2024.

  3. arXiv:2312.00947  [pdf, other

    cs.CV

    FreeZe: Training-free zero-shot 6D pose estimation with geometric and vision foundation models

    Authors: Andrea Caraffa, Davide Boscaini, Amir Hamza, Fabio Poiesi

    Abstract: Estimating the 6D pose of objects unseen during training is highly desirable yet challenging. Zero-shot object 6D pose estimation methods address this challenge by leveraging additional task-specific supervision provided by large-scale, photo-realistic synthetic datasets. However, their performance heavily depends on the quality and diversity of rendered data and they require extensive training. I… ▽ More

    Submitted 3 April, 2024; v1 submitted 1 December, 2023; originally announced December 2023.

  4. arXiv:2312.00690  [pdf, other

    cs.CV

    Open-vocabulary object 6D pose estimation

    Authors: Jaime Corsetti, Davide Boscaini, Changjae Oh, Andrea Cavallaro, Fabio Poiesi

    Abstract: We introduce the new setting of open-vocabulary object 6D pose estimation, in which a textual prompt is used to specify the object of interest. In contrast to existing approaches, in our setting (i) the object of interest is specified solely through the textual prompt, (ii) no object model (e.g., CAD or video sequence) is required at inference, and (iii) the object is imaged from two RGBD viewpoin… ▽ More

    Submitted 25 June, 2024; v1 submitted 1 December, 2023; originally announced December 2023.

    Comments: Camera ready version (CVPR 2024, poster highlight). New Oryon version: arXiv:2406.16384

  5. arXiv:2308.15353  [pdf, other

    cs.CV

    Detect, Augment, Compose, and Adapt: Four Steps for Unsupervised Domain Adaptation in Object Detection

    Authors: Mohamed L. Mekhalfi, Davide Boscaini, Fabio Poiesi

    Abstract: Unsupervised domain adaptation (UDA) plays a crucial role in object detection when adapting a source-trained detector to a target domain without annotated data. In this paper, we propose a novel and effective four-step UDA approach that leverages self-supervision and trains source and target data concurrently. We harness self-supervised learning to mitigate the lack of ground truth in the target d… ▽ More

    Submitted 29 August, 2023; originally announced August 2023.

  6. arXiv:2307.15692  [pdf, other

    cs.CV

    PatchMixer: Rethinking network design to boost generalization for 3D point cloud understanding

    Authors: Davide Boscaini, Fabio Poiesi

    Abstract: The recent trend in deep learning methods for 3D point cloud understanding is to propose increasingly sophisticated architectures either to better capture 3D geometries or by introducing possibly undesired inductive biases. Moreover, prior works introducing novel architectures compared their performance on the same domain, devoting less attention to their generalization to other domains. We argue… ▽ More

    Submitted 28 July, 2023; originally announced July 2023.

    Comments: Published in the Image and Vision Computing journal

  7. arXiv:2307.15514  [pdf, other

    cs.CV cs.AI

    Revisiting Fully Convolutional Geometric Features for Object 6D Pose Estimation

    Authors: Jaime Corsetti, Davide Boscaini, Fabio Poiesi

    Abstract: Recent works on 6D object pose estimation focus on learning keypoint correspondences between images and object models, and then determine the object pose through RANSAC-based algorithms or by directly regressing the pose with end-to-end optimisations. We argue that learning point-level discriminative features is overlooked in the literature. To this end, we revisit Fully Convolutional Geometric Fe… ▽ More

    Submitted 3 October, 2023; v1 submitted 28 July, 2023; originally announced July 2023.

    Comments: Camera ready version, 18 pages and 13 figures. Published at the 8th International Workshop on Recovering 6D Object Pose

  8. arXiv:2304.05417  [pdf, other

    cs.CV

    The MONET dataset: Multimodal drone thermal dataset recorded in rural scenarios

    Authors: Luigi Riz, Andrea Caraffa, Matteo Bortolon, Mohamed Lamine Mekhalfi, Davide Boscaini, André Moura, José Antunes, André Dias, Hugo Silva, Andreas Leonidou, Christos Constantinides, Christos Keleshis, Dante Abate, Fabio Poiesi

    Abstract: We present MONET, a new multimodal dataset captured using a thermal camera mounted on a drone that flew over rural areas, and recorded human and vehicle activities. We captured MONET to study the problem of object localisation and behaviour understanding of targets undergoing large-scale variations and being recorded from different and moving viewpoints. Target activities occur in two different la… ▽ More

    Submitted 19 July, 2023; v1 submitted 11 April, 2023; originally announced April 2023.

    Comments: Published in Computer Vision and Pattern Recognition (CVPR) Workshops 2023 - 6th Multimodal Learning and Applications Workshop

  9. arXiv:2212.03300  [pdf, other

    cs.CV q-bio.NC

    Supervised Tractogram Filtering using Geometric Deep Learning

    Authors: Pietro Astolfi, Ruben Verhagen, Laurent Petit, Emanuele Olivetti, Silvio Sarubbo, Jonathan Masci, Davide Boscaini, Paolo Avesani

    Abstract: A tractogram is a virtual representation of the brain white matter. It is composed of millions of virtual fibers, encoded as 3D polylines, which approximate the white matter axonal pathways. To date, tractograms are the most accurate white matter representation and thus are used for tasks like presurgical planning and investigations of neuroplasticity, brain disorders, or brain networks. However,… ▽ More

    Submitted 6 December, 2022; originally announced December 2022.

    Comments: Pre-print. Under review at journal

  10. Learning general and distinctive 3D local deep descriptors for point cloud registration

    Authors: Fabio Poiesi, Davide Boscaini

    Abstract: An effective 3D descriptor should be invariant to different geometric transformations, such as scale and rotation, robust to occlusions and clutter, and capable of generalising to different application domains. We present a simple yet effective method to learn general and distinctive 3D local descriptors that can be used to register point clouds that are captured in different domains. Point cloud… ▽ More

    Submitted 12 May, 2022; v1 submitted 21 May, 2021; originally announced May 2021.

    Comments: Accepted in IEEE Transactions on Pattern Analysis and Machine Intelligence

  11. arXiv:2009.00258  [pdf, other

    cs.CV

    Distinctive 3D local deep descriptors

    Authors: Fabio Poiesi, Davide Boscaini

    Abstract: We present a simple but yet effective method for learning distinctive 3D local deep descriptors (DIPs) that can be used to register point clouds without requiring an initial alignment. Point cloud patches are extracted, canonicalised with respect to their estimated local reference frame and encoded into rotation-invariant compact descriptors by a PointNet-based deep neural network. DIPs can effect… ▽ More

    Submitted 28 December, 2020; v1 submitted 1 September, 2020; originally announced September 2020.

    Comments: IEEE International Conference on Pattern Recognition 2020

  12. arXiv:2008.01589  [pdf, other

    cs.CV

    Shape Consistent 2D Keypoint Estimation under Domain Shift

    Authors: Levi O. Vasconcelos, Massimiliano Mancini, Davide Boscaini, Samuel Rota Bulo, Barbara Caputo, Elisa Ricci

    Abstract: Recent unsupervised domain adaptation methods based on deep architectures have shown remarkable performance not only in traditional classification tasks but also in more complex problems involving structured predictions (e.g. semantic segmentation, depth estimation). Following this trend, in this paper we present a novel deep adaptation framework for estimating keypoints under domain shift}, i.e.… ▽ More

    Submitted 4 August, 2020; originally announced August 2020.

  13. arXiv:2007.02808  [pdf, other

    cs.CV

    Novel-View Human Action Synthesis

    Authors: Mohamed Ilyes Lakhal, Davide Boscaini, Fabio Poiesi, Oswald Lanz, Andrea Cavallaro

    Abstract: Novel-View Human Action Synthesis aims to synthesize the movement of a body from a virtual viewpoint, given a video from a real viewpoint. We present a novel 3D reasoning to synthesize the target viewpoint. We first estimate the 3D mesh of the target body and transfer the rough textures from the 2D images to the mesh. As this transfer may generate sparse textures on the mesh due to frame resolutio… ▽ More

    Submitted 8 October, 2020; v1 submitted 6 July, 2020; originally announced July 2020.

    Comments: Asian Conference on Computer Vision (ACCV) 2020

  14. arXiv:2004.07392  [pdf, other

    cs.CV

    Joint Supervised and Self-Supervised Learning for 3D Real-World Challenges

    Authors: Antonio Alliegro, Davide Boscaini, Tatiana Tommasi

    Abstract: Point cloud processing and 3D shape understanding are very challenging tasks for which deep learning techniques have demonstrated great potentials. Still further progresses are essential to allow artificial intelligent agents to interact with the real world, where the amount of annotated data may be limited and integrating new sources of knowledge becomes crucial to support autonomous learning. He… ▽ More

    Submitted 15 April, 2020; originally announced April 2020.

  15. arXiv:2003.11013  [pdf, other

    q-bio.NC cs.CV eess.IV

    Tractogram filtering of anatomically non-plausible fibers with geometric deep learning

    Authors: Pietro Astolfi, Ruben Verhagen, Laurent Petit, Emanuele Olivetti, Jonathan Masci, Davide Boscaini, Paolo Avesani

    Abstract: Tractograms are virtual representations of the white matter fibers of the brain. They are of primary interest for tasks like presurgical planning, and investigation of neuroplasticity or brain disorders. Each tractogram is composed of millions of fibers encoded as 3D polylines. Unfortunately, a large portion of those fibers are not anatomically plausible and can be considered artifacts of the trac… ▽ More

    Submitted 9 July, 2020; v1 submitted 24 March, 2020; originally announced March 2020.

    Comments: Accepted at MICCAI2020

  16. 3D Shape Segmentation with Geometric Deep Learning

    Authors: Davide Boscaini, Fabio Poiesi

    Abstract: The semantic segmentation of 3D shapes with a high-density of vertices could be impractical due to large memory requirements. To make this problem computationally tractable, we propose a neural-network based approach that produces 3D augmented views of the 3D shape to solve the whole segmentation as sub-segmentation problems. 3D augmented views are obtained by projecting vertices and normals of a… ▽ More

    Submitted 2 February, 2020; originally announced February 2020.

  17. arXiv:1611.08402  [pdf, other

    cs.CV

    Geometric deep learning on graphs and manifolds using mixture model CNNs

    Authors: Federico Monti, Davide Boscaini, Jonathan Masci, Emanuele Rodolà, Jan Svoboda, Michael M. Bronstein

    Abstract: Deep learning has achieved a remarkable performance breakthrough in several fields, most notably in speech recognition, natural language processing, and computer vision. In particular, convolutional neural network (CNN) architectures currently produce state-of-the-art performance on a variety of image analysis tasks such as object detection and recognition. Most of deep learning research has so fa… ▽ More

    Submitted 6 December, 2016; v1 submitted 25 November, 2016; originally announced November 2016.

  18. arXiv:1605.06437  [pdf, other

    cs.CV

    Learning shape correspondence with anisotropic convolutional neural networks

    Authors: Davide Boscaini, Jonathan Masci, Emanuele Rodolà, Michael M. Bronstein

    Abstract: Establishing correspondence between shapes is a fundamental problem in geometry processing, arising in a wide variety of applications. The problem is especially difficult in the setting of non-isometric deformations, as well as in the presence of topological noise and missing parts, mainly due to the limited capability to model such deformations axiomatically. Several recent works showed that inva… ▽ More

    Submitted 20 May, 2016; originally announced May 2016.

  19. arXiv:1501.06297  [pdf, other

    cs.CV

    Geodesic convolutional neural networks on Riemannian manifolds

    Authors: Jonathan Masci, Davide Boscaini, Michael M. Bronstein, Pierre Vandergheynst

    Abstract: Feature descriptors play a crucial role in a wide range of geometry analysis and processing applications, including shape correspondence, retrieval, and segmentation. In this paper, we introduce Geodesic Convolutional Neural Networks (GCNN), a generalization of the convolutional networks (CNN) paradigm to non-Euclidean manifolds. Our construction is based on a local geodesic system of polar coordi… ▽ More

    Submitted 8 June, 2018; v1 submitted 26 January, 2015; originally announced January 2015.

  20. arXiv:1406.1925  [pdf, other

    cs.CV

    Shape-from-intrinsic operator

    Authors: Davide Boscaini, Davide Eynard, Michael M. Bronstein

    Abstract: Shape-from-X is an important class of problems in the fields of geometry processing, computer graphics, and vision, attempting to recover the structure of a shape from some observations. In this paper, we formulate the problem of shape-from-operator (SfO), recovering an embedding of a mesh from intrinsic differential operators defined on the mesh. Particularly interesting instances of our SfO prob… ▽ More

    Submitted 7 June, 2014; originally announced June 2014.