Skip to main content

Showing 1–15 of 15 results for author: Carrara, F

.
  1. arXiv:2404.03539  [pdf, other

    cs.CV

    Is CLIP the main roadblock for fine-grained open-world perception?

    Authors: Lorenzo Bianchi, Fabio Carrara, Nicola Messina, Fabrizio Falchi

    Abstract: Modern applications increasingly demand flexible computer vision models that adapt to novel concepts not encountered during training. This necessity is pivotal in emerging domains like extended reality, robotics, and autonomous driving, which require the ability to respond to open-world stimuli. A key ingredient is the ability to identify objects based on free-form textual queries defined at infer… ▽ More

    Submitted 4 April, 2024; originally announced April 2024.

  2. arXiv:2311.17518  [pdf, other

    cs.CV cs.AI cs.LG

    The devil is in the fine-grained details: Evaluating open-vocabulary object detectors for fine-grained understanding

    Authors: Lorenzo Bianchi, Fabio Carrara, Nicola Messina, Claudio Gennaro, Fabrizio Falchi

    Abstract: Recent advancements in large vision-language models enabled visual object detection in open-vocabulary scenarios, where object classes are defined in free-text formats during inference. In this paper, we aim to probe the state-of-the-art methods for open-vocabulary object detection to determine to what extent they understand fine-grained properties of objects and their parts. To this end, we intro… ▽ More

    Submitted 5 April, 2024; v1 submitted 29 November, 2023; originally announced November 2023.

    Comments: Accepted as Highlight at CVPR2024

  3. arXiv:2304.14942  [pdf, other

    cs.CV

    The Emotions of the Crowd: Learning Image Sentiment from Tweets via Cross-modal Distillation

    Authors: Alessio Serra, Fabio Carrara, Maurizio Tesconi, Fabrizio Falchi

    Abstract: Trends and opinion mining in social media increasingly focus on novel interactions involving visual media, like images and short videos, in addition to text. In this work, we tackle the problem of visual sentiment analysis of social media images -- specifically, the prediction of image sentiment polarity. While previous work relied on manually labeled training sets, we propose an automated approac… ▽ More

    Submitted 28 April, 2023; originally announced April 2023.

  4. arXiv:2211.10351  [pdf

    eess.SP cs.LG

    Deep learning for structural health monitoring: An application to heritage structures

    Authors: Fabio Carrara, Fabrizio Falchi, Maria Girardi, Nicola Messina, Cristina Padovani, Daniele Pellegrini

    Abstract: Thanks to recent advancements in numerical methods, computer power, and monitoring technology, seismic ambient noise provides precious information about the structural behavior of old buildings. The measurement of the vibrations produced by anthropic and environmental sources and their use for dynamic identification and structural health monitoring of buildings initiated an emerging, cross-discipl… ▽ More

    Submitted 4 November, 2022; originally announced November 2022.

  5. arXiv:2111.14576  [pdf, other

    cs.CV

    Recurrent Vision Transformer for Solving Visual Reasoning Problems

    Authors: Nicola Messina, Giuseppe Amato, Fabio Carrara, Claudio Gennaro, Fabrizio Falchi

    Abstract: Although convolutional neural networks (CNNs) showed remarkable results in many vision tasks, they are still strained by simple yet challenging visual reasoning problems. Inspired by the recent success of the Transformer network in computer vision, in this paper, we introduce the Recurrent Vision Transformer (RViT) model. Thanks to the impact of recurrent connections and spatial attention in reaso… ▽ More

    Submitted 29 November, 2021; originally announced November 2021.

  6. arXiv:2106.02842  [pdf, other

    cs.CV

    Multi-Camera Vehicle Counting Using Edge-AI

    Authors: Luca Ciampi, Claudio Gennaro, Fabio Carrara, Fabrizio Falchi, Claudio Vairo, Giuseppe Amato

    Abstract: This paper presents a novel solution to automatically count vehicles in a parking lot using images captured by smart cameras. Unlike most of the literature on this task, which focuses on the analysis of single images, this paper proposes the use of multiple visual sources to monitor a wider parking area from different perspectives. The proposed multi-camera system is capable of automatically estim… ▽ More

    Submitted 5 June, 2021; originally announced June 2021.

    Comments: Submitted to Expert Systems With Applications

  7. Solving the Same-Different Task with Convolutional Neural Networks

    Authors: Nicola Messina, Giuseppe Amato, Fabio Carrara, Claudio Gennaro, Fabrizio Falchi

    Abstract: Deep learning demonstrated major abilities in solving many kinds of different real-world problems in computer vision literature. However, they are still strained by simple reasoning tasks that humans consider easy to solve. In this work, we probe current state-of-the-art convolutional neural networks on a difficult set of tasks known as the same-different problems. All the problems require the sam… ▽ More

    Submitted 22 January, 2021; originally announced January 2021.

    Comments: Preprint of the paper published in Patter Recognition Letters (Elsevier)

    Journal ref: Pattern Recognition Letters, Volume 143, March 2021, Pages 75-80

  8. arXiv:2011.08102  [pdf, other

    cs.CV cs.LG stat.ML

    Combining GANs and AutoEncoders for Efficient Anomaly Detection

    Authors: Fabio Carrara, Giuseppe Amato, Luca Brombin, Fabrizio Falchi, Claudio Gennaro

    Abstract: In this work, we propose CBiGAN -- a novel method for anomaly detection in images, where a consistency constraint is introduced as a regularization term in both the encoder and decoder of a BiGAN. Our model exhibits fairly good modeling power and reconstruction consistency capability. We evaluate the proposed method on MVTec AD -- a real-world benchmark for unsupervised anomaly detection on high-r… ▽ More

    Submitted 26 November, 2020; v1 submitted 16 November, 2020; originally announced November 2020.

    Comments: 8 pages, 5 figures, 3 tables, pre-print, to be published in the proceedings of the 25th International Conference on Pattern Recognition (ICPR2020)

  9. arXiv:2008.02749  [pdf, other

    cs.CV cs.MM

    The VISIONE Video Search System: Exploiting Off-the-Shelf Text Search Engines for Large-Scale Video Retrieval

    Authors: Giuseppe Amato, Paolo Bolettieri, Fabio Carrara, Franca Debole, Fabrizio Falchi, Claudio Gennaro, Lucia Vadicamo, Claudio Vairo

    Abstract: In this paper, we describe in details VISIONE, a video search system that allows users to search for videos using textual keywords, occurrence of objects and their spatial relationships, occurrence of colors and their spatial relationships, and image similarity. These modalities can be combined together to express complex queries and satisfy user needs. The peculiarity of our approach is that we e… ▽ More

    Submitted 18 March, 2021; v1 submitted 6 August, 2020; originally announced August 2020.

    Comments: 22 pages, 12 figures

  10. arXiv:2007.06475  [pdf, other

    cs.CV cs.LG

    Automatic Pass Annotation from Soccer VideoStreams Based on Object Detection and LSTM

    Authors: Danilo Sorano, Fabio Carrara, Paolo Cintia, Fabrizio Falchi, Luca Pappalardo

    Abstract: Soccer analytics is attracting increasing interest in academia and industry, thanks to the availability of data that describe all the spatio-temporal events that occur in each match. These events (e.g., passes, shots, fouls) are collected by human operators manually, constituting a considerable cost for data providers in terms of time and economic resources. In this paper, we describe PassNet, a m… ▽ More

    Submitted 13 July, 2020; originally announced July 2020.

  11. Detection of Face Recognition Adversarial Attacks

    Authors: Fabio Valerio Massoli, Fabio Carrara, Giuseppe Amato, Fabrizio Falchi

    Abstract: Deep Learning methods have become state-of-the-art for solving tasks such as Face Recognition (FR). Unfortunately, despite their success, it has been pointed out that these learning models are exposed to adversarial inputs - images to which an imperceptible amount of noise for humans is added to maliciously fool a neural network - thus limiting their adoption in real-world applications. While it i… ▽ More

    Submitted 5 December, 2019; originally announced December 2019.

    MSC Class: I.2.0; I.2.6 ACM Class: I.2.0; I.2.6

    Journal ref: Computer Vision and Image Understanding Volume 202, January 2021, 103103

  12. arXiv:1905.07774  [pdf, other

    physics.bio-ph cond-mat.stat-mech q-bio.CB

    Bacteria push the limits of chemotactic precision to navigate dynamic chemical gradients

    Authors: Douglas R. Brumley, Francesco Carrara, Andrew M. Hein, Yutaka Yawata, Simon A. Levin, Roman Stocker

    Abstract: Ephemeral aggregations of bacteria are ubiquitous in the environment, where they serve as hotbeds of metabolic activity, nutrient cycling, and horizontal gene transfer. In many cases, these regions of high bacterial concentration are thought to form when motile cells use chemotaxis to navigate to chemical hotspots. However, what governs the dynamics of bacterial aggregations is unclear. Here, we u… ▽ More

    Submitted 19 May, 2019; originally announced May 2019.

    Comments: 6 pages, 5 figures. PNAS first published May 16, 2019 https://doi.org/10.1073/pnas.1816621116

  13. arXiv:1704.06178  [pdf, other

    cs.CV

    Exploring epoch-dependent stochastic residual networks

    Authors: Fabio Carrara, Andrea Esuli, Fabrizio Falchi, Alejandro Moreo Fernández

    Abstract: The recently proposed stochastic residual networks selectively activate or bypass the layers during training, based on independent stochastic choices, each of which following a probability distribution that is fixed in advance. In this paper we present a first exploration on the use of an epoch-dependent distribution, starting with a higher probability of bypassing deeper layers and then activatin… ▽ More

    Submitted 20 April, 2017; originally announced April 2017.

    Comments: Preliminary report

  14. arXiv:1606.07287  [pdf, other

    cs.IR cs.CL cs.CV cs.NE

    Picture It In Your Mind: Generating High Level Visual Representations From Textual Descriptions

    Authors: Fabio Carrara, Andrea Esuli, Tiziano Fagni, Fabrizio Falchi, Alejandro Moreo Fernández

    Abstract: In this paper we tackle the problem of image search when the query is a short textual description of the image the user is looking for. We choose to implement the actual search process as a similarity search in a visual feature space, by learning to translate a textual query into a visual representation. Searching in the visual feature space has the advantage that any update to the translation mod… ▽ More

    Submitted 23 June, 2016; originally announced June 2016.

    Comments: Neu-IR '16 SIGIR Workshop on Neural Information Retrieval, July 21, 2016, Pisa, Italy

  15. arXiv:1512.04217  [pdf, other

    physics.bio-ph cond-mat.soft q-bio.CB

    Physical Limits on Bacterial Navigation in Dynamic Environments

    Authors: Andrew M. Hein, Douglas R. Brumley, Francesco Carrara, Roman Stocker, Simon A. Levin

    Abstract: Many chemotactic bacteria inhabit environments in which chemicals appear as localized pulses and evolve by processes such as diffusion and mixing. We show that, in such environments, physical limits on the accuracy of temporal gradient sensing govern when and where bacteria can accurately measure the cues they use to navigate. Chemical pulses are surrounded by a predictable dynamic region, outside… ▽ More

    Submitted 14 December, 2015; originally announced December 2015.

    Comments: 19 pages, 5 figures (including Supplementary Text). Journal of The Royal Society Interface, in press