Skip to main content

Showing 1–18 of 18 results for author: Cascianelli, S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2404.17243  [pdf, other

    cs.CV

    Binarizing Documents by Leveraging both Space and Frequency

    Authors: Fabio Quattrini, Vittorio Pippi, Silvia Cascianelli, Rita Cucchiara

    Abstract: Document Image Binarization is a well-known problem in Document Analysis and Computer Vision, although it is far from being solved. One of the main challenges of this task is that documents generally exhibit degradations and acquisition artifacts that can greatly vary throughout the page. Nonetheless, even when dealing with a local patch of the document, taking into account the overall appearance… ▽ More

    Submitted 26 April, 2024; originally announced April 2024.

    Comments: Accepted at ICDAR2024

  2. arXiv:2402.10798  [pdf, other

    cs.CV cs.AI

    VATr++: Choose Your Words Wisely for Handwritten Text Generation

    Authors: Bram Vanherle, Vittorio Pippi, Silvia Cascianelli, Nick Michiels, Frank Van Reeth, Rita Cucchiara

    Abstract: Styled Handwritten Text Generation (HTG) has received significant attention in recent years, propelled by the success of learning-based solutions employing GANs, Transformers, and, preliminarily, Diffusion Models. Despite this surge in interest, there remains a critical yet understudied aspect - the impact of the input, both visual and textual, on the HTG model training and its subsequent influenc… ▽ More

    Submitted 16 February, 2024; originally announced February 2024.

  3. arXiv:2310.20316  [pdf, other

    cs.CV cs.DL

    HWD: A Novel Evaluation Score for Styled Handwritten Text Generation

    Authors: Vittorio Pippi, Fabio Quattrini, Silvia Cascianelli, Rita Cucchiara

    Abstract: Styled Handwritten Text Generation (Styled HTG) is an important task in document analysis, aiming to generate text images with the handwriting of given reference images. In recent years, there has been significant progress in the development of deep learning models for tackling this task. Being able to measure the performance of HTG models via a meaningful and representative criterion is key for f… ▽ More

    Submitted 31 October, 2023; originally announced October 2023.

    Comments: Accepted at BMVC2023

  4. arXiv:2308.05070  [pdf, other

    cs.CV cs.DL

    Volumetric Fast Fourier Convolution for Detecting Ink on the Carbonized Herculaneum Papyri

    Authors: Fabio Quattrini, Vittorio Pippi, Silvia Cascianelli, Rita Cucchiara

    Abstract: Recent advancements in Digital Document Restoration (DDR) have led to significant breakthroughs in analyzing highly damaged written artifacts. Among those, there has been an increasing interest in applying Artificial Intelligence techniques for virtually unwrap** and automatically detecting ink on the Herculaneum papyri collection. This collection consists of carbonized scrolls and fragments of… ▽ More

    Submitted 9 August, 2023; originally announced August 2023.

    Comments: Accepted at the 4th ICCV Workshop on e-Heritage (in conjunction with ICCV 2023)

  5. arXiv:2305.02593  [pdf, other

    cs.CV cs.DL

    How to Choose Pretrained Handwriting Recognition Models for Single Writer Fine-Tuning

    Authors: Vittorio Pippi, Silvia Cascianelli, Christopher Kermorvant, Rita Cucchiara

    Abstract: Recent advancements in Deep Learning-based Handwritten Text Recognition (HTR) have led to models with remarkable performance on both modern and historical manuscripts in large benchmark datasets. Nonetheless, those models struggle to obtain the same performance when applied to manuscripts with peculiar characteristics, such as language, paper support, ink, and author handwriting. This issue is ver… ▽ More

    Submitted 4 May, 2023; originally announced May 2023.

    Comments: Accepted at ICDAR2023

  6. arXiv:2304.01842  [pdf, other

    cs.CV

    Evaluating Synthetic Pre-Training for Handwriting Processing Tasks

    Authors: Vittorio Pippi, Silvia Cascianelli, Lorenzo Baraldi, Rita Cucchiara

    Abstract: In this work, we explore massive pre-training on synthetic word images for enhancing the performance on four benchmark downstream handwriting analysis tasks. To this end, we build a large synthetic dataset of word images rendered in several handwriting fonts, which offers a complete supervision signal. We use it to train a simple convolutional neural network (ConvNet) with a fully supervised objec… ▽ More

    Submitted 4 April, 2023; originally announced April 2023.

  7. arXiv:2303.15269  [pdf, other

    cs.CV

    Handwritten Text Generation from Visual Archetypes

    Authors: Vittorio Pippi, Silvia Cascianelli, Rita Cucchiara

    Abstract: Generating synthetic images of handwritten text in a writer-specific style is a challenging task, especially in the case of unseen styles and new words, and even more when these latter contain characters that are rarely encountered during training. While emulating a writer's style has been recently addressed by generative models, the generalization towards rare characters has been disregarded. In… ▽ More

    Submitted 27 March, 2023; originally announced March 2023.

    Comments: Accepted at CVPR2023

  8. arXiv:2301.07150  [pdf, other

    cs.RO cs.AI cs.CL cs.CV

    Embodied Agents for Efficient Exploration and Smart Scene Description

    Authors: Roberto Bigazzi, Marcella Cornia, Silvia Cascianelli, Lorenzo Baraldi, Rita Cucchiara

    Abstract: The development of embodied agents that can communicate with humans in natural language has gained increasing interest over the last years, as it facilitates the diffusion of robotic platforms in human-populated environments. As a step towards this objective, in this work, we tackle a setting for visual navigation in which an autonomous agent needs to explore and map an unseen indoor environment w… ▽ More

    Submitted 17 January, 2023; originally announced January 2023.

    Comments: Accepted by IEEE International Conference on Robotics and Automation (ICRA 2023)

  9. arXiv:2208.08109  [pdf, other

    cs.CV

    Boosting Modern and Historical Handwritten Text Recognition with Deformable Convolutions

    Authors: Silvia Cascianelli, Marcella Cornia, Lorenzo Baraldi, Rita Cucchiara

    Abstract: Handwritten Text Recognition (HTR) in free-layout pages is a challenging image understanding task that can provide a relevant boost to the digitization of handwritten documents and reuse of their content. The task becomes even more challenging when dealing with historical documents due to the variability of the writing style and degradation of the page quality. State-of-the-art HTR approaches typi… ▽ More

    Submitted 17 August, 2022; originally announced August 2022.

    Journal ref: International Journal on Document Analysis and Recognition (IJDAR), 2022, 1-11

  10. arXiv:2208.07682  [pdf, other

    cs.CV cs.DL

    The LAM Dataset: A Novel Benchmark for Line-Level Handwritten Text Recognition

    Authors: Silvia Cascianelli, Vittorio Pippi, Martin Maarand, Marcella Cornia, Lorenzo Baraldi, Christopher Kermorvant, Rita Cucchiara

    Abstract: Handwritten Text Recognition (HTR) is an open problem at the intersection of Computer Vision and Natural Language Processing. The main challenges, when dealing with historical manuscripts, are due to the preservation of the paper support, the variability of the handwriting -- even of the same author over a wide time-span -- and the scarcity of data from ancient, poorly represented languages. With… ▽ More

    Submitted 16 August, 2022; originally announced August 2022.

    Comments: Accepted at ICPR 2022

  11. Embodied Navigation at the Art Gallery

    Authors: Roberto Bigazzi, Federico Landi, Silvia Cascianelli, Marcella Cornia, Lorenzo Baraldi, Rita Cucchiara

    Abstract: Embodied agents, trained to explore and navigate indoor photorealistic environments, have achieved impressive results on standard datasets and benchmarks. So far, experiments and evaluations have involved domestic and working scenes like offices, flats, and houses. In this paper, we build and release a new 3D space with unique characteristics: the one of a complete art museum. We name this environ… ▽ More

    Submitted 19 April, 2022; originally announced April 2022.

    Comments: Accepted by 21st International Conference on Image Analysis and Processing (ICIAP 2021)

  12. Spot the Difference: A Novel Task for Embodied Agents in Changing Environments

    Authors: Federico Landi, Roberto Bigazzi, Marcella Cornia, Silvia Cascianelli, Lorenzo Baraldi, Rita Cucchiara

    Abstract: Embodied AI is a recent research area that aims at creating intelligent agents that can move and operate inside an environment. Existing approaches in this field demand the agents to act in completely new and unexplored scenes. However, this setting is far from realistic use cases that instead require executing multiple tasks in the same environment. Even if the environment changes over time, the… ▽ More

    Submitted 18 April, 2022; originally announced April 2022.

    Comments: Accepted by 26TH International Conference on Pattern Recognition (ICPR 2022)

  13. arXiv:2202.10492  [pdf, other

    cs.CV cs.AI cs.CL cs.MM

    CaMEL: Mean Teacher Learning for Image Captioning

    Authors: Manuele Barraco, Matteo Stefanini, Marcella Cornia, Silvia Cascianelli, Lorenzo Baraldi, Rita Cucchiara

    Abstract: Describing images in natural language is a fundamental step towards the automatic modeling of connections between the visual and textual modalities. In this paper we present CaMEL, a novel Transformer-based architecture for image captioning. Our proposed approach leverages the interaction of two interconnected language models that learn from each other during the training phase. The interplay betw… ▽ More

    Submitted 21 February, 2022; originally announced February 2022.

  14. arXiv:2109.08521  [pdf, other

    cs.RO cs.AI cs.CV

    Focus on Impact: Indoor Exploration with Intrinsic Motivation

    Authors: Roberto Bigazzi, Federico Landi, Silvia Cascianelli, Lorenzo Baraldi, Marcella Cornia, Rita Cucchiara

    Abstract: Exploration of indoor environments has recently experienced a significant interest, also thanks to the introduction of deep neural agents built in a hierarchical fashion and trained with Deep Reinforcement Learning (DRL) on simulated environments. Current state-of-the-art methods employ a dense extrinsic reward that requires the complete a priori knowledge of the layout of the training environment… ▽ More

    Submitted 4 February, 2022; v1 submitted 14 September, 2021; originally announced September 2021.

    Comments: Published in IEEE Robotics and Automation Letters. To appear in ICRA 2022

    Journal ref: IEEE Robotics and Automation Letters (Volume: 7, Issue: 2, April 2022)

  15. arXiv:2107.06912  [pdf, other

    cs.CV cs.CL

    From Show to Tell: A Survey on Deep Learning-based Image Captioning

    Authors: Matteo Stefanini, Marcella Cornia, Lorenzo Baraldi, Silvia Cascianelli, Giuseppe Fiameni, Rita Cucchiara

    Abstract: Connecting Vision and Language plays an essential role in Generative Intelligence. For this reason, large research efforts have been devoted to image captioning, i.e. describing images with syntactically and semantically meaningful sentences. Starting from 2015 the task has generally been addressed with pipelines composed of a visual encoder and a language model for text generation. During these y… ▽ More

    Submitted 30 November, 2021; v1 submitted 14 July, 2021; originally announced July 2021.

  16. Out of the Box: Embodied Navigation in the Real World

    Authors: Roberto Bigazzi, Federico Landi, Marcella Cornia, Silvia Cascianelli, Lorenzo Baraldi, Rita Cucchiara

    Abstract: The research field of Embodied AI has witnessed substantial progress in visual navigation and exploration thanks to powerful simulating platforms and the availability of 3D data of indoor and photorealistic environments. These two factors have opened the doors to a new generation of intelligent agents capable of achieving nearly perfect PointGoal Navigation. However, such architectures are commonl… ▽ More

    Submitted 12 May, 2021; originally announced May 2021.

  17. arXiv:2102.05067  [pdf, other

    cs.CV cs.CL cs.MM

    The Role of the Input in Natural Language Video Description

    Authors: Silvia Cascianelli, Gabriele Costante, Alessandro Devo, Thomas A. Ciarfuglia, Paolo Valigi, Mario L. Fravolini

    Abstract: Natural Language Video Description (NLVD) has recently received strong interest in the Computer Vision, Natural Language Processing (NLP), Multimedia, and Autonomous Robotics communities. The State-of-the-Art (SotA) approaches obtained remarkable results when tested on the benchmark datasets. However, those approaches poorly generalize to new datasets. In addition, none of the existing works focus… ▽ More

    Submitted 9 February, 2021; originally announced February 2021.

    Comments: In IEEE Transactions on Multimedia

    Journal ref: IEEE Transactions on Multimedia, 22(1), 271-283 (2019)

  18. arXiv:2007.07268  [pdf, other

    cs.CV cs.AI cs.CL cs.RO

    Explore and Explain: Self-supervised Navigation and Recounting

    Authors: Roberto Bigazzi, Federico Landi, Marcella Cornia, Silvia Cascianelli, Lorenzo Baraldi, Rita Cucchiara

    Abstract: Embodied AI has been recently gaining attention as it aims to foster the development of autonomous and intelligent agents. In this paper, we devise a novel embodied setting in which an agent needs to explore a previously unknown environment while recounting what it sees during the path. In this context, the agent needs to navigate the environment driven by an exploration goal, select proper moment… ▽ More

    Submitted 14 July, 2020; originally announced July 2020.

    Comments: ICPR 2020