Search | arXiv e-print repository

Sinestesia as a model for HCI: a Systematic Review

Authors: Simona Corciulo, Mario Alessandro Bochicchio

Abstract: Synesthesia, conceived as a neuropsychological condition, may prove valuable in studying the interaction between humans and machines by analyzing the co-occurrence of sensory or cognitive responses triggered by a stimulus. In our approach, synesthesia is elevated beyond a mere perceptual-cognitive anomaly, offering insights into the reciprocal interaction between humans and the digital system, ste… ▽ More Synesthesia, conceived as a neuropsychological condition, may prove valuable in studying the interaction between humans and machines by analyzing the co-occurrence of sensory or cognitive responses triggered by a stimulus. In our approach, synesthesia is elevated beyond a mere perceptual-cognitive anomaly, offering insights into the reciprocal interaction between humans and the digital system, steering novel experimental design and enriching results interpretations.This review broadens the traditional scope, conventionally rooted in neuroscience and psychology, by considering how computer science can approach this condition. The interdisciplinary examination revolves around two primary viewpoints: one associating this condition with specific cognitive, perceptual, and behavioral anomalies, and the other acknowledging it as a prevalent human experience. Synesthesia, in this review, emerges as a significant model for Human Computer Interaction (HCI). The exploration of this specific condition aims to decipher how atypical pathways of perception and cognition can be encoded, empowering machines to actively engage in processing information from both the body and the environment. The authors attempt to amalgamate findings and insights from various disciplines, fostering collaboration between computer science, neuroscience, psychology, and philosophy.The overarching objective is to construct a comprehensive framework that elucidates how synesthesia and anomalies in information processing can be harnessed within HCI, with a particular emphasis on contributing to digital technologies for medical research and enhancing patients care and comfort. In this sense, the review endeavors also to fill the gap between theoretical understanding and practical application. △ Less

Submitted 14 April, 2024; originally announced April 2024.

Comments: 12 pages

arXiv:2403.04562 [pdf, other]

Out of the Room: Generalizing Event-Based Dynamic Motion Segmentation for Complex Scenes

Authors: Stamatios Georgoulis, Weining Ren, Alfredo Bochicchio, Daniel Eckert, Yuanyou Li, Abel Gawel

Abstract: Rapid and reliable identification of dynamic scene parts, also known as motion segmentation, is a key challenge for mobile sensors. Contemporary RGB camera-based methods rely on modeling camera and scene properties however, are often under-constrained and fall short in unknown categories. Event cameras have the potential to overcome these limitations, but corresponding methods have only been demon… ▽ More Rapid and reliable identification of dynamic scene parts, also known as motion segmentation, is a key challenge for mobile sensors. Contemporary RGB camera-based methods rely on modeling camera and scene properties however, are often under-constrained and fall short in unknown categories. Event cameras have the potential to overcome these limitations, but corresponding methods have only been demonstrated in smaller-scale indoor environments with simplified dynamic objects. This work presents an event-based method for class-agnostic motion segmentation that can successfully be deployed across complex large-scale outdoor environments too. To this end, we introduce a novel divide-and-conquer pipeline that combines: (a) ego-motion compensated events, computed via a scene understanding module that predicts monocular depth and camera pose as auxiliary tasks, and (b) optical flow from a dedicated optical flow module. These intermediate representations are then fed into a segmentation module that predicts motion segmentation masks. A novel transformer-based temporal attention module in the segmentation module builds correlations across adjacent 'frames' to get temporally consistent segmentation masks. Our method sets the new state-of-the-art on the classic EV-IMO benchmark (indoors), where we achieve improvements of 2.19 moving object IoU (2.22 mIoU) and 4.52 point IoU respectively, as well as on a newly-generated motion segmentation and tracking benchmark (outdoors) based on the DSEC event dataset, termed DSEC-MOTS, where we show improvement of 12.91 moving object IoU. △ Less

Submitted 7 March, 2024; originally announced March 2024.

Comments: 3DV 2024, the first two authors contributed equally

arXiv:2310.13766 [pdf, other]

U-BEV: Height-aware Bird's-Eye-View Segmentation and Neural Map-based Relocalization

Authors: Andrea Boscolo Camiletto, Alfredo Bochicchio, Alexander Liniger, Dengxin Dai, Abel Gawel

Abstract: Efficient relocalization is essential for intelligent vehicles when GPS reception is insufficient or sensor-based localization fails. Recent advances in Bird's-Eye-View (BEV) segmentation allow for accurate estimation of local scene appearance and in turn, can benefit the relocalization of the vehicle. However, one downside of BEV methods is the heavy computation required to leverage the geometric… ▽ More Efficient relocalization is essential for intelligent vehicles when GPS reception is insufficient or sensor-based localization fails. Recent advances in Bird's-Eye-View (BEV) segmentation allow for accurate estimation of local scene appearance and in turn, can benefit the relocalization of the vehicle. However, one downside of BEV methods is the heavy computation required to leverage the geometric constraints. This paper presents U-BEV, a U-Net inspired architecture that extends the current state-of-the-art by allowing the BEV to reason about the scene on multiple height layers before flattening the BEV features. We show that this extension boosts the performance of the U-BEV by up to 4.11 IoU. Additionally, we combine the encoded neural BEV with a differentiable template matcher to perform relocalization on neural SD-map data. The model is fully end-to-end trainable and outperforms transformer-based BEV methods of similar computational complexity by 1.7 to 2.8 mIoU and BEV-based relocalization by over 26% Recall Accuracy on the nuScenes dataset. △ Less

Submitted 20 October, 2023; originally announced October 2023.

Comments: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

arXiv:2208.11398 [pdf, other]

Event-based Image Deblurring with Dynamic Motion Awareness

Authors: Patricia Vitoria, Stamatios Georgoulis, Stepan Tulyakov, Alfredo Bochicchio, Julius Erbach, Yuanyou Li

Abstract: Non-uniform image deblurring is a challenging task due to the lack of temporal and textural information in the blurry image itself. Complementary information from auxiliary sensors such event sensors are being explored to address these limitations. The latter can record changes in a logarithmic intensity asynchronously, called events, with high temporal resolution and high dynamic range. Current e… ▽ More Non-uniform image deblurring is a challenging task due to the lack of temporal and textural information in the blurry image itself. Complementary information from auxiliary sensors such event sensors are being explored to address these limitations. The latter can record changes in a logarithmic intensity asynchronously, called events, with high temporal resolution and high dynamic range. Current event-based deblurring methods combine the blurry image with events to jointly estimate per-pixel motion and the deblur operator. In this paper, we argue that a divide-and-conquer approach is more suitable for this task. To this end, we propose to use modulated deformable convolutions, whose kernel offsets and modulation masks are dynamically estimated from events to encode the motion in the scene, while the deblur operator is learned from the combination of blurry image and corresponding events. Furthermore, we employ a coarse-to-fine multi-scale reconstruction approach to cope with the inherent sparsity of events in low contrast regions. Importantly, we introduce the first dataset containing pairs of real RGB blur images and related events during the exposure time. Our results show better overall robustness when using events, with improvements in PSNR by up to 1.57dB on synthetic data and 1.08 dB on real event data. △ Less

Submitted 24 August, 2022; originally announced August 2022.

Journal ref: ECCVW 2022

arXiv:2203.17191 [pdf, other]

Time Lens++: Event-based Frame Interpolation with Parametric Non-linear Flow and Multi-scale Fusion

Authors: Stepan Tulyakov, Alfredo Bochicchio, Daniel Gehrig, Stamatios Georgoulis, Yuanyou Li, Davide Scaramuzza

Abstract: Recently, video frame interpolation using a combination of frame- and event-based cameras has surpassed traditional image-based methods both in terms of performance and memory efficiency. However, current methods still suffer from (i) brittle image-level fusion of complementary interpolation results, that fails in the presence of artifacts in the fused image, (ii) potentially temporally inconsiste… ▽ More Recently, video frame interpolation using a combination of frame- and event-based cameras has surpassed traditional image-based methods both in terms of performance and memory efficiency. However, current methods still suffer from (i) brittle image-level fusion of complementary interpolation results, that fails in the presence of artifacts in the fused image, (ii) potentially temporally inconsistent and inefficient motion estimation procedures, that run for every inserted frame and (iii) low contrast regions that do not trigger events, and thus cause events-only motion estimation to generate artifacts. Moreover, previous methods were only tested on datasets consisting of planar and faraway scenes, which do not capture the full complexity of the real world. In this work, we address the above problems by introducing multi-scale feature-level fusion and computing one-shot non-linear inter-frame motion from events and images, which can be efficiently sampled for image war**. We also collect the first large-scale events and frames dataset consisting of more than 100 challenging scenes with depth variations, captured with a new experimental setup based on a beamsplitter. We show that our method improves the reconstruction quality by up to 0.2 dB in terms of PSNR and up to 15% in LPIPS score. △ Less

Submitted 25 April, 2022; v1 submitted 31 March, 2022; originally announced March 2022.

Comments: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, 2022

arXiv:2203.06622 [pdf, other]

Multi-Bracket High Dynamic Range Imaging with Event Cameras

Authors: Nico Messikommer, Stamatios Georgoulis, Daniel Gehrig, Stepan Tulyakov, Julius Erbach, Alfredo Bochicchio, Yuanyou Li, Davide Scaramuzza

Abstract: Modern high dynamic range (HDR) imaging pipelines align and fuse multiple low dynamic range (LDR) images captured at different exposure times. While these methods work well in static scenes, dynamic scenes remain a challenge since the LDR images still suffer from saturation and noise. In such scenarios, event cameras would be a valid complement, thanks to their higher temporal resolution and dynam… ▽ More Modern high dynamic range (HDR) imaging pipelines align and fuse multiple low dynamic range (LDR) images captured at different exposure times. While these methods work well in static scenes, dynamic scenes remain a challenge since the LDR images still suffer from saturation and noise. In such scenarios, event cameras would be a valid complement, thanks to their higher temporal resolution and dynamic range. In this paper, we propose the first multi-bracket HDR pipeline combining a standard camera with an event camera. Our results show better overall robustness when using events, with improvements in PSNR by up to 5dB on synthetic data and up to 0.7dB on real-world data. We also introduce a new dataset containing bracketed LDR images with aligned events and HDR ground truth. △ Less

Submitted 28 April, 2022; v1 submitted 13 March, 2022; originally announced March 2022.

Journal ref: IEEE Conference on Computer Vision and Pattern Recognition Workshop (CVPRW), New Orleans, 2022

Showing 1–6 of 6 results for author: Bochicchio, A