-
For a semiotic AI: Bridging computer vision and visual semiotics for computational observation of large scale facial image archives
Authors:
Lia Morra,
Antonio Santangelo,
Pietro Basci,
Luca Piano,
Fabio Garcea,
Fabrizio Lamberti,
Massimo Leone
Abstract:
Social networks are creating a digital world in which the cognitive, emotional, and pragmatic value of the imagery of human faces and bodies is arguably changing. However, researchers in the digital humanities are often ill-equipped to study these phenomena at scale. This work presents FRESCO (Face Representation in E-Societies through Computational Observation), a framework designed to explore th…
▽ More
Social networks are creating a digital world in which the cognitive, emotional, and pragmatic value of the imagery of human faces and bodies is arguably changing. However, researchers in the digital humanities are often ill-equipped to study these phenomena at scale. This work presents FRESCO (Face Representation in E-Societies through Computational Observation), a framework designed to explore the socio-cultural implications of images on social media platforms at scale. FRESCO deconstructs images into numerical and categorical variables using state-of-the-art computer vision techniques, aligning with the principles of visual semiotics. The framework analyzes images across three levels: the plastic level, encompassing fundamental visual features like lines and colors; the figurative level, representing specific entities or concepts; and the enunciation level, which focuses particularly on constructing the point of view of the spectator and observer. These levels are analyzed to discern deeper narrative layers within the imagery. Experimental validation confirms the reliability and utility of FRESCO, and we assess its consistency and precision across two public datasets. Subsequently, we introduce the FRESCO score, a metric derived from the framework's output that serves as a reliable measure of similarity in image content.
△ Less
Submitted 3 July, 2024;
originally announced July 2024.
-
Toward a Realistic Benchmark for Out-of-Distribution Detection
Authors:
Pietro Recalcati,
Fabio Garcea,
Luca Piano,
Fabrizio Lamberti,
Lia Morra
Abstract:
Deep neural networks are increasingly used in a wide range of technologies and services, but remain highly susceptible to out-of-distribution (OOD) samples, that is, drawn from a different distribution than the original training set. A common approach to address this issue is to endow deep neural networks with the ability to detect OOD samples. Several benchmarks have been proposed to design and v…
▽ More
Deep neural networks are increasingly used in a wide range of technologies and services, but remain highly susceptible to out-of-distribution (OOD) samples, that is, drawn from a different distribution than the original training set. A common approach to address this issue is to endow deep neural networks with the ability to detect OOD samples. Several benchmarks have been proposed to design and validate OOD detection techniques. However, many of them are based on far-OOD samples drawn from very different distributions, and thus lack the complexity needed to capture the nuances of real-world scenarios. In this work, we introduce a comprehensive benchmark for OOD detection, based on ImageNet and Places365, that assigns individual classes as in-distribution or out-of-distribution depending on the semantic similarity with the training set. Several techniques can be used to determine which classes should be considered in-distribution, yielding benchmarks with varying properties. Experimental results on different OOD detection techniques show how their measured efficacy depends on the selected benchmark and how confidence-based techniques may outperform classifier-based ones on near-OOD samples.
△ Less
Submitted 16 April, 2024;
originally announced April 2024.
-
Latent Diffusion Models for Attribute-Preserving Image Anonymization
Authors:
Luca Piano,
Pietro Basci,
Fabrizio Lamberti,
Lia Morra
Abstract:
Generative techniques for image anonymization have great potential to generate datasets that protect the privacy of those depicted in the images, while achieving high data fidelity and utility. Existing methods have focused extensively on preserving facial attributes, but failed to embrace a more comprehensive perspective that considers the scene and background into the anonymization process. This…
▽ More
Generative techniques for image anonymization have great potential to generate datasets that protect the privacy of those depicted in the images, while achieving high data fidelity and utility. Existing methods have focused extensively on preserving facial attributes, but failed to embrace a more comprehensive perspective that considers the scene and background into the anonymization process. This paper presents, to the best of our knowledge, the first approach to image anonymization based on Latent Diffusion Models (LDMs). Every element of a scene is maintained to convey the same meaning, yet manipulated in a way that makes re-identification difficult. We propose two LDMs for this purpose: CAMOUFLaGE-Base exploits a combination of pre-trained ControlNets, and a new controlling mechanism designed to increase the distance between the real and anonymized images. CAMOFULaGE-Light is based on the Adapter technique, coupled with an encoding designed to efficiently represent the attributes of different persons in a scene. The former solution achieves superior performance on most metrics and benchmarks, while the latter cuts the inference time in half at the cost of fine-tuning a lightweight module. We show through extensive experimental comparison that the proposed method is competitive with the state-of-the-art concerning identity obfuscation whilst better preserving the original content of the image and tackling unresolved challenges that current solutions fail to address.
△ Less
Submitted 21 March, 2024;
originally announced March 2024.
-
Fuzzy Logic Visual Network (FLVN): A neuro-symbolic approach for visual features matching
Authors:
Francesco Manigrasso,
Lia Morra,
Fabrizio Lamberti
Abstract:
Neuro-symbolic integration aims at harnessing the power of symbolic knowledge representation combined with the learning capabilities of deep neural networks. In particular, Logic Tensor Networks (LTNs) allow to incorporate background knowledge in the form of logical axioms by grounding a first order logic language as differentiable operations between real tensors. Yet, few studies have investigate…
▽ More
Neuro-symbolic integration aims at harnessing the power of symbolic knowledge representation combined with the learning capabilities of deep neural networks. In particular, Logic Tensor Networks (LTNs) allow to incorporate background knowledge in the form of logical axioms by grounding a first order logic language as differentiable operations between real tensors. Yet, few studies have investigated the potential benefits of this approach to improve zero-shot learning (ZSL) classification. In this study, we present the Fuzzy Logic Visual Network (FLVN) that formulates the task of learning a visual-semantic embedding space within a neuro-symbolic LTN framework. FLVN incorporates prior knowledge in the form of class hierarchies (classes and macro-classes) along with robust high-level inductive biases. The latter allow, for instance, to handle exceptions in class-level attributes, and to enforce similarity between images of the same class, preventing premature overfitting to seen classes and improving overall performance. FLVN reaches state of the art performance on the Generalized ZSL (GZSL) benchmarks AWA2 and CUB, improving by 1.3% and 3%, respectively. Overall, it achieves competitive performance to recent ZSL methods with less computational overhead. FLVN is available at https://gitlab.com/grains2/flvn.
△ Less
Submitted 29 July, 2023;
originally announced July 2023.
-
Bent & Broken Bicycles: Leveraging synthetic data for damaged object re-identification
Authors:
Luca Piano,
Filippo Gabriele Pratticò,
Alessandro Sebastian Russo,
Lorenzo Lanari,
Lia Morra,
Fabrizio Lamberti
Abstract:
Instance-level object re-identification is a fundamental computer vision task, with applications from image retrieval to intelligent monitoring and fraud detection. In this work, we propose the novel task of damaged object re-identification, which aims at distinguishing changes in visual appearance due to deformations or missing parts from subtle intra-class variations. To explore this task, we le…
▽ More
Instance-level object re-identification is a fundamental computer vision task, with applications from image retrieval to intelligent monitoring and fraud detection. In this work, we propose the novel task of damaged object re-identification, which aims at distinguishing changes in visual appearance due to deformations or missing parts from subtle intra-class variations. To explore this task, we leverage the power of computer-generated imagery to create, in a semi-automatic fashion, high-quality synthetic images of the same bike before and after a damage occurs. The resulting dataset, Bent & Broken Bicycles (BBBicycles), contains 39,200 images and 2,800 unique bike instances spanning 20 different bike models. As a baseline for this task, we propose TransReI3D, a multi-task, transformer-based deep network unifying damage detection (framed as a multi-label classification task) with object re-identification. The BBBicycles dataset is available at https://huggingface.co/datasets/GrainsPolito/BBBicycles
△ Less
Submitted 16 April, 2023;
originally announced April 2023.
-
Digital Twins for Industry 4.0 in the 6G Era
Authors:
Bin Han,
Mohammad Asif Habibi,
Bjoern Richerzhagen,
Kim Schindhelm,
Florian Zeiger,
Fabrizio Lamberti,
Filippo Gabriele Pratticò,
Karthik Upadhya,
Charalampos Korovesis,
Ioannis-Prodromos Belikaidis,
Panagiotis Demestichas,
Siyu Yuan,
Hans D. Schotten
Abstract:
Having the Fifth Generation (5G) mobile communication system recently rolled out in many countries, the wireless community is now setting its eyes on the next era of Sixth Generation (6G). Inheriting from 5G its focus on industrial use cases, 6G is envisaged to become the infrastructural backbone of future intelligent industry. Especially, a combination of 6G and the emerging technologies of Digit…
▽ More
Having the Fifth Generation (5G) mobile communication system recently rolled out in many countries, the wireless community is now setting its eyes on the next era of Sixth Generation (6G). Inheriting from 5G its focus on industrial use cases, 6G is envisaged to become the infrastructural backbone of future intelligent industry. Especially, a combination of 6G and the emerging technologies of Digital Twins (DT) will give impetus to the next evolution of Industry 4.0 (I4.0) systems. This article provides a survey in the research area of 6G-empowered industrial DT system. With a novel vision of 6G industrial DT ecosystem, this survey discusses the ambitions and potential applications of industrial DT in the 6G era, identifying the emerging challenges as well as the key enabling technologies. The introduced ecosystem is supposed to bridge the gaps between humans, machines, and the data infrastructure, and therewith enable numerous novel application scenarios.
△ Less
Submitted 15 October, 2023; v1 submitted 5 October, 2022;
originally announced October 2022.
-
Shape Proportions and Sphericity in n Dimensions
Authors:
William Franz Lamberti
Abstract:
Shape metrics for objects in high dimensions remain sparse. Those that do exist, such as hyper-volume, remain limited to objects that are better understood such as Platonic solids and $n$-Cubes. Further, understanding objects of ill-defined shapes in higher dimensions is ambiguous at best. Past work does not provide a single number to give a qualitative understanding of an object. For example, the…
▽ More
Shape metrics for objects in high dimensions remain sparse. Those that do exist, such as hyper-volume, remain limited to objects that are better understood such as Platonic solids and $n$-Cubes. Further, understanding objects of ill-defined shapes in higher dimensions is ambiguous at best. Past work does not provide a single number to give a qualitative understanding of an object. For example, the eigenvalues from principal component analysis results in $n$ metrics to describe the shape of an object. Therefore, we need a single number which can discriminate objects with different shape from one another. Previous work has developed shape metrics for specific dimensions such as two or three dimensions. However, there is an opportunity to develop metrics for any desired dimension. To that end, we present two new shape metrics for objects in a given number of dimensions: hyper-Sphericity and hyper-Shape Proportion (SP). We explore the proprieties of these metrics on a number of different shapes including $n$-balls. We then connect these metrics to applications of analyzing the shape of multidimensional data such as the popular Iris dataset.
△ Less
Submitted 12 August, 2022;
originally announced August 2022.
-
Classification of White Blood Cell Leukemia with Low Number of Interpretable and Explainable Features
Authors:
William Franz Lamberti
Abstract:
White Blood Cell (WBC) Leukaemia is detected through image-based classification. Convolutional Neural Networks are used to learn the features needed to classify images of cells a malignant or healthy. However, this type of model requires learning a large number of parameters and is difficult to interpret and explain. Explainable AI (XAI) attempts to alleviate this issue by providing insights to ho…
▽ More
White Blood Cell (WBC) Leukaemia is detected through image-based classification. Convolutional Neural Networks are used to learn the features needed to classify images of cells a malignant or healthy. However, this type of model requires learning a large number of parameters and is difficult to interpret and explain. Explainable AI (XAI) attempts to alleviate this issue by providing insights to how models make decisions. Therefore, we present an XAI model which uses only 24 explainable and interpretable features and is highly competitive to other approaches by outperforming them by about 4.38\%. Further, our approach provides insight into which variables are the most important for the classification of the cells. This insight provides evidence that when labs treat the WBCs differently, the importance of various metrics changes substantially. Understanding the important features for classification is vital in medical imaging diagnosis and, by extension, understanding the AI models built in scientific pursuits.
△ Less
Submitted 27 January, 2022;
originally announced January 2022.
-
Using Shape Metrics to Describe 2D Data Points
Authors:
William Franz Lamberti
Abstract:
Traditional machine learning (ML) algorithms, such as multiple regression, require human analysts to make decisions on how to treat the data. These decisions can make the model building process subjective and difficult to replicate for those who did not build the model. Deep learning approaches benefit by allowing the model to learn what features are important once the human analyst builds the arc…
▽ More
Traditional machine learning (ML) algorithms, such as multiple regression, require human analysts to make decisions on how to treat the data. These decisions can make the model building process subjective and difficult to replicate for those who did not build the model. Deep learning approaches benefit by allowing the model to learn what features are important once the human analyst builds the architecture. Thus, a method for automating certain human decisions for traditional ML modeling would help to improve the reproducibility and remove subjective aspects of the model building process. To that end, we propose to use shape metrics to describe 2D data to help make analyses more explainable and interpretable. The proposed approach provides a foundation to help automate various aspects of model building in an interpretable and explainable fashion. This is particularly important in applications in the medical community where the `right to explainability' is crucial. We provide various simulated data sets ranging from probability distributions, functions, and model quality control checks (such as QQ-Plots and residual analyses from ordinary least squares) to showcase the breadth of this approach.
△ Less
Submitted 27 January, 2022;
originally announced January 2022.
-
Real-time sensing with multiplexed optomechanical resonators
Authors:
Fabrice Lamberti,
Ujwol Palanchoke,
Thijs Geurts,
Marc Gely,
Sebastien Regord,
Louise Banniard,
Marc Sansa,
Ivan Favero,
Guillaume Jourdan,
Sebastien Hentz
Abstract:
Nanoelectromechanical resonators have been successfully used for a variety of sensing applications. Their extreme resolution comes from their small size at the cost of low capture area, making the "needle in a haystack" issue acute. This leads to poor instrument sensitivity and long analysis time. Moreover, electrical transductions are limited in frequency, which limits the achievable mechanical b…
▽ More
Nanoelectromechanical resonators have been successfully used for a variety of sensing applications. Their extreme resolution comes from their small size at the cost of low capture area, making the "needle in a haystack" issue acute. This leads to poor instrument sensitivity and long analysis time. Moreover, electrical transductions are limited in frequency, which limits the achievable mechanical bandwidth again limiting throughput. Multiplexing a large number of high-frequency resonators appears as a solution, but this is complex with electrical transductions. We propose here a route to solve these issues, with a multiplexing scheme for very high frequency optomechanical resonators. We demonstrate the simultaneous frequency measurement of three silicon microdisks resonators fabricated through a Very Large Scale Integration process. The readout architecture is simple and does not degrade the sensing resolutions. This paves the way towards the realization of sensors for multi-parametric analysis, extremely low limit of detection and response time.
△ Less
Submitted 9 July, 2021;
originally announced July 2021.
-
Faster-LTN: a neuro-symbolic, end-to-end object detection architecture
Authors:
Francesco Manigrasso,
Filomeno Davide Miro,
Lia Morra,
Fabrizio Lamberti
Abstract:
The detection of semantic relationships between objects represented in an image is one of the fundamental challenges in image interpretation. Neural-Symbolic techniques, such as Logic Tensor Networks (LTNs), allow the combination of semantic knowledge representation and reasoning with the ability to efficiently learn from examples typical of neural networks. We here propose Faster-LTN, an object d…
▽ More
The detection of semantic relationships between objects represented in an image is one of the fundamental challenges in image interpretation. Neural-Symbolic techniques, such as Logic Tensor Networks (LTNs), allow the combination of semantic knowledge representation and reasoning with the ability to efficiently learn from examples typical of neural networks. We here propose Faster-LTN, an object detector composed of a convolutional backbone and an LTN. To the best of our knowledge, this is the first attempt to combine both frameworks in an end-to-end training setting. This architecture is trained by optimizing a grounded theory which combines labelled examples with prior knowledge, in the form of logical axioms. Experimental comparisons show competitive performance with respect to the traditional Faster R-CNN architecture.
△ Less
Submitted 5 July, 2021;
originally announced July 2021.
-
Breast Mass Detection with Faster R-CNN: On the Feasibility of Learning from Noisy Annotations
Authors:
Sina Famouri,
Lia Morra,
Leonardo Mangia,
Fabrizio Lamberti
Abstract:
In this work we study the impact of noise on the training of object detection networks for the medical domain, and how it can be mitigated by improving the training procedure. Annotating large medical datasets for training data-hungry deep learning models is expensive and time consuming. Leveraging information that is already collected in clinical practice, in the form of text reports, bookmarks o…
▽ More
In this work we study the impact of noise on the training of object detection networks for the medical domain, and how it can be mitigated by improving the training procedure. Annotating large medical datasets for training data-hungry deep learning models is expensive and time consuming. Leveraging information that is already collected in clinical practice, in the form of text reports, bookmarks or lesion measurements would substantially reduce this cost. Obtaining precise lesion bounding boxes through automatic mining procedures, however, is difficult. We provide here a quantitative evaluation of the effect of bounding box coordinate noise on the performance of Faster R-CNN object detection networks for breast mass detection. Varying degrees of noise are simulated by randomly modifying the bounding boxes: in our experiments, bounding boxes could be enlarged up to six times the original size. The noise is injected in the CBIS-DDSM collection, a well curated public mammography dataset for which accurate lesion location is available. We show how, due to an imperfect matching between the ground truth and the network bounding box proposals, the noise is propagated during training and reduces the ability of the network to correctly classify lesions from background. When using the standard Intersection over Union criterion, the area under the FROC curve decreases by up to 9%. A novel matching criterion is proposed to improve tolerance to noise.
△ Less
Submitted 25 April, 2021;
originally announced April 2021.
-
Training Medical Communication Skills with Virtual Patients: Literature Review and Directions for Future Research
Authors:
Edoardo Battegazzorre,
Andrea Bottino,
Fabrizio Lamberti
Abstract:
Effective communication is a crucial skill for healthcare providers since it leads to better patient health, satisfaction and avoids malpractice claims. In standard medical education, students' communication skills are trained with role-playing and Standardized Patients (SPs), i.e., actors. However, SPs are difficult to standardize, and are very resource consuming. Virtual Patients (VPs) are inter…
▽ More
Effective communication is a crucial skill for healthcare providers since it leads to better patient health, satisfaction and avoids malpractice claims. In standard medical education, students' communication skills are trained with role-playing and Standardized Patients (SPs), i.e., actors. However, SPs are difficult to standardize, and are very resource consuming. Virtual Patients (VPs) are interactive computer-based systems that represent a valuable alternative to SPs. VPs are capable of portraying patients in realistic clinical scenarios and engage learners in realistic conversations. Approaching medical communication skill training with VPs has been an active research area in the last ten years. As a result, the number of works in this field has grown significantly. The objective of this work is to survey the recent literature, assessing the state of the art of this technology with a specific focus on the instructional and technical design of VP simulations. After having classified and analysed the VPs selected for our research, we identified several areas that require further investigation, and we drafted practical recommendations for VP developers on design aspects that, based on our findings, are pivotal to create novel and effective VP simulations or improve existing ones.
△ Less
Submitted 1 April, 2021;
originally announced April 2021.
-
Comparing State-of-the-Art and Emerging Augmented Reality Interfaces for Autonomous Vehicle-to-Pedestrian Communication
Authors:
F. Gabriele Pratticò,
Fabrizio Lamberti,
Alberto Cannavò,
Lia Morra,
Paolo Montuschi
Abstract:
Providing pedestrians and other vulnerable road users with a clear indication about a fully autonomous vehicle status and intentions is crucial to make them coexist. In the last few years, a variety of external interfaces have been proposed, leveraging different paradigms and technologies including vehicle-mounted devices (like LED panels), short-range on-road projections, and road infrastructure…
▽ More
Providing pedestrians and other vulnerable road users with a clear indication about a fully autonomous vehicle status and intentions is crucial to make them coexist. In the last few years, a variety of external interfaces have been proposed, leveraging different paradigms and technologies including vehicle-mounted devices (like LED panels), short-range on-road projections, and road infrastructure interfaces (e.g., special asphalts with embedded displays). These designs were experimented in different settings, using mockups, specially prepared vehicles, or virtual environments, with heterogeneous evaluation metrics. Promising interfaces based on Augmented Reality (AR) have been proposed too, but their usability and effectiveness have not been tested yet. This paper aims to complement such body of literature by presenting a comparison of state-of-the-art interfaces and new designs under common conditions. To this aim, an immersive Virtual Reality-based simulation was developed, recreating a well-known scenario represented by pedestrians crossing in urban environments under non-regulated conditions. A user study was then performed to investigate the various dimensions of vehicle-to-pedestrian interaction leveraging objective and subjective metrics. Even though no interface clearly stood out over all the considered dimensions, one of the AR designs achieved state-of-the-art results in terms of safety and trust, at the cost of higher cognitive effort and lower intuitiveness compared to LED panels showing anthropomorphic features. Together with rankings on the various dimensions, indications about advantages and drawbacks of the various alternatives that emerged from this study could provide important information for next developments in the field.
△ Less
Submitted 4 February, 2021;
originally announced February 2021.
-
An Evaluation Testbed for Locomotion in Virtual Reality
Authors:
Alberto Cannavò,
Davide Calandra,
F. Gabriele Pratticò,
Valentina Gatteschi,
Fabrizio Lamberti
Abstract:
A common operation performed in Virtual Reality (VR) environments is locomotion. Although real walking can represent a natural and intuitive way to manage displacements in such environments, its use is generally limited by the size of the area tracked by the VR system (typically, the size of a room) or requires expensive technologies to cover particularly extended settings. A number of approaches…
▽ More
A common operation performed in Virtual Reality (VR) environments is locomotion. Although real walking can represent a natural and intuitive way to manage displacements in such environments, its use is generally limited by the size of the area tracked by the VR system (typically, the size of a room) or requires expensive technologies to cover particularly extended settings. A number of approaches have been proposed to enable effective explorations in VR, each characterized by different hardware requirements and costs, and capable to provide different levels of usability and performance. However, the lack of a well-defined methodology for assessing and comparing available approaches makes it difficult to identify, among the various alternatives, the best solutions for selected application domains. To deal with this issue, this paper introduces a novel evaluation testbed which, by building on the outcomes of many separate works reported in the literature, aims to support a comprehensive analysis of the considered design space. An experimental protocol for collecting objective and subjective measures is proposed, together with a scoring system able to rank locomotion approaches based on a weighted set of requirements. Testbed usage is illustrated in a use case requesting to select the technique to adopt in a given application scenario.
△ Less
Submitted 20 October, 2020;
originally announced October 2020.
-
Mixed-Reality Robotic Games: Design Guidelines for Effective Entertainment with Consumer Robots
Authors:
F. Gabriele Pratticò,
Fabrizio Lamberti
Abstract:
In recent years, there has been an increasing interest in the use of robotic technology at home. A number of service robots appeared on the market, supporting customers in the execution of everyday tasks. Roughly at the same time, consumer level robots started to be used also as toys or gaming companions. However, gaming possibilities provided by current off-the-shelf robotic products are generall…
▽ More
In recent years, there has been an increasing interest in the use of robotic technology at home. A number of service robots appeared on the market, supporting customers in the execution of everyday tasks. Roughly at the same time, consumer level robots started to be used also as toys or gaming companions. However, gaming possibilities provided by current off-the-shelf robotic products are generally quite limited, and this fact makes them quickly loose their attractiveness. A way that has been proven capable to boost robotic gaming and related devices consists in creating playful experiences in which physical and digital elements are combined together using Mixed Reality technologies. However, these games differ significantly from digital- or physical only experiences, and new design principles are required to support developers in their creative work. This papers addresses such need, by drafting a set of guidelines which summarize developments carried out by the research community and their findings.
△ Less
Submitted 30 July, 2020;
originally announced July 2020.
-
Building Trust in Autonomous Vehicles: Role of Virtual Reality Driving Simulators in HMI Design
Authors:
Lia Morra,
Fabrizio Lamberti,
F. Gabriele Pratticó,
Salvatore La Rosa,
Paolo Montuschi
Abstract:
The investigation of factors contributing at making humans trust Autonomous Vehicles (AVs) will play a fundamental role in the adoption of such technology. The user's ability to form a mental model of the AV, which is crucial to establish trust, depends on effective user-vehicle communication; thus, the importance of Human-Machine Interaction (HMI) is poised to increase. In this work, we propose a…
▽ More
The investigation of factors contributing at making humans trust Autonomous Vehicles (AVs) will play a fundamental role in the adoption of such technology. The user's ability to form a mental model of the AV, which is crucial to establish trust, depends on effective user-vehicle communication; thus, the importance of Human-Machine Interaction (HMI) is poised to increase. In this work, we propose a methodology to validate the user experience in AVs based on continuous, objective information gathered from physiological signals, while the user is immersed in a Virtual Reality-based driving simulation. We applied this methodology to the design of a head-up display interface delivering visual cues about the vehicle' sensory and planning systems. Through this approach, we obtained qualitative and quantitative evidence that a complete picture of the vehicle's surrounding, despite the higher cognitive load, is conducive to a less stressful experience. Moreover, after having been exposed to a more informative interface, users involved in the study were also more willing to test a real AV. The proposed methodology could be extended by adjusting the simulation environment, the HMI and/or the vehicle's Artificial Intelligence modules to dig into other aspects of the user experience.
△ Less
Submitted 27 July, 2020;
originally announced July 2020.
-
Object Tracking through Residual and Dense LSTMs
Authors:
Fabio Garcea,
Alessandro Cucco,
Lia Morra,
Fabrizio Lamberti
Abstract:
Visual object tracking task is constantly gaining importance in several fields of application as traffic monitoring, robotics, and surveillance, to name a few. Dealing with changes in the appearance of the tracked object is paramount to achieve high tracking accuracy, and is usually achieved by continually learning features. Recently, deep learning-based trackers based on LSTMs (Long Short-Term Me…
▽ More
Visual object tracking task is constantly gaining importance in several fields of application as traffic monitoring, robotics, and surveillance, to name a few. Dealing with changes in the appearance of the tracked object is paramount to achieve high tracking accuracy, and is usually achieved by continually learning features. Recently, deep learning-based trackers based on LSTMs (Long Short-Term Memory) recurrent neural networks have emerged as a powerful alternative, bypassing the need to retrain the feature extraction in an online fashion. Inspired by the success of residual and dense networks in image recognition, we propose here to enhance the capabilities of hybrid trackers using residual and/or dense LSTMs. By introducing skip connections, it is possible to increase the depth of the architecture while ensuring a fast convergence. Experimental results on the Re3 tracker show that DenseLSTMs outperform Residual and regular LSTM, and offer a higher resilience to nuisances such as occlusions and out-of-view objects. Our case study supports the adoption of residual-based RNNs for enhancing the robustness of other trackers.
△ Less
Submitted 22 June, 2020;
originally announced June 2020.
-
Bridging the gap between Natural and Medical Images through Deep Colorization
Authors:
Lia Morra,
Luca Piano,
Fabrizio Lamberti,
Tatiana Tommasi
Abstract:
Deep learning has thrived by training on large-scale datasets. However, in many applications, as for medical image diagnosis, getting massive amount of data is still prohibitive due to privacy, lack of acquisition homogeneity and annotation cost. In this scenario, transfer learning from natural image collections is a standard practice that attempts to tackle shape, texture and color discrepancies…
▽ More
Deep learning has thrived by training on large-scale datasets. However, in many applications, as for medical image diagnosis, getting massive amount of data is still prohibitive due to privacy, lack of acquisition homogeneity and annotation cost. In this scenario, transfer learning from natural image collections is a standard practice that attempts to tackle shape, texture and color discrepancies all at once through pretrained model fine-tuning. In this work, we propose to disentangle those challenges and design a dedicated network module that focuses on color adaptation. We combine learning from scratch of the color module with transfer learning of different classification backbones, obtaining an end-to-end, easy-to-train architecture for diagnostic image recognition on X-ray images. Extensive experiments showed how our approach is particularly efficient in case of data scarcity and provides a new path for further transferring the learned color information across multiple medical datasets.
△ Less
Submitted 19 October, 2020; v1 submitted 21 May, 2020;
originally announced May 2020.
-
Turning bad into good: a water-splitting-active hole transporting material to preserve the performance of perovskite solar cells in humid environments
Authors:
Min Kim,
Antonio Alfano,
Giovanni Perotto,
Michele Serri,
Nicola Dengo,
Alessandro Mezzetti,
Silvia Gross,
Mirko Prato,
Marco Salerno,
Roberto Sorrentino,
Gaudenzio Meneghesso,
Fabio Di Fonzo,
Annamaria Petrozza,
Teresa Gatti,
Francesco Lamberti
Abstract:
Lead halide perovskite-based photoactive layers are nowadays employed for a large number of optoelectronic applications, from solar cells to photodetectors and light-emitting diodes, because of their excellent absorption, emission and charge-transport properties. Unfortunately, their commercialization is still hindered by an intrinsic instability towards classical environmental conditions. Water i…
▽ More
Lead halide perovskite-based photoactive layers are nowadays employed for a large number of optoelectronic applications, from solar cells to photodetectors and light-emitting diodes, because of their excellent absorption, emission and charge-transport properties. Unfortunately, their commercialization is still hindered by an intrinsic instability towards classical environmental conditions. Water in particular promotes fast decomposition, leading to a drastic decrease in device performance. An innovative functional approach to overcome this major issue could derive from integrating water-splitting active species within charge extracting layers adjacent to the perovskite photoactive layer, converting incoming water molecules into molecular oxygen and hydrogen before they reach this last one, thus preserving device performance in time. In this work we report for the first time on a perovskite-ancillary layer based on CuSCN nanoplateletes dispersed in a p-type semiconducting polymeric matrix, combining hole extraction/transport properties with good water-oxidation activity, that transforms incoming water molecules and further triggers the in situ p-do** of the conjugated polymer by means of the produced dioxygen, further improving transport of photogenerated charges. This composite layer enables the long-term stabilization of a mixed cation lead halide perovskite within a direct solar cell architecture, maintaining a stable performance for 28 days in high-moisture simulated conditions. Our findings demonstrate that the engineering of a hole extraction layer with water-splitting active additives represent a valuable strategy to mitigate the degradation of perovskite solar cells exposed to atmospheric humidity. A similar approach could be employed in the future to improve stabilities of other optoelectronic devices based on water-sensitive species.
△ Less
Submitted 19 May, 2020;
originally announced May 2020.
-
Slicing and dicing soccer: automatic detection of complex events from spatio-temporal data
Authors:
Lia Morra,
Francesco Manigrasso,
Giuseppe Canto,
Claudio Gianfrate,
Enrico Guarino,
Fabrizio Lamberti
Abstract:
The automatic detection of events in sport videos has im-portant applications for data analytics, as well as for broadcasting andmedia companies. This paper presents a comprehensive approach for de-tecting a wide range of complex events in soccer videos starting frompositional data. The event detector is designed as a two-tier system thatdetectsatomicandcomplex events. Atomic events are detected b…
▽ More
The automatic detection of events in sport videos has im-portant applications for data analytics, as well as for broadcasting andmedia companies. This paper presents a comprehensive approach for de-tecting a wide range of complex events in soccer videos starting frompositional data. The event detector is designed as a two-tier system thatdetectsatomicandcomplex events. Atomic events are detected basedon temporal and logical combinations of the detected objects, their rel-ative distances, as well as spatio-temporal features such as velocity andacceleration. Complex events are defined as temporal and logical com-binations of atomic and complex events, and are expressed by meansof a declarative Interval Temporal Logic (ITL). The effectiveness of theproposed approach is demonstrated over 16 different events, includingcomplex situations such as tackles and filtering passes. By formalizingevents based on principled ITL, it is possible to easily perform reason-ing tasks, such as understanding which passes or crosses result in a goalbeing scored. To counterbalance the lack of suitable, annotated publicdatasets, we built on an open source soccer simulation engine to re-lease the synthetic SoccER (Soccer Event Recognition) dataset, whichincludes complete positional data and annotations for more than 1.6 mil-lion atomic events and 9,000 complex events. The dataset and code areavailable at https://gitlab.com/grains2/slicing-and-dicing-soccer
△ Less
Submitted 10 April, 2020; v1 submitted 8 April, 2020;
originally announced April 2020.
-
Benchmarking unsupervised near-duplicate image detection
Authors:
Lia Morra,
Fabrizio Lamberti
Abstract:
Unsupervised near-duplicate detection has many practical applications ranging from social media analysis and web-scale retrieval, to digital image forensics. It entails running a threshold-limited query on a set of descriptors extracted from the images, with the goal of identifying all possible near-duplicates, while limiting the false positives due to visually similar images. Since the rate of fa…
▽ More
Unsupervised near-duplicate detection has many practical applications ranging from social media analysis and web-scale retrieval, to digital image forensics. It entails running a threshold-limited query on a set of descriptors extracted from the images, with the goal of identifying all possible near-duplicates, while limiting the false positives due to visually similar images. Since the rate of false alarms grows with the dataset size, a very high specificity is thus required, up to $1 - 10^{-9}$ for realistic use cases; this important requirement, however, is often overlooked in literature. In recent years, descriptors based on deep convolutional neural networks have matched or surpassed traditional feature extraction methods in content-based image retrieval tasks. To the best of our knowledge, ours is the first attempt to establish the performance range of deep learning-based descriptors for unsupervised near-duplicate detection on a range of datasets, encompassing a broad spectrum of near-duplicate definitions. We leverage both established and new benchmarks, such as the Mir-Flick Near-Duplicate (MFND) dataset, in which a known ground truth is provided for all possible pairs over a general, large scale image collection. To compare the specificity of different descriptors, we reduce the problem of unsupervised detection to that of binary classification of near-duplicate vs. not-near-duplicate images. The latter can be conveniently characterized using Receiver Operating Curve (ROC). Our findings in general favor the choice of fine-tuning deep convolutional networks, as opposed to using off-the-shelf features, but differences at high specificity settings depend on the dataset and are often small. The best performance was observed on the MFND benchmark, achieving 96\% sensitivity at a false positive rate of $1.43 \times 10^{-6}$.
△ Less
Submitted 3 July, 2019;
originally announced July 2019.
-
Brillouin Scattering in Hybrid Optophononic Bragg Micropillar Resonators at 300 GHz
Authors:
M. Esmann,
F. R. Lamberti,
A. Harouri,
L. Lanco,
I. Sagnes,
I. Favero,
G. Aubin,
C. Gomez-Carbonell,
A. Lemaitre,
O. Krebs,
P. Senellart,
N. D. Lanzillotti-Kimura
Abstract:
We introduce a monolithic Brillouin generator based on a semiconductor micropillar cavity embedding a high frequency nanoacoustic resonator operating in the hundreds of GHz range. The concept of two nested resonators allows an independent design of the ultrahigh frequency Brillouin spectrum and of the optical device. We develop an optical free-space technique to characterize spontaneous Brillouin…
▽ More
We introduce a monolithic Brillouin generator based on a semiconductor micropillar cavity embedding a high frequency nanoacoustic resonator operating in the hundreds of GHz range. The concept of two nested resonators allows an independent design of the ultrahigh frequency Brillouin spectrum and of the optical device. We develop an optical free-space technique to characterize spontaneous Brillouin scattering in this monolithic device and propose a measurement protocol that maximizes the Brillouin generation efficiency in the presence of optically induced thermal effects. The compact and versatile Brillouin generator studied here could be readily integrated into fibered and on-chip architectures.
△ Less
Submitted 12 December, 2018;
originally announced December 2018.
-
Topological acoustics in coupled nanocavity arrays
Authors:
M. Esmann,
F. R. Lamberti,
A. Lemaitre,
N. D. Lanzillotti-Kimura
Abstract:
The Su-Schrieffer-Heeger (SSH) model is likely the simplest one-dimensional concept to study non-trivial topological phases and topological excitations. Originally developed to explain the electric conductivity of polyacetylene, it has become a platform for the study of topological effects in electronics, photonics and ultra-cold atomic systems. Here, we propose an experimentally feasible implemen…
▽ More
The Su-Schrieffer-Heeger (SSH) model is likely the simplest one-dimensional concept to study non-trivial topological phases and topological excitations. Originally developed to explain the electric conductivity of polyacetylene, it has become a platform for the study of topological effects in electronics, photonics and ultra-cold atomic systems. Here, we propose an experimentally feasible implementation of the SSH model based on coupled one-dimensional acoustic nanoresonators working in the GHz-THz range. In this simulator it is possible to implement different signs in the nearest neighbor interaction terms, showing full tunability of all parameters in the SSH model. Based on this concept we construct topological transition points generating nanophononic edge and interface states and propose an easy scheme to experimentally probe their spatial complex amplitude distribution directly by well-established optical pump-probe techniques.
△ Less
Submitted 11 May, 2018;
originally announced May 2018.
-
Topological nanophononic states by band inversion
Authors:
Martin Esmann,
Fabrice Roland Lamberti,
Pascale Senellart,
Ivan Favero,
Olivier Krebs,
Loic Lanco,
Carmen Gomez Carbonell,
Aristide Lemaitre,
Norberto Daniel Lanzillotti-Kimura
Abstract:
Nanophononics is essential for the engineering of thermal transport in nanostructured electronic devices, it greatly facilitates the manipulation of mechanical resonators in the quantum regime, and could unveil a new route in quantum communications using phonons as carriers of information. Acoustic phonons also constitute a versatile platform for the study of fundamental wave dynamics, including B…
▽ More
Nanophononics is essential for the engineering of thermal transport in nanostructured electronic devices, it greatly facilitates the manipulation of mechanical resonators in the quantum regime, and could unveil a new route in quantum communications using phonons as carriers of information. Acoustic phonons also constitute a versatile platform for the study of fundamental wave dynamics, including Bloch oscillations, Wannier Stark ladders and other localization phenomena. Many of the phenomena studied in nanophononics were indeed inspired by their counterparts in optics and electronics. In these fields, the consideration of topological invariants to control wave dynamics has already had a great impact for the generation of robust confined states. Interestingly, the use of topological phases to engineer nanophononic devices remains an unexplored and promising field. Conversely, the use of acoustic phonons could constitute a rich platform to study topological states. Here, we introduce the concept of topological invariants to nanophononics and experimentally implement a nanophononic system supporting a robust topological interface state at 350 GHz. The state is constructed through band inversion, i.e. by concatenating two semiconductor superlattices with inverted spatial mode symmetries. The existence of this state is purely determined by the Zak phases of the constituent superlattices, i.e. that one-dimensional Berry phase. We experimentally evidenced the mode through Raman spectroscopy. The reported robust topological interface states could become part of nanophononic devices requiring resonant structures such as sensors or phonon lasers.
△ Less
Submitted 24 February, 2018;
originally announced February 2018.
-
Nanomechanical resonators based on adiabatic periodicity-breaking in a superlattice
Authors:
F. R. Lamberti,
M. Esmann,
A. Lemaitre,
C. Gomez Carbonell,
O. Krebs,
I. Favero,
B. Jusserand,
P. Senellart,
L. Lanco,
N. D. Lanzillotti-Kimura
Abstract:
We propose a novel acoustic cavity design where we confine a mechanical mode by adiabatically changing the acoustic properties of a GaAs/AlAs superlattice. By means of high resolution Raman scattering measurements, we experimentally demonstrate the presence of a confined acoustic mode at a resonance frequency around 350 GHz. We observe an excellent agreement between the experimental data and numer…
▽ More
We propose a novel acoustic cavity design where we confine a mechanical mode by adiabatically changing the acoustic properties of a GaAs/AlAs superlattice. By means of high resolution Raman scattering measurements, we experimentally demonstrate the presence of a confined acoustic mode at a resonance frequency around 350 GHz. We observe an excellent agreement between the experimental data and numerical simulations based on a photoelastic model. We demonstrate that the spatial profile of the confined mode can be tuned by changing the magnitude of the adiabatic deformation, leading to strong variations of its mechanical quality factor and Raman scattering cross section. The reported alternative confinement method could lead to the development of a novel generation of nanophononic and optomechanical systems.
△ Less
Submitted 18 August, 2017;
originally announced August 2017.
-
Optomechanical properties of GaAs/AlAs micropillar resonators operating in the 18 GHz range
Authors:
F. R. Lamberti,
Q. Yao,
L. Lanco,
D. T. Nguyen,
M. Esmann,
A. Fainstein,
P. Sesin,
S. Anguiano,
V. Villafañe,
A. Bruchhausen,
P. Senellart,
I. Favero,
N. D. Lanzillotti-Kimura
Abstract:
Recent experiments demonstrated that GaAs-AlAs based micropillar cavities are promising systems for quantum optomechanics, allowing the simultaneous three-dimensional confinement of near-infrared photons and acoustic phonons in the 18-100 GHz range. Here, we investigate through numerical simulations the optomechanical properties of this new platform. We evidence how the Poisson's ratio and semicon…
▽ More
Recent experiments demonstrated that GaAs-AlAs based micropillar cavities are promising systems for quantum optomechanics, allowing the simultaneous three-dimensional confinement of near-infrared photons and acoustic phonons in the 18-100 GHz range. Here, we investigate through numerical simulations the optomechanical properties of this new platform. We evidence how the Poisson's ratio and semiconductor-vacuum boundary conditions lead to very distinct features in the mechanical and optical three dimensional confinement. We find a strong dependence of the mechanical quality factor and strain distribution on the micropillar radius, in great contrast to what is predicted and observed in the optical domain. The derived optomechanical coupling constants g_0 reach ultra-large values in the 10^6 rad/s range.
△ Less
Submitted 26 July, 2017;
originally announced July 2017.