Search | arXiv e-print repository

doi 10.1109/TIFS.2018.2807791

Facial Soft Biometrics for Recognition in the Wild: Recent Works, Annotation, and COTS Evaluation

Authors: Ester Gonzalez-Sosa, Julian Fierrez, Ruben Vera-Rodriguez, Fernando Alonso-Fernandez

Abstract: The role of soft biometrics to enhance person recognition systems in unconstrained scenarios has not been extensively studied. Here, we explore the utility of the following modalities: gender, ethnicity, age, glasses, beard, and moustache. We consider two assumptions: 1) manual estimation of soft biometrics and 2) automatic estimation from two commercial off-the-shelf systems (COTS). All experimen… ▽ More The role of soft biometrics to enhance person recognition systems in unconstrained scenarios has not been extensively studied. Here, we explore the utility of the following modalities: gender, ethnicity, age, glasses, beard, and moustache. We consider two assumptions: 1) manual estimation of soft biometrics and 2) automatic estimation from two commercial off-the-shelf systems (COTS). All experiments are reported using the labeled faces in the wild (LFW) database. First, we study the discrimination capabilities of soft biometrics standalone. Then, experiments are carried out fusing soft biometrics with two state-of-the-art face recognition systems based on deep learning. We observe that soft biometrics is a valuable complement to the face modality in unconstrained scenarios, with relative improvements up to 40%/15% in the verification performance when using manual/automatic soft biometrics estimation. Results are reproducible as we make public our manual annotations and COTS outputs of soft biometrics over LFW, as well as the face recognition scores. △ Less

Submitted 24 October, 2022; originally announced October 2022.

Comments: Published at IEEE Transactions on Information Forensics and Security

arXiv:2208.12639 [pdf, other]

Full Body Video-Based Self-Avatars for Mixed Reality: from E2E System to User Study

Authors: Diego Gonzalez Morin, Ester Gonzalez-Sosa, Pablo Perez, Alvaro Villegas

Abstract: In this work we explore the creation of self-avatars through video pass-through in Mixed Reality (MR) applications. We present our end-to-end system, including: custom MR video pass-through implementation on a commercial head mounted display (HMD), our deep learning-based real-time egocentric body segmentation algorithm, and our optimized offloading architecture, to communicate the segmentation se… ▽ More In this work we explore the creation of self-avatars through video pass-through in Mixed Reality (MR) applications. We present our end-to-end system, including: custom MR video pass-through implementation on a commercial head mounted display (HMD), our deep learning-based real-time egocentric body segmentation algorithm, and our optimized offloading architecture, to communicate the segmentation server with the HMD. To validate this technology, we designed an immersive VR experience where the user has to walk through a narrow tiles path over an active volcano crater. The study was performed under three body representation conditions: virtual hands, video pass-through with color-based full-body segmentation and video pass-through with deep learning full-body segmentation. This immersive experience was carried out by 30 women and 28 men. To the best of our knowledge, this is the first user study focused on evaluating video-based self-avatars to represent the user in a MR scene. Results showed no significant differences between the different body representations in terms of presence, with moderate improvements in some Embodiment components between the virtual hands and full-body representations. Visual Quality results showed better results from the deep-learning algorithms in terms of the whole body perception and overall segmentation quality. We provide some discussion regarding the use of video-based self-avatars, and some reflections on the evaluation methodology. The proposed E2E solution is in the boundary of the state of the art, so there is still room for improvement before it reaches maturity. However, this solution serves as a crucial starting point for novel MR distributed solutions. △ Less

Submitted 24 August, 2022; originally announced August 2022.

Comments: Diego Gonzalez-Morin and Ester Gonzalez-Sosa contribute equally

arXiv:2207.01296 [pdf, other]

Real Time Egocentric Segmentation for Video-self Avatar in Mixed Reality

Authors: Ester Gonzalez-Sosa, Andrija Gajic, Diego Gonzalez-Morin, Guillermo Robledo, Pablo Perez, Alvaro Villegas

Abstract: In this work we present our real-time egocentric body segmentation algorithm. Our algorithm achieves a frame rate of 66 fps for an input resolution of 640x480, thanks to our shallow network inspired in Thundernet's architecture. Besides, we put a strong emphasis on the variability of the training data. More concretely, we describe the creation process of our Egocentric Bodies (EgoBodies) dataset,… ▽ More In this work we present our real-time egocentric body segmentation algorithm. Our algorithm achieves a frame rate of 66 fps for an input resolution of 640x480, thanks to our shallow network inspired in Thundernet's architecture. Besides, we put a strong emphasis on the variability of the training data. More concretely, we describe the creation process of our Egocentric Bodies (EgoBodies) dataset, composed of almost 10,000 images from three datasets, created both from synthetic methods and real capturing. We conduct experiments to understand the contribution of the individual datasets; compare Thundernet model trained with EgoBodies with simpler and more complex previous approaches and discuss their corresponding performance in a real-life setup in terms of segmentation quality and inference times. The described trained semantic segmentation algorithm is already integrated in an end-to-end system for Mixed Reality (MR), making it possible for users to see his/her own body while being immersed in a MR scene. △ Less

Submitted 4 July, 2022; originally announced July 2022.

Comments: 9 pages, 9 figures

arXiv:2205.05953 [pdf]

doi 10.3389/frsip.2022.917684

Emerging Immersive Communication Systems: Overview, Taxonomy, and Good Practises for QoE Assessment

Authors: Pablo Pérez, Ester Gonzalez-Sosa, Jesús Gutiérrez, Narciso García

Abstract: Several technological and scientific advances have been achieved recently in the fields of immersive systems, which are offering new possibilities to applications and services in different communication domains, such as entertainment, virtual conferencing, working meetings, social relations, healthcare, and industry. Users of these immersive technologies can explore and experience the stimuli in a… ▽ More Several technological and scientific advances have been achieved recently in the fields of immersive systems, which are offering new possibilities to applications and services in different communication domains, such as entertainment, virtual conferencing, working meetings, social relations, healthcare, and industry. Users of these immersive technologies can explore and experience the stimuli in a more interactive and personalized way than previous technologies. Thus, considering the new technological challenges related to these systems and the new perceptual dimensions and interaction behaviors involved, a deep understanding of the users' Quality of Experience is required to satisfy their demands and expectations. In this sense, it is essential to foster the research on evaluating the QoE of immersive communication systems, since this will provide useful outcomes to optimize them and to identify the factors that can deteriorate the user experience. With this aim, subjective tests are usually performed following standard methodologies, which are designed for specific technologies and services. Although numerous user studies have been already published, there are no recommendations or standards that define common testing methodologies to be applied to evaluate immersive communication systems, such as those developed for images and video. Therefore, a revision of the QoE evaluation methods designed for previous technologies is required to develop robust and reliable methodologies for immersive communication systems. Thus, the objective of this paper is to provide an overview of existing immersive communication systems and related user studies, which can help on the definition of basic guidelines and testing methodologies to be used when performing user tests of immersive communication systems, such as 360-degree video-based telepresence, avatar-based social VR, cooperative AR, etc. △ Less

Submitted 1 September, 2022; v1 submitted 12 May, 2022; originally announced May 2022.

Comments: Frontiers in Signal Processing

Journal ref: Front. Signal Process. (2022)

arXiv:2203.15618 [pdf]

doi 10.1109/TBIOM.2019.2906367

Exploring Body Texture from mmW Images for Person Recognition

Authors: E. Gonzalez-Sosa, J. Fierrez, R. Vera-Rodriguez, F. Alonso-Fernandez, V. M. Patel

Abstract: Imaging using millimeter waves (mmWs) has many advantages including the ability to penetrate obscurants such as clothes and polymers. After having explored shape information retrieved from mmW images for person recognition, in this work we aim to gain some insight about the potential of using mmW texture information for the same task, considering not only the mmW face, but also mmW torso and mmW w… ▽ More Imaging using millimeter waves (mmWs) has many advantages including the ability to penetrate obscurants such as clothes and polymers. After having explored shape information retrieved from mmW images for person recognition, in this work we aim to gain some insight about the potential of using mmW texture information for the same task, considering not only the mmW face, but also mmW torso and mmW wholebody. We report experimental results using the mmW TNO database consisting of 50 individuals based on both hand-crafted and learned features from Alexnet and VGG-face pretrained Convolutional Neural Networks (CNN) models. First, we analyze the individual performance of three mmW body parts, concluding that: i) mmW torso region is more discriminative than mmW face and the whole body, ii) CNN features produce better results compared to hand-crafted features on mmW faces and the entire body, and iii) hand-crafted features slightly outperform CNN features on mmW torso. In the second part of this work, we analyze different multi-algorithmic and multi-modal techniques, including a novel CNN-based fusion technique, improving verification results to 2% EER and identification rank-1 results up to 99%. Comparative analyses with mmW body shape information and face recognition in the visible and NIR spectral bands are also reported. △ Less

Submitted 27 March, 2022; originally announced March 2022.

Comments: Published at IEEE Transactions on Biometrics, Behavior, and Identity Science

arXiv:2203.14203 [pdf, other]

doi 10.1109/ACCESS.2018.2889395

A Survey of Super-Resolution in Iris Biometrics with Evaluation of Dictionary-Learning

Authors: F. Alonso-Fernandez, R. A. Farrugia, J. Bigun, J. Fierrez, E. Gonzalez-Sosa

Abstract: The lack of resolution has a negative impact on the performance of image-based biometrics. While many generic super-resolution methods have been proposed to restore low-resolution images, they usually aim to enhance their visual appearance. However, a visual enhancement of biometric images does not necessarily correlate with a better recognition performance. Reconstruction approaches need thus to… ▽ More The lack of resolution has a negative impact on the performance of image-based biometrics. While many generic super-resolution methods have been proposed to restore low-resolution images, they usually aim to enhance their visual appearance. However, a visual enhancement of biometric images does not necessarily correlate with a better recognition performance. Reconstruction approaches need thus to incorporate specific information from the target biometric modality to effectively improve recognition. This paper presents a comprehensive survey of iris super-resolution approaches proposed in the literature. We have also adapted an Eigen-patches reconstruction method based on PCA Eigen-transformation of local image patches. The structure of the iris is exploited by building a patch-position dependent dictionary. In addition, image patches are restored separately, having their own reconstruction weights. This allows the solution to be locally optimized, hel** to preserve local information. To evaluate the algorithm, we degraded high-resolution images from the CASIA Interval V3 database. Different restorations were considered, with 15x15 pixels being the smallest resolution. To the best of our knowledge, this is among the smallest resolutions employed in the literature. The framework is complemented with six public iris comparators, which were used to carry out biometric verification and identification experiments. Experimental results show that the proposed method significantly outperforms both bilinear and bicubic interpolation at very low-resolution. The performance of a number of comparators attains an impressive Equal Error Rate as low as 5%, and a Top-1 accuracy of 77-84% when considering iris images of only 15x15 pixels. These results clearly demonstrate the benefit of using trained super-resolution techniques to improve the quality of iris images prior to matching. △ Less

Submitted 26 March, 2022; originally announced March 2022.

Comments: Published at IEEE Access

arXiv:2106.04957 [pdf, other]

Real Time Egocentric Object Segmentation: THU-READ Labeling and Benchmarking Results

Authors: E. Gonzalez-Sosa, G. Robledo, D. Gonzalez-Morin, P. Perez-Garcia, A. Villegas

Abstract: Egocentric segmentation has attracted recent interest in the computer vision community due to their potential in Mixed Reality (MR) applications. While most previous works have been focused on segmenting egocentric human body parts (mainly hands), little attention has been given to egocentric objects. Due to the lack of datasets of pixel-wise annotations of egocentric objects, in this paper we con… ▽ More Egocentric segmentation has attracted recent interest in the computer vision community due to their potential in Mixed Reality (MR) applications. While most previous works have been focused on segmenting egocentric human body parts (mainly hands), little attention has been given to egocentric objects. Due to the lack of datasets of pixel-wise annotations of egocentric objects, in this paper we contribute with a semantic-wise labeling of a subset of 2124 images from the RGB-D THU-READ Dataset. We also report benchmarking results using Thundernet, a real-time semantic segmentation network, that could allow future integration with end-to-end MR applications. △ Less

Submitted 9 June, 2021; originally announced June 2021.

Comments: Accepted for presentation at EPIC@CVPR2021 workshop

arXiv:2005.12074 [pdf, other]

Egocentric Human Segmentation for Mixed Reality

Authors: Andrija Gajic, Ester Gonzalez-Sosa, Diego Gonzalez-Morin, Marcos Escudero-Viñolo, Alvaro Villegas

Abstract: The objective of this work is to segment human body parts from egocentric video using semantic segmentation networks. Our contribution is two-fold: i) we create a semi-synthetic dataset composed of more than 15, 000 realistic images and associated pixel-wise labels of egocentric human body parts, such as arms or legs including different demographic factors; ii) building upon the ThunderNet archite… ▽ More The objective of this work is to segment human body parts from egocentric video using semantic segmentation networks. Our contribution is two-fold: i) we create a semi-synthetic dataset composed of more than 15, 000 realistic images and associated pixel-wise labels of egocentric human body parts, such as arms or legs including different demographic factors; ii) building upon the ThunderNet architecture, we implement a deep learning semantic segmentation algorithm that is able to perform beyond real-time requirements (16 ms for 720 x 720 images). It is believed that this method will enhance sense of presence of Virtual Environments and will constitute a more realistic solution to the standard virtual avatars. △ Less

Submitted 8 June, 2020; v1 submitted 25 May, 2020; originally announced May 2020.

Comments: Accepted for presentation at EPIC@CVPR2020 workshop

arXiv:2003.12352 [pdf, other]

Enhanced Self-Perception in Mixed Reality: Egocentric Arm Segmentation and Database with Automatic Labelling

Authors: Ester Gonzalez-Sosa, Pablo Perez, Ruben Tolosana, Redouane Kachach, Alvaro Villegas

Abstract: In this study, we focus on the egocentric segmentation of arms to improve self-perception in Augmented Virtuality (AV). The main contributions of this work are: i) a comprehensive survey of segmentation algorithms for AV; ii) an Egocentric Arm Segmentation Dataset, composed of more than 10, 000 images, comprising variations of skin color, and gender, among others. We provide all details required f… ▽ More In this study, we focus on the egocentric segmentation of arms to improve self-perception in Augmented Virtuality (AV). The main contributions of this work are: i) a comprehensive survey of segmentation algorithms for AV; ii) an Egocentric Arm Segmentation Dataset, composed of more than 10, 000 images, comprising variations of skin color, and gender, among others. We provide all details required for the automated generation of groundtruth and semi-synthetic images; iii) the use of deep learning for the first time for segmenting arms in AV; iv) to showcase the usefulness of this database, we report results on different real egocentric hand datasets, including GTEA Gaze+, EDSH, EgoHands, Ego Youtube Hands, THU-Read, TEgO, FPAB, and Ego Gesture, which allow for direct comparisons with existing approaches utilizing color or depth. Results confirm the suitability of the EgoArm dataset for this task, achieving improvement up to 40% with respect to the original network, depending on the particular dataset. Results also suggest that, while approaches based on color or depth can work in controlled conditions (lack of occlusion, uniform lighting, only objects of interest in the near range, controlled background, etc.), egocentric segmentation based on deep learning is more robust in real AV applications. △ Less

Submitted 27 March, 2020; originally announced March 2020.

Showing 1–9 of 9 results for author: Gonzalez-Sosa, E