Search | arXiv e-print repository

Reinforcement-learning robotic sailboats: simulator and preliminary results

Authors: Eduardo Charles Vasconcellos, Ronald M Sampaio, André P D Araújo, Esteban Walter Gonzales Clua, Philippe Preux, Raphael Guerra, Luiz M G Gonçalves, Luis Martí, Hernan Lira, Nayat Sanchez-Pi

Abstract: This work focuses on the main challenges and problems in develo** a virtual oceanic environment reproducing real experiments using Unmanned Surface Vehicles (USV) digital twins. We introduce the key features for building virtual worlds, considering using Reinforcement Learning (RL) agents for autonomous navigation and control. With this in mind, the main problems concern the definition of the si… ▽ More This work focuses on the main challenges and problems in develo** a virtual oceanic environment reproducing real experiments using Unmanned Surface Vehicles (USV) digital twins. We introduce the key features for building virtual worlds, considering using Reinforcement Learning (RL) agents for autonomous navigation and control. With this in mind, the main problems concern the definition of the simulation equations (physics and mathematics), their effective implementation, and how to include strategies for simulated control and perception (sensors) to be used with RL. We present the modeling, implementation steps, and challenges required to create a functional digital twin based on a real robotic sailing vessel. The application is immediate for develo** navigation algorithms based on RL to be applied on real boats. △ Less

Submitted 16 January, 2024; originally announced February 2024.

Journal ref: NeurIPS 2023 Workshop on Robot Learning Workshop: Pretraining, Fine-Tuning, and Generalization with Large Scale Models, Dec 2023, New Orelans, United States

arXiv:2307.05760 [pdf, other]

doi 10.1109/sbgames56371.2022.9961078

Line Art Colorization of Fakemon using Generative Adversarial Neural Networks

Authors: Erick Oliveira Rodrigues, Esteban Clua, Giovani Bernardes Vitor

Abstract: This work proposes a complete methodology to colorize images of Fakemon, anime-style monster-like creatures. In addition, we propose algorithms to extract the line art from colorized images as well as to extract color hints. Our work is the first in the literature to use automatic color hint extraction, to train the networks specifically with anime-styled creatures and to combine the Pix2Pix and C… ▽ More This work proposes a complete methodology to colorize images of Fakemon, anime-style monster-like creatures. In addition, we propose algorithms to extract the line art from colorized images as well as to extract color hints. Our work is the first in the literature to use automatic color hint extraction, to train the networks specifically with anime-styled creatures and to combine the Pix2Pix and CycleGAN approaches, two different generative adversarial networks that create a single final result. Visual results of the colorizations are feasible but there is still room for improvement. △ Less

Submitted 11 July, 2023; originally announced July 2023.

Comments: art generation, lineart colorization, image colorization, generative adversarial networks, stable diffusion, pokemon, fakemon, digimon

Journal ref: 2022 21st Brazilian Symposium on Computer Games and Digital Entertainment (SBGames)

arXiv:2212.09981 [pdf, other]

Benchmarking person re-identification datasets and approaches for practical real-world implementations

Authors: Jose Huaman, Felix O. Sumari, Luigy Machaca, Esteban Clua, Joris Guerin

Abstract: Recently, Person Re-Identification (Re-ID) has received a lot of attention. Large datasets containing labeled images of various individuals have been released, allowing researchers to develop and test many successful approaches. However, when such Re-ID models are deployed in new cities or environments, the task of searching for people within a network of security cameras is likely to face an impo… ▽ More Recently, Person Re-Identification (Re-ID) has received a lot of attention. Large datasets containing labeled images of various individuals have been released, allowing researchers to develop and test many successful approaches. However, when such Re-ID models are deployed in new cities or environments, the task of searching for people within a network of security cameras is likely to face an important domain shift, thus resulting in decreased performance. Indeed, while most public datasets were collected in a limited geographic area, images from a new city present different features (e.g., people's ethnicity and clothing style, weather, architecture, etc.). In addition, the whole frames of the video streams must be converted into cropped images of people using pedestrian detection models, which behave differently from the human annotators who created the dataset used for training. To better understand the extent of this issue, this paper introduces a complete methodology to evaluate Re-ID approaches and training datasets with respect to their suitability for unsupervised deployment for live operations. This method is used to benchmark four Re-ID approaches on three datasets, providing insight and guidelines that can help to design better Re-ID pipelines in the future. △ Less

Submitted 19 December, 2022; originally announced December 2022.

Comments: This paper is the extended version of our short paper accepted in VISAPP - 2023

arXiv:2209.06452 [pdf, other]

TrADe Re-ID -- Live Person Re-Identification using Tracking and Anomaly Detection

Authors: Luigy Machaca, F. Oliver Sumari H, Jose Huaman, Esteban Clua, Joris Guerin

Abstract: Person Re-Identification (Re-ID) aims to search for a person of interest (query) in a network of cameras. In the classic Re-ID setting the query is sought in a gallery containing properly cropped images of entire bodies. Recently, the live Re-ID setting was introduced to represent the practical application context of Re-ID better. It consists in searching for the query in short videos, containing… ▽ More Person Re-Identification (Re-ID) aims to search for a person of interest (query) in a network of cameras. In the classic Re-ID setting the query is sought in a gallery containing properly cropped images of entire bodies. Recently, the live Re-ID setting was introduced to represent the practical application context of Re-ID better. It consists in searching for the query in short videos, containing whole scene frames. The initial live Re-ID baseline used a pedestrian detector to build a large search gallery and a classic Re-ID model to find the query in the gallery. However, the galleries generated were too large and contained low-quality images, which decreased the live Re-ID performance. Here, we present a new live Re-ID approach called TrADe, to generate lower high-quality galleries. TrADe first uses a Tracking algorithm to identify sequences of images of the same individual in the gallery. Following, an Anomaly Detection model is used to select a single good representative of each tracklet. TrADe is validated on the live Re-ID version of the PRID-2011 dataset and shows significant improvements over the baseline. △ Less

Submitted 14 September, 2022; originally announced September 2022.

Comments: 6 pages, 4 figures, Accepted on ICMLA 2022

arXiv:2207.06346 [pdf, other]

A guideline proposal for minimizing cybersickness in VR-based serious games and applications

Authors: Thiago Porcino, Derek Reilly, Esteban Clua, Daniela Trevisan

Abstract: Head-mounted displays (HMDs) are popular immersive tools in general, not limited to entertainment but also for education, military, and serious games for health. While these displays have strong popularity, they still have user experience issues, triggering possible symptoms of discomfort to users. This condition is known as cybersickness (CS) and is one of the most popular research topics tied to… ▽ More Head-mounted displays (HMDs) are popular immersive tools in general, not limited to entertainment but also for education, military, and serious games for health. While these displays have strong popularity, they still have user experience issues, triggering possible symptoms of discomfort to users. This condition is known as cybersickness (CS) and is one of the most popular research topics tied to virtual reality (VR) issues. We first present the main strategies focused on minimizing cybersickness problems in virtual reality. Following this, we propose a guideline framework based on CS causes such as locomotion, acceleration, the field of view, depth of field, degree of freedom, exposition use time, latency-lag, static rest frame, and camera rotation. Additionally, serious games applications and broader categories of games can also adopt it. Additionally, we categorized the imminent challenges for CS minimization into four different items. Conclusively, this work contributes as a consulting reference to enable VR developers and designers to optimize their VR users' experience and VR serious games. △ Less

Submitted 13 July, 2022; originally announced July 2022.

Comments: 8 pages

Report number: Accepted at the IEEE 10th International Conference on Serious Games and Applications for Health (SeGAH) - SeGAH 2020

arXiv:2108.07903 [pdf, other]

Spatially and color consistent environment lighting estimation using deep neural networks for mixed reality

Authors: Bruno Augusto Dorta Marques, Esteban Walter Gonzalez Clua, Anselmo Antunes Montenegro, Cristina Nader Vasconcelos

Abstract: The representation of consistent mixed reality (XR) environments requires adequate real and virtual illumination composition in real-time. Estimating the lighting of a real scenario is still a challenge. Due to the ill-posed nature of the problem, classical inverse-rendering techniques tackle the problem for simple lighting setups. However, those assumptions do not satisfy the current state-of-art… ▽ More The representation of consistent mixed reality (XR) environments requires adequate real and virtual illumination composition in real-time. Estimating the lighting of a real scenario is still a challenge. Due to the ill-posed nature of the problem, classical inverse-rendering techniques tackle the problem for simple lighting setups. However, those assumptions do not satisfy the current state-of-art in computer graphics and XR applications. While many recent works solve the problem using machine learning techniques to estimate the environment light and scene's materials, most of them are limited to geometry or previous knowledge. This paper presents a CNN-based model to estimate complex lighting for mixed reality environments with no previous information about the scene. We model the environment illumination using a set of spherical harmonics (SH) environment lighting, capable of efficiently represent area lighting. We propose a new CNN architecture that inputs an RGB image and recognizes, in real-time, the environment lighting. Unlike previous CNN-based lighting estimation methods, we propose using a highly optimized deep neural network architecture, with a reduced number of parameters, that can learn high complex lighting scenarios from real-world high-dynamic-range (HDR) environment images. We show in the experiments that the CNN architecture can predict the environment lighting with an average mean squared error (MSE) of \num{7.85e-04} when comparing SH lighting coefficients. We validate our model in a variety of mixed reality scenarios. Furthermore, we present qualitative results comparing relights of real-world scenes. △ Less

Submitted 17 August, 2021; originally announced August 2021.

arXiv:2009.01377 [pdf, other]

doi 10.1016/j.patrec.2020.08.023

Towards Practical Implementations of Person Re-Identification from Full Video Frames

Authors: Felix O. Sumari, Luigy Machaca, Jose Huaman, Esteban W. G. Clua, Joris Guérin

Abstract: With the major adoption of automation for cities security, person re-identification (Re-ID) has been extensively studied recently. In this paper, we argue that the current way of studying person re-identification, i.e. by trying to re-identify a person within already detected and pre-cropped images of people, is not sufficient to implement practical security applications, where the inputs to the s… ▽ More With the major adoption of automation for cities security, person re-identification (Re-ID) has been extensively studied recently. In this paper, we argue that the current way of studying person re-identification, i.e. by trying to re-identify a person within already detected and pre-cropped images of people, is not sufficient to implement practical security applications, where the inputs to the system are the full frames of the video streams. To support this claim, we introduce the Full Frame Person Re-ID setting (FF-PRID) and define specific metrics to evaluate FF-PRID implementations. To improve robustness, we also formalize the hybrid human-machine collaboration framework, which is inherent to any Re-ID security applications. To demonstrate the importance of considering the FF-PRID setting, we build an experiment showing that combining a good people detection network with a good Re-ID model does not necessarily produce good results for the final application. This underlines a failure of the current formulation in assessing the quality of a Re-ID model and justifies the use of different metrics. We hope that this work will motivate the research community to consider the full problem in order to develop algorithms that are better suited to real-world scenarios. △ Less

Submitted 2 September, 2020; originally announced September 2020.

Comments: 7 pages, 9 figures, This paper is under consideration at Pattern Recognition Letters

arXiv:2006.15432 [pdf, other]

Automatic Recommendation of Strategies for Minimizing Discomfort in Virtual Environments

Authors: Thiago Porcino, Esteban Clua, Daniela Trevisan, Érick Rodrigues, Alexandre Silva

Abstract: Virtual reality (VR) is an imminent trend in games, education, entertainment, military, and health applications, as the use of head-mounted displays is becoming accessible to the mass market. Virtual reality provides immersive experiences but still does not offer an entirely perfect situation, mainly due to Cybersickness (CS) issues. In this work, we first present a detailed review about possible… ▽ More Virtual reality (VR) is an imminent trend in games, education, entertainment, military, and health applications, as the use of head-mounted displays is becoming accessible to the mass market. Virtual reality provides immersive experiences but still does not offer an entirely perfect situation, mainly due to Cybersickness (CS) issues. In this work, we first present a detailed review about possible causes of CS. Following, we propose a novel CS prediction solution. Our system is able to suggest if the user may be entering in the next moments of the application into an illness situation. We use Random Forest classifiers, based on a dataset we have produced. The CSPQ (Cybersickness Profile Questionnaire) is also proposed, which is used to identify the player's susceptibility to CS and the dataset construction. In addition, we designed two immersive environments for empirical studies where participants are asked to complete the questionnaire and describe (orally) the degree of discomfort during their gaming experience. Our data was achieved through 84 individuals on different days, using VR devices. Our proposal also allows us to identify which are the most frequent attributes (causes) in the observed discomfort situations. △ Less

Submitted 27 June, 2020; originally announced June 2020.

Comments: Accepted at the IEEE 8th International Conference on Serious Games and Applications for Health (SeGAH) - SeGAH 2020

arXiv:1611.06292 [pdf, other]

Minimizing cyber sickness in head mounted display systems: design guidelines and applications

Authors: Thiago M. Porcino, Esteban W. Clua, Cristina N. Vasconcelos, Daniela Trevisan, Luis Valente

Abstract: We are experiencing an upcoming trend of using head mounted display systems in games and serious games, which is likely to become an established practice in the near future. While these systems provide highly immersive experiences, many users have been reporting discomfort symptoms, such as nausea, sickness, and headaches, among others. When using VR for health applications, this is more critical,… ▽ More We are experiencing an upcoming trend of using head mounted display systems in games and serious games, which is likely to become an established practice in the near future. While these systems provide highly immersive experiences, many users have been reporting discomfort symptoms, such as nausea, sickness, and headaches, among others. When using VR for health applications, this is more critical, since the discomfort may interfere a lot in treatments. In this work we discuss possible causes of these issues, and present possible solutions as design guidelines that may mitigate them. In this context, we go deeper within a dynamic focus solution to reduce discomfort in immersive virtual environments, when using first-person navigation. This solution applies an heuristic model of visual attention that works in real time. This work also discusses a case study (as a first-person spatial shooter demo) that applies this solution and the proposed design guidelines. △ Less

Submitted 18 November, 2016; originally announced November 2016.

Comments: 11 pages, 3 figures, 3 tables

arXiv:1605.08035 [pdf]

Notes on Pervasive Virtuality

Authors: Luis Valente, Bruno Feijo, Alexandre Ribeiro Silva, Esteban Clua

Abstract: This paper summarizes current notes about a new mixed-reality paradigm that we named as "pervasive virtuality". This paradigm has emerged recently in industry and academia through different initiatives. In this paper we intend to explore this new area by proposing a set of features that we identified as important or helpful to realize pervasive virtuality in games and entertainment applications. This paper summarizes current notes about a new mixed-reality paradigm that we named as "pervasive virtuality". This paradigm has emerged recently in industry and academia through different initiatives. In this paper we intend to explore this new area by proposing a set of features that we identified as important or helpful to realize pervasive virtuality in games and entertainment applications. △ Less

Submitted 25 May, 2016; originally announced May 2016.

Comments: Tech report MCC01/16 (Monografias em Ciência da Computação, May 2016, PUC-Rio, ISSN 0103-9741), discussion paper in progress to IFIP ICEC 2016

Report number: MCC01/16

arXiv:1601.01645 [pdf]

Live-action Virtual Reality Games

Authors: Luis Valente, Esteban Clua, Alexandre Ribeiro Silva, Bruno Feijó

Abstract: This paper proposes the concept of "live-action virtual reality games" as a new genre of digital games based on an innovative combination of live-action, mixed-reality, context-awareness, and interaction paradigms that comprise tangible objects, context-aware input devices, and embedded/embodied interactions. Live-action virtual reality games are "live-action games" because a player physically act… ▽ More This paper proposes the concept of "live-action virtual reality games" as a new genre of digital games based on an innovative combination of live-action, mixed-reality, context-awareness, and interaction paradigms that comprise tangible objects, context-aware input devices, and embedded/embodied interactions. Live-action virtual reality games are "live-action games" because a player physically acts out (using his/her real body and senses) his/her "avatar" (his/her virtual representation) in the game stage, which is the mixed-reality environment where the game happens. The game stage is a kind of "augmented virtuality"; a mixed-reality where the virtual world is augmented with real-world information. In live-action virtual reality games, players wear HMD devices and see a virtual world that is constructed using the physical world architecture as the basic geometry and context information. Physical objects that reside in the physical world are also mapped to virtual elements. Live-action virtual reality games keeps the virtual and real-worlds superimposed, requiring players to physically move in the environment and to use different interaction paradigms (such as tangible and embodied interaction) to complete game activities. This setup enables the players to touch physical architectural elements (such as walls) and other objects, "feeling" the game stage. Players have free movement and may interact with physical objects placed in the game stage, implicitly and explicitly. Live-action virtual reality games differ from similar game concepts because they sense and use contextual information to create unpredictable game experiences, giving rise to emergent gameplay. △ Less

Submitted 7 January, 2016; originally announced January 2016.

Comments: 10 pages, technical report published at "Monografias em Ciência da Computação, PUC-Rio" (ISSN 0103-9741), MCC03/15, July 2015

Report number: MCC03/15

Showing 1–11 of 11 results for author: Clua, E