-
Large-Scale Evaluation of Mobility, Technology and Demand Scenarios in the Chicago Region Using POLARIS
Authors:
Joshua Auld,
Jamie Cook,
Krishna Murthy Gurumurthy,
Nazmul Khan,
Charbel Mansour,
Aymeric Rousseau,
Olcay Sahin,
Felipe de Souza,
Omer Verbas,
Natalia Zuniga-Garcia
Abstract:
Rapid technological progress and innovation in the areas of vehicle connectivity, automation and electrification, new modes of shared and alternative mobility, and advanced transportation system demand and supply management strategies, have motivated numerous questions and studies regarding the potential impact on key performance and equity metrics. Several of these areas of development may or may…
▽ More
Rapid technological progress and innovation in the areas of vehicle connectivity, automation and electrification, new modes of shared and alternative mobility, and advanced transportation system demand and supply management strategies, have motivated numerous questions and studies regarding the potential impact on key performance and equity metrics. Several of these areas of development may or may not have a synergistic outcome on the overall benefits such as reduction in congestion and travel times. In this study, the use of an end-to-end modeling workflow centered around an activity-based agent-based travel demand forecasting tool called POLARIS is explored to provide insights on the effects of several different technology deployments and operational policies in combination for the Chicago region. The objective of the research was to explore the direct impacts and observe any interactions between the various policy and technology scenarios to help better characterize and evaluate their potential future benefits. We analyze system outcome metrics on mobility, energy and emissions, equity and environmental justice and overall efficiency for a scenario design of experiments that looks at combinations of supply interventions (congestion pricing, transit expansion, tnc policy, off-hours freight policy, connected signal optimization) for different potential demand scenarios defined by e-commerce and on-demand delivery engagement, and market penetration of electric vehicles. We found different combinations of strategies that can reduce overall travel times up to 7% and increase system efficiency up to 53% depending on how various metrics are prioritized. The results demonstrate the importance of considering various interventions jointly.
△ Less
Submitted 4 March, 2024;
originally announced March 2024.
-
Smartphone region-wise image indoor localization using deep learning for indoor tourist attraction
Authors:
Gabriel Toshio Hirokawa Higa,
Rodrigo Stuqui Monzani,
Jorge Fernando da Silva Cecatto,
Maria Fernanda Balestieri Mariano de Souza,
Vanessa Aparecida de Moraes Weber,
Hemerson Pistori,
Edson Takashi Matsubara
Abstract:
Smart indoor tourist attractions, such as smart museums and aquariums, usually require a significant investment in indoor localization devices. The smartphone Global Positional Systems use is unsuitable for scenarios where dense materials such as concrete and metal block weaken the GPS signals, which is the most common scenario in an indoor tourist attraction. Deep learning makes it possible to pe…
▽ More
Smart indoor tourist attractions, such as smart museums and aquariums, usually require a significant investment in indoor localization devices. The smartphone Global Positional Systems use is unsuitable for scenarios where dense materials such as concrete and metal block weaken the GPS signals, which is the most common scenario in an indoor tourist attraction. Deep learning makes it possible to perform region-wise indoor localization using smartphone images. This approach does not require any investment in infrastructure, reducing the cost and time to turn museums and aquariums into smart museums or smart aquariums. This paper proposes using deep learning algorithms to classify locations using smartphone camera images for indoor tourism attractions. We evaluate our proposal in a real-world scenario in Brazil. We extensively collect images from ten different smartphones to classify biome-themed fish tanks inside the Pantanal Biopark, creating a new dataset of 3654 images. We tested seven state-of-the-art neural networks, three being transformer-based, achieving precision around 90% on average and recall and f-score around 89% on average. The results indicate good feasibility of the proposal in a most indoor tourist attractions.
△ Less
Submitted 12 June, 2024; v1 submitted 12 March, 2024;
originally announced March 2024.
-
Discovering Local Binary Pattern Equation for Foreground Object Removal in Videos
Authors:
Caroline Pacheco do Espirito Silva,
Andrews Cordolino Sobral,
Antoine Vacavant,
Thierry Bouwmans,
Felippe De Souza
Abstract:
Designing a novel Local Binary Pattern (LBP) process usually relies heavily on human experts' knowledge and experience in the area. Even experts are often left with tedious episodes of trial and error until they identify an optimal LBP for a particular dataset. To address this problem, we present a novel symbolic regression able to automatically discover LBP formulas to remove the moving parts of…
▽ More
Designing a novel Local Binary Pattern (LBP) process usually relies heavily on human experts' knowledge and experience in the area. Even experts are often left with tedious episodes of trial and error until they identify an optimal LBP for a particular dataset. To address this problem, we present a novel symbolic regression able to automatically discover LBP formulas to remove the moving parts of a scene by segmenting it into a background and a foreground. Experimental results conducted on real videos of outdoor urban scenes under various conditions show that the LBPs discovered by the proposed approach significantly outperform the previous state-of-the-art LBP descriptors both qualitatively and quantitatively. Our source code and data will be available online.
△ Less
Submitted 11 August, 2023;
originally announced August 2023.
-
ProWis: A Visual Approach for Building, Managing, and Analyzing Weather Simulation Ensembles at Runtime
Authors:
Carolina Veiga Ferreira de Souza,
Suzanna Maria Bonnet,
Daniel de Oliveira,
Marcio Cataldi,
Fabio Miranda,
Marcos Lage
Abstract:
Weather forecasting is essential for decision-making and is usually performed using numerical modeling. Numerical weather models, in turn, are complex tools that require specialized training and laborious setup and are challenging even for weather experts. Moreover, weather simulations are data-intensive computations and may take hours to days to complete. When the simulation is finished, the expe…
▽ More
Weather forecasting is essential for decision-making and is usually performed using numerical modeling. Numerical weather models, in turn, are complex tools that require specialized training and laborious setup and are challenging even for weather experts. Moreover, weather simulations are data-intensive computations and may take hours to days to complete. When the simulation is finished, the experts face challenges analyzing its outputs, a large mass of spatiotemporal and multivariate data. From the simulation setup to the analysis of results, working with weather simulations involves several manual and error-prone steps. The complexity of the problem increases exponentially when the experts must deal with ensembles of simulations, a frequent task in their daily duties. To tackle these challenges, we propose ProWis: an interactive and provenance-oriented system to help weather experts build, manage, and analyze simulation ensembles at runtime. Our system follows a human-in-the-loop approach to enable the exploration of multiple atmospheric variables and weather scenarios. ProWis was built in close collaboration with weather experts, and we demonstrate its effectiveness by presenting two case studies of rainfall events in Brazil.
△ Less
Submitted 9 August, 2023;
originally announced August 2023.
-
AnuraSet: A dataset for benchmarking Neotropical anuran calls identification in passive acoustic monitoring
Authors:
Juan Sebastián Cañas,
Maria Paula Toro-Gómez,
Larissa Sayuri Moreira Sugai,
Hernán Darío Benítez Restrepo,
Jorge Rudas,
Breyner Posso Bautista,
Luís Felipe Toledo,
Simone Dena,
Adão Henrique Rosa Domingos,
Franco Leandro de Souza,
Selvino Neckel-Oliveira,
Anderson da Rosa,
Vítor Carvalho-Rocha,
José Vinícius Bernardy,
José Luiz Massao Moreira Sugai,
Carolina Emília dos Santos,
Rogério Pereira Bastos,
Diego Llusia,
Juan Sebastián Ulloa
Abstract:
Global change is predicted to induce shifts in anuran acoustic behavior, which can be studied through passive acoustic monitoring (PAM). Understanding changes in calling behavior requires the identification of anuran species, which is challenging due to the particular characteristics of neotropical soundscapes. In this paper, we introduce a large-scale multi-species dataset of anuran amphibians ca…
▽ More
Global change is predicted to induce shifts in anuran acoustic behavior, which can be studied through passive acoustic monitoring (PAM). Understanding changes in calling behavior requires the identification of anuran species, which is challenging due to the particular characteristics of neotropical soundscapes. In this paper, we introduce a large-scale multi-species dataset of anuran amphibians calls recorded by PAM, that comprises 27 hours of expert annotations for 42 different species from two Brazilian biomes. We provide open access to the dataset, including the raw recordings, experimental setup code, and a benchmark with a baseline model of the fine-grained categorization problem. Additionally, we highlight the challenges of the dataset to encourage machine learning researchers to solve the problem of anuran call identification towards conservation policy. All our experiments and resources can be found on our GitHub repository https://github.com/soundclim/anuraset.
△ Less
Submitted 11 July, 2023;
originally announced July 2023.
-
Benefits and Drawbacks of a Graduate Course: An Experience Teaching Systematic Literature Review
Authors:
Anderson Yoshiaki Iwazaki,
Vinicius dos Santos,
Katia Romero Felizardo,
/'Erica Ferreira de Souza,
Natasha M. C. Valentim,
Elisa Yumi Nakagawa
Abstract:
Graduate courses can provide specialized knowledge for Ph.D. and Master's students and contribute to develop their hard and soft skills. At the same time, Systematic Literature Review (SLR) has been increasingly adopted in the computing area as a valuable technique to synthesize the state of the art of a given research topic. However, there is still a poor understanding of the real benefits and dr…
▽ More
Graduate courses can provide specialized knowledge for Ph.D. and Master's students and contribute to develop their hard and soft skills. At the same time, Systematic Literature Review (SLR) has been increasingly adopted in the computing area as a valuable technique to synthesize the state of the art of a given research topic. However, there is still a poor understanding of the real benefits and drawbacks of offering the SLR course for graduate students. This paper reports an experience that examines such benefits and drawbacks, the difficulties for professors (i.e., educators), and the essential SLR topics to be taught as well as a way to better teach them. We also surveyed computer science graduate students who attended the SLR course, which we have offered for almost ten years for Ph.D. and Master's students in our institution. We found the attendance to the SLR course is a valuable opportunity for graduate students to conduct the required deep literature review of their research topic, improve their research skills, and increase their formation. Hence, we recommend that Ph.D. and Masters' programs offer the SLR course to contribute to their academic achievement.
△ Less
Submitted 16 May, 2022;
originally announced May 2022.
-
Standing Forest Coin (SFC)
Authors:
Marcelo de A. Borges,
Guido L. de S. Filho,
Cicero Inacio da Silva,
Anderson M. P. Barros,
Raul V. B. J. Britto,
Nivaldo M. de C. Junior,
Daniel F. L. de Souza
Abstract:
This article describes a proposal to create a digital currency that allows the decentralized collection of resources directed to initiatives and activities that aim to protect the Brazilian Amazon ecosystem by using blockchain and digital contracts. In addition to the digital currency, the goal is to design a smart contract based in oracles to ensure credibility and security for investors and dono…
▽ More
This article describes a proposal to create a digital currency that allows the decentralized collection of resources directed to initiatives and activities that aim to protect the Brazilian Amazon ecosystem by using blockchain and digital contracts. In addition to the digital currency, the goal is to design a smart contract based in oracles to ensure credibility and security for investors and donors of financial resources invested in projects within the Standing Forest Coin (SFC - standingforest.org).
△ Less
Submitted 4 March, 2022;
originally announced March 2022.
-
Sequence-to-Sequence Models for Extracting Information from Registration and Legal Documents
Authors:
Ramon Pires,
Fábio C. de Souza,
Guilherme Rosa,
Roberto A. Lotufo,
Rodrigo Nogueira
Abstract:
A typical information extraction pipeline consists of token- or span-level classification models coupled with a series of pre- and post-processing scripts. In a production pipeline, requirements often change, with classes being added and removed, which leads to nontrivial modifications to the source code and the possible introduction of bugs. In this work, we evaluate sequence-to-sequence models a…
▽ More
A typical information extraction pipeline consists of token- or span-level classification models coupled with a series of pre- and post-processing scripts. In a production pipeline, requirements often change, with classes being added and removed, which leads to nontrivial modifications to the source code and the possible introduction of bugs. In this work, we evaluate sequence-to-sequence models as an alternative to token-level classification methods for information extraction of legal and registration documents. We finetune models that jointly extract the information and generate the output already in a structured format. Post-processing steps are learned during training, thus eliminating the need for rule-based methods and simplifying the pipeline. Furthermore, we propose a novel method to align the output with the input text, thus facilitating system inspection and auditing. Our experiments on four real-world datasets show that the proposed method is an alternative to classical pipelines.
△ Less
Submitted 14 January, 2022;
originally announced January 2022.
-
${\tt simwave}$ -- A Finite Difference Simulator for Acoustic Waves Propagation
Authors:
Jaime Freire de Souza,
João Baptista Dias Moreira,
Keith Jared Roberts,
Roussian di Ramos Alves Gaioso,
Edson Satoshi Gomi,
Emílio Carlos Nelli Silva,
Hermes Senger
Abstract:
${\tt simwave}$ is an open-source Python package to perform wave simulations in 2D or 3D domains. It solves the constant and variable density acoustic wave equation with the finite difference method and has support for domain truncation techniques, several boundary conditions, and the modeling of sources and receivers given a user-defined acquisition geometry. The architecture of ${\tt simwave}$ i…
▽ More
${\tt simwave}$ is an open-source Python package to perform wave simulations in 2D or 3D domains. It solves the constant and variable density acoustic wave equation with the finite difference method and has support for domain truncation techniques, several boundary conditions, and the modeling of sources and receivers given a user-defined acquisition geometry. The architecture of ${\tt simwave}$ is designed for applications with geophysical exploration in mind. Its Python front-end enables straightforward integration with many existing Python scientific libraries for the composition of more complex workflows and applications (e.g., migration and inversion problems). The back-end is implemented in C enabling performance portability across a range of computing hardware and compilers including both CPUs and GPUs.
△ Less
Submitted 13 January, 2022;
originally announced January 2022.
-
Combining Embeddings and Fuzzy Time Series for High-Dimensional Time Series Forecasting in Internet of Energy Applications
Authors:
Hugo Vinicius Bitencourt,
Luiz Augusto Facury de Souza,
Matheus Cascalho dos Santos,
Petrônio Cândido de Lima e Silva,
Frederico Gadelha Guimarães
Abstract:
The prediction of residential power usage is essential in assisting a smart grid to manage and preserve energy to ensure efficient use. An accurate energy forecasting at the customer level will reflect directly into efficiency improvements across the power grid system, however forecasting building energy use is a complex task due to many influencing factors, such as meteorological and occupancy pa…
▽ More
The prediction of residential power usage is essential in assisting a smart grid to manage and preserve energy to ensure efficient use. An accurate energy forecasting at the customer level will reflect directly into efficiency improvements across the power grid system, however forecasting building energy use is a complex task due to many influencing factors, such as meteorological and occupancy patterns. In addiction, high-dimensional time series increasingly arise in the Internet of Energy (IoE), given the emergence of multi-sensor environments and the two way communication between energy consumers and the smart grid. Therefore, methods that are capable of computing high-dimensional time series are of great value in smart building and IoE applications. Fuzzy Time Series (FTS) models stand out as data-driven non-parametric models of easy implementation and high accuracy. Unfortunately, the existing FTS models can be unfeasible if all features were used to train the model. We present a new methodology for handling high-dimensional time series, by projecting the original high-dimensional data into a low dimensional embedding space and using multivariate FTS approach in this low dimensional representation. Combining these techniques enables a better representation of the complex content of multivariate time series and more accurate forecasts.
△ Less
Submitted 3 December, 2021;
originally announced December 2021.
-
RandSolomon: Optimally Resilient Random Number Generator with Deterministic Termination
Authors:
Luciano Freitas de Souza,
Andrei Tonkikh,
Sara Tucci-Piergiovanni,
Renaud Sirdey,
Oana Stan,
Nicolas Quero,
Petr Kuznetsov
Abstract:
Multi-party random number generation is a key building-block in many practical protocols. While straightforward to solve when all parties are trusted to behave correctly, the problem becomes much more difficult in the presence of faults. In this context, this paper presents RandSolomon, a protocol that allows a network of N processes to produce an unpredictable common random number among the non-f…
▽ More
Multi-party random number generation is a key building-block in many practical protocols. While straightforward to solve when all parties are trusted to behave correctly, the problem becomes much more difficult in the presence of faults. In this context, this paper presents RandSolomon, a protocol that allows a network of N processes to produce an unpredictable common random number among the non-faulty of them. We provide optimal resilience for partially-synchronous systems where less than a third of the participants might behave arbitrarily and, contrary to many solutions, we do not require at any point faulty-processes to be responsive.
△ Less
Submitted 14 December, 2021; v1 submitted 10 September, 2021;
originally announced September 2021.
-
Towards Sustainability of Systematic Literature Reviews
Authors:
Vinicius dos Santos,
Anderson Yoshiaki Iwazaki,
Katia Romero Felizardo,
Érica Ferreira de Souza,
Elisa Yumi Nakagawa
Abstract:
Background: The software engineering community has increasingly conducted systematic literature reviews (SLR) as a means to summarize evidence from different studies and bring to light the state of the art of a given research topic. While SLR provide many benefits, they also present several problems with punctual solutions for some of them. However, two main problems still remain: the high time-/e…
▽ More
Background: The software engineering community has increasingly conducted systematic literature reviews (SLR) as a means to summarize evidence from different studies and bring to light the state of the art of a given research topic. While SLR provide many benefits, they also present several problems with punctual solutions for some of them. However, two main problems still remain: the high time-/effort-consumption nature of SLR and the lack of an effective impact of SLR results in the industry, as initially expected for SLR. Aims: The main goal of this paper is to introduce a new view - which we name Sustainability of SLR - on how to deal with SLR aiming at reducing those problems. Method: We analyzed six reference studies published in the last decade to identify, group, and analyze the SLR problems and their interconnections. Based on such analysis, we proposed the view of Sustainability of SLR that intends to address these problems. Results: The proposed view encompasses three dimensions (social, economic, and technical) that could become SLR more sustainable in the sense that the four major problems and 31 barriers (i.e., possible causes for those problems) that we identified could be mitigated. Conclusions: The view of Sustainability of SLR intends to change the researchers' mindset to mitigate the inherent SLR problems and, as a consequence, achieve sustainable SLR, i.e., those that consume less time/effort to be conducted and updated with useful results for the industry.
△ Less
Submitted 31 August, 2021;
originally announced September 2021.
-
Deep Learning-based Type Identification of Volumetric MRI Sequences
Authors:
Jean Pablo Vieira de Mello,
Thiago M. Paixão,
Rodrigo Berriel,
Mauricio Reyes,
Claudine Badue,
Alberto F. De Souza,
Thiago Oliveira-Santos
Abstract:
The analysis of Magnetic Resonance Imaging (MRI) sequences enables clinical professionals to monitor the progression of a brain tumor. As the interest for automatizing brain volume MRI analysis increases, it becomes convenient to have each sequence well identified. However, the unstandardized naming of MRI sequences makes their identification difficult for automated systems, as well as makes it di…
▽ More
The analysis of Magnetic Resonance Imaging (MRI) sequences enables clinical professionals to monitor the progression of a brain tumor. As the interest for automatizing brain volume MRI analysis increases, it becomes convenient to have each sequence well identified. However, the unstandardized naming of MRI sequences makes their identification difficult for automated systems, as well as makes it difficult for researches to generate or use datasets for machine learning research. In the face of that, we propose a system for identifying types of brain MRI sequences based on deep learning. By training a Convolutional Neural Network (CNN) based on 18-layer ResNet architecture, our system can classify a volumetric brain MRI as a FLAIR, T1, T1c or T2 sequence, or whether it does not belong to any of these classes. The network was evaluated on publicly available datasets comprising both, pre-processed (BraTS dataset) and non-pre-processed (TCGA-GBM dataset), image types with diverse acquisition protocols, requiring only a few slices of the volume for training. Our system can classify among sequence types with an accuracy of 96.81%.
△ Less
Submitted 6 June, 2021;
originally announced June 2021.
-
Accountability and Reconfiguration: Self-Healing Lattice Agreement
Authors:
Luciano Freitas de Souza,
Petr Kuznetsov,
Thibault Rieutord,
Sara Tucci-Piergiovanni
Abstract:
An accountable distributed system provides means to detect deviations of system components from their expected behavior. It is natural to complement fault detection with a reconfiguration mechanism, so that the system could heal itself, by replacing malfunctioning parts with new ones. In this paper, we describe a framework that can be used to implement a large class of accountable and reconfigurab…
▽ More
An accountable distributed system provides means to detect deviations of system components from their expected behavior. It is natural to complement fault detection with a reconfiguration mechanism, so that the system could heal itself, by replacing malfunctioning parts with new ones. In this paper, we describe a framework that can be used to implement a large class of accountable and reconfigurable replicated services. We build atop the fundamental lattice agreement abstraction lying at the core of storage systems and cryptocurrencies.
Our asynchronous implementation of accountable lattice agreement ensures that every violation of consistency is followed by an undeniable evidence of misbehavior of a faulty replica. The system can then be seamlessly reconfigured by evicting faulty replicas, adding new ones and merging inconsistent states. We believe that this paper opens a direction towards asynchronous "self-healing" systems that combine accountability and reconfiguration.
△ Less
Submitted 14 December, 2021; v1 submitted 11 May, 2021;
originally announced May 2021.
-
Automated Mathematical Equation Structure Discovery for Visual Analysis
Authors:
Caroline Pacheco do Espírito Silva,
José A. M. Felippe De Souza,
Antoine Vacavant,
Thierry Bouwmans,
Andrews Cordolino Sobral
Abstract:
Finding the best mathematical equation to deal with the different challenges found in complex scenarios requires a thorough understanding of the scenario and a trial and error process carried out by experts. In recent years, most state-of-the-art equation discovery methods have been widely applied in modeling and identification systems. However, equation discovery approaches can be very useful in…
▽ More
Finding the best mathematical equation to deal with the different challenges found in complex scenarios requires a thorough understanding of the scenario and a trial and error process carried out by experts. In recent years, most state-of-the-art equation discovery methods have been widely applied in modeling and identification systems. However, equation discovery approaches can be very useful in computer vision, particularly in the field of feature extraction. In this paper, we focus on recent AI advances to present a novel framework for automatically discovering equations from scratch with little human intervention to deal with the different challenges encountered in real-world scenarios. In addition, our proposal can reduce human bias by proposing a search space design through generative network instead of hand-designed. As a proof of concept, the equations discovered by our framework are used to distinguish moving objects from the background in video sequences. Experimental results show the potential of the proposed approach and its effectiveness in discovering the best equation in video sequences. The code and data are available at: https://github.com/carolinepacheco/equation-discovery-scene-analysis
△ Less
Submitted 17 April, 2021;
originally announced April 2021.
-
Intel HEXL: Accelerating Homomorphic Encryption with Intel AVX512-IFMA52
Authors:
Fabian Boemer,
Sejun Kim,
Gelila Seifu,
Fillipe D. M. de Souza,
Vinodh Gopal
Abstract:
Modern implementations of homomorphic encryption (HE) rely heavily on polynomial arithmetic over a finite field. This is particularly true of the CKKS, BFV, and BGV HE schemes. Two of the biggest performance bottlenecks in HE primitives and applications are polynomial modular multiplication and the forward and inverse number-theoretic transform (NTT). Here, we introduce Intel Homomorphic Encryptio…
▽ More
Modern implementations of homomorphic encryption (HE) rely heavily on polynomial arithmetic over a finite field. This is particularly true of the CKKS, BFV, and BGV HE schemes. Two of the biggest performance bottlenecks in HE primitives and applications are polynomial modular multiplication and the forward and inverse number-theoretic transform (NTT). Here, we introduce Intel Homomorphic Encryption Acceleration Library (Intel HEXL), a C++ library which provides optimized implementations of polynomial arithmetic for Intel processors. Intel HEXL takes advantage of the recent Intel Advanced Vector Extensions 512 (Intel AVX512) instruction set to provide state-of-the-art implementations of the NTT and modular multiplication. On the forward and inverse NTT, Intel HEXL provides up to 7.2x and 6.7x speedup, respectively, over a native C++ implementation. Intel HEXL also provides up to 6.0x speedup on the element-wise vector-vector modular multiplication, and 1.7x speedup on the element-wise vector-scalar modular multiplication. Intel HEXL is available open-source at https://github.com/intel/hexl under the Apache 2.0 license and has been adopted by the Microsoft SEAL and PALISADE homomorphic encryption libraries.
△ Less
Submitted 9 July, 2021; v1 submitted 30 March, 2021;
originally announced March 2021.
-
Copycat CNN: Are Random Non-Labeled Data Enough to Steal Knowledge from Black-box Models?
Authors:
Jacson Rodrigues Correia-Silva,
Rodrigo F. Berriel,
Claudine Badue,
Alberto F. De Souza,
Thiago Oliveira-Santos
Abstract:
Convolutional neural networks have been successful lately enabling companies to develop neural-based products, which demand an expensive process, involving data acquisition and annotation; and model generation, usually requiring experts. With all these costs, companies are concerned about the security of their models against copies and deliver them as black-boxes accessed by APIs. Nonetheless, we…
▽ More
Convolutional neural networks have been successful lately enabling companies to develop neural-based products, which demand an expensive process, involving data acquisition and annotation; and model generation, usually requiring experts. With all these costs, companies are concerned about the security of their models against copies and deliver them as black-boxes accessed by APIs. Nonetheless, we argue that even black-box models still have some vulnerabilities. In a preliminary work, we presented a simple, yet powerful, method to copy black-box models by querying them with natural random images. In this work, we consolidate and extend the copycat method: (i) some constraints are waived; (ii) an extensive evaluation with several problems is performed; (iii) models are copied between different architectures; and, (iv) a deeper analysis is performed by looking at the copycat behavior. Results show that natural random images are effective to generate copycats for several problems.
△ Less
Submitted 21 January, 2021;
originally announced January 2021.
-
Deep traffic light detection by overlaying synthetic context on arbitrary natural images
Authors:
Jean Pablo Vieira de Mello,
Lucas Tabelini,
Rodrigo F. Berriel,
Thiago M. Paixão,
Alberto F. de Souza,
Claudine Badue,
Nicu Sebe,
Thiago Oliveira-Santos
Abstract:
Deep neural networks come as an effective solution to many problems associated with autonomous driving. By providing real image samples with traffic context to the network, the model learns to detect and classify elements of interest, such as pedestrians, traffic signs, and traffic lights. However, acquiring and annotating real data can be extremely costly in terms of time and effort. In this cont…
▽ More
Deep neural networks come as an effective solution to many problems associated with autonomous driving. By providing real image samples with traffic context to the network, the model learns to detect and classify elements of interest, such as pedestrians, traffic signs, and traffic lights. However, acquiring and annotating real data can be extremely costly in terms of time and effort. In this context, we propose a method to generate artificial traffic-related training data for deep traffic light detectors. This data is generated using basic non-realistic computer graphics to blend fake traffic scenes on top of arbitrary image backgrounds that are not related to the traffic domain. Thus, a large amount of training data can be generated without annotation efforts. Furthermore, it also tackles the intrinsic data imbalance problem in traffic light datasets, caused mainly by the low amount of samples of the yellow state. Experiments show that it is possible to achieve results comparable to those obtained with real training data from the problem domain, yielding an average mAP and an average F1-score which are each nearly 4 p.p. higher than the respective metrics obtained with a real-world reference model.
△ Less
Submitted 10 December, 2020; v1 submitted 7 November, 2020;
originally announced November 2020.
-
Keep your Eyes on the Lane: Real-time Attention-guided Lane Detection
Authors:
Lucas Tabelini,
Rodrigo Berriel,
Thiago M. Paixão,
Claudine Badue,
Alberto F. De Souza,
Thiago Oliveira-Santos
Abstract:
Modern lane detection methods have achieved remarkable performances in complex real-world scenarios, but many have issues maintaining real-time efficiency, which is important for autonomous vehicles. In this work, we propose LaneATT: an anchor-based deep lane detection model, which, akin to other generic deep object detectors, uses the anchors for the feature pooling step. Since lanes follow a reg…
▽ More
Modern lane detection methods have achieved remarkable performances in complex real-world scenarios, but many have issues maintaining real-time efficiency, which is important for autonomous vehicles. In this work, we propose LaneATT: an anchor-based deep lane detection model, which, akin to other generic deep object detectors, uses the anchors for the feature pooling step. Since lanes follow a regular pattern and are highly correlated, we hypothesize that in some cases global information may be crucial to infer their positions, especially in conditions such as occlusion, missing lane markers, and others. Thus, this work proposes a novel anchor-based attention mechanism that aggregates global information. The model was evaluated extensively on three of the most widely used datasets in the literature. The results show that our method outperforms the current state-of-the-art methods showing both higher efficacy and efficiency. Moreover, an ablation study is performed along with a discussion on efficiency trade-off options that are useful in practice.
△ Less
Submitted 17 November, 2020; v1 submitted 22 October, 2020;
originally announced October 2020.
-
What is the Best Grid-Map for Self-Driving Cars Localization? An Evaluation under Diverse Types of Illumination, Traffic, and Environment
Authors:
Filipe Mutz,
Thiago Oliveira-Santos,
Avelino Forechi,
Karin S. Komati,
Claudine Badue,
Felipe M. G. França,
Alberto F. De Souza
Abstract:
The localization of self-driving cars is needed for several tasks such as kee** maps updated, tracking objects, and planning. Localization algorithms often take advantage of maps for estimating the car pose. Since maintaining and using several maps is computationally expensive, it is important to analyze which type of map is more adequate for each application. In this work, we provide data for s…
▽ More
The localization of self-driving cars is needed for several tasks such as kee** maps updated, tracking objects, and planning. Localization algorithms often take advantage of maps for estimating the car pose. Since maintaining and using several maps is computationally expensive, it is important to analyze which type of map is more adequate for each application. In this work, we provide data for such analysis by comparing the accuracy of a particle filter localization when using occupancy, reflectivity, color, or semantic grid maps. To the best of our knowledge, such evaluation is missing in the literature. For building semantic and colour grid maps, point clouds from a Light Detection and Ranging (LiDAR) sensor are fused with images captured by a front-facing camera. Semantic information is extracted from images with a deep neural network. Experiments are performed in varied environments, under diverse conditions of illumination and traffic. Results show that occupancy grid maps lead to more accurate localization, followed by reflectivity grid maps. In most scenarios, the localization with semantic grid maps kept the position tracking without catastrophic losses, but with errors from 2 to 3 times bigger than the previous. Colour grid maps led to inaccurate and unstable localization even using a robust metric, the entropy correlation coefficient, for comparing online data and the map.
△ Less
Submitted 19 September, 2020;
originally announced September 2020.
-
Deep Traffic Sign Detection and Recognition Without Target Domain Real Images
Authors:
Lucas Tabelini,
Rodrigo Berriel,
Thiago M. Paixão,
Alberto F. De Souza,
Claudine Badue,
Nicu Sebe,
Thiago Oliveira-Santos
Abstract:
Deep learning has been successfully applied to several problems related to autonomous driving, often relying on large databases of real target-domain images for proper training. The acquisition of such real-world data is not always possible in the self-driving context, and sometimes their annotation is not feasible. Moreover, in many tasks, there is an intrinsic data imbalance that most learning-b…
▽ More
Deep learning has been successfully applied to several problems related to autonomous driving, often relying on large databases of real target-domain images for proper training. The acquisition of such real-world data is not always possible in the self-driving context, and sometimes their annotation is not feasible. Moreover, in many tasks, there is an intrinsic data imbalance that most learning-based methods struggle to cope with. Particularly, traffic sign detection is a challenging problem in which these three issues are seen altogether. To address these challenges, we propose a novel database generation method that requires only (i) arbitrary natural images, i.e., requires no real image from the target-domain, and (ii) templates of the traffic signs. The method does not aim at overcoming the training with real data, but to be a compatible alternative when the real data is not available. The effortlessly generated database is shown to be effective for the training of a deep detector on traffic signs from multiple countries. On large data sets, training with a fully synthetic data set almost matches the performance of training with a real one. When compared to training with a smaller data set of real images, training with synthetic images increased the accuracy by 12.25%. The proposed method also improves the performance of the detector when target-domain data are available.
△ Less
Submitted 30 July, 2020;
originally announced August 2020.
-
Secondary Studies in the Academic Context: A Systematic Map** and Survey
Authors:
Katia Romero Felizardo,
Érica Ferreira de Souza,
Bianca Minetto Napoleão,
Nandamudi Lankalapalli Vijaykumar,
Maria Teresa Baldassarre
Abstract:
Context: Several researchers have reported their experiences in applying secondary studies (Systematic Literature Reviews - SLRs and Systematic Map**s - SMs) in Software Engineering (SE). However, there is still a lack of studies discussing the value of performing secondary studies in an academic context. Goal: The main goal of this study is to provide an overview on the use of secondary studies…
▽ More
Context: Several researchers have reported their experiences in applying secondary studies (Systematic Literature Reviews - SLRs and Systematic Map**s - SMs) in Software Engineering (SE). However, there is still a lack of studies discussing the value of performing secondary studies in an academic context. Goal: The main goal of this study is to provide an overview on the use of secondary studies in an academic context. Method: Two empirical research methods were used. Initially, we conducted an SM to identify the available and relevant studies on the use of secondary studies as a research methodology for conducting SE research projects. Secondly, a survey was performed with 64 SE researchers to identify their perception related to the value of performing secondary studies to support their research projects. Results: Our results show benefits of using secondary studies in the academic context, such as, providing an overview of the literature as well as identifying relevant research literature on a research area enabling to find reasons to explain why a research project should be approved for a grant and/or supporting decisions made in a research project. Difficulties faced by SE graduate students with secondary studies are that they tend to be conducted by a team and it demands more effort than a traditional review. Conclusions: Secondary studies are valuable to graduate students. They should consider conducting a secondary study for their research project due to the benefits and contributions provided to develop the overall project. However, the advice of an experienced supervisor is essential to avoid bias. In addition, the acquisition of skills can increase student's motivation to pursue their research projects and prepare them for both academic or industrial careers.
△ Less
Submitted 10 July, 2020;
originally announced July 2020.
-
Self-supervised Deep Reconstruction of Mixed Strip-shredded Text Documents
Authors:
Thiago M. Paixão,
Rodrigo F. Berriel,
Maria C. S. Boeres,
Alessandro L. Koerich,
Claudine Badue,
Alberto F. de Souza,
Thiago Oliveira-Santos
Abstract:
The reconstruction of shredded documents consists of coherently arranging fragments of paper (shreds) to recover the original document(s). A great challenge in computational reconstruction is to properly evaluate the compatibility between the shreds. While traditional pixel-based approaches are not robust to real shredding, more sophisticated solutions compromise significantly time performance. Th…
▽ More
The reconstruction of shredded documents consists of coherently arranging fragments of paper (shreds) to recover the original document(s). A great challenge in computational reconstruction is to properly evaluate the compatibility between the shreds. While traditional pixel-based approaches are not robust to real shredding, more sophisticated solutions compromise significantly time performance. The solution presented in this work extends our previous deep learning method for single-page reconstruction to a more realistic/complex scenario: the reconstruction of several mixed shredded documents at once. In our approach, the compatibility evaluation is modeled as a two-class (valid or invalid) pattern recognition problem. The model is trained in a self-supervised manner on samples extracted from simulated-shredded documents, which obviates manual annotation. Experimental results on three datasets -- including a new collection of 100 strip-shredded documents produced for this work -- have shown that the proposed method outperforms the competing ones on complex scenarios, achieving accuracy superior to 90%.
△ Less
Submitted 1 July, 2020;
originally announced July 2020.
-
PolyLaneNet: Lane Estimation via Deep Polynomial Regression
Authors:
Lucas Tabelini,
Rodrigo Berriel,
Thiago M. Paixão,
Claudine Badue,
Alberto F. De Souza,
Thiago Oliveira-Santos
Abstract:
One of the main factors that contributed to the large advances in autonomous driving is the advent of deep learning. For safer self-driving vehicles, one of the problems that has yet to be solved completely is lane detection. Since methods for this task have to work in real-time (+30 FPS), they not only have to be effective (i.e., have high accuracy) but they also have to be efficient (i.e., fast)…
▽ More
One of the main factors that contributed to the large advances in autonomous driving is the advent of deep learning. For safer self-driving vehicles, one of the problems that has yet to be solved completely is lane detection. Since methods for this task have to work in real-time (+30 FPS), they not only have to be effective (i.e., have high accuracy) but they also have to be efficient (i.e., fast). In this work, we present a novel method for lane detection that uses as input an image from a forward-looking camera mounted in the vehicle and outputs polynomials representing each lane marking in the image, via deep polynomial regression. The proposed method is shown to be competitive with existing state-of-the-art methods in the TuSimple dataset while maintaining its efficiency (115 FPS). Additionally, extensive qualitative results on two additional public datasets are presented, alongside with limitations in the evaluation metrics used by recent works for lane detection. Finally, we provide source code and trained models that allow others to replicate all the results shown in this paper, which is surprisingly rare in state-of-the-art lane detection methods. The full source code and pretrained models are available at https://github.com/lucastabelini/PolyLaneNet.
△ Less
Submitted 14 July, 2020; v1 submitted 22 April, 2020;
originally announced April 2020.
-
Fast(er) Reconstruction of Shredded Text Documents via Self-Supervised Deep Asymmetric Metric Learning
Authors:
Thiago M. Paixão,
Rodrigo F. Berriel,
Maria C. S. Boeres,
Alessando L. Koerich,
Claudine Badue,
Alberto F. De Souza,
Thiago Oliveira-Santos
Abstract:
The reconstruction of shredded documents consists in arranging the pieces of paper (shreds) in order to reassemble the original aspect of such documents. This task is particularly relevant for supporting forensic investigation as documents may contain criminal evidence. As an alternative to the laborious and time-consuming manual process, several researchers have been investigating ways to perform…
▽ More
The reconstruction of shredded documents consists in arranging the pieces of paper (shreds) in order to reassemble the original aspect of such documents. This task is particularly relevant for supporting forensic investigation as documents may contain criminal evidence. As an alternative to the laborious and time-consuming manual process, several researchers have been investigating ways to perform automatic digital reconstruction. A central problem in automatic reconstruction of shredded documents is the pairwise compatibility evaluation of the shreds, notably for binary text documents. In this context, deep learning has enabled great progress for accurate reconstructions in the domain of mechanically-shredded documents. A sensitive issue, however, is that current deep model solutions require an inference whenever a pair of shreds has to be evaluated. This work proposes a scalable deep learning approach for measuring pairwise compatibility in which the number of inferences scales linearly (rather than quadratically) with the number of shreds. Instead of predicting compatibility directly, deep models are leveraged to asymmetrically project the raw shred content onto a common metric space in which distance is proportional to the compatibility. Experimental results show that our method has accuracy comparable to the state-of-the-art with a speed-up of about 22 times for a test instance with 505 shreds (20 mixed shredded-pages from different documents).
△ Less
Submitted 28 April, 2020; v1 submitted 22 March, 2020;
originally announced March 2020.
-
Establishing a Search String to Detect Secondary Studies in Software Engineering
Authors:
Bianca Minetto Napoleao,
Katia Romero Felizardo,
Erica Ferreira de Souza,
Fabio Petrillo,
Nandamudi L. Vijaykumar,
Elisa Yumi Nakagawa,
Sylvain Halle
Abstract:
Context: A tertiary study can be performed to identify related reviews on a topic of interest. However, the elaboration of an appropriate and effective search string to detect secondary studies is challenging for Software Engineering (SE) researchers. Objective: The main goal of this study is to propose a suitable search string to detect secondary studies in SE, addressing issues such as the quant…
▽ More
Context: A tertiary study can be performed to identify related reviews on a topic of interest. However, the elaboration of an appropriate and effective search string to detect secondary studies is challenging for Software Engineering (SE) researchers. Objective: The main goal of this study is to propose a suitable search string to detect secondary studies in SE, addressing issues such as the quantity of applied terms, relevance, recall and precision. Method: We analyzed seven tertiary studies under two perspectives: (1) structure -- strings' terms to detect secondary studies; and (2) field: where searching -- titles alone or abstracts alone or titles and abstracts together, among others. We validate our string by performing a two-step validation process. Firstly, we evaluated the capability to retrieve secondary studies over a set of 1537 secondary studies included in 24 tertiary studies in SE. Secondly, we evaluated the general capacity of retrieving secondary studies over an automated search using the Scopus digital library. Results: Our string was capable to retrieve an optimum value of over 90\% of the included secondary studies (recall) with a high general precision of almost 60\%. Conclusion: The suitable search string for finding secondary studies in SE contains the terms "systematic review", "literature review", "systematic map**", "map** study" and "systematic map".
△ Less
Submitted 8 June, 2022; v1 submitted 18 December, 2019;
originally announced December 2019.
-
Bio-Inspired Foveated Technique for Augmented-Range Vehicle Detection Using Deep Neural Networks
Authors:
Pedro Azevedo,
Sabrina S. Panceri,
Rânik Guidolini,
Vinicius B. Cardoso,
Claudine Badue,
Thiago Oliveira-Santos,
Alberto F. De Souza
Abstract:
We propose a bio-inspired foveated technique to detect cars in a long range camera view using a deep convolutional neural network (DCNN) for the IARA self-driving car. The DCNN receives as input (i) an image, which is captured by a camera installed on IARA's roof; and (ii) crops of the image, which are centered in the waypoints computed by IARA's path planner and whose sizes increase with the dist…
▽ More
We propose a bio-inspired foveated technique to detect cars in a long range camera view using a deep convolutional neural network (DCNN) for the IARA self-driving car. The DCNN receives as input (i) an image, which is captured by a camera installed on IARA's roof; and (ii) crops of the image, which are centered in the waypoints computed by IARA's path planner and whose sizes increase with the distance from IARA. We employ an overlap filter to discard detections of the same car in different crops of the same image based on the percentage of overlap of detections' bounding boxes. We evaluated the performance of the proposed augmented-range vehicle detection system (ARVDS) using the hardware and software infrastructure available in the IARA self-driving car. Using IARA, we captured thousands of images of real traffic situations containing cars in a long range. Experimental results show that ARVDS increases the Average Precision (AP) of long range car detection from 29.51% (using a single whole image) to 63.15%.
△ Less
Submitted 2 October, 2019;
originally announced October 2019.
-
Performance of Devito on HPC-Optimised ARM Processors
Authors:
Hermes Senger,
Jaime F. de Souza,
Edson S. Gomi,
Fabio Luporini,
Gerard J. Gorman
Abstract:
We evaluate the performance of Devito, a domain specific language (DSL) for finite differences on Arm ThunderX2 processors. Experiments with two common seismic computational kernels demonstrate that Arm processors can deliver competitive performance compared to other Intel Xeon processors.
We evaluate the performance of Devito, a domain specific language (DSL) for finite differences on Arm ThunderX2 processors. Experiments with two common seismic computational kernels demonstrate that Arm processors can deliver competitive performance compared to other Intel Xeon processors.
△ Less
Submitted 19 August, 2019; v1 submitted 9 August, 2019;
originally announced August 2019.
-
Effortless Deep Training for Traffic Sign Detection Using Templates and Arbitrary Natural Images
Authors:
Lucas Tabelini Torres,
Thiago M. Paixão,
Rodrigo F. Berriel,
Alberto F. De Souza,
Claudine Badue,
Nicu Sebe,
Thiago Oliveira-Santos
Abstract:
Deep learning has been successfully applied to several problems related to autonomous driving. Often, these solutions rely on large networks that require databases of real image samples of the problem (i.e., real world) for proper training. The acquisition of such real-world data sets is not always possible in the autonomous driving context, and sometimes their annotation is not feasible (e.g., ta…
▽ More
Deep learning has been successfully applied to several problems related to autonomous driving. Often, these solutions rely on large networks that require databases of real image samples of the problem (i.e., real world) for proper training. The acquisition of such real-world data sets is not always possible in the autonomous driving context, and sometimes their annotation is not feasible (e.g., takes too long or is too expensive). Moreover, in many tasks, there is an intrinsic data imbalance that most learning-based methods struggle to cope with. It turns out that traffic sign detection is a problem in which these three issues are seen altogether. In this work, we propose a novel database generation method that requires only (i) arbitrary natural images, i.e., requires no real image from the domain of interest, and (ii) templates of the traffic signs, i.e., templates synthetically created to illustrate the appearance of the category of a traffic sign. The effortlessly generated training database is shown to be effective for the training of a deep detector (such as Faster R-CNN) on German traffic signs, achieving 95.66% of mAP on average. In addition, the proposed method is able to detect traffic signs with an average precision, recall and F1-score of about 94%, 91% and 93%, respectively. The experiments surprisingly show that detectors can be trained with simple data generation methods and without problem domain data for the background, which is in the opposite direction of the common sense for deep learning.
△ Less
Submitted 22 July, 2019;
originally announced July 2019.
-
Cross-Domain Car Detection Using Unsupervised Image-to-Image Translation: From Day to Night
Authors:
Vinicius F. Arruda,
Thiago M. Paixão,
Rodrigo F. Berriel,
Alberto F. De Souza,
Claudine Badue,
Nicu Sebe,
Thiago Oliveira-Santos
Abstract:
Deep learning techniques have enabled the emergence of state-of-the-art models to address object detection tasks. However, these techniques are data-driven, delegating the accuracy to the training dataset which must resemble the images in the target task. The acquisition of a dataset involves annotating images, an arduous and expensive process, generally requiring time and manual effort. Thus, a c…
▽ More
Deep learning techniques have enabled the emergence of state-of-the-art models to address object detection tasks. However, these techniques are data-driven, delegating the accuracy to the training dataset which must resemble the images in the target task. The acquisition of a dataset involves annotating images, an arduous and expensive process, generally requiring time and manual effort. Thus, a challenging scenario arises when the target domain of application has no annotated dataset available, making tasks in such situation to lean on a training dataset of a different domain. Sharing this issue, object detection is a vital task for autonomous vehicles where the large amount of driving scenarios yields several domains of application requiring annotated data for the training process. In this work, a method for training a car detection system with annotated data from a source domain (day images) without requiring the image annotations of the target domain (night images) is presented. For that, a model based on Generative Adversarial Networks (GANs) is explored to enable the generation of an artificial dataset with its respective annotations. The artificial dataset (fake dataset) is created translating images from day-time domain to night-time domain. The fake dataset, which comprises annotated images of only the target domain (night images), is then used to train the car detector model. Experimental results showed that the proposed method achieved significant and consistent improvements, including the increasing by more than 10% of the detection performance when compared to the training with only the available annotated data (i.e., day images).
△ Less
Submitted 19 July, 2019;
originally announced July 2019.
-
Traffic Light Recognition Using Deep Learning and Prior Maps for Autonomous Cars
Authors:
Lucas C. Possatti,
Rânik Guidolini,
Vinicius B. Cardoso,
Rodrigo F. Berriel,
Thiago M. Paixão,
Claudine Badue,
Alberto F. De Souza,
Thiago Oliveira-Santos
Abstract:
Autonomous terrestrial vehicles must be capable of perceiving traffic lights and recognizing their current states to share the streets with human drivers. Most of the time, human drivers can easily identify the relevant traffic lights. To deal with this issue, a common solution for autonomous cars is to integrate recognition with prior maps. However, additional solution is required for the detecti…
▽ More
Autonomous terrestrial vehicles must be capable of perceiving traffic lights and recognizing their current states to share the streets with human drivers. Most of the time, human drivers can easily identify the relevant traffic lights. To deal with this issue, a common solution for autonomous cars is to integrate recognition with prior maps. However, additional solution is required for the detection and recognition of the traffic light. Deep learning techniques have showed great performance and power of generalization including traffic related problems. Motivated by the advances in deep learning, some recent works leveraged some state-of-the-art deep detectors to locate (and further recognize) traffic lights from 2D camera images. However, none of them combine the power of the deep learning-based detectors with prior maps to recognize the state of the relevant traffic lights. Based on that, this work proposes to integrate the power of deep learning-based detection with the prior maps used by our car platform IARA (acronym for Intelligent Autonomous Robotic Automobile) to recognize the relevant traffic lights of predefined routes. The process is divided in two phases: an offline phase for map construction and traffic lights annotation; and an online phase for traffic light recognition and identification of the relevant ones. The proposed system was evaluated on five test cases (routes) in the city of Vitória, each case being composed of a video sequence and a prior map with the relevant traffic lights for the route. Results showed that the proposed technique is able to correctly identify the relevant traffic light along the trajectory.
△ Less
Submitted 4 June, 2019;
originally announced June 2019.
-
Self-Driving Cars: A Survey
Authors:
Claudine Badue,
Rânik Guidolini,
Raphael Vivacqua Carneiro,
Pedro Azevedo,
Vinicius Brito Cardoso,
Avelino Forechi,
Luan Jesus,
Rodrigo Berriel,
Thiago Paixão,
Filipe Mutz,
Lucas Veronese,
Thiago Oliveira-Santos,
Alberto Ferreira De Souza
Abstract:
We survey research on self-driving cars published in the literature focusing on autonomous cars developed since the DARPA challenges, which are equipped with an autonomy system that can be categorized as SAE level 3 or higher. The architecture of the autonomy system of self-driving cars is typically organized into the perception system and the decision-making system. The perception system is gener…
▽ More
We survey research on self-driving cars published in the literature focusing on autonomous cars developed since the DARPA challenges, which are equipped with an autonomy system that can be categorized as SAE level 3 or higher. The architecture of the autonomy system of self-driving cars is typically organized into the perception system and the decision-making system. The perception system is generally divided into many subsystems responsible for tasks such as self-driving-car localization, static obstacles map**, moving obstacles detection and tracking, road map**, traffic signalization detection and recognition, among others. The decision-making system is commonly partitioned as well into many subsystems responsible for tasks such as route planning, path planning, behavior selection, motion planning, and control. In this survey, we present the typical architecture of the autonomy system of self-driving cars. We also review research on relevant methods for perception and decision making. Furthermore, we present a detailed description of the architecture of the autonomy system of the self-driving car developed at the Universidade Federal do Espírito Santo (UFES), named Intelligent Autonomous Robotics Automobile (IARA). Finally, we list prominent self-driving car research platforms developed by academia and technology companies, and reported in the media.
△ Less
Submitted 2 October, 2019; v1 submitted 14 January, 2019;
originally announced January 2019.
-
Memory-like Map Decay for Autonomous Vehicles based on Grid Maps
Authors:
Thomas Teixeira,
Filipe Mutz,
Karin Satie Komati,
Lucas Veronese,
Vinicius B. Cardoso,
Claudine Badue,
Thiago Oliveira-Santos,
Alberto F. De Souza
Abstract:
In this work, we present a novel strategy for correcting imperfections in occupancy grid maps called map decay. The objective of map decay is to correct invalid occupancy probabilities of map cells that are unobservable by sensors. The strategy was inspired by an analogy between the memory architecture believed to exist in the human brain and the maps maintained by an autonomous vehicle. It consis…
▽ More
In this work, we present a novel strategy for correcting imperfections in occupancy grid maps called map decay. The objective of map decay is to correct invalid occupancy probabilities of map cells that are unobservable by sensors. The strategy was inspired by an analogy between the memory architecture believed to exist in the human brain and the maps maintained by an autonomous vehicle. It consists in merging sensory information obtained during runtime (online) with a priori data from a high-precision map constructed offline. In map decay, cells observed by sensors are updated using traditional occupancy grid map** techniques and unobserved cells are adjusted so that their occupancy probabilities tend to the values found in the offline map. This strategy is grounded in the idea that the most precise information available about an unobservable cell is the value found in the high-precision offline map. Map decay was successfully tested and is still in use in the IARA autonomous vehicle from Universidade Federal do Espírito Santo.
△ Less
Submitted 27 April, 2021; v1 submitted 4 October, 2018;
originally announced October 2018.
-
Ego-Lane Analysis System (ELAS): Dataset and Algorithms
Authors:
Rodrigo F. Berriel,
Edilson de Aguiar,
Alberto F. de Souza,
Thiago Oliveira-Santos
Abstract:
Decreasing costs of vision sensors and advances in embedded hardware boosted lane related research detection, estimation, and tracking in the past two decades. The interest in this topic has increased even more with the demand for advanced driver assistance systems (ADAS) and self-driving cars. Although extensively studied independently, there is still need for studies that propose a combined solu…
▽ More
Decreasing costs of vision sensors and advances in embedded hardware boosted lane related research detection, estimation, and tracking in the past two decades. The interest in this topic has increased even more with the demand for advanced driver assistance systems (ADAS) and self-driving cars. Although extensively studied independently, there is still need for studies that propose a combined solution for the multiple problems related to the ego-lane, such as lane departure warning (LDW), lane change detection, lane marking type (LMT) classification, road markings detection and classification, and detection of adjacent lanes (i.e., immediate left and right lanes) presence. In this paper, we propose a real-time Ego-Lane Analysis System (ELAS) capable of estimating ego-lane position, classifying LMTs and road markings, performing LDW and detecting lane change events. The proposed vision-based system works on a temporal sequence of images. Lane marking features are extracted in perspective and Inverse Perspective Map** (IPM) images that are combined to increase robustness. The final estimated lane is modeled as a spline using a combination of methods (Hough lines with Kalman filter and spline with particle filter). Based on the estimated lane, all other events are detected. To validate ELAS and cover the lack of lane datasets in the literature, a new dataset with more than 20 different scenes (in more than 15,000 frames) and considering a variety of scenarios (urban road, highways, traffic, shadows, etc.) was created. The dataset was manually annotated and made publicly available to enable evaluation of several events that are of interest for the research community (i.e., lane estimation, change, and centering; road markings; intersections; LMTs; crosswalks and adjacent lanes). ELAS achieved high detection rates in all real-world events and proved to be ready for real-time applications.
△ Less
Submitted 15 June, 2018;
originally announced June 2018.
-
Copycat CNN: Stealing Knowledge by Persuading Confession with Random Non-Labeled Data
Authors:
Jacson Rodrigues Correia-Silva,
Rodrigo F. Berriel,
Claudine Badue,
Alberto F. de Souza,
Thiago Oliveira-Santos
Abstract:
In the past few years, Convolutional Neural Networks (CNNs) have been achieving state-of-the-art performance on a variety of problems. Many companies employ resources and money to generate these models and provide them as an API, therefore it is in their best interest to protect them, i.e., to avoid that someone else copies them. Recent studies revealed that state-of-the-art CNNs are vulnerable to…
▽ More
In the past few years, Convolutional Neural Networks (CNNs) have been achieving state-of-the-art performance on a variety of problems. Many companies employ resources and money to generate these models and provide them as an API, therefore it is in their best interest to protect them, i.e., to avoid that someone else copies them. Recent studies revealed that state-of-the-art CNNs are vulnerable to adversarial examples attacks, and this weakness indicates that CNNs do not need to operate in the problem domain (PD). Therefore, we hypothesize that they also do not need to be trained with examples of the PD in order to operate in it.
Given these facts, in this paper, we investigate if a target black-box CNN can be copied by persuading it to confess its knowledge through random non-labeled data. The copy is two-fold: i) the target network is queried with random data and its predictions are used to create a fake dataset with the knowledge of the network; and ii) a copycat network is trained with the fake dataset and should be able to achieve similar performance as the target network.
This hypothesis was evaluated locally in three problems (facial expression, object, and crosswalk classification) and against a cloud-based API. In the copy attacks, images from both non-problem domain and PD were used. All copycat networks achieved at least 93.7% of the performance of the original models with non-problem domain data, and at least 98.6% using additional data from the PD. Additionally, the copycat CNN successfully copied at least 97.3% of the performance of the Microsoft Azure Emotion API. Our results show that it is possible to create a copycat CNN by simply querying a target network as black-box with random non-labeled data.
△ Less
Submitted 14 June, 2018;
originally announced June 2018.
-
Automatic Large-Scale Data Acquisition via Crowdsourcing for Crosswalk Classification: A Deep Learning Approach
Authors:
Rodrigo F. Berriel,
Franco Schmidt Rossi,
Alberto F. de Souza,
Thiago Oliveira-Santos
Abstract:
Correctly identifying crosswalks is an essential task for the driving activity and mobility autonomy. Many crosswalk classification, detection and localization systems have been proposed in the literature over the years. These systems use different perspectives to tackle the crosswalk classification problem: satellite imagery, cockpit view (from the top of a car or behind the windshield), and pede…
▽ More
Correctly identifying crosswalks is an essential task for the driving activity and mobility autonomy. Many crosswalk classification, detection and localization systems have been proposed in the literature over the years. These systems use different perspectives to tackle the crosswalk classification problem: satellite imagery, cockpit view (from the top of a car or behind the windshield), and pedestrian perspective. Most of the works in the literature are designed and evaluated using small and local datasets, i.e. datasets that present low diversity. Scaling to large datasets imposes a challenge for the annotation procedure. Moreover, there is still need for cross-database experiments in the literature because it is usually hard to collect the data in the same place and conditions of the final application. In this paper, we present a crosswalk classification system based on deep learning. For that, crowdsourcing platforms, such as OpenStreetMap and Google Street View, are exploited to enable automatic training via automatic acquisition and annotation of a large-scale database. Additionally, this work proposes a comparison study of models trained using fully-automatic data acquisition and annotation against models that were partially annotated. Cross-database experiments were also included in the experimentation to show that the proposed methods enable use with real world applications. Our results show that the model trained on the fully-automatic database achieved high overall accuracy (94.12%), and that a statistically significant improvement (to 96.30%) can be achieved by manually annotating a specific part of the database. Finally, the results of the cross-database experiments show that both models are robust to the many variations of image and scenarios, presenting a consistent behavior.
△ Less
Submitted 30 May, 2018;
originally announced May 2018.
-
Visual Global Localization with a Hybrid WNN-CNN Approach
Authors:
Avelino Forechi,
Thiago Oliveira-Santos,
Claudine Badue,
Alberto F. De Souza
Abstract:
Currently, self-driving cars rely greatly on the Global Positioning System (GPS) infrastructure, albeit there is an increasing demand for alternative methods for GPS-denied environments. One of them is known as place recognition, which associates images of places with their corresponding positions. We previously proposed systems based on Weightless Neural Networks (WNN) to address this problem as…
▽ More
Currently, self-driving cars rely greatly on the Global Positioning System (GPS) infrastructure, albeit there is an increasing demand for alternative methods for GPS-denied environments. One of them is known as place recognition, which associates images of places with their corresponding positions. We previously proposed systems based on Weightless Neural Networks (WNN) to address this problem as a classification task. This encompasses solely one part of the global localization, which is not precise enough for driverless cars. Instead of just recognizing past places and outputting their poses, it is desired that a global localization system estimates the pose of current place images. In this paper, we propose to tackle this problem as follows. Firstly, given a live image, the place recognition system returns the most similar image and its pose. Then, given live and recollected images, a visual localization system outputs the relative camera pose represented by those images. To estimate the relative camera pose between the recollected and the current images, a Convolutional Neural Network (CNN) is trained with the two images as input and a relative pose vector as output. Together, these systems solve the global localization problem using the topological and metric information to approximate the current vehicle pose. The full approach is compared to a Real- Time Kinematic GPS system and a Simultaneous Localization and Map** (SLAM) system. Experimental results show that the proposed approach correctly localizes a vehicle 90% of the time with a mean error of 1.20m compared to 1.12m of the SLAM system and 0.37m of the GPS, 89% of the time.
△ Less
Submitted 14 May, 2018; v1 submitted 8 May, 2018;
originally announced May 2018.
-
Map** Road Lanes Using Laser Remission and Deep Neural Networks
Authors:
Raphael V. Carneiro,
Rafael C. Nascimento,
Rânik Guidolini,
Vinicius B. Cardoso,
Thiago Oliveira-Santos,
Claudine Badue,
Alberto F. De Souza
Abstract:
We propose the use of deep neural networks (DNN) for solving the problem of inferring the position and relevant properties of lanes of urban roads with poor or absent horizontal signalization, in order to allow the operation of autonomous cars in such situations. We take a segmentation approach to the problem and use the Efficient Neural Network (ENet) DNN for segmenting LiDAR remission grid maps…
▽ More
We propose the use of deep neural networks (DNN) for solving the problem of inferring the position and relevant properties of lanes of urban roads with poor or absent horizontal signalization, in order to allow the operation of autonomous cars in such situations. We take a segmentation approach to the problem and use the Efficient Neural Network (ENet) DNN for segmenting LiDAR remission grid maps into road maps. We represent road maps using what we called road grid maps. Road grid maps are square matrixes and each element of these matrixes represents a small square region of real-world space. The value of each element is a code associated with the semantics of the road map. Our road grid maps contain all information about the roads' lanes required for building the Road Definition Data Files (RDDFs) that are necessary for the operation of our autonomous car, IARA (Intelligent Autonomous Robotic Automobile). We have built a dataset of tens of kilometers of manually marked road lanes and used part of it to train ENet to segment road grid maps from remission grid maps. After being trained, ENet achieved an average segmentation accuracy of 83.7%. We have tested the use of inferred road grid maps in the real world using IARA on a stretch of 3.7 km of urban roads and it has shown performance equivalent to that of the previous IARA's subsystem that uses a manually generated RDDF.
△ Less
Submitted 27 April, 2018;
originally announced April 2018.
-
Going Deeper with Semantics: Video Activity Interpretation using Semantic Contextualization
Authors:
Sathyanarayanan N. Aakur,
Fillipe DM de Souza,
Sudeep Sarkar
Abstract:
A deeper understanding of video activities extends beyond recognition of underlying concepts such as actions and objects: constructing deep semantic representations requires reasoning about the semantic relationships among these concepts, often beyond what is directly observed in the data. To this end, we propose an energy minimization framework that leverages large-scale commonsense knowledge bas…
▽ More
A deeper understanding of video activities extends beyond recognition of underlying concepts such as actions and objects: constructing deep semantic representations requires reasoning about the semantic relationships among these concepts, often beyond what is directly observed in the data. To this end, we propose an energy minimization framework that leverages large-scale commonsense knowledge bases, such as ConceptNet, to provide contextual cues to establish semantic relationships among entities directly hypothesized from video signal. We mathematically express this using the language of Grenander's canonical pattern generator theory. We show that the use of prior encoded commonsense knowledge alleviate the need for large annotated training datasets and help tackle imbalance in training through prior knowledge. Using three different publicly available datasets - Charades, Microsoft Visual Description Corpus and Breakfast Actions datasets, we show that the proposed model can generate video interpretations whose quality is better than those reported by state-of-the-art approaches, which have substantial training needs. Through extensive experiments, we show that the use of commonsense knowledge from ConceptNet allows the proposed approach to handle various challenges such as training data imbalance, weak features, and complex semantic relationships and visual scenes.
△ Less
Submitted 15 November, 2018; v1 submitted 11 August, 2017;
originally announced August 2017.
-
Deep Learning Based Large-Scale Automatic Satellite Crosswalk Classification
Authors:
Rodrigo F. Berriel,
Andre Teixeira Lopes,
Alberto F. de Souza,
Thiago Oliveira-Santos
Abstract:
High-resolution satellite imagery have been increasingly used on remote sensing classification problems. One of the main factors is the availability of this kind of data. Even though, very little effort has been placed on the zebra crossing classification problem. In this letter, crowdsourcing systems are exploited in order to enable the automatic acquisition and annotation of a large-scale satell…
▽ More
High-resolution satellite imagery have been increasingly used on remote sensing classification problems. One of the main factors is the availability of this kind of data. Even though, very little effort has been placed on the zebra crossing classification problem. In this letter, crowdsourcing systems are exploited in order to enable the automatic acquisition and annotation of a large-scale satellite imagery database for crosswalks related tasks. Then, this dataset is used to train deep-learning-based models in order to accurately classify satellite images that contains or not zebra crossings. A novel dataset with more than 240,000 images from 3 continents, 9 countries and more than 20 cities was used in the experiments. Experimental results showed that freely available crowdsourcing data can be used to accurately (97.11%) train robust models to perform crosswalk classification on a global scale.
△ Less
Submitted 5 July, 2017; v1 submitted 28 June, 2017;
originally announced June 2017.
-
A Model-Predictive Motion Planner for the IARA Autonomous Car
Authors:
Vinicius Cardoso,
Josias Oliveira,
Thomas Teixeira,
Claudine Badue,
Filipe Mutz,
Thiago Oliveira-Santos,
Lucas Veronese,
Alberto F. De Souza
Abstract:
We present the Model-Predictive Motion Planner (MPMP) of the Intelligent Autonomous Robotic Automobile (IARA). IARA is a fully autonomous car that uses a path planner to compute a path from its current position to the desired destination. Using this path, the current position, a goal in the path and a map, IARA's MPMP is able to compute smooth trajectories from its current position to the goal in…
▽ More
We present the Model-Predictive Motion Planner (MPMP) of the Intelligent Autonomous Robotic Automobile (IARA). IARA is a fully autonomous car that uses a path planner to compute a path from its current position to the desired destination. Using this path, the current position, a goal in the path and a map, IARA's MPMP is able to compute smooth trajectories from its current position to the goal in less than 50 ms. MPMP computes the poses of these trajectories so that they follow the path closely and, at the same time, are at a safe distance of eventual obstacles. Our experiments have shown that MPMP is able to compute trajectories that precisely follow a path produced by a Human driver (distance of 0.15 m in average) while smoothly driving IARA at speeds of up to 32.4 km/h (9 m/s).
△ Less
Submitted 9 November, 2017; v1 submitted 14 November, 2016;
originally announced November 2016.
-
A Dataset for Improved RGBD-based Object Detection and Pose Estimation for Warehouse Pick-and-Place
Authors:
Colin Rennie,
Rahul Shome,
Kostas E. Bekris,
Alberto F. De Souza
Abstract:
An important logistics application of robotics involves manipulators that pick-and-place objects placed in warehouse shelves. A critical aspect of this task corre- sponds to detecting the pose of a known object in the shelf using visual data. Solving this problem can be assisted by the use of an RGB-D sensor, which also provides depth information beyond visual data. Nevertheless, it remains a chal…
▽ More
An important logistics application of robotics involves manipulators that pick-and-place objects placed in warehouse shelves. A critical aspect of this task corre- sponds to detecting the pose of a known object in the shelf using visual data. Solving this problem can be assisted by the use of an RGB-D sensor, which also provides depth information beyond visual data. Nevertheless, it remains a challenging problem since multiple issues need to be addressed, such as low illumination inside shelves, clutter, texture-less and reflective objects as well as the limitations of depth sensors. This paper provides a new rich data set for advancing the state-of-the-art in RGBD- based 3D object pose estimation, which is focused on the challenges that arise when solving warehouse pick- and-place tasks. The publicly available data set includes thousands of images and corresponding ground truth data for the objects used during the first Amazon Picking Challenge at different poses and clutter conditions. Each image is accompanied with ground truth information to assist in the evaluation of algorithms for object detection. To show the utility of the data set, a recent algorithm for RGBD-based pose estimation is evaluated in this paper. Based on the measured performance of the algorithm on the data set, various modifications and improvements are applied to increase the accuracy of detection. These steps can be easily applied to a variety of different methodologies for object pose detection and improve performance in the domain of warehouse pick-and-place.
△ Less
Submitted 20 February, 2016; v1 submitted 3 September, 2015;
originally announced September 2015.
-
Wavelet Analysis as an Information Processing Technique
Authors:
H. M. de Oliveira,
D. F. de Souza
Abstract:
A new interpretation for the wavelet analysis is reported, which can is viewed as an information processing technique. It was recently proposed that every basic wavelet could be associated with a proper probability density, allowing defining the entropy of a wavelet. Introducing now the concept of wavelet mutual information between a signal and an analysing wavelet fulfils the foundations of a wav…
▽ More
A new interpretation for the wavelet analysis is reported, which can is viewed as an information processing technique. It was recently proposed that every basic wavelet could be associated with a proper probability density, allowing defining the entropy of a wavelet. Introducing now the concept of wavelet mutual information between a signal and an analysing wavelet fulfils the foundations of a wavelet information theory (WIT). Both continuous and discrete time signals are considered. Finally, we showed how to compute the information provided by a multiresolution analysis by means of the inhomogeneous wavelet expansion. Highlighting ideas behind the WIT are presented.
△ Less
Submitted 20 February, 2015;
originally announced February 2015.
-
Content-Based Filtering for Video Sharing Social Networks
Authors:
Eduardo Valle,
Sandra de Avila,
Antonio da Luz Jr.,
Fillipe de Souza,
Marcelo Coelho,
Arnaldo Araújo
Abstract:
In this paper we compare the use of several features in the task of content filtering for video social networks, a very challenging task, not only because the unwanted content is related to very high-level semantic concepts (e.g., pornography, violence, etc.) but also because videos from social networks are extremely assorted, preventing the use of constrained a priori information. We propose a si…
▽ More
In this paper we compare the use of several features in the task of content filtering for video social networks, a very challenging task, not only because the unwanted content is related to very high-level semantic concepts (e.g., pornography, violence, etc.) but also because videos from social networks are extremely assorted, preventing the use of constrained a priori information. We propose a simple method, able to combine diverse evidence, coming from different features and various video elements (entire video, shots, frames, keyframes, etc.). We evaluate our method in three social network applications, related to the detection of unwanted content - pornographic videos, violent videos, and videos posted to artificially manipulate popularity scores. Using challenging test databases, we show that this simple scheme is able to obtain good results, provided that adequate features are chosen. Moreover, we establish a representation using codebooks of spatiotemporal local descriptors as critical to the success of the method in all three contexts. This is consequential, since the state-of-the-art still relies heavily on static features for the tasks addressed.
△ Less
Submitted 12 January, 2011;
originally announced January 2011.
-
Semi-Automatic Indexing of Multilingual Documents
Authors:
Ulrich Schiel,
Ianna M. Sodre Ferreira de Souza,
Edberto Ferneda
Abstract:
With the growing significance of digital libraries and the Internet, more and more electronic texts become accessible to a wide and geographically disperse public. This requires adequate tools to facilitate indexing, storage, and retrieval of documents written in different languages. We present a method for semi-automatic indexing of electronic documents and construction of a multilingual thesau…
▽ More
With the growing significance of digital libraries and the Internet, more and more electronic texts become accessible to a wide and geographically disperse public. This requires adequate tools to facilitate indexing, storage, and retrieval of documents written in different languages. We present a method for semi-automatic indexing of electronic documents and construction of a multilingual thesaurus, which can be used for query formulation and information retrieval. We use special dictionaries and user interaction in order to solve ambiguities and find adequate canonical terms in the language and adequate abstract language-independent terms. The abstract thesaurus is updated incrementally by new indexed documents and is used to search document concerning terms in a query to the document base.
△ Less
Submitted 10 February, 1999;
originally announced February 1999.