Search | arXiv e-print repository

arXiv:2010.06626 [pdf, other]

doi 10.1016/j.robot.2020.103701

On Deep Learning Techniques to Boost Monocular Depth Estimation for Autonomous Navigation

Authors: Raul de Queiroz Mendes, Eduardo Godinho Ribeiro, Nicolas dos Santos Rosa, Valdir Grassi Jr

Abstract: Inferring the depth of images is a fundamental inverse problem within the field of Computer Vision since depth information is obtained through 2D images, which can be generated from infinite possibilities of observed real scenes. Benefiting from the progress of Convolutional Neural Networks (CNNs) to explore structural features and spatial image information, Single Image Depth Estimation (SIDE) is… ▽ More Inferring the depth of images is a fundamental inverse problem within the field of Computer Vision since depth information is obtained through 2D images, which can be generated from infinite possibilities of observed real scenes. Benefiting from the progress of Convolutional Neural Networks (CNNs) to explore structural features and spatial image information, Single Image Depth Estimation (SIDE) is often highlighted in scopes of scientific and technological innovation, as this concept provides advantages related to its low implementation cost and robustness to environmental conditions. In the context of autonomous vehicles, state-of-the-art CNNs optimize the SIDE task by producing high-quality depth maps, which are essential during the autonomous navigation process in different locations. However, such networks are usually supervised by sparse and noisy depth data, from Light Detection and Ranging (LiDAR) laser scans, and are carried out at high computational cost, requiring high-performance Graphic Processing Units (GPUs). Therefore, we propose a new lightweight and fast supervised CNN architecture combined with novel feature extraction models which are designed for real-world autonomous navigation. We also introduce an efficient surface normals module, jointly with a simple geometric 2.5D loss function, to solve SIDE problems. We also innovate by incorporating multiple Deep Learning techniques, such as the use of densification algorithms and additional semantic, surface normals and depth information to train our framework. The method introduced in this work focuses on robotic applications in indoor and outdoor environments and its results are evaluated on the competitive and publicly available NYU Depth V2 and KITTI Depth datasets. △ Less

Submitted 28 December, 2020; v1 submitted 13 October, 2020; originally announced October 2020.

Comments: 29 pages, 16 figures. Preprint published in the Elsevier's Robotics and Autonomous Systems journal on November 23, 2020

Journal ref: Journal: Robotics and Autonomous Systems, publisher: Elsevier, volume number: 136, year: 2020, page number: 103701

arXiv:2010.06544 [pdf, other]

doi 10.1016/j.robot.2021.103757

Real-Time Deep Learning Approach to Visual Servo Control and Grasp Detection for Autonomous Robotic Manipulation

Authors: Eduardo Godinho Ribeiro, Raul de Queiroz Mendes, Valdir Grassi Jr

Abstract: In order to explore robotic gras** in unstructured and dynamic environments, this work addresses the visual perception phase involved in the task. This phase involves the processing of visual data to obtain the location of the object to be grasped, its pose and the points at which the robot`s grippers must make contact to ensure a stable grasp. For this, the Cornell Gras** dataset is used to t… ▽ More In order to explore robotic gras** in unstructured and dynamic environments, this work addresses the visual perception phase involved in the task. This phase involves the processing of visual data to obtain the location of the object to be grasped, its pose and the points at which the robot`s grippers must make contact to ensure a stable grasp. For this, the Cornell Gras** dataset is used to train a convolutional neural network that, having an image of the robot`s workspace, with a certain object, is able to predict a grasp rectangle that symbolizes the position, orientation and opening of the robot`s grippers before its closing. In addition to this network, which runs in real-time, another one is designed to deal with situations in which the object moves in the environment. Therefore, the second network is trained to perform a visual servo control, ensuring that the object remains in the robot`s field of view. This network predicts the proportional values of the linear and angular velocities that the camera must have so that the object is always in the image processed by the grasp network. The dataset used for training was automatically generated by a Kinova Gen3 manipulator. The robot is also used to evaluate the applicability in real-time and obtain practical results from the designed algorithms. Moreover, the offline results obtained through validation sets are also analyzed and discussed regarding their efficiency and processing speed. The developed controller was able to achieve a millimeter accuracy in the final position considering a target object seen for the first time. To the best of our knowledge, we have not found in the literature other works that achieve such precision with a controller learned from scratch. Thus, this work presents a new system for autonomous robotic manipulation with high processing speed and the ability to generalize to several different objects. △ Less

Submitted 28 February, 2021; v1 submitted 13 October, 2020; originally announced October 2020.

Comments: 27 pages, 24 figures. Preprint published in the Elsevier's Robotics and Autonomous Systems journal on February 24, 2021

Journal ref: Journal: Robotics and Autonomous Systems, publisher: Elsevier, volume number: 139, year: 2021, page number: 103757

arXiv:cs/0012020 [pdf]

Creativity and Delusions: A Neurocomputational Approach

Authors: Daniele Quintella Mendes, Luis Alfredo Vidal de Carvalho

Abstract: Thinking is one of the most interesting mental processes. Its complexity is sometimes simplified and its different manifestations are classified into normal and abnormal, like the delusional and disorganized thought or the creative one. The boundaries between these facets of thinking are fuzzy causing difficulties in medical, academic, and philosophical discussions. Considering the dopaminergic… ▽ More Thinking is one of the most interesting mental processes. Its complexity is sometimes simplified and its different manifestations are classified into normal and abnormal, like the delusional and disorganized thought or the creative one. The boundaries between these facets of thinking are fuzzy causing difficulties in medical, academic, and philosophical discussions. Considering the dopaminergic signal-to-noise neuronal modulation in the central nervous system, and the existence of semantic maps in human brain, a self-organizing neural network model was developed to unify the different thought processes into a single neurocomputational substrate. Simulations were performed varying the dopaminergic modulation and observing the different patterns that emerged at the semantic map. Assuming that the thought process is the total pattern elicited at the output layer of the neural network, the model shows how the normal and abnormal thinking are generated and that there are no borders between their different manifestations. Actually, a continuum of different qualitative reasoning, ranging from delusion to disorganization of thought, and passing through the normal and the creative thinking, seems to be more plausible. The model is far from explaining the complexities of human thinking but, at least, it seems to be a good metaphorical and unifying view of the many facets of this phenomenon usually studied in separated settings. △ Less

Submitted 22 December, 2000; originally announced December 2000.

Comments: 8 pages, 6 figures

ACM Class: I.5.1

Showing 1–3 of 3 results for author: Mendes, D Q