-
3D Scene Geometry Estimation from 360$^\circ$ Imagery: A Survey
Authors:
Thiago Lopes Trugillo da Silveira,
Paulo Gamarra Lessa Pinto,
Jeffri Erwin Murrugarra Llerena,
Claudio Rosito Jung
Abstract:
This paper provides a comprehensive survey on pioneer and state-of-the-art 3D scene geometry estimation methodologies based on single, two, or multiple images captured under the omnidirectional optics. We first revisit the basic concepts of the spherical camera model, and review the most common acquisition technologies and representation formats suitable for omnidirectional (also called 360…
▽ More
This paper provides a comprehensive survey on pioneer and state-of-the-art 3D scene geometry estimation methodologies based on single, two, or multiple images captured under the omnidirectional optics. We first revisit the basic concepts of the spherical camera model, and review the most common acquisition technologies and representation formats suitable for omnidirectional (also called 360$^\circ$, spherical or panoramic) images and videos. We then survey monocular layout and depth inference approaches, highlighting the recent advances in learning-based solutions suited for spherical data. The classical stereo matching is then revised on the spherical domain, where methodologies for detecting and describing sparse and dense features become crucial. The stereo matching concepts are then extrapolated for multiple view camera setups, categorizing them among light fields, multi-view stereo, and structure from motion (or visual simultaneous localization and map**). We also compile and discuss commonly adopted datasets and figures of merit indicated for each purpose and list recent results for completeness. We conclude this paper by pointing out current and future trends.
△ Less
Submitted 17 January, 2024;
originally announced January 2024.
-
"A Nova Eletricidade: Aplicações, Riscos e Tendências da IA Moderna -- "The New Electricity": Applications, Risks, and Trends in Current AI
Authors:
Ana L. C. Bazzan,
Anderson R. Tavares,
André G. Pereira,
Cláudio R. Jung,
Jacob Scharcanski,
Joel Luis Carbonera,
Luís C. Lamb,
Mariana Recamonde-Mendoza,
Thiago L. T. da Silveira,
Viviane Moreira
Abstract:
The thought-provoking analogy between AI and electricity, made by computer scientist and entrepreneur Andrew Ng, summarizes the deep transformation that recent advances in Artificial Intelligence (AI) have triggered in the world. This chapter presents an overview of the ever-evolving landscape of AI, written in Portuguese. With no intent to exhaust the subject, we explore the AI applications that…
▽ More
The thought-provoking analogy between AI and electricity, made by computer scientist and entrepreneur Andrew Ng, summarizes the deep transformation that recent advances in Artificial Intelligence (AI) have triggered in the world. This chapter presents an overview of the ever-evolving landscape of AI, written in Portuguese. With no intent to exhaust the subject, we explore the AI applications that are redefining sectors of the economy, impacting society and humanity. We analyze the risks that may come along with rapid technological progress and future trends in AI, an area that is on the path to becoming a general-purpose technology, just like electricity, which revolutionized society in the 19th and 20th centuries.
A provocativa comparação entre IA e eletricidade, feita pelo cientista da computação e empreendedor Andrew Ng, resume a profunda transformação que os recentes avanços em Inteligência Artificial (IA) têm desencadeado no mundo. Este capítulo apresenta uma visão geral pela paisagem em constante evolução da IA. Sem pretensões de exaurir o assunto, exploramos as aplicações que estão redefinindo setores da economia, impactando a sociedade e a humanidade. Analisamos os riscos que acompanham o rápido progresso tecnológico e as tendências futuras da IA, área que trilha o caminho para se tornar uma tecnologia de propósito geral, assim como a eletricidade, que revolucionou a sociedade dos séculos XIX e XX.
△ Less
Submitted 8 October, 2023;
originally announced October 2023.
-
Cell Tracking-by-detection using Elliptical Bounding Boxes
Authors:
Lucas N. Kirsten,
Cláudio R. Jung
Abstract:
Cell detection and tracking are paramount for bio-analysis. Recent approaches rely on the tracking-by-model evolution paradigm, which usually consists of training end-to-end deep learning models to detect and track the cells on the frames with promising results. However, such methods require extensive amounts of annotated data, which is time-consuming to obtain and often requires specialized annot…
▽ More
Cell detection and tracking are paramount for bio-analysis. Recent approaches rely on the tracking-by-model evolution paradigm, which usually consists of training end-to-end deep learning models to detect and track the cells on the frames with promising results. However, such methods require extensive amounts of annotated data, which is time-consuming to obtain and often requires specialized annotators. This work proposes a new approach based on the classical tracking-by-detection paradigm that alleviates the requirement of annotated data. More precisely, it approximates the cell shapes as oriented ellipses and then uses generic-purpose oriented object detectors to identify the cells in each frame. We then rely on a global data association algorithm that explores temporal cell similarity using probability distance metrics, considering that the ellipses relate to two-dimensional Gaussian distributions. Our results show that our method can achieve detection and tracking results competitively with state-of-the-art techniques that require considerably more extensive data annotation. Our code is available at: https://github.com/LucasKirsten/Deep-Cell-Tracking-EBB.
△ Less
Submitted 11 October, 2023; v1 submitted 7 October, 2023;
originally announced October 2023.
-
SVBR-NET: A Non-Blind Spatially Varying Defocus Blur Removal Network
Authors:
Ali Karaali,
Claudio Rosito Jung
Abstract:
Defocus blur is a physical consequence of the optical sensors used in most cameras. Although it can be used as a photographic style, it is commonly viewed as an image degradation modeled as the convolution of a sharp image with a spatially-varying blur kernel. Motivated by the advance of blur estimation methods in the past years, we propose a non-blind approach for image deblurring that can deal w…
▽ More
Defocus blur is a physical consequence of the optical sensors used in most cameras. Although it can be used as a photographic style, it is commonly viewed as an image degradation modeled as the convolution of a sharp image with a spatially-varying blur kernel. Motivated by the advance of blur estimation methods in the past years, we propose a non-blind approach for image deblurring that can deal with spatially-varying kernels. We introduce two encoder-decoder sub-networks that are fed with the blurry image and the estimated blur map, respectively, and produce as output the deblurred (deconvolved) image. Each sub-network presents several skip connections that allow data propagation from layers spread apart, and also inter-subnetwork skip connections that ease the communication between the modules. The network is trained with synthetically blur kernels that are augmented to emulate blur maps produced by existing blur estimation methods, and our experimental results show that our method works well when combined with a variety of blur estimation methods.
△ Less
Submitted 26 June, 2022;
originally announced June 2022.
-
Improving the Classification of Rare Chords with Unlabeled Data
Authors:
Marcelo Bortolozzo,
Rodrigo Schramm,
Claudio R. Jung
Abstract:
In this work, we explore techniques to improve performance for rare classes in the task of Automatic Chord Recognition (ACR). We first explored the use of the focal loss in the context of ACR, which was originally proposed to improve the classification of hard samples. In parallel, we adapted a self-learning technique originally designed for image recognition to the musical domain. Our experiments…
▽ More
In this work, we explore techniques to improve performance for rare classes in the task of Automatic Chord Recognition (ACR). We first explored the use of the focal loss in the context of ACR, which was originally proposed to improve the classification of hard samples. In parallel, we adapted a self-learning technique originally designed for image recognition to the musical domain. Our experiments show that both approaches individually (and their combination) improve the recognition of rare chords, but using only self-learning with noise addition yields the best results.
△ Less
Submitted 10 February, 2021; v1 submitted 13 December, 2020;
originally announced December 2020.
-
Deep Multi-Scale Feature Learning for Defocus Blur Estimation
Authors:
Ali Karaali,
Naomi Harte,
Claudio Rosito Jung
Abstract:
This paper presents an edge-based defocus blur estimation method from a single defocused image. We first distinguish edges that lie at depth discontinuities (called depth edges, for which the blur estimate is ambiguous) from edges that lie at approximately constant depth regions (called pattern edges, for which the blur estimate is well-defined). Then, we estimate the defocus blur amount at patter…
▽ More
This paper presents an edge-based defocus blur estimation method from a single defocused image. We first distinguish edges that lie at depth discontinuities (called depth edges, for which the blur estimate is ambiguous) from edges that lie at approximately constant depth regions (called pattern edges, for which the blur estimate is well-defined). Then, we estimate the defocus blur amount at pattern edges only, and explore an interpolation scheme based on guided filters that prevents data propagation across the detected depth edges to obtain a dense blur map with well-defined object boundaries. Both tasks (edge classification and blur estimation) are performed by deep convolutional neural networks (CNNs) that share weights to learn meaningful local features from multi-scale patches centered at edge locations. Experiments on naturally defocused images show that the proposed method presents qualitative and quantitative results that outperform state-of-the-art (SOTA) methods, with a good compromise between running time and accuracy.
△ Less
Submitted 7 November, 2021; v1 submitted 24 September, 2020;
originally announced September 2020.
-
Superpixel Image Classification with Graph Attention Networks
Authors:
Pedro H. C. Avelar,
Anderson R. Tavares,
Thiago L. T. da Silveira,
Cláudio R. Jung,
Luís C. Lamb
Abstract:
This paper presents a methodology for image classification using Graph Neural Network (GNN) models. We transform the input images into region adjacency graphs (RAGs), in which regions are superpixels and edges connect neighboring superpixels. Our experiments suggest that Graph Attention Networks (GATs), which combine graph convolutions with self-attention mechanisms, outperforms other GNN models.…
▽ More
This paper presents a methodology for image classification using Graph Neural Network (GNN) models. We transform the input images into region adjacency graphs (RAGs), in which regions are superpixels and edges connect neighboring superpixels. Our experiments suggest that Graph Attention Networks (GATs), which combine graph convolutions with self-attention mechanisms, outperforms other GNN models. Although raw image classifiers perform better than GATs due to information loss during the RAG generation, our methodology opens an interesting avenue of research on deep learning beyond rectangular-gridded images, such as 360-degree field of view panoramas. Traditional convolutional kernels of current state-of-the-art methods cannot handle panoramas, whereas the adapted superpixel algorithms and the resulting region adjacency graphs can naturally feed a GNN, without topology issues.
△ Less
Submitted 15 November, 2020; v1 submitted 13 February, 2020;
originally announced February 2020.
-
Collective behavior recognition using compact descriptors
Authors:
Gustavo Fuhr,
Claudio Rosito Jung
Abstract:
This paper presents a novel hierarchical approach for collective behavior recognition based solely on ground-plane trajectories. In the first layer of our classifier, we introduce a novel feature called Personal Interaction Descriptor (PID), which combines the spatial distribution of a pair of pedestrians within a temporal window with a pyramidal representation of the relative speed to detect pair…
▽ More
This paper presents a novel hierarchical approach for collective behavior recognition based solely on ground-plane trajectories. In the first layer of our classifier, we introduce a novel feature called Personal Interaction Descriptor (PID), which combines the spatial distribution of a pair of pedestrians within a temporal window with a pyramidal representation of the relative speed to detect pairwise interactions. These interactions are then combined with higher level features related to the mean speed and shape formed by the pedestrians in the scene, generating a Collective Behavior Descriptor (CBD) that is used to identify collective behaviors in a second stage. In both layers, Random Forests were used as classifiers, since they allow features of different natures to be combined seamlessly. Our experimental results indicate that the proposed method achieves results on par with state of the art techniques with a better balance of class errors. Moreover, we show that our method can generalize well across different camera setups through cross-dataset experiments.
△ Less
Submitted 27 September, 2018;
originally announced September 2018.