Search | arXiv e-print repository

Learning Relighting and Intrinsic Decomposition in Neural Radiance Fields

Authors: Yixiong Yang, Shilin Hu, Haoyu Wu, Ramon Baldrich, Dimitris Samaras, Maria Vanrell

Abstract: The task of extracting intrinsic components, such as reflectance and shading, from neural radiance fields is of growing interest. However, current methods largely focus on synthetic scenes and isolated objects, overlooking the complexities of real scenes with backgrounds. To address this gap, our research introduces a method that combines relighting with intrinsic decomposition. By leveraging ligh… ▽ More The task of extracting intrinsic components, such as reflectance and shading, from neural radiance fields is of growing interest. However, current methods largely focus on synthetic scenes and isolated objects, overlooking the complexities of real scenes with backgrounds. To address this gap, our research introduces a method that combines relighting with intrinsic decomposition. By leveraging light variations in scenes to generate pseudo labels, our method provides guidance for intrinsic decomposition without requiring ground truth data. Our method, grounded in physical constraints, ensures robustness across diverse scene types and reduces the reliance on pre-trained models or hand-crafted priors. We validate our method on both synthetic and real-world datasets, achieving convincing results. Furthermore, the applicability of our method to image editing tasks demonstrates promising outcomes. △ Less

Submitted 16 June, 2024; originally announced June 2024.

Comments: Accepted by CVPR 2024 Workshop Neural Rendering Intelligence(NRI)

arXiv:2011.14447 [pdf, other]

Intrinsic Decomposition of Document Images In-the-Wild

Authors: Sagnik Das, Hassan Ahmed Sial, Ke Ma, Ramon Baldrich, Maria Vanrell, Dimitris Samaras

Abstract: Automatic document content processing is affected by artifacts caused by the shape of the paper, non-uniform and diverse color of lighting conditions. Fully-supervised methods on real data are impossible due to the large amount of data needed. Hence, the current state of the art deep learning models are trained on fully or partially synthetic images. However, document shadow or shading removal res… ▽ More Automatic document content processing is affected by artifacts caused by the shape of the paper, non-uniform and diverse color of lighting conditions. Fully-supervised methods on real data are impossible due to the large amount of data needed. Hence, the current state of the art deep learning models are trained on fully or partially synthetic images. However, document shadow or shading removal results still suffer because: (a) prior methods rely on uniformity of local color statistics, which limit their application on real-scenarios with complex document shapes and textures and; (b) synthetic or hybrid datasets with non-realistic, simulated lighting conditions are used to train the models. In this paper we tackle these problems with our two main contributions. First, a physically constrained learning-based method that directly estimates document reflectance based on intrinsic image formation which generalizes to challenging illumination conditions. Second, a new dataset that clearly improves previous synthetic ones, by adding a large range of realistic shading and diverse multi-illuminant conditions, uniquely customized to deal with documents in-the-wild. The proposed architecture works in a self-supervised manner where only the synthetic texture is used as a weak training signal (obviating the need for very costly ground truth with disentangled versions of shading and reflectance). The proposed approach leads to a significant generalization of document reflectance estimation in real scenes with challenging illumination. We extensively evaluate on the real benchmark datasets available for intrinsic image decomposition and document shadow removal tasks. Our reflectance estimation scheme, when used as a pre-processing step of an OCR pipeline, shows a 26% improvement of character error rate (CER), thus, proving the practical applicability. △ Less

Submitted 29 November, 2020; originally announced November 2020.

Comments: This a modified version of the BMVC 2020 accepted manuscript

arXiv:2009.08941 [pdf, other]

Light Direction and Color Estimation from Single Image with Deep Regression

Authors: Hassan A. Sial, Ramon Baldrich, Maria Vanrell, Dimitris Samaras

Abstract: We present a method to estimate the direction and color of the scene light source from a single image. Our method is based on two main ideas: (a) we use a new synthetic dataset with strong shadow effects with similar constraints to the SID dataset; (b) we define a deep architecture trained on the mentioned dataset to estimate the direction and color of the scene light source. Apart from showing go… ▽ More We present a method to estimate the direction and color of the scene light source from a single image. Our method is based on two main ideas: (a) we use a new synthetic dataset with strong shadow effects with similar constraints to the SID dataset; (b) we define a deep architecture trained on the mentioned dataset to estimate the direction and color of the scene light source. Apart from showing good performance on synthetic images, we additionally propose a preliminary procedure to obtain light positions of the Multi-Illumination dataset, and, in this way, we also prove that our trained model achieves good performance when it is applied to real scenes. △ Less

Submitted 18 September, 2020; originally announced September 2020.

Comments: Conference: London Imaging Meeting 2020

arXiv:2009.06295 [pdf, other]

doi 10.1364/JOSAA.37.000001

Deep intrinsic decomposition trained on surreal scenes yet with realistic light effects

Authors: Hassan Sial, Ramon Baldrich, Maria Vanrell

Abstract: Estimation of intrinsic images still remains a challenging task due to weaknesses of ground-truth datasets, which either are too small or present non-realistic issues. On the other hand, end-to-end deep learning architectures start to achieve interesting results that we believe could be improved if important physical hints were not ignored. In this work, we present a twofold framework: (a) a flexi… ▽ More Estimation of intrinsic images still remains a challenging task due to weaknesses of ground-truth datasets, which either are too small or present non-realistic issues. On the other hand, end-to-end deep learning architectures start to achieve interesting results that we believe could be improved if important physical hints were not ignored. In this work, we present a twofold framework: (a) a flexible generation of images overcoming some classical dataset problems such as larger size jointly with coherent lighting appearance; and (b) a flexible architecture tying physical properties through intrinsic losses. Our proposal is versatile, presents low computation time, and achieves state-of-the-art results. △ Less

Submitted 14 September, 2020; originally announced September 2020.

Journal ref: JOSA A 2020

arXiv:1702.00382 [pdf, other]

doi 10.1016/j.patrec.2019.10.013

Understanding trained CNNs by indexing neuron selectivity

Authors: Ivet Rafegas, Maria Vanrell, Luis A. Alexandre, Guillem Arias

Abstract: The impressive performance of Convolutional Neural Networks (CNNs) when solving different vision problems is shadowed by their black-box nature and our consequent lack of understanding of the representations they build and how these representations are organized. To help understanding these issues, we propose to describe the activity of individual neurons by their Neuron Feature visualization and… ▽ More The impressive performance of Convolutional Neural Networks (CNNs) when solving different vision problems is shadowed by their black-box nature and our consequent lack of understanding of the representations they build and how these representations are organized. To help understanding these issues, we propose to describe the activity of individual neurons by their Neuron Feature visualization and quantify their inherent selectivity with two specific properties. We explore selectivity indexes for: an image feature (color); and an image label (class membership). Our contribution is a framework to seek or classify neurons by indexing on these selectivity properties. It helps to find color selective neurons, such as a red-mushroom neuron in layer Conv4 or class selective neurons such as dog-face neurons in layer Conv5 in VGG-M, and establishes a methodology to derive other selectivity properties. Indexing on neuron selectivity can statistically draw how features and classes are represented through layers in a moment when the size of trained nets is growing and automatic tools to index neurons can be helpful. △ Less

Submitted 21 May, 2019; v1 submitted 1 February, 2017; originally announced February 2017.

Comments: Under review on Pattern Recognition Letters

arXiv:1511.05084 [pdf, other]

Understanding learned CNN features through Filter Decoding with Substitution

Authors: Ivet Rafegas, Maria Vanrell

Abstract: In parallel with the success of CNNs to solve vision problems, there is a growing interest in develo** methodologies to understand and visualize the internal representations of these networks. How the responses of a trained CNN encode the visual information is a fundamental question both for computer and human vision research. Image representations provided by the first convolutional layer as we… ▽ More In parallel with the success of CNNs to solve vision problems, there is a growing interest in develo** methodologies to understand and visualize the internal representations of these networks. How the responses of a trained CNN encode the visual information is a fundamental question both for computer and human vision research. Image representations provided by the first convolutional layer as well as the resolution change provided by the max-polling operation are easy to understand, however, as soon as a second and further convolutional layers are added in the representation, any intuition is lost. A usual way to deal with this problem has been to define deconvolutional networks that somehow allow to explore the internal representations of the most important activations towards the image space, where deconvolution is assumed as a convolution with the transposed filter. However, this assumption is not the best approximation of an inverse convolution. In this paper we propose a new assumption based on filter substitution to reverse the encoding of a convolutional layer. This provides us with a new tool to directly visualize any CNN single neuron as a filter in the first layer, this is in terms of the image space. △ Less

Submitted 17 November, 2015; v1 submitted 16 November, 2015; originally announced November 2015.

Comments: 10 pages, 7 figures (including supplementary material). Submitted for review for CVPR 2016

Showing 1–6 of 6 results for author: Vanrell, M