Search | arXiv e-print repository

arXiv:2405.19876 [pdf, other]

IReNe: Instant Recoloring of Neural Radiance Fields

Authors: Alessio Mazzucchelli, Adrian Garcia-Garcia, Elena Garces, Fernando Rivas-Manzaneque, Francesc Moreno-Noguer, Adrian Penate-Sanchez

Abstract: Advances in NERFs have allowed for 3D scene reconstructions and novel view synthesis. Yet, efficiently editing these representations while retaining photorealism is an emerging challenge. Recent methods face three primary limitations: they're slow for interactive use, lack precision at object boundaries, and struggle to ensure multi-view consistency. We introduce IReNe to address these limitations… ▽ More Advances in NERFs have allowed for 3D scene reconstructions and novel view synthesis. Yet, efficiently editing these representations while retaining photorealism is an emerging challenge. Recent methods face three primary limitations: they're slow for interactive use, lack precision at object boundaries, and struggle to ensure multi-view consistency. We introduce IReNe to address these limitations, enabling swift, near real-time color editing in NeRF. Leveraging a pre-trained NeRF model and a single training image with user-applied color edits, IReNe swiftly adjusts network parameters in seconds. This adjustment allows the model to generate new scene views, accurately representing the color changes from the training image while also controlling object boundaries and view-specific effects. Object boundary control is achieved by integrating a trainable segmentation module into the model. The process gains efficiency by retraining only the weights of the last network layer. We observed that neurons in this layer can be classified into those responsible for view-dependent appearance and those contributing to diffuse appearance. We introduce an automated classification approach to identify these neuron types and exclusively fine-tune the weights of the diffuse neurons. This further accelerates training and ensures consistent color edits across different views. A thorough validation on a new dataset, with edited object colors, shows significant quantitative and qualitative advancements over competitors, accelerating speeds by 5x to 500x. △ Less

Submitted 10 June, 2024; v1 submitted 30 May, 2024; originally announced May 2024.

arXiv:2403.12961 [pdf, other]

TexTile: A Differentiable Metric for Texture Tileability

Authors: Carlos Rodriguez-Pardo, Dan Casas, Elena Garces, Jorge Lopez-Moreno

Abstract: We introduce TexTile, a novel differentiable metric to quantify the degree upon which a texture image can be concatenated with itself without introducing repeating artifacts (i.e., the tileability). Existing methods for tileable texture synthesis focus on general texture quality, but lack explicit analysis of the intrinsic repeatability properties of a texture. In contrast, our TexTile metric effe… ▽ More We introduce TexTile, a novel differentiable metric to quantify the degree upon which a texture image can be concatenated with itself without introducing repeating artifacts (i.e., the tileability). Existing methods for tileable texture synthesis focus on general texture quality, but lack explicit analysis of the intrinsic repeatability properties of a texture. In contrast, our TexTile metric effectively evaluates the tileable properties of a texture, opening the door to more informed synthesis and analysis of tileable textures. Under the hood, TexTile is formulated as a binary classifier carefully built from a large dataset of textures of different styles, semantics, regularities, and human annotations.Key to our method is a set of architectural modifications to baseline pre-train image classifiers to overcome their shortcomings at measuring tileability, along with a custom data augmentation and training regime aimed at increasing robustness and accuracy. We demonstrate that TexTile can be plugged into different state-of-the-art texture synthesis methods, including diffusion-based strategies, and generate tileable textures while kee** or even improving the overall texture quality. Furthermore, we show that TexTile can objectively evaluate any tileable texture synthesis method, whereas the current mix of existing metrics produces uncorrelated scores which heavily hinders progress in the field. △ Less

Submitted 19 March, 2024; originally announced March 2024.

Comments: CVPR 2024. Project page: https://mslab.es/projects/TexTile/

MSC Class: 68T07 (Primary) 68T45; 68U05 (Secondary) ACM Class: I.2.6; I.4.10; I.3.3; I.5.4; I.5.1; I.3.7; I.3.8; I.2.10

arXiv:2307.01199 [pdf, other]

doi 10.1016/j.cag.2023.06.018

NeuBTF: Neural fields for BTF encoding and transfer

Authors: Carlos Rodriguez-Pardo, Konstantinos Kazatzis, Jorge Lopez-Moreno, Elena Garces

Abstract: Neural material representations are becoming a popular way to represent materials for rendering. They are more expressive than analytic models and occupy less memory than tabulated BTFs. However, existing neural materials are immutable, meaning that their output for a certain query of UVs, camera, and light vector is fixed once they are trained. While this is practical when there is no need to edi… ▽ More Neural material representations are becoming a popular way to represent materials for rendering. They are more expressive than analytic models and occupy less memory than tabulated BTFs. However, existing neural materials are immutable, meaning that their output for a certain query of UVs, camera, and light vector is fixed once they are trained. While this is practical when there is no need to edit the material, it can become very limiting when the fragment of the material used for training is too small or not tileable, which frequently happens when the material has been captured with a gonioreflectometer. In this paper, we propose a novel neural material representation which jointly tackles the problems of BTF compression, tiling, and extrapolation. At test time, our method uses a guidance image as input to condition the neural BTF to the structural features of this input image. Then, the neural BTF can be queried as a regular BTF using UVs, camera, and light vectors. Every component in our framework is purposefully designed to maximize BTF encoding quality at minimal parameter count and computational complexity, achieving competitive compression rates compared with previous work. We demonstrate the results of our method on a variety of synthetic and captured materials, showing its generality and capacity to learn to represent many optical properties. △ Less

Submitted 3 July, 2023; originally announced July 2023.

Comments: 9 pages, 7 figures. Accepted to Computers & Graphics (Special Section on CEIG 2023). Project Website: https://carlosrodriguezpardo.es/projects/NeuBTF/

MSC Class: 68T07 (Primary) 68T45; 68U10; 68U05 (Secondary) ACM Class: I.4.0; I.2.6; I.3.0

Journal ref: Computers & Graphics, Volume 114, 2023, Pages 239-246, ISSN 0097-8493

arXiv:2305.16312 [pdf, other]

doi 10.1109/CVPR52729.2023.00558

UMat: Uncertainty-Aware Single Image High Resolution Material Capture

Authors: Carlos Rodriguez-Pardo, Henar Dominguez-Elvira, David Pascual-Hernandez, Elena Garces

Abstract: We propose a learning-based method to recover normals, specularity, and roughness from a single diffuse image of a material, using microgeometry appearance as our primary cue. Previous methods that work on single images tend to produce over-smooth outputs with artifacts, operate at limited resolution, or train one model per class with little room for generalization. Previous methods that work on s… ▽ More We propose a learning-based method to recover normals, specularity, and roughness from a single diffuse image of a material, using microgeometry appearance as our primary cue. Previous methods that work on single images tend to produce over-smooth outputs with artifacts, operate at limited resolution, or train one model per class with little room for generalization. Previous methods that work on single images tend to produce over-smooth outputs with artifacts, operate at limited resolution, or train one model per class with little room for generalization. In contrast, in this work, we propose a novel capture approach that leverages a generative network with attention and a U-Net discriminator, which shows outstanding performance integrating global information at reduced computational complexity. We showcase the performance of our method with a real dataset of digitized textile materials and show that a commodity flatbed scanner can produce the type of diffuse illumination required as input to our method. Additionally, because the problem might be illposed -- more than a single diffuse image might be needed to disambiguate the specular reflection -- or because the training dataset is not representative enough of the real distribution, we propose a novel framework to quantify the model's confidence about its prediction at test time. Our method is the first one to deal with the problem of modeling uncertainty in material digitization, increasing the trustworthiness of the process and enabling more intelligent strategies for dataset creation, as we demonstrate with an active learning experiment. △ Less

Submitted 25 May, 2023; originally announced May 2023.

Comments: CVPR 2023. Project website: https://carlosrodriguezpardo.es/projects/UMat/

MSC Class: 68T07 (Primary) 68T45; 68U10; 68U05 (Secondary) ACM Class: I.4.0; I.2.6; I.3.0

Journal ref: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, BC, Canada, 2023 pp. 5764-5774

arXiv:2304.06704 [pdf, other]

doi 10.1111/cgf.14750

How Will It Drape Like? Capturing Fabric Mechanics from Depth Images

Authors: Carlos Rodriguez-Pardo, Melania Prieto-Martin, Dan Casas, Elena Garces

Abstract: We propose a method to estimate the mechanical parameters of fabrics using a casual capture setup with a depth camera. Our approach enables to create mechanically-correct digital representations of real-world textile materials, which is a fundamental step for many interactive design and engineering applications. As opposed to existing capture methods, which typically require expensive setups, vide… ▽ More We propose a method to estimate the mechanical parameters of fabrics using a casual capture setup with a depth camera. Our approach enables to create mechanically-correct digital representations of real-world textile materials, which is a fundamental step for many interactive design and engineering applications. As opposed to existing capture methods, which typically require expensive setups, video sequences, or manual intervention, our solution can capture at scale, is agnostic to the optical appearance of the textile, and facilitates fabric arrangement by non-expert operators. To this end, we propose a sim-to-real strategy to train a learning-based framework that can take as input one or multiple images and outputs a full set of mechanical parameters. Thanks to carefully designed data augmentation and transfer learning protocols, our solution generalizes to real images despite being trained only on synthetic data, hence successfully closing the sim-to-real loop.Key in our work is to demonstrate that evaluating the regression accuracy based on the similarity at parameter space leads to an inaccurate distances that do not match the human perception. To overcome this, we propose a novel metric for fabric drape similarity that operates on the image domain instead on the parameter space, allowing us to evaluate our estimation within the context of a similarity rank. We show that out metric correlates with human judgments about the perception of drape similarity, and that our model predictions produce perceptually accurate results compared to the ground truth parameters. △ Less

Submitted 13 April, 2023; originally announced April 2023.

Comments: 12 pages, 12 figures. Accepted to EUROGRAPHICS 2023. Project website: https://carlosrodriguezpardo.es/projects/MechFromDepth/

MSC Class: 68T45 (Primary) 68U05 (Secondary) ACM Class: I.5.4; I.3.5; I.4.8; I.5.1

Journal ref: Computer Graphics Forum, 42: 149-160, 2023

arXiv:2201.05120 [pdf, other]

doi 10.1109/TVCG.2022.3143615

SeamlessGAN: Self-Supervised Synthesis of Tileable Texture Maps

Authors: Carlos Rodriguez-Pardo, Elena Garces

Abstract: We present SeamlessGAN, a method capable of automatically generating tileable texture maps from a single input exemplar. In contrast to most existing methods, focused solely on solving the synthesis problem, our work tackles both problems, synthesis and tileability, simultaneously. Our key idea is to realize that tiling a latent space within a generative network trained using adversarial expansion… ▽ More We present SeamlessGAN, a method capable of automatically generating tileable texture maps from a single input exemplar. In contrast to most existing methods, focused solely on solving the synthesis problem, our work tackles both problems, synthesis and tileability, simultaneously. Our key idea is to realize that tiling a latent space within a generative network trained using adversarial expansion techniques produces outputs with continuity at the seam intersection that can be then be turned into tileable images by crop** the central area. Since not every value of the latent space is valid to produce high-quality outputs, we leverage the discriminator as a perceptual error metric capable of identifying artifact-free textures during a sampling process. Further, in contrast to previous work on deep texture synthesis, our model is designed and optimized to work with multi-layered texture representations, enabling textures composed of multiple maps such as albedo, normals, etc. We extensively test our design choices for the network architecture, loss function and sampling parameters. We show qualitatively and quantitatively that our approach outperforms previous methods and works for textures of different types. △ Less

Submitted 13 January, 2022; originally announced January 2022.

Comments: 12 pages. To be published in Transactions on Visualizations and Computer Graphics. Project website: http://carlosrodriguezpardo.es/projects/SeamlessGAN/

MSC Class: 68T07 (Primary) 68T45; 68U05 (Secondary) ACM Class: I.2.6; I.4.10; I.3.3; I.5.4; I.5.1; I.3.7; I.3.8; I.2.10

arXiv:2112.03842 [pdf, other]

A Survey on Intrinsic Images: Delving Deep Into Lambert and Beyond

Authors: Elena Garces, Carlos Rodriguez-Pardo, Dan Casas, Jorge Lopez-Moreno

Abstract: Intrinsic imaging or intrinsic image decomposition has traditionally been described as the problem of decomposing an image into two layers: a reflectance, the albedo invariant color of the material; and a shading, produced by the interaction between light and geometry. Deep learning techniques have been broadly applied in recent years to increase the accuracy of those separations. In this survey,… ▽ More Intrinsic imaging or intrinsic image decomposition has traditionally been described as the problem of decomposing an image into two layers: a reflectance, the albedo invariant color of the material; and a shading, produced by the interaction between light and geometry. Deep learning techniques have been broadly applied in recent years to increase the accuracy of those separations. In this survey, we overview those results in context of well-known intrinsic image data sets and relevant metrics used in the literature, discussing their suitability to predict a desirable intrinsic image decomposition. Although the Lambertian assumption is still a foundational basis for many methods, we show that there is increasing awareness on the potential of more sophisticated physically-principled components of the image formation process, that is, optically accurate material models and geometry, and more complete inverse light transport estimations. We classify these methods in terms of the type of decomposition, considering the priors and models used, as well as the learning architecture and methodology driving the decomposition process. We also provide insights about future directions for research, given the recent advances in neural, inverse and differentiable rendering techniques. △ Less

Submitted 7 December, 2021; originally announced December 2021.

Comments: Accepted at International Journal of Computer Vision (to appear in 2022) http://www.elenagarces.es/projects/SurveyIntrinsicImages/

arXiv:2112.02520 [pdf, other]

doi 10.1109/TVCG.2021.3133081

Neural Photometry-guided Visual Attribute Transfer

Authors: Carlos Rodriguez-Pardo, Elena Garces

Abstract: We present a deep learning-based method for propagating spatially-varying visual material attributes (e.g. texture maps or image stylizations) to larger samples of the same or similar materials. For training, we leverage images of the material taken under multiple illuminations and a dedicated data augmentation policy, making the transfer robust to novel illumination conditions and affine deformat… ▽ More We present a deep learning-based method for propagating spatially-varying visual material attributes (e.g. texture maps or image stylizations) to larger samples of the same or similar materials. For training, we leverage images of the material taken under multiple illuminations and a dedicated data augmentation policy, making the transfer robust to novel illumination conditions and affine deformations. Our model relies on a supervised image-to-image translation framework and is agnostic to the transferred domain; we showcase a semantic segmentation, a normal map, and a stylization. Following an image analogies approach, the method only requires the training data to contain the same visual structures as the input guidance. Our approach works at interactive rates, making it suitable for material edit applications. We thoroughly evaluate our learning methodology in a controlled setup providing quantitative measures of performance. Last, we demonstrate that training the model on a single material is enough to generalize to materials of the same type without the need for massive datasets. △ Less

Submitted 5 December, 2021; originally announced December 2021.

Comments: 13 pages. To be published in Transactions on Visualizations and Computer Graphics. Project website: http://carlosrodriguezpardo.es/projects/NeuralPhotometricTransfer/

MSC Class: 68T07 (Primary) 68T45; 68U05 (Secondary) ACM Class: I.2.6; I.4.10; I.3.3; I.5.4

arXiv:2009.04592 [pdf, other]

Fully Convolutional Graph Neural Networks for Parametric Virtual Try-On

Authors: Raquel Vidaurre, Igor Santesteban, Elena Garces, Dan Casas

Abstract: We present a learning-based approach for virtual try-on applications based on a fully convolutional graph neural network. In contrast to existing data-driven models, which are trained for a specific garment or mesh topology, our fully convolutional model can cope with a large family of garments, represented as parametric predefined 2D panels with arbitrary mesh topology, including long dresses, sh… ▽ More We present a learning-based approach for virtual try-on applications based on a fully convolutional graph neural network. In contrast to existing data-driven models, which are trained for a specific garment or mesh topology, our fully convolutional model can cope with a large family of garments, represented as parametric predefined 2D panels with arbitrary mesh topology, including long dresses, shirts, and tight tops. Under the hood, our novel geometric deep learning approach learns to drape 3D garments by decoupling the three different sources of deformations that condition the fit of clothing: garment type, target body shape, and material. Specifically, we first learn a regressor that predicts the 3D drape of the input parametric garment when worn by a mean body shape. Then, after a mesh topology optimization step where we generate a sufficient level of detail for the input garment type, we further deform the mesh to reproduce deformations caused by the target body shape. Finally, we predict fine-scale details such as wrinkles that depend mostly on the garment material. We qualitatively and quantitatively demonstrate that our fully convolutional approach outperforms existing methods in terms of generalization capabilities and memory requirements, and therefore it opens the door to more general learning-based models for virtual try-on applications. △ Less

Submitted 9 September, 2020; originally announced September 2020.

Comments: Project website http://mslab.es/projects/FullyConvolutionalGraphVirtualTryOn . Accepted to ACM SIGGRAPH / Eurographics Symposium on Computer Animation, 2020

arXiv:2004.00326 [pdf, other]

doi 10.1111/cgf.13912

SoftSMPL: Data-driven Modeling of Nonlinear Soft-tissue Dynamics for Parametric Humans

Authors: Igor Santesteban, Elena Garces, Miguel A. Otaduy, Dan Casas

Abstract: We present SoftSMPL, a learning-based method to model realistic soft-tissue dynamics as a function of body shape and motion. Datasets to learn such task are scarce and expensive to generate, which makes training models prone to overfitting. At the core of our method there are three key contributions that enable us to model highly realistic dynamics and better generalization capabilities than state… ▽ More We present SoftSMPL, a learning-based method to model realistic soft-tissue dynamics as a function of body shape and motion. Datasets to learn such task are scarce and expensive to generate, which makes training models prone to overfitting. At the core of our method there are three key contributions that enable us to model highly realistic dynamics and better generalization capabilities than state-of-the-art methods, while training on the same data. First, a novel motion descriptor that disentangles the standard pose representation by removing subject-specific features; second, a neural-network-based recurrent regressor that generalizes to unseen shapes and motions; and third, a highly efficient nonlinear deformation subspace capable of representing soft-tissue deformations of arbitrary shapes. We demonstrate qualitative and quantitative improvements over existing methods and, additionally, we show the robustness of our method on a variety of motion capture databases. △ Less

Submitted 1 April, 2020; originally announced April 2020.

Comments: Accepted at Eurographics 2020. Project website: http://dancasas.github.io/projects/SoftSMPL

arXiv:1905.01562 [pdf, other]

doi 10.1145/3306346.3323036

A Similarity Measure for Material Appearance

Authors: Manuel Lagunas, Sandra Malpica, Ana Serrano, Elena Garces, Diego Gutierrez, Belen Masia

Abstract: We present a model to measure the similarity in appearance between different materials, which correlates with human similarity judgments. We first create a database of 9,000 rendered images depicting objects with varying materials, shape and illumination. We then gather data on perceived similarity from crowdsourced experiments; our analysis of over 114,840 answers suggests that indeed a shared pe… ▽ More We present a model to measure the similarity in appearance between different materials, which correlates with human similarity judgments. We first create a database of 9,000 rendered images depicting objects with varying materials, shape and illumination. We then gather data on perceived similarity from crowdsourced experiments; our analysis of over 114,840 answers suggests that indeed a shared perception of appearance similarity exists. We feed this data to a deep learning architecture with a novel loss function, which learns a feature space for materials that correlates with such perceived appearance similarity. Our evaluation shows that our model outperforms existing metrics. Last, we demonstrate several applications enabled by our metric, including appearance-based search for material suggestions, database visualization, clustering and summarization, and gamut map**. △ Less

Submitted 4 May, 2019; originally announced May 2019.

Comments: 12 pages, 17 figures

Journal ref: ACM Transactions on Graphics (SIGGRAPH 2019)

arXiv:1902.05378 [pdf, ps, other]

doi 10.1007/s11042-018-6628-7

Learning icons appearance similarity

Authors: Manuel Lagunas, Elena Garces, Diego Gutierrez

Abstract: Selecting an optimal set of icons is a crucial step in the pipeline of visual design to structure and navigate through content. However, designing the icons sets is usually a difficult task for which expert knowledge is required. In this work, to ease the process of icon set selection to the users, we propose a similarity metric which captures the properties of style and visual identity. We train… ▽ More Selecting an optimal set of icons is a crucial step in the pipeline of visual design to structure and navigate through content. However, designing the icons sets is usually a difficult task for which expert knowledge is required. In this work, to ease the process of icon set selection to the users, we propose a similarity metric which captures the properties of style and visual identity. We train a Siamese Neural Network with an online dataset of icons organized in visually coherent collections that are used to adaptively sample training data and optimize the training process. As the dataset contains noise, we further collect human-rated information on the perception of icon's similarity which will be used for evaluating and testing the proposed model. We present several results and applications based on searches, kernel visualizations and optimized set proposals that can be helpful for designers and non-expert users while exploring large collections of icons. △ Less

Submitted 1 February, 2019; originally announced February 2019.

Comments: 12 pages, 11 figures

Journal ref: Multimedia Tools and Applications, pages: 1-19, year: 2018, publisher: Springer

arXiv:1811.09131 [pdf, other]

BRDF Estimation of Complex Materials with Nested Learning

Authors: Raquel Vidaurre, Dan Casas, Elena Garces, Jorge Lopez-Moreno

Abstract: The estimation of the optical properties of a material from RGB-images is an important but extremely ill-posed problem in Computer Graphics. While recent works have successfully approached this problem even from just a single photograph, significant simplifications of the material model are assumed, limiting the usability of such methods. The detection of complex material properties such as anisot… ▽ More The estimation of the optical properties of a material from RGB-images is an important but extremely ill-posed problem in Computer Graphics. While recent works have successfully approached this problem even from just a single photograph, significant simplifications of the material model are assumed, limiting the usability of such methods. The detection of complex material properties such as anisotropy or Fresnel effect remains an unsolved challenge. We propose a novel method that predicts the model parameters of an artist-friendly, physically-based BRDF, from only two low-resolution shots of the material. Thanks to a novel combination of deep neural networks in a nested architecture, we are able to handle the ambiguities given by the non-orthogonality and non-convexity of the parameter space. To train the network, we generate a novel dataset of physically-based synthetic images. We prove that our model can recover new properties like anisotropy, index of refraction and a second reflectance color, for materials that have tinted specular reflections or whose albedo changes at glancing angles. △ Less

Submitted 22 November, 2018; originally announced November 2018.

Comments: Accepted to IEEE Winter Conference on Applications of Computer Vision 2019 (WACV 2019)

arXiv:1806.04935 [pdf, other]

doi 10.1111/cgf.13086

Convolutional sparse coding for capturing high speed video content

Authors: Ana Serrano, Elena Garces, Diego Gutierrez, Belen Masia

Abstract: Video capture is limited by the trade-off between spatial and temporal resolution: when capturing videos of high temporal resolution, the spatial resolution decreases due to bandwidth limitations in the capture system. Achieving both high spatial and temporal resolution is only possible with highly specialized and very expensive hardware, and even then the same basic trade-off remains. The recent… ▽ More Video capture is limited by the trade-off between spatial and temporal resolution: when capturing videos of high temporal resolution, the spatial resolution decreases due to bandwidth limitations in the capture system. Achieving both high spatial and temporal resolution is only possible with highly specialized and very expensive hardware, and even then the same basic trade-off remains. The recent introduction of compressive sensing and sparse reconstruction techniques allows for the capture of single-shot high-speed video, by coding the temporal information in a single frame, and then reconstructing the full video sequence from this single coded image and a trained dictionary of image patches. In this paper, we first analyze this approach, and find insights that help improve the quality of the reconstructed videos. We then introduce a novel technique, based on convolutional sparse coding (CSC), and show how it outperforms the state-of-the-art, patch-based approach in terms of flexibility and efficiency, due to the convolutional nature of its filter banks. The key idea for CSC high-speed video acquisition is extending the basic formulation by imposing an additional constraint in the temporal dimension, which enforces sparsity of the first-order derivatives over time. △ Less

Submitted 13 June, 2018; originally announced June 2018.

Journal ref: Computer Graphics Forum 36, 8, Pages 380-389 (February 2017)

arXiv:1806.02682 [pdf, other]

doi 10.2312/ceig.20171213

Transfer Learning for Illustration Classification

Authors: Manuel Lagunas, Elena Garces

Abstract: The field of image classification has shown an outstanding success thanks to the development of deep learning techniques. Despite the great performance obtained, most of the work has focused on natural images ignoring other domains like artistic depictions. In this paper, we use transfer learning techniques to propose a new classification network with better performance in illustration images. Sta… ▽ More The field of image classification has shown an outstanding success thanks to the development of deep learning techniques. Despite the great performance obtained, most of the work has focused on natural images ignoring other domains like artistic depictions. In this paper, we use transfer learning techniques to propose a new classification network with better performance in illustration images. Starting from the deep convolutional network VGG19, pre-trained with natural images, we propose two novel models which learn object representations in the new domain. Our optimized network will learn new low-level features of the images (colours, edges, textures) while kee** the knowledge of the objects and shapes that it already learned from the ImageNet dataset. Thus, requiring much less data for the training. We propose a novel dataset of illustration images labelled by content where our optimized architecture achieves $\textbf{86.61\%}$ of top-1 and $\textbf{97.21\%}$ of top-5 precision. We additionally demonstrate that our model is still able to recognize objects in photographs. △ Less

Submitted 23 May, 2018; originally announced June 2018.

Comments: 9 pages, 8 figures, 4 tables

Journal ref: 2017 Spanish Computer Graphics Conference (CEIG)

arXiv:1608.04342 [pdf, other]

doi 10.1111/cgf.13154

Intrinsic Light Field Images

Authors: Elena Garces, Jose I. Echevarria, Wen Zhang, Hongzhi Wu, Kun Zhou, Diego Gutierrez

Abstract: We present a method to automatically decompose a light field into its intrinsic shading and albedo components. Contrary to previous work targeted to 2D single images and videos, a light field is a 4D structure that captures non-integrated incoming radiance over a discrete angular domain. This higher dimensionality of the problem renders previous state-of-the-art algorithms impractical either due t… ▽ More We present a method to automatically decompose a light field into its intrinsic shading and albedo components. Contrary to previous work targeted to 2D single images and videos, a light field is a 4D structure that captures non-integrated incoming radiance over a discrete angular domain. This higher dimensionality of the problem renders previous state-of-the-art algorithms impractical either due to their cost of processing a single 2D slice, or their inability to enforce proper coherence in additional dimensions. We propose a new decomposition algorithm that jointly optimizes the whole light field data for proper angular coherence. For efficiency, we extend Retinex theory, working on the gradient domain, where new albedo and occlusion terms are introduced. Results show our method provides 4D intrinsic decompositions difficult to achieve with previous state-of-the-art algorithms. We further provide a comprehensive analysis and comparisons with existing intrinsic image/video decomposition methods on light field images. △ Less

Submitted 12 April, 2017; v1 submitted 15 August, 2016; originally announced August 2016.

Journal ref: Computer Graphics Forum 2017

Showing 1–16 of 16 results for author: Garces, E