Search | arXiv e-print repository

S2F2: Self-Supervised High Fidelity Face Reconstruction from Monocular Image

Authors: Abdallah Dib, Junghyun Ahn, Cedric Thebault, Philippe-Henri Gosselin, Louis Chevallier

Abstract: We present a novel face reconstruction method capable of reconstructing detailed face geometry, spatially varying face reflectance from a single monocular image. We build our work upon the recent advances of DNN-based auto-encoders with differentiable ray tracing image formation, trained in self-supervised manner. While providing the advantage of learning-based approaches and real-time reconstruct… ▽ More We present a novel face reconstruction method capable of reconstructing detailed face geometry, spatially varying face reflectance from a single monocular image. We build our work upon the recent advances of DNN-based auto-encoders with differentiable ray tracing image formation, trained in self-supervised manner. While providing the advantage of learning-based approaches and real-time reconstruction, the latter methods lacked fidelity. In this work, we achieve, for the first time, high fidelity face reconstruction using self-supervised learning only. Our novel coarse-to-fine deep architecture allows us to solve the challenging problem of decoupling face reflectance from geometry using a single image, at high computational speed. Compared to state-of-the-art methods, our method achieves more visually appealing reconstruction. △ Less

Submitted 5 April, 2022; v1 submitted 15 March, 2022; originally announced March 2022.

Comments: 24 Pages, 22 Figures

MSC Class: 68T45; 68T07; 68U10; 68U05 ACM Class: I.4.5; I.4.8; I.4.9; I.3.3; I.2.10

arXiv:2110.02124 [pdf, other]

doi 10.1145/3485441.3485646

FacialFilmroll: High-resolution multi-shot video editing

Authors: Bharath Bhushan Damodaran, Emmanuel Jolly, Gilles Puy, Philippe Henri Gosselin, Cédric Thébault, Junghyun Ahn, Tim Christensen, Paul Ghezzo, Pierre Hellier

Abstract: We present FacialFilmroll, a solution for spatially and temporally consistent editing of faces in one or multiple shots. We build upon unwrap mosaic [Rav-Acha et al. 2008] by specializing it to faces. We leverage recent techniques to fit a 3D face model on monocular videos to (i) improve the quality of the mosaic for edition and (ii) permit the automatic transfer of edits from one shot to other sh… ▽ More We present FacialFilmroll, a solution for spatially and temporally consistent editing of faces in one or multiple shots. We build upon unwrap mosaic [Rav-Acha et al. 2008] by specializing it to faces. We leverage recent techniques to fit a 3D face model on monocular videos to (i) improve the quality of the mosaic for edition and (ii) permit the automatic transfer of edits from one shot to other shots of the same actor. We explain how FacialFilmroll is integrated in post-production facility. Finally, we present video editing results using FacialFilmroll on high resolution videos. △ Less

Submitted 17 November, 2021; v1 submitted 5 October, 2021; originally announced October 2021.

Comments: European Conference on Visual Media Production (CVMP '21)

Journal ref: European Conference on Visual Media Production (CVMP '21), 2021

arXiv:2103.15432 [pdf, other]

Towards High Fidelity Monocular Face Reconstruction with Rich Reflectance using Self-supervised Learning and Ray Tracing

Authors: Abdallah Dib, Cedric Thebault, Junghyun Ahn, Philippe-Henri Gosselin, Christian Theobalt, Louis Chevallier

Abstract: Robust face reconstruction from monocular image in general lighting conditions is challenging. Methods combining deep neural network encoders with differentiable rendering have opened up the path for very fast monocular reconstruction of geometry, lighting and reflectance. They can also be trained in self-supervised manner for increased robustness and better generalization. However, their differen… ▽ More Robust face reconstruction from monocular image in general lighting conditions is challenging. Methods combining deep neural network encoders with differentiable rendering have opened up the path for very fast monocular reconstruction of geometry, lighting and reflectance. They can also be trained in self-supervised manner for increased robustness and better generalization. However, their differentiable rasterization based image formation models, as well as underlying scene parameterization, limit them to Lambertian face reflectance and to poor shape details. More recently, ray tracing was introduced for monocular face reconstruction within a classic optimization-based framework and enables state-of-the art results. However optimization-based approaches are inherently slow and lack robustness. In this paper, we build our work on the aforementioned approaches and propose a new method that greatly improves reconstruction quality and robustness in general scenes. We achieve this by combining a CNN encoder with a differentiable ray tracer, which enables us to base the reconstruction on much more advanced personalized diffuse and specular albedos, a more sophisticated illumination model and a plausible representation of self-shadows. This enables to take a big leap forward in reconstruction quality of shape, appearance and lighting even in scenes with difficult illumination. With consistent face attributes reconstruction, our method leads to practical applications such as relighting and self-shadows removal. Compared to state-of-the-art methods, our results show improved accuracy and validity of the approach. △ Less

Submitted 22 November, 2021; v1 submitted 29 March, 2021; originally announced March 2021.

Comments: International Conference on Computer Vision (ICCV 2021)

arXiv:2101.05356 [pdf, other]

Practical Face Reconstruction via Differentiable Ray Tracing

Authors: Abdallah Dib, Gaurav Bharaj, Junghyun Ahn, Cédric Thébault, Philippe-Henri Gosselin, Marco Romeo, Louis Chevallier

Abstract: We present a differentiable ray-tracing based novel face reconstruction approach where scene attributes - 3D geometry, reflectance (diffuse, specular and roughness), pose, camera parameters, and scene illumination - are estimated from unconstrained monocular images. The proposed method models scene illumination via a novel, parameterized virtual light stage, which in-conjunction with differentiabl… ▽ More We present a differentiable ray-tracing based novel face reconstruction approach where scene attributes - 3D geometry, reflectance (diffuse, specular and roughness), pose, camera parameters, and scene illumination - are estimated from unconstrained monocular images. The proposed method models scene illumination via a novel, parameterized virtual light stage, which in-conjunction with differentiable ray-tracing, introduces a coarse-to-fine optimization formulation for face reconstruction. Our method can not only handle unconstrained illumination and self-shadows conditions, but also estimates diffuse and specular albedos. To estimate the face attributes consistently and with practical semantics, a two-stage optimization strategy systematically uses a subset of parametric attributes, where subsequent attribute estimations factor those previously estimated. For example, self-shadows estimated during the first stage, later prevent its baking into the personalized diffuse and specular albedos in the second stage. We show the efficacy of our approach in several real-world scenarios, where face attributes can be estimated even under extreme illumination conditions. Ablation studies, analyses and comparisons against several recent state-of-the-art methods show improved accuracy and versatility of our approach. With consistent face attributes reconstruction, our method leads to several style -- illumination, albedo, self-shadow -- edit and transfer applications, as discussed in the paper. △ Less

Submitted 13 January, 2021; originally announced January 2021.

Comments: 16 pages, 14 figures

MSC Class: 65D19; 68U05 ACM Class: I.4.5; I.4.8; I.3.7

arXiv:2008.08049 [pdf, other]

doi 10.1016/j.mfglet.2020.08.001

Efficient planning of peen-forming patterns via artificial neural networks

Authors: Wassime Siguerdidjane, Farbod Khameneifar, Frédérick P. Gosselin

Abstract: Robust automation of the shot peen forming process demands a closed-loop feedback in which a suitable treatment pattern needs to be found in real-time for each treatment iteration. In this work, we present a method for finding the peen-forming patterns, based on a neural network (NN), which learns the nonlinear function that relates a given target shape (input) to its optimal peening pattern (outp… ▽ More Robust automation of the shot peen forming process demands a closed-loop feedback in which a suitable treatment pattern needs to be found in real-time for each treatment iteration. In this work, we present a method for finding the peen-forming patterns, based on a neural network (NN), which learns the nonlinear function that relates a given target shape (input) to its optimal peening pattern (output), from data generated by finite element simulations. The trained NN yields patterns with an average binary accuracy of 98.8\% with respect to the ground truth in microseconds. △ Less

Submitted 18 August, 2020; originally announced August 2020.

arXiv:1910.05200 [pdf, other]

Face Reflectance and Geometry Modeling via Differentiable Ray Tracing

Authors: Abdallah Dib, Gaurav Bharaj, Junghyun Ahn, Cedric Thebault, Philippe-Henri Gosselin, Louis Chevallier

Abstract: We present a novel strategy to automatically reconstruct 3D faces from monocular images with explicitly disentangled facial geometry (pose, identity and expression), reflectance (diffuse and specular albedo), and self-shadows. The scene lights are modeled as a virtual light stage with pre-oriented area lights used in conjunction with differentiable Monte-Carlo ray tracing to optimize the scene and… ▽ More We present a novel strategy to automatically reconstruct 3D faces from monocular images with explicitly disentangled facial geometry (pose, identity and expression), reflectance (diffuse and specular albedo), and self-shadows. The scene lights are modeled as a virtual light stage with pre-oriented area lights used in conjunction with differentiable Monte-Carlo ray tracing to optimize the scene and face parameters. With correctly disentangled self-shadows and specular reflection parameters, we can not only obtain robust facial geometry reconstruction, but also gain explicit control over these parameters, with several practical applications. We can change facial expressions with accurate resultant self-shadows or relight the scene and obtain accurate specular reflection and several other parameter combinations. △ Less

Submitted 3 October, 2019; originally announced October 2019.

arXiv:1711.09783 [pdf, other]

Data Dependent Kernel Approximation using Pseudo Random Fourier Features

Authors: Bharath Bhushan Damodaran, Nicolas Courty, Philippe-Henri Gosselin

Abstract: Kernel methods are powerful and flexible approach to solve many problems in machine learning. Due to the pairwise evaluations in kernel methods, the complexity of kernel computation grows as the data size increases; thus the applicability of kernel methods is limited for large scale datasets. Random Fourier Features (RFF) has been proposed to scale the kernel method for solving large scale dataset… ▽ More Kernel methods are powerful and flexible approach to solve many problems in machine learning. Due to the pairwise evaluations in kernel methods, the complexity of kernel computation grows as the data size increases; thus the applicability of kernel methods is limited for large scale datasets. Random Fourier Features (RFF) has been proposed to scale the kernel method for solving large scale datasets by approximating kernel function using randomized Fourier features. While this method proved very popular, still it exists shortcomings to be effectively used. As RFF samples the randomized features from a distribution independent of training data, it requires sufficient large number of feature expansions to have similar performances to kernelized classifiers, and this is proportional to the number samples in the dataset. Thus, reducing the number of feature dimensions is necessary to effectively scale to large datasets. In this paper, we propose a kernel approximation method in a data dependent way, coined as Pseudo Random Fourier Features (PRFF) for reducing the number of feature dimensions and also to improve the prediction performance. The proposed approach is evaluated on classification and regression problems and compared with the RFF, orthogonal random features and Nystr{ö}m approach △ Less

Submitted 27 November, 2017; originally announced November 2017.

arXiv:1410.8151 [pdf, other]

doi 10.1109/TIP.2015.2423557

A comparison of dense region detectors for image search and fine-grained classification

Authors: Ahmet Iscen, Giorgos Tolias, Philippe-Henri Gosselin, Hervé Jégou

Abstract: We consider a pipeline for image classification or search based on coding approaches like Bag of Words or Fisher vectors. In this context, the most common approach is to extract the image patches regularly in a dense manner on several scales. This paper proposes and evaluates alternative choices to extract patches densely. Beyond simple strategies derived from regular interest region detectors, we… ▽ More We consider a pipeline for image classification or search based on coding approaches like Bag of Words or Fisher vectors. In this context, the most common approach is to extract the image patches regularly in a dense manner on several scales. This paper proposes and evaluates alternative choices to extract patches densely. Beyond simple strategies derived from regular interest region detectors, we propose approaches based on super-pixels, edges, and a bank of Zernike filters used as detectors. The different approaches are evaluated on recent image retrieval and fine-grain classification benchmarks. Our results show that the regular dense detector is outperformed by other methods in most situations, leading us to improve the state of the art in comparable setups on standard retrieval and fined-grain benchmarks. As a byproduct of our study, we show that existing methods for blob and super-pixel extraction achieve high accuracy if the patches are extracted along the edges and not around the detected regions. △ Less

Submitted 17 April, 2015; v1 submitted 29 October, 2014; originally announced October 2014.

Comments: Accepted to IEEE Transactions on Image Processing

Showing 1–8 of 8 results for author: Gosselin, P