-
Controllable Inversion of Black-Box Face Recognition Models via Diffusion
Authors:
Manuel Kansy,
Anton Raƫl,
Graziana Mignone,
Jacek Naruniec,
Christopher Schroers,
Markus Gross,
Romann M. Weber
Abstract:
Face recognition models embed a face image into a low-dimensional identity vector containing abstract encodings of identity-specific facial features that allow individuals to be distinguished from one another. We tackle the challenging task of inverting the latent space of pre-trained face recognition models without full model access (i.e. black-box setting). A variety of methods have been propose…
▽ More
Face recognition models embed a face image into a low-dimensional identity vector containing abstract encodings of identity-specific facial features that allow individuals to be distinguished from one another. We tackle the challenging task of inverting the latent space of pre-trained face recognition models without full model access (i.e. black-box setting). A variety of methods have been proposed in literature for this task, but they have serious shortcomings such as a lack of realistic outputs and strong requirements for the data set and accessibility of the face recognition model. By analyzing the black-box inversion problem, we show that the conditional diffusion model loss naturally emerges and that we can effectively sample from the inverse distribution even without an identity-specific loss. Our method, named identity denoising diffusion probabilistic model (ID3PM), leverages the stochastic nature of the denoising diffusion process to produce high-quality, identity-preserving face images with various backgrounds, lighting, poses, and expressions. We demonstrate state-of-the-art performance in terms of identity preservation and diversity both qualitatively and quantitatively, and our method is the first black-box face recognition model inversion method that offers intuitive control over the generation process.
△ Less
Submitted 30 September, 2023; v1 submitted 22 March, 2023;
originally announced March 2023.
-
Augmentation for small object detection
Authors:
Mate Kisantal,
Zbigniew Wojna,
Jakub Murawski,
Jacek Naruniec,
Kyunghyun Cho
Abstract:
In recent years, object detection has experienced impressive progress. Despite these improvements, there is still a significant gap in the performance between the detection of small and large objects. We analyze the current state-of-the-art model, Mask-RCNN, on a challenging dataset, MS COCO. We show that the overlap between small ground-truth objects and the predicted anchors is much lower than t…
▽ More
In recent years, object detection has experienced impressive progress. Despite these improvements, there is still a significant gap in the performance between the detection of small and large objects. We analyze the current state-of-the-art model, Mask-RCNN, on a challenging dataset, MS COCO. We show that the overlap between small ground-truth objects and the predicted anchors is much lower than the expected IoU threshold. We conjecture this is due to two factors; (1) only a few images are containing small objects, and (2) small objects do not appear enough even within each image containing them. We thus propose to oversample those images with small objects and augment each of those images by copy-pasting small objects many times. It allows us to trade off the quality of the detector on large objects with that on small objects. We evaluate different pasting augmentation strategies, and ultimately, we achieve 9.7\% relative improvement on the instance segmentation and 7.1\% on the object detection of small objects, compared to the current state of the art method on MS COCO.
△ Less
Submitted 19 February, 2019;
originally announced February 2019.
-
Face Alignment Using K-Cluster Regression Forests With Weighted Splitting
Authors:
Marek Kowalski,
Jacek Naruniec
Abstract:
In this work we present a face alignment pipeline based on two novel methods: weighted splitting for K-cluster Regression Forests and 3D Affine Pose Regression for face shape initialization. Our face alignment method is based on the Local Binary Feature framework, where instead of standard regression forests and pixel difference features used in the original method, we use our K-cluster Regression…
▽ More
In this work we present a face alignment pipeline based on two novel methods: weighted splitting for K-cluster Regression Forests and 3D Affine Pose Regression for face shape initialization. Our face alignment method is based on the Local Binary Feature framework, where instead of standard regression forests and pixel difference features used in the original method, we use our K-cluster Regression Forests with Weighted Splitting (KRFWS) and Pyramid HOG features. We also use KRFWS to perform Affine Pose Regression (APR) and 3D-Affine Pose Regression (3D-APR), which intend to improve the face shape initialization. APR applies a rigid 2D transform to the initial face shape that compensates for inaccuracy in the initial face location, size and in-plane rotation. 3D-APR estimates the parameters of a 3D transform that additionally compensates for out-of-plane rotation. The resulting pipeline, consisting of APR and 3D-APR followed by face alignment, shows an improvement of 20% over standard LBF on the challenging IBUG dataset, and state-of-theart accuracy on the entire 300-W dataset.
△ Less
Submitted 6 June, 2017;
originally announced June 2017.
-
Deep Alignment Network: A convolutional neural network for robust face alignment
Authors:
Marek Kowalski,
Jacek Naruniec,
Tomasz Trzcinski
Abstract:
In this paper, we propose Deep Alignment Network (DAN), a robust face alignment method based on a deep neural network architecture. DAN consists of multiple stages, where each stage improves the locations of the facial landmarks estimated by the previous stage. Our method uses entire face images at all stages, contrary to the recently proposed face alignment methods that rely on local patches. Thi…
▽ More
In this paper, we propose Deep Alignment Network (DAN), a robust face alignment method based on a deep neural network architecture. DAN consists of multiple stages, where each stage improves the locations of the facial landmarks estimated by the previous stage. Our method uses entire face images at all stages, contrary to the recently proposed face alignment methods that rely on local patches. This is possible thanks to the use of landmark heatmaps which provide visual information about landmark locations estimated at the previous stages of the algorithm. The use of entire face images rather than patches allows DAN to handle face images with large variation in head pose and difficult initializations. An extensive evaluation on two publicly available datasets shows that DAN reduces the state-of-the-art failure rate by up to 70%. Our method has also been submitted for evaluation as part of the Menpo challenge.
△ Less
Submitted 10 August, 2017; v1 submitted 6 June, 2017;
originally announced June 2017.