Search | arXiv e-print repository

Revisiting Outer Optimization in Adversarial Training

Authors: Ali Dabouei, Fariborz Taherkhani, Sobhan Soleymani, Nasser M. Nasrabadi

Abstract: Despite the fundamental distinction between adversarial and natural training (AT and NT), AT methods generally adopt momentum SGD (MSGD) for the outer optimization. This paper aims to analyze this choice by investigating the overlooked role of outer optimization in AT. Our exploratory evaluations reveal that AT induces higher gradient norm and variance compared to NT. This phenomenon hinders the o… ▽ More Despite the fundamental distinction between adversarial and natural training (AT and NT), AT methods generally adopt momentum SGD (MSGD) for the outer optimization. This paper aims to analyze this choice by investigating the overlooked role of outer optimization in AT. Our exploratory evaluations reveal that AT induces higher gradient norm and variance compared to NT. This phenomenon hinders the outer optimization in AT since the convergence rate of MSGD is highly dependent on the variance of the gradients. To this end, we propose an optimization method called ENGM which regularizes the contribution of each input example to the average mini-batch gradients. We prove that the convergence rate of ENGM is independent of the variance of the gradients, and thus, it is suitable for AT. We introduce a trick to reduce the computational cost of ENGM using empirical observations on the correlation between the norm of gradients w.r.t. the network parameters and input examples. Our extensive evaluations and ablation studies on CIFAR-10, CIFAR-100, and TinyImageNet demonstrate that ENGM and its variants consistently improve the performance of a wide range of AT methods. Furthermore, ENGM alleviates major shortcomings of AT including robust overfitting and high sensitivity to hyperparameter settings. △ Less

Submitted 2 September, 2022; originally announced September 2022.

arXiv:2208.14263 [pdf]

Controllable 3D Generative Adversarial Face Model via Disentangling Shape and Appearance

Authors: Fariborz Taherkhani, Aashish Rai, Quankai Gao, Shaunak Srivastava, Xuanbai Chen, Fernando de la Torre, Steven Song, Aayush Prakash, Daeil Kim

Abstract: 3D face modeling has been an active area of research in computer vision and computer graphics, fueling applications ranging from facial expression transfer in virtual avatars to synthetic data generation. Existing 3D deep learning generative models (e.g., VAE, GANs) allow generating compact face representations (both shape and texture) that can model non-linearities in the shape and appearance spa… ▽ More 3D face modeling has been an active area of research in computer vision and computer graphics, fueling applications ranging from facial expression transfer in virtual avatars to synthetic data generation. Existing 3D deep learning generative models (e.g., VAE, GANs) allow generating compact face representations (both shape and texture) that can model non-linearities in the shape and appearance space (e.g., scatter effects, specularities, etc.). However, they lack the capability to control the generation of subtle expressions. This paper proposes a new 3D face generative model that can decouple identity and expression and provides granular control over expressions. In particular, we propose using a pair of supervised auto-encoder and generative adversarial networks to produce high-quality 3D faces, both in terms of appearance and shape. Experimental results in the generation of 3D faces learned with holistic expression labels, or Action Unit labels, show how we can decouple identity and expression; gaining fine-control over expressions while preserving identity. △ Less

Submitted 30 August, 2022; originally announced August 2022.

Comments: 8 Pages

arXiv:2112.05827 [pdf, other]

Quality-Aware Multimodal Biometric Recognition

Authors: Sobhan Soleymani, Ali Dabouei, Fariborz Taherkhani, Seyed Mehdi Iranmanesh, Jeremy Dawson, Nasser M. Nasrabadi

Abstract: We present a quality-aware multimodal recognition framework that combines representations from multiple biometric traits with varying quality and number of samples to achieve increased recognition accuracy by extracting complimentary identification information based on the quality of the samples. We develop a quality-aware framework for fusing representations of input modalities by weighting their… ▽ More We present a quality-aware multimodal recognition framework that combines representations from multiple biometric traits with varying quality and number of samples to achieve increased recognition accuracy by extracting complimentary identification information based on the quality of the samples. We develop a quality-aware framework for fusing representations of input modalities by weighting their importance using quality scores estimated in a weakly-supervised fashion. This framework utilizes two fusion blocks, each represented by a set of quality-aware and aggregation networks. In addition to architecture modifications, we propose two task-specific loss functions: multimodal separability loss and multimodal compactness loss. The first loss assures that the representations of modalities for a class have comparable magnitudes to provide a better quality estimation, while the multimodal representations of different classes are distributed to achieve maximum discrimination in the embedding space. The second loss, which is considered to regularize the network weights, improves the generalization performance by regularizing the framework. We evaluate the performance by considering three multimodal datasets consisting of face, iris, and fingerprint modalities. The efficacy of the framework is demonstrated through comparison with the state-of-the-art algorithms. In particular, our framework outperforms the rank- and score-level fusion of modalities of BIOMDATA by more than 30% for true acceptance rate at false acceptance rate of $10^{-4}$. △ Less

Submitted 10 December, 2021; originally announced December 2021.

Comments: IEEE Transactions on Biometrics, Behavior, and Identity Science

arXiv:2108.04353 [pdf, other]

Tasks Structure Regularization in Multi-Task Learning for Improving Facial Attribute Prediction

Authors: Fariborz Taherkhani, Ali Dabouei, Sobhan Soleymani, Jeremy Dawson, Nasser M. Nasrabadi

Abstract: The great success of Convolutional Neural Networks (CNN) for facial attribute prediction relies on a large amount of labeled images. Facial image datasets are usually annotated by some commonly used attributes (e.g., gender), while labels for the other attributes (e.g., big nose) are limited which causes their prediction challenging. To address this problem, we use a new Multi-Task Learning (MTL)… ▽ More The great success of Convolutional Neural Networks (CNN) for facial attribute prediction relies on a large amount of labeled images. Facial image datasets are usually annotated by some commonly used attributes (e.g., gender), while labels for the other attributes (e.g., big nose) are limited which causes their prediction challenging. To address this problem, we use a new Multi-Task Learning (MTL) paradigm in which a facial attribute predictor uses the knowledge of other related attributes to obtain a better generalization performance. Here, we leverage MLT paradigm in two problem settings. First, it is assumed that the structure of the tasks (e.g., grou** pattern of facial attributes) is known as a prior knowledge, and parameters of the tasks (i.e., predictors) within the same group are represented by a linear combination of a limited number of underlying basis tasks. Here, a sparsity constraint on the coefficients of this linear combination is also considered such that each task is represented in a more structured and simpler manner. Second, it is assumed that the structure of the tasks is unknown, and then structure and parameters of the tasks are learned jointly by using a Laplacian regularization framework. Our MTL methods are compared with competing methods for facial attribute prediction to show its effectiveness. △ Less

Submitted 17 August, 2021; v1 submitted 29 July, 2021; originally announced August 2021.

arXiv:2108.04352 [pdf, other]

Attribute Guided Sparse Tensor-Based Model for Person Re-Identification

Authors: Fariborz Taherkhani, Ali Dabouei, Sobhan Soleymani, Jeremy Dawson, Nasser M. Nasrabadi

Abstract: Visual perception of a person is easily influenced by many factors such as camera parameters, pose and viewpoint variations. These variations make person Re-Identification (ReID) a challenging problem. Nevertheless, human attributes usually stand as robust visual properties to such variations. In this paper, we propose a new method to leverage features from human attributes for person ReID. Our mo… ▽ More Visual perception of a person is easily influenced by many factors such as camera parameters, pose and viewpoint variations. These variations make person Re-Identification (ReID) a challenging problem. Nevertheless, human attributes usually stand as robust visual properties to such variations. In this paper, we propose a new method to leverage features from human attributes for person ReID. Our model uses a tensor to non-linearly fuse identity and attribute features, and then forces the parameters of the tensor in the loss function to generate discriminative fused features for ReID. Since tensor-based methods usually contain a large number of parameters, training all of these parameters becomes very slow, and the chance of overfitting increases as well. To address this issue, we propose two new techniques based on Structural Sparsity Learning (SSL) and Tensor Decomposition (TD) methods to create an accurate and stable learning problem. We conducted experiments on several standard pedestrian datasets, and experimental results indicate that our tensor-based approach significantly improves person ReID baselines and also outperforms state of the art methods. △ Less

Submitted 29 July, 2021; originally announced August 2021.

arXiv:2107.13742 [pdf, other]

Profile to Frontal Face Recognition in the Wild Using Coupled Conditional GAN

Authors: Fariborz Taherkhani, Veeru Talreja, Jeremy Dawson, Matthew C. Valenti, Nasser M. Nasrabadi

Abstract: In recent years, with the advent of deep-learning, face recognition has achieved exceptional success. However, many of these deep face recognition models perform much better in handling frontal faces compared to profile faces. The major reason for poor performance in handling of profile faces is that it is inherently difficult to learn pose-invariant deep representations that are useful for profil… ▽ More In recent years, with the advent of deep-learning, face recognition has achieved exceptional success. However, many of these deep face recognition models perform much better in handling frontal faces compared to profile faces. The major reason for poor performance in handling of profile faces is that it is inherently difficult to learn pose-invariant deep representations that are useful for profile face recognition. In this paper, we hypothesize that the profile face domain possesses a latent connection with the frontal face domain in a latent feature subspace. We look to exploit this latent connection by projecting the profile faces and frontal faces into a common latent subspace and perform verification or retrieval in the latent domain. We leverage a coupled conditional generative adversarial network (cpGAN) structure to find the hidden relationship between the profile and frontal images in a latent common embedding subspace. Specifically, the cpGAN framework consists of two conditional GAN-based sub-networks, one dedicated to the frontal domain and the other dedicated to the profile domain. Each sub-network tends to find a projection that maximizes the pair-wise correlation between the two feature domains in a common embedding feature subspace. The efficacy of our approach compared with the state-of-the-art is demonstrated using the CFP, CMU Multi-PIE, IJB-A, and IJB-C datasets. Additionally, we have also implemented a coupled convolutional neural network (cpCNN) and an adversarial discriminative domain adaptation network (ADDA) for profile to frontal face recognition. We have evaluated the performance of cpCNN and ADDA and compared it with the proposed cpGAN. Finally, we have also evaluated our cpGAN for reconstruction of frontal faces from input profile faces contained in the VGGFace2 dataset. △ Less

Submitted 29 July, 2021; originally announced July 2021.

Comments: arXiv admin note: substantial text overlap with arXiv:2005.02166

arXiv:2012.03790 [pdf, other]

Matching Distributions via Optimal Transport for Semi-Supervised Learning

Authors: Fariborz Taherkhani, Hadi Kazemi, Ali Dabouei, Jeremy Dawson, Nasser M. Nasrabadi

Abstract: Semi-Supervised Learning (SSL) approaches have been an influential framework for the usage of unlabeled data when there is not a sufficient amount of labeled data available over the course of training. SSL methods based on Convolutional Neural Networks (CNNs) have recently provided successful results on standard benchmark tasks such as image classification. In this work, we consider the general se… ▽ More Semi-Supervised Learning (SSL) approaches have been an influential framework for the usage of unlabeled data when there is not a sufficient amount of labeled data available over the course of training. SSL methods based on Convolutional Neural Networks (CNNs) have recently provided successful results on standard benchmark tasks such as image classification. In this work, we consider the general setting of SSL problem where the labeled and unlabeled data come from the same underlying probability distribution. We propose a new approach that adopts an Optimal Transport (OT) technique serving as a metric of similarity between discrete empirical probability measures to provide pseudo-labels for the unlabeled data, which can then be used in conjunction with the initial labeled data to train the CNN model in an SSL manner. We have evaluated and compared our proposed method with state-of-the-art SSL algorithms on standard datasets to demonstrate the superiority and effectiveness of our SSL algorithm. △ Less

Submitted 21 October, 2021; v1 submitted 4 December, 2020; originally announced December 2020.

arXiv:2012.01542 [pdf, other]

Mutual Information Maximization on Disentangled Representations for Differential Morph Detection

Authors: Sobhan Soleymani, Ali Dabouei, Fariborz Taherkhani, Jeremy Dawson, Nasser M. Nasrabadi

Abstract: In this paper, we present a novel differential morph detection framework, utilizing landmark and appearance disentanglement. In our framework, the face image is represented in the embedding domain using two disentangled but complementary representations. The network is trained by triplets of face images, in which the intermediate image inherits the landmarks from one image and the appearance from… ▽ More In this paper, we present a novel differential morph detection framework, utilizing landmark and appearance disentanglement. In our framework, the face image is represented in the embedding domain using two disentangled but complementary representations. The network is trained by triplets of face images, in which the intermediate image inherits the landmarks from one image and the appearance from the other image. This initially trained network is further trained for each dataset using contrastive representations. We demonstrate that, by employing appearance and landmark disentanglement, the proposed framework can provide state-of-the-art differential morph detection performance. This functionality is achieved by the using distances in landmark, appearance, and ID domains. The performance of the proposed framework is evaluated using three morph datasets generated with different methodologies. △ Less

Submitted 2 December, 2020; originally announced December 2020.

Comments: IEEE Winter Conference on Applications of Computer Vision (WACV 2021)

arXiv:2010.11689 [pdf, other]

Cross-Spectral Iris Matching Using Conditional Coupled GAN

Authors: Moktari Mostofa, Fariborz Taherkhani, Jeremy Dawson, Nasser M. Nasrabadi

Abstract: Cross-spectral iris recognition is emerging as a promising biometric approach to authenticating the identity of individuals. However, matching iris images acquired at different spectral bands shows significant performance degradation when compared to single-band near-infrared (NIR) matching due to the spectral gap between iris images obtained in the NIR and visual-light (VIS) spectra. Although res… ▽ More Cross-spectral iris recognition is emerging as a promising biometric approach to authenticating the identity of individuals. However, matching iris images acquired at different spectral bands shows significant performance degradation when compared to single-band near-infrared (NIR) matching due to the spectral gap between iris images obtained in the NIR and visual-light (VIS) spectra. Although researchers have recently focused on deep-learning-based approaches to recover invariant representative features for more accurate recognition performance, the existing methods cannot achieve the expected accuracy required for commercial applications. Hence, in this paper, we propose a conditional coupled generative adversarial network (CpGAN) architecture for cross-spectral iris recognition by projecting the VIS and NIR iris images into a low-dimensional embedding domain to explore the hidden relationship between them. The conditional CpGAN framework consists of a pair of GAN-based networks, one responsible for retrieving images in the visible domain and other responsible for retrieving images in the NIR domain. Both networks try to map the data into a common embedding subspace to ensure maximum pair-wise similarity between the feature vectors from the two iris modalities of the same subject. To prove the usefulness of our proposed approach, extensive experimental results obtained on the PolyU dataset are compared to existing state-of-the-art cross-spectral recognition methods. △ Less

Submitted 9 October, 2020; originally announced October 2020.

Comments: International Joint Conference on Biometrics (IJCB-2020)

arXiv:2005.02166 [pdf, other]

PF-cpGAN: Profile to Frontal Coupled GAN for Face Recognition in the Wild

Authors: Fariborz Taherkhani, Veeru Talreja, Jeremy Dawson, Matthew C. Valenti, Nasser M. Nasrabadi

Abstract: In recent years, due to the emergence of deep learning, face recognition has achieved exceptional success. However, many of these deep face recognition models perform relatively poorly in handling profile faces compared to frontal faces. The major reason for this poor performance is that it is inherently difficult to learn large pose invariant deep representations that are useful for profile face… ▽ More In recent years, due to the emergence of deep learning, face recognition has achieved exceptional success. However, many of these deep face recognition models perform relatively poorly in handling profile faces compared to frontal faces. The major reason for this poor performance is that it is inherently difficult to learn large pose invariant deep representations that are useful for profile face recognition. In this paper, we hypothesize that the profile face domain possesses a gradual connection with the frontal face domain in the deep feature space. We look to exploit this connection by projecting the profile faces and frontal faces into a common latent space and perform verification or retrieval in the latent domain. We leverage a coupled generative adversarial network (cpGAN) structure to find the hidden relationship between the profile and frontal images in a latent common embedding subspace. Specifically, the cpGAN framework consists of two GAN-based sub-networks, one dedicated to the frontal domain and the other dedicated to the profile domain. Each sub-network tends to find a projection that maximizes the pair-wise correlation between two feature domains in a common embedding feature subspace. The efficacy of our approach compared with the state-of-the-art is demonstrated using the CFP, CMU MultiPIE, IJB-A, and IJB-C datasets. △ Less

Submitted 25 April, 2020; originally announced May 2020.

arXiv:2004.03378 [pdf, other]

Error-Corrected Margin-Based Deep Cross-Modal Hashing for Facial Image Retrieval

Authors: Fariborz Taherkhani, Veeru Talreja, Matthew C. Valenti, Nasser M. Nasrabadi

Abstract: Cross-modal hashing facilitates map** of heterogeneous multimedia data into a common Hamming space, which can beutilized for fast and flexible retrieval across different modalities. In this paper, we propose a novel cross-modal hashingarchitecture-deep neural decoder cross-modal hashing (DNDCMH), which uses a binary vector specifying the presence of certainfacial attributes as an input query to… ▽ More Cross-modal hashing facilitates map** of heterogeneous multimedia data into a common Hamming space, which can beutilized for fast and flexible retrieval across different modalities. In this paper, we propose a novel cross-modal hashingarchitecture-deep neural decoder cross-modal hashing (DNDCMH), which uses a binary vector specifying the presence of certainfacial attributes as an input query to retrieve relevant face images from a database. The DNDCMH network consists of two separatecomponents: an attribute-based deep cross-modal hashing (ADCMH) module, which uses a margin (m)-based loss function toefficiently learn compact binary codes to preserve similarity between modalities in the Hamming space, and a neural error correctingdecoder (NECD), which is an error correcting decoder implemented with a neural network. The goal of NECD network in DNDCMH isto error correct the hash codes generated by ADCMH to improve the retrieval efficiency. The NECD network is trained such that it hasan error correcting capability greater than or equal to the margin (m) of the margin-based loss function. This results in NECD cancorrect the corrupted hash codes generated by ADCMH up to the Hamming distance of m. We have evaluated and comparedDNDCMH with state-of-the-art cross-modal hashing methods on standard datasets to demonstrate the superiority of our method. △ Less

Submitted 3 April, 2020; originally announced April 2020.

Comments: IEEE TRANSACTIONS ON BIOMETRICS, BEHAVIOR, AND IDENTITY SCIENCE, 2020

arXiv:2003.05034 [pdf, other]

SuperMix: Supervising the Mixing Data Augmentation

Authors: Ali Dabouei, Sobhan Soleymani, Fariborz Taherkhani, Nasser M. Nasrabadi

Abstract: This paper presents a supervised mixing augmentation method termed SuperMix, which exploits the salient regions within input images to construct mixed training samples. SuperMix is designed to obtain mixed images rich in visual features and complying with realistic image priors. To enhance the efficiency of the algorithm, we develop a variant of the Newton iterative method, $65\times$ faster than… ▽ More This paper presents a supervised mixing augmentation method termed SuperMix, which exploits the salient regions within input images to construct mixed training samples. SuperMix is designed to obtain mixed images rich in visual features and complying with realistic image priors. To enhance the efficiency of the algorithm, we develop a variant of the Newton iterative method, $65\times$ faster than gradient descent on this problem. We validate the effectiveness of SuperMix through extensive evaluations and ablation studies on two tasks of object classification and knowledge distillation. On the classification task, SuperMix provides comparable performance to the advanced augmentation methods, such as AutoAugment and RandAugment. In particular, combining SuperMix with RandAugment achieves 78.2\% top-1 accuracy on ImageNet with ResNet50. On the distillation task, solely classifying images mixed using the teacher's knowledge achieves comparable performance to the state-of-the-art distillation methods. Furthermore, on average, incorporating mixed images into the distillation objective improves the performance by 3.4\% and 3.1\% on CIFAR-100 and ImageNet, respectively. {\it The code is available at https://github.com/alldbi/SuperMix}. △ Less

Submitted 10 December, 2021; v1 submitted 10 March, 2020; originally announced March 2020.

arXiv:2001.04559 [pdf, other]

Boosting Deep Face Recognition via Disentangling Appearance and Geometry

Authors: Ali Dabouei, Fariborz Taherkhani, Sobhan Soleymani, Jeremy Dawson, Nasser M. Nasrabadi

Abstract: In this paper, we propose a framework for disentangling the appearance and geometry representations in the face recognition task. To provide supervision for this aim, we generate geometrically identical faces by incorporating spatial transformations. We demonstrate that the proposed approach enhances the performance of deep face recognition models by assisting the training process in two ways. Fir… ▽ More In this paper, we propose a framework for disentangling the appearance and geometry representations in the face recognition task. To provide supervision for this aim, we generate geometrically identical faces by incorporating spatial transformations. We demonstrate that the proposed approach enhances the performance of deep face recognition models by assisting the training process in two ways. First, it enforces the early and intermediate convolutional layers to learn more representative features that satisfy the properties of disentangled embeddings. Second, it augments the training set by altering faces geometrically. Through extensive experiments, we demonstrate that integrating the proposed approach into state-of-the-art face recognition methods effectively improves their performance on challenging datasets, such as LFW, YTF, and MegaFace. Both theoretical and practical aspects of the method are analyzed rigorously by concerning ablation studies and knowledge transfer tasks. Furthermore, we show that the knowledge leaned by the proposed method can favor other face-related tasks, such as attribute prediction. △ Less

Submitted 13 January, 2020; originally announced January 2020.

Comments: WACV 2020

arXiv:1910.03624 [pdf, other]

SmoothFool: An Efficient Framework for Computing Smooth Adversarial Perturbations

Authors: Ali Dabouei, Sobhan Soleymani, Fariborz Taherkhani, Jeremy Dawson, Nasser M. Nasrabadi

Abstract: Deep neural networks are susceptible to adversarial manipulations in the input domain. The extent of vulnerability has been explored intensively in cases of $\ell_p$-bounded and $\ell_p$-minimal adversarial perturbations. However, the vulnerability of DNNs to adversarial perturbations with specific statistical properties or frequency-domain characteristics has not been sufficiently explored. In th… ▽ More Deep neural networks are susceptible to adversarial manipulations in the input domain. The extent of vulnerability has been explored intensively in cases of $\ell_p$-bounded and $\ell_p$-minimal adversarial perturbations. However, the vulnerability of DNNs to adversarial perturbations with specific statistical properties or frequency-domain characteristics has not been sufficiently explored. In this paper, we study the smoothness of perturbations and propose SmoothFool, a general and computationally efficient framework for computing smooth adversarial perturbations. Through extensive experiments, we validate the efficacy of the proposed method for both the white-box and black-box attack scenarios. In particular, we demonstrate that: (i) there exist extremely smooth adversarial perturbations for well-established and widely used network architectures, (ii) smoothness significantly enhances the robustness of perturbations against state-of-the-art defense mechanisms, (iii) smoothness improves the transferability of adversarial perturbations across both data points and network architectures, and (iv) class categories exhibit a variable range of susceptibility to smooth perturbations. Our results suggest that smooth APs can play a significant role in exploring the vulnerability extent of DNNs to adversarial examples. △ Less

Submitted 8 October, 2019; originally announced October 2019.

arXiv:1909.08130 [pdf, other]

Identity-Aware Deep Face Hallucination via Adversarial Face Verification

Authors: Hadi Kazemi, Fariborz Taherkhani, Nasser M. Nasrabadi

Abstract: In this paper, we address the problem of face hallucination by proposing a novel multi-scale generative adversarial network (GAN) architecture optimized for face verification. First, we propose a multi-scale generator architecture for face hallucination with a high up-scaling ratio factor, which has multiple intermediate outputs at different resolutions. The intermediate outputs have the growing g… ▽ More In this paper, we address the problem of face hallucination by proposing a novel multi-scale generative adversarial network (GAN) architecture optimized for face verification. First, we propose a multi-scale generator architecture for face hallucination with a high up-scaling ratio factor, which has multiple intermediate outputs at different resolutions. The intermediate outputs have the growing goal of synthesizing small to large images. Second, we incorporate a face verifier with the original GAN discriminator and propose a novel discriminator which learns to discriminate different identities while distinguishing fake generated HR face images from their ground truth images. In particular, the learned generator cares for not only the visual quality of hallucinated face images but also preserving the discriminative features in the hallucination process. In addition, to capture perceptually relevant differences we employ a perceptual similarity loss, instead of similarity in pixel space. We perform a quantitative and qualitative evaluation of our framework on the LFW and CelebA datasets. The experimental results show the advantages of our proposed method against the state-of-the-art methods on the 8x downsampled testing dataset. △ Less

Submitted 17 September, 2019; originally announced September 2019.

Comments: BTAS 2019

arXiv:1908.09630 [pdf, other]

Deep Sparse Band Selection for Hyperspectral Face Recognition

Authors: Fariborz Taherkhani, Jeremy Dawson, Nasser M. Nasrabadi

Abstract: Hyperspectral imaging systems collect and process information from specific wavelengths across the electromagnetic spectrum. The fusion of multi-spectral bands in the visible spectrum has been exploited to improve face recognition performance over all the conventional broad band face images. In this book chapter, we propose a new Convolutional Neural Network (CNN) framework which adopts a structur… ▽ More Hyperspectral imaging systems collect and process information from specific wavelengths across the electromagnetic spectrum. The fusion of multi-spectral bands in the visible spectrum has been exploited to improve face recognition performance over all the conventional broad band face images. In this book chapter, we propose a new Convolutional Neural Network (CNN) framework which adopts a structural sparsity learning technique to select the optimal spectral bands to obtain the best face recognition performance over all of the spectral bands. Specifically, in this method, images from all bands are fed to a CNN, and the convolutional filters in the first layer of the CNN are then regularized by employing a group Lasso algorithm to zero out the redundant bands during the training of the network. Contrary to other methods which usually select the useful bands manually or in a greedy fashion, our method selects the optimal spectral bands automatically to achieve the best face recognition performance over all spectral bands. Moreover, experimental results demonstrate that our method outperforms state of the art band selection methods for face recognition on several publicly-available hyperspectral face image datasets. △ Less

Submitted 15 August, 2019; originally announced August 2019.

arXiv:1908.01790 [pdf, other]

Attribute-Guided Coupled GAN for Cross-Resolution Face Recognition

Authors: Veeru Talreja, Fariborz Taherkhani, Matthew C Valenti, Nasser M Nasrabadi

Abstract: In this paper, we propose a novel attribute-guided cross-resolution (low-resolution to high-resolution) face recognition framework that leverages a coupled generative adversarial network (GAN) structure with adversarial training to find the hidden relationship between the low-resolution and high-resolution images in a latent common embedding subspace. The coupled GAN framework consists of two sub-… ▽ More In this paper, we propose a novel attribute-guided cross-resolution (low-resolution to high-resolution) face recognition framework that leverages a coupled generative adversarial network (GAN) structure with adversarial training to find the hidden relationship between the low-resolution and high-resolution images in a latent common embedding subspace. The coupled GAN framework consists of two sub-networks, one dedicated to the low-resolution domain and the other dedicated to the high-resolution domain. Each sub-network aims to find a projection that maximizes the pair-wise correlation between the two feature domains in a common embedding subspace. In addition to projecting the images into a common subspace, the coupled network also predicts facial attributes to improve the cross-resolution face recognition. Specifically, our proposed coupled framework exploits facial attributes to further maximize the pair-wise correlation by implicitly matching facial attributes of the low and high-resolution images during the training, which leads to a more discriminative embedding subspace resulting in performance enhancement for cross-resolution face recognition. The efficacy of our approach compared with the state-of-the-art is demonstrated using the LFWA, Celeb-A, SCFace and UCCS datasets. △ Less

Submitted 5 August, 2019; originally announced August 2019.

arXiv:1902.04139 [pdf, other]

Using Deep Cross Modal Hashing and Error Correcting Codes for Improving the Efficiency of Attribute Guided Facial Image Retrieval

Authors: Veeru Talreja, Fariborz Taherkhani, Matthew C. Valenti, Nasser M. Nasrabadi

Abstract: With benefits of fast query speed and low storage cost, hashing-based image retrieval approaches have garnered considerable attention from the research community. In this paper, we propose a novel Error-Corrected Deep Cross Modal Hashing (CMH-ECC) method which uses a bitmap specifying the presence of certain facial attributes as an input query to retrieve relevant face images from the database. In… ▽ More With benefits of fast query speed and low storage cost, hashing-based image retrieval approaches have garnered considerable attention from the research community. In this paper, we propose a novel Error-Corrected Deep Cross Modal Hashing (CMH-ECC) method which uses a bitmap specifying the presence of certain facial attributes as an input query to retrieve relevant face images from the database. In this architecture, we generate compact hash codes using an end-to-end deep learning module, which effectively captures the inherent relationships between the face and attribute modality. We also integrate our deep learning module with forward error correction codes to further reduce the distance between different modalities of the same subject. Specifically, the properties of deep hashing and forward error correction codes are exploited to design a cross modal hashing framework with high retrieval performance. Experimental results using two standard datasets with facial attributes-image modalities indicate that our CMH-ECC face image retrieval model outperforms most of the current attribute-based face image retrieval approaches. △ Less

Submitted 11 February, 2019; originally announced February 2019.

Comments: To be published in Proc. IEEE Global SIP 2018

arXiv:1811.11979 [pdf, other]

Unsupervised Image-to-Image Translation Using Domain-Specific Variational Information Bound

Authors: Hadi Kazemi, Sobhan Soleymani, Fariborz Taherkhani, Seyed Mehdi Iranmanesh, Nasser M. Nasrabadi

Abstract: Unsupervised image-to-image translation is a class of computer vision problems which aims at modeling conditional distribution of images in the target domain, given a set of unpaired images in the source and target domains. An image in the source domain might have multiple representations in the target domain. Therefore, ambiguity in modeling of the conditional distribution arises, specially when… ▽ More Unsupervised image-to-image translation is a class of computer vision problems which aims at modeling conditional distribution of images in the target domain, given a set of unpaired images in the source and target domains. An image in the source domain might have multiple representations in the target domain. Therefore, ambiguity in modeling of the conditional distribution arises, specially when the images in the source and target domains come from different modalities. Current approaches mostly rely on simplifying assumptions to map both domains into a shared-latent space. Consequently, they are only able to model the domain-invariant information between the two modalities. These approaches usually fail to model domain-specific information which has no representation in the target domain. In this work, we propose an unsupervised image-to-image translation framework which maximizes a domain-specific variational information bound and learns the target domain-invariant representation of the two domain. The proposed framework makes it possible to map a single source image into multiple images in the target domain, utilizing several target domain-specific codes sampled randomly from the prior distribution, or extracted from reference images. △ Less

Submitted 29 November, 2018; originally announced November 2018.

Comments: NIPS 2018

arXiv:1810.05361 [pdf, other]

Unsupervised Facial Geometry Learning for Sketch to Photo Synthesis

Authors: Hadi Kazemi, Fariborz Taherkhani, Nasser M. Nasrabadi

Abstract: Face sketch-photo synthesis is a critical application in law enforcement and digital entertainment industry where the goal is to learn the map** between a face sketch image and its corresponding photo-realistic image. However, the limited number of paired sketch-photo training data usually prevents the current frameworks to learn a robust map** between the geometry of sketches and their matchi… ▽ More Face sketch-photo synthesis is a critical application in law enforcement and digital entertainment industry where the goal is to learn the map** between a face sketch image and its corresponding photo-realistic image. However, the limited number of paired sketch-photo training data usually prevents the current frameworks to learn a robust map** between the geometry of sketches and their matching photo-realistic images. Consequently, in this work, we present an approach for learning to synthesize a photo-realistic image from a face sketch in an unsupervised fashion. In contrast to current unsupervised image-to-image translation techniques, our framework leverages a novel perceptual discriminator to learn the geometry of human face. Learning facial prior information empowers the network to remove the geometrical artifacts in the face sketch. We demonstrate that a simultaneous optimization of the face photo generator network, employing the proposed perceptual discriminator in combination with a texture-wise discriminator, results in a significant improvement in quality and recognition rate of the synthesized photos. We evaluate the proposed network by conducting extensive experiments on multiple baseline sketch-photo datasets. △ Less

Submitted 12 October, 2018; originally announced October 2018.

Comments: Published as a conference paper in BIOSIG 2018

arXiv:1805.00324 [pdf, other]

A Deep Face Identification Network Enhanced by Facial Attributes Prediction

Authors: Fariborz Taherkhani, Nasser M. Nasrabadi, Jeremy Dawson

Abstract: In this paper, we propose a new deep framework which predicts facial attributes and leverage it as a soft modality to improve face identification performance. Our model is an end to end framework which consists of a convolutional neural network (CNN) whose output is fanned out into two separate branches; the first branch predicts facial attributes while the second branch identifies face images. Co… ▽ More In this paper, we propose a new deep framework which predicts facial attributes and leverage it as a soft modality to improve face identification performance. Our model is an end to end framework which consists of a convolutional neural network (CNN) whose output is fanned out into two separate branches; the first branch predicts facial attributes while the second branch identifies face images. Contrary to the existing multi-task methods which only use a shared CNN feature space to train these two tasks jointly, we fuse the predicted attributes with the features from the face modality in order to improve the face identification performance. Experimental results show that our model brings benefits to both face identification as well as facial attribute prediction performance, especially in the case of identity facial attributes such as gender prediction. We tested our model on two standard datasets annotated by identities and face attributes. Experimental results indicate that the proposed model outperforms most of the current existing face identification and attribute prediction methods. △ Less

Submitted 20 April, 2018; originally announced May 2018.

arXiv:1706.00083

Blood capillaries and vessels segmentation in optical coherence tomography angiogram using fuzzy C-means and Curvelet transform

Authors: Fariborz Taherkhani

Abstract: This paper has been removed from arXiv as the submitter did not have ownership of the data presented in this work. This paper has been removed from arXiv as the submitter did not have ownership of the data presented in this work. △ Less

Submitted 31 May, 2017; originally announced June 2017.

Comments: arXiv admin note: This paper has been removed from arXiv as the submitter did not have ownership of the data presented in this work

arXiv:1607.06824 [pdf]

Permutation, Multiscale and Modified Multiscale Entropies a Natural Complexity for Low-High Infection Level Intracellular Viral Reaction Kinetics

Authors: Fariborz Taherkhani, Farid Taherkhani

Abstract: Viral infectious diseases, such as HIV virus growth, cause an important health concern. Study of intracellular viral processes can provide us to develop drug and understanding the drug dose to decrease the HIV virus in during growth. Kinetics Monte Carlo simulation has been done for solving Master equation about dynamics of intracellular viral reaction kinetics. Scaling relationship between equili… ▽ More Viral infectious diseases, such as HIV virus growth, cause an important health concern. Study of intracellular viral processes can provide us to develop drug and understanding the drug dose to decrease the HIV virus in during growth. Kinetics Monte Carlo simulation has been done for solving Master equation about dynamics of intracellular viral reaction kinetics. Scaling relationship between equilibrium time and initial population of template has been found as power low, , where N , are the number of initial population of template species , equilibrium time, a = 163.1 , b = -0.1429 respectively. Stochastic dynamics shows that increasing initial population of template decreases the time of equilibrium. Entropy generation has been considered in low, intermediate and high infection level of intracellular viral kinetics reaction in during dynamical process. Permutation, multi scaling and modified multiscaling entropies have been calculated for three kinds of species in intracellular reaction dynamics, genome, structural protein, and template. Our result shows that presence of noise in dynamical process of intracellular reaction will change order of permutation entropy for the mentioned of three species. In addition to multiscaling entropy is computed for mentioned model and it has the following order: template > structural protein> genome. Dependency of permutation entropy result to permutation order becomes small in high infection level in intracellular viral kinetics dynamics. At short time scale in intracellular reaction dynamics, convergency of permutation entropy occurs with medium permutation order value. In the big time scale of intracellular dynamics, permutation entropy scale with permutation order n as a scaling relation . △ Less

Submitted 28 March, 2017; v1 submitted 22 July, 2016; originally announced July 2016.

arXiv:1607.06803 [pdf]

Restoring highly corrupted images by impulse noise using radial basis functions interpolation

Authors: Fariborz Taherkhani, Mansour Jamzad

Abstract: Preserving details in restoring images highly corrupted by impulse noise remains a challenging problem. We proposed an algorithm based on radial basis functions (RBF) interpolation which estimates the intensities of corrupted pixels by their neighbors. In this algorithm, first intensity values of noisy pixels in the corrupted image are estimated using RBFs. Next, the image is smoothed. The propose… ▽ More Preserving details in restoring images highly corrupted by impulse noise remains a challenging problem. We proposed an algorithm based on radial basis functions (RBF) interpolation which estimates the intensities of corrupted pixels by their neighbors. In this algorithm, first intensity values of noisy pixels in the corrupted image are estimated using RBFs. Next, the image is smoothed. The proposed algorithm can effectively remove the highly dense impulse noise. Experimental results show the superiority of the proposed algorithm in comparison to the recent similar methods both in noise suppression and detail preservation. Extensive simulations show better results in measure of peak signal-to-noise ratio (PSNR) and structural similarity index (SSIM), especially when the image is corrupted by very highly dense impulse noise. △ Less

Submitted 16 February, 2017; v1 submitted 22 July, 2016; originally announced July 2016.

arXiv:1607.06797 [pdf]

A probabilistic patch based image representation using Conditional Random Field model for image classification

Authors: Fariborz Taherkhani

Abstract: In this paper we proposed an ordered patch based method using Conditional Random Field (CRF) in order to encode local properties and their spatial relationship in images to address texture classification, face recognition, and scene classification problems. Typical image classification approaches work without considering spatial causality among distinctive properties of an image for image represen… ▽ More In this paper we proposed an ordered patch based method using Conditional Random Field (CRF) in order to encode local properties and their spatial relationship in images to address texture classification, face recognition, and scene classification problems. Typical image classification approaches work without considering spatial causality among distinctive properties of an image for image representation in feature space. In this method first, each image is encoded as a sequence of ordered patches, including local properties. Second, the sequence of these ordered patches is modeled as a probabilistic feature vector by CRF to model spatial relationship of these local properties. And finally, image classification is performed on such probabilistic image representation. Experimental results on several standard image datasets indicate that proposed method outperforms some of existing image classification methods. △ Less

Submitted 5 October, 2016; v1 submitted 22 July, 2016; originally announced July 2016.

arXiv:1607.06794 [pdf]

An ensemble learning method for scene classification based on Hidden Markov Model image representation

Authors: Fariborz Taherkhani, Reza Hedayati

Abstract: Low level images representation in feature space performs poorly for classification with high accuracy since this level of representation is not able to project images into the discriminative feature space. In this work, we propose an efficient image representation model for classification. First we apply Hidden Markov Model (HMM) on ordered grids represented by different type of image descriptors… ▽ More Low level images representation in feature space performs poorly for classification with high accuracy since this level of representation is not able to project images into the discriminative feature space. In this work, we propose an efficient image representation model for classification. First we apply Hidden Markov Model (HMM) on ordered grids represented by different type of image descriptors in order to include causality of local properties existing in image for feature extraction and then we train up a separate classifier for each of these features sets. Finally we ensemble these classifiers efficiently in a way that they can cancel out each other errors for obtaining higher accuracy. This method is evaluated on 15 natural scene dataset. Experimental results show the superiority of the proposed method in comparison to some current existing methods △ Less

Submitted 5 October, 2016; v1 submitted 22 July, 2016; originally announced July 2016.

Comments: This paper has been withdrawn by the authors due to significant errors in equations 3 and 4

Showing 1–26 of 26 results for author: Taherkhani, F