Search | arXiv e-print repository

Face Reconstruction Transfer Attack as Out-of-Distribution Generalization

Authors: Yoon Gyo Jung, Jaewoo Park, Xingbo Dong, Ho** Park, Andrew Beng ** Teoh, Octavia Camps

Abstract: Understanding the vulnerability of face recognition systems to malicious attacks is of critical importance. Previous works have focused on reconstructing face images that can penetrate a targeted verification system. Even in the white-box scenario, however, naively reconstructed images misrepresent the identity information, hence the attacks are easily neutralized once the face system is updated o… ▽ More Understanding the vulnerability of face recognition systems to malicious attacks is of critical importance. Previous works have focused on reconstructing face images that can penetrate a targeted verification system. Even in the white-box scenario, however, naively reconstructed images misrepresent the identity information, hence the attacks are easily neutralized once the face system is updated or changed. In this paper, we aim to reconstruct face images which are capable of transferring face attacks on unseen encoders. We term this problem as Face Reconstruction Transfer Attack (FRTA) and show that it can be formulated as an out-of-distribution (OOD) generalization problem. Inspired by its OOD nature, we propose to solve FRTA by Averaged Latent Search and Unsupervised Validation with pseudo target (ALSUV). To strengthen the reconstruction attack on OOD unseen encoders, ALSUV reconstructs the face by searching the latent of amortized generator StyleGAN2 through multiple latent optimization, latent optimization trajectory averaging, and unsupervised validation with a pseudo target. We demonstrate the efficacy and generalization of our method on widely used face datasets, accompanying it with extensive ablation studies and visually, qualitatively, and quantitatively analyses. The source code will be released. △ Less

Submitted 2 July, 2024; originally announced July 2024.

Comments: Accepted to ECCV2024

arXiv:2406.19672 [pdf, other]

Beyond First-Order: A Multi-Scale Approach to Finger Knuckle Print Biometrics

Authors: Chengrui Gao, Ziyuan Yang, Andrew Beng ** Teoh, Min Zhu

Abstract: Recently, finger knuckle prints (FKPs) have gained attention due to their rich textural patterns, positioning them as a promising biometric for identity recognition. Prior FKP recognition methods predominantly leverage first-order feature descriptors, which capture intricate texture details but fail to account for structural information. Emerging research, however, indicates that second-order text… ▽ More Recently, finger knuckle prints (FKPs) have gained attention due to their rich textural patterns, positioning them as a promising biometric for identity recognition. Prior FKP recognition methods predominantly leverage first-order feature descriptors, which capture intricate texture details but fail to account for structural information. Emerging research, however, indicates that second-order textures, which describe the curves and arcs of the textures, encompass this overlooked structural information. This paper introduces a novel FKP recognition approach, the Dual-Order Texture Competition Network (DOTCNet), designed to capture texture information in FKP images comprehensively. DOTCNet incorporates three dual-order texture competitive modules (DTCMs), each targeting textures at different scales. Each DTCM employs a learnable texture descriptor, specifically a learnable Gabor filter (LGF), to extract texture features. By leveraging LGFs, the network extracts first and second order textures to describe fine textures and structural features thoroughly. Furthermore, an attention mechanism enhances relevant features in the first-order features, thereby highlighting significant texture details. For second-order features, a competitive mechanism emphasizes structural information while reducing noise from higher-order features. Extensive experimental results reveal that DOTCNet significantly outperforms several standard algorithms on the publicly available PolyU-FKP dataset. △ Less

Submitted 28 June, 2024; originally announced June 2024.

arXiv:2403.02680 [pdf, other]

A Dual-Level Cancelable Framework for Palmprint Verification and Hack-Proof Data Storage

Authors: Ziyuan Yang, Ming Kang, Andrew Beng ** Teoh, Chengrui Gao, Wen Chen, Bob Zhang, Yi Zhang

Abstract: In recent years, palmprints have been widely used for individual verification. The rich privacy information in palmprint data necessitates its protection to ensure security and privacy without sacrificing system performance. Existing systems often use cancelable technologies to protect templates, but these technologies ignore the potential risk of data leakage. Upon breaching the system and gainin… ▽ More In recent years, palmprints have been widely used for individual verification. The rich privacy information in palmprint data necessitates its protection to ensure security and privacy without sacrificing system performance. Existing systems often use cancelable technologies to protect templates, but these technologies ignore the potential risk of data leakage. Upon breaching the system and gaining access to the stored database, a hacker could easily manipulate the stored templates, compromising the security of the verification system. To address this issue, we propose a dual-level cancelable palmprint verification framework in this paper. Specifically, the raw template is initially encrypted using a competition hashing network with a first-level token, facilitating the end-to-end generation of cancelable templates. Different from previous works, the protected template undergoes further encryption to differentiate the second-level protected template from the first-level one. The system specifically creates a negative database (NDB) with the second-level token for dual-level protection during the enrollment stage. Reversing the NDB is NP-hard and a fine-grained algorithm for NDB generation is introduced to manage the noise and specified bits. During the verification stage, we propose an NDB matching algorithm based on matrix operation to accelerate the matching process of previous NDB methods caused by dictionary-based matching rules. This approach circumvents the need to store templates identical to those utilized for verification, reducing the risk of potential data leakage. Extensive experiments conducted on public palmprint datasets have confirmed the effectiveness and generality of the proposed framework. Upon acceptance of the paper, the code will be accessible at https://github.com/Deep-Imaging-Group/NPR. △ Less

Submitted 5 March, 2024; originally announced March 2024.

arXiv:2311.12049 [pdf, other]

Energizing Federated Learning via Filter-Aware Attention

Authors: Ziyuan Yang, Zerui Shao, Huijie Huangfu, Hui Yu, Andrew Beng ** Teoh, Xiaoxiao Li, Hongming Shan, Yi Zhang

Abstract: Federated learning (FL) is a promising distributed paradigm, eliminating the need for data sharing but facing challenges from data heterogeneity. Personalized parameter generation through a hypernetwork proves effective, yet existing methods fail to personalize local model structures. This leads to redundant parameters struggling to adapt to diverse data distributions. To address these limitations… ▽ More Federated learning (FL) is a promising distributed paradigm, eliminating the need for data sharing but facing challenges from data heterogeneity. Personalized parameter generation through a hypernetwork proves effective, yet existing methods fail to personalize local model structures. This leads to redundant parameters struggling to adapt to diverse data distributions. To address these limitations, we propose FedOFA, utilizing personalized orthogonal filter attention for parameter recalibration. The core is the Two-stream Filter-aware Attention (TFA) module, meticulously designed to extract personalized filter-aware attention maps, incorporating Intra-Filter Attention (IntraFa) and Inter-Filter Attention (InterFA) streams. These streams enhance representation capability and explore optimal implicit structures for local models. Orthogonal regularization minimizes redundancy by averting inter-correlation between filters. Furthermore, we introduce an Attention-Guided Pruning Strategy (AGPS) for communication efficiency. AGPS selectively retains crucial neurons while masking redundant ones, reducing communication costs without performance sacrifice. Importantly, FedOFA operates on the server side, incurring no additional computational cost on the client, making it advantageous in communication-constrained scenarios. Extensive experiments validate superior performance over state-of-the-art approaches, with code availability upon paper acceptance. △ Less

Submitted 18 November, 2023; originally announced November 2023.

arXiv:2311.11354 [pdf, other]

Scale-aware competition network for palmprint recognition

Authors: Chengrui Gao, Ziyuan Yang, Min Zhu, Andrew Beng ** Teoh

Abstract: Palmprint biometrics garner heightened attention in palm-scanning payment and social security due to their distinctive attributes. However, prevailing methodologies singularly prioritize texture orientation, neglecting the significant texture scale dimension. We design an innovative network for concurrently extracting intra-scale and inter-scale features to redress this limitation. This paper prop… ▽ More Palmprint biometrics garner heightened attention in palm-scanning payment and social security due to their distinctive attributes. However, prevailing methodologies singularly prioritize texture orientation, neglecting the significant texture scale dimension. We design an innovative network for concurrently extracting intra-scale and inter-scale features to redress this limitation. This paper proposes a scale-aware competitive network (SAC-Net), which includes the Inner-Scale Competition Module (ISCM) and the Across-Scale Competition Module (ASCM) to capture texture characteristics related to orientation and scale. ISCM efficiently integrates learnable Gabor filters and a self-attention mechanism to extract rich orientation data and discern textures with long-range discriminative properties. Subsequently, ASCM leverages a competitive strategy across various scales to effectively encapsulate the competitive texture scale elements. By synergizing ISCM and ASCM, our method adeptly characterizes palmprint features. Rigorous experimentation across three benchmark datasets unequivocally demonstrates our proposed approach's exceptional recognition performance and resilience relative to state-of-the-art alternatives. △ Less

Submitted 20 November, 2023; v1 submitted 19 November, 2023; originally announced November 2023.

arXiv:2310.05316 [pdf, other]

Understanding the Feature Norm for Out-of-Distribution Detection

Authors: Jaewoo Park, Jacky Chen Long Chai, Jaeho Yoon, Andrew Beng ** Teoh

Abstract: A neural network trained on a classification dataset often exhibits a higher vector norm of hidden layer features for in-distribution (ID) samples, while producing relatively lower norm values on unseen instances from out-of-distribution (OOD). Despite this intriguing phenomenon being utilized in many applications, the underlying cause has not been thoroughly investigated. In this study, we demyst… ▽ More A neural network trained on a classification dataset often exhibits a higher vector norm of hidden layer features for in-distribution (ID) samples, while producing relatively lower norm values on unseen instances from out-of-distribution (OOD). Despite this intriguing phenomenon being utilized in many applications, the underlying cause has not been thoroughly investigated. In this study, we demystify this very phenomenon by scrutinizing the discriminative structures concealed in the intermediate layers of a neural network. Our analysis leads to the following discoveries: (1) The feature norm is a confidence value of a classifier hidden in the network layer, specifically its maximum logit. Hence, the feature norm distinguishes OOD from ID in the same manner that a classifier confidence does. (2) The feature norm is class-agnostic, thus it can detect OOD samples across diverse discriminative models. (3) The conventional feature norm fails to capture the deactivation tendency of hidden layer neurons, which may lead to misidentification of ID samples as OOD instances. To resolve this drawback, we propose a novel negative-aware norm (NAN) that can capture both the activation and deactivation tendencies of hidden layer neurons. We conduct extensive experiments on NAN, demonstrating its efficacy and compatibility with existing OOD detectors, as well as its capability in label-free environments. △ Less

Submitted 8 October, 2023; originally announced October 2023.

Comments: Accepted to ICCV2023

arXiv:2309.14888 [pdf, other]

Nearest Neighbor Guidance for Out-of-Distribution Detection

Authors: Jaewoo Park, Yoon Gyo Jung, Andrew Beng ** Teoh

Abstract: Detecting out-of-distribution (OOD) samples are crucial for machine learning models deployed in open-world environments. Classifier-based scores are a standard approach for OOD detection due to their fine-grained detection capability. However, these scores often suffer from overconfidence issues, misclassifying OOD samples distant from the in-distribution region. To address this challenge, we prop… ▽ More Detecting out-of-distribution (OOD) samples are crucial for machine learning models deployed in open-world environments. Classifier-based scores are a standard approach for OOD detection due to their fine-grained detection capability. However, these scores often suffer from overconfidence issues, misclassifying OOD samples distant from the in-distribution region. To address this challenge, we propose a method called Nearest Neighbor Guidance (NNGuide) that guides the classifier-based score to respect the boundary geometry of the data manifold. NNGuide reduces the overconfidence of OOD samples while preserving the fine-grained capability of the classifier-based score. We conduct extensive experiments on ImageNet OOD detection benchmarks under diverse settings, including a scenario where the ID data undergoes natural distribution shift. Our results demonstrate that NNGuide provides a significant performance improvement on the base detection scores, achieving state-of-the-art results on both AUROC, FPR95, and AUPR metrics. The code is given at \url{https://github.com/roomo7time/nnguide}. △ Less

Submitted 26 September, 2023; originally announced September 2023.

Comments: Accepted to ICCV2023

arXiv:2308.00451 [pdf, other]

Physics-Driven Spectrum-Consistent Federated Learning for Palmprint Verification

Authors: Ziyuan Yang, Andrew Beng ** Teoh, Bob Zhang, Lu Leng, Yi Zhang

Abstract: Palmprint as biometrics has gained increasing attention recently due to its discriminative ability and robustness. However, existing methods mainly improve palmprint verification within one spectrum, which is challenging to verify across different spectrums. Additionally, in distributed server-client-based deployment, palmprint verification systems predominantly necessitate clients to transmit pri… ▽ More Palmprint as biometrics has gained increasing attention recently due to its discriminative ability and robustness. However, existing methods mainly improve palmprint verification within one spectrum, which is challenging to verify across different spectrums. Additionally, in distributed server-client-based deployment, palmprint verification systems predominantly necessitate clients to transmit private data for model training on the centralized server, thereby engendering privacy apprehensions. To alleviate the above issues, in this paper, we propose a physics-driven spectrum-consistent federated learning method for palmprint verification, dubbed as PSFed-Palm. PSFed-Palm draws upon the inherent physical properties of distinct wavelength spectrums, wherein images acquired under similar wavelengths display heightened resemblances. Our approach first partitions clients into short- and long-spectrum groups according to the wavelength range of their local spectrum images. Subsequently, we introduce anchor models for short- and long-spectrum, which constrain the optimization directions of local models associated with long- and short-spectrum images. Specifically, a spectrum-consistent loss that enforces the model parameters and feature representation to align with their corresponding anchor models is designed. Finally, we impose constraints on the local models to ensure their consistency with the global model, effectively preventing model drift. This measure guarantees spectrum consistency while protecting data privacy, as there is no need to share local data. Extensive experiments are conducted to validate the efficacy of our proposed PSFed-Palm approach. The proposed PSFed-Palm demonstrates compelling performance despite only a limited number of training data. The codes will be released at https://github.com/Zi-YuanYang/PSFed-Palm. △ Less

Submitted 1 August, 2023; originally announced August 2023.

arXiv:2304.10066 [pdf, other]

Recognizability Embedding Enhancement for Very Low-Resolution Face Recognition and Quality Estimation

Authors: Jacky Chen Long Chai, Tiong-Sik Ng, Cheng-Yaw Low, Jaewoo Park, Andrew Beng ** Teoh

Abstract: Very low-resolution face recognition (VLRFR) poses unique challenges, such as tiny regions of interest and poor resolution due to extreme standoff distance or wide viewing angle of the acquisition devices. In this paper, we study principled approaches to elevate the recognizability of a face in the embedding space instead of the visual quality. We first formulate a robust learning-based face recog… ▽ More Very low-resolution face recognition (VLRFR) poses unique challenges, such as tiny regions of interest and poor resolution due to extreme standoff distance or wide viewing angle of the acquisition devices. In this paper, we study principled approaches to elevate the recognizability of a face in the embedding space instead of the visual quality. We first formulate a robust learning-based face recognizability measure, namely recognizability index (RI), based on two criteria: (i) proximity of each face embedding against the unrecognizable faces cluster center and (ii) closeness of each face embedding against its positive and negative class prototypes. We then devise an index diversion loss to push the hard-to-recognize face embedding with low RI away from unrecognizable faces cluster to boost the RI, which reflects better recognizability. Additionally, a perceptibility attention mechanism is introduced to attend to the most recognizable face regions, which offers better explanatory and discriminative traits for embedding learning. Our proposed model is trained end-to-end and simultaneously serves recognizability-aware embedding learning and face quality estimation. To address VLRFR, our extensive evaluations on three challenging low-resolution datasets and face quality assessment demonstrate the superiority of the proposed model over the state-of-the-art methods. △ Less

Submitted 19 April, 2023; originally announced April 2023.

Comments: Accepted to CVPR23

arXiv:2301.01922 [pdf, other]

doi 10.1109/ICPR56361.2022.9956714

Open-Set Face Identification on Few-Shot Gallery by Fine-Tuning

Authors: Ho** Park, Jaewoo Park, Andrew Beng ** Teoh

Abstract: In this paper, we focus on addressing the open-set face identification problem on a few-shot gallery by fine-tuning. The problem assumes a realistic scenario for face identification, where only a small number of face images is given for enrollment and any unknown identity must be rejected during identification. We observe that face recognition models pretrained on a large dataset and naively fine-… ▽ More In this paper, we focus on addressing the open-set face identification problem on a few-shot gallery by fine-tuning. The problem assumes a realistic scenario for face identification, where only a small number of face images is given for enrollment and any unknown identity must be rejected during identification. We observe that face recognition models pretrained on a large dataset and naively fine-tuned models perform poorly for this task. Motivated by this issue, we propose an effective fine-tuning scheme with classifier weight imprinting and exclusive BatchNorm layer tuning. For further improvement of rejection accuracy on unknown identities, we propose a novel matcher called Neighborhood Aware Cosine (NAC) that computes similarity based on neighborhood information. We validate the effectiveness of the proposed schemes thoroughly on large-scale face benchmarks across different convolutional neural network architectures. The source code for this project is available at: https://github.com/1ho0**1/OSFI-by-FineTuning △ Less

Submitted 5 January, 2023; originally announced January 2023.

Journal ref: 2022 26th International Conference on Pattern Recognition (ICPR), 2022, pp. 1026-1032

arXiv:2210.00294 [pdf, other]

Gait-based Age Group Classification with Adaptive Graph Neural Network

Authors: Timilehin B. Aderinola, Tee Connie, Thian Song Ong, Andrew Beng ** Teoh, Michael Kah Ong Goh

Abstract: Deep learning techniques have recently been utilized for model-free age-associated gait feature extraction. However, acquiring model-free gait demands accurate pre-processing such as background subtraction, which is non-trivial in unconstrained environments. On the other hand, model-based gait can be obtained without background subtraction and is less affected by covariates. For model-based gait-b… ▽ More Deep learning techniques have recently been utilized for model-free age-associated gait feature extraction. However, acquiring model-free gait demands accurate pre-processing such as background subtraction, which is non-trivial in unconstrained environments. On the other hand, model-based gait can be obtained without background subtraction and is less affected by covariates. For model-based gait-based age group classification problems, present works rely solely on handcrafted features, where feature extraction is tedious and requires domain expertise. This paper proposes a deep learning approach to extract age-associated features from model-based gait for age group classification. Specifically, we first develop an unconstrained gait dataset called Multimedia University Gait Age and Gender dataset (MMU GAG). Next, the body joint coordinates are determined via pose estimation algorithms and represented as compact gait graphs via a novel part aggregation scheme. Then, a Part-AdaptIve Residual Graph Convolutional Neural Network (PairGCN) is designed for age-associated feature learning. Experiments suggest that PairGCN features are far more informative than handcrafted features, yielding up to 99% accuracy for classifying subjects as a child, adult, or senior in the MMU GAG dataset. △ Less

Submitted 1 October, 2022; originally announced October 2022.

arXiv:2209.11436 [pdf, other]

doi 10.1016/j.patcog.2023.109942

Understanding Open-Set Recognition by Jacobian Norm and Inter-Class Separation

Authors: Jaewoo Park, Ho** Park, Eunju Jeong, Andrew Beng ** Teoh

Abstract: The findings on open-set recognition (OSR) show that models trained on classification datasets are capable of detecting unknown classes not encountered during the training process. Specifically, after training, the learned representations of known classes dissociate from the representations of the unknown class, facilitating OSR. In this paper, we investigate this emergent phenomenon by examining… ▽ More The findings on open-set recognition (OSR) show that models trained on classification datasets are capable of detecting unknown classes not encountered during the training process. Specifically, after training, the learned representations of known classes dissociate from the representations of the unknown class, facilitating OSR. In this paper, we investigate this emergent phenomenon by examining the relationship between the Jacobian norm of representations and the inter/intra-class learning dynamics. We provide a theoretical analysis, demonstrating that intra-class learning reduces the Jacobian norm for known class samples, while inter-class learning increases the Jacobian norm for unknown samples, even in the absence of direct exposure to any unknown sample. Overall, the discrepancy in the Jacobian norm between the known and unknown classes enables OSR. Based on this insight, which highlights the pivotal role of inter-class learning, we devise a marginal one-vs-rest (m-OvR) loss function that promotes strong inter-class separation. To further improve OSR performance, we integrate the m-OvR loss with additional strategies that maximize the Jacobian norm disparity. We present comprehensive experimental results that support our theoretical observations and demonstrate the efficacy of our proposed OSR approach. △ Less

Submitted 29 September, 2023; v1 submitted 23 September, 2022; originally announced September 2022.

Comments: Accepted to Pattern Recognition

arXiv:2208.00563 [pdf, other]

doi 10.1016/j.patcog.2023.109844

Deep Fidelity in DNN Watermarking: A Study of Backdoor Watermarking for Classification Models

Authors: Guang Hua, Andrew Beng ** Teoh

Abstract: Backdoor watermarking is a promising paradigm to protect the copyright of deep neural network (DNN) models. In the existing works on this subject, researchers have intensively focused on watermarking robustness, while the concept of fidelity, which is concerned with the preservation of the model's original functionality, has received less attention. In this paper, focusing on deep image classifica… ▽ More Backdoor watermarking is a promising paradigm to protect the copyright of deep neural network (DNN) models. In the existing works on this subject, researchers have intensively focused on watermarking robustness, while the concept of fidelity, which is concerned with the preservation of the model's original functionality, has received less attention. In this paper, focusing on deep image classification models, we show that the existing shared notion of the sole measurement of learning accuracy is inadequate to characterize backdoor fidelity. Meanwhile, we show that the analogous concept of embedding distortion in multimedia watermarking, interpreted as the total weight loss (TWL) in DNN backdoor watermarking, is also problematic for fidelity measurement. To address this challenge, we propose the concept of deep fidelity, which states that the backdoor watermarked DNN model should preserve both the feature representation and decision boundary of the unwatermarked host model. To achieve deep fidelity, we propose two loss functions termed penultimate feature loss (PFL) and softmax probability-distribution loss (SPL) to preserve feature representation, while the decision boundary is preserved by the proposed fix last layer (FixLL) treatment, inspired by the recent discovery that deep learning with a fixed classifier causes no loss of learning accuracy. With the above designs, both embedding from scratch and fine-tuning strategies are implemented to evaluate the deep fidelity of backdoor embedding, whose advantages over the existing methods are verified via experiments using ResNet18 for MNIST and CIFAR-10 classifications, and wide residual network (i.e., WRN28_10) for CIFAR-100 task. PyTorch codes are available at https://github.com/ghua-ac/dnn_watermark. △ Less

Submitted 31 October, 2023; v1 submitted 31 July, 2022; originally announced August 2022.

Comments: Published in Pattern Recognition

Journal ref: Pattern Recognition, Vol. 144, Dec. 2023

arXiv:2206.04295 [pdf, other]

Reconstruct Face from Features Using GAN Generator as a Distribution Constraint

Authors: Xingbo Dong, Zhihui Miao, Lan Ma, Jiajun Shen, Zhe **, Zhenhua Guo, Andrew Beng ** Teoh

Abstract: Face recognition based on the deep convolutional neural networks (CNN) shows superior accuracy performance attributed to the high discriminative features extracted. Yet, the security and privacy of the extracted features from deep learning models (deep features) have been often overlooked. This paper proposes the reconstruction of face images from deep features without accessing the CNN network co… ▽ More Face recognition based on the deep convolutional neural networks (CNN) shows superior accuracy performance attributed to the high discriminative features extracted. Yet, the security and privacy of the extracted features from deep learning models (deep features) have been often overlooked. This paper proposes the reconstruction of face images from deep features without accessing the CNN network configurations as a constrained optimization problem. Such optimization minimizes the distance between the features extracted from the original face image and the reconstructed face image. Instead of directly solving the optimization problem in the image space, we innovatively reformulate the problem by looking for a latent vector of a GAN generator, then use it to generate the face image. The GAN generator serves as a dual role in this novel framework, i.e., face distribution constraint of the optimization goal and a face generator. On top of the novel optimization task, we also propose an attack pipeline to impersonate the target user based on the generated face image. Our results show that the generated face images can achieve a state-of-the-art successful attack rate of 98.0\% on LFW under type-I attack @ FAR of 0.1\%. Our work sheds light on the biometric deployment to meet the privacy-preserving and security policies. △ Less

Submitted 9 June, 2022; originally announced June 2022.

arXiv:2205.14575 [pdf, other]

3D-C2FT: Coarse-to-fine Transformer for Multi-view 3D Reconstruction

Authors: Leslie Ching Ow Tiong, Dick Sigmund, Andrew Beng ** Teoh

Abstract: Recently, the transformer model has been successfully employed for the multi-view 3D reconstruction problem. However, challenges remain on designing an attention mechanism to explore the multiview features and exploit their relations for reinforcing the encoding-decoding modules. This paper proposes a new model, namely 3D coarse-to-fine transformer (3D-C2FT), by introducing a novel coarse-to-fine(… ▽ More Recently, the transformer model has been successfully employed for the multi-view 3D reconstruction problem. However, challenges remain on designing an attention mechanism to explore the multiview features and exploit their relations for reinforcing the encoding-decoding modules. This paper proposes a new model, namely 3D coarse-to-fine transformer (3D-C2FT), by introducing a novel coarse-to-fine(C2F) attention mechanism for encoding multi-view features and rectifying defective 3D objects. C2F attention mechanism enables the model to learn multi-view information flow and synthesize 3D surface correction in a coarse to fine-grained manner. The proposed model is evaluated by ShapeNet and Multi-view Real-life datasets. Experimental results show that 3D-C2FT achieves notable results and outperforms several competing models on these datasets. △ Less

Submitted 17 January, 2023; v1 submitted 29 May, 2022; originally announced May 2022.

Comments: Accepted by Asian Conference on Computer Vision (ACCV) 2022

arXiv:2203.04042 [pdf, other]

Abandoning the Bayer-Filter to See in the Dark

Authors: Xingbo Dong, Wanyan Xu, Zhihui Miao, Lan Ma, Chao Zhang, Jiewen Yang, Zhe **, Andrew Beng ** Teoh, Jiajun Shen

Abstract: Low-light image enhancement - a pervasive but challenging problem, plays a central role in enhancing the visibility of an image captured in a poor illumination environment. Due to the fact that not all photons can pass the Bayer-Filter on the sensor of the color camera, in this work, we first present a De-Bayer-Filter simulator based on deep neural networks to generate a monochrome raw image from… ▽ More Low-light image enhancement - a pervasive but challenging problem, plays a central role in enhancing the visibility of an image captured in a poor illumination environment. Due to the fact that not all photons can pass the Bayer-Filter on the sensor of the color camera, in this work, we first present a De-Bayer-Filter simulator based on deep neural networks to generate a monochrome raw image from the colored raw image. Next, a fully convolutional network is proposed to achieve the low-light image enhancement by fusing colored raw data with synthesized monochrome raw data. Channel-wise attention is also introduced to the fusion process to establish a complementary interaction between features from colored and monochrome raw images. To train the convolutional networks, we propose a dataset with monochrome and color raw pairs named Mono-Colored Raw paired dataset (MCR) collected by using a monochrome camera without Bayer-Filter and a color camera with Bayer-Filter. The proposed pipeline take advantages of the fusion of the virtual monochrome and the color raw images and our extensive experiments indicate that significant improvement can be achieved by leveraging raw sensor data and data-driven learning. △ Less

Submitted 22 March, 2022; v1 submitted 8 March, 2022; originally announced March 2022.

arXiv:2106.06341 [pdf, ps, other]

doi 10.1109/LSP.2021.3089437

Towards End-to-End Synthetic Speech Detection

Authors: Guang Hua, Andrew Beng ** Teoh, Haijian Zhang

Abstract: The constant Q transform (CQT) has been shown to be one of the most effective speech signal pre-transforms to facilitate synthetic speech detection, followed by either hand-crafted (subband) constant Q cepstral coefficient (CQCC) feature extraction and a back-end binary classifier, or a deep neural network (DNN) directly for further feature extraction and classification. Despite the rich literatur… ▽ More The constant Q transform (CQT) has been shown to be one of the most effective speech signal pre-transforms to facilitate synthetic speech detection, followed by either hand-crafted (subband) constant Q cepstral coefficient (CQCC) feature extraction and a back-end binary classifier, or a deep neural network (DNN) directly for further feature extraction and classification. Despite the rich literature on such a pipeline, we show in this paper that the pre-transform and hand-crafted features could simply be replaced by end-to-end DNNs. Specifically, we experimentally verify that by only using standard components, a light-weight neural network could outperform the state-of-the-art methods for the ASVspoof2019 challenge. The proposed model is termed Time-domain Synthetic Speech Detection Net (TSSDNet), having ResNet- or Inception-style structures. We further demonstrate that the proposed models also have attractive generalization capability. Trained on ASVspoof2019, they could achieve promising detection performance when tested on disjoint ASVspoof2015, significantly better than the existing cross-dataset results. This paper reveals the great potential of end-to-end DNNs for synthetic speech detection, without hand-crafted features. △ Less

Submitted 11 June, 2021; originally announced June 2021.

Comments: Accepted in IEEE Signal Processing Letters 2021

arXiv:2012.06746 [pdf, other]

doi 10.1016/j.neucom.2024.127263

Periocular Embedding Learning with Consistent Knowledge Distillation from Face

Authors: Yoon Gyo Jung, Jaewoo Park, Cheng Yaw Low, Jacky Chen Long Chai, Leslie Ching Ow Tiong, Andrew Beng ** Teoh

Abstract: Periocular biometric, the peripheral area of the ocular, is a collaborative alternative to the face, especially when the face is occluded or masked. However, in practice, sole periocular biometric capture the least salient facial features, thereby lacking discriminative information, particularly in wild environments. To address these problems, we transfer discriminatory information from the face t… ▽ More Periocular biometric, the peripheral area of the ocular, is a collaborative alternative to the face, especially when the face is occluded or masked. However, in practice, sole periocular biometric capture the least salient facial features, thereby lacking discriminative information, particularly in wild environments. To address these problems, we transfer discriminatory information from the face to support the training of a periocular network by using knowledge distillation. Specifically, we leverage face images for periocular embedding learning, but periocular alone is utilized for identity identification or verification. To enhance periocular embeddings by face effectively, we proposeConsistent Knowledge Distillation (CKD) that imposes consistency between face and periocular networks across prediction and feature layers. We find that imposing consistency at the prediction layer enables (1) extraction of global discriminative relationship information from face images and (2) effective transfer of the information from the face network to the periocular network. Particularly, consistency regularizes the prediction units to extract and store profound inter-class relationship information of face images. (3) The feature layer consistency, on the other hand, makes the periocular features robust against identity-irrelevant attributes. Overall, CKD empowers the sole periocular network to produce robust discriminative embeddings for periocular recognition in the wild. We theoretically and empirically validate the core principles of the distillation mechanism in CKD, discovering that CKD is equivalent to label smoothing with a novel sparsity-oriented regularizer that helps the network prediction to capture the global discriminative relationship. Extensive experiments reveal that CKD achieves state-of-the-art results on standard periocular recognition benchmark datasets. △ Less

Submitted 28 January, 2024; v1 submitted 12 December, 2020; originally announced December 2020.

Comments: Accepted to Neurocomputing

arXiv:2006.13051 [pdf, other]

Interpretable security analysis of cancellable biometrics using constrained-optimized similarity-based attack

Authors: Hanrui Wang, Xingbo Dong, Zhe **, Andrew Beng ** Teoh, Massimo Tistarelli

Abstract: In cancellable biometrics (CB) schemes, template security is achieved by applying, mainly non-linear, transformations to the biometric template. The transformation is designed to preserve the template distance/similarity in the transformed domain. Despite its effectiveness, the security issues attributed to similarity preservation property of CB are underestimated. Dong et al. [BTAS'19], exploited… ▽ More In cancellable biometrics (CB) schemes, template security is achieved by applying, mainly non-linear, transformations to the biometric template. The transformation is designed to preserve the template distance/similarity in the transformed domain. Despite its effectiveness, the security issues attributed to similarity preservation property of CB are underestimated. Dong et al. [BTAS'19], exploited the similarity preservation trait of CB and proposed a similarity-based attack with high successful attack rate. The similarity-based attack utilizes preimage that are generated from the protected biometric template for impersonation and perform cross matching. In this paper, we propose a constrained optimization similarity-based attack (CSA), which is improved upon Dong's genetic algorithm enabled similarity-based attack (GASA). The CSA applies algorithm-specific equality or inequality relations as constraints, to optimize preimage generation. We interpret the effectiveness of CSA from the supervised learning perspective. We identify such constraints then conduct extensive experiments to demonstrate CSA against CB with LFW face dataset. The results suggest that CSA is effective to breach IoM hashing and BioHashing security, and outperforms GASA significantly. Inferring from the above results, we further remark that, other than IoM and BioHashing, CSA is critical to other CB schemes as far as the constraints can be formulated. Furthermore, we reveal the correlation of hash code size and the attack performance of CSA. △ Less

Submitted 17 June, 2021; v1 submitted 23 June, 2020; originally announced June 2020.

arXiv:2003.01665 [pdf, other]

doi 10.1109/ICPR48806.2021.9413248

Discriminative Multi-level Reconstruction under Compact Latent Space for One-Class Novelty Detection

Authors: Jaewoo Park, Yoon Gyo Jung, Andrew Beng ** Teoh

Abstract: In one-class novelty detection, a model learns solely on the in-class data to single out out-class instances. Autoencoder (AE) variants aim to compactly model the in-class data to reconstruct it exclusively, thus differentiating the in-class from out-class by the reconstruction error. However, compact modeling in an improper way might collapse the latent representations of the in-class data and th… ▽ More In one-class novelty detection, a model learns solely on the in-class data to single out out-class instances. Autoencoder (AE) variants aim to compactly model the in-class data to reconstruct it exclusively, thus differentiating the in-class from out-class by the reconstruction error. However, compact modeling in an improper way might collapse the latent representations of the in-class data and thus their reconstruction, which would lead to performance deterioration. Moreover, to properly measure the reconstruction error of high-dimensional data, a metric is required that captures high-level semantics of the data. To this end, we propose Discriminative Compact AE (DCAE) that learns both compact and collapse-free latent representations of the in-class data, thereby reconstructing them both finely and exclusively. In DCAE, (a) we force a compact latent space to bijectively represent the in-class data by reconstructing them through internal discriminative layers of generative adversarial nets. (b) Based on the deep encoder's vulnerability to open set risk, out-class instances are encoded into the same compact latent space and reconstructed poorly without sacrificing the quality of in-class data reconstruction. (c) In inference, the reconstruction error is measured by a novel metric that computes the dissimilarity between a query and its reconstruction based on the class semantics captured by the internal discriminator. Extensive experiments on public image datasets validate the effectiveness of our proposed model on both novelty and adversarial example detection, delivering state-of-the-art performance. △ Less

Submitted 17 February, 2021; v1 submitted 3 March, 2020; originally announced March 2020.

Comments: Accepted to ICPR 2020 Oral (acceptance rate 4.4%)

arXiv:1910.07770 [pdf, other]

On the Risk of Cancelable Biometrics

Authors: Xingbo Dong, Jaewoo Park, Zhe **, Andrew Beng ** Teoh, Massimo Tistarelli, KokSheik Wong

Abstract: Cancelable biometrics (CB) employs an irreversible transformation to convert the biometric features into transformed templates while preserving the relative distance between two templates for security and privacy protection. However, distance preservation invites unexpected security issues such as pre-image attacks, which are often neglected.This paper presents a generalized pre-image attack metho… ▽ More Cancelable biometrics (CB) employs an irreversible transformation to convert the biometric features into transformed templates while preserving the relative distance between two templates for security and privacy protection. However, distance preservation invites unexpected security issues such as pre-image attacks, which are often neglected.This paper presents a generalized pre-image attack method and its extension version that operates on practical CB systems. We theoretically reveal that distance preservation property is a vulnerability source in the CB schemes. We then propose an empirical information leakage estimation algorithm to access the pre-image attack risk of the CB schemes. The experiments conducted with six CB schemes designed for the face, iris and fingerprint, demonstrate that the risks originating from the distance computed from two transformed templates significantly compromise the security of CB schemes. Our work reveals the potential risk of existing CB systems theoretically and experimentally. △ Less

Submitted 29 September, 2022; v1 submitted 17 October, 2019; originally announced October 2019.

arXiv:1902.06383 [pdf, other]

Periocular Recognition in the Wild with Orthogonal Combination of Local Binary Coded Pattern in Dual-stream Convolutional Neural Network

Authors: Leslie Ching Ow Tiong, Andrew Beng ** Teoh, Yunli Lee

Abstract: In spite of the advancements made in the periocular recognition, the dataset and periocular recognition in the wild remains a challenge. In this paper, we propose a multilayer fusion approach by means of a pair of shared parameters (dual-stream) convolutional neural network where each network accepts RGB data and a novel colour-based texture descriptor, namely Orthogonal Combination-Local Binary C… ▽ More In spite of the advancements made in the periocular recognition, the dataset and periocular recognition in the wild remains a challenge. In this paper, we propose a multilayer fusion approach by means of a pair of shared parameters (dual-stream) convolutional neural network where each network accepts RGB data and a novel colour-based texture descriptor, namely Orthogonal Combination-Local Binary Coded Pattern (OC-LBCP) for periocular recognition in the wild. Specifically, two distinct late-fusion layers are introduced in the dual-stream network to aggregate the RGB data and OC-LBCP. Thus, the network beneficial from this new feature of the late-fusion layers for accuracy performance gain. We also introduce and share a new dataset for periocular in the wild, namely Ethnic-ocular dataset for benchmarking. The proposed network has also been assessed on one publicly available dataset, namely UBIPr. The proposed network outperforms several competing approaches on these datasets. △ Less

Submitted 19 March, 2019; v1 submitted 17 February, 2019; originally announced February 2019.

Comments: Accepted in International Conference On Biometrics 2019

arXiv:1811.11489 [pdf]

Fixed-length Bit-string Representation of Fingerprint by Normalized Local Structures

Authors: Jun Beom Kho, Andrew B. J. Teoh, Wonjune Lee, Jaihie Kim

Abstract: In this paper, we propose a method to represent a fingerprint image by an ordered, fixed-length bit-string providing improved accuracy performance, faster matching time and compressibility. First, we devise a novel minutia-based local structure modeled by a mixture of 2D elliptical Gaussian functions in the pixel space. Each local structure is mapped to the Euclidean space by normalizing the local… ▽ More In this paper, we propose a method to represent a fingerprint image by an ordered, fixed-length bit-string providing improved accuracy performance, faster matching time and compressibility. First, we devise a novel minutia-based local structure modeled by a mixture of 2D elliptical Gaussian functions in the pixel space. Each local structure is mapped to the Euclidean space by normalizing the local structure with the number of minutiae that associates to it. This simple yet crucial crux enables fast dissimilarity computation of two local structures with Euclidean distance without distortion. A complementary texture-based local structure to the minutia-based local structure is also introduced whereby both can be compressed via principal component analysis and fused easily in the Euclidean space. The fused local structure is then converted to a K-bit ordered string via a K-means clustering algorithm. This chain of computation with sole use of Euclidean distance is vital for speedy and discriminative bit-string conversion. The accuracy can be further improved by a finger-specific bit-training algorithm in which two criteria are leveraged to select useful bit positions for matching. Experiments are performed on Fingerprint Verification Competition (FVC) databases for comparison with existing techniques to show the superiority of the proposed method. △ Less

Submitted 28 November, 2018; originally announced November 2018.

Comments: 16 pages, 15 figures, 10 tables

arXiv:1809.11045 [pdf]

A Symmetric Keyring Encryption Scheme for Biometric Cryptosystems

Authors: Yen-Lung Lai, Jung-Yeon Hwang, Zhe **, Soohyong Kim, Sangrae Cho, Andrew Beng ** Teoh

Abstract: In this paper, we propose a novel biometric cryptosystem for vectorial biometrics named symmetric keyring encryption (SKE) inspired by Rivest's keyring model (2016). Unlike conventional biometric secret-binding primitives, such as fuzzy commitment and fuzzy vault, the proposed scheme reframes the biometric secret-binding problem as a fuzzy symmetric encryption problem with a notion called resilien… ▽ More In this paper, we propose a novel biometric cryptosystem for vectorial biometrics named symmetric keyring encryption (SKE) inspired by Rivest's keyring model (2016). Unlike conventional biometric secret-binding primitives, such as fuzzy commitment and fuzzy vault, the proposed scheme reframes the biometric secret-binding problem as a fuzzy symmetric encryption problem with a notion called resilient vector pair. In this study, the pair resembles the encryption-decryption key pair in symmetric key cryptosystems. This notion is realized using the index of maximum hashed vectors - a special instance of the ranking-based locality-sensitive hashing function. With a simple filtering mechanism and [m,k] Shamir's secret-sharing scheme, we show that SKE, both in theoretical and empirical evaluation, can retrieve the exact secret with overwhelming probability for a genuine input yet negligible probability for an imposter input. Though SKE can be applied to any vectorial biometrics, we adopt the fingerprint vector as a case of study in this work. The experiments have been performed under several subsets of FVC 2002, 2004, and 2006 datasets. We formalize and analyze the threat model of SKE that encloses several major security attacks. △ Less

Submitted 28 September, 2018; originally announced September 2018.

Comments: 15 pages, 5 figures, 5 tables

arXiv:1703.05455 [pdf]

doi 10.1109/TIFS.2017.2753172

Ranking Based Locality Sensitive Hashing Enabled Cancelable Biometrics: Index-of-Max Hashing

Authors: Zhe **, Yen-Lung Lai, Jung-Yeon Hwang, Soohyung Kim, Andrew Beng ** Teoh

Abstract: In this paper, we propose a ranking based locality sensitive hashing inspired two-factor cancelable biometrics, dubbed "Index-of-Max" (IoM) hashing for biometric template protection. With externally generated random parameters, IoM hashing transforms a real-valued biometric feature vector into discrete index (max ranked) hashed code. We demonstrate two realizations from IoM hashing notion, namely… ▽ More In this paper, we propose a ranking based locality sensitive hashing inspired two-factor cancelable biometrics, dubbed "Index-of-Max" (IoM) hashing for biometric template protection. With externally generated random parameters, IoM hashing transforms a real-valued biometric feature vector into discrete index (max ranked) hashed code. We demonstrate two realizations from IoM hashing notion, namely Gaussian Random Projection based and Uniformly Random Permutation based hashing schemes. The discrete indices representation nature of IoM hashed codes enjoy serveral merits. Firstly, IoM hashing empowers strong concealment to the biometric information. This contributes to the solid ground of non-invertibility guarantee. Secondly, IoM hashing is insensitive to the features magnitude, hence is more robust against biometric features variation. Thirdly, the magnitude-independence trait of IoM hashing makes the hash codes being scale-invariant, which is critical for matching and feature alignment. The experimental results demonstrate favorable accuracy performance on benchmark FVC2002 and FVC2004 fingerprint databases. The analyses justify its resilience to the existing and newly introduced security and privacy attacks as well as satisfy the revocability and unlinkability criteria of cancelable biometrics. △ Less

Submitted 17 September, 2017; v1 submitted 15 March, 2017; originally announced March 2017.

Comments: 15 pages, 8 figures, 6 tables

arXiv:1607.06902 [pdf]

Rank Correlation Measure: A Representational Transformation for Biometric Template Protection

Authors: Zhe **, Yen-Lung Lai, Andrew Beng ** Teoh

Abstract: Despite a variety of theoretical-sound techniques have been proposed for biometric template protection, there is rarely practical solution that guarantees non-invertibility, cancellability, non-linkability and performance simultaneously. In this paper, a ranking-based representational transformation is proposed for fingerprint templates. The proposed method transforms a real-valued feature vector… ▽ More Despite a variety of theoretical-sound techniques have been proposed for biometric template protection, there is rarely practical solution that guarantees non-invertibility, cancellability, non-linkability and performance simultaneously. In this paper, a ranking-based representational transformation is proposed for fingerprint templates. The proposed method transforms a real-valued feature vector into index code such that the pairwise-order measure in the resultant codes are closely correlated with rank similarity measure. Such a ranking based technique offers two major merits: 1) Resilient to noises/perturbations in numeric values; and 2) Highly nonlinear embedding based on partial order statistics. The former takes care of the accuracy performance mitigating numeric noises/perturbations while the latter offers strong non-invertible transformation via nonlinear feature embedding from Euclidean to Rank space that leads to toughness in inversion. The experimental results demonstrate reasonable accuracy performance on benchmark FVC2002 and FVC2004 fingerprint databases, thus confirm the proposition of the rank correlation. Moreover, the security and privacy analysis justify the strong capability against the existing major privacy attacks. △ Less

Submitted 23 July, 2016; originally announced July 2016.

Comments: 6 pages, 5 figures, 2 tables, 1 algorithm

arXiv:1604.07057 [pdf]

doi 10.1109/TCSVT.2017.2761829

Multi-Fold Gabor, PCA and ICA Filter Convolution Descriptor for Face Recognition

Authors: Cheng Yaw Low, Andrew Beng ** Teoh, Cong Jie Ng

Abstract: This paper devises a new means of filter diversification, dubbed multi-fold filter convolution (M-FFC), for face recognition. On the assumption that M-FFC receives single-scale Gabor filters of varying orientations as input, these filters are self-cross convolved by M-fold to instantiate a filter offspring set. The M-FFC flexibility also permits cross convolution amongst Gabor filters and other fi… ▽ More This paper devises a new means of filter diversification, dubbed multi-fold filter convolution (M-FFC), for face recognition. On the assumption that M-FFC receives single-scale Gabor filters of varying orientations as input, these filters are self-cross convolved by M-fold to instantiate a filter offspring set. The M-FFC flexibility also permits cross convolution amongst Gabor filters and other filter banks of profoundly dissimilar traits, e.g., principal component analysis (PCA) filters, and independent component analysis (ICA) filters. The 2-FFC of Gabor, PCA and ICA filters thus yields three offspring sets: (1) Gabor filters solely, (2) Gabor-PCA filters, and (3) Gabor-ICA filters, to render the learning-free and the learning-based 2-FFC descriptors. To facilitate a sensible Gabor filter selection for M-FFC, the 40 multi-scale, multi-orientation Gabor filters are condensed into 8 elementary filters. Aside from that, an average histogram pooling operator is employed to leverage the 2-FFC histogram features, prior to the final whitening PCA compression. The empirical results substantiate that the 2-FFC descriptors prevail over, or on par with, other face descriptors on both identification and verification tasks. △ Less

Submitted 19 October, 2017; v1 submitted 24 April, 2016; originally announced April 2016.

Comments: 14 pages, 10 figures, 10 tables

arXiv:1507.02049 [pdf]

doi 10.1109/APSIPA.2015.7415375

DCTNet : A Simple Learning-free Approach for Face Recognition

Authors: Cong Jie Ng, Andrew Beng ** Teoh

Abstract: PCANet was proposed as a lightweight deep learning network that mainly leverages Principal Component Analysis (PCA) to learn multistage filter banks followed by binarization and block-wise histograming. PCANet was shown worked surprisingly well in various image classification tasks. However, PCANet is data-dependence hence inflexible. In this paper, we proposed a data-independence network, dubbed… ▽ More PCANet was proposed as a lightweight deep learning network that mainly leverages Principal Component Analysis (PCA) to learn multistage filter banks followed by binarization and block-wise histograming. PCANet was shown worked surprisingly well in various image classification tasks. However, PCANet is data-dependence hence inflexible. In this paper, we proposed a data-independence network, dubbed DCTNet for face recognition in which we adopt Discrete Cosine Transform (DCT) as filter banks in place of PCA. This is motivated by the fact that 2D DCT basis is indeed a good approximation for high ranked eigenvectors of PCA. Both 2D DCT and PCA resemble a kind of modulated sine-wave patterns, which can be perceived as a bandpass filter bank. DCTNet is free from learning as 2D DCT bases can be computed in advance. Besides that, we also proposed an effective method to regulate the block-wise histogram feature vector of DCTNet for robustness. It is shown to provide surprising performance boost when the probe image is considerably different in appearance from the gallery image. We evaluate the performance of DCTNet extensively on a number of benchmark face databases and being able to achieve on par with or often better accuracy performance than PCANet. △ Less

Submitted 29 September, 2015; v1 submitted 8 July, 2015; originally announced July 2015.

Comments: APSIPA ASC 2015

Showing 1–28 of 28 results for author: Teoh, A B J