Search | arXiv e-print repository

CLIP4Sketch: Enhancing Sketch to Mugshot Matching through Dataset Augmentation using Diffusion Models

Authors: Kushal Kumar Jain, Steve Grosz, Anoop M. Namboodiri, Anil K. Jain

Abstract: Forensic sketch-to-mugshot matching is a challenging task in face recognition, primarily hindered by the scarcity of annotated forensic sketches and the modality gap between sketches and photographs. To address this, we propose CLIP4Sketch, a novel approach that leverages diffusion models to generate a large and diverse set of sketch images, which helps in enhancing the performance of face recogni… ▽ More Forensic sketch-to-mugshot matching is a challenging task in face recognition, primarily hindered by the scarcity of annotated forensic sketches and the modality gap between sketches and photographs. To address this, we propose CLIP4Sketch, a novel approach that leverages diffusion models to generate a large and diverse set of sketch images, which helps in enhancing the performance of face recognition systems in sketch-to-mugshot matching. Our method utilizes Denoising Diffusion Probabilistic Models (DDPMs) to generate sketches with explicit control over identity and style. We combine CLIP and Adaface embeddings of a reference mugshot, along with textual descriptions of style, as the conditions to the diffusion model. We demonstrate the efficacy of our approach by generating a comprehensive dataset of sketches corresponding to mugshots and training a face recognition model on our synthetic data. Our results show significant improvements in sketch-to-mugshot matching accuracy over training on an existing, limited amount of real face sketch data, validating the potential of diffusion models in enhancing the performance of face recognition systems across modalities. We also compare our dataset with datasets generated using GAN-based methods to show its superiority. △ Less

Submitted 13 August, 2024; v1 submitted 2 August, 2024; originally announced August 2024.

arXiv:2406.00287 [pdf, other]

GenPalm: Contactless Palmprint Generation with Diffusion Models

Authors: Steven A. Grosz, Anil K. Jain

Abstract: The scarcity of large-scale palmprint databases poses a significant bottleneck to advancements in contactless palmprint recognition. To address this, researchers have turned to synthetic data generation. While Generative Adversarial Networks (GANs) have been widely used, they suffer from instability and mode collapse. Recently, diffusion probabilistic models have emerged as a promising alternative… ▽ More The scarcity of large-scale palmprint databases poses a significant bottleneck to advancements in contactless palmprint recognition. To address this, researchers have turned to synthetic data generation. While Generative Adversarial Networks (GANs) have been widely used, they suffer from instability and mode collapse. Recently, diffusion probabilistic models have emerged as a promising alternative, offering stable training and better distribution coverage. This paper introduces a novel palmprint generation method using diffusion probabilistic models, develops an end-to-end framework for synthesizing multiple palm identities, and validates the realism and utility of the generated palmprints. Experimental results demonstrate the effectiveness of our approach in generating palmprint images which enhance contactless palmprint recognition performance across several test databases utilizing challenging cross-database and time-separated evaluation protocols. △ Less

Submitted 31 May, 2024; originally announced June 2024.

arXiv:2404.13791 [pdf, other]

Universal Fingerprint Generation: Controllable Diffusion Model with Multimodal Conditions

Authors: Steven A. Grosz, Anil K. Jain

Abstract: The utilization of synthetic data for fingerprint recognition has garnered increased attention due to its potential to alleviate privacy concerns surrounding sensitive biometric data. However, current methods for generating fingerprints have limitations in creating impressions of the same finger with useful intra-class variations. To tackle this challenge, we present GenPrint, a framework to produ… ▽ More The utilization of synthetic data for fingerprint recognition has garnered increased attention due to its potential to alleviate privacy concerns surrounding sensitive biometric data. However, current methods for generating fingerprints have limitations in creating impressions of the same finger with useful intra-class variations. To tackle this challenge, we present GenPrint, a framework to produce fingerprint images of various types while maintaining identity and offering humanly understandable control over different appearance factors such as fingerprint class, acquisition type, sensor device, and quality level. Unlike previous fingerprint generation approaches, GenPrint is not confined to replicating style characteristics from the training dataset alone: it enables the generation of novel styles from unseen devices without requiring additional fine-tuning. To accomplish these objectives, we developed GenPrint using latent diffusion models with multimodal conditions (text and image) for consistent generation of style and identity. Our experiments leverage a variety of publicly available datasets for training and evaluation. Results demonstrate the benefits of GenPrint in terms of identity preservation, explainable control, and universality of generated images. Importantly, the GenPrint-generated images yield comparable or even superior accuracy to models trained solely on real data and further enhances performance when augmenting the diversity of existing real fingerprint datasets. △ Less

Submitted 21 April, 2024; originally announced April 2024.

arXiv:2401.08111 [pdf, other]

Mobile Contactless Palmprint Recognition: Use of Multiscale, Multimodel Embeddings

Authors: Steven A. Grosz, Akash Godbole, Anil K. Jain

Abstract: Contactless palmprints are comprised of both global and local discriminative features. Most prior work focuses on extracting global features or local features alone for palmprint matching, whereas this research introduces a novel framework that combines global and local features for enhanced palmprint matching accuracy. Leveraging recent advancements in deep learning, this study integrates a visio… ▽ More Contactless palmprints are comprised of both global and local discriminative features. Most prior work focuses on extracting global features or local features alone for palmprint matching, whereas this research introduces a novel framework that combines global and local features for enhanced palmprint matching accuracy. Leveraging recent advancements in deep learning, this study integrates a vision transformer (ViT) and a convolutional neural network (CNN) to extract complementary local and global features. Next, a mobile-based, end-to-end palmprint recognition system is developed, referred to as Palm-ID. On top of the ViT and CNN features, Palm-ID incorporates a palmprint enhancement module and efficient dimensionality reduction (for faster matching). Palm-ID balances the trade-off between accuracy and latency, requiring just 18ms to extract a template of size 516 bytes, which can be efficiently searched against a 10,000 palmprint gallery in 0.33ms on an AMD EPYC 7543 32-Core CPU utilizing 128-threads. Cross-database matching protocols and evaluations on large-scale operational datasets demonstrate the robustness of the proposed method, achieving a TAR of 98.06% at FAR=0.01% on a newly collected, time-separated dataset. To show a practical deployment of the end-to-end system, the entire recognition pipeline is embedded within a mobile device for enhanced user privacy and security. △ Less

Submitted 15 January, 2024; originally announced January 2024.

arXiv:2311.11753 [pdf, other]

AdvGen: Physical Adversarial Attack on Face Presentation Attack Detection Systems

Authors: Sai Amrit Patnaik, Shivali Chansoriya, Anil K. Jain, Anoop M. Namboodiri

Abstract: Evaluating the risk level of adversarial images is essential for safely deploying face authentication models in the real world. Popular approaches for physical-world attacks, such as print or replay attacks, suffer from some limitations, like including physical and geometrical artifacts. Recently, adversarial attacks have gained attraction, which try to digitally deceive the learning strategy of a… ▽ More Evaluating the risk level of adversarial images is essential for safely deploying face authentication models in the real world. Popular approaches for physical-world attacks, such as print or replay attacks, suffer from some limitations, like including physical and geometrical artifacts. Recently, adversarial attacks have gained attraction, which try to digitally deceive the learning strategy of a recognition system using slight modifications to the captured image. While most previous research assumes that the adversarial image could be digitally fed into the authentication systems, this is not always the case for systems deployed in the real world. This paper demonstrates the vulnerability of face authentication systems to adversarial images in physical world scenarios. We propose AdvGen, an automated Generative Adversarial Network, to simulate print and replay attacks and generate adversarial images that can fool state-of-the-art PADs in a physical domain attack setting. Using this attack strategy, the attack success rate goes up to 82.01%. We test AdvGen extensively on four datasets and ten state-of-the-art PADs. We also demonstrate the effectiveness of our attack by conducting experiments in a realistic, physical environment. △ Less

Submitted 20 November, 2023; originally announced November 2023.

Comments: 10 pages, 9 figures, Accepted to the International Joint Conference on Biometrics (IJCB 2023)

arXiv:2306.14808 [pdf, other]

Maximum State Entropy Exploration using Predecessor and Successor Representations

Authors: Arnav Kumar Jain, Lucas Lehnert, Irina Rish, Glen Berseth

Abstract: Animals have a developed ability to explore that aids them in important tasks such as locating food, exploring for shelter, and finding misplaced items. These exploration skills necessarily track where they have been so that they can plan for finding items with relative efficiency. Contemporary exploration algorithms often learn a less efficient exploration strategy because they either condition o… ▽ More Animals have a developed ability to explore that aids them in important tasks such as locating food, exploring for shelter, and finding misplaced items. These exploration skills necessarily track where they have been so that they can plan for finding items with relative efficiency. Contemporary exploration algorithms often learn a less efficient exploration strategy because they either condition only on the current state or simply rely on making random open-loop exploratory moves. In this work, we propose $ηψ$-Learning, a method to learn efficient exploratory policies by conditioning on past episodic experience to make the next exploratory move. Specifically, $ηψ$-Learning learns an exploration policy that maximizes the entropy of the state visitation distribution of a single trajectory. Furthermore, we demonstrate how variants of the predecessor representation and successor representations can be combined to predict the state visitation entropy. Our experiments demonstrate the efficacy of $ηψ$-Learning to strategically explore the environment and maximize the state coverage with limited samples. △ Less

Submitted 26 June, 2023; originally announced June 2023.

arXiv:2306.00272 [pdf, other]

Accelerated Fingerprint Enhancement: A GPU-Optimized Mixed Architecture Approach

Authors: André Brasil Vieira Wyzykowski, Anil K. Jain

Abstract: This document presents a preliminary approach to latent fingerprint enhancement, fundamentally designed around a mixed Unet architecture. It combines the capabilities of the Resnet-101 network and Unet encoder, aiming to form a potentially powerful composite. This combination, enhanced with attention mechanisms and forward skip connections, is intended to optimize the enhancement of ridge and minu… ▽ More This document presents a preliminary approach to latent fingerprint enhancement, fundamentally designed around a mixed Unet architecture. It combines the capabilities of the Resnet-101 network and Unet encoder, aiming to form a potentially powerful composite. This combination, enhanced with attention mechanisms and forward skip connections, is intended to optimize the enhancement of ridge and minutiae features in fingerprints. One innovative element of this approach includes a novel Fingerprint Enhancement Gabor layer, specifically designed for GPU computations. This illustrates how modern computational resources might be harnessed to expedite enhancement. Given its potential functionality as either a CNN or Transformer layer, this Gabor layer could offer improved agility and processing speed to the system. However, it is important to note that this approach is still in the early stages of development and has not yet been fully validated through rigorous experiments. As such, it may require additional time and testing to establish its robustness and usability in the field of latent fingerprint enhancement. This includes improvements in processing speed, enhancement adaptability with distinct latent fingerprint types, and full validation in experimental approaches such as open-set (identification 1:N) and open-set validation, fingerprint quality evaluation, among others. △ Less

Submitted 31 May, 2023; originally announced June 2023.

arXiv:2306.00231 [pdf, other]

A Universal Latent Fingerprint Enhancer Using Transformers

Authors: Andre Brasil Vieira Wyzykowski, Anil K. Jain

Abstract: Forensic science heavily relies on analyzing latent fingerprints, which are crucial for criminal investigations. However, various challenges, such as background noise, overlapping prints, and contamination, make the identification process difficult. Moreover, limited access to real crime scene and laboratory-generated databases hinders the development of efficient recognition algorithms. This stud… ▽ More Forensic science heavily relies on analyzing latent fingerprints, which are crucial for criminal investigations. However, various challenges, such as background noise, overlapping prints, and contamination, make the identification process difficult. Moreover, limited access to real crime scene and laboratory-generated databases hinders the development of efficient recognition algorithms. This study aims to develop a fast method, which we call ULPrint, to enhance various latent fingerprint types, including those obtained from real crime scenes and laboratory-created samples, to boost fingerprint recognition system performance. In closed-set identification accuracy experiments, the enhanced image was able to improve the performance of the MSU-AFIS from 61.56\% to 75.19\% in the NIST SD27 database, from 67.63\% to 77.02\% in the MSP Latent database, and from 46.90\% to 52.12\% in the NIST SD302 database. Our contributions include (1) the development of a two-step latent fingerprint enhancement method that combines Ridge Segmentation with UNet and Mix Visual Transformer (MiT) SegFormer-B5 encoder architecture, (2) the implementation of multiple dilated convolutions in the UNet architecture to capture intricate, non-local patterns better and enhance ridge segmentation, and (3) the guided blending of the predicted ridge mask with the latent fingerprint. This novel approach, ULPrint, streamlines the enhancement process, addressing challenges across diverse latent fingerprint types to improve forensic investigations and criminal justice outcomes. △ Less

Submitted 31 May, 2023; originally announced June 2023.

arXiv:2305.07602 [pdf, other]

ViT Unified: Joint Fingerprint Recognition and Presentation Attack Detection

Authors: Steven A. Grosz, Kanishka P. Wijewardena, Anil K. Jain

Abstract: A secure fingerprint recognition system must contain both a presentation attack (i.e., spoof) detection and recognition module in order to protect users against unwanted access by malicious users. Traditionally, these tasks would be carried out by two independent systems; however, recent studies have demonstrated the potential to have one unified system architecture in order to reduce the computat… ▽ More A secure fingerprint recognition system must contain both a presentation attack (i.e., spoof) detection and recognition module in order to protect users against unwanted access by malicious users. Traditionally, these tasks would be carried out by two independent systems; however, recent studies have demonstrated the potential to have one unified system architecture in order to reduce the computational burdens on the system, while maintaining high accuracy. In this work, we leverage a vision transformer architecture for joint spoof detection and matching and report competitive results with state-of-the-art (SOTA) models for both a sequential system (two ViT models operating independently) and a unified architecture (a single ViT model for both tasks). ViT models are particularly well suited for this task as the ViT's global embedding encodes features useful for recognition, whereas the individual, local embeddings are useful for spoof detection. We demonstrate the capability of our unified model to achieve an average integrated matching (IM) accuracy of 98.87% across LivDet 2013 and 2015 CrossMatch sensors. This is comparable to IM accuracy of 98.95% of our sequential dual-ViT system, but with ~50% of the parameters and ~58% of the latency. △ Less

Submitted 12 May, 2023; originally announced May 2023.

arXiv:2305.05161 [pdf, other]

Child Palm-ID: Contactless Palmprint Recognition for Children

Authors: Akash Godbole, Steven A. Grosz, Anil K. Jain

Abstract: Effective distribution of nutritional and healthcare aid for children, particularly infants and toddlers, in some of the least developed and most impoverished countries of the world, is a major problem due to the lack of reliable identification documents. Biometric authentication technology has been investigated to address child recognition in the absence of reliable ID documents. We present a mob… ▽ More Effective distribution of nutritional and healthcare aid for children, particularly infants and toddlers, in some of the least developed and most impoverished countries of the world, is a major problem due to the lack of reliable identification documents. Biometric authentication technology has been investigated to address child recognition in the absence of reliable ID documents. We present a mobile-based contactless palmprint recognition system, called Child Palm-ID, which meets the requirements of usability, hygiene, cost, and accuracy for child recognition. Using a contactless child palmprint database, Child-PalmDB1, consisting of 19,158 images from 1,020 unique palms (in the age range of 6 mos. to 48 mos.), we report a TAR=94.11% @ FAR=0.1%. The proposed Child Palm-ID system is also able to recognize adults, achieving a TAR=99.4% on the CASIA contactless palmprint database and a TAR=100% on the COEP contactless adult palmprint database, both @ FAR=0.1%. These accuracies are competitive with the SOTA provided by COTS systems. Despite these high accuracies, we show that the TAR for time-separated child-palmprints is only 78.1% @ FAR=0.1%. △ Less

Submitted 9 May, 2023; originally announced May 2023.

arXiv:2304.13800 [pdf, other]

Latent Fingerprint Recognition: Fusion of Local and Global Embeddings

Authors: Steven A. Grosz, Anil K. Jain

Abstract: One of the most challenging problems in fingerprint recognition continues to be establishing the identity of a suspect associated with partial and smudgy fingerprints left at a crime scene (i.e., latent prints or fingermarks). Despite the success of fixed-length embeddings for rolled and slap fingerprint recognition, the features learned for latent fingerprint matching have mostly been limited to… ▽ More One of the most challenging problems in fingerprint recognition continues to be establishing the identity of a suspect associated with partial and smudgy fingerprints left at a crime scene (i.e., latent prints or fingermarks). Despite the success of fixed-length embeddings for rolled and slap fingerprint recognition, the features learned for latent fingerprint matching have mostly been limited to local minutiae-based embeddings and have not directly leveraged global representations for matching. In this paper, we combine global embeddings with local embeddings for state-of-the-art latent to rolled matching accuracy with high throughput. The combination of both local and global representations leads to improved recognition accuracy across NIST SD 27, NIST SD 302, MSP, MOLF DB1/DB4, and MOLF DB2/DB4 latent fingerprint datasets for both closed-set (84.11%, 54.36%, 84.35%, 70.43%, 62.86% rank-1 retrieval rate, respectively) and open-set (0.50, 0.74, 0.44, 0.60, 0.68 FNIR at FPIR=0.02, respectively) identification scenarios on a gallery of 100K rolled fingerprints. Not only do we fuse the complimentary representations, we also use the local features to guide the global representations to focus on discriminatory regions in two fingerprint images to be compared. This leads to a multi-stage matching paradigm in which subsets of the retrieved candidate lists for each probe image are passed to subsequent stages for further processing, resulting in a considerable reduction in latency (requiring just 0.068 ms per latent to rolled comparison on a AMD EPYC 7543 32-Core Processor, roughly 15K comparisons per second). Finally, we show the generalizability of the fused representations for improving authentication accuracy across several rolled, plain, and contactless fingerprint datasets. △ Less

Submitted 7 September, 2023; v1 submitted 26 April, 2023; originally announced April 2023.

arXiv:2212.07299 [pdf, other]

Child PalmID: Contactless Palmprint Recognition

Authors: Anil K. Jain, Akash Godbole, Anjoo Bhatnagar, Prem Sewak Sudhish

Abstract: Developing and least developed countries face the dire challenge of ensuring that each child in their country receives required doses of vaccination, adequate nutrition and proper medication. International agencies such as UNICEF, WHO and WFP, among other organizations, strive to find innovative solutions to determine which child has received the benefits and which have not. Biometric recognition… ▽ More Developing and least developed countries face the dire challenge of ensuring that each child in their country receives required doses of vaccination, adequate nutrition and proper medication. International agencies such as UNICEF, WHO and WFP, among other organizations, strive to find innovative solutions to determine which child has received the benefits and which have not. Biometric recognition systems have been sought out to help solve this problem. To that end, this report establishes a baseline accuracy of a commercial contactless palmprint recognition system that may be deployed for recognizing children in the age group of one to five years old. On a database of contactless palmprint images of one thousand unique palms from 500 children, we establish SOTA authentication accuracy of 90.85% @ FAR of 0.01%, rank-1 identification accuracy of 99.0% (closed set), and FPIR=0.01 @ FNIR=0.3 for open-set identification using PalmMobile SDK from Armatura. △ Less

Submitted 14 December, 2022; originally announced December 2022.

Comments: 9 pages, 14 figures

arXiv:2211.13897 [pdf, other]

AFR-Net: Attention-Driven Fingerprint Recognition Network

Authors: Steven A. Grosz, Anil K. Jain

Abstract: The use of vision transformers (ViT) in computer vision is increasing due to limited inductive biases (e.g., locality, weight sharing, etc.) and increased scalability compared to other deep learning methods. This has led to some initial studies on the use of ViT for biometric recognition, including fingerprint recognition. In this work, we improve on these initial studies for transformers in finge… ▽ More The use of vision transformers (ViT) in computer vision is increasing due to limited inductive biases (e.g., locality, weight sharing, etc.) and increased scalability compared to other deep learning methods. This has led to some initial studies on the use of ViT for biometric recognition, including fingerprint recognition. In this work, we improve on these initial studies for transformers in fingerprint recognition by i.) evaluating additional attention-based architectures, ii.) scaling to larger and more diverse training and evaluation datasets, and iii.) combining the complimentary representations of attention-based and CNN-based embeddings for improved state-of-the-art (SOTA) fingerprint recognition (both authentication and identification). Our combined architecture, AFR-Net (Attention-Driven Fingerprint Recognition Network), outperforms several baseline transformer and CNN-based models, including a SOTA commercial fingerprint system, Verifinger v12.3, across intra-sensor, cross-sensor, and latent to rolled fingerprint matching datasets. Additionally, we propose a realignment strategy using local embeddings extracted from intermediate feature maps within the networks to refine the global embeddings in low certainty situations, which boosts the overall recognition accuracy significantly across each of the models. This realignment strategy requires no additional training and can be applied as a wrapper to any existing deep learning network (including attention-based, CNN-based, or both) to boost its performance. △ Less

Submitted 3 December, 2022; v1 submitted 25 November, 2022; originally announced November 2022.

arXiv:2210.13994 [pdf, other]

Minutiae-Guided Fingerprint Embeddings via Vision Transformers

Authors: Steven A. Grosz, Joshua J. Engelsma, Rajeev Ranjan, Naveen Ramakrishnan, Manoj Aggarwal, Gerard G. Medioni, Anil K. Jain

Abstract: Minutiae matching has long dominated the field of fingerprint recognition. However, deep networks can be used to extract fixed-length embeddings from fingerprints. To date, the few studies that have explored the use of CNN architectures to extract such embeddings have shown extreme promise. Inspired by these early works, we propose the first use of a Vision Transformer (ViT) to learn a discriminat… ▽ More Minutiae matching has long dominated the field of fingerprint recognition. However, deep networks can be used to extract fixed-length embeddings from fingerprints. To date, the few studies that have explored the use of CNN architectures to extract such embeddings have shown extreme promise. Inspired by these early works, we propose the first use of a Vision Transformer (ViT) to learn a discriminative fixed-length fingerprint embedding. We further demonstrate that by guiding the ViT to focus in on local, minutiae related features, we can boost the recognition performance. Finally, we show that by fusing embeddings learned by CNNs and ViTs we can reach near parity with a commercial state-of-the-art (SOTA) matcher. In particular, we obtain a TAR=94.23% @ FAR=0.1% on the NIST SD 302 public-domain dataset, compared to a SOTA commercial matcher which obtains TAR=96.71% @ FAR=0.1%. Additionally, our fixed-length embeddings can be matched orders of magnitude faster than the commercial system (2.5 million matches/second compared to 50K matches/second). We make our code and models publicly available to encourage further research on this topic: https://github.com/tba. △ Less

Submitted 25 October, 2022; v1 submitted 25 October, 2022; originally announced October 2022.

arXiv:2210.11698 [pdf, other]

Learning Robust Dynamics through Variational Sparse Gating

Authors: Arnav Kumar Jain, Shivakanth Sujit, Shruti Joshi, Vincent Michalski, Danijar Hafner, Samira Ebrahimi-Kahou

Abstract: Learning world models from their sensory inputs enables agents to plan for actions by imagining their future outcomes. World models have previously been shown to improve sample-efficiency in simulated environments with few objects, but have not yet been applied successfully to environments with many objects. In environments with many objects, often only a small number of them are moving or interac… ▽ More Learning world models from their sensory inputs enables agents to plan for actions by imagining their future outcomes. World models have previously been shown to improve sample-efficiency in simulated environments with few objects, but have not yet been applied successfully to environments with many objects. In environments with many objects, often only a small number of them are moving or interacting at the same time. In this paper, we investigate integrating this inductive bias of sparse interactions into the latent dynamics of world models trained from pixels. First, we introduce Variational Sparse Gating (VSG), a latent dynamics model that updates its feature dimensions sparsely through stochastic binary gates. Moreover, we propose a simplified architecture Simple Variational Sparse Gating (SVSG) that removes the deterministic pathway of previous models, resulting in a fully stochastic transition function that leverages the VSG mechanism. We evaluate the two model architectures in the BringBackShapes (BBS) environment that features a large number of moving objects and partial observability, demonstrating clear improvements over prior models. △ Less

Submitted 20 October, 2022; originally announced October 2022.

arXiv:2209.02425 [pdf, other]

Learning an Ensemble of Deep Fingerprint Representations

Authors: Akash Godbole, Karthik Nandakumar, Anil K. Jain

Abstract: Deep neural networks (DNNs) have shown incredible promise in learning fixed-length representations from fingerprints. Since the representation learning is often focused on capturing specific prior knowledge (e.g., minutiae), there is no universal representation that comprehensively encapsulates all the discriminatory information available in a fingerprint. While learning an ensemble of representat… ▽ More Deep neural networks (DNNs) have shown incredible promise in learning fixed-length representations from fingerprints. Since the representation learning is often focused on capturing specific prior knowledge (e.g., minutiae), there is no universal representation that comprehensively encapsulates all the discriminatory information available in a fingerprint. While learning an ensemble of representations can mitigate this problem, two critical challenges need to be addressed: (i) How to extract multiple diverse representations from the same fingerprint image? and (ii) How to optimally exploit these representations during the matching process? In this work, we train multiple instances of DeepPrint (a state-of-the-art DNN-based fingerprint encoder) on different transformations of the input image to generate an ensemble of fingerprint embeddings. We also propose a feature fusion technique that distills these multiple representations into a single embedding, which faithfully captures the diversity present in the ensemble without increasing the computational complexity. The proposed approach has been comprehensively evaluated on five databases containing rolled, plain, and latent fingerprints (NIST SD4, NIST SD14, NIST SD27, NIST SD302, and FVC2004 DB2A) and statistically significant improvements in accuracy have been consistently demonstrated across a range of verification as well as closed- and open-set identification settings. The proposed approach serves as a wrapper capable of improving the accuracy of any DNN-based recognition system. △ Less

Submitted 2 September, 2022; originally announced September 2022.

arXiv:2208.13811 [pdf, other]

Synthetic Latent Fingerprint Generator

Authors: Andre Brasil Vieira Wyzykowski, Anil K. Jain

Abstract: Given a full fingerprint image (rolled or slap), we present CycleGAN models to generate multiple latent impressions of the same identity as the full print. Our models can control the degree of distortion, noise, blurriness and occlusion in the generated latent print images to obtain Good, Bad and Ugly latent image categories as introduced in the NIST SD27 latent database. The contributions of our… ▽ More Given a full fingerprint image (rolled or slap), we present CycleGAN models to generate multiple latent impressions of the same identity as the full print. Our models can control the degree of distortion, noise, blurriness and occlusion in the generated latent print images to obtain Good, Bad and Ugly latent image categories as introduced in the NIST SD27 latent database. The contributions of our work are twofold: (i) demonstrate the similarity of synthetically generated latent fingerprint images to crime scene latents in NIST SD27 and MSP databases as evaluated by the NIST NFIQ 2 quality measure and ROC curves obtained by a SOTA fingerprint matcher, and (ii) use of synthetic latents to augment small-size latent training databases in the public domain to improve the performance of DeepPrint, a SOTA fingerprint matcher designed for rolled to rolled fingerprint matching on three latent databases (NIST SD27, NIST SD302, and IIITD-SLF). As an example, with synthetic latent data augmentation, the Rank-1 retrieval performance of DeepPrint is improved from 15.50% to 29.07% on challenging NIST SD27 latent database. Our approach for generating synthetic latent fingerprints can be used to improve the recognition performance of any latent matcher and its individual components (e.g., enhancement, segmentation and feature extraction). △ Less

Submitted 29 August, 2022; originally announced August 2022.

arXiv:2205.09318 [pdf, other]

On Demographic Bias in Fingerprint Recognition

Authors: Akash Godbole, Steven A. Grosz, Karthik Nandakumar, Anil K. Jain

Abstract: Fingerprint recognition systems have been deployed globally in numerous applications including personal devices, forensics, law enforcement, banking, and national identity systems. For these systems to be socially acceptable and trustworthy, it is critical that they perform equally well across different demographic groups. In this work, we propose a formal statistical framework to test for the exi… ▽ More Fingerprint recognition systems have been deployed globally in numerous applications including personal devices, forensics, law enforcement, banking, and national identity systems. For these systems to be socially acceptable and trustworthy, it is critical that they perform equally well across different demographic groups. In this work, we propose a formal statistical framework to test for the existence of bias (demographic differentials) in fingerprint recognition across four major demographic groups (white male, white female, black male, and black female) for two state-of-the-art (SOTA) fingerprint matchers operating in verification and identification modes. Experiments on two different fingerprint databases (with 15,468 and 1,014 subjects) show that demographic differentials in SOTA fingerprint recognition systems decrease as the matcher accuracy increases and any small bias that may be evident is likely due to certain outlier, low-quality fingerprint images. △ Less

Submitted 19 May, 2022; originally announced May 2022.

arXiv:2205.03809 [pdf, other]

Fingerprint Template Invertibility: Minutiae vs. Deep Templates

Authors: Kanishka P. Wijewardena, Steven A. Grosz, Kai Cao, Anil K. Jain

Abstract: Much of the success of fingerprint recognition is attributed to minutiae-based fingerprint representation. It was believed that minutiae templates could not be inverted to obtain a high fidelity fingerprint image, but this assumption has been shown to be false. The success of deep learning has resulted in alternative fingerprint representations (embeddings), in the hope that they might offer bette… ▽ More Much of the success of fingerprint recognition is attributed to minutiae-based fingerprint representation. It was believed that minutiae templates could not be inverted to obtain a high fidelity fingerprint image, but this assumption has been shown to be false. The success of deep learning has resulted in alternative fingerprint representations (embeddings), in the hope that they might offer better recognition accuracy as well as non-invertibility of deep network-based templates. We evaluate whether deep fingerprint templates suffer from the same reconstruction attacks as the minutiae templates. We show that while a deep template can be inverted to produce a fingerprint image that could be matched to its source image, deep templates are more resistant to reconstruction attacks than minutiae templates. In particular, reconstructed fingerprint images from minutiae templates yield a TAR of about 100.0% (98.3%) @ FAR of 0.01% for type-I (type-II) attacks using a state-of-the-art commercial fingerprint matcher, when tested on NIST SD4. The corresponding attack performance for reconstructed fingerprint images from deep templates using the same commercial matcher yields a TAR of less than 1% for both type-I and type-II attacks; however, when the reconstructed images are matched using the same deep network, they achieve a TAR of 85.95% (68.10%) for type-I (type-II) attacks. Furthermore, what is missing from previous fingerprint template inversion studies is an evaluation of the black-box attack performance, which we perform using 3 different state-of-the-art fingerprint matchers. We conclude that fingerprint images generated by inverting minutiae templates are highly susceptible to both white-box and black-box attack evaluations, while fingerprint images generated by deep templates are resistant to black-box evaluations and comparatively less susceptible to white-box evaluations. △ Less

Submitted 8 May, 2022; originally announced May 2022.

arXiv:2204.06498 [pdf, other]

SpoofGAN: Synthetic Fingerprint Spoof Images

Authors: Steven A. Grosz, Anil K. Jain

Abstract: A major limitation to advances in fingerprint spoof detection is the lack of publicly available, large-scale fingerprint spoof datasets, a problem which has been compounded by increased concerns surrounding privacy and security of biometric data. Furthermore, most state-of-the-art spoof detection algorithms rely on deep networks which perform best in the presence of a large amount of training data… ▽ More A major limitation to advances in fingerprint spoof detection is the lack of publicly available, large-scale fingerprint spoof datasets, a problem which has been compounded by increased concerns surrounding privacy and security of biometric data. Furthermore, most state-of-the-art spoof detection algorithms rely on deep networks which perform best in the presence of a large amount of training data. This work aims to demonstrate the utility of synthetic (both live and spoof) fingerprints in supplying these algorithms with sufficient data to improve the performance of fingerprint spoof detection algorithms beyond the capabilities when training on a limited amount of publicly available real datasets. First, we provide details of our approach in modifying a state-of-the-art generative architecture to synthesize high quality live and spoof fingerprints. Then, we provide quantitative and qualitative analysis to verify the quality of our synthetic fingerprints in mimicking the distribution of real data samples. We showcase the utility of our synthetic live and spoof fingerprints in training a deep network for fingerprint spoof detection, which dramatically boosts the performance across three different evaluation datasets compared to an identical model trained on real data alone. Finally, we demonstrate that only 25% of the original (real) dataset is required to obtain similar detection performance when augmenting the training dataset with synthetic data. △ Less

Submitted 15 April, 2022; v1 submitted 13 April, 2022; originally announced April 2022.

arXiv:2204.00964 [pdf, other]

AdaFace: Quality Adaptive Margin for Face Recognition

Authors: Minchul Kim, Anil K. Jain, Xiaoming Liu

Abstract: Recognition in low quality face datasets is challenging because facial attributes are obscured and degraded. Advances in margin-based loss functions have resulted in enhanced discriminability of faces in the embedding space. Further, previous studies have studied the effect of adaptive losses to assign more importance to misclassified (hard) examples. In this work, we introduce another aspect of a… ▽ More Recognition in low quality face datasets is challenging because facial attributes are obscured and degraded. Advances in margin-based loss functions have resulted in enhanced discriminability of faces in the embedding space. Further, previous studies have studied the effect of adaptive losses to assign more importance to misclassified (hard) examples. In this work, we introduce another aspect of adaptiveness in the loss function, namely the image quality. We argue that the strategy to emphasize misclassified samples should be adjusted according to their image quality. Specifically, the relative importance of easy or hard samples should be based on the sample's image quality. We propose a new loss function that emphasizes samples of different difficulties based on their image quality. Our method achieves this in the form of an adaptive margin function by approximating the image quality with feature norms. Extensive experiments show that our method, AdaFace, improves the face recognition performance over the state-of-the-art (SoTA) on four datasets (IJB-B, IJB-C, IJB-S and TinyFace). Code and models are released in https://github.com/mk-minchul/AdaFace. △ Less

Submitted 15 February, 2023; v1 submitted 2 April, 2022; originally announced April 2022.

Comments: Published in CVPR2022 (Oral)

arXiv:2201.04806 [pdf, other]

RealGait: Gait Recognition for Person Re-Identification

Authors: Shaoxiong Zhang, Yunhong Wang, Tianrui Chai, Annan Li, Anil K. Jain

Abstract: Human gait is considered a unique biometric identifier which can be acquired in a covert manner at a distance. However, models trained on existing public domain gait datasets which are captured in controlled scenarios lead to drastic performance decline when applied to real-world unconstrained gait data. On the other hand, video person re-identification techniques have achieved promising performan… ▽ More Human gait is considered a unique biometric identifier which can be acquired in a covert manner at a distance. However, models trained on existing public domain gait datasets which are captured in controlled scenarios lead to drastic performance decline when applied to real-world unconstrained gait data. On the other hand, video person re-identification techniques have achieved promising performance on large-scale publicly available datasets. Given the diversity of clothing characteristics, clothing cue is not reliable for person recognition in general. So, it is actually not clear why the state-of-the-art person re-identification methods work as well as they do. In this paper, we construct a new gait dataset by extracting silhouettes from an existing video person re-identification challenge which consists of 1,404 persons walking in an unconstrained manner. Based on this dataset, a consistent and comparative study between gait recognition and person re-identification can be carried out. Given that our experimental results show that current gait recognition approaches designed under data collected in controlled scenarios are inappropriate for real surveillance scenarios, we propose a novel gait recognition method, called RealGait. Our results suggest that recognizing people by their gait in real surveillance scenarios is feasible and the underlying gait pattern is probably the true reason why video person re-idenfification works in practice. △ Less

Submitted 8 February, 2023; v1 submitted 13 January, 2022; originally announced January 2022.

arXiv:2201.03674 [pdf, other]

PrintsGAN: Synthetic Fingerprint Generator

Authors: Joshua J. Engelsma, Steven A. Grosz, Anil K. Jain

Abstract: A major impediment to researchers working in the area of fingerprint recognition is the lack of publicly available, large-scale, fingerprint datasets. The publicly available datasets that do exist contain very few identities and impressions per finger. This limits research on a number of topics, including e.g., using deep networks to learn fixed length fingerprint embeddings. Therefore, we propose… ▽ More A major impediment to researchers working in the area of fingerprint recognition is the lack of publicly available, large-scale, fingerprint datasets. The publicly available datasets that do exist contain very few identities and impressions per finger. This limits research on a number of topics, including e.g., using deep networks to learn fixed length fingerprint embeddings. Therefore, we propose PrintsGAN, a synthetic fingerprint generator capable of generating unique fingerprints along with multiple impressions for a given fingerprint. Using PrintsGAN, we synthesize a database of 525k fingerprints (35K distinct fingers, each with 15 impressions). Next, we show the utility of the PrintsGAN generated dataset by training a deep network to extract a fixed-length embedding from a fingerprint. In particular, an embedding model trained on our synthetic fingerprints and fine-tuned on a small number of publicly available real fingerprints (25K prints from NIST SD302) obtains a TAR of 87.03% @ FAR=0.01% on the NIST SD4 database (a boost from TAR=73.37% when only trained on NIST SD302). Prevailing synthetic fingerprint generation methods do not enable such performance gains due to i) lack of realism or ii) inability to generate multiple impressions per finger. We plan to release our database of synthetic fingerprints to the public. △ Less

Submitted 20 January, 2022; v1 submitted 10 January, 2022; originally announced January 2022.

arXiv:2108.04212 [pdf, other]

AutoVideo: An Automated Video Action Recognition System

Authors: Daochen Zha, Zaid Pervaiz Bhat, Yi-Wei Chen, Yicheng Wang, Sirui Ding, Jiaben Chen, Kwei-Herng Lai, Mohammad Qazim Bhat, Anmoll Kumar Jain, Alfredo Costilla Reyes, Na Zou, Xia Hu

Abstract: Action recognition is an important task for video understanding with broad applications. However, developing an effective action recognition solution often requires extensive engineering efforts in building and testing different combinations of the modules and their hyperparameters. In this demo, we present AutoVideo, a Python system for automated video action recognition. AutoVideo is featured fo… ▽ More Action recognition is an important task for video understanding with broad applications. However, developing an effective action recognition solution often requires extensive engineering efforts in building and testing different combinations of the modules and their hyperparameters. In this demo, we present AutoVideo, a Python system for automated video action recognition. AutoVideo is featured for 1) highly modular and extendable infrastructure following the standard pipeline language, 2) an exhaustive list of primitives for pipeline construction, 3) data-driven tuners to save the efforts of pipeline tuning, and 4) easy-to-use Graphical User Interface (GUI). AutoVideo is released under MIT license at https://github.com/datamllab/autovideo △ Less

Submitted 16 July, 2022; v1 submitted 9 August, 2021; originally announced August 2021.

Comments: Accepted by IJCAI https://github.com/datamllab/autovideo

arXiv:2107.06641 [pdf, other]

Trustworthy AI: A Computational Perspective

Authors: Haochen Liu, Yiqi Wang, Wenqi Fan, Xiaorui Liu, Yaxin Li, Shaili Jain, Yunhao Liu, Anil K. Jain, Jiliang Tang

Abstract: In the past few decades, artificial intelligence (AI) technology has experienced swift developments, changing everyone's daily life and profoundly altering the course of human society. The intention of developing AI is to benefit humans, by reducing human labor, bringing everyday convenience to human lives, and promoting social good. However, recent research and AI applications show that AI can ca… ▽ More In the past few decades, artificial intelligence (AI) technology has experienced swift developments, changing everyone's daily life and profoundly altering the course of human society. The intention of developing AI is to benefit humans, by reducing human labor, bringing everyday convenience to human lives, and promoting social good. However, recent research and AI applications show that AI can cause unintentional harm to humans, such as making unreliable decisions in safety-critical scenarios or undermining fairness by inadvertently discriminating against one group. Thus, trustworthy AI has attracted immense attention recently, which requires careful consideration to avoid the adverse effects that AI may bring to humans, so that humans can fully trust and live in harmony with AI technologies. Recent years have witnessed a tremendous amount of research on trustworthy AI. In this survey, we present a comprehensive survey of trustworthy AI from a computational perspective, to help readers understand the latest technologies for achieving trustworthy AI. Trustworthy AI is a large and complex area, involving various dimensions. In this work, we focus on six of the most crucial dimensions in achieving trustworthy AI: (i) Safety & Robustness, (ii) Non-discrimination & Fairness, (iii) Explainability, (iv) Privacy, (v) Accountability & Auditability, and (vi) Environmental Well-Being. For each dimension, we review the recent related technologies according to a taxonomy and summarize their applications in real-world systems. We also discuss the accordant and conflicting interactions among different dimensions and discuss potential aspects for trustworthy AI to investigate in the future. △ Less

Submitted 18 August, 2021; v1 submitted 12 July, 2021; originally announced July 2021.

Comments: 55 pages

arXiv:2105.06625 [pdf, other]

Biometrics: Trust, but Verify

Authors: Anil K. Jain, Debayan Deb, Joshua J. Engelsma

Abstract: Over the past two decades, biometric recognition has exploded into a plethora of different applications around the globe. This proliferation can be attributed to the high levels of authentication accuracy and user convenience that biometric recognition systems afford end-users. However, in-spite of the success of biometric recognition systems, there are a number of outstanding problems and concern… ▽ More Over the past two decades, biometric recognition has exploded into a plethora of different applications around the globe. This proliferation can be attributed to the high levels of authentication accuracy and user convenience that biometric recognition systems afford end-users. However, in-spite of the success of biometric recognition systems, there are a number of outstanding problems and concerns pertaining to the various sub-modules of biometric recognition systems that create an element of mistrust in their use - both by the scientific community and also the public at large. Some of these problems include: i) questions related to system recognition performance, ii) security (spoof attacks, adversarial attacks, template reconstruction attacks and demographic information leakage), iii) uncertainty over the bias and fairness of the systems to all users, iv) explainability of the seemingly black-box decisions made by most recognition systems, and v) concerns over data centralization and user privacy. In this paper, we provide an overview of each of the aforementioned open-ended challenges. We survey work that has been conducted to address each of these concerns and highlight the issues requiring further attention. Finally, we provide insights into how the biometric community can address core biometric recognition systems design issues to better instill trust, fairness, and security for all. △ Less

Submitted 31 May, 2021; v1 submitted 13 May, 2021; originally announced May 2021.

Comments: 20 pages, 15 figures

arXiv:2104.03008 [pdf, other]

FedFace: Collaborative Learning of Face Recognition Model

Authors: Divyansh Aggarwal, Jiayu Zhou, Anil K. Jain

Abstract: DNN-based face recognition models require large centrally aggregated face datasets for training. However, due to the growing data privacy concerns and legal restrictions, accessing and sharing face datasets has become exceedingly difficult. We propose FedFace, a federated learning (FL) framework for collaborative learning of face recognition models in a privacy-aware manner. FedFace utilizes the f… ▽ More DNN-based face recognition models require large centrally aggregated face datasets for training. However, due to the growing data privacy concerns and legal restrictions, accessing and sharing face datasets has become exceedingly difficult. We propose FedFace, a federated learning (FL) framework for collaborative learning of face recognition models in a privacy-aware manner. FedFace utilizes the face images available on multiple clients to learn an accurate and generalizable face recognition model where the face images stored at each client are neither shared with other clients nor the central host and each client is a mobile device containing face images pertaining to only the owner of the device (one identity per client). Our experiments show the effectiveness of FedFace in enhancing the verification performance of pre-trained face recognition system on standard face verification benchmarks namely LFW, IJB-A, and IJB-C. △ Less

Submitted 24 June, 2021; v1 submitted 7 April, 2021; originally announced April 2021.

arXiv:2104.02811 [pdf, other]

C2CL: Contact to Contactless Fingerprint Matching

Authors: Steven A. Grosz, Joshua J. Engelsma, Eryun Liu, Anil K. Jain

Abstract: Matching contactless fingerprints or finger photos to contact-based fingerprint impressions has received increased attention in the wake of COVID-19 due to the superior hygiene of the contactless acquisition and the widespread availability of low cost mobile phones capable of capturing photos of fingerprints with sufficient resolution for verification purposes. This paper presents an end-to-end au… ▽ More Matching contactless fingerprints or finger photos to contact-based fingerprint impressions has received increased attention in the wake of COVID-19 due to the superior hygiene of the contactless acquisition and the widespread availability of low cost mobile phones capable of capturing photos of fingerprints with sufficient resolution for verification purposes. This paper presents an end-to-end automated system, called C2CL, comprised of a mobile finger photo capture app, preprocessing, and matching algorithms to handle the challenges inhibiting previous cross-matching methods; namely i) low ridge-valley contrast of contactless fingerprints, ii) varying roll, pitch, yaw, and distance of the finger to the camera, iii) non-linear distortion of contact-based fingerprints, and vi) different image qualities of smartphone cameras. Our preprocessing algorithm segments, enhances, scales, and unwarps contactless fingerprints, while our matching algorithm extracts both minutiae and texture representations. A sequestered dataset of 9,888 contactless 2D fingerprints and corresponding contact-based fingerprints from 206 subjects (2 thumbs and 2 index fingers for each subject) acquired using our mobile capture app is used to evaluate the cross-database performance of our proposed algorithm. Furthermore, additional experimental results on 3 publicly available datasets show substantial improvement in the state-of-the-art for contact to contactless fingerprint matching (TAR in the range of 96.67% to 98.30% at FAR=0.01%). △ Less

Submitted 9 December, 2021; v1 submitted 6 April, 2021; originally announced April 2021.

arXiv:2104.02156 [pdf, other]

Unified Detection of Digital and Physical Face Attacks

Authors: Debayan Deb, Xiaoming Liu, Anil K. Jain

Abstract: State-of-the-art defense mechanisms against face attacks achieve near perfect accuracies within one of three attack categories, namely adversarial, digital manipulation, or physical spoofs, however, they fail to generalize well when tested across all three categories. Poor generalization can be attributed to learning incoherent attacks jointly. To overcome this shortcoming, we propose a unified at… ▽ More State-of-the-art defense mechanisms against face attacks achieve near perfect accuracies within one of three attack categories, namely adversarial, digital manipulation, or physical spoofs, however, they fail to generalize well when tested across all three categories. Poor generalization can be attributed to learning incoherent attacks jointly. To overcome this shortcoming, we propose a unified attack detection framework, namely UniFAD, that can automatically cluster 25 coherent attack types belonging to the three categories. Using a multi-task learning framework along with k-means clustering, UniFAD learns joint representations for coherent attacks, while uncorrelated attack types are learned separately. Proposed UniFAD outperforms prevailing defense methods and their fusion with an overall TDR = 94.73% @ 0.2% FDR on a large fake face dataset consisting of 341K bona fide images and 448K attack images of 25 types across all 3 categories. Proposed method can detect an attack within 3 milliseconds on a Nvidia 2080Ti. UniFAD can also identify the attack types and categories with 75.81% and 97.37% accuracies, respectively. △ Less

Submitted 5 April, 2021; originally announced April 2021.

arXiv:2011.14371 [pdf, ps, other]

Predicting Regional Locust Swarm Distribution with Recurrent Neural Networks

Authors: Hadia Mohmmed Osman Ahmed Samil, Annabelle Martin, Arnav Kumar Jain, Susan Amin, Samira Ebrahimi Kahou

Abstract: Locust infestation of some regions in the world, including Africa, Asia and Middle East has become a concerning issue that can affect the health and the lives of millions of people. In this respect, there have been attempts to resolve or reduce the severity of this problem via detection and monitoring of locust breeding areas using satellites and sensors, or the use of chemicals to prevent the for… ▽ More Locust infestation of some regions in the world, including Africa, Asia and Middle East has become a concerning issue that can affect the health and the lives of millions of people. In this respect, there have been attempts to resolve or reduce the severity of this problem via detection and monitoring of locust breeding areas using satellites and sensors, or the use of chemicals to prevent the formation of swarms. However, such methods have not been able to suppress the emergence and the collective behaviour of locusts. The ability to predict the location of the locust swarms prior to their formation, on the other hand, can help people get prepared and tackle the infestation issue more effectively. Here, we use machine learning to predict the location of locust swarms using the available data published by the Food and Agriculture Organization of the United Nations. The data includes the location of the observed swarms as well as environmental information, including soil moisture and the density of vegetation. The obtained results show that our proposed model can successfully, and with reasonable precision, predict the location of locust swarms, as well as their likely level of damage using a notion of density. △ Less

Submitted 12 November, 2021; v1 submitted 29 November, 2020; originally announced November 2020.

arXiv:2011.14218 [pdf, other]

FaceGuard: A Self-Supervised Defense Against Adversarial Face Images

Authors: Debayan Deb, Xiaoming Liu, Anil K. Jain

Abstract: Prevailing defense mechanisms against adversarial face images tend to overfit to the adversarial perturbations in the training set and fail to generalize to unseen adversarial attacks. We propose a new self-supervised adversarial defense framework, namely FaceGuard, that can automatically detect, localize, and purify a wide variety of adversarial faces without utilizing pre-computed adversarial tr… ▽ More Prevailing defense mechanisms against adversarial face images tend to overfit to the adversarial perturbations in the training set and fail to generalize to unseen adversarial attacks. We propose a new self-supervised adversarial defense framework, namely FaceGuard, that can automatically detect, localize, and purify a wide variety of adversarial faces without utilizing pre-computed adversarial training samples. During training, FaceGuard automatically synthesizes challenging and diverse adversarial attacks, enabling a classifier to learn to distinguish them from real faces and a purifier attempts to remove the adversarial perturbations in the image space. Experimental results on LFW dataset show that FaceGuard can achieve 99.81% detection accuracy on six unseen adversarial attack types. In addition, the proposed method can enhance the face recognition performance of ArcFace from 34.27% TAR @ 0.1% FAR under no defense to 77.46% TAR @ 0.1% FAR. △ Less

Submitted 5 April, 2021; v1 submitted 28 November, 2020; originally announced November 2020.

arXiv:2011.13126 [pdf, other]

Lifting 2D StyleGAN for 3D-Aware Face Generation

Authors: Yichun Shi, Divyansh Aggarwal, Anil K. Jain

Abstract: We propose a framework, called LiftedGAN, that disentangles and lifts a pre-trained StyleGAN2 for 3D-aware face generation. Our model is "3D-aware" in the sense that it is able to (1) disentangle the latent space of StyleGAN2 into texture, shape, viewpoint, lighting and (2) generate 3D components for rendering synthetic images. Unlike most previous methods, our method is completely self-supervised… ▽ More We propose a framework, called LiftedGAN, that disentangles and lifts a pre-trained StyleGAN2 for 3D-aware face generation. Our model is "3D-aware" in the sense that it is able to (1) disentangle the latent space of StyleGAN2 into texture, shape, viewpoint, lighting and (2) generate 3D components for rendering synthetic images. Unlike most previous methods, our method is completely self-supervised, i.e. it neither requires any manual annotation nor 3DMM model for training. Instead, it learns to generate images as well as their 3D components by distilling the prior knowledge in StyleGAN2 with a differentiable renderer. The proposed model is able to output both the 3D shape and texture, allowing explicit pose and lighting control over generated images. Qualitative and quantitative results show the superiority of our approach over existing methods on 3D-controllable GANs in content controllability while generating realistic high quality images. △ Less

Submitted 18 April, 2021; v1 submitted 26 November, 2020; originally announced November 2020.

Comments: in CVPR 2021

arXiv:2010.06121 [pdf, other]

To be Robust or to be Fair: Towards Fairness in Adversarial Training

Authors: Han Xu, Xiaorui Liu, Yaxin Li, Anil K. Jain, Jiliang Tang

Abstract: Adversarial training algorithms have been proved to be reliable to improve machine learning models' robustness against adversarial examples. However, we find that adversarial training algorithms tend to introduce severe disparity of accuracy and robustness between different groups of data. For instance, a PGD adversarially trained ResNet18 model on CIFAR-10 has 93% clean accuracy and 67% PGD l-inf… ▽ More Adversarial training algorithms have been proved to be reliable to improve machine learning models' robustness against adversarial examples. However, we find that adversarial training algorithms tend to introduce severe disparity of accuracy and robustness between different groups of data. For instance, a PGD adversarially trained ResNet18 model on CIFAR-10 has 93% clean accuracy and 67% PGD l-infty-8 robust accuracy on the class "automobile" but only 65% and 17% on the class "cat". This phenomenon happens in balanced datasets and does not exist in naturally trained models when only using clean samples. In this work, we empirically and theoretically show that this phenomenon can happen under general adversarial training algorithms which minimize DNN models' robust errors. Motivated by these findings, we propose a Fair-Robust-Learning (FRL) framework to mitigate this unfairness problem when doing adversarial defenses. Experimental results validate the effectiveness of FRL. △ Less

Submitted 18 May, 2021; v1 submitted 12 October, 2020; originally announced October 2020.

arXiv:2010.03624 [pdf, other]

Infant-ID: Fingerprints for Global Good

Authors: Joshua J. Engelsma, Debayan Deb, Kai Cao, Anjoo Bhatnagar, Prem S. Sudhish, Anil K. Jain

Abstract: In many of the least developed and developing countries, a multitude of infants continue to suffer and die from vaccine-preventable diseases and malnutrition. Lamentably, the lack of official identification documentation makes it exceedingly difficult to track which infants have been vaccinated and which infants have received nutritional supplements. Answering these questions could prevent this in… ▽ More In many of the least developed and developing countries, a multitude of infants continue to suffer and die from vaccine-preventable diseases and malnutrition. Lamentably, the lack of official identification documentation makes it exceedingly difficult to track which infants have been vaccinated and which infants have received nutritional supplements. Answering these questions could prevent this infant suffering and premature death around the world. To that end, we propose Infant-Prints, an end-to-end, low-cost, infant fingerprint recognition system. Infant-Prints is comprised of our (i) custom built, compact, low-cost (85 USD), high-resolution (1,900 ppi), ergonomic fingerprint reader, and (ii) high-resolution infant fingerprint matcher. To evaluate the efficacy of Infant-Prints, we collected a longitudinal infant fingerprint database captured in 4 different sessions over a 12-month time span (December 2018 to January 2020), from 315 infants at the Saran Ashram Hospital, a charitable hospital in Dayalbagh, Agra, India. Our experimental results demonstrate, for the first time, that Infant-Prints can deliver accurate and reliable recognition (over time) of infants enrolled between the ages of 2-3 months, in time for effective delivery of vaccinations, healthcare, and nutritional supplements (TAR=95.2% @ FAR = 1.0% for infants aged 8-16 weeks at enrollment and authenticated 3 months later). △ Less

Submitted 7 October, 2020; originally announced October 2020.

Comments: 16 pages, 16 figures

arXiv:2009.07888 [pdf, other]

Transfer Learning in Deep Reinforcement Learning: A Survey

Authors: Zhuangdi Zhu, Kaixiang Lin, Anil K. Jain, Jiayu Zhou

Abstract: Reinforcement learning is a learning paradigm for solving sequential decision-making problems. Recent years have witnessed remarkable progress in reinforcement learning upon the fast development of deep neural networks. Along with the promising prospects of reinforcement learning in numerous domains such as robotics and game-playing, transfer learning has arisen to tackle various challenges faced… ▽ More Reinforcement learning is a learning paradigm for solving sequential decision-making problems. Recent years have witnessed remarkable progress in reinforcement learning upon the fast development of deep neural networks. Along with the promising prospects of reinforcement learning in numerous domains such as robotics and game-playing, transfer learning has arisen to tackle various challenges faced by reinforcement learning, by transferring knowledge from external expertise to facilitate the efficiency and effectiveness of the learning process. In this survey, we systematically investigate the recent progress of transfer learning approaches in the context of deep reinforcement learning. Specifically, we provide a framework for categorizing the state-of-the-art transfer learning approaches, under which we analyze their goals, methodologies, compatible reinforcement learning backbones, and practical applications. We also draw connections between transfer learning and other relevant topics from the reinforcement learning perspective and explore their potential challenges that await future research progress. △ Less

Submitted 4 July, 2023; v1 submitted 16 September, 2020; originally announced September 2020.

arXiv:2008.00128 [pdf, other]

White-Box Evaluation of Fingerprint Recognition Systems

Authors: Steven A. Grosz, Joshua J. Engelsma, Anil K. Jain

Abstract: Typical evaluations of fingerprint recognition systems consist of end-to-end black-box evaluations, which assess performance in terms of overall identification or authentication accuracy. However, these black-box tests of system performance do not reveal insights into the performance of the individual modules, including image acquisition, feature extraction, and matching. On the other hand, white-… ▽ More Typical evaluations of fingerprint recognition systems consist of end-to-end black-box evaluations, which assess performance in terms of overall identification or authentication accuracy. However, these black-box tests of system performance do not reveal insights into the performance of the individual modules, including image acquisition, feature extraction, and matching. On the other hand, white-box evaluations, the topic of this paper, measure the individual performance of each constituent module in isolation. While a few studies have conducted white-box evaluations of the fingerprint reader, feature extractor, and matching components, no existing study has provided a full system, white-box analysis of the uncertainty introduced at each stage of a fingerprint recognition system. In this work, we extend previous white-box evaluations of fingerprint recognition system components and provide a unified, in-depth analysis of fingerprint recognition system performance based on the aggregated white-box evaluation results. In particular, we analyze the uncertainty introduced at each stage of the fingerprint recognition system due to adverse capture conditions (i.e., varying illumination, moisture, and pressure) at the time of acquisition. Our experiments show that a system that performs better overall, in terms of black-box recognition performance, does not necessarily perform best at each module in the fingerprint recognition system pipeline, which can only be seen with white-box analysis of each sub-module. Findings such as these enable researchers to better focus their efforts in improving fingerprint recognition systems. △ Less

Submitted 31 July, 2020; originally announced August 2020.

arXiv:2006.07576 [pdf, other]

Mitigating Face Recognition Bias via Group Adaptive Classifier

Authors: Sixue Gong, Xiaoming Liu, Anil K. Jain

Abstract: Face recognition is known to exhibit bias - subjects in a certain demographic group can be better recognized than other groups. This work aims to learn a fair face representation, where faces of every group could be more equally represented. Our proposed group adaptive classifier mitigates bias by using adaptive convolution kernels and attention mechanisms on faces based on their demographic attri… ▽ More Face recognition is known to exhibit bias - subjects in a certain demographic group can be better recognized than other groups. This work aims to learn a fair face representation, where faces of every group could be more equally represented. Our proposed group adaptive classifier mitigates bias by using adaptive convolution kernels and attention mechanisms on faces based on their demographic attributes. The adaptive module comprises kernel masks and channel-wise attention maps for each demographic group so as to activate different facial regions for identification, leading to more discriminative features pertinent to their demographics. Our introduced automated adaptation strategy determines whether to apply adaptation to a certain layer by iteratively computing the dissimilarity among demographic-adaptive parameters. A new de-biasing loss function is proposed to mitigate the gap of average intra-class distance between demographic groups. Experiments on face benchmarks (RFW, LFW, IJB-A, and IJB-C) show that our work is able to mitigate face recognition bias across demographic groups while maintaining the competitive accuracy. △ Less

Submitted 30 November, 2020; v1 submitted 13 June, 2020; originally announced June 2020.

arXiv:2006.02834 [pdf, other]

Look Locally Infer Globally: A Generalizable Face Anti-Spoofing Approach

Authors: Debayan Deb, Anil K. Jain

Abstract: State-of-the-art spoof detection methods tend to overfit to the spoof types seen during training and fail to generalize to unknown spoof types. Given that face anti-spoofing is inherently a local task, we propose a face anti-spoofing framework, namely Self-Supervised Regional Fully Convolutional Network (SSR-FCN), that is trained to learn local discriminative cues from a face image in a self-super… ▽ More State-of-the-art spoof detection methods tend to overfit to the spoof types seen during training and fail to generalize to unknown spoof types. Given that face anti-spoofing is inherently a local task, we propose a face anti-spoofing framework, namely Self-Supervised Regional Fully Convolutional Network (SSR-FCN), that is trained to learn local discriminative cues from a face image in a self-supervised manner. The proposed framework improves generalizability while maintaining the computational efficiency of holistic face anti-spoofing approaches (< 4 ms on a Nvidia GTX 1080Ti GPU). The proposed method is interpretable since it localizes which parts of the face are labeled as spoofs. Experimental results show that SSR-FCN can achieve TDR = 65% @ 2.0% FDR when evaluated on a dataset comprising of 13 different spoof types under unknown attacks while achieving competitive performances under standard benchmark datasets (Oulu-NPU, CASIA-MFSD, and Replay-Attack). △ Less

Submitted 15 June, 2020; v1 submitted 4 June, 2020; originally announced June 2020.

arXiv:2004.03756 [pdf, other]

DashCam Pay: A System for In-vehicle Payments Using Face and Voice

Authors: Cori Tymoszek, Sunpreet S. Arora, Kim Wagner, Anil K. Jain

Abstract: We present our ongoing work on developing a system, called DashCam Pay, that enables in-vehicle payments in a seamless and secure manner using face and voice biometrics. A plug-and-play device (dashcam) mounted in the vehicle is used to capture face images and voice commands of passengers. Privacy-preserving biometric comparison techniques are used to compare the biometric data captured by the das… ▽ More We present our ongoing work on developing a system, called DashCam Pay, that enables in-vehicle payments in a seamless and secure manner using face and voice biometrics. A plug-and-play device (dashcam) mounted in the vehicle is used to capture face images and voice commands of passengers. Privacy-preserving biometric comparison techniques are used to compare the biometric data captured by the dashcam with the biometric data enrolled on the users' mobile devices over a wireless interface (e.g., Bluetooth or Wi-Fi Direct) to determine the payer. Once the payer is identified, payment is conducted using the enrolled payment credential on the mobile device of the payer. We conduct preliminary analysis on data collected using a commercially available dashcam to show the feasibility of building the proposed system. A prototype of the proposed system is also developed in Android. DashCam Pay can be integrated as a software solution by dashcam or vehicle manufacturers to enable open loop in-vehicle payments. △ Less

Submitted 8 September, 2020; v1 submitted 7 April, 2020; originally announced April 2020.

Comments: 9 pages

arXiv:2004.02941 [pdf, other]

Fingerprint Presentation Attack Detection: A Sensor and Material Agnostic Approach

Authors: Steven A. Grosz, Tarang Chugh, Anil K. Jain

Abstract: The vulnerability of automated fingerprint recognition systems to presentation attacks (PA), i.e., spoof or altered fingers, has been a growing concern, warranting the development of accurate and efficient presentation attack detection (PAD) methods. However, one major limitation of the existing PAD solutions is their poor generalization to new PA materials and fingerprint sensors, not used in tra… ▽ More The vulnerability of automated fingerprint recognition systems to presentation attacks (PA), i.e., spoof or altered fingers, has been a growing concern, warranting the development of accurate and efficient presentation attack detection (PAD) methods. However, one major limitation of the existing PAD solutions is their poor generalization to new PA materials and fingerprint sensors, not used in training. In this study, we propose a robust PAD solution with improved cross-material and cross-sensor generalization. Specifically, we build on top of any CNN-based architecture trained for fingerprint spoof detection combined with cross-material spoof generalization using a style transfer network wrapper. We also incorporate adversarial representation learning (ARL) in deep neural networks (DNN) to learn sensor and material invariant representations for PAD. Experimental results on LivDet 2015 and 2017 public domain datasets exhibit the effectiveness of the proposed approach. △ Less

Submitted 6 April, 2020; originally announced April 2020.

arXiv:2003.12197 [pdf, other]

HERS: Homomorphically Encrypted Representation Search

Authors: Joshua J. Engelsma, Anil K. Jain, Vishnu Naresh Boddeti

Abstract: We present a method to search for a probe (or query) image representation against a large gallery in the encrypted domain. We require that the probe and gallery images be represented in terms of a fixed-length representation, which is typical for representations obtained from learned networks. Our encryption scheme is agnostic to how the fixed-length representation is obtained and can therefore be… ▽ More We present a method to search for a probe (or query) image representation against a large gallery in the encrypted domain. We require that the probe and gallery images be represented in terms of a fixed-length representation, which is typical for representations obtained from learned networks. Our encryption scheme is agnostic to how the fixed-length representation is obtained and can therefore be applied to any fixed-length representation in any application domain. Our method, dubbed HERS (Homomorphically Encrypted Representation Search), operates by (i) compressing the representation towards its estimated intrinsic dimensionality with minimal loss of accuracy (ii) encrypting the compressed representation using the proposed fully homomorphic encryption scheme, and (iii) efficiently searching against a gallery of encrypted representations directly in the encrypted domain, without decrypting them. Numerical results on large galleries of face, fingerprint, and object datasets such as ImageNet show that, for the first time, accurate and fast image search within the encrypted domain is feasible at scale (500 seconds; $275\times$ speed up over state-of-the-art for encrypted search against a gallery of 100 million). Code is available at https://github.com/human-analysis/hers-encrypted-image-search △ Less

Submitted 18 June, 2022; v1 submitted 26 March, 2020; originally announced March 2020.

Comments: Published in the Trustworthy Biometrics Special Issue of IEEE Transactions on Biometrics, Behavior, and Identity Science 2021

arXiv:2003.08788 [pdf, other]

Child Face Age-Progression via Deep Feature Aging

Authors: Debayan Deb, Divyansh Aggarwal, Anil K. Jain

Abstract: Given a gallery of face images of missing children, state-of-the-art face recognition systems fall short in identifying a child (probe) recovered at a later age. We propose a feature aging module that can age-progress deep face features output by a face matcher. In addition, the feature aging module guides age-progression in the image space such that synthesized aged faces can be utilized to enhan… ▽ More Given a gallery of face images of missing children, state-of-the-art face recognition systems fall short in identifying a child (probe) recovered at a later age. We propose a feature aging module that can age-progress deep face features output by a face matcher. In addition, the feature aging module guides age-progression in the image space such that synthesized aged faces can be utilized to enhance longitudinal face recognition performance of any face matcher without requiring any explicit training. For time lapses larger than 10 years (the missing child is found after 10 or more years), the proposed age-progression module improves the closed-set identification accuracy of FaceNet from 16.53% to 21.44% and CosFace from 60.72% to 66.12% on a child celebrity dataset, namely ITWCC. The proposed method also outperforms state-of-the-art approaches with a rank-1 identification rate of 95.91%, compared to 94.91%, on a public aging dataset, FG-NET, and 99.58%, compared to 99.50%, on CACD-VS. These results suggest that aging face features enhances the ability to identify young children who are possible victims of child trafficking or abduction. △ Less

Submitted 17 March, 2020; originally announced March 2020.

Comments: arXiv admin note: substantial text overlap with arXiv:1911.07538

arXiv:2003.07936 [pdf, other]

Boosting Unconstrained Face Recognition with Auxiliary Unlabeled Data

Authors: Yichun Shi, Anil K. Jain

Abstract: In recent years, significant progress has been made in face recognition, which can be partially attributed to the availability of large-scale labeled face datasets. However, since the faces in these datasets usually contain limited degree and types of variation, the resulting trained models generalize poorly to more realistic unconstrained face datasets. While collecting labeled faces with larger… ▽ More In recent years, significant progress has been made in face recognition, which can be partially attributed to the availability of large-scale labeled face datasets. However, since the faces in these datasets usually contain limited degree and types of variation, the resulting trained models generalize poorly to more realistic unconstrained face datasets. While collecting labeled faces with larger variations could be helpful, it is practically infeasible due to privacy and labor cost. In comparison, it is easier to acquire a large number of unlabeled faces from different domains, which could be used to regularize the learning of face representations. We present an approach to use such unlabeled faces to learn generalizable face representations, where we assume neither the access to identity labels nor domain labels for unlabeled images. Experimental results on unconstrained datasets show that a small amount of unlabeled data with sufficient diversity can (i) lead to an appreciable gain in recognition performance and (ii) outperform the supervised baseline when combined with less than half of the labeled data. Compared with the state-of-the-art face recognition methods, our method further improves their performance on challenging benchmarks, such as IJB-B, IJB-C and IJB-S. △ Less

Submitted 18 April, 2021; v1 submitted 17 March, 2020; originally announced March 2020.

arXiv:2002.11841 [pdf, other]

Towards Universal Representation Learning for Deep Face Recognition

Authors: Yichun Shi, Xiang Yu, Kihyuk Sohn, Manmohan Chandraker, Anil K. Jain

Abstract: Recognizing wild faces is extremely hard as they appear with all kinds of variations. Traditional methods either train with specifically annotated variation data from target domains, or by introducing unlabeled target variation data to adapt from the training data. Instead, we propose a universal representation learning framework that can deal with larger variation unseen in the given training dat… ▽ More Recognizing wild faces is extremely hard as they appear with all kinds of variations. Traditional methods either train with specifically annotated variation data from target domains, or by introducing unlabeled target variation data to adapt from the training data. Instead, we propose a universal representation learning framework that can deal with larger variation unseen in the given training data without leveraging target domain knowledge. We firstly synthesize training data alongside some semantically meaningful variations, such as low resolution, occlusion and head pose. However, directly feeding the augmented data for training will not converge well as the newly introduced samples are mostly hard examples. We propose to split the feature embedding into multiple sub-embeddings, and associate different confidence values for each sub-embedding to smooth the training procedure. The sub-embeddings are further decorrelated by regularizing variation classification loss and variation adversarial loss on different partitions of them. Experiments show that our method achieves top performance on general face recognition datasets such as LFW and MegaFace, while significantly better on extreme benchmarks such as TinyFace and IJB-S. △ Less

Submitted 26 February, 2020; originally announced February 2020.

Comments: to appear in CVPR 2020

arXiv:1912.08240 [pdf, other]

Fingerprint Spoof Detection: Temporal Analysis of Image Sequence

Authors: Tarang Chugh, Anil K. Jain

Abstract: We utilize the dynamics involved in the imaging of a fingerprint on a touch-based fingerprint reader, such as perspiration, changes in skin color (blanching), and skin distortion, to differentiate real fingers from spoof (fake) fingers. Specifically, we utilize a deep learning-based architecture (CNN-LSTM) trained end-to-end using sequences of minutiae-centered local patches extracted from ten col… ▽ More We utilize the dynamics involved in the imaging of a fingerprint on a touch-based fingerprint reader, such as perspiration, changes in skin color (blanching), and skin distortion, to differentiate real fingers from spoof (fake) fingers. Specifically, we utilize a deep learning-based architecture (CNN-LSTM) trained end-to-end using sequences of minutiae-centered local patches extracted from ten color frames captured on a COTS fingerprint reader. A time-distributed CNN (MobileNet-v1) extracts spatial features from each local patch, while a bi-directional LSTM layer learns the temporal relationship between the patches in the sequence. Experimental results on a database of 26,650 live frames from 685 subjects (1,333 unique fingers), and 32,910 spoof frames of 7 spoof materials (with 14 variants) shows the superiority of the proposed approach in both known-material and cross-material (generalization) scenarios. For instance, the proposed approach improves the state-of-the-art cross-material performance from TDR of 81.65% to 86.20% @ FDR = 0.2%. △ Less

Submitted 17 December, 2019; originally announced December 2019.

Comments: 8 pages

arXiv:1912.07195 [pdf, other]

Fingerprint Synthesis: Search with 100 Million Prints

Authors: Vishesh Mistry, Joshua J. Engelsma, Anil K. Jain

Abstract: Evaluation of large-scale fingerprint search algorithms has been limited due to lack of publicly available datasets. To address this problem, we utilize a Generative Adversarial Network (GAN) to synthesize a fingerprint dataset consisting of 100 million fingerprint images. In contrast to existing fingerprint synthesis algorithms, we incorporate an identity loss which guides the generator to synthe… ▽ More Evaluation of large-scale fingerprint search algorithms has been limited due to lack of publicly available datasets. To address this problem, we utilize a Generative Adversarial Network (GAN) to synthesize a fingerprint dataset consisting of 100 million fingerprint images. In contrast to existing fingerprint synthesis algorithms, we incorporate an identity loss which guides the generator to synthesize fingerprints corresponding to more distinct identities. The characteristics of our synthesized fingerprints are shown to be more similar to real fingerprints than existing methods via eight different metrics (minutiae count - block and template, minutiae direction - block and template, minutiae convex hull area, minutiae spatial distribution, block minutiae quality distribution, and NFIQ 2.0 scores). Additionally, the synthetic fingerprints based on our approach are shown to be more distinct than synthetic fingerprints based on published methods through search results and imposter distribution statistics. Finally, we report for the first time in open literature, search accuracy against a gallery of 100 million fingerprint images (NIST SD4 Rank-1 accuracy of 89.7%). △ Less

Submitted 12 June, 2020; v1 submitted 16 December, 2019; originally announced December 2019.

arXiv:1912.03737 [pdf, other]

Universal Material Translator: Towards Spoof Fingerprint Generalization

Authors: Rohit Gajawada, Additya Popli, Tarang Chugh, Anoop Namboodiri, Anil K. Jain

Abstract: Spoof detectors are classifiers that are trained to distinguish spoof fingerprints from bonafide ones. However, state of the art spoof detectors do not generalize well on unseen spoof materials. This study proposes a style transfer based augmentation wrapper that can be used on any existing spoof detector and can dynamically improve the robustness of the spoof detection system on spoof materials f… ▽ More Spoof detectors are classifiers that are trained to distinguish spoof fingerprints from bonafide ones. However, state of the art spoof detectors do not generalize well on unseen spoof materials. This study proposes a style transfer based augmentation wrapper that can be used on any existing spoof detector and can dynamically improve the robustness of the spoof detection system on spoof materials for which we have very low data. Our method is an approach for synthesizing new spoof images from a few spoof examples that transfers the style or material properties of the spoof examples to the content of bonafide fingerprints to generate a larger number of examples to train the classifier on. We demonstrate the effectiveness of our approach on materials in the publicly available LivDet 2015 dataset and show that the proposed approach leads to robustness to fingerprint spoofs of the target material. △ Less

Submitted 8 December, 2019; originally announced December 2019.

Comments: 8 pages, 6 figures, conference

Journal ref: IAPR International Conference on Biometrics (ICB), 2019

arXiv:1912.02710 [pdf, other]

Fingerprint Spoof Generalization

Authors: Tarang Chugh, Anil K. Jain

Abstract: We present a style-transfer based wrapper, called Universal Material Generator (UMG), to improve the generalization performance of any fingerprint spoof detector against spoofs made from materials not seen during training. Specifically, we transfer the style (texture) characteristics between fingerprint images of known materials with the goal of synthesizing fingerprint images corresponding to unk… ▽ More We present a style-transfer based wrapper, called Universal Material Generator (UMG), to improve the generalization performance of any fingerprint spoof detector against spoofs made from materials not seen during training. Specifically, we transfer the style (texture) characteristics between fingerprint images of known materials with the goal of synthesizing fingerprint images corresponding to unknown materials, that may occupy the space between the known materials in the deep feature space. Synthetic live fingerprint images are also added to the training dataset to force the CNN to learn generative-noise invariant features which discriminate between lives and spoofs. The proposed approach is shown to improve the generalization performance of a state-of-the-art spoof detector, namely Fingerprint Spoof Buster, from TDR of 75.24% to 91.78% @ FDR = 0.2%. These results are based on a large-scale dataset of 5,743 live and 4,912 spoof images fabricated using 12 different materials. Additionally, the UMG wrapper is shown to improve the average cross-sensor spoof detection performance from 67.60% to 80.63% when tested on the LivDet 2017 dataset. Training the UMG wrapper requires only 100 live fingerprint images from the target sensor, alleviating the time and resources required to generate large-scale live and spoof datasets for a new sensor. We also fabricate physical spoof artifacts using a mixture of known spoof materials to explore the role of cross-material style transfer in improving generalization performance. △ Less

Submitted 5 December, 2019; originally announced December 2019.

Comments: 14 pages, 10 figures

arXiv:1911.08080 [pdf, other]

Jointly De-biasing Face Recognition and Demographic Attribute Estimation

Authors: Sixue Gong, Xiaoming Liu, Anil K. Jain

Abstract: We address the problem of bias in automated face recognition and demographic attribute estimation algorithms, where errors are lower on certain cohorts belonging to specific demographic groups. We present a novel de-biasing adversarial network (DebFace) that learns to extract disentangled feature representations for both unbiased face recognition and demographics estimation. The proposed network c… ▽ More We address the problem of bias in automated face recognition and demographic attribute estimation algorithms, where errors are lower on certain cohorts belonging to specific demographic groups. We present a novel de-biasing adversarial network (DebFace) that learns to extract disentangled feature representations for both unbiased face recognition and demographics estimation. The proposed network consists of one identity classifier and three demographic classifiers (for gender, age, and race) that are trained to distinguish identity and demographic attributes, respectively. Adversarial learning is adopted to minimize correlation among feature factors so as to abate bias influence from other factors. We also design a new scheme to combine demographics with identity features to strengthen robustness of face representation in different demographic groups. The experimental results show that our approach is able to reduce bias in face recognition as well as demographics estimation while achieving state-of-the-art performance. △ Less

Submitted 31 July, 2020; v1 submitted 18 November, 2019; originally announced November 2019.

arXiv:1911.07538 [pdf, other]

Finding Missing Children: Aging Deep Face Features

Authors: Debayan Deb, Divyansh Aggarwal, Anil K. Jain

Abstract: Given a gallery of face images of missing children, state-of-the-art face recognition systems fall short in identifying a child (probe) recovered at a later age. We propose an age-progression module that can age-progress deep face features output by any commodity face matcher. For time lapses larger than 10 years (the missing child is found after 10 or more years), the proposed age-progression mod… ▽ More Given a gallery of face images of missing children, state-of-the-art face recognition systems fall short in identifying a child (probe) recovered at a later age. We propose an age-progression module that can age-progress deep face features output by any commodity face matcher. For time lapses larger than 10 years (the missing child is found after 10 or more years), the proposed age-progression module improves the closed-set identification accuracy of FaceNet from 40% to 49.56% and CosFace from 56.88% to 61.25% on a child celebrity dataset, namely ITWCC. The proposed method also outperforms state-of-the-art approaches with a rank-1 identification rate from 94.91% to 95.91% on a public aging dataset, FG-NET, and from 99.50% to 99.58% on CACD-VS. These results suggest that aging face features enhances the ability to identify young children who are possible victims of child trafficking or abduction. △ Less

Submitted 18 November, 2019; v1 submitted 18 November, 2019; originally announced November 2019.

Showing 1–50 of 103 results for author: Jain, A K