-
A Covariate-Adjusted Homogeneity Test with Application to Facial Recognition Accuracy Assessment
Authors:
Ngoc-Ty Nguyen,
P. Jonathon Phillips,
Larry Tang
Abstract:
Ordinal scores occur commonly in medical imaging studies and in black-box forensic studies \citep{Phillips:2018}. To assess the accuracy of raters in the studies, one needs to estimate the receiver operating characteristic (ROC) curve while accounting for covariates of raters. In this paper, we propose a covariate-adjusted homogeneity test to determine differences in accuracy among multiple rater…
▽ More
Ordinal scores occur commonly in medical imaging studies and in black-box forensic studies \citep{Phillips:2018}. To assess the accuracy of raters in the studies, one needs to estimate the receiver operating characteristic (ROC) curve while accounting for covariates of raters. In this paper, we propose a covariate-adjusted homogeneity test to determine differences in accuracy among multiple rater groups. We derived the theoretical results of the proposed test and conducted extensive simulation studies to evaluate the finite sample performance of the proposed test. Our proposed test is applied to a face recognition study to identify statistically significant differences among five participant groups.
△ Less
Submitted 17 July, 2023;
originally announced July 2023.
-
Human-Machine Comparison for Cross-Race Face Verification: Race Bias at the Upper Limits of Performance?
Authors:
Geraldine Jeckeln,
Selin Yavuzcan,
Kate A. Marquis,
Prajay Sandipkumar Mehta,
Amy N. Yates,
P. Jonathon Phillips,
Alice J. O'Toole
Abstract:
Face recognition algorithms perform more accurately than humans in some cases, though humans and machines both show race-based accuracy differences. As algorithms continue to improve, it is important to continually assess their race bias relative to humans. We constructed a challenging test of 'cross-race' face verification and used it to compare humans and two state-of-the-art face recognition sy…
▽ More
Face recognition algorithms perform more accurately than humans in some cases, though humans and machines both show race-based accuracy differences. As algorithms continue to improve, it is important to continually assess their race bias relative to humans. We constructed a challenging test of 'cross-race' face verification and used it to compare humans and two state-of-the-art face recognition systems. Pairs of same- and different-identity faces of White and Black individuals were selected to be difficult for humans and an open-source implementation of the ArcFace face recognition algorithm from 2019 (5). Human participants (54 Black; 51 White) judged whether face pairs showed the same identity or different identities on a 7-point Likert-type scale. Two top-performing face recognition systems from the Face Recognition Vendor Test-ongoing performed the same test (7). By design, the test proved challenging for humans as a group, who performed above chance, but far less than perfect. Both state-of-the-art face recognition systems scored perfectly (no errors), consequently with equal accuracy for both races. We conclude that state-of-the-art systems for identity verification between two frontal face images of Black and White individuals can surpass the general population. Whether this result generalizes to challenging in-the-wild images is a pressing concern for deploying face recognition systems in unconstrained environments.
△ Less
Submitted 30 May, 2023; v1 submitted 25 May, 2023;
originally announced May 2023.
-
A Flexible Multi-Metric Bayesian Framework for Decision-Making in Phase II Multi-Arm Multi-Stage Studies
Authors:
Suzanne M. Dufault,
Angela M. Crook,
Katie Rolfe,
Patrick P. J. Phillips
Abstract:
We propose a multi-metric flexible Bayesian framework to support efficient interim decision-making in multi-arm multi-stage phase II clinical trials. Multi-arm multi-stage phase II studies increase the efficiency of drug development, but early decisions regarding the futility or desirability of a given arm carry considerable risk since sample sizes are often low and follow-up periods may be short.…
▽ More
We propose a multi-metric flexible Bayesian framework to support efficient interim decision-making in multi-arm multi-stage phase II clinical trials. Multi-arm multi-stage phase II studies increase the efficiency of drug development, but early decisions regarding the futility or desirability of a given arm carry considerable risk since sample sizes are often low and follow-up periods may be short. Further, since intermediate outcomes based on biomarkers of treatment response are rarely perfect surrogates for the primary outcome and different trial stakeholders may have different levels of risk tolerance, a single hypothesis test is insufficient for comprehensively summarizing the state of the collected evidence. We present a Bayesian framework comprised of multiple metrics based on point estimates, uncertainty, and evidence towards desired thresholds (a Target Product Profile) for 1) ranking of arms and 2) comparison of each arm against an internal control. Using a large public-private partnership targeting novel TB arms as a motivating example, we find via simulation study that our multi-metric framework provides sufficient confidence for decision-making with sample sizes as low as 30 patients per arm, even when intermediate outcomes have only moderate correlation with the primary outcome. Our reframing of trial design and the decision-making procedure has been well-received by research partners and is a practical approach to more efficient assessment of novel therapeutics.
△ Less
Submitted 23 October, 2023; v1 submitted 14 February, 2023;
originally announced February 2023.
-
Distill and De-bias: Mitigating Bias in Face Verification using Knowledge Distillation
Authors:
Prithviraj Dhar,
Joshua Gleason,
Aniket Roy,
Carlos D. Castillo,
P. Jonathon Phillips,
Rama Chellappa
Abstract:
Face recognition networks generally demonstrate bias with respect to sensitive attributes like gender, skintone etc. For gender and skintone, we observe that the regions of the face that a network attends to vary by the category of an attribute. This might contribute to bias. Building on this intuition, we propose a novel distillation-based approach called Distill and De-bias (D&D) to enforce a ne…
▽ More
Face recognition networks generally demonstrate bias with respect to sensitive attributes like gender, skintone etc. For gender and skintone, we observe that the regions of the face that a network attends to vary by the category of an attribute. This might contribute to bias. Building on this intuition, we propose a novel distillation-based approach called Distill and De-bias (D&D) to enforce a network to attend to similar face regions, irrespective of the attribute category. In D&D, we train a teacher network on images from one category of an attribute; e.g. light skintone. Then distilling information from the teacher, we train a student network on images of the remaining category; e.g., dark skintone. A feature-level distillation loss constrains the student network to generate teacher-like representations. This allows the student network to attend to similar face regions for all attribute categories and enables it to reduce bias. We also propose a second distillation step on top of D&D, called D&D++. Here, we distill the `un-biasedness' of the D&D network into a new student network, the D&D++ network, while training this new network on all attribute categories; e.g., both light and dark skintones. This helps us train a network that is less biased for an attribute, while obtaining higher face verification performance than D&D. We show that D&D++ outperforms existing baselines in reducing gender and skintone bias on the IJB-C dataset, while obtaining higher face verification performance than existing adversarial de-biasing methods. We evaluate the effectiveness of our proposed methods on two state-of-the-art face recognition networks: ArcFace and Crystalface.
△ Less
Submitted 16 April, 2022; v1 submitted 17 December, 2021;
originally announced December 2021.
-
Face Identification Proficiency Test Designed Using Item Response Theory
Authors:
GĂ©raldine Jeckeln,
Ying Hu,
Jacqueline G. Cavazos,
Amy N. Yates,
Carina A. Hahn,
Larry Tang,
P. Jonathon Phillips,
Alice J. O'Toole
Abstract:
Measures of face-identification proficiency are essential to ensure accurate and consistent performance by professional forensic face examiners and others who perform face-identification tasks in applied scenarios. Current proficiency tests rely on static sets of stimulus items, and so, cannot be administered validly to the same individual multiple times. To create a proficiency test, a large numb…
▽ More
Measures of face-identification proficiency are essential to ensure accurate and consistent performance by professional forensic face examiners and others who perform face-identification tasks in applied scenarios. Current proficiency tests rely on static sets of stimulus items, and so, cannot be administered validly to the same individual multiple times. To create a proficiency test, a large number of items of "known" difficulty must be assembled. Multiple tests of equal difficulty can be constructed then using subsets of items. We introduce the Triad Identity Matching (TIM) test and evaluate it using Item Response Theory (IRT). Participants view face-image "triads" (N=225) (two images of one identity, one image of a different identity) and select the different identity. In Experiment 1, university students (N=197) showed wide-ranging accuracy on the TIM test, and IRT modeling demonstrated that the TIM items span various difficulty levels. In Experiment 2, we used IRT-based item metrics to partition the test into subsets of specific difficulties. Simulations showed that subsets of the TIM items yielded reliable estimates of subject ability. In Experiments 3a and 3b, we found that the student-derived IRT model reliably evaluated the ability of non-student participants and that ability generalized across different test sessions. In Experiment 3c, we show that TIM test performance correlates with other common face-recognition tests. In summary, the TIM test provides a starting point for develo** a framework that is flexible and calibrated to measure proficiency across various ability levels (e.g., professionals or populations with face-processing deficits).
△ Less
Submitted 9 August, 2022; v1 submitted 22 June, 2021;
originally announced June 2021.
-
Four Principles of Explainable AI as Applied to Biometrics and Facial Forensic Algorithms
Authors:
P. Jonathon Phillips,
Mark Przybocki
Abstract:
Traditionally, researchers in automatic face recognition and biometric technologies have focused on develo** accurate algorithms. With this technology being integrated into operational systems, engineers and scientists are being asked, do these systems meet societal norms? The origin of this line of inquiry is `trust' of artificial intelligence (AI) systems. In this paper, we concentrate on adap…
▽ More
Traditionally, researchers in automatic face recognition and biometric technologies have focused on develo** accurate algorithms. With this technology being integrated into operational systems, engineers and scientists are being asked, do these systems meet societal norms? The origin of this line of inquiry is `trust' of artificial intelligence (AI) systems. In this paper, we concentrate on adapting explainable AI to face recognition and biometrics, and we present four principles of explainable AI to face recognition and biometrics. The principles are illustrated by $\it{four}$ case studies, which show the challenges and issues in develo** algorithms that can produce explanations.
△ Less
Submitted 3 February, 2020;
originally announced February 2020.
-
Accuracy comparison across face recognition algorithms: Where are we on measuring race bias?
Authors:
Jacqueline G. Cavazos,
P. Jonathon Phillips,
Carlos D. Castillo,
Alice J. O'Toole
Abstract:
Previous generations of face recognition algorithms differ in accuracy for images of different races (race bias). Here, we present the possible underlying factors (data-driven and scenario modeling) and methodological considerations for assessing race bias in algorithms. We discuss data driven factors (e.g., image quality, image population statistics, and algorithm architecture), and scenario mode…
▽ More
Previous generations of face recognition algorithms differ in accuracy for images of different races (race bias). Here, we present the possible underlying factors (data-driven and scenario modeling) and methodological considerations for assessing race bias in algorithms. We discuss data driven factors (e.g., image quality, image population statistics, and algorithm architecture), and scenario modeling factors that consider the role of the "user" of the algorithm (e.g., threshold decisions and demographic constraints). To illustrate how these issues apply, we present data from four face recognition algorithms (a previous-generation algorithm and three deep convolutional neural networks, DCNNs) for East Asian and Caucasian faces. First, dataset difficulty affected both overall recognition accuracy and race bias, such that race bias increased with item difficulty. Second, for all four algorithms, the degree of bias varied depending on the identification decision threshold. To achieve equal false accept rates (FARs), East Asian faces required higher identification thresholds than Caucasian faces, for all algorithms. Third, demographic constraints on the formulation of the distributions used in the test, impacted estimates of algorithm accuracy. We conclude that race bias needs to be measured for individual applications and we provide a checklist for measuring this bias in face recognition algorithms.
△ Less
Submitted 4 June, 2020; v1 submitted 16 December, 2019;
originally announced December 2019.
-
How are attributes expressed in face DCNNs?
Authors:
Prithviraj Dhar,
Ankan Bansal,
Carlos D. Castillo,
Joshua Gleason,
P. Jonathon Phillips,
Rama Chellappa
Abstract:
As deep networks become increasingly accurate at recognizing faces, it is vital to understand how these networks process faces. While these networks are solely trained to recognize identities, they also contain face related information such as sex, age, and pose of the face. The networks are not trained to learn these attributes. We introduce expressivity as a measure of how much a feature vector…
▽ More
As deep networks become increasingly accurate at recognizing faces, it is vital to understand how these networks process faces. While these networks are solely trained to recognize identities, they also contain face related information such as sex, age, and pose of the face. The networks are not trained to learn these attributes. We introduce expressivity as a measure of how much a feature vector informs us about an attribute, where a feature vector can be from internal or final layers of a network. Expressivity is computed by a second neural network whose inputs are features and attributes. The output of the second neural network approximates the mutual information between feature vectors and an attribute. We investigate the expressivity for two different deep convolutional neural network (DCNN) architectures: a Resnet-101 and an Inception Resnet v2. In the final fully connected layer of the networks, we found the order of expressivity for facial attributes to be Age > Sex > Yaw. Additionally, we studied the changes in the encoding of facial attributes over training iterations. We found that as training progresses, expressivities of yaw, sex, and age decrease. Our technique can be a tool for investigating the sources of bias in a network and a step towards explaining the network's identity decisions.
△ Less
Submitted 12 October, 2019;
originally announced October 2019.
-
Re-thinking non-inferiority: a practical trial design for optimising treatment duration
Authors:
Matteo Quartagno,
A. Sarah Walker,
James R. Carpenter,
Patrick P. J. Phillips,
Mahesh K. B. Parmar
Abstract:
Background: trials to identify the minimal effective treatment duration are needed in different therapeutic areas, including bacterial infections, TB and Hepatitis--C. However, standard non-inferiority designs have several limitations, including arbitrariness of non-inferiority margins, choice of research arms and very large sample sizes.
Methods: we recast the problem of finding an appropriate…
▽ More
Background: trials to identify the minimal effective treatment duration are needed in different therapeutic areas, including bacterial infections, TB and Hepatitis--C. However, standard non-inferiority designs have several limitations, including arbitrariness of non-inferiority margins, choice of research arms and very large sample sizes.
Methods: we recast the problem of finding an appropriate non-inferior treatment duration in terms of modelling the entire duration-response curve within a pre-specified range. We propose a multi-arm randomised trial design, allocating patients to different treatment durations. We use fractional polynomials and spline-based methods to flexibly model the duration-response curve. We compare different methods in terms of a scaled version of the area between true and estimated prediction curves. We evaluate sensitivity to key design parameters, including sample size, number and position of arms.
Results: a total sample size of $\sim 500$ patients divided into a moderate number of equidistant arms (5-7) is sufficient to estimate the duration-response curve within a $5\%$ error margin in $95\%$ of the simulations. Fractional polynomials provide similar or better results than spline-based methods in most scenarios.
Conclusions: our proposed practical randomised trial design is an alternative to standard non-inferiority designs, avoiding many of their limitations, and yet being fairly robust to different possible duration-response curves. The trial outcome is the whole duration-response curve, which could be used by clinicians and policy makers to make informed decisions, facilitating a move away from a forced binary hypothesis testing paradigm.
△ Less
Submitted 5 February, 2018;
originally announced February 2018.
-
Experimental verification of orbital engineering at the atomic scale: charge transfer and symmetry breaking in nickelate heterostructures
Authors:
Patrick J. Phillips,
Paolo Longo,
Alexandru B. Georgescu,
Eiji Okunishi,
Xue Rui,
Ankit S. Disa,
Fred Walker,
Sohrab Ismail-Beigi,
Charles H. Ahn,
Robert F. Klie
Abstract:
Epitaxial strain, layer confinement and inversion symmetry breaking have emerged as powerful new approaches to control the electronic and atomic-scale structural properties in complex metal oxides. Nickelate heterostructures, based on RENiO$_3$, where RE is a trivalent rare-earth cation, have been shown to be relevant model systems since the orbital occupancy, degeneracy, and, consequently, the el…
▽ More
Epitaxial strain, layer confinement and inversion symmetry breaking have emerged as powerful new approaches to control the electronic and atomic-scale structural properties in complex metal oxides. Nickelate heterostructures, based on RENiO$_3$, where RE is a trivalent rare-earth cation, have been shown to be relevant model systems since the orbital occupancy, degeneracy, and, consequently, the electronic/magnetic properties can be altered as a function of epitaxial strain, layer thickness and superlattice structure. One such recent example is the tri-component LaTiO$_3$-LaNiO$_3$-LaAlO$_3$ superlattice, which exhibits charge transfer and orbital polarization as the result of its interfacial dipole electric field. A crucial step towards control of these parameters for future electronic and magnetic device applications is to develop an understanding of both the magnitude and range of the octahedral network's response towards interfacial strain and electric fields. An approach that provides atomic-scale resolution and sensitivity towards the local octahedral distortions and orbital occupancy is therefore required. Here, we employ atomic-resolution imaging coupled with electron spectroscopies and first principles theory to examine the role of interfacial charge transfer and symmetry breaking in a tricomponent nickelate superlattice system. We find that nearly complete charge transfer occurs between the LaTiO$_3$ and LaNiO$_3$ layers, resulting in a Ni$^{2+}$ valence state. We further demonstrate that this charge transfer is highly localized with a range of about 1 unit cell, within the LaNiO$_3$ layers. The results presented here provide important feedback to synthesis efforts aimed at stabilizing new electronic phases that are not accessible by conventional bulk or epitaxial film approaches.
△ Less
Submitted 16 December, 2016;
originally announced December 2016.
-
Reversible modulation of orbital occupations via an interface-induced state in metallic manganites
Authors:
Hanghui Chen,
Qiao Qiao,
Matthew S. J. Marshall,
Alexandru B. Georgescu,
Ahmet Gulec,
Patrick J. Phillips,
Robert F. Klie,
Frederick J. Walker,
Charles H. Ahn,
Sohrab Ismail-Beigi
Abstract:
The breaking of orbital degeneracy on a transition metal cation and the resulting unequal electronic occupations of these orbitals provide a powerful lever over electron density and spin ordering inmetal oxides. Here, we use ab initio calculations to show that reversibly modulating the orbital populations on Mn atoms can be achieved at ferroelectric/manganite interfaces by the presence of ferroele…
▽ More
The breaking of orbital degeneracy on a transition metal cation and the resulting unequal electronic occupations of these orbitals provide a powerful lever over electron density and spin ordering inmetal oxides. Here, we use ab initio calculations to show that reversibly modulating the orbital populations on Mn atoms can be achieved at ferroelectric/manganite interfaces by the presence of ferroelectric polarization on the nanoscale. The change in orbital occupation can be as large as 10%, greatly exceeding that of bulk manganites. This reversible orbital splitting is in large part controlled by the propagation of ferroelectric polar displacements into the interfacial region, a structural motif absent in the bulk and unique to the interface. We use epitaxial thin film growth and scanning transmission electron microscopy to verify this key interfacial polar distortion and discuss the potential of reversible control of orbital polarization via nanoscale ferroelectrics.
△ Less
Submitted 12 September, 2014;
originally announced September 2014.
-
Dynamical control of orbital occupations via a ferroelectric-induced polar state in metallic manganites
Authors:
Hanghui Chen,
Qiao Qiao,
Matthew S. J. Marshall,
Alexandru B. Georgescu,
Ahmet Gulec,
Patrick J. Phillips,
Robert F. Klie,
Frederick J. Walker,
Charles H. Ahn,
Sohrab Ismail-Beigi
Abstract:
The breaking of orbital degeneracy on a transition metal cation and the resulting unequal electronic occupations of these orbitals provide a powerful lever over electron density and spin ordering in metal oxides. Here, we show how to dynamically modulate the orbital populations on Mn atoms at ferroelectric/manganite interfaces by switching the ferroelectric polarization. The change in orbital occu…
▽ More
The breaking of orbital degeneracy on a transition metal cation and the resulting unequal electronic occupations of these orbitals provide a powerful lever over electron density and spin ordering in metal oxides. Here, we show how to dynamically modulate the orbital populations on Mn atoms at ferroelectric/manganite interfaces by switching the ferroelectric polarization. The change in orbital occupation can be as large as 10\%, greatly exceeding that of bulk manganites. This flippable orbital splitting is in large part controlled by the propagation of ferroelectric polar displacements into the interfacial region, a structural motif absent in the bulk and unique to the interface. We use {\it ab initio} theory, epitaxial thin film growth, and scanning transmission electron microscopy to verify the predicted interfacial polar state and concomitant orbital splittings.
△ Less
Submitted 14 September, 2014; v1 submitted 11 September, 2013;
originally announced September 2013.
-
Coaxial Nanowire Resonant Tunneling Diodes from non-polar AlN/GaN on Silicon
Authors:
S. D. Carnevale,
C. Marginean,
P. J. Phillips,
T. F. Kent,
A. T. M. G. Sarwar,
M. J. Mills,
R. C. Myers
Abstract:
Resonant tunneling diodes are formed using AlN/GaN core-shell nanowire heterostructures grown by plasma assisted molecular beam epitaxy on n-Si(111) substrates. By using a coaxial geometry these devices take advantage of non-polar (m-plane) nanowire sidewalls. Device modeling predicts non-polar orientation should enhance resonant tunneling compared to a polar structure and that AlN double barriers…
▽ More
Resonant tunneling diodes are formed using AlN/GaN core-shell nanowire heterostructures grown by plasma assisted molecular beam epitaxy on n-Si(111) substrates. By using a coaxial geometry these devices take advantage of non-polar (m-plane) nanowire sidewalls. Device modeling predicts non-polar orientation should enhance resonant tunneling compared to a polar structure and that AlN double barriers will lead to higher peak-to-valley current ratios compared to AlGaN barriers. Electrical measurements of ensembles of nanowires show negative differential resistance appearing only at cryogenic temperature. Individual nanowire measurements show negative differential resistance at room temperature with peak current density of 5*10^5 A/cm^2.
△ Less
Submitted 23 March, 2012; v1 submitted 27 February, 2012;
originally announced February 2012.
-
Intersublevel Polaron Dephasing in Self-Assembled Quantum Dots
Authors:
E. A. Zibik,
T. Grange,
B. A. Carpenter,
R. Ferreira,
G. Bastard,
N. Q. Vinh,
P. J. Phillips,
M. J. Steer,
M. Hopkinson,
J. W. Cockburn,
M. S. Skolnick,
L. R. Wilson
Abstract:
Polaron dephasing processes are investigated in InAs/GaAs dots using far-infrared transient four wave mixing (FWM) spectroscopy. We observe an oscillatory behaviour in the FWM signal shortly (< 5 ps) after resonant excitation of the lowest energy conduction band transition due to coherent acoustic phonon generation. The subsequent single exponential decay yields long intraband dephasing times of…
▽ More
Polaron dephasing processes are investigated in InAs/GaAs dots using far-infrared transient four wave mixing (FWM) spectroscopy. We observe an oscillatory behaviour in the FWM signal shortly (< 5 ps) after resonant excitation of the lowest energy conduction band transition due to coherent acoustic phonon generation. The subsequent single exponential decay yields long intraband dephasing times of 90 ps. We find excellent agreement between our measured and calculated FWM dynamics, and show that both real and virtual acoustic phonon processes are necessary to explain the temperature dependence of the polarization decay.
△ Less
Submitted 26 October, 2007;
originally announced October 2007.
-
Spin-galvanic effect due to optical spin orientation
Authors:
S. D. Ganichev,
Petra Schneider,
V. V. Bel'kov,
E. L. Ivchenko,
S. A. Tarasenko,
W. Wegscheider,
D. Weiss,
D. Schuh,
B. N. Murdin,
P. J. Phillips,
C. R. Pidgeon,
D. G. Clarke,
M. Merrick,
P. Murzyn,
E. V. Beregulin,
W. Prettl
Abstract:
Under oblique incidence of circularly polarized infrared radiation the spin-galvanic effect has been unambiguously observed in (001)-grown $n$-type GaAs quantum well (QW) structures in the absence of any external magnetic field. Resonant inter-subband transitions have been obtained making use of the tunability of the free-electron laser FELIX. It is shown that a helicity dependent photocurrent a…
▽ More
Under oblique incidence of circularly polarized infrared radiation the spin-galvanic effect has been unambiguously observed in (001)-grown $n$-type GaAs quantum well (QW) structures in the absence of any external magnetic field. Resonant inter-subband transitions have been obtained making use of the tunability of the free-electron laser FELIX. It is shown that a helicity dependent photocurrent along one of the $<110>$ axes is predominantly contributed by the spin-galvanic effect while that along the perpendicular in-plane axis is mainly due to the circular photogalvanic effect. This strong non-equivalence of the [110] and [1$\bar{1}$0] directions is determined by the interplay between bulk and structural inversion asymmetries. A microscopic theory of the spin-galvanic effect for direct inter-subband optical transitions has been developed being in good agreement with experimental findings.
△ Less
Submitted 6 June, 2003; v1 submitted 11 March, 2003;
originally announced March 2003.