Re: IQ Biometrix FACES EDU PLUS V4.0

0 views

Skip to first unread message

Message has been deleted

Rene Thivierge

unread,

Jul 16, 2024, 1:49:12 PM7/16/24

to corfelowhe

As one of the most natural biometric techniques for identification [11], face recognition can be considered a law enforcement application. In fact, the natural variation among individuals leads to good inter-class separation making the facial characteristics appealing for biometric recognition [12]. Whereas early face recognition methodologies were based on Principal Component Analysis (PCA) and Linear Discriminant Analysis (LDA) (i.e., Eigenfaces [13] and Fisherfaces [14]), face recognition became mature with the results achieved by Convolutional Neural Networks (CNNs). CNNs were successfully applied in face verification, i.e., the job of assessing whether two face images belong to the same person, and identification, i.e., the job of assessing whether a face image belongs to a specific identity in a set of known subjects [15]. Thanks to such advancements, face recognition is currently used for biometric authentication in applications such as smartphone unlocking [16] and passport verification [17].

IQ Biometrix FACES EDU PLUS V4.0

Download File > https://urlca.com/2yXks7

In fact, as analyzed in Section 2, despite the availability of many databases for face verification and identification, the SCFace is the only one including mugshots and surveillance camera images that can perform face recognition in CCTV frames using pictures from nine different points of view as reference images. However, all the faces in the surveillance camera frames are almost frontal. Therefore, we built a new dataset, the FRMDB, containing more mugshots for each subject (28) and videos from surveillance cameras taken from five different points of view. The FRMDB is specifically tailored to presenting a set of mugshots systematically taken from multiple points of view. The videos from the security cameras currently contain the same lighting and do not include occlusions. Instead, the background clutter is different for each of the five points of view, as described in Section 3.

With the results achieved by CNNs in image recognition and face recognition, databases with more face images and unique identities appeared, to the point that training and evaluation of CNNs on the scale of the millions is possible. To this end, the CASIA-Webface database [42] includes 494,414 face images of 10,575 unique identities. The images are crawled from the web at various resolutions. The database is available upon request, even if the official website seems to be discontinued at the time of writing. The Megaface Challange Dataset [43,44] includes 4.7 million color photos of 672,057 unique subjects at various resolutions. As the Megaface Challenge ended, the database was discontinued and Megaface data were no longer officially distributed. The VGGFace Dataset [32] contains 982,803 color images (95% frontal, 5% profile) of 2622 unique identities, whereas the VGGFace2 Dataset [33] includes 3.31 million color images of 9131 unique subjects. Both the VGGFace and the VGGFace2 datasets are free to use and open-access. The Megaface Challenge, the VGGFace, and the VGGFace2 datasets include faces collected from the web under different conditions of lighting, pose, expression, and occlusion, similar to the LFW and YouTube Face Datasets. The amount of images available in such databases make them ideal for training DL-based techniques such as the CNNs, even to be used in a transfer learning fashion, as we did with the VGGFace and VGGFace2 datasets in this paper. However, given that these databases do not include systematically taken sets of mugshots and security videos to compare with, they are not suitable for evaluating the impact of the use of mugshots from multiple POVs in the face recognition performance in surveillance scenarios.

For the experiments presented in this paper, we manually selected one frame for each video and cropped the face to test the recognition performance on such frames using different sets of mugshots. The selected frames and the cropped faces are available in the proposed dataset repository.

Face recognition is an easy task for humans. Experiments in [270] have shown, that even one to three day old babies are able to distinguish between known faces. So how hard could it be for a computer? It turns out we know little about human recognition to date. Are inner features (eyes, nose, mouth) or outer features (head shape, hairline) used for a successful face recognition? How do we analyze an image and how does the brain encode it? It was shown by David Hubel and Torsten Wiesel, that our brain has specialized nerve cells responding to specific local features of a scene, such as lines, edges, angles or movement. Since we don't see the world as scattered pieces, our visual cortex must somehow combine the different sources of information into useful patterns. Automatic face recognition is all about extracting those meaningful features from an image, putting them into a useful representation and performing some kind of classification on them.

The Eigenfaces method described in [271] took a holistic approach to face recognition: A facial image is a point from a high-dimensional image space and a lower-dimensional representation is found, where classification becomes easy. The lower-dimensional subspace is found with Principal Component Analysis, which identifies the axes with maximum variance. While this kind of transformation is optimal from a reconstruction standpoint, it doesn't take any class labels into account. Imagine a situation where the variance is generated from external sources, let it be light. The axes with maximum variance do not necessarily contain any discriminative information at all, hence a classification becomes impossible. So a class-specific projection with a Linear Discriminant Analysis was applied to face recognition in [23] . The basic idea is to minimize the variance within a class, while maximizing the variance between the classes at the same time.

Let's dissect the line. /path/to/image.ext is the path to an image, probably something like this if you are in Windows: C:/faces/person0/image0.jpg. Then there is the separator ; and finally we assign the label 0 to the image. Think of the label as the subject (the person) this image belongs to, so same subjects (persons) should have the same label.

I've used the jet colormap, so you can see how the grayscale values are distributed within the specific Eigenfaces. You can see, that the Eigenfaces do not only encode facial features, but also the illumination in the images (see the left light in Eigenface #4, right light in Eigenfaces #5):

We've already seen, that we can reconstruct a face from its lower dimensional approximation. So let's see how many Eigenfaces are needed for a good reconstruction. I'll do a subplot with \(10,30,\ldots,310\) Eigenfaces:

10 Eigenvectors are obviously not sufficient for a good image reconstruction, 50 Eigenvectors may already be sufficient to encode important facial features. You'll get a good reconstruction with approximately 300 Eigenvectors for the AT&T Facedatabase. There are rule of thumbs how many Eigenfaces you should choose for a successful face recognition, but it heavily depends on the input data. [315] is the perfect point to start researching for this:

The Principal Component Analysis (PCA), which is the core of the Eigenfaces method, finds a linear combination of features that maximizes the total variance in data. While this is clearly a powerful way to represent data, it doesn't consider any classes and so a lot of discriminative information may be lost when throwing components away. Imagine a situation where the variance in your data is generated by an external source, let it be the light. The components identified by a PCA do not necessarily contain any discriminative information at all, so the projected samples are smeared together and a classification becomes impossible (see _lda_with_gnu_octave for an example).

For this example I am going to use the Yale Facedatabase A, just because the plots are nicer. Each Fisherface has the same length as an original image, thus it can be displayed as an image. The demo shows (or saves) the first, at most 16 Fisherfaces:

The Fisherfaces method learns a class-specific transformation matrix, so the they do not capture illumination as obviously as the Eigenfaces method. The Discriminant Analysis instead finds the facial features to discriminate between the persons. It's important to mention, that the performance of the Fisherfaces heavily depends on the input data as well. Practically said: if you learn the Fisherfaces for well-illuminated pictures only and you try to recognize faces in bad-illuminated scenes, then method is likely to find the wrong components (just because those features may not be predominant on bad illuminated images). This is somewhat logical, since the method had no chance to learn the illumination.

The Fisherfaces allow a reconstruction of the projected image, just like the Eigenfaces did. But since we only identified the features to distinguish between subjects, you can't expect a nice reconstruction of the original image. For the Fisherfaces method we'll project the sample image onto each of the Fisherfaces instead. So you'll have a nice visualization, which feature each of the Fisherfaces describes:

Eigenfaces and Fisherfaces take a somewhat holistic approach to face recognition. You treat your data as a vector somewhere in a high-dimensional image space. We all know high-dimensionality is bad, so a lower-dimensional subspace is identified, where (probably) useful information is preserved. The Eigenfaces approach maximizes the total scatter, which can lead to problems if the variance is generated by an external source, because components with a maximum variance over all classes aren't necessarily useful for classification (see _lda_with_gnu_octave). So to preserve some discriminative information we applied a Linear Discriminant Analysis and optimized as described in the Fisherfaces method. The Fisherfaces method worked great... at least for the constrained scenario we've assumed in our model.