Robust face alignment and partial face recognition
Date of Issue2016-03-02
School of Electrical and Electronic Engineering
Face alignment and face recognition are two fundamental problems in the facial analysis community. For face alignment, it forms the basis for the accurate face recognition, age estimation, and facial expression recognition. For face recognition, it has been widely applied in various practical scenarios such as access control system, massive surveillance, human computer interaction, etc.. There mainly exist two lines of works in these two ﬁelds, namely holistic face alignment and recognition, and partial face alignment and recognition. Numerous holistic face alignment and recognition works have been proposed and recent state of the arts have surpassed human’s recognition capability on the challenging LFW dataset. One of the major challenges of this area lies on designing robust holistic face alignment method which can accurately detect landmarks from faces with large facial poses. On the contrary, relatively few works have been proposed to deal with partial face alignment and recognition, and they have achieved limited success. In this thesis, we aim to advance the holistic face alignment and contribute to the ﬁeld of partial face alignment and recognition. In particular, for the holistic face alignment, we devise two deep learning based approaches which are capable of estimating facial landmark positions with great robustness and high accuracy. In terms of the partial face alignment and recognition, we present an approach based on robust feature set matching, which achieves partial face alignment and recognition jointly in a single framework. For the holistic face alignment, we are interested in the facial landmark detection problem. The mainstream face landmark detection approaches consist of a pose initialization stage and a pose update step. The pose initialization step derives an initial pose for face alignment. Since the face landmark detection is a highly non-convex problem, this initial pose largely determines the local basin where the ﬁnal solution arrives. The pose update stage then locally reﬁnes the initial pose to achieve high alignment accuracy. Both of these two steps are critical for achieving robust and accurate face alignment performance. In our ﬁrst work, to improve the robustness of the pose initialization step against large pose variations, we devise a Global Exemplar-based Deep Auto-encoder Network (GEDAN), whose top regression layer deploys several exemplars to assist pose estimation. For the pose update stage, we design a series of Localized Deep Auto-encoder Networks (LDAN). Speciﬁcally, its ﬁrst layer consists of individual Local Auto-Encoders (LAEs). Each LAE aims to extract pose-related features from its corresponding local patch. The outputs of these LAEs are then directly fed into their corresponding local regressors. In addition, these outputs are concatenated into a global feature vector which is further encoded by several layers of auto-encoders to preserve the global facial structure. By assembling GEDAN and several LDANs together in a coarse-to-ﬁne way, our approach achieves superior alignment accuracy with real- time speed. We term this network ensemble as Cascaded Deep Auto-encoder Networks (CDAN). While CDAN works well on near-upright faces, it’s incapable of detecting land- marks from arbitrarily rotated facial images. To this end, we leverage the strength of the Convolutional Neural Networks (CNN) and devise a Hierarchical CNN (HiCNN) cascade. In particular, HiCNN consists of a global CNN, a part-based CNN and a patch- based CNN. The global CNN generates a preliminary four-landmark conﬁguration from the low-resolution facial image. Based on this preliminary result, landmark positions are estimated by the part-based CNN based on the corresponding facial parts on a larger resolution. Lastly, the patch-based CNN reﬁnes the landmark positions from the view of pose-indexed patches at the highest resolution. Extensive experiments on three bench- marks show that the proposed HiCNN can accurately detect landmarks from facial images with arbitrary in-plane rotation, large scale variations and random face shifts. Both CDAN and HiCNN are holistic face alignment methods, they may fail if the facial image is an arbitrary facial patch. In realistic scenarios, however, faces might be severely occluded or randomly cropped, resulting in partial faces. It’s desirable to automatically align these partial faces to holistic facial image and subsequently recognize them. To this end, we propose a new partial face recognition approach named Robust Point Set Matching (RPSM) by using feature set matching, which is able to align par- tial face patches to holistic gallery faces automatically and is robust to occlusions and illumination changes. Given each gallery image and probe face patch, we ﬁrst detect keypoints and extract their local features. Then, the RPSM matches the extracted local feature sets by minimizing the geometric and textural difference. Lastly, the similarity of two faces is converted as the distance between two feature sets. The matching problem is formulated in a linear programming framework; hence, constraint of afﬁne transformation can be easily applied to restrain from unrealistic face warping. The proposed RPSM achieves superior results both on partial face alignment and partial face recognition on four public face datasets.