Robust multiple people tracking and re-identification in cluttered environment
Date of Issue2016
School of Electrical and Electronic Engineering
As the computational ability develops in computers, there has been an increasing interest to detect, label and track people by computer vision algorithms in surveillance systems. To liberate human operators from the tedious task watching over a set of monitors as well as improving the efficiency of surveillance system, an automated system that can process tasks with less or even without involvement of human is desired. Those computer vision techniques can be used in many applications such as abnormal event detection, searching and tracking for specific target (for example, a criminal suspect), activity analysis and others. Though developed for decades, there are still some challenges in these areas remain unsolved. Occlusion is one of the essential challenges, including partial occlusion, full occlusion and mutual occlusion between multiple targets. Besides, building a model to identify a target is challenging as well, especially under various conditions like illumination change, various view angles and pose change. The objective of this research is to develop methods that can help dealing with occlusion conditions in tracking, as well as labelling the target discriminatively. Basically there are three contribution parts presented in this thesis. In the first part, a key-points based single target tracking method is proposed. The proposed method aims to solve the partial occlusion problem in tracking since it is one of the major problems that object tracking faces in a clustered environment. An object model and a surrounding background model are constructed simultaneously to store key-points from the target and surrounding background respectively. Key-points are located by using the way SURF key-points are found. They are evaluated and on-line learned by Random Ferns. These labelled key-points can be used to improve the tracking performance as well as the learning of Random Ferns. Tracking of targets under partial occlusion depends on the locating of key-points belonging to the target. Long-term tracking is achieved by combining detection and tracking together. Experiments were conducted on videos where targets were under different occlusion conditions. The second part of this thesis is, to extend the idea of occlusion reasoning in single target tracking, a framework follows tracking-by-detection method is proposed to enhance the performance on multi-people tracking. The saliency of an object or area is the quality to stand out from its neighborhood. It is an important component when we observe object in real world. In order to solve occlusion by background object and mutual occlusion between targets, salient parts inside the target as well as those around the target are extracted to assist tracking of the target. Salient parts inside the target are considered as representative parts of the target, while salient parts around target are used as context information. Short-term tracking of salient parts of targets are applied when targets fail to associated with detections. Tracking of target can be improved with the supporting model indicating the spatial relationship between salient parts and target. The supporting model is on-line updated. To evaluate the performance, experiments were conducted on several public datasets which contains movement of multiple people and complicate conditions such as mutual occlusion and trajectory intersection. The third contribution of the thesis is to study human descriptor to model the targets and to assist the tracking of multiple targets. As biometric information is not available in applications where the camera resolution is low, this research exploits soft-biometric information to construct the human descriptor. A framework is constructed to extract soft-biometric based descriptors in real scene. Soft-biometric based description is more invariant to changing factors than directly using low level features such as colour and texture. The ensemble of a set of soft-biometric traits can achieve good performance in people re-identification. The body of detected people is divided into three parts and the selected soft-biometric traits are extracted from each part. Specific methods were designed to extract soft-biometric traits from certain body areas. All traits are then combined to form the final descriptor. This framework is further used for experiment in people re-identification. Furthermore, human descriptors are employed in multi-people tracking. This will enable the proposed system to solve the target re-identification problem when tracker lost the target with the aim to make the tracking of target more consecutive, even if the tracker fails for certain reason in between. In summary, two frameworks and one object tracking method are proposed in this research work. The key-points based object tracking method can track the target when it is under partial occlusion. By combining detection, it can achieve long-term tracking. In the first framework, a salient part based multi-people tracking method is proposed. With tracking of salient parts inside and around the target patch, problems such as partial occlusion and mutual occlusion can be solved. Targets can be tracked more robustly. In the second framework, human descriptor with a set of soft-biometric features can be extracted in surveillance video. It can be used to assist the construction of human model and target re-identification in tracking. With the help of human descriptors, trajectory intersection problem can be solved.