Sampling-based image and video matting without compositing equation
Date of Issue2017
School of Computer Science and Engineering
Centre for Multimedia and Network Technology
Image and video matting play a fundamental role in image and video editing applications. They are generally classified into α-propagation based approaches and color sampling based approaches. The correlation between neighboring pixels with respect to local image statistics is leveraged to interpolate the known alpha values to the unknown regions in α-propagation methods. In color sampling methods, foreground (F ) and background (B) samples from known regions that represent the true colors of the unknown pixels are used to estimate alpha. Complex color distributions of foreground and background regions, highly textured edges, and unavailability of true F and B samples are some of the main challenges faced by current works. In addition to this, sampling methods have traditionally followed the compositing equation using (F, B) pairs for alpha estimation. When extended to videos, the unavailability of user-defined trimaps in each frame and the additional requirement of temporal coherency across the sequence makes the matte extraction process a highly challenging task. We aim to develop novel natural matting algorithms for both images and video that can alleviate the drawbacks faced by current methods in generating a good quality matte. We achieve the objectives through the following contributions. First, a sampling-based image matting algorithm is proposed that utilizes sparse coding in the image domain to extract an alpha matte. Multiple F and B samples, as opposed to a single (F, B) pair is used to describe the color at a blended pixel. A carefully chosen dictionary made up of feature vectors from the F and B regions, refined through a foreground probability map ensures that the constrained sparse code coefficients can be approximated to the alpha value. Experimental evaluations on a public benchmark database show that our method achieves state-of-the-art results. Second, a new video matting algorithm is proposed which uses a multi-frame graph- ical model to ensure temporal coherency in the extracted matte. The alpha value at a pixel needs to be consistent and smooth across the video sequence for better tempo- ral coherence. This is accomplished by simultaneously solving for the alpha mattes for multiple consecutive frames. An objective function is proposed that can be solved in closed-form as a sparse linear system. An adaptive temporal trimap propagation using motion-assisted shape blending is utilized to propagate the trimaps automatically be- tween the key-frames. Experimental evaluations on an exclusive video matting dataset validates the effectiveness of the method. Third, a new sampling-based video matting algorithm is proposed that reinterprets the matting problem from the perspective of sparse reconstruction error of F and B samples. Sampling methods generally select an (F, B) pair that produces the least recon- struction error. The significance of the error has been left unexamined. Two patch-based frameworks are used to ensure temporal coherency in the video mattes - a multi-frame non-local means framework using coherency sensitive hashing and a patch-based multi- frame graph model using motion. Qualitative and quantitative evaluations indicate the performance of the method in reducing temporal jitter and maintaining spatial accuracy in the video mattes.
DRNTU::Engineering::Computer science and engineering