Screen content image evaluation and processing
Date of Issue2015
School of Computer Engineering
Centre for Multimedia and Network Technology
Screen Content Image (SCI) is a typical kind of compound images that contain texts, graphics and pictures concurrently. With the rapid development of digital devices and computing techniques, SCIs have increasingly appeared in multi-client communication systems. The related applications bring many challenges on SCI processing, such as acquisition, segmentation, compression, transmission, quality evaluation, etc. SCIs have different characteristics from natural scene images and scanned document images, which result in the fact that existing classical image processing methods cannot effectively process SCIs. Hence, specialized algorithms for SCI processing are much desired. Currently, there is no much research work in the literature for SCI processing. In this research work, we try to understand the basic properties of SCIs and focus on address- ing challenging problems in the following three aspects, i.e., segmentation, compression and perceptual quality assessment of SCIs. SCI segmentation, which aims to distinguish texts from other components, is a fun- damental step in various SCI processing techniques. In this research work, we rstly propose a coarse-to- ne framework to segment texts with arbitrary scales and orien- tations from other components in SCIs. A Local Image Activity Measure (LIAM) is designed to enhance the di erence between textual and pictorial regions and eliminate most of pictorial regions with low frequency. In order to remove survived pictorial regions (mistaken as texts), a new Scale and Orientation Invariant Grouping (SOIG) algorithm is proposed to construct Textual Connected Components (TCCs) with uni- form geometrical features. False positive components are nally ltered out by three veri cation criteria. The proposed text segmentation algorithm can maintain integrity of texts with varied scales and orientations, which bene ts the compression and evalu- ation procedures for SCIs.It has been demonstrated that traditional coding methods with a single basic func- tion, such as JPEG and JPEG2000, cannot achieve good performance for SCI com- pression due to the intensive high frequency variations in textual regions. In this work, a novel SCI compression scheme is proposed by using di erent basis functions to en- code di erent components respectively. A tailored text dictionary for textual image representation is learned via a modi ed dictionary learning method, i.e., K-Singular Value Decomposition (K-SVD). Compared with the Discrete Cosine Transform (DCT) based representation, textual representation derived from the tailored text dictionary is much sparser, which provides more probability to e ectively encode SCIs. The pro- posed coding scheme achieves much higher coding performance than existing standard coding methods, especially for SCIs with large percentage of textual regions. To evaluate the visual quality of the processed SCIs by compression and other processing, we present a study on perceptual quality assessment of SCIs. A large SCI Quality Assessment Database (SIQAD) is constructed with the visual quality scores ob- tained through subjective testing. Besides, we investigate the correlations between the subjective scores of di erent regions, which reveals the impact of textual and pictorial regions to the overall visual quality. A new SCI Perceptual Quality Assessment (SPQA) scheme is also proposed to automatically evaluate the visual quality of distorted SCIs, by taking into account the di erent properties and contributions of textual and pic- torial regions. Compared with the start-of-the-art Image Quality Assessment (IQA) methods, the proposed SPQA achieves much higher consistency with subjective results.
DRNTU::Engineering::Computer science and engineering