Blind quality assessment of image and speech signals
Date of Issue2017-05-09
School of Computer Science and Engineering
Quality assessment of multimedia signals is of great interest to the researchers and practitioners in signal processing community. As most multimedia services and systems are provided for human consumption, it is of great importance to reproduce human judgement of perceived quality for objective quality assessment methods. Among all kinds of these methods, no-reference or blind methods that operate solely on the distorted signals are most desirable as the reference signals are not always available in many practical applications. However, blind quality assessment is a very challenging task due to the various distortion types and diverse content properties. In this thesis, I present a series of works on designing better blind models to automatically estimate perceptual quality of image and speech signals for modern multimedia systems. The first work presented here deals with quality assessment on multiply-distorted images. We propose a novel structural feature as the gradient weighted histogram of local binary pattern calculated on the gradient map, which is effective to describe the complex degradation pattern introduced by multiple distortions. In the second work we propose a general-purpose method to predict the visual quality of images degraded by various distortion types. By exploring the characteristics of the human visual system (HVS), two new perceptual features are extracted to represent the structural information and luminance changes in distorted images. We show that the complementary information provided by extracted statistical structural and luminance features plays an important role in image quality estimation. This work is later extended in the third work by two aspects: 1) we show that linear filter response can complement the widely used local contrast normalization response; 2) we fuse the luminance and structural information through a weighting scheme. These two works belong to perceptual feature based methods, accounting for the HVS properties in the feature design. In the fourth work we explore the utilization of natural scene statistics (NSS) for general-purpose blind image quality assessment. We present a new model for natural images, by using multivariate Gaussian mixture model to approximate the joint distribution of log-contrast response. This is the first attempt to use joint NSS model for blind image quality assessment, and it has several advantages over related works. The last work of this thesis presents a novel non-intrusive speech quality assessment method by adopting the bag-of-words model to speech feature extraction. It provides an effective way to producing global representation from local segments. In all of these works we compare the proposed methods to the cutting edge of related works.