Modeling user factors in multimedia preferences
Guntuku, Sharath Chandra
Date of Issue2017-11-23
School of Computer Science and Engineering
Centre for Multimedia and Network Technology
With the increasing proliferation of data production technologies, such as cameras, and consumption avenues, such as social media, multimedia has become an interaction channel among users today. Consequently, automatically modeling users’ perception (i.e. liking an image, or finding a video funny etc.) is an important and challenging problem. This challenge goes beyond carrying out a content- or genre-based multimedia analysis because the same content can arouse varying (and possibly opposite) perceptions, depending on the users’ cultural and psychophysical frameworks. However, existing multimedia systems largely ignore the subjective nature of such perception where individual user factors (such as personality and culture) play a crucial role. While individual differences are extraneous for objective tasks like identifying objects or colors present in the content, neglecting them would be inappropriate in subjective perception tasks such as modeling experience of affect or concepts users ‘like’ etc. In this thesis, we carry out a series of studies to answer two questions pertaining to individual user factors: 1) If and to what extent individual factors play a role in influencing users’ multimedia perception, specifically perception of affect & enjoyment in videos?, and 2) How can user-level personality prediction models be built based on social media data? The challenge in considering individual user factors in practical systems is that they are not freely available, and need to be collected using survey questionnaires. Use of social media has become widespread due to ubiquitous Internet access and multimedia enabled devices, and this has led to the possibility of using multimedia as a source for user-modeling. Users’ innate preferences and tastes are conveyed based on media they consume and share, thereby contributing to the field of personalized recommender systems and content generation systems. A key factor to consider in personality recognition is the source used for acquiring users’ personality, i.e. the behavior we base the assessment upon. Studies in Psychology show that to get strong personality cues, users should be given the necessary freedom of control and motivation to express themselves through that behavior. However, in the computational domain, which focuses mostly on curated behavior of participants involved, these two criteria are given amiss. In this thesis, we aim to address this by exploring how non-intrusive data sources – such as posted and liked content on social media can be leveraged to automatically predict users’ personality profiles. In modeling users personality based on posted content, we crawl selfies from Sina Weibo-a popular Chinese microblogging site, propose several mid-level cues(suchas presence of duck faces, pressed lips etc.), and detect them computationally using low-level visual features (like LBP, BoVW etc). These mid-level cues are then used as predictors to model users’ personality measured using the Big Five model, and are shown to outperform low-level visual features. In a second study with image posts from Twitter, we identify the way in and extent to which personality differences are related to online image sharing in general. We extract interpretable semantic image features (color aesthetics and objects) using large-scale content analysis in the posted images. We then give insights into the personality differences among people posting them, and build predictive models of users’ personality. In modeling users personality based on liked content, we use images tagged as ‘Favorite’ on Flickr. Here, there exist two challenges - a) how best to represent users’ likes on images, and b) how to build effective models to predict users’ personality using images they liked. To address the first challenge we represent images using semantic features which are interpretable - for instance, is the image colored or black & white, is the image rendered by graphics or a natural image, are there people in the image and what their characteristics are and so on. We further propose a deep bi-modal representation exploiting the visual content along with the tags associated with images to better the state-of-the-art on user ‘likes’ by 15-20%. To address the second challenge, we use the semantic interpretable features that we identified previously and a novel Feature Selection based Ordinal Regression for personality prediction. Extensive experiments show that our proposed method significantly improves personality prediction performance.