PoliDB : a repository for storing multi-dimensional relation information case study with Indian politics
Toh, Jeck Chee
Date of Issue2014
School of Computer Engineering
Nowadays, there has been a proliferation of social network usage. Social networks not only enhance relationships among people by providing a more convenient form of communication (Petre, 2010), but also contain a large amount of information. Patterns or trends can be analysed with a system that has the capability of capturing, processing, storing and analysing data from social networks. Twitter is one of the social platforms that is commonly used to express small thoughts, feelings or opinions in words. It is largely used by people all around the world to express their thoughts and also as an advertising tool by celebrities. A text message broadcast through Twitter is called a tweet. Each tweet can contain a maximum of 140 characters. The message size restriction keeps things scan-friendly which is suitable for the modern, attention-deficit world (Gil, 2012). Hence, tweet data is used in the project as a major input for the analysis of the Indian political trend. As the India general election in 2014 approaches, discussions about the Indian politics becomes more popular on social platforms. This has created an opportunity for obtaining a huge amount of related information for analysing the Indian politics. The number of active users of Twitter has increased rapidly in recent years (Bennett, 2013), and statistics also suggest Twitter is growing in popularity among younger people (Cheng, Evans, 2009). In addition, social networking sites have become a significant additional arena for politics. From the statistics (Lee, 2012), registered voters with age group of 18-29 are more likely to share their presidential choice on social media compared to other age groups. Hence, the political trend among the young generation can be analysed and forecast through analysing the tweet data. Historical and demographical data are also obtained to cover the part that is lacking in tweet data. For example, political party Indian National Congress (INC) is more traditional compared to the political party Bharatiya Janata Party (BJP). INC may not have a huge impact compared to BJP on social networks but it is still supported by some of the older generation. Therefore, both of the real time crawled tweet data and historical and demographical data should be captured and stored for more complete and reliable analysis of the Indian politics. The objective of this project can be logically separated into two parts. The first part is to provide a repository for the crawled tweet data and manually populating historical data by designing and implementing a relational database. The second part aims to provide a multidimensional analysis by designing and implementing a data warehouse. With the database created, statistics for some important historical and geographical data can be retrieved. With the data warehouse implemented, various combinations of the data obtained by manipulating the dimensions can be used to analyse the tweet data.
DRNTU::Engineering::Computer science and engineering
Final Year Project (FYP)
Nanyang Technological University