Detecting influences in social network
Date of Issue2016-04-18
School of Computer Engineering
We live in a world where social media has been deeply rooted in our lives. Whether it is on our computers, mobile phones or tablets, social media applications are used everywhere (Dewing, 2012). Almost all social media applications involve users forming links with other users (‘friends’ as in Facebook, or ‘followers’ as in Instagram), which forms a graph, with the users as nodes, and edges to represent their connectivity. Thus we can use graphs to put all this information into perspective, and from there, we can easily analyze the graphs according to our needs and mine information from them, which is also known as Social Network Analysis (SNA). Information collected from SNA has a wide range of benefits and can be used in marketing, or even the potential pathway of the spread of a disease. This report is concerned with finding the ‘influencers’ in the social network, or in terms of a network, to find the most important nodes. The term ‘importance’ is subjective, and thus there are many different theories and algorithms which are able to identify the important nodes in their own way. Three algorithms: Page rank, Betweenness, and influence game will be considered in this report. Rankings of the importance of nodes by each algorithm will be compared, and the reason for the difference in rankings will be explained. Another problem is that it is often hard to be certain if the algorithm works for every kind of graphs. While an algorithm is able to find the important nodes of one graph accurately, there is no way to be certain that it can do the same for every other graphs. Thus, to improve the situation, the concept of kronecker graph is proposed. Kronecker graphs work by expanding the graph with n nodes by n, thus increasing the size of the graph. This new graph can be tested against the algorithm to determine if its accuracy still holds. This also proposes a new problem: how large can a graph scale in order for the algorithm to be efficient? If the algorithm cannot handle an extremely large, then how can the scaling of a kronecker graph be reduced such it does not multiply by n times each round? Lastly, clusters in graphs also need to be identified, such that nodes inside the clusters are much more similar to each other as compared to nodes outside of the cluster. With all these goals and problems in mind, extensive study and research will be done, using Gephi, a visualization software for all kinds of graphs and networks. While Gephi already has some in built functions like PageRank, other algorithms used to analyze the graph have to be coded as a plugin into Gephi.
Final Year Project (FYP)
Nanyang Technological University