Fine-grained sentiment classification of social media
Le Thi, Nhu Y
Date of Issue2017-05-12
School of Electrical and Electronic Engineering
Institute of High Performance Computing, A*STAR
This research is conducted to enhance the sentiment analysis for classifying data that are collected from Twitter into binary classes positive and negative. The project starts with the implementation of valence-based and rule-based method to improve the current simple polarity-based method. The implementation includes the tuning method for determining threshold value that gives the best classification results. Then, the results are compared and discussed, which concludes that the valence-based method performs better than the polarity-based method in various datasets. In addition, the Random Forest classifier with word frequency as feature is implemented and evaluated in comparison with other machine learning classifiers consisting of Support Vector Machine, Naïve Bayes, Maximum Entropy and Extreme Learning Machine. The tuning method of hyperparameters for Random Forest in different datasets is also explained, and an idea is introduced about the impact of parameters on its performance as well as its prospective application. The result has shown that with the proper tuning of Random Forest hyperparameters, including the number of decision trees and the maximum number of random features, it can give the highest accuracy for larger datasets in all the five classifiers discussed in this report.
Final Year Project (FYP)
Nanyang Technological University