Recommender system for events with hybrid filtering and ensemble machine learning
Date of Issue2016
School of Computer Engineering
This work aims to build a recommender system for local events that uses hybrid information filtering methods and ensemble machine learning. The motivation behind the project is the lack of a universal event aggregator that recommends personalized local events to its users. The sources currently used by people to find local events do not cater to their individual preferences and sometimes they contain incomplete details. The objective through the project is to make it easier to discover local events for a user. The data for events was scraped from the platform of Facebook Events, and included around 1.5 million events and 38000 users. This data was stored in a database organized into tables. The data was then preprocessed: missing values were inferred, irrelevant data was removed, data was subsetted to use a smaller training set, and values were normalized. We then performed model-based information filtering to calculate similarity metrics, including user-user collaborative, event-event collaborative, content-based and demographic filtering. Next, we performed feature engineering by building a set of features from the above data and calculating their values from our preprocessed data. We used our feature vector and our response variable to fit three different classifiers and evaluated the cross-validation accuracy for each. The classifiers used were: logistic regression, decision tree and random forests. After performing an 11-fold cross validation for each classifier, random forests gave the best performance with a mean accuracy of 90.1% followed by decision trees and logistic regression. We investigated the reasons why random forests classifier was successful in building an improved model from our training data. We also attribute the good performance of the recommender system to its architecture, that calculates the similarity metrics and feature values first, and then performs machine learning. Finally, we suggested some improvements for the recommender system, which include using more sophisticated natural language processing techniques to extract important keywords from event descriptions and using an ensemble of ensemble classifiers to improve accuracy.
Final Year Project (FYP)
Nanyang Technological University