Automatic generation of keywords from images
Ho, Galvin Yuan Hao
Date of Issue2016
School of Electrical and Electronic Engineering
A Star I2R
Every year, there are approximately 700 movies released into cinema. As a result, movies contribute a large portion of database in the entertainment industries. While movie genre is an important tool for film industries, simply classifying huge collection of movies without automation is not an easy task. A commonly used method by film industries is classifying genre based on movie synopsis. This approach is considered to be semi-automatic as it requires manual process of collecting synopsis from movie viewers. While this method yields a decent performance of 64.34%, 54.42% and 79.77% for Recall, Precision and Accuracy respectively, it suffers from the need of manual work and may potentially prone to human error. Hence there is a demand for fully automation to speed up classifying process and eliminate human error. This report presents MovieNet, a designed framework for automatic movie genre classification. MovieNet preprocess raw movie film into image frame via scene change detection. Image classifier, consist of image captioning and object recognition is used to give caption and labels to every single image frame. These sentences are combined together simulating a movie synopsis which is later convert into feature of vector using bag-of-words. Classifier such as SVM and ELM are used for classifying the features to predict genres. The robustness of MovieNet was tested with 396 movies, consisting of 10 popular movie genres which are Action, Animated, Comedy, Crime, Epic, Horror, Romance, Science Fiction, War and Western. Results show MovieNet yield a 57.57%, 51.26% and 76.04% for Recall, Precision and Accuracy respectively. Even though it is slightly underperformed to first approach, it enjoys the benefit of automation and human free error. Final studies also shows that MovieNet is capable of recommending and identify existing human error made in current database on genre. Possible recommendations such as retraining image classifier, implement spatial localization or detection, noise removal, extracting closed caption or subtitle methods and exploiting audio features are consider to be feasible and beneficial to the current MovieNet framework in improving its performance significantly.
Final Year Project (FYP)
Nanyang Technological University