Machine learning for mathematical question difficulty classification
Pang, Jarald Qi Kai
Date of Issue2019
School of Computer Science and Engineering
This project is an experimental study on how machine learning models can be used for classification of GCE ‘A’ Level mathematical questions. Two levels of classification are carried out. First, the classification of questions to their respective topics and second, the classification of the questions to their difficulty level. The report will contain detailed explanations of the steps gone through during the experiment. The grading metrics used in this experiment are F1 Score, Precision, Recall and Accuracy. For data pre-processing three text vectorization methods, count vector, word level TF-IDF and N-gram level TF-IDF were used and tested. Four machine learning methods, Support Vector Machines, Naïve Bayes, Random Forest and Extreme Gradient Boosting, were then used to classify the data to their respective topic. Analysis was then done on the models’ performance on each topic. The same 4 machine learning methods were then again used to classify the difficulty of each question using the vectorized question and predicted topic. A final analysis was then done on the performance of the models in difficulty classification.
DRNTU::Engineering::Computer science and engineering
Final Year Project (FYP)
Nanyang Technological University