Extreme learning machine for classification and regression
Date of Issue2014
School of Electrical and Electronic Engineering
The machine learning techniques have been extensively studied in the past few decades. One of the most common approaches in machine learning is the Artificial Neural Network (ANN), or commonly referred as Neural Network (NN) that is inspirited by how the human brain functions. Many learning algorithms and paradigms have been developed for Neural Network since 1940s. However, most of the traditional neural network learning algorithms suffer from problems like local minima, slow learning rate and trivial human intervene. Extreme Learning Machine (ELM) proposed by Huang et al. in 2004 is an emergent technology that has great potential of overcoming the problems faced by traditional neural network learning algorithms. ELM is based on the structure of the ``generalized'' single hidden-layer feedforward neural networks, where the hidden node parameters are randomly generated. In the aspect of the standard optimization method, the ELM problem could be formulated as an optimization problem that is similar to the formulation of Support Vector Machine (SVM)'s optimization problem. However, SVM tends to obtain a solution that is sub-optimal to ELM's. With the finding of the relationship between ELM and SVM, we could extend ELM to many of SVM's variants. In the work presented in chapter 3, the equality constrained approach from both Least Square SVM and Proximal SVM was adopted in the optimization method based ELM. By implementing the equality constraints in its optimization equations, ELM can provide a unified solution to different practical applications (e.g. regression, binary and multiclass classifications). ELM could also provide different solutions based on the application size thus to reduce the training complexity. The kernel trick can also be used in ELM's solution. As supported in theory and by simulation results, ELM tends to have better generalization performance than SVM and its variants when the same kernel functions are used. The equality constrained optimization method based ELM has shown promising results in the benchmark datasets. It is also important to test its performance in real-world applications. In chapter 5, the kernel based ELM was implemented in credit risk evaluation for two credit datasets. Simulation results showed that the kernel based ELM was more suitable for credit risk evaluation than the popular Support Vector Machines with consideration of overall, good and bad accuracy. Compared with other machine learning techniques, ELM has greater potential of solving larger scale dataset problems due to its simple network structure. However when solving very large data problems, ELM requires a large number of hidden nodes to map the data to higher dimensional space where the data can be separated well. The large number of hidden nodes result in a large hidden layer matrix, which usually requires very large memory during computation. In chapter 6, a stacked ELMs (S-ELMs) method was proposed to solve the memory issue. Instead of using one single ELM with large hidden layer matrix, we broke it into multiple small ELMs and connected them serially. The stacked ELMs not only reduces the computational memory requirement, but also saves the training time. The generalization performance of S-ELMs can be further improved by implementing the unsupervised pretraining approach (usually an autoencoder) in each layer. The work presented in chapter 7 shows that when adding ELM based autoencoder to each layers of S-ELMs network, the testing accuracy could be significantly improved.
DRNTU::Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence