Word representation learning
Date of Issue2016-05-30
School of Electrical and Electronic Engineering
The research topic studied in this dissertation is word representation learning, which aims to learn the numerical vector representation for words in natural language. The learned vector representation of words may be used as a dictionary for computers and applied in many natural language processing tasks. There are two major research directions in this dissertation, including addressing the problems existing in the application of word vector representations and enhancing existing word vector representations in a postprocessing way. The works are categorized in 4 chapters based on the problems they aim to address, including the effect of imbalanced word frequency, bias of context definition, multi-prototype word representation learning and sentence vector representation learning (compositional distributional semantic). Firstly, the inconsistency problem between existing word vector representations and WordNet is identified based on empirical experimental analysis. Many potential factors affecting the identified problem are explored to locate the root cause, and the inconsistency problem is found to be a side effect of existing word vector representation algorithms and imbalanced word frequency. To alleviate the pain, two measures based on ordinal information and piecewise linear mapping are proposed. The experiment result empirically proves the effectiveness of proposed new measures. The first study reveals that the ranking of cosine values is more robust than the cosine values themselves. This motivates the author to improve the existing word vector representation by adjusting the ranking of cosine similarity values. With the help of ranking learning, a supervised fine tuning framework is proposed to alleviate the bias problem caused by context definition. As a postprocessing framework, the proposed fine tuning framework is compatible with all word vector representation learning models employing vectors to represent words. Various empirical experiments prove the proposed framework may significantly improve the performance of existing word vector representations. After addressing the bias problem of context definition, the supervised fine tuning framework is further enhanced to learn multi-prototype word vector representations. The mini context word sense disambiguation is proposed and integrated into the framework. Armed with new initialization and leaning algorithms, the framework may transfer a single-prototype word vector representation into a multi-prototype word vector representation. The experimental result reveals that the learned multiprototype word vector representation may encode different senses of the polysemous words and outperforms the original single-prototype word vector representation. At last, a vectorial sentence model is proposed to extend the existing word level vector representation to the sentence level vector representation. The proposed vectorial sentence model is based on phrase level semantic composition models and recursive neural network. It has a dynamic model structure which is consistent with the dependency tree of the given sentence. The model may benefit from both data driven learning algorithms and grammar rules defined by linguistic experts. Both phrase and sentence level evaluation experiments prove the proposed models are effective.
DRNTU::Engineering::Electrical and electronic engineering