Development of a word segmentation algorithm for Myanmar language
U Tun Thura Thet
Date of Issue2006
Wee Kim Wee School of Communication and Information
This study is to develop a word segmentation algorithm and solution for Myanmar language. This is a first-of-its-kind for word segmentation in Myanmar language using the Unicode Standard version 5.1. The Unicode standard for Myanmar character set had not been very stable in the past and the recent version 5.1 is now included with some significant changes in order to address the major issues faced in the previous versions. The literature review for research covers the studies of not only Myanmar script but also the other similar scripts such as Thai, Cambodia and Laos. Some word segmentation approaches for Thai, Vietnamese and Chinese languages which are relevant to the studies are also reviewed to understand how other solutions were developed and evaluated.
DRNTU::Library and information science::Libraries::Technologies
Nanyang Technological University