Machine translation of software-specific documentations
Tee, Li Yin
Date of Issue2017-04-13
School of Computer Science and Engineering
Machine translation is automated translation which is a process to translate language from one to another with computer software. It uses bilingual data to build language and translation model that used to translate the text. It is one of the important steps in software localisation. There are many studies have been carried out to locate need-to-translate strings in software and adapt UI layout after text translation in the new language. However, there is no work has been done on one of the most important and time-consuming steps which to work on the translation of software text. In software text, there are some unique characteristics, for example, application speciﬁc naming, context-sensitive translation, domain-speciﬁc rare words that general machine translation tools such as Google Translate cannot properly translate it. Therefore, in this project, we will study a statistical machine translation with a phrase-based model to train and work for software text translation. We collect human-translated bilingual sentence pairs from Python-related documentations from the internet. We then use an open source software toolkit for statistical machine translation after preprocessing the data. Lastly, we evaluate our test sets with BLEU (bilingual evaluation understudy) and WER (word error rate) to get the translation quality and find out how this model and what is the problem and what can be improved.
Final Year Project (FYP)
Nanyang Technological University