Stack overflow data explorer : dictionary tool for software engineering terms from user-generated content
Date of Issue2016-04-25
School of Computer Engineering
In this information age, many informal texts were generated and put up online, these user-generated-content may contain a lot of misspelled terms and abbreviations which affect people’s understanding. These cases happen often in the software engineering community website or the technology blog. The various forms of the same term can be misleading. Whenever people encounter a software engineering term that they may not understand, additional searches need to be done to get the explanation. Not mentioning the query like the what library to use for a certain task, many readings, information extraction and comparison needed to be done. This project aims to shorten the time taken for look up these software engineering terms online and assist people to find the direct answer to their query. We developed a software engineering specific domain dictionary tool with the ability to retrieve the explanation and related terms from misspelled word or abbreviation and give direct answers in terms of library recommendation by utilizing the relational and analogical knowledge mined from Stack Overflow. The software engineering specific corpus is build up from a large set of unlabeled text, from which the semantics of the terms is learned and related terms are extracted, and the abbreviation and morphological forms of the terms are identified among the semantically related terms. Our solution provides a helpful way to look up the various form of software engineering domain terms and smart recommendation answer for a certain query. According to the survey we get, our software engineering dictionary tool assist the lookup and understanding of the term in an effective way, and users find this tool very useful.
Final Year Project (FYP)
Nanyang Technological University