Exploiting text mining for Java package mappings
Ong, Kent Long Xiong
Date of Issue2017-04-10
School of Computer Science and Engineering
Developers often need to utilize method(s) that serve a functionality from more than one program library in order to obtain the latest optimized functionality or to seek a desired functionality. For example, a developer may be utilizing the array feature from the program library “org.json”. Therefore, he/she may require method(s) from the package “org.json.JSONArray” to perform some array operations but “org.json” may no longer be under active development. Consequently, he/she may wish to search for method(s) in another analogical program library (i.e. gson) that performs operations on arrays such as method(s) from the package “com.google.gson.JsonArray”. As a result, a mapping between these packages are required. Such mappings are called package mappings. Due to large number of package mappings, a manual process of defining those mappings is tedious and error-prone. To relieve developers from this tiresome process, an automatic technique to create a database of likely package mappings is desired. Therefore, this report proposes the use of Term Frequency-Inverse Document Frequency (TF-IDF) to perform package mappings between analogical Java program libraries. TF-IDF makes use of package names and their descriptions from Java documentations to measure the similarity and define the package mappings between analogical program libraries. We used Application Programming Interface (API) mappings between four pairs of analogical program libraries as ground truth to evaluate our approach. Our results indicate that the mappings performed inferred the right analogical API within the top-10 recommended results over 50% of the time. With this result, we also present a web application (http://similarpackage.appspot.com/) which can recommend analogical packages for 71,775 packages of 117 pairs of analogical Java program libraries with diverse functionalities.
DRNTU::Engineering::Computer science and engineering::Computing methodologies::Document and text processing
Final Year Project (FYP)
Nanyang Technological University