LdClusterView : a visualization for genomics data
Date of Issue2017-04-18
School of Computer Science and Engineering
Agency for Science, Technology and Research (A*STAR)
Scalable processing of large, heterogeneous, and possible incomplete and/or conflicting data, makes the analysis of haplotype data a challenging task. Moreover, near completion of the genome sequences and the re-focus on research analysis, makes the issue of effective genomic sequence display essential: it becomes cumbersome and difficult to understand to have billions of genomic DNA letters displayed on the screen as plain text! Thus, it is of paramount importance to be able to collect and digest the large amount of data about biological systems that is accumulating in the literature. Visualizing the data has successfully aided in gaining better understanding of the data. Moreover, researchers wish to view all facets of the genotype and haplotype data, including the spatial distribution of the loci along a chromosome, the different frequencies of haplotypes in different subgroups, and possibly also the correlation of occurring haplotypes. This emphasizes a need for a dynamic visualization which can address such complex and huge data sets on many different levels. As a solution, Singapore Immunology Network (SigN) aims to provide a customizable and highly user-interactive display of requested portion of genomes. Apart from kick-starting the project, SIgN aims to release the project in the public domain to enable collaborators from all over the world to contribute to and expand the project. As the foundational stone, three kinds of plots have been made to analyse genomic sequences in a better manner – Manhattan Plot, Genes Plot, and the Leaf Nodes Plot.
Final Year Project (FYP)
Nanyang Technological University