dc.contributor.authorHo, Guanlin
dc.date.accessioned2014-04-22T01:45:45Z
dc.date.available2014-04-22T01:45:45Z
dc.date.copyright2014en_US
dc.date.issued2014
dc.identifier.urihttp://hdl.handle.net/10356/59048
dc.description.abstractMetagenomics and its processes have a great impact in biological advances. With the introduction of NGS technologies to produce high throughput sequencing, efficiency is achieved but not the quality of the sequences. Thus, error correction tools were introduced to improve the quality of sequences. However, there are many error correction tools to choose from which use different kind of algorithms. On top of that, sequences produced from different NGS technologies are biased towards different error characteristics. Three error correction tools were selected for benchmarking in this project, namely Coral, CD-HIT and USEARCH. The error correction tools were used to correct simulated 454 Pyrosequencing and Illumina‘s Solexa reads. MapQ score was used to compare the performance of the quality of the corrected reads against the original genome. Coral, using exact k-mer clustering and correcting reads using multiple alignments, produced most accurate reads in correcting 454 Pyrosequencing reads while USEARCH performed the best in correcting Illumina’s reads using 3’ end trimming and discarding reads with high expected errors. With the results of the performance of the tools, the project continues to integrate USEARCH fast clustering method and Coral’s detailed individual read error correction method together. The integrated method is called Fast Clustering Detailed Correction, FCDC. FCDC reduced the percentage of low quality reads compared to USEARCH’s and Coral’s corrected reads. Future developments of this project include improving error correction method for Illumina’s reads. The benchmarking process can also be extended to other NGS technologies such as Ion Torrent and SOLiD.en_US
dc.format.extent60 p.en_US
dc.language.isoenen_US
dc.rightsNanyang Technological University
dc.subjectDRNTU::Engineering::Computer science and engineering::Theory of computation::Analysis of algorithms and problem complexityen_US
dc.titleEvaluation and improvement of error correction tools for erroneous metagenomic readsen_US
dc.typeFinal Year Project (FYP)en_US
dc.contributor.supervisorKwoh Chee Keong (SCE)en_US
dc.contributor.schoolSchool of Computer Engineeringen_US
dc.description.degreeCOMPUTER SCIENCEen_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record