IDBA-MT: de novo assembler for metatranscriptomic data generated from next-generation sequencing technology

Henry C M Leung, Siu-Ming Yiu, John Parkinson, Francis Y L Chin
Journal of Computational Biology 2013, 20 (7): 540-50
High-throughput next-generation sequencing technology provides a great opportunity for analyzing metatranscriptomic data. However, the reads produced by these technologies are short and an assembling step is required to combine the short reads into longer contigs. As there are many repeat patterns in mRNAs from different genomes and the abundance ratio of mRNAs in a sample varies a lot, existing assemblers for genomic data, transcriptomic data, and metagenomic data do not work on metatranscriptomic data and produce chimeric contigs, that is, incorrect contigs formed by merging multiple mRNA sequences. To our best knowledge, there is no assembler designed for metatranscriptomic data. In this article, we introduce an assembler called IDBA-MT, which is designed for assembling reads from metatranscriptomic data. IDBA-MT produces much fewer chimeric contigs (reduce by 50% or more) when compared with existing assemblers such as Oases, IDBA-UD, and Trinity.

Full Text Links

Find Full Text Links for this Article


You are not logged in. Sign Up or Log In to join the discussion.

Related Papers

Remove bar
Read by QxMD icon Read

Save your favorite articles in one place with a free QxMD account.


Search Tips

Use Boolean operators: AND/OR

diabetic AND foot
diabetes OR diabetic

Exclude a word using the 'minus' sign

Virchow -triad

Use Parentheses

water AND (cup OR glass)

Add an asterisk (*) at end of a word to include word stems

Neuro* will search for Neurology, Neuroscientist, Neurological, and so on

Use quotes to search for an exact phrase

"primary prevention of cancer"
(heart or cardiac or cardio*) AND arrest -"American Heart Association"