↓ Skip to main content

PLOS

Evaluation of Different Reference Based Annotation Strategies Using RNA-Seq – A Case Study in Drososphila pseudoobscura

Overview of attention for article published in PLOS ONE, October 2012
Altmetric Badge

Mentioned by

twitter
12 X users

Readers on

mendeley
91 Mendeley
citeulike
2 CiteULike
Title
Evaluation of Different Reference Based Annotation Strategies Using RNA-Seq – A Case Study in Drososphila pseudoobscura
Published in
PLOS ONE, October 2012
DOI 10.1371/journal.pone.0046415
Pubmed ID
Authors

Nicola Palmieri, Viola Nolte, Anton Suvorov, Carolin Kosiol, Christian Schlötterer

Abstract

RNA-Seq is a powerful tool for the annotation of genomes, in particular for the identification of isoforms and UTRs. Nevertheless, several software tools exist and no standard strategy to obtain a reliable annotation is yet established. We tested different combinations of the most commonly used reference-based alignment tools (TopHat, GSNAP) in combination with two frequently used reference-based assemblers (Cufflinks, Scripture) and evaluated the potential of RNA-Seq to improve the annotation of Drosophila pseudoobscura. While GSNAP maps a higher proportion of reads, TopHat resulted in a more accurate annotation when used in combination with Cufflinks. Scripture had the lowest sensitivity. Interestingly, after subsampling to the same coverage for GSNAP and TopHat, we find that both mappers have similar performance, implying that the advantage of TopHat is mainly an artifact of the lower coverage. Overall, we observed a low concordance among the different approaches tested both at junction and isoform levels. Using data from both sexes of two adult strains of D. pseudoobscura we detected alternative splicing for about 30% of the FlyBase multiple-exon genes. Moreover, we extended the boundaries for 6523 genes (about 40%). We annotated 669 new genes, 45% of them with splicing evidence. Most of the new genes are located on unassembled contigs, reflecting their incomplete annotation. Finally, we identified 99 additional new genes that are not represented in the current genome contigs of D. pseudoobscura, probably due to location in genomic regions that are difficult to assemble (e.g. heterochromatic regions).

X Demographics

X Demographics

The data shown below were collected from the profiles of 12 X users who shared this research output. Click here to find out more about how the information was compiled.
Mendeley readers

Mendeley readers

The data shown below were compiled from readership statistics for 91 Mendeley readers of this research output. Click here to see the associated Mendeley record.

Geographical breakdown

Country Count As %
United States 5 5%
United Kingdom 2 2%
Netherlands 1 1%
France 1 1%
Czechia 1 1%
Germany 1 1%
China 1 1%
Brazil 1 1%
Unknown 78 86%

Demographic breakdown

Readers by professional status Count As %
Researcher 33 36%
Student > Ph. D. Student 19 21%
Student > Master 8 9%
Student > Postgraduate 4 4%
Student > Bachelor 3 3%
Other 14 15%
Unknown 10 11%
Readers by discipline Count As %
Agricultural and Biological Sciences 58 64%
Biochemistry, Genetics and Molecular Biology 18 20%
Computer Science 2 2%
Medicine and Dentistry 2 2%
Neuroscience 1 1%
Other 0 0%
Unknown 10 11%