↓ Skip to main content

PLOS

Improving PacBio Long Read Accuracy by Short Read Alignment

Overview of attention for article published in PLOS ONE, October 2012
Altmetric Badge

Mentioned by

blogs
1 blog
twitter
6 X users
patent
4 patents
wikipedia
2 Wikipedia pages

Citations

dimensions_citation
299 Dimensions

Readers on

mendeley
408 Mendeley
citeulike
4 CiteULike
Title
Improving PacBio Long Read Accuracy by Short Read Alignment
Published in
PLOS ONE, October 2012
DOI 10.1371/journal.pone.0046679
Pubmed ID
Authors

Kin Fai Au, Jason G. Underwood, Lawrence Lee, Hung Wong

Abstract

The recent development of third generation sequencing (TGS) generates much longer reads than second generation sequencing (SGS) and thus provides a chance to solve problems that are difficult to study through SGS alone. However, higher raw read error rates are an intrinsic drawback in most TGS technologies. Here we present a computational method, LSC, to perform error correction of TGS long reads (LR) by SGS short reads (SR). Aiming to reduce the error rate in homopolymer runs in the main TGS platform, the PacBio® RS, LSC applies a homopolymer compression (HC) transformation strategy to increase the sensitivity of SR-LR alignment without scarifying alignment accuracy. We applied LSC to 100,000 PacBio long reads from human brain cerebellum RNA-seq data and 64 million single-end 75 bp reads from human brain RNA-seq data. The results show LSC can correct PacBio long reads to reduce the error rate by more than 3 folds. The improved accuracy greatly benefits many downstream analyses, such as directional gene isoform detection in RNA-seq study. Compared with another hybrid correction tool, LSC can achieve over double the sensitivity and similar specificity.

X Demographics

X Demographics

The data shown below were collected from the profiles of 6 X users who shared this research output. Click here to find out more about how the information was compiled.
Mendeley readers

Mendeley readers

The data shown below were compiled from readership statistics for 408 Mendeley readers of this research output. Click here to see the associated Mendeley record.

Geographical breakdown

Country Count As %
United States 10 2%
United Kingdom 4 <1%
France 3 <1%
Netherlands 2 <1%
Italy 2 <1%
Sweden 2 <1%
Australia 2 <1%
Mexico 2 <1%
Spain 2 <1%
Other 9 2%
Unknown 370 91%

Demographic breakdown

Readers by professional status Count As %
Researcher 103 25%
Student > Ph. D. Student 98 24%
Student > Master 44 11%
Student > Bachelor 30 7%
Professor > Associate Professor 27 7%
Other 67 16%
Unknown 39 10%
Readers by discipline Count As %
Agricultural and Biological Sciences 205 50%
Biochemistry, Genetics and Molecular Biology 73 18%
Computer Science 43 11%
Engineering 7 2%
Medicine and Dentistry 6 1%
Other 25 6%
Unknown 49 12%