↓ Skip to main content

PLOS

MOSAIK: A Hash-Based Algorithm for Accurate Next-Generation Sequencing Short-Read Mapping

Overview of attention for article published in PLOS ONE, March 2014
Altmetric Badge

Mentioned by

blogs
1 blog
twitter
26 X users
patent
6 patents

Readers on

mendeley
325 Mendeley
citeulike
3 CiteULike
Title
MOSAIK: A Hash-Based Algorithm for Accurate Next-Generation Sequencing Short-Read Mapping
Published in
PLOS ONE, March 2014
DOI 10.1371/journal.pone.0090581
Pubmed ID
Authors

Wan-Ping Lee, Michael P. Stromberg, Alistair Ward, Chip Stewart, Erik P. Garrison, Gabor T. Marth

Abstract

MOSAIK is a stable, sensitive and open-source program for mapping second and third-generation sequencing reads to a reference genome. Uniquely among current mapping tools, MOSAIK can align reads generated by all the major sequencing technologies, including Illumina, Applied Biosystems SOLiD, Roche 454, Ion Torrent and Pacific BioSciences SMRT. Indeed, MOSAIK was the only aligner to provide consistent mappings for all the generated data (sequencing technologies, low-coverage and exome) in the 1000 Genomes Project. To provide highly accurate alignments, MOSAIK employs a hash clustering strategy coupled with the Smith-Waterman algorithm. This method is well-suited to capture mismatches as well as short insertions and deletions. To support the growing interest in larger structural variant (SV) discovery, MOSAIK provides explicit support for handling known-sequence SVs, e.g. mobile element insertions (MEIs) as well as generating outputs tailored to aid in SV discovery. All variant discovery benefits from an accurate description of the read placement confidence. To this end, MOSAIK uses a neural-network based training scheme to provide well-calibrated mapping quality scores, demonstrated by a correlation coefficient between MOSAIK assigned and actual mapping qualities greater than 0.98. In order to ensure that studies of any genome are supported, a training pipeline is provided to ensure optimal mapping quality scores for the genome under investigation. MOSAIK is multi-threaded, open source, and incorporated into our command and pipeline launcher system GKNO (http://gkno.me).

X Demographics

X Demographics

The data shown below were collected from the profiles of 26 X users who shared this research output. Click here to find out more about how the information was compiled.
Mendeley readers

Mendeley readers

The data shown below were compiled from readership statistics for 325 Mendeley readers of this research output. Click here to see the associated Mendeley record.

Geographical breakdown

Country Count As %
United States 14 4%
Germany 3 <1%
Brazil 3 <1%
France 2 <1%
Sweden 2 <1%
Australia 1 <1%
Italy 1 <1%
Norway 1 <1%
United Kingdom 1 <1%
Other 7 2%
Unknown 290 89%

Demographic breakdown

Readers by professional status Count As %
Student > Ph. D. Student 99 30%
Researcher 64 20%
Student > Master 38 12%
Student > Bachelor 28 9%
Other 16 5%
Other 40 12%
Unknown 40 12%
Readers by discipline Count As %
Agricultural and Biological Sciences 148 46%
Biochemistry, Genetics and Molecular Biology 68 21%
Computer Science 26 8%
Medicine and Dentistry 7 2%
Immunology and Microbiology 6 2%
Other 24 7%
Unknown 46 14%