Title |
Bayesian Genome Assembly and Assessment by Markov Chain Monte Carlo Sampling
|
---|---|
Published in |
PLOS ONE, June 2014
|
DOI | 10.1371/journal.pone.0099497 |
Pubmed ID | |
Authors |
Mark Howison, Felipe Zapata, Erika J. Edwards, Casey W. Dunn |
Abstract |
Most genome assemblers construct point estimates, choosing only a single genome sequence from among many alternative hypotheses that are supported by the data. We present a Markov chain Monte Carlo approach to sequence assembly that instead generates distributions of assembly hypotheses with posterior probabilities, providing an explicit statistical framework for evaluating alternative hypotheses and assessing assembly uncertainty. We implement this approach in a prototype assembler, called Genome Assembly by Bayesian Inference (GABI), and illustrate its application to the bacteriophage [Formula: see text]X174. Our sampling strategy achieves both good mixing and convergence on Illumina test data for [Formula: see text]X174, demonstrating the feasibility of our approach. We summarize the posterior distribution of assembly hypotheses generated by GABI as a majority-rule consensus assembly. Then we compare the posterior distribution to external assemblies of the same test data, and annotate those assemblies by assigning posterior probabilities to features that are in common with GABI's assembly graph. GABI is freely available under a GPL license from https://bitbucket.org/mhowison/gabi. |
X Demographics
Geographical breakdown
Country | Count | As % |
---|---|---|
United States | 13 | 41% |
United Kingdom | 4 | 13% |
France | 2 | 6% |
Colombia | 2 | 6% |
Germany | 1 | 3% |
Canada | 1 | 3% |
China | 1 | 3% |
Norway | 1 | 3% |
India | 1 | 3% |
Other | 0 | 0% |
Unknown | 6 | 19% |
Demographic breakdown
Type | Count | As % |
---|---|---|
Members of the public | 18 | 56% |
Scientists | 12 | 38% |
Unknown | 2 | 6% |
Mendeley readers
Geographical breakdown
Country | Count | As % |
---|---|---|
United States | 4 | 6% |
Germany | 3 | 4% |
Switzerland | 1 | 1% |
Norway | 1 | 1% |
France | 1 | 1% |
Sweden | 1 | 1% |
Brazil | 1 | 1% |
Unknown | 60 | 83% |
Demographic breakdown
Readers by professional status | Count | As % |
---|---|---|
Researcher | 21 | 29% |
Student > Ph. D. Student | 15 | 21% |
Professor > Associate Professor | 7 | 10% |
Student > Bachelor | 5 | 7% |
Student > Master | 5 | 7% |
Other | 12 | 17% |
Unknown | 7 | 10% |
Readers by discipline | Count | As % |
---|---|---|
Agricultural and Biological Sciences | 38 | 53% |
Computer Science | 10 | 14% |
Biochemistry, Genetics and Molecular Biology | 8 | 11% |
Mathematics | 2 | 3% |
Environmental Science | 1 | 1% |
Other | 5 | 7% |
Unknown | 8 | 11% |