↓ Skip to main content

PLOS

Long-Branch Attraction Bias and Inconsistency in Bayesian Phylogenetics

Overview of attention for article published in PLOS ONE, December 2009
Altmetric Badge

Mentioned by

blogs
2 blogs
wikipedia
3 Wikipedia pages

Readers on

mendeley
290 Mendeley
citeulike
7 CiteULike
Title
Long-Branch Attraction Bias and Inconsistency in Bayesian Phylogenetics
Published in
PLOS ONE, December 2009
DOI 10.1371/journal.pone.0007891
Pubmed ID
Authors

Bryan Kolaczkowski, Joseph W. Thornton

Abstract

Bayesian inference (BI) of phylogenetic relationships uses the same probabilistic models of evolution as its precursor maximum likelihood (ML), so BI has generally been assumed to share ML's desirable statistical properties, such as largely unbiased inference of topology given an accurate model and increasingly reliable inferences as the amount of data increases. Here we show that BI, unlike ML, is biased in favor of topologies that group long branches together, even when the true model and prior distributions of evolutionary parameters over a group of phylogenies are known. Using experimental simulation studies and numerical and mathematical analyses, we show that this bias becomes more severe as more data are analyzed, causing BI to infer an incorrect tree as the maximum a posteriori phylogeny with asymptotically high support as sequence length approaches infinity. BI's long branch attraction bias is relatively weak when the true model is simple but becomes pronounced when sequence sites evolve heterogeneously, even when this complexity is incorporated in the model. This bias--which is apparent under both controlled simulation conditions and in analyses of empirical sequence data--also makes BI less efficient and less robust to the use of an incorrect evolutionary model than ML. Surprisingly, BI's bias is caused by one of the method's stated advantages--that it incorporates uncertainty about branch lengths by integrating over a distribution of possible values instead of estimating them from the data, as ML does. Our findings suggest that trees inferred using BI should be interpreted with caution and that ML may be a more reliable framework for modern phylogenetic analysis.

Mendeley readers

Mendeley readers

The data shown below were compiled from readership statistics for 290 Mendeley readers of this research output. Click here to see the associated Mendeley record.

Geographical breakdown

Country Count As %
United States 22 8%
Brazil 6 2%
Spain 4 1%
Germany 2 <1%
Colombia 2 <1%
United Kingdom 2 <1%
Poland 2 <1%
Canada 2 <1%
Sweden 1 <1%
Other 9 3%
Unknown 238 82%

Demographic breakdown

Readers by professional status Count As %
Researcher 89 31%
Student > Ph. D. Student 72 25%
Student > Master 27 9%
Professor > Associate Professor 23 8%
Professor 12 4%
Other 48 17%
Unknown 19 7%
Readers by discipline Count As %
Agricultural and Biological Sciences 213 73%
Biochemistry, Genetics and Molecular Biology 17 6%
Computer Science 7 2%
Environmental Science 6 2%
Earth and Planetary Sciences 6 2%
Other 14 5%
Unknown 27 9%