↓ Skip to main content

PLOS

Inferring Phylogenies from RAD Sequence Data

Overview of attention for article published in PLOS ONE, April 2012
Altmetric Badge

Mentioned by

blogs
2 blogs
twitter
7 X users
wikipedia
1 Wikipedia page

Citations

dimensions_citation
275 Dimensions

Readers on

mendeley
686 Mendeley
citeulike
3 CiteULike
Title
Inferring Phylogenies from RAD Sequence Data
Published in
PLOS ONE, April 2012
DOI 10.1371/journal.pone.0033394
Pubmed ID
Authors

Benjamin E. R. Rubin, Richard H. Ree, Corrie S. Moreau

Abstract

Reduced-representation genome sequencing represents a new source of data for systematics, and its potential utility in interspecific phylogeny reconstruction has not yet been explored. One approach that seems especially promising is the use of inexpensive short-read technologies (e.g., Illumina, SOLiD) to sequence restriction-site associated DNA (RAD)--the regions of the genome that flank the recognition sites of restriction enzymes. In this study, we simulated the collection of RAD sequences from sequenced genomes of different taxa (Drosophila, mammals, and yeasts) and developed a proof-of-concept workflow to test whether informative data could be extracted and used to accurately reconstruct "known" phylogenies of species within each group. The workflow consists of three basic steps: first, sequences are clustered by similarity to estimate orthology; second, clusters are filtered by taxonomic coverage; and third, they are aligned and concatenated for "total evidence" phylogenetic analysis. We evaluated the performance of clustering and filtering parameters by comparing the resulting topologies with well-supported reference trees and we were able to identify conditions under which the reference tree was inferred with high support. For Drosophila, whole genome alignments allowed us to directly evaluate which parameters most consistently recovered orthologous sequences. For the parameter ranges explored, we recovered the best results at the low ends of sequence similarity and taxonomic representation of loci; these generated the largest supermatrices with the highest proportion of missing data. Applications of the method to mammals and yeasts were less successful, which we suggest may be due partly to their much deeper evolutionary divergence times compared to Drosophila (crown ages of approximately 100 and 300 versus 60 Mya, respectively). RAD sequences thus appear to hold promise for reconstructing phylogenetic relationships in younger clades in which sufficient numbers of orthologous restriction sites are retained across species.

X Demographics

X Demographics

The data shown below were collected from the profiles of 7 X users who shared this research output. Click here to find out more about how the information was compiled.
Mendeley readers

Mendeley readers

The data shown below were compiled from readership statistics for 686 Mendeley readers of this research output. Click here to see the associated Mendeley record.

Geographical breakdown

Country Count As %
United States 25 4%
Brazil 5 <1%
Switzerland 4 <1%
France 3 <1%
United Kingdom 3 <1%
Australia 3 <1%
Netherlands 2 <1%
Canada 2 <1%
Belgium 2 <1%
Other 13 2%
Unknown 624 91%

Demographic breakdown

Readers by professional status Count As %
Student > Ph. D. Student 224 33%
Researcher 139 20%
Student > Master 77 11%
Student > Doctoral Student 40 6%
Student > Bachelor 40 6%
Other 111 16%
Unknown 55 8%
Readers by discipline Count As %
Agricultural and Biological Sciences 481 70%
Biochemistry, Genetics and Molecular Biology 68 10%
Environmental Science 33 5%
Computer Science 10 1%
Earth and Planetary Sciences 7 1%
Other 15 2%
Unknown 72 10%