Report for: Joint Analysis of Multiple Metagenomic Samples

Title	Joint Analysis of Multiple Metagenomic Samples
Published in	PLoS Computational Biology, February 2012
DOI	10.1371/journal.pcbi.1002373
Pubmed ID	22359490
Authors	Yael Baran, Eran Halperin
Abstract	The availability of metagenomic sequencing data, generated by sequencing DNA pooled from multiple microbes living jointly, has increased sharply in the last few years with developments in sequencing technology. Characterizing the contents of metagenomic samples is a challenging task, which has been extensively attempted by both supervised and unsupervised techniques, each with its own limitations. Common to practically all the methods is the processing of single samples only; when multiple samples are sequenced, each is analyzed separately and the results are combined. In this paper we propose to perform a combined analysis of a set of samples in order to obtain a better characterization of each of the samples, and provide two applications of this principle. First, we use an unsupervised probabilistic mixture model to infer hidden components shared across metagenomic samples. We incorporate the model in a novel framework for studying association of microbial sequence elements with phenotypes, analogous to the genome-wide association studies performed on human genomes: We demonstrate that stratification may result in false discoveries of such associations, and that the components inferred by the model can be used to correct for this stratification. Second, we propose a novel read clustering (also termed "binning") algorithm which operates on multiple samples simultaneously, leveraging on the assumption that the different samples contain the same microbial species, possibly in different proportions. We show that integrating information across multiple samples yields more precise binning on each of the samples. Moreover, for both applications we demonstrate that given a fixed depth of coverage, the average per-sample performance generally increases with the number of sequenced samples as long as the per-sample coverage is high enough.

View on publisher site Alert me about new mentions

X Demographics

The data shown below were collected from the profiles of 3 X users who shared this research output. Click here to find out more about how the information was compiled.

Geographical breakdown

Country	Count	As %
United Kingdom	1	33%
United States	1	33%
Italy	1	33%

Demographic breakdown

Type	Count	As %
Scientists	3	100%

Mendeley readers

The data shown below were compiled from readership statistics for 122 Mendeley readers of this research output. Click here to see the associated Mendeley record.

Geographical breakdown

Country	Count	As %
United States	6	5%
United Kingdom	5	4%
Denmark	3	2%
Italy	2	2%
Switzerland	1	<1%
Mexico	1	<1%
Sweden	1	<1%
Estonia	1	<1%
Germany	1	<1%
Other	0	0%
Unknown	101	83%

Demographic breakdown

Readers by professional status	Count	As %
Researcher	36	30%
Student > Ph. D. Student	33	27%
Student > Master	14	11%
Professor > Associate Professor	11	9%
Student > Bachelor	7	6%
Other	15	12%
Unknown	6	5%

Readers by discipline	Count	As %
Agricultural and Biological Sciences	72	59%
Biochemistry, Genetics and Molecular Biology	10	8%
Mathematics	9	7%
Computer Science	9	7%
Immunology and Microbiology	3	2%
Other	9	7%
Unknown	10	8%

PLOS

Article Metrics

Joint Analysis of Multiple Metagenomic Samples

Mentioned by

Citations

Readers on

X Demographics

Geographical breakdown

Demographic breakdown

Mendeley readers

Geographical breakdown

Demographic breakdown