↓ Skip to main content

PLOS

Where Have All the Interactions Gone? Estimating the Coverage of Two-Hybrid Protein Interaction Maps

Overview of attention for article published in PLoS Computational Biology, November 2007
Altmetric Badge

Mentioned by

f1000
1 research highlight platform

Citations

dimensions_citation
135 Dimensions

Readers on

mendeley
146 Mendeley
citeulike
7 CiteULike
connotea
4 Connotea
Title
Where Have All the Interactions Gone? Estimating the Coverage of Two-Hybrid Protein Interaction Maps
Published in
PLoS Computational Biology, November 2007
DOI 10.1371/journal.pcbi.0030214
Pubmed ID
Authors

Hailiang Huang, Bruno M Jedynak, Joel S Bader

Abstract

Yeast two-hybrid screens are an important method for mapping pairwise physical interactions between proteins. The fraction of interactions detected in independent screens can be very small, and an outstanding challenge is to determine the reason for the low overlap. Low overlap can arise from either a high false-discovery rate (interaction sets have low overlap because each set is contaminated by a large number of stochastic false-positive interactions) or a high false-negative rate (interaction sets have low overlap because each misses many true interactions). We extend capture-recapture theory to provide the first unified model for false-positive and false-negative rates for two-hybrid screens. Analysis of yeast, worm, and fly data indicates that 25% to 45% of the reported interactions are likely false positives. Membrane proteins have higher false-discovery rates on average, and signal transduction proteins have lower rates. The overall false-negative rate ranges from 75% for worm to 90% for fly, which arises from a roughly 50% false-negative rate due to statistical undersampling and a 55% to 85% false-negative rate due to proteins that appear to be systematically lost from the assays. Finally, statistical model selection conclusively rejects the Erdös-Rényi network model in favor of the power law model for yeast and the truncated power law for worm and fly degree distributions. Much as genome sequencing coverage estimates were essential for planning the human genome sequencing project, the coverage estimates developed here will be valuable for guiding future proteomic screens. All software and datasets are available in and , -, and -, and are also available from our Web site, http://www.baderzone.org.

Mendeley readers

Mendeley readers

The data shown below were compiled from readership statistics for 146 Mendeley readers of this research output. Click here to see the associated Mendeley record.

Geographical breakdown

Country Count As %
United States 5 3%
Germany 3 2%
United Kingdom 3 2%
Netherlands 2 1%
Australia 1 <1%
Austria 1 <1%
France 1 <1%
Belgium 1 <1%
Argentina 1 <1%
Other 2 1%
Unknown 126 86%

Demographic breakdown

Readers by professional status Count As %
Student > Ph. D. Student 47 32%
Researcher 34 23%
Student > Master 18 12%
Student > Bachelor 12 8%
Other 6 4%
Other 17 12%
Unknown 12 8%
Readers by discipline Count As %
Agricultural and Biological Sciences 82 56%
Biochemistry, Genetics and Molecular Biology 26 18%
Computer Science 11 8%
Chemistry 3 2%
Unspecified 2 1%
Other 9 6%
Unknown 13 9%