↓ Skip to main content

PLOS

From Principal Component to Direct Coupling Analysis of Coevolution in Proteins: Low-Eigenvalue Modes are Needed for Structure Prediction

Overview of attention for article published in PLoS Computational Biology, August 2013
Altmetric Badge

Mentioned by

twitter
3 X users

Citations

dimensions_citation
124 Dimensions

Readers on

mendeley
219 Mendeley
citeulike
7 CiteULike
Title
From Principal Component to Direct Coupling Analysis of Coevolution in Proteins: Low-Eigenvalue Modes are Needed for Structure Prediction
Published in
PLoS Computational Biology, August 2013
DOI 10.1371/journal.pcbi.1003176
Pubmed ID
Authors

Simona Cocco, Remi Monasson, Martin Weigt

Abstract

Various approaches have explored the covariation of residues in multiple-sequence alignments of homologous proteins to extract functional and structural information. Among those are principal component analysis (PCA), which identifies the most correlated groups of residues, and direct coupling analysis (DCA), a global inference method based on the maximum entropy principle, which aims at predicting residue-residue contacts. In this paper, inspired by the statistical physics of disordered systems, we introduce the Hopfield-Potts model to naturally interpolate between these two approaches. The Hopfield-Potts model allows us to identify relevant 'patterns' of residues from the knowledge of the eigenmodes and eigenvalues of the residue-residue correlation matrix. We show how the computation of such statistical patterns makes it possible to accurately predict residue-residue contacts with a much smaller number of parameters than DCA. This dimensional reduction allows us to avoid overfitting and to extract contact information from multiple-sequence alignments of reduced size. In addition, we show that low-eigenvalue correlation modes, discarded by PCA, are important to recover structural information: the corresponding patterns are highly localized, that is, they are concentrated in few sites, which we find to be in close contact in the three-dimensional protein fold.

X Demographics

X Demographics

The data shown below were collected from the profiles of 3 X users who shared this research output. Click here to find out more about how the information was compiled.
Mendeley readers

Mendeley readers

The data shown below were compiled from readership statistics for 219 Mendeley readers of this research output. Click here to see the associated Mendeley record.

Geographical breakdown

Country Count As %
United States 6 3%
United Kingdom 4 2%
France 2 <1%
India 2 <1%
Germany 2 <1%
Canada 2 <1%
Czechia 1 <1%
Netherlands 1 <1%
Argentina 1 <1%
Other 1 <1%
Unknown 197 90%

Demographic breakdown

Readers by professional status Count As %
Student > Ph. D. Student 77 35%
Researcher 47 21%
Student > Doctoral Student 21 10%
Student > Master 18 8%
Student > Bachelor 11 5%
Other 28 13%
Unknown 17 8%
Readers by discipline Count As %
Agricultural and Biological Sciences 67 31%
Biochemistry, Genetics and Molecular Biology 34 16%
Physics and Astronomy 31 14%
Computer Science 17 8%
Chemistry 16 7%
Other 26 12%
Unknown 28 13%