↓ Skip to main content

PLOS

On Evaluating MHC-II Binding Peptide Prediction Methods

Overview of attention for article published in PLOS ONE, September 2008
Altmetric Badge

Mentioned by

twitter
1 X user
wikipedia
1 Wikipedia page

Citations

dimensions_citation
41 Dimensions

Readers on

mendeley
46 Mendeley
Title
On Evaluating MHC-II Binding Peptide Prediction Methods
Published in
PLOS ONE, September 2008
DOI 10.1371/journal.pone.0003268
Pubmed ID
Authors

Yasser EL-Manzalawy, Drena Dobbs, Vasant Honavar

Abstract

Choice of one method over another for MHC-II binding peptide prediction is typically based on published reports of their estimated performance on standard benchmark datasets. We show that several standard benchmark datasets of unique peptides used in such studies contain a substantial number of peptides that share a high degree of sequence identity with one or more other peptide sequences in the same dataset. Thus, in a standard cross-validation setup, the test set and the training set are likely to contain sequences that share a high degree of sequence identity with each other, leading to overly optimistic estimates of performance. Hence, to more rigorously assess the relative performance of different prediction methods, we explore the use of similarity-reduced datasets. We introduce three similarity-reduced MHC-II benchmark datasets derived from MHCPEP, MHCBN, and IEDB databases. The results of our comparison of the performance of three MHC-II binding peptide prediction methods estimated using datasets of unique peptides with that obtained using their similarity-reduced counterparts shows that the former can be rather optimistic relative to the performance of the same methods on similarity-reduced counterparts of the same datasets. Furthermore, our results demonstrate that conclusions regarding the superiority of one method over another drawn on the basis of performance estimates obtained using commonly used datasets of unique peptides are often contradicted by the observed performance of the methods on the similarity-reduced versions of the same datasets. These results underscore the importance of using similarity-reduced datasets in rigorously comparing the performance of alternative MHC-II peptide prediction methods.

X Demographics

X Demographics

The data shown below were collected from the profile of 1 X user who shared this research output. Click here to find out more about how the information was compiled.
Mendeley readers

Mendeley readers

The data shown below were compiled from readership statistics for 46 Mendeley readers of this research output. Click here to see the associated Mendeley record.

Geographical breakdown

Country Count As %
United States 2 4%
Germany 1 2%
Chile 1 2%
Italy 1 2%
Ireland 1 2%
Spain 1 2%
Argentina 1 2%
Unknown 38 83%

Demographic breakdown

Readers by professional status Count As %
Student > Ph. D. Student 19 41%
Researcher 9 20%
Student > Master 5 11%
Other 3 7%
Student > Bachelor 3 7%
Other 4 9%
Unknown 3 7%
Readers by discipline Count As %
Agricultural and Biological Sciences 24 52%
Biochemistry, Genetics and Molecular Biology 6 13%
Immunology and Microbiology 3 7%
Computer Science 3 7%
Chemistry 2 4%
Other 5 11%
Unknown 3 7%