↓ Skip to main content

PLOS

Semantic Similarity for Automatic Classification of Chemical Compounds

Overview of attention for article published in PLoS Computational Biology, September 2010
Altmetric Badge

Readers on

mendeley
103 Mendeley
citeulike
6 CiteULike
Title
Semantic Similarity for Automatic Classification of Chemical Compounds
Published in
PLoS Computational Biology, September 2010
DOI 10.1371/journal.pcbi.1000937
Pubmed ID
Authors

João D. Ferreira, Francisco M. Couto

Abstract

With the increasing amount of data made available in the chemical field, there is a strong need for systems capable of comparing and classifying chemical compounds in an efficient and effective way. The best approaches existing today are based on the structure-activity relationship premise, which states that biological activity of a molecule is strongly related to its structural or physicochemical properties. This work presents a novel approach to the automatic classification of chemical compounds by integrating semantic similarity with existing structural comparison methods. Our approach was assessed based on the Matthews Correlation Coefficient for the prediction, and achieved values of 0.810 when used as a prediction of blood-brain barrier permeability, 0.694 for P-glycoprotein substrate, and 0.673 for estrogen receptor binding activity. These results expose a significant improvement over the currently existing methods, whose best performances were 0.628, 0.591, and 0.647 respectively. It was demonstrated that the integration of semantic similarity is a feasible and effective way to improve existing chemical compound classification systems. Among other possible uses, this tool helps the study of the evolution of metabolic pathways, the study of the correlation of metabolic networks with properties of those networks, or the improvement of ontologies that represent chemical information.

Mendeley readers

Mendeley readers

The data shown below were compiled from readership statistics for 103 Mendeley readers of this research output. Click here to see the associated Mendeley record.

Geographical breakdown

Country Count As %
United States 7 7%
Portugal 4 4%
France 2 2%
United Kingdom 2 2%
India 1 <1%
Spain 1 <1%
China 1 <1%
Unknown 85 83%

Demographic breakdown

Readers by professional status Count As %
Researcher 31 30%
Student > Ph. D. Student 14 14%
Student > Master 12 12%
Professor > Associate Professor 8 8%
Student > Doctoral Student 5 5%
Other 21 20%
Unknown 12 12%
Readers by discipline Count As %
Computer Science 28 27%
Agricultural and Biological Sciences 21 20%
Chemistry 15 15%
Medicine and Dentistry 6 6%
Biochemistry, Genetics and Molecular Biology 5 5%
Other 16 16%
Unknown 12 12%