↓ Skip to main content

PLOS

Benchmarking Ontologies: Bigger or Better?

Overview of attention for article published in PLoS Computational Biology, January 2011
Altmetric Badge

Mentioned by

blogs
1 blog
twitter
1 X user

Citations

dimensions_citation
17 Dimensions

Readers on

mendeley
111 Mendeley
citeulike
12 CiteULike
Title
Benchmarking Ontologies: Bigger or Better?
Published in
PLoS Computational Biology, January 2011
DOI 10.1371/journal.pcbi.1001055
Pubmed ID
Authors

Lixia Yao, Anna Divoli, Ilya Mayzus, James A. Evans, Andrey Rzhetsky

Abstract

A scientific ontology is a formal representation of knowledge within a domain, typically including central concepts, their properties, and relations. With the rise of computers and high-throughput data collection, ontologies have become essential to data mining and sharing across communities in the biomedical sciences. Powerful approaches exist for testing the internal consistency of an ontology, but not for assessing the fidelity of its domain representation. We introduce a family of metrics that describe the breadth and depth with which an ontology represents its knowledge domain. We then test these metrics using (1) four of the most common medical ontologies with respect to a corpus of medical documents and (2) seven of the most popular English thesauri with respect to three corpora that sample language from medicine, news, and novels. Here we show that our approach captures the quality of ontological representation and guides efforts to narrow the breach between ontology and collective discourse within a domain. Our results also demonstrate key features of medical ontologies, English thesauri, and discourse from different domains. Medical ontologies have a small intersection, as do English thesauri. Moreover, dialects characteristic of distinct domains vary strikingly as many of the same words are used quite differently in medicine, news, and novels. As ontologies are intended to mirror the state of knowledge, our methods to tighten the fit between ontology and domain will increase their relevance for new areas of biomedical science and improve the accuracy and power of inferences computed across them.

X Demographics

X Demographics

The data shown below were collected from the profile of 1 X user who shared this research output. Click here to find out more about how the information was compiled.
Mendeley readers

Mendeley readers

The data shown below were compiled from readership statistics for 111 Mendeley readers of this research output. Click here to see the associated Mendeley record.

Geographical breakdown

Country Count As %
United States 12 11%
Germany 3 3%
Brazil 3 3%
Mexico 2 2%
Sweden 2 2%
Cuba 1 <1%
France 1 <1%
Chile 1 <1%
Norway 1 <1%
Other 3 3%
Unknown 82 74%

Demographic breakdown

Readers by professional status Count As %
Researcher 28 25%
Student > Ph. D. Student 23 21%
Professor > Associate Professor 10 9%
Other 9 8%
Student > Master 9 8%
Other 22 20%
Unknown 10 9%
Readers by discipline Count As %
Computer Science 33 30%
Agricultural and Biological Sciences 32 29%
Social Sciences 6 5%
Business, Management and Accounting 4 4%
Biochemistry, Genetics and Molecular Biology 4 4%
Other 20 18%
Unknown 12 11%