↓ Skip to main content

PLOS

Learning to Recognize Phenotype Candidates in the Auto-Immune Literature Using SVM Re-Ranking

Overview of attention for article published in PLOS ONE, October 2013
Altmetric Badge

Mentioned by

twitter
1 X user

Readers on

mendeley
38 Mendeley
citeulike
2 CiteULike
Title
Learning to Recognize Phenotype Candidates in the Auto-Immune Literature Using SVM Re-Ranking
Published in
PLOS ONE, October 2013
DOI 10.1371/journal.pone.0072965
Pubmed ID
Authors

Nigel Collier, Mai-vu Tran, Hoang-quynh Le, Quang-Thuy Ha, Anika Oellrich, Dietrich Rebholz-Schuhmann

Abstract

The identification of phenotype descriptions in the scientific literature, case reports and patient records is a rewarding task for bio-medical text mining. Any progress will support knowledge discovery and linkage to other resources. However because of their wide variation a number of challenges still remain in terms of their identification and semantic normalisation before they can be fully exploited for research purposes. This paper presents novel techniques for identifying potential complex phenotype mentions by exploiting a hybrid model based on machine learning, rules and dictionary matching. A systematic study is made of how to combine sequence labels from these modules as well as the merits of various ontological resources. We evaluated our approach on a subset of Medline abstracts cited by the Online Mendelian Inheritance of Man database related to auto-immune diseases. Using partial matching the best micro-averaged F-score for phenotypes and five other entity classes was 79.9%. A best performance of 75.3% was achieved for phenotype candidates using all semantics resources. We observed the advantage of using SVM-based learn-to-rank for sequence label combination over maximum entropy and a priority list approach. The results indicate that the identification of simple entity types such as chemicals and genes are robustly supported by single semantic resources, whereas phenotypes require combinations. Altogether we conclude that our approach coped well with the compositional structure of phenotypes in the auto-immune domain.

X Demographics

X Demographics

The data shown below were collected from the profile of 1 X user who shared this research output. Click here to find out more about how the information was compiled.
Mendeley readers

Mendeley readers

The data shown below were compiled from readership statistics for 38 Mendeley readers of this research output. Click here to see the associated Mendeley record.

Geographical breakdown

Country Count As %
United States 2 5%
France 1 3%
Australia 1 3%
Unknown 34 89%

Demographic breakdown

Readers by professional status Count As %
Researcher 7 18%
Student > Ph. D. Student 7 18%
Student > Bachelor 4 11%
Student > Doctoral Student 4 11%
Lecturer 2 5%
Other 7 18%
Unknown 7 18%
Readers by discipline Count As %
Computer Science 13 34%
Medicine and Dentistry 7 18%
Agricultural and Biological Sciences 6 16%
Biochemistry, Genetics and Molecular Biology 3 8%
Engineering 2 5%
Other 1 3%
Unknown 6 16%