↓ Skip to main content

PLOS

Textrous!: Extracting Semantic Textual Meaning from Gene Sets

Overview of attention for article published in PLOS ONE, April 2013
Altmetric Badge

Mentioned by

twitter
3 X users

Readers on

mendeley
38 Mendeley
citeulike
4 CiteULike
Title
Textrous!: Extracting Semantic Textual Meaning from Gene Sets
Published in
PLOS ONE, April 2013
DOI 10.1371/journal.pone.0062665
Pubmed ID
Authors

Hongyu Chen, Bronwen Martin, Caitlin M. Daimon, Sana Siddiqui, Louis M. Luttrell, Stuart Maudsley

Abstract

The un-biased and reproducible interpretation of high-content gene sets from large-scale genomic experiments is crucial to the understanding of biological themes, validation of experimental data, and the eventual development of plans for future experimentation. To derive biomedically-relevant information from simple gene lists, a mathematical association to scientific language and meaningful words or sentences is crucial. Unfortunately, existing software for deriving meaningful and easily-appreciable scientific textual 'tokens' from large gene sets either rely on controlled vocabularies (Medical Subject Headings, Gene Ontology, BioCarta) or employ Boolean text searching and co-occurrence models that are incapable of detecting indirect links in the literature. As an improvement to existing web-based informatic tools, we have developed Textrous!, a web-based framework for the extraction of biomedical semantic meaning from a given input gene set of arbitrary length. Textrous! employs natural language processing techniques, including latent semantic indexing (LSI), sentence splitting, word tokenization, parts-of-speech tagging, and noun-phrase chunking, to mine MEDLINE abstracts, PubMed Central articles, articles from the Online Mendelian Inheritance in Man (OMIM), and Mammalian Phenotype annotation obtained from Jackson Laboratories. Textrous! has the ability to generate meaningful output data with even very small input datasets, using two different text extraction methodologies (collective and individual) for the selecting, ranking, clustering, and visualization of English words obtained from the user data. Textrous!, therefore, is able to facilitate the output of quantitatively significant and easily appreciable semantic words and phrases linked to both individual gene and batch genomic data.

X Demographics

X Demographics

The data shown below were collected from the profiles of 3 X users who shared this research output. Click here to find out more about how the information was compiled.
Mendeley readers

Mendeley readers

The data shown below were compiled from readership statistics for 38 Mendeley readers of this research output. Click here to see the associated Mendeley record.

Geographical breakdown

Country Count As %
Malaysia 1 3%
Belgium 1 3%
Unknown 36 95%

Demographic breakdown

Readers by professional status Count As %
Student > Ph. D. Student 10 26%
Researcher 9 24%
Student > Master 5 13%
Other 3 8%
Student > Bachelor 2 5%
Other 6 16%
Unknown 3 8%
Readers by discipline Count As %
Agricultural and Biological Sciences 12 32%
Arts and Humanities 4 11%
Medicine and Dentistry 3 8%
Linguistics 2 5%
Biochemistry, Genetics and Molecular Biology 2 5%
Other 11 29%
Unknown 4 11%