↓ Skip to main content

PLOS

Using Sequence-Specific Chemical and Structural Properties of DNA to Predict Transcription Factor Binding Sites

Overview of attention for article published in PLoS Computational Biology, November 2010
Altmetric Badge

Mentioned by

f1000
1 research highlight platform

Readers on

mendeley
86 Mendeley
citeulike
7 CiteULike
Title
Using Sequence-Specific Chemical and Structural Properties of DNA to Predict Transcription Factor Binding Sites
Published in
PLoS Computational Biology, November 2010
DOI 10.1371/journal.pcbi.1001007
Pubmed ID
Authors

Amy L. Bauer, William S. Hlavacek, Pat J. Unkefer, Fangping Mu

Abstract

An important step in understanding gene regulation is to identify the DNA binding sites recognized by each transcription factor (TF). Conventional approaches to prediction of TF binding sites involve the definition of consensus sequences or position-specific weight matrices and rely on statistical analysis of DNA sequences of known binding sites. Here, we present a method called SiteSleuth in which DNA structure prediction, computational chemistry, and machine learning are applied to develop models for TF binding sites. In this approach, binary classifiers are trained to discriminate between true and false binding sites based on the sequence-specific chemical and structural features of DNA. These features are determined via molecular dynamics calculations in which we consider each base in different local neighborhoods. For each of 54 TFs in Escherichia coli, for which at least five DNA binding sites are documented in RegulonDB, the TF binding sites and portions of the non-coding genome sequence are mapped to feature vectors and used in training. According to cross-validation analysis and a comparison of computational predictions against ChIP-chip data available for the TF Fis, SiteSleuth outperforms three conventional approaches: Match, MATRIX SEARCH, and the method of Berg and von Hippel. SiteSleuth also outperforms QPMEME, a method similar to SiteSleuth in that it involves a learning algorithm. The main advantage of SiteSleuth is a lower false positive rate.

Mendeley readers

Mendeley readers

The data shown below were compiled from readership statistics for 86 Mendeley readers of this research output. Click here to see the associated Mendeley record.

Geographical breakdown

Country Count As %
United States 8 9%
Chile 1 1%
France 1 1%
Australia 1 1%
Israel 1 1%
Saudi Arabia 1 1%
Korea, Republic of 1 1%
Belgium 1 1%
Argentina 1 1%
Other 2 2%
Unknown 68 79%

Demographic breakdown

Readers by professional status Count As %
Researcher 24 28%
Student > Ph. D. Student 18 21%
Professor > Associate Professor 10 12%
Professor 7 8%
Student > Bachelor 6 7%
Other 15 17%
Unknown 6 7%
Readers by discipline Count As %
Agricultural and Biological Sciences 48 56%
Biochemistry, Genetics and Molecular Biology 10 12%
Computer Science 9 10%
Medicine and Dentistry 3 3%
Engineering 2 2%
Other 5 6%
Unknown 9 10%