Report for: High Resolution Models of Transcription Factor-DNA Affinities Improve In Vitro and In Vivo Binding Predictions

Title	High Resolution Models of Transcription Factor-DNA Affinities Improve In Vitro and In Vivo Binding Predictions
Published in	PLoS Computational Biology, September 2010
DOI	10.1371/journal.pcbi.1000916
Pubmed ID	20838582
Authors	Phaedra Agius, Aaron Arvey, William Chang, William Stafford Noble, Christina Leslie
Abstract	Accurately modeling the DNA sequence preferences of transcription factors (TFs), and using these models to predict in vivo genomic binding sites for TFs, are key pieces in deciphering the regulatory code. These efforts have been frustrated by the limited availability and accuracy of TF binding site motifs, usually represented as position-specific scoring matrices (PSSMs), which may match large numbers of sites and produce an unreliable list of target genes. Recently, protein binding microarray (PBM) experiments have emerged as a new source of high resolution data on in vitro TF binding specificities. PBM data has been analyzed either by estimating PSSMs or via rank statistics on probe intensities, so that individual sequence patterns are assigned enrichment scores (E-scores). This representation is informative but unwieldy because every TF is assigned a list of thousands of scored sequence patterns. Meanwhile, high-resolution in vivo TF occupancy data from ChIP-seq experiments is also increasingly available. We have developed a flexible discriminative framework for learning TF binding preferences from high resolution in vitro and in vivo data. We first trained support vector regression (SVR) models on PBM data to learn the mapping from probe sequences to binding intensities. We used a novel -mer based string kernel called the di-mismatch kernel to represent probe sequence similarities. The SVR models are more compact than E-scores, more expressive than PSSMs, and can be readily used to scan genomics regions to predict in vivo occupancy. Using a large data set of yeast and mouse TFs, we found that our SVR models can better predict probe intensity than the E-score method or PBM-derived PSSMs. Moreover, by using SVRs to score yeast, mouse, and human genomic regions, we were better able to predict genomic occupancy as measured by ChIP-chip and ChIP-seq experiments. Finally, we found that by training kernel-based models directly on ChIP-seq data, we greatly improved in vivo occupancy prediction, and by comparing a TF's in vitro and in vivo models, we could identify cofactors and disambiguate direct and indirect binding.

View on publisher site Alert me about new mentions

X Demographics

The data shown below were collected from the profile of 1 X user who shared this research output. Click here to find out more about how the information was compiled.

Geographical breakdown

Country	Count	As %
United States	1	100%

Demographic breakdown

Type	Count	As %
Scientists	1	100%

Mendeley readers

The data shown below were compiled from readership statistics for 145 Mendeley readers of this research output. Click here to see the associated Mendeley record.

Geographical breakdown

Country	Count	As %
United States	9	6%
France	3	2%
Germany	2	1%
Sweden	2	1%
Canada	2	1%
United Kingdom	1	<1%
Hong Kong	1	<1%
Argentina	1	<1%
Singapore	1	<1%
Other	2	1%
Unknown	121	83%

Demographic breakdown

Readers by professional status	Count	As %
Researcher	44	30%
Student > Ph. D. Student	43	30%
Student > Master	11	8%
Professor > Associate Professor	8	6%
Student > Bachelor	7	5%
Other	22	15%
Unknown	10	7%

Readers by discipline	Count	As %
Agricultural and Biological Sciences	82	57%
Computer Science	19	13%
Biochemistry, Genetics and Molecular Biology	14	10%
Mathematics	4	3%
Medicine and Dentistry	3	2%
Other	8	6%
Unknown	15	10%

PLOS

Article Metrics

High Resolution Models of Transcription Factor-DNA Affinities Improve In Vitro and In Vivo Binding Predictions

Mentioned by

Citations

Readers on

X Demographics

Geographical breakdown

Demographic breakdown

Mendeley readers

Geographical breakdown

Demographic breakdown