Assessing transcription factor motif drift from noisy decoy sequences

Timothy E Reddy, Charles DeLisi, Boris E Shakhnovich
Genome Informatics 2005, 16 (1): 59-67
Genome scale identification of transcription factor binding sites (TFBS) is fundamental to understanding the complexities of mRNA expression at both the cell and organismal levels. While high-throughput experimental methods provide associations between transcription factors and the genes they regulate under a specified experimental condition, computational methods are still required to pinpoint the exact location of binding. Moreover, since the binding site is an intrinsic property of the promoter region, computational methods are in principle more general than condition dependent experimental methods. Computational identification of TFBSs is complicated in at least two different ways. First, transcription factors bind a heterogeneous distribution of sites and therefore have a distribution of affinities. Second, the set of sequences for which a common site is to be determined do not all have a site for the TF of interest. In this paper, we evaluate the robustness of TFBS identification with respect to both effects. We show addition of upstream regions that do not have the TFBS destroy the specificity of the predicted binding site. We also propose a method to calculate the distance between position weight matrices that can be used to measure "drift'' from the canonical binding site. The results presented here could be useful in developing future transcription factor binding site identification algorithms.

Full Text Links

Find Full Text Links for this Article


You are not logged in. Sign Up or Log In to join the discussion.

Related Papers

Remove bar
Read by QxMD icon Read

Save your favorite articles in one place with a free QxMD account.


Search Tips

Use Boolean operators: AND/OR

diabetic AND foot
diabetes OR diabetic

Exclude a word using the 'minus' sign

Virchow -triad

Use Parentheses

water AND (cup OR glass)

Add an asterisk (*) at end of a word to include word stems

Neuro* will search for Neurology, Neuroscientist, Neurological, and so on

Use quotes to search for an exact phrase

"primary prevention of cancer"
(heart or cardiac or cardio*) AND arrest -"American Heart Association"