Abstract
The computational identification of DNA binding sites that have high affinity for a specific transcription factor is an important problem that has only been partially addressed in prokaryotes and lower eukaryotes. Given the higher length of regulatory regions and the relative low complexity of DNA binding signature, however, methods to address this problem in higher order eukaryotes are lacking. In this paper, we propose a novel computational framework, which combines cellular network reverse engineering, integrative genomics, and comparative genomic approaches, to address this problem for a set of human transcription factors. Specifically, we study the regulatory regions of putative orthologous targets of a given transcription factor, obtained by reverse engineering methods, in several mammalian genomes. Highly conserved regions are identified by pattern discovery. Finally DNA binding sites are inferred from these regions using a standard Position Weight Matrices (PWM) discovery algorithm. By framing the identification of the PWM as an optimization problem over the two parameters of the method, we are able to discover known binding sites for several genes and to propose reasonable signatures for genes that have not been previously characterized.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Altschul, S., Erickson, B.: Significance of nucleotide sequence alignments: A method for random sequence permutation that preserves dinucleotide and codon usage. Mol. Biol. Evol. 2, 528–538 (1985)
Basso, K., Margolin, A.A., Stolovitzky, G., Klein, U., Dalla-Favera, R., Califano, A.: Reverse engineering of regulatory networks in human B cells. Nat. Genetics 37, 382–390 (2005)
Blanchette, M., Tompa, M.: Discovery of regulatory elements by a computational method for mphylogenetic footprinting. Genome Research 12, 739–748 (2002)
Cardone, M., Kandilci, A.: The Novel ETS Factor TEL2 Cooperates with Myc in B Lyemphomagenesis. Molecular and Cellular Biology 25, 2395–2405 (2005)
Califano, A.: SPLASH: structural pattern localization analysis by sequential histograms. Bioinformatics 16, 341–357 (2000)
Chang, C., Ye, B., Chaganti, R., Dalla-Favera, R.: BCL6, a POZ/zinc-finger protein, is a sequence-specific transcriptional repressor. PNAS 93, 6947–6952 (1996)
Claverie, J.: Some useful statistical properties of position-weight matrices. Comput. Chemistry 18, 287–294 (1994)
Cliften, P., Sudarsanam, P., Desikan, A., Fulton, L., Fulton, B., Majors, J., Waterston, R., Cohen, B.A., Johnston, M.: Finding functional features in Saccharomyces genomes by mphylogenetic footprinting. Science 301, 71–76 (2003)
Elemento, O., Tavazoie, S.: Fast and systematic genome-wide discovery of conserved regulatory elements using a non-alignment based approach. Genome Biol. 6, R18 (2005)
Harbison, C.T., Gordon, D.B., Lee, T.I., Rinaldi, N.J., Macisaac, K.D., Danford, T.W., Hannett, N.M., Tagne, J.B., Reynolds, D.B., Yoo, J., Jennings, E.G., Zeitlinger, J., Pokholok, D.K., Kellis, M., Rolfe, P.A., Takusagawa, K.T., Lander, E.S., Gifford, D.K., Fraenkel, E., Young, R.A.: Transcriptional regulatory code of a eukaryotic genome. Nature 431, 99–104 (2004)
Hong, S., Pusapati, R.V., Powers, J.T., Johnson, D.G.: Oncogenes and the DNA Damage Response: Myc and E2F1 Engage the ATM Signaling Pathway to Activate p53 and Induce Apoptosis. Cell. cycle 5, 801–803 (2005)
Hartemink, A.J.: Reverse engineering gene regulatory networks. Nature Biotechnology 23, 554–555 (2005)
Kharchenko, P., Vitkup, D., Church, G.M.: Filling gaps in a metabolic network using expression information. Bioinformatics 20, I178–I185 (2000)
Lenhard, B., Sandelin, A., Mendoza, L., Engstrm, P., Jareborg, N., Wasserman, W.: Identification of conserved regulatory elements by comparative genome analysis. J. Biol. 2, 13 (2003)
Liu, Y., Liu, X.S., Wei, L., Altman, R.B., Batzoglou, S.: Eukaryotic regulatory element conservation analysis and identification using comparative genomics. Genome Res. 3, 451–458 (2004)
Docquier, F., Farrar, D., D’Arcy, V., Chernukhin, I., Robinson, A.F., Loukinov, D., Vatolin, S., Pack, S., Mackay, A., Harris, R.A., Dorricott, H., O’Hare, M.J., Lobanenkov, V., Klenova, E.: Heightened expression of CTCF in breast cancer cells is associated with resistance to apoptosis Cancer Research 65, 5122–5125 (2005)
Margolin, A.A., Nemenman, I., Basso, K., Wiggins, C., Stolovitzky, G., Dalla Favera, R., Califano, A.: An Algorithm for the Reconstruction of Gene Regulatory Networks in a Mammalian Cellular Context. BMC Bioinformatics 7, S7 (2006)
Margolin, A.A., Wang, K., Lim, W.K., Kustagi, M., Nemenman, I., Califano, A.: Reverse engineering cellular networks. Nature Protocols 1, 662–671 (2006)
Ohashi, Y., Ueda, M., Kawase, T., Kawakami, Y., Toda, M.: Identification of an epigenetically silenced gene, RFX1, in human glioma cells using restriction landmark genomic scanning. Oncogene 23, 7772–7779 (2004)
Prakash, A., Tompa, M.: Discovery of regulatory elements in vertebrates through comparative genomics. Nature Biotechnology 102, 14689–14693 (2005)
Schones, D., Sumazin, P., Zhang, M.Q.: Similarity of position frequency matrices for transcription factor binding sites. Bioinformatics 21, 307–313 (2005)
Smith, A., Sumazin, P., Zhang, M.Q.: Identifying tissue-selective transcription factor binding sites in vertebrate promoters. PNAS 102, 1560–1565 (2005)
Sinha, S., Blanchette, M., Tompa, M.: PhyME: a probabilistic algorithm for finding motifs in sets of orthologous sequences. BMC Bioinformatics 5, 170 (2004)
Wasserman, W.W., Palumbo, M., Thompson, W., Fickett, J.: Human-mouse genome comparisons to locate regulatory sites. Nature Genetics 26, 225–228 (2000)
Wang, T., Stormo, G.: Combining phylogenetic data with co-regulated genes to identify regulatory motifs. Bioinformatics 18, 2369–2380 (2003)
Wang, T., Stormo, G.: Identifying the conserved network of cis-regulatory sites of a eukaryotic genome. PNAS 102, 17400–17405 (2006)
Williams, T., Williams, M., Kuick, R., Misek, D., McDonagh, K., Hanash, S., Innis, J.: Candidate downstream regulated genes of HOX group 13 transcription factors with and without monomeric DNA binding capability. Developmental Biology 279, 462–480 (2005)
Xie, X., Lu, J., Kulbokas, E.J., Golub, T.R., Mootha, V., Lindblad-Toh, K., Lander, E.S., Kellis, M.: Systematic discovery of regulatory motifs in human promoters and 3’ UTRs by comparison of several mammals. Nature 434, 338–345 (2005)
Zhu, Z., Pilpel, Y., Church, G.M.: Computational identification of transcription factor binding sites via a transcription-factor-centric clustering (TFCC) algorithm. J. Mol. Biol. 318, 71–81 (2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Banerjee, N., Califano, A. (2006). Transcription Factor Centric Discovery of Regulatory Elements in Mammalian Genomes Using Alignment-Independent Conservation Maps. In: Bourque, G., El-Mabrouk, N. (eds) Comparative Genomics. RCG 2006. Lecture Notes in Computer Science(), vol 4205. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11864127_16
Download citation
DOI: https://doi.org/10.1007/11864127_16
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-44529-6
Online ISBN: 978-3-540-44530-2
eBook Packages: Computer ScienceComputer Science (R0)