Abstract
Many algorithms have been proposed to date for the problem of finding biologically significant motifs in promoter regions. They can be classified into two large families: combinatorial methods and probabilistic methods. Probabilistic methods have been used more extensively, since their output is easier to interpret. Combinatorial methods have the potential to identify hard to detect motifs, but their output is much harder to interpret, since it may consist of hundreds or thousands of motifs. In this work, we propose a method that processes the output of combinatorial motif finders in order to find groups of motifs that represent variations of the same motif, thus reducing the output to a manageable size. This processing is done by building a graph that represents the co-occurrences of motifs, and finding communities in this graph. We show that this innovative approach leads to a method that is as easy to use as a probabilistic motif finder, and as sensitive to low quorum motifs as a combinatorial motif finder. The method was integrated with two combinatorial motif finders, and made available on the Web.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Sandve, G., Drablos, F.: A survey of motif discovery methods in an integrated framework. Biology Direct. 1(1), 11 (2006)
Segal, E., Sharan, R.: A discriminative model for identifying spatial cis-regulatory modules. Journal of Computational Biology 12(6), 822–834 (2005)
Buhler, J., Tompa, M.: Finding motifs using random projections. Journal of Computational Biology 9(2), 225–242 (2002)
Bailey, T., Elkan, C.: Fitting a mixture model by expectation maximization to discover motifs in biopolymers. In: Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28–36 (1994)
Lawrence, C.E., Altschul, S.F., Boguski, M.S., Liu, J.S., Neuwald, A.F., Wootton, J.C.: Detecting subtle sequence signals: A Gibbs sampling strategy for multiple alignment. Science 262(5131), 208–214 (1993)
Roth, F.P., Hughes, J.D., Estep, P.W., Church, G.M.: Finding DNA regulatory motifs within unaligned noncoding sequences clustered by whole-genome mRNA quantitation. Nature Biotechnology 16, 939–945 (1998)
Liu, X., Brutlag, D.L., Liu, J.S.: BioProspector: discovering conserved DNA motifs in upstream regulatory regions of co-expressed genes. In: Pacific Symposium on Biocomputing, vol. 6, pp. 127–138 (2001)
Sagot, M.F.: Spelling approximate repeated or common motifs using a suffix tree. Latin 98, 111–127 (1998)
Pevzner, P.A., Sze, S.H.: Combinatorial approaches to finding subtle signals in DNA sequences. In: Proceedings of the International Conference on Intelligent Systems for Molecular Biology, vol. 8, pp. 269–278 (2000)
Carvalho, A.M., Freitas, A.T., Oliveira, A.L., Sagot, M.-F.: An efficient algorithm for the identification of structured motifs in DNA promoter sequences. IEEE Transactions on Computational Biology and Bioinformatics 3(2), 126–140 (2006)
Marsan, L., Sagot, M.F.: Algorithms for extracting structured motifs using a suxffix tree with an application to promoter and regulatory site consensus identification. Journal of Computational Biology 7(3-4), 345–362 (2000)
Mendes, N., Casimiro, A., Santos, P., Sá-Correia, I., Oliveira, A., Freitas, A.: MUSA: A parameter free algorithm for the identification of biologically significant motifs. Bioinformatics 22, 2996–3002 (2006)
Kankainen, M., Loytynoja, A.: MATLIGN: a motif clustering, comparison and matching tool. BMC Bioinformatics 8(1), 189 (2007)
Mahony, S., Benos, P.V.: STAMP: a web tool for exploring DNA-binding motif similarities. Nucleic Acids Research (2007)
Girvan, M., Newman, M.E.J.: Community structure in social and biological networks. Proceedings of the National Academy of Sciences 99, 7821 (2002)
Newman, M.E.J., Girvan, M.: Finding and evaluating community structure in networks. Physical Review E 69, 026113 (2004)
Newman, M.E.J.: Fast algorithm for detecting community structure in networks. Physical Review E 69, 066133 (2004)
Clauset, A., Newman, M.E.J., Moore, C.: Finding community structure in very large networks. Physical Review E 70, 066111 (2004)
Cormen, T.H., Leiserson, C.E., Rivest, R.L., Stein, C.: Introduction to Algorithms. MIT Press, Cambridge (2001)
Teixeira, M.C., Monteiro, P., Jain, P., Tenreiro, S., Fernandes, A.R., Mira, N.P., Alenquer, M., Freitas, A.T., Oliveira, A.L., Sá-Correia, I.: The YEASTRACT database: a tool for the analysis of transcription regulatory associations in saccharomyces cerevisiae. Nucleic Acids Research 34, D446–D451 (2006)
DeRisi, J., van den Hazel, B., Marc, P., Balzi, E., Brown, P., Jack, C., Goffeau, A.: Genome microarray analysis of transcriptional activation in multidrug resistance yeast mutants. FEBS Letters 470, 156–160 (2000)
Courel, M., Lallet, S., Camadro, J.M., Blaiseau, P.L.: Direct activation of genes involved in intracellular iron use by the yeast iron-responsive transcription factor Aft2 without its paralog Aft1. Molecular Cell Biology 25(15), 6760–6771 (2005)
Cohen, B.A., Pilpel, Y., Mitra, R.D., Church, G.M.: Discrimination between paralogs using microarray analysis: application to the Yap1p and Yap2p transcriptional networks. Molecular Biology of the Cell 13(7), 1608–1614 (2002)
Teixeira, M.C., Fernandes, A.R., Mira, N.P., Becker, J.D., Sá-Correia, I.: Early transcriptional response of Saccharomyces cerevisiae to stress imposed by the herbicide 2, 4-dichlorophenoxyacetic acid. FEMS Yeast Research 6(2), 230–248 (2006)
Blaiseau, P.L., Lesuisse, E., Camadro, J.M.: Aft2p, a novel iron-regulated transcription activator that modulates, with Aft1p, intracellular iron use and resistance to oxidative stress in yeast. Journal of Biological Chemistry 276(36), 34221–34226 (2001)
Harbison, C.T., Gordon, D.B., Lee, T.I., Rinaldi, N.J., Macisaac, K.D., Danford, T.W., Hannett, N.M., Tagne, J.-B., Reynolds, D.B., Yoo, J., Jennings, E.G., Zeitlinger, J., Pokholok, D.K., Kellis, M., Rolfe, P.A., Takusagawa, K.T., Lander, E.S., Gifford, D.K., Fraenkel, E., Young, R.A.: Transcriptional regulatory code of a eukaryotic genome. Nature 431(7004), 99–104 (2004)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Francisco, A.P., Oliveira, A.L., Freitas, A.T. (2008). Identification of Transcription Factor Binding Sites in Promoter Regions by Modularity Analysis of the Motif Co-occurrence Graph. In: Măndoiu, I., Sunderraman, R., Zelikovsky, A. (eds) Bioinformatics Research and Applications. ISBRA 2008. Lecture Notes in Computer Science(), vol 4983. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-79450-9_21
Download citation
DOI: https://doi.org/10.1007/978-3-540-79450-9_21
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-79449-3
Online ISBN: 978-3-540-79450-9
eBook Packages: Computer ScienceComputer Science (R0)