Abstract
ExpaRNA’s core algorithm computes, for two fixed RNA structures, a maximal non-overlapping set of maximal exact matchings. We introduce an algorithm ExpaRNA-P that solves the lifted problem of finding such sets of exact matchings in entire Boltzmann-distributed structure ensembles of two RNAs. Due to a novel kind of structural sparsification, the new algorithm maintains the time and space complexity of the algorithm for fixed input structures. Furthermore, we generalized the chaining algorithm of ExpaRNA in order to compute a compatible subset of ExpaRNA-P’s exact matchings. We show that ExpaRNA-P outperforms ExpaRNA in BRAliBase 2.1 benchmarks, where we pass the chained exact matchings as anchor constraints to the RNA alignment tool LocARNA. Compared to LocARNA, this novel approach shows similar accuracy but is six times faster.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
The FANTOM Consortium: The transcriptional landscape of the mammalian genome. Science 309(5740), 1559–1563 (2005)
Cheng, J., Kapranov, P., Drenkow, J., Dike, S., Brubaker, S., Patel, S., Long, J., Stern, D., Tammana, H., Helt, G., Sementchenko, V., Piccolboni, A., Bekiranov, S., Bailey, D.K., Ganesh, M., Ghosh, S., Bell, I., Gerhard, D.S., Gingeras, T.R.: Transcriptional maps of 10 human chromosomes at 5-nucleotide resolution. Science 308, 1149–1154 (2005)
Bertone, P., Stoc, V., Royce, T.E., Rozowsky, J.S., Urban, A.E., Zhu, X., Rinn, J.L., Tongprasit, W., Samanta, M., Weissman, S., Gerstein, M., Snyder, M.: Global identification of human transcribed sequences with genome tiling arrays. Science 306, 2242–2246 (2004)
Kapranov, P., Willingham, A.T., Gingeras, T.R.: Genome-wide transcription and the implications for genomic organization. Nat. Rev. Genet. 8(6), 413–423 (2007)
Mattick, J.S., Taft, R.J., Faulkner, G.J.: A global view of genomic information - moving beyond the gene and the master regulator. Trends in Genetics (2009)
Consortium, A.F.B., Backofen, R., Bernhart, S.H., Flamm, C., Fried, C., Fritzsch, G., Hackermuller, J., Hertel, J., Hofacker, I.L., Missal, K., Mosig, A., Prohaska, S.J., Rose, D., Stadler, P.F., Tanzer, A., Washietl, S., Will, S.: RNAs everywhere: genome-wide annotation of structured RNAs. J. Exp. Zoolog. B. Mol. Dev. Evol. 308(1), 1–25 (2007)
Rivas, E., Eddy, S.R.: Noncoding RNA gene detection using comparative sequence analysis. BMC Bioinformatics 2(1), 8 (2001)
Washietl, S., Hofacker, I.L.: Identifying structural noncoding RNAs using RNAz. In: Curr. Protoc. Bioinformatics, ch.12, Unit 12.7 (2007)
Pedersen, J.S., Bejerano, G., Siepel, A., Rosenbloom, K., Lindblad-Toh, K., Lander, E.S., Kent, J., Miller, W., Haussler, D.: Identification and Classification of Conserved RNA Secondary Structures in the Human Genome. PLoS Comput. Biol. 2(4), e33 (2006)
Will, S., Reiche, K., Hofacker, I.L., Stadler, P.F., Backofen, R.: Inferring non-coding RNA families and classes by means of genome-scale structure-based clustering. PLOS Computational Biology 3(4), e65 (2007)
Kaczkowski, B., Torarinsson, E., Reiche, K., Havgaard, J.H., Stadler, P.F., Gorodkin, J.: Structural profiles of human miRNA families from pairwise clustering. Bioinformatics 25(3), 291–294 (2009)
Parker, B.J., Moltke, I., Roth, A., Washietl, S., Wen, J., Kellis, M., Breaker, R., Pedersen, J.S.: New families of human regulatory RNA structures identified by comparative analysis of vertebrate genomes. Genome Res. (2011)
Höchsmann, M., Töller, T., Giegerich, R., Kurtz, S.: Local similarity in RNA secondary structures. In: Proceedings of Computational Systems Bioinformatics (CSB 2003), vol. 2, pp. 159–168. IEEE Computer Society (2003)
Siebert, S., Backofen, R.: MARNA: multiple alignment and consensus structure prediction of RNAs based on sequence structure comparisons. Bioinformatics 21(16), 3352–3359 (2005)
Sankoff, D.: Simultaneous solution of the RNA folding, alignment and protosequence problems. SIAM J. Appl. Math. 45(5), 810–825 (1985)
Havgaard, J.H., Lyngso, R.B., Stormo, G.D., Gorodkin, J.: Pairwise local structural alignment of RNA sequences with sequence similarity less than 40%. Bioinformatics 21(9), 1815–1824 (2005)
Mathews, D.H., Turner, D.H.: Dynalign: an algorithm for finding the secondary structure common to two RNA sequences. Journal of Molecular Biology 317(2), 191–203 (2002)
Hofacker, I.L., Bernhart, S.H., Stadler, P.F.: Alignment of RNA base pairing probability matrices. Bioinformatics 20(14), 2222–2227 (2004)
McCaskill, J.S.: The equilibrium partition function and base pair binding probabilities for RNA secondary structure. Biopolymers 29(6-7), 1105–1119 (1990)
Gorodkin, J., Heyer, L., Stormo, G.: Finding the most significant common sequence and structure motifs in a set of RNA sequences. Nucleic Acids Res. 25(18), 3724–3732 (1997)
Bradley, R.K., Pachter, L., Holmes, I.: Specific alignment of structured RNA: stochastic grammars and sequence annealing. Bioinformatics 24(23), 2677–2683 (2008)
Torarinsson, E., Havgaard, J.H., Gorodkin, J.: Multiple structural alignment and clustering of RNA sequences. Bioinformatics 23(8), 926–932 (2007)
Bauer, M., Klau, G.W., Reinert, K.: Accurate multiple sequence-structure alignment of RNA sequences using combinatorial optimization. BMC Bioinformatics 8, 271 (2007)
Do, C.B., Foo, C.S., Batzoglou, S.: A max-margin model for efficient simultaneous alignment and folding of RNA sequences. Bioinformatics 24(13), i68–i76 (2008)
Heyne, S., Will, S., Beckstette, M., Backofen, R.: Lightweight comparison of RNAs based on exact sequence-structure matches. Bioinformatics 25(16), 2095–2102 (2009)
Backofen, R., Siebert, S.: Fast detection of common sequence structure patterns in RNAs. Journal of Discrete Algorithms 5(2), 212–228 (2007)
Wexler, Y., Zilberstein, C., Ziv-Ukelson, M.: A study of accessible motifs and RNA folding complexity. Journal of Computational Biology 14(6), 856–872 (2007)
Havgaard, J.H., Torarinsson, E., Gorodkin, J.: Fast pairwise structural RNA alignments by pruning of the dynamical programming matrix. PLoS Comput. Biol. 3(10), 1896–1908 (2007)
Ziv-Ukelson, M., Gat-Viks, I., Wexler, Y., Shamir, R.: A Faster Algorithm for RNA Co-folding. In: Crandall, K.A., Lagergren, J. (eds.) WABI 2008. LNCS (LNBI), vol. 5251, pp. 174–185. Springer, Heidelberg (2008)
Backofen, R., Tsur, D., Zakov, S., Ziv-Ukelson, M.: Sparse RNA Folding: Time and Space Efficient Algorithms. In: Kucherov, G., Ukkonen, E. (eds.) CPM 2009. LNCS, vol. 5577, pp. 249–262. Springer, Heidelberg (2009)
Salari, R., Möhl, M., Will, S., Sahinalp, S.C., Backofen, R.: Time and Space Efficient RNA-RNA Interaction Prediction via Sparse Folding. In: Berger, B. (ed.) RECOMB 2010. LNCS, vol. 6044, pp. 473–490. Springer, Heidelberg (2010)
Backofen, R., Will, S.: Local sequence-structure motifs in RNA. Journal of Bioinformatics and Computational Biology (JBCB) 2(4), 681–698 (2004)
Otto, W., Will, S., Backofen, R.: Structure local multiple alignment of RNA. In: Proceedings of German Conference on Bioinformatics (GCB 2008). LNI, Gesellschaft für Informatik (GI), vol. P-136, pp. 178–188 (2008)
Wilm, A., Mainz, I., Steger, G.: An enhanced RNA alignment benchmark for sequence alignment programs. Algorithms Mol. Biol. 1, 19 (2006)
Gardner, P.P., Wilm, A., Washietl, S.: A benchmark of multiple sequence alignment programs upon structural RNAs. Nucleic Acids Research 33(8), 2433–2439 (2005)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Schmiedl, C. et al. (2012). Exact Pattern Matching for RNA Structure Ensembles. In: Chor, B. (eds) Research in Computational Molecular Biology. RECOMB 2012. Lecture Notes in Computer Science(), vol 7262. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-29627-7_27
Download citation
DOI: https://doi.org/10.1007/978-3-642-29627-7_27
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-29626-0
Online ISBN: 978-3-642-29627-7
eBook Packages: Computer ScienceComputer Science (R0)