Abstract
As one of the earliest problems in computational biology, RNA secondary structure prediction (sometimes referred to as “RNA folding”) problem has attracted attention again, thanking to the recent discoveries of many novel non-coding RNA molecules. The two common approaches to this problem are de novo prediction of RNA secondary structure based on energy minimization and “consensus folding” approach (computing the common secondary structure for a set of unaligned RNA sequences). Consensus folding algorithms work well when the correct seed alignment is part of the input to the problem. However, seed alignment itself is a challenging problem for diverged RNA families.
In this paper, we propose a novel framework to predict the common secondary structure for unaligned RNA sequences. By matching putative stacks in RNA sequences, we make use of both primary sequence information and thermodynamic stability for prediction at the same time. We show that our method can predict the correct common RNA secondary structures even when we are only given a limited number of unaligned RNA sequences, and it outperforms current algorithms in sensitivity and accuracy.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Eddy, S.: Non-coding RNA genes and the modern RNA world. Nat. Rev. Genet. 2, 919–929 (2001)
Storz, G.: An expanding universe of noncoding RNAs. Science 296, 1260–1263 (2002)
International Human Genome Sequencing Consortium: Finishing the euchromatic sequence of the human genome. Nature 431, 931–945 (2004)
Kampa, D., et al.: Novel RNAs identified from an in-depth analysis of the transcriptome of human chromosomes 21 and 22. Genome Res. 14, 331–342 (2004)
Nahvi, A., Sudarshan, N., Ebert, M., Zou, X., Brown, K., Breaker, R.: Genetic control by a metabolite binding mRNA. Chemical Biology 9, 1043–1049 (2003)
Vitreschak, A., et al.: Riboswitches: the oldest mechanism for the regulation of gene expression? Trends in Genetics 20, 44–50 (2003)
Tinoco, I., Uhlenbeck, O., Levine, M.: Estimation of secondary structure in ribonucleic acids. Nature 230, 362–367 (1971)
Nussinov, R., Jacobson, A.: Fast algorithm for predicting the secondary structure of single-stranded RNA. Proc. Natl. Acad. Sci. USA 77, 6309–6313 (1980)
Nussinov, R., Pieczenik, G., Griggs, J., Kleitman, D.: Algorithms for loop matchings. SIAM J. Appl. Math. 35, 68–82 (1978)
Smith, T., Waterman, M.: RNA Secondary structure. Math. Biosci. 42, 257–266 (1978)
Waterman, M.: Secondary structure of single stranded nucleic acids. Adv. Math. Suppl. Stud. I, 167–212 (1978)
Zuker, M., Sankoff, D.: RNA secondary structure and their prediction. Bull. Math. Biol. 46, 591–621 (1984)
Zuker, M.: Prediction of RNA secondary structure by energy minimization. Methods Mol. Biol. 25, 267–294 (1994)
Hofacker, I.: Vienna RNA secondary structure server. Nucl. Acids Res. 31, 3429–3431 (2003)
Jaeger, J., Turner, D., Zuker, M.: Improved predictions of secondary structures for RNA. Proc. Natl. Acad. Sci. USA 86, 7706–7710 (1989)
Pavesi, G., Mauri, G., Stefani, M., Pesole, G.: RNAProfile: an algorithm for finding conserved secondary structure motifs in unaligned RNA sequences. Nucl. Acids Res. 32, 3258–3269 (2004)
Levitt, M.: Detailed molecular model for transfer ribonucleic acid. Nature 224, 759–763 (1969)
Hofacker, I., Fekete, M., Stadler, P.: Secondary structure prediction for aligned RNA sequences. J. Mol. Biol. 319, 1059–1066 (2002)
Gorodkin, J., Stricklin, S., Stormo, G.: Discovering common stem-loop motifs in unaligned RNA sequences. Nucl. Acids Res. 29, 2135–2144 (2001)
Sankoff, D.: Simultaneous solution of the RNA folding, alignment and protosequence problems. SIAM J. Appl. Math. 45, 810–825 (1985)
Mathews, D., Turner, D.: Dynalign: an algorithm for finding the secondary structure common to two RNA sequences. J. Mol. Biol. 317, 191–203 (2002)
Gorodkin, J., Heyer, L., Stormo, G.: Finding the most significant common sequence and structure motifs in a set of RNA sequences. Nucl. Acids Res. 25, 3724–3732 (1997)
Eddy, S., Durbin, R.: RNA sequence analysis using covariance models. Nucl. Acids Res. 22, 2079–2088 (1994)
Sakakibara, Y., et al.: Recent methods for RNA modeling using Stochastic Context Free Grammars. Combinatorial Pattern Matching 807 (1994)
Knudsen, B., Hein, J.: Pfold: RNA secondary structure prediction using stochastic context-free grammars. Nucl. Acids Res. 31, 3423–3428 (2003)
Knight, R., Birmingham, A., Yarus, M.: BayesFold: rational 2 degrees folds that combine thermodynamic, covariation, and chemical data for aligned RNA sequences. RNA 10, 1323–1336 (2004)
Bray, N., Pachter, L.: MAVID: Constrained Ancestral Alignment of Multiple Sequences. Genome Res. 14, 693–699 (2004)
Waterman, M.: Consensus methods for fodling single-stranded nucleic acids. Mathematical methods for DNA Sequences, 185–224 (1989)
Ji, Y., Xu, X., Stormo, G.: A graph theoretical approach for predicting common RNA secondary structure motifs including pseudoknots in unaligned sequences. Bioinformatics 20, 1591–1602 (2004)
Perriquet, O., Touzet, H., Dauchet, M.: Finding the common structure shared by two homologous RNAs. Bioinformatics 19, 108–116 (2003)
Bouthinon, D., Soldano, H.: A new method to predict the consensus secondary structure of a set of unaligned RNA sequences. Bioinformatics 15, 785–798 (1999)
Davydov, E., Batzoglou, S.: A computational model for rna multiple structural alignment. Combinatorial Pattern Matching (2004)
Griffiths-Jones, S., Bateman, A., Marshall, M., Khanna, A., Eddy, S.: Rfam: an RNA family database. Nucl. Acids Res. 31, 439–441 (2003)
Touzet, H., Perriquet, O.: CARNAC: folding families of related RNAs. Nucl. Acids Res. 32, 142–145 (2004)
Bafna, V., Muthukrishnan, S., Ravi, R.: Computing similarity between RNA strings. Combinatorial Pattern Matching 937, 1–14 (1995)
Thompson, J., Higgins, D., Gibson, T.: CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucl. Acids Res. 22, 4673–4680 (1994)
Lawrence, C., et al.: Detecting subtle sequence signals: a Gibbs sampling strategy for multiple alignment. Science 262, 208–214 (1993)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Bafna, V., Tang, H., Zhang, S. (2005). Consensus Folding of Unaligned RNA Sequences Revisited. In: Miyano, S., Mesirov, J., Kasif, S., Istrail, S., Pevzner, P.A., Waterman, M. (eds) Research in Computational Molecular Biology. RECOMB 2005. Lecture Notes in Computer Science(), vol 3500. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11415770_13
Download citation
DOI: https://doi.org/10.1007/11415770_13
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-25866-7
Online ISBN: 978-3-540-31950-4
eBook Packages: Computer ScienceComputer Science (R0)