Nothing Special   »   [go: up one dir, main page]

Skip to main content

Consensus Folding of Unaligned RNA Sequences Revisited

  • Conference paper
Research in Computational Molecular Biology (RECOMB 2005)

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 3500))

Abstract

As one of the earliest problems in computational biology, RNA secondary structure prediction (sometimes referred to as “RNA folding”) problem has attracted attention again, thanking to the recent discoveries of many novel non-coding RNA molecules. The two common approaches to this problem are de novo prediction of RNA secondary structure based on energy minimization and “consensus folding” approach (computing the common secondary structure for a set of unaligned RNA sequences). Consensus folding algorithms work well when the correct seed alignment is part of the input to the problem. However, seed alignment itself is a challenging problem for diverged RNA families.

In this paper, we propose a novel framework to predict the common secondary structure for unaligned RNA sequences. By matching putative stacks in RNA sequences, we make use of both primary sequence information and thermodynamic stability for prediction at the same time. We show that our method can predict the correct common RNA secondary structures even when we are only given a limited number of unaligned RNA sequences, and it outperforms current algorithms in sensitivity and accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Eddy, S.: Non-coding RNA genes and the modern RNA world. Nat. Rev. Genet. 2, 919–929 (2001)

    Article  Google Scholar 

  2. Storz, G.: An expanding universe of noncoding RNAs. Science 296, 1260–1263 (2002)

    Article  Google Scholar 

  3. International Human Genome Sequencing Consortium: Finishing the euchromatic sequence of the human genome. Nature 431, 931–945 (2004)

    Google Scholar 

  4. Kampa, D., et al.: Novel RNAs identified from an in-depth analysis of the transcriptome of human chromosomes 21 and 22. Genome Res. 14, 331–342 (2004)

    Article  Google Scholar 

  5. Nahvi, A., Sudarshan, N., Ebert, M., Zou, X., Brown, K., Breaker, R.: Genetic control by a metabolite binding mRNA. Chemical Biology 9, 1043–1049 (2003)

    Article  Google Scholar 

  6. Vitreschak, A., et al.: Riboswitches: the oldest mechanism for the regulation of gene expression? Trends in Genetics 20, 44–50 (2003)

    Article  Google Scholar 

  7. Tinoco, I., Uhlenbeck, O., Levine, M.: Estimation of secondary structure in ribonucleic acids. Nature 230, 362–367 (1971)

    Article  Google Scholar 

  8. Nussinov, R., Jacobson, A.: Fast algorithm for predicting the secondary structure of single-stranded RNA. Proc. Natl. Acad. Sci. USA 77, 6309–6313 (1980)

    Article  Google Scholar 

  9. Nussinov, R., Pieczenik, G., Griggs, J., Kleitman, D.: Algorithms for loop matchings. SIAM J. Appl. Math. 35, 68–82 (1978)

    Article  MATH  MathSciNet  Google Scholar 

  10. Smith, T., Waterman, M.: RNA Secondary structure. Math. Biosci. 42, 257–266 (1978)

    Article  MATH  Google Scholar 

  11. Waterman, M.: Secondary structure of single stranded nucleic acids. Adv. Math. Suppl. Stud. I, 167–212 (1978)

    MathSciNet  Google Scholar 

  12. Zuker, M., Sankoff, D.: RNA secondary structure and their prediction. Bull. Math. Biol. 46, 591–621 (1984)

    MATH  Google Scholar 

  13. Zuker, M.: Prediction of RNA secondary structure by energy minimization. Methods Mol. Biol. 25, 267–294 (1994)

    Google Scholar 

  14. Hofacker, I.: Vienna RNA secondary structure server. Nucl. Acids Res. 31, 3429–3431 (2003)

    Article  Google Scholar 

  15. Jaeger, J., Turner, D., Zuker, M.: Improved predictions of secondary structures for RNA. Proc. Natl. Acad. Sci. USA 86, 7706–7710 (1989)

    Article  Google Scholar 

  16. Pavesi, G., Mauri, G., Stefani, M., Pesole, G.: RNAProfile: an algorithm for finding conserved secondary structure motifs in unaligned RNA sequences. Nucl. Acids Res. 32, 3258–3269 (2004)

    Article  Google Scholar 

  17. Levitt, M.: Detailed molecular model for transfer ribonucleic acid. Nature 224, 759–763 (1969)

    Article  Google Scholar 

  18. Hofacker, I., Fekete, M., Stadler, P.: Secondary structure prediction for aligned RNA sequences. J. Mol. Biol. 319, 1059–1066 (2002)

    Article  Google Scholar 

  19. Gorodkin, J., Stricklin, S., Stormo, G.: Discovering common stem-loop motifs in unaligned RNA sequences. Nucl. Acids Res. 29, 2135–2144 (2001)

    Article  Google Scholar 

  20. Sankoff, D.: Simultaneous solution of the RNA folding, alignment and protosequence problems. SIAM J. Appl. Math. 45, 810–825 (1985)

    Article  MATH  MathSciNet  Google Scholar 

  21. Mathews, D., Turner, D.: Dynalign: an algorithm for finding the secondary structure common to two RNA sequences. J. Mol. Biol. 317, 191–203 (2002)

    Article  Google Scholar 

  22. Gorodkin, J., Heyer, L., Stormo, G.: Finding the most significant common sequence and structure motifs in a set of RNA sequences. Nucl. Acids Res. 25, 3724–3732 (1997)

    Article  Google Scholar 

  23. Eddy, S., Durbin, R.: RNA sequence analysis using covariance models. Nucl. Acids Res. 22, 2079–2088 (1994)

    Article  Google Scholar 

  24. Sakakibara, Y., et al.: Recent methods for RNA modeling using Stochastic Context Free Grammars. Combinatorial Pattern Matching 807 (1994)

    Google Scholar 

  25. Knudsen, B., Hein, J.: Pfold: RNA secondary structure prediction using stochastic context-free grammars. Nucl. Acids Res. 31, 3423–3428 (2003)

    Article  Google Scholar 

  26. Knight, R., Birmingham, A., Yarus, M.: BayesFold: rational 2 degrees folds that combine thermodynamic, covariation, and chemical data for aligned RNA sequences. RNA 10, 1323–1336 (2004)

    Article  Google Scholar 

  27. Bray, N., Pachter, L.: MAVID: Constrained Ancestral Alignment of Multiple Sequences. Genome Res. 14, 693–699 (2004)

    Article  Google Scholar 

  28. Waterman, M.: Consensus methods for fodling single-stranded nucleic acids. Mathematical methods for DNA Sequences, 185–224 (1989)

    Google Scholar 

  29. Ji, Y., Xu, X., Stormo, G.: A graph theoretical approach for predicting common RNA secondary structure motifs including pseudoknots in unaligned sequences. Bioinformatics 20, 1591–1602 (2004)

    Article  Google Scholar 

  30. Perriquet, O., Touzet, H., Dauchet, M.: Finding the common structure shared by two homologous RNAs. Bioinformatics 19, 108–116 (2003)

    Article  Google Scholar 

  31. Bouthinon, D., Soldano, H.: A new method to predict the consensus secondary structure of a set of unaligned RNA sequences. Bioinformatics 15, 785–798 (1999)

    Article  Google Scholar 

  32. Davydov, E., Batzoglou, S.: A computational model for rna multiple structural alignment. Combinatorial Pattern Matching (2004)

    Google Scholar 

  33. Griffiths-Jones, S., Bateman, A., Marshall, M., Khanna, A., Eddy, S.: Rfam: an RNA family database. Nucl. Acids Res. 31, 439–441 (2003)

    Article  Google Scholar 

  34. Touzet, H., Perriquet, O.: CARNAC: folding families of related RNAs. Nucl. Acids Res. 32, 142–145 (2004)

    Article  Google Scholar 

  35. Bafna, V., Muthukrishnan, S., Ravi, R.: Computing similarity between RNA strings. Combinatorial Pattern Matching 937, 1–14 (1995)

    MathSciNet  Google Scholar 

  36. Thompson, J., Higgins, D., Gibson, T.: CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucl. Acids Res. 22, 4673–4680 (1994)

    Article  Google Scholar 

  37. Lawrence, C., et al.: Detecting subtle sequence signals: a Gibbs sampling strategy for multiple alignment. Science 262, 208–214 (1993)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Bafna, V., Tang, H., Zhang, S. (2005). Consensus Folding of Unaligned RNA Sequences Revisited. In: Miyano, S., Mesirov, J., Kasif, S., Istrail, S., Pevzner, P.A., Waterman, M. (eds) Research in Computational Molecular Biology. RECOMB 2005. Lecture Notes in Computer Science(), vol 3500. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11415770_13

Download citation

  • DOI: https://doi.org/10.1007/11415770_13

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-25866-7

  • Online ISBN: 978-3-540-31950-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics