Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/2381716.2381799acmotherconferencesArticle/Chapter ViewAbstractPublication PagescubeConference Proceedingsconference-collections
research-article

Lyndon fountains and the Burrows-Wheeler transform

Published: 03 September 2012 Publication History

Abstract

In this paper we study Lyndon structures related to the Burrows-Wheeler Transform with potential application to bioinformatics. Next-Generation Sequencing techniques require the alignment of a large set of short reads (between dozens to hundreds of letters) on a reference sequence (millions of letters). The Burrows-Wheeler Transform has been used in various alignment programs which generally compute the Lyndon factorization of the reference sequence as a preprocessing step. We compute the quadratic factorization of all rotations of an input string and the Burrows-Wheeler Transform of a Lyndon substring. From the factored rotations we introduce the Lyndon fountain.

References

[1]
D. Adjeroh, T. Bell, and A. Mukherjee, The Burrows-Wheeler Transform: data compression, suffix arrays, and pattern matching, 2008.
[2]
P. Antoniou, J. W. Daykin, C. S. Iliopoulos, D. Kourie, L. Mouchard, and S. P. Pissis, Mapping uniquely occuring short sequences derived from high throughput technologies to a reference genome, Proceedings of the 9th IEEE International Conference on Information Technology and Applications in Biomedicine (ITAB 2009), 2009.
[3]
D. Breslauer, R. Grossi, and F. Mignosi, Simple real-time constant-space string matching, CPM, 2011, pp. 173--183.
[4]
M. Burrows and D. Wheeler, A block-sorting lossless data compression algorithm, Tech. report, Digital SRC Research Report 124, 1994.
[5]
M. Chemillier, Periodic musical sequences and Lyndon words, Soft Comput. 8 (2004), no. 9, 611--616.
[6]
K. T. Chen, R. H. Fox, and R. C. Lyndon., Free differential calculus, iv --- the quotient groups of the lower central series., (1958), 68:81--95.
[7]
M. Crochemore, J. Désarménien, and D. Perrin, A note on the Burrows-Wheeler Transformation, CoRR abs/cs/0502073 (2005).
[8]
M. Crochemore and D. Perrin, Two-way string matching, J. ACM 38 (1991), no. 3, 651--675.
[9]
O. Delgrange and E. Rivals, STAR: an algorithm to search for tandem approximate repeats, Bioinformatics 20 (2004), no. 16, 2812--2820.
[10]
J. P. Duval, Factorizing words over an ordered alphabet, J. Algorithms 4 (1983), no. 4, 363--381.
[11]
J. Y. Gil and D. A. Scott, A bijective string sorting transform, CoRR abs/1201.3077 (2012).
[12]
P. Ko and S. Aluru, Space efficient linear time construction of suffix arrays, CPM, 2003, pp. 200--210.
[13]
B. Langmead., C. Trapnell., M. Pop, and S. J. Salzberg, Ultrafast and memory-efficient alignment of short dna sequences to the human genome, Genome Biol 10 (2009), R25 -- R25.
[14]
H. Li and R. Durbin, Fast and accurate short read alignment with Burrows-Wheeler Transform, Bioinformatics 25 (2009), no. 14, 1754--1760.
[15]
R. Li, C. Yu, Y. Li, T. W. Lam, S. M. Yiu, K. Kristiansen, and J. Wang, Soap2: an improved ultrafast tool for short read alignment, Bioinformatics 25 (2009), no. 15, 1966--1967.
[16]
M. Lothaire, Combinatorics on words, Reading, MA (1983); 2nd Edition, Cambridge University Press, Cambridge (1997)., Addison-Wesley, 1983.
[17]
L. Perret, A chosen ciphertext attack on a public key cryptosystem based on Lyndon words, IACR Cryptology ePrint Archive 2005 (2005), 14.
[18]
C. Reutenauer, Free lie algebras, London Math. Soc. Monographs New Ser. 7, Oxford University Press, 1993.
[19]
M. Salson, T. Lecroq, M. Léonard, and L. Mouchard, A four-stage algorithm for updating a Burrows-Wheeler Transform, Theoretical Computer Science 410 (2009), no. 43, 4350--4359.
[20]
B. Smyth, Computing patterns in strings, ACM Press Bks, Pearson/Addison-Wesley, 2003.
  1. Lyndon fountains and the Burrows-Wheeler transform

      Recommendations

      Comments

      Please enable JavaScript to view thecomments powered by Disqus.

      Information & Contributors

      Information

      Published In

      cover image ACM Other conferences
      CUBE '12: Proceedings of the CUBE International Information Technology Conference
      September 2012
      879 pages
      ISBN:9781450311854
      DOI:10.1145/2381716
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Sponsors

      • CUOT: Curtin University of Technology

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 03 September 2012

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. Burrows-Wheeler transform
      2. Lyndon fountain
      3. alignment
      4. factor
      5. factorization
      6. lexicographic order
      7. next-generation sequencing
      8. short reads

      Qualifiers

      • Research-article

      Conference

      CUBE '12
      Sponsor:
      • CUOT

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • 0
        Total Citations
      • 109
        Total Downloads
      • Downloads (Last 12 months)1
      • Downloads (Last 6 weeks)0
      Reflects downloads up to 19 Nov 2024

      Other Metrics

      Citations

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media