Abstract
A Lyndon word is a primitive string which is lexicographically smallest among cyclic permutations of its characters. Lyndon words are used for constructing bases in free Lie algebras, constructing de Bruijn sequences, finding the lexicographically smallest or largest substring in a string, and succinct suffix–prefix matching of highly periodic strings. In this paper, we extend the concept of the Lyndon word to two dimensions. We introduce the 2D Lyndon word and use it to capture 2D horizontal periodicity of a matrix in which each row is highly periodic, and to efficiently solve 2D horizontal suffix–prefix matching among a set of patterns. This yields a succinct and efficient algorithm for 2D dictionary matching. We present several algorithms that compute the 2D Lyndon word that represents a matrix. The final algorithm achieves linear time complexity even when the least common multiple of the periods of the rows is exponential in the matrix width.
Similar content being viewed by others
Notes
Lyndon word naming of the matrix rows takes linear \(O(m^2)\) time [27].
Processing columns even for this type of pattern would have several additional drawbacks, including working only for matrices that are uniform size in both dimensions.
Hashing techniques allow us to traverse a compressed trie of LWpos arrays in linear time.
References
Aho, A.V., Corasick, M.J.: Efficient string matching: an aid to bibliographic search. Commun. ACM 18(6), 333–340 (1975)
Amir, A., Benson, G.: Two-dimensional periodicity and its applications. In: ACM-SIAM Symposium on Discrete Algorithms (SODA), pp. 440–452 (1992)
Amir, A., Benson, G.: Two-dimensional periodicity in rectangular arrays. SIAM J. Comput. 27(1), 90–106 (1998)
Amir, A., Benson, G., Farach, M.: An alphabet independent approach to two-dimensional pattern matching. SIAM J. Comput. 23(2), 313–323 (1994)
Amir, A., Farach, M.: Two-dimensional dictionary matching. Inf. Process. Lett. 44(5), 233–239 (1992)
Apostolico, A., Crochemore, M.: Fast parallel Lyndon factorization with applications. Math. Syst. Theory 28(2), 89–108 (1995)
Baker, T.J.: A technique for extending rapid exact-match string matching to arrays of more than one dimension. SIAM J. Comput. 7, 533–541 (1978)
Bannai, H., Inenaga, S., Nakashima, Y., Takeda, M., Tsuruta, K.: A new characterization of maximal repetitions by Lyndon trees. In: ACM-SIAM Symposium on Discrete Algorithms (SODA), pp. 562–571 (2015)
Bird, R.S.: Two dimensional pattern matching. Inf. Process. Lett. 6(5), 168–170 (1977)
Chemillier, M.: Periodic musical sequences and Lyndon words. Soft Comput. 8(9), 611–616 (2004)
Choi, Y., Lam, T.-W.: Two-dimensional dynamic dictionary matching. In: International Symposium on Symbolic and Algebraic Computation (ISAAC), pp. 85–94 (1996)
Crochemore, M., Gasieniec, L., Hariharan, R., Muthukrishnan, S., Rytter, W.: A constant time optimal parallel algorithm for two-dimensional pattern matching. SIAM J. Comput. 27(3), 668–681 (1998)
Crochemore, M., Iliopoulos, C., Korda, M., Reid, J.: A failure function for multiple two-dimensional pattern matching. Comb. Math. Comb. Comput. 35, 225–238 (2000)
Delgrange, O., Rivals, E.: Star: an algorithm to search for tandem approximate repeats. Bioinformatics 20(16), 2812–2820 (2004)
Farhi, B.: Nontrivial lower bounds for the least common multiple of some finite sequences of integers. J. Number Theory 125(2), 393–411 (2007)
Fredricksen, H., Maiorana, J.: Necklaces of beads in k colors and k-ary de Bruijn sequences. Discrete Math. 23(3), 207–210 (1978)
Giancarlo, R.: A generalization of the suffix tree to square matrices, with applications. SIAM J. Comput. 24(3), 520–562 (1995)
Gusfield, D.: Algorithms on Strings, Trees, and Sequences—Computer Science and Computational Biology. Cambridge University Press, Cambridge (1997)
Gusfield, D., Landau, G.M., Schieber, B.: An efficient algorithm for the all pairs suffix–prefix problem. Inf. Process. Lett. 41(4), 181–185 (1992)
Idury, R.M., Schäffer, A.A.: Multiple matching of rectangular patterns. Inf. Comput. 117(1), 78–90 (1995)
Kedem, Z.M., Landau, G.M., Palem, K.V.: Parallel suffix–prefix-matching algorithm and applications. SIAM J. Comput. 25(5), 998–1023 (1996)
Knuth, D.E.: The Art of Computer Programming, vol. 2. Addison Wesley, Redwood (1998)
Koshy, T.: Elementary Number Theory with Applications, 2nd edn. Academic Press, New York (2001)
Lothaire, M.: Applied Combinatorics on Words (Encyclopedia of Mathematics and Its Applications). Cambridge University Press, New York (2005)
Lyndon, R.C.: On Burnside’s problem. Trans. Am. Math. Soc. 77(2), 212–215 (1954)
Mucha, M.: Lyndon words and short superstrings. In: ACM-SIAM Symposium on Discrete Algorithms (SODA), pp. 958–972 (2013)
Neuburger, S., Sokol, D.: Succinct 2D dictionary matching. Algorithmica 65(3), 662–684 (2013)
Ohlebusch, E., Gog, S.: Efficient algorithms for the all-pairs suffix–prefix problem and the all-pairs substring-prefix problem. Inf. Process. Lett. 110(3), 123–128 (2010)
Acknowledgments
This work was supported in part by PSC-CUNY research award 65112-0043. The authors would like to thank Binyomin Balsam for his helpful discussions and his insight into the modular arithmetic solution.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Marcus, S., Sokol, D. 2D Lyndon Words and Applications. Algorithmica 77, 116–133 (2017). https://doi.org/10.1007/s00453-015-0065-z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00453-015-0065-z