Abstract
Given m documents of total length n, we consider the problem of finding a longest string common to at least d ≥ 2 of the documents. This problem is known as the longest common substring (LCS) problem and has a classic \(\mathcal{O}(n)\) space and \(\mathcal{O}(n)\) time solution (Weiner [FOCS’73], Hui [CPM’92]). However, the use of linear space is impractical in many applications. In this paper we show that for any trade-off parameter 1 ≤ τ ≤ n, the LCS problem can be solved in \(\mathcal{O}(\tau)\) space and \(\mathcal{O}(n^2/\tau)\) time, thus providing the first smooth deterministic time-space trade-off from constant to linear space. The result uses a new and very simple algorithm, which computes a τ-additive approximation to the LCS in \(\mathcal{O}(n^2/\tau)\) time and \(\mathcal{O}(1)\) space. We also show a time-space trade-off lower bound for deterministic branching programs, which implies that any deterministic RAM algorithm solving the LCS problem on documents from a sufficiently large alphabet in \(\mathcal{O}(\tau)\) space must use \(\Omega(n\sqrt{\log(n/(\tau\log n))/\log\log(n/(\tau\log n)})\) time.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Afek, Y., Bremler-Barr, A., Landau Feibish, S.: Automated signature extraction for high volume attacks. In: Proc. 9th ANCS, pp. 147–156 (2013)
Beame, P.: Clifford, R., Machmouchi, W.: Element Distinctness, Frequency Moments, and Sliding Windows. In: Proc. 54th FOCS, pp. 290–299 (2013)
Beame, P., Saks, M., Sun, X., Vee, E.: Time-Space Trade-Off Lower Bounds for Randomized Computation of Decision Problems. Journal of the ACM 50(2), 154–195 (2003)
Borodin, A., Cook, S.A.: A Time-Space Tradeoff for Sorting on a General Sequential Model of Computation. SIAM Journal on Computing 11(2), 287–297 (1982)
Breslauer, D., Grossi, R., Mignosi, F.: Simple Real-Time Constant-Space String Matching. Theor. Comput. Sci. 483, 2–9 (2013)
Crochemore, M., Hancart, C., Lecroq, T.: Algorithms on Strings. Cambridge University Press (2007)
Farach-Colton, M.: Optimal Suffix Tree Construction with Large Alphabets. In: Proc. 38th FOCS, pp. 137–143 (1997)
Grossi, R., Vitter, J.S.: Compressed Suffix Arrays and Suffix Trees with Applications to Text Indexing and String Matching. SIAM Journal on Computing 35(2), 378–407 (2005)
Gusfield, D.: Algorithms on Strings, Trees and Sequences: Computer Science and Computational Biology. Cambridge University Press (1997)
Han, Y.: Deterministic sorting in O(nloglogn) time and linear space. Journal of Algorithms 50(1), 96–105 (2004)
Hui, L.C.K.: Color Set Size Problem with Applications to String Matching. In: Apostolico, A., Galil, Z., Manber, U., Crochemore, M. (eds.) CPM 1992. LNCS, vol. 644, pp. 230–243. Springer, Heidelberg (1992)
Kreibich, C., Crowcroft, J.: Honeycomb: Creating Intrusion Detection Signatures Using Honeypots. ACM SIGCOMM Comput. Commun. Rev. 34(1), 51–56 (2004)
Navarro, G., Mäkinen, V.: Compressed Full-Text Indexes. ACM Computing Surveys (CSUR) 39(1), 2 (2007)
Ružić, M.: Constructing Efficient Dictionaries in Close to Sorting Time. In: Aceto, L., Damgård, I., Goldberg, L.A., Halldórsson, M.M., Ingólfsdóttir, A., Walukiewicz, I. (eds.) ICALP 2008, Part I. LNCS, vol. 5125, pp. 84–95. Springer, Heidelberg (2008)
Starikovskaya, T., Vildhøj, H.W.: Time-Space Trade-Offs for the Longest Common Substring Problem. In: Fischer, J., Sanders, P. (eds.) CPM 2013. LNCS, vol. 7922, pp. 223–234. Springer, Heidelberg (2013)
Wang, K., Cretu, G.F., Stolfo, S.J.: Anomalous Payload-Based Worm Detection and Signature Generation. In: Valdes, A., Zamboni, D. (eds.) RAID 2005. LNCS, vol. 3858, pp. 227–246. Springer, Heidelberg (2006)
Weiner, P.: Linear Pattern Matching Algorithms. In: Proc. 14th FOCS (SWAT), pp. 1–11 (1973)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Kociumaka, T., Starikovskaya, T., Vildhøj, H.W. (2014). Sublinear Space Algorithms for the Longest Common Substring Problem. In: Schulz, A.S., Wagner, D. (eds) Algorithms - ESA 2014. ESA 2014. Lecture Notes in Computer Science, vol 8737. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-44777-2_50
Download citation
DOI: https://doi.org/10.1007/978-3-662-44777-2_50
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-662-44776-5
Online ISBN: 978-3-662-44777-2
eBook Packages: Computer ScienceComputer Science (R0)