Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

Relationship between superstring and compression measures: : New insights on the greedy conjecture

Published: 20 August 2018 Publication History

Abstract

A superstring of a set of words is a string that contains each input word as a substring. Given such a set, the Shortest Superstring Problem (SSP) asks for a superstring of minimum length. SSP is an important theoretical problem related to the Asymmetric Travelling Salesman Problem, and also has practical applications in data compression and in bioinformatics. Indeed, it models the question of assembling a genome from a set of sequencing reads. Unfortunately, SSP is known to be NP-hard even on a binary alphabet and also hard to approximate with respect to the superstring length or to the compression achieved by the superstring. Even the variant in which all words share the same length r, called r-SSP, is NP-hard whenever r > 2. Numerous involved approximation algorithms achieve approximation ratio above 2 for the superstring, but remain difficult to implement in practice. In contrast the greedy conjecture asked in 1988 whether a simple greedy algorithm achieves ratio of 2 for SSP. Here, we present a novel approach to bound the superstring approximation ratio with the compression ratio, which, when applied to the greedy algorithm, shows a 2 approximation ratio for 3-SSP, and also that greedy achieves ratios smaller than 2. This leads to a new version of the greedy conjecture.

References

[1]
A. Blum, T. Jiang, M. Li, J. Tromp, M. Yannakakis, Linear approximation of shortest superstrings, J. ACM 41 (4) (1994) 630–647.
[2]
B. Cazaux, R. Cánovas, E. Rivals, Shortest DNA cyclic cover in compressed space, in: Data Compression Conference, DCC, IEEE Computer Society Press, 2016, pp. 536–545.
[3]
B. Cazaux, E. Rivals, Approximation of greedy algorithms for Max-ATSP, maximal compression, maximal cycle cover, and shortest cyclic cover of strings, in: Proc. of Prague Stringology Conference, (PSC), Czech Technical Univ. Prague, 2014, pp. 148–161. http://www.stringology.org/event/2014/p14.html.
[4]
B. Cazaux, E. Rivals, The power of greedy algorithms for approximating Max-ATSP, Cyclic Cover, and superstrings, Discrete Appl. Math. 212 (2016) 48–60,. Stringology Algorithms.
[5]
M. Demange, V.T. Paschos, On an approximation measure founded on the links between optimization and polynomial approximation theory, Theoret. Comput. Sci. 158 (1&2) (1996) 117–141,.
[6]
G. Fici, T. Kociumaka, J. Radoszewski, W. Rytter, T. Walen, On the greedy algorithm for the Shortest Common Superstring problem with reversals, Inform. Process. Lett. 116 (3) (2016) 245–251.
[7]
J. Gallant, D. Maier, J.A. Storer, On finding minimal length superstrings, J. Comput. System Sci. 20 (1980) 50–58.
[8]
T.P. Gevezes, L.S. Pitsoulis, Optimization in science and engineering, in: Honor of the 60th Birthday of Panos M. Pardalos, Springer New York, New York, NY, 2014, pp. 189–227,. (Chapter). The Shortest Superstring Problem.
[9]
A. Golovnev, A. Kulikov, I. Mihajlin, Approximating shortest superstring problem using de Bruijn graphs, in: Combinatorial Pattern Matching, in: LNCS, Vol. 7922, Springer Verlag, 2013, pp. 120–129.
[10]
H. Kaplan, N. Shafrir, The greedy algorithm for shortest superstrings, Inform. Process. Lett. 93 (1) (2005) 13–17.
[11]
A.S. Kulikov, S. Savinov, E. Sluzhaev, Greedy conjecture for strings of length 4, in: Combinatorial Pattern Matching - 26th Annual Symposium, CPM 2015, Ischia Island, Italy, June 29 - July 1, 2015, Proceedings, 2015, pp. 307–315.
[12]
B. Ma, Why greed works for shortest common superstring problem, Theor. Comput. Sci. 410 (51) (2009) 5374–5381.
[13]
M. Mucha, Lyndon words and short superstrings, in: Proceedings of the Twenty-Fourth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA, 2013, pp. 958–972.
[14]
K.E. Paluch, Better Approximation Algorithms for Maximum Asymmetric Traveling Salesman and Shortest Superstring, CoRR abs/1401.3670.
[15]
H.J. Romero, C.A. Brizuela, A. Tchernykh, An experimental comparison of approximation algorithms for the shortest common superstring problem, in: 5th Mexican International Conference on Computer Science, 2004, pp. 27–34.
[16]
J. Tarhio, E. Ukkonen, A greedy approximation algorithm for constructing shortest common superstrings, Theor. Comput. Sci. 57 (1988) 131–145.
[17]
M. Weinard, G. Schnitger, On the greedy superstring conjecture, SIAM J. Discrete Math. 20 (2) (2006) 502–522.
[18]
A. Zaritsky, M. Sipper, The preservation of favored building blocks in the struggle for fitness: the puzzle algorithm, IEEE Trans. Evol. Comput. 8 (5) (2004) 443–455.

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Discrete Applied Mathematics
Discrete Applied Mathematics  Volume 245, Issue C
Aug 2018
265 pages

Publisher

Elsevier Science Publishers B. V.

Netherlands

Publication History

Published: 20 August 2018

Author Tags

  1. Approximation algorithm
  2. Shortest Common Superstring Problem
  3. Stringology
  4. Data compression
  5. Assembly
  6. Greedy conjecture

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 0
    Total Downloads
  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 05 Mar 2025

Other Metrics

Citations

View Options

View options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media