Abstract
Given a set of n strings of length L and a radius d, the closest string problem asks for a new string t sol that is within a Hamming distance of d to each of the given strings. It is known that the problem is NP-hard and its optimization version admits a polynomial time approximation scheme (PTAS). Parameterized algorithms have been then developed to solve the problem when d is small. In this paper, with a new approach (called the 3-string approach), we first design a parameterized algorithm for binary strings that runs in \(O\left(nL + nd^3 6.731^d\right)\) time, while the previous best runs in \(O\left(nL + nd 8^d\right)\) time. We then extend the algorithm to arbitrary alphabet sizes, obtaining an algorithm that runs in \( O \left( nL + nd 1.612^d \left( \alpha^2 +1 - 2 \alpha^{-1} + \alpha^{-2} \right) ^{3d}\right) \) time, where \(\alpha = \sqrt[3]{\mathstrut \sqrt{|\Sigma|-1}+1 }\). This new time bound is better than the previous best for small alphabets, including the very important case where |Σ| = 4 (i.e., the case of DNA strings).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Ben-Dor, A., Lancia, G., Perone, J., Ravi, R.: Banishing bias from consensus sequences. In: Hein, J., Apostolico, A. (eds.) CPM 1997. LNCS, vol. 1264, pp. 247–261. Springer, Heidelberg (1997)
Chen, Z.Z., Wang, L.: Article 3A fast exact algorithm for the closest substring problem and its application to the planted (l,d)-motif model. In: TCBB (2009) (submitted for publication)
Deng, X., Li, G., Li, Z., Ma, B., Wang, L.: Genetic design of drugs without side-effects. SIAM Journal on Computing 32(4), 1073–1090 (2003)
Dopazo, J., Rodríguez, A., Sáiz, J.C., Sobrino, F.: Design of primers for PCR amplification of highly variable genomes. CABIOS 9, 123–125 (1993)
Downey, R.G., Fellows, M.R.: Parameterized complexity. Monographs in Computer Science. Springer, New York (1999)
Evans, P.A., Smith, A.D.: Complexity of approximating closest substring problems. In: Lingas, A., Nilsson, B.J. (eds.) FCT 2003. LNCS, vol. 2751, pp. 210–221. Springer, Heidelberg (2003)
Fellows, M.R., Gramm, J., Niedermeier, R.: On the parameterized intractability of motif search problems. Combinatorica 26(2), 141–167 (2006)
Frances, M., Litman, A.: On covering problems of codes. Theoretical Computer Science 30, 113–119 (1997)
Gramm, J., Guo, J., Niedermeier, R.: On exact and approximation algorithms for distinguishing substring selection. In: Lingas, A., Nilsson, B.J. (eds.) FCT 2003. LNCS, vol. 2751, pp. 159–209. Springer, Heidelberg (2003)
Gramm, J., Niedermeier, R., Rossmanith, P.: Fixed-parameter algorithms for closest string and related problems. Algorithmica 37, 25–42 (2003)
Lanctot, K., Li, M., Ma, B., Wang, S., Zhang, L.: Distinguishing string search problems. In: SODA 1999, pp. 633–642 (1999)
Li, M., Ma, B., Wang, L.: On the closest string and substring problems. Journal of the ACM 49(2), 157–171 (2002)
Ma, B., Sun, X.: More efficient algorithms for closest string and substring problems. In: RECOMB 2008, pp. 396–409 (2008)
Marx, D.: The closest substring problem with small distances. In: FOCS 2005, pp. 63–72 (2005)
Meneses, C.N., Lu, Z., Oliveira, C.A.S., Pardalos, P.M.: Optimal solutions for the closest-string problem via integer programming. INFORMS Journal on Computing (2004)
Nicolas, F., Rivals, E.: Complexities of the centre and median string problems. In: Proceedings of the 14th Annual Symposium on Combinatorial Pattern Matching, pp. 315–327 (2003)
Stojanovic, N., Berman, P., Gumucio, D., Hardison, R., Miller, W.: A linear-time algorithm for the 1-mismatch problem. In: Proceedings of the 5th International Workshop on Algorithms and Data Structures, pp. 126–135 (1997)
Wang, L., Dong, L.: Randomized algorithms for motif detection. Journal of Bioinformatics and Computational Biology 3(5), 1039–1052 (2005)
Wang, L., Zhu, B.: Efficient algorithms for the closest string and distinguishing string selection problems. In: The Third International Frontiers of Algorithmics Workshop, pp. 261–270 (2009)
Wang, Y., Chen, W., Li, X., Cheng, B.: Degenerated primer design to amplify the heavy chain variable region from immunoglobulin cDNA. BMC Bioinformatics 7(Suppl. 4), S9 (2006)
Zhao, R., Zhang, N.: A more efficient closest string algorithm. In: 2nd International Conference on Bioinformatics and Computational Biology (to appear, 2010)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Chen, ZZ., Ma, B., Wang, L. (2010). A Three-String Approach to the Closest String Problem. In: Thai, M.T., Sahni, S. (eds) Computing and Combinatorics. COCOON 2010. Lecture Notes in Computer Science, vol 6196. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-14031-0_48
Download citation
DOI: https://doi.org/10.1007/978-3-642-14031-0_48
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-14030-3
Online ISBN: 978-3-642-14031-0
eBook Packages: Computer ScienceComputer Science (R0)