Abstract
Given a set S = {s 1, s 2, …, s n } of strings of equal length L and an integer d, the closest string problem (CSP) requires the computation of a string s of length L such that d(s, s i ) ≤ d for each s i ∈ S, where d(s, s i ) is the Hamming distance between s and s i . The problem is NP-hard and has been extensively studied in the context of approximation algorithms and parameterized algorithms. Parameterized algorithms provide the most practical solutions to its real-life applications in bioinformatics. In this paper we develop the first randomized parameterized algorithms for CSP. Not only are the randomized algorithms much simpler than their deterministic counterparts, their expected-time complexities are also significantly better than the previously best known (deterministic) algorithms.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Böcker, S., Jahn, K., Mixtacki, J., Stoye, J.: Computation of median gene clusters. Journal of Computational Biology 16(8), 1085–1099 (2009)
Boucher, C., Brown, D.G.: Detecting motifs in a large data set: Applying probabilistic insights to motif finding. In: Rajasekaran, S. (ed.) BICoB 2009. LNCS, vol. 5462, pp. 139–150. Springer, Heidelberg (2009)
Ben-Dor, A., Lancia, G., Perone, J., Ravi, R.: Banishing bias from consensus sequences. In: Hein, J., Apostolico, A. (eds.) CPM 1997. LNCS, vol. 1264, pp. 247–261. Springer, Heidelberg (1997)
Chen, Z.-Z., Ma, B., Wang, L.: A three-string approach to the closest string problem. Journal of Computer and System Sciences 78, 164–178 (2012)
Chen, Z.-Z., Wang, L.: Fast exact algorithms for the closest string and substring problems with application to the planted (ℓ,d)-motif model. IEEE/ACM Transactions on Computational Biology and Bioinformatics 8(5), 1400–1410 (2011)
Davila, J., Balla, S., Rajasekaran, S.: Space and time efficient algorithms for planted motif search. In: Proc. of the International Conference on Computational Science, pp. 822–829 (2006)
Deng, X., Li, G., Li, Z., Ma, B., Wang, L.: Genetic design of drugs without side-effects. SIAM J. Comput. 32(4), 1073–1090 (2003)
Dopazo, J., Rodríguez, A., Sáiz, J.C., Sobrino, F.: Design of primers for PCR amplification of highly variable genomes. CABIOS 9, 123–125 (1993)
Evans, P.A., Smith, A.D.: Complexity of approximating closest substring problems. In: Proc. of the 14th International Symposium on Foundations of Complexity Theory, pp. 210–221 (2003)
Fellows, M.R., Gramm, J., Niedermeier, R.: On the parameterized intractability of motif search problems. Combinatorica 26(2), 141–167 (2006)
Frances, M., Litman, A.: On covering problems of codes. Theoret. Comput. Sci. 30, 113–119 (1997)
Gramm, J., Guo, J., Niedermeier, R.: On exact and approximation algorithms for distinguishing substring selection. In: Lingas, A., Nilsson, B.J. (eds.) FCT 2003. LNCS, vol. 2751, pp. 195–209. Springer, Heidelberg (2003)
Gramm, J., Hüffner, F., Niedermeier, R.: Closest strings, primer design, and motif search. In: Florea, L., et al (eds.), Currents in Computational Molecular Biology. Poster Abstracts of RECOMB 2002, pp. 74–75 (2002)
Gramm, J., Niedermeier, R., Rossmanith, P.: Fixed-parameter algorithms for closest string and related problems. Algorithmica 37, 25–42 (2003)
Hufsky, F., Kuchenbecker, L., Jahn, K., Stoye, J., Böcker, S.: Swiftly computing center strings. In: Moulton, V., Singh, M. (eds.) WABI 2010. LNCS, vol. 6293, pp. 325–336. Springer, Heidelberg (2010)
Jiao, Y., Xu, J., Li, M.: On the k-closest substring and k-consensus pattern problems. In: Sahinalp, S.C., Muthukrishnan, S.M., Dogrusoz, U. (eds.) CPM 2004. LNCS, vol. 3109, pp. 130–144. Springer, Heidelberg (2004)
Lanctot, K., Li, M., Ma, B., Wang, S., Zhang, L.: Distinguishing string search problems. Inform. and Comput. 185, 41–55 (2003)
Li, M., Ma, B., Wang, L.: On the closest string and substring problems. J. ACM 49(2), 157–171 (2002)
Lucas, K., Busch, M., Mösinger, S., Thompson, J.A.: An improved microcomputer program for finding gene- or gene family-specific oligonucleotides suitable as primers for polymerase chain reactions or as probes. CABIOS 7, 525–529 (1991)
Ma, B., Sun, X.: More efficient algorithms for closest string and substring problems. SIAM J. Comput. 39(4), 1432–1443 (2010)
Marx, D.: Closest substring problems with small distances. SIAM J. Comput. 38(4), 1382–1410 (2008)
Marx, D.: Randomized techniques for parameterized algorithms. In: Thilikos, D.M., Woeginger, G.J. (eds.) IPEC 2012. LNCS, vol. 7535, p. 2. Springer, Heidelberg (2012)
Mauch, H., Melzer, M.J., Hu, J.S.: Genetic algorithm approach for the closest string problem. In: Proc. of the 2nd IEEE Computer Society Bioinformatics Conference (CSB), pp. 560–561 (2003)
Meneses, C.N., Lu, Z., Oliveira, C.A.S., Pardalos, P.M.: Optimal solutions for the closest-string problem via integer programming. INFORMS J. Comput. (2004)
Nicolas, F., Rivals, E.: Complexities of the centre and median string problems. In: Baeza-Yates, R., Chávez, E., Crochemore, M. (eds.) CPM 2003. LNCS, vol. 2676, pp. 315–327. Springer, Heidelberg (2003)
Proutski, V., Holme, E.C.: Primer master: A new program for the design and analysis of PCR primers. CABIOS 12, 253–255 (1996)
Stojanovic, N., Berman, P., Gumucio, D., Hardison, R., Miller, W.: A linear-time algorithm for the 1-mismatch problem. In: Rau-Chaplin, A., Dehne, F., Sack, J.-R., Tamassia, R. (eds.) WADS 1997. LNCS, vol. 1272, pp. 126–135. Springer, Heidelberg (1997)
Wang, L., Dong, L.: Randomized algorithms for motif detection. J. Bioinform. Comput. Biol. 3(5), 1039–1052 (2005)
Wang, L., Zhu, B.: Efficient algorithms for the closest string and distinguishing string selection problems. In: Deng, X., Hopcroft, J.E., Xue, J. (eds.) FAW 2009. LNCS, vol. 5598, pp. 261–270. Springer, Heidelberg (2009)
Wang, Y., Chen, W., Li, X., Cheng, B.: Degenerated primer design to amplify the heavy chain variable region from immunoglobulin cDNA. BMC Bioinform. 7(suppl. 4), S9 (2006)
Zhao, R., Zhang, N.: A more efficient closest string algorithm. In: Proc. of the 2nd International Conference on Bioinformatics and Computational Biology (2010)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Chen, ZZ., Ma, B., Wang, L. (2014). Randomized and Parameterized Algorithms for the Closest String Problem. In: Kulikov, A.S., Kuznetsov, S.O., Pevzner, P. (eds) Combinatorial Pattern Matching. CPM 2014. Lecture Notes in Computer Science, vol 8486. Springer, Cham. https://doi.org/10.1007/978-3-319-07566-2_11
Download citation
DOI: https://doi.org/10.1007/978-3-319-07566-2_11
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-07565-5
Online ISBN: 978-3-319-07566-2
eBook Packages: Computer ScienceComputer Science (R0)