Nothing Special   »   [go: up one dir, main page]

skip to main content
10.5555/1875616.1875627guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

On table arrangements, scrabble freaks, and jumbled pattern matching

Published: 02 June 2010 Publication History

Abstract

Given a string s, the Parikh vector of s, denoted p(s), counts the multiplicity of each character in s. Searching for a match of Parikh vector q (a "jumbled string") in the text s requires to find a substring t of s with p(t) = q. The corresponding decision problem is to verify whether at least one such match exists. So, for example for the alphabet Σ = {a,b,c}, the string s = abaccbabaaa has Parikh vector p(s) = (6,3,2), and the Parikh vector q = (2,1,1) appears once in s in position (1,4). Like its more precise counterpart, the renown Exact String Matching, Jumbled Pattern Matching has ubiquitous applications, e.g., string matching with a dyslectic word processor, table rearrangements, anagram checking, Scrabble playing and, allegedly, also analysis of mass spectrometry data. We consider two simple algorithms for Jumbled Pattern Matching and use very complicated data structures and analytic tools to show that they are not worse than the most obvious algorithm. We also show that we can achieve non-trivial efficient average case behavior, but that's less fun to describe in this abstract so we defer the details to the main part of the article, to be read at the reader's risk... well, at the reader's discretion.

References

[1]
Amir, A., Apostolico, A., Landau, G.M., Satta, G.: Efficient text fingerprinting via Parikh mapping. J. Discrete Algorithms 1(5-6), 409-421 (2003).
[2]
Babai, L., Felzenszwalb, P.F.: Computing rank-convolutions with a mask. ACM Trans. Algorithms 6(1), 1-13 (2009).
[3]
Bellman, R., Karush, W.: Mathematical programming and the maximum transform. Journal of the Soc. for Industrial and Applied Math. 10(3), 550-567 (1962).
[4]
Benson, G.: Composition alignment. In: Benson, G., Page, R.D.M. (eds.) WABI 2003. LNCS (LNBI), vol. 2812, pp. 447-461. Springer, Heidelberg (2003).
[5]
Böcker, S.: Simulating multiplexed SNP discovery rates using base-specific cleavage and mass spectrometry. Bioinformatics 23(2), 5-12 (2007).
[6]
Böcker, S., Lipták, Z.: A fast and simple algorithm for the Money Changing Problem. Algorithmica 48(4), 413-432 (2007).
[7]
Bremner, D., Chan, T.M., Demaine, E.D., Erickson, J., Hurtado, F., Iacono, J., Langerman, S., Taslakian, P.: Necklaces, convolutions, and X + Y. In: Azar, Y., Erlebach, T. (eds.) ESA 2006. LNCS, vol. 4168, pp. 160-171. Springer, Heidelberg (2006).
[8]
Butman, A., Eres, R., Landau, G.M.: Scaled and permuted string matching. Inf. Process. Lett. 92(6), 293-297 (2004).
[9]
Chan, T.M.: All-pairs shortest paths with real weights in O(n 3/ log n) time. Algorithmica 50(2), 236-243 (2008).
[10]
Cicalese, F., Fici, G., Lipták, Z.: Searching for jumbled patterns in strings. In: Proc. of the Prague Stringology Conference 2009, pp. 105-117 (2009).
[11]
Cieliebak, M., Erlebach, T., Lipták, Z., Stoye, J., Welzl, E.: Algorithmic complexity of protein identification: combinatorics of weighted strings. Discrete Applied Mathematics 137(1), 27-46 (2004).
[12]
Clark, D.: Compact pat trees. PhD thesis, University of Waterloo, Canada (1996).
[13]
Eppstein, D.A.: Efficient algorithms for sequence analysis with concave and convex gap costs. PhD thesis, New York, NY, USA (1989).
[14]
Eres, R., Landau, G.M., Parida, L.: Permutation pattern discovery in biosequences. Journal of Computational Biology 11(6), 1050-1060 (2004).
[15]
Felzenszwalb, P.F., Huttenlocher, D.P., Kleinberg, J.M.: Fast algorithms for large-state-space HMMs with applications to web usage analysis. In: Thrun, S., Saul, L.K., Schölkopf, B. (eds.) NIPS. MIT Press, Cambridge (2003).
[16]
Goczyła, K.: The generalized Banach match-box problem: Application in disc storage management. Acta Applicandae Mathematicae 5, 27-36 (1986).
[17]
Grossi, R., Gupta, A., Vitter, J.S.: High-order entropy-compressed text indexes. In: SODA, pp. 841-850 (2003).
[18]
Jokinen, P., Tarhio, J., Ukkonen, E.: A comparison of approximate string matching algorithms. Software Practice and Experience 26(12), 1439-1458 (1996).
[19]
Mendelson, H., Pliskin, J., Yechiali, U.: Optimal storage allocation for serial files. Communications of the ACM 22, 124-130 (1979).
[20]
Mendelson, H., Pliskin, J., Yechiali, U.: A stochastic allocation problem. Operations Research 28, 687-693 (1980).
[21]
Munro, J.I.: Tables. In: Chandru, V., Vinay, V. (eds.) FSTTCS 1996. LNCS, vol. 1180, pp. 37-42. Springer, Heidelberg (1996).
[22]
Navarro, G., Mäkinen, V.: Compressed full-text indexes. ACM Comput. Surv. 39(1) (2007).
[23]
Parida, L.: Gapped permutation patterns for comparative genomics. In: Bücher, P., Moret, B.M.E. (eds.) WABI 2006. LNCS (LNBI), vol. 4175, pp. 376-387. Springer, Heidelberg (2006).

Cited By

View all

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Guide Proceedings
FUN'10: Proceedings of the 5th international conference on Fun with algorithms
June 2010
381 pages
ISBN:3642131212
  • Editors:
  • Paolo Boldi,
  • Luisa Gargano

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 02 June 2010

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 24 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2019)On Problems Equivalent to (min,+)-ConvolutionACM Transactions on Algorithms10.1145/329346515:1(1-25)Online publication date: 8-Jan-2019
  • (2017)On prefix normal words and prefix normal formsTheoretical Computer Science10.1016/j.tcs.2016.10.015659:C(1-13)Online publication date: 10-Jan-2017
  • (2017)Fast algorithms for Abelian periods in words and greatest common divisor queriesJournal of Computer and System Sciences10.1016/j.jcss.2016.09.00384:C(205-218)Online publication date: 1-Mar-2017
  • (2017)Efficient Indexes for Jumbled Pattern Matching with Constant-Sized AlphabetAlgorithmica10.1007/s00453-016-0140-077:4(1194-1215)Online publication date: 1-Apr-2017
  • (2016)Algorithms for Jumbled Indexing, Jumbled Border and Jumbled Square on run-length encoded stringsTheoretical Computer Science10.1016/j.tcs.2016.04.030656:PB(146-159)Online publication date: 20-Dec-2016
  • (2015)Clustered Integer 3SUM via Additive CombinatoricsProceedings of the forty-seventh annual ACM symposium on Theory of Computing10.1145/2746539.2746568(31-40)Online publication date: 14-Jun-2015
  • (2015)Subquadratic-Time Algorithms for Abelian Stringology ProblemsRevised Selected Papers of the 6th International Conference on Mathematical Aspects of Computer and Information Sciences - Volume 958210.1007/978-3-319-32859-1_27(320-334)Online publication date: 11-Nov-2015
  • (2014)Algorithms for computing Abelian periods of wordsDiscrete Applied Mathematics10.1016/j.dam.2013.08.021163(287-297)Online publication date: 1-Jan-2014
  • (2014)Algorithms for Jumbled Indexing, Jumbled Border and Jumbled Square on Run-Length Encoded StringsProceedings of the 21st International Symposium on String Processing and Information Retrieval - Volume 879910.1007/978-3-319-11918-2_5(45-51)Online publication date: 20-Oct-2014
  • (2011)On prefix normal wordsProceedings of the 15th international conference on Developments in language theory10.5555/2032178.2032200(228-238)Online publication date: 19-Jul-2011

View Options

View options

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media