Improved Variable-to-Fixed Length Codes

Shmuel T. Klein⁴ &
Dana Shapira⁵

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 5280))

Included in the following conference series:

International Symposium on String Processing and Information Retrieval

841 Accesses
3 Altmetric

Abstract

Though many compression methods are based on the use of variable length codes, there has recently been a trend to search for alternatives in which the lengths of the codewords are more restricted, which can be useful for fast decoding and compressed searches. This paper explores the construction of variable-to-fixed length codes, which have been suggested long ago by Tunstall. Using a new heuristic based on suffix trees, the performance of Tunstall codes could be improved by more than 30%.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Run-Length Encoding in a Finite Universe

Efficient and Compact Representations of Some Non-canonical Prefix-Free Codes

A generalized construction of variable-length non-overlapping codes

Article 15 February 2025

References

Abrahams, J.: Code and parse trees for lossless source encoding. Comm. in Information and Systems 1(2), 113–146 (2001)
Article MathSciNet MATH Google Scholar
Brisaboa, N.R., Fariña, A., Navarro, G., Esteller, M.F. (s,c)-dense coding: an optimized compression code for natural language text databases. In: Nascimento, M.A., de Moura, E.S., Oliveira, A.L. (eds.) SPIRE 2003. LNCS, vol. 2857, pp. 122–136. Springer, Heidelberg (2003)
Chapter Google Scholar
Brisaboa, N.R., Iglesias, E.L., Navarro, G., Paramá, J.R.: An efficient compression code for text databases. In: Sebastiani, F. (ed.) ECIR 2003. LNCS, vol. 2633, pp. 468–481. Springer, Heidelberg (2003)
Chapter Google Scholar
Crochemore, M., Ilie, L., Smyth, W.F.: A Simple Algorithm for Computing the Lempel Ziv Factorization. In: Proc. Data Compression Conference DCC 2008, Snowbird, Utah, pp. 482–488 (2008)
Google Scholar
Fraenkel, A.S., Klein, S.T.: Complexity Aspects of Guessing Prefix Codes. Algorithmica 12, 409–419 (1994)
Article MathSciNet MATH Google Scholar
Chrobak, M., Kolman, P., Sgall, J.: The greedy algorithm for the minimum common string partition problem. ACM Transactions on Algorithms 1(2), 350–366 (2005)
Article MathSciNet MATH Google Scholar
Fraenkel, A.S., Mor, M., Perl, Y.: Is text compression by prefixes and suffixes practical? Acta Informatica 20, 371–389 (1983)
Article MathSciNet MATH Google Scholar
Huffman, D.: A method for the construction of minimum redundancy codes. Proc. of the IRE 40, 1098–1101 (1952)
Article MATH Google Scholar
Klein, S.T.: Improving static compression schemes by alphabet extension. In: Giancarlo, R., Sankoff, D. (eds.) CPM 2000. LNCS, vol. 1848, pp. 210–221. Springer, Heidelberg (2000)
Chapter Google Scholar
Klein, S.T., Kopel Ben-Nissan, M.: Using Fibonacci compression codes as alternatives to dense codes. In: Proc. Data Compression Conference DCC 2008, Snowbird, Utah, pp. 472–481 (2008)
Google Scholar
Moffat, A.: Word-based text compression. Software – Practice & Experience 19, 185–198 (1989)
Article Google Scholar
de Moura, E.S., Navarro, G., Ziviani, N., Baeza-Yates, R.: Fast and flexible word searching on compressed text. ACM Trans. on Information Systems 18, 113–139 (2000)
Article Google Scholar
Savari, S.A., Gallager, R.G.: Generalized Tunstall codes for sources with memory. IEEE Trans. Info. Theory IT–43, 658–668 (1997)
Article MATH Google Scholar
Tjalkens, T.J., Willems, F.M.J.: Variable to fixed length codes for Markov sources. IEEE Trans. Info. Theory IT–33, 246–257 (1987)
Article MathSciNet MATH Google Scholar
Tunstall, B.P.: Synthesis of noiseless compression codes, Ph.D dissertation, Georgia Institute of Technology, Atlanta, GA (1967)
Google Scholar
Ukkonen, E.: On-line construction of suffix trees. Algorithmica 14(3), 249–260 (1995)
Article MathSciNet MATH Google Scholar
Véronis, J., Langlais, P.: Evaluation of parallel text alignment systems: The arcade project. In: Véronis, J. (ed.) Parallel Text Processing, pp. 369–388. Kluwer Academic Publishers, Dordrecht (2000)
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

Dept. of Computer Science, Bar Ilan University, Ramat Gan, 52900, Israel
Shmuel T. Klein
Dept. of Computer Science, Ashkelon Academic College, Ashkelon, 78211, Israel
Dana Shapira

Authors

Shmuel T. Klein
View author publications
You can also search for this author in PubMed Google Scholar
Dana Shapira
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science, Bar-Ilan University, 52900, Ramat-Gan, Israel
Amihood Amir
School of Computer Science and Information Technology, RMIT University, Melbourne, Australia
Andrew Turpin
NICTA Victoria Laboratory, Department of Computer Science and Software Engineering, The University of Melbourne, Victoria, Australia
Alistair Moffat

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Klein, S.T., Shapira, D. (2008). Improved Variable-to-Fixed Length Codes. In: Amir, A., Turpin, A., Moffat, A. (eds) String Processing and Information Retrieval. SPIRE 2008. Lecture Notes in Computer Science, vol 5280. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-89097-3_6

Download citation

DOI: https://doi.org/10.1007/978-3-540-89097-3_6
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-89096-6
Online ISBN: 978-3-540-89097-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics