Keywords and Synonyms
Compression of multi-dimensional data; Storage, compression and transmission of tables; Compressive estimates of entropy
Problem Definition
Table compression was introduced by Buchsbaum et al. [2] as a unique application of compression, based on several distinguishing characteristics. Tables are collections of fixed-length records and can grow to be terabytes in size. They are often generated by information systems and kept in data warehouses to facilitate ongoing operations. These data warehouses will typically manage many terabytes of data online, with significant capital and operational costs. In addition, the tables must be transmitted to different parts of an organization, incurring additional costs for transmission. Typical examples are tables of transaction activity, like phone calls and credit card usage, which are stored once but then shipped repeatedly to different parts of an organization: for fraud detection, billing, operations support, etc. The...
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Recommended Reading
Blum, A., Li, M., Tromp, J., Yannakakis, M.: Linear approximation of shortest superstrings. J. ACM 41, 630–47 (1994)
Buchsbaum, A.L., Caldwell, D.F., Church, K.W., Fowler, G.S., Muthukrishnan, S.: Engineering the compression of massive tables: An experimental approach. In: Proc. 11th ACM-SIAM Symp. on Discrete Algorithms, 2000, pp. 175–84
Buchsbaum, A.L., Fowler, G.S., Giancarlo, R.: Improving table compression with combinatorial optimization. J. ACM 50, 825–851 (2003)
Burrows, M., Wheeler, D.: A block sorting lossless data compression algorithm. Technical Report 124, Digital Equipment Corporation (1994)
Cilibrasi, R., Vitanyi, P.M.B.: Clustering by compression. IEEE Trans. Inf. Theory 51, 1523–1545 (2005)
Cormack, G.: Data compression in a data base system. Commun. ACM 28, 1336–1350 (1985)
Cover, T.M., Thomas, J.A.: Elements of Information Theory. Wiley Interscience, New York, USA (1990)
Ferragina, P., Giancarlo, R., Manzini, G., Sciortino, M.: Boosting textual compression in optimal linear time. J. ACM 52, 688–713 (2005)
Ferragina, P., Luccio, F., Manzini, G., Muthukrishnan, S.: Structuring Labeled Trees for Optimal Succinctness, and beyond. In: Proc. 45th Annual IEEE Symposium on Foundations of Computer Science, 2005, pp. 198–207
Giancarlo, R., Sciortino, M., Restivo, A.: From first principles to the Burrows and Wheeler transform and beyond, via combinatorial optimization. Theor. Comput. Sci. (2007)
Li, M., Chen, X., Li, X., Ma, B., Vitanyi, P.M.B.: The similarity metric. IEEE Trans. Inf. Theory 50, 3250–3264 (2004)
Liefke, H., Suciu, D.: XMILL: An efficient compressor for XML data. In: Proceedings of the 2000 ACM SIGMOD Int. Conf. on Management of Data, pp. 153–164. ACM, New York, USA (2000)
Lifshits, Y., Mozes, S., Weimann, O., Ziv-Ukelson, M.: Speeding up HMM decoding and training by exploiting sequence repetitions. Algorithmica to appear doi:10.1007/s00453-007-9128-0
Manzini, G.: An analysis of the Burrows–Wheeler transform. J. ACM 48, 407–430 (2001)
Vo, B.D., Vo, K.-P.: Compressing table data with column dependency. Theor. Comput. Sci. 387, 273–283 (2007)
Vo, B.D., Vo, K.-P.: Using column dependency to compress tables. In: DCC: Data Compression Conference, pp. 92–101. IEEE Computer Society TCC, Washington DC, USA (2004)
Vo., K.-P.: Compression as data transformation. In: DCC: Data Compression Conference. IEEE Computer Society TCC, pp. 403. Washington DCD, USA (2006)
Ziv, J., Lempel, A.: A universal algorithm for sequential data compression. IEEE Trans. Inf. Theory 23, 337–343 (1977)
Ziv, J., Lempel, A.: Compression of individual sequences via variable length coding. IEEE Trans. Inf. Theory 24, 530–536 (1978)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2008 Springer-Verlag
About this entry
Cite this entry
Buchsbaum, A., Giancarlo, R. (2008). Table Compression. In: Kao, MY. (eds) Encyclopedia of Algorithms. Springer, Boston, MA. https://doi.org/10.1007/978-0-387-30162-4_418
Download citation
DOI: https://doi.org/10.1007/978-0-387-30162-4_418
Publisher Name: Springer, Boston, MA
Print ISBN: 978-0-387-30770-1
Online ISBN: 978-0-387-30162-4
eBook Packages: Computer ScienceReference Module Computer Science and Engineering