Nothing Special   »   [go: up one dir, main page]

skip to main content
10.5555/1083592.1083642dlproceedingsArticle/Chapter ViewAbstractPublication PagesvldbConference Proceedingsconference-collections
Article

Space efficiency in synopsis construction algorithms

Published: 30 August 2005 Publication History

Abstract

Histograms and Wavelet synopses have been found to be useful in query optimization, approximate query answering and mining. Over the last few years several good synopsis algorithms have been proposed. These have mostly focused on the running time of the synopsis constructions, optimum or approximate, vis-a-vis their quality. However the space complexity of synopsis construction algorithms has not been investigated as thoroughly. Many of the optimum synopsis construction algorithms (as well as few of the approximate ones) are expensive in space. In this paper, we propose a general technique that reduces space complexity. We show that the notion of "working space" proposed in these contexts is redundant. We believe that our algorithm also generalizes to a broader range of dynamic programs beyond synopsis construction. Our modifications can be easily adapted to existing algorithms. We demonstrate the performance benefits through experiments on real-life and synthetic data.

References

[1]
S. Acharya, P. Gibbons, V. Poosala, and S. Ramaswamy. The Aqua Approximate Query Answering System. SIGMOD Conference, 1999.]]
[2]
L. Amsaleg, P. Bonnet, M. J. Franklin, A. Tomasic, and T. Urhan. Improving responsiveness for wide-area data access. IEEE Data Eng., 20(3):3--11, 1997.]]
[3]
L. Arge. External memory data structures. Proc. of ESA, pages 1--29, 2001.]]
[4]
K. Chakrabarti, M. N. Garofalakis, R. Rastogi, and K. Shim. Approximate query processing using wavelets. VLDB Conference, 2000.]]
[5]
K. Chakrabarti, E. J. Keogh, S. Mehrotra, and M. J. Pazzani. Locally adaptive dimensionality reduction for indexing large time series databases. ACM TODS, 27(2):188--228, 2002.]]
[6]
I. Daubechies. Ten Lectures on Wavelets. SIAM, 1992.]]
[7]
A. Deligiannakis and N. Roussopoulos. Extended wavelets for multiple measures. SIGMOD Conference, 2003.]]
[8]
M. Garofalakis and A. Kumar. Deterministic wavelet thresholding for maximum error metric. Proc. of PODS, 2004.]]
[9]
M. N. Garofalakis and P. B. Gibbons. Probabilistic wavelet synopses. ACM TODS (See also SIGMOD 2002), 29:43--90, 2004.]]
[10]
P. Gibbons and Y. Matias. Synopsis data structures for massive data sets. Proc. of SODA, 1999.]]
[11]
P. Gibbons, Y. Matias, and V. Poosala. Fast Incremental Maintenance of Approximate Histograms. VLDB Conference, 1997.]]
[12]
A. Gilbert, Y. Kotadis, S. Muthukrishnan, and M. Strauss. Surfing Wavelets on Streams: One Pass Summaries for Approximate Aggregate Queries. VLDB Conference, 2001.]]
[13]
S. Guha. Space efficiency in optimal, approximation and streaming algorithms for synopsis construction problems. Manuscript (email for copy)., 2005.]]
[14]
S. Guha and B. Harb. Wavelet synopsis for data streams: Minimizing non-euclidean error. Proc. of SIGKDD Conference, 2005.]]
[15]
S. Guha, P. Indyk, S. Muthukrishnan, and M. Strauss. Histogramming data streams with fast per-item processing. Proc. of ICALP, 2002.]]
[16]
S. Guha, N. Koudas, and K. Shim. Data Streams and Histograms. Proc. of STOC, (see also the full version of {16}, available at http://www.cis.upenn.edu/~sudipto/mypapers/histjour.pdf.gz), 2001.]]
[17]
S. Guha, K. Shim, and J. Woo. REHIST: Relative error histogram construction algorithms. VLDB Conference, 2004.]]
[18]
J. M. Hellerstein, P. J. Haas, and H. J. Wang. Online aggregation. SIGMOD Conference, 1997.]]
[19]
Y. E. Ioannidis. Universality of serial histograms. VLDB Conference, 1993.]]
[20]
Y. E. Ioannidis and V. Poosala. Balancing Histogram Optimality and Practicality for Query Result Size Estimation. ACM SIGMOD Conference, 1995.]]
[21]
H. V. Jagadish, N. Koudas, S. Muthukrishnan, V. Poosala, K. C. Sevcik, and T. Suel. Optimal Histograms with Quality Guarantees. VLDB Conference, 1998.]]
[22]
P. Karras and N. Mamoulis. One pass wavelet synopis for maximum error metrics. This proceeding, 2005.]]
[23]
Y. Matias and D. Urieli. Manuscript. 2004.]]
[24]
Y. Matias, J. S. Vitter, and M. Wang. Wavelet-Based Histograms for Selectivity Estimation. SIGMOD Conference, 1998.]]
[25]
M. Muralikrishna and D. J. DeWitt. Equi-depth histograms for estimating selectivity factors for multidimensional queries. SIGMOD Conference, 1998.]]
[26]
S. Muthukrishnan. Workload optimal wavelet synopsis. DIMACS TR, 2004.]]
[27]
S. Muthukrishnan and M. Strauss. Approximate histogram and wavelet summaries of streaming data. DIMACS TR 52, 2003.]]
[28]
V. Poosala, Y. E. Ioannidis, P. Haas, and E. Shekita. Improved Histograms for Selectivity Estimation of Range Predicates. SIGMOD Conference, 1996.]]
[29]
P. G. Selinger, M. M. Astrahan, D. D. Chamberlin, R. A. Lorie, and T. G. Price. Access path selection in a relational database management system. SIGMOD Conference, 1979.]]
[30]
J. S. Vitter. External memory algorithms and data structures: dealing with massive data. ACM Comput. Surv., 33(2):209--271, 2001.]]

Cited By

View all
  • (2022)SIEVE: A Space-Efficient Algorithm for Viterbi DecodingProceedings of the 2022 International Conference on Management of Data10.1145/3514221.3526170(1136-1145)Online publication date: 10-Jun-2022
  • (2019)Maintaining Wavelet Synopses for Sliding-Window AggregatesProceedings of the 31st International Conference on Scientific and Statistical Database Management10.1145/3335783.3335793(73-84)Online publication date: 23-Jul-2019
  • (2017)Efficient Haar+ synopsis construction for the maximum absolute error measureProceedings of the VLDB Endowment10.14778/3151113.315111711:1(40-52)Online publication date: 1-Sep-2017
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image DL Hosted proceedings
VLDB '05: Proceedings of the 31st international conference on Very large data bases
August 2005
1392 pages
ISBN:1595931546

Publisher

VLDB Endowment

Publication History

Published: 30 August 2005

Qualifiers

  • Article

Conference

ICMI05

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 16 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2022)SIEVE: A Space-Efficient Algorithm for Viterbi DecodingProceedings of the 2022 International Conference on Management of Data10.1145/3514221.3526170(1136-1145)Online publication date: 10-Jun-2022
  • (2019)Maintaining Wavelet Synopses for Sliding-Window AggregatesProceedings of the 31st International Conference on Scientific and Statistical Database Management10.1145/3335783.3335793(73-84)Online publication date: 23-Jul-2019
  • (2017)Efficient Haar+ synopsis construction for the maximum absolute error measureProceedings of the VLDB Endowment10.14778/3151113.315111711:1(40-52)Online publication date: 1-Sep-2017
  • (2016)Distributed Wavelet Thresholding for Maximum Error MetricsProceedings of the 2016 International Conference on Management of Data10.1145/2882903.2915230(663-677)Online publication date: 26-Jun-2016
  • (2013)Finding the minimum number of elements with sum above a thresholdInformation Sciences: an International Journal10.1016/j.ins.2013.02.035238(205-211)Online publication date: 1-Jul-2013
  • (2013)Computing Unrestricted Synopses Under Maximum Error BoundAlgorithmica10.1007/s00453-011-9571-965:1(1-42)Online publication date: 1-Jan-2013
  • (2009)Unrestricted wavelet synopses under maximum error boundProceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology10.1145/1516360.1516445(732-743)Online publication date: 24-Mar-2009
  • (2009)Tight results for clustering and summarizing data streamsProceedings of the 12th International Conference on Database Theory10.1145/1514894.1514926(268-275)Online publication date: 23-Mar-2009
  • (2009)Hierarchically compressed wavelet synopsesThe VLDB Journal — The International Journal on Very Large Data Bases10.1007/s00778-008-0096-z18:1(203-231)Online publication date: 1-Jan-2009
  • (2007)Building data synopses within a known maximum error boundProceedings of the joint 9th Asia-Pacific web and 8th international conference on web-age information management conference on Advances in data and web management10.5555/1769708.1769768(463-470)Online publication date: 16-Jun-2007
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media