Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

Efficient sort-based skyline evaluation

Published: 12 December 2008 Publication History

Abstract

Skyline queries compute the set of Pareto-optimal tuples in a relation, that is, those tuples that are not dominated by any other tuple in the same relation. Although several algorithms have been proposed for efficiently evaluating skyline queries, they either necessitate the relation to have been indexed or have to perform the dominance tests on all the tuples in order to determine the result. In this article we introduce salsa, a novel skyline algorithm that exploits the idea of presorting the input data so as to effectively limit the number of tuples to be read and compared. This makes salsa also attractive when skyline queries are executed on top of systems that do not understand skyline semantics, or when the skyline logic runs on clients with limited power and/or bandwidth. We prove that, if one considers symmetric sorting functions, the number of tuples to be read is minimized by sorting data according to a “minimum coordinate,” minC, criterion, and that performance can be further improved if data distribution is known and an asymmetric sorting function is used. Experimental results obtained on synthetic and real datasets show that salsa consistently outperforms state-of-the-art sequential skyline algorithms and that its performance can be accurately predicted.

Supplementary Material

Bartolini Appendix (a31-bartolini-apndx.pdf)
Online appendix to efficient sort-based skyline evaluation. The appendix supports the information on article 31.

References

[1]
Acharya, S., Poosala, V., and Ramaswamy, S. 1999. Selectivity estimation in spatial databases. In Proceedings of the 1999 ACM SIGMOD International Conference on Management of Data (SIGMOD). Philadelphia, PA, ACM, New York, NY, 13--24.
[2]
Balke, W.-T., Güntzer, U., and Zheng, J. X. 2004. Efficient distributed skylining for web information systems. In Proceedings of the 6th International Conference on Extending Database Technology (EDBT). Lecture Notes in Computer Science, vol. 2992, Springer, Berlin, Heidelberg, New York, 256--273.
[3]
Bartolini, I., Ciaccia, P., Oria, V., and Özsu, T. 2004. Integrating the results of multimedia sub-queries using qualitative preferences. In Proceedings of the 10th International Workshop on Multimedia Information Systems (MIS). College Park, MD. Lecture Notes in Computer Science, vol. 2992, Springer, Berlin, Heidelberg, New York, 66--75.
[4]
Bartolini, I., Ciaccia, P., Oria, V., and Özsu, T. 2007. Flexible integration of multimedia sub-queries with qualitative preferences. Multimed. Tools Appl. 33, 3, 275--300.
[5]
Bartolini, I., Ciaccia, P., and Patella, M. 2006. SaLSa: computing the skyline without scanning the whole sky. In Proceedings of the 2006 ACM CIKM International Conference on Information and Knowledge Management. Arlington, VA, ACM, New York, NY, 405--414.
[6]
Börzsönyi, S., Kossmann, D., and Stocker, K. 2001. The skyline operator. In Proceedings of the 17th International Conference on Data Engineering (ICDE). Heidelberg, Germany, IEEE Computer Society, Washington, DC, 421--430.
[7]
Buchta, C. 1989. On the average number of maxima in a set of vectors. Inform. Process. Lett. 33, 2, 63--65.
[8]
Chaudhuri, S., Dalvi, N., and Kaushik, R. 2006. Robust cardinality and cost estimation for skyline operator. In Proceedings of the 22nd International Conference on Data Engineering (ICDE). Atlanta, GA. IEEE Computer Society, Washington, DC, 64.
[9]
Chomicki, J. 2003. Preference Formulas in Relational Queries. ACM Trans. Database Syst. 28, 4, 427--466.
[10]
Chomicki, J., Godfrey, P., Gryz, J., and Liang, D. 2002. Skyline with presorting. Tech. Rep. CS-2002-04, York University, Toronto, ON.
[11]
Chomicki, J., Godfrey, P., Gryz, J., and Liang, D. 2003. Skyline with presorting. In Proceedings of the 19th International Conference on Data Engineering (ICDE). Bangalore, India. IEEE Computer Society, Washington, DC, 717--816.
[12]
Garey, M. R. and Johnson, D. S. 1979. Computers and Intractability: A Guide to the Theory of NP-Completeness. W. H. Freeman.
[13]
Godfrey, P. 2004. Skyline cardinality for relational processing. In Proceedings of the 3rd International Symposium on Foundations of Information and Knowledge Systems (FoIKS). Lecture Notes in Computer Science, vol. 2942, Springer, Berlin, Heidelberg, New York, 78--97.
[14]
Godfrey, P., Shipley, R., and Gryz, J. 2005. Maximal vector computation in large data sets. In Proceedings of the 31st International Conference on Very Large Data Bases (VLDB). Trondheim, Norway. Morgan Kaufmann, San Francisco, CA, 229--240.
[15]
Kiessling, W. 2002. Foundations of preferences in database systems. In Proceedings of the 28th International Conference on Very Large Data Bases (VLDB). Hong Kong, China. Morgan Kaufmann, San Francisco, CA, 311--322.
[16]
Kossmann, D., Ramsak, F., and Rost, S. 2002. Shooting stars in the sky: an online algorithm for skyline queries. In Proceedings of the 28th International Conference on Very Large Data Bases (VLDB). Hong Kong, China. Morgan Kaufmann, San Francisco, CA, 275--286.
[17]
Papadias, D., Tao, Y., Fu, G., and Seeger, B. 2003. An optimal and progressive algorithm for skyline queries. In Proceedings of the 2003 ACM SIGMOD International Conference on Management of Data (SIGMOD). San Diego, CA. ACM, New York, NY, 467--478.
[18]
Papadias, D., Tao, Y., Fu, G., and Seeger, B. 2005. Progressive skyline computation in database systems. ACM Trans. Database Syst. 30, 1, 41--82.
[19]
Paredes, R. and Navarro, G. 2006. Optimal incremental sorting. In Proceedings of the 8th Workshop on Algorithm Engineering and Experiments (ALENEX) and the 3rd Workshop on Analytic Algorithmics and Combinatorics (ANALCO). Miami, FL. SIAM Press, Philadelphia, PA, 171--182.
[20]
Preparata, F. P. and Shamos, M. I. 1985. Computational Geometry—An Introduction. Springer.
[21]
Sakamoto, H. 1943. On the distributions of the product and the quotient of the independent and uniformly distributed random variables. Tohoku Math. J. 49, 243--260.
[22]
Tan, K.-L., Eng, P.-K., and Ooi, B. C. 2001. Efficient progressive skyline computation. In Proceedings of the 27th International Conference on Very Large Data Bases (VLDB). Rome, Italy, Morgan Kaufmann, San Francisco, CA, 301--310.
[23]
Tao, Y., Xiao, X., and Pei, J. 2006. SUBSKY: efficient computation of skylines in subspaces. In Proceedings of the 22nd International Conference on Data Engineering (ICDE). Atlanta, GA. IEEE, Computer Society, Washington, DC, 65.

Cited By

View all
  • (2024)CSQUiD: an index and non-probability framework for constrained skyline query processing over uncertain dataPeerJ Computer Science10.7717/peerj-cs.222510(e2225)Online publication date: 16-Sep-2024
  • (2024)Efficient Skyline Keyword-Based Tree Retrieval on Attributed GraphsIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2024.338898836:11(6056-6070)Online publication date: Nov-2024
  • (2024)Multi-objective Test Recommendation for Adaptive LearningTransactions on Large-Scale Data- and Knowledge-Centered Systems LVI10.1007/978-3-662-69603-3_1(1-36)Online publication date: 21-Jul-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Transactions on Database Systems
ACM Transactions on Database Systems  Volume 33, Issue 4
November 2008
379 pages
ISSN:0362-5915
EISSN:1557-4644
DOI:10.1145/1412331
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 12 December 2008
Accepted: 01 July 2008
Revised: 01 March 2008
Received: 01 June 2007
Published in TODS Volume 33, Issue 4

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Monotone functions
  2. Skyline query

Qualifiers

  • Research-article
  • Research
  • Refereed

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)10
  • Downloads (Last 6 weeks)1
Reflects downloads up to 16 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)CSQUiD: an index and non-probability framework for constrained skyline query processing over uncertain dataPeerJ Computer Science10.7717/peerj-cs.222510(e2225)Online publication date: 16-Sep-2024
  • (2024)Efficient Skyline Keyword-Based Tree Retrieval on Attributed GraphsIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2024.338898836:11(6056-6070)Online publication date: Nov-2024
  • (2024)Multi-objective Test Recommendation for Adaptive LearningTransactions on Large-Scale Data- and Knowledge-Centered Systems LVI10.1007/978-3-662-69603-3_1(1-36)Online publication date: 21-Jul-2024
  • (2023)Adaptive Test Recommendation for Mastery LearningProceedings of the 2nd International Workshop on Data Systems Education: Bridging education practice with education research10.1145/3596673.3596977(18-23)Online publication date: 23-Jun-2023
  • (2023)Efficient and Secure Skyline Queries Over Vertical Data FederationIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2022.322241535:9(9269-9280)Online publication date: 1-Sep-2023
  • (2023)Federated $k$-Dominant Skyline: An Efficient Approach Under Differential Privacy2023 5th International Conference on Data-driven Optimization of Complex Systems (DOCS)10.1109/DOCS60977.2023.10294954(1-8)Online publication date: 22-Sep-2023
  • (2023)SkyFlow: Heterogeneous streaming for skyline computation using FlowGraph and SYCLFuture Generation Computer Systems10.1016/j.future.2022.11.021141(269-283)Online publication date: Apr-2023
  • (2023)Collaborative Filtering Skyline (CFS) for Enhanced Recommender SystemsHandbook of Computational Sciences10.1002/9781119763468.ch3(37-67)Online publication date: 14-Jul-2023
  • (2022)Handling qualitative preferences in SPARQL over virtual ontology-based data accessSemantic Web10.3233/SW-21289513:4(659-682)Online publication date: 31-May-2022
  • (2022)RG-SKY: A fuzzy group skyline relaxation for combinatorial decision makingComputer Science and Information Systems10.2298/CSIS211020015N19:2(887-912)Online publication date: 2022
  • Show More Cited By

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media