Abstract
In many applications, range skyline query is an important operation to find the interesting tuples in a potentially huge data space. Given selection condition, range skyline query returns tuples both satisfying the specified selection condition and not dominated by other satisfying tuples. It is found that most of the existing skyline algorithms do not consider the selection condition. This paper proposes a novel table-scan-based Presorted-table-based Range Skyline (PRS) algorithm to efficiently compute range skyline results on massive data. PRS first presorts the table for early termination. The early termination checking is proposed in this paper, along with the theoretical analysis of scan depth. The selection checking and dominance checking are devised in this paper to skip the unsatisfying or dominated tuples directly. The theoretical analysis proves that the overwhelming majority of candidates can be skipped. The extensive experimental results, conducted on synthetic and real-life data sets, show that PRS outperforms the existing algorithms significantly.
Similar content being viewed by others
Notes
In this paper, the tuples are considered in general position [31].
References
Berger M (1987) Geometry I. Springer, Berlin
Börzsönyi S, Kossmann D, Stocker K (2001) The skyline operator. In: Proceedings of the 17th international conference on data engineering, pp 421–430
Chomicki J, Godfrey P, Gryz J, Liang D (2003) Skyline with presorting. In: Proceedings of the 19th international conference on data engineering, pp 717–719
Dellis E, Vlachou A, Vladimirskiy I, et al (2006) Constrained subspace skyline computation. In: Proceedings of the 15th ACM international conference on information and knowledge management, pp 415–424
Endres M (2015) The structure of preference orders. In: Proceedings of 19th East European conference on advances in databases and information systems, pp 32–45
Endres M, Kießling W (2011a) Semi-skyline optimization of constrained skyline queries. In: 22nd Australasian database conference, pp 7–16
Endres M, Kießling W (2011b) Skyline snippets. In: Proceedings of 9th international conference on flexible query answering systems, pp 246–257
Endres M, Kießling W (2014) High parallel skyline computation over low-cardinality domains. In: Proceedings of the 18th East European conference on advances in databases and information systems, pp 97–111
Endres M, Kießling W (2015) Parallel skyline computation exploiting the lattice structure. J Database Manag 26(4):18–43
Endres M, Preisinger T (2017) Beyond skylines: explicit preferences. In: Proceedings of the 22nd international conference on database systems for advanced applications, Part I, pp 327–342
Endres M, Roocks P, Kießling W (2015) Scalagon: an efficient skyline algorithm for all seasons. In: Proceedings of 20th international conference on database systems for advanced applications, Part II, pp 292–308
Fagin R, Lotem A, Naor M (2001) Optimal aggregation algorithms for middleware. In: Proceedings of the 20th ACM SIGMOD-SIGACT-SIGART symposium on principles of database systems, pp 102–113
Godfrey P, Shipley R, Gryz J (2007) Algorithms and analyses for maximal vector computation. VLDB J 16(1):5–28
Han X, Li J, Gao H (2015) Efficient top-k retrieval on massive data. IEEE Trans Knowl Data Eng 27(10):2687–2699
Han X, Li J, Yang D, Wang J (2013) Efficient skyline computation on big data. IEEE Trans Knowl Data Eng 25(11):2521–2535
Hsueh Y, Lin C, Chang C (2017) An efficient indexing method for skyline computations with partially ordered domains. IEEE Trans Knowl Data Eng 29(5):963–976
Kießling W, Endres M, Wenzel F (2011) The preference SQL system—an overview. IEEE Data Eng Bull 34(2):11–18
Korn F, Pagel B, Faloutsos C (2001) On the ’dimensionality curse’ and the ’self-similarity blessing’. IEEE Trans Knowl Data Eng 13(1):96–111
Kossmann D, Ramsak F, Rost S (2002) Shooting stars in the sky: an online algorithm for skyline queries. In: Proceedings of the 28th international conference on very large data bases, pp 275–286
Kung H, Luccio F, Preparata F (1975) On finding the maxima of a set of vectors. J ACM 22(4):469–476
Lee J, Hwang S (2014a) Scalable skyline computation using a balanced pivot selection technique. Inf Syst 39:1–21
Lee J, Hwang S (2014b) Toward efficient multidimensional subspace skyline computation. VLDB J 23(1):129–145
Liu X, Lu R, Ma J et al (2016) Efficient and privacy-preserving skyline computation framework across domains. Future Gener Comput Syst 62(C):161–174
Mandl S, Kozachuk O, Endres M, Kießling W (2015) Preference analytics in exasolution. In: Datenbanksysteme für Business, Technologie und Web (BTW), 16. Fachtagung des GI-Fachbereichs “Datenbanken und Informationssysteme” (DBIS). Proceedings, pp 613–632
Morse M, Patel J, Jagadish H (2007) Efficient skyline computation over low-cardinality domains. In: Proceedings of the 33rd international conference on very large data bases, pp 267–278
Mortensen M, Chester S, Assent I, Magnani M (2015) Efficient caching for constrained skyline queries. In: Proceedings of the 18th international conference on extending database technology, pp 337–348
O’Neil P, Quass D (1997) Improved query performance with variant indexes. In: Proceedings of the 1997 ACM SIGMOD international conference on management of data, pp 38–49
Papadias D, Tao Y, Fu G, Seeger B (2005) Progressive skyline computation in database systems. ACM Trans Database Syst 30(1):41–82
Park Y, Min J, Shim K (2017) Efficient processing of skyline queries using mapreduce. IEEE Trans Knowl Data Eng 29(5):1031–1044
Pei J, Yuan Y, Lin X et al (2006) Towards multidimensional subspace skyline analysis. ACM Trans Database Syst 31(4):1335–1381
Sheng C, Tao Y (2011) On finding skylines in external memory. In: Proceedings of the 30th ACM SIGMOD-SIGACT-SIGART symposium on principles of database systems, pp 107–116
Steuer R (1986) Multiple criteria optimization. Wiley, Hoboken
Tan K, Eng P, Ooi B (2001) Efficient progressive skyline computation. In: Proceedings of the 27th international conference on very large data bases, pp 301–310
Tao Y, Xiao X, Pei J (2007) Efficient skyline and top-k retrieval in subspaces. IEEE Trans Knowl Data Eng 19(8):1072–1088
Wu K, Shoshani A, Stockinger K (2008) Analyses of multi-level and multi-component compressed bitmap indexes. ACM Trans Database Syst 35(1):2:1–2:52
Xia T, Zhang D, Fang Z et al (2012) Online subspace skyline query processing using the compressed skycube. ACM Trans Database Syst 37(2):15:1–15:36
Xin D, Han J, Cheng H, Li X (2006) Answering top-k queries with multi-dimensional selections: the ranking cube approach. In: Proceedings of the 32nd international conference on very large data bases, pp 463–474
Zhang M, Alhajj R (2010) Skyline queries with constraints: integrating skyline and traditional query operators. Data Knowl Eng 69(1):153–168
Acknowledgements
This work was supported in part by the National Natural Science Foundation of China under Grant Nos. 61872106, 61632010, 61502121 and 61402130, National key research and development program of China under Grant No. 2016YFB1000703 and Weihai-HIT co-construction program under Grant No. ZMZ001702.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Han, X., Li, X., Wang, B. et al. PRS: efficient range skyline computation on massive data via presorting. Knowl Inf Syst 60, 1511–1548 (2019). https://doi.org/10.1007/s10115-018-1310-y
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10115-018-1310-y