Nothing Special   »   [go: up one dir, main page]

Skip to main content
Log in

Sliding window top-k dominating query processing over distributed data streams

  • Published:
Distributed and Parallel Databases Aims and scope Submit manuscript

Abstract

Preference query processing is important for a wide range of applications involving distributed databases, such as network monitoring, web-based systems, and market analysis. In such applications, data objects are generated frequently and massively, which presents an important and challenging problem of continuous query processing over distributed data stream environments. A top-k dominating query, which has been receiving much research attention recently, returns the k data objects that dominate the highest number of data objects in a given dataset, and due to its dominance-based ranking function, we can easily obtain superior data objects. An emerging requirement in distributed stream environments is an efficient technique for continuously monitoring top-k dominating data objects. Despite of this fact, no study has addressed this problem. In this paper, therefore, we address the problem of continuous top-k dominating query processing over distributed data stream environments. We present two algorithms that monitor the exact top-k dominating data and efficiently eliminate unqualified data objects for the result, which reduces both communication and computation costs. In addition to these algorithms, we present an approximate algorithm that further reduces both communication and computation costs. Extensive experiments on both synthetic and real data have demonstrated the efficiency and scalability of our algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

References

  1. Babcock, B., Babu, S., Datar, M., Motwani, R., Widom, J.: Models and issues in data stream systems. In: Proceedings of PODS, pp. 1–16. Springer, New York (2002)

  2. Babcock, B., Olston, C.: Distributed top-k monitoring. In: Proceedings of SIGMOD, pp. 28–39. Springer, San Diego (2003)

  3. Börzsönyi, S., Kossmann, D., Stocker, K.: The skyline operator. In: Proceedings of ICDE, pp. 421–430. Springer, New York (2001)

  4. Chan, C.Y., Jagadish, H., Tan, K.L., Tung, A.K., Zhang, Z.: Finding k-dominant skylines in high dimensional space. In: Proceedings of SIGMOD, pp. 503–514. Springer, New York (2006)

  5. Cheema, M.A., Lin, X., Zhang, W., Zhang, Y.: A safe zone based approach for monitoring moving skyline queries. In: Proceedings of EDBT, pp. 275–286. IEEE Press, Geneva (2013)

  6. Das, G., Gunopulos, D., Koudas, N., Sarkas, N.: Ad-hoc top-k query answering for data streams. In: Proceedings of VLDB, pp. 183–194. ACM, New York (2007)

  7. Garofalakis, M., Keren, D., Samoladas, V.: Sketch-based geometric monitoring of distributed stream queries. PVLDB 6(10), 937–948 (2013)

    Google Scholar 

  8. Giatrakos, N., Deligiannakis, A., Garofalakis, M., Sharfman, I., Schuster, A.: Prediction-based geometric monitoring over distributed data streams. In: Proceedings of SIGMOD, pp. 265–276. ACM Press, Scottsdale (2012)

  9. Han, X., Li, J., Gao, H.: Tdep: Efficiently Processing Top-k Dominating Query on Massive Data, pp. 1–30. Knowledge and Information Systems, Springer, Berlin (2014)

    Google Scholar 

  10. He, Z., Lo, E.: Answering why-not questions on top-k queries. In: Proceedings of ICDE, pp. 750–761. Springer, New York (2012)

  11. Hose, K., Vlachou, A.: A survey of skyline processing in highly distributed environments. VLDB J. 21(3), 359–384 (2012)

    Article  Google Scholar 

  12. Huang, Q., Lee, P.P.: Ld-sketch: A distributed sketching design for accurate and scalable anomaly detection in network data streams. In: Proceedings of INFOCOM, pp. 1420–1428. IEEE Press, Geneva (2014)

  13. Huang, Z., Lu, H., Ooi, B.C., Tung, A.: Continuous skyline queries for moving objects. IEEE TKDE 18(12), 1645–1658 (2006)

    Google Scholar 

  14. Kontaki, M., Papadopoulos, A.N., Manolopoulos, Y.: Continuous top-k dominating queries in subspaces. In: Panhellenic Conference on Informatics, pp. 31–35. IEEE Press, New York (2008)

  15. Kontaki, M., Papadopoulos, A.N., Manolopoulos, Y.: Continuous top-k dominating queries. IEEE TKDE 24(5), 840–853 (2012)

    Google Scholar 

  16. Lee, Y.W., Lee, K.Y., Kim, M.H.: Efficient processing of multiple continuous skyline queries over a data stream. Inf. Sci. 221, 316–337 (2013)

    Article  Google Scholar 

  17. Lian, X., Chen, L.: Top-k dominating queries in uncertain databases. In: Proceedings of EDBT, pp. 660–671. IEEE Press, New York (2009)

  18. Lin, X., Yuan, Y., Wang, W., Lu, H.: Stabbing the sky: efficient skyline computation over sliding windows. In: Proceedings of ICDE, pp. 502–513. IEEE Press, San Diego (2005)

  19. Lin, X., Yuan, Y., Zhang, Q., Zhang, Y.: Selecting stars: the k most representative skyline operator. In: Proceedings of SIGMOD, pp. 86–95. Springer, Heidelberg (2007)

  20. Lu, H., Zhou, Y., Haustad, J.: Continuous skyline monitoring over distributed data streams. In: SSDBM, pp. 565–583. Springer, Heidelberg (2010)

  21. Lu, H., Zhou, Y., Haustad, J.: Efficient and scalable continuous skyline monitoring in two-tier streaming settings. Inf. Syst. 38(1), 68–81 (2013)

    Article  Google Scholar 

  22. Morse, M., Patel, J.M., Grosky, W.I.: Efficient continuous skyline computation. Inf. Sci. 177(17), 3411–3437 (2007)

    Article  MathSciNet  Google Scholar 

  23. Mouratidis, K., Bakiras, S., Papadias, D.: Continuous monitoring of top-k queries over sliding windows. In: SIGMOD, pp. 635–646. ACM New York (2006)

  24. Papadias, D., Tao, Y., Fu, G., Seeger, B.: Progressive skyline computation in database systems. ACM Trans. Database Syst. 30(1), 41–82 (2005)

    Article  Google Scholar 

  25. Papapetrou, O., Garofalakis, M.: Continuous fragmented skylines over distributed streams. In: Proceedings of ICDE, pp. 124–135. ACM Press, New York (2014)

  26. Papapetrou, O., Garofalakis, M., Deligiannakis, A.: Sketch-based querying of distributed sliding-window data streams. PVLDB 5(10), 992–1003 (2012)

    Google Scholar 

  27. Santoso, B., Chiu, G.: Close dominance graph: an efficient framework for answering continuous top-k dominating queries. IEEE TKDE 26(8), 1853–1865 (2014)

    Google Scholar 

  28. Shen, Z., Cheema, M.A., Lin, X., Zhang, W., Wang, H.: Efficiently monitoring top-k pairs over sliding windows. In: Proceedings of ICDE, pp. 798–809. Springer, New York (2012)

  29. Skoutas, D., Sacharidis, D., Simitsis, A., Kantere, V., Sellis, T.: Top-k dominant web services under multi-criteria matching. In: Proceedings of EDBT, pp. 898–909. Springer, New York (2009)

  30. Sun, S., Huang, Z., Zhong, H., Dai, D., Liu, H., Li, J.: Efficient monitoring of skyline queries over distributed data streams. Knowl. Inf. Syst. 25(3), 575–606 (2010)

  31. Tao, Y., Papadias, D.: Maintaining sliding window skylines on data streams. IEEE TKDE 18(3), 377–391 (2006)

    Google Scholar 

  32. Tiakas, E., Valkans, G., Papadopoulos, A.N., Manolopoulos, Y., Gunopoulos, D.: Metric-based top-k dominating queries. In: Proceedings of EDBT, pp. 415–426. IEEE Press, San Diego (2014)

  33. Vlachou, A., Doulkeridis, C., Nørvåg, K., Vazirgiannis, M.: On efficient top-k query processing in highly distributed environments. In: Proceedings of SIGMOD, pp. 753–764. Springer, New York (2008)

  34. Xie, X., Lu, H., Chen, J., Shang, S.: Top-k neighborhood dominating query. In: Proceedings of DASFAA, pp. 131–145. Springer, Berlin (2013)

  35. Yang, D., Shastri, A., Rundensteiner, E.A., Ward, M.O.: An optimal strategy for monitoring top-k queries in streaming windows. In: Proceedings of EDBT, pp. 57–68. ACM, New York (2011)

  36. Yiu, M.L., Mamoulis, N.: Efficient processing of top-k dominating queries on multi-dimensional data. In: Proceedings of VLDB, pp. 483–494. Springer, New York (2007)

  37. Yiu, M.L., Mamoulis, N.: Multi-dimensional top-k dominating queries. VLDB J. 18(3), 695–718 (2009)

    Article  Google Scholar 

  38. Yu, A., Agarwal, P., Yang, J.: Processing a large number of continuous preference top-k queries. In: Proceedings of SIGMOD, pp. 397–408. ACM Press, New York (2012)

  39. Zhang, W., Lin, X., Zhang, Y., Pei, J., Wang, W.: Threshold-based probabilistic top-k dominating queries. VLDB J. 19(2), 283–305 (2010)

    Article  Google Scholar 

  40. Zhang, W., Lin, X., Zhang, Y., Pei, J., Wang, W.: Progressive processing of subspace dominating queries. VLDB J. 20(6), 921–948 (2011)

    Article  Google Scholar 

  41. Zhang, Z., Cheng, R., Papadias, D., Tung, A.K.: Minimizing the communication cost for continuous skyline maintenance. In: Proceedings of SIGMOD, pp. 495–508. ACM Press, New York (2009)

Download references

Acknowledgments

This research is partially supported by the Grant-in-Aid for Scientific Research (A)(26240013) of MEXT, Japan, and JST, Strategic International Collaborative Research Program, SICORP.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Daichi Amagata.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Amagata, D., Hara, T. & Nishio, S. Sliding window top-k dominating query processing over distributed data streams. Distrib Parallel Databases 34, 535–566 (2016). https://doi.org/10.1007/s10619-015-7187-9

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10619-015-7187-9

Keywords

Navigation