Abstract
The increasing volume of live content poses new challenges to publish/subscribe services at the cloud scale. Providing efficient publish/subscribe services for live content is a complex task because most subscriptions occupy only a small portion of the entire subscription space, i.e., use limited live content. Thus, the real-world workload of a publish/subscribe service for live content becomes skewed, and the distribution of subscriptions becomes seriously imbalanced, causing an inefficient processing of events. We present a correlation-based balanced content space partitioning technique for a publish/subscribe service. Our proposed technique reduces the degree of imbalance from a skewed subscription workload in a content-based publish/subscribe service, using the correlation coefficient between attributes to build dimension groups. We assign attributes of low correlation to the same dimension group to balance the subscription workloads. Moreover, we present our analysis on the load balance impacts of varying partitioning granularity for efficient message processing. We conducted empirical experiments evaluating the effectiveness of our partitioning technique and measuring the impact of varying partitioning granularity. The results show that the proposed technique outperforms conventional partitioning techniques by evaluating the ways in which subscriptions are evenly distributed among brokers. Moreover, the results show that the load balance can be improved by increasing the partitioning granularity with an adjustment of two degrees, i.e., the segment and dimension group degrees.
Similar content being viewed by others
References
Cheung AKY, Jacobsen HA (2010) Load balancing content-based publish/subscribe systems. ACM Trans Comput Syst (TOCS) 28(4):1–55
Cormen TH, Leiserson CE, Rivest RL, Stein C (2009) Introduction to algorithms. MIT press, New York
Ellison NB, Steinfield C, Lampe C (2007) The benefits of facebook “friends:’’ social capital and college students’ use of online social network sites. J Comput Med Commun 12(4):1143–1168
Esposito C, Ficco M, Palmieri F, Castiglione A (2015) A knowledge-based platform for big data analytics based on publish/subscribe services and stream processing. Knowl Based Syst 79:3–17
Feldmann M, Kolb C, Scheideler C, Strothmann T (2018) Self-stabilizing supervised publish-subscribe systems. In: 2018 IEEE international parallel and distributed processing symposium (IPDPS). IEEE, pp 1050–1059
Gascon-Samson J, Garcia FP, Kemme B, Kienzle J (2015) Dynamoth: a scalable pub/sub middleware for latency-constrained applications in the cloud. In: 2015 IEEE 35th International Conference on Distributed Computing Systems. IEEE, pp 486–496
Google protocol buffers. http://code.google.com/p/protobuf/ (2008)
Gupta A, Sahin OD, Agrawal D, El Abbadi A (2004) Meghdoot: content-based publish/subscribe over p2p networks. In: ACM/IFIP/USENIX International Conference on Distributed Systems Platforms and Open Distributed Processing. Springer, pp 254–273
Kreps J, Narkhede N, Rao J et al (2011) Kafka: a distributed messaging system for log processing. Proc NetDB 11:1–7
Langford E, Schwertman N, Owens M (2001) Is the property of being positively correlated transitive? Am Stat 55(4):322–325
Li M, Ye F, Kim M, Chen H, Lei H (2011) A scalable and elastic publish/subscribe service. In: 2011 IEEE international parallel and distributed processing symposium. IEEE, pp 1254–1265
Ma X, Wang Y, Pei X (2014) A scalable and reliable matching service for content-based publish/subscribe systems. IEEE Trans Coud Comput 3(1):1–13
Ma X, Wang Y, Qiu Q, Sun W, Pei X (2014) Scalable and elastic event matching for attribute-based publish/subscribe systems. Future Gener Comput Syst 36:102–119
Majumder A, Shrivastava N, Rastogi R, Srinivasan A (2009) Scalable content-based routing in pub/sub systems. In: IEEE INFOCOM 2009. IEEE, pp 567–575
Murmurhash. http://burtleburtle.net/bob/hash/doobs.html/ (2014)
Ross SM (2014) Introduction to probability and statistics for engineers and scientists. Academic Press, London
Tariq MA, Koldehofe B, Koch GG, Khan I, Rothermel K (2011) Meeting subscriber-defined qos constraints in publish/subscribe systems. Concurr Comput Pract Exp 23(17):2140–2153
Voulgaris S, Riviere E, Kermarrec AM, Van Steen M et al. (2006) Sub-2-sub: Self-organizing content-based publish subscribe for dynamic large scale collaborative networks. In: IPTPS, vol. 141. Citeseer
Wang Y, Ma X (2014) A general scalable and elastic content-based publish/subscribe service. IEEE Trans Parallel Distrib Syst 26(8):2100–2113
Acknowledgements
This research was jointly supported by the MSIT (Ministry of Science and ICT), Korea, under the ITRC (Information Technology Research Center) support program (IITP-2020-2018-0-01431) supervised by the IITP (Institute for Information & communications Technology Promotion), the Basic Science Research Program through the National Research Foundation of Korea(NRF) funded by the Ministry of Education (NRF-2018R1D1A1B07043858), and National Supercomputing Center with supercomputing resources including technical support (KSC-2019-CRE-0105).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Yoon, D., Li, Z. & Oh, S. Balanced content space partitioning for pub/sub: a study on impact of varying partitioning granularity. J Supercomput 77, 13676–13702 (2021). https://doi.org/10.1007/s11227-021-03821-5
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-021-03821-5