Abstract
With the evolving technological landscape, Real-Time Analytics has become a de-facto practice and area of attraction for various industrial applications. Real-Time Analytics has proven to be impactful in plenty of critical problem statements, and a wide range of use cases have been raised in the real-time world. Working with streaming data includes various steps and aspects of software development and engineering starting from data collection to movement to processing and building actionable insights. Every such component of streaming analytics architecture needs to be designed and developed keeping in mind that the materialization of insights generated with this kind of dataset needs to be nearly real time. Many industrial organizations and large-scale enterprises have grown by adopting and investing heavily into building their systems using event-based streaming architectures. In this article, we intend to provide insights on primarily three aspects of architectural components in the context of Real-Time Analytics as well as advancements in the field. These aspects include recent research and development with regards to Real-Time Analytics Architecture and its use cases, industrial applications, development of tools and technologies. We also attempt to expose some open challenges and research issues that need further attention from researchers and industrial experts.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Davoudian A, Liu M (2020) Big data systems: a software engineering perspective. ACM Comput Surv 53(5, Article 110):39. https://doi.org/10.1145/3408314
Dubuc T, Stahl F, Roesch EB (2021) Mapping the big data landscape: technologies, platforms and paradigms for real-time analytics of data streams. IEEE Access 9:15351–15374. https://doi.org/10.1109/ACCESS.2020.3046132
Al-Ghossein M, Abdessalem T, Barre A (2021) A survey on stream-based recommender systems. ACM Comput Surv 54(5, Article 104):36. https://doi.org/10.1145/3453443
Michel O, Bifulco R, Rétvári G, Schmid S (2021) The programmable data plane: abstractions, architectures, algorithms, and applications. ACM Comput Surv 54(4, Article 82):36. https://doi.org/10.1145/3447868
Ellis B (2014) Real-time analytics: techniques to analyze and visualize streaming data. Wiley & Sons
Peng J, Cai K, Jin X (2020) High concurrency massive data collection algorithm for IoMT applications, Comput Commun 157:402–409. https://doi.org/10.1016/j.comcom.2020.04.045. ISSN 0140-3664
Chen G, Yeh T, Liu C, İk T (2020) Microscopic traffic monitoring and data collection cloud platform based on aerial video. In: 2020 IEEE wireless communications and networking conference (WCNC), pp 1–6. https://doi.org/10.1109/WCNC45663.2020.9120623
Nougnanke KB, Labit Y (2020) Novel adaptive data collection based on a confidence index in SDN. In: 2020 IEEE 17th annual consumer communications & networking conference (CCNC), pp 1–6. https://doi.org/10.1109/CCNC46108.2020.9045207
Melenli S, Topkaya A (2020) Real-time maintaining of social distance in Covid-19 environment using image processing and Big Data. In: 2020 Innovations in intelligent systems and applications conference (ASYU), pp 1–5. https://doi.org/10.1109/ASYU50717.2020.9259891
Facebook. [n.d.]. Facebook fbthrift. Retrieved from https://github.com/facebook/fbthrift/
Google. [n.d.]. Protocol Buffers. Retrieved from https://developers.google.com/protocol-buffers/
Apache. [n.d.]. Welcome to Apache Avro!. Retrieved from https://avro.apache.org/
JSON. [n.d.]. Introducing JSON. Retrieved from https://www.json.org/
Silvestre PF, Fragkoulis M, Spinellis D, Katsifodimos A (2021) Clonos: consistent causal recovery for highly-available streaming dataflows. In: Proceedings of the 2021 international conference on management of data. Association for Computing Machinery, New York, NY, USA, pp 1637–1650. https://doi.org/10.1145/3448016.3457320
Ma Q, Gu Y, Lee W-C, Yu G, Liu H, Wu X (2020) REMIAN: real-time and error-tolerant missing value imputation. ACM Trans Knowl Discov Data 14(6, Article 77):38. https://doi.org/10.1145/3412364
Lesniak A, Laigner R, Zhou Y (2021) Enforcing consistency in microservice architectures through event-based constraints. In: Proceedings of the 15th ACM international conference on distributed and event-based systems. Association for Computing Machinery, New York, NY, USA, pp 180–183. https://doi.org/10.1145/3465480.3467839
Apache. [n.d.]. Apache Samza. Retrieved from https://projects.apache.org/project.html?samza
Apache. [n.d.]. Apache Helix. Retrieved from https://projects.apache.org/project.html?helix
Apache. [n.d.]. Apache Kafka. Retrieved from https://projects.apache.org/project.html?kafka
Apache. [n.d.]. Apache Flume. Retrieved from https://projects.apache.org/project.html?flume
Wang G, Chen L, Dikshit A, Gustafson J, Chen B, Sax MJ, Roesler J, Blee-Goldman S, Cadonna B, Mehta A, Madan V, Rao J (2021) Consistency and completeness: rethinking distributed stream processing in Apache Kafka. In: Proceedings of the 2021 international conference on management of data (SIGMOD/PODS '21). Association for Computing Machinery, New York, NY, USA, pp 2602–2613. https://doi.org/10.1145/3448016.3457556
Lv Z, Chen D, Singh AK (2021) Big data processing on volunteer computing. ACM Trans Internet Technol 21(4, Article 83):20. https://doi.org/10.1145/3409801
Lin C, Ouyang Z, Wang X, Li H, Huang Z (2021) Preserve integrity in realtime event summarization. ACM Trans Knowl Discov Data 15(3, Article 49):29. https://doi.org/10.1145/3442344
Wu H, Ma T, Wu L, Xu F, Ji (2021) Exploiting heterogeneous graph neural networks with latent worker/task correlation information for label aggregation in crowdsourcing. ACM Trans Knowl Discov Data 16(2, Article 27):18. https://doi.org/10.1145/3460865
Rozet A, Poppe O, Lei C, Rundensteiner EA (2020) Muse: multi-query event trend aggregation. In: Proceedings of the 29th ACM international conference on information & knowledge management. Association for Computing Machinery, New York, NY, USA, pp 2193–2196. https://doi.org/10.1145/3340531.3412138
Traub J, Grulich PM, Cuéllar AR, Breß S, Katsifodimos A, Rabl T, Markl V (2021) Scotty: general and efficient open-source window aggregation for stream processing systems. ACM Trans Database Syst 46(1, Article 1):46.https://doi.org/10.1145/3433675
Apache. [n.d.]. Apache Storm. Retrieved from https://projects.apache.org/project.html?storm
Apache. [n.d.]. Apache Flink. Retrieved from https://projects.apache.org/project.html?flink
Apache. [n.d.]. Apache Spark. Retrieved from https://projects.apache.org/project.html?spark
Apache. [n.d.]. Apache Fluo. Retrieved from https://projects.apache.org/project.html?fluo
Apache. [n.d.]. Apache Tez. Retrieved from https://projects.apache.org/project.html?tez
Körber M, Glombiewski N, Seeger B (2021) Index-accelerated pattern matching in event stores. In: Proceedings of the 2021 international conference on management of data. Association for Computing Machinery, New York, NY, USA, pp 1023–1036. https://doi.org/10.1145/3448016.3457245
Zheng Q, Cranor CD, Jain A, Ganger GR, Gibson GA, Amvrosiadis G, Settlemyer BW, Grider G (2020) Streaming data reorganization at scale with DeltaFS indexed massive directories. ACM Trans Storage 16(4, Article 23):31. https://doi.org/10.1145/3415581
Izadpanah R, Peterson C, Solihin Y, Dechev D (2021) PETRA: persistent transactional non-blocking linked data structures. ACM Trans Archit Code Optim 18(2, Article 23):26. https://doi.org/10.1145/3446391
Srikanth S, Jain A, Conte TM, Debenedictis EP, Cook J (2021) SortCache: intelligent cache management for accelerating sparse data workloads. ACM Trans Archit Code Optim 18(4, Article 56):24. https://doi.org/10.1145/3473332
Singh A, Dave S, Zardoshti P, Brotzman R, Zhang C, Guo X, Shrivastava A, Tan G, Spear M (2021) SPX64: a scratchpad memory for general-purpose microprocessors. ACM Trans Archit Code Optim 18(1, Article 14):26. https://doi.org/10.1145/3436730
Maass S, Kumar MK, Kim T, Krishna T, Bhattacharjee A (2020) ECO TLB: eventually consistent TLBs. ACM Trans Archit Code Optim 17(4, Article 27):24. https://doi.org/10.1145/3409454
Apache. [n.d.]. Apache CouchDB. Retrieved from https://projects.apache.org/project.html?couchdb
Apache. [n.d.]. Apache Carbondata. Retrieved from https://projects.apache.org/project.html?carbondata
Apache. [n.d.]. Apache Phoenix. Retrieved from https://projects.apache.org/project.html?phoenix
Apache. [n.d.]. Apache Lucene-core. Retrieved from https://projects.apache.org/project.html?lucene-core
Paschalides D, Stephanidis D, Andreou A, Orphanou K, Pallis G, Dikaiakos MD, Markatos E (2020) MANDOLA: a big-data processing and visualization platform for monitoring and detecting online hate speech. ACM Trans Internet Technol 20(2, Article 11):21. https://doi.org/10.1145/3371276
Farhat O, Daudjee K, Querzoni L (2021) Klink: progress-aware scheduling for streaming data systems. In: Proceedings of the 2021 international conference on management of data. Association for Computing Machinery, New York, NY, USA, pp 485–498. https://doi.org/10.1145/3448016.3452794
Li Y, Li K, Chen C, Zhou X, Zeng Z, Li K (2021) Modeling temporal patterns with dilated convolutions for time-series forecasting. ACM Trans Knowl Discov Data 16(1, Article 14):22. https://doi.org/10.1145/3453724
Savva F, Anagnostopoulos C, Triantafillou P, Kolomvatsos K (2020) Large-scale data exploration using explanatory regression functions. ACM Trans Knowl Discov Data 14(6, Article 76):33. https://doi.org/10.1145/3410448
Apache. [n.d.]. Apache Zeppelin. Retrieved from https://projects.apache.org/project.html?zeppelin
Grafana Labs [n.d.]. Grafana Labs. Retrieved from https://grafana.com/
Prometheus [n.d.]. Prometheus. Retrieved from https://prometheus.io/
Zang T, Zhu Y, Xu Y, Yu J (2021) Jointly modeling spatio–temporal dependencies and daily flow correlations for crowd flow prediction. ACM Trans Knowl Discov Data 15(4, Article 58):20. https://doi.org/10.1145/3439346
Deng J, Chen X, Fan Z, Jiang R, Song X, Tsang IW (2021) The pulse of urban transport: exploring the co-evolving pattern for spatio-temporal forecasting. ACM Trans Knowl Discov Data 15(6, Article 103):25. https://doi.org/10.1145/3450528
Daghistani A, Aref WG, Ghafoor A, Mahmood AR (2021) SWARM: adaptive load balancing in distributed streaming systems for big spatial data. ACM Trans Spatial Algorithms Syst 7(3, Article 14):43. https://doi.org/10.1145/3460013
Fu Y, Soman C (2021) Real-time data infrastructure at Uber. In: Proceedings of the 2021 international conference on management of data. Association for Computing Machinery, New York, NY, USA, pp 2503–2516. https://doi.org/10.1145/3448016.3457552
Badr E (2021) Images in space and time: real big data in healthcare. ACM Comput Surv 54(6, Article 113):38. https://doi.org/10.1145/3453657
Can YS, Ersoy C (2021) Privacy-preserving federated deep learning for wearable IoT-based biomedical monitoring. ACM Trans Internet Technol 21(1, Article 21):17. https://doi.org/10.1145/3428152
Yue Z, Ding S, Zhao L, Zhang Y, Cao Z, Tanveer M, Jolfaei A, Zheng X (2021) Privacy-preserving time-series medical images analysis using a hybrid deep learning framework. ACM Trans Internet Technol 21(3, Article 57):21. https://doi.org/10.1145/3383779
Piccialli F, Giampaolo F, Prezioso E, Crisci D, Cuomo S (2021) Predictive analytics for smart parking: a deep learning approach in forecasting of IoT data. ACM Trans Internet Technol 21(3, Article 68):21. https://doi.org/10.1145/3412842
Djenouri Y, Djenouri D, Lin JC-W (2021) Trajectory outlier detection: new problems and solutions for smart cities. ACM Trans Knowl Discov Data 15(2, Article 20):28. https://doi.org/10.1145/3425867
Shahid H, Shah MA, Almogren A, Khattak HA, Din IU, Kumar N, Maple C (2021) Machine learning-based mist computing enabled internet of battlefield things. ACM Trans Internet Technol 21(4, Article 101):26. https://doi.org/10.1145/3418204
Shen J, Cao J, Lederman O, Tang S, Pentland AS (2021) User profiling based on nonlinguistic audio data. ACM Trans Inf Syst 40(1, Article 17):23. https://doi.org/10.1145/3474826
Almaslukh A, Kang Y, Magdy A (2021) Temporal geo-social personalized keyword search over streaming data. ACM Trans Spatial Algorithms Syst 7(4, Article 20): 28. https://doi.org/10.1145/3473006
Hidalgo JIG, Santos SGTC, Barros RSM (2021) Dynamically adjusting diversity in ensembles for the classification of data streams with concept drift. ACM Trans Knowl Discov Data 16(2, Article 31):20. https://doi.org/10.1145/3466616
Wu H, Wu Q, Ng MK (2021) Knowledge preserving and distribution alignment for heterogeneous domain adaptation. ACM Trans Inf Syst 40(1, Article 16):29. https://doi.org/10.1145/3469856
Kallas K, Niksic F, Stanford C, Alur R (2020) DiffStream: differential output testing for stream processing programs. Proc ACM Program Lang 4(OOPSLA, Article 153):29. https://doi.org/10.1145/3428221
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Dashora, R., Babu, M.R. (2023). A Survey on Advancements of Real-Time Analytics Architecture Components. In: Asari, V.K., Singh, V., Rajasekaran, R., Patel, R.B. (eds) Computational Methods and Data Engineering. Lecture Notes on Data Engineering and Communications Technologies, vol 139. Springer, Singapore. https://doi.org/10.1007/978-981-19-3015-7_41
Download citation
DOI: https://doi.org/10.1007/978-981-19-3015-7_41
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-19-3014-0
Online ISBN: 978-981-19-3015-7
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)