Nothing Special   »   [go: up one dir, main page]

Skip to main content

A Survey on Advancements of Real-Time Analytics Architecture Components

  • Conference paper
  • First Online:
Computational Methods and Data Engineering

Part of the book series: Lecture Notes on Data Engineering and Communications Technologies ((LNDECT,volume 139))

  • 529 Accesses

Abstract

With the evolving technological landscape, Real-Time Analytics has become a de-facto practice and area of attraction for various industrial applications. Real-Time Analytics has proven to be impactful in plenty of critical problem statements, and a wide range of use cases have been raised in the real-time world. Working with streaming data includes various steps and aspects of software development and engineering starting from data collection to movement to processing and building actionable insights. Every such component of streaming analytics architecture needs to be designed and developed keeping in mind that the materialization of insights generated with this kind of dataset needs to be nearly real time. Many industrial organizations and large-scale enterprises have grown by adopting and investing heavily into building their systems using event-based streaming architectures. In this article, we intend to provide insights on primarily three aspects of architectural components in the context of Real-Time Analytics as well as advancements in the field. These aspects include recent research and development with regards to Real-Time Analytics Architecture and its use cases, industrial applications, development of tools and technologies. We also attempt to expose some open challenges and research issues that need further attention from researchers and industrial experts.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Davoudian A, Liu M (2020) Big data systems: a software engineering perspective. ACM Comput Surv 53(5, Article 110):39. https://doi.org/10.1145/3408314

  2. Dubuc T, Stahl F, Roesch EB (2021) Mapping the big data landscape: technologies, platforms and paradigms for real-time analytics of data streams. IEEE Access 9:15351–15374. https://doi.org/10.1109/ACCESS.2020.3046132

    Article  Google Scholar 

  3. Al-Ghossein M, Abdessalem T, Barre A (2021) A survey on stream-based recommender systems. ACM Comput Surv 54(5, Article 104):36. https://doi.org/10.1145/3453443

  4. Michel O, Bifulco R, Rétvári G, Schmid S (2021) The programmable data plane: abstractions, architectures, algorithms, and applications. ACM Comput Surv 54(4, Article 82):36. https://doi.org/10.1145/3447868

  5. Ellis B (2014) Real-time analytics: techniques to analyze and visualize streaming data. Wiley & Sons

    Google Scholar 

  6. Peng J, Cai K, Jin X (2020) High concurrency massive data collection algorithm for IoMT applications, Comput Commun 157:402–409. https://doi.org/10.1016/j.comcom.2020.04.045. ISSN 0140-3664

  7. Chen G, Yeh T, Liu C, İk T (2020) Microscopic traffic monitoring and data collection cloud platform based on aerial video. In: 2020 IEEE wireless communications and networking conference (WCNC), pp 1–6. https://doi.org/10.1109/WCNC45663.2020.9120623

  8. Nougnanke KB, Labit Y (2020) Novel adaptive data collection based on a confidence index in SDN. In: 2020 IEEE 17th annual consumer communications & networking conference (CCNC), pp 1–6. https://doi.org/10.1109/CCNC46108.2020.9045207

  9. Melenli S, Topkaya A (2020) Real-time maintaining of social distance in Covid-19 environment using image processing and Big Data. In: 2020 Innovations in intelligent systems and applications conference (ASYU), pp 1–5. https://doi.org/10.1109/ASYU50717.2020.9259891

  10. Facebook. [n.d.]. Facebook fbthrift. Retrieved from https://github.com/facebook/fbthrift/

  11. Google. [n.d.]. Protocol Buffers. Retrieved from https://developers.google.com/protocol-buffers/

  12. Apache. [n.d.]. Welcome to Apache Avro!. Retrieved from https://avro.apache.org/

  13. JSON. [n.d.]. Introducing JSON. Retrieved from https://www.json.org/

  14. Silvestre PF, Fragkoulis M, Spinellis D, Katsifodimos A (2021) Clonos: consistent causal recovery for highly-available streaming dataflows. In: Proceedings of the 2021 international conference on management of data. Association for Computing Machinery, New York, NY, USA, pp 1637–1650. https://doi.org/10.1145/3448016.3457320

  15. Ma Q, Gu Y, Lee W-C, Yu G, Liu H, Wu X (2020) REMIAN: real-time and error-tolerant missing value imputation. ACM Trans Knowl Discov Data 14(6, Article 77):38. https://doi.org/10.1145/3412364

  16. Lesniak A, Laigner R, Zhou Y (2021) Enforcing consistency in microservice architectures through event-based constraints. In: Proceedings of the 15th ACM international conference on distributed and event-based systems. Association for Computing Machinery, New York, NY, USA, pp 180–183. https://doi.org/10.1145/3465480.3467839

  17. Apache. [n.d.]. Apache Samza. Retrieved from https://projects.apache.org/project.html?samza

  18. Apache. [n.d.]. Apache Helix. Retrieved from https://projects.apache.org/project.html?helix

  19. Apache. [n.d.]. Apache Kafka. Retrieved from https://projects.apache.org/project.html?kafka

  20. Apache. [n.d.]. Apache Flume. Retrieved from https://projects.apache.org/project.html?flume

  21. Wang G, Chen L, Dikshit A, Gustafson J, Chen B, Sax MJ, Roesler J, Blee-Goldman S, Cadonna B, Mehta A, Madan V, Rao J (2021) Consistency and completeness: rethinking distributed stream processing in Apache Kafka. In: Proceedings of the 2021 international conference on management of data (SIGMOD/PODS '21). Association for Computing Machinery, New York, NY, USA, pp 2602–2613. https://doi.org/10.1145/3448016.3457556

  22. Lv Z, Chen D, Singh AK (2021) Big data processing on volunteer computing. ACM Trans Internet Technol 21(4, Article 83):20. https://doi.org/10.1145/3409801

  23. Lin C, Ouyang Z, Wang X, Li H, Huang Z (2021) Preserve integrity in realtime event summarization. ACM Trans Knowl Discov Data 15(3, Article 49):29. https://doi.org/10.1145/3442344

  24. Wu H, Ma T, Wu L, Xu F, Ji (2021) Exploiting heterogeneous graph neural networks with latent worker/task correlation information for label aggregation in crowdsourcing. ACM Trans Knowl Discov Data 16(2, Article 27):18. https://doi.org/10.1145/3460865

  25. Rozet A, Poppe O, Lei C, Rundensteiner EA (2020) Muse: multi-query event trend aggregation. In: Proceedings of the 29th ACM international conference on information & knowledge management. Association for Computing Machinery, New York, NY, USA, pp 2193–2196. https://doi.org/10.1145/3340531.3412138

  26. Traub J, Grulich PM, Cuéllar AR, Breß S, Katsifodimos A, Rabl T, Markl V (2021) Scotty: general and efficient open-source window aggregation for stream processing systems. ACM Trans Database Syst 46(1, Article 1):46.https://doi.org/10.1145/3433675

  27. Apache. [n.d.]. Apache Storm. Retrieved from https://projects.apache.org/project.html?storm

  28. Apache. [n.d.]. Apache Flink. Retrieved from https://projects.apache.org/project.html?flink

  29. Apache. [n.d.]. Apache Spark. Retrieved from https://projects.apache.org/project.html?spark

  30. Apache. [n.d.]. Apache Fluo. Retrieved from https://projects.apache.org/project.html?fluo

  31. Apache. [n.d.]. Apache Tez. Retrieved from https://projects.apache.org/project.html?tez

  32. Körber M, Glombiewski N, Seeger B (2021) Index-accelerated pattern matching in event stores. In: Proceedings of the 2021 international conference on management of data. Association for Computing Machinery, New York, NY, USA, pp 1023–1036. https://doi.org/10.1145/3448016.3457245

  33. Zheng Q, Cranor CD, Jain A, Ganger GR, Gibson GA, Amvrosiadis G, Settlemyer BW, Grider G (2020) Streaming data reorganization at scale with DeltaFS indexed massive directories. ACM Trans Storage 16(4, Article 23):31. https://doi.org/10.1145/3415581

  34. Izadpanah R, Peterson C, Solihin Y, Dechev D (2021) PETRA: persistent transactional non-blocking linked data structures. ACM Trans Archit Code Optim 18(2, Article 23):26. https://doi.org/10.1145/3446391

  35. Srikanth S, Jain A, Conte TM, Debenedictis EP, Cook J (2021) SortCache: intelligent cache management for accelerating sparse data workloads. ACM Trans Archit Code Optim 18(4, Article 56):24. https://doi.org/10.1145/3473332

  36. Singh A, Dave S, Zardoshti P, Brotzman R, Zhang C, Guo X, Shrivastava A, Tan G, Spear M (2021) SPX64: a scratchpad memory for general-purpose microprocessors. ACM Trans Archit Code Optim 18(1, Article 14):26. https://doi.org/10.1145/3436730

  37. Maass S, Kumar MK, Kim T, Krishna T, Bhattacharjee A (2020) ECO TLB: eventually consistent TLBs. ACM Trans Archit Code Optim 17(4, Article 27):24. https://doi.org/10.1145/3409454

  38. Apache. [n.d.]. Apache CouchDB. Retrieved from https://projects.apache.org/project.html?couchdb

  39. Apache. [n.d.]. Apache Carbondata. Retrieved from https://projects.apache.org/project.html?carbondata

  40. Apache. [n.d.]. Apache Phoenix. Retrieved from https://projects.apache.org/project.html?phoenix

  41. Apache. [n.d.]. Apache Lucene-core. Retrieved from https://projects.apache.org/project.html?lucene-core

  42. Paschalides D, Stephanidis D, Andreou A, Orphanou K, Pallis G, Dikaiakos MD, Markatos E (2020) MANDOLA: a big-data processing and visualization platform for monitoring and detecting online hate speech. ACM Trans Internet Technol 20(2, Article 11):21. https://doi.org/10.1145/3371276

  43. Farhat O, Daudjee K, Querzoni L (2021) Klink: progress-aware scheduling for streaming data systems. In: Proceedings of the 2021 international conference on management of data. Association for Computing Machinery, New York, NY, USA, pp 485–498. https://doi.org/10.1145/3448016.3452794

  44. Li Y, Li K, Chen C, Zhou X, Zeng Z, Li K (2021) Modeling temporal patterns with dilated convolutions for time-series forecasting. ACM Trans Knowl Discov Data 16(1, Article 14):22. https://doi.org/10.1145/3453724

  45. Savva F, Anagnostopoulos C, Triantafillou P, Kolomvatsos K (2020) Large-scale data exploration using explanatory regression functions. ACM Trans Knowl Discov Data 14(6, Article 76):33. https://doi.org/10.1145/3410448

  46. Apache. [n.d.]. Apache Zeppelin. Retrieved from https://projects.apache.org/project.html?zeppelin

  47. Grafana Labs [n.d.]. Grafana Labs. Retrieved from https://grafana.com/

  48. Prometheus [n.d.]. Prometheus. Retrieved from https://prometheus.io/

  49. Zang T, Zhu Y, Xu Y, Yu J (2021) Jointly modeling spatio–temporal dependencies and daily flow correlations for crowd flow prediction. ACM Trans Knowl Discov Data 15(4, Article 58):20. https://doi.org/10.1145/3439346

  50. Deng J, Chen X, Fan Z, Jiang R, Song X, Tsang IW (2021) The pulse of urban transport: exploring the co-evolving pattern for spatio-temporal forecasting. ACM Trans Knowl Discov Data 15(6, Article 103):25. https://doi.org/10.1145/3450528

  51. Daghistani A, Aref WG, Ghafoor A, Mahmood AR (2021) SWARM: adaptive load balancing in distributed streaming systems for big spatial data. ACM Trans Spatial Algorithms Syst 7(3, Article 14):43. https://doi.org/10.1145/3460013

  52. Fu Y, Soman C (2021) Real-time data infrastructure at Uber. In: Proceedings of the 2021 international conference on management of data. Association for Computing Machinery, New York, NY, USA, pp 2503–2516. https://doi.org/10.1145/3448016.3457552

  53. Badr E (2021) Images in space and time: real big data in healthcare. ACM Comput Surv 54(6, Article 113):38. https://doi.org/10.1145/3453657

  54. Can YS, Ersoy C (2021) Privacy-preserving federated deep learning for wearable IoT-based biomedical monitoring. ACM Trans Internet Technol 21(1, Article 21):17. https://doi.org/10.1145/3428152

  55. Yue Z, Ding S, Zhao L, Zhang Y, Cao Z, Tanveer M, Jolfaei A, Zheng X (2021) Privacy-preserving time-series medical images analysis using a hybrid deep learning framework. ACM Trans Internet Technol 21(3, Article 57):21. https://doi.org/10.1145/3383779

  56. Piccialli F, Giampaolo F, Prezioso E, Crisci D, Cuomo S (2021) Predictive analytics for smart parking: a deep learning approach in forecasting of IoT data. ACM Trans Internet Technol 21(3, Article 68):21. https://doi.org/10.1145/3412842

  57. Djenouri Y, Djenouri D, Lin JC-W (2021) Trajectory outlier detection: new problems and solutions for smart cities. ACM Trans Knowl Discov Data 15(2, Article 20):28. https://doi.org/10.1145/3425867

  58. Shahid H, Shah MA, Almogren A, Khattak HA, Din IU, Kumar N, Maple C (2021) Machine learning-based mist computing enabled internet of battlefield things. ACM Trans Internet Technol 21(4, Article 101):26. https://doi.org/10.1145/3418204

  59. Shen J, Cao J, Lederman O, Tang S, Pentland AS (2021) User profiling based on nonlinguistic audio data. ACM Trans Inf Syst 40(1, Article 17):23. https://doi.org/10.1145/3474826

  60. Almaslukh A, Kang Y, Magdy A (2021) Temporal geo-social personalized keyword search over streaming data. ACM Trans Spatial Algorithms Syst 7(4, Article 20): 28. https://doi.org/10.1145/3473006

  61. Hidalgo JIG, Santos SGTC, Barros RSM (2021) Dynamically adjusting diversity in ensembles for the classification of data streams with concept drift. ACM Trans Knowl Discov Data 16(2, Article 31):20. https://doi.org/10.1145/3466616

  62. Wu H, Wu Q, Ng MK (2021) Knowledge preserving and distribution alignment for heterogeneous domain adaptation. ACM Trans Inf Syst 40(1, Article 16):29. https://doi.org/10.1145/3469856

  63. Kallas K, Niksic F, Stanford C, Alur R (2020) DiffStream: differential output testing for stream processing programs. Proc ACM Program Lang 4(OOPSLA, Article 153):29. https://doi.org/10.1145/3428221

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Rajnish Dashora .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Dashora, R., Babu, M.R. (2023). A Survey on Advancements of Real-Time Analytics Architecture Components. In: Asari, V.K., Singh, V., Rajasekaran, R., Patel, R.B. (eds) Computational Methods and Data Engineering. Lecture Notes on Data Engineering and Communications Technologies, vol 139. Springer, Singapore. https://doi.org/10.1007/978-981-19-3015-7_41

Download citation

Publish with us

Policies and ethics