Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3468737.3494096acmconferencesArticle/Chapter ViewAbstractPublication PagesuccConference Proceedingsconference-collections
research-article
Open access

Amoeba: aligning stream processing operators with externally-managed state

Published: 17 December 2021 Publication History

Abstract

Scalable stream processing systems (SPS) often require external storage systems for long-term storage of non-emphemeral state. Such state cannot be accommodated in the internal stores of SPSes that are mainly geared for fault tolerance of streaming jobs, lack externally visible APIs, and their state is disposed of at the end of such jobs. Recent research have pointed to scalable in-memory key-value stores (KVS) as an efficient solution to manage external state. While such data stores have been interconnected with scalable streaming systems, they are currently managed independently, missing opportunities for optimizations, such as exploiting locality between stream partitions and table shards, as well as coordinating elasticity actions. Both processing and data management systems are typically designed for scalability, however coordination between them poses a significant challenge. In this work we describe Amoeba, a system that dynamically adapts data-partitioning schemes and/or task or data placement across systems to eliminate unnecessary network communication across nodes. Our evaluation using state-of-the art systems, such as the Flink SPS and Redis KVS, demonstrated 2.6x performance improvement when aligning SPS tasks with KVS shards in AWS deployments of up to 64 nodes.

References

[1]
[n.d.]. https://en.wikipedia.org/wiki/Amoeba (Accessed Aug 2021).
[2]
[n.d.]. https://ci.apache.org/projects/flink/flink-docs-stable/ops/state/savepoints.html (Accessed Aug 2021).
[3]
[n.d.]. Blink: How Alibaba Uses Apache Flink. https://www.ververica.com/blog/blink-flink-alibaba-search. (Accessed Aug 2021).
[4]
[n.d.]. Flink Operators. https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/operators (Accessed Aug 2021).
[5]
[n.d.]. Flink Queryable State (Beta). https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/state/queryable_state.html. (Accessed Aug 2021).
[6]
[n.d.]. Keystone Real-time Stream Processing Platform at Netflix. https://netflixtechblog.com/keystone-real-time-stream-processing-platform-a3ee651812a. (Accessed Aug 2021).
[7]
[n.d.]. Redis Cluster Specification. https://redis.io/topics/cluster-spec (Accessed Aug 2021).
[8]
Tyler Akidau et al. 2013. MillWheel: Fault-Tolerant Stream Processing at Internet Scale. Proc. VLDB Endow. 6, 11 (2013), 734--746.
[9]
Tyler Akidau et al. 2015. The Dataflow Model: A Practical Approach to Balancing Correctness, Latency, and Cost in Massive-Scale, Unbounded, Out-of-Order Data Processing. Proc. VLDB Endow. 8 (2015), 1792--1803.
[10]
R. Ananthanarayanan et al. 2013. Photon: Fault-Tolerant and Scalable Joining of Continuous Data Streams. In Proc. of SIGMOD'13.
[11]
Eric Anderson et al. 2002. Hippodrome: Running Circles Around Storage Administration. In Proc. of FAST '02 (Monterey, CA, USA). USA.
[12]
Arvind Arasu et al. 2004. Linear Road: A Stream Data Management Benchmark. In Proc. of VLDB'04 (Toronto, Canada).
[13]
P. Carbone et al. 2017. State Management in Apache Flink®: Consistent Stateful Distributed Stream Processing. Proc. VLDB Endow. 10, 12 (Aug. 2017), 1718--1729.
[14]
Guoqiang Jerry Chen et al. 2016. Realtime Data Processing at Facebook. In Proc. of the SIGMOD '16 (San Francisco, California, USA).
[15]
D. M. Chess and J. O. Kephart. 2003. The Vision of Autonomic Computing. Computer 36, 01 (jan 2003), 41--50.
[16]
S. Chintapallietal. 2016. Benchmarking Streaming Computation Engines: Storm, Flink and Spark Streaming. In 2016 IEEE IPDPS Workshops.
[17]
B. Del Monte et al. 2020. Rhino: Efficient Management of Very Large Distributed State for Stream Processing Engines. In Proc. of SIGMOD 20.
[18]
Y. Feng et al. 2011. Efficient and Adaptive Stateful Replication for Stream Processing Engines in High-Availability Cluster. IEEE TPDS 22, 11 (2011), 1788--1796.
[19]
M. Fragkoulis etal. 2020. A Survey on the Evolution of Stream Processing Systems. arXiv:CoRR abs/2008.00842
[20]
Buğra Gedik. 2014. Partitioning Functions for Stateful Data Parallelism in Stream Processing. The VLDB Journal 23, 4 (Aug. 2014), 517--539.
[21]
M. Ghosh et al. 2017. Morphus: Supporting Online Reconfigurations in Sharded NoSQL Systems. IEEE Trans. on Emerg. Topics in Computing 5, 4 (2017), 466--479.
[22]
Moritz Hoffmann et al. 2019. Megaphone: Latency-Conscious State Migration for Distributed Streaming Dataflows. Proc. VLDB Endow. 12, 9 (May 2019), 1002--1015.
[23]
N. Jain et al. 2006. Design, Implementation, and Evaluation of the Linear Road Benchmark on the Stream Processing Core. In SIGMOD 2006 (Chicago, IL, USA).
[24]
N. R. Katsipoulakis, A. Labrinidis, and P. K. Chrysanthis. 2017. A Holistic View of Stream Partitioning Costs. Proc. VLDB Endow. 10, 11 (Aug. 2017).
[25]
Y. Kwon et al. 2008. Fault-Tolerant Stream Processing Using a Distributed, Replicated File System. Proc. VLDB Endow. 1, 1 (Aug. 2008), 574--585.
[26]
G. Losa et al. 2012. CAPSULE: Language and System Support for Efficient State Sharing in Distributed Stream Processing Systems. In Proc. of DEBS'12.
[27]
K. G. S. Madsen et al. 2017. Integrative Dynamic Reconfiguration in a Parallel Stream Processing Engine. In Proc. of 33rd ICDE'17.
[28]
M. A. U. Nasir et al. 2015. The power of both choices: Practical load balancing for distributed stream processing engines. In Proc. of 31st ICDE'15.
[29]
Shadi A. Noghabi et al. 2017. Samza: Stateful Scalable Stream Processing at LinkedIn. Proc. VLDB Endow. 10, 12 (Aug. 2017).
[30]
B. Ottenwälder et al. 2013. MigCEP: Operator Migration for Mobility Driven Distributed Complex Event Processing. In Proc. of DEBS'13 (Arlington, TX, USA).
[31]
Antonis Papaioannou. 2021. Linear Road implementation. https://github.com/antonis-papaioannou/linearRoad
[32]
A. Papaioannou et al. 2020. The Case for Better Integrating Scalable Data Stores and Stream-Processing Systems. In Proc. IEEE CLUSTER'20.
[33]
A. Papaioannou and K. Magoutis. 2017. Incremental Elasticity for NoSQL Data Stores. In Proc. 36th IEEE Symposium on Reliable Distributed Systems (SRDS).
[34]
P. Pietzuch et al. 2006. Network-Aware Operator Placement for Stream-Processing Systems. In Proc. 22nd Int. Conf. on Data Engineering (ICDE'06).
[35]
R. Rea. 2016. Walmart & IBM Revisit the Linear Road Benchmark. https://www.slideshare.net/RedisLabs/walmart-ibm-revisit-the-linear-road-benchmark
[36]
Y. Robert et al. 2012. On the complexity of scheduling checkpoints for computational workflows. In Proc. IEEE/IFIP DSN Workshops (DSN 2012).
[37]
N. Senthil and G. Bugra. 2013. Using InfoSphere Streams with memcached and Redis. In IBM developerWorks. https://www.ibm.com/developerworks/library/bd-streamsmemcached/
[38]
M. A. Shah et al. 2003. Flux: an adaptive partitioning operator for continuous query systems. In Proc. 19th Int. Conf. on Data Engineering (ICDE'03).
[39]
Roshan Sumbaly et al. 2013. The Big Data Ecosystem at LinkedIn. In Proc. of the 2013 ACM SIGMOD Int. Conf. on Management of Data (SIGMOD'13).
[40]
The Borealis Team. [n.d.]. Borealis Application Programmer's Guide [White Paper]. http://cs.brown.edu/research/borealis/public/publications/borealis_application_guide.pdf. (Accessed Aug 2021).
[41]
QC. To, J. Soto, and V. Markl. 2018. A survey of state management in big data processing systems. The VLDB Journal 27 (2018), 847--872.
[42]
S. Wu et al. 2012. Parallelizing Stateful Operators in a Distributed Stream Processing System: How, Should You and How Much?. In DEBS'12 (Berlin, Germany).
[43]
Y. Wu and K. Tan. 2015. ChronoStream: Elastic stateful stream computation in the cloud. In Proc. of 31st IEEE Int. Conf. on Data Engineering (ICDE'15).

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
UCC '21: Proceedings of the 14th IEEE/ACM International Conference on Utility and Cloud Computing
December 2021
214 pages
ISBN:9781450385640
DOI:10.1145/3468737
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

In-Cooperation

  • CIMPA: International Center for Pure and Applied Mathematics

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 17 December 2021

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. data locality
  2. data stream processing
  3. scalable cloud services

Qualifiers

  • Research-article

Funding Sources

Conference

UCC '21
Sponsor:

Acceptance Rates

UCC '21 Paper Acceptance Rate 21 of 62 submissions, 34%;
Overall Acceptance Rate 38 of 125 submissions, 30%

Upcoming Conference

UCC '24
2024 IEEE/ACM 17th International Conference on Utility and Cloud Computing
December 16 - 19, 2024
Sharjah , United Arab Emirates

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 328
    Total Downloads
  • Downloads (Last 12 months)81
  • Downloads (Last 6 weeks)9
Reflects downloads up to 18 Nov 2024

Other Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media