Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

InfiniStore: Elastic Serverless Cloud Storage

Published: 01 March 2023 Publication History

Abstract

Cloud object storage such as AWS S3 is cost-effective and highly elastic but relatively slow, while high-performance cloud storage such as AWS ElastiCache is expensive and provides limited elasticity. We present a new cloud storage service called ServerlessMemory, which stores data using the memory of serverless functions. ServerlessMemory employs a sliding-window-based memory management strategy inspired by the garbage collection mechanisms used in the programming language to effectively segregate hot/cold data and provides fine-grained elasticity, good performance, and a pay-per-access cost model with extremely low cost.
We then design and implement InfiniStore, a persistent and elastic cloud storage system, which seamlessly couples the function-based ServerlessMemory layer with a persistent, inexpensive cloud object store layer. InfiniStore enables durability despite function failures using a fast parallel recovery scheme built on the auto-scaling functionality of a FaaS (Function-as-a-Service) platform. We evaluate InfiniStore extensively using both microbenchmarking and two real-world applications. Results show that InfiniStore has more performance benefits for objects larger than 10 MB compared to AWS ElastiCache and Anna, and InfiniStore achieves 26.25% and 97.24% tenant-side cost reduction compared to InfiniCache and ElastiCache, respectively.

References

[1]
Hussam Abu-Libdeh, Lonnie Princehouse, and Hakim Weatherspoon. 2010. RACS: A Case for Cloud Storage Diversity. In Proceedings of the 1st ACM Symposium on Cloud Computing (Indianapolis, Indiana, USA) (SoCC '10). ACM, New York, NY, USA, 229--240.
[2]
AlDanial. 2022. CLOC: Count Lines of Code. https://github.com/AlDanial/cloc/.
[3]
Ali Anwar, Yue Cheng, Aayush Gupta, and Ali R. Butt. 2016. MOS: Workload-Aware Elasticity for Cloud Object Stores. In Proceedings of the 25th ACM International Symposium on High-Performance Parallel and Distributed Computing (Kyoto, Japan) (HPDC '16). Association for Computing Machinery, New York, NY, USA, 177--188.
[4]
Ali Anwar, Mohamed Mohamed, Vasily Tarasov, Michael Littley, Lukas Rupprecht, Yue Cheng, Nannan Zhao, Dimitrios Skourtis, Amit S. Warke, Heiko Ludwig, Dean Hildebrand, and Ali R. Butt. 2018. Improving Docker Registry Design Based on Production Workload Analysis. In 16th USENIX Conference on File and Storage Technologies (FAST 18). USENIX Association, Oakland, CA, 265--278. https://www.usenix.org/conference/fast18/presentation/anwar
[5]
AWS. 2022. Amazon DynamoDB: Fast, flexible NoSQL database service for single-digit millisecond performance at any scale. https://aws.amazon.com/dynamodb/.
[6]
AWS. 2022. AWS ElastiCache: Unlock microsecond latency and scale with in-memory caching. https://aws.amazon.com/elasticache/.
[7]
AWS. 2022. AWS FSx: Launch and run feature-rich and highly-performant file systems with just a few clicks. https://aws.amazon.com/fsx/.
[8]
AWS. 2022. AWS Lambda execution context. https://docs.aws.amazon.com/lambda/latest/dg/runtimes-context.html.
[9]
AWS. 2022. AWS S3: Object storage built to retrieve any amount of data from anywhere. https://aws.amazon.com/s3/.
[10]
AWS. 2023. AWS Lambda: Run code without thinking about servers or clusters. https://aws.amazon.com/lambda/.
[11]
Doug Beaver, Sanjeev Kumar, Harry C. Li, Jason Sobel, and Peter Vajgel. 2010. Finding a Needle in Haystack: Facebook's Photo Storage. In Proceedings of the 9th USENIX Conference on Operating Systems Design and Implementation (Vancouver, BC, Canada) (OSDI'10). USENIX Association, Berkeley, CA, USA, 47--60. http://dl.acm.org/citation.cfm?id=1924943.1924947
[12]
Alysson Bessani, Ricardo Mendes, Tiago Oliveira, and Nuno Neves. 2014. SCFS: A Shared Cloud-backed File System. Proceedings of the 2014 USENIX Annual Technical Conference (2014), 169--180. https://www.usenix.org/conference/atc14/technical-sessions/presentation/bessani
[13]
Benjamin Carver, Jingyuan Zhang, Ao Wang, Ali Anwar, Panruo Wu, and Yue Cheng. 2020. Wukong: A Scalable and Locality-Enhanced Framework for Server-less Parallel Computing. In Proceedings of the 11th ACM Symposium on Cloud Computing (Virtual Event, USA) (SoCC '20). Association for Computing Machinery, New York, NY, USA, 1--15.
[14]
Benjamin Carver, Jingyuan Zhang, Ao Wang, and Yue Cheng. 2019. In Search of a Fast and Efficient Serverless DAG Engine. In 4th International Parallel Data Systems Workshop (PDSW 2019).
[15]
Yue Cheng, M. Safdar Iqbal, Aayush Gupta, and Ali R. Butt. 2015. CAST: Tiering Storage for Data Analytics in the Cloud. In Proceedings of the 24th International Symposium on High-Performance Parallel and Distributed Computing (Portland, Oregon, USA) (HPDC '15). ACM, New York, NY, USA, 45--56.
[16]
Yue Cheng, M. Safdar Iqbal, Aayush Gupta, and Ali R. Butt. 2015. Pricing Games for Hybrid Object Stores in the Cloud: Provider vs. Tenant. In 7th USENIX Workshop on Hot Topics in Cloud Computing (HotCloud 15). USENIX Association, Santa Clara, CA. https://www.usenix.org/conference/hotcloud15/workshop-program/presentation/cheng
[17]
Brian F. Cooper, Adam Silberstein, Erwin Tam, Raghu Ramakrishnan, and Russell Sears. 2010. Benchmarking Cloud Serving Systems with YCSB. In Proceedings of the 1st ACM Symposium on Cloud Computing (Indianapolis, Indiana, USA) (SoCC '10). ACM, New York, NY, USA, 143--154.
[18]
Francisco Cruz, Francisco Maia, Miguel Matos, Rui Oliveira, João Paulo, José Pereira, and Ricardo Vilaça. 2013. MeT: Workload Aware Elasticity for NoSQL. In Proceedings of the 8th ACM European Conference on Computer Systems (Prague, Czech Republic) (EuroSys '13). Association for Computing Machinery, New York, NY, USA, 183--196.
[19]
Docker. 2022. Docker Hub: Container Image Library. https://www.docker.com/products/docker-hub.
[20]
Google. 2022. Google Cloud Functions. https://cloud.google.com/functions/.
[21]
Google. 2022. Google Cloud Storage. https://cloud.google.com/storage.
[22]
IETF HTTP Working Group. 2022. HTTP/2. https://http2.github.io/.
[23]
IBM. 2022. IBM Cloud Functions. https://console.bluemix.net/openwhisk/.
[24]
Eric Jonas, Qifan Pu, Shivaram Venkataraman, Ion Stoica, and Benjamin Recht. 2017. Occupy the Cloud: Distributed Computing for the 99%. In Proceedings of the 2017 Symposium on Cloud Computing (Santa Clara, California) (SoCC '17). ACM, New York, NY, USA, 445--451.
[25]
Richard Jones and Rafael. Lins. 1996. Garbage collection : algorithms for automatic dynamic memory management. (1996), 377.
[26]
David Karger, Eric Lehman, Tom Leighton, Rina Panigrahy, Matthew Levine, and Daniel Lewin. 1997. Consistent Hashing and Random Trees: Distributed Caching Protocols for Relieving Hot Spots on the World Wide Web. In Proceedings of the Twenty-Ninth Annual ACM Symposium on Theory of Computing (El Paso, Texas, USA) (STOC '97). Association for Computing Machinery, New York, NY, USA, 654--663.
[27]
Anurag Khandelwal, Yupeng Tang, Rachit Agarwal, Aditya Akella, and Ion Stoica. 2022. Jiffy: Elastic Far-Memory for Stateful Serverless Analytics. In Proceedings of the Seventeenth European Conference on Computer Systems (Rennes, France) (EuroSys '22). Association for Computing Machinery, New York, NY, USA, 697--713.
[28]
Ana Klimovic, Yawen Wang, Christos Kozyrakis, Patrick Stuedi, Jonas Pfefferle, and Animesh Trivedi. 2018. Understanding Ephemeral Storage for Serverless Analytics. In 2018 USENIX Annual Technical Conference (USENIX ATC 18). USENIX Association, Boston, MA, 789--794. https://www.usenix.org/conference/atc18/presentation/klimovic-serverless
[29]
Ana Klimovic, Yawen Wang, Patrick Stuedi, Animesh Trivedi, Jonas Pfefferle, and Christos Kozyrakis. 2018. Pocket: Elastic Ephemeral Storage for Server-less Analytics. In 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI 18). USENIX Association, Carlsbad, CA, 427--444. https://www.usenix.org/conference/osdi18/presentation/klimovic
[30]
Andrew W. Leung, Shankar Pasupathy, Garth Goodson, and Ethan L. Miller. 2008. Measurement and Analysis of Large-Scale Network File System Workloads. In USENIX 2008 Annual Technical Conference (Boston, Massachusetts) (ATC'08). USENIX Association, USA, 213--226.
[31]
M. Littley, A. Anwar, H. Fayyaz, Z. Fayyaz, V. Tarasov, L. Rupprecht, D. Skourtis, M. Mohamed, H. Ludwig, Y. Cheng, and A. R. Butt. 2019. Bolt: Towards a Scalable Docker Registry via Hyperconvergence. In 2019 IEEE 12th International Conference on Cloud Computing (CLOUD). 358--366.
[32]
Ashraf Mahgoub, Karthick Shankar, Subrata Mitra, Ana Klimovic, Somali Chaterji, and Saurabh Bagchi. 2021. SONIC: Application-aware Data Passing for Chained Serverless Applications. In 2021 USENIX Annual Technical Conference (USENIX ATC 21). USENIX Association, 285--301. https://www.usenix.org/conference/atc21/presentation/mahgoub
[33]
Microsoft. 2022. Azure Blob Storage. https://azure.microsoft.com/en-us/services/storage/blobs/.
[34]
Djob Mvondo, Mathieu Bacou, Kevin Nguetchouang, Lucien Ngale, Stéphane Pouget, Josiane Kouam, Renaud Lachaize, Jinho Hwang, Tim Wood, Daniel Hagimont, Noël De Palma, Bernabé Batchakui, and Alain Tchana. 2021. OFC: An Opportunistic Caching System for FaaS Platforms. In Proceedings of the Sixteenth European Conference on Computer Systems, Vol. 21. ACM, New York, NY, USA, 17.
[35]
Diego Ongaro, Stephen M. Rumble, Ryan Stutsman, John Ousterhout, and Mendel Rosenblum. 2011. Fast Crash Recovery in RAMCloud. In Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles (Cascais, Portugal) (SOSP '11). Association for Computing Machinery, New York, NY, USA, 29--41.
[36]
T. G. Papaioannou, N. Bonvin, and K. Aberer. 2012. Scalia: An adaptive scheme for efficient multi-cloud storage. In SC '12: Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis. 1--10.
[37]
Matthew Perron, Raul Castro Fernandez, David DeWitt, and Samuel Madden. 2020. Starling: A Scalable Query Engine on Cloud Functions. In Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data (Portland, OR, USA) (SIGMOD '20). Association for Computing Machinery, New York, NY, USA, 131--141.
[38]
Qifan Pu, Shivaram Venkataraman, and Ion Stoica. 2019. Shuffling, Fast and Slow: Scalable Analytics on Serverless Infrastructure. In 16th USENIX Symposium on Networked Systems Design and Implementation (NSDI 19). USENIX Association, Boston, MA, 193--206. https://www.usenix.org/conference/nsdi19/presentation/pu
[39]
Krishna P.N. Puttaswamy, Thyaga Nandagopal, and Murali Kodialam. 2012. Frugal Storage for Cloud File Systems. In Proceedings of the 7th ACM European Conference on Computer Systems (Bern, Switzerland) (EuroSys '12). ACM, New York, NY, USA, 71--84.
[40]
Francisco Romero, Gohar Irfan Chaudhry, Íñigo Goiri, Pragna Gopa, Paul Batum, Neeraja J. Yadwadkar, Rodrigo Fonseca, Christos Kozyrakis, and Ricardo Bianchini. 2021. Faa$T: A Transparent Auto-Scaling Cache for Serverless Applications. Association for Computing Machinery, New York, NY, USA, 122--137.
[41]
Mendel Rosenblum and John K. Ousterhout. 1992. The design and implementation of a log-structured file system. ACM Transactions on Computer Systems (TOCS) 10 (2 1992), 26--52. Issue 1.
[42]
Stephen M. Rumble, Ankita Kejriwal, and John Ousterhout. 2014. Log-structured Memory for DRAM-based Storage. In 12th USENIX Conference on File and Storage Technologies (FAST 14). USENIX Association, Santa Clara, CA, 1--16. https://www.usenix.org/conference/fast14/technical-sessions/presentation/rumble
[43]
Josep Sampé, Marc Sánchez-Artigas, Pedro García-López, and Gerard París. 2017. Data-Driven Serverless Functions for Object Storage. In Proceedings of the 18th ACM/IFIP/USENIX Middleware Conference (Las Vegas, Nevada) (Middleware '17). Association for Computing Machinery, New York, NY, USA, 121--133.
[44]
Mikhail Shilkov. 2018. Serverless: Cold Start War. https://mikhail.io/2018/08/serverless-cold-start-war/.
[45]
Vikram Sreekanti, Chenggang Wu, Saurav Chhatrapati, Joseph E. Gonzalez, Joseph M. Hellerstein, and Jose M. Faleiro. 2020. A Fault-Tolerance Shim for Serverless Computing. In Proceedings of the Fifteenth European Conference on Computer Systems (Heraklion, Greece) (EuroSys '20). Association for Computing Machinery, New York, NY, USA, Article 15, 15 pages.
[46]
Vikram Sreekanti, Chenggang Wu, Charles Lin, Johann Schleier-Smith, Joseph E Gonzalez, Joseph M Hellerstein, Alexey Tumanov, U C Berkeley, Georgia Tech, and Jose M Faleiro. [n.d.]. Cloudburst: Stateful Functions-as-a-Service. ([n. d.]).
[47]
Huangshi Tian, Yunchuan Zheng, and Wei Wang. 2019. Characterizing and Synthesizing Task Dependencies of Data-Parallel Jobs in Alibaba Cloud. In Proceedings of the ACM Symposium on Cloud Computing (Santa Cruz, CA, USA) (SoCC '19). Association for Computing Machinery, New York, NY, USA, 139--151.
[48]
Alexandre Verbitski, Anurag Gupta, Debanjan Saha, James Corey, Kamal Gupta, Murali Brahmadesam, Raman Mittal, Sailesh Krishnamurthy, Sandor Maurice, Tengiz Kharatishvilli, and Xiaofeng Bao. 2018. Amazon Aurora: On Avoiding Distributed Consensus for I/Os, Commits, and Membership Changes. In Proceedings of the 2018 International Conference on Management of Data (Houston, TX, USA) (SIGMOD '18). Association for Computing Machinery, New York, NY, USA, 789--796.
[49]
Midhul Vuppalapati, Justin Miron, Rachit Agarwal, Dan Truong, Ashish Motivala, and Thierry Cruanes. 2020. Building An Elastic Query Engine on Disaggregated Storage. In 17th USENIX Symposium on Networked Systems Design and Implementation (NSDI 20). USENIX Association, Santa Clara, CA, 449--462. https://www.usenix.org/conference/nsdi20/presentation/vuppalapati
[50]
Ao Wang, Shuai Chang, Huangshi Tian, Hongqi Wang, Haoran Yang, Huiba Li, Rui Du, and Yue Cheng. 2021. FaaSNet: Scalable and Fast Provisioning of Custom Serverless Container Runtimes at Alibaba Cloud Function Compute. In 2021 USENIX Annual Technical Conference (USENIX ATC 21). USENIX Association, 443--457. https://www.usenix.org/conference/atc21/presentation/wang-ao
[51]
Ao Wang, Jingyuan Zhang, Xiaolong Ma, Ali Anwar, Lukas Rupprecht, Dimitrios Skourtis, Vasily Tarasov, Feng Yan, and Yue Cheng. 2020. InfiniCache: Exploiting Ephemeral Serverless Functions to Build a Cost-Effective Memory Cache. In 18th USENIX Conference on File and Storage Technologies (FAST 20). USENIX Association, Santa Clara, CA, 267--281. https://www.usenix.org/conference/fast20/presentation/wang-ao
[52]
Liang Wang, Mengyuan Li, Yinqian Zhang, Thomas Ristenpart, and Michael Swift. 2018. Peeking Behind the Curtains of Serverless Platforms. In 2018 USENIX Annual Technical Conference (USENIX ATC 18). USENIX Association, Boston, MA, 133--146. https://www.usenix.org/conference/atc18/presentation/wang-liang
[53]
Chenggang Wu, Jose M. Faleiro, Yihan Lin, and Joseph M. Hellerstein. 2021. Anna: A KVS for Any Scale. IEEE Transactions on Knowledge and Data Engineering 33 (2 2021), 344--358. Issue 2.
[54]
Zhe Wu, Michael Butkiewicz, Dorian Perkins, Ethan Katz-Bassett, and Harsha V. Madhyastha. 2013. SPANStore: Cost-effective Geo-replicated Storage Spanning Multiple Cloud Services. In Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles (Farminton, Pennsylvania) (SOSP '13). ACM, New York, NY, USA, 292--308.
[55]
Zhe Wu, Curtis Yu, and Harsha V. Madhyastha. 2015. CosTLO: Cost-Effective Redundancy for Lower Latency Variance on Cloud Storage Services. In 12th USENIX Symposium on Networked Systems Design and Implementation (NSDI 15). USENIX Association, Oakland, CA, 543--557. https://www.usenix.org/conference/nsdi15/technical-sessions/presentation/wu
[56]
ZeroMQ. 2022. ZeroMQ. https://zeromq.org/.
[57]
Jingyuan Zhang, Ao Wang, Xiaolong Ma, Benjamin Carver, Nicholas John Newman, Ali Anwar, Lukas Rupprecht, Dimitrios Skourtis, Vasily Tarasov, Feng Yan, and Yue Cheng. 2022. InfiniStore: Elastic Serverless Cloud Storage.
[58]
Tian Zhang, Dong Xie, Feifei Li, and Ryan Stutsman. 2019. Narrowing the Gap Between Serverless and its State with Storage Functions. In Proceedings of the ACM Symposium on Cloud Computing - SoCC '19. Association for Computing Machinery (ACM), New York, New York, USA, 1--12.

Cited By

View all
  • (2024)MinFlowProceedings of the 22nd USENIX Conference on File and Storage Technologies10.5555/3650697.3650716(311-328)Online publication date: 27-Feb-2024
  • (2024)FaaSKeeper: Learning from Building Serverless Services with ZooKeeper as an ExampleProceedings of the 33rd International Symposium on High-Performance Parallel and Distributed Computing10.1145/3625549.3658661(94-108)Online publication date: 3-Jun-2024
  • (2023)λFS: A Scalable and Elastic Distributed File System Metadata Service using Serverless FunctionsProceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 410.1145/3623278.3624765(394-411)Online publication date: 25-Mar-2023

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Proceedings of the VLDB Endowment
Proceedings of the VLDB Endowment  Volume 16, Issue 7
March 2023
203 pages
ISSN:2150-8097
Issue’s Table of Contents

Publisher

VLDB Endowment

Publication History

Published: 01 March 2023
Published in PVLDB Volume 16, Issue 7

Check for updates

Badges

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)108
  • Downloads (Last 6 weeks)8
Reflects downloads up to 10 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)MinFlowProceedings of the 22nd USENIX Conference on File and Storage Technologies10.5555/3650697.3650716(311-328)Online publication date: 27-Feb-2024
  • (2024)FaaSKeeper: Learning from Building Serverless Services with ZooKeeper as an ExampleProceedings of the 33rd International Symposium on High-Performance Parallel and Distributed Computing10.1145/3625549.3658661(94-108)Online publication date: 3-Jun-2024
  • (2023)λFS: A Scalable and Elastic Distributed File System Metadata Service using Serverless FunctionsProceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 410.1145/3623278.3624765(394-411)Online publication date: 25-Mar-2023

View Options

Get Access

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media