research-article

Phoebe: a learning-based checkpoint optimizer

Authors:

Matteo Interlandi,

Krishnadhan Das,

Alekh JindalAuthors Info & Claims

Proceedings of the VLDB Endowment, Volume 14, Issue 11

Pages 2505 - 2518

https://doi.org/10.14778/3476249.3476298

Published: 01 July 2021 Publication History

Abstract

Easy-to-use programming interfaces paired with cloud-scale processing engines have enabled big data system users to author arbitrarily complex analytical jobs over massive volumes of data. However, as the complexity and scale of analytical jobs increase, they encounter a number of unforeseen problems, hotspots with large intermediate data on temporary storage, longer job recovery time after failures, and worse query optimizer estimates being examples of issues that we are facing at Microsoft.

To address these issues, we propose Phoebe, an efficient learning-based checkpoint optimizer. Given a set of constraints and an objective function at compile-time, Phoebe is able to determine the decomposition of job plans, and the optimal set of checkpoints to preserve their outputs to durable global storage. Phoebe consists of three machine learning predictors and one optimization module. For each stage of a job, Phoebe makes accurate predictions for: (1) the execution time, (2) the output size, and (3) the start/end time taking into account the inter-stage dependencies. Using these predictions, we formulate checkpoint optimization as an integer programming problem and propose a scalable heuristic algorithm that meets the latency requirement of the production environment.

We demonstrate the effectiveness of Phoebe in production workloads, and show that we can free the temporary storage on hotspots by more than 70% and restart failed jobs 68% faster on average with minimum performance impact. Phoebe also illustrates that adding multiple sets of checkpoints is not cost-efficient, which dramatically reduces the complexity of the optimization.

References

[1]

Zeeshan Ahmed, Saeed Amizadeh, Mikhail Bilenko, Rogan Carr, Wei-Sheng Chin, Yael Dekel, Xavier Dupre, Vadim Eksarevskiy, Senja Filipi, Tom Finley, et al. 2019. Machine Learning at Microsoft with ML. NET. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 2448--2458.

Digital Library

[2]

Mert Akdere, Ugur Cetintemel, Matteo Riondato, Eli Upfal, and Stanley B Zdonik. 2012. Learning-based query performance modeling and prediction. In 2012 IEEE 28th International Conference on Data Engineering. IEEE, 390--401.

Digital Library

[3]

Amazon.com, Inc. 2020. Amazon Athena. Retrieved July 4, 2020 from hhttps://aws.amazon.com/athena/

[4]

Magdalena Balazinska, Hari Balakrishnan, Samuel Madden, and Michael Stonebraker.2005. Fault-tolerance in the Borealis distributed stream processing system. In Proceedings of the 2005 ACM SIGMOD international conference on Management of data. 13--24.

Digital Library

[5]

Yoshua Bengio, Réjean Ducharme, Pascal Vincent, and Christian Jauvin. 2003. A neural probabilistic language model. Journal of machine learning research 3, Feb (2003), 1137--1155.

Digital Library

[6]

Christopher M Bishop. 2006. Pattern recognition and machine learning. springer.

Digital Library

[7]

Leo Breiman. 2001. Random forests. Machine learning 45, 1 (2001), 5--32.

Digital Library

[8]

Raul Castro Fernandez, Matteo Migliavacca, Evangelia Kalyvianaki, and Peter Pietzuch. 2013. Integrating scale out and fault tolerance in stream processing using operator state management. In Proceedings of the 2013 ACM SIGMOD international conference on Management of data. 725--736.

Digital Library

[9]

Ronnie Chaiken, Bob Jenkins, Per-Åke Larson, Bill Ramsey, Darren Shakib, Simon Weaver, and Jingren Zhou. 2008. SCOPE: easy and efficient parallel processing of massive data sets. Proceedings of the VLDB Endowment 1, 2 (2008), 1265--1276.

Digital Library

[10]

Ting Chen and Kenjiro Taura. 2013. A selective checkpointing mechanism for query plans in a parallel database system. In 2013 IEEE International Conference on Big Data. IEEE, 237--245.

[11]

Microsoft Corp. 2020. Azure Machine Learning. Retrieved July 4, 2020 from https://azure.microsoft.com/en-us/services/machine-learning/

[12]

Microsoft Corp. 2020. Nimbusml. Retrieved December 14, 2020 from https://docs.microsoft.com/en-us/NimbusML/overview

[13]

Carlo Curino, Subru Krishnan, Konstantinos Karanasos, Sriram Rao, Giovanni M Fumarola, Botong Huang, Kishore Chaliparambil, Arun Suresh, Young Chen, Solom Heddaya, et al. 2019. Hydra: a federated resource manager for data-center scale analytics. In 16th USENIX Symposium on Networked Systems Design and Implementation (NSDI 19). 177--192.

Digital Library

[14]

John T Daly. 2006. A higher order estimate of the optimum checkpoint interval for restart dumps. Future generation computer systems 22, 3 (2006), 303--312.

Digital Library

[15]

Francesco Diaz and Roberto Freato. 2018. Azure Data Lake Store and Azure Data Lake Analytics. In Cloud Data Design, Orchestration, and Management Using Microsoft Azure. Springer, 327--392.

[16]

Anshuman Dutt, Chi Wang, Azade Nazi, Srikanth Kandula, Vivek Narasayya, and Surajit Chaudhuri. 2019. Selectivity estimation for range predicates using lightweight models. Proceedings of the VLDB Endowment 12, 9 (2019), 1044--1057.

Digital Library

[17]

J Forrest, T Ralphs, S Vigerske, B LouHafer, Kristjansson, jpfasano, EdwinStraver, M Lubin, HG Santos, and M Saltzman. 2020. coin-or/cbc: Version 2.9.9. Retrieved July 4, 2020 from https://zenodo.org/badge/latestdoi/30382416

[18]

Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep Residual Learning for Image Recognition. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 770--778.

[19]

Benjamin Hilprecht, Andreas Schmidt, Moritz Kulessa, Alejandro Molina, Kristian Kersting, and Carsten Binnig. 2020. DeepDB: Learn from Data, not from Queries! Proceedings of the VLDB Endowment 13, 7 (2020), 992--1005.

Digital Library

[20]

Wei-Ning Hsu, Yu Zhang, Ann Lee, and James Glass. 2016. Exploiting depth and highway connections in convolutional recurrent deep neural networks for speech recognition. cell 50 (2016), 1.

[21]

Alekh Jindal, Konstantinos Karanasos, Sriram Rao, and Hiren Patel. 2018. Selecting subexpressions to materialize at datacenter scale. Proceedings of the VLDB Endowment 11, 7 (2018), 800--812.

Digital Library

[22]

Alekh Jindal, Hiren Patel, Abhishek Roy, Shi Qiao, Zhicheng Yin, Rathijit Sen, and Subru Krishnan. 2019. Peregrine: Workload Optimization for Cloud Query Engines. In Proceedings of the ACM Symposium on Cloud Computing. 416--427.

Digital Library

[23]

Alekh Jindal, Shi Qiao, Hiren Patel, Zhicheng Yin, Jieming Di, Malay Bag, Marc Friedman, Yifung Lin, Konstantinos Karanasos, and Sriram Rao. 2018. Computation reuse in analytics job service at microsoft. In Proceedings of the 2018 International Conference on Management of Data. 191--203.

Digital Library

[24]

Alekh Jindal, Shi Qiao, Rathijit Sen, and Hiren Patel. 2020. Microlearner: A fine-grained Learning Optimizer for Big Data Workloads at Microsoft. Under Submission (2020).

[25]

Sangeetha Abdu Jyothi, Carlo Curino, Ishai Menache, Shravan Matthur Narayanamurthy, Alexey Tumanov, Jonathan Yaniv, Ruslan Mavlyutov, Íñigo Goiri, Subru Krishnan, Janardhan Kulkarni, et al. 2016. Morpheus: Towards automated slos for enterprise clusters. In 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 16). 117--134.

Digital Library

[26]

Zoi Kaoudi, Jorge-Arnulfo Quiané-Ruiz, Bertty Contreras-Rojas, Rodrigo Pardo-Meza, Anis Troudi, and Sanjay Chawla. 2020. ML-based cross-platform query optimization. In 2020 IEEE 36th International Conference on Data Engineering (ICDE). IEEE, 1489--1500.

[27]

Guolin Ke, Qi Meng, Thomas Finley, Taifeng Wang, Wei Chen, Weidong Ma, Qiwei Ye, and Tie-Yan Liu. 2017. Lightgbm: A highly efficient gradient boosting decision tree. In Advances in neural information processing systems. 3146--3154.

Digital Library

[28]

Andreas Kipf, Thomas Kipf, Bernhard Radke, Viktor Leis, Peter A. Boncz, and Alfons Kemper. 2019. Learned Cardinalities: Estimating Correlated Joins with Deep Learning. In 9th Biennial Conference on Innovative Data Systems Research.

[29]

YongChul Kwon, Magdalena Balazinska, and Albert Greenberg. 2008. Fault-tolerant stream processing using a distributed, replicated file system. Proceedings of the VLDB Endowment 1, 1 (2008), 574--585.

Digital Library

[30]

Jyoti Leeka and Kaushik Rajan. 2019. Incorporating super-operators in big-data query optimizers. Proceedings of the VLDB Endowment 13, 3 (2019), 348--361.

Digital Library

[31]

Youfu Li, Mingda Li, Ling Ding, and Matteo Interlandi. 2018. RIOS: Runtime Integrated Optimizer for Spark. In Proceedings of the ACM Symposium on Cloud Computing (SoCC '18). 275--287.

Digital Library

[32]

John DC Little. 1961. A proof for the queuing formula: L= λ W. Operations research 9, 3 (1961), 383--387.

Digital Library

[33]

Alberto Marchetti-Spaccamela and Carlo Vercellis. 1995. Stochastic on-line knapsack problems. Mathematical Programming 68, 1--3 (1995), 73--104.

Digital Library

[34]

Ryan Marcus, Parimarjan Negi, Hongzi Mao, Chi Zhang, Mohammad Alizadeh, Tim Kraska, Olga Papaemmanouil, and Nesime Tatbul. 2019. Neo: A learned query optimizer. Proceedings of the VLDB Endowment 12, 11 (2019), 1705--1718.

Digital Library

[35]

Ryan Marcus and Olga Papaemmanouil. 2019. Plan-structured deep neural network models for query performance prediction. Proceedings of the VLDB Endowment 12, 11 (2019), 1733--1746.

Digital Library

[36]

Guido Moerkotte, Thomas Neumann, and Gabriele Steidl. 2009. Preventing bad plans by bounding the impact of cardinality estimation errors. Proceedings of the VLDB Endowment 2, 1 (2009), 982--993.

Digital Library

[37]

Lili Mou, Ge Li, Lu Zhang, Tao Wang, and Zhi Jin. 2016. Convolutional neural networks over tree structures for programming language processing. In Thirtieth AAAI Conference on Artificial Intelligence.

Digital Library

[38]

Parimarjan Negi, Ryan Marcus, Hongzi Mao, Nesime Tatbul, Tim Kraska, and Mohammad Alizadeh. 2020. Cost-guided cardinality estimation: Focus where it matters. In 2020 IEEE 36th International Conference on Data Engineering Workshops (ICDEW). IEEE, 154--157.

[39]

Thomas Neumann and Cesar Galindo-Legaria. 2013. Taking the edge off cardinality estimation errors using incremental execution. Datenbanksysteme für Business, Technologie und Web (BTW) 2019 (2013).

[40]

Jennifer Ortiz, Magdalena Balazinska, Johannes Gehrke, and S Sathiya Keerthi. 2018. Learning state representations for query optimization with deep reinforcement learning. In Proceedings of the Second Workshop on Data Management for End-To-End Machine Learning. 1--4.

Digital Library

[41]

Laurent Perron and Vincent Furnon. [n.d.]. OR-Tools. Google. https://developers.google.com/optimization/

[42]

Raghu Ramakrishnan, Baskar Sridharan, John R Douceur, Pavan Kasturi, Balaji Krishnamachari-Sampath, Karthick Krishnamoorthy, Peng Li, Mitica Manu, Spiro Michaylov, Rogério Ramos, et al. 2017. Azure data lake store: a hyperscale distributed file service for big data analytics. In Proceedings of the 2017 ACM International Conference on Management of Data. 51--63.

Digital Library

[43]

Abdallah Salama, Carsten Binnig, Tim Kraska, and Erfan Zamanian. 2015. Cost-based fault-tolerance for parallel data processing. In Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data. 285--297.

Digital Library

[44]

Aurobindo Sarkar and Amit Shah. 2018. Learning AWS: Design, build, and deploy responsive applications using AWS Cloud components. Packt Publishing Ltd.

[45]

Liqun Shao, Yiwen Zhu, Siqi Liu, Abhiram Eswaran, Kristin Lieber, Janhavi Mahajan, Minsoo Thigpen, Sudhir Darbha, Subru Krishnan, Soundar Srinivasan, et al. 2019. Griffon: Reasoning about Job Anomalies with Unlabeled Data in Cloud-based Platforms. In Proceedings of the ACM Symposium on Cloud Computing. 441--452.

Digital Library

[46]

Prateek Sharma, Tian Guo, Xin He, David Irwin, and Prashant Shenoy. 2016. Flint: Batch-interactive data-intensive processing on transient servers. In Proceedings of the Eleventh European Conference on Computer Systems. 1--15.

Digital Library

[47]

Tarique Siddiqui, Alekh Jindal, Shi Qiao, Hiren Patel, and Wangchao Le. 2020. Cost models for big data query processing: Learning, retrofitting, and our findings. In Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data. 99--113.

Digital Library

[48]

Vikram Singh and SK Jain. 2016. A progressive query materialization for interactive data exploration. In Proceeding of 1st International Workshop Social Data Analytics and Management (SoDAM'2016) Co-located at 44th VLDB. 1--10.

[49]

Ashish Thusoo, Joydeep Sen Sarma, Namit Jain, Zheng Shao, Prasad Chakka, Suresh Anthony, Hao Liu, Pete Wyckoff, and Raghotham Murthy. 2009. Hive: a warehousing solution over a map-reduce framework. Proceedings of the VLDB Endowment 2, 2 (2009), 1626--1629.

Digital Library

[50]

Jordan Tigani and Siddartha Naidu. 2014. Google BigQuery Analytics. John Wiley & Sons.

Digital Library

[51]

Edward John Triou, Fei Xu, Hiren Patel, Jingren Zhou, et al. 2020. Analyzing multiple data streams as a single data object. US Patent 10,565,208.

[52]

Prasang Upadhyaya, YongChul Kwon, and Magdalena Balazinska. 2011. A latency and fault-tolerance optimizer for online parallel query plans. In Proceedings of the 2011 ACM SIGMOD International Conference on Management of data. 241--252.

Digital Library

[53]

Vinod Kumar Vavilapalli, Arun C Murthy, Chris Douglas, Sharad Agarwal, Mahadev Konar, Robert Evans, Thomas Graves, Jason Lowe, Hitesh Shah, Siddharth Seth, et al. 2013. Apache hadoop yarn: Yet another resource negotiator. In Proceedings of the 4th annual Symposium on Cloud Computing. ACM, 5.

Digital Library

[54]

Herman van Hövell Wenchen Fan and MaryAnn Xue. 2020. Adaptive query execution in Apache Spark. Retrieved Jan 20, 2021 from https://databricks.com/blog/2020/05/29/adaptive-query-execution-speeding-up-spark-sql-at-runtime.html

[55]

Wikipedia. 2020. Topological Sorting. Retrieved July 4, 2020 from https://en.wikipedia.org/wiki/Topological_sorting

[56]

Chenggang Wu, Alekh Jindal, Saeed Amizadeh, Hiren Patel, Wangchao Le, Shi Qiao, and Sriram Rao. 2018. Towards a learning optimizer for shared clouds. Proceedings of the VLDB Endowment 12, 3 (2018), 210--222.

Digital Library

[57]

Ying Yan, Yanjie Gao, Yang Chen, Zhongxin Guo, Bole Chen, and Thomas Moscibroda. 2016. Tr-spark: Transient computing for big data analytics. In Proceedings of the Seventh ACM Symposium on Cloud Computing. 484--496.

Digital Library

[58]

Christopher Yang, Christine Yen, Ceryen Tan, and Samuel R Madden. 2010. Osprey: Implementing MapReduce-style fault tolerance in a shared-nothing distributed database. In 2010 IEEE 26th International Conference on Data Engineering (ICDE 2010). IEEE, 657--668.

[59]

Zongheng Yang, Amog Kamsetty, Sifei Luan, Eric Liang, Yan Duan, Xi Chen, and Ion Stoica. 2020. NeuroCard. Proceedings of the VLDB Endowment 14, 1 (Sep 2020), 61--73.

Digital Library

[60]

Matei Zaharia, Mosharaf Chowdhury, Tathagata Das, Ankur Dave, Justin Ma, Murphy McCauly, Michael J. Franklin, Scott Shenker, and Ion Stoica. 2012. Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In-Memory Cluster Computing. In 9th USENIX Symposium on Networked Systems Design and Implementation (NSDI 12). USENIX Association, 15--28.

Digital Library

[61]

Matei Zaharia, Mosharaf Chowdhury, Michael J Franklin, Scott Shenker, Ion Stoica, et al. 2010. Spark: Cluster computing with working sets. HotCloud 10, 10--10 (2010), 95.

Digital Library

[62]

Jingren Zhou, Nicolas Bruno, Ming-Chuan Wu, Per-Ake Larson, Ronnie Chaiken, and Darren Shakib. 2012. SCOPE: parallel databases meet MapReduce. The VLDB Journal 21, 5 (2012), 611--636.

Digital Library

Cited By

Zhu YSen RHorton RAgosta J(2023)Runtime Variation in Big Data AnalyticsProceedings of the ACM on Management of Data10.1145/35889211:1(1-20)Online publication date: 30-May-2023
https://dl.acm.org/doi/10.1145/3588921
Zhu YTian YCahoon JKrishnan SAgarwal AAlotaibi RCamacho-Rodriguez JChundatt BChung ADutta NFogarty AGruenheid AHaynes BInterlandi MIyer MJurgens NKhushalani SKroth BKumar MLeeka JMatusevych SMittal MMueller AMuthyala KNagulapalli HPark YPatel HPavlenko APoppe ORavindran SSaur KSen RSuh STarafdar AWaghray KWang DCurino CRamakrishnan RDas SPandis ISelçuk Candan KAmer-Yahia S(2023)Towards Building Autonomous Data Services on AzureCompanion of the 2023 International Conference on Management of Data10.1145/3555041.3589674(217-224)Online publication date: 4-Jun-2023
https://dl.acm.org/doi/10.1145/3555041.3589674
Lyu CFan QSong FSinha ADiao YChen WMa LFeng YLi YZeng KZhou J(2022)Fine-grained modeling and optimization for intelligent resource management in big data processingProceedings of the VLDB Endowment10.14778/3551793.355185515:11(3098-3111)Online publication date: 1-Jul-2022
https://dl.acm.org/doi/10.14778/3551793.3551855
Show More Cited By

Index Terms

Phoebe: a learning-based checkpoint optimizer

Index terms have been assigned to the content through auto-classification.

Recommendations

A Hybrid Approach to Failed Disk Recovery Using RAID-6 Codes: Algorithms and Performance Evaluation

The current parallel storage systems use thousands of inexpensive disks to meet the storage requirement of applications. Data redundancy and/or coding are used to enhance data availability, for instance, Row-diagonal parity (RDP) and EVENODD codes, ...
Higher reliability redundant disk arrays: Organization, operation, and coding

Parity is a popular form of data protection in redundant arrays of inexpensive/independent disks (RAID). RAID5 dedicates one out of N disks to parity to mask single disk failures, that is, the contents of a block on a failed disk can be reconstructed by ...
HPDA: A hybrid parity-based disk array for enhanced performance and reliability

Flash-based Solid State Drive (SSD) has been productively shipped and deployed in large scale storage systems. However, a single flash-based SSD cannot satisfy the capacity, performance and reliability requirements of the modern storage systems that ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Proceedings of the VLDB Endowment

Proceedings of the VLDB Endowment Volume 14, Issue 11

July 2021

732 pages

ISSN:2150-8097

Editors:
Xin Luna Dong
Amazon
,
Felix Naumann
HPI, University of Potsdam

Issue’s Table of Contents

Publisher

VLDB Endowment

Publication History

Published: 01 July 2021

Published in PVLDB Volume 14, Issue 11

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

5
Total Citations
View Citations
35
Total Downloads

Downloads (Last 12 months)7
Downloads (Last 6 weeks)0

Reflects downloads up to 12 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Zhu YSen RHorton RAgosta J(2023)Runtime Variation in Big Data AnalyticsProceedings of the ACM on Management of Data10.1145/35889211:1(1-20)Online publication date: 30-May-2023
https://dl.acm.org/doi/10.1145/3588921
Zhu YTian YCahoon JKrishnan SAgarwal AAlotaibi RCamacho-Rodriguez JChundatt BChung ADutta NFogarty AGruenheid AHaynes BInterlandi MIyer MJurgens NKhushalani SKroth BKumar MLeeka JMatusevych SMittal MMueller AMuthyala KNagulapalli HPark YPatel HPavlenko APoppe ORavindran SSaur KSen RSuh STarafdar AWaghray KWang DCurino CRamakrishnan RDas SPandis ISelçuk Candan KAmer-Yahia S(2023)Towards Building Autonomous Data Services on AzureCompanion of the 2023 International Conference on Management of Data10.1145/3555041.3589674(217-224)Online publication date: 4-Jun-2023
https://dl.acm.org/doi/10.1145/3555041.3589674
Lyu CFan QSong FSinha ADiao YChen WMa LFeng YLi YZeng KZhou J(2022)Fine-grained modeling and optimization for intelligent resource management in big data processingProceedings of the VLDB Endowment10.14778/3551793.355185515:11(3098-3111)Online publication date: 1-Jul-2022
https://dl.acm.org/doi/10.14778/3551793.3551855
Zhang WInterlandi MMineiro PQiao SGhazanfari NLie KFriedman MHosn RPatel HJindal AIves ZBonifati AEl Abbadi A(2022)Deploying a Steered Query Optimizer in Production at MicrosoftProceedings of the 2022 International Conference on Management of Data10.1145/3514221.3526052(2299-2311)Online publication date: 10-Jun-2022
https://dl.acm.org/doi/10.1145/3514221.3526052
Jindal AInterlandi M(2021)Machine learning for cloud data systemsProceedings of the VLDB Endowment10.14778/3476311.347640814:12(3202-3205)Online publication date: 28-Oct-2021
https://dl.acm.org/doi/10.14778/3476311.3476408

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Issue’s Table of Contents