Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3358960.3379137acmconferencesArticle/Chapter ViewAbstractPublication PagesicpeConference Proceedingsconference-collections
research-article

Sampling Effect on Performance Prediction of Configurable Systems: A Case Study

Published: 20 April 2020 Publication History

Abstract

Numerous software systems are highly configurable and provide a myriad of configuration options that users can tune to fit their functional and performance requirements (e.g., execution time). Measuring all configurations of a system is the most obvious way to understand the effect of options and their interactions, but is too costly or infeasible in practice. Numerous works thus propose to measure only a few configurations (a sample) to learn and predict the performance of any combination of options' values. A challenging issue is to sample a small and representative set of configurations that leads to a good accuracy of performance prediction models. A recent study devised a new algorithm, called distance-based sampling, that obtains state-of-the-art accurate performance predictions on different subject systems. In this paper, we replicate this study through an in-depth analysis of x264, a popular and configurable video encoder. We systematically measure all 1,152 configurations of x264 with 17 input videos and two quantitative properties (encoding time and encoding size). Our goal is to understand whether there is a dominant sampling strategy over the very same subject system (x264), i.e., whatever the workload and targeted performance properties. The findings from this study show that random sampling leads to more accurate performance models. However, without considering random, there is no single "dominant" sampling, instead different strategies perform best on different inputs and non-functional properties, further challenging practitioners and researchers.

References

[1]
Mathieu Acher, Hugo Martin, Juliana Alves Pereira, Arnaud Blouin, Jean-Marc Jézéquel, Djamel Eddine Khelladi, Luc Lesoil, and Olivier Barais. 2019. Learning Very Large Configuration Spaces: What Matters for Linux Kernel Sizes. Research Report. Inria Rennes - Bretagne Atlantique. https://hal.inria.fr/hal-02314830
[2]
Mathieu Acher, Paul Temple, Jean-Marc Jezequel, José A Galindo, Jabier Martinez, and Tewfik Ziadi. 2018. VaryLaTeX: Learning Paper Variants That Meet Constraints. In Proceedings of the 12th International Workshop on Variability Modelling of Software-Intensive Systems. ACM, 83--88.
[3]
Benoit Amand, Maxime Cordy, Patrick Heymans, Mathieu Acher, Paul Temple, and Jean-Marc Jézéquel. 2019. Towards Learning-Aided Configuration in 3D Printing: Feasibility Study and Application to Defect Prediction. In Proceedings of the 13th International Workshop on Variability Modelling of Software-Intensive Systems. ACM, 7.
[4]
Andrea Arcuri and Lionel Briand. 2011. A Practical Guide for Using Statistical Tests to Assess Randomized Algorithms in Software Engineering. In Proceedings of the 33rd International Conference on Software Engineering (ICSE '11). ACM, New York, NY, USA, 1--10.
[5]
Andrea Arcuri and Lionel Briand. 2012. Formal Analysis of the Probability of Interaction Fault Detection Using Random Testing. IEEE Transactions on Software Engineering 38, 5 (Sept 2012), 1088--1099.
[6]
Liang Bao, Xin Liu, Ziheng Xu, and Baoyin Fang. 2018. AutoConfig: Automatic Configuration Tuning for Distributed Message Systems. In IEEE/ACM International Conference on Automated Software Engineering (ASE). ACM, 29--40.
[7]
Leo Breiman. 2001. Random forests. Machine learning 45, 1 (2001), 5--32.
[8]
Supratik Chakraborty, Daniel J. Fremont, Kuldeep S. Meel, Sanjit A. Seshia, and Moshe Y. Vardi. 2015. On Parallel Scalable Uniform SAT Witness Generation. In Tools and Algorithms for the Construction and Analysis of Systems TACAS'15 2015, London, UK, April 11--18, 2015. Proceedings. 304--319.
[9]
Supratik Chakraborty, Kuldeep S Meel, and Moshe Y Vardi. 2013. A Scalable and Nearly Uniform Generator of SAT Witnesses. In International Conference on Computer Aided Verification. Springer, 608--623.
[10]
Shiping Chen, Yan Liu, Ian Gorton, and Anna Liu. 2005. Performance Prediction of Component-Based Applications. Journal of Systems and Software 74, 1 (2005), 35--43.
[11]
Myra B Cohen, Matthew B Dwyer, and Shi, Jiangfan. 2008. Constructing Interaction Test Suites for Highly-Configurable Systems in the Presence of Constraints: A Greedy Approach. IEEE TSE 34, 5 (2008), 633--650.
[12]
Leonardo De Moura and Nikolaj Bjørner. 2008. Z3: An efficient SMT solver. In International conference on Tools and Algorithms for the Construction and Analysis of Systems. Springer, 337--340.
[13]
Aaron Fisher, Cynthia Rudin, and Francesca Dominici. 2018. All Models are Wrong, but Many are Useful: Learning a Variable's Importance by Studying an Entire Class of Prediction Models Simultaneously. arXiv:arXiv:1801.01489
[14]
Alexander Grebhahn, Carmen Rodrigo, Norbert Siegmund, Francisco J Gaspar, and Sven Apel. 2017. Performance-Influence Models of Multigrid Methods: A Case Study on Triangular Grids. Concurrency and Computation: Practice and Experience 29, 17 (2017), e4057.
[15]
Jianmei Guo, Krzysztof Czarnecki, Sven Apel, Norbert Siegmund, and Andrzej Wasowski. 2013. Variability-Aware Performance Prediction: A Statistical Learning Approach. In Automated Software Engineering (ASE). IEEE, 301--311.
[16]
Jianmei Guo, Jia Hui Liang, Kai Shi, Dingyu Yang, Jingsong Zhang, Krzysztof Czarnecki, Vijay Ganesh, and Huiqun Yu. 2017. SMTIBEA: A Hybrid MultiObjective Optimization Algorithm for Configuring Large Constrained Software Product Lines. In Software & Systems Modeling.
[17]
Axel Halin, Alexandre Nuttinck, Mathieu Acher, Xavier Devroey, Gilles Perrouin, and Benoit Baudry. 2018. Test them All, Is It Worth It? Assessing Configuration Sampling on the JHipster Web Development Stack. Empirical Software Engineering.
[18]
Christopher Henard, Mike Papadakis, Gilles Perrouin, Jacques Klein, Patrick Heymans, and Yves Le Traon. 2014. Bypassing the Combinatorial Explosion: Using Similarity to Generate and Prioritize T-Wise Test Configurations for Software Product Lines. IEEE Trans. Software Eng. (2014).
[19]
Ruben Heradio, David Fernández-Amorós, Christoph Mayr-Dorn, and Alexander Egyed. 2019. Supporting the Statistical Analysis of Variability Models. In 41st International Conference on Software Engineering, ICSE. 843--853.
[20]
Pooyan Jamshidi, Javier Cámara, Bradley Schmerl, Christian Kästner, and David Garlan. 2019. Machine Learning Meets Quantitative Planning: Enabling SelfAdaptation in Autonomous Robots. arXiv preprint arXiv:1903.03920 (2019).
[21]
Pooyan Jamshidi, Norbert Siegmund, Miguel Velez, Akshay Patel, and Yuvraj Agarwal. 2017. Transfer Learning for Performance Modeling of Configurable Systems: An Exploratory Analysis. In In IEEE/ACM International Conference on Automated Software Engineering (ASE). IEEE Press, 497--508.
[22]
Pooyan Jamshidi, Miguel Velez, Christian Kästner, and Norbert Siegmund. 2018. Learning to Sample: Exploiting Similarities Across Environments to Learn Performance Models for Configurable Systems. In Proceedings of the 2018 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. ACM, 71--82.
[23]
Pooyan Jamshidi, Miguel Velez, Christian Kästner, Norbert Siegmund, and Prasad Kawthekar. 2017. Transfer Learning for Improving Model Predictions in Highly Configurable Software. In International Symposium on Software Engineering for Adaptive and Self-Managing Systems. ACM, 31--41.
[24]
Martin Fagereng Johansen, Øystein Haugen, and Franck Fleurey. 2012. An Algorithm for Generating t-Wise Covering Arrays from Large Feature Models. In Proceedings of the 16th International Software Product Line Conference on - SPLC '12 -volume 1, Vol. 1. ACM, 46.
[25]
Christian Kaltenecker, Alexander Grebhahn, Norbert Siegmund, Jianmei Guo, and Sven Apel. 2019. Distance-Based Sampling of Software Configuration Spaces. In Proceedings of the International Conference on Software Engineering (ICSE).
[26]
Sergiy Kolesnikov, Norbert Siegmund, Christian Kästner, and Sven Apel. 2017. On the Relation of External and Internal Feature Interactions: A Case Study. arXiv preprint arXiv:1712.07440 (2017).
[27]
Sergiy Kolesnikov, Norbert Siegmund, Christian Kästner, Alexander Grebhahn, and Sven Apel. 2019. Tradeoffs in Modeling Performance of Highly Configurable Software Systems. Software & Systems Modeling 18, 3 (01 Jun 2019), 2265--2283. https://doi.org/10.1007/s10270-018-0662--9
[28]
Thomas Krismayer, Rick Rabiser, and Paul Grünbacher. 2017. Mining Constraints for Event-Based Monitoring in Systems of Systems. In IEEE/ACM International Conference on Automated Software Engineering (ASE). IEEE Press, 826--831.
[29]
William H Kruskal and W Allen Wallis. 1952. Use of Ranks in One-Criterion Variance Analysis. Journal of the American statistical Association 47, 260 (1952), 583--621.
[30]
D.R. Kuhn, D.R. Wallace, and A.M. Gallo. 2004. Software fault interactions and implications for software testing. IEEE Transactions on Software Engineering 30, 6 (jun 2004), 418--421.
[31]
Daniel Le Berre and Anne Parrain. 2010. The SAT4J library, Release 2.2, System Description. Journal on Satisfiability, Boolean Modeling and Computation 7 (2010), 59--64. https://hal.archives-ouvertes.fr/hal-00868136
[32]
Howard Levene. 1961. Robust Tests for Equality of Variances. Contributions to probability and statistics. Essays in honor of Harold Hotelling (1961), 279--292.
[33]
Max Lillack, Johannes Müller, and Ulrich W Eisenecker. 2013. Improved Prediction of Non-Functional Properties in Software Product Lines with Domain Context. Software Engineering 2013 (2013).
[34]
Henry B Mann and Donald R Whitney. 1947. On a Test of Whether One of Two Random Variables is Stochastically Larger than the Other. The annals of mathematical statistics (1947), 50--60.
[35]
Flávio Medeiros, Christian Kästner, Márcio Ribeiro, Rohit Gheyi, and Sven Apel. 2016. A Comparison of 10 Sampling Algorithms for Configurable Systems. In Proceedings of the 38th International Conference on Software Engineering - ICSE '16. ACM Press, Austin, Texas, USA, 643--654.
[36]
Marcilio Mendonca, Andrzej Wasowski, Krzysztof Czarnecki, and Donald Cowan. 2008. Efficient Compilation Techniques for Large Scale Feature Models. In Int'l Conference on Generative programming and component engineering. 13--22.
[37]
Christoph Molnar. 2019. Interpretable Machine Learning. https://christophm.github.io/interpretable-ml-book/.
[38]
Daniel-Jesus Munoz, Jeho Oh, Mónica Pinto, Lidia Fuentes, and Don S. Batory. 2019. Uniform Random Sampling Product Configurations of Feature Models that Have Numerical Features. In International Systems and Software Product Line Conference (SPLC). 39:1--39:13.
[39]
I Made Murwantara, Behzad Bordbar, and Leandro L. Minku. 2014. Measuring Energy Consumption for Web Service Product Configuration. In Proceedings of the 16th International Conference on Information Integration and Web-based Applications & Services (iiWAS). ACM, New York, NY, USA, 224--228.
[40]
Vivek Nair, Tim Menzies, Norbert Siegmund, and Sven Apel. 2017. Using Bad Learners to Find Good Configurations. In Proceedings of the European Software Engineering Conference/Foundations of Software Engineering (ESEC/FSE). 257--267.
[41]
Vivek Nair, Tim Menzies, Norbert Siegmund, and Sven Apel. 2018. Faster Discovery of Faster System Configurations with Spectral Learning. Automated Software Engineering (2018), 1--31.
[42]
Vivek Nair, Zhe Yu, Tim Menzies, Norbert Siegmund, and Sven Apel. 2018. Finding Faster Configurations Using Flash. IEEE Transact. on Software Engineering (2018).
[43]
Jeho Oh, Don S. Batory, Margaret Myers, and Norbert Siegmund. 2017. Finding Near-Optimal Configurations in Product Lines by Random Sampling. In Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering, ESEC/FSE 2017, Paderborn, Germany, September 4--8, 2017. 61--71.
[44]
Jeho Oh, Paul Gazzillo, and Don S. Batory. 2019. t-Wise Coverage by Uniform Sampling. In Proceedings of the 23rd International Systems and Software Product Line Conference, SPLC 2019, Volume A, Paris, France. 15:1--15:4.
[45]
Terence Parr, Kerem Turgutlu, Christopher Csiszar, and Jeremy Howard. 2018. Beware Default Random Forest Importances. last access: july 2019.
[46]
Juliana Alves Pereira, Hugo Martin, Mathieu Acher, Jean-Marc Jézéquel, Goetz Botterweck, and Anthony Ventresque. 2019. Learning Software Configuration Spaces: A Systematic Literature Review. arXiv:arXiv:1906.03018
[47]
Quentin Plazar, Mathieu Acher, Gilles Perrouin, Xavier Devroey, and Maxime Cordy. 2019. Uniform Sampling of SAT Solutions for Configurable Systems: Are We There Yet?. In International Conference on Software Testing, Verification, and Validation (ICST). 1--12.
[48]
Adam Porter, Cemal Yilmaz, Atif M Memon, Douglas C Schmidt, and Bala Natarajan. 2007. Skoll: A Process and Infrastructure for Distributed Continuous Quality Assurance. IEEE Transactions on Software Engineering 33, 8 (2007), 510--525.
[49]
Rodrigo Queiroz, Thorsten Berger, and Krzysztof Czarnecki. 2016. Towards Predicting Feature Defects in Software Product Lines. In Proceedings of the 7th International Workshop on Feature-Oriented Software Development. ACM, 58--62.
[50]
Faiza Samreen, Yehia Elkhatib, Matthew Rowe, and Gordon S Blair. 2016. Daleel: Simplifying Cloud Instance Selection Using Machine Learning. In NOMS 2016- 2016 IEEE/IFIP Network Operations and Management Symposium. IEEE, 557--563.
[51]
Atri Sarkar, Jianmei Guo, Norbert Siegmund, Sven Apel, and Krzysztof Czarnecki. 2015. Cost-Efficient Sampling for Performance Prediction of Configurable Systems (T). In IEEE/ACM International Conference on Automated Software Engineering (ASE). IEEE, 342--352.
[52]
Norbert Siegmund, Alexander Grebhahn, Sven Apel, and Christian Kastner. 2015. Performance-Influence Models for Highly Configurable Systems. In 10th Joint Meeting on Foundations of Software Engineering (ESEC/FSE). 284--294.
[53]
Norbert Siegmund, Sergiy S. Kolesnikov, Christian Kästner, Sven Apel, Don S. Batory, Marko Rosenmüller, and Gunter Saake. 2012. Predicting Performance via Automated Feature-Interaction Detection. In International Conference on Software Engineering (ICSE). 167--177.
[54]
Norbert Siegmund, Marko Rosenmüller, Martin Kuhlemann, Christian Kästner, and Gunter Saake. 2008. Measuring Non-Functional Properties in Software Product Line for Product Derivation. In 2008 15th Asia-Pacific Software Engineering Conference. IEEE, 187--194.
[55]
George W Snedecor and Witiiam G Cochran. 1989. Statistical Methods, 8thEdn. Ames: Iowa State Univ. Press Iowa (1989).
[56]
Charles Song, Adam Porter, and Jeffrey S Foster. 2013. iTree: Efficiently Discovering High-Coverage Configurations Using Interaction Trees. IEEE Transactions on Software Engineering 40, 3 (2013), 251--265.
[57]
Klaas-Jan Stol and Brian Fitzgerald. 2018. The ABC of Software Engineering Research. ACM Trans. Softw. Eng. Methodol. 27, 3, Article 11 (Sept. 2018), 51 pages.
[58]
Paul Temple, Mathieu Acher, Jean-Marc Jézéquel, and Olivier Barais. 2017. Learning Contextual-Variability Models. IEEE Software 34, 6 (2017), 64--70.
[59]
Paul Temple, Mathieu Acher, Gilles Perrouin, Battista Biggio, Jean-Marc Jézéquel, and Fabio Roli. 2019. Towards quality assurance of software product lines with adversarial configurations. In 23rd International Systems and Software Product Line Conference, SPLC. ACM, 38:1--38:12.
[60]
Paul Temple, José Angel Galindo Duarte, Mathieu Acher, and Jean-Marc Jézéquel. 2016. Using Machine Learning to Infer Constraints for Product Lines. In Software Product Line Conference (SPLC). Beijing, China. https://hal.inria.fr/hal-01323446
[61]
Chris Thornton, Frank Hutter, Holger H Hoos, and Kevin Leyton-Brown. 2013. Auto-WEKA: Combined Selection and Hyperparameter Optimization of Classification Algorithms. In Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 847--855.
[62]
Thomas Thüm, Sven Apel, Christian Kästner, Ina Schaefer, and Gunter Saake. 2014. A Classification and Survey of Analysis Strategies for Software Product Lines. Comput. Surveys (2014).
[63]
Pavel Valov, Jianmei Guo, and Krzysztof Czarnecki. 2015. Empirical Comparison of Regression Methods for Variability-Aware Performance Prediction. In SPLC'15.
[64]
Pavel Valov, Jean-Christophe Petkovich, Jianmei Guo, Sebastian Fischmeister, and Krzysztof Czarnecki. 2017. Transferring Performance Prediction Models Across Different Hardware Platforms. In Proceedings of the 8th ACM/SPEC on International Conference on Performance Engineering. ACM, 39--50.
[65]
András Vargha and Harold D Delaney. 2000. A Critique and Improvement of the CL Common Language Effect Size Statistics of McGraw and Wong. Journal of Educational and Behavioral Statistics 25, 2 (2000), 101--132.
[66]
Mahsa Varshosaz, Mustafa Al-Hajjaji, Thomas Thüm, Tobias Runge, Mohammad Reza Mousavi, and Ina Schaefer. 2018. A Classification of Product Sampling for Software Product Lines. In International Systems and Software Product Line Conference (SPLC). 1--13.
[67]
Markus Weckesser, Roland Kluge, Martin Pfannemüller, Michael Matthé, Andy Schürr, and Christian Becker. 2018. Optimal Reconfiguration of Dynamic Software Product Lines Based on Performance-Influence Models. In International Systems and Software Product Line Conference (SPLC). ACM, 98--109.
[68]
Dennis Westermann, Jens Happe, Rouven Krebs, and Roozbeh Farahbod. 2012. Automated Inference of Goal-Oriented Performance Prediction Functions. In IEEE/ACM International Conference on Automated Software Engineering (ASE). ACM, 190--199.
[69]
Cemal Yilmaz, Myra B Cohen, and Adam A Porter. 2006. Covering Arrays for Efficient Fault Characterization in Complex Configuration Spaces. IEEE Transactions on Software Engineering 32, 1 (2006), 20--34.
[70]
Yi Zhang, Jianmei Guo, Eric Blais, Krzysztof Czarnecki, and Huiqun Yu. 2016. A Mathematical Model of Performance-Relevant Feature Interactions. In International Systems and Software Product Line Conference (SPLC). ACM, 25--34.
[71]
Wei Zheng, Ricardo Bianchini, and Thu D Nguyen. 2007. Automatic Configuration of Internet Services. ACM SIGOPS Operating Systems Review 41, 3 (2007), 219--229.

Cited By

View all
  • (2024)Towards Automated Configuration DocumentationProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695311(2256-2261)Online publication date: 27-Oct-2024
  • (2024)Optimization Space Learning: A Lightweight, Noniterative Technique for Compiler AutotuningProceedings of the 28th ACM International Systems and Software Product Line Conference10.1145/3646548.3672588(36-46)Online publication date: 2-Sep-2024
  • (2024)Predicting Configuration Performance in Multiple Environments with Sequential Meta-LearningProceedings of the ACM on Software Engineering10.1145/36437431:FSE(359-382)Online publication date: 12-Jul-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
ICPE '20: Proceedings of the ACM/SPEC International Conference on Performance Engineering
April 2020
319 pages
ISBN:9781450369916
DOI:10.1145/3358960
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 20 April 2020

Permissions

Request permissions for this article.

Check for updates

Badges

Author Tags

  1. configurable systems
  2. machine learning
  3. performance prediction
  4. software product lines

Qualifiers

  • Research-article

Funding Sources

  • Agence Nationale de la Recherche

Conference

ICPE '20

Acceptance Rates

ICPE '20 Paper Acceptance Rate 15 of 62 submissions, 24%;
Overall Acceptance Rate 252 of 851 submissions, 30%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)33
  • Downloads (Last 6 weeks)1
Reflects downloads up to 22 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Towards Automated Configuration DocumentationProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695311(2256-2261)Online publication date: 27-Oct-2024
  • (2024)Optimization Space Learning: A Lightweight, Noniterative Technique for Compiler AutotuningProceedings of the 28th ACM International Systems and Software Product Line Conference10.1145/3646548.3672588(36-46)Online publication date: 2-Sep-2024
  • (2024)Predicting Configuration Performance in Multiple Environments with Sequential Meta-LearningProceedings of the ACM on Software Engineering10.1145/36437431:FSE(359-382)Online publication date: 12-Jul-2024
  • (2024)Embracing Deep Variability For Reproducibility and ReplicabilityProceedings of the 2nd ACM Conference on Reproducibility and Replicability10.1145/3641525.3663621(30-35)Online publication date: 18-Jun-2024
  • (2024)Learning input-aware performance models of configurable systems: An empirical evaluationJournal of Systems and Software10.1016/j.jss.2023.111883208(111883)Online publication date: Feb-2024
  • (2024)Software product line testing: a systematic literature reviewEmpirical Software Engineering10.1007/s10664-024-10516-x29:6Online publication date: 2-Sep-2024
  • (2024)VaryMinions: leveraging RNNs to identify variants in variability-intensive systems’ logsEmpirical Software Engineering10.1007/s10664-024-10473-529:4Online publication date: 15-Jun-2024
  • (2023)Specialization of Run-time Configuration Space at Compile-time: An Exploratory StudyProceedings of the 38th ACM/SIGAPP Symposium on Applied Computing10.1145/3555776.3578613(1459-1468)Online publication date: 27-Mar-2023
  • (2023)Analysing the Impact of Workloads on Modeling the Performance of Configurable Software Systems2023 IEEE/ACM 45th International Conference on Software Engineering (ICSE)10.1109/ICSE48619.2023.00176(2085-2097)Online publication date: May-2023
  • (2023)CoMSA: A Modeling-Driven Sampling Approach for Configuration Performance Testing2023 38th IEEE/ACM International Conference on Automated Software Engineering (ASE)10.1109/ASE56229.2023.00091(1352-1363)Online publication date: 11-Sep-2023
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media