research-article

Public Access

Using bad learners to find good configurations

Authors:

Norbert Siegmund,

Sven ApelAuthors Info & Claims

ESEC/FSE 2017: Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering

Pages 257 - 267

https://doi.org/10.1145/3106237.3106238

Published: 21 August 2017 Publication History

Abstract

Finding the optimally performing configuration of a software system for a given setting is often challenging. Recent approaches address this challenge by learning performance models based on a sample set of configurations. However, building an accurate performance model can be very expensive (and is often infeasible in practice). The central insight of this paper is that exact performance values (e.g., the response time of a software system) are not required to rank configurations and to identify the optimal one. As shown by our experiments, performance models that are cheap to learn but inaccurate (with respect to the difference between actual and predicted performance) can still be used rank configurations and hence find the optimal configuration. This novel rank-based approach allows us to significantly reduce the cost (in terms of number of measurements of sample configuration) as well as the time required to build performance models. We evaluate our approach with 21 scenarios based on 9 software systems and demonstrate that our approach is beneficial in 16 scenarios; for the remaining 5 scenarios, an accurate model can be built by using very few samples anyway, without the need for a rank-based approach.

References

[1]

J. Chen, V. Nair, R. Krishna, and T. Menzies. 2016.

[2]

Is Sampling better than Evolution for Search-based Software Engineering? arXiv (2016).

[3]

B. Efron and R. J. Tibshirani. 1993.

[4]

An Introduction to the Bootstrap. CRC.

[5]

T. Foss, E. Stensrud, B. Kitchenham, and I. Myrtveit. 2003.

[6]

A Simulation Study of the Model Evaluation Criterion MMRE. IEEE Transactions on Software Engineering (TSE) 29 (2003), 985– 995.

Digital Library

[7]

B. Ghotra, S. McIntosh, and A. E. Hassan. 2015. Revisiting the Impact of Classification Techniques on the Performance of Defect Prediction Models. In Proc. of International Conference on Software Engineering (ICSE). IEEE, 789–800.

Digital Library

[8]

J. Guo, K. Czarnecki, S. Apel, N. Siegmund, and A. Wasowski. 2013.

[9]

Variability-Aware Performance Prediction: A Statistical Learning Approach. In Proc. of International Conference on Automated Software Engineering (ASE). IEEE, 301–311.

Digital Library

[10]

C. Henard, M. Papadakis, M. Harman, and Y. Le Traon. 2015.

[11]

Combining Multi-Objective Search and Constraint Solving for Configuring Large Software Product Lines. In Proc. of International Conference on Software Engineering (ICSE). IEEE, 517–528.

Digital Library

[12]

P. Jamshidi and G. Casale. 2016. An Uncertainty-Aware Approach to Optimal Configuration of Stream Processing Systems. In Proc. of International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems (MASCOTS). IEEE, 39–48.

[13]

J.D. Kloke and J. W. McKean. 2012.

[14]

Rfit: Rank-based Estimation for Linear Models. The R Journal 4 (2012), 57–64.

[15]

J. Krall, T. Menzies, and M. Davies. 2015. GALE: Geometric Active Learning for Search-Based Software Engineering. IEEE Transactions on Software Engineering (TSE) 41 (2015), 1001–1018.

Digital Library

[16]

D. Lim, Y. Jin, Y. S. Ong, and B. Sendhoff. 2010. Generalizing Surrogate-Assisted Evolutionary Computation. IEEE Transactions on Evolutionary Computation 14 (2010), 329–355.

Digital Library

[17]

T. Menzies, A. Butcher, D. Cok, A. Marcus, L. Layman, F. Shull, B. Turhan, and T. Zimmermann. 2013.

[18]

Local versus Global Lessons for Defect Prediction and Effort Estimation. IEEE Transactions on Software Engineering (TSE) 39 (2013), 822–834.

Digital Library

[19]

N. Mittas and L. Angelis. 2013.

[20]

Ranking and Clustering Software Cost Estimation Models through a Multiple Comparisons Algorithm. IEEE Transactions on Software Engineering (TSE) 39 (2013), 537–551.

Digital Library

[21]

I. Myrtveit and E. Stensrud. 2012.

[22]

Validity and Reliability of Evaluation Procedures in Comparative Studies of Effort Prediction Models. Empirical Software Engineering (ESE) 17 (2012), 23–33.

Digital Library

[23]

I. Myrtveit, E. Stensrud, and M. Shepperd. 2005. Reliability and Validity in Comparative Studies of Software Prediction Models. IEEE Transactions on Software Engineering (TSE) 31 (2005), 380– 391.

Digital Library

[24]

V. Nair, T. Menzies, N. Siegmund, and S. Apel. 2017. Faster Discovery of Faster System Configurations with Spectral Learning. arXiv (2017).

[25]

F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, and others. 2011.

[26]

Scikit-learn: Machine Learning in Python. Journal of Machine Learning Research 12 (2011), 2825–2830.

Digital Library

[27]

S. Rosset, C. Perlich, and B. Zadrozny. 2005. Ranking-based Evaluation of Regression Models. In Proc. of International Conference on Data Mining (ICDM). IEEE.

Digital Library

[28]

A. Sarkar, J. Guo, N. Siegmund, S. Apel, and K. Czarnecki. 2015.

[29]

Cost-Efficient Sampling for Performance Prediction of Configurable Systems (T). In Proc. of International Conference on Automated Software Engineering (ASE). IEEE, 342–352.

[30]

A. S. Sayyad, J. Ingram, T. Menzies, and H. Ammar. 2013. Scalable Product Line Configuration: A Straw to Break the Camel’s Back. In Proc. of International Conference on Automated Software Engineering (ASE). IEEE, 465–474.

Digital Library

[31]

J. Siegmund, N. Siegmund, and S. Apel. 2015.

[32]

Views on Internal and External Validity in Empirical Software Engineering. In Proc. of International Conference on Software Engineering (ICSE). IEEE, 9–19.

Digital Library

[33]

N. Siegmund, S. S. Kolesnikov, C. Kästner, S. Apel, D. Batory, M. Rosenmüller, and G. Saake. 2012. Predicting Performance via Automated Feature-Interaction Detection. In Proc. of International Conference on Software Engineering (ICSE). IEEE, 167–177.

Digital Library

[34]

A. Vargha and H. D. Delaney. 2000. A Critique and Improvement of the CL Common Language Effect Size Statistics of McGraw and Wong. Journal of Educational and Behavioral Statistics 25 (2000), 101–132.

[35]

G. M. Weiss and Y. Tian. 2008. Maximizing Classifier Utility when there are Data Acquisition and Modeling Costs. Journal of Data Mining and Knowledge Discovery (2008), 253–282.

Digital Library

Cited By

Gong JChen T(2024)Predicting Configuration Performance in Multiple Environments with Sequential Meta-LearningProceedings of the ACM on Software Engineering10.1145/36437431:FSE(359-382)Online publication date: 12-Jul-2024
https://dl.acm.org/doi/10.1145/3643743
Acher M(2024)A Demonstration of End-User Code Customization Using Generative AIProceedings of the 18th International Working Conference on Variability Modelling of Software-Intensive Systems10.1145/3634713.3634732(139-145)Online publication date: 7-Feb-2024
https://dl.acm.org/doi/10.1145/3634713.3634732
Lesoil LSpieker HGotlieb AAcher MTemple PBlouin AJézéquel J(2024)Learning input-aware performance models of configurable systems: An empirical evaluationJournal of Systems and Software10.1016/j.jss.2023.111883208(111883)Online publication date: Feb-2024
https://doi.org/10.1016/j.jss.2023.111883
Show More Cited By

Index Terms

Using bad learners to find good configurations
1. Computing methodologies
  1. Machine learning
    1. Learning paradigms
      1. Supervised learning
        Ranking
    2. Machine learning approaches
      1. Classification and regression trees
2. Software and its engineering
  1. Software organization and properties
    1. Extra-functional properties
      1. Software performance
    2. Software system structures
      1. Software system models
        Feature interaction
        Model-driven software engineering

Recommendations

Performance-influence models for highly configurable systems
ESEC/FSE 2015: Proceedings of the 2015 10th Joint Meeting on Foundations of Software Engineering

Almost every complex software system today is configurable. While configurability has many benefits, it challenges performance prediction, optimization, and debugging. Often, the influences of individual configuration options on performance are ...
Faster discovery of faster system configurations with spectral learning

Despite the huge spread and economical importance of configurable software systems, there is unsatisfactory support in utilizing the full potential of these systems with respect to finding performance-optimal configurations. Prior work on predicting the ...
Regression Models for Performance Ranking of Configurable Systems: A Comparative Study
Structured Object-Oriented Formal Language and Method
Abstract
Finding the best configurations for a highly configurable system is challenging. Existing studies learned regression models to predict the performance of potential configurations. Such learning suffers from the low accuracy and the high effort of ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

ESEC/FSE 2017: Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering

August 2017

1073 pages

ISBN:9781450351058

DOI:10.1145/3106237

General Chairs:
Eric Bodden
Paderborn University, Germany / Fraunhofer IEM, Germany
,
Wilhelm Schäfer
Paderborn University, Germany
,
Program Chairs:
Arie van Deursen
Delft University of Technology, Netherlands
,
Andrea Zisman
Open University, UK

Copyright © 2017 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGSOFT: ACM Special Interest Group on Software Engineering

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 21 August 2017

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

Conference

ESEC/FSE'17

Sponsor:

SIGSOFT

ESEC/FSE'17: Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on the Foundations of Software Engineering

September 4 - 8, 2017

Paderborn, Germany

Acceptance Rates

Overall Acceptance Rate 112 of 543 submissions, 21%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

78
Total Citations
View Citations
656
Total Downloads

Downloads (Last 12 months)81
Downloads (Last 6 weeks)13

Reflects downloads up to 26 Sep 2024

Other Metrics

View Author Metrics

Citations

Cited By

Gong JChen T(2024)Predicting Configuration Performance in Multiple Environments with Sequential Meta-LearningProceedings of the ACM on Software Engineering10.1145/36437431:FSE(359-382)Online publication date: 12-Jul-2024
https://dl.acm.org/doi/10.1145/3643743
Acher M(2024)A Demonstration of End-User Code Customization Using Generative AIProceedings of the 18th International Working Conference on Variability Modelling of Software-Intensive Systems10.1145/3634713.3634732(139-145)Online publication date: 7-Feb-2024
https://dl.acm.org/doi/10.1145/3634713.3634732
Lesoil LSpieker HGotlieb AAcher MTemple PBlouin AJézéquel J(2024)Learning input-aware performance models of configurable systems: An empirical evaluationJournal of Systems and Software10.1016/j.jss.2023.111883208(111883)Online publication date: Feb-2024
https://doi.org/10.1016/j.jss.2023.111883
Fortz STemple PDevroey XHeymans PPerrouin G(2024)VaryMinions: leveraging RNNs to identify variants in variability-intensive systems’ logsEmpirical Software Engineering10.1007/s10664-024-10473-529:4Online publication date: 15-Jun-2024
https://dl.acm.org/doi/10.1007/s10664-024-10473-5
Göttmann HCaesar BBeers LLochau MSchürr AFay A(2024)Cost-sensitive precomputation of real-time-aware reconfiguration strategies based on stochastic priced timed gamesSoftware and Systems Modeling10.1007/s10270-024-01195-9Online publication date: 5-Aug-2024
https://doi.org/10.1007/s10270-024-01195-9
Phillips IKenley C(2024)Validation Framework of a Digital Twin: A System Identification ApproachINCOSE International Symposium10.1002/iis2.1314534:1(249-267)Online publication date: 7-Sep-2024
https://doi.org/10.1002/iis2.13145
Oh JBatory DHeradio R(2023)Finding Near-optimal Configurations in Colossal Spaces with Statistical GuaranteesACM Transactions on Software Engineering and Methodology10.1145/361166333:1(1-36)Online publication date: 23-Nov-2023
https://dl.acm.org/doi/10.1145/3611663
Chen JDing ZTang YSayagh MLi HAdams BShang WChandra SBlincoe KTonella P(2023)IoPV: On Inconsistent Option Performance VariationsProceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering10.1145/3611643.3616319(845-857)Online publication date: 30-Nov-2023
https://dl.acm.org/doi/10.1145/3611643.3616319
Acher MMartinez J(2023)Generative AI for Reengineering Variants into Software Product LinesProceedings of the 27th ACM International Systems and Software Product Line Conference - Volume B10.1145/3579028.3609016(57-66)Online publication date: 28-Aug-2023
https://dl.acm.org/doi/10.1145/3579028.3609016
Ling XMenzies T(2023)What Not to Test (For Cyber-Physical Systems)IEEE Transactions on Software Engineering10.1109/TSE.2023.327230949:7(3811-3826)Online publication date: Jul-2023
https://doi.org/10.1109/TSE.2023.3272309
Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents