Article

Free access

Predicting relevance of change recommendations

Authors:

Thomas Rolfsnes,

David BinkleyAuthors Info & Claims

ASE '17: Proceedings of the 32nd IEEE/ACM International Conference on Automated Software Engineering

Pages 694 - 705

Published: 30 October 2017 Publication History

Abstract

Software change recommendation seeks to suggest artifacts (e.g., files or methods) that are related to changes made by a developer, and thus identifies possible omissions or next steps. While one obvious challenge for recommender systems is to produce accurate recommendations, a complimentary challenge is to rank recommendations based on their relevance . In this paper, we address this challenge for recommendation systems that are based on evolutionary coupling . Such systems use targeted association-rule mining to identify relevant patterns in a software systemâ s change history. Traditionally, this process involves ranking artifacts using interestingness measures such as confidence and support . However, these measures often fall short when used to assess recommendation relevance.

We propose the use of random forest classification models to assess recommendation relevance. This approach improves on past use of various interestingness measures by learning from previous change recommendations. We empirically evaluate our approach on fourteen open source systems and two systems from our industry partners. Furthermore, we consider complimenting two mining algorithms: Co-Change and Tarmaq. The results find that random forest classification significantly outperforms previous approaches, receives lower Brier scores, and has superior trade-off between precision and recall. The results are consistent across software system and mining algorithm.

References

[1]

R. Agrawal, T. Imielinski, and A. Swami. “Mining association rules between sets of items in large databases”. In: ACM SIGMOD International Conference on Management of Data. ACM, 1993, pp. 207–216.

Digital Library

[2]

R. Agrawal and R. Srikant. “Fast Algorithms for Mining Association Rules”. In: International Conference on Very Large Data Bases (VLDB). 1994, pp. 487–499.

Digital Library

[3]

E. Baralis et al. “Generalized association rule mining with constraints”. In: Information Sciences 194 (2012), pp. 68–84.

Digital Library

[4]

C. Bergmeir and J. M. Benítez. “On the use of cross-validation for time series predictor evaluation”. In: Information Sciences 191 (2012), pp. 192–213.

Digital Library

[5]

D. Beyer and A. Noack. “Clustering Software Artifacts Based on Frequent Common Changes”. In: International Workshop on Program Comprehension (IWPC). IEEE, 2005, pp. 259– 268.

Digital Library

[6]

S. Bohner and R. Arnold. Software Change Impact Analysis. CA, USA: IEEE, 1996.

Digital Library

[7]

L. Breiman. “Random Forests”. In: Machine Learning 45.1 (2001), pp. 5–32.

Digital Library

[8]

A. Buja, W. Stuetzle, and Y. Shen. “Loss functions for binary class probability estimation and classification: structure and application”. 2005.

[9]

R. Caruana and A. Niculescu-Mizil. “An empirical comparison of supervised learning algorithms”. In: Proceedings of the 23th International Conference on Machine Learning (2006), pp. 161–168.

Digital Library

[10]

W. Cheetham. “Case-Based Reasoning with Confidence”. In: European Workshop on Advances in Case-Based Reasoning (EWCBR). Lecture Notes in Computer Science, vol 1898. Springer, 2000, pp. 15–25.

Digital Library

[11]

W. Cheetham and J. Price. “Measures of Solution Accuracy in Case-Based Reasoning Systems”. In: European Conference on Case-Based Reasoning (ECCBR). Lecture Notes in Computer Science, vol 3155. Springer, 2004, pp. 106–118.

[12]

D. Cubranic et al. “Hipikat: a project memory for software development”. In: IEEE Transactions on Software Engineering 31.6 (2005), pp. 446–465.

Digital Library

[13]

S. Eick et al. “Does code decay? Assessing the evidence from change management data”. In: IEEE Transactions on Software Engineering 27.1 (2001), pp. 1–12.

Digital Library

[14]

H. Gall, K. Hajek, and M. Jazayeri. “Detection of logical coupling based on product release history”. In: IEEE International Conference on Software Maintenance (ICSM). IEEE, 1998, pp. 190–198.

Digital Library

[15]

H. Gall, M. Jazayeri, and J. Krajewski. “CVS release history data for detecting logical couplings”. In: International Workshop on Principles of Software Evolution (IWPSE). IEEE, 2003, pp. 13–23.

Digital Library

[16]

L. Geng and H. J. Hamilton. “Interestingness measures for data mining”. In: ACM Computing Surveys 38.3 (2006).

Digital Library

[17]

A. E. Hassan and R. Holt. “Predicting change propagation in software systems”. In: IEEE International Conference on Software Maintenance (ICSM). IEEE, 2004, pp. 284–293.

Digital Library

[18]

N. Jiang and L. Gruenwald. “Research issues in data stream association rule mining”. In: ACM SIGMOD Record 35.1 (2006), pp. 14–19.

Digital Library

[19]

H. Kagdi et al. “Blending conceptual and evolutionary couplings to support change impact analysis in source code”. In: Working Conference on Reverse Engineering (WCRE). 2010, pp. 119–128.

Digital Library

[20]

S. Kannan and R. Bhaskaran. “Association Rule Pruning based on Interestingness Measures with Clustering”. In: Journal of Computer Science 6.1 (2009), pp. 35–43.

[21]

T.-d. B. Le and D. Lo. “Beyond support and confidence: Exploring interestingness measures for rule-based specification mining”. In: International Conference on Software Analysis, Evolution, and Reengineering (SANER). IEEE, 2015, pp. 331– 340.

[22]

T.-D. B. Le, D. Lo, and F. Thung. “Should I follow this fault localization tool’s output?” In: Empirical Software Engineering 20.5 (2015), pp. 1237–1274.

Digital Library

[23]

T.-D. B. Le, F. Thung, and D. Lo. “Predicting Effectiveness of IR-Based Bug Localization Techniques”. In: 2014 IEEE 25th International Symposium on Software Reliability Engineering. IEEE, 2014, pp. 335–345.

Digital Library

[24]

P. Lenca et al. “On selecting interestingness measures for association rules: User oriented description and multiple criteria decision aid”. In: European Journal of Operational Research 184.2 (2008), pp. 610–626.

[25]

W. Lin, S. A. Alvarez, and C. Ruiz. “Efficient Adaptive-Support Association Rule Mining for Recommender Systems”. In: Data Mining and Knowledge Discovery 6.1 (2002), pp. 83–105.

Digital Library

[26]

B. Liu, W. Hsu, and Y. Ma. “Pruning and summarizing the discovered associations”. In: SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD). ACM, 1999, pp. 125–134.

Digital Library

[27]

O. Maimon and L. Rokach. Data Mining and Knowledge Discovery Handbook. Ed. by O. Maimon and L. Rokach. Springer, 2010, p. 1383.

Digital Library

[28]

K. McGarry. “A survey of interestingness measures for knowledge discovery”. In: The Knowledge Engineering Review 20.01 (2005), p. 39.

Digital Library

[29]

L. Moonen et al. “Exploring the Effects of History Length and Age on Mining Software Change Impact”. In: IEEE International Working Conference on Source Code Analysis and Manipulation (SCAM). 2016, pp. 207–216.

[30]

L. Moonen et al. “Practical Guidelines for Change Recommendation using Association Rule Mining”. In: International Conference on Automated Software Engineering (ASE). Singapore: IEEE, 2016.

Digital Library

[31]

C. Parnin and A. Orso. “Are automated debugging techniques actually helping programmers?” In: International Symposium on Software Testing and Analysis (ISSTA). ACM, 2011, p. 199.

Digital Library

[32]

P. Resnick and H. R. Varian. “Recommender systems”. In: Communications of the ACM 40.3 (1997), pp. 56–58.

Digital Library

[33]

R. Robbes, D. Pollet, and M. Lanza. “Logical Coupling Based on Fine-Grained Change Information”. In: Working Conference on Reverse Engineering (WCRE). IEEE, 2008, pp. 42–46.

Digital Library

[34]

T. Rolfsnes et al. “Generalizing the Analysis of Evolutionary Coupling for Software Change Impact Analysis”. In: International Conference on Software Analysis, Evolution, and Reengineering (SANER). IEEE, 2016, pp. 201–212.

[35]

T. Rolfsnes et al. “Improving change recommendation using aggregated association rules”. In: International Conference on Mining Software Repositories (MSR). ACM, 2016, pp. 73–84.

Digital Library

[36]

R. Srikant, Q. Vu, and R. Agrawal. “Mining Association Rules with Item Constraints”. In: International Conference on Knowledge Discovery and Data Mining (KDD). AASI, 1997, pp. 67–73.

Digital Library

[37]

P.-N. Tan, V. Kumar, and J. Srivastava. “Selecting the right interestingness measure for association patterns”. In: International Conference on Knowledge Discovery and Data Mining (KDD). ACM, 2002, p. 32.

Digital Library

[38]

H. Toivonen et al. “Pruning and Grouping Discovered Association Rules”. In: Workshop on Statistics, Machine Learning, and Knowledge Discovery in Databases. Heraklion, Crete, Greece, 1995, pp. 47–52.

[39]

M.-C. Tseng and W.-Y. Lin. “Mining Generalized Association Rules with Multiple Minimum Supports”. In: Lecture Notes in Computer Science (LNCS). Vol. 2114. 2001, pp. 11–20.

Digital Library

[40]

A. T. T. Ying et al. “Predicting source code changes by mining change history”. In: IEEE Transactions on Software Engineering 30.9 (2004), pp. 574–586.

Digital Library

[41]

M. J. Zaki. “Generating non-redundant association rules”. In: SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD). ACM, 2000, pp. 34–43.

Digital Library

[42]

M. B. Zanjani, G. Swartzendruber, and H. Kagdi. “Impact analysis of change requests on source code based on interaction and commit histories”. In: International Working Conference on Mining Software Repositories (MSR). 2014, pp. 162–171.

Digital Library

[43]

Z. Zheng, R. Kohavi, and L. Mason. “Real world performance of association rule algorithms”. In: SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD). ACM, 2001, pp. 401–406.

Digital Library

[44]

T. Zimmermann et al. “Mining version histories to guide software changes”. In: IEEE Transactions on Software Engineering 31.6 (2005), pp. 429–445. Introduction Overall Approach Related Work Generating Change Recommendations Software Change Recommendation Targeted Association Rule Mining Association Rule Mining Algorithms Overview of Model Features Features of the Query Features of the Change History Features of the Recommendation Experiment Design Generating Change Recommendations Evaluation of Relevance Prediction Results and Discussion RQ 1: Comparison to Confidence as a Relevance Predictor RQ 2: Analysis of Features Threats to Validity Concluding Remarks

Digital Library

Cited By

Isemoto KKobayashi THayashi SRastogi ATufano RBavota GArnaoudova VHaiduc S(2022)Revisiting the effect of branch handling strategies on change recommendationProceedings of the 30th IEEE/ACM International Conference on Program Comprehension10.1145/3524610.3527870(162-172)Online publication date: 16-May-2022
https://dl.acm.org/doi/10.1145/3524610.3527870
Lyu YLi HSayagh MJiang ZHassan A(2021)An Empirical Study of the Impact of Data Splitting Decisions on the Performance of AIOps SolutionsACM Transactions on Software Engineering and Methodology10.1145/344787630:4(1-38)Online publication date: 23-Jul-2021
https://dl.acm.org/doi/10.1145/3447876

Recommendations

Aggregating Association Rules to Improve Change Recommendation

As the complexity of software systems grows, it becomes increasingly difficult for developers to be aware of all the dependencies that exist between artifacts (e.g., files or methods) of a system. Change recommendation has been proposed as a technique ...
Predicting the ratings of multimedia items for making personalized recommendations
SIGIR '12: Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval

Existing multimedia recommenders suggest a specific type of multimedia items rather than items of different types personalized for a user based on his/her preference. Assume that a user is interested in a particular family movie, it is appealing if a ...
Practical guidelines for change recommendation using association rule mining
ASE '16: Proceedings of the 31st IEEE/ACM International Conference on Automated Software Engineering

Association rule mining is an unsupervised learning technique that infers relationships among items in a data set. This technique has been successfully used to analyze a system's change history and uncover evolutionary coupling between system artifacts. ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Guide Proceedings

ASE '17: Proceedings of the 32nd IEEE/ACM International Conference on Automated Software Engineering

October 2017

1033 pages

ISBN:9781538626849

General Chair:
Grigore Rosu
University of Illinois at Urbana-Champaign, USA
,
Program Chairs:
Massimiliano Di Penta
University of Sannio, Italy
,
Tien N. Nguyen
University of Texas at Dallas, USA

Sponsors

Publisher

IEEE Press

Publication History

Published: 30 October 2017

Author Tags

Qualifiers

Article

Acceptance Rates

Overall Acceptance Rate 82 of 337 submissions, 24%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

2
Total Citations
View Citations
175
Total Downloads

Downloads (Last 12 months)50
Downloads (Last 6 weeks)7

Reflects downloads up to 12 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Isemoto KKobayashi THayashi SRastogi ATufano RBavota GArnaoudova VHaiduc S(2022)Revisiting the effect of branch handling strategies on change recommendationProceedings of the 30th IEEE/ACM International Conference on Program Comprehension10.1145/3524610.3527870(162-172)Online publication date: 16-May-2022
https://dl.acm.org/doi/10.1145/3524610.3527870
Lyu YLi HSayagh MJiang ZHassan A(2021)An Empirical Study of the Impact of Data Splitting Decisions on the Performance of AIOps SolutionsACM Transactions on Software Engineering and Methodology10.1145/344787630:4(1-38)Online publication date: 23-Jul-2021
https://dl.acm.org/doi/10.1145/3447876

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents