Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3338906.3338947acmconferencesArticle/Chapter ViewAbstractPublication PagesfseConference Proceedingsconference-collections
research-article

Assessing the quality of the steps to reproduce in bug reports

Published: 12 August 2019 Publication History

Abstract

A major problem with user-written bug reports, indicated by developers and documented by researchers, is the (lack of high) quality of the reported steps to reproduce the bugs. Low-quality steps to reproduce lead to excessive manual effort spent on bug triage and resolution. This paper proposes Euler, an approach that automatically identifies and assesses the quality of the steps to reproduce in a bug report, providing feedback to the reporters, which they can use to improve the bug report. The feedback provided by Euler was assessed by external evaluators and the results indicate that Euler correctly identified 98% of the existing steps to reproduce and 58% of the missing ones, while 73% of its quality annotations are correct.

References

[1]
Young-Min Baek and Doo-Hwan Bae. 2016. Automated Model-based Android GUI Testing Using Multi-level GUI Comparison Criteria. In Proceedings of the 31st IEEE/ACM International Conference on Automated Software Engineering (ASE’16). 238–249.
[2]
Nicolas Bettenburg, Sascha Just, Adrian Schröter, Cathrin Weiss, Rahul Premraj, and Thomas Zimmermann. 2008. What Makes a Good Bug Report?. In Proceedings of the 16th International Symposium on the Foundations of Software Engineering (FSE’08). 308–318.
[3]
Piotr Bojanowski, Edouard Grave, Armand Joulin, and Tomas Mikolov. 2017. Enriching Word Vectors with Subword Information. Transactions of the Association for Computational Linguistics 5 (2017), 135–146.
[4]
Silvia Breu, Rahul Premraj, Jonathan Sillito, and Thomas Zimmermann. 2010. Information Needs in Bug Reports: Improving Cooperation Between Developers and Users. In Proceedings of the Conference on Computer Supported Cooperative Work (CSCW’10). 301–310.
[5]
Oscar Chaparro, Jing Lu, Fiorella Zampetti, Laura Moreno, Massimiliano Di Penta, Andrian Marcus, Gabriele Bavota, and Vincent Ng. 2017. Detecting Missing Information in Bug Descriptions. In Proceedings of the 11th Joint Meeting on the Foundations of Software Engineering (ESEC/FSE’17). 396–407.
[6]
Alexis Conneau, Germán Kruszewski, Guillaume Lample, Loïc Barrault, and Marco Baroni. 2018. What you can cram into a single vector: Probing sentence embeddings for linguistic properties. CoRR abs/1805.01070 (2018).
[7]
Steven Davies and Marc Roper. 2014. What’s in a bug report?. In Proceedings of the 8th International Symposium on Empirical Software Engineering and Measurement (ESEM’14). 26:1–26:10.
[8]
Bogdan Dit, Denys Poshyvanyk, and Andrian Marcus. 2008. Measuring the semantic similarity of comments in bug reports. In Proceedings of the 1st International Workshop on Semantic Technologies in System Maintenance (STSM’08). 265–280.
[9]
Mona Erfani Joorabchi, Mehdi Mirzaaghaei, and Ali Mesbah. 2014. Works for Me! Characterizing Non-reproducible Bug Reports. In Proceedings of the Working Conference on Mining Software Repositories (MSR’14). 62–71.
[10]
Mattia Fazzini, Martin Prammer, Marcelo d’Amorim, and Alessandro Orso. 2018. Automatically translating bug reports into test cases for mobile apps. In Proceedings of the 27th International Symposium on Software Testing and Analysis (ISSTA’18). 141–152.
[11]
Philip J. Guo, Thomas Zimmermann, Nachiappan Nagappan, and Brendan Murphy. 2010. Characterizing and predicting which bugs get fixed: an empirical study of Microsoft Windows. In Proceedings of the 32nd International Conference on Software Engineering (ICSE’10), Vol. 1. 495–504.
[12]
Pieter Hooimeijer and Westley Weimer. 2007. Modeling Bug Report Quality. In Proceedings of the 22nd International Conference on Automated Software Engineering (ASE’07). 34–43.
[13]
Zhiheng Huang, Wei Xu, and Kai Yu. 2015. Bidirectional LSTM-CRF Models for Sequence Tagging. CoRR abs/1508.01991 (2015).
[14]
Gün Karagöz and Hasan Sözer. 2017. Reproducing failures based on semiformal failure scenario descriptions. Software Quality Journal 25, 1 (2017), 111–129.
[15]
Andrew J. Ko, Brad A Myers, and Duen Horng Chau. 2006. A Linguistic Analysis of How People Describe Software Problems. In Proceedings of the Symposium on Visual Languages and Human-Centric Computing (VL/HCC’06). 127–134.
[16]
Guillaume Lample, Miguel Ballesteros, Sandeep Subramanian, Kazuya Kawakami, and Chris Dyer. 2016. Neural Architectures for Named Entity Recognition. In Proceedings of North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT’16). 260–270.
[17]
Eero I. Laukkanen and Mika V. Mäntylä. 2011. Survey Reproduction of Defect Reporting in Industrial Software Development. In Proceedings of the International Symposium on Empirical Software Engineering and Measurement (ESEM’11). 197– 206.
[18]
Erik Linstead and Pierre Baldi. 2009. Mining the coherence of GNOME bug reports with statistical topic models. In Proceedings of the 6th International Working Conference on Mining Software Repositories (MSR’09). 99–102.
[19]
Xuezhe Ma and Eduard Hovy. 2016. End-to-end Sequence Labeling via Bidirectional LSTM-CNNs-CRF. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (ACL’16), Vol. 1. 1064–1074.
[20]
Christopher D Manning, Mihai Surdeanu, John Bauer, Jenny Rose Finkel, Steven Bethard, and David McClosky. 2014. The Stanford CoreNLP Natural Language Processing Toolkit. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (ACL’14). 55–60.
[21]
Kevin Moran, Mario Linares-Váquez, Carlos Bernal-Cárdenas, Christopher Vendome, and Denys Poshyvanyk. 2016. Automatically Discovering, Reporting and Reproducing Android Application Crashes. In Proceedings of the International Conference on Software Testing, Verification and Validation (ICST’16). 33–44.
[22]
Kevin Moran, Mario Linares-Vásquez, Carlos Bernal-Cárdenas, and Denys Poshyvanyk. 2015. Auto-completing Bug Reports for Android Applications. In Proceedings of the Joint Meeting on Foundations of Software Engineering (FSE’15). 673–686.
[23]
Kevin Moran, Mario Linares-Vasquez, Carlos Bernal-Cardenas, Cristopher Vendome, and Denys Poshyvanyk. 2017. CrashScope: A Practical Tool for Automated Testing of Android Applications. In Proceedings of the IEEE/ACM 39th International Conference on Software Engineering (ICSE’17). 15–18.
[24]
Abraham Naftali Oppenheim. 1992. Questionnaire Design, Interviewing and Attitude Measurement. Pinter Publishers.
[25]
Lance A Ramshaw and Mitchell P Marcus. 1999. Text chunking using transformation-based learning. In Natural language processing using very large corpora. 157–176.
[26]
Swarup Kumar Sahoo, John Criswell, and Vikram Adve. 2010. An empirical study of reported bugs in server software with implications for automated bug diagnosis. In Proceedings of the International Conference on Software Engineering (ICSE’10). 485–494.
[27]
Tommaso Dal Sasso, Andrea Mocci, and Michele Lanza. 2016. What Makes a Satisficing Bug Report?. In Proceedings of the International Conference on Software Quality, Reliability and Security (QRS’16). 164–174.
[28]
Donna Spencer. 2009. Card sorting: Designing usable categories. Rosenfeld Media.
[29]
Ashish Sureka and Pankaj Jalote. 2010. Detecting Duplicate Bug Report Using Character N-Gram-Based Features. In Proceedings of the Asia Pacific Software Engineering Conference (APSEC’10). 366–374.
[30]
Jie Yang, Shuailong Liang, and Yue Zhang. 2018. Design Challenges and Misconceptions in Neural Sequence Labeling. In Proceedings of the 27th International Conference on Computational Linguistics (COLING’18). 3879–3889.
[31]
Jie Yang and Yue Zhang. 2018. NCRF++: An Open-source Neural Sequence Labeling Toolkit. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (ACL’18).
[32]
Razieh Nokhbeh Zaeem, Mukul R. Prasad, and Sarfraz Khurshid. 2014. Automated Generation of Oracles for Testing User-Interaction Features of Mobile Apps. In Proceedings of the 7th International Conference on Software Testing, Verification and Validation (ICST’14). 183–192.
[33]
Marcelo Serrano Zanetti, Ingo Scholtes, Claudio Juan Tessone, and Frank Schweitzer. 2013. Categorizing Bugs with Social Networks: A Case Study on Four Open Source Software Communities. In Proceedings of the International Conference on Software Engineering (ICSE’13). 1032–1041.
[34]
Tao Zhang, Jiachi Chen, He Jiang, Xiapu Luo, and Xin Xia. 2017. Bug Report Enrichment with Application of Automated Fixer Recommendation. In Proceedings of the 25th International Conference on Program Comprehension (ICPC’17). 230–240.
[35]
Yu Zhao, Tingting Yu, Ting Su, Yang Liu, Wei Zheng, Jingzhi Zhang, and William G.J. Halfond. 2019. ReCDroid: Automatically Reproducing Android Application Crashes from Bug Reports. In Proceedings of the 41st ACM/IEEE International Conference on Software Engineering (ICSE’19). 128–139.
[36]
Thomas Zimmermann, Nachiappan Nagappan, Philip J. Guo, and Brendan Murphy. 2012. Characterizing and predicting which bugs get reopened. In Proceedings of the International Conference on Software Engineering (ICSE’12). 1074–1083.
[37]
Thomas Zimmermann, Rahul Premraj, Nicolas Bettenburg, Sascha Just, Adrian Schröter, and Cathrin Weiss. 2010. What Makes a Good Bug Report? IEEE Transactions on Software Engineering 36, 5 (2010), 618–643.

Cited By

View all
  • (2024)Automating Issue Reporting in Software Testing: Lessons Learned from Using the Template Generator ToolCompanion Proceedings of the 32nd ACM International Conference on the Foundations of Software Engineering10.1145/3663529.3663847(278-282)Online publication date: 10-Jul-2024
  • (2024)Mobile Bug Report Reproduction via Global Search on the App UI ModelProceedings of the ACM on Software Engineering10.1145/36608241:FSE(2656-2676)Online publication date: 12-Jul-2024
  • (2024)Early and Realistic Exploitability Prediction of Just-Disclosed Software Vulnerabilities: How Reliable Can It Be?ACM Transactions on Software Engineering and Methodology10.1145/365444333:6(1-41)Online publication date: 27-Jun-2024
  • Show More Cited By

Index Terms

  1. Assessing the quality of the steps to reproduce in bug reports

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    ESEC/FSE 2019: Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering
    August 2019
    1264 pages
    ISBN:9781450355728
    DOI:10.1145/3338906
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 12 August 2019

    Permissions

    Request permissions for this article.

    Check for updates

    Badges

    • Distinguished Paper

    Author Tags

    1. Bug Report Quality
    2. Dynamic Software Analysis
    3. Textual Analysis

    Qualifiers

    • Research-article

    Conference

    ESEC/FSE '19
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 112 of 543 submissions, 21%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)77
    • Downloads (Last 6 weeks)11
    Reflects downloads up to 24 Sep 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Automating Issue Reporting in Software Testing: Lessons Learned from Using the Template Generator ToolCompanion Proceedings of the 32nd ACM International Conference on the Foundations of Software Engineering10.1145/3663529.3663847(278-282)Online publication date: 10-Jul-2024
    • (2024)Mobile Bug Report Reproduction via Global Search on the App UI ModelProceedings of the ACM on Software Engineering10.1145/36608241:FSE(2656-2676)Online publication date: 12-Jul-2024
    • (2024)Early and Realistic Exploitability Prediction of Just-Disclosed Software Vulnerabilities: How Reliable Can It Be?ACM Transactions on Software Engineering and Methodology10.1145/365444333:6(1-41)Online publication date: 27-Jun-2024
    • (2024)Feedback-Driven Automated Whole Bug Report Reproduction for Android AppsProceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3650212.3680341(1048-1060)Online publication date: 11-Sep-2024
    • (2024)How do Hugging Face Models Document Datasets, Bias, and Licenses? An Empirical StudyProceedings of the 32nd IEEE/ACM International Conference on Program Comprehension10.1145/3643916.3644412(370-381)Online publication date: 15-Apr-2024
    • (2024)The NLBSE'24 Tool CompetitionProceedings of the Third ACM/IEEE International Workshop on NL-based Software Engineering10.1145/3643787.3648038(33-40)Online publication date: 20-Apr-2024
    • (2024)An Empirical Analysis of Issue Templates Usage in Large-Scale Projects on GitHubACM Transactions on Software Engineering and Methodology10.1145/364367333:5(1-28)Online publication date: 3-Jun-2024
    • (2024)MissConf: LLM-Enhanced Reproduction of Configuration-Triggered BugsProceedings of the 2024 IEEE/ACM 46th International Conference on Software Engineering: Companion Proceedings10.1145/3639478.3647635(484-495)Online publication date: 14-Apr-2024
    • (2024)Toward Rapid Bug Resolution for Android AppsProceedings of the 2024 IEEE/ACM 46th International Conference on Software Engineering: Companion Proceedings10.1145/3639478.3639812(237-241)Online publication date: 14-Apr-2024
    • (2024)CrashTranslator: Automatically Reproducing Mobile Application Crashes Directly from Stack TraceProceedings of the IEEE/ACM 46th International Conference on Software Engineering10.1145/3597503.3623298(1-13)Online publication date: 20-May-2024
    • Show More Cited By

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media