Article

Comparing test quality measures for assessing student-written tests

Authors:

Stephen H. Edwards,

Zalia ShamsAuthors Info & Claims

ICSE Companion 2014: Companion Proceedings of the 36th International Conference on Software Engineering

Pages 354 - 363

https://doi.org/10.1145/2591062.2591164

Published: 31 May 2014 Publication History

Abstract

Many educators now include software testing activities in programming assignments, so there is a growing demand for appropriate methods of assessing the quality of student-written software tests. While tests can be hand-graded, some educators also use objective performance metrics to assess software tests. The most common measures used at present are code coverage measures—tracking how much of the student’s code (in terms of statements, branches, or some combination) is exercised by the corresponding software tests. Code coverage has limitations, however, and sometimes it overestimates the true quality of the tests. Some researchers have suggested that mutation analysis may provide a better indication of test quality, while some educators have experimented with simply running every student’s test suite against every other student’s program—an “all-pairs” strategy that gives a bit more insight into the quality of the tests. However, it is still unknown which one of these measures is more accurate, in terms of most closely predicting the true bug revealing capability of a given test suite. This paper directly compares all three methods of measuring test quality in terms of how well they predict the observed bug revealing capabilities of student-written tests when run against a naturally occurring collection of student-produced defects. Experimental results show that all-pairs testing—running each student’s tests against every other student’s solution—is the most effective predictor of the underlying bug revealing capability of a test suite. Further, no strong correlation was found between bug revealing capability and either code coverage or mutation analysis scores.

References

[1]

S.H. Edwards. Using software testing to move students from trial-and-error to reflection-in-action. In Proc. 35th SIGCSE Tech. Symp. Comp. Sci. Education, ACM, 2004, pp. 26-30.

Digital Library

[2]

S.H. Edwards. Using test-driven development in the classroom: Providing students with concrete feedback. In Proc. Int'l Conf. Education and Info. Sys.: Technologies and Applications, Int'l Inst. of Informatics and Systemics, 2003, pp. 421–426.

[3]

S.H. Edwards. Rethinking computer science education from a test-first perspective. In Add. 2003 Proc. Conf. Object-oriented Prog., Sys., Languages, and Applications, ACM, 2003, pp. 148–155.

Digital Library

[4]

D. Jackson and M. Usher. Grading student programs using ASSYST. In Pro. 28th SIGCSE Tech. Symp. Comp. Sci. Education, 1997, pp. 335-339.

Digital Library

[5]

J. Spacco and W. Pugh. Helping students appreciate testdriven development (TDD). In Companion to 21st ACM SIGPLAN Symp. Object-oriented Prog. Systems, Languages, and Applications, ACM, 2006, pp. 907-913.

Digital Library

[6]

J.C. Miller and C.J. Maloney. Systematic mistake analysis of digital computer programs. Commun. ACM, vol. 6, pp. 58-63, 1963.

Digital Library

[7]

(10/19/2013). JaCoCo Java Code Coverage Library. Available: http://www.eclemma.org/jacoco/

[8]

(10/19/2013). Clover: Java and Groovy Code Coverage. Available: https://www.atlassian.com/software/clover/overview

[9]

(10/19/2013). EMMA: a free Java code coverage tool. Available: http://emma.sourceforge.net/

[10]

M.H. Goldwasser. A gimmick to integrate software testing throughout the curriculum. In Proc. 33rd SIGCSE Tech. Symp. Comp. Sci. Education, ACM, pp. 271-275, 2002.

Digital Library

[11]

S.H. Edwards, Z. Shams, M. Cogswell, and R.C. Senkbeil. Running students' software tests against each others' code: New life for an old "gimmick". In Proc. 43rd ACM Tech. Symp. Comp. Sci. Education, ACM, 2012, pp. 221-226.

Digital Library

[12]

K. Aaltonen, P. Ihantola, and O. Seppälä. Mutation analysis vs. code coverage in automated assessment of students' testing skills. In Proc. ACM Int'l Conf. Companion on Object-oriented Prog. Sys., Languages, and Applications, ACM, 2010, pp. 153-160.

Digital Library

[13]

R.A. DeMillo, R.J. Lipton, and F.G. Sayward. Hints on test data selection: Help for the practicing programmer. Computer, vol. 11, pp. 34-41, 1978.

Digital Library

[14]

A.J. Offutt. Investigations of the software testing coupling effect. ACM Trans. Softw. Eng. Methodol., vol. 1, pp. 5-20, 1992.

Digital Library

[15]

Z. Shams and S.H. Edwards. Toward practical mutation analysis for evaluating the quality of student-written software tests. In Proc. 9th Ann. Int'l ACM Conf. Comp. Education Research, ACM, 2013, pp. 53-58.

Digital Library

[16]

Y.-S. Ma, J. Offutt, and Y.R. Kwon. MuJava: An automated class mutation system: Research Articles. Softw. Test. Verif. Reliab., vol. 15, pp. 97-133, 2005.

Digital Library

[17]

D. Schuler. (04/15/2013). Javalanche. Available: https://github.com/david-schuler/javalanche/

[18]

P. Ammann and J. Offutt, Introduction to Software Testing, 1 ed.: Cambridge University Press, 2008.

Digital Library

Cited By

Land KVogel-Heuser B(2024)Investigating the graphical IEC 61131-3 language impact on test case design and evaluation of mechatronic apprenticesat - Automatisierungstechnik10.1515/auto-2023-016272:3(176-188)Online publication date: 29-Feb-2024
https://doi.org/10.1515/auto-2023-0162
Ma QShen HKoedinger KWu S(2024)How to Teach Programming in the AI Era? Using LLMs as a Teachable Agent for DebuggingArtificial Intelligence in Education10.1007/978-3-031-64302-6_19(265-279)Online publication date: 2-Jul-2024
https://doi.org/10.1007/978-3-031-64302-6_19
Shin AKazerouni A(2023)A Model of How Students Engineer Test Cases With FeedbackACM Transactions on Computing Education10.1145/362860424:1(1-31)Online publication date: 20-Oct-2023
https://dl.acm.org/doi/10.1145/3628604
Show More Cited By

Index Terms

Comparing test quality measures for assessing student-written tests
1. Social and professional topics
  1. Professional topics
    1. Computing education
      1. Computing education programs
        Computer science education
        Information science education
2. Software and its engineering
  1. Software creation and management
    1. Software verification and validation
      1. Software defect analysis
        Software testing and debugging

Recommendations

Toward practical mutation analysis for evaluating the quality of student-written software tests
ICER '13: Proceedings of the ninth annual international ACM conference on International computing education research

Software testing is being added to programming courses at many schools, but current assessment techniques for evaluating student-written tests are imperfect. Code coverage measures are typically used in practice, but they have limitations and sometimes ...
Checked Coverage and Object Branch Coverage: New Alternatives for Assessing Student-Written Tests
SIGCSE '15: Proceedings of the 46th ACM Technical Symposium on Computer Science Education

Many educators currently use code coverage metrics to assess student-written software tests. While test adequacy criteria such as statement or branch coverage can also be used to measure the thoroughness of a test suite, they have limitations. Coverage ...
Mutation analysis vs. code coverage in automated assessment of students' testing skills
OOPSLA '10: Proceedings of the ACM international conference companion on Object oriented programming systems languages and applications companion

Learning to program should include learning about proper software testing. Some automatic assessment systems, e.g. Web-CAT, allow assessing student-generated test suites using coverage metrics. While this encourages testing, we have observed that ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

ICSE Companion 2014: Companion Proceedings of the 36th International Conference on Software Engineering

May 2014

741 pages

ISBN:9781450327688

DOI:10.1145/2591062

General Chair:
Pankaj Jalote
IIIT-Delhi, India
,
Program Chairs:
Lionel Briand
University of Luxembourg, Luxembourg
,
André van der Hoek
University of California, Irvine, USA

Copyright © 2014 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGSOFT: ACM Special Interest Group on Software Engineering

In-Cooperation

TCSE: IEEE Computer Society's Tech. Council on Software Engin.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 31 May 2014

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Article

Conference

ICSE '14

Sponsor:

SIGSOFT

ICSE '14: 36th International Conference on Software Engineering

May 31 - June 7, 2014

Hyderabad, India

Acceptance Rates

Overall Acceptance Rate 276 of 1,856 submissions, 15%

Upcoming Conference

ICSE 2025

2025 IEEE/ACM 46th International Conference on Software Engineering

April 26 - May 3, 2025

Ottawa , ON , Canada

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

24
Total Citations
View Citations
425
Total Downloads

Downloads (Last 12 months)37
Downloads (Last 6 weeks)3

Reflects downloads up to 20 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Land KVogel-Heuser B(2024)Investigating the graphical IEC 61131-3 language impact on test case design and evaluation of mechatronic apprenticesat - Automatisierungstechnik10.1515/auto-2023-016272:3(176-188)Online publication date: 29-Feb-2024
https://doi.org/10.1515/auto-2023-0162
Ma QShen HKoedinger KWu S(2024)How to Teach Programming in the AI Era? Using LLMs as a Teachable Agent for DebuggingArtificial Intelligence in Education10.1007/978-3-031-64302-6_19(265-279)Online publication date: 2-Jul-2024
https://doi.org/10.1007/978-3-031-64302-6_19
Shin AKazerouni A(2023)A Model of How Students Engineer Test Cases With FeedbackACM Transactions on Computing Education10.1145/362860424:1(1-31)Online publication date: 20-Oct-2023
https://dl.acm.org/doi/10.1145/3628604
Obaid Barraood SMohd HBaharom FAlmogahed A(2023)Systematic Literature Review on Test Case Quality Characteristics and Metrics2023 3rd International Conference on Emerging Smart Technologies and Applications (eSmarTA)10.1109/eSmarTA59349.2023.10293544(01-08)Online publication date: 10-Oct-2023
https://doi.org/10.1109/eSmarTA59349.2023.10293544
Perretta JDeOrio AGuha ABell JRyu SSmaragdakis Y(2022)On the use of mutation analysis for evaluating student test suite qualityProceedings of the 31st ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3533767.3534217(263-275)Online publication date: 18-Jul-2022
https://dl.acm.org/doi/10.1145/3533767.3534217
Chen ZDwyer M(2022)Students vs. professionalsProceedings of the ACM/IEEE 44th International Conference on Software Engineering: Companion Proceedings10.1145/3510454.3517058(294-296)Online publication date: 21-May-2022
https://dl.acm.org/doi/10.1145/3510454.3517058
Chen Z(2022)Students vs. Professionals: Improving the Learning of Software Testing2022 IEEE/ACM 44th International Conference on Software Engineering: Companion Proceedings (ICSE-Companion)10.1109/ICSE-Companion55297.2022.9793734(294-296)Online publication date: May-2022
https://doi.org/10.1109/ICSE-Companion55297.2022.9793734
Paiva JFigueira ÁLeal J(2022)Automated Assessment in Computer Science: A Bibliometric Analysis of the LiteratureLearning Technologies and Systems10.1007/978-3-031-33023-0_11(122-134)Online publication date: 21-Nov-2022
https://dl.acm.org/doi/10.1007/978-3-031-33023-0_11
Delgado-Pérez PMedina-Bulo IÁlvarez-García MValle-Gómez KErdogmus HMoreno A(2021)Mutation testing and self/peer assessmentProceedings of the 43rd International Conference on Software Engineering: Joint Track on Software Engineering Education and Training10.1109/ICSE-SEET52601.2021.00033(231-240)Online publication date: 25-May-2021
https://dl.acm.org/doi/10.1109/ICSE-SEET52601.2021.00033
Heckman SSchmidt JKing J(2020)Integrating Testing Throughout the CS Curriculum2020 IEEE International Conference on Software Testing, Verification and Validation Workshops (ICSTW)10.1109/ICSTW50294.2020.00079(441-444)Online publication date: Oct-2020
https://doi.org/10.1109/ICSTW50294.2020.00079
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents