research-article

Public Access

Who Tests the Testers?

Authors:

Shriram Krishnamurthi,

Kathi FislerAuthors Info & Claims

ICER '18: Proceedings of the 2018 ACM Conference on International Computing Education Research

Pages 51 - 59

https://doi.org/10.1145/3230977.3230999

Published: 08 August 2018 Publication History

Abstract

Instructors routinely use automated assessment methods to evaluate the semantic qualities of student implementations and, sometimes, test suites. In this work, we distill a variety of automated assessment methods in the literature down to a pair of assessment models. We identify pathological assessment outcomes in each model that point to underlying methodological flaws. These theoretical flaws broadly threaten the validity of the techniques, and we actually observe them in multiple assignments of an introductory programming course. We propose adjustments that remedy these flaws and then demonstrate, on these same assignments, that our interventions improve the accuracy of assessment. We believe that with these adjustments, instructors can greatly improve the accuracy of automated assessment.

References

[1]

Kalle Aaltonen, Petri Ihantola, and Otto Sepp"al"a . 2010. Mutation Analysis vs. Code Coverage in Automated Assessment of Students' Testing Skills. In Proceedings of the ACM International Conference Companion on Object Oriented Programming Systems Languages and Applications Companion (OOPSLA '10). ACM, New York, NY, USA, 153--160.

Digital Library

[2]

Michael K. Bradshaw . 2015. Ante Up: A Framework to Strengthen Student-Based Testing of Assignments Proceedings of the 46th ACM Technical Symposium on Computer Science Education (SIGCSE '15). ACM, New York, NY, USA, 488--493.

Digital Library

[3]

Koen Claessen and John Hughes . 2000. QuickCheck: A Lightweight Tool for Random Testing of Haskell Programs Proceedings of the Fifth ACM SIGPLAN International Conference on Functional Programming (ICFP '00). ACM, New York, NY, USA, 268--279.

Digital Library

[4]

Jeffrey Dean and Sanjay Ghemawat . 2008. MapReduce: Simplified Data Processing on Large Clusters. Commun. ACM Vol. 51, 1 (Jan. . 2008), 107--113.

Digital Library

[5]

The Pyret Project Developers . 2018 a. The Pyret Programming Language. Chapter 2.2.deftempurl%https://www.pyret.org/docs/latest/testing.html tempurl

[6]

The Rust Project Developers . 2018 b. The Rust Programming Language. Chapter 11.deftempurl%https://doc.rust-lang.org/book/second-edition/ch11-00-testing.html tempurl

[7]

Stephen H. Edwards . 2003 a. Improving Student Performance by Evaluating How Well Students Test Their Own Programs. Journal on Educational Resources in Computing Vol. 3, 3, Article bibinfoarticleno1 (Sept. . 2003).

Digital Library

[8]

Stephen H. Edwards . 2003 b. Improving Student Performance by Evaluating How Well Students Test Their Own Programs. J. Educ. Resour. Comput. Vol. 3, 3, Article bibinfoarticleno1 (Sept. . 2003).

Digital Library

[9]

Stephen H. Edwards and Zalia Shams . 2014 a. Comparing Test Quality Measures for Assessing Student-written Tests Companion Proceedings of the 36th International Conference on Software Engineering (ICSE Companion 2014). ACM, New York, NY, USA, 354--363.

Digital Library

[10]

Stephen H. Edwards and Zalia Shams . 2014 b. Do Student Programmers All Tend to Write the Same Software Tests?killpunct ITiCSE. ACM, New York, NY, USA, 171--176.

Digital Library

[11]

Stephen H. Edwards, Zalia Shams, Michael Cogswell, and Robert C. Senkbeil . 2012. Running Students' Software Tests Against Each Others' Code: New Life for an Old "Gimmick". In Proceedings of the 43rd ACM Technical Symposium on Computer Science Education (SIGCSE '12). ACM, New York, NY, USA, 221--226.

Digital Library

[12]

John English . 2004. Automated Assessment of GUI Programs Using JEWL. In Proceedings of the 9th Annual SIGCSE Conference on Innovation and Technology in Computer Science Education (ITiCSE '04). ACM, New York, NY, USA, 137--141.

Digital Library

[13]

Matthias Felleisen, Robert Bruce Findler, Matthew Flatt, and Shriram Krishnamurthi . 2001. How to Design Programs. MIT Press.

[14]

Matthias Felleisen, Robert Bruce Findler, Matthew Flatt, and Shriram Krishnamurthi . 2018. How to Design Programs (bibinfoeditionsecond ed.). MIT Press.

[15]

George E. Forsythe and Niklaus Wirth . 1965. Automatic Grading Programs. Commun. ACM Vol. 8, 5 (May . 1965), 275--278.

Digital Library

[16]

Eric Foxley, Omar Salman, and Zarina Shukur . 1997. The Automatic Assessment of Z Specifications. In The Supplemental Proceedings of the Conference on Integrating Technology into Computer Science Education: Working Group Reports and Supplemental Proceedings (ITiCSE-WGR '97). ACM, New York, NY, USA, 129--131.

Digital Library

[17]

Xiang Fu, Boris Peltsverger, Kai Qian, Lixin Tao, and Jigang Liu . 2008. APOGEE: Automated Project Grading and Instant Feedback System for Web Based Computing. In Proceedings of the 39th SIGCSE Technical Symposium on Computer Science Education (SIGCSE '08). ACM, New York, NY, USA, 77--81.

Digital Library

[18]

Michael H. Goldwasser . 2002. A Gimmick to Integrate Software Testing Throughout the Curriculum Proceedings of the 33rd SIGCSE Technical Symposium on Computer Science Education (SIGCSE '02). ACM, New York, NY, USA, 271--275.

Digital Library

[19]

J. B. Hext and J. W. Winings . 1969. An Automatic Grading Scheme for Simple Programming Exercises. Commun. ACM Vol. 12, 5 (May . 1969), 272--275.

Digital Library

[20]

Laura Inozemtseva and Reid Holmes . 2014. Coverage is Not Strongly Correlated with Test Suite Effectiveness Proceedings of the 36th International Conference on Software Engineering (ICSE 2014). ACM, New York, NY, USA, 435--445.

Digital Library

[21]

Peter C. Isaacson and Terry A. Scott . 1989. Automating the Execution of Student Programs. SIGCSE Bull. Vol. 21, 2 (June . 1989), 15--22.

Digital Library

[22]

David Jackson . 2000. A Semi-automated Approach to Online Assessment. In Proceedings of the 5th Annual SIGCSE/SIGCUE ITiCSEconference on Innovation and Technology in Computer Science Education (ITiCSE '00). ACM, New York, NY, USA, 164--167.

Digital Library

[23]

David Jackson and Michelle Usher . 1997. Grading Student Programs Using ASSYST. In Proceedings of the Twenty-eighth SIGCSE Technical Symposium on Computer Science Education (SIGCSE '97). ACM, New York, NY, USA, 335--339.

Digital Library

[24]

David G. Kay, Terry Scott, Peter Isaacson, and Kenneth A. Reek . 1994. Automated Grading Assistance for Student Programs. In Proceedings of the Twenty-fifth SIGCSE Symposium on Computer Science Education (SIGCSE '94). ACM, New York, NY, USA, 381--382.

Digital Library

[25]

Will Marrero and Amber Settle . 2005. Testing First: Emphasizing Testing in Early Programming Courses Proceedings of the 10th Annual SIGCSE Conference on Innovation and Technology in Computer Science Education (ITiCSE '05). ACM, New York, NY, USA, 4--8.

Digital Library

[26]

Sebastian Pape, Julian Flake, Andreas Beckmann, and Jan Jürjens . 2016. STAGE: A Software Tool for Automatic Grading of Testing Exercises: Case Study Paper. In Proceedings of the 38th International Conference on Software Engineering Companion (ICSE '16). ACM, New York, NY, USA, 491--500.

Digital Library

[27]

Joe Gibbs Politz, Shriram Krishnamurthi, and Kathi Fisler . 2014. In-flow Peer-review of Tests in Test-first Programming ICER. ACM, New York, NY, USA, 11--18.

Digital Library

[28]

Jon Postel . 1980. Transmission Control Protocol. Internet-Draft. Internet Engineering Task Force. deftempurl%https://tools.ietf.org/html/rfc761 tempurl

[29]

Gerard Salton, Anita Wong, and Chung-Shu Yang . 1975. A Vector Space Model for Automatic Indexing. Commun. ACM Vol. 18, 11 (Nov. . 1975), 613--620.

Digital Library

[30]

Zalia Shams and Stephen H. Edwards . 2013. Toward Practical Mutation Analysis for Evaluating the Quality of Student-written Software Tests. In Proceedings of the Ninth Annual International ACM Conference on International Computing Education Research (ICER '13). ACM, New York, NY, USA, 53--58.

Digital Library

[31]

K. K. Sharma, Kunal Banerjee, and Chittaranjan Mandal . 2014. A Scheme for Automated Evaluation of Programming Assignments Using FSMD Based Equivalence Checking. In Proceedings of the 6th IBM Collaborative Academia Research Exchange Conference (I-CARE) on I-CARE 2014 (I-CARE 2014). ACM, New York, NY, USA, Article bibinfoarticleno10, bibinfonumpages4 pages.

Digital Library

[32]

Joanna Smith, Joe Tessler, Elliot Kramer, and Calvin Lin . 2012. Using Peer Review to Teach Software Testing. In Proceedings of the Ninth Annual International Conference on International Computing Education Research (ICER '12). ACM, New York, NY, USA, 93--98.

Digital Library

[33]

Rebecca Smith, Terry Tang, Joe Warren, and Scott Rixner . 2017. An Automated System for Interactively Learning Software Testing Proceedings of the 2017 ACM Conference on Innovation and Technology in Computer Science Education (ITiCSE '17). ACM, New York, NY, USA, 98--103.

Digital Library

[34]

Jaime Spacco, Jaymie Strecker, David Hovemeyer, and William Pugh . 2005. Software Repository Mining with Marmoset: An Automated Programming Project Snapshot and Testing System. In Proceedings of the 2005 International Workshop on Mining Software Repositories (MSR '05). ACM, New York, NY, USA, 1--5.

Digital Library

[35]

Matthew Thornton, Stephen H. Edwards, Roy P. Tan, and Manuel A. Pérez-Qui nones . 2008. Supporting Student-written Tests of Gui Programs. In Proceedings of the 39th SIGCSE Technical Symposium on Computer Science Education (SIGCSE '08). ACM, New York, NY, USA, 537--541.

Digital Library

[36]

Urs von Matt . 1994. Kassandra: The Automatic Grading System. SIGCUE Outlook Vol. 22, 1 (Jan. . 1994), 26--40.

Digital Library

Cited By

Perretta JDeOrio AGuha ABell JStone JYuen TShoop LRebelsky SPrather J(2025)Instructor-Written Hints as Automated Test Suite Quality FeedbackProceedings of the 56th ACM Technical Symposium on Computer Science Education V. 110.1145/3641554.3701866(910-916)Online publication date: 12-Feb-2025
https://dl.acm.org/doi/10.1145/3641554.3701866
Rocha HCosta ETenório GNascimento MNascimento MLins D(2024)Narrativas Contextualizadas na Definição de Problemas de Programação: Uma Revisão da LiteraturaAnais do XXXV Simpósio Brasileiro de Informática na Educação (SBIE 2024)10.5753/sbie.2024.242489(2391-2402)Online publication date: 4-Nov-2024
https://doi.org/10.5753/sbie.2024.242489
Nelson TGreenman BPrasad SDyer TBove EChen QCutting CDel Vecchio TLeVine SRudner JRyjikov BVarga AWagner AWest LKrishnamurthi S(2024)Forge: A Tool and Language for Teaching Formal MethodsProceedings of the ACM on Programming Languages10.1145/36498338:OOPSLA1(613-641)Online publication date: 29-Apr-2024
https://dl.acm.org/doi/10.1145/3649833
Show More Cited By

Index Terms

Who Tests the Testers?
1. Social and professional topics
  1. Professional topics
    1. Computing education
      1. Computing education programs
        Computer science education
        CS1
      2. Student assessment
2. Software and its engineering
  1. Software creation and management
    1. Software verification and validation
      1. Software defect analysis

Recommendations

Feedback on Student Programming Assignments: Teaching Assistants vs Automated Assessment Tool
Koli Calling '23: Proceedings of the 23rd Koli Calling International Conference on Computing Education Research

Existing research does not quantify and compare the differences between automated and manual assessment in the context of feedback on programming assignments. This makes it hard to reason about the effects of adopting automated assessment at the expense ...
Investigating Learners’ Views of Assessment Types in Massive Open Online Courses (MOOCs)
Design for Teaching and Learning in a Networked World
Abstract
Massive Open Online Courses (MOOCs) are changing the contours of the teaching and learning landscape. Assessment covers an important part of this landscape and may be a key driver for learning. This paper presents preliminary results of a ...
Evaluating students' programs using automated assessment: a case study
ITiCSE '09: Proceedings of the 14th annual ACM SIGCSE conference on Innovation and technology in computer science education

This poster presents our experience of using automated assessment in a programming course given by the Department of Computer Science at Holon Institute of Technology (HIT). The course was given as a first year course as part of an engineering degree ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

ICER '18: Proceedings of the 2018 ACM Conference on International Computing Education Research

August 2018

307 pages

ISBN:9781450356282

DOI:10.1145/3230977

General Chairs:
Lauri Malmi
Aalto University, Finland
,
Ari Korhonen
Aalto University, Finland
,
Robert McCartney
University of Connecticut, USA
,
Andrew Petersen
University of Toronto Mississauga, Canada
,
Program Chairs:
Lauri Malmi
Aalto University, Finland
,
Robert McCartney
University of Connecticut, USA

Copyright © 2018 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGCSE: ACM Special Interest Group on Computer Science Education

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 08 August 2018

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

US National Science Foundation

Conference

ICER '18

Sponsor:

SIGCSE

ICER '18: International Computing Education Research Conference

August 13 - 15, 2018

Espoo, Finland

Acceptance Rates

ICER '18 Paper Acceptance Rate 28 of 125 submissions, 22%;

Overall Acceptance Rate 189 of 803 submissions, 24%

Upcoming Conference

ICER 2025

Sponsor:
sigcse

ACM Conference on International Computing Education Research

August 3 - 6, 2025

Charlottesville , VA , USA

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

17
Total Citations
View Citations
552
Total Downloads

Downloads (Last 12 months)91
Downloads (Last 6 weeks)10

Reflects downloads up to 05 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Perretta JDeOrio AGuha ABell JStone JYuen TShoop LRebelsky SPrather J(2025)Instructor-Written Hints as Automated Test Suite Quality FeedbackProceedings of the 56th ACM Technical Symposium on Computer Science Education V. 110.1145/3641554.3701866(910-916)Online publication date: 12-Feb-2025
https://dl.acm.org/doi/10.1145/3641554.3701866
Rocha HCosta ETenório GNascimento MNascimento MLins D(2024)Narrativas Contextualizadas na Definição de Problemas de Programação: Uma Revisão da LiteraturaAnais do XXXV Simpósio Brasileiro de Informática na Educação (SBIE 2024)10.5753/sbie.2024.242489(2391-2402)Online publication date: 4-Nov-2024
https://doi.org/10.5753/sbie.2024.242489
Nelson TGreenman BPrasad SDyer TBove EChen QCutting CDel Vecchio TLeVine SRudner JRyjikov BVarga AWagner AWest LKrishnamurthi S(2024)Forge: A Tool and Language for Teaching Formal MethodsProceedings of the ACM on Programming Languages10.1145/36498338:OOPSLA1(613-641)Online publication date: 29-Apr-2024
https://dl.acm.org/doi/10.1145/3649833
Shin AKazerouni A(2024)A Model of How Students Engineer Test Cases With FeedbackACM Transactions on Computing Education10.1145/362860424:1(1-31)Online publication date: 14-Jan-2024
https://dl.acm.org/doi/10.1145/3628604
Izu CWeeransinghe A(2024)Testing and Debugging Habits of Intermediate Student Programmers2024 IEEE Global Engineering Education Conference (EDUCON)10.1109/EDUCON60312.2024.10578650(1-10)Online publication date: 8-May-2024
https://doi.org/10.1109/EDUCON60312.2024.10578650
Venkatesh VVenkatesh VKumar V(2023)Evaluating Copilot on CS1 Code Writing Problems with Suppressed SpecificationsProceedings of the 16th Annual ACM India Compute Conference10.1145/3627217.3627235(104-107)Online publication date: 9-Dec-2023
https://dl.acm.org/doi/10.1145/3627217.3627235
Milovančević DKunčak V(2023)Proving and Disproving Equivalence of Functional Programming AssignmentsProceedings of the ACM on Programming Languages10.1145/35912587:PLDI(928-951)Online publication date: 6-Jun-2023
https://dl.acm.org/doi/10.1145/3591258
Balse RValaboju BSinghal SWarriem JPrasad PLaakso MMonga MSimon Sheard J(2023)Investigating the Potential of GPT-3 in Providing Feedback for Programming AssessmentsProceedings of the 2023 Conference on Innovation and Technology in Computer Science Education V. 110.1145/3587102.3588852(292-298)Online publication date: 29-Jun-2023
https://dl.acm.org/doi/10.1145/3587102.3588852
Strömbäck FMannila LKamkar M(2023)Using Model-Checking and Peer-Grading to Provide Automated Feedback to Concurrency Exercises in ProgvisProceedings of the 25th Australasian Computing Education Conference10.1145/3576123.3576125(11-20)Online publication date: 30-Jan-2023
https://dl.acm.org/doi/10.1145/3576123.3576125
Prasad SGreenman BNelson TWrenn JKrishnamurthi S(2022)Making Hay from Wheats: A Classsourcing Method to Identify MisconceptionsProceedings of the 22nd Koli Calling International Conference on Computing Education Research10.1145/3564721.3564726(1-7)Online publication date: 17-Nov-2022
https://dl.acm.org/doi/10.1145/3564721.3564726
Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Figures

Tables

Media

View Table of Conten