Abstract
Program comprehension is an important human factor in software engineering. To measure and evaluate program comprehension, researchers typically conduct experiments. However, designing experiments requires considerable effort, because confounding parameters need to be controlled for. Our aim is to support researchers in identifying relevant confounding parameters and select appropriate techniques to control their influence. To this end, we conducted a literature survey of 13 journals and conferences over a time span of 10 years. As result, we created a catalog of 39 confounding parameters, including an overview of measurement and control techniques. With the catalog, we give experimenters a tool to design reliable and valid experiments.
Similar content being viewed by others
Notes
ICPC was a workshop until 2005.
ESEM originated 2007 from merging the International Symposium on Empirical Software Engineering (ISESE) and International Software Metrics Symposium (METRICS)
VLHCC was called Human-Centric Computing Languages and Environments until 2003.
CHASE first took place in 2008.
In Section 5, we discuss techniques and parameters in detail. Here, we give only an overview.
There are voices that say intelligence is rather something learned than something inborn. Thus, we could also classify it as individual knowledge. However, since our classification aims at a better overview, we do not step into this discussion.
There are controversial discussion about the magical number seven (Baddeley 2001).
The specific contents of courses depend on the country and specific university.
For examples of all identified parameters for specific experiments, see the first author’s PhD thesis (Siegmund 2012).
References
Anderson M (2001) Permutation tests for univariate or multivariate analysis of variance and regression. Can J Fish Aquat Sci 58(3):626–639
Anderson T, Finn J (1996) The new statistical analysis of data. Springer, New York
Baddeley A (2001) Is working memory still working? Am Psychol 56(11):851–864
Beckwith L, Burnett M, Wiedenbeck S, Cook C, Sorte S, Hastings M (2005) Effectiveness of end-user debugging software features: are there gender issues? In: Proc. Conf. Human Factors in Computing Systems (CHI), pp. 869–878. ACM Press
Bergersen G, Gustafsson JE (2011) Programming skill, knowledge, and working memory among professional software developers from an investment theory perspective. J Individ Differ 32(4):201–209
Bettenburg N, Hassan A (2010) Studying the impact of social structures on software quality. In: Proc. Int’l Conf. Program Comprehension (ICPC), pp. 124–133. IEEE CS
Biffl S, Halling M (2003) Investigating the defect detection effectiveness and cost benefit of nominal inspection teams. IEEE Trans Softw Eng 29(5):385–397
Binkley D, Lawrie D, Maex S, Morrell C (2008) Impact of limited memory resources. In: Proc. Int’l Conf. Program Comprehension (ICPC), pp. 83–92. IEEE Computer Society
Boehm B (1981) Software engineering economics. Prentice Hall, Englewood Cliffs
Briand LC, Labiche Y, Di Penta M, Yan-Bondoc HD (2005) An experimental investigation of formality in UML-based development. IEEE Trans Softw Eng 31(10):833–849
Brooks R (1978) Using a behavioral theory of program comprehension in software engineering. In: Proc. Int’l Conf. Software Engineering (ICSE), pp. 196–201. IEEE CS
Cohen J (1960) A coefficient of agreement for nominal scales. Educ Psychol Meas 20(1):37–46
Cook T, Campbell D (1979) Quasi-experimentation: design & analysis issues for field settings. Houghton Mifflin, Boston
Corbett A, Anderson J (2001) Locus of feedback control in computer-based tutoring: Impact on learning rate, achievement and attitudes. In: Proc. Conf. Human Factors in Computing Systems (CHI), pp. 245–252. ACM Press
Druin A, Foss E, Hutchinson H, Golub E, Hatley L (2010) Children’s roles using keyword search interfaces at home. In: Proc. Conf. Human Factors in Computing Systems (CHI), pp. 413–422. ACM Press
Dybå T, Kampenes VB, Sjøberg D (2006) A systematic review of statistical power in software engineering experiments. J Inf Softw Technol 48(8):745–755
Dzidek W, Arisholm E, Briand L (2008) A Realistic empirical evaluation of the costs and benefits of UML in software maintenance. IEEE Trans Softw Eng 34(3):407–432
Ellis B, Stylos J, Myers B (2007) The factory pattern in API design: a usability evaluation. In: Proc. Int’l Conf. Software Engineering (ICSE), pp. 302–312. IEEE CS
Ericsson K, Simon H (1980) Verbal reports as data. Psychol Rev 87(2):215–251
Feigenspan J (2009) Empirical comparison of FOSD approaches regarding program comprehension—a feasibility study. Master’s thesis, University of Magdeburg
Feigenspan J, Siegmund N, Fruth J (2011) On the role of program comprehension in embedded systems. In: Proc. Workshop Software Reengineering (WSR), pp. 34–35. http://wwwiti.cs.uni-magdeburg.de/iti_db/publikationen/ps/auto/FeSiFr11
Feigenspan J, Kästner C, Liebig J, Apel S, Hanenberg S (2012) Measuring programming experience. In: Proc. Int’l Conf. Program Comprehension (ICPC), pp. 73–82. IEEE CS
Friedman M (1937) The use of ranks to avoid the assumption of normality implicit in the analysis of variance. J Am Stat Assoc 32(200):675–701
Fry Z, Weimer W (2010) A human study of fault localization accuracy. In: Proc. Int’l Conf. Software Maintenance (ICSM), pp. 1–10. IEEE CS
Goldstein B (2002) Sensation and perception, fifth edn. Cengage Learning Services
Gong L, Lai J (2001) Shall we mix synthetic speech and human speech impact on users’ performance, perception, and attitude. In: Proc. Conf. Human Factors in Computing Systems (CHI), pp. 158–165. ACM Press (2001)
Goodwin J (1999) Research in psychology: methods and design, 2nd edn. Wiley
Grigoreanu V, Cao J, Kulesza T, Bogart C, Rector K, Burnett M, Wiedenbeck S (2008) Can feature design reduce the gender gap in end-user software development environments? In: Proc. Symposium Visual Languages and Human-Centric Computing (VLHCC), pp. 149–156. IEEE CS
Güleşir G, Berg K, Bergmans L, Akşit M (2009) Experimental evaluation of a tool for the verification and transformation of source code in event-driven systems. Empir Softw Eng 14(6):720–777
Hu W, Lee H, Zhang Q, Liu T, Geng L, Seghier M, Shakeshaft C, Twomey T, Green D, Yang Y, Price C (2010) Developmental dyslexia in Chinese and English populations: dissociating the effect of dyslexia from language differences. Brain 133(6):1694–1706
Ishihara S (1972) Test for colour-blindness. Kanehara Shuppan Co., Tokyo
Jablonski P, Hou D (2010) Aiding software maintenance with copy-and-paste clone-awareness. In: Proc. Int’l Conf. Program Comprehension (ICPC), pp. 170–179. IEEE CS
Jäger A, Süß HM, Beauducel A (1997) Berliner Intelligenzstruktur-Test. Hogrefe, Göttingen
Jedlitschka A, Ciolkowski M, Pfahl D (2008) Reporting experiments in software engineering. In: Guide to advanced empirical software engineering, pp. 201–228. Springer
Jensen E (1998) Teaching with the brain in mind. Atlantic Books, London
Juristo N, Moreno A (2001) Basics of software engineering experimentation. Kluwer, Boston
Kampenes V, Dybå T, Hannay J, Sjøberg D (2009) A systematic review of quasi-experiments in software engineering. Inf Softw Technol 51(1):71–82
Ko A, Uttl B (2003) Individual differences in program comprehension strategies in unfamiliar programming systems. In: Proc. Int’l Workshop Program Comprehension (IWPC), pp. 175–184. IEEE CS
Ko A, Myers B, Coblenz M, Aung H (2006) An exploratory study of how developers seek, relate, and collect relevant information during software maintenance tasks. IEEE Trans Softw Eng 32(12):971–987
McConnell S (2011) What does 10× mean? Measuring variations in programmer productivity. In: Making Software, pp. 567–574. O’Reilly & Associates, Inc
McQuiggan SW, Rowe JP, Lester JC (2008) The effects of empathetic virtual characters on presence in narrative-centered learning environments. In: Proc. Conf. Human Factors in Computing Systems (CHI), pp. 1511–1520. ACM Press
Miller G (1956) The magical number seven, plus or minus two: some limits on our capacity for processing information. Psychol Rev 63(2):81–97
Mook D (1996) Motivation: the organization of action, 2nd edn. W.W. Norton & Co., New York
Neumann Jv (1945) First draft of a report on the EDVAC
Oberauer K, Süß HM, Schulze R, Wilhelm O, Wittmann W (2000) Working memory capacity—facets of a cognitive ability construct. Personal Individ Differ 29(6):1017–1045
Oezbek C, Prechelt L (2007) Jtourbus: simplifying program understanding by documentation that provides tours through the source code. In: Proc. Int’l Conf. Software Maintenance (ICSM), pp. 64–73. IEEE CS
Pennington N (1987) Stimulus structures and mental representations in expert comprehension of computer programs. Cogn Psychol 19(3):295–341
Raven J (1936) Mental tests used in genetic studies: the performances of related individuals in tests mainly educative and mainly reproductive. Master’s thesis, University of London
Roethlisberger F (1939) Management and the worker. Harvard University Press, Cambridge
Rosenthal R, Jacobson L (1966) Teachers’ expectancies: determinants of pupils’ IQ gains. Psychol Rep 19(1):115–118
Sackman H, Erikson W, Grant E (1968) Exploratory experimental studies comparing online and offline programming performance. Commun ACM 11(1):3–11
Schlaug G (2001) The brain of musicians. A model for functional and structural adaptation. Ann N Y Acad Sci 930:281–299
Shadish W, Cook T, Campbell D (2002) Experimental and quasi-experimental designs for generalized causal inference. Houghton Mifflin Company, Boston
Shaft T, Vessey I (1995) The relevance of application domain knowledge: the case of computer program comprehension. Inf Syst Res 6(3):286–299
Sharafi Z, Soh Z, Guéhéneuc YG, Antoniol G (2012) Women and men–different but equal: on the impact of identifier style on source code reading. In: Proc. Int’l Conf. Program Comprehension (ICPC), pp. 27–36. IEEE CS
Sharif B, Maletic J (2009) An empirical study on the comprehension of stereotyped UML class diagram layouts. In: Proc. Int’l Conf. Program Comprehension (ICPC), pp. 268–272. IEEE CS
Sharif B, Maletic J (2010) An eye tracking study on camel case and underscore identifier styles. In: Proc. Int’l Conf. Program Comprehension (ICPC), pp. 196–205. IEEE CS
Shneiderman B, Mayer R (1979) Syntactic/semantic interactions in programmer behavior: a model and experimental results. Int J Parallel Prog 8(3):219–238
Siegmund J (2012) Framework for measuring program comprehension. Ph.D. thesis, School of Computer Science, University of Magdeburg
Sjøberg D, Hannay J, Hansen O, Kampenes VB, Karahasanovic A, Liborg NK, Rekdal A (2005) A survey of controlled experiments in software engineering. IEEE Trans Softw Eng 31(9):733–753
Soloway E, Ehrlich K (1984) Empirical studies of programming knowledge. IEEE Trans Softw Eng 10(5):595–609
Standish T (1984) An essay on software reuse. IEEE Trans Softw Eng SE–10(5):494–497
Tiarks R (2011) What programmers really do: an observational study. In: Proc. Workshop Software Reengineering (WSR), pp. 36–37
Torchiano M (2004) Empirical assessment of UML static object diagrams. In: Proc. Int’l Workshop Program Comprehension (IWPC), pp. 226–230. IEEE CS
Vitharana P, Ramamurthy K (2003) Computer-mediated group support, anonymity, and the software inspection process: an empirical investigation. IEEE Trans Softw Eng 29(2):167–180
von Mayrhauser A, Vans M (1995) Program comprehension during software maintenance and evolution. Computer 28(8):44–55
von Mayrhauser A, Vans M, Howe A (1997) Program understanding behaviour during enhancement of large-scale software. J Softw Maint Res Pract 9(5):299–327
Wechsler D (1950) The measurement of adult intelligence, 3rd edn. American Psychological Association, Washington, DC
Wohlin C, Runeson P, Höst M, Ohlsson M, Regnell B, Wesslén A (2000) Experimentation in software engineering: an introduction. Kluwer Academic Publishers, Boston
Wundt W (1874) Grundzüge der Physiologischen Psychologie. Engelmann, Leipzig
Acknowledgments
Thanks to Norbert Siegmund and Christian Kästner for helpful discussions. Thanks to all reviewers for their constructive feedback. Thanks to Raimund Dachselt for his encouragement to write this article. Thanks to Andreas Meister for his support in selecting relevant papers.
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by: Andrian Marcus
This author published previous work as Janet Feigenspan.
Rights and permissions
About this article
Cite this article
Siegmund, J., Schumann, J. Confounding parameters on program comprehension: a literature survey. Empir Software Eng 20, 1159–1192 (2015). https://doi.org/10.1007/s10664-014-9318-8
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10664-014-9318-8