research-article

A simpler model of software readability

Authors:

Premkumar DevanbuAuthors Info & Claims

MSR '11: Proceedings of the 8th Working Conference on Mining Software Repositories

Pages 73 - 82

https://doi.org/10.1145/1985441.1985454

Published: 21 May 2011 Publication History

Abstract

Software readability is a property that influences how easily a given piece of code can be read and understood. Since readability can affect maintainability, quality, etc., programmers are very concerned about the readability of code. If automatic readability checkers could be built, they could be integrated into development tool-chains, and thus continually inform developers about the readability level of the code. Unfortunately, readability is a subjective code property, and not amenable to direct automated measurement. In a recently published study, Buse et al. asked 100 participants to rate code snippets by readability, yielding arguably reliable mean readability scores of each snippet; they then built a fairly complex predictive model for these mean scores using a large, diverse set of directly measurable source code properties. We build on this work: we present a simple, intuitive theory of readability, based on size and code entropy, and show how this theory leads to a much sparser, yet statistically significant, model of the mean readability scores produced in Buse's studies. Our model uses well-known size metrics and Halstead metrics, which are easily extracted using a variety of tools. We argue that this approach provides a more theoretically well-founded, practically usable, approach to readability measurement.

References

[1]

The Zen of Python. http://www.python.org/dev/peps/pep-0020/. {Online; accessed 31-January-2011}.

[2]

K. Aggarwal, Y. Singh, and J. Chhabra. An integrated measure of software maintainability. In Reliability and Maintainability Symposium, 2002. Proceedings. Annual, pages 235--241. IEEE, 2002.

[3]

R. M. Baecker and A. Marcus. Human factors and typography for more readable programs. ACM, New York, NY, USA, 1989.

[4]

J. Börstler, M. Caspersen, and M. Nordström. Beauty and the Beast: Toward a Measurement Framework for Example Program Quality. Department of Computing Science, Umeå University, 2008.

[5]

L. Briand and J. Wüst. Empirical studies of quality models in object-oriented systems. Advances in Computers, 56:97--166, 2002.

[6]

R. Buse and W. Weimer. Learning a Metric for Code Readability. Software Engineering, IEEE Transactions on, 36(4):546--558, 2010.

Digital Library

[7]

S. Butler, M. Wermelinger, Y. Yu, and H. Sharp. Relating Identifier Naming Flaws and Code Quality: An Empirical Study. In 2009 16th Working Conference on Reverse Engineering, pages 31--35. IEEE, 2009.

Digital Library

[8]

S. Butler, M. Wermelinger, Y. Yu, and H. Sharp. Exploring the influence of identifier names on code quality: An empirical study. In 14th European Conference on Software Maintenance and Reengineering, March 2010. Pages 159--168.

Digital Library

[9]

J. Cohen. Applied multiple regression/correlation analysis for the behavioral sciences. Lawrence Erlbaum, 2003.

[10]

D. Coleman, D. Ash, B. Lowther, and P. Oman. Using metrics to evaluate software system maintainability. Computer, 27(8):44--49, 2002.

Digital Library

[11]

N. Coulter. Software science and cognitive psychology. IEEE Transactions on Software Engineering, pages 166--171, 1983.

Digital Library

[12]

S. Dahiya, J. Chhabra, and S. Kumar. Use of genetic algorithm for software maintainability metrics' conditioning. In Advanced Computing and Communications, 2007. ADCOM 2007. International Conference on, pages 87--92. IEEE, 2008.

Digital Library

[13]

F. Détienne and F. Bott. Software design-cognitive aspects. Springer Verlag, 2002.

Digital Library

[14]

K. El Emam, S. Benlarbi, N. Goel, and S. Rai. The confounding effect of class size on the validity of object-oriented metrics. Software Engineering, IEEE Transactions on, 27(7):630--650, 2002.

Digital Library

[15]

J. Elshoff and M. Marcotty. Improving computer program readability to aid modification. Communications of the ACM, 25(8):512--521, 1982.

Digital Library

[16]

L. Etzkorn, S. Gholston, and W. Hughes Jr. A semantic entropy metric. Journal of Software Maintenance and Evolution: Research and Practice, 14(4):293--310, 2002.

[17]

R. Flesch. A new readability yardstick. Journal of applied psychology, 32(3):221--233, 1948.

[18]

R. Forax. Why extension methods are evil. http://weblogs.java.net/blog/forax/archive/2009/11/-28/why-extension-methods-are-evil. {Online; accessed 31-January-2011}.

[19]

B. Guzel. Top 15 best practices for writing super readable code. http://net.tutsplus.com/tutorials/htmlcss-techniques/top-15-best-practices-for-writing-superreadable-code/. {Online; accessed 31-January-2011}.

[20]

M. Hall, E. Frank, G. Holmes, B. Pfahringer, P. Reutemann, and I. Witten. The WEKA data mining software: An update. ACM SIGKDD Explorations Newsletter, 11(1):10--18, 2009.

Digital Library

[21]

M. Halstead. Elements of software science. Elsevier New York, 1977.

Digital Library

[22]

A. Hindle, M. Godfrey, and R. Holt. Reading beside the lines: Indentation as a proxy for complexity metric. In Program Comprehension, 2008. ICPC 2008. The 16th IEEE International Conference on, pages 133--142. IEEE, 2008.

Digital Library

[23]

M. Kanat-Alexander. Readability and naming things. http://www.codesimplicity.com/post/readability-andnaming-things/. {Online; accessed 31-January-2011}.

[24]

J. Kearney, R. Sedlmeyer, W. Thompson, M. Gray, and M. Adler. Software complexity measurement. Communications of the ACM, 29(11):1044--1050, 1986.

Digital Library

[25]

D. Kozlov, J. Koskinen, M. Sakkinen, and J. Markkula. Assessing maintainability change over multiple software releases. Journal of Software Maintenance and Evolution: Research and Practice, 20(1):31--58, 2008.

Digital Library

[26]

J. Kumar Chhabra, K. Aggarwal, and Y. Singh. Code and data spatial complexity: two important software understandability measures. Information and software Technology, 45(8):539--546, 2003.

[27]

S. Lessmann, B. Baesens, C. Mues, and S. Pietsch. Benchmarking classification models for software defect prediction: A proposed framework and novel findings. IEEE Transactions on Software Engineering, 34(4):485, 2008.

Digital Library

[28]

J. Lin and K. Wu. A Model for Measuring Software Understandability. In Computer and Information Technology, 2006. CIT'06. The Sixth IEEE International Conference on, page 192. IEEE, 2006.

Digital Library

[29]

J. Lin and K. Wu. Evaluation of software understandability based on fuzzy matrix. In Fuzzy Systems, 2008. FUZZ-IEEE 2008.(IEEE World Congress on Computational Intelligence). IEEE International Conference on, pages 887--892. IEEE, 2008.

[30]

A. Mohan, N. Gold, and P. Layzell. An initial approach to assessing program comprehensibility using spatial complexity, number of concepts and typographical style. In Reverse Engineering, 2004. Proceedings. 11th Working Conference on, pages 246--255. IEEE, 2005.

Digital Library

[31]

N. Naeem, M. Batchelder, and L. Hendren. Metrics for measuring the effectiveness of decompilers and obfuscators. In Program Comprehension, 2007. ICPC'07. 15th IEEE International Conference on, pages 253--258. IEEE, 2007.

Digital Library

[32]

P. Peduzzi, J. Concato, E. Kemper, T. Holford, and A. Feinstein. A simulation study of the number of events per variable in logistic regression analysis* 1. Journal of clinical epidemiology, 49(12):1373--1379, 1996.

[33]

D. Raymond. Reading source code. In Proceedings of the 1991 conference of the Centre for Advanced Studies on Collaborative research, pages 3--16. IBM Press, 1991.

Digital Library

[34]

P. Relf. Tool assisted identifier naming for improved software readability: an empirical study. In Empirical Software Engineering, 2005. 2005 International Symposium on, page 10. IEEE, 2005.

Cited By

Gao FChen HZhou YWang KFilkov VRay BZhou M(2024)Shoot Yourself in the Foot — Efficient Code Causes Inefficiency in Compiler OptimizationsProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695548(1846-1857)Online publication date: 27-Oct-2024
https://dl.acm.org/doi/10.1145/3691620.3695548
Mostafavi Ghahfarokhi MJahantigh HKianiangolafshani SKhademian AAsadi AHeydarnoori AFilkov VRay BZhou M(2024)Can Code Metrics Enhance Documentation Generation for Computational Notebooks?Proceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695334(2472-2473)Online publication date: 27-Oct-2024
https://dl.acm.org/doi/10.1145/3691620.3695334
Lee GJu HLee SFilkov VRay BZhou M(2024)NeuroJIT: Improving Just-In-Time Defect Prediction Using Neurophysiological and Empirical Perceptions of Modern DevelopersProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695056(594-605)Online publication date: 27-Oct-2024
https://dl.acm.org/doi/10.1145/3691620.3695056
Show More Cited By

Index Terms

A simpler model of software readability
1. General and reference
  1. Cross-computing tools and techniques
    1. Metrics
2. Information systems
  1. Information systems applications

Recommendations

Towards understanding code readability and its impact on design quality
NL4SE 2018: Proceedings of the 4th ACM SIGSOFT International Workshop on NLP for Software Engineering

Readability of code is commonly believed to impact the overall quality of software. Poor readability not only hinders developers from understanding what the code is doing but also can cause developers to make sub-optimal changes and introduce bugs. ...
Improving source code readability: theory and practice
ICPC '19: Proceedings of the 27th International Conference on Program Comprehension

There are several widely accepted metrics to measure code quality that are currently being used in both research and practice to detect code smells and to find opportunities for code improvement. Although these metrics have been proposed as a proxy of ...
The Effect of Font Type on Screen Readability by People with Dyslexia

Around 10% of the people have dyslexia, a neurological disability that impairs a person’s ability to read and write. There is evidence that the presentation of the text has a significant effect on a text’s accessibility for people with dyslexia. However,...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

MSR '11: Proceedings of the 8th Working Conference on Mining Software Repositories

May 2011

260 pages

ISBN:9781450305747

DOI:10.1145/1985441

General Chair:
Arie van Deursen
Delft University of Technology, The Netherlands
,
Program Chairs:
Tao Xie
North Carolina State University, USA
,
Thomas Zimmermann
Microsoft Research, USA

Copyright © 2011 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 21 May 2011

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

ICSE11

Sponsor:

SIGSOFT

ICSE11: International Conference on Software Engineering

May 21 - 22, 2011

HI, Waikiki, Honolulu, USA

Upcoming Conference

ICSE 2025

2025 IEEE/ACM 46th International Conference on Software Engineering

April 26 - May 3, 2025

Ottawa , ON , Canada

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

110
Total Citations
View Citations
1,449
Total Downloads

Downloads (Last 12 months)122
Downloads (Last 6 weeks)19

Reflects downloads up to 16 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Gao FChen HZhou YWang KFilkov VRay BZhou M(2024)Shoot Yourself in the Foot — Efficient Code Causes Inefficiency in Compiler OptimizationsProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695548(1846-1857)Online publication date: 27-Oct-2024
https://dl.acm.org/doi/10.1145/3691620.3695548
Mostafavi Ghahfarokhi MJahantigh HKianiangolafshani SKhademian AAsadi AHeydarnoori AFilkov VRay BZhou M(2024)Can Code Metrics Enhance Documentation Generation for Computational Notebooks?Proceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695334(2472-2473)Online publication date: 27-Oct-2024
https://dl.acm.org/doi/10.1145/3691620.3695334
Lee GJu HLee SFilkov VRay BZhou M(2024)NeuroJIT: Improving Just-In-Time Defect Prediction Using Neurophysiological and Empirical Perceptions of Modern DevelopersProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695056(594-605)Online publication date: 27-Oct-2024
https://dl.acm.org/doi/10.1145/3691620.3695056
Mostafavi Ghahfarokhi MAsgari AAbolnejadian MHeydarnoori ASpinellis DConstantinou EBacchelli A(2024)DistilKaggle: A Distilled Dataset of Kaggle Jupyter NotebooksProceedings of the 21st International Conference on Mining Software Repositories10.1145/3643991.3644882(647-651)Online publication date: 15-Apr-2024
https://dl.acm.org/doi/10.1145/3643991.3644882
Sergeyuk ALvova OTitov SSerova ABagirov FKirillova EBryksin TBaysal OLinares-Vasquez MMoran KSteinmacher I(2024)Reassessing Java Code Readability Models with a Human-Centered ApproachProceedings of the 32nd IEEE/ACM International Conference on Program Comprehension10.1145/3643916.3644435(225-235)Online publication date: 15-Apr-2024
https://dl.acm.org/doi/10.1145/3643916.3644435
Eom HKim DLim SKoo HHwang S(2024)R2I: A Relative Readability Metric for Decompiled CodeProceedings of the ACM on Software Engineering10.1145/36437441:FSE(383-405)Online publication date: 12-Jul-2024
https://dl.acm.org/doi/10.1145/3643744
Sampaio ISampaio A(2024)Replication of a Study about the Impact of Method Chaining and Comments on Readability and Comprehension2024 4th International Conference on Code Quality (ICCQ)10.1109/ICCQ60895.2024.10576941(35-52)Online publication date: 22-Jun-2024
https://doi.org/10.1109/ICCQ60895.2024.10576941
Chuang YChang H(2024)Analyzing Novice and Competent Programmers' Problem-Solving Behaviors Using an Automated Evaluation SystemScience of Computer Programming10.1016/j.scico.2024.103138(103138)Online publication date: May-2024
https://doi.org/10.1016/j.scico.2024.103138
Mashhadi EChowdhury SModaberi SHemmati HUddin G(2024)An empirical study on bug severity estimation using source code metrics and static analysisJournal of Systems and Software10.1016/j.jss.2024.112179(112179)Online publication date: Aug-2024
https://doi.org/10.1016/j.jss.2024.112179
Mondal SRoy B(2024)Reproducibility of issues reported in stack overflow questions: Challenges, impact & estimationJournal of Systems and Software10.1016/j.jss.2024.112158217(112158)Online publication date: Nov-2024
https://doi.org/10.1016/j.jss.2024.112158
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents