Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3377816.3381736acmconferencesArticle/Chapter ViewAbstractPublication PagesicseConference Proceedingsconference-collections
short-paper
Open access

Where should I comment my code?: a dataset and model for predicting locations that need comments

Published: 18 September 2020 Publication History

Abstract

Programmers should write code comments, but not on every line of code. We have created a machine learning model that suggests locations where a programmer should write a code comment. We trained it on existing commented code to learn locations that are chosen by developers. Once trained, the model can predict locations in new code. Our models achieved precision of 74% and recall of 13% in identifying comment-worthy locations. This first success opens the door to future work, both in the new where-to-comment problem and in guiding comment generation. Our code and data is available at http://groups.inf.ed.ac.uk/cup/comment-locator/.

References

[1]
A. Blasi, A. Goffi, K. Kuznetsov, A. Gorla, M. D. Ernst, M. Pezzè, and S. D. Castellanos. 2018. Translating code comments to procedure specifications. In ISSTA. 242---253.
[2]
R. P. Buse and W. R. Weimer. 2010. Automatically documenting program changes. In ASE. 33---42.
[3]
M. D. Ernst. 2017. Natural language is a programming language: Applying natural language processing to software development. In SNAPL. 4:1---4:14.
[4]
M. Fowler. 2000. Refactoring: Improving the Design of Existing Code. Addison-Wesley.
[5]
A. Goffi, A. Gorla, M. D. Ernst, and M. Pezze. 2016. Automatic generation of oracles for exceptional behaviors. In ISSTA. 213---224.
[6]
X. Hu, G. Li, X. Xia, D. Lo, and Z. Jin. 2018. Deep code comment generation. In ICPC. 200---210.
[7]
A. Louis, S. K. Dash, E. T. Barr, and C. Sutton. 2018. Deep Learning to Detect Redundant Method Comments. http://arxiv.org/abs/1806.04616.
[8]
S. McConnell. 2004. Code complete: A practical handbook of software construction (2nd ed.). Microsoft Press, Redmond, WA, USA.
[9]
T. Mikolov, I. Sutskever, K. Chen, G. Corrado, and J. Dean. 2013. Distributed representations of words and phrases and their compositionality. In NIPS. 3111---3119.
[10]
M. Motwani and Y. Brun. 2019. Automatically Generating Precise Oracles from Structured Natural Language Specifications. In ICSE. Montreal, Canada, 188---199.
[11]
D. Movshovitz-Attias and W. W. Cohen. 2013. Natural language models for predicting programming comments. In ACL. 35---40.
[12]
L. Tan, D. Yuan, G. Krishna, and Y. Zhou. 2007. /*iComment: Bugs or Bad Comments?*/. In SOSP. 145---158.
[13]
L. Tan, Y. Zhou, and Y. Padioleau. 2011. aComment: Mining annotations from comments and code to detect interrupt related concurrency bugs. In ICSE. 11---20.
[14]
S. H. Tan, D. Marinov, L. Tan, and G. T. Leavens. 2012. @tComment: Testing Javadoc Comments to Detect Comment-Code Inconsistencies. In ICST. Montreal, Canada, 260---269.

Cited By

View all
  • (2024)Taxonomy of inline code comment smellsEmpirical Software Engineering10.1007/s10664-023-10425-529:3Online publication date: 3-Apr-2024
  • (2024)Are your comments outdated? Toward automatically detecting code‐comment consistencyJournal of Software: Evolution and Process10.1002/smr.2718Online publication date: 27-Aug-2024
  • (2023)Finding associations between natural and computer languagesJournal of Systems and Software10.1016/j.jss.2023.111651201:COnline publication date: 1-Jul-2023

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
ICSE-NIER '20: Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering: New Ideas and Emerging Results
June 2020
128 pages
ISBN:9781450371261
DOI:10.1145/3377816
This work is licensed under a Creative Commons Attribution International 4.0 License.

Sponsors

In-Cooperation

  • KIISE: Korean Institute of Information Scientists and Engineers
  • IEEE CS

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 18 September 2020

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. NLP
  2. comments
  3. natural language processing

Qualifiers

  • Short-paper

Funding Sources

Conference

ICSE '20
Sponsor:

Upcoming Conference

ICSE 2025

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)132
  • Downloads (Last 6 weeks)13
Reflects downloads up to 02 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Taxonomy of inline code comment smellsEmpirical Software Engineering10.1007/s10664-023-10425-529:3Online publication date: 3-Apr-2024
  • (2024)Are your comments outdated? Toward automatically detecting code‐comment consistencyJournal of Software: Evolution and Process10.1002/smr.2718Online publication date: 27-Aug-2024
  • (2023)Finding associations between natural and computer languagesJournal of Systems and Software10.1016/j.jss.2023.111651201:COnline publication date: 1-Jul-2023
  • (2021)Why My Code Summarization Model Does Not WorkACM Transactions on Software Engineering and Methodology10.1145/343428030:2(1-29)Online publication date: 10-Feb-2021

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media