Abstract
Code summarization, which provides a high-level description of the function implemented by code, plays a vital role in software maintenance and code retrieval. Traditional approaches focus on retrieving similar code snippets to generate summaries, and recently researchers pay increasing attention to leverage deep learning approaches, especially the encoder-decoder framework. Approaches based on encoder-decoder suffer from two drawbacks: (a) Lack of summarization in functionality level; (b) Code snippets are always too long (more than ten words), regular encoders perform poorly. In this paper, we propose a novel code representation with the help of Abstract Syntax Trees, which could describe the functionality of code snippets and shortens the length of inputs. Based on our proposed code representation, we develop Generative Task, which aims to generate summary sentences of code snippets. Experiments on large-scale real-world industrial Java projects indicate that our approaches are effective and outperform the state-of-the-art approaches in code summarization.
Q. Chen, H. Hu and Z. Liu—Equal contribution.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
The camel-cased and underline identifiers are split into several words, for example, split checkJavaFile to three words: check, Java, File, so a leaf may consist of several tokens.
- 2.
- 3.
- 4.
- 5.
References
Allamanis, M., Peng, H., Sutton, C.: A convolutional attention network for extreme summarization of source code. In: International Conference on Machine Learning, pp. 2091–2100 (2016)
Hu, X., Li, G., Xia, X., Lo, D., Jin, Z.: Deep code comment generation. In: Proceedings of the 26th Conference on Program Comprehension, pp. 200–210. ACM (2018)
Hu, X., Li, G., Xia, X., Lo, D., Lu, S., Jin, Z.: Summarizing source code with transferred API knowledge. In: IJCAI, pp. 2269–2275 (2018)
Iyer, S., Konstas, I., Cheung, A., Zettlemoyer, L.: Summarizing source code using a neural attention model. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), vol. 1, pp. 2073–2083 (2016)
Iyer, S., Konstas, I., Cheung, A., Zettlemoyer, L.: Mapping language to code in programmatic context (2018)
Kingma, D., Ba, J.: Adam: a method for stochastic optimization. Comput. Sci. (2014)
Liang, Y., Zhu, K.Q.: Automatic generation of text descriptive comments for code blocks. In: Thirty-Second AAAI Conference on Artificial Intelligence (2018)
Ling, W., et al.: Latent predictor networks for code generation (2016)
McBurney, P.W., McMillan, C.: Automatic documentation generation via source code summarization of method context. In: Proceedings of the 22nd International Conference on Program Comprehension, pp. 279–290. ACM (2014)
Moreno, L., Aponte, J., Sridhara, G., Marcus, A., Pollock, L., Vijay-Shanker, K.: Automatic generation of natural language summaries for java classes. In: 2013 21st International Conference on Program Comprehension (ICPC), pp. 23–32. IEEE (2013)
Movshovitz-Attias, D., Cohen, W.W.: Natural language models for predicting programming comments. In: Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), vol. 2, pp. 35–40 (2013)
Sridhara, G., Hill, E., Muppaneni, D., Pollock, L., Vijay-Shanker, K.: Towards automatically generating summary comments for java methods. In: Proceedings of the IEEE/ACM International Conference on Automated Software Engineering, pp. 43–52. ACM (2010)
Sridhara, G., Pollock, L., Vijay-Shanker, K.: Generating parameter comments and integrating with method summaries. In: 2011 IEEE 19th International Conference on Program Comprehension, pp. 71–80. IEEE (2011)
Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014)
Sutskever, I., Vinyals, O., Le, Q.V.: Sequence to sequence learning with neural networks. In: Advances in Neural Information Processing Systems, pp. 3104–3112 (2014)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Chen, Q., Hu, H., Liu, Z. (2019). Code Summarization with Abstract Syntax Tree. In: Gedeon, T., Wong, K., Lee, M. (eds) Neural Information Processing. ICONIP 2019. Communications in Computer and Information Science, vol 1143. Springer, Cham. https://doi.org/10.1007/978-3-030-36802-9_69
Download citation
DOI: https://doi.org/10.1007/978-3-030-36802-9_69
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-36801-2
Online ISBN: 978-3-030-36802-9
eBook Packages: Computer ScienceComputer Science (R0)