research-article

AST-transformer: encoding abstract syntax trees efficiently for code summarization

Authors:

Bin LuoAuthors Info & Claims

ASE '21: Proceedings of the 36th IEEE/ACM International Conference on Automated Software Engineering

Pages 1193 - 1195

https://doi.org/10.1109/ASE51524.2021.9678882

Published: 24 June 2022 Publication History

Abstract

Code summarization aims to generate brief natural language descriptions for source code. As source code is highly structured and follows strict programming language grammars, its Abstract Syntax Tree (AST) is often leveraged to inform the encoder about the structural information. However, ASTs are usually much longer than the source code. Current approaches ignore the size limit and simply feed the whole linearized AST into the encoder. To address this problem, we propose AST-Transformer to efficiently encode tree-structured ASTs. Experiments show that AST-Transformer outperforms the state-of-arts by a substantial margin while being able to reduce 90 ~ 95% of the computational complexity in the encoding process.

References

[1]

P. W. McBurney and C. McMillan, "Automatic source code summarization of context for java methods," IEEE Trans. Software Eng., vol. 42, no. 2, pp. 103--119, 2016. [Online].

Digital Library

[2]

X. Shen, Y. Oualil, C. Greenberg, M. Singh, and D. Klakow, "Estimation of gap between current language models and human performance," Proc. Interspeech 2017, pp. 553--557, 2017.

[3]

X. Shen, Y. Zhao, H. Su, and D. Klakow, "Improving latent alignment in text summarization by generalizing the pointer generator," in Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), 2019, pp. 3762--3773.

[4]

A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser, and I. Polosukhin, "Attention is all you need," in Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 4--9 December 2017, Long Beach, CA, USA, I. Guyon, U. von Luxburg, S. Bengio, H. M. Wallach, R. Fergus, S. V. N. Vishwanathan, and R. Garnett, Eds., 2017, pp. 5998--6008. [Online]. Available: http://papers.nips.cc/paper/7181-attention-is-all-you-need

[5]

S. Kim, J. Zhao, Y. Tian, and S. Chandra, "Code prediction by feeding trees to transformers," CoRR, vol. abs/2003.13848, 2020. [Online]. Available: https://arxiv.org/abs/2003.13848

[6]

X. Hu, G. Li, X. Xia, D. Lo, and Z. Jin, "Deep code comment generation," in Proceedings of the 26th Conference on Program Comprehension, ICPC 2018, Gothenburg, Sweden, May 27--28, 2018, F. Khomh, C. K. Roy, and J. Siegmund, Eds. ACM, 2018, pp. 200--210. [Online].

Digital Library

[7]

U. Alon, S. Brody, O. Levy, and E. Yahav, "code2seq: Generating sequences from structured representations of code," in 7th International Conference on Learning Representations, ICLR 2019, New Orleans, LA, USA, May 6--9, 2019. OpenReview.net, 2019. [Online]. Available: https://openreview.net/forum?id=H1gKYo09tX

[8]

A. Eriguchi, K. Hashimoto, and Y. Tsuruoka, "Tree-to-sequence attentional neural machine translation," in Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, ACL 2016, August 7--12, 2016, Berlin, Germany, Volume 1: Long Papers. The Association for Computer Linguistics, 2016. [Online].

[9]

Y. Wan, Z. Zhao, M. Yang, G. Xu, H. Ying, J. Wu, and P. S. Yu, "Improving automatic source code summarization via deep reinforcement learning," in Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering, ASE 2018, Montpellier, France, September 3--7, 2018, M. Huchard, C. Kästner, and G. Fraser, Eds. ACM, 2018, pp. 397--407. [Online].

Digital Library

[10]

X. Hu, G. Li, X. Xia, D. Lo, S. Lu, and Z. Jin, "Summarizing source code with transferred API knowledge," in Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, IJCAI 2018, July 13--19, 2018, Stockholm, Sweden, J. Lang, Ed. ijcai.org, 2018, pp. 2269--2275. [Online].

[11]

M. Allamanis, H. Peng, and C. Sutton, "A convolutional attention network for extreme summarization of source code," CoRR, vol. abs/1602.03001, 2016. [Online]. Available: http://arxiv.org/abs/1602.03001

[12]

B. Wei, G. Li, X. Xia, Z. Fu, and Z. Jin, "Code generation as a dual task of code summarization," in Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, NeurIPS 2019, December 8--14, 2019, Vancouver, BC, Canada, H. M. Wallach, H. Larochelle, A. Beygelzimer, F. d'Alché-Buc, E. B. Fox, and R. Garnett, Eds., 2019, pp. 6559--6569.

[13]

W. U. Ahmad, S. Chakraborty, B. Ray, and K. Chang, "A transformer-based approach for source code summarization," in Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, ACL 2020, Online, July 5--10, 2020, D. Jurafsky, J. Chai, N. Schluter, and J. R. Tetreault, Eds. Association for Computational Linguistics, 2020, pp. 4998--5007. [Online].

[14]

P. Shaw, J. Uszkoreit, and A. Vaswani, "Self-attention with relative position representations," in Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT, New Orleans, Louisiana, USA, June 1--6, 2018, Volume 2 (Short Papers), M. A. Walker, H. Ji, and A. Stent, Eds. Association for Computational Linguistics, 2018, pp. 464--468. [Online].

[15]

P. He, X. Liu, J. Gao, and W. Chen, "Deberta: Decoding-enhanced BERT with disentangled attention," CoRR, vol. abs/2006.03654, 2020. [Online]. Available: https://arxiv.org/abs/2006.03654

Cited By

Li JZhang MLi NWeyns DJin ZTei K(2024)Generative AI for Self-Adaptive Systems: State of the Art and Research RoadmapACM Transactions on Autonomous and Adaptive Systems10.1145/368680319:3(1-60)Online publication date: 30-Sep-2024
https://dl.acm.org/doi/10.1145/3686803
Karas ZBansal AZhang YLi TMcMillan CHuang Y(2024)A Tale of Two Comprehensions? Analyzing Student Programmer Attention during Code SummarizationACM Transactions on Software Engineering and Methodology10.1145/366480833:7(1-37)Online publication date: 26-Aug-2024
https://dl.acm.org/doi/10.1145/3664808
Yang HXu LLiu CHuangfu L(2024)Query-oriented two-stage attention-based model for code searchJournal of Systems and Software10.1016/j.jss.2023.111948210:COnline publication date: 1-Apr-2024
https://dl.acm.org/doi/10.1016/j.jss.2023.111948

Recommendations

AST-trans: code summarization with efficient tree-structured attention
ICSE '22: Proceedings of the 44th International Conference on Software Engineering

Code summarization aims to generate brief natural language descriptions for source codes. The state-of-the-art approaches follow a transformer-based encoder-decoder architecture. As the source code is highly structured and follows strict grammars, its ...
M2TS: multi-scale multi-modal approach based on transformer for source code summarization
ICPC '22: Proceedings of the 30th IEEE/ACM International Conference on Program Comprehension

Source code summarization aims to generate natural language descriptions of code snippets. Many existing studies learn the syntactic and semantic knowledge of code snippets from their token sequences and Abstract Syntax Trees (ASTs). They use the ...
An empirical study of the textual similarity between source code and source code summaries

Source code documentation often contains summaries of source code written by authors. Recently, automatic source code summarization tools have emerged that generate summaries without requiring author intervention. These summaries are designed for ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

ASE '21: Proceedings of the 36th IEEE/ACM International Conference on Automated Software Engineering

November 2021

1446 pages

ISBN:9781665403375

General Chair:
John Grundy
Monash University, Australia

Sponsors

In-Cooperation

IEEE CS

Publisher

IEEE Press

Publication History

Published: 24 June 2022

Check for updates

Author Tags

Qualifiers

Research-article

Conference

ASE '21

Sponsor:

ASE '21: 36th IEEE/ACM International Conference on Automated Software Engineering

November 15 - 19, 2021

Melbourne, Australia

Acceptance Rates

Overall Acceptance Rate 82 of 337 submissions, 24%

Upcoming Conference

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

3
Total Citations
View Citations
47
Total Downloads

Downloads (Last 12 months)15
Downloads (Last 6 weeks)0

Reflects downloads up to 01 Oct 2024

Other Metrics

View Author Metrics

Citations

Cited By

Li JZhang MLi NWeyns DJin ZTei K(2024)Generative AI for Self-Adaptive Systems: State of the Art and Research RoadmapACM Transactions on Autonomous and Adaptive Systems10.1145/368680319:3(1-60)Online publication date: 30-Sep-2024
https://dl.acm.org/doi/10.1145/3686803
Karas ZBansal AZhang YLi TMcMillan CHuang Y(2024)A Tale of Two Comprehensions? Analyzing Student Programmer Attention during Code SummarizationACM Transactions on Software Engineering and Methodology10.1145/366480833:7(1-37)Online publication date: 26-Aug-2024
https://dl.acm.org/doi/10.1145/3664808
Yang HXu LLiu CHuangfu L(2024)Query-oriented two-stage attention-based model for code searchJournal of Systems and Software10.1016/j.jss.2023.111948210:COnline publication date: 1-Apr-2024
https://dl.acm.org/doi/10.1016/j.jss.2023.111948

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents