Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1109/ASE51524.2021.9678882acmconferencesArticle/Chapter ViewAbstractPublication PagesaseConference Proceedingsconference-collections
research-article

AST-transformer: encoding abstract syntax trees efficiently for code summarization

Published: 24 June 2022 Publication History

Abstract

Code summarization aims to generate brief natural language descriptions for source code. As source code is highly structured and follows strict programming language grammars, its Abstract Syntax Tree (AST) is often leveraged to inform the encoder about the structural information. However, ASTs are usually much longer than the source code. Current approaches ignore the size limit and simply feed the whole linearized AST into the encoder. To address this problem, we propose AST-Transformer to efficiently encode tree-structured ASTs. Experiments show that AST-Transformer outperforms the state-of-arts by a substantial margin while being able to reduce 90 ~ 95% of the computational complexity in the encoding process.

References

[1]
P. W. McBurney and C. McMillan, "Automatic source code summarization of context for java methods," IEEE Trans. Software Eng., vol. 42, no. 2, pp. 103--119, 2016. [Online].
[2]
X. Shen, Y. Oualil, C. Greenberg, M. Singh, and D. Klakow, "Estimation of gap between current language models and human performance," Proc. Interspeech 2017, pp. 553--557, 2017.
[3]
X. Shen, Y. Zhao, H. Su, and D. Klakow, "Improving latent alignment in text summarization by generalizing the pointer generator," in Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), 2019, pp. 3762--3773.
[4]
A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser, and I. Polosukhin, "Attention is all you need," in Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 4--9 December 2017, Long Beach, CA, USA, I. Guyon, U. von Luxburg, S. Bengio, H. M. Wallach, R. Fergus, S. V. N. Vishwanathan, and R. Garnett, Eds., 2017, pp. 5998--6008. [Online]. Available: http://papers.nips.cc/paper/7181-attention-is-all-you-need
[5]
S. Kim, J. Zhao, Y. Tian, and S. Chandra, "Code prediction by feeding trees to transformers," CoRR, vol. abs/2003.13848, 2020. [Online]. Available: https://arxiv.org/abs/2003.13848
[6]
X. Hu, G. Li, X. Xia, D. Lo, and Z. Jin, "Deep code comment generation," in Proceedings of the 26th Conference on Program Comprehension, ICPC 2018, Gothenburg, Sweden, May 27--28, 2018, F. Khomh, C. K. Roy, and J. Siegmund, Eds. ACM, 2018, pp. 200--210. [Online].
[7]
U. Alon, S. Brody, O. Levy, and E. Yahav, "code2seq: Generating sequences from structured representations of code," in 7th International Conference on Learning Representations, ICLR 2019, New Orleans, LA, USA, May 6--9, 2019. OpenReview.net, 2019. [Online]. Available: https://openreview.net/forum?id=H1gKYo09tX
[8]
A. Eriguchi, K. Hashimoto, and Y. Tsuruoka, "Tree-to-sequence attentional neural machine translation," in Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, ACL 2016, August 7--12, 2016, Berlin, Germany, Volume 1: Long Papers. The Association for Computer Linguistics, 2016. [Online].
[9]
Y. Wan, Z. Zhao, M. Yang, G. Xu, H. Ying, J. Wu, and P. S. Yu, "Improving automatic source code summarization via deep reinforcement learning," in Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering, ASE 2018, Montpellier, France, September 3--7, 2018, M. Huchard, C. Kästner, and G. Fraser, Eds. ACM, 2018, pp. 397--407. [Online].
[10]
X. Hu, G. Li, X. Xia, D. Lo, S. Lu, and Z. Jin, "Summarizing source code with transferred API knowledge," in Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, IJCAI 2018, July 13--19, 2018, Stockholm, Sweden, J. Lang, Ed. ijcai.org, 2018, pp. 2269--2275. [Online].
[11]
M. Allamanis, H. Peng, and C. Sutton, "A convolutional attention network for extreme summarization of source code," CoRR, vol. abs/1602.03001, 2016. [Online]. Available: http://arxiv.org/abs/1602.03001
[12]
B. Wei, G. Li, X. Xia, Z. Fu, and Z. Jin, "Code generation as a dual task of code summarization," in Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, NeurIPS 2019, December 8--14, 2019, Vancouver, BC, Canada, H. M. Wallach, H. Larochelle, A. Beygelzimer, F. d'Alché-Buc, E. B. Fox, and R. Garnett, Eds., 2019, pp. 6559--6569.
[13]
W. U. Ahmad, S. Chakraborty, B. Ray, and K. Chang, "A transformer-based approach for source code summarization," in Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, ACL 2020, Online, July 5--10, 2020, D. Jurafsky, J. Chai, N. Schluter, and J. R. Tetreault, Eds. Association for Computational Linguistics, 2020, pp. 4998--5007. [Online].
[14]
P. Shaw, J. Uszkoreit, and A. Vaswani, "Self-attention with relative position representations," in Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT, New Orleans, Louisiana, USA, June 1--6, 2018, Volume 2 (Short Papers), M. A. Walker, H. Ji, and A. Stent, Eds. Association for Computational Linguistics, 2018, pp. 464--468. [Online].
[15]
P. He, X. Liu, J. Gao, and W. Chen, "Deberta: Decoding-enhanced BERT with disentangled attention," CoRR, vol. abs/2006.03654, 2020. [Online]. Available: https://arxiv.org/abs/2006.03654

Cited By

View all
  • (2024)Generative AI for Self-Adaptive Systems: State of the Art and Research RoadmapACM Transactions on Autonomous and Adaptive Systems10.1145/368680319:3(1-60)Online publication date: 30-Sep-2024
  • (2024)A Tale of Two Comprehensions? Analyzing Student Programmer Attention during Code SummarizationACM Transactions on Software Engineering and Methodology10.1145/366480833:7(1-37)Online publication date: 26-Aug-2024
  • (2024)Query-oriented two-stage attention-based model for code searchJournal of Systems and Software10.1016/j.jss.2023.111948210:COnline publication date: 1-Apr-2024

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
ASE '21: Proceedings of the 36th IEEE/ACM International Conference on Automated Software Engineering
November 2021
1446 pages
ISBN:9781665403375

Sponsors

In-Cooperation

  • IEEE CS

Publisher

IEEE Press

Publication History

Published: 24 June 2022

Check for updates

Author Tags

  1. source code summarization
  2. tree-based neural network

Qualifiers

  • Research-article

Conference

ASE '21
Sponsor:

Acceptance Rates

Overall Acceptance Rate 82 of 337 submissions, 24%

Upcoming Conference

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)15
  • Downloads (Last 6 weeks)0
Reflects downloads up to 01 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Generative AI for Self-Adaptive Systems: State of the Art and Research RoadmapACM Transactions on Autonomous and Adaptive Systems10.1145/368680319:3(1-60)Online publication date: 30-Sep-2024
  • (2024)A Tale of Two Comprehensions? Analyzing Student Programmer Attention during Code SummarizationACM Transactions on Software Engineering and Methodology10.1145/366480833:7(1-37)Online publication date: 26-Aug-2024
  • (2024)Query-oriented two-stage attention-based model for code searchJournal of Systems and Software10.1016/j.jss.2023.111948210:COnline publication date: 1-Apr-2024

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media