Learning to Substitute Spans towards Improving Compositional Generalization

Abstract

Despite the rising prevalence of neural sequence models, recent empirical evidences suggest their deficiency in compositional generalization. One of the current de-facto solutions to this problem is compositional data augmentation, aiming to incur additional compositional inductive bias. Nonetheless, the improvement offered by existing handcrafted augmentation strategies is limited when successful systematic generalization of neural sequence models requires multi-grained compositional bias (i.e., not limited to either lexical or structural biases only) or differentiation of training sequences in an imbalanced difficulty distribution. To address the two challenges, we first propose a novel compositional augmentation strategy dubbed Span Substitution (SpanSub) that enables multi-grained composition of substantial substructures in the whole training set. Over and above that, we introduce the Learning to Substitute Span (L2S2) framework which empowers the learning of span substitution probabilities in SpanSub in an end-to-end manner by maximizing the loss of neural sequence models, so as to outweigh those challenging compositions with elusive concepts and novel surroundings. Our empirical results on three standard compositional generalization benchmarks, including SCAN, COGS and GeoQuery (with an improvement of at most 66.5%, 10.3%, 1.2%, respectively), demonstrate the superiority of SpanSub, L2S2 and their combination.

Anthology ID:: 2023.acl-long.157
Volume:: Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:: July
Year:: 2023
Address:: Toronto, Canada
Editors:: Anna Rogers, Jordan Boyd-Graber, Naoaki Okazaki
Venue:: ACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 2791–2811
Language:
URL:: https://aclanthology.org/2023.acl-long.157
DOI:: 10.18653/v1/2023.acl-long.157
Bibkey:
Cite (ACL):: Zhaoyi Li, Ying Wei, and Defu Lian. 2023. Learning to Substitute Spans towards Improving Compositional Generalization. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 2791–2811, Toronto, Canada. Association for Computational Linguistics.
Cite (Informal):: Learning to Substitute Spans towards Improving Compositional Generalization (Li et al., ACL 2023)
Copy Citation:
PDF:: https://aclanthology.org/2023.acl-long.157.pdf
Video:: https://aclanthology.org/2023.acl-long.157.mp4

PDF Cite Search Video