Computer Science > Computation and Language

arXiv:2205.10866 (cs)

[Submitted on 22 May 2022]

Title:Blackbird's language matrices (BLMs): a new benchmark to investigate disentangled generalisation in neural networks

Authors:Paola Merlo, Aixiu An, Maria A. Rodriguez

View PDF

Abstract:Current successes of machine learning architectures are based on computationally expensive algorithms and prohibitively large amounts of data. We need to develop tasks and data to train networks to reach more complex and more compositional skills. In this paper, we illustrate Blackbird's language matrices (BLMs), a novel grammatical dataset developed to test a linguistic variant of Raven's progressive matrices, an intelligence test usually based on visual stimuli. The dataset consists of 44800 sentences, generatively constructed to support investigations of current models' linguistic mastery of grammatical agreement rules and their ability to generalise them. We present the logic of the dataset, the method to automatically construct data on a large scale and the architecture to learn them. Through error analysis and several experiments on variations of the dataset, we demonstrate that this language task and the data that instantiate it provide a new challenging testbed to understand generalisation and abstraction.

Comments:	15 pages, 9 figures, 1 table
Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2205.10866 [cs.CL]
	(or arXiv:2205.10866v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2205.10866

Submission history

From: Paola Merlo [view email]
[v1] Sun, 22 May 2022 16:51:24 UTC (1,567 KB)

Computer Science > Computation and Language

Title:Blackbird's language matrices (BLMs): a new benchmark to investigate disentangled generalisation in neural networks

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Blackbird's language matrices (BLMs): a new benchmark to investigate disentangled generalisation in neural networks

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators