Computer Science > Computation and Language

arXiv:2403.18769 (cs)

[Submitted on 27 Mar 2024]

Title:Improved Neural Protoform Reconstruction via Reflex Prediction

Authors:Liang Lu, Jingzhi Wang, David R. Mortensen

View PDF

Abstract:Protolanguage reconstruction is central to historical linguistics. The comparative method, one of the most influential theoretical and methodological frameworks in the history of the language sciences, allows linguists to infer protoforms (reconstructed ancestral words) from their reflexes (related modern words) based on the assumption of regular sound change. Not surprisingly, numerous computational linguists have attempted to operationalize comparative reconstruction through various computational models, the most successful of which have been supervised encoder-decoder models, which treat the problem of predicting protoforms given sets of reflexes as a sequence-to-sequence problem. We argue that this framework ignores one of the most important aspects of the comparative method: not only should protoforms be inferable from cognate sets (sets of related reflexes) but the reflexes should also be inferable from the protoforms. Leveraging another line of research -- reflex prediction -- we propose a system in which candidate protoforms from a reconstruction model are reranked by a reflex prediction model. We show that this more complete implementation of the comparative method allows us to surpass state-of-the-art protoform reconstruction methods on three of four Chinese and Romance datasets.

Comments:	Accepted to LREC-COLING 2024
Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2403.18769 [cs.CL]
	(or arXiv:2403.18769v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2403.18769

Submission history

From: Liang Lu [view email]
[v1] Wed, 27 Mar 2024 17:13:38 UTC (496 KB)

Computer Science > Computation and Language

Title:Improved Neural Protoform Reconstruction via Reflex Prediction

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Improved Neural Protoform Reconstruction via Reflex Prediction

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators