Computer Science > Computation and Language

arXiv:2303.08809 (cs)

[Submitted on 15 Mar 2023 (v1), last revised 9 May 2023 (this version, v2)]

Title:Cascading and Direct Approaches to Unsupervised Constituency Parsing on Spoken Sentences

Authors:Yuan Tseng, Cheng-I Lai, Hung-yi Lee

View PDF

Abstract:Past work on unsupervised parsing is constrained to written form. In this paper, we present the first study on unsupervised spoken constituency parsing given unlabeled spoken sentences and unpaired textual data. The goal is to determine the spoken sentences' hierarchical syntactic structure in the form of constituency parse trees, such that each node is a span of audio that corresponds to a constituent. We compare two approaches: (1) cascading an unsupervised automatic speech recognition (ASR) model and an unsupervised parser to obtain parse trees on ASR transcripts, and (2) direct training an unsupervised parser on continuous word-level speech representations. This is done by first splitting utterances into sequences of word-level segments, and aggregating self-supervised speech representations within segments to obtain segment embeddings. We find that separately training a parser on the unpaired text and directly applying it on ASR transcripts for inference produces better results for unsupervised parsing. Additionally, our results suggest that accurate segmentation alone may be sufficient to parse spoken sentences accurately. Finally, we show the direct approach may learn head-directionality correctly for both head-initial and head-final languages without any explicit inductive bias.

Comments:	Accepted to ICASSP 2023; updated compute resource acknowledgements
Subjects:	Computation and Language (cs.CL); Audio and Speech Processing (eess.AS)
Cite as:	arXiv:2303.08809 [cs.CL]
	(or arXiv:2303.08809v2 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2303.08809

Submission history

From: Yuan Tseng [view email]
[v1] Wed, 15 Mar 2023 17:57:22 UTC (578 KB)
[v2] Tue, 9 May 2023 10:36:55 UTC (578 KB)

Computer Science > Computation and Language

Title:Cascading and Direct Approaches to Unsupervised Constituency Parsing on Spoken Sentences

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Cascading and Direct Approaches to Unsupervised Constituency Parsing on Spoken Sentences

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators