Improving Embedding-based Unsupervised Keyphrase Extraction by Incorporating Structural Information

Mingyang Song, Huafeng Liu, Yi Feng, Liping Jing

Abstract

Keyphrase extraction aims to extract a set of phrases with the central idea of the source document. In a structured document, there are certain locations (e.g., the title or the first sentence) where a keyphrase is most likely to appear. However, when extracting keyphrases from the document, most existing embedding-based unsupervised keyphrase extraction models ignore the indicative role of the highlights in certain locations, leading to wrong keyphrases extraction. In this paper, we propose a new Highlight-Guided Unsupervised Keyphrase Extraction model (HGUKE) to address the above issue. Specifically, HGUKE first models the phrase-document relevance via the highlights of the documents. Next, HGUKE calculates the cross-phrase relevance between all candidate phrases. Finally, HGUKE aggregates the above two relevance as the importance score of each candidate phrase to rank and extract keyphrases. The experimental results on three benchmarks demonstrate that HGUKE outperforms the state-of-the-art unsupervised keyphrase extraction baselines.

Anthology ID:: 2023.findings-acl.66
Volume:: Findings of the Association for Computational Linguistics: ACL 2023
Month:: July
Year:: 2023
Address:: Toronto, Canada
Editors:: Anna Rogers, Jordan Boyd-Graber, Naoaki Okazaki
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 1041–1048
Language:
URL:: https://aclanthology.org/2023.findings-acl.66
DOI:: 10.18653/v1/2023.findings-acl.66
Bibkey:
Cite (ACL):: Mingyang Song, Huafeng Liu, Yi Feng, and Liping Jing. 2023. Improving Embedding-based Unsupervised Keyphrase Extraction by Incorporating Structural Information. In Findings of the Association for Computational Linguistics: ACL 2023, pages 1041–1048, Toronto, Canada. Association for Computational Linguistics.
Cite (Informal):: Improving Embedding-based Unsupervised Keyphrase Extraction by Incorporating Structural Information (Song et al., Findings 2023)
Copy Citation:
PDF:: https://aclanthology.org/2023.findings-acl.66.pdf

PDF Cite Search