Article

Entity Set Expansion Based on Category Prompts in MOOCs

Authors:

Juanzi LiAuthors Info & Claims

Knowledge Science, Engineering and Management: 17th International Conference, KSEM 2024, Birmingham, UK, August 16–18, 2024, Proceedings, Part II

Pages 318 - 332

https://doi.org/10.1007/978-981-97-5495-3_24

Published: 16 August 2024 Publication History

Abstract

Entity Set Expansion(ESE) is an important task in natural language processing, which is dedicated to expanding new entities from the seed entity set that belongs to the same semantic class. The ESE task initially used Bootstrap’s method to iteratively generate entities, but this method would cause semantic drift issue. Some studies have introduced categories to guide entity generation, but the corpus does not participate in the generation of categories. This leads to a certain extent to the insufficient granularity and inaccuracy of the categories, and subsequently directly affects the generation of entities. To address these challenges, we introduce a text summarization model to fully mine the semantic information of entities and use double filtering to further enhance the semantic boundary of entities. In addition, we propose a two-stage framework to expand entities. We also provide a Chinese MOOC-ESE dataset consisting of 476 courses and 45938 entity concepts. Experimental results show that our method performs better than other baseline models in MAP evaluation.

References

[1]

Huang Q MOOC English online learning resource recommendation algorithm based on spectral clustering and matrix decomposition Int. J. Continuing Eng. Educ. Life Long Learn. 2023 33 4–5 376-387

[2]

Lu, M., Wang, Y., Yu, J., et al.: Distantly supervised course concept extraction in MOOCs with academic discipline. In: Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 13044–13059 (2023)

[3]

Shen, J., Qiu, W., Shang, J., et al.: Synsetexpan: An iterative framework for joint entity set expansion and synonym discovery. arXiv preprint arXiv:2009.13827 (2020)

[4]

Lewis, M., Liu, Y., Goyal, N., et al.: BART: denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. arXiv preprint arXiv:1910.13461 (2019)

[5]

Raffel C, Shazeer N, Roberts A, et al. Exploring the limits of transfer learning with a unified text-to-text transformer J. Mach. Learn. Res. 2020 21 1 5485-5551

Digital Library

[6]

Zhang, J., Zhao, Y., Saleh, M., et al.: Pegasus: pre-training with extracted gap-sentences for abstractive summarization. In: International Conference on Machine Learning, pp. 11328–11339. PMLR (2020)

[7]

Yu, J., Wang, Y., Zhong, Q., et al.: MOOCCubeX: a large knowledge-centered repository for adaptive learning in MOOCs. In: Proceedings of the 30th ACM International Conference on Information and Knowledge Management, pp. 4643–4652 (2021)

[8]

Tong, S., Dean, J.: System and methods for automatically creating lists: U.S. Patent 7,350,187, 25 March 2008

[9]

Wang, R.C., Cohen, W.W.: Language-independent set expansion of named entities using the web. In: Seventh IEEE International Conference on Data Mining (ICDM), pp. 342–350. IEEE (2007)

[10]

Mamou, J., Pereg, O., Wasserblat, M., et al.: Term set expansion based NLP architect by intel AI lab. arXiv preprint arXiv:1808.08953 (2018)

[11]

Huang, J., Xie, Y., Meng, Y., et al.: Guiding corpus-based set expansion by auxiliary sets generation and co-expansion. In: Proceedings of the Web Conference 2020, pp. 2188–2198 (2020)

[12]

Zhang, Y., Shen, J., Shang, J., et al.: Empower entity set expansion via language model probing. arXiv preprint arXiv:2004.13897 (2020)

[13]

Jindal, P., Roth, D.: Learning from negative examples in set-expansion. In: 2011 IEEE 11th International Conference on Data Mining, pp. 1110–1115. IEEE (2011)

[14]

Wang, C., Chakrabarti, K., He, Y., et al.: Concept expansion using web tables. In: Proceedings of the 24th International Conference on World Wide Web, pp. 1198–1208 (2015)

[15]

Radford A, Wu J, Child R, et al. Language models are unsupervised multitask learners OpenAI Blog 2019 1 8 9

[16]

Mann, B., Ryder, N., Subbiah, M., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020)

[17]

Shin, T., Razeghi, Y., Logan IV, R.L., et al.: Autoprompt: eliciting knowledge from language models with automatically generated prompts. arXiv preprint arXiv:2010.15980 (2020)

[18]

Wallace, E., Feng, S., Kandpal, N., et al.: Universal adversarial triggers for attacking and analyzing NLP. arXiv preprint arXiv:1908.07125 (2019)

[19]

Lester, B., Al-Rfou, R., Constant, N.: The power of scale for parameter-efficient prompt tuning. arXiv preprint arXiv:2104.08691 (2021)

[20]

Hearst, M.A.: Automatic acquisition of hyponyms from large text corpora. In: COLING 1992 Volume 2: The 14th International Conference on Computational Linguistics (1992)

[21]

Ben-David E, Oved N, and Reichart R PADA: example-based prompt learning for on-the-fly adaptation to unseen domains Trans. Assoc. Comput. Linguist. 2022 10 414-433

[22]

Liu P, Yuan W, Fu J, et al. Pre-train, prompt, and predict: a systematic survey of prompting methods in natural language processing ACM Comput. Surv. 2023 55 9 1-35

Digital Library

[23]

Liu, X., Zheng, Y., Du, Z., et al.: GPT understands, too. AI Open (2023)

[24]

Sun, J., Xu, C., Tang, L., et al.: Think-on-graph: deep and responsible reasoning of large language model with knowledge graph. arXiv preprint arXiv:2307.07697 (2023)

[25]

Everingham M, Eslami SMA, Van Gool L, et al. The pascal visual object classes challenge: a retrospective Int. J. Comput. Vision 2015 111 98-136

Digital Library

[26]

Shleifer, S., Rush, A.M.: Pre-trained summarization distillation. arXiv preprint arXiv:2010.13002 (2020)

[27]

Gardner K A summary of findings of a five-year comparison study of primary and team nursing Nurs. Res. 1991 40 2 113-117

[28]

Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of NaacL-HLT, vol. 1 (2019)

[29]

Cui Y, Che W, Liu T, Qin B, and Yang Z Pre-training with whole word masking for Chinese BERT IEEE/ACM Trans. Audio Speech Lang. Process. 2021 29 3504-3514

Digital Library

[30]

Buckley, C., Voorhees, E.M.: Retrieval evaluation with incomplete information. In: Proceedings of the 27th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 25–32 (2004)

Index Terms

Entity Set Expansion Based on Category Prompts in MOOCs
1. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing
2. Information systems
  1. Data management systems
    1. Information integration
      1. Entity resolution
  2. Information retrieval
    1. Document representation
      1. Content analysis and feature selection
      2. Document topic models
    2. Retrieval tasks and goals
      1. Information extraction

Index terms have been assigned to the content through auto-classification.

Recommendations

Fine-Grained Category Generation for Sets of Entities
Web and Big Data
Abstract
Category systems play an essential role in knowledge bases by groupings of semantically related entities. Category generation task aims to produce category suggestions which can help knowledge editors to expand a category system. Most past ...
A Study of Category Expansion for Related Entity Finding
ISCID '11: Proceedings of the 2011 Fourth International Symposium on Computational Intelligence and Design - Volume 01

Entity is an important information carrier in Web pages. Searchers often want a ranked list of relevant entities directly rather a list of documents. So the research of related entity finding (REF) is very meaningful. In this paper we investigate the ...
Contrastive Learning with Hard Negative Entities for Entity Set Expansion
SIGIR '22: Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval

Entity Set Expansion (ESE) is a promising task which aims to expand entities of the target semantic class described by a small seed entity set. Various NLP and IR applications will benefit from ESE due to its ability to discover knowledge. Although ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Guide Proceedings

Knowledge Science, Engineering and Management: 17th International Conference, KSEM 2024, Birmingham, UK, August 16–18, 2024, Proceedings, Part II

Aug 2024

476 pages

ISBN:978-981-97-5494-6

DOI:10.1007/978-981-97-5495-3

Editors:
Cungeng Cao
https://ror.org/034t30j35Chinese Academy of Sciences, Beijing, China
,
Huajun Chen
https://ror.org/00a2xv884Zhejiang University, Zhejiang, China
,
Liang Zhao
https://ror.org/03czfpz43Emory University, Atlanta, GA, USA
,
Junaid Arshad
https://ror.org/00t67pt25Birmingham City University, Birmingham, UK
,
Taufiq Asyhari
Monash University, Banten, Indonesia
,
Yonghao Wang
Birmingham City University, Birmingham, UK

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024.

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 16 August 2024

Author Tags

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 21 Dec 2024

Other Metrics

View Author Metrics

Citations

View Options

View options

Media

Figures

Other

Tables

View Table of Contents