Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1007/978-981-97-5495-3_24guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

Entity Set Expansion Based on Category Prompts in MOOCs

Published: 16 August 2024 Publication History

Abstract

Entity Set Expansion(ESE) is an important task in natural language processing, which is dedicated to expanding new entities from the seed entity set that belongs to the same semantic class. The ESE task initially used Bootstrap’s method to iteratively generate entities, but this method would cause semantic drift issue. Some studies have introduced categories to guide entity generation, but the corpus does not participate in the generation of categories. This leads to a certain extent to the insufficient granularity and inaccuracy of the categories, and subsequently directly affects the generation of entities. To address these challenges, we introduce a text summarization model to fully mine the semantic information of entities and use double filtering to further enhance the semantic boundary of entities. In addition, we propose a two-stage framework to expand entities. We also provide a Chinese MOOC-ESE dataset consisting of 476 courses and 45938 entity concepts. Experimental results show that our method performs better than other baseline models in MAP evaluation.

References

[1]
Huang Q MOOC English online learning resource recommendation algorithm based on spectral clustering and matrix decomposition Int. J. Continuing Eng. Educ. Life Long Learn. 2023 33 4–5 376-387
[2]
Lu, M., Wang, Y., Yu, J., et al.: Distantly supervised course concept extraction in MOOCs with academic discipline. In: Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 13044–13059 (2023)
[3]
Shen, J., Qiu, W., Shang, J., et al.: Synsetexpan: An iterative framework for joint entity set expansion and synonym discovery. arXiv preprint arXiv:2009.13827 (2020)
[4]
Lewis, M., Liu, Y., Goyal, N., et al.: BART: denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. arXiv preprint arXiv:1910.13461 (2019)
[5]
Raffel C, Shazeer N, Roberts A, et al. Exploring the limits of transfer learning with a unified text-to-text transformer J. Mach. Learn. Res. 2020 21 1 5485-5551
[6]
Zhang, J., Zhao, Y., Saleh, M., et al.: Pegasus: pre-training with extracted gap-sentences for abstractive summarization. In: International Conference on Machine Learning, pp. 11328–11339. PMLR (2020)
[7]
Yu, J., Wang, Y., Zhong, Q., et al.: MOOCCubeX: a large knowledge-centered repository for adaptive learning in MOOCs. In: Proceedings of the 30th ACM International Conference on Information and Knowledge Management, pp. 4643–4652 (2021)
[8]
Tong, S., Dean, J.: System and methods for automatically creating lists: U.S. Patent 7,350,187, 25 March 2008
[9]
Wang, R.C., Cohen, W.W.: Language-independent set expansion of named entities using the web. In: Seventh IEEE International Conference on Data Mining (ICDM), pp. 342–350. IEEE (2007)
[10]
Mamou, J., Pereg, O., Wasserblat, M., et al.: Term set expansion based NLP architect by intel AI lab. arXiv preprint arXiv:1808.08953 (2018)
[11]
Huang, J., Xie, Y., Meng, Y., et al.: Guiding corpus-based set expansion by auxiliary sets generation and co-expansion. In: Proceedings of the Web Conference 2020, pp. 2188–2198 (2020)
[12]
Zhang, Y., Shen, J., Shang, J., et al.: Empower entity set expansion via language model probing. arXiv preprint arXiv:2004.13897 (2020)
[13]
Jindal, P., Roth, D.: Learning from negative examples in set-expansion. In: 2011 IEEE 11th International Conference on Data Mining, pp. 1110–1115. IEEE (2011)
[14]
Wang, C., Chakrabarti, K., He, Y., et al.: Concept expansion using web tables. In: Proceedings of the 24th International Conference on World Wide Web, pp. 1198–1208 (2015)
[15]
Radford A, Wu J, Child R, et al. Language models are unsupervised multitask learners OpenAI Blog 2019 1 8 9
[16]
Mann, B., Ryder, N., Subbiah, M., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020)
[17]
Shin, T., Razeghi, Y., Logan IV, R.L., et al.: Autoprompt: eliciting knowledge from language models with automatically generated prompts. arXiv preprint arXiv:2010.15980 (2020)
[18]
Wallace, E., Feng, S., Kandpal, N., et al.: Universal adversarial triggers for attacking and analyzing NLP. arXiv preprint arXiv:1908.07125 (2019)
[19]
Lester, B., Al-Rfou, R., Constant, N.: The power of scale for parameter-efficient prompt tuning. arXiv preprint arXiv:2104.08691 (2021)
[20]
Hearst, M.A.: Automatic acquisition of hyponyms from large text corpora. In: COLING 1992 Volume 2: The 14th International Conference on Computational Linguistics (1992)
[21]
Ben-David E, Oved N, and Reichart R PADA: example-based prompt learning for on-the-fly adaptation to unseen domains Trans. Assoc. Comput. Linguist. 2022 10 414-433
[22]
Liu P, Yuan W, Fu J, et al. Pre-train, prompt, and predict: a systematic survey of prompting methods in natural language processing ACM Comput. Surv. 2023 55 9 1-35
[23]
Liu, X., Zheng, Y., Du, Z., et al.: GPT understands, too. AI Open (2023)
[24]
Sun, J., Xu, C., Tang, L., et al.: Think-on-graph: deep and responsible reasoning of large language model with knowledge graph. arXiv preprint arXiv:2307.07697 (2023)
[25]
Everingham M, Eslami SMA, Van Gool L, et al. The pascal visual object classes challenge: a retrospective Int. J. Comput. Vision 2015 111 98-136
[26]
Shleifer, S., Rush, A.M.: Pre-trained summarization distillation. arXiv preprint arXiv:2010.13002 (2020)
[27]
Gardner K A summary of findings of a five-year comparison study of primary and team nursing Nurs. Res. 1991 40 2 113-117
[28]
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of NaacL-HLT, vol. 1 (2019)
[29]
Cui Y, Che W, Liu T, Qin B, and Yang Z Pre-training with whole word masking for Chinese BERT IEEE/ACM Trans. Audio Speech Lang. Process. 2021 29 3504-3514
[30]
Buckley, C., Voorhees, E.M.: Retrieval evaluation with incomplete information. In: Proceedings of the 27th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 25–32 (2004)

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Guide Proceedings
Knowledge Science, Engineering and Management: 17th International Conference, KSEM 2024, Birmingham, UK, August 16–18, 2024, Proceedings, Part II
Aug 2024
476 pages
ISBN:978-981-97-5494-6
DOI:10.1007/978-981-97-5495-3
  • Editors:
  • Cungeng Cao,
  • Huajun Chen,
  • Liang Zhao,
  • Junaid Arshad,
  • Taufiq Asyhari,
  • Yonghao Wang

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 16 August 2024

Author Tags

  1. Entity set extension
  2. Chinese MOOCs
  3. Category generation
  4. Prompt
  5. Entity generation
  6. Fine-grained

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 0
    Total Downloads
  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 21 Dec 2024

Other Metrics

Citations

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media