Abstract
Bag-of-words is the most common-used method in text mining tasks and many other applications. However, this method has some obvious shortcomings, such as ignoring semantic information. While in document analysis, semantic information always plays a more important role than individual words. To tackle this problem, we need to borrow semantic information from ontologies to learn the text information better. An expert-edited ontology is usually well structured and is more authoritative than an online cyclopedia. On the other hand, due to the costly editing, it is rather difficult for expert-edited ontologies to keep up with a deluge of new words. In this paper, we propose a method to construct a Chinese ontology to keep the carefully-designed structure of an expert-edited ontology, meanwhile embody new vocabulary from an online cyclopedia. We name the enhanced ontology as Chinese Concept Encyclopedia (CCE) and employ it in some text mining applications. The experimental results show that CCE outperforms the expert-edited ontology Chinese Concept Dictionary (CCD).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Dong, Z., Dong, Q.: Introduction to HowNet - Chinese Message Structure Base (2000), http://www.keenage.com
Jiang, S., Bing, L., Sun, B., Zhang, Y., Lam, W.: Ontology Enhancement and Concept Granularity Learning: Keeping Yourself Current and Adaptive. In: Proceedings of The 17th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, San Diego, CA, US, pp. 1244–1252 (2011)
Yu, J., Yu, S., Liu, Y., Zhang, H.: Introduction to Chineses Concept Dictionary. In: Proceedings of the International Conference on Chinese Computing (ICCC 2001), pp. 361–367 (2001)
Knight, K., Luk, S.K.: Building a Large-Scale Knowledge Base for Machine Translation. In: AAAI 1994 Proceedings (1994)
Kozareva, Z., Hovy, E.: A Semi-Supervised Method to Learn and Construct Taxonomies using the web. In: Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, EMNLP 2010, pp. 1110–1118 (2010)
Lee, S., Huh, S.-Y., McNiel, R.D.: Automatic generation of concept hierarchies using WordNet. Expert Systems with Applications 35(3), 1132–1144 (2008)
Mendes, S., Chaves, R.P.: Enriching WordNet with Qualia Information. In: Proceedings of the Workshop on WordNets and Other Lexical Resources at NAACL 2001 Conference, Pittsburgh, pp. 108–112 (2001)
Suchanek, F.M., Kasnecia, G., Weikuma, G.: YAGO: A Large Ontology from Wikipedia and WordNet. Web Semantics: Science, Services and Agents on the World Wide Web 6(3), 203–217 (2008)
Sussna, M.: Word sense disambiguation for free-text indexing using a massive semantic network. In: Proceedings of the Second International Conference on Information and Knowledge Management (CIKM 1993), pp. 67–74 (1993)
Wolf, E., Gurevyc, I.: Aligning Sense Inventories in Wikipedia and WordNet. In: First Workshop on Automated Knowledge Base Construction, pp. 24–28 (2010)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Nian, J., Jiang, S., Huang, C., Zhang, Y. (2011). CCE: A Chinese Concept Encyclopedia Incorporating the Expert-Edited Chinese Concept Dictionary with Online Cyclopedias. In: Tang, J., King, I., Chen, L., Wang, J. (eds) Advanced Data Mining and Applications. ADMA 2011. Lecture Notes in Computer Science(), vol 7120. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-25853-4_16
Download citation
DOI: https://doi.org/10.1007/978-3-642-25853-4_16
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-25852-7
Online ISBN: 978-3-642-25853-4
eBook Packages: Computer ScienceComputer Science (R0)