The Princeton WordNet® (PWN) is a widely used lexical knowledge database for semantic information processing. There are now many wordnets under creation for languages worldwide. In this paper, we endeavor to construct a wordnet for Pre-Qin ancient Chinese (PQAC), called PQAC WordNet (PQAC-WN), to process the semantic information of PQAC. In previous work, most recently constructed wordnets have been established either manually by experts or automatically using resources from which translation pairs between English and the target language can be extracted. The former method, however, is time-consuming, and the latter method, owing to a lack of language resources, cannot be performed on PQAC. As a result, a method based on word definitions in a monolingual dictionary is proposed. Specifically, for each sense, kernel words are first extracted from its definition, and the senses of each kernel word are then determined by graph-based Word Sense Disambiguation. Finally, one optimal sense is chosen from the kernel word senses to guide the mapping between the word sense and PWN synset. In this research, we obtain 66 % PQAC senses that can be shared with English and another 14 % language-specific senses that were added to PQAC-WN as new synsets. Overall, the automatic mapping achieves a precision of over 85 %.
The 25 books areZuo Zhuan (左传), Guanzi (管子), Hanfeizi (韩非子), Lv Shi Chun Qiu (吕氏春秋), Liji (礼记), Mohism (墨子), Xunzi (荀子), Guo Yu (国语), Yili (仪礼), Zhuangzi (庄子), The Rites of Zhou (周礼), Gongyang Zhuan (公羊传), Guliang Zhuan (谷梁传), YanziChun Qiu (晏子春秋), Mengzi (孟子), Book of Poetry (诗经), Shang Shu (尚书), Book of Changes (周易), Shang Jun Shu (商君书), The Analects (论语), Chu Ci (楚辞), The Art of War (孙子兵法), Taoism (道德经), Wuzi (吴子) and Xiao Jing (孝经).
We are grateful for the comments of the reviewers. This work is the staged achievement of the projects supported by National Social Science Foundation of China (10&ZD117, 12&ZD177) and Ministry of Education of China (16YJC740034).
