Using Semi-supervised Learning for Question Classification

Nguyen Thanh Tri²²,
Nguyen Minh Le²² &
Akira Shimazu²²

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4285))

Included in the following conference series:

International Conference on Computer Processing of Oriental Languages

1080 Accesses
1 Citations

Abstract

This paper tries to use unlabelled in combination with labelled questions for semi-supervised learning to improve the performance of question classification task. We also give two proposals to modify the Tri-training which is a simple but efficient co-training style algorithm to make it more suitable for question data type. In order to avoid bootstrap-sampling the training set to get different sets for training the three classifiers, the first proposal is to use multiple algorithms for classifiers in Tri-training, the second one is to use multiple algorithms for classifiers in combination with multiple views. The modification prevents the error rate at the initial step from being increased and our experiments show promising results.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 99.00; Price excludes VAT (USA)

Softcover Book: USD 129.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Semi-supervised Question Classification Based on Ensemble Learning

Semi-supervised learning for question classification in CQA

Article 05 May 2016

Question Classification for Albanian Language: An Annotated Corpus and Classification Models

References

Berger, A., Pietra, S.D., Pietra, V.D.: A maximum entropy approach to natural language processing. Computational Linguistics 22(1) (1996)
Google Scholar
Carlson, A., Cumby, C., Roth, D.: The SNoW learning architecture, Technical Report UIUC-DCS-R-99-2101, UIUC Computer Science Department (1999)
Google Scholar
Cortes, C., Vapnik, V.: Support vector networks. Machine Learning 20(3), 273–297 (1995)
MATH Google Scholar
Zhang, D., Lee, W.S.: Question classification using Support vector machine. In: Proceedings of the 26^th Annual International ACM SIGIR Conference, pp. 26–32 (2003)
Google Scholar
Voorhees, E.: The TREC-8 Question Answering Track Report. In: Proceedings of the 8^th Text Retrieval Conference (TREC8), pp. 77–82 (1999)
Google Scholar
Voorhees, E.: The TREC-9 Question Answering Track. In: Proceedings of the 9^th Text Retrieval Conference (TREC9), pp. 71–80 (2000)
Google Scholar
Voorhees, E.: Overview of the TREC 2001 Question Answering Track. In: Proceedings of the 10^th Text Retrieval Conference (TREC10), pp. 157–165 (2001)
Google Scholar
Mulenbach, F., et al.: Identifying and handling mislabelled Instances. Journal of Intelligent Information Systems 22(1), 89–109 (2004)
Article Google Scholar
Kanji, G.K.: 100 Statistical Tests. SAGE Publications, Thousand Oaks (1994)
Google Scholar
Kadri, H., Wayne, W.: Question classification with Support vector machines and error correcting codes. In: Proceedings of NAACL/Human Language Technology Conference, pp. 28–30 (2003)
Google Scholar
Goldman, S., Zhou, Y.: Enhancing supervised learning with unlabeled data. In: Proceedings of the 17^th International Conference on Machine Learning, pp. 327–334 (2000)
Google Scholar
Joachims, T.: Text categorization with Support vector machines: Learning with many relevant features. In: Nédellec, C., Rouveirol, C. (eds.) ECML 1998. LNCS, vol. 1398, pp. 137–142. Springer, Heidelberg (1998)
Chapter Google Scholar
Li, X., Roth, D.: Learning question classifiers. In: Proceedings of the 19^th International Conference on Computational Linguistics, pp. 556–562 (2002)
Google Scholar
Zhou, Z.-H., Li, M.: Tri-training: Exploiting unlabeled data using three classifiers. IEEE Transactions on Knowledge and Data Engineering 17(11) (2005)
Google Scholar

Download references

Author information

Authors and Affiliations

School of Information Science, Japan Advanced Institute of Science and Technology, 1-1 Asahidai, Nomi, Ishikawa, 923-1292, Japan
Nguyen Thanh Tri, Nguyen Minh Le & Akira Shimazu

Authors

Nguyen Thanh Tri
View author publications
You can also search for this author in PubMed Google Scholar
Nguyen Minh Le
View author publications
You can also search for this author in PubMed Google Scholar
Akira Shimazu
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Graduate School of Information Science, Nara Institute of Science and Technology, 630-0192, Takayama, Ikoma, Nara, Japan
Yuji Matsumoto
Dept of ECE, University of Illinois at Urbana Champaign, IL 61801, Urbana, USA
Richard W. Sproat
Department of Systems Engineering and Engineering Management, The Chinese University of Hong Kong, Shatin, N.T., Hong Kong
Kam-Fai Wong
State Key Lab of Intelligent Tech. & Sys., Tsinghua University,
Min Zhang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Tri, N.T., Le, N.M., Shimazu, A. (2006). Using Semi-supervised Learning for Question Classification. In: Matsumoto, Y., Sproat, R.W., Wong, KF., Zhang, M. (eds) Computer Processing of Oriental Languages. Beyond the Orient: The Research Challenges Ahead. ICCPOL 2006. Lecture Notes in Computer Science(), vol 4285. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11940098_4

Download citation

DOI: https://doi.org/10.1007/11940098_4
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-49667-0
Online ISBN: 978-3-540-49668-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Using Semi-supervised Learning for Question Classification

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Semi-supervised Question Classification Based on Ensemble Learning

Semi-supervised learning for question classification in CQA

Question Classification for Albanian Language: An Annotated Corpus and Classification Models

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Using Semi-supervised Learning for Question Classification

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Semi-supervised Question Classification Based on Ensemble Learning

Semi-supervised learning for question classification in CQA

Question Classification for Albanian Language: An Annotated Corpus and Classification Models

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation