A Feature Selection for Text Categorization on Research Support System Papits

Tadachika Ozono²¹,
Toramatsu Shintani²¹,
Takayuki Ito²¹ &
…
Tomoharu Hasegawa²¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3157))

Included in the following conference series:

Pacific Rim International Conference on Artificial Intelligence

1595 Accesses
2 Citations

Abstract

We have developed a research support system, called Papits, that shares research information, such as PDF files of research papers, in computers on the network and classifies the information into types of research fields. Users of Papits can share various research information and survey the corpora of their particular fields of research. In order to realize Papits, we need to design a mechanism for identifying what words are best suited to classify documents in predefined classes. Further we have to consider classification in cases where we must classify documents into multivalued fields and where there is insufficient data for classification. In this paper, we present an implementation method of automatic classification based on a text classification technique for Papits. We also propose a new method for using feature selection to classify documents that are represented by a bag-of-words into a multivalued category. Our method transforms the multivalued category into a binary category to easily identify the characteristic words to classify category in a few training data. Our experimental result indicates that our method can effectively classify documents in Papits.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Enhanced text mining approach based on ontology for clustering research project selection

Article 04 December 2017

Text Classification of Technical Papers Based on Text Segmentation

Using Class Based Document Frequency to Select Features in Text Classification

References

Burges, C.J.C.: A tutorial on support vector machines for pattern recognition. Data Mining and Knowledge Discovery 2(2), 121–167 (1998)
Article Google Scholar
Fujimaki, N., Ozono, T., Shintani, T.: Flexible Query Modifier for Research Support System Papits. In: Proceedings of the IASTED International Conference on Artificial and Computational Intelligence(ACI 2002), pp. 142–147 (2002)
Google Scholar
Joachims, T.: Text Categorization with Support Vector Machines: Learning with Many Relevant Features. In: Proceedings of the European Conference on Machine Learning (1998)
Google Scholar
John, G.H., Kohavi, R., Pfleger, K.: Irrelevant Features and the Subset Selection Problem. In: Proceedings of the Eleventh International Conference on Machine Learning, pp. 121–129 (1994)
Google Scholar
Kudo, T.: TinySVM: Support Vector Machines (2001), http://cl-aistnara.ac.jp/taku-ku/software/TinySVM
Lewis, D., Ringuette, M.: A comparison of two learning algorithms for text categorization. In: Third Annual Symposium on Document Analysis and Information Retrieval, pp. 81–93 (1994)
Google Scholar
Nigam, K., Lafferty, J., McCallum, A.: Using Maximum Entropy for Text Classification. In: IJCAI 1999 Workshop on Machine Learning for Information Filtering (1999)
Google Scholar
Ozono, T., Goto, S., Fujimaki, N., Shintani, T.: P2P based Knowledge Source Discovery on Research Support System Papits. In: The First International Joint Conference on Autonomous Agents & Multiagent Systems(AAMAS 2002) (2002)
Google Scholar
Quinlan, J.R.: Induction of decision trees. Machine Learning 1(1), 81–106 (1986)
Google Scholar
Sahami, M., Dumais, S., Heckerman, D., Horvitz, E.: A Bayesian approach to filtering junk e-mail. In: AAAI/ICML Workshop on Learning for Text Categorization (1998)
Google Scholar
Soucy, P., Mineau, G.W.: A Simple Feature Selection Method for Text Classification. In: Proceedings of International joint Conference on Artificial Intelligence( IJCAI 2001), pp. 897–902 (2001)
Google Scholar
Yang, Y., Liu, X.: A re-examination of text categorization methods. In: 22nd Annual International SIGIR, pp. 42–49 (1999)
Google Scholar
Yang, Y., Perdersen, J.O.: A Comparative Study on Feature Selection in Text Categorization. In: Proceedings of the Fourteenth International Conference on Machine Learning, ICML 1997 (1997)
Google Scholar

Download references

Author information

Authors and Affiliations

Nagoya Institute of Technology, Gokiso, Showa-ku, Nagoya, Aichi, 466-8555, Japan
Tadachika Ozono, Toramatsu Shintani, Takayuki Ito & Tomoharu Hasegawa

Authors

Tadachika Ozono
View author publications
You can also search for this author in PubMed Google Scholar
Toramatsu Shintani
View author publications
You can also search for this author in PubMed Google Scholar
Takayuki Ito
View author publications
You can also search for this author in PubMed Google Scholar
Tomoharu Hasegawa
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Faculty of Engineering and Information Technology, Centre for Quantum Computation and Intelligent Systems, and Australian ACS National Committee for Artificial Intelligence, University of Technology, Sydney, Australia
Chengqi Zhang
Department of Computer Science, Auckland University of Technology, 1020, Auckland, New Zealand
Hans W. Guesgen
Artificial Intelligence Technology Centre, Auckland University of Technology, Auckland, New Zealand
Wai-Kiang Yeap

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ozono, T., Shintani, T., Ito, T., Hasegawa, T. (2004). A Feature Selection for Text Categorization on Research Support System Papits. In: Zhang, C., W. Guesgen, H., Yeap, WK. (eds) PRICAI 2004: Trends in Artificial Intelligence. PRICAI 2004. Lecture Notes in Computer Science(), vol 3157. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-28633-2_56

Download citation

DOI: https://doi.org/10.1007/978-3-540-28633-2_56
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-22817-2
Online ISBN: 978-3-540-28633-2
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics

A Feature Selection for Text Categorization on Research Support System Papits

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Enhanced text mining approach based on ontology for clustering research project selection

Text Classification of Technical Papers Based on Text Segmentation

Using Class Based Document Frequency to Select Features in Text Classification

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

A Feature Selection for Text Categorization on Research Support System Papits

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Enhanced text mining approach based on ontology for clustering research project selection

Text Classification of Technical Papers Based on Text Segmentation

Using Class Based Document Frequency to Select Features in Text Classification

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation