Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/958220.958242acmconferencesArticle/Chapter ViewAbstractPublication PagesdocengConference Proceedingsconference-collections
Article

Accuracy improvement of automatic text classification based on feature transformation

Published: 20 November 2003 Publication History

Abstract

In this paper, we describe a comparative study on techniques of feature transformation and classification to improve the accuracy of automatic text classification. The normalization to the relative word frequency, the principal component analysis (K-L transformation) and the power transformation were applied to the feature vectors, which were classified by the Euclidean distance, the linear discriminant function, the projection distance, the modified projection distance and the SVM.

References

[1]
Sebastiani, F.: Machine Learning in Automated Text Categorization. ACM Computing Surveys, Vol. 34, No. 1, 1--47, March 2002.
[2]
Lam,W., Han,Y.: Automatic Textual Document Categorization Based on Generalized Instance Sets and a Metamodel. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol.25, No.5, 628--633, May 2003.
[3]
Fukunaga,K.: Introduction to Statistical Pattern Recognition, 76--77, Academic Press, Inc, 1990.
[4]
Fukumoto,T., Wakabayashi,T. Kimura,F. and Miyake,Y.: Accuracy Improvement of Handwritten Character Recognition By GLVQ, Proceedings of the Seventh International Workshop on Frontiers in Handwriting Recognition Proceedings(IWFHR VII), 271--280 September 2000.
[5]
R. Sebastiani, A. Sperduti, and N. Valdambrini: "An Improved boosting algorithm and its application to text categorization", Proceedings of the Ninth International Conference on Information and Knowledge Management (CIKM 2000), pp.78--85 (2000)
[6]
Y. Yang, and X. Liu: "A re-examination of text categorization methods", Proceedings of the Twenty-First International ACM SIGIR Conference on Research and Development in Information Retrieval, pp.42--49 (1999)
[7]
C. Cortes, and V. Vapnik : "Support-vector network", Machine Learning 20, pp.273--297 (1995).
[8]
C.C. Chang, and C. J. Lin "LIBSVM -- A Library for Support Vector Machines (Version 2.33)", http://www.csie.ntu.edu.tw/ cjlin/libsvm/index.html, (2002.4).

Cited By

View all
  • (2022)Machine Learning Algorithm for Text Categorization of News Articles from Senegalese Online News Websites2022 17th Iberian Conference on Information Systems and Technologies (CISTI)10.23919/CISTI54924.2022.9820408(1-8)Online publication date: 22-Jun-2022
  • (2017)Text Classification for Organizational ResearchersOrganizational Research Methods10.1177/109442811771932221:3(766-799)Online publication date: 12-Jul-2017
  • (2017)A filtering of message in online social network using hybrid classifierCluster Computing10.1007/s10586-017-1300-yOnline publication date: 12-Nov-2017
  • Show More Cited By

Index Terms

  1. Accuracy improvement of automatic text classification based on feature transformation

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    DocEng '03: Proceedings of the 2003 ACM symposium on Document engineering
    November 2003
    260 pages
    ISBN:1581137249
    DOI:10.1145/958220
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    In-Cooperation

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 20 November 2003

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. automatic text classification
    2. principal component analysis
    3. variable transformation

    Qualifiers

    • Article

    Conference

    DocEng03
    Sponsor:
    DocEng03: ACM Symposium on Document Engineering
    November 20 - 22, 2003
    Grenoble, France

    Acceptance Rates

    Overall Acceptance Rate 194 of 564 submissions, 34%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)3
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 21 Nov 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2022)Machine Learning Algorithm for Text Categorization of News Articles from Senegalese Online News Websites2022 17th Iberian Conference on Information Systems and Technologies (CISTI)10.23919/CISTI54924.2022.9820408(1-8)Online publication date: 22-Jun-2022
    • (2017)Text Classification for Organizational ResearchersOrganizational Research Methods10.1177/109442811771932221:3(766-799)Online publication date: 12-Jul-2017
    • (2017)A filtering of message in online social network using hybrid classifierCluster Computing10.1007/s10586-017-1300-yOnline publication date: 12-Nov-2017
    • (2009)Increasing the Accuracy of Discriminative of Multinomial Bayesian Classifier in Text ClassificationProceedings of the 2009 Fourth International Conference on Computer Sciences and Convergence Information Technology10.1109/ICCIT.2009.13(1246-1251)Online publication date: 24-Nov-2009
    • (2008)MEDLINE Abstracts Classification Based on Noun Phrases ExtractionBiomedical Engineering Systems and Technologies10.1007/978-3-540-92219-3_38(507-519)Online publication date: 2008
    • (2006)NEWPARProceedings of the 2006 ACM symposium on Document engineering10.1145/1166160.1166196(128-137)Online publication date: 10-Oct-2006
    • (2006)The impact of OCR accuracy and feature transformation on automatic text classificationProceedings of the 7th international conference on Document Analysis Systems10.1007/11669487_45(506-517)Online publication date: 13-Feb-2006
    • (2005)Text classificationProceedings of the 9th WSEAS International Conference on Computers10.5555/1369599.1369724(1-6)Online publication date: 14-Jul-2005
    • (2004)Text type structure and logical document structureProceedings of the 2004 ACL Workshop on Discourse Annotation10.5555/1608938.1608945(49-56)Online publication date: 25-Jul-2004
    • (2004)The Impact of OCR Accuracy on Automatic Text ClassificationContent Computing10.1007/978-3-540-30483-8_49(403-409)Online publication date: 2004

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media