Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/956863.956924acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
Article

Efficient multi-way text categorization via generalized discriminant analysis

Published: 03 November 2003 Publication History

Abstract

Text categorization is an important research area and has been receiving much attention due to the growth of the on-line information and of Internet. Automated text categorization is generally cast as a multi-class classification problem. Much of previous work focused on binary document classification problems. Support vector machines (SVMs) excel in binary classification, but the elegant theory behind large-margin hyperplane cannot be easily extended to multi-class text classification. In addition, the training time and scaling are also important concerns. On the other hand, other techniques naturally extensible to handle multi-class classification are generally not as accurate as SVM. This paper presents a simple and efficient solution to multi-class text categorization. Classification problems are first formulated as optimization via discriminant analysis. Text categorization is then cast as the problem of finding coordinate transformations that reflects the inherent similarity from the data. While most of the previous approaches decompose a multi-class classification problem into multiple independent binary classification tasks, the proposed approach enables direct multi-class classification. By using Generalized Singular Value Decomposition (GSVD), a coordinate transformation that reflects the inherent class structure indicated by the generalized singular values is identified. Extensive experiments demonstrate the efficiency and effectiveness of the proposed approach.

References

[1]
Allwein, E. L., Schapire, R. E., & Singer, Y. (2000). Reducing multiclass to binary: A unifying approach for margin classifiers. ICML -00 (pp. 9--16).
[2]
Apte, C., Damerau, F., & Weiss, S. (1998). Text mining with decision rules and decision trees. Proceedings of the Workshop with Conference on Automated Learning and Discovery: Learning from text and the Web.
[3]
Chakrabarti, S., Roy, S., & Soundalgekar, M. V. (2002). Fast and accurate text classification via multiple linear discriminant projections. Proceedings of the 28th International Conference on Very Large Databases (pp. 658--669).
[4]
Cohen, W. W., & Singer, Y. (1996). Context-sensitive learning methods for text categorization. Proceedings of the 19th Annual International ACM SIGIR Conference on Research and Development in Information (pp. 307--315).
[5]
Collobert, R., & Bengio, S. (2001). SVMTorch: Support vector machines for large-scale regression problems. Journal of Machine Learning Research, 1, 143--160.
[6]
Deerwester, S. C., Dumais, S. T., Landauer, T. K., Furnas, G. W., & Harshman, R. A. (1990). Indexing by latent semantic analysis. Journal of the American Society of Information Science, 41, 391--407.
[7]
Demmel, J., & Veselic, K. (1992). Jacobi's method is more accurate than QP. SIAM Journal on Matrix Analysis and Applications, 13, 10--19.
[8]
Dietterich, T. G., & Bakiri, G. (1995). Solving multiclass learning problems via error-correcting output codes. Journal of Artificial Intelligence Research, 2, 263--286.
[9]
Dumais, S., Platt, J., Heckerman, D., & Sahami, M. (1998). Inductive learning algorithms and representations for text categorization. CIKM-98 (pp. 148--155).
[10]
Fisher, R. (1936). The use of multiple measurements in taxonomic problems. Annals of Eugenics, 7, 179--188.
[11]
Fragoudis, D., Meretakis, D., & Likothanassis, S. (2002). Integrating feature and instance selection for text classification. SIGKDD-02 (pp. 501--506).
[12]
Fukunaga, K. (1990). Introduction to statistical pattern recognition. Academic Press.
[13]
Ghani, R. (2000). Using error-correcting codes for text classification. ICML -00 (pp. 303--310).
[14]
Godbole, S., Sarawagi, S., & Chakrabarti, S. (2002). Scaling multi-class support vector machine using inter-class confusion. SIGKDD -02 (pp. 513--518).
[15]
Han, E.-H., Boley, D., Gini, M., Gross, R., Hastings, K., Karypis, G., Kumar, V., Mobasher, B., & Moore, J. (1998). WebACE: A web agent for document categorization and exploration. Agents -98 (pp. 408--415).
[16]
Hastie, T., & Tibshirani, R. (1998). Classification by pairwise coupling. Advances in Neural Information Processing Systems. The MIT Press.
[17]
Joachims, T. (1998). Making large-scale support vector machine learning practical. In Advances in kernel methods: Support vector machines.
[18]
Joachims, T. (2001). A statistical learning model of text classification with support vector machines. SIGIR -01 (pp. 128--136).
[19]
Lam, W., & Ho., C. (1998). Using a generalized instance set for automatic text categorization. SIGIR -98 (pp. 81--89).
[20]
Lewis, D. D. (1998). Naive (Bayes) at forty: The independence assumption in information retrieval. ECML -98 .
[21]
Loan, C. V. (1976). Generalizing the singular value decomposition. SIAM J. Num. Anal., 13, 76--83.
[22]
Masand, B., Linoff, G., & Waltz., D. (1992). Classifying news stories using memory based reasoning. SIGIR -92 (pp. 59--64).
[23]
McCallum, A. K. (1996). Bow: A toolkit for statistical language modeling, text retrieval, classification and clustering. http://www.cs.cmu.edu/ mccallum/bow.
[24]
Mika, S., Rätsch, G., Weston, J., Schölkopf, B., & Müüller, K.-R. (1999). Fisher discriminant analysis with kernels. Neural Networks for Signal Processing IX (pp. 41--48). IEEE.
[25]
Ng, H. T., Goh, W. B., & Low, K. L. (1997). Feature selection, perceptron learning, and a usability case study for text categorization. Proceedings of the 20th Annual International ACM SIGIR Conference on Research and Development in Information (pp. 67--73).
[26]
Nigam, K., Lafferty, J., & McCallum, A. (1999). Using maximum entropy for text classification. In IJCAI-99 Workshop on Machine Learning for Information Filtering (pp. 61--67).
[27]
Papadimitriou, C. H., Tamaki, H., Raghavan, P., & Vempala, S. (1998). Latent semantic indexing: A probabilistic analysis. Proceedings of the Symposium on Principles of Database Systems (pp. 159--168).
[28]
Schapire, R. E., & Singer, Y. (2000). Boostexter: A boosting-based system for text categorization. Machine Learning, 39, 135--168.
[29]
Scholkopf, B., & J.Smola, A. (2002). Learning with kernels. MIT Press.
[30]
TDT2 (1998). Nist topic detection and tracking corpus. http://www.nist.gove/speech/tests/tdt/tdt98/index.htm.
[31]
Tzeras, K., & Hartmann, S. (1993). Automatic indexing based on Bayesian inference networks. SIGIR -93 (pp. 22--34).
[32]
Vapnik, V. N. (1998). Statistical learning theory. Wiley, New York.
[33]
Wiener, E. D., Pedersen, J. O., & Weigend, A. S. (1995). A neural network approach to topic spotting. 4th Annual Symposium on Document Analysis and Information Retrieval (pp. 317--332).
[34]
Yang, Y., & Liu, X. (1999). A re-examination of text categorization methods. SIGIR -99 (pp. 42--49).
[35]
Yang, Y., & Pederson, J. O. (1997). A comparative study on feature selection in text categorization. ICML -97 (pp. 412--420).

Cited By

View all
  • (2023)The Typology of Public Schools in the State of Louisiana and Interventions to Improve Performance: A Machine Learning ApproachEducation Sciences10.3390/educsci1302016013:2(160)Online publication date: 2-Feb-2023
  • (2018)Graph and Sparse-Based Robust Nonnegative Block Value Decomposition for ClusteringIEEE Journal of Selected Topics in Signal Processing10.1109/JSTSP.2018.287704112:6(1561-1574)Online publication date: Dec-2018
  • (2013)Finding multiple global linear correlations in sparse and noisy data setsKnowledge-Based Systems10.1016/j.knosys.2013.08.01553(40-50)Online publication date: 1-Nov-2013
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
CIKM '03: Proceedings of the twelfth international conference on Information and knowledge management
November 2003
592 pages
ISBN:1581137230
DOI:10.1145/956863
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 03 November 2003

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. GSVD
  2. discriminant analysis
  3. multi-class text categorization

Qualifiers

  • Article

Conference

CIKM03

Acceptance Rates

Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

Upcoming Conference

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)2
  • Downloads (Last 6 weeks)0
Reflects downloads up to 19 Sep 2024

Other Metrics

Citations

Cited By

View all
  • (2023)The Typology of Public Schools in the State of Louisiana and Interventions to Improve Performance: A Machine Learning ApproachEducation Sciences10.3390/educsci1302016013:2(160)Online publication date: 2-Feb-2023
  • (2018)Graph and Sparse-Based Robust Nonnegative Block Value Decomposition for ClusteringIEEE Journal of Selected Topics in Signal Processing10.1109/JSTSP.2018.287704112:6(1561-1574)Online publication date: Dec-2018
  • (2013)Finding multiple global linear correlations in sparse and noisy data setsKnowledge-Based Systems10.1016/j.knosys.2013.08.01553(40-50)Online publication date: 1-Nov-2013
  • (2012)Complementary distribution BPSO for feature selectionIntelligent Data Analysis10.5555/2608507.260851016:2(183-198)Online publication date: 1-Mar-2012
  • (2011)Representing document as dependency graph for document clusteringProceedings of the 20th ACM international conference on Information and knowledge management10.1145/2063576.2063920(2177-2180)Online publication date: 24-Oct-2011
  • (2011)A multiclass/multilabel document categorization systemApplied Soft Computing10.1016/j.asoc.2011.06.00211:8(4981-4990)Online publication date: 1-Dec-2011
  • (2009)Proximal support vector machine using local informationNeurocomputing10.1016/j.neucom.2009.08.00273:1-3(357-365)Online publication date: 1-Dec-2009
  • (2008)Text categorization via generalized discriminant analysisInformation Processing and Management: an International Journal10.1016/j.ipm.2008.03.00544:5(1684-1697)Online publication date: 1-Sep-2008
  • (2007)Searching with styleProceedings of the thirtieth Australasian conference on Computer science - Volume 6210.5555/1273749.1273757(59-68)Online publication date: 30-Jan-2007
  • (2007)Hierarchical document classification using automatically generated hierarchyJournal of Intelligent Information Systems10.1007/s10844-006-0019-729:2(211-230)Online publication date: 1-Oct-2007
  • Show More Cited By

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media