Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/1102351.1102404acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicmlConference Proceedingsconference-collections
Article

Generalized LARS as an effective feature selection tool for text classification with SVMs

Published: 07 August 2005 Publication History

Abstract

In this paper we generalize the LARS feature selection method to the linear SVM model, derive an efficient algorithm for it, and empirically demonstrate its usefulness as a feature selection tool for text classification.

References

[1]
Bekkerman, R., El-Yaniv, R., Tishby, N., & Winter, Y. (2003). Distributional word clusters vs. words for text categorization. Journal of Machine Learning Research, 3, 1183 1208.
[2]
Efron, B., Hastie, T., Johnstone, T., & Tibshirani, R. (2004). Least angle regression. Annals of Statistics, 32, 407--499.
[3]
Forman, G. (2003). An extensive empirical study of feature selection metrics for text classification. Journal of Machine Learning Research, 3, 1289 1305.
[4]
Genkin, A., Lewis, D. D., & Madigan, D. (2004). Large-scale bayesian logistic regression for text categorization.
[5]
Joachims, T. (1998). Text categorization with support vector machines: Learning with many relevant features. Proceedings of the Tenth European Conference on Machine Learning (ECML) (pp. 137--142).
[6]
Lewis, D. D. (2004). Reuters-21578 text categorization test collection: Distribution 1.0 readme file (v 1.3) (Technical Report). www.daviddlewis.com/resources/.
[7]
Rennie, J. D. M., & Rifkin, R. (2001). Improving multiclass text classification with the support vector machine (Technical Report). Artificial Intelligence Lab, MIT.
[8]
Rosset, S., & Zhu, J. (2004). Piecewise linear regularized solution paths.
[9]
Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society, B, 58, 267--288.
[10]
Yang, Y., & Pedersen, J. O. (1997). A comparative study on feature selection in text categorization. Proceedings of the 14th International Conference on Machine Learning (pp. 412--420).
[11]
Zou, H., & Hastie, T. (2005). Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society, B.

Cited By

View all
  • (2023)A Best Balance Ratio Ordered Feature Selection Methodology for Robust and Fast Statistical Analysis of Memory DesignsIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2022.321376242:6(1742-1755)Online publication date: Jun-2023
  • (2022)A Word-Concept Heterogeneous Graph Convolutional Network for Short Text ClassificationNeural Processing Letters10.1007/s11063-022-10906-655:1(735-750)Online publication date: 22-Jun-2022
  • (2019)Performance and Evaluation of Support Vector Machine and Artificial Neural Network over Heterogeneous DataRecent Trends in Image Processing and Pattern Recognition10.1007/978-981-13-9181-1_51(584-595)Online publication date: 20-Jul-2019
  • Show More Cited By
  1. Generalized LARS as an effective feature selection tool for text classification with SVMs

        Recommendations

        Comments

        Please enable JavaScript to view thecomments powered by Disqus.

        Information & Contributors

        Information

        Published In

        cover image ACM Other conferences
        ICML '05: Proceedings of the 22nd international conference on Machine learning
        August 2005
        1113 pages
        ISBN:1595931805
        DOI:10.1145/1102351
        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        Published: 07 August 2005

        Permissions

        Request permissions for this article.

        Check for updates

        Qualifiers

        • Article

        Acceptance Rates

        Overall Acceptance Rate 140 of 548 submissions, 26%

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • Downloads (Last 12 months)5
        • Downloads (Last 6 weeks)0
        Reflects downloads up to 21 Sep 2024

        Other Metrics

        Citations

        Cited By

        View all
        • (2023)A Best Balance Ratio Ordered Feature Selection Methodology for Robust and Fast Statistical Analysis of Memory DesignsIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2022.321376242:6(1742-1755)Online publication date: Jun-2023
        • (2022)A Word-Concept Heterogeneous Graph Convolutional Network for Short Text ClassificationNeural Processing Letters10.1007/s11063-022-10906-655:1(735-750)Online publication date: 22-Jun-2022
        • (2019)Performance and Evaluation of Support Vector Machine and Artificial Neural Network over Heterogeneous DataRecent Trends in Image Processing and Pattern Recognition10.1007/978-981-13-9181-1_51(584-595)Online publication date: 20-Jul-2019
        • (2011)Data-Driven Evaluation of Ontologies Using Machine Learning AlgorithmsApplied Semantic Web Technologies10.1201/b11085-13(211-273)Online publication date: 10-Aug-2011
        • (2010)Nearest-neighbour ensembles in lasso feature subspacesIET Computer Vision10.1049/iet-cvi.2009.00564:4(306)Online publication date: 2010
        • (2010)Quadratic Programming and Machine Learning — Large Scale Problems and SparsityOptimization in Signal and Image Processing10.1002/9780470611319.ch5(111-135)Online publication date: 27-Jan-2010
        • (2009)Two-Stage Feature Selection Method for Text ClassificationProceedings of the 2009 International Conference on Multimedia Information Networking and Security - Volume 0110.1109/MINES.2009.127(234-238)Online publication date: 18-Nov-2009
        • (2008)Cascaded search for similar documents between mobile devicesProceedings of the 12th WSEAS international conference on Computers10.5555/1513605.1513630(122-127)Online publication date: 23-Jul-2008
        • (2008)Heterogeneous data fusion for alzheimer's disease studyProceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining10.1145/1401890.1402012(1025-1033)Online publication date: 24-Aug-2008
        • (2008)Taxonomic support for document classication in mobile device environment2008 Conference on Human System Interactions10.1109/HSI.2008.4581521(674-679)Online publication date: May-2008
        • Show More Cited By

        View Options

        Get Access

        Login options

        View options

        PDF

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        Media

        Figures

        Other

        Tables

        Share

        Share

        Share this Publication link

        Share on social media