Nothing Special   »   [go: up one dir, main page]

skip to main content
article
Free access

Quadratic Programming Feature Selection

Published: 01 August 2010 Publication History

Abstract

Identifying a subset of features that preserves classification accuracy is a problem of growing importance, because of the increasing size and dimensionality of real-world data sets. We propose a new feature selection method, named Quadratic Programming Feature Selection (QPFS), that reduces the task to a quadratic optimization problem. In order to limit the computational complexity of solving the optimization problem, QPFS uses the Nyström method for approximate matrix diagonalization. QPFS is thus capable of dealing with very large data sets, for which the use of other methods is computationally expensive. In experiments with small and medium data sets, the QPFS method leads to classification accuracy similar to that of other successful techniques. For large data sets, QPFS is superior in terms of computational efficiency.

References

[1]
E. Anderson, Z. Bai, J. Dongarra, A. Greenbaum, A. McKenney, J. Du Croz, S. Hammarling, J. Demmel, C. H. Bischof, and Danny C. Sorensen. LAPACK: a portable linear algebra library for high-performance computers. In SC, pages 2-11, 1990.
[2]
R. Bekkerman, N. Tishby, Y. Winter, I. Guyon, and A. Elisseeff. Distributional word clusters vs. words for text categorization. Journal of Machine Learning Research, 3:1183-1208, 2003.
[3]
D.P. Bertsekas. Nonlinear Programming. Athena Scientific, September 1999.
[4]
L. Breiman, J.H. Friedman, R.A. Olshen, and C.J. Stone. Classification and Regression Trees. Chapman & Hall/CRC, January 1984.
[5]
C. Chang and C. Lin. LIBSVM: a library for support vector machines, 2001. Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm.
[6]
T. M. Cover and J. A. Thomas. Elements of information theory. Wiley-Interscience, New York, NY, USA, 1991.
[7]
T.M. Cover. The best two independent measurements are not the two best. IEEE Trans. Systems, Man, and Cybernetics, 4:116-117, 1974.
[8]
C. Ding and H. Peng. Minimum redundancy feature selection from microarray gene expression data. J Bioinform Comput Biol, 3(2):185-205, April 2005.
[9]
R. O. Duda, P.E. Hart, and D. G. Stork. Pattern Classification (2nd Edition). Wiley-Interscience, November 2000.
[10]
S. Fine, K. Scheinberg, N. Cristianini, J. Shawe-Taylor, and B. Williamson. Efficient SVM training using low-rank kernel representations. Journal of Machine Learning Research, 2:243-264, 2001.
[11]
G. Forman. BNS feature scaling: an improved representation over TF-IDF for SVM text classification. In CIKM '08: Proceeding of the 17th ACM conference on Information and knowledge mining, pages 263-270, New York, NY, USA, 2008. ACM.
[12]
G. Forman. An extensive empirical study of feature selection metrics for text classification. J. Mach. Learn. Res., 3:1289-1305, 2003.
[13]
C. Fowlkes, S. Belongie, and J. Malik. Efficient spatiotemporal grouping using the Nyström method. In Proc. IEEE Conf. Comput. Vision and Pattern Recognition, pages 231-238, 2001.
[14]
D. Goldfarb and A. Idnani. A numerically stable dual method for solving strictly convex quadratic programs. Mathematical Programming, 27(1):1-33, 1983.
[15]
I. Guyon. An introduction to variable and feature selection. Journal of Machine Learning Research, 3:1157-1182, 2003.
[16]
M. A. Hall. Correlation-based feature selection for discrete and numeric class machine learning. In Proceedings of the International Conference on Machine Learning, pages 359-366. Morgan Kaufmann, 2000.
[17]
J. Hua, W. D. Tembe, and E. R. Dougherty. Performance of feature-selection methods in the classification of high-dimension data. Pattern Recogn., 42(3):409-424, 2009.
[18]
A. K. Jain, R. P.W. Duin, and J. Mao. Statistical pattern recognition: A review. IEEE Trans. Pattern Anal. Mach. Intell., 22(1):4-37, January 2000.
[19]
G. John, R. Kohavi, and K. Pfleger. Irrelevant features and the subset selection problem. In Machine Learning: Proceedings of the Eleventh International Conference, San Francisco, 1994.
[20]
R. Kohavi and G. H. John. Wrappers for feature subset selection. Artificial Intelligence, 97(1-2): 273-324, 1997.
[21]
S. Kumar, M. Mohri, and A. Talwalkar. Sampling techniques for the Nyström method. In Proceeding of the 12th International Conference on Artificial Intelligence and Statistics (AISTATS), 2009.
[22]
P. Langley. Selection of relevant features in machine learning. In Proceedings of the AAAI Fall symposium on relevance, pages 140-144. AAAI Press, 1994.
[23]
F. Lauer, C. Y. Suen, and G. Bloch. A trainable feature extractor for handwritten digit recognition. Pattern Recogn., 40(6):1816-1824, 2007.
[24]
Y. Lecun, Y. Bengio, and P. Haffner. Gradient-based learning applied to document recognition. In Proceedings of the IEEE, pages 2278-2324, 1998.
[25]
R. Leitner, H. Mairer, and A. Kercek. Real-time classification of polymers with NIR spectral imaging and blob analysis. Real-Time Imaging, 9:245-251, 2003.
[26]
T. Li, C. Zhang, and M. Ogihara. A comparative study of feature selection and multiclass classification methods for tissue classification based on gene expression. Bioinformatics, 20:2429-2437, 2004.
[27]
K. Momen, S. Krishnan, and T. Chau. Real-time classification of forearm electromyographic signals corresponding to user-selected intentional movements for multifunction prosthesis control. Neural Systems and Rehabilitation Engineering, IEEE Transactions on, 15(4):535-542, Dec. 2007.
[28]
J. Neter and W. Wasserman. Applied Linear Statistical Models. Richard D. Irwin, INC., 1974.
[29]
H. Peng, F. Long, and C. Ding. Feature selection based on mutual information: criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans. Pattern Anal. Mach. Intell, 27: 1226-1238, 2005. Software available at http://research.janelia.org/peng/proj/mRMR/.
[30]
M. Robnik-¿ikonja and I. Kononenko. Theoretical and empirical analysis of ReliefF and RReliefF. Mach. Learn., 53(1-2):23-69, 2003.
[31]
J. Rodriguez, A. Goni, and A. Illarramendi. Real-time classification of ECGs on a PDA. IEEE Transactions on Information Technology in Biomedicine, 9(1):23-34, 2005.
[32]
P. Shenoy, K. J. Miller, B. Crawford, and R. N. Rao. Online electromyographic control of a robotic prosthesis. IEEE Trans Biomed Eng, 55(3):1128-1135, Mar 2008.
[33]
B. A. Turlach and A. Weingessel. The quadprog package, version 1.4-11, 2000. Available electronically at cran.r-project.org/web/packages/quadprog/index.html.
[34]
J. Weston, S. Mukherjee, O. Chapelle, M. Pontil, T. Poggio, and V. Vapnik. Feature selection for SVMs. In Advances in Neural Information Processing Systems 13, pages 668-674. MIT Press, 2001.
[35]
C. K. I. Williams and M. Seeger. Using the Nyström method to speed up kernel machines. In Advances in Neural Information Processing Systems 13, pages 682-688. MIT Press, 2001.
[36]
L. Yu and H. Liu. Feature selection for high-dimensional data: A fast correlation-based filter solution. In Proceedings of the Twentieth International Conference on Machine Learning (ICML-03), 2003.
[37]
Y. Zhang, C. Ding, and T. Li. Gene selection algorithm by combining reliefF and mRMR. BMC Genomics, 9(Suppl 2), 2008.
[38]
J. Zhou, D. P. Foster, R. A. Stine, and L. H. Ungar. Streamwise feature selection. J. Mach. Learn. Res., 7:1861-1885, 2006.
[39]
S. Zhu, D. Wang, and T. Li. Feature selection for gene expression using model-based entropy. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 99(1), 2008.

Cited By

View all
  • (2024)Dimension Importance Estimation for Dense Information RetrievalProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657691(1318-1328)Online publication date: 10-Jul-2024
  • (2023)Service-level agreement aware energy-efficient load balancing in cloud using hybrid optimization modelService Oriented Computing and Applications10.1007/s11761-023-00359-717:2(77-91)Online publication date: 29-Apr-2023
  • (2022)An Efficient Hybrid Feature Selection Method Using the Artificial Immune Algorithm for High-Dimensional DataComputational Intelligence and Neuroscience10.1155/2022/14523012022Online publication date: 1-Jan-2022
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image The Journal of Machine Learning Research
The Journal of Machine Learning Research  Volume 11, Issue
3/1/2010
3637 pages
ISSN:1532-4435
EISSN:1533-7928
Issue’s Table of Contents

Publisher

JMLR.org

Publication History

Published: 01 August 2010
Published in JMLR Volume 11

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)50
  • Downloads (Last 6 weeks)11
Reflects downloads up to 14 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Dimension Importance Estimation for Dense Information RetrievalProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657691(1318-1328)Online publication date: 10-Jul-2024
  • (2023)Service-level agreement aware energy-efficient load balancing in cloud using hybrid optimization modelService Oriented Computing and Applications10.1007/s11761-023-00359-717:2(77-91)Online publication date: 29-Apr-2023
  • (2022)An Efficient Hybrid Feature Selection Method Using the Artificial Immune Algorithm for High-Dimensional DataComputational Intelligence and Neuroscience10.1155/2022/14523012022Online publication date: 1-Jan-2022
  • (2022)Classification of COVID-19 from chest x-ray images using deep features and correlation coefficientMultimedia Tools and Applications10.1007/s11042-022-12500-381:19(27631-27655)Online publication date: 1-Aug-2022
  • (2022)Nowcasting with Numerous Candidate PredictorsMachine Learning and Knowledge Discovery in Databases10.1007/978-3-662-44848-9_24(370-385)Online publication date: 10-Mar-2022
  • (2021)A dynamic VM consolidation approach based on load balancing using Pearson correlation in cloud computingThe Journal of Supercomputing10.1007/s11227-020-03494-677:6(5840-5881)Online publication date: 1-Jun-2021
  • (2020)Statistical Analysis of the Performance of Rank Fusion Methods Applied to a Homogeneous Ensemble Feature RankingScientific Programming10.1155/2020/88600442020Online publication date: 1-Jan-2020
  • (2020)Analysis of the European stock market's advance response time to COVID-19 based on Pearson correlation CoefficientProceedings of the 2020 3rd International Conference on Algorithms, Computing and Artificial Intelligence10.1145/3446132.3446149(1-7)Online publication date: 24-Dec-2020
  • (2020)A Feature Selection Algorithm Based on Equal Interval Division and Minimal-Redundancy–Maximal-RelevanceNeural Processing Letters10.1007/s11063-019-10144-351:2(1237-1263)Online publication date: 1-Apr-2020
  • (2018)Small-Scale Building Load Forecast based on Hybrid Forecast EngineNeural Processing Letters10.1007/s11063-017-9723-248:1(329-351)Online publication date: 1-Aug-2018
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Full Access

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media