Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/1273496.1273600acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicmlConference Proceedingsconference-collections
Article

Supervised feature selection via dependence estimation

Published: 20 June 2007 Publication History

Abstract

We introduce a framework for filtering features that employs the Hilbert-Schmidt Independence Criterion (HSIC) as a measure of dependence between the features and the labels. The key idea is that good features should maximise such dependence. Feature selection for various supervised learning problems (including classification and regression) is unified under this framework, and the solutions can be approximated using a backward-elimination algorithm. We demonstrate the usefulness of our method on both artificial and real world datasets.

References

[1]
Baker, C. (1973). Joint measures and cross-covariance operators. Transactions of the American Mathematical Society, 186, 273--289.
[2]
Borgwardt, K. M., Gretton, A., Rasch, M. J., Kriegel, H.-P., Schölkopf, B., & Smola, A. J. (2006). Integrating structured biological data by kernel maximum mean discrepancy. Bioinformatics (ISMB), 22(14), e49--e57.
[3]
Cristianini, N., Kandola, J., Elisseeff, A., & Shawe-Taylor, J. (2003). On optimizing kernel alignment. Tech. rep., UC Davis Department of Statistics.
[4]
Dornhege, G., Blankertz, B., Curio, G., & Müüller, K. (2004). Boosting bit rates in non-invasive EEG singletrial classifications by feature combination and multiclass paradigms. IEEE Trans. Biomed. Eng., 51, 993--1002.
[5]
Dornhege, G., Blankertz, B., Krauledat, M., Losch, F., Curio, G., & Müüller, K. (2006). Optimizing spatio-temporal filters for improving BCI. In NIPS, vol. 18.
[6]
Fukumizu, K., Bach, F. R., & Jordan, M. I. (2004). Dimensionality reduction for supervised learning with reproducing kernel hilbert spaces. JMLR, 5, 73--99.
[7]
Gretton, A., Bousquet, O., Smola, A., & Schöölkopf, B. (2005). Measuring statistical dependence with Hilbert-Schmidt norms. In ALT, 63--78.
[8]
Guyon, I., & Elisseeff, A. (2003). An introduction to variable and feature selection. Journal of Machine Learning Research, 3, 1157--1182.
[9]
Guyon, I., Weston, J., Barnhill, S., & Vapnik, V. (2002). Gene selection for cancer classification using support vector machines. Machine Learning, 46, 389--422.
[10]
Kira, K., & Rendell, L. (1992). A practical approach to feature selection. In Proc. 9th Intl. Workshop on Machine Learning, 249--256.
[11]
Koller, D., & Sahami, M. (1996). Toward optimal feature selection. In ICML, 284--292.
[12]
Lemm, S., Blankertz, B., Curio, G., & Müülller, K.-R. (2005). Spatio-spectral filters for improving the classification of single trial EEG. IEEE Trans. Biomed. Eng., 52, 1541--1548.
[13]
Nemenman, I., Shafee, F., & Bialek, W. (2002). Entropy and inference, revisited. In NIPS, vol. 14.
[14]
Neumann, J., Schnörr, C., & Steidl, G. (2005). Combined SVM-based feature selection and classification. Machine Learning, 61, 129--150.
[15]
Schölkopf, B., & Smola, A. (2002). Learning with Kernels. Cambridge, MA: MIT Press.
[16]
Serfling, R. (1980). Approximation Theorems of Mathematical Statistics. New York: Wiley.
[17]
Song, L., Smola, A., Gretton, A., Borgwardt, K., & Bedo, J. (2007). Feature selection for supervised learning using Hilbert-Schmidt Independence Criterion. Tech. rep., NICTA, ANU.
[18]
Steinwart, I. (2002). On the influence of the kernel on the consistency of svms. JMLR, 2, 67--93.
[19]
Weston, J., Elisseeff, A., Schölkopf, B., & Tipping, M. (2003). Use of zero-norm with linear models and kernel methods. JMLR, 3, 1439--1461.
[20]
Weston, J., Mukherjee, S., Chapelle, O., Pontil, M., Poggio, T., & Vapnik, V. (2000). Feature selection for SVMs. In NIPS, vol. 13.
[21]
Zaffalon, M., & Hutter, M. (2002). Robust feature selection using distributions of mutual information. In UAI.

Cited By

View all
  • (2024)A preprocessing Shapley value-based approach to detect relevant and disparity prone features in machine learningProceedings of the 2024 ACM Conference on Fairness, Accountability, and Transparency10.1145/3630106.3658905(279-289)Online publication date: 3-Jun-2024
  • (2024)MIRFuse: an infrared and visible image fusion model based on disentanglement representation via mutual information regularizationJournal of Electronic Imaging10.1117/1.JEI.33.2.02300533:02Online publication date: 1-Mar-2024
  • (2024)Generalizing Graph Neural Networks on Out-of-Distribution GraphsIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2023.332109746:1(322-337)Online publication date: Jan-2024
  • Show More Cited By
  1. Supervised feature selection via dependence estimation

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    ICML '07: Proceedings of the 24th international conference on Machine learning
    June 2007
    1233 pages
    ISBN:9781595937933
    DOI:10.1145/1273496
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    • Machine Learning Journal

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 20 June 2007

    Permissions

    Request permissions for this article.

    Check for updates

    Qualifiers

    • Article

    Conference

    ICML '07 & ILP '07
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 140 of 548 submissions, 26%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)54
    • Downloads (Last 6 weeks)11
    Reflects downloads up to 24 Nov 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)A preprocessing Shapley value-based approach to detect relevant and disparity prone features in machine learningProceedings of the 2024 ACM Conference on Fairness, Accountability, and Transparency10.1145/3630106.3658905(279-289)Online publication date: 3-Jun-2024
    • (2024)MIRFuse: an infrared and visible image fusion model based on disentanglement representation via mutual information regularizationJournal of Electronic Imaging10.1117/1.JEI.33.2.02300533:02Online publication date: 1-Mar-2024
    • (2024)Generalizing Graph Neural Networks on Out-of-Distribution GraphsIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2023.332109746:1(322-337)Online publication date: Jan-2024
    • (2024)Exploring Large-scale Financial Knowledge Graph for SMEs Supply Chain MiningIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2023.3317631(1-12)Online publication date: 2024
    • (2024)Multiple Collaboration Preserving Projection for Monitoring of Complex Industrial ProcessIEEE Transactions on Instrumentation and Measurement10.1109/TIM.2023.333021173(1-9)Online publication date: 2024
    • (2024)A Multimodal Sentiment Analysis Method Based on Fuzzy Attention FusionIEEE Transactions on Fuzzy Systems10.1109/TFUZZ.2024.343461432:10(5886-5898)Online publication date: 1-Oct-2024
    • (2024)Inductive Link Prediction via Interactive Learning Across Relations in Multiplex NetworksIEEE Transactions on Computational Social Systems10.1109/TCSS.2022.317692811:3(3118-3130)Online publication date: Jun-2024
    • (2024)Wireless Image Semantic Cooperative Transmission in Distributed Edge Networks: An Information Disentanglement Method2024 IEEE 25th International Workshop on Signal Processing Advances in Wireless Communications (SPAWC)10.1109/SPAWC60668.2024.10694656(461-465)Online publication date: 10-Sep-2024
    • (2024)A Fine-Grained Tri-Modal Interaction Model for Multimodal Sentiment AnalysisICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)10.1109/ICASSP48485.2024.10447872(5715-5719)Online publication date: 14-Apr-2024
    • (2024)Kernel-based Sensitivity Analysis for (Excursion) SetsTechnometrics10.1080/00401706.2024.233653766:4(575-587)Online publication date: 13-May-2024
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media