Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3067695.3082053acmconferencesArticle/Chapter ViewAbstractPublication PagesgeccoConference Proceedingsconference-collections
research-article

Towards a method for automatically selecting and configuring multi-label classification algorithms

Published: 15 July 2017 Publication History

Abstract

Given a new dataset for classification in Machine Learning (ML), finding the best classification algorithm and the best configuration of its (hyper)-parameters for that particular dataset is an open issue. The Automatic ML (Auto-ML) area has emerged to solve this task. With this issue in mind, in this work we are interested in a specific type of classification problem, called multi-label classification (MLC). In MLC, each example in the dataset can be associated to one or more class labels, making the task considerably harder than traditional, single-label classification. In addition, the cost of learning raises due to the higher complexity of the data. Although the literature has proposed some methods to solve the Auto-ML task, those methods address only the traditional, single-label classification problem. By contrast, this work proposes the first method (an evolutionary algorithm) for solving the Auto-ML task in MLC, i.e., the first method for automatically selecting and configuring the best MLC algorithm for a given input dataset. The proposed evolutionary algorithm is evaluated on three MLC datasets, and compared against two baseline methods according to four different multi-label predictive accuracy measures. The results show that the proposed evolutionary algorithm is competitive against the baselines, but there is still room for improvement.

References

[1]
R. C. Barros, M. P. Basgalupp, A. C. P. L. F. de Carvalho, and A. A. Freitas. 2013. Automatic Design of Decision-Tree Algorithms with Evolutionary Algorithms. Evolutionary Computation 21, 4 (2013), 659--684.
[2]
S. Boucheron, O. Bousquet, and G. Lugosi. 2005. Theory of classification: A survey of some recent advances. ESAIM: Probability and Statistics 9 (2005), 323--375.
[3]
M. R. Boutell, J. Luo, X. Shen, and C. M. Brown. 2004. Learning multi-label scene classification. Pattern Recognition 37, 9 (2004), 1757--1771.
[4]
W. Chen, J. Yan, B. Zhang, Z. Chen, and Q. Yang. 2007. Document Transformation for Multi-label Feature Selection in Text Categorization. In Proc. of the IEEE International Conference on Data Mining (ICDM). 451--456.
[5]
A. Clare and R. D. King. 2001. Knowledge Discovery in Multi-label Phenotype Data. In Proc. of the European Conference on Principles of Data Mining and Knowledge Discovery (PKDD). Springer-Verlag, London, UK, 42--53.
[6]
L. Dioşan, A. Rogozan, and J.-P. Pecuchet. 2012. Improving classification performance of Support Vector Machine by genetically optimising kernel shape and hyper-parameters. Applied Intelligence 36, 2 (2012), 280--294.
[7]
P. Domingos. 2012. A Few Useful Things to Know About Machine Learning. Commun. ACM 55, 10 (Oct. 2012), 78--87.
[8]
A. Elisseeff and J. Weston. 2001. A Kernel Method for Multi-labelled Classification. In Proc. of the International Conference on Neural Information Processing Systems: Natural and Synthetic (NIPS). MIT Press, Cambridge, MA, USA, 681--687.
[9]
M. Feurer, A. Klein, K. Eggensperger, J. T. Springenberg, M. Blum, and F. Hutter. 2015. Efficient and Robust Automated Machine Learning. In Proc. of International Conference on Neural Information Processing Systems. MIT Press, 2755--2763.
[10]
M. Hall, E. Frank, G. Holmes, B. Pfahringer, P. Reutemann, and I. H. Witten. 2009. The WEKA data mining software: an update. SIGKDD Explorations Newsletter 11, 1 (Nov. 2009), 10 -- 18.
[11]
F. Herrera, F. Charte, A. J. Rivera, and M. J. del Jesus. 2016. Multilabel Classification : Problem Analysis, Metrics and Techniques (1 ed.). Springer International Publishing.
[12]
S. B. Kotsiantis. 2007. Supervised Machine Learning: A Review of Classification Techniques. In Proc. of the Conference on Emerging Artificial Intelligence Applications in Computer Engineering: Real Word AI Systems with Applications in eHealth, HCI, Information Retrieval and Pervasive Technologies. IOS Press, 3--24.
[13]
M. López-Ibáñez, J. Dubois-Lacoste, L. P. Cáceres, T. Stützle, and M. Birattari. 2016. The irace package: Iterated Racing for Automatic Algorithm Configuration. Operations Research Perspectives 3 (2016), 43--58.
[14]
G. Madjarov, D. Kocev, D. Gjorgjevikj, and S. Dzeroski. 2012. An extensive experimental comparison of methods for multi-label learning. Pattern Recognition 45, 9 (2012), 3084--3104.
[15]
R. G. Mantovani, A. L. D. Rossi, J. Vanschoren, B. Bischl, and A. C. P. L. F. de Carvalho. 2015. Effectiveness of Random Search in SVM hyper-parameter tuning. In Proc. of the International Joint Conference on Neural Networks. 1--8.
[16]
R. I. McKay, N. X. Hoai, P. A. Whigham, Y. Shan, and M. O'Neill. 2010. Grammar-based Genetic Programming: a survey. Genetic Programming and Evolvable Machines 11, 3 (2010), 365--396.
[17]
H. Mendoza, A. Klein, M. Feurer, J. Springenberg, and F. Hutter. 2016. Towards Automatically-Tuned Neural Networks. In Proc. of the ICML AutoML Workshop.
[18]
R. Olson, R. Urbanowicz, P. Andrews, N. Lavender, L. Kidd, and J. H. Moore. 2016. Automating Biomedical Data Science Through Tree-Based Pipeline Optimization. In Proc. of the European Conference on the Applications of Evolutionary Computation. Springer International Publishing, 123--137.
[19]
G. L. Pappa and A. A. Freitas. 2009. Automating the Design of Data Mining Algorithms: An Evolutionary Computation Approach. Springer.
[20]
G. L. Pappa, G. Ochoa, M. R. Hyde, A. A. Freitas, J. Woodward, and J. Swan. 2014. Contrasting meta-learning and hyper-heuristic research: the role of evolutionary algorithms. Genetic Programming and Evolvable Machines 15, 1 (2014), 3--35.
[21]
F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, and E. Duchesnay. 2011. Scikit-learn: Machine Learning in Python. Journal of Machine Learning Research 12 (2011), 2825--2830.
[22]
R. B. Pereira, A. Piastino, B. Zadrozny, and L. H. C. Merschmann. 2016. Categorizing feature selection methods for multi-label classification. Artificial Intelligence Review (2016), 1--22.
[23]
J. Read. 2008. A pruned problem transformation method for multi-label classification. In Proceedints of the New Zealand Computer Science Research Student Conference (NZCSRS). 143--150.
[24]
J. Read, B. Pfahringer, G. Holmes, and E. Frank. 2011. Classifier Chains for Multi-label Classification. Machine Learning 85, 3 (Dec. 2011), 333--359.
[25]
J. Read, P. Reutemann, B. Pfahringer, and G. Holmes. 2016. MEKA: A Multi-label/Multi-target Extension to Weka. Journal of Machine Learning Research 17, 21 (2016), 1--5.
[26]
A. G. C. Sá and G. L. Pappa. 2014. A Hyper-heuristic Evolutionary Algorithm for Learning Bayesian Network Classifiers. In Proc. Of Ibero-American Conference on Artificial Intelligence. 430--442.
[27]
A. G. C. Sá, W. J. G. S. Pinto, L. O. V B. Oliveira, and G. L. Pappa. RECIPE: A Grammar-based Framework for Automatically Evolving Classification Pipelines. In Proc. of the European Conference on Genetic Programming (EuroGP). Springer International Publishing, 246--261.
[28]
J. T. Springenberg, A. Klein, S.Falkner, and F. Hutter. 2016. Bayesian optimization with robust Bayesian neural networks. In Proc. of the Conference on Neural Information Processing Systems.
[29]
K. O. Stanley and R. Miikkulainen. 2002. Evolving neural networks through augmenting topologies. Evolutionary Computation 10, 2 (2002), 99--127.
[30]
C. Thornton, F. Hutter, H. H. Hoos, and K. Leyton-Brown. 2013. Auto-WEKA: Combined Selection and Hyperparameter Optimization of Classification Algorithms. In Proc. of KDD. ACM, 847--855.
[31]
G. Tsoumakas and I. Katakis. 2007. Multi-label classification: An overview. International Journal on Data Warehousing and Mining 3, 3 (2007), 1--13.
[32]
G. Tsoumakas, I. Katakis, and I. Vlahavas. 2010. Mining Multi-label Data. In Data Mining and Knowledge Discovery Handbook, Oded Maimon and Lior Rokach (Eds.). Springer US, Boston, MA, 667--685.
[33]
G. Tsoumakas, E. Spyromitros-Xioufis, J. Vilcek, and I. Vlahavas. 2011. Mulan: A Java Library for Multi-Label Learning. Journal of Machine Learning Research 12 (2011), 2411--2414.
[34]
G. Tsoumakas and I. Vlahavas. 2007. Random k-Labelsets: An Ensemble Method for Multilabel Classification. In Proc. of the European Conference on Machine Learning (ECML). Springer-Verlag, Berlin, Heidelberg, 406--417.
[35]
F. Wilcoxon, S. K. Katti, and R. A Wilcox. 1970. Critical values and probability levels for the Wilcoxon rank sum test and the Wilcoxon signed rank test. Selected tables in mathematical statistics 1 (1970), 171--259.
[36]
Ian H. Witten, Eibe Frank, and Mark A. Hall. 2011. Data Mining: Practical Machine Learning Tools and Techniques (3rd ed.). Morgan Kaufmann Publishers Inc.
[37]
X. Yao. 1999. Evolving artificial neural networks. Proc. of the IEEE 87, 9 (1999), 1423--1447.
[38]
M.-L. Zhang, J. M. Pena, and V. Robles. 2009. Feature selection for multi-label naive Bayes classification. Information Sciences 179, 19 (2009), 3218--3229.
[39]
M.-L. Zhang and Z.-H. Zhou. 2006. Multilabel Neural Networks with Applications to Functional Genomics and Text Categorization. IEEE Transactions on Knowledge and Data Engineering 18, 10 (2006), 1338--1351.
[40]
M.-L. Zhang and Z.-H. Zhou. 2007. ML-KNN: A Lazy Learning Approach to Multi-label Learning. Pattern Recognition 40, 7 (2007), 2038--2048.
[41]
M.-L. Zhang and Z. H. Zhou. 2014. A Review on Multi-Label Learning Algorithms. IEEE Transactions on Knowledge and Data Engineering 26, 8 (2014), 1819--1837.

Cited By

View all
  • (2024)Evolutionary Label Selection for Multi-label Classification2024 IEEE Congress on Evolutionary Computation (CEC)10.1109/CEC60901.2024.10611918(01-08)Online publication date: 30-Jun-2024
  • (2024)Hyperparameter optimization of two-branch neural networks in multi-target predictionApplied Soft Computing10.1016/j.asoc.2024.111957165(111957)Online publication date: Nov-2024
  • (2023)A systematic literature review on AutoML for multi-target learning tasksArtificial Intelligence Review10.1007/s10462-023-10569-256:S2(2013-2052)Online publication date: 10-Aug-2023
  • Show More Cited By

Index Terms

  1. Towards a method for automatically selecting and configuring multi-label classification algorithms

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    GECCO '17: Proceedings of the Genetic and Evolutionary Computation Conference Companion
    July 2017
    1934 pages
    ISBN:9781450349390
    DOI:10.1145/3067695
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 15 July 2017

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. automatic machine learning
    2. evolutionary algorithms
    3. multi-label

    Qualifiers

    • Research-article

    Funding Sources

    • FAPEMIG
    • CNPq
    • CAPES

    Conference

    GECCO '17
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 1,669 of 4,410 submissions, 38%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)10
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 13 Nov 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Evolutionary Label Selection for Multi-label Classification2024 IEEE Congress on Evolutionary Computation (CEC)10.1109/CEC60901.2024.10611918(01-08)Online publication date: 30-Jun-2024
    • (2024)Hyperparameter optimization of two-branch neural networks in multi-target predictionApplied Soft Computing10.1016/j.asoc.2024.111957165(111957)Online publication date: Nov-2024
    • (2023)A systematic literature review on AutoML for multi-target learning tasksArtificial Intelligence Review10.1007/s10462-023-10569-256:S2(2013-2052)Online publication date: 10-Aug-2023
    • (2023)AutoMMLC: An Automated and Multi-objective Method for Multi-label ClassificationIntelligent Systems10.1007/978-3-031-45389-2_20(291-306)Online publication date: 12-Oct-2023
    • (2022)EHHR: an efficient evolutionary hyper-heuristic based recommender framework for short-text classifier selectionCluster Computing10.1007/s10586-022-03754-5Online publication date: 10-Oct-2022
    • (2021)A multi‐label cascaded neural network classification algorithm for automatic training and evolution of deep cascaded architectureExpert Systems10.1111/exsy.1267138:7Online publication date: 22-Jan-2021
    • (2021)AutoML for Multi-Label Classification: Overview and Empirical EvaluationIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2021.305127643:9(3037-3054)Online publication date: 1-Sep-2021
    • (2020)A FRAMEWORK THAT USES FEATURE MODELS AND CORRESPONDING LABELS FOR MACHINE LEARNING ALGORITHMSJOURNAL OF MECHANICS OF CONTINUA AND MATHEMATICAL SCIENCES10.26782/jmcms.2020.08.0003515:8Online publication date: 18-Aug-2020
    • (2020)Metalearning Applied to Multi-label Text ClassificationProceedings of the XVI Brazilian Symposium on Information Systems10.1145/3411564.3411646(1-8)Online publication date: 3-Nov-2020
    • (2020)An empirical analysis of binary transformation strategies and base algorithms for multi-label learningMachine Learning10.1007/s10994-020-05879-3Online publication date: 10-Jun-2020
    • Show More Cited By

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media