Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3583780.3614996acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article

On the Thresholding Strategy for Infrequent Labels in Multi-label Classification

Published: 21 October 2023 Publication History

Abstract

In multi-label classification, the imbalance between labels is often a concern. For a label that seldom occurs, the default threshold used to generate binarized predictions of that label is usually sub-optimal. However, directly tuning the threshold to optimize F-measure has been observed to overfit easily. In this work, we explain why this overfitting occurs. Then, we analyze the FBR heuristic, a previous technique proposed to address the overfitting issue. We explain its success but also point out some problems unobserved before. Then, we first propose a variant of the FBR heuristic that not only fixes the problems but is also more justifiable. Second, we propose a new technique based on smoothing the F-measure when tuning the threshold. We theoretically prove that, with proper parameters, smoothing results in desirable properties of the tuned threshold. Based on the idea of smoothing, we then propose jointly optimizing micro-F and macro-F as a lightweight alternative free from extra hyperparameters. Our methods are empirically evaluated on text and node classification datasets. The results show that our methods consistently outperform the FBR heuristic.

References

[1]
Janez Brank, Marko Grobelnik, Natavs a Milić-Frayling, and Dunja Mladenić. 2003. Training text classifiers with SVM on very few positive examples. Technical Report. Technical Report MSR-TR-2003--34, Microsoft Corp.
[2]
Wei-Cheng Chang, Daniel Jiang, Hsiang-Fu Yu, Choon-Hui Teo, Jiong Zhang, Kai Zhong, Kedarnath Kolluri, Qie Hu, Nikhil Shandilya, Vyacheslav Ievgrafov, Japinder Singh, and Inderjit S Dhillon. 2021. Extreme multi-label learning for semantic matching in product search. In Proceedings of the 27th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD).
[3]
Rong-En Fan, Kai-Wei Chang, Cho-Jui Hsieh, Xiang-Rui Wang, and Chih-Jen Lin. 2008. LIBLINEAR: a library for large linear classification. Journal of Machine Learning Research, Vol. 9 (2008), 1871--1874. http://www.csie.ntu.edu.tw/ cjlin/papers/liblinear.pdf
[4]
Rong-En Fan and Chih-Jen Lin. 2007. A study on threshold selection for multi-label classification. Technical Report. Department of Computer Science, National Taiwan University.
[5]
Aditya Grover and Jure Leskovec. 2016. Node2vec: scalable feature learning for networks. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD). 855--864. https://doi.org/10.1145/2939672.2939754
[6]
Haixiang Guo, Yijing Li, Jennifer Shang, Mingyun Gu, Yuanyue Huang, and Bing Gong. 2017. Learning from class-imbalanced data: Review of methods and applications. Expert Systems With Applications, Vol. 73 (2017), 220--239.
[7]
Will Hamilton, Zhitao Ying, and Jure Leskovec. 2017. Inductive representation learning on large graphs. In Advances in Neural Information Processing Systems.
[8]
Haibo He and Edwardo A. Garcia. 2009. Learning from imbalanced data. IEEE Transactions on Knowledge and Data Engineering, Vol. 21, 9 (2009), 1263--1284.
[9]
Kalina Jasinska, Krzysztof Dembczy'nski, Róbert Busa-Fekete, Karlson Pfannschmidt, Timo Klerx, and Eyke Hüllermeier. 2016. Extreme F-measure maximization using sparse probability estimates. In Proceedings of The 33rd International Conference on Machine Learning (ICML). 1435--1444.
[10]
Justin M. Johnson and Taghi M. Khoshgoftaar. 2019. Deep learning and thresholding with class-imbalanced big data. In Proceedings of the 18th IEEE International Conference on Machine Learning and Applications (ICMLA). 755--762.
[11]
Oluwasanmi O. Koyejo, Nagarajan Natarajan, Pradeep K. Ravikumar, and Inderjit S. Dhillon. 2015. Consistent Multilabel Classification. In Advances in Neural Information Processing Systems, Vol. 28. Curran Associates, Inc. https://proceedings.neurips.cc/paper_files/paper/2015/file/85f007f8c50dd25f5a45fca73cad64bd-Paper.pdf
[12]
David D. Lewis, Yiming Yang, Tony G. Rose, and Fan Li. 2004. RCV1: A new benchmark collection for text categorization research. Journal of Machine Learning Research, Vol. 5 (2004), 361--397.
[13]
Li-Chung Lin, Cheng-Hung Liu, Chih-Ming Chen, Kai-Chin Hsu, I-Feng Wu, Ming-Feng Tsai, and Chih-Jen Lin. 2022. On the use of unrealistic predictions in hundreds of papers evaluating graph representations. In Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence (AAAI). https://www.csie.ntu.edu.tw/ cjlin/papers/multilabel-embedding/multilabel_embedding.pdf
[14]
Zachary C. Lipton, Charles Elkan, and Balakrishnan Naryanaswamy. 2014. Optimal thresholding of classifiers to maximize F1 measure. In Proceedings of the Joint European Conference on Machine Learning and Knowledge Discovery in Databases (ECML/PKDD). 225--239.
[15]
Johannes Loza Mencía, Eneldoand Fürnkranz. 2010. Efficient multilabel classification algorithms for large-scale problems in the legal domain. In Semantic Processing of Legal Texts: Where the Language of Law Meets the Law of Language, Enrico Francesconi, Simonetta Montemagni, Wim Peters, and Daniela Tiscornia (Eds.). Springer Berlin Heidelberg, 192--215.
[16]
James Mullenbach, Sarah Wiegreffe, Jon Duke, Jimeng Sun, and Jacob Eisenstein. 2018. Explainable prediction of medical codes from clinical text. In Proceedings of the Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL). 1101--1111. https://doi.org/10.18653/v1/N18--1100
[17]
Shameem A. Puthiya Parambath, Nicolas Usunier, and Yves Grandvalet. 2014. Optimizing F-measures by cost-sensitive classification. In Advances in Neural Information Processing Systems, Vol. 27.
[18]
Bryan Perozzi, Rami Al-Rfou, and Steven Skiena. 2014. DeepWalk: Online Learning of Social Representations. In Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD). 701--710. https://doi.org/10.1145/2623330.2623732
[19]
Ignazio Pillai, Giorgio Fumera, and Fabio Roli. 2013. Threshold optimisation for multi-label classifiers. Pattern Recognition, Vol. 46, 7 (2013), 2055--2065.
[20]
Foster Provost. 2000. Machine Learning from Imbalanced Data Sets 101. In Proceedings of the AAAI Workshop on Imbalanced Data Sets. 1--3.
[21]
Erik Schultheis, Marek Wydmuch, Rohit Babbar, and Krzysztof Dembczynski. 2022. On missing labels, long-tails and propensities in extreme multi-label classification. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD). 1547--1557.
[22]
Fabrizio Sebastiani. 2002. Machine learning in automated text categorization. ACM Comput. Surv., Vol. 34, 1 (2002), 1--47.
[23]
Shai Shalev-Shwartz, Yoram Singer, and Nathan Srebro. 2007. Pegasos: primal estimated sub-gradient solver for SVM. In Proceedings of the Twenty Fourth International Conference on Machine Learning (ICML).
[24]
Aixin Sun, Ee-Peng Lim, and Ying Liu. 2009. On strategies for imbalanced text classification using SVM: A comparative study. Decision Support Systems, Vol. 48, 1 (2009), 191--201.
[25]
Jian Tang, Meng Qu, Mingzhe Wang, Ming Zhang, Jun Yan, and Qiaozhu Mei. 2015. Line: Large-scale information network embedding. In Proceedings of the 24th international Conference on World Wide Web (WWW). 1067--1077.
[26]
Lei Tang and Huan Liu. 2009. Relational learning via latent social dimensions. In Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining (KDD). 817--826.
[27]
Gang Wu and Edward Y. Chang. 2003. Class-Boundary Alignment for Imbalanced Dataset Learning. In ICML Workshop on Learning from Imbalanced Data Sets II. 49--56.
[28]
Yiming Yang. 2001. A Study on Thresholding Strategies for Text Categorization. In Proceedings of the 24th ACM International Conference on Research and Development in Information Retrieval, W. Bruce Croft, David J. Harper, Donald H. Kraft, and Justin Zobel (Eds.). ACM Press, New York, US, New Orleans, US, 137--145.
[29]
Hsiang-Fu Yu, Kai Zhong, Jiong Zhang, Wei-Cheng Chang, and Inderjit S. Dhillon. 2022. PECOS: Prediction for Enormous and Correlated Output Spaces. Journal of Machine Learning Research, Vol. 23, 98 (2022), 1--32.
[30]
Guo-Xun Yuan, Kai-Wei Chang, Cho-Jui Hsieh, and Chih-Jen Lin. 2010. A Comparison of Optimization Methods and software for Large-scale L1-regularized Linear Classification. Journal of Machine Learning Research, Vol. 11 (2010), 3183--3234. http://www.csie.ntu.edu.tw/ cjlin/papers/l1.pdf
[31]
Jiong Zhang, Wei-Cheng Chang, Hsiang-Fu Yu, and Inderjit S. Dhillon. 2021. Fast Multi-Resolution Transformer Fine-tuning for Extreme Multi-label Text Classification., Vol. 34 (2021), 7267--7280.
[32]
Arkaitz Zubiaga. 2009. Enhancing Navigation on Wikipedia with Social Tags. In Proceedings of Wikimania.

Cited By

View all
  • (2023)Generalized test utilities for long-tail performance in extreme multi-label classificationProceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3667100(22269-22303)Online publication date: 10-Dec-2023

Index Terms

  1. On the Thresholding Strategy for Infrequent Labels in Multi-label Classification

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    CIKM '23: Proceedings of the 32nd ACM International Conference on Information and Knowledge Management
    October 2023
    5508 pages
    ISBN:9798400701245
    DOI:10.1145/3583780
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 21 October 2023

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. F-measure
    2. infrequent labels
    3. multi-label classification
    4. threshold adjustion

    Qualifiers

    • Research-article

    Funding Sources

    Conference

    CIKM '23
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

    Upcoming Conference

    CIKM '25

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)93
    • Downloads (Last 6 weeks)11
    Reflects downloads up to 22 Nov 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2023)Generalized test utilities for long-tail performance in extreme multi-label classificationProceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3667100(22269-22303)Online publication date: 10-Dec-2023

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media