Improving the $$\epsilon $$ -approximate algorithm for Probabilistic Classifier Chains

Miriam Fdez-Díaz¹,
Laura Fdez-Díaz¹,
Deiner Mena²,
Elena Montañés¹,
José Ramón Quevedo¹ &
…
Juan José del Coz¹

201 Accesses
Explore all metrics

Abstract

Probabilistic Classifier Chains are a multi-label classification method which has gained the attention of researchers in recent years. This is because of their ability to optimally estimate the entire joint conditional probability of a label combination through the product rule of probability. Their main drawback is that they require performing an exhaustive search in order to obtain Bayes optimal predictions. This means computing this probability for all possible label combinations before taking a label combination with the highest value of probability. This is the reason why several works have been published in recent years that avoid exploring all combinations, while maintaining optimality. Approaches such as greedy search, beam search and Monte Carlo reduce the computational cost, but at the cost of not ensuring Bayes optimal predictions (although, in general, they provide close to optimal solutions). Methods based on a heuristic search provide optimal predictions, but the computational time has not been as good as expected. In this respect, the $\epsilon $-approximate algorithm has been found to be the best inference approach among those that provide Bayes optimal predictions, not only for its optimality, but also for its computational time. However, this paper both theoretically and experimentally shows that it sometimes performs some backtracking during the search for optimal predictions which may prolong the prediction time. The aim of this paper is thus to improve this algorithm by achieving a more direct search. Specifically, it enhances the criterion under which the next node to be expanded is chosen by adding heuristic information, although it is only applicable for linear-based models. The experiments carried out confirm that the improved $\epsilon $-approximate algorithm explores fewer nodes and reduces the computational time of the original version.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A family of admissible heuristics for A* to perform inference in probabilistic classifier chains

Article 26 September 2016

Heuristic Classifier Chains for Multi-label Classification

Extreme Gradient Boosted Multi-label Trees for Dynamic Classifier Chains

Notes

A heuristic is admissible if it never underestimates the gain of reaching the goal, i.e., the gain it estimates to reach the goal is not lower than the highest possible gain. This property means that the A* algorithm ensures an optimal solution.
A heuristic is more dominant than another heuristic if it estimates the gain of reaching the goal closer to the actual value. A consequence of this property is that it makes the former algorithm explore fewer nodes than the latter.

References

Brier GW (1950) Verification of forecasts expressed in terms of probability. Mon Weather Rev 78(1):1–3
Article Google Scholar
Cheng W, Hüllermeier E (2009) Combining instance-based learning and logistic regression for multi-label classification. Mach Learn 76(2–3):211–225
Article Google Scholar
Clare A, King RD (2001) Knowledge discovery in multi-label phenotype data. In: European conference on data mining and knowledge discovery, pp 42–53
Dembczyński K, Cheng W, Hüllermeier E (2010) Bayes optimal multilabel classification via probabilistic classifier chains. ICML 2010:279–286
Google Scholar
Dembczyński K, Waegeman W, Cheng W, Hüllermeier E (2012) On label dependence and loss minimization in multi-label classification. Mach Learn 88(1–2):5–45
Article MathSciNet Google Scholar
Dembczynski K, Waegeman W, Hüllermeier E (2012) An analysis of chaining in multi-label classification. Front Artif Intell Appl 242:294–299
MATH Google Scholar
Elisseeff A, Weston J (2005) A kernel method for multi-labelled classification. In: ACM Conference on Research and Develop. In: Information retrieval, pp 274–281
Fürnkranz J, Hüllermeier E, Loza Mencía E, Brinker K (2008) Multilabel classification via calibrated label ranking. Mach Learn 73:133–153
Article Google Scholar
García S, Herrera F (2008) An extension on “statistical comparisons of classifiers over multiple data sets” for all pairwise comparisons. J Machine Learn Res 9(12):2677–2694
MATH Google Scholar
Ghamrawi N, McCallum A (2005) Collective multi-label classification. In: ACM International Conference on Information and Knowledge Management. ACM, pp 195–200
Gibaja E, Ventura S (2015) A tutorial on multilabel learning. ACM Comput Surv 47(3):52:1–52:38. https://doi.org/10.1145/2716262
Article Google Scholar
Godbole S, Sarawagi S (2004) Discriminative methods for multi-labeled classification. In: Pacific-Asia conference on knowledge discovery and data mining, pp 22–30
Goncalves EC, Plastino A, Freitas AA (2013) A genetic algorithm for optimizing the label ordering in multi-label classifier chains. In: IEEE 25th international conference on tools with artificial intelligence, pp 469–476. https://doi.org/10.1109/ICTAI.2013.76
Kumar A, Vembu S, Menon AK, Elkan C (2012) Learning and inference in probabilistic classifier chains with beam search. ECML/PKDD 2012:665–680
Google Scholar
Kumar A, Vembu S, Menon AK, Elkan C (2013) Beam search algorithms for multi-label learning. Mach Learn 92(1):65–89
Article MathSciNet Google Scholar
Lin CJ, Weng RC, Keerthi SS (2008) Trust region Newton method for logistic regression. J Machine Learn Res 9(Apr):627–650
MathSciNet MATH Google Scholar
McCallum AK (1999) Multi-label text classification with a mixture model trained by em. In: AAAI 99 workshop on text learning
Mena D, Montañés E, Quevedo JR, Del Coz JJ (2015) Using A* for inference in probabilistic classifier chains. In: Proceedings of the 24th international conference on artificial intelligence, IJCAI’15. AAAI Press, pp 3707–3713
Mena D, Montañés E, Quevedo JR, del Coz JJ (2016) An overview of inference methods in probabilistic classifier chains for multilabel classification. Wiley Interdiscip Rev Data Min Knowl Discov 6(6):215–230. https://doi.org/10.1002/widm.1185
Article Google Scholar
Mena D, Montañés E, Quevedo JR, del Coz JJ (2017) A family of admissible heuristics for A* to perform inference in probabilistic classifier chains. Mach Learn 106(1):143–169. https://doi.org/10.1007/s10994-016-5593-5
Article MathSciNet MATH Google Scholar
Mena D, Quevedo JR, Montañés E, del Coz JJ (2017) A heuristic in A* for inference in nonlinear probabilistic classifier chains. Knowl-Based Syst 126:78–90. https://doi.org/10.1016/j.knosys.2017.03.015
Article MATH Google Scholar
Montañés E, Quevedo J, del Coz JJ (2011) Aggregating independent and dependent models to learn multi-label classifiers. In: ECML’11, pp 484–500
Montañés E, Senge R, Barranquero J, Quevedo J, del Coz JJ, Hüllermeier E (2014) Dependent binary relevance models for multi-label classification. Pattern Recogn 47(3):1494–1508
Article Google Scholar
Prati R (2015) Fuzzy rule classifiers for multi-label classification. https://doi.org/10.1109/FUZZ-IEEE.2015.7337815
Qi GJ, Hua XS, Rui Y, Tang J, Mei T, Zhang HJ (2007) Correlative multi-label video annotation. In: Proceedings of the international conference on multimedia. ACM, New York, pp 17–26
Read J, Martino L, Hollmén J (2016) Multi-label methods for prediction with sequential data. CoRR arXiv:1609.08349
Read J, Martino L, Luengo D (2014) Efficient monte carlo methods for multi-dimensional learning with classifier chains. Pattern Recogn 47(3):1535–1546
Article Google Scholar
Read J, Martino L, Olmos PM, Luengo D (2015) Scalable multi-output label prediction: from classifier chains to classifier trellises. Pattern Recogn 48(6):2096–2109. https://doi.org/10.1016/j.patcog.2015.01.004
Article MATH Google Scholar
Read J, Pfahringer B, Holmes G (2008) Multi-label classification using ensembles of pruned sets. In: IEEE International Conference on Data Mining, pp 995–1000. IEEE
Read J, Pfahringer B, Holmes G, Frank E (2009) Classifier chains for multi-label classification. In: ECML/PKDD’09, LNCS. Springer, pp 254–269
Read J, Pfahringer B, Holmes G, Frank E (2011) Classifier chains for multi-label classification. Mach Learn 85(3):333–359
Article MathSciNet Google Scholar
Schapire RE, Singer Y (2000) Boostexter: a boosting-based system for text categorization. Machine Learn 39:135–168
Article Google Scholar
Tsoumakas G, Katakis I, Vlahavas I (2008) Effective and efficient multilabel classification in domains with large number of labels. In: Proceedings of the ECML/PKDD 2008 Workshop on Mining Multidimensional Data (MMD’08), vol. 21. Antwerp, Belgium, pp 53–59
Tsoumakas G, Vlahavas I (2007) Random k-Labelsets: An ensemble method for multi-label classification. In: ECML/PKDD’07. Springer, pp 406–417
Wu YP, Lin HT (2017) Progressive random k-labelsets for cost-sensitive multi-label classification. Mach Learn 106(5):671–694. https://doi.org/10.1007/s10994-016-5600-x
Article MathSciNet MATH Google Scholar
Zhang ML, Zhou ZH (2006) Multilabel neural networks with applications to functional genomics and text categorization. IEEE Trans Knowl Data Eng 18:1338–1351
Article Google Scholar
Zhang ML, Zhou ZH (2007) Ml-knn: a lazy learning approach to multi-label learning. Pattern Recogn 40(7):2038–2048
Article Google Scholar

Download references

Author information

Authors and Affiliations

Artificial Intelligence Center, University of Oviedo at Gijón, 33204, Asturias, Spain
Miriam Fdez-Díaz, Laura Fdez-Díaz, Elena Montañés, José Ramón Quevedo & Juan José del Coz
Dept. de Ingeniería en Telecomunicaciones e Informática, Universidad Tecnológica del Chocó, Quibdó, Chocó, Colombia
Deiner Mena

Authors

Miriam Fdez-Díaz
View author publications
You can also search for this author in PubMed Google Scholar
Laura Fdez-Díaz
View author publications
You can also search for this author in PubMed Google Scholar
Deiner Mena
View author publications
You can also search for this author in PubMed Google Scholar
Elena Montañés
View author publications
You can also search for this author in PubMed Google Scholar
José Ramón Quevedo
View author publications
You can also search for this author in PubMed Google Scholar
Juan José del Coz
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Elena Montañés.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This research has been funded by MINECO (the Spanish Ministerio de Economía y Competitividad) and ERDF (European Regional Development Fund) under Grant TIN2015-65069-C2-2-R (MINECO/FEDER).

Rights and permissions

Reprints and permissions

About this article

Cite this article

Fdez-Díaz, M., Fdez-Díaz, L., Mena, D. et al. Improving the $\epsilon $-approximate algorithm for Probabilistic Classifier Chains. Knowl Inf Syst 62, 2709–2738 (2020). https://doi.org/10.1007/s10115-020-01436-5

Download citation

Received: 19 January 2018
Revised: 24 December 2019
Accepted: 30 December 2019
Published: 30 January 2020
Issue Date: July 2020
DOI: https://doi.org/10.1007/s10115-020-01436-5

Improving the \(\epsilon \)-approximate algorithm for Probabilistic Classifier Chains

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

A family of admissible heuristics for A* to perform inference in probabilistic classifier chains

Heuristic Classifier Chains for Multi-label Classification

Extreme Gradient Boosted Multi-label Trees for Dynamic Classifier Chains

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Improving the \(\epsilon \)-approximate algorithm for Probabilistic Classifier Chains

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

A family of admissible heuristics for A* to perform inference in probabilistic classifier chains

Heuristic Classifier Chains for Multi-label Classification

Extreme Gradient Boosted Multi-label Trees for Dynamic Classifier Chains

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation