research-article

Learning high-dependence Bayesian network classifier with robust topology

Authors:

Kuo LiAuthors Info & Claims

Volume 239, Issue C

https://doi.org/10.1016/j.eswa.2023.122395

Published: 01 April 2024 Publication History

Abstract

The increase in data variability and quantity makes it urgent for learning complex multivariate probability distributions. The state-of-the-art Tree Augmented Naive Bayes (TAN) classifier uses maximum weighted spanning tree (MWST) to graphically model data with excellent time and space complexity. In this paper, we theoretically prove the feasibility of scaling up one-dependence MWST to model high-dependence relationships, and then propose to apply a heuristic search strategy to improve the fitness of extended topology to data. The newly added edges to each attribute node may provide a local optimal solution. Then ensemble learning is introduced to improve the generalization performance and reduce the sensitivity to variation in training data. The experimental results on 32 benchmark datasets reveal that this highly scalable algorithm inherits the expressive power of TAN and achieves an excellent bias–variance tradeoff, and it also demonstrates competitive classification performance when compared to a range of high-dependence or ensemble learning algorithms.

Highlights

•

We extend TAN to identify the high-dependence relationships.

•

Causal semantics are described in the form of DAG.

•

Ensemble learning helps improve the topology robustness.

References

[1]

Badaloni S., Sambo F., Venco F., Bayesian network structure learning: Hybridizing complete search with independence tests, Ai Communications 28 (2) (2015) 309–322,.

[2]

Benjumeda M., Bielza C., Larranaga P., Learning tractable Bayesian networks in the space of elimination orders, Artificial Intelligence 274 (2019) 66–90,.

Digital Library

[3]

Bernardo J.M., Smith A.F.M., Bayesian theory, Journal of the Royal Statistical Society Series A (Statistics in Society) 15 (2000) 340–341,.

[4]

Brain D., Webb G.I., The need for low bias algorithms in classification learning from large data sets, in: Proceedings of the 6th European conference on principles of data mining and knowledge discovery, 2002, 2002, pp. 62–73,.

[5]

Burr T., Causation, prediction, and search, Technometrics 45 (2003) 272–273,.

[6]

Carvalho A.M., Roos T., Oliveira A.L., Myllymäki P., Discriminative learning of Bayesian networks via factorized conditional log-likelihood, Journal of Machine Learning Research 12 (2011) 2181–2210. https://dl.acm.org/doi/10.5555/1953048.2021070.

[7]

Chen E.Y., Learning Bayesian network structures with non-decomposable scores, (Ph.D. thesis) University of California, Los Angeles, USA, 2016, http://www.escholarship.org/uc/item/7r51b4hv.

[8]

Chen Z., Jiang L., Li C., Label augmented and weighted majority voting for crowdsourcing, Information Sciences 606 (2022) 397–409,.

Digital Library

[9]

Chickering D.M., A transformational characterization of equivalent Bayesian network structures, in: Proceedings of the 11th annual conference on uncertainty in artificial intelligence, 1995, 1995, pp. 87–98. https://dslpitt.org/uai/displayArticleDetails.jsp?mmnu=1&smnu=2&article_id=423&proceeding_id=11.

[10]

Chow C.K., Liu C.N., Approximating discrete probability distributions with dependence trees, IEEE Transactions on Information Theory 14 (1968) 462–467,.

Digital Library

[11]

Colombo D., Maathuis M.H., Order-independent constraint-based causal structure learning, Journal of Machine Learning Research 15 (2014) 3741–3782. https://dl.acm.org/doi/10.5555/2627435.2750365.

[12]

Congdon P., Model weights for model choice and averaging, Statistical Methodology 4 (2007) 143–157,.

[13]

Cowell R.G., Conditions under which conditional independence and scoring methods lead to identical selection of Bayesian network models, in: Proceedings of the 17th conference in uncertainty in artificial intelligence, 2001, 2001, pp. 91–97. https://dslpitt.org/uai/displayArticleDetails.jsp?mmnu=1&smnu=2&article_id=87&proceeding_id=17.

[14]

de Campos C.P., Ji Q., Efficient structure learning of Bayesian networks using constraints, Journal of Machine Learning Research 12 (2011) 663–689. https://dl.acm.org/doi/10.5555/1953048.2021027.

[15]

de Campos C.P., Zeng Z., Ji Q., Structure learning of Bayesian networks using constraints, in: Proceedings of the 26th annual international conference on machine learning, 2009, 2009, pp. 113–120,.

Digital Library

[16]

de Waal A., Joubert J.W., Explainable Bayesian networks applied to transport vulnerability, Expert Systems with Applications 209 (2022),.

Digital Library

[17]

Dietterich T.G., Ensemble methods in machine learning, in: First international workshop on multiple classifier systems, vol. 1857, 2000, pp. 1–15,.

[18]

Duan Z., Wang L., Chen S., Sun M., Instance-based weighting filter for superparent one-dependence estimators, Knowledge-Based Systems 203 (2020),.

[19]

Friedman M., The use of ranks to avoid the assumption of normality implicit in the analysis of variance, Publications of the American Statistical Association 32 (1939) 675–701,.

[20]

Friedman N., Geiger D., Goldszmidt M., Bayesian network classifiers, Machine Learning 29 (1997) 131–163,.

Digital Library

[21]

Geiger D., Heckerman D., Learning Gaussian networks, in: Proceedings of the 10th annual conference on uncertainty in artificial intelligence, 1994, 1994, pp. 235–243. https://dslpitt.org/uai/displayArticleDetails.jsp?mmnu=1&smnu=2&article_id=509&proceeding_id=10.

[22]

Ghorbany S., Noorzai E., Yousefi S., BIM-based solution to enhance the performance of public-private partnership construction projects using copula bayesian network, Expert Systems with Applications 216 (2023),.

Digital Library

[23]

Grossman D., Domingos P.M., Learning Bayesian network classifiers by maximizing conditional likelihood, in: Proceedings of the 21th international conference on machine learning, 2004, 2004,.

Digital Library

[24]

Guinhouya K.A., Bayesian networks in project management: A scoping review COVID-19 medical waste transportation risk evaluation integrating type-2 fuzzy total interpretive structural modeling and Bayesian network, Expert Systems with Applications 214 (2023) https://www.sciencedirect.com/science/article/pii/S0957417422022321.

[25]

Harikrishnakumar R., Nannapaneni S., Forecasting bike sharing demand using quantum Bayesian network, Expert Systems with Applications 221 (2023),.

Digital Library

[26]

Heckerman D., A tutorial on learning with Bayesian networks, Springer Netherlands, vol. 89, 1998, pp. 301–354,.

[27]

Jiang L., Cai Z., Wang D., Zhang H., Improving tree augmented naive Bayes for class probability estimation, Knowledge-Based Systems 26 (2012) 239–245,.

Digital Library

[28]

Jiang L., Zhang H., Cai Z., Wang D., Weighted average of one-dependence estimators†, Journal of Experimental and Theoretical Artificial Intelligence 24 (2012) 219–230,.

[29]

Katib A., Rao P., Barnard K., Kamhoua C.A., Fast approximate score computation on large-scale distributed data for learning multinomial Bayesian networks, Association for Computing Machinery 13 (2) (2019) 1–40,.

Digital Library

[30]

Kong H., Shi X., Wang L., Liu Y., Mammadov M., Wang G., Averaged tree-augmented one-dependence estimators, Applied Intelligence 51 (2021) 4270–4286,.

Digital Library

[31]

Lewis P.M., Approximating probability distributions to reduce storage requirements, Information and Control 2 (1959) 214–225,.

[32]

Liu Y., Wang L., Mammadov M., Learning semi-lazy Bayesian network classifier under the c.i.i.d assumption, Knowledge-Based Systems 208 (2020),.

[33]

Maclin R., Opitz D.W., An empirical evaluation of bagging and boosting, in: Proceedings of the 14th national conference on artificial intelligence and 9th innovative applications of artificial intelligence conference, 1997, 1997, pp. 546–551. http://www.aaai.org/Library/AAAI/1997/aaai97-085.php.

[34]

Mahdi R., Mezey J.G., Sub-local constraint-based learning of Bayesian networks using a joint dependence criterion, Journal of Machine Learning Research 14 (1) (2013) 1563–1603. https://dl.acm.org/doi/10.5555/2567709.2567714.

[35]

Malone B.M., Kangas K., Järvisalo M., Koivisto M., Myllymäki P., Empirical hardness of finding optimal Bayesian network structures: Algorithm selection and runtime prediction, Machine Learning 107 (2018) 247–283,.

Digital Library

[36]

Martínez A.M., Webb G.I., Chen S., Zaidi N.A., Scalable learning of Bayesian network classifiers, Journal of Machine Learning Research 17 (2016) 1–35. http://jmlr.org/papers/v17/martinez16a.html.

[37]

Park S., Fürnkranz J., Efficient implementation of class-based decomposition schemes for Naïve Bayes, Machine Learning 96 (2014) 295–309,.

Digital Library

[38]

Pearl J., Probabilistic reasoning in intelligent systems: Networks of plausible inference, Morgan Kaufmann Publishers 48 (1991) 117–124,.

[39]

Pearl J., Glymour M., Jewell N.P., Causal inference in statistics: A primer, 2016.

[40]

Ren L., Jiang L., Li C., Label confidence-based noise correction for crowdsourcing, Engineering Applications of Artificial Intelligence 117 (Part) (2023),.

Digital Library

[41]

Ren Y., Wang L., Li X., Pang M., Wei J., Stochastic optimization for Bayesian network classifiers, Applied Intelligence 52 (2022) 15496–15516,.

Digital Library

[42]

Sahami M., Learning limited dependence Bayesian classifiers, in: Proceedings of the 2th international conference on knowledge discovery and data mining, 1996, 1996, pp. 335–338. http://www.aaai.org/Library/KDD/1996/kdd96-061.php.

[43]

Shen J., Li L., Wong W., Markov blanket feature selection for support vector machines, in: Proceedings of the 23th AAAI conference on artificial intelligence, 2008, 2008, pp. 696–701. http://www.aaai.org/Library/AAAI/2008/aaai08-111.php.

[44]

Tsamardinos I., Brown L.E., Aliferis C.F., The max-min hill-climbing Bayesian network structure learning algorithm, Machine Learning 65 (2006) 31–78,.

Digital Library

[45]

Wang L., Xie Y., Pang M., Wei J., Alleviating the attribute conditional independence and I.I.D. Assumptions of averaged one-dependence estimator by double weighting, Knowledge-Based Systems 250 (2022),.

Digital Library

[46]

Wang L., Zhang X., Li K., Zhang S., Semi-supervised learning for k-dependence Bayesian classifiers, Applied Intelligence 52 (2022) 3604–3622,.

Digital Library

[47]

Wang L., Zhou J., Wei J., Pang M., Sun M., Learning causal Bayesian networks based on causality analysis for classification, Engineering Applications of Artificial Intelligence 114 (2022),.

Digital Library

[48]

Yang Y., Korb K.B., Ting K.M., Webb G.I., Ensemble selection for SuperParent-one-dependence estimators, in: Proceedings of the 18th australian joint conference on advances on artificial intelligence, 2005, 2005, pp. 102–112,.

Digital Library

[49]

Yang Y., Webb G.I., Cerquides J., Korb K.B., Boughton J.R., Ting K.M., To select or to weigh: A comparative study of linear combination schemes for SuperParent-one-dependence estimators, IEEE Transactions on Knowledge and Data Engineering 19 (2007) 1652–1665,.

Digital Library

[50]

Zhang Y., Jiang L., Li C., Attribute augmentation-based label integration for crowdsourcing, Frontiers of Computer Science 17 (5) (2023),.

Digital Library

[51]

Zhang H., Jiang L., Yu L., Attribute and instance weighted naive Bayes, Pattern Recognition 111 (2021),.

[52]

Zhang H., Jiang L., Zhang W., Li C., Multi-view attribute weighted naive Bayes, IEEE Transactions on Knowledge and Data Engineering 35 (7) (2023) 7291–7302,.

Digital Library

[53]

Zhang H., Ling C.X., An improved learning algorithm for augmented naive Bayes, in: Proceedings of the 5th Pacific-Asia conference on knowledge discovery and data mining, 2001, 2001, pp. 581–586,.

[54]

Zhang H., Sheng S., Learning weighted naive Bayes with accurate ranking, in: Proceedings of the 4th IEEE international conference on data mining, 2004, 2004, pp. 567–570,.

Cited By

Zhao YWang LZhu XJin TSun MLi X(2025)Probability knowledge acquisition from unlabeled instance based on dual learningKnowledge and Information Systems10.1007/s10115-024-02238-967:1(521-547)Online publication date: 1-Jan-2025
https://dl.acm.org/doi/10.1007/s10115-024-02238-9

Recommendations

Improving Tree augmented Naive Bayes for class probability estimation

Numerous algorithms have been proposed to improve Naive Bayes (NB) by weakening its conditional attribute independence assumption, among which Tree Augmented Naive Bayes (TAN) has demonstrated remarkable classification performance in terms of ...
Building Locally Discriminative Classifier Ensemble Through Classifier Fusion Among Nearest Neighbors
PCM 2016: 17th Pacific-Rim Conference on Advances in Multimedia Information Processing - Volume 9916

Many studies on ensemble learning that combines multiple classifiers have shown that, it is an effective technique to improve accuracy and stability of a single classifier. In this paper, we propose a novel discriminative classifier fusion method, which ...
Learning Bayesian classifiers from positive and unlabeled examples

The positive unlabeled learning term refers to the binary classification problem in the absence of negative examples. When only positive and unlabeled instances are available, semi-supervised classification algorithms cannot be directly applied, and ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Expert Systems with Applications: An International Journal

Expert Systems with Applications: An International Journal Volume 239, Issue C

Apr 2024

1593 pages

Issue’s Table of Contents

Elsevier Ltd.

Publisher

Pergamon Press, Inc.

United States

Publication History

Published: 01 April 2024

Author Tags

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 05 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Zhao YWang LZhu XJin TSun MLi X(2025)Probability knowledge acquisition from unlabeled instance based on dual learningKnowledge and Information Systems10.1007/s10115-024-02238-967:1(521-547)Online publication date: 1-Jan-2025
https://dl.acm.org/doi/10.1007/s10115-024-02238-9

View Options

View options

Figures

Tables

Media

View Issue’s Table of Contents