Abstract
Widening is a method where parallel resources are used to find better solutions from greedy algorithms instead of merely trying to find the same solutions more quickly. To date, every example of Widening has used some form of communication between the parallel workers to maintain their distances from one another in the model space. For the first time, we present a communication-free, widened extension to a standard machine learning algorithm. By using Locality Sensitive Hashing on the Bayesian networks’ Fiedler vectors, we demonstrate the ability to learn classifiers superior to those of standard implementations and to those generated with a greedy heuristic alone.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
Isomorphic graphs have similar Fiedler vectors. The converse is not necessarily true.
References
Akbar, Zaenal, Ivanova, Violeta N., Berthold, Michael R.: Parallel data mining revisited. Better, not faster. In: Hollmén, Jaakko, Klawonn, Frank, Tucker, Allan (eds.) IDA 2012. LNCS, vol. 7619, pp. 23–34. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-34156-4_4
Akl, S.G.: Parallel real-time computation: sometimes quantity means quality. In: Proceedings of International Symposium on Parallel Architectures, Algorithms and Networks, 2000. I-SPAN 2000, pp. 2–11. IEEE (2000)
Bielza, Concha, Larrañaga, Pedro: Discrete Bayesian network classifiers: a survey. ACM Comput. Surv. (CSUR) 47(1), 5 (2014)
Andrei Z. Broder. On the resemblance and containment of documents. In: Proceedings of Compression and Complexity of Sequences 1997, pp. 21–29. IEEE (1997)
Buhler, Jeremy: Efficient large-scale sequence comparison by locality-sensitive hashing. Bioinformatics 17(5), 419–428 (2001)
Charikar, M.S.: Similarity estimation techniques from rounding algorithms. In: Proceedings of the Thiry-Fourth Annual ACM Symposium on Theory of Computing, pp. 380–388. ACM (2002)
Fan-Roon Kim Chung. Spectral Graph Theory. Number 92 in Regional Conference Series in Mathematics. American Mathematical Society, 1997
Coenen, F.: LUCS-KDD DN software (2003)
Datar, M., Immorlica, N., Indyk, P., Mirrokni, V.S.: Locality-sensitive hashing scheme based on p-stable distributions. In: Proceedings of the Twentieth Annual Symposium on Computational Geometry, pp. 253–262. ACM (2004)
Doyle, P.G., Laurie Snell, J.: Random Walks and Electric Networks. Mathematical Association of America (1984)
Fillbrunn, Alexander, Berthold, Michael R.: Diversity-driven widening of hierarchical agglomerative clustering. In: Fromont, Elisa, De Bie, Tijl, van Leeuwen, Matthijs (eds.) IDA 2015. LNCS, vol. 9385, pp. 84–94. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24465-5_8
Fillbrunn, Alexander, Wörteler, Leonard, Grossniklaus, Michael, Berthold, Michael R.: Bucket selection: a model-independent diverse selection strategy for widening. In: Adams, Niall, Tucker, Allan, Weston, David (eds.) IDA 2017. LNCS, vol. 10584, pp. 87–98. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-68765-0_8
Gionis, Aristides, Indyk, Piotr, Motwani, Rajeev: Similarity search in high dimensions via hashing. VLDB 99, 518–529 (1999)
Guo, Krystal, Mohar, Bojan: Hermitian adjacency matrix of digraphs and mixed graphs. J. Graph Theory 85(1), 217–248 (2017)
Koski, T.J.T., Noble, J.M.: A review of Bayesian networks and structure learning. Mathematica Applicanda 40(1), 53–103 (2012)
Kulis, B., Grauman, K.: Kernelized locality-sensitive hashing for scalable image search. In: 12th International Conference on Computer Vision, pp. 2130–7. IEEE (2009)
Larrañaga, Pedro, Karshenas, Hossein, Bielza, Concha, Santana, Roberto: A review on evolutionary algorithms in Bayesian network learning and inference tasks. Inf. Sci. 233, 109–125 (2013)
Lichman, M.: UCI Machine Learning Repository (2013)
Luo, Bin, Wilson, Richard C., Hancock, Edwin R.: Spectral feature vectors for graph clustering. In: Caelli, Terry, Amin, Adnan, Duin, Robert P.W., de Ridder, Dick, Kamel, Mohamed (eds.) SSPR /SPR 2002. LNCS, vol. 2396, pp. 83–93. Springer, Heidelberg (2002). https://doi.org/10.1007/3-540-70659-3_8
Marolt, Matija: A mid-level representation for melody-based retrieval in audio collections. IEEE Trans. Multimed. 10(8), 1617–1625 (2008)
Meinl, T.: Maximum-Score Diversity Selection. Ph.D. thesis, University of Konstanz, Konstanz, Germany (2010)
Pearl, J.: Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Morgan Kaufmann Publishers Inc., Burlington (1988)
Qiu, Huaijun, Hancock, Edwin R.: Graph matching and clustering using spectral partitions. Pattern Recognit. 39(1), 22–34 (2006)
Sampson, Oliver, Berthold, Michael R.: Widened KRIMP: better performance through diverse parallelism. In: Blockeel, Hendrik, van Leeuwen, Matthijs, Vinciotti, Veronica (eds.) IDA 2014. LNCS, vol. 8819, pp. 276–285. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-12571-8_24
Sampson, Oliver R., Berthold, Michael R.: Widened learning of Bayesian network classifiers. In: Boström, Henrik, Knobbe, Arno, Soares, Carlos, Papapetrou, Panagiotis (eds.) IDA 2016. LNCS, vol. 9897, pp. 215–225. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46349-0_19
Satu Elisa Schaeffer: Graph clustering. Comput. Sci. Rev. 1(1), 27–64 (2007)
Scutari, Marco: Learning Bayesian networks with the bnlearn R package. J. Stat. Softw. 35(3), 1–22 (2010)
Terasawa, Kengo, Tanaka, Yuzuru: Spherical LSH for approximate nearest neighbor search on unit hypersphere. In: Dehne, Frank, Sack, Jörg-Rüdiger, Zeh, Norbert (eds.) WADS 2007. LNCS, vol. 4619, pp. 27–38. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-73951-7_4
Van Dam, E.R., Haemers, W.H.: Which graphs are determined by their spectrum? Linear Algebra Appl. 373, 241–272 (2003)
Vishveshwara, S., Brinda, K.V., Kannan, N.: Protein structure: insights from graph theory. J. Theor. Comput. Chem. 1(01), 187–211 (2002)
Zhang, Boyu, Liu, Xianglong, Lang, Bo: Fast graph similarity search via locality sensitive hashing. In: Ho, Yo-Sung, Sang, Jitao, Ro, Yong Man, Kim, Junmo, Wu, Fei (eds.) PCM 2015. LNCS, vol. 9314, pp. 623–633. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24075-6_60
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this paper
Cite this paper
Sampson, O.R., Borgelt, C., Berthold, M.R. (2018). Communication-Free Widened Learning of Bayesian Network Classifiers Using Hashed Fiedler Vectors. In: Duivesteijn, W., Siebes, A., Ukkonen, A. (eds) Advances in Intelligent Data Analysis XVII. IDA 2018. Lecture Notes in Computer Science(), vol 11191. Springer, Cham. https://doi.org/10.1007/978-3-030-01768-2_22
Download citation
DOI: https://doi.org/10.1007/978-3-030-01768-2_22
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-01767-5
Online ISBN: 978-3-030-01768-2
eBook Packages: Computer ScienceComputer Science (R0)