Adaptive structure learning method of deep belief network using neuron generation–annihilation and layer generation

Shin Kamada ORCID: orcid.org/0000-0002-1250-4092¹,
Takumi Ichimura²,
Akira Hara¹ &
…
Kenneth J. Mackin³

579 Accesses
Explore all metrics

Abstract

Recently, deep learning is receiving renewed attention in the field of artificial intelligence. Deep belief network (DBN) has a deep network architecture that can represent multiple features of input patterns hierarchically, using pre-trained restricted Boltzmann machines (RBMs). Such deep network architectures enable extremely high classification accuracy in many tasks compared to previous methods. However, determining various parameters to design effective deep network architectures is a difficult task even for experienced designers, since traditional RBM and DBN cannot change their network structure during the training. The adaptive structure learning method has been previously proposed for finding the optimum number of hidden neurons in multilayered neural networks. The method employs the neuron generation–annihilation algorithm by observing the variance of weight decays. We develop the adaptive structure learning method of RBM and DBN using the neuron generation–annihilation and layer generation algorithm by observing the variance of some parameters. The effectiveness of our proposed model was verified by tenfold cross-validation on benchmark data sets CIFAR-10 and CIFAR-100. The adaptive DBN achieved the highest classification accuracy (97.4% for CIFAR-10, 81.2% for CIFAR-100) among several latest DBN- and CNN-based methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Improved Classification Based on Deep Belief Networks

A-DBNF: adaptive deep belief network framework for regression and classification tasks

Article 02 January 2021

A Structural Learning Method of Restricted Boltzmann Machine by Neuron Generation and Annihilation Algorithm

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

References

Bengio Y, Lamblin P, Popovici D, Larochelle H (2007) Greedy layer-wise training of deep networks. In: Proceedings of advances in neural information processing systems 19 (NIPS 2007), pp 153–160
Ranzato M, Boureau Y, LeCun Y (2007) Sparse feature learning for deep belief networks. In: Proceedings of advances in neural information processing systems 20 (NIPS 2007), pp 1185–1192
Grosse LR, Ranganath R, Ng AY (2009) Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations. In: Proceedings of international conference in machine learning (ICML 2009), pp 609–616
Lyons T, Skitmore M (2012) Project risk management in the Queensland engineering construction industry : a survey. Int J Proj Manag 22(1):51–61
Article Google Scholar
Lane ND, Miluzzo E, Hong L, Peebles D, Choudhury T, Campbell AT (2010) A survey of mobile phone sensing. IEEE Commun Mag 48(9):140–150
Article Google Scholar
Zhang H, Cao X, Ho JKL, Chow TWS (2017) Object-level video advertising: an optimization framework. IEEE Trans Ind Inf 13(2):520–531
Article Google Scholar
Oyedotun OK, Khashman A (2017) Deep learning in vision-based static hand gesture recognition. Neural Comput Appl 28(12):3941–3951
Article Google Scholar
Bengio Y (2009) Learning Deep Architectures for AI. Found Trends Mach Learn Arch 2(1):1–127
Article MathSciNet Google Scholar
Quoc VL, Marc’s Aurelio R et al (2013) Building high-level features using large scale unsupervised learning. Proceedings of IEEE international conference on acoustics, speech and signal processing, pp 8595–8598
Hinton GE, Osindero S, Teh Y (2006) A fast learning algorithm for deep belief nets. Neural Comput 18(7):1527–1554
Article MathSciNet Google Scholar
Hinton GE (2012) A practical guide to training restricted boltzmann machines. Neural Networks, Tricks of the Trade, Lecture Notes in Computer Science (LNCS, vol 7700), pp 599–619
Ichimura T, Yoshida K, (eds) (2004) Knowledge-based intelligent systems for health care. Advanced knowledge international, ISBN 0-9751004-4-0, pp 11–50
Ichimura T, Oeda S, Suka M, Hara A, Mackin KJ, Yoshida K (2005) Knowledge discovery and data mining in medicine. In: Pal N, Jain L (eds) Advanced techniques in knowledge discovery and data mining (advanced information and knowledge processing). Springer, Berlin, pp 177–210
Chapter Google Scholar
Ichimura T, Oeda S, Suka Ma, Yoshida K (2005) A learning method of immune multi-agent neural networks. Neural Comput Appl 14(2):132–148
Article Google Scholar
Kamada S, Ichimura T (2016) An adaptive learning method of restricted boltzmann machine by neuron generation and annihilation algorithm. In: Proceedings of IEEE international conference on systems, man, and cybernetics (SMC), pp 1273–1278
Kamada S, Ichimura T (2016) A structural learning method of restricted boltzmann machine by neuron generation and annihilation algorithm. Neural information processing, lecture notes in computer science (LNCS, vol 9950), pp 372–380
Kamada S, Ichimura T (2016) An adaptive learning method of deep belief network by layer generation algorithm. In: Proceedings of IEEE region 10 conference (TENCON), pp 2971–2974
Krizhevsky A (2009) Learning multiple layers of features from tiny images. Master of thesis, University of Toronto
KyungHyun C, Alexander I, Tapani R (2011) Improved learning of Gaussian–Bernoulli restricted Boltzmann machines. In: Proceedings of international conference on artificial neural networks (ICANN 2011), Part 1, pp 14–17
Courville A, Desjardins G, Bergstra J, Bengio Y (2014) The spike-and-slab RBM and extensions to discrete and sparse data distributions. IEEE Trans Pattern Anal Mach Intell 36(9):1874–1887
Article Google Scholar
Yogeswaran A, Payeur P (2016) Improving visual feature representations by Biasing restricted Boltzmann machines with Gaussian Filters. In: Proceedings advances in visual computing: 12th international symposium, ISVC 2016. Part I: 825–835
Li Z, Cai X, Liang T (2016) Gaussian–Bernoulli based convolutional restricted boltzmann machine for images feature extraction. In: Proceedings of the 23rd International Conference on Neural Information Processing 9948:593–602
Krizhevsky A (2010) Convolutional deep belief networks on CIFAR-10
Sohn K, Lee H (2012) Learning invariant representations with local transformations. In: Proceedings of the 29th international conference on machine learning (ICML 2012), pp 1339–1346
Coates A, Ng A, Lee H (2011) An analysis of single-layer networks in unsupervised feature learning. Proc Mach Learn Res 15:215–223
Google Scholar
Mocanu DC, Mocanu E, Stone P, Nguyen PH, Gibescu M, Liotta A (2017) Evolutionary training of sparse artificial neural networks: a network science perspective. arXiv:1707.04780
Anush S, Gaurav G, Mayank V, Richa S, Angshul M (2017) Class sparsity signature based Restricted Boltzmann Machine. Pattern Recognit 61:674–685
Article Google Scholar
Zhang L, Subbarayan G (2002) An evaluation of back-propagation neural networks for the optimal design of structural systems: Part II. Numerical evaluation. Comput Methods Appl Mech Eng 191(25–26):2887–2904
Article Google Scholar
Ichimura T, Takano T, Tazaki E (1995) Reasoning and learning method for fuzzy rules using neural networks with adaptive structured genetic algorithm. In: Proceedings of IEEE International Conference on Systems, Man and Cybernetics (SMC’95) 4:3269–3274
Zenga X, Yeungb DS (2006) Hidden neuron pruning of multilayer perceptrons using a quantified sensitivity measure. Neurocomputing 69(7–9):825–837
Article Google Scholar
Islam MM, Sattar MA, Amin MF, Yao X, Murase K (2009) A New Adaptive Merging and Growing Algorithm for Designing Artificial Neural Networks. In: IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics) 39(3):705–722
Bruzzone L, Prieto DF (1999) A technique for the selection of kernel-function parameters in RBF neural networks for classification of remote-sensing images. IEEE Trans Geosci Remote Sens 37(2):1179–1184
Article Google Scholar
Ichimura T, Tazaki E, Yoshida K (1995) Extraction of fuzzy rules using neural networks with structure level adaptation verification to the diagnosis of hepatobiliary disorders. Int J Bio-Med Comput 40(2):139–146
Article Google Scholar
Kristiansen G, Gonzalvo X (2017) EnergyNet: energy-based adaptive structural learning of artificial neural network architectures. arXiv:1711.03130 [cs.LG]
Fahlman SE, Lebiere C (1990) The cascade-correlation learning architecture. Proc Adv Neural Inf Process Syst 2(NIPS 1989):524–532
Ackley DH, Hinton GE, Sejnowski TJ (1985) A learning algorithm for Boltzmann machines. Cogn Sci 9(1):147–169
Article Google Scholar
Hinton GE (2002) Training products of experts by minimizing contrastive divergence. Neural Comput 14(8):1771–1800
Article Google Scholar
Tieleman T (2008) Training restricted Boltzmann machines using approximations to the likelihood gradient. In: Proceedings of the 25th international conference in machine learning (ICML 2008), pp 1064–1071
Kawaguchi K (2016) Deep learning without poor local minima. In: Proceedings of advances in neural information processing systems 29 (NIPS 2016):586–594
Carlson D, Cevher V, Carin L (2015) Stochastic spectral descent for restricted Boltzmann machines. In: Proceedings of the 18th international conference on artificial intelligence and statistics, pp 111–119
LeCun Y et al (2015) THE MNIST DATABASE of handwritten digits. http://yann.lecun.com/exdb/mnist/. Accessed 26 June 2017
Cortes C et al (2016) AdaNet: adaptive structural learning of artificial neural networks. arXiv:1607.01097
Kamada S, Ichimura T (2016) Fine tuning method by using knowledge acquisition from deep belief network. In: Proceedings of IEEE 9th international workshop on computational intelligence and applications (IWCIA2016), pp 119–124
Goodfellow I, Warde-Farley D, Mirza M, Courville A, Bengio Y (2013) Maxout networks. Proc Mach Learn Res (PMLR) 28(3):1319–1327
Google Scholar
Clevert DA, Unterthiner T, Hochreiter S (2016) Fast and accurate deep network learning by exponential linear units (ELUs). In: Proceedings of ICRL (2016)
Benjamin G (2015) Fractional max-pooling. arXiv:1412.6071
Zagoruyko S, Komodakis N (2016) Wide residual networks. In: Proceedings of the British machine vision conference (BMVC), 87.1–87.12

Download references

Funding

This study was funded by JAPAN MIC SCOPE (Grant Number 162308002), Artificial Intelligence Research Promotion Foundation, and JSPS KAKENHI (Grant Number JP17J11178).

Author information

Authors and Affiliations

Graduate School of Information Sciences, Hiroshima City University, 3-4-1, Ozuka-Higashi, Asa-Minami-ku, Hiroshima, 731-3194, Japan
Shin Kamada & Akira Hara
Faculty of Management and Information Systems, Prefectural University of Hiroshima, 1-1-71, Ujina-Higashi, Minami-ku, Hiroshima, 734-8558, Japan
Takumi Ichimura
Department of Information Systems, Tokyo University of Information Sciences, 4-1 Onaridai, Wakaba-ku, Chiba, 265-8501, Japan
Kenneth J. Mackin

Authors

Shin Kamada
View author publications
You can also search for this author in PubMed Google Scholar
Takumi Ichimura
View author publications
You can also search for this author in PubMed Google Scholar
Akira Hara
View author publications
You can also search for this author in PubMed Google Scholar
Kenneth J. Mackin
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Shin Kamada.

Ethics declarations

Conflict of interest

Author Takumi Ichimura has received research grants from JAPAN MIC SCOPE and Artificial Intelligence Research Promotion Foundation. Author Shin Kamada has received a research grant from JSPS KAKENHI.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kamada, S., Ichimura, T., Hara, A. et al. Adaptive structure learning method of deep belief network using neuron generation–annihilation and layer generation. Neural Comput & Applic 31, 8035–8049 (2019). https://doi.org/10.1007/s00521-018-3622-y

Download citation

Received: 16 September 2017
Accepted: 11 July 2018
Published: 24 July 2018
Issue Date: November 2019
DOI: https://doi.org/10.1007/s00521-018-3622-y

Adaptive structure learning method of deep belief network using neuron generation–annihilation and layer generation

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Improved Classification Based on Deep Belief Networks

A-DBNF: adaptive deep belief network framework for regression and classification tasks

A Structural Learning Method of Restricted Boltzmann Machine by Neuron Generation and Annihilation Algorithm

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Adaptive structure learning method of deep belief network using neuron generation–annihilation and layer generation

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Improved Classification Based on Deep Belief Networks

A-DBNF: adaptive deep belief network framework for regression and classification tasks

A Structural Learning Method of Restricted Boltzmann Machine by Neuron Generation and Annihilation Algorithm

Explore related subjects

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation