Abstract
Recently, deep learning is receiving renewed attention in the field of artificial intelligence. Deep belief network (DBN) has a deep network architecture that can represent multiple features of input patterns hierarchically, using pre-trained restricted Boltzmann machines (RBMs). Such deep network architectures enable extremely high classification accuracy in many tasks compared to previous methods. However, determining various parameters to design effective deep network architectures is a difficult task even for experienced designers, since traditional RBM and DBN cannot change their network structure during the training. The adaptive structure learning method has been previously proposed for finding the optimum number of hidden neurons in multilayered neural networks. The method employs the neuron generation–annihilation algorithm by observing the variance of weight decays. We develop the adaptive structure learning method of RBM and DBN using the neuron generation–annihilation and layer generation algorithm by observing the variance of some parameters. The effectiveness of our proposed model was verified by tenfold cross-validation on benchmark data sets CIFAR-10 and CIFAR-100. The adaptive DBN achieved the highest classification accuracy (97.4% for CIFAR-10, 81.2% for CIFAR-100) among several latest DBN- and CNN-based methods.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Bengio Y, Lamblin P, Popovici D, Larochelle H (2007) Greedy layer-wise training of deep networks. In: Proceedings of advances in neural information processing systems 19 (NIPS 2007), pp 153–160
Ranzato M, Boureau Y, LeCun Y (2007) Sparse feature learning for deep belief networks. In: Proceedings of advances in neural information processing systems 20 (NIPS 2007), pp 1185–1192
Grosse LR, Ranganath R, Ng AY (2009) Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations. In: Proceedings of international conference in machine learning (ICML 2009), pp 609–616
Lyons T, Skitmore M (2012) Project risk management in the Queensland engineering construction industry : a survey. Int J Proj Manag 22(1):51–61
Lane ND, Miluzzo E, Hong L, Peebles D, Choudhury T, Campbell AT (2010) A survey of mobile phone sensing. IEEE Commun Mag 48(9):140–150
Zhang H, Cao X, Ho JKL, Chow TWS (2017) Object-level video advertising: an optimization framework. IEEE Trans Ind Inf 13(2):520–531
Oyedotun OK, Khashman A (2017) Deep learning in vision-based static hand gesture recognition. Neural Comput Appl 28(12):3941–3951
Bengio Y (2009) Learning Deep Architectures for AI. Found Trends Mach Learn Arch 2(1):1–127
Quoc VL, Marc’s Aurelio R et al (2013) Building high-level features using large scale unsupervised learning. Proceedings of IEEE international conference on acoustics, speech and signal processing, pp 8595–8598
Hinton GE, Osindero S, Teh Y (2006) A fast learning algorithm for deep belief nets. Neural Comput 18(7):1527–1554
Hinton GE (2012) A practical guide to training restricted boltzmann machines. Neural Networks, Tricks of the Trade, Lecture Notes in Computer Science (LNCS, vol 7700), pp 599–619
Ichimura T, Yoshida K, (eds) (2004) Knowledge-based intelligent systems for health care. Advanced knowledge international, ISBN 0-9751004-4-0, pp 11–50
Ichimura T, Oeda S, Suka M, Hara A, Mackin KJ, Yoshida K (2005) Knowledge discovery and data mining in medicine. In: Pal N, Jain L (eds) Advanced techniques in knowledge discovery and data mining (advanced information and knowledge processing). Springer, Berlin, pp 177–210
Ichimura T, Oeda S, Suka Ma, Yoshida K (2005) A learning method of immune multi-agent neural networks. Neural Comput Appl 14(2):132–148
Kamada S, Ichimura T (2016) An adaptive learning method of restricted boltzmann machine by neuron generation and annihilation algorithm. In: Proceedings of IEEE international conference on systems, man, and cybernetics (SMC), pp 1273–1278
Kamada S, Ichimura T (2016) A structural learning method of restricted boltzmann machine by neuron generation and annihilation algorithm. Neural information processing, lecture notes in computer science (LNCS, vol 9950), pp 372–380
Kamada S, Ichimura T (2016) An adaptive learning method of deep belief network by layer generation algorithm. In: Proceedings of IEEE region 10 conference (TENCON), pp 2971–2974
Krizhevsky A (2009) Learning multiple layers of features from tiny images. Master of thesis, University of Toronto
KyungHyun C, Alexander I, Tapani R (2011) Improved learning of Gaussian–Bernoulli restricted Boltzmann machines. In: Proceedings of international conference on artificial neural networks (ICANN 2011), Part 1, pp 14–17
Courville A, Desjardins G, Bergstra J, Bengio Y (2014) The spike-and-slab RBM and extensions to discrete and sparse data distributions. IEEE Trans Pattern Anal Mach Intell 36(9):1874–1887
Yogeswaran A, Payeur P (2016) Improving visual feature representations by Biasing restricted Boltzmann machines with Gaussian Filters. In: Proceedings advances in visual computing: 12th international symposium, ISVC 2016. Part I: 825–835
Li Z, Cai X, Liang T (2016) Gaussian–Bernoulli based convolutional restricted boltzmann machine for images feature extraction. In: Proceedings of the 23rd International Conference on Neural Information Processing 9948:593–602
Krizhevsky A (2010) Convolutional deep belief networks on CIFAR-10
Sohn K, Lee H (2012) Learning invariant representations with local transformations. In: Proceedings of the 29th international conference on machine learning (ICML 2012), pp 1339–1346
Coates A, Ng A, Lee H (2011) An analysis of single-layer networks in unsupervised feature learning. Proc Mach Learn Res 15:215–223
Mocanu DC, Mocanu E, Stone P, Nguyen PH, Gibescu M, Liotta A (2017) Evolutionary training of sparse artificial neural networks: a network science perspective. arXiv:1707.04780
Anush S, Gaurav G, Mayank V, Richa S, Angshul M (2017) Class sparsity signature based Restricted Boltzmann Machine. Pattern Recognit 61:674–685
Zhang L, Subbarayan G (2002) An evaluation of back-propagation neural networks for the optimal design of structural systems: Part II. Numerical evaluation. Comput Methods Appl Mech Eng 191(25–26):2887–2904
Ichimura T, Takano T, Tazaki E (1995) Reasoning and learning method for fuzzy rules using neural networks with adaptive structured genetic algorithm. In: Proceedings of IEEE International Conference on Systems, Man and Cybernetics (SMC’95) 4:3269–3274
Zenga X, Yeungb DS (2006) Hidden neuron pruning of multilayer perceptrons using a quantified sensitivity measure. Neurocomputing 69(7–9):825–837
Islam MM, Sattar MA, Amin MF, Yao X, Murase K (2009) A New Adaptive Merging and Growing Algorithm for Designing Artificial Neural Networks. In: IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics) 39(3):705–722
Bruzzone L, Prieto DF (1999) A technique for the selection of kernel-function parameters in RBF neural networks for classification of remote-sensing images. IEEE Trans Geosci Remote Sens 37(2):1179–1184
Ichimura T, Tazaki E, Yoshida K (1995) Extraction of fuzzy rules using neural networks with structure level adaptation verification to the diagnosis of hepatobiliary disorders. Int J Bio-Med Comput 40(2):139–146
Kristiansen G, Gonzalvo X (2017) EnergyNet: energy-based adaptive structural learning of artificial neural network architectures. arXiv:1711.03130 [cs.LG]
Fahlman SE, Lebiere C (1990) The cascade-correlation learning architecture. Proc Adv Neural Inf Process Syst 2(NIPS 1989):524–532
Ackley DH, Hinton GE, Sejnowski TJ (1985) A learning algorithm for Boltzmann machines. Cogn Sci 9(1):147–169
Hinton GE (2002) Training products of experts by minimizing contrastive divergence. Neural Comput 14(8):1771–1800
Tieleman T (2008) Training restricted Boltzmann machines using approximations to the likelihood gradient. In: Proceedings of the 25th international conference in machine learning (ICML 2008), pp 1064–1071
Kawaguchi K (2016) Deep learning without poor local minima. In: Proceedings of advances in neural information processing systems 29 (NIPS 2016):586–594
Carlson D, Cevher V, Carin L (2015) Stochastic spectral descent for restricted Boltzmann machines. In: Proceedings of the 18th international conference on artificial intelligence and statistics, pp 111–119
LeCun Y et al (2015) THE MNIST DATABASE of handwritten digits. http://yann.lecun.com/exdb/mnist/. Accessed 26 June 2017
Cortes C et al (2016) AdaNet: adaptive structural learning of artificial neural networks. arXiv:1607.01097
Kamada S, Ichimura T (2016) Fine tuning method by using knowledge acquisition from deep belief network. In: Proceedings of IEEE 9th international workshop on computational intelligence and applications (IWCIA2016), pp 119–124
Goodfellow I, Warde-Farley D, Mirza M, Courville A, Bengio Y (2013) Maxout networks. Proc Mach Learn Res (PMLR) 28(3):1319–1327
Clevert DA, Unterthiner T, Hochreiter S (2016) Fast and accurate deep network learning by exponential linear units (ELUs). In: Proceedings of ICRL (2016)
Benjamin G (2015) Fractional max-pooling. arXiv:1412.6071
Zagoruyko S, Komodakis N (2016) Wide residual networks. In: Proceedings of the British machine vision conference (BMVC), 87.1–87.12
Funding
This study was funded by JAPAN MIC SCOPE (Grant Number 162308002), Artificial Intelligence Research Promotion Foundation, and JSPS KAKENHI (Grant Number JP17J11178).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
Author Takumi Ichimura has received research grants from JAPAN MIC SCOPE and Artificial Intelligence Research Promotion Foundation. Author Shin Kamada has received a research grant from JSPS KAKENHI.
Rights and permissions
About this article
Cite this article
Kamada, S., Ichimura, T., Hara, A. et al. Adaptive structure learning method of deep belief network using neuron generation–annihilation and layer generation. Neural Comput & Applic 31, 8035–8049 (2019). https://doi.org/10.1007/s00521-018-3622-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-018-3622-y