research-article

Regularizing Deep Neural Networks by Ensemble-based Low-Level Sample-Variances Method

Authors:

Zeting HuAuthors Info & Claims

CIKM '19: Proceedings of the 28th ACM International Conference on Information and Knowledge Management

Pages 1111 - 1120

https://doi.org/10.1145/3357384.3357921

Published: 03 November 2019 Publication History

Abstract

Deep Neural Networks (DNNs) with a large number of parameters are very powerful machine learning systems. However, overfitting is a serious problem in such networks. Till now, many regularizers such as dropout, data augmentation have been proposed to prevent overfitting. Motivated by ensemble learning, we treat each hidden layer in neural networks as an ensemble of some base learners by dividing hidden units into some non-overlapping groups and each group is considered as a base learner. Based on the theoretical analysis of generalization error of ensemble estimators (bias-variance-covariance decomposition), we find the variance of each base learner plays an important role in preventing overfitting and propose a novel regularizer---\emphEnsemble-based Low-Level Sample-Variances Method (ELSM) to encourage each base learner of hidden layers to have a low-level sample-variance. Experiments across a number of datasets and network architectures show that ELSM can effectively reduce overfitting and improve the generalization ability of DNNs.

References

[1]

Mart'in Abadi, Ashish Agarwal, Paul Barham, Eugene Brevdo, Zhifeng Chen, Craig Citro, Greg S Corrado, Andy Davis, Jeffrey Dean, Matthieu Devin, et almbox. 2016. Tensorflow: Large-scale machine learning on heterogeneous distributed systems. arXiv preprint arXiv:1603.04467 (2016).

Digital Library

[2]

Monther Alhamdoosh and Dianhui Wang. 2014. Fast decorrelated neural network ensembles with random weights. Information Sciences, Vol. 264 (2014), 104--117.

Digital Library

[3]

Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2014. Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473 (2014).

[4]

Kyunghyun Cho, Bart Van Merriënboer, Dzmitry Bahdanau, and Yoshua Bengio. 2014. On the properties of neural machine translation: Encoder-decoder approaches. arXiv preprint arXiv:1409.1259 (2014).

[5]

Michael Cogswell, Faruk Ahmed, Ross Girshick, Larry Zitnick, and Dhruv Batra. 2015. Reducing overfitting in deep networks by decorrelating representations. arXiv preprint arXiv:1511.06068 (2015).

[6]

Andries Petrus Engelbrecht. 2001. A new pruning heuristic based on variance analysis of sensitivity information. IEEE transactions on Neural Networks, Vol. 12, 6 (2001), 1386--1399.

Digital Library

[7]

Hakan Erdogan, John R Hershey, Shinji Watanabe, and Jonathan Le Roux. 2015. Phase-sensitive and recognition-boosted speech separation using deep recurrent neural networks. In 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 708--712.

[8]

Stuart Geman, Elie Bienenstock, and René Doursat. 1992. Neural networks and the bias/variance dilemma. Neural computation, Vol. 4, 1 (1992), 1--58.

[9]

Sanjay Surendranath Girija. 2016. Tensorflow: Large-scale machine learning on heterogeneous distributed systems. Software available from tensorflow. org (2016).

[10]

Ian J Goodfellow, David Warde-Farley, Mehdi Mirza, Aaron Courville, and Yoshua Bengio. 2013. Maxout networks. arXiv preprint arXiv:1302.4389 (2013).

Digital Library

[11]

Alex Graves, Abdel-rahman Mohamed, and Geoffrey Hinton. 2013. Speech recognition with deep recurrent neural networks. In 2013 IEEE international conference on acoustics, speech and signal processing. IEEE, 6645--6649.

[12]

Shuqin Gu, Yuexian Hou, Lipeng Zhang, and Yazhou Zhang. 2018. Regularizing Deep Neural Networks with an Ensemble-based Decorrelation Method. In IJCAI . 2177--2183.

[13]

Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. 770--778.

[14]

Geoffrey Hinton, Li Deng, Dong Yu, George Dahl, Abdel-rahman Mohamed, Navdeep Jaitly, Andrew Senior, Vincent Vanhoucke, Patrick Nguyen, Brian Kingsbury, et almbox. 2012. Deep neural networks for acoustic modeling in speech recognition. IEEE Signal processing magazine, Vol. 29 (2012).

[15]

Seunghoon Hong, Tackgeun You, Suha Kwak, and Bohyung Han. 2015. Online tracking by learning discriminative saliency map with convolutional neural network. In International conference on machine learning . 597--606.

[16]

Md Monirul Islam, Md Abdus Sattar, Md Faijul Amin, Xin Yao, and Kazuyuki Murase. 2009. A new adaptive merging and growing algorithm for designing artificial neural networks. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), Vol. 39, 3 (2009), 705--722.

Digital Library

[17]

Alex Krizhevsky and Geoffrey Hinton. 2009. Learning multiple layers of features from tiny images . Technical Report. Citeseer.

[18]

Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. 2012. Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems. 1097--1105.

[19]

Philippe Lauret, Eric Fock, and Thierry Alex Mara. 2006. A node pruning algorithm based on a Fourier amplitude sensitivity test method. IEEE transactions on neural networks, Vol. 17, 2 (2006), 273--293.

Digital Library

[20]

Yong Liu and Xin Yao. 1999. Ensemble learning via negative correlation. Neural networks, Vol. 12, 10 (1999), 1399--1404.

Digital Library

[21]

Hyeonseob Nam and Bohyung Han. 2016. Learning multi-domain convolutional neural networks for visual tracking. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition . 4293--4302.

[22]

Brady Neal, Sarthak Mittal, Aristide Baratin, Vinayak Tantia, Matthew Scicluna, Simon Lacoste-Julien, and Ioannis Mitliagkas. 2018. A Modern Take on the Bias-Variance Tradeoff in Neural Networks. arXiv preprint arXiv:1810.08591 (2018).

[23]

Bruce E Rosen. 1996. Ensemble learning using decorrelated neural networks. Connection science, Vol. 8, 3--4 (1996), 373--384.

[24]

Nitish Srivastava, Geoffrey Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov. 2014. Dropout: a simple way to prevent neural networks from overfitting. The Journal of Machine Learning Research, Vol. 15, 1 (2014), 1929--1958.

Digital Library

[25]

Sundaram Suresh, Keming Dong, and HJ Kim. 2010. A sequential learning algorithm for self-adaptive resource allocation network classifier. Neurocomputing, Vol. 73, 16--18 (2010), 3012--3019.

Digital Library

[26]

Naonori Ueda and Ryohei Nakano. 1996. Generalization error of ensemble estimators. In Proceedings of International Conference on Neural Networks (ICNN'96), Vol. 1. IEEE, 90--95.

[27]

Lijun Wang, Wanli Ouyang, Xiaogang Wang, and Huchuan Lu. 2015. Visual tracking with fully convolutional networks. In Proceedings of the IEEE international conference on computer vision. 3119--3127.

Digital Library

[28]

Han Xiao, Kashif Rasul, and Roland Vollgraf. 2017. Fashion-mnist: a novel image dataset for benchmarking machine learning algorithms. arXiv preprint arXiv:1708.07747 (2017).

[29]

Sergey Zagoruyko and Nikos Komodakis. 2016. Wide residual networks. arXiv preprint arXiv:1605.07146 (2016).

[30]

Xiangwen Zhang, Jinsong Su, Yue Qin, Yang Liu, Rongrong Ji, and Hongji Wang. 2018. Asynchronous bidirectional decoding for neural machine translation. In Thirty-Second AAAI Conference on Artificial Intelligence .

Cited By

Feng JZhao XZhu TLi TQiu ZLi Z(2023)Detection Mature Bud for Daylily Based on Faster R-CNN Integrated With CBAMIEEE Access10.1109/ACCESS.2023.329959511(81646-81655)Online publication date: 2023
https://doi.org/10.1109/ACCESS.2023.3299595

Index Terms

Regularizing Deep Neural Networks by Ensemble-based Low-Level Sample-Variances Method
1. Computing methodologies
  1. Machine learning
    1. Machine learning algorithms
      1. Ensemble methods
    2. Machine learning approaches
      1. Neural networks
2. Networks
  1. Network performance evaluation
    1. Network performance analysis

Recommendations

Regularizing deep neural networks with an ensemble-based decorrelation method
IJCAI'18: Proceedings of the 27th International Joint Conference on Artificial Intelligence

Although Deep Neural Networks (DNNs) have achieved excellent performance in many tasks, improving the generalization capacity of DNNs still remains a challenge. In this work, we propose a novel regularizer named Ensemble-based Decorrelation Method (EDM), ...
Ensembling neural networks: many could be better than all

Neural network ensemble is a learning paradigm where many neural networks are jointly used to solve a problem. In this paper, the relationship between the ensemble and its component neural networks is analyzed from the context of both regression and ...
AdaBoost-based artificial neural network learning

A boosting-based method of learning a feed-forward artificial neural network (ANN) with a single layer of hidden neurons and a single output neuron is presented. Initially, an algorithm called Boostron is described that learns a single-layer perceptron ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

CIKM '19: Proceedings of the 28th ACM International Conference on Information and Knowledge Management

November 2019

3373 pages

ISBN:9781450369763

DOI:10.1145/3357384

General Chairs:
Wenwu Zhu
Tsinghua University, China
,
Dacheng Tao
University of Massachusetts, USA
,
Xueqi Cheng
Institute of Computing Technology, CAS, China
,
Program Chairs:
Peng Cui
Tsinghua University, China
,
Elke Rundensteiner
Worcester Polytechnic Institute, USA
,
David Carmel
Amazon Research, USA
,
Qi He
LinkedIn, USA
,
Jeffrey Xu Yu
Chinese University of Hong Kong, China

Copyright © 2019 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 03 November 2019

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

National Key R&D Program of China
National Natural Science Foundation of China
Alibaba Innovation Research Foundation 2017
European Unions Horizon 2020 research and innovation programme under the Marie Skodowska-Curie grant agreement

Conference

CIKM '19

Sponsor:

CIKM '19: The 28th ACM International Conference on Information and Knowledge Management

November 3 - 7, 2019

Beijing, China

Acceptance Rates

CIKM '19 Paper Acceptance Rate 202 of 1,031 submissions, 20%;

Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

Upcoming Conference

CIKM '25

Sponsor:
sigir
sigir

The 34th ACM International Conference on Information and Knowledge Management

November 10 - 14, 2025

Seoul , Republic of Korea

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
167
Total Downloads

Downloads (Last 12 months)6
Downloads (Last 6 weeks)0

Reflects downloads up to 05 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Feng JZhao XZhu TLi TQiu ZLi Z(2023)Detection Mature Bud for Daylily Based on Faster R-CNN Integrated With CBAMIEEE Access10.1109/ACCESS.2023.329959511(81646-81655)Online publication date: 2023
https://doi.org/10.1109/ACCESS.2023.3299595

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten