Abstract
Convolutional neural networks (CNNs) have achieved significant progress in computer vision systems, helping to efficiently obtain feature information by sliding filters on the input images. However, CNNs have difficulty capturing specific properties when the images are affected by various noises. This paper proposes an attention based convolutional pooling neural network (ACPNN) where an attention-mechanism is applied to feature maps to obtain key features, and max pooling is replaced with convolutional pooling to improve recognition accuracy in harsh environments. The ACPNN with attention mechanism and convolutional pooling structure is robust against external noises and maintains classification performance under such conditions. The proposed ACPNN was validated on the German traffic sign recognition benchmark with various cases. Considering the traffic signs are suffered from various noises, the recognition performances were demonstrated with conventional CNN and state-of-the art CNNs such as multi-scale CNN, committee of CNN, hierarchical CNN, and multi-column deep neural network. Under such harsh conditions, the proposed ACPNN shows 66.981% and 83.198% respectively, which are the best performances compared to other CNNs.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Krizhevsky A, Sutskever I, Hinton G (2012) Imagenet classification with deep convolutional neural networks. In: Pereira F, Burges CJC, Bottou L, Weinberger KQ (eds) Advances in Neural Information Processing Systems 25, Curran Associates, Inc., pp 1097–1105
Li X, Jie Z, Feng J, Liu C, Yan S (2018) Learning with rethinking: recurrently improving convolutional neural. Pattern Recognit 79:183–194
Liu J, Gong M, Qin K, Zhang P (2018) A deep convolutional coupling network for change detection based on heterogeneous optical and radar images. IEEE Trans Neural Netw Learn Syst 29(3):545–559
Pang Y, Sun M, Jiang X, Li X (2018) Convolution in convolution for network in network. IEEE Trans Neural Netw Learn Syst 29(5):1587–1597
Szegedy C, Vanhoucke V, Ioffe S, Shlens J (2016) Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2818–2826
Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 580–587
Du W, Wang Y, Qiao Y (2018) Recurrent spatial-temporal attention network for action recognition in videos. IEEE Trans Image Process 27(3):1347–1360
Yan Z, Feng Y, Cheng C, Fu J, Zhou X, Yuan J (2018) Extensive exploration of comprehensive vehicle attributes using D-CNN with weighted multi-attribute strategy. IET Intell Transp Syst 12(3):186–193
Xie J, Xu L, Chen E (2012) Image denoising and inpainting with deep neural networks. In: Pereira F, Burges CJC, Bottou L, Weinberger KQ (eds) Advances in Neural Information Processing Systems 25, Curran Associates, Inc., pp 341–349
Xu L, Ren JS, Liu C, Jia J (2014) Deep convolutional neural network for image deconvolution. In: Ghahramani Z, Welling M, Cortes C, Lawrence ND, Weinberger KQ (eds) Advances in Neural Information Processing Systems 27. Curran Associates, Inc., pp 1790–1798
Sun J, Cao W, Xu Z, Ponce J (2015) Learning a convolutional neural network for non-uniform motion blur removal. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 769–777
Stallkamp J, Schlipsing M, Salmen J, lgel C (2012) Man vs. computer: benchmarking machine learning algorithms for traffic sign recognition. Neural Netw 32:323–332
Ciresan D, Meier U, Masci J, Schmidhuber J (2011) A committee of neural networks for traffic sign classification. In: The 2011 international joint conference on neural networks, pp 1918–1921
Sermanet P, LeCun Y (2011) Traffic sign recognition with multi-scale convolutional networks. In: The 2011 international joint conference on neural networks (IJCNN), pp 2809–2813
Ciresan D, Meier U, Masci J, Schmidhuber J (2015) Multi-column deep neural networks for image classification. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 3642–3649
Jin J, Fu K, Zhang C (2014) Traffic sign recognition with hinge loss trained convolutional neural networks. IEEE Trans Intell Transp Syst 15:1991–2000
Luo H, Yang Y, Tong B, Wu F, Fan B (2018) Traffic sign recognition using a multi-task convolutional neural network. IEEE Trans Intell Transp Syst 19(4):1100–1111
Liu C, Chang F, Chen Z, Liu D (2016) Fast traffic sign recognition via high-contrast region extraction and extended sparse representation. IEEE Trans Intell Transp Syst 17(1):79–92
Zhu Y, Zhang C, Zhou D, Wang X, Bai X, Liu W (2016) Traffic sign detection and recognition using fully convolutional network guided proposals. Neurocomputing 214:758–766
Wong A, Shafiee MJ, Jules MS (2018) Micronnet: a highly compact deep convolutional neural network architecture for real-time embedded traffic sign classification. IEEE Access 6:59803–59810
Li J, Wang Z (2019) Real-time traffic sign recognition based on efficient CNNs in the wild. IEEE Trans Intell Transp Syst 20(3):975–984
Khalid S, Muhammad N, Sharif M (2018) Automatic measurement of the traffic sign with digital segmentation and recognition. IET Intel Transp Syst 13(2):269–279
Shustanov A, Yakimov P (2017) CNN design for real-time traffic sign recognition. Procedia Eng 201:718–725
Kryvinska N, Maranda AP, Gregus M (2018) An approach towards service system building for road traffic signs detection and recognition. Procedia Comput Sci 141:64–71
Gudigar A, Chokkadi S, Raghavendra U, Acharya UR (2017) Local texture patterns for traffic sign recognition using higher order spectra. Pattern Recogn Lett 94:202–210
Gudigar A, Chokkadi S, Raghavendra U, Acharya UR (2019) An efficient traffic sign recognition based on graph embedding features. Neural Comput Appl 31(2):395–407
Arcos-Garcia A, Soilan M, Alvarez-Garcia JA, Riveiro B (2017) Exploiting synergies of mobile mapping sensors and deep learning for traffic sign recognition systems. Expert Syst Appl 89:286–295
Farabet C, Couprie C, Najman L, LeCun Y (2013) Learning hierarchical features for scene labeling. IEEE Trans Pattern Anal Mach Intell 35(8):1915–1929
Noord N, Postma E (2017) Learning scale-variant and scale-invariant features for deep image classification. Pattern Recognit 61:583–592
Ellahyani A, Ansari ME, Jaafari IE (2016) Traffic sign detection and recognition based on random forests. Appl Soft Comput 46:805–815
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser L, Polosukhin I (2017) Attention is all you need. In: Guyon I, Luxburg UV, Bengio S, Wallach H, Fergus R, Vishwanathan S, Garnett R (eds) Advances in Neural Information Processing Systems 30, Curran Associates, Inc., pp 5998–6008
Cinar Y, Mirisaee H, Goswami P, Gaussier E, Bachir AA, Strijov V (2017) Position-based content attention for time series forecasting with sequence-to-sequence RNNs. In: International conference on neural information processing, pp 533–544
Sharma S, Kiros R, Salakhutdinov R (2016) Action recognition using visual attention. arXiv preprint arXiv:1511.04119
Wojna Z, Gorban A, Lee DS, Murphy K, Yu Q, Li Y, Ibarz J (2017) Attention-based extraction of structured information from street view imagery. arXiv preprint arXiv:1704.03549
Wang F, Jiang M, Qian C, Yang S, Li C, Zhang H, Wang X, Tang X (2017) Residual attention network for image classification. In: 2017 IEEE conference on computer vision and pattern recognition, pp 6450–6458
Chu X, Yang W, Ouyang W, Ma C, Yuille A L, Wang X (2017) Multi-context attention for human pose estimation. In: 2017 IEEE conference on computer vision and pattern recognition, pp 5669–5678
Stollenga M, Masci J, Gomez F, Schmidhuber J (2014) Design of stabilizing state feedback for delay systems via convex optimization. In: Advances in neural information processing systems, pp 3545–3553
Sun M, Song Z, Jiang X, Pan J, Pang Y (2017) Learning pooling for convolutional neural network. Neurocomputing 224:96–104
Springenberg JT, Dosovitskiy A, Brox T, Riedmiller M (2014) Striving for simplicity: the all convolutional net. arXiv preprint arXiv:1412.6806
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556
Zeiler MD, Fergus R (2013) Stochastic pooling for regularization of deep convolutional neural networks. pp 2278–2324. arXiv preprint arXiv:1301.3557
Scherer D, Muller A, Behnke S (2010) Evaluation of pooling operations in convolutional architectures for object recognition. In: International conference on artificial neural networks, pp 92–101
Hamker FH (2004) Predictions of a model of spatial attention using sum-and max-pooling functions. Neurocomputing 56:329–343
Mullen KT, Kim YJ, Gheiratmand M (2014) Contrast normalization in colour vision: the effect of luminance contrast on colour contrast detection. Sci Rep 4(7350):1–7
Ioffe S, Szegedy C (2015) Batch normalization: accelerating deep network training by reducing internal covariate shift. In: International conference on machine learning, pp 448–456
Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15(1):1929–1958
Bengio Y (2012) Practical Recommendations for Gradient-Based Training of Deep Architectures. In: Montavon G, Orr GB, Müller KR (eds) Neural Networks: Tricks of the Trade. Lecture Notes in Computer Science, vol 7700. Springer, Berlin, Heidelberg
Kang L, Ye P, Li Y, Doermann D (2014) Convolutional neural networks for no-reference image quality assessment. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1733–1740
Pons G, Masip D (2018) Supervised committee of convolutional neural networks in automated facial expression analysis. IEEE Trans Affect Comput 9(3):343–350
Timofte R, Zimmermann K, Gool LV (2011) Multi-view traffic sign detection, recognition, and 3D localisation. Mach Vis Appl 25(3):633–674
Acknowledgements
This research was supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education (NRF-2016R1D1A1B01016071 and NRF-2017R1D1A1B03031467).
Author information
Authors and Affiliations
Corresponding authors
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Chung, J.H., Kim, D.W., Kang, T.K. et al. Traffic Sign Recognition in Harsh Environment Using Attention Based Convolutional Pooling Neural Network. Neural Process Lett 51, 2551–2573 (2020). https://doi.org/10.1007/s11063-020-10211-0
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11063-020-10211-0