Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3293353.3293383acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicvgipConference Proceedingsconference-collections
research-article

HSD-CNN: Hierarchically self decomposing CNN architecture using class specific filter sensitivity analysis

Published: 03 May 2020 Publication History

Abstract

Conventional convolutional neural networks (CNN) are trained on large domain datasets and are hence typically over-represented and inefficient in limited class applications. An efficient way to convert such large many-class pre-trained networks into small few-class networks is through a hierarchical decomposition of its feature maps. To alleviate this issue, we propose an automated framework for such decomposition in Hierarchically Self Decomposing CNN (HSD-CNN), in four steps. HSD-CNN is derived automatically using a class-specific filter sensitivity analysis that quantifies the impact of specific features on a class prediction. The decomposed hierarchical network can be utilized and deployed directly to obtain sub-networks for a subset of classes, and it is shown to perform better without the requirement of retraining these sub-networks. Experimental results show that HSD-CNN generally does not degrade accuracy if the full set of classes is used. Interestingly, when operating on known subsets of classes, HSD-CNN has an improvement in accuracy with a much smaller model size requiring much fewer operations. HSD-CNN flow is verified on the CIFAR10, CIFAR100 and CALTECH101 datasets. We report accuracies up to 85.6% (94.75%) on scenarios with 13 (4) classes of CIFAR100, using a pre-trained VGG-16 network on the full dataset. In this case, the proposed HSD-CNN requires 3.97x fewer parameters and has 71.22% savings in operations, in comparison to baseline VGG-16 containing features for all 100 classes.

References

[1]
Karen Simonyan and Andrew Zisserman. Very deep convolutional networks for large-scale image recognition. CoRR, abs/1409.1556, 2014.
[2]
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. In 2016 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016, Las Vegas, NV, USA, June 27-30, 2016, pages 770--778. IEEE Computer Society, 2016.
[3]
Zhicheng Yan, Hao Zhang, Robinson Piramuthu, Vignesh Jagadeesh, Dennis DeCoste, Wei Di, and Yizhou Yu. HD-CNN: hierarchical deep convolutional neural networks for large scale visual recognition. In 2015 IEEE International Conference on Computer Vision, ICCV 2015, Santiago, Chile, December 7-13, 2015, pages 2740--2748, 2015.
[4]
Deboleena Roy, Priyadarshini Panda, and Kaushik Roy. Tree-cnn: A deep convolutional neural network for lifelong learning. CoRR, abs/1802.05800, 2018.
[5]
Xinqi Zhu and Michael Bain. B-CNN: branch convolutional neural network for hierarchical classification. CoRR, abs/1709.09890, 2017.
[6]
Jia Guo and Miodrag Potkonjak. Pruning convnets online for efficient specialist models. In 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops, CVPR Workshops, Honolulu, HI, USA, July 21-26, 2017, pages 430--437. IEEE Computer Society, 2017.
[7]
Yann Le Cun, John S. Denker, and Sara A. Solla. Advances in neural information processing systems 2. chapter Optimal Brain Damage, pages 598--605. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 1990.
[8]
Babak Hassibi and David G. Stork. Second order derivatives for network pruning: Optimal brain surgeon. In Advances in Neural Information Processing Systems 5, [NIPS Conference], pages 164--171, San Francisco, CA, USA, 1993. Morgan Kaufmann Publishers Inc.
[9]
Song Han, Jeff Pool, John Tran, and William J. Dally. Learning both weights and connections for efficient neural networks. In Proceedings of the 28th International Conference on Neural Information Processing Systems - Volume 1, NIPS'15, pages 1135--1143, Cambridge, MA, USA, 2015. MIT Press.
[10]
Sajid Anwar, Kyuyeon Hwang, and Wonyong Sung. Structured pruning of deep convolutional neural networks. J. Emerg. Technol. Comput. Syst., 13(3):32:1--32:18, February 2017.
[11]
A. Polyak and L. Wolf. Channel-level acceleration of deep face representations. IEEE Access, 3:2163--2175, October 2015.
[12]
Emily Denton, Wojciech Zaremba, Joan Bruna, Yann LeCun, and Rob Fergus. Exploiting linear structure within convolutional networks for efficient evaluation. In Proceedings of the 27th International Conference on Neural Information Processing Systems - Volume 1, NIPS'14, pages 1269--1277, Cambridge, MA, USA, 2014. MIT Press.
[13]
Max Jaderberg, Andrea Vedaldi, and Andrew Zisserman. Speeding up convolutional neural networks with low rank expansions. In Michel François Valstar, Andrew P. French, and Tony P. Pridmore, editors, British Machine Vision Conference, BMVC 2014, Nottingham, UK, September 1-5, 2014. BMVA Press, 2014.
[14]
X. Dong, J. Huang, Y. Yang, and S. Yan. More is less: A more complicated network with less inference complexity. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 1895--1903, July 2017.
[15]
Xiangyu Zhang, Jianhua Zou, Kaiming He, and Jian Sun. Accelerating very deep convolutional networks for classification and detection. IEEE Trans. Pattern Anal. Mach. Intell., 38(10):1943--1955, October 2016.
[16]
Yihui He, Xiangyu Zhang, and Jian Sun. Channel pruning for accelerating very deep neural networks. In IEEE International Conference on Computer Vision, ICCV 2017, Venice, Italy, October 22-29, 2017, pages 1398--1406, 2017.
[17]
Wei Wen, Cong Xu, Chunpeng Wu, Yandan Wang, Yiran Chen, and Hai Li. Coordinating filters for faster deep neural networks. In IEEE International Conference on Computer Vision, ICCV 2017, Venice, Italy, October 22-29, 2017, pages 658--666, 2017.
[18]
Q. Huang, K. Zhou, S. You, and U. Neumann. Learning to prune filters in convolutional neural networks. In 2018 IEEE Winter Conference on Applications of Computer Vision (WACV), pages 709--718, March 2018.
[19]
Andrew G. Howard, Menglong Zhu, Bo Chen, Dmitry Kalenichenko, Weijun Wang, Tobias Weyand, Marco Andreetto, and Hartwig Adam. Mobilenets: Efficient convolutional neural networks for mobile vision applications. CoRR, abs/1704.04861, 2017.
[20]
Forrest N. Iandola, Matthew W. Moskewicz, Khalid Ashraf, Song Han, William J. Dally, and Kurt Keutzer. Squeezenet: Alexnet-level accuracy with 50x fewer parameters and <1mb model size. CoRR, abs/1602.07360, 2016.
[21]
Xiangyu Zhang, Xinyu Zhou, Mengxiao Lin, and Jian Sun. Shufflenet: An extremely efficient convolutional neural network for mobile devices. CoRR, abs/1707.01083, 2017.
[22]
Gao Huang, Shichen Liu, Laurens van der Maaten, and Kilian Q. Weinberger. Condensenet: An efficient densenet using learned group convolutions. In 2018 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018, Salt Lake City, USA, June 18-22, 2018, pages 2752--2761, 2018.
[23]
Yann LeCun, Yoshua Bengio, and Geoffrey E. Hinton. Deep learning. Nature, 521(7553):436--444, 2015.
[24]
Nicholas Frosst and Geoffrey E. Hinton. Distilling a neural network into a soft decision tree. CoRR, abs/1711.09784, 2017.
[25]
Quanshi Zhang, Yu Yang, Ying Nian Wu, and Song-Chun Zhu. Interpreting cnns via decision trees. CoRR, abs/1802.00121, 2018.
[26]
Joe H. Ward Jr. Hierarchical grouping to optimize an objective function. Journal of the American Statistical Association, 58(301):236--244, 1963.
[27]
Adam Paszke, Sam Gross, Soumith Chintala, and Gregory Chanan. Pytorch: Tensors and dynamic neural networks in python with strong GPU acceleration, 2017. Last accessed on Mar 07, 2018.
[28]
Sebastian Ruder. An overview of gradient descent optimization algorithms. CoRR, abs/1609.04747, 2016.
[29]
Alex Krizhevsky. Learning Multiple Layers of Features from Tiny Images. Master's thesis, 2009. Last accessed on 10 Aug, 2018.
[30]
Li Fei-Fei, Rob Fergus, and Pietro Perona. Learning generative visual models from few training examples: An incremental bayesian approach tested on 101 object categories. Comput. Vis. Image Underst., 106(1):59--70, April 2007.
[31]
Yu Cheng, Felix X. Yu, Rogério Schmidt Feris, Sanjiv Kumar, Alok N. Choudhary, and Shih-Fu Chang. An exploration of parameter redundancy in deep networks with circulant projections. In 2015 IEEE International Conference on Computer Vision, ICCV 2015, Santiago, Chile, December 7-13, 2015, pages 2857--2865, 2015.

Cited By

View all
  • (2024)Decomposition of Deep Neural Networks into Modules via Mutation AnalysisProceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3650212.3680390(1669-1681)Online publication date: 11-Sep-2024
  • (2023)DecompoVision: Reliability Analysis of Machine Vision Components through Decomposition and ReuseProceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering10.1145/3611643.3616333(541-552)Online publication date: 30-Nov-2023
  • (2023)Decomposing a Recurrent Neural Network into Modules for Enabling Reusability and Replacement2023 IEEE/ACM 45th International Conference on Software Engineering (ICSE)10.1109/ICSE48619.2023.00093(1020-1032)Online publication date: May-2023
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences
ICVGIP '18: Proceedings of the 11th Indian Conference on Computer Vision, Graphics and Image Processing
December 2018
659 pages
ISBN:9781450366151
DOI:10.1145/3293353
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 03 May 2020

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. CNN
  2. classification
  3. clustering
  4. hierarchical
  5. model transfer
  6. neural networks
  7. sub-networks

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

ICVGIP 2018

Acceptance Rates

Overall Acceptance Rate 95 of 286 submissions, 33%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)20
  • Downloads (Last 6 weeks)0
Reflects downloads up to 12 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Decomposition of Deep Neural Networks into Modules via Mutation AnalysisProceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3650212.3680390(1669-1681)Online publication date: 11-Sep-2024
  • (2023)DecompoVision: Reliability Analysis of Machine Vision Components through Decomposition and ReuseProceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering10.1145/3611643.3616333(541-552)Online publication date: 30-Nov-2023
  • (2023)Decomposing a Recurrent Neural Network into Modules for Enabling Reusability and Replacement2023 IEEE/ACM 45th International Conference on Software Engineering (ICSE)10.1109/ICSE48619.2023.00093(1020-1032)Online publication date: May-2023
  • (2022)Decomposing convolutional neural networks into reusable and replaceable modulesProceedings of the 44th International Conference on Software Engineering10.1145/3510003.3510051(524-535)Online publication date: 21-May-2022
  • (2022)CaptorX: A Class-Adaptive Convolutional Neural Network Reconfiguration FrameworkIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2021.306152041:3(530-543)Online publication date: Mar-2022
  • (2022)Taxonomy Driven Learning Of Semantic Hierarchy Of Classes2022 IEEE International Conference on Image Processing (ICIP)10.1109/ICIP46576.2022.9898007(171-175)Online publication date: 16-Oct-2022

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media