Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

Sensitivity pruner: : Filter-Level compression algorithm for deep neural networks

Published: 01 August 2023 Publication History

Highlights

We integrate the sensitivity measure from SNIP into the “training while fine-tuning” framework to form a more powerful pruning strategy by adapting the unstructured pruning measure from SNIP to allow filterlevel compression. In practice, the sensitivity score can be easily computed as the gradient of the connection mask applied to the weight matrix. Independent of the model structure, the sensitivity score can be applied to most neural networks for pruning purposes.
We mitigate the sampling bias in the single-shot influence score by introducing the difference between the learned pruning strategy and the single-shot strategy as the second loss component. Filter influence is measured on batched data, where a convolutional layer is used to discover the robust influence from the noise of the batch. The learning process is guided by the score provided by the influence measure.
Our algorithm can dynamically tweak the training goal between improving model accuracy and pruning more filters. We add a selfadaptive hyper-parameter

Graphical abstract

Display Omitted

Abstract

As neural networks get deeper for better performance, the demand for deployable models on resource-constrained devices also grows. In this work, we propose eliminating less sensitive filters to compress models. The previous method evaluates neuron importance using the connection matrix gradient in a single shot. To mitigate the sampling bias, we integrate this measure into the previously proposed “pruning while fine-tuning” framework. Besides classification errors, we introduce the difference between the learned and the single-shot strategy as the second loss component with a self-adjustive hyper-parameter that balances the training goal between improving accuracy and pruning more filters. Our Sensitivity Pruner (SP) adapts the unstructured pruning saliency metric to structured pruning tasks and enables the strategy to be derived sequentially to accommodate the updating sparsity. Experimental results demonstrate that SP significantly reduces the computational cost and the pruned models give comparable or better performance on CIFAR10, CIFAR100, and ILSVRC-12 datasets.

References

[1]
I. Manipur, M. Manzo, I. Granata, M. Giordano, L. Maddalena, M.R. Guarracino, Netpro2vec: a graph embedding framework for biomedical applications, IEEE/ACM TCBB 19 (2) (2022) 729–740.
[2]
Y. Cheng, D. Wang, P. Zhou, T. Zhang, Model compression and acceleration for deep neural networks: the principles, progress, and challenges, IEEE Signal Process. Mag. 35 (1) (2018) 126–136.
[3]
H. Li, H. Samet, A. Kadav, I. Durdanovic, H.P. Graf, Pruning Filters for efficient ConvNets, ICLR, 2017.
[4]
J.-H. Luo, H. Zhang, H.-Y. Zhou, C.-W. Xie, J. Wu, W. Lin, Thinet: pruning CNN filters for a thinner net, TPAMI 41 (10) (2019) 2525–2538.
[5]
J. Liu, B. Zhuang, Z. Zhuang, Y. Guo, J. Huang, J. Zhu, M. Tan, Discrimination-aware network pruning for deep model compression, TPAMI 44 (8) (2022) 4035–4051.
[6]
Y. He, P. Liu, Z. Wang, Z. Hu, Y. Yang, Filter pruning via geometric median for deep convolutional neural networks acceleration, CVPR, 2019, pp. 4340–4349.
[7]
Y. He, J. Lin, Z. Liu, H. Wang, L.-J. Li, S. Han, AMC: AutoML for model compression and acceleration on mobile devices, ECCV, 2018, pp. 815–832.
[8]
J.-H. Luo, J. Wu, Autopruner: an end-to-end trainable filter pruning method for efficient deep model inference, Pattern Recognit. 107 (2020) 107461.
[9]
N. Lee, T. Ajanthan, P.H. Torr, SNIP: single-shot network pruning based on connection sensitivity, ICLR, 2019.
[10]
T. Zhuang, Z. Zhang, Y. Huang, X. Zeng, K. Shuang, X. Li, Neuron-level structured pruning using polarization regularizer, NeurIPS, volume 33, 2020, pp. 9865–9877.
[11]
C. Tai, T. Xiao, X. Wang, W. E, Convolutional neural networks with low-rank regularization, Int. Conf. Learn. Represent., 2016.
[12]
V. Lebedev, Y. Ganin, M. Rakhuba, I.V. Oseledets, V.S. Lempitsky, Speeding-up Convolutional neural networks using fine-tuned CP-decomposition, ICLR, 2015.
[13]
T. Cohen, M. Welling, Group Equivariant convolutional networks, ICML, 2016, pp. 2990–2999.
[14]
W. Shang, K. Sohn, D. Almeida, H. Lee, Understanding and improving convolutional neural networks via concatenated rectified linear units, ICML, 2016, pp. 2217–2225.
[15]
A. Romero, N. Ballas, S.E. Kahou, A. Chassang, C. Gatta, Y. Bengio, FitNets: hints for thin deep nets, ICLR, 2015.
[16]
P. Luo, Z. Zhu, Z. Liu, X. Wang, X. Tang, Face model compression by distilling knowledge from neurons, AAAI, 2016, pp. 3560–3566.
[17]
T. Chen, I.J. Goodfellow, J. Shlens, Net2Net: accelerating learning via knowledge transfer, ICLR, 2016.
[18]
S. Gupta, A. Agrawal, K. Gopalakrishnan, P. Narayanan, Deep learning with limited numerical precision, ICML, 2015, pp. 1737–1746.
[19]
M. Courbariaux, Y. Bengio, J.-P. David, BinaryConnect: training deep neural networks with binary weights during propagations, NeurIPS, 2015, pp. 3123–3131.
[20]
M. Rastegari, V. Ordonez, J. Redmon, A. Farhadi, XNOR-Net: ImageNet classification using binary convolutional neural networks, ECCV, 2016, pp. 525–542.
[21]
J.M. Alvarez, M. Salzmann, Learning the number of neurons in deep networks, NeurIPS, 2016, pp. 2262–2270.
[22]
X. Zhu, W. Zhou, H. Li, Improving Deep neural network sparsity through decorrelation regularization, IJCAI, 2018, pp. 3264–3270.
[23]
S. Han, H. Mao, W.J. Dally, Deep Compression: Compressing deep neural network with pruning, trained quantization and huffman coding, in: Y. Bengio, Y. LeCun (Eds.), ICLR, 2016.
[24]
S. Han, J. Pool, J. Tran, W.J. Dally, Learning both weights and connections for efficient neural network, NeurIPS, 2015, pp. 1135–1143.
[25]
Y. He, X. Zhang, J. Sun, Channel pruning for accelerating very deep neural networks, ICCV, 2017, pp. 1398–1406.
[26]
K. Simonyan, A. Zisserman, Very deep convolutional networks for large-scale image recognition, ICLR, 2015.
[27]
K. He, X. Zhang, S. Ren, J. Sun, Deep residual learning for image recognition, CVPR, 2016, pp. 770–778.
[28]
M. Sandler, A.G. Howard, M. Zhu, A. Zhmoginov, L.-C. Chen, MobileNetV2: inverted residuals and linear bottlenecks, CVPR, 2018, pp. 4510–4520.
[29]
A. Krizhevsky, Learning multiple layers of features from tiny images, Technical Report, 2009.
[30]
J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, L. Fei-Fei, ImageNet: a large-scale hierarchical image database, CVPR, 2009, pp. 248–255.
[31]
Z. Liu, J. Li, Z. Shen, G. Huang, S. Yan, C. Zhang, Learning Efficient convolutional networks through network slimming, ICCV, 2017, pp. 2755–2763.
[32]
H. Peng, J. Wu, S. Chen, J. Huang, Collaborative channel pruning for deep networks, ICML, 2019, pp. 5113–5122.
[33]
Y. He, G. Kang, X. Dong, Y. Fu, Y. Yang, Soft filter pruning for accelerating deep convolutional neural networks, IJCAI, 2018, pp. 2234–2240.
[34]
X. Ding, T. Hao, J. Tan, J. Liu, J. Han, Y. Guo, G. Ding, ResRep: lossless CNN pruning via decoupling remembering and forgetting, ICCV, 2021, pp. 4490–4500.
[35]
W. Wang, C. Fu, J. Guo, D. Cai, X. He, COP: customized deep model compression via regularized correlation-based filter-level pruning, IJCAI, 2019, pp. 3785–3791.
[36]
Z. Liu, H. Mu, X. Zhang, Z. Guo, X. Yang, K.-T. Cheng, J. Sun, MetaPruning: meta learning for automatic neural network channel pruning, ICCV, 2019, pp. 3295–3304.
[37]
X. Ding, T. Hao, J. Han, Y. Guo, G. Ding, Manipulating identical filter redundancy for efficient pruning on deep and complicated CNN, 2021. arXiv:2107.14444
[38]
M. Lin, R. Ji, Y. Zhang, B. Zhang, Y. Wu, Y. Tian, Channel pruning via automatic structure search, IJCAI, volume 1, 2020, pp. 673–679.
[39]
S. Guo, Y. Wang, Q. Li, J. Yan, DMCP: differentiable markov channel pruning for neural networks, CVPR, 2020, pp. 1536–1544.

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Pattern Recognition
Pattern Recognition  Volume 140, Issue C
Aug 2023
775 pages

Publisher

Elsevier Science Inc.

United States

Publication History

Published: 01 August 2023

Author Tags

  1. Filter pruning
  2. Saliency-based pruning
  3. End-to-end pruning framework
  4. Sampling bias

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 0
    Total Downloads
  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 25 Nov 2024

Other Metrics

Citations

View Options

View options

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media