Abstract
Data augmentation has been proved effective in training deep models. Existing data augmentation methods tackle fine-grained problem by blending image pairs and fusing corresponding labels according to the statistics of mixed pixels, which produces additional noise harmful to the performance of networks. Motivated by this, we present a simple yet effective cross ensemble knowledge distillation (CEKD) model for fine-grained feature learning. We innovatively propose a cross distillation module to provide additional supervision to alleviate the noise problem, and propose a collaborative ensemble module to overcome the target conflict problem. The proposed model can be trained in an end-to-end manner, and only requires image-level label supervision. Extensive experiments on widely used fine-grained benchmarks demonstrate the effectiveness of our proposed model. Specifically, with the backbone of ResNet-101, CEKD obtains the accuracy of 89.59%, 95.96% and 94.56% in three datasets respectively, outperforming state-of-the-art API-Net by 0.99%, 1.06% and 1.16%.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Wah C, Branson S, Welinder P, Perona P, Belongie S (2011) The caltech-ucsd birds-200-2011 dataset
Krause J, Stark M, Deng J, Fei-Fei L (2013) 3d object representations for fine-grained categorization. In: Proceedings of the IEEE international conference on computer vision workshops, pp 554–561
Maji S, Rahtu E, Kannala J, Blaschko M, Vedaldi A Fine-grained visual classification of aircraft, arXiv:1306.5151
Huang S, Xu Z, Tao D, Zhang Y (2016) Part-stacked cnn for fine-grained visual categorization. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1173–1182
Hu J, Shen L, Sun G (2018) Squeeze-and-excitation networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7132–7141
Zhang L, Huang S, Liu W, Tao D (2019) Learning a mixture of granularity-specific experts for fine-grained categorization. In: Proceedings of the IEEE international conference on computer vision, pp 8331–8340
Du R, Chang D, Bhunia AK, Xie J, Ma Z, Song Y-Z, Guo J (2020) Fine-grained visual classification via progressive multi-granularity training of jigsaw patches. In: European conference on computer vision. Springer, pp 153–168
Zhang H, Cisse M, Dauphin YN, Lopez-Paz D Mixup: Beyond empirical risk minimization, arXiv:1710.09412
Yun S, Han D, J Oh S, Chun S, Choe J, Yoo Y (2019) Cutmix: Regularization strategy to train strong classifiers with localizable features. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 6023–6032
Huang S, Wang X, Tao D (2020) Snapmix: Semantically proportional mixing for augmenting fine-grained data, arXiv:2012.04846
Gong C, Wang D, Li M, Chandra V, Liu Q (2021) Keepaugment: A simple information-preserving data augmentation approach. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 1055–1064
Hinton G, Vinyals O, Dean J (2015) Distilling the knowledge in a neural network, arXiv:150.025313
Guo Q, Wang X, Wu Y, Yu Z, Liang D, Hu X, Luo P (2020) Online knowledge distillation via collaborative learning. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 11020–11029
Wang H, Peng J, Zhao Y, Fu X (2020) Multi-path deep cnns for fine-grained car recognition. IEEE Trans Veh Technol 69(10):10484–10493
Wang H, Wang Y, Zhang Z, Fu X, Zhuo L, Xu M, Wang M (2020) Kernelized multiview subspace analysis by self-weighted learning, IEEE Transactions on Multimedia
Wang H, Peng J, Chen D, Jiang G, Zhao T, Fu X (2020) Attribute-guided feature learning network for vehicle reidentification. IEEE MultiMedia 27(4):112–121
Wang H, Peng J, Jiang G, Xu F, Fu X (2021) Discriminative feature and dictionary learning with part-aware model for vehicle re-identification. Neurocomputing 438:55–62
Oquab M, Bottou L, Laptev I, Sivic J (2015) Is object localization for free?-weakly-supervised learning with convolutional neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 685–694
Zhou B, Khosla A, Lapedriza A, Oliva A, Torralba A (2016) Learning deep features for discriminative localization. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2921–2929
Selvaraju RR, Cogswell M, Das A, Vedantam R, Parikh D, Batra D (2017) Grad-cam: Visual explanations from deep networks via gradient-based localization. In: Proceedings of the IEEE international conference on computer vision, pp 618–626
Zhang X, Wei Y, Feng J, Yang Y, Huang TS (2018) Adversarial complementary learning for weakly supervised object localization. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1325–1334
Zhuang P, Wang Y, Qiao Y (2020) Learning attentive pairwise interaction for fine-grained classification. In: Proceedings of the AAAI conference on artificial intelligence, vol 34, pp 13130–13137
Gao Y, Han X, Wang X, Huang W, Scott M (2020) Channel interaction networks for fine-grained image categorization. In: Proceedings of the AAAI conference on artificial intelligence, vol 34, pp 10818–10825
Simard PY, LeCun YA, Denker JS, Victorri B (1998) Transformation invariance in pattern recognition—tangent distance and tangent propagation. In: Neural networks: tricks of the trade. Springer, pp 239–274
DeVries T, Taylor GW (2017) Improved regularization of convolutional neural networks with cutout, arXiv:1708.04552
Summers C, Dinneen MJ (2019) Improved mixed-example data augmentation. In: 2019 IEEE Winter Conference on Applications of Computer Vision (WACV). IEEE, pp 1262–1270
Ba LJ, Caruana R. (2013) Do deep nets really need to be deep? arXiv:1312.6184
Passalis N, Tefas A (2018) Learning deep representations with probabilistic knowledge transfer. In: Proceedings of the European Conference on Computer Vision (ECCV), pp 268–284
Heo B, Lee M, Yun S, Choi JY (2019) Knowledge distillation with adversarial samples supporting decision boundary. In: Proceedings of the AAAI conference on artificial intelligence, vol 33, pp 3771–3778
Mirzadeh SI, Farajtabar M, Li A, Levine N, Matsukawa A, Ghasemzadeh H (2020) Improved knowledge distillation via teacher assistant. In: Proceedings of the AAAI conference on artificial intelligence, vol 34, pp 5191–5198
Li T, Li J, Liu Z, Zhang C (2020) Few sample knowledge distillation for efficient network compression. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 14639–14647
Xie J, Lin S, Zhang Y, Luo L (2019) Training convolutional neural networks with cheap convolutions and online distillation, arXiv:1909.13063
Chen D, Mei J-P, Wang C, Feng Y, Chen C (2020) Online knowledge distillation with diverse peers. In: Proceedings of the AAAI conference on artificial intelligence, vol 34, pp 3430–3437
Chung I, Park S, Kim J, Kwak N (2020) Feature-map-level online adversarial knowledge distillation. In: International conference on machine learning, PMLR, pp 2006–2015
Zhou M, Bai Y, Zhang W, Zhao T, Mei T (2020) Look-into-object: Self-supervised structure modeling for object recognition. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 11774–11783
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
Lin D, Shen X, Lu C, Jia J (2015) Deep lac: Deep localization, alignment and classification for fine-grained recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1666–1674
Zhang N, Donahue J, Girshick R, Darrell T (2014) Part-based r-cnns for fine-grained category detection. In: European conference on computer vision. Springer, pp 834–849
Fu J, Zheng H, Mei T (2017) Look closer to see better: Recurrent attention convolutional neural network for fine-grained image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4438–4446
Zheng H, Fu J, Mei T, Luo J (2017) Learning multi-attention convolutional neural network for fine-grained image recognition. In: Proceedings of the IEEE international conference on computer vision, pp 5209–5217
Sun M, Yuan Y, Zhou F, Ding E (2018) Multi-attention multi-class constraint for fine-grained image recognition. In: Proceedings of the european conference on computer vision (ECCV), pp 805–821
Yang Z, Luo T, Wang D, Hu Z, Gao J, Wang L (2018) Learning to navigate for fine-grained classification. In: Proceedings of the european conference on computer vision (ECCV), pp 420–435
Luo W, Yang X, Mo X, Lu Y, Davis LS, Li J, Yang J, Lim S.-N. (2019) Cross-x learning for fine-grained visual categorization. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 8242–8251
Chen Y, Bai Y, Zhang W, Mei T (2019) Destruction and construction learning for fine-grained image recognition. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 5157–5166
Zheng H, Fu J, Zha Z-J, Luo J (2019) Learning deep bilinear transformation for fine-grained image representation, arXiv:1911.03621
Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z (2016) Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2818–2826
Huang G, Liu Z, Van Der Maaten L, Weinberger KQ (2017) Densely connected convolutional networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4700–4708
Acknowledgements
This work was supported by a Grant from National Natural Science Foundation of China 61972121, Zhejiang provincial Natural Science Foundation of China LY21F020015, Science and Technology Program of Zhejiang Province(No.2021C01187).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of Interests
We declare that we have no financial and personal relationships with other people or organizations that can inappropriately influence our work, there is no professional or other personal interest of any nature or kind in any product, service and/or company that could be construed as influencing the position presented in, or the review of, the manuscript entitled.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Zhang, K., Fan, J., Huang, S. et al. CEKD:Cross ensemble knowledge distillation for augmented fine-grained data. Appl Intell 52, 16640–16650 (2022). https://doi.org/10.1007/s10489-022-03355-0
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-022-03355-0