Abstract
General object detection has been widely developed and studied over the past few years, while few-shot object detection is still in the exploratory stage. Learning effective knowledge from a limited number of samples is challenging, as the trained model is prone to over-fitting due to biased feature distributions in a few training samples. There exist two significant challenges in traditional few-shot object detection methods: (1) The scarcity of extreme samples aggravates the proposal distribution bias, hindering the evolution of regions of interest (ROI) heads toward new categories; (2) Due to the scarce of the samples in novel categories, the region proposal network (RPN) is identified as a key source of classification errors, resulting in a significant decrease in detection performance on novel categories. To overcome these challenges, an effective knowledge transfer method based on distributed calibration and data augmentation is proposed. Firstly, the biased novel category distributions are calibrated with the basic category distributions; secondly, a drift compensation strategy is utilized to reduce the negative impact on new categories classifications during the fine-tuning process; thirdly, synthetic features are obtained from calibrated distributions of novel categories and added to the subsequent training process. Furthermore, the domain-aware data augmentation is utilized to alleviate the issue of data scarcity by exploiting the cross-image foreground—background mixture to increase the diversity and rationality of augmented data. Experimental results demonstrate the effectiveness and applicability of the proposed method.
Similar content being viewed by others
Data availability
All the authors mentioned in the manuscript have agreed for authorship, read, and approved the manuscript and given consent for submission and subsequent publication of the manuscript.
References
Wang, X., Huang, T.E., Gonzalez, J., Darrell, T., Yu, F.: Frustratingly simple few-shot object detection. In: ICML, pp. 9919–9928 (2020)
Xing, C., Rostamzadeh, N., Oreshkin, B.N., Pinheiro, P.O.: Adaptive cross-modal few-shot learning. In: NeurIPS, pp. 4848–4858 (2019)
Wu, J., Liu, S., Huang, D., Wang, Y.: Multi-scale positive sample refinement for few-shot object detection. In: ECCV (16), pp. 456–472 (2020)
Sun, B., Li, B., Cai, S., Yuan, Y., Zhang, C.: FSCE: few-shot object detection via contrastive proposal encoding. In: CVPR, pp. 7352–7362 (2021)
Kim, J., Yoon, I., Park, G.-M., Kim, J.-H.: Non-probabilistic cosine similarity loss for few-shot image classification. In: BMVC (2020)
Karlinsky, L., Shtok, J., Harary, S., Schwartz, E., Aides, A., Feris, R.S., Giryes, R., Bronstein, A.M.: RepMet: Representative-based metric learning for classification and few-shot object detection. In: CVPR, pp. 5197–5206 (2019)
Xiao, Y., Marlet, R.: Few-shot object detection and viewpoint estimation for objects in the wild. In: ECCV, pp. 192–210 (2020)
Kang, B., Liu, Z., Wang, X., Yu, F., Feng, J., Darrell, T.: Few-shot object detection via feature reweighting. In: ICCV, pp. 8419–8428 (2019)
Li, B., Yang, B., Liu, C., Liu, F., Ji, R., Ye, Q.: Beyond max-margin: class margin equilibrium for few-shot object detection. In: CVPR, pp. 7363–7372 (2021)
Cao, Y., Wang, J., Jin, Y., Wu, T., Chen, K., Liu, Z., Lin, D.: Few-shot object detection via association and discrimination. In: NeurIPS, pp. 16570–16581 (2021)
Han, G., Ma, J., Huang, S., Chen, L., Chang, S.-F.: Few-shot object detection with fully cross-transformer. In: CVPR, pp. 5311–5320 (2022)
Zhang, X., Liu, F., Peng, Z., Guo, Z., Wan, F., Ji, X., Ye, Q.: Integral migrating pre-trained transformer encoder-decoders for visual object detection. In: CoRR (2022). https://arxiv.org/abs/2205.09613
He, K., Chen, X., Xie, S., Li, Y., Dollár, P., Girshick, R.B.: Masked autoencoders are scalable vision learners. In: CVPR, pp. 15979–15988 (2022)
Vinyals, O., Blundell, C., Lillicrap, T., Kavukcuoglu, K., Wierstra, D.: Matching networks for one shot learning. In: NIPS, pp. 3630–3638 (2016)
Sung, F., Yang, Y., Zhang, L., Xiang, T., Torr, P.H.S., Hospedales, T.M.: Learning to compare: relation network for few-shot learning. In: CVPR, pp. 1199–1208 (2018)
Zhang, C., Cai, Y., Lin, G., Shen, C.: DeepEMD: Few-shot image classification with differentiable earth mover's distance and structured classifiers. In: CVPR, pp. 12200–12210 (2020)
Yang, B., Liu, C., Li, B., Jiao, J., Ye, Q.: Prototype mixture models for few-shot semantic segmentation. In: ECCV, pp. 763–778 (2020)
Liu, B., Ding, Y., Jiao, J., Ji, X., Ye, Q.: Anti-aliasing semantic reconstruction for few-shot semantic segmentation. In: CVPR, pp. 9747–9756 (2021)
Liu, B., Jiao, J., Ye, Q.: Harmonic feature activation for few-shot semantic segmentation. IEEE Trans. Image Process. 30, 3142–3153 (2021)
Wang, Y.-X., Girshick, R.B., Hebert, M., Hariharan, B.: Low-shot learning from imaginary data. In: CVPR, pp. 7278–7286 (2018)
Hariharan, B., Girshick, R.B.: Low-Shot Visual Recognition by Shrinking and Hallucinating Features. ICCV 2017: 3037–3046.
Park, S.-J., Han, S., Baek, J.-W., Kim, I., Song, J., Lee, H., Han, J.-J., Ju Hwang, S.: Meta variance transfer: learning to augment from the others. In: ICML, pp. 7510–7520 (2020)
Yang, S., Liu, L., Xu, M.: Free lunch for few-shot learning: distribution calibration. In: ICLR (2021)
Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: ECCV, pp. 213–229 (2020)
Redmon, J., Kumar Divvala, S., Girshick, R.B., Farhadi, A.: You only look once: unified, real-time object detection. In: CVPR, pp. 779–788 (2016)
Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S.E., Fu, C.-Y., Berg, A.C.: SSD: single shot multibox detector. In: ECCV, pp. 21–37 (2016)
Wang, Y.-X., Ramanan, D., Hebert, M.: Meta-learning to detect rare objects. In: ICCV, pp. 9924–9933 (2019)
Li, A., Li, Z.: Transformation invariant few shot object detection. In: CVPR, pp. 3094–3102 (2021)
Zhu, C., Chen, F., Ahmed, U., Shen, Z., Savvides, M.: Semantic relation reasoning for shot-stable few-shot object detection. In: CVPR, pp. 8782–8791 (2021)
Hu, H., Bai, S., Li, A., Cui, J., Wang, L.: Dense relation distillation with context-aware aggregation for few-shot object detection. In: CVPR, pp. 10185–10194 (2021)
Fan, Z., Ma, Y., Li, Z., Sun, J.: Generalized few-shot object detection without forgetting. In: CVPR, pp. 4527–4536 (2021)
Qiao, L., Zhao, Y., Li, Z., Qiu, X., Wu, J., Zhang, C.: DeFRCN: decoupled faster R-CNN for few-shot object detection. In: ICCV, pp. 8661–8670 (2021)
Zhang, S.,Wang, L., Murray, N., Koniusz, P.: Kernelized few-shot object detection with efficient integral aggregation. In: CVPR, pp. 19185–19194 (2022)
Kaul, P., Xie, W., Zisserman, A.: Label, verify, correct: a simple few shot object detection method. In: CVPR, pp. 14217–14227 (2022)
Ghiasi, G., Cui, Y., Srinivas, A., Qian, R., Lin, T.-Y., Cubuk, E.D., Le, Q.V., Zoph, B.: Simple copy-paste is a strong data augmentation method for instance segmentation. In: CVPR, pp. 2918–2928 (2021)
Zhang, W., Wang, Y.-X.: Hallucination improves few-shot object detection. In: CVPR, pp. 13008–13017 (2021)
Zhang, S., Li, Z., Yan, S., He, X., Sun, J.: Distribution alignment: a unified framework for long-tail visual recognition. In: CVPR, 2361–2370 (2021)
Song, H., Diethe, T., Kull, M., Flach, P.A.: Distribution calibration for regression. In: ICML, pp. 5897–5906 (2019)
Shen, Z., Liu, Z., Qin, J., Huang, L., Cheng, K.-T., Savvides, M.: S2-BNN: bridging the gap between self-supervised real and 1-bit neural networks via guided distribution calibration. In: CVPR, pp. 2165–2174 (2021)
Gretton, A., Borgwardt, K.M., Rasch, M.J., Schölkopf, B., Smola, A.J.: A kernel two-sample test. J. Mach. Learn. Res. 13, 723–773 (2012)
Zhang, W., Wang, Y.-X., Forsyth, DA.: Cooperating RPN's improve few-shot object detection. In: CoRR (2020). https://arxiv.org/abs/2112.02814
Guo, H., Pasunuru, R., Bansal, M.: Multi-source domain adaptation for text classification via distanceNet-bandits. In: AAAI, pp. 7830–7838 (2020)
Luo, Y., Zheng, L., Guan, T., Yu, J., Yang, Y.: Taking a closer look at domain shift: category-level adversaries for semantics consistent domain adaptation. In: CVPR, pp. 2507–2516 (2019)
Dvornik, N., Mairal, J., Schmid, C.: Modeling visual context is key to augmenting object detection datasets. In: ECCV, pp. 375–391 (2018)
Fang, H., Sun, J., Wang, R., Gou, M., Li, Y.-L., Lu, C.: InstaBoost: boosting instance segmentation via probability map guided copy-pasting. In: ICCV, pp. 682–691 (2019)
Shangguan, Z., Rostami, M.: Improved region proposal network for enhanced few-shot object detection. In: CoRR (2023). https://arxiv.org/abs/2308.07535
Li, J., Zhang, Y., Qiang, W., Si, L., Jiao, C., Hu, X., Zheng, C., Sun, F.: Disentangle and remerge: interventional knowledge distillation for few-shot object detection from a conditional causal perspective. In: AAAI, pp. 1323–1333 (2023)
Acknowledgements
We are very grateful to the editor and reviewers for their time and efforts while reviewing this manuscript. Besides, we also appreciate the support of the Natural Science Foundation of China (No. 221077).
Author information
Authors and Affiliations
Contributions
Kai Zhang wrote the main manuscript text and Songhao Zhu prepared figures1, 2 and 3. All authors reviewed the manuscript.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Zhu, S., Zhang, K. Few-shot object detection via data augmentation and distribution calibration. Machine Vision and Applications 35, 11 (2024). https://doi.org/10.1007/s00138-023-01486-z
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s00138-023-01486-z