Abstract
Extracting feature points and their descriptors from images is one of the fundamental techniques in computer vision with many applications such as geometric fitting and camera calibration, and for this task several deep learning models have been proposed. However, existing feature descriptor networks have been developed with the intention of improving the accuracy, and consideration for practical networks that can run on embedded devices has somewhat been deferred. Therefore, the objective of this study is to devise light feature descriptor networks. To this end, we employ lightweight convolution operations that have been developed for image classification networks (e.g. SqueezeNet and MobileNet) for the purpose of replacing the normal convolution operators in the state-of-the-art feature descriptor network, RF-Net. Experimental results show that the model size of the detector can be reduced by up to 80% compared to that of the original size with only a 11% degradation at worst performance in our final lightweight detector model for image matching tasks. Our study indicates that the modern convolution techniques originally proposed for small image classification models can be effectively extended to designing tiny models for the feature descriptor extraction and matching portions in deep local feature learning networks.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Balntas, V., Lenc, K., Vedaldi, A., Mikolajczyk, K.: HPatches: a benchmark and evaluation of handcrafted and learned local descriptors. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2017)
Howard, A.G., et al.: MobileNets: efficient convolutional neural networks for mobile vision applications. arXiv:1704.04861 (2017)
Iandola, F.N., Han, S., Moskewicz, M.W., Ashraf, K., Dally, W.J., Keutzer, K.: SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and \(<\)0.5MB model size. arXiv:1602.07360 (2016)
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60, 91–110 (2004). https://doi.org/10.1023/B:VISI.0000029664.99615.94
Ono, Y., Trulls, E., Fua, P., Yi, K.M.: LF-Net: learning local features from images. In: Advances in Neural Information Processing Systems (NIPS) (2018)
Sandler, M., Chu, G., Chen, L.-C.: Searching for MobileNetV3. Presented at the (2019)
Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.C.: MobileNetV2: inverted residuals and linear bottlenecks. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2018)
Shen, X., et al.: RF-Net: an end-to-end image matching network based on receptive field. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019)
Zitnick, C.L., Ramnath, K.: Edge foci interest points. Presented at the (2011)
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Wang, Y.R., Kanemura, A. (2021). Designing Lightweight Feature Descriptor Networks with Depthwise Separable Convolution. In: Yada, K., et al. Advances in Artificial Intelligence. JSAI 2020. Advances in Intelligent Systems and Computing, vol 1357. Springer, Cham. https://doi.org/10.1007/978-3-030-73113-7_17
Download citation
DOI: https://doi.org/10.1007/978-3-030-73113-7_17
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-73112-0
Online ISBN: 978-3-030-73113-7
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)