Abstract
The transformational and spatial proximities are important cues for identifying inliers from an appearance based match set because correct matches generally stay close in input images and share similar local transformations. However, most existing approaches only check one type of them or both types consecutively with manually set thresholds, and thus their matching accuracy and flexibility in handling large-scale images are limited. In this paper, we present an efficient clustering based approach to identify match inliers with both proximities simultaneously. It first projects the putative matches into a joint transformational-spatial space, where mismatches tend to scatter all around while correct matches gather together. A mode-seeking process based on joint kernel density estimation is then proposed to obtain significant clusters in the joint space, where each cluster contains matches mapping the same object across images with high accuracy. Moreover, kernel bandwidths for measuring match proximities are adaptively set during density estimation, which enhances its applicability for matching different images. Experiments on three standard datasets show that the proposed approach delivers superior performance on a variety of feature matching tasks, including multi-object matching, duplicate object matching and object retrieval.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60(2), 91–110 (2004)
Schonberger, J.L., Hardmeier, H., Sattler, T., Pollefeys, M.: Comparative evaluation of hand-crafted and learned local features. In: Proc. IEEE Conf. on Computer Vision and Pattern Recognition, pp. 1482–1491 (2017)
Torr, P.H., Murray, D.W.: The development and comparison of robust methods for estimating the fundamental matrix. Int. J. Comput. Vis. 24(3), 271–300 (1997)
Cho, M., Lee, K.M.: Progressive graph matching: making a move of graphs via probabilistic voting. In: Proc. IEEE Conf. on Computer Vision and Pattern Recognition, pp. 398–405 (2012)
Chen, H.-Y., Lin, Y.-Y., Chen, B.-Y.: Robust feature matching with alternate Hough and inverted Hough transforms. In: Proc. IEEE Conf. on Computer Vision and Pattern Recognition, pp. 2762–2769 (2013)
Adamczewski, K., Suh, Y., Lee, K.M.: Discrete tabu search for graph matching. In: Proc. IEEE Int’l Conf. on Computer Vision, pp. 109–117 (2015)
Zanfir, A., Sminchisescu, C.: Deep learning of graph matching. In: Proc. IEEE Conf. on Computer Vision and Pattern Recognition, pp. 2684–2693 (2018)
Cho, M., Lee, J., Lee, K.M.: Feature correspondence and deformable object matching via agglomerative correspondence clustering. In: Proc. IEEE Int’l Conf. on Computer Vision, pp. 1280–1287 (2009)
Cho, M., Lee, K.M.: Mode-seeking on graphs via random walks. In: Proc. IEEE Conf. on Computer Vision and Pattern Recognition, pp. 606–613 (2012)
Liu, H., Latecki, L.J., Yan, S.: Fast detection of dense subgraphs with iterative shrinking and expansion. IEEE Trans. Pattern Anal. Mach. Intell. 35(9), 2131–2142 (2013)
Wang, L., Tang, D., Guo, Y., Do, M.N.: Common visual pattern discovery via nonlinear mean shift clustering. IEEE Trans. Image Process. 24(12), 5442–5454 (2015)
Yi, K.M., Trulls, E., Ono, Y., Lepetit, V., Salzmann, M., Fua, P.: Learning to find good correspondences. In: Proc. IEEE Conf. on Computer Vision and Pattern Recognition, pp. 2666–2674 (2018)
Zhao, C., Cao, Z., Li, C., Li, X., Yang, J.: Nm-net: mining reliable neighbors for robust feature correspondences. In: Proc. IEEE Conf. on Computer Vision and Pattern Recognition, pp. 215–224 (2019)
Rocco, I., Arandjelovic, R., Sivic, J.: Convolutional neural network architecture for geometric matching. In: Proc. IEEE Conf. on Computer Vision and Pattern Recognition, pp. 6148–6157 (2017)
Ummenhofer, B., Zhou, H., Uhrig, J., Mayer, N., Ilg, E., Dosovitskiy, A., Brox, T.: Demon: depth and motion network for learning monocular stereo. In: Proc. IEEE Conf. on Computer Vision and Pattern Recognition, pp. 5038–5047 (2017)
Vijayanarasimhan, S., Ricco, S., Schmid, C., Sukthankar, R., Fragkiadaki, K.: Sfm-net: learning of structure and motion from video (2017). arXiv:1704.07804
Zhang, Z., Lee, W.S.: Deep graphical feature learning for the feature matching problem. In: Proc. IEEE Int’l Conf. on Computer Vision, pp. 5087–5096 (2019)
Ma, J., Zhao, J., Jiang, J., Zhou, H., Guo, X.: Locality preserving matching. Int. J. Comput. Vis. 127(5), 512–531 (2019)
Bian, J., Lin, W.-Y., Matsushita, Y., Yeung, S.-K., Nguyen, T.-D., Cheng, M.-M.: Gms: grid-based motion statistics for fast, ultra-robust feature correspondence. In: Proc. IEEE Conf. on Computer Vision and Pattern Recognition, pp. 4181–4190 (2017)
Jiang, X., Ma, J., Jiang, J., Guo, X.: Robust feature matching using spatial clustering with heavy outliers. IEEE Trans. Image Process. 29, 736–746 (2019)
Wu, X., Kashino, K.: Robust spatial matching as ensemble of weak geometric relations. In: Proc. British Machine Vision Conf., pp. 25–1 (2015)
Ferrari, V., Tuytelaars, T., Van Gool, L.: Simultaneous object recognition and segmentation from single or multiple model views. Int. J. Comput. Vis. 67(2), 159–188 (2006)
Leng, C., Zhang, H., Li, B., Cai, G., Pei, Z., He, L.: Local feature descriptor for image matching: a survey. IEEE Access 7, 6424–6434 (2018)
Yi, K.M., Trulls, E., Lepetit, V., Fua, P.: Lift: Learned invariant feature transform. In: Proc. European Conf. on Computer Vision, pp. 467–483 (2016)
Bai, X., Zhang, T., Wang, C., Abd El-Latif, A.A., Niu, X.: A fully automatic player detection method based on one-class svm. IEICE Trans. Inf. Syst. 96(2), 387–391 (2013)
Gad, R., Abd El-Latif, A.A., Elseuofi, S., Ibrahim, H.M., Elmezain, M., Said, W.: Iot security based on iris verification using multi-algorithm feature level fusion scheme. In: 2019 2nd International Conference on Computer Applications & Information Security (ICCAIS), pp. 1–6. IEEE (2019)
Peng, J., Li, Q., Abd El-Latif, A.A., Niu, X.: Linear discriminant multi-set canonical correlations analysis (ldmcca): an efficient approach for feature fusion of finger biometrics. Multimed. Tools Appl. 74(13), 4469–4486 (2015)
Peng, J., Wang, N., Abd El-Latif, A.A., Li, Q., Niu, X.: Finger-vein verification using Gabor filter and sift feature matching. In: 2012 Eighth International Conference on Intelligent Information Hiding and Multimedia Signal Processing, pp. 45–48. IEEE (2012)
Wang, L., Chen, B., Xu, P., Ren, H., Fang, X., Wan, S.: Geometry consistency aware confidence evaluation for feature matching. Image Vis. Comput. 103, 103984 (2020)
Zhang, T., Abd El-Latif, A.A., Wang, N., Li, Q., Niu, X.: A new image segmentation method via fusing ncut eigenvectors maps. In: Fourth International Conference on Digital Image Processing (ICDIP 2012), vol. 8334, p. 833430. International Society for Optics and Photonics (2012)
Jing, H., He, X., Han, Q., Abd El-Latif, A.A., Niu, X.: Saliency detection based on integrated features. Neurocomputing 129, 114–121 (2014)
Wang, L., Zhen, H., Fang, X., Wan, S., Ding, W., Guo, Y.: A unified two-parallel-branch deep neural network for joint gland contour and segmentation learning. Future Gener. Comput. Syst. 100, 316–324 (2019)
Suh, Y., Adamczewski, K., Lee, K.M.: Subgraph matching using compactness prior for robust feature correspondence. In: Proc. IEEE Conf. on Computer Vision and Pattern Recognition, pp. 5070–5078 (2015)
Lee, J., Cho, M., Lee, K.M.: Hyper-graph matching via reweighted random walks. In: Proc. IEEE Conf. on Computer Vision and Pattern Recognition, pp. 1633–1640 (2011)
Wan, S., Xia, Y., Qi, L., Yang, Y.-H., Atiquzzaman, M.: Automated colorization of a grayscale image with seed points propagation. IEEE Trans. Multimed. (2020)
Gao, Z., Li, Y., Wan, S.: Exploring deep learning for view-based 3d model retrieval. ACM Trans. Multimed. Comput. Commun. Appl. (TOMM) 16(1), 1–21 (2020)
Gao, Z., Xuan, H.-Z., Zhang, H., Wan, S., Choo, K.-K.R.: Adaptive fusion and category-level dictionary learning model for multiview human action recognition. IEEE Internet Things J. 6(6), 9280–9293 (2019)
Ma, J., Jiang, X., Jiang, J., Zhao, J., Guo, X.: Lmr: learning a two-class classifier for mismatch removal. IEEE Trans. Image Process. 28(8), 4045–4059 (2019)
Mikolajczyk, K., Schmid, C.: Scale & affine invariant interest point detectors. Int. J. Comput. Vis. 60(1), 63–86 (2004)
Marron, J., Nolan, D.: Canonical kernels for density estimation. Stat. Probab. Lett. 7(3), 195–199 (1988)
Vedaldi, A., Fulkerson, B.: VLFeat: an open and portable library of computer vision algorithms (2008). http://www.vlfeat.org/
Cho, M., Shin, Y.M., Lee, K.M.: Co-recognition of image pairs by data-driven monte Carlo image exploration. In: Proc. European Conf. on Computer Vision, pp. 144–157 (2008)
Cho, M., Shin, Y.M., Lee, K.M.: Unsupervised detection and segmentation of identical objects. In: Proc. IEEE Conf. on Computer Vision and Pattern Recognition, pp. 1617–1624 (2010)
Lenc, K., Gulshan, V., Vedaldi, A.: Vlbenchmkars (2011). http://www.vlfeat.org/benchmarks/
Philbin, J., Chum, O., Isard, M., Sivic, J., Zisserman, A.: Object retrieval with large vocabularies and fast spatial matching. In: Proc. IEEE Conf. on Computer Vision and Pattern Recognition, pp. 1–8 (2007)
Lazebnik, S., Schmid, C., Ponce, J.: Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. In: Proc. IEEE Conf. on Computer Vision and Pattern Recognition, vol. 2, pp. 2169–2178 (2006)
Jégou, H., Douze, M., Schmid, C.: Exploiting descriptor distances for precise image search. Ph.D. dissertation. INRIA (2011)
Acknowledgements
This work is co-supported by the National Natural Science Foundation of China (61502005), Anhui Science Foundation (1608085QF129).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Wang, L., Tan, L., Fang, X. et al. Adaptively feature matching via joint transformational-spatial clustering. Multimedia Systems 29, 1717–1727 (2023). https://doi.org/10.1007/s00530-021-00792-8
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00530-021-00792-8