Abstract
3D object recognition is a longstanding task in computer vision and has shown wide applications in computer aided design, virtual reality, etc. Current state-of-the-art methods mainly focus on 3D object representation for recognition. Concerning the multi-modal representations in practice, how to effectively combine such multi-modal information for recognition is still a challenging and urgent requirement. In this paper, we aim to conduct 3D object recognition using multi-modal information through a cross diffusion process on multi-hypergraph structure. Given multi-modal representations of 3D objects, the correlation among these objects is formulated using the multi-hypergraph structure each representation separately, which is able to model complex relationship among objects. To combine multi-modal representation, we propose a cross diffusion process on multi-hypergraph, in which the label information is propagated from multiple hypergraphs alternatively. In this way, the multi-modal information can be jointly combined through this cross diffusion process in multi-hypergraph structure. We have applied the proposed method in 3D object recognition using multiple representations. To evaluate the performance of the proposed cross diffusion method, we provide extensive experiments on two public 3D object datasets. Experimental results demonstrate that the proposed method can achieve satisfied multi-modal combination performance and outperform the current state-of-the-art methods.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Chen, D.Y., Tian, X.P., Shen, Y.T., Ouhyoung, M.: On visual similarity based 3D model retrieval. Comput. Graph. Forum 22(3), 223–232 (2003)
Chen, X., Ma, H., Wan, J., Li, B., Xia, T.: Multi-view 3D object detection network for autonomous driving. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, vol. 1 (2017)
Feng, Y., Zhang, Z., Zhao, X., Ji, R., Gao, Y.: GVCNN: group-view convolutional neural networks for 3D shape recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2018)
Gao, Y., Wang, M., Tao, D., Ji, R., Dai, Q.: 3D object retrieval and recognition with hypergraph analysis. IEEE Trans. Image Process. 21(9), 4290–4303 (2012)
Gao, Y., Wang, M., Zha, Z.J., Shen, J., Li, X., Wu, X.: Visual-textual joint relevance learning for tag-based social image search. IEEE Trans. Image Process. 22(1), 363–376 (2013)
Guo, H., Wang, J., Gao, Y., Li, J., Lu, H.: Multi-view 3D object retrieval with deep embedding network. IEEE Trans. Image Process. 25(12), 5526–5537 (2016)
Hong, R., Hu, Z., Wang, R., Wang, M., Tao, D.: Multi-view object retrieval via multi-scale topic models. IEEE Trans. Image Process. 25(12), 5814–5827 (2016)
Hong, R., Zhang, L., Zhang, C., Zimmermann, R.: Flickr circles: aesthetic tendency discovery by multi-view regularized topic modeling. IEEE Trans. Multimed. 18(8), 1555–1567 (2016)
Huang, Y., Liu, Q., Metaxas, D.: Video object segmentation by hypergraph cut. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2009, pp. 1738–1745. IEEE (2009)
Huang, Y., Liu, Q., Zhang, S., Metaxas, D.N.: Image Retrieval via Probabilistic Hypergraph Ranking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3376–3383. IEEE (2010)
Klaser, A., Marszałek, M., Schmid, C.: A spatio-temporal descriptor based on 3D-gradients. In: Proceedings of the British Machine Vision Conference, pp. 275–1. British Machine Vision Association (2008)
Klokov, R., Lempitsky, V.: Escape from cells: deep KD-networks for the recognition of 3D point cloud models. In: 2017 IEEE International Conference on Computer Vision (ICCV), pp. 863–872. IEEE (2017)
Li, J., Chen, B.M., Lee, G.H.: SO-Net: self-organizing network for point cloud analysis. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 9397–9406 (2018)
Li, Y., Bu, R., Sun, M., Chen, B.: PointCNN. arXiv preprint arXiv:1801.07791 (2018)
Liu, Q., Sun, Y., Wang, C., Liu, T., Tao, D.: Elastic net hypergraph learning for image clustering and semi-supervised classification. IEEE Trans. Image Process. 26(1), 452–463 (2017)
Maturana, D., Scherer, S.: Voxnet: A 3D convolutional neural network for real-time object recognition. In: Proceedings of the Intelligent Robots and Systems, pp. 922–928. IEEE (2015)
Papadakis, P., Pratikakis, I., Theoharis, T., Perantonis, S.: PANORAMA: a 3D shape descriptor based on panoramic views for unsupervised 3D object retrieval. Int. J. Comput. Vis. 89(2–3), 177–192 (2010)
Pontil, M., Verri, A.: Support vector machines for 3D object recognition. IEEE Trans. Pattern Anal. Mach. Intell. 20(6), 637–646 (1998)
Qi, C.R., Su, H., Mo, K., Guibas, L.J.: Pointnet: deep learning on point sets for 3D classification and segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, vol. 1, p. 4 (2017)
Qi, C.R., Su, H., Nießner, M., Dai, A., Yan, M., Guibas, L.J.: Volumetric and multi-view cnns for object classification on 3D data. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5648–5656 (2016)
Qi, C.R., Yi, L., Su, H., Guibas, L.J.: Pointnet++: deep hierarchical feature learning on point sets in a metric space. In: Advances in Neural Information Processing Systems, pp. 5105–5114 (2017)
Riegler, G., Ulusoy, A.O., Geiger, A.: OctNET: learning deep 3D representations at high resolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, vol. 3 (2017)
Shen, Y., Feng, C., Yang, Y., Tian, D.: Mining point cloud local structures by kernel correlation and graph pooling. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, vol. 4 (2018)
Su, H., Maji, S., Kalogerakis, E., Learned-Miller, E.: Multi-view convolutional neural networks for 3D shape recognition. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 945–953 (2015)
Tian, Z., Hwang, T., Kuang, R.: A hypergraph-based learning algorithm for classifying gene expression and ArrayCGH data with prior knowledge. Bioinformatics 25(21), 2831–2838 (2009)
Vranic, D., Saupe, D.: 3D Shape Descriptor Based on 3D Fourier Transform. EURASIP, pp. 271–274 (2001)
Wang, Y., Sun, Y., Liu, Z., Sarma, S.E., Bronstein, M.M., Solomon, J.M.: Dynamic graph CNN for learning on point clouds. arXiv preprint arXiv:1801.07829 (2018)
Wu, Z., et al.: 3D ShapeNets: a deep representation for volumetric shapes. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1912–1920 (2015)
Xie, J., Dai, G., Zhu, F., Shao, L., Fang, Y.: Deep nonlinear metric learning for 3-D shape retrieval. IEEE Trans. Cybern. 48(1), 412–422 (2016)
Xie, S., Liu, S., Chen, Z., Tu, Z.: Attentional ShapeContextNet for point cloud recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4606–4615 (2018)
Zhang, Z., Bai, L., Liang, Y., Hancock, E.: Joint hypergraph learning and sparse regression for feature selection. Pattern Recogn. 63, 291–309 (2017)
Zhu, L., Shen, J., Jin, H., Zheng, R., Xie, L.: Content-based visual landmark search via multimodal hypergraph learning. IEEE Trans. Cybern. 45(12), 2756–2769 (2015)
Acknowledgements
This work was supported by National Key R&D Program of China (Grant No. 2017YFC0113000), and National Natural Science Funds of China (U1701262, 61671267).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this paper
Cite this paper
Zhang, Z., Lin, H., Zhu, J., Zhao, X., Gao, Y. (2018). Cross Diffusion on Multi-hypergraph for Multi-modal 3D Object Recognition. In: Hong, R., Cheng, WH., Yamasaki, T., Wang, M., Ngo, CW. (eds) Advances in Multimedia Information Processing – PCM 2018. PCM 2018. Lecture Notes in Computer Science(), vol 11164. Springer, Cham. https://doi.org/10.1007/978-3-030-00776-8_4
Download citation
DOI: https://doi.org/10.1007/978-3-030-00776-8_4
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-00775-1
Online ISBN: 978-3-030-00776-8
eBook Packages: Computer ScienceComputer Science (R0)