Nothing Special   »   [go: up one dir, main page]

Skip to main content

Cross Diffusion on Multi-hypergraph for Multi-modal 3D Object Recognition

  • Conference paper
  • First Online:
Advances in Multimedia Information Processing – PCM 2018 (PCM 2018)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11164))

Included in the following conference series:

Abstract

3D object recognition is a longstanding task in computer vision and has shown wide applications in computer aided design, virtual reality, etc. Current state-of-the-art methods mainly focus on 3D object representation for recognition. Concerning the multi-modal representations in practice, how to effectively combine such multi-modal information for recognition is still a challenging and urgent requirement. In this paper, we aim to conduct 3D object recognition using multi-modal information through a cross diffusion process on multi-hypergraph structure. Given multi-modal representations of 3D objects, the correlation among these objects is formulated using the multi-hypergraph structure each representation separately, which is able to model complex relationship among objects. To combine multi-modal representation, we propose a cross diffusion process on multi-hypergraph, in which the label information is propagated from multiple hypergraphs alternatively. In this way, the multi-modal information can be jointly combined through this cross diffusion process in multi-hypergraph structure. We have applied the proposed method in 3D object recognition using multiple representations. To evaluate the performance of the proposed cross diffusion method, we provide extensive experiments on two public 3D object datasets. Experimental results demonstrate that the proposed method can achieve satisfied multi-modal combination performance and outperform the current state-of-the-art methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Chen, D.Y., Tian, X.P., Shen, Y.T., Ouhyoung, M.: On visual similarity based 3D model retrieval. Comput. Graph. Forum 22(3), 223–232 (2003)

    Article  Google Scholar 

  2. Chen, X., Ma, H., Wan, J., Li, B., Xia, T.: Multi-view 3D object detection network for autonomous driving. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, vol. 1 (2017)

    Google Scholar 

  3. Feng, Y., Zhang, Z., Zhao, X., Ji, R., Gao, Y.: GVCNN: group-view convolutional neural networks for 3D shape recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2018)

    Google Scholar 

  4. Gao, Y., Wang, M., Tao, D., Ji, R., Dai, Q.: 3D object retrieval and recognition with hypergraph analysis. IEEE Trans. Image Process. 21(9), 4290–4303 (2012)

    Article  MathSciNet  Google Scholar 

  5. Gao, Y., Wang, M., Zha, Z.J., Shen, J., Li, X., Wu, X.: Visual-textual joint relevance learning for tag-based social image search. IEEE Trans. Image Process. 22(1), 363–376 (2013)

    Article  MathSciNet  Google Scholar 

  6. Guo, H., Wang, J., Gao, Y., Li, J., Lu, H.: Multi-view 3D object retrieval with deep embedding network. IEEE Trans. Image Process. 25(12), 5526–5537 (2016)

    Article  MathSciNet  Google Scholar 

  7. Hong, R., Hu, Z., Wang, R., Wang, M., Tao, D.: Multi-view object retrieval via multi-scale topic models. IEEE Trans. Image Process. 25(12), 5814–5827 (2016)

    Article  MathSciNet  Google Scholar 

  8. Hong, R., Zhang, L., Zhang, C., Zimmermann, R.: Flickr circles: aesthetic tendency discovery by multi-view regularized topic modeling. IEEE Trans. Multimed. 18(8), 1555–1567 (2016)

    Article  Google Scholar 

  9. Huang, Y., Liu, Q., Metaxas, D.: Video object segmentation by hypergraph cut. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2009, pp. 1738–1745. IEEE (2009)

    Google Scholar 

  10. Huang, Y., Liu, Q., Zhang, S., Metaxas, D.N.: Image Retrieval via Probabilistic Hypergraph Ranking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3376–3383. IEEE (2010)

    Google Scholar 

  11. Klaser, A., Marszałek, M., Schmid, C.: A spatio-temporal descriptor based on 3D-gradients. In: Proceedings of the British Machine Vision Conference, pp. 275–1. British Machine Vision Association (2008)

    Google Scholar 

  12. Klokov, R., Lempitsky, V.: Escape from cells: deep KD-networks for the recognition of 3D point cloud models. In: 2017 IEEE International Conference on Computer Vision (ICCV), pp. 863–872. IEEE (2017)

    Google Scholar 

  13. Li, J., Chen, B.M., Lee, G.H.: SO-Net: self-organizing network for point cloud analysis. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 9397–9406 (2018)

    Google Scholar 

  14. Li, Y., Bu, R., Sun, M., Chen, B.: PointCNN. arXiv preprint arXiv:1801.07791 (2018)

  15. Liu, Q., Sun, Y., Wang, C., Liu, T., Tao, D.: Elastic net hypergraph learning for image clustering and semi-supervised classification. IEEE Trans. Image Process. 26(1), 452–463 (2017)

    Article  MathSciNet  Google Scholar 

  16. Maturana, D., Scherer, S.: Voxnet: A 3D convolutional neural network for real-time object recognition. In: Proceedings of the Intelligent Robots and Systems, pp. 922–928. IEEE (2015)

    Google Scholar 

  17. Papadakis, P., Pratikakis, I., Theoharis, T., Perantonis, S.: PANORAMA: a 3D shape descriptor based on panoramic views for unsupervised 3D object retrieval. Int. J. Comput. Vis. 89(2–3), 177–192 (2010)

    Article  Google Scholar 

  18. Pontil, M., Verri, A.: Support vector machines for 3D object recognition. IEEE Trans. Pattern Anal. Mach. Intell. 20(6), 637–646 (1998)

    Article  Google Scholar 

  19. Qi, C.R., Su, H., Mo, K., Guibas, L.J.: Pointnet: deep learning on point sets for 3D classification and segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, vol. 1, p. 4 (2017)

    Google Scholar 

  20. Qi, C.R., Su, H., Nießner, M., Dai, A., Yan, M., Guibas, L.J.: Volumetric and multi-view cnns for object classification on 3D data. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5648–5656 (2016)

    Google Scholar 

  21. Qi, C.R., Yi, L., Su, H., Guibas, L.J.: Pointnet++: deep hierarchical feature learning on point sets in a metric space. In: Advances in Neural Information Processing Systems, pp. 5105–5114 (2017)

    Google Scholar 

  22. Riegler, G., Ulusoy, A.O., Geiger, A.: OctNET: learning deep 3D representations at high resolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, vol. 3 (2017)

    Google Scholar 

  23. Shen, Y., Feng, C., Yang, Y., Tian, D.: Mining point cloud local structures by kernel correlation and graph pooling. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, vol. 4 (2018)

    Google Scholar 

  24. Su, H., Maji, S., Kalogerakis, E., Learned-Miller, E.: Multi-view convolutional neural networks for 3D shape recognition. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 945–953 (2015)

    Google Scholar 

  25. Tian, Z., Hwang, T., Kuang, R.: A hypergraph-based learning algorithm for classifying gene expression and ArrayCGH data with prior knowledge. Bioinformatics 25(21), 2831–2838 (2009)

    Article  Google Scholar 

  26. Vranic, D., Saupe, D.: 3D Shape Descriptor Based on 3D Fourier Transform. EURASIP, pp. 271–274 (2001)

    Google Scholar 

  27. Wang, Y., Sun, Y., Liu, Z., Sarma, S.E., Bronstein, M.M., Solomon, J.M.: Dynamic graph CNN for learning on point clouds. arXiv preprint arXiv:1801.07829 (2018)

  28. Wu, Z., et al.: 3D ShapeNets: a deep representation for volumetric shapes. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1912–1920 (2015)

    Google Scholar 

  29. Xie, J., Dai, G., Zhu, F., Shao, L., Fang, Y.: Deep nonlinear metric learning for 3-D shape retrieval. IEEE Trans. Cybern. 48(1), 412–422 (2016)

    Article  Google Scholar 

  30. Xie, S., Liu, S., Chen, Z., Tu, Z.: Attentional ShapeContextNet for point cloud recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4606–4615 (2018)

    Google Scholar 

  31. Zhang, Z., Bai, L., Liang, Y., Hancock, E.: Joint hypergraph learning and sparse regression for feature selection. Pattern Recogn. 63, 291–309 (2017)

    Article  Google Scholar 

  32. Zhu, L., Shen, J., Jin, H., Zheng, R., Xie, L.: Content-based visual landmark search via multimodal hypergraph learning. IEEE Trans. Cybern. 45(12), 2756–2769 (2015)

    Article  Google Scholar 

Download references

Acknowledgements

This work was supported by National Key R&D Program of China (Grant No. 2017YFC0113000), and National Natural Science Funds of China (U1701262, 61671267).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yue Gao .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Zhang, Z., Lin, H., Zhu, J., Zhao, X., Gao, Y. (2018). Cross Diffusion on Multi-hypergraph for Multi-modal 3D Object Recognition. In: Hong, R., Cheng, WH., Yamasaki, T., Wang, M., Ngo, CW. (eds) Advances in Multimedia Information Processing – PCM 2018. PCM 2018. Lecture Notes in Computer Science(), vol 11164. Springer, Cham. https://doi.org/10.1007/978-3-030-00776-8_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-00776-8_4

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-00775-1

  • Online ISBN: 978-3-030-00776-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics