Nothing Special   »   [go: up one dir, main page]

Skip to main content
Log in

A robust framework for multi-view stereopsis

  • Original article
  • Published:
The Visual Computer Aims and scope Submit manuscript

Abstract

Various approaches using neural networks have been proposed to address multi-view stereopsis, but most of them lack capabilities to handle large textureless regions. Hence, a compelling matching network learning comprehensive information from stereo images is constructed to enforce smoothness constraints globally. Trained over binocular stereo datasets only, we show that the network can directly handle the DTU multi-view stereo dataset. When merging together multiple depth maps obtained using either stereo matching, an additional point consolidation procedure is often needed for removing outliers and better aligning individual patches. A second network that consolidates 3D point clouds through directly projecting individual 3D points based on point distributions in their neighborhoods is proposed. Unlike the matching network, this network is trained on local information and is scalable for handling point clouds of any sizes and is capable of processing selected areas of interest as well. Quantitative evaluation on the DTU dataset demonstrates our two networks together can generate point clouds comparable to existing state-of-the-art approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

References

  1. Aanæs, H., Jensen, R.R., Vogiatzis, G., Tola, E., Dahl, A.B.: Large-scale data for multiple-view stereopsis. Int. J. Comput. Vis. 2016, 1–16 (2016)

    MathSciNet  Google Scholar 

  2. Arvanitis, G., Spathis-Papadiotis, A., Lalos, A.S., Moustakas, K., Fakotakis, N.: Outliers removal and consolidation of dynamic point cloud. In: 2018 25th IEEE International Conference on Image Processing (ICIP). IEEE, pp. 3888–3892 (2018)

  3. Bleyer, M., Rhemann, C., Rother, C.: Patchmatch stereo-stereo matching with slanted support windows. In: Proceedings of the British Machine Vision Conference (BMVC), vol. 11, pp. 1–11 (2011)

  4. Boulch, A., Marlet, R.: Deep learning for robust normal estimation in unstructured point clouds. Comput. Graph. Forum 35, 281–290 (2016)

    Article  Google Scholar 

  5. Campbell, N., Vogiatzis, G., Hernandez, C., Cipolla, R.: Using multiple hypotheses to improve depth-maps for multi-view stereo, vol. 5302, pp. 766–779 (2008)

  6. Chang, J.R., Chen, Y.S.: Pyramid stereo matching network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5410–5418 (2018)

  7. Chen, R., Han, S., Xu, J., Su, H.: Point-based multi-view stereo network. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1538–1547 (2019)

  8. Choi, S., Kim, S., Sohn, K., et al.: Learning descriptor, confidence, and depth estimation in multi-view stereo. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). IEEE, pp. 389–3896 (2018)

  9. Furukawa, Y., Ponce, J.: Accurate, dense, and robust multiview stereopsis. IEEE Trans. Pattern Anal. Mach. Intell. 32(8), 1362–1376 (2010)

    Article  Google Scholar 

  10. Furukawa, Y., Hernández, C., et al.: Multi-view stereo: a tutorial. Found. Trends Comput. Graph. Vis. 9(1–2), 1–148 (2015)

    Article  Google Scholar 

  11. Galliani, S., Schindler, K.: Just look at the image: viewpoint-specific surface normal prediction for improved multi-view reconstruction. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5479–5487 (2016)

  12. Galliani, S., Lasinger, K., Schindler, K.: Massively parallel multiview stereopsis by surface normal diffusion. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 873–881 (2015)

  13. Guerrero, P., Kleiman, Y., Ovsjanikov, M., Mitra, N.J.: PCPNet: learning local shape properties from raw point clouds. Comput. Graph. Forum 37(2), 75–85 (2018). https://doi.org/10.1111/cgf.13343

    Article  Google Scholar 

  14. Han, X., Li, Z., Huang, H., Kalogerakis, E., Yu, Y.: High-resolution shape completion using deep neural networks for global structure and local geometry inference, pp 85–93 (2017)

  15. Hartmann, W., Galliani, S., Havlena, M., Van Gool, L., Schindler, K.: Learned multi-patch similarity. In: Proceedings of the International Conference on Computer Vision (ICCV), IEEE, pp 1595–1603 (2017)

  16. Hirschmuller, H.: Stereo processing by semiglobal matching and mutual information. IEEE Trans. Pattern Anal. Mach. Intell. 30(2), 328–341 (2008)

    Article  Google Scholar 

  17. Huang, H., Li, D., Zhang, H., Ascher, U., Cohen-Or, D.: Consolidation of unorganized point clouds for surface reconstruction. ACM Trans. Graph. 28(5), 176 (2009)

    Article  Google Scholar 

  18. Huang, H., Wu, S., Gong, M., Cohen-Or, D., Ascher, U., Zhang, H.R.: Edge-aware point set resampling. ACM Trans. Graph. 32(1), 9 (2013)

    Article  Google Scholar 

  19. Huang, P.H., Matzen, K., Kopf, J., Ahuja, N., Huang, J.B.: Deepmvs: Learning multi-view stereopsis. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2821–2830 (2018)

  20. Im, S., Jeon, H.G., Lin, S., Kweon, I.S.: DPSNet: End-to-end deep plane sweep stereo. In: International Conference on Learning Representations (2019)

  21. Ji, M., Gall, J., Zheng, H., Liu, Y., Fang, L.: SurfaceNet: An end-to-end 3D neural network for multiview stereopsis. arXiv preprint arXiv:1708.01749 (2017)

  22. Kar, A., Häne, C., Malik, J.: Learning a multi-view stereo machine. In: Advances in Neural Information Processing Systems, pp. 365–376 (2017)

  23. Kazhdan, M., Hoppe, H.: Screened Poisson surface reconstruction. ACM Trans. Graph. 32(3), 29 (2013)

    Article  Google Scholar 

  24. Kendall, A., Martirosyan, H., Dasgupta, S., Henry, P., Kennedy, R., Bachrach, A., Bry, A.: End-to-end learning of geometry and context for deep stereo regression. In: Proceedings of the International Conference on Computer Vision (ICCV) (2017)

  25. Kim, P., Chen, J., Cho, Y.K.: Slam-driven robotic mapping and registration of 3D point clouds. Autom. Constr. 89, 38–48 (2018)

    Article  Google Scholar 

  26. Kim, S.H., Chung, K.Y.: Medical information service system based on human 3D anatomical model. Multimed. Tools Appl. 74(20), 8939–8950 (2015)

    Article  Google Scholar 

  27. Luo, K., Guan, T., Ju, L., Huang, H., Luo, Y.: P-mvsnet: Learning patch-wise matching confidence aggregation for multi-view stereo. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 10452–10461 (2019)

  28. Mao, W., Gong, M., Huang, X., Cai, H., Yi, Z.: A global-matching framework for multi-view stereopsis. In: Vento, M., Percannella, G. (eds.) Computer Analysis of Images and Patterns, pp. 635–647. Springer, Cham (2019)

    Chapter  Google Scholar 

  29. Menze, M., Heipke, C., Geiger, A.: Joint 3D estimation of vehicles and scene flow. In: ISPRS Workshop on Image Sequence Analysis (ISA) (2015)

  30. Pang, J., Sun, W., Ren, J.S., Yang, C., Yan, Q.: Cascade residual learning: a two-stage convolutional neural network for stereo matching. In: Proceedings of the International Conference on Computer Vision (ICCV), vol. 7 (2017)

  31. Park, H., Lee, K.M.: Look wider to match image patches with convolutional neural networks. IEEE Signal Process. Lett. 24(12), 1788–1792 (2017)

    Article  Google Scholar 

  32. Poms, A., Wu, C., Yu, S.I., Sheikh, Y.: Learning patch reconstructability for accelerating multi-view stereo. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3041–3050 (2018)

  33. Preiner, R., Mattausch, O., Arikan, M., Pajarola, R., Wimmer, M.: Continuous projection for fast l1 reconstruction. ACM Trans. Graph. 33(4), 47–1 (2014)

    Article  Google Scholar 

  34. Qi, C.R., Su, H., Mo, K., Guibas, L.J.: PointNet: Deep learning on point sets for 3D classification and segmentation. arXiv preprint arXiv:1612.00593 (2016)

  35. Qi, C.R., Yi, L., Su, H., Guibas, L.J.: PointNet++: deep hierarchical feature learning on point sets in a metric space. arXiv preprint arXiv:1706.02413 (2017)

  36. Romanoni, A., Matteucci, M.: Tapa-mvs: Textureless-aware patchmatch multi-view stereo. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 10413–10422 (2019)

  37. Ronneberger, O., Fischer, P., Brox, T.: U-Net: Convolutional networks for biomedical image segmentation. In: International Conference on Medical Image Computing and Computer-Assisted Intervention. Springer, pp. 234–241 (2015)

  38. Roveri, R., Öztireli, A.C., Pandele, I., Gross, M.H.: PointProNets: consolidation of point clouds with convolutional neural networks. Comput. Graph. Forum 37, 87–99 (2018)

    Article  Google Scholar 

  39. Scharstein, D., Szeliski, R.: A taxonomy and evaluation of dense two-frame stereo correspondence algorithms. Int. J. Comput. Vis. 47(1–3), 7–42 (2002)

    Article  Google Scholar 

  40. Scharstein, D., Hirschmüller, H., Kitajima, Y., Krathwohl, G., Nešić, N., Wang, X., Westling, P.: High-resolution stereo datasets with subpixel-accurate ground truth. In: German Conference on Pattern Recognition. Springer, pp. 31–42 (2014)

  41. Seitz, S.M., Curless, B., Diebel, J., Scharstein, D., Szeliski, R.: A comparison and evaluation of multi-view stereo reconstruction algorithms. In: 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. IEEE, vol. 1, pp. 519–528 (2006)

  42. Sun, Y., Schaefer, S., Wang, W.: Denoising point sets via l0 minimization. Comput. Aided Geom. Des. 35, 2–15 (2015)

    Article  Google Scholar 

  43. Tola, E., Strecha, C., Fua, P.: Efficient large-scale multi-view stereo for ultra high-resolution image sets. Mach. Vis. Appl. 23(5), 903–920 (2012)

    Article  Google Scholar 

  44. Wu, S., Huang, H., Gong, M., Zwicker, M., Cohen-Or, D.: Deep points consolidation. ACM Trans. Graph. 34(6), 176 (2015)

    Google Scholar 

  45. Yan, T., Gan, Y., Xia, Z., Zhao, Q.: Segment-based disparity refinement with occlusion handling for stereo matching. IEEE Trans. Image Process. 28, 3885–3897 (2019)

    Article  MathSciNet  Google Scholar 

  46. Yao, Y., Luo, Z., Li, S., Fang, T., Quan, L.: MVSNet: Depth inference for unstructured multi-view stereo. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 767–783 (2018)

  47. Yao, Y., Luo, Z., Li, S., Shen, T., Fang, T., Quan, L.: Recurrent mvsnet for high-resolution multi-view stereo depth inference. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5525–5534 (2019)

  48. Ye, X., Li, J., Wang, H., Huang, H., Zhang, X.: Efficient stereo matching leveraging deep local and context information. IEEE Access 5, 18745–18755 (2017)

    Article  Google Scholar 

  49. Yifan, W., Wu, S., Huang, H., Cohen-Or, D., Sorkine-Hornung, O.: Patch-base progressive 3D Point Set Upsampling. ArXiv e-prints arXiv:1811.11286 (2018)

  50. Yu, L., Li, X., Fu, C.W., Cohen-Or, D., Heng, P.A.: EC-Net: an edge-aware point set consolidation network. In: Proceedings of the European Conference on Computer Vision (ECCV) (2018a)

  51. Yu, L., Li, X., Fu, C.W., Cohen-Or, D., Heng, P.A.: PU-Net: Point cloud upsampling network. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018b)

  52. Yu, Z., Gao, S.: Fast-mvsnet: Sparse-to-dense multi-view stereo with learned propagation and gauss-newton refinement. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1949–1958 (2020)

  53. Zbontar, J., LeCun, Y.: Stereo matching by training a convolutional neural network to compare image patches. J. Mach. Learn. Res. 17(1–32), 2 (2016)

    MATH  Google Scholar 

  54. Zhang, Z.: A flexible new technique for camera calibration. IEEE Trans. Pattern Anal. Mach. Intell. 22, 1330–1334 (2000)

    Article  Google Scholar 

  55. Zollhöfer, M., Siegl, C., Vetter, M., Dreyer, B., Stamminger, M., Aybek, S., Bauer, F.: Low-cost real-time 3D reconstruction of large-scale excavation sites. J. Comput. Cult. Herit. 9(1), 2 (2016)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Wendong Mao.

Ethics declarations

Conflict of interest

On behalf of all authors, the corresponding author states that there is no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Mao, W., Wang, M., Huang, H. et al. A robust framework for multi-view stereopsis. Vis Comput 38, 1539–1551 (2022). https://doi.org/10.1007/s00371-021-02087-5

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00371-021-02087-5

Keywords

Navigation