Abstract
Stereo matching is one of the most active research areas in computer vision. While a large number of algorithms for stereo correspondence have been developed, relatively little work has been done on characterizing their performance. In this paper, we present a taxonomy of dense, two-frame stereo methods. Our taxonomy is designed to assess the different components and design decisions made in individual stereo algorithms. Using this taxonomy, we compare existing stereo methods and present experiments evaluating the performance of many different variants. In order to establish a common software platform and a collection of data sets for easy evaluation, we have designed a stand-alone, flexible C++ implementation that enables the evaluation of individual components and that can easily be extended to include new algorithms. We have also produced several new multi-frame stereo data sets with ground truth and are making both the code and data sets available on the Web. Finally, we include a comparative evaluation of a large set of today's best-performing stereo algorithms.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Anandan, P. 1989. A computational framework and an algorithm for the measurement of visual motion. IJCV, 2(3):283–310.
Arnold, R.D. 1983. Automated stereo perception. Technical Report AIM-351, Artificial Intelligence Laboratory, Stanford University.
Baker, H.H. 1980. Edge based stereo correlation. In Image Understanding Workshop, L.S. Baumann (Ed.). Science Applications International Corporation, pp. 168–175.
Baker, H. and Binford, T. 1981. Depth from edge and intensity based stereo. In IJCAI, pp. 631–636.
Baker, S., Szeliski, R., and Anandan, P. 1998. A layered approach to stereo reconstruction. In CVPR, pp. 434–441.
Barnard, S.T. 1989. Stochastic stereo matching over scale. IJCV, 3(1):17–32.
Barnard, S.T. and Fischler, M.A. 1982. Computational stereo. ACM Comp. Surveys, 14(4):553–572.
Barron, J.L., Fleet, D.J., and Beauchemin, S.S. 1994. Performance of optical flow techniques. IJCV, 12(1):43–77.
Belhumeur, P.N. 1996. A Bayesian approach to binocular stereopsis. IJCV, 19(3):237–260.
Belhumeur, P.N. and Mumford, D. 1992. A Bayesian treatment of the stereo correspondence problem using half-occuluded regions. In CVPR, pp. 506–512.
Bergen, J.R., Anandan, P., Hanna, K.J., and Hingorani, R. 1992. Hierarchical model-based motion estimation. In ECCV, pp. 237–252.
Birchfield, S. and Tomasi, C. 1998a. A pixel dissimilarity measure that is insensitive to image sampling. IEEE TPAMI, 20(4):401–406.
Birchfield, S. and Tomasi, C. 1998b. Depth discontinuities by pixel-to-pixel stereo. In ICCV, pp. 1073–1080.
Birchfield, S. and Tomasi, C. 1999. Multiway cut for stereo and motion with slanted surfaces. In ICCV, pp. 489–495.
Black, M.J. and Anandan, P. 1993. A framework for the robust estimation of optical flow. In ICCV, pp. 231–236.
Black, M.J. and Rangarajan, A. 1996. On the unification of line processes, outlier rejection, and robust statistics with applications in early vision. IJCV, 19(1):57–91.
Blake, A. and Zisserman, A. 1987. Visual Reconstruction, MITPress: Cambridge, MA.
Bobick, A.F. and Intille, S.S. 1999. Large occlusion stereo. IJCV, 33(3):181–200.
Bolles, R.C., Baker, H.H., and Hannah, M.J. 1993. The JISCT stereo evaluation. In DARPA Image Understanding Workshop, pp. 263–274.
Bolles, R.C., Baker, H.H., and Marimont, D.H. 1987. Epipolar-plane image analysis: An approach to determining structure from motion. IJCV, 1:7–55.
Boykov, Y. and Kolmogorov, V. 2001. An experimental comparison of min-cut/max-flow algorithms for energy minimization in computer vision. In Intl. Workshop on Energy Minimization Methods in Computer Vision and Pattern Recognition, pp. 205–220.
Boykov, Y., Veksler, O., and Zabih, R. 1998. A variable window approach to early vision. IEEE TPAMI, 20(12):1283–1294.
Boykov, Y., Veksler, O., and Zabih, R. 2001. Fast approximate energy minimization via graph cuts. IEEE TPAMI, 23(11):1222–1239.
Broadhurst, A., Drummond, T., and Cipolla, R. 2001. A probabilistic framework for space carving. In ICCV, Vol. I, pp. 388–393.
Brown, L.G. 1992. A survey of image registration techniques. Computing Surveys, 24(4):325–376.
Burt, P.J. and Adelson, E.H. 1983. The Laplacian pyramid as a compact image code. IEEE Transactions on Communications, COM-31(4):532–540.
Canny, J.F. 1986. A computational approach to edge detection. IEEE TPAMI, 8(6):34–43.
Chou, P.B. and Brown, C.M. 1990. The theory and practice of Bayesian image labeling. IJCV, 4(3):185–210.
Cochran, S.D. and Medioni, G. 1992. 3-D surface description from binocular stereo. IEEE TPAMI, 14(10):981–994.
Collins, R.T. 1996. A space-sweep approach to true multi-image matching. In CVPR, pp. 358–363.
Cox, I.J., Hingorani, S.L., Rao, S.B., and Maggs, B.M. 1996. A maximum likelihood stereo algorithm. CVIU, 63(3):542–567.
Cox, I.J., Roy, S., and Hingorani, S.L. 1995. Dynamic histogram warping of image pairs for constant image brightness. In IEEE International Conference on Image Processing, Vol. 2, pp. 366–369.
Culbertson, B., Malzbender, T., and Slabaugh, G. 1999. Generalized voxel coloring. In International Workshop on Vision Algorithms, Kerkyra, Greece. Springer: Berlin, pp. 100–114.
De Bonet, J.S. and Viola, P. 1999. Poxels: Probabilistic voxelized volume reconstruction. In ICCV, pp. 418–425.
Deriche, R. 1990. Fast algorithms for low-level vision. IEEE TPAMI, 12(1):78–87.
Dev, P. 1974. Segmentation processes in visual perception: A cooperative neural model. University of Massachusetts at Amherst, COINS Technical Report 74C-5.
Dhond, U.R. and Aggarwal, J.K. 1989. Structure from stereo—a review. IEEE Trans. on Systems, Man, and Cybern., 19(6):1489–1510.
Faugeras, O. and Keriven, R. 1998. Variational principles, surface evolution, PDE's, level set methods, and the stereo problem. IEEE Trans. Image Proc., 7(3):336–344.
Faugeras, O. and Luong, Q.-T. 2001. The Geometry of Multiple Images. MIT Press: Cambridge, MA.
Fleet, D.J., Jepson, A.D., and Jenkin, M.R.M. 1991. Phase-based disparity measurement. CVGIP, 53(2):198–210.
Frohlinghaus, T. and Buhmann, J.M. 1996. Regularizing phase-based stereo. In ICPR, Vol. A, pp. 451–455.
Fua, P. 1993. A parallel stereo algorithm that produces dense depth maps and preserves image features. Machine Vision and Applications, 6:35–49.
Fua, P. and Leclerc, Y.G. 1995. Object-centered surface reconstruction: Combining multi-image stereo and shading. IJCV, 16:35–56.
Gamble, E. and Poggio, T. 1987. Visual integration and detection of discontinuities: The key role of intensity edges. AI Lab, MIT, A.I. Memo 970.
Geiger, D. and Girosi, F. 1991. Parallel and deterministic algorithms for MRF's: Surface reconstruction. IEEE TPAMI, 13(5):401–412.
Geiger, D., Ladendorf, B., and Yuille, A. 1992. Occlusions and binocular stereo. In ECCV, pp. 425–433.
Geman, S. and Geman, D. 1984. Stochastic relaxation, Gibbs distribution, and the Bayesian restoration of images. IEEE TPAMI, 6(6):721–741.
Gennert, M.A. 1988. Brightness-based stereo matching. In ICCV, pp. 139–143.
Gong, M. and Yang, Y.-H. 2002. Genetic-based stereo algorithm and disparity map evaluation. IJCV, 47(1/2/3):63–77.
Grimson, W.E.L. 1985. Computational experiments with a feature based stereo algorithm. IEEE TPAMI, 7(1):17–34.
Hannah, M.J. 1974. Computer Matching of Areas in Stereo Images. Ph.D. Thesis, Stanford University.
Hartley, R.I. and Zisserman, A. 2000. Multiple Views Geometry. Cambridge University Press: Cambridge, UK.
Hirschmüller, H. 2002. Real-time correlation-based stereo vision with reduced border errors. IJCV, 47(1/2/3):229–246.
Hsieh, Y.C., McKeown, D., and Perlant, F.P. 1992. Performance evaluation of scene registration and stereo matching for cartographic feature extraction. IEEE TPAMI, 14(2):214–238.
Ishikawa, H. and Geiger, D. 1998. Occlusions, discontinuities, and epipolar lines in stereo. In ECCV, pp. 232–248.
Jenkin, M.R.M., Jepson, A.D., and Tsotsos, J.K. 1991. Techniques for disparity measurement. CVGIP: Image Understanding, 53(1):14–30.
Jones, D.G. and Malik, J. 1992. A computational framework for determining stereo correspondence from a set of linear spatial filters. In ECCV, pp. 395–410.
Kanade, T. 1994. Development of a videorate stereo machine. In Image Understanding Workshop, Monterey, CA, 1994. Morgan Kaufmann Publishers: San Mateo, CA, pp. 549–557.
Kanade, T. and Okutomi, M. 1994. A stereo matching algorithm with an adaptive window: Theory and experiment. IEEE TPAMI, 16(9):920–932.
Kanade, T., Yoshida, A., Oda, K., Kano, H., and Tanaka, M. 1996. A stereo machine for video-rate dense depth mapping and its new applications. In CVPR, pp. 196–202.
Kang, S.B., Szeliski, R., and Chai, J. 2001. Handling occlusions in dense multi-view stereo. In CVPR, pp. 103–110.
Kang, S.B., Webb, J., Zitnick, L., and Kanade, T. 1995. A multibase-line stereo system with active illumination and realtime image acquisition. In ICCV, pp. 88–93.
Kass, M. 1988. Linear image features in stereopsis. IJCV, 1(4):357–368.
Kimura, R. et al. 1999. A convolver-based real-time stereo machine (SAZAN). In CVPR, Vol. 1, pp. 457–463.
Kolmogorov, V. and Zabih, R. 2001. Computing visual correspondence with occlusions using graph cuts. In ICCV, Vol. II, pp. 508–515.
Kutulakos, K.N. 2000. Approximate N-view stereo. In ECCV, Vol. I, pp. 67–83.
Kutulakos, K.N. and Seitz, S.M. 2000. A theory of shape by space carving. IJCV, 38(3):199–218.
Lee, S.H., Kanatsugu, Y., and Park, J.-I. 2002. MAP-based stochastic diffusion for stereo matching and line fields estimation. IJCV, 47(1/2/3):195–218.
Lin, M. and Tomasi, C. Surfaces with occlusions from layered stereo. Technical report, Stanford University. In preparation.
Loop, C. and Zhang, Z. 1999. Computing rectifying homographies for stereo vision. In CVPR, Vol. I, pp. 125–131.
Lucas, B.D. and Kanade, T. 1981. An iterative image registration technique with an application in stereo vision. In IJCAI, pp. 674–679.
Marr, D. 1982. Vision. Freeman: New York.
Marr, D. and Poggio, T. 1976. Cooperative computation of stereo disparity. Science, 194:283–287.
Marr, D.C. and Poggio, T. 1979. A computational theory of human stereo vision. Proceedings of the Royal Society of London, B 204:301–328.
Marroquin, J.L. 1983. Design of cooperative networks. AI Lab, MIT, Working Paper 253.
Marroquin, J., Mitter, S., and Poggio, T. 1987. Probabilistic solution of ill-posed problems in computational vision. Journal of the American Statistical Association, 82(397):76–89.
Matthies, L., Szeliski, R., and Kanade, T. 1989. Kalman filter-based algorithms for estimating depth from image sequences. IJCV, 3:209–236.
Mitiche, A. and Bouthemy, P. 1996. Computation and analysis of image motion: Asynopsis of current problems and methods. IJCV, 19(1):29–55.
Mühlmann, K., Maier, D., Hesser, J., and Männer, R. 2002. Calculating dense disparity maps from color stereo images, an efficient implementation. IJCV, 47(1/2/3):79–88.
Mulligan, J., Isler, V., and Danulidis, K. 2001. Performance evaluation of stereo for telepresence. In ICCV, Vol. II, pp. 558–565.
Nakamura, Y., Matsuura, T., Satoh, K., and Ohta, Y. 1996. Occlusion detectable stereo—occlusion patterns in camera matrix. In CVPR, pp. 371–378.
Nishihara, H.K. 1984. Practical real-time imaging stereo matcher. Optical Engineering, 23(5):536–545.
Ohta, Y. and Kanade, T. 1985. Stereo by intra-and interscanline search using dynamic programming. IEEE TPAMI, 7(2):139–154.
Okutomi, M. and Kanade, T. 1992. A locally adaptive window for signal matching. IJCV, 7(2):143–162.
Okutomi, M. and Kanade, T. 1993. A multiple-baseline stereo. IEEE TPAMI, 15(4):353–363.
Otte, M. and Nagel, H.-H. 1994. Optical flow estimation: Advances and comparisons. In ECCV, Vol. 1, pp. 51–60.
Poggio, T., Torre, V., and Koch, C. 1985. Computational vision and regularization theory. Nature, 317(6035):314–319.
Pollard, S.B., Mayhew, J.E.W., and Frisby, J.P. 1985. PMF: A stereo correspondence algorithm using a disparity gradient limit. Perception, 14:449–470.
Prazdny, K. 1985. Detection of binocular disparities. Biological Cybernetics, 52(2):93–99.
Quam, L.H. 1984. Hierarchical warp stereo. In Image Understanding Workshop, New Orleans, Louisiana, 1984. Science Applications International Corporation, pp. 149–155.
Roy, S. 1999. Stereo without epipolar lines: A maximum flow formulation. IJCV, 34(2/3):147–161.
Roy, S. and Cox, I.J. 1998. A maximum-flow formulation of the N-camera stereo correspondence problem. In ICCV, pp. 492–499.
Ryan, T.W., Gray, R.T., and Hunt, B.R. 1980. Prediction of correlation errors in stereo-pair images. Optical Engineering, 19(3):312–322.
Saito, H. and Kanade, T. 1999. Shape reconstruction in projective grid space from large number of images. In CVPR, Vol. 2, pp. 49–54.
Scharstein, D. 1994. Matching images by comparing their gradient fields. In ICPR, Vol. 1, pp. 572–575.
Scharstein, D. 1999. View synthesis Using Stereo Vision, Vol. 1583 of Lecture Notes in Computer Science (LNCS). Springer-Verlag: Berlin.
Scharstein, D. and Szeliski, R. 1998. Stereo matching with nonlinear diffusion. IJCV, 28(2):155–174.
Scharstein, D. and Szeliski, R. 2001. A taxonomy and evaluation of dense two-frame stereo correspondence algorithms. Microsoft Research, Technical Report MSR-TR-2001–81.
Scharstein, D., Szeliski, R., and Zabih, R. 2001. A taxonomy and evaluation of dense two-frame stereo correspondence algorithms. In IEEE Workshop on Stereo and Multi-Baseline Vision.
Seitz, P. 1989. Using local orientation information as image primitive for robust object recognition. In SPIE Visual Communications and Image Processing IV, Vol. 1199, pp. 1630–1639.
Seitz, S.M. and Dyer, C.M. 1999. Photorealistic scene reconstruction by voxel coloring. IJCV, 35(2):1–23.
Shade, J., Gortler, S., He, L.-W., and Szeliski, R. 1998. Layered depth images. In SIGGRAPH, pp. 231–242.
Shah, J. 1993. A nonlinear diffusion model for discontinuous disparity and half-occlusion in stereo. In CVPR, pp. 34–40.
Shao, J. 2002. Generation of temporally consistent multiple virtual camera views from stereoscopic image sequences. IJCV, 47(1/2/3):171–180.
Shimizu, M. and Okutomi, M. 2001. Precise sub-pixel estimation on area-based matching. In ICCV, Vol. I, pp. 90–97.
Shum, H.-Y. and Szeliski, R. 1999. Stereo reconstruction from mul-tiperspective panoramas. In ICCV, pp. 14–21.
Simoncelli, E.P., Adelson, E.H., and Heeger, D.J. 1991. Probability distributions of optic flow. In CVPR, pp. 310–315.
Sun, C. 2002. Fast stereo matching using rectangular subregioning and 3D maximum-surface techniques. IJCV, 47(1/2/3):99–117.
Sun, J., Shum, H.Y., and Zheng, N.N. 2002. Stereo matching using belief propagation. In ECCV.
Szeliski, R. 1989. Bayesian Modeling of Uncertainty in Low-Level Vision. Kluwer: Boston, MA.
Szeliski, R. 1999. Prediction error as a quality metric for motion and stereo. In ICCV, pp. 781–788.
Szeliski, R. and Coughlan, J. 1997. Spline-based image registration. IJCV, 22(3):199–218.
Szeliski, R. and Golland, P. 1999. Stereo matching with transparency and matting. IJCV, 32(1):45–61. Special Issue for Marr Prize papers.
Szeliski, R. and Hinton, G. 1985. Solving random-dot stereograms using the heat equation. In CVPR, pp. 284–288.
Szeliski, R. and Scharstein, D. 2002. Symmetric sub-pixel stereo matching. In ECCV.
Szeliski, R. and Zabih, R. 1999. An experimental comparison of stereo algorithms. In International Workshop on Vision Algorithms, Kerkyra, Greece, 1999. Springer: Berlin, pp. 1–19.
Tao, H., Sawhney, H., and Kumar, R. 2001. Aglobal matching frame-work for stereo computation. In ICCV, Vol. I, pp. 532–539.
Tekalp, M. 1995. Digital Video Processing. Prentice Hall: Upper Saddle River, NJ.
Terzopoulos, D. 1986. Regularization of inverse visual problems involving discontinuities. IEEE TPAMI, 8(4):413–424.
Terzopoulos, D. and Fleischer, K. 1988. Deformable models. The Visual Computer, 4(6):306–331.
Terzopoulos, D. and Metaxas, D. 1991. Dynamic 3D models with local and global deformations: Deformable superquadrics. IEEE TPAMI, 13(7):703–714.
Tian, Q. and Huhns, M.N. 1986. Algorithms for subpixel registration. CVGIP, 35:220–233.
Veksler, O. 1999. Efficient Graph-based Energy Minimization Methods in Computer Vision. Ph.D. Thesis, Cornell University.
Veksler, O. 2001. Stereo matching by compact windows via minimum ratio cycle. In ICCV, Vol. I, pp. 540–547.
Wang, J.Y.A. and Adelson, E.H. 1993. Layered representation for motion analysis. In CVPR, pp. 361–366.
Witkin, A., Terzopoulos, D., and Kass, M. 1987. Signal matching through scale space. IJCV, 1:133–144.
Yang, Y., Yuille, A., and Lu, J. 1993. Local, global, and multilevel stereo matching. In CVPR, pp. 274–279.
Yuille, A.L. and Poggio, T. 1984. A generalized ordering constraint for stereo correspondence. AI Lab, MIT, A.I. Memo 777.
Zabih, R. and Woodfill, J. 1994. Non-parametric local transforms for computing visual correspondence. In ECCV, Vol. II, pp. 151–158.
Zhang, Z. 1998. Determining the epipolar geometry and its uncertainty: A review. IJCV, 27(2):161–195.
Zhang, Z. 2000. A flexible new technique for camera calibration. IEEE TPAMI, 22(11):1330–1334.
Zitnick, C.L. and Kanade, T. 2000. A cooperative algorithm for stereo matching and occlusion detection. IEEE TPAMI, 22(7):675–684.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Scharstein, D., Szeliski, R. A Taxonomy and Evaluation of Dense Two-Frame Stereo Correspondence Algorithms. International Journal of Computer Vision 47, 7–42 (2002). https://doi.org/10.1023/A:1014573219977
Issue Date:
DOI: https://doi.org/10.1023/A:1014573219977