Abstract
Building upon recent developments in optical flow and stereo matching estimation, we propose a variational framework for the estimation of stereoscopic scene flow, i.e., the motion of points in the three-dimensional world from stereo image sequences. The proposed algorithm takes into account image pairs from two consecutive times and computes both depth and a 3D motion vector associated with each point in the image. In contrast to previous works, we partially decouple the depth estimation from the motion estimation, which has many practical advantages. The variational formulation is quite flexible and can handle both sparse or dense disparity maps. The proposed method is very efficient; with the depth map being computed on an FPGA, and the scene flow computed on the GPU, the proposed algorithm runs at frame rates of 20 frames per second on QVGA images (320×240 pixels). Furthermore, we present solutions to two important problems in scene flow estimation: violations of intensity consistency between input images, and the uncertainty measures for the scene flow result.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Aujol, J. F., Gilboa, G., Chan, T., & Osher, S. (2006). structure-texture image decomposition—modeling, algorithms, and parameter selection. International Journal of Computer Vision, 67(1), 111–136.
Badino, H. (2004). A robust approach for ego-motion estimation using a mobile stereo platform. In Proc. int. workshop on complex motion (IWCM04) (pp. 198–208). Berlin: Springer.
Black, M. J., & Anandan, P. (1996). The robust estimation of multiple motions: parametric and piecewise smooth flow fields. Computer Vision and Image Understanding, 63(1), 75–104.
Boykov, Y., & Kolmogorov, V. (2004). An experimental comparison of min-cut/max-flow algorithms for energy minimization in vision. IEEE Transactions on Pattern Analysis and Machine Intelligence, 26(9), 1124–1137.
Brox, T. (2005). From pixels to regions: partial differential equations in image analysis. PhD thesis, Faculty of Mathematics and Computer Science, Saarland University, Germany.
Brox, T., Bruhn, A., Papenberg, N., & Weickert, J. (2004). High accuracy optical flow estimation based on a theory for warping. In Proc. European conf. on computer vision (ECCV) (pp. 25–36). Berlin: Springer.
Bruhn, A., & Weickert, J. (2006). Geometric properties for incomplete data: Vols. 283–298. A confidence measure for variational optic flow methods. Berlin: Springer.
Bruhn, A., Weickert, J., Kohlberger, T., & Schnörr, C. (2005). Discontinuity preserving computation of variational optic flow in real-time. In Proc. int. conf. on scale-space (pp. 279–290). Berlin: Springer.
Costeira, J., & Kanande, T. (1995). A multi-body factorization method for motion analysis. In Proc. IEEE int. conf. on computer vision (ICCV) (pp. 1071–1076).
Franke, U., & Joos, A. (2000). Real-time stereo vision for urban traffic scene understanding. In Proc. IEEE intelligent vehicles symposium (pp. 273–278). Dearborn: IEEE Computer Society.
Gong, M. (2009). Real-time joint disparity and disparity flow estimation on programmable graphics hardware. Computer Vision and Image Understanding, 113(1), 90–100.
Gong, M., & Yang, Y. H. (2006). Disparity flow estimation using orthogonal reliability-based dynamic programming. In Proc. int. conf. on pattern recognition (ICPR) (pp. 70–73). Los Alamitos: IEEE Computer Society.
Hartley, R. I., & Zisserman, A. (2000). Multiple view geometry in computer vision. Cambridge: Cambridge University Press. ISBN:0521623049.
Hirschmüller, H. (2006). Stereo vision in structured environments by consistent semi-globalmatching. In Proc. IEEE int. conf. on computer vision and pattern recognition (CVPR) (pp. 2386–2393). Los Alamitos: IEEE Computer Society.
Hirschmüller, H. (2008). Stereo processing by semiglobal matching and mutual information. IEEE Transactions on Pattern Analysis and Machine Intelligence, 30(2), 328–341.
Horn, B., & Schunck, B. (1981). Determining optical flow. Artificial Intelligence, 17, 185–203.
Hu, X., & Mordohai, P. (2010). Evaluation of stereo condence indoors and outdoors. In Proc. IEEE int. conf. on computer vision and pattern recognition (CVPR). Los Alamitos: IEEE Computer Society.
Huguet, F., & Devernay, F. (2007). A variational method for scene flow estimation from stereo sequences. In Proc. IEEE int. conf. on computer vision (ICCV) (pp. 1–7). Los Alamitos: IEEE Computer Society. http://www-prima.imag.fr/prima/pub/Publications/2007/HD07/.
Isard, M., & MacCormick, J. (2006). Dense motion and disparity estimation via loopy belief propagation. In Lecture Notes in Computer Science: Vol. 3852. Asian conf. on computer vision (ACCV) (pp. 32–41). Berlin: Springer. http://dblp.uni-trier.de/db/conf/accv/accv2006-2.html#IsardM06.
Kanatani, K., & Sugaya, Y. (2004). Multi-stage optimization for multi-body motion segmentation. IEICE Transactions on Information and Systems E87-D(7), 1935–1942.
Kolmogorov, V., & Zabih, R. (2002). What energy functions can be minimized via graph cuts? In Proc. European conf. on computer vision (ECCV) (pp. 65–81).
Mahalanobis, P. C. (1936). On the generalised distance in statistics. In Proc. of the National Institute of Science of India 12 (pp. 49–55).
Mémin, E., & Pérez, P. (1998). Dense estimation and object-based segmentation of the optical flow with robust techniques. IEEE Transactions on Image Processing, 7(5), 703–719.
Min, D., & Sohn, K. (2006). Edge-preserving simultaneous joint motion-disparity estimation. In Proc. int. conf. on pattern recognition (ICPR) (pp. 74–77). Washington: IEEE Computer Society. doi:10.1109/ICPR.2006.470.
Patras, I., Hendriks, E., & Tziritas, G. (1996). A joint motion/disparity estimation method for the construction of stereo interpolated images in stereoscopic image sequences. In Proc. int. conf. on pattern recognition (ICPR) (p. 359). Heijen: IEEE Computer Society.
Pons, J. P., Keriven, R., & Faugeras, O. (2007). Multi-view stereo reconstruction and scene flow estimation with a global image-based matching score. International Journal of Computer Vision, 72(2), 179–193. doi:10.1007/s11263-006-8671-5.
Rabe, C., Franke, U., & Gehrig, S. (2007). Fast detection of moving objects in complex scenarios. In Proc. IEEE intelligent vehicles symposium (pp. 398–403). Los Alamitos: IEEE Computer Society.
Rao, S. R., Tron, R., Vidal, R., & Ma, Y. (2008). Motion segmentation via robust subspace separation in the presence of outlying, incomplete, or corrupted trajectories. In Proc. IEEE int. conf. on computer vision and pattern recognition (CVPR).
Rudin, L., Osher, S., & Fatemi, E. (1992). Nonlinear total variation based noise removal algorithms. Physica D, 60, 259–268.
Scharstein, D., & Szeliski, R. (2002). A taxonomy and evaluation of dense two-frame stereo correspondencealgorithms. In Proc. IEEE int. conf. on computer vision (ICCV) (pp. 7–42). Los Alamitos: IEEE Computer Society.
Shimizu, M., & Okutomi, M. (2001). Precise sub-pixel estimation on area-based matching. In Proc. IEEE int. conf. on computer vision (ICCV) (pp. 90–97). Los Alamitos: IEEE Computer Society.
Stein, F. (2004). Efficient computation of optical flow using the census transform. In Proc. DAGM (pattern recognition) (pp. 79–86). Berlin: Springer.
Stüben, K., & Trottenberg, U. (1982). Lecture notes in mathematics: Vol. 960. Multigrid methods: fundamental algorithms, model problem analysis and applications. Berlin: Springer.
Tomasi, C., & Kanade, T. (1991). Detection and tracking of point features. Tech. Rep. CMU-CS-91-132, Carnegie Mellon University. citeseer.ist.psu.edu/tomasi91detection.html.
University of Auckland (2008). .enpeda. Image Sequence Analysis Test Site (EISATS). http://www.mi.auckland.ac.nz/EISATS/.
Vaudrey, T., Rabe, C., Klette, R., & Milburn, J. (2008). Differences between stereo and motion behaviour on synthetic and real-world stereo sequences. In Proc. int. conf. on image and vision computing New Zealand (IVCNZ). Los Alamitos: IEEE Computer Society. IEEE Xplore:10.1109/IVCNZ.2008.4762133.
Vaudrey, T., Morales, S., Wedel, A., & Klette, R. (2010). Generalised residual images’ effect on illumination artifact removal for correspondence algorithms. Pattern Recognition. doi:10.1016/j.patcog.2010.05.036.
Vedula, S., Baker, S., Rander, P., Collins, R., & Kanade, T. (2005). Three-dimensional scene flow. IEEE Transactions on Pattern Analysis and Machine Intelligence, 27(3), 475–480.
Wedel, A., Pock, T., Zach, C., Cremers, D., & Bischof, H. (2008a). An improved algorithm for TV-L1 optical flow. In Revised papers int. Dagstuhl seminar on statistical and geometrical approaches to visual motion analysis (pp. 23–45). Berlin: Springer.
Wedel, A., Rabe, C., Vaudrey, T., Brox, T., Franke, U., & Cremers, D. (2008b). Efficient dense scene flow from sparse or dense stereo data. In Proc. European conf. on computer vision (ECCV) (pp. 739–751). Berlin: Springer.
Wedel, A., Vaudrey, T., Meissner, A., Rabe, C., Brox, T., Franke, U., & Cremers, D. (2008c). An evaluation approach for scene flow with decoupled motion and position. In Revised papers int. Dagstuhl seminar on statistical and geometrical approaches to visual motion analysis (pp. 46–69). Berlin: Springer.
Wedel, A., Rabe, C., Meissner, A., Franke, U., & Cremers, D. (2009). Detection and segmentation of independently moving objects from dense scene flow. In D. Cremers, Y. Boykov, A. Blake, & F. R. Schmidt (Eds.), Energy minimization methods in computer vision and pattern recognition (EMMCVPR) (Vol. 5681, pp. 14–27). Berlin: Springer.
Werlberger, M., Pock, T., & Bischof, H. (2010). Motion estimation with non-local total variation regularization. In Proc. IEEE int. conf. on computer vision and pattern recognition (CVPR). Los Alamitos: IEEE Computer Society.
Yan, J., & Pollefeys, M. (2006). A general framework for motion segmentation: independent, articulated, rigid, non-rigid, degenerate and non-degenerate. In LNCS: Vol. 3954. Proc. European conf. on computer vision (ECCV) (pp. 94–106). Berlin: Springer.
Young, D. M. (1971). Iterative solution of large linear systems. New York: Academic Press.
Zach, C., Pock, T., & Bischof, H. (2007). A duality based approach for realtime tv-L 1 optical flow. In Proc. DAGM (pattern recognition) (pp. 214–223). Berlin: Springer.
Zhang, Y., & Kambhamettu, C. (2001). On 3d scene flow and structure estimation. In Proc. IEEE int. conf. on computer vision and pattern recognition (CVPR) (Vol. 2, pp. 778–778). Los Alamitos: IEEE Computer Society. doi:10.1109/CVPR.2001.991044.
Zimmer, H., Bruhn, A., Weickert, J., Valgaerts, L., Salgado, A., Rosenhahn, B., & Seidel, H. P. (2009). Complementary optic flow. In Energy minimization methods in computer vision and pattern recognition (EMMCVPR) (pp. 207–220). Berlin: Springer.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Wedel, A., Brox, T., Vaudrey, T. et al. Stereoscopic Scene Flow Computation for 3D Motion Understanding. Int J Comput Vis 95, 29–51 (2011). https://doi.org/10.1007/s11263-010-0404-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11263-010-0404-0