Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

Fast Approximate Energy Minimization via Graph Cuts

Published: 01 November 2001 Publication History

Abstract

Many tasks in computer vision involve assigning a label (such as disparity) to every pixel. A common constraint is that the labels should vary smoothly almost everywhere while preserving sharp discontinuities that may exist, e.g., at object boundaries. These tasks are naturally stated in terms of energy minimization. In this paper, we consider a wide class of energies with various smoothness constraints. Global minimization of these energy functions is NP-hard even in the simplest discontinuity-preserving case. Therefore, our focus is on efficient approximation algorithms. We present two algorithms based on graph cuts that efficiently find a local minimum with respect to two types of large moves, namely expansion moves and swap moves. These moves can simultaneously change the labels of arbitrarily large sets of pixels. In contrast, many standard algorithms (including simulated annealing) use small moves where only one pixel changes its label at a time. Our expansion algorithm finds a labeling within a known factor of the global minimum, while our swap algorithm handles more general energy functions. Both of these algorithms allow important cases of discontinuity preserving energies. We experimentally demonstrate the effectiveness of our approach for image restoration, stereo and motion. On real data with ground truth, we achieve 98 percent accuracy.

References

[1]
R.K. Ahuja, T.L. Magnanti, and J.B. Orlin, Network Flows: Theory, Algorithms, and Applications. Prentice Hall, 1993.]]
[2]
A. Amini, T. Weymouth, and R. Jam, 'Using Dynamic Programming for Solving Variational Problems in Vision," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 12, no. 9, pp. 855-867, Sept. 1990.]]
[3]
S.A. Barker and P.J.W. Rayner, "Unsupervised Image Segmentation," Proc. IEEE Int'l Conf. Acoustics, Speech and Signal Processing, vol. 5, pp. 2757-2760,1998.]]
[4]
J. Besag, "On the Statistical Analysis of Dirty Pictures," (with discussion), J. Royal Statistical Soc., Series B, vol. 48, no. 3, pp. 259302, 1986.]]
[5]
S. Birchfield and C. Tomasi, "Depth Discontinuities by Pixel-toPixel Stereo," Int'l J. Computer Vision, vol. 35, no. 3, pp. 1-25, Dec. 1999.]]
[6]
S. Birchfield and C. Tomasi, "A Pixel Dissimilarity Measure that Is Insensitive to Image Sampling," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 20, no. 4, pp. 401406, Apr. 1998.]]
[7]
S. Birchfietd, "Depth and Motion Discontinuities," PhD thesis, Stanford Univ., June 1999. Available from http://vision. stanford.edu/-.birch/publications/. {8) A. Blake and A. Zisserman, Visual Reconstruction. MIT Press, 1987.]]
[8]
A. Blake and A. Zisserman, Visual Reconstruction. MIT Press, 1987.]]
[9]
A. Blake, "Comparison of the Efficiency of Deterministic and Stochastic Algorithms for Visual Reconstruction," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 11, no. 1, pp. 2-12, Jan. 1989.]]
[10]
Y. Boykov and V. Kolmogorov, "An Experimental Comparison of Mm-Cut/Max-Flow Algorithms for Energy Minimization in Vision," Proc. Int'l Workshop Energy Minimization Methods in Computer Vision and Pattern Recognition, pp. 359-374, Sept. 2001.]]
[11]
I Boykov, 0. Veksler, and R. Zabih, "Markov Random Fields with Efficient Approximations." Poor. IEEE Conf. Computer Vision and Pattern Recognition, pp. 648-655, 1998.]]
[12]
P.8. Chou and C.M. Brown, "The Theory and Practice of Bayesian Image Labeling," Intl J. Computer Vision, vol. 4, no. 3, pp. 185-210, 1990.]]
[13]
E. Dahlhaus, D.S. Johnson, CU. Papadimitriou, P.O. Seymour, and M. Yannakalcis, "The Complexity of Multiway Cuts," ACM Symp. Theory of Computing, pp. 241-251, 1992.]]
[14]
P. Ferrari, A. Frigessi, and P. de So, "Fast Approximate Maximum A Posteriori Restoration of Multicolour Images," J. Royal Statistical Soc., Series B, vol. 57, no. 3, pp. 485-500, 1995.]]
[15]
L. Ford and D. Fulkerson, Flows in Networks. Princeton Univ. Press, 1%2. {16) Y. Gdalyahu, D. Weinshall, and M. Werman, "Self-Organization in Vision: Stochastic Clustering for Image Segmentation, Perceptual Grouping, and Image Database Organization," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 23, no. 10, pp. 10531074, Oct. 2001.]]
[16]
0. Geiger and A. Yuille, "A Common Framework for Image Segmentation," Int'l J. Computer Vision, vol. 6, no. 3, pp. 227-243, 1991.]]
[17]
D. Geman, S. Geman, C. Graffigne, and P. Dong, "Boundary Detection by Constrained Optimization," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 12, no. 7, pp. 609-628, July 1990.]]
[18]
5. Geman and D. Geman, "Stochastic Relaxation, Gibbs Distributions, and the Bayesian Restoration of Images," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 6, pp. 721741, 1984.]]
[19]
0. Greig, B. Porteous, and A. Seheult, "Exact Maximum A Posteriori Estimation for Binary Images," /. Royal Statistical Soc., Series B, vol. 51, no. 2, pp. 271279, 1989.]]
[20]
W.E.L. Grimson and T. Pavlidis, "Discontinuity Detection for Visual Surface Reconstruction," Computer Vision, Graphics and Image Processing, vol. 30, pp. 316-330, 1985.]]
[21]
B.K.P. Horn and B. Schunic, "Determining Optical Flow," Artificial Intelligence, vol. 17, pp. 185-203, 1981.]]
[22]
R Hummel and S. Zucker, "On the Foundations of Relaxation Labeling Processes," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 5, pp. 267287, 1983.]]
[23]
H. lshikawa and D. Geiger, "Occlusions, Discontinuities, and Epipolar Lines in Stereo," Proc. European Conf. Computer Vision, pp. 232-248, 1998.]]
[24]
H. Ishikawa, "Global Optimization Using Embedded Graphs," PhD thesis, New York Univ., May 2000. Available from http://csl.cs.nyu.edu/phstudents/ishikawa/index.html.]]
[25]
M. ICass, A. Witkmn, and 0. Terzopoulos, "Snakes: Active Contour Models," Int'l J. Computer Vision, vol. 1, no. 4, pp. 321-331, 1987.]]
[26]
J. Kleinberg and E. Tardos, "Approximation Algorithms for Classification Problems with Pairwise Relationships: Metric Labeling and Markov Random Fields," Proc. IEEE Symp. Foundations of Computer Science, pp. 14-24, 1999.]]
[27]
V. lColmogorov and R. Zabih, "Computing Visual Correspondence with Occlusions via Graph Cuts," Proc. Int'l Conf. Computer Vision, vol. II, pp. 508-515, 2001.]]
[28]
D. Lee and T. Pavlidis, "One Dimensional Regularization with Discontinuities,' IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 10, no. 6, pp. 822-829, Nov. 1988. p]]
[29]
S. Li, Markon Random Field Modeling in Computer Vision. SpringerVerlag, 1995.]]
[30]
G. Parisi, Statistical Field Theory. Reading, Mass.: Addison-Wesley, 1988]]
[31]
M. Pelillo, "The Dynamics of Nonlinear Relaxation Labeling Processes," f. Math. Imaging and Vision, vol. 7, pp. 309-323, 1997]]
[32]
T. Poggio, E. Gamble, and J: Little, ."Parallel Integration of Vision Modules," Science, vol. 242, pp. 436-440, Oct. 1988. See also E. Gamble and T. Poggio, MIT Al Memo 970.]]
[33]
1 1. Poggio, V. Torre, and C. Koch, "Computational Vision and Regularization Theory," Nature, vol. 317, pp. 314-319, 1985.]]
[34]
It Potts, "Some Generalized Order-Disorder Transformation," Proc. Cambridge Philosophical Soc., vol. 48, pp. 106-109, 1952.]]
[35]
A. Rosenfeld, R.A. Hummel, and S.W. Zucker, "Scene Labeling by Relaxation Operations," IEEE Trans. Systems, Man, and Cybernetics, vol. 6, no. 6, pp. 420-433, June 1976.]]
[36]
5. Roy and I. Cox, "A Maximum-Flow Formulation of the ti-Camera Stereo Correspondence Problem," Proc. Int'l Conf. Computer Vision, pp. 492-499, 1998.]]
[37]
J. Shi and J. Malik, "Normalized Cuts and Image Segmentation," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 22, no. 8, pp. 888-905, Aug. 2000.]]
[38]
RH. Swendson and J. Wang, "Nonuniversal Critical Dynamics in Monte Carlo Simulations," Physical Rev. Letters, vol. 58, no. 2, pp. 86-88, 1987.]]
[39]
R. Szeliski and R. Zabih, "An Experimental Comparison of Stereo Algorithms," Vision Algorithms: Theory and Practice, B. Triggs, A. Zisserman, and R. Szeliski, eds., pp. 1-19, Sept. 1999.]]
[40]
R.S. Szeliski, "Bayesian Modeling of Uncertainty in Low-Level Vision," Int'l J. Computer Vision, vol. 5, no. 3, pp. 271-302, Dec. 1990.]]
[41]
D. Terzopoulos, 'Regularization of Inverse Visual Problems Involving Discontinuities," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 8, no. 4, pp. 413-424, 1986.]]
[42]
0. Veksler, "Efficient Graph-Based Energy Minimization Methods in Computer Vision," PhD thesis, Cornell Univ., July 1999. Available from www.neci.nj.nec.com/homepages/olga.]]
[43]
0. Veksler, "Image Segmentation by Nested Cuts," Proc. IEEE Conf. Computer Vision and Pattern Recognition, vol. 1, pp. 339-344, 2000.]]
[44]
Y. Weiss and E.H. Adelson, "A Unified Mixture Framework for Motion Segmentation: Incorporating Spatial Coherence and Estimating the Number of Models," Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 321-326, 1996.]]
[45]
G. Winder, Image Analysis, Random Fields and Dynamic Monte Carlo Methods. SpringerVerlag, 1995.]]
[46]
Z. Wu and R. Leahy, "An Optimal Graph Theoretic Approach to Data Clustering: Theory and Its Application to Image Segmentation," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 15, no. 11, pp. 1101-1113, Nov. 1993.]]

Cited By

View all
  • (2024)Digital 3D Reconstruction of Ancient Chinese Great Wild Goose Pagoda by TLS Point Cloud Hierarchical RegistrationJournal on Computing and Cultural Heritage 10.1145/363993217:2(1-16)Online publication date: 26-Mar-2024
  • (2024)A Hardware Accelerator for the Semi-Global Matching Stereo Algorithm: An Efficient Implementation for the Stratix V and Zynq UltraScale+ FPGA TechnologyACM Transactions on Reconfigurable Technology and Systems10.1145/361586917:1(1-25)Online publication date: 27-Jan-2024
  • (2024)Visual-Preserving Mesh RepairIEEE Transactions on Visualization and Computer Graphics10.1109/TVCG.2023.334882930:9(6586-6597)Online publication date: 1-Jan-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image IEEE Transactions on Pattern Analysis and Machine Intelligence
IEEE Transactions on Pattern Analysis and Machine Intelligence  Volume 23, Issue 11
November 2001
128 pages

Publisher

IEEE Computer Society

United States

Publication History

Published: 01 November 2001

Author Tags

  1. Energy minimization
  2. Markov Random Fields
  3. Potts model
  4. early vision
  5. graph algorithms
  6. image restoration
  7. maximum flow
  8. minimum cut
  9. motion
  10. multiway cut.
  11. stereo

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 16 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Digital 3D Reconstruction of Ancient Chinese Great Wild Goose Pagoda by TLS Point Cloud Hierarchical RegistrationJournal on Computing and Cultural Heritage 10.1145/363993217:2(1-16)Online publication date: 26-Mar-2024
  • (2024)A Hardware Accelerator for the Semi-Global Matching Stereo Algorithm: An Efficient Implementation for the Stratix V and Zynq UltraScale+ FPGA TechnologyACM Transactions on Reconfigurable Technology and Systems10.1145/361586917:1(1-25)Online publication date: 27-Jan-2024
  • (2024)Visual-Preserving Mesh RepairIEEE Transactions on Visualization and Computer Graphics10.1109/TVCG.2023.334882930:9(6586-6597)Online publication date: 1-Jan-2024
  • (2024)Stylizing Ribbons: Computing Surface Contours With Temporally Coherent OrientationsIEEE Transactions on Visualization and Computer Graphics10.1109/TVCG.2023.330464130:8(5623-5634)Online publication date: 1-Aug-2024
  • (2024)Polarimetric Helmholtz StereopsisIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2024.335710046:7(4597-4611)Online publication date: 25-Jan-2024
  • (2024)Regularized Loss With Hyperparameter Estimation for Weakly Supervised Single Class SegmentationIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2024.335045046:5(3923-3937)Online publication date: 5-Jan-2024
  • (2024)Graphical Modeling for Multi-Source Domain AdaptationIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2022.317237246:3(1727-1741)Online publication date: 1-Mar-2024
  • (2024)Semi-Supervised 3D Shape Segmentation via Self RefiningIEEE Transactions on Image Processing10.1109/TIP.2024.337420033(2044-2057)Online publication date: 12-Mar-2024
  • (2024)Automatic Quaternion-Domain Color Image StitchingIEEE Transactions on Image Processing10.1109/TIP.2024.336168833(1299-1312)Online publication date: 1-Jan-2024
  • (2024)Quantum Annealing for Computer Vision minimization problemsFuture Generation Computer Systems10.1016/j.future.2024.05.037160:C(54-64)Online publication date: 1-Nov-2024
  • Show More Cited By

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media