research-article

Fast Approximate Energy Minimization via Graph Cuts

Authors:

Ramin ZabihAuthors Info & Claims

IEEE Transactions on Pattern Analysis and Machine Intelligence, Volume 23, Issue 11

Pages 1222 - 1239

https://doi.org/10.1109/34.969114

Published: 01 November 2001 Publication History

Abstract

Many tasks in computer vision involve assigning a label (such as disparity) to every pixel. A common constraint is that the labels should vary smoothly almost everywhere while preserving sharp discontinuities that may exist, e.g., at object boundaries. These tasks are naturally stated in terms of energy minimization. In this paper, we consider a wide class of energies with various smoothness constraints. Global minimization of these energy functions is NP-hard even in the simplest discontinuity-preserving case. Therefore, our focus is on efficient approximation algorithms. We present two algorithms based on graph cuts that efficiently find a local minimum with respect to two types of large moves, namely expansion moves and swap moves. These moves can simultaneously change the labels of arbitrarily large sets of pixels. In contrast, many standard algorithms (including simulated annealing) use small moves where only one pixel changes its label at a time. Our expansion algorithm finds a labeling within a known factor of the global minimum, while our swap algorithm handles more general energy functions. Both of these algorithms allow important cases of discontinuity preserving energies. We experimentally demonstrate the effectiveness of our approach for image restoration, stereo and motion. On real data with ground truth, we achieve 98 percent accuracy.

References

[1]

R.K. Ahuja, T.L. Magnanti, and J.B. Orlin, Network Flows: Theory, Algorithms, and Applications. Prentice Hall, 1993.]]

Digital Library

[2]

A. Amini, T. Weymouth, and R. Jam, 'Using Dynamic Programming for Solving Variational Problems in Vision," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 12, no. 9, pp. 855-867, Sept. 1990.]]

Digital Library

[3]

S.A. Barker and P.J.W. Rayner, "Unsupervised Image Segmentation," Proc. IEEE Int'l Conf. Acoustics, Speech and Signal Processing, vol. 5, pp. 2757-2760,1998.]]

[4]

J. Besag, "On the Statistical Analysis of Dirty Pictures," (with discussion), J. Royal Statistical Soc., Series B, vol. 48, no. 3, pp. 259302, 1986.]]

[5]

S. Birchfield and C. Tomasi, "Depth Discontinuities by Pixel-toPixel Stereo," Int'l J. Computer Vision, vol. 35, no. 3, pp. 1-25, Dec. 1999.]]

Digital Library

[6]

S. Birchfield and C. Tomasi, "A Pixel Dissimilarity Measure that Is Insensitive to Image Sampling," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 20, no. 4, pp. 401406, Apr. 1998.]]

Digital Library

[7]

S. Birchfietd, "Depth and Motion Discontinuities," PhD thesis, Stanford Univ., June 1999. Available from http://vision. stanford.edu/-.birch/publications/. {8) A. Blake and A. Zisserman, Visual Reconstruction. MIT Press, 1987.]]

Digital Library

[8]

A. Blake and A. Zisserman, Visual Reconstruction. MIT Press, 1987.]]

Digital Library

[9]

A. Blake, "Comparison of the Efficiency of Deterministic and Stochastic Algorithms for Visual Reconstruction," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 11, no. 1, pp. 2-12, Jan. 1989.]]

Digital Library

[10]

Y. Boykov and V. Kolmogorov, "An Experimental Comparison of Mm-Cut/Max-Flow Algorithms for Energy Minimization in Vision," Proc. Int'l Workshop Energy Minimization Methods in Computer Vision and Pattern Recognition, pp. 359-374, Sept. 2001.]]

Digital Library

[11]

I Boykov, 0. Veksler, and R. Zabih, "Markov Random Fields with Efficient Approximations." Poor. IEEE Conf. Computer Vision and Pattern Recognition, pp. 648-655, 1998.]]

Digital Library

[12]

P.8. Chou and C.M. Brown, "The Theory and Practice of Bayesian Image Labeling," Intl J. Computer Vision, vol. 4, no. 3, pp. 185-210, 1990.]]

Digital Library

[13]

E. Dahlhaus, D.S. Johnson, CU. Papadimitriou, P.O. Seymour, and M. Yannakalcis, "The Complexity of Multiway Cuts," ACM Symp. Theory of Computing, pp. 241-251, 1992.]]

Digital Library

[14]

P. Ferrari, A. Frigessi, and P. de So, "Fast Approximate Maximum A Posteriori Restoration of Multicolour Images," J. Royal Statistical Soc., Series B, vol. 57, no. 3, pp. 485-500, 1995.]]

[15]

L. Ford and D. Fulkerson, Flows in Networks. Princeton Univ. Press, 1%2. {16) Y. Gdalyahu, D. Weinshall, and M. Werman, "Self-Organization in Vision: Stochastic Clustering for Image Segmentation, Perceptual Grouping, and Image Database Organization," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 23, no. 10, pp. 10531074, Oct. 2001.]]

Digital Library

[16]

0. Geiger and A. Yuille, "A Common Framework for Image Segmentation," Int'l J. Computer Vision, vol. 6, no. 3, pp. 227-243, 1991.]]

Digital Library

[17]

D. Geman, S. Geman, C. Graffigne, and P. Dong, "Boundary Detection by Constrained Optimization," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 12, no. 7, pp. 609-628, July 1990.]]

Digital Library

[18]

5. Geman and D. Geman, "Stochastic Relaxation, Gibbs Distributions, and the Bayesian Restoration of Images," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 6, pp. 721741, 1984.]]

[19]

0. Greig, B. Porteous, and A. Seheult, "Exact Maximum A Posteriori Estimation for Binary Images," /. Royal Statistical Soc., Series B, vol. 51, no. 2, pp. 271279, 1989.]]

[20]

W.E.L. Grimson and T. Pavlidis, "Discontinuity Detection for Visual Surface Reconstruction," Computer Vision, Graphics and Image Processing, vol. 30, pp. 316-330, 1985.]]

[21]

B.K.P. Horn and B. Schunic, "Determining Optical Flow," Artificial Intelligence, vol. 17, pp. 185-203, 1981.]]

Digital Library

[22]

R Hummel and S. Zucker, "On the Foundations of Relaxation Labeling Processes," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 5, pp. 267287, 1983.]]

[23]

H. lshikawa and D. Geiger, "Occlusions, Discontinuities, and Epipolar Lines in Stereo," Proc. European Conf. Computer Vision, pp. 232-248, 1998.]]

Digital Library

[24]

H. Ishikawa, "Global Optimization Using Embedded Graphs," PhD thesis, New York Univ., May 2000. Available from http://csl.cs.nyu.edu/phstudents/ishikawa/index.html.]]

Digital Library

[25]

M. ICass, A. Witkmn, and 0. Terzopoulos, "Snakes: Active Contour Models," Int'l J. Computer Vision, vol. 1, no. 4, pp. 321-331, 1987.]]

[26]

J. Kleinberg and E. Tardos, "Approximation Algorithms for Classification Problems with Pairwise Relationships: Metric Labeling and Markov Random Fields," Proc. IEEE Symp. Foundations of Computer Science, pp. 14-24, 1999.]]

Digital Library

[27]

V. lColmogorov and R. Zabih, "Computing Visual Correspondence with Occlusions via Graph Cuts," Proc. Int'l Conf. Computer Vision, vol. II, pp. 508-515, 2001.]]

[28]

D. Lee and T. Pavlidis, "One Dimensional Regularization with Discontinuities,' IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 10, no. 6, pp. 822-829, Nov. 1988. p]]

Digital Library

[29]

S. Li, Markon Random Field Modeling in Computer Vision. SpringerVerlag, 1995.]]

Digital Library

[30]

G. Parisi, Statistical Field Theory. Reading, Mass.: Addison-Wesley, 1988]]

[31]

M. Pelillo, "The Dynamics of Nonlinear Relaxation Labeling Processes," f. Math. Imaging and Vision, vol. 7, pp. 309-323, 1997]]

Digital Library

[32]

T. Poggio, E. Gamble, and J: Little, ."Parallel Integration of Vision Modules," Science, vol. 242, pp. 436-440, Oct. 1988. See also E. Gamble and T. Poggio, MIT Al Memo 970.]]

[33]

1 1. Poggio, V. Torre, and C. Koch, "Computational Vision and Regularization Theory," Nature, vol. 317, pp. 314-319, 1985.]]

Digital Library

[34]

It Potts, "Some Generalized Order-Disorder Transformation," Proc. Cambridge Philosophical Soc., vol. 48, pp. 106-109, 1952.]]

[35]

A. Rosenfeld, R.A. Hummel, and S.W. Zucker, "Scene Labeling by Relaxation Operations," IEEE Trans. Systems, Man, and Cybernetics, vol. 6, no. 6, pp. 420-433, June 1976.]]

[36]

5. Roy and I. Cox, "A Maximum-Flow Formulation of the ti-Camera Stereo Correspondence Problem," Proc. Int'l Conf. Computer Vision, pp. 492-499, 1998.]]

Digital Library

[37]

J. Shi and J. Malik, "Normalized Cuts and Image Segmentation," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 22, no. 8, pp. 888-905, Aug. 2000.]]

Digital Library

[38]

RH. Swendson and J. Wang, "Nonuniversal Critical Dynamics in Monte Carlo Simulations," Physical Rev. Letters, vol. 58, no. 2, pp. 86-88, 1987.]]

[39]

R. Szeliski and R. Zabih, "An Experimental Comparison of Stereo Algorithms," Vision Algorithms: Theory and Practice, B. Triggs, A. Zisserman, and R. Szeliski, eds., pp. 1-19, Sept. 1999.]]

Digital Library

[40]

R.S. Szeliski, "Bayesian Modeling of Uncertainty in Low-Level Vision," Int'l J. Computer Vision, vol. 5, no. 3, pp. 271-302, Dec. 1990.]]

Digital Library

[41]

D. Terzopoulos, 'Regularization of Inverse Visual Problems Involving Discontinuities," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 8, no. 4, pp. 413-424, 1986.]]

Digital Library

[42]

0. Veksler, "Efficient Graph-Based Energy Minimization Methods in Computer Vision," PhD thesis, Cornell Univ., July 1999. Available from www.neci.nj.nec.com/homepages/olga.]]

Digital Library

[43]

0. Veksler, "Image Segmentation by Nested Cuts," Proc. IEEE Conf. Computer Vision and Pattern Recognition, vol. 1, pp. 339-344, 2000.]]

[44]

Y. Weiss and E.H. Adelson, "A Unified Mixture Framework for Motion Segmentation: Incorporating Spatial Coherence and Estimating the Number of Models," Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 321-326, 1996.]]

Digital Library

[45]

G. Winder, Image Analysis, Random Fields and Dynamic Monte Carlo Methods. SpringerVerlag, 1995.]]

[46]

Z. Wu and R. Leahy, "An Optimal Graph Theoretic Approach to Data Clustering: Theory and Its Application to Image Segmentation," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 15, no. 11, pp. 1101-1113, Nov. 1993.]]

Digital Library

Cited By

Lin XXue BWang X(2024)Digital 3D Reconstruction of Ancient Chinese Great Wild Goose Pagoda by TLS Point Cloud Hierarchical RegistrationJournal on Computing and Cultural Heritage 10.1145/363993217:2(1-16)Online publication date: 26-Mar-2024
https://dl.acm.org/doi/10.1145/3639932
Kalomiros JVourvoulakis JVologiannidis S(2024)A Hardware Accelerator for the Semi-Global Matching Stereo Algorithm: An Efficient Implementation for the Stratix V and Zynq UltraScale+ FPGA TechnologyACM Transactions on Reconfigurable Technology and Systems10.1145/361586917:1(1-25)Online publication date: 27-Jan-2024
https://dl.acm.org/doi/10.1145/3615869
Zheng ZGao XPan ZLi WWang PWang GWu K(2024)Visual-Preserving Mesh RepairIEEE Transactions on Visualization and Computer Graphics10.1109/TVCG.2023.334882930:9(6586-6597)Online publication date: 1-Jan-2024
https://dl.acm.org/doi/10.1109/TVCG.2023.3348829
Show More Cited By

Index Terms

Fast Approximate Energy Minimization via Graph Cuts
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision problems
      2. Computer vision tasks
        Scene understanding
2. Mathematics of computing
  1. Discrete mathematics
    1. Graph theory
      1. Graph algorithms

Recommendations

Approximate Labeling via Graph Cuts Based on Linear Programming

A new framework is presented for both understanding and developing graph-cut-based combinatorial algorithms suitable for the approximate optimization of a very wide class of Markov Random Fields (MRFs) that are frequently encountered in computer vision. ...
An Experimental Comparison of Min-Cut/Max-Flow Algorithms for Energy Minimization in Vision

After [15], [31], [19], [8], [25], [5], minimum cut/maximum flow algorithms on graphs emerged as an increasingly useful tool for exact or approximate energy minimization in low-level vision. The combinatorial optimization literature provides many min-...
Minimization of monotonically levelable higher order MRF energies via graph cuts

A feature of minimizing images of submodular binary Markov random field (MRF) energies is introduced. Using this novel feature, the collection of minimizing images of levels of higher order, monotonically levelable multilabel MRF energies is shown to ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image IEEE Transactions on Pattern Analysis and Machine Intelligence

IEEE Transactions on Pattern Analysis and Machine Intelligence Volume 23, Issue 11

November 2001

128 pages

ISSN:0162-8828

Issue’s Table of Contents

Copyright © Copyright © 2001 IEEE. All Rights Reserved.

Publisher

IEEE Computer Society

United States

Publication History

Published: 01 November 2001

Author Tags

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1,411
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 16 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Lin XXue BWang X(2024)Digital 3D Reconstruction of Ancient Chinese Great Wild Goose Pagoda by TLS Point Cloud Hierarchical RegistrationJournal on Computing and Cultural Heritage 10.1145/363993217:2(1-16)Online publication date: 26-Mar-2024
https://dl.acm.org/doi/10.1145/3639932
Kalomiros JVourvoulakis JVologiannidis S(2024)A Hardware Accelerator for the Semi-Global Matching Stereo Algorithm: An Efficient Implementation for the Stratix V and Zynq UltraScale+ FPGA TechnologyACM Transactions on Reconfigurable Technology and Systems10.1145/361586917:1(1-25)Online publication date: 27-Jan-2024
https://dl.acm.org/doi/10.1145/3615869
Zheng ZGao XPan ZLi WWang PWang GWu K(2024)Visual-Preserving Mesh RepairIEEE Transactions on Visualization and Computer Graphics10.1109/TVCG.2023.334882930:9(6586-6597)Online publication date: 1-Jan-2024
https://dl.acm.org/doi/10.1109/TVCG.2023.3348829
Willett Nde Goes FFleischer KMeyer MBurrows C(2024)Stylizing Ribbons: Computing Surface Contours With Temporally Coherent OrientationsIEEE Transactions on Visualization and Computer Graphics10.1109/TVCG.2023.330464130:8(5623-5634)Online publication date: 1-Aug-2024
https://dl.acm.org/doi/10.1109/TVCG.2023.3304641
Ding YJi YChen ZZhou MKang SYe J(2024)Polarimetric Helmholtz StereopsisIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2024.335710046:7(4597-4611)Online publication date: 25-Jan-2024
https://dl.acm.org/doi/10.1109/TPAMI.2024.3357100
Ji ZVeksler O(2024)Regularized Loss With Hyperparameter Estimation for Weakly Supervised Single Class SegmentationIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2024.335045046:5(3923-3937)Online publication date: 5-Jan-2024
https://dl.acm.org/doi/10.1109/TPAMI.2024.3350450
Xu MWang HNi B(2024)Graphical Modeling for Multi-Source Domain AdaptationIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2022.317237246:3(1727-1741)Online publication date: 1-Mar-2024
https://dl.acm.org/doi/10.1109/TPAMI.2022.3172372
Shu ZWu TShen JXin SLiu L(2024)Semi-Supervised 3D Shape Segmentation via Self RefiningIEEE Transactions on Image Processing10.1109/TIP.2024.337420033(2044-2057)Online publication date: 12-Mar-2024
https://dl.acm.org/doi/10.1109/TIP.2024.3374200
Li JZhou Y(2024)Automatic Quaternion-Domain Color Image StitchingIEEE Transactions on Image Processing10.1109/TIP.2024.336168833(1299-1312)Online publication date: 1-Jan-2024
https://dl.acm.org/doi/10.1109/TIP.2024.3361688
Heidari SDinneen MDelmas P(2024)Quantum Annealing for Computer Vision minimization problemsFuture Generation Computer Systems10.1016/j.future.2024.05.037160:C(54-64)Online publication date: 1-Nov-2024
https://dl.acm.org/doi/10.1016/j.future.2024.05.037
Show More Cited By

View Options

View options

Media

Figures

Other

Tables

View Issue’s Table of Contents