article

3D Object Modeling and Recognition Using Local Affine-Invariant Image Descriptors and Multi-View Spatial Constraints

Authors:

Fred Rothganger,

Svetlana Lazebnik,

Cordelia Schmid,

Jean PonceAuthors Info & Claims

International Journal of Computer Vision, Volume 66, Issue 3

Pages 231 - 259

https://doi.org/10.1007/s11263-005-3674-1

Published: 01 March 2006 Publication History

Abstract

This article introduces a novel representation for three-dimensional (3D) objects in terms of local affine-invariant descriptors of their images and the spatial relationships between the corresponding surface patches. Geometric constraints associated with different views of the same patches under affine projection are combined with a normalized representation of their appearance to guide matching and reconstruction, allowing the acquisition of true 3D affine and Euclidean models from multiple unregistered images, as well as their recognition in photographs taken from arbitrary viewpoints. The proposed approach does not require a separate segmentation stage, and it is applicable to highly cluttered scenes. Modeling and recognition results are presented.

References

[1]

Ayache, N. and Faugeras, O.D. 1986. Hyper: A new approach for the recognition and positioning of two-dimensional objects. IEEE Transactions on Pattern Analysis and Machine Intelligence, 8(1):44-54.]]

Digital Library

[2]

Baker, S. and Kanade, T. 2002. Limits on super-resolution and how to break them. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(9): 1167-1183.]]

Digital Library

[3]

Baumberg, A. 2000. Reliable feature matching across widely separated views. In Conference on Computer Vision and Pattern Recognition, pp. 774-781.]]

[4]

Belhumeur, P.N., Hespanha, J.P., and Kriegman, D.J. 1997. Eigenfaces vs. fisherfaces: Recognition using class specific linear projection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 19(7):711-720.]]

Digital Library

[5]

Blostein, D. and Ahuja, N. 1989. A multiscale region detector. Computer Vision, Graphics and Image Processing, 45:22-41.]]

Digital Library

[6]

Burns, J.B., Weiss, R.S., and Riseman, E.M. 1993. View variation of point-set and line-segment features. IEEE Transactions on Pattern Analysis and Maehine Intelligence, 15(1):51-68.]]

Digital Library

[7]

Capel, D. and Zisserman, A. 2001. Super-resolution from multiple views using learnt image models. In Conference on Computer Vision and Pattern Recognition.]]

[8]

Cheeseman, P., Kanefsky, B., Kraft, R., and Stutz, J. 1994. Superresolved surface reconstruction from multiple Images. Technical report, NASA Ames Research Center.]]

[9]

Crowley, J.L. and Parker, A.C. 1984. A representation of shape based on peaks and ridges in the difference of low-pass transform. IEEE Transactions on Pattern Analysis and Machine Intelligence, 6:156-170.]]

Digital Library

[10]

Duda, R.O., Hart, P.E., and Stork, D.G. 2001. Pattern Classification. 2nd edition. Wiley-Interscience.]]

Digital Library

[11]

Faugeras, O., Luong, Q.T., and Papadopoulo, T. 2001. The Geometry of Multiple Images. MIT Press.]]

Digital Library

[12]

Faugeras, O.D. and Hebert, M. 1986. The representation, recognition, and locating of 3-D objects. International Journal of Robotics Researeh, 5(3):27-52.]]

Digital Library

[13]

Fergus, R., Perona, P., and Zisserman, A. 2003. Object class recognition by unsupervised scale-invariant learning. In Conference on Computer Vision and Pattern Recognition, vol. II, pp. 264-270.]]

[14]

Ferrari, V., Tuytelaars, T., and Van Gool, L. 2004. Simultaneous object recognition and segmentation by image exploration. In European Conference on Computer Vision.]]

[15]

Fischler, M.A. and Bolles, R.C. 1981. Random sample consensus: A paradigm for model fitting with application to image analysis and automated cartography. Communications ACM, 24(6):381-395.]]

Digital Library

[16]

Forsyth, D. and Ponce, J. 2002. Computer Vision: A Modern Approach . Prentice-Hall.]]

Digital Library

[17]

Gårding, J. and Lindeberg, T. 1996. Direct computation of shape cues using scale-adapted spatial derivative operators. International Journal of Computer Vision, 17(2): 163-191.]]

Digital Library

[18]

Grimson, W.E.L. 1990. The combinatories of object recognition in cluttered environments using constrained search. Arttficial Intelligence Journal, 44(1-2): 121-166.]]

Digital Library

[19]

Grimson, W.E.L. and Lozano-Pérez, T. 1987. Localizing overlapping parts by searching the interpretation tree. IEEE Transactions on Pattern Analysis and Machine Intelligence, 9(4):469-482.]]

Digital Library

[20]

Harris, C. and Stephens, M. 1988. A combined edge and corner detector. In 4th Alvey Vision Conference, Manchester, UK, pp. 189-192.]]

[21]

Hartley, R. and Zisserman, A. 2000. Multiple View Geometry in Computer Vision. Cambridge University Press.]]

Digital Library

[22]

Huttenlocher, D.P. and Ullman, S. 1987. Object recognition using alignment. In International Conference on Computer Vision, pp. 102-111.]]

[23]

Kadir, T. and Brady, M. 2001. Scale, saliency and image description. International Journal of Computer Vision, 45(2):83-105.]]

Digital Library

[24]

Koenderink, J.J. and van Doom, A.J. 1991. Affine structure from motion. Journal of the Optical Society of America, 8(2):377- 385.]]

[25]

Lamdan, Y. and Wolfson, H.J. 1988. Geometric hashing: A general and efficient model-based reconitiion scheme. In International Conference on Computer Vision, pp. 238-249.]]

[26]

Lamdan, Y. and Wolfson, H.J. 1991. On the Error Analysis of "Geometric hashing." In Conference on Computer Vision and Pattern Recognition. Maui, Hawaii, pp. 22-27.]]

[27]

Lindeberg, T. 1998. Feature detection with automatic scale selection. International Journal of Computer Vision, 30(2):77-116.]]

Digital Library

[28]

Liu, J., Mundy, J., Forsyth, D., Zisserman, A., and Rothwell, C. 1993. Efficient recognition of rotationally symmetric surfaces and straight homogeneous generalized cylinders. In Conference on Computer Vision and Pattern Recognition. New York City, NY, pp. 123-128.]]

[29]

Lowe, D. 2004. Distinctive image features from scale-invariant key-points. International Journal of Computer Vision, 60(2): 91-110.]]

Digital Library

[30]

Lowe, D.G. 1987. The viewpoint consistency constraint. International Journal of Computer Vision, 1(1):57-72.]]

[31]

Mahamud, S. and Hebert, M. 2003. The optimal distance measure for object detection. In Conference on Computer Vision and Pattern Recognition.]]

[32]

Mahamud, S., Hebert, M., Omori, Y., and Ponce, J. 2001. Provably-convergent iterative methods for projective structure from motion. In Conference on Computer Vision and Pattern Recognition, pp. 1018-1025.]]

[33]

Matas, J., Chum, O., Urban, M., and Pajdla, T. 2002. Robust wide baseline stereo from maximally stable extremal regions. In British Machine Vision Conference, vol. I, pp. 384-393.]]

[34]

Mikolajczyk, K. and Schmid, C. 2001. Indexing based on scale invariant interest points. In International Conference on Computer Vision. Vancouver, Canada, pp. 525-531.]]

[35]

Mikolajczyk, K. and Schmid, C. 2002. An affine invariant interest point detector. In European Conference on Computer Vision, vol. I. pp. 128-142.]]

Digital Library

[36]

Mikolajczyk, K. and Schmid, C. 2003. A performance evaluation of local descriptors. In Conference on Computer Vision and Pattern Recognition.]]

[37]

Moreels, P., Maire, M., and Perona, P. 2004. Recognition by probabilistic hypothesis construction. In European Conference on Computer Vision.]]

[38]

Mundy, J.L. and Zisserman, A. 1992. Geometric Invariance in Computer Vision. MIT Press.]]

Digital Library

[39]

Mundy, J.L., Zisserman, A., and Forsyth, D. 1994. Applications of Invariance in Computer Vision, vol. 825 of Lecture Notes in Computer Science. Springer-Verlag.]]

Digital Library

[40]

Murase, H. and Nayar, S.K. 1995. Visual learning and recognition of 3-D objects from appearance. International Journal of Computer Vision, 14:5-24.]]

Digital Library

[41]

Nalwa, V S. 1988. Line-drawing interpretation: A mathematical framework. International Journal of Computer Vision, 2:103- 124.]]

[42]

Pentland, A., Moghaddam, B., and Starner, T. 1994. View-based and modular eigenspaces for face recognition. In Conference on Computer Vision and Pattern Recognition. Seattle, WA.]]

[43]

Poelman, C.J. and Kanade, T. 1997. A paraperspective factorization method for shape and motion recovery. IEEE Transactions on Pattern Analysis and Machine Intelligence, 19(3):206-218.]]

Digital Library

[44]

Ponce, J. 2000. On computing metric upgrades of projective reconstructions under the rectangular pixel assumption. In Second SMILE Workshop, pp. 18-27.]]

Digital Library

[45]

Ponce, J., Chelberg, D., and Mann, W. 1989. Invariant properties of straight homogeneous generalized cylinders and their contours. IEEE Transactions on Pattern Analysis and Machine Intelligence, 11(9):951-966.]]

Digital Library

[46]

Pope, A.R. and Lowe, D.G. 2000. Probabilistic models of appearance for 3-D object recognition. International Journal of Computer Vision, 40(2): 149-167.]]

Digital Library

[47]

Pritchett, P. and Zisserman, A. 1998. Wide baseline stereo matching. In International Conference on Computer Vision, Bombay, India, pp. 754-760.]]

Digital Library

[48]

Rothganger, F., Lazebnik, S., Schmid, C., and Ponce, J. 2003.3D object modeling and recognition using affine-invariant Patches and Multi-View Spatial Constraints. In Conference on Computer Vision and Pattern Recognition, vol. II, pp. 272-277.]]

[49]

Rothganger, F., Lazebnik, S., Schmid, C., and Ponce, J. 2004. Segmenting, modeling, and matching video clips containing multiple moving objects. In Conference on Computer Vision and Pattern Recognition, Washington, DC, June 2004, Vol. 2, pp. 914-921.]]

[50]

Schaffalitzky, F. and Zisserman, A. 2002. Multi-view matching for unordered image sets, or "How do I organize my holiday snaps?". In European Conference on Computer Vision, vol. I, pp. 414-431.]]

Digital Library

[51]

Schmid, C. and Mohr, R. 1997. Local grayvalue invariants for image retrieval. IEEE Transactions on Pattern Analysis and Machine Intelligence, 19(5):530-535.]]

Digital Library

[52]

Schneiderman, H. and Kanade, T. 2000. A statistical method for 3D object detection applied to faces and cars. In Conference on Computer Vision and Pattern Recognition.]]

[53]

Selinger, A. and Nelson, R. 1999. A perceptual grouping hierarchy for appearance-based 3D object recognition. Computer Vision and Image Understanding, 76(1):83-92.]]

Digital Library

[54]

Tell, D. and Carlsson, S. 2000. Wide baseline point matching using affine invariants computed from intensity profiles. In Proc. 6th ECCV. Dublin, Ireland, pp. 814-828, Springer LNCS 1842- 1843.]]

Digital Library

[55]

Thompson, D. and Mundy, J. 1987. Three-dimensional model matching from an unconstrained viewpoint. In International Conference on Robotics and Automation. Raleigh, NC, pp. 208-220.]]

[56]

Tomasi, C. and Kanade, T. 1992. Shape and motion from image streams: A factorization method. International Journal of Computer Vision, 9(2): 137-154.]]

Digital Library

[57]

Torr, P. and Zisserman, A.Z. 2000. MLESAC: A new robust estimator with application to estimating image geometry. Computer Vision and Image Understanding, 78(1):138- 156.]]

Digital Library

[58]

Triggs, B., McLauchlan, P.F., Hartley, R.I., and Fitzgibbon, A.W. 1999. Bundle adjustment--A modern synthesis. In: B. Triggs, A. Zisserman, and R. Szeliski (Eds.), Vision Algorithms, Corfu, Greece, pp. 298-372, Spinger-Verlag, LNCS 1883.]]

[59]

Turk, M. and Pentland, A. 1991. Eigenfaces for recognition. Journal of Cognitive Neuroscience, 3(1):71-86.]]

Digital Library

[60]

Tuytelaars, T. and Van Gool, L. 2004. Matching widely separated views based on affinely invariant neighbourhoods. International Journal of Computer Vision. (in press)]]

Digital Library

[61]

Voorhees, H. and Poggio, T. 87. Detecting textons and texture boundaries in natural images. In International Conference on Computer Vision, pp. 250-258.]]

[62]

Weber, M., Welling, M., and Perona, P. 2000. Unsupervised learning of models for recognition. In European Conference on Computer Vision.]]

Digital Library

[63]

Weinshall, D. and Tomasi, C. 1995. Linear and incremental acquisition of invariant shape models from image sequences. IEEE Transactions on Pattern Analysis and Machine Intelligence, 17(5):512-517.]]

Digital Library

Cited By

Wang XZhang ZZhang DDing L(2024)Pose visual detection method for cellphone dropping process incorporating prior informationThe Journal of Supercomputing10.1007/s11227-023-05527-280:3(3142-3161)Online publication date: 1-Feb-2024
https://dl.acm.org/doi/10.1007/s11227-023-05527-2
Lindeberg T(2024)Discrete Approximations of Gaussian Smoothing and Gaussian DerivativesJournal of Mathematical Imaging and Vision10.1007/s10851-024-01196-966:5(759-800)Online publication date: 1-Oct-2024
https://dl.acm.org/doi/10.1007/s10851-024-01196-9
Shen ZChu HWang FGuo YLiu SHan S(2024)HFE-Net: hierarchical feature extraction and coordinate conversion of point cloud for object 6D pose estimationNeural Computing and Applications10.1007/s00521-023-09241-136:6(3167-3178)Online publication date: 1-Feb-2024
https://dl.acm.org/doi/10.1007/s00521-023-09241-1
Show More Cited By

Index Terms

3D Object Modeling and Recognition Using Local Affine-Invariant Image Descriptors and Multi-View Spatial Constraints
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision problems
        Object recognition
      2. Computer vision tasks
        Scene understanding
  2. Computer graphics
    1. Image manipulation
    2. Shape modeling
2. Theory of computation
  1. Randomness, geometry and discrete structures
    1. Computational geometry

Recommendations

Face recognition using Weber local descriptors

This paper presents a method for face recognition using multi-scale Weber local descriptors (WLDs) and multi-level information fusion. Our method introduces the WLD, a novel and robust local descriptor, to describe the facial images and modifies it by a ...
3D object recognition using scale-invariant features

As 3D scanning technology develops, it becomes easier to acquire various 3D surface data; thus, there is a growing need for 3D data registration and recognition technology. Many existing studies use local descriptors using local surface patches, and ...
Invariant Descriptors for 3D Object Recognition and Pose
Special issue on interpretation of 3-D scenes—part I

Invariant descriptors are shape descriptors that are unaffected by object pose, by perspective projection, or by the intrinsic parameters of the camera. These descriptors can be constructed using the methods of invariant theory, which are briefly ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image International Journal of Computer Vision

International Journal of Computer Vision Volume 66, Issue 3

March 2006

103 pages

ISSN:0920-5691

Issue’s Table of Contents

Copyright © Copyright © 2006 Springer Science + Business Media, Inc.

Publisher

Kluwer Academic Publishers

United States

Publication History

Published: 01 March 2006

Author Tags

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

86
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 03 Oct 2024

Other Metrics

View Author Metrics

Citations

Cited By

Wang XZhang ZZhang DDing L(2024)Pose visual detection method for cellphone dropping process incorporating prior informationThe Journal of Supercomputing10.1007/s11227-023-05527-280:3(3142-3161)Online publication date: 1-Feb-2024
https://dl.acm.org/doi/10.1007/s11227-023-05527-2
Lindeberg T(2024)Discrete Approximations of Gaussian Smoothing and Gaussian DerivativesJournal of Mathematical Imaging and Vision10.1007/s10851-024-01196-966:5(759-800)Online publication date: 1-Oct-2024
https://dl.acm.org/doi/10.1007/s10851-024-01196-9
Shen ZChu HWang FGuo YLiu SHan S(2024)HFE-Net: hierarchical feature extraction and coordinate conversion of point cloud for object 6D pose estimationNeural Computing and Applications10.1007/s00521-023-09241-136:6(3167-3178)Online publication date: 1-Feb-2024
https://dl.acm.org/doi/10.1007/s00521-023-09241-1
Ullah FWei WFan ZYu Q(2024)6D object pose estimation based on dense convolutional object center voting with improved accuracy and efficiencyThe Visual Computer: International Journal of Computer Graphics10.1007/s00371-023-03113-440:8(5421-5434)Online publication date: 1-Aug-2024
https://dl.acm.org/doi/10.1007/s00371-023-03113-4
Shen KZhuang YChen YZuo SLiu T(2023)AeroNetPattern Recognition Letters10.1016/j.patrec.2023.05.008171:C(28-37)Online publication date: 1-Jul-2023
https://dl.acm.org/doi/10.1016/j.patrec.2023.05.008
Hoque SXu SMaiti AWei YArafat M(2023)Deep learning for 6D pose estimation of objects — A case study for autonomous drivingExpert Systems with Applications: An International Journal10.1016/j.eswa.2023.119838223:COnline publication date: 1-Aug-2023
https://dl.acm.org/doi/10.1016/j.eswa.2023.119838
Li YZhang ZWang XFu WLi J(2023)Automatic reconstruction and modeling of dormant jujube trees using three-view image constraints for intelligent pruning applicationsComputers and Electronics in Agriculture10.1016/j.compag.2023.108149212:COnline publication date: 1-Sep-2023
https://dl.acm.org/doi/10.1016/j.compag.2023.108149
Geng XShi FCheng XJia CWang MChen SDai H(2023)SANetComputer Communications10.1016/j.comcom.2023.05.003207:C(19-26)Online publication date: 1-Jul-2023
https://dl.acm.org/doi/10.1016/j.comcom.2023.05.003
Lin YLing BLi CLiao G(2023)Multivariate two dimensional singular spectrum analysis based fusion method for four view image based object classificationMultimedia Tools and Applications10.1007/s11042-023-15712-382:30(46403-46421)Online publication date: 1-Dec-2023
https://dl.acm.org/doi/10.1007/s11042-023-15712-3
Mitash CBoularias ABekris K(2022)Physics-based scene-level reasoning for object pose estimation in clutterInternational Journal of Robotics Research10.1177/027836491984655141:6(615-636)Online publication date: 1-May-2022
https://dl.acm.org/doi/10.1177/0278364919846551
Show More Cited By

View Options

View options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Issue’s Table of Contents