Nothing Special   »   [go: up one dir, main page]

Academia.eduAcademia.edu
~ zyxwvutsrqponmlkj zyxwvutsrqpon zyxwvutsrqponml zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA 536 [ 1I ] [I21 (131 [I41 [IS] [ 161 [17] [I81 [19] [20] [21] [22] 1231 IEEE Trans. Pattern Anal. Machine Intell.. vol. PAMI-9, no. 1. pp. 168-176, 1987. M. Oshima and S . Sato, ”Linear algebra,’’ Gakuzyututnsho-Syuppansya, Tokyo, Japan, 1968 (in Japanese). R . J. Scalkoff and E. S . McVey, “A model and tracking algorithm for a class of video targets,” IEEE Trans. Parrern Anal. Machine Intell., vol. PAMI-4, no. 1, pp. 2-10,-1982. M. Subbarao and A. M. Waxman, “Closed form solutions to image flow equations for planar surfaces in motion,” Comput. Vision Gruphics Image Processing, vol. 36, pp. 208-228, 1986. W. B. Thompson and S . T . Barnard, “Lower-level estimation and interpretation of visual motion,” Computer, vol. 14, no. 8, pp. 2028. 1981. R . Y. Tsai and T . S . Huang, “Estimating three-dimensional motion parameters of a rigid planar patch,” IEEE Trans. Acousr., Speech, Signal Processing, vol. ASSP-29, no. 6 , pp. 1147-1 152, 1981. -, “Uniqueness and estimation of three-dimensional motion parameters of a rigid objects with curved surfaces,” IEEE Trans. Partern Anal. Machine Intell.. vol. PAMI-6, no. 1, pp. 13-26, 1984. H. Tsukune, S. Sakane, T. Matsushita, and M. Kakikura, “On the image processing system view,” Bull. Electrotech. Lah., vol. 49, no. 4. pp. 299-315. 1985 (in Japanese). S . Ullnian, The Interpretation of Visual Motion. Cambridge, MA: MIT Press. 1979. M. Yamamoto, “A method of 3-dimensional velocity measurement.” Japanese Patent Appl. No. 197526/1984, 1984. -, “Direct estimation of three-dimensional motion parameters from image sequence and depth,” Trans. Inst. Electron. Commun. Eng. Japnn, vol. J68-D, no. 4 , pp. 562-569. 1985 (in Japanese). -, “Three-dimensional motion analysis of scene containing multiple moving objects from image sequence and depth,” Trans. Ins:. Electron. Commun. Eng. Japan, vol. J69-D, no. 5 , pp. 785-793, 1986 (in Japanese). -, “The image sequence analysis of three-dimensional dynamic scenes,” Electrotech. Lab., Ibaraki. Japan, Rep. 893, 1988. S. W. Zucker and R . A. Hummel, “A three dimensional edge operator,” IEEE Trans. Purtern Anal. Machine lnrell,, vol. PAMI-3. no. 3 , pp. 324-331, 1981. views are sufficient to determine motion and structure of the fourpoint rigid configuration. Furthermore, he proposed a nonlinear algorithm for finding the solution. More recently, Aloimonos and Brown [2] have reexamined the problem. In particular, they have studied the two-view as well as the three-view case. In this correspondence, we shall present two main results. 1) It is impossible to determine motion and structure uniquely from two orthographic views no matter how many point correspondences one may have. In fact, the number of solutions is uncountably infinite. 2) A linear algorithm for finding motion and structure in the threeorthographic-view case is presented. 11. PROBLEM FORMULATION: THE TWO-VIEWCASE We assume that the image plane is stationary and that two orthographic views at time instants rl and t 2 , respectively, are taken of a rigid object moving in the 3-D object space. By processing the two views, we intend to determine the motion and structure of the 3-D object. We shall use the following notation. Let ( x , y , z ) be the object space coordinates, and ( X , Y ) the image space coordinates. The X and Y axis coincide with the x and y axis (in particular, the origins of the x-y-z coordinate system and the X - Y coordinate system coincide). Let zyxw (x, y, z ) = object-space coordinates of a point P on the rigid y’, 2 ’ ) = (.XI, (X, Y ) ( X ’ , Y‘) = = object at t I object-space coordinates of the same point P at t2 image-space coordinates of the point P at t l image-space coordinates of the point P at t 2 . zy Then = R [ j + T where zyxwvutsrqpo Motion and Structure from Orthographic Projections T. S . HUANG AND C . H . LEE Abstract-Ullman’s classical results on motionlstructure from orthographic views are revisited. Specifically, we present two results: 1) two orthographic views allow an uncountably infinite number of solutions to motionlstructure of a rigid body, and 2) a linear algorithm for solving motion/structure from four point correspondences over three views. Index Terms-Motion analysis, orthographic projections, stucture from motion. is a rotation matrix and (3) zyxwvu zyxw is a translation vector. The problem we are trying to solve is: given N image point correspondences ++ I. INTRODUCTION In his classical book on motion [ I ] , Ullman showed that for the orthographic projection case, four point correspondences over three Manuscript received December 10, 1986; revised September 28, 1988. Recommended for acceptance by W. B. Thompson. This work was supported in part by Battelle under the Scientific Service Program, Contract DAAG29-81-D-0100 and in part by the National Science Foundation under Grant IRI-8605400. T . S : Huang is with the Coordinated Science Laboratory, University of Illinois, Urbana, IL 61801. C . H. Lee is with the Department of Computer Science, Purdue University, West Lafayette, IN 47907. IEEE Log Number 8826291. r); i = 1, 2 , . * . , N ( X , , Y,) (X,’, determine R, T , and (xl, y I , zl), i = 1, 2, . . . , N . Note that with orthographic projections X=x,Y=j X’ = y’ = y‘ (4) and therefore it is obvious that Az can never be determined and we can hope to determine the 2,’s to only within an unknown additive constant. What we are trying to determine are then: R; A x , Ay; x,, y l , z , - z , ; i = 1, 2 , * . . , N . 111. TWO-VIEWNONUNIQUENESS zyxwvuts We show that no matter how large N is, the problem posed at the end of Section I1 allows an uncountably infinite number of solutions. 0162-8828/89/0500-0536$01 .OO 0 1989 IEEE zyxwvutsrq zyxwvuts zyxwvutsr zyxwvu zyxwvutsrq zyxwvutsr zyxwvutsrq 531 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 11, NO. 5 , MAY 1989 First of all, we can decompose the rigid body motion from t , to zI ) followed by a translation Note that (1 1) follows from the fact that t2 as a rotation R around the point ( x , , yI, (5) Second, to determine R and ( x , , y , , z , - zl), we can, for simplicity and without loss of generality, move ( x , y , z ) and ( x ’ , y ’ , y ’ ) to the origin: then the other points are related simply by a rotation (from t , to 22): It can be shown readily that for any of these infinitely many solutions, we can construct an R, and therefore our main result is proven. For each solution R, we can use (8) to find I , ; i = 2, 3, 4. And z,’ can be found by I: = r3,X, + r32Y,+ r3,z,. (13) We remark that as a byproduct of the above derivation, we have a way of testing whether a set of proposed point correspondences is legitimate. Let us assume that we have four points over two views, but the correspondences among the points are unknown. Potentially, there are 4! = 24 different mappings between the two four-point sets. For each mapping, we can go ahead and solve (10) to get r I 3 , r23, r31rr32 to within a scale factor. The mapping is permissible (i.e., it will lead to consistent solutions to motion and structure) if and only if To summarize, we have shown in this section that no matter how many point correspondences one may have, two orthographic views allow an uncountably infinite number of solutions to motion and structure of rigid objects. Furthermore, we have derived a simple test for the legitimacy of potential point correspondences. The nonuniqueness result is contained implicitly in Ullman’s work [l], and can probably be established by using a simple counting argument as in [2]. Now, we proceed to try to determine R. From (7) and (4), X,’ = rllXj + r I 2 Y i+ r I 3 z , + r22Y, + r 2 3 z s . Y: = r21Xj zyxwvutsrqp Or, in matrix notation, To eliminate I,,we premultiply both sides of (8) by [ r 2 3 ,- r 1 3 ] to get Using the identities rllr23 - r21r13 = -r32 r12123 - r22r13 = 131 (9) which follow from the fact that each row of R is the cross product of the other two, we get I IV. PROBLEMSTATEMENT: THE FOUR POINTSOVERTHREEVIEW CASE We assume that the image plane is stationary and that three orthographic views at time instants t l , t 2 , and f 3 , respectively, are taken of a rigid body moving in the 3-D object space. By processing the three views, we intend to find the object structure and the motions from tl to t2 and from t2 to t3. Ullman [ 11 proved that four point correspondences over three views yield a unique solution to motion and structure (up to a reflection). In this correspondence, we shall describe a linear algorithm for obtaining the solution. We shall use the same notations as those in Section 11, except that coordinates at t3 will be double primed, and that the rotation matrix from t2 to t3 is denoted by zyxwvutsrqp r23x,’ - r13r + r 3 2 x ~ - r31 y, = I (10) which is linear and homogeneous in the four unknowns r I 3 ,r Z 3 r, 3 , , and r32.From N point correspondences, we get ( N - 1 ) such equations. Therefore, if N 2 4 , and assuming that the points are not coplanar, we can solve the set of equations (10) to obtain r I 3 , rz3, r31,and r j 2to within a scale factor. In fact, in the absence of noise, four point correspondences are sufficient. Additional point correspondences are superfluous. The important point to notice here is that (10) contains all the information we can get on R from the point correspondence ( X , , Y i ) (X:, Therefore, no matter how many point correspondences we may have, the only thing we can determine about R is the values of rI3,r23rr 3 1 ,r32to within a scale factor. Obviously, by changing the scale factor, we can get an uncountably infinite number of solutions of ( r I 3 ,r Z 3 ,r31,r12) which satisfies r). + r:3 r53 = r:, which implies, in particular, + r& < 1 (11) s32 s33 and the rotation matrix from t l to t3 by Thus, W = SR. Assume that we are given four point correspondences over three views: (X,, Y,) ++ (x;,Y ) (x;,Yt”), i = 1, 2, 3. 4 Again, for simplicity and without loss of generality, we can let 538 zyxw zyxwvutsrqponmlkjih IEEE TRANSACTIONS ON PATTERN ANALYSIS A N D MACHINE INTELLIGENCE. VOL. 11, NO. 5, MAY 1989 Then zyxwvutsrqpon zyxwvuts [i] I:;[zyxwvuts . zyxwvutsrq zyxwvutsrqp (20) and Similarly, postmultiplying both sides of (28) by w31P32 -'31/)23 i = 2, 3, 4. = (21) Thus, a / y and = - w32P31 + u32P13 - r23s31 O' C . Step 3 Next, we premultiply both sides of (27) by [ s 1 3 s 2 3 ]to get 2, 3, 4. V . A LINEARALGORITHM A . Step yields are determined if r13s32 The problem is to determine R , S, z , , i [-E;] I Using the method of Section 111, we can determine ( ~ 1 3 , rz3, r31, ~ 3 2 )to within scale factors: ri2), (sI3, sz3, s31, ~ 3 2 1 ,and ( ~ 3 ,W Z ~ . + ( d 3 + or zyxwvutsrqponm zyxwvutsrqponm r 3 2 ) = a(P13, P233 P3l3 P 3 2 ) (r13, r23, r31, (s13, s233 s31* s32) (W13r W23r W31r W 3 2 ) = p(a13, u23, u31> u32) = Y(w13, U235 m i l r 0 3 2 ) (22) (23) (24) where p , ] , U , ] , and U,] are known, but a , j3,y are unknown constants (which we assume to be nonzero) yet to be determined. If we can find a , 6, y,then from Section 111, R , S, and W can be determined. Thus, we proceed to find these unknown scale factors. B. Step 2 The only constraint we have on rewrite here as r,,, s,,, and wlj is (17), which we yields Similarly, postmultiplying both sides of (28) by Multiplying out, we get 1 I bt'33 = [S3IS321 13 + S33"33. r ~( ~~ r / y ) scan ~ ~be deterA unique solution for ( b / ~ ) and mined from (33) and (34) if I + Sf3) -(s31r13 + S 3 2 f - 2 3 ) (29) As we shall see presently, by working with (27) and (28), we can determine a , 0, y. Then, we can find R and S. Premultiplying both sides of (27) by Isl3, -s13], we get (Sf3 -(s31r13 + s32r23) (41 + 4 3 ) I # 0. (35) have already been obtained, we can deThen, since a / y and termine r33and sj3 uniquely. We note that after some manipulation, it can be readily shown that condition (35) is equivalent to (32). D. Step 4 From or r:3 + rs3 + ri3 = 1 and (22), we have a2(P:3 + = 1 - 4 3 539 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. I I . NO. 5, MAY 1989 or zyxwvutsrqpon zyxwvutsrq zyxwvu zyxwvuts zyxwvutsrqpon zyxwvuts zyxwvutsrq zyxwvutsrqp zyxwvu (37) over the first two frames (at t l and t2).However, the point correspondences over these first two frames impose no constraints at all on the q’s. Therefore, the problem of finding motion/structure from t2 to t3 is decoupled from that from t l to t 2 . The former, being a two-frame problem, has infinitely many solutions as proven in Section 111. (38) C. Case 3: (42)Is Valid Thus, we have two solutions to a : a = where zyx +a0 In this case, both Step 2) and Step 3) [but not Step l ) ] of our algorithm fail. However, motion and structure can be determined by alternative methods, one of which is described in [3]. (39) 0,y ) . Let us denote the values of VII. THREE-POINT CASE In the remainder of this correspondence, we discuss briefly the case of three point correspondences. Given three point correspondences over two orthographic views, the technique of Section 111 still applies. However, (10) yields only two equations. Assuming these two equations are linearly independent, we can solve them and get E. Step 5 (rI3,r23r r 3 1 9 r32) = au + bv (43) where U and v are known four-vectors and a, b are unknown scalars. Substituting into the constraint (14), we get a second-order homogeneous polynomial equation in a and b . Solving this equation, we get two solutions for (bla).Assuming both of these solutions are real, we have from (43) two families of solutions for From (30) and (31), it follows that we have two solutions to ( a , and y corresponding to a. by Po and yo, respectively. Then the two solutions are ( a o ,Po, yo) and ( - a o , -PO, -YO). At this point, we have obtained two solutions to r 1 3 ,r2s,r 3 , ,rs2, r33; s13,s~~~s3,, sj2, s33.For each solution, we can determine the remaining elements of R and S by the procedure to be described below. To find rlI and r I 2 , we use the following two properties of the rotation matrix: (r13,r23, r31v r32): (r13, r23> rS1, ~ Equations (40) and (41) represent two linear equations with two unknowns ( r l and l rI2),and the coefficient matrix has a determinant which we assume to be nonzero. Therefore, we have a unique solution for r l l ,rI2. Similarly, a unique solution for rZ1,r22 can be found. Obviously, s l lsrI z ,s~~~ sz2can be found in an entirely analogous manner. F. Step 6 r3Z) = a ’ ( P ; 3 , d 3 3 d l 3 d 2 ) (44) ( r I 3 ,r2,, r 3 , , r32) = a ” ( p h , PG, pi’,, P & ) (45) where p> and p; are known and a’, a“ are unknown. If three point correspondences over three orthographic views are given, the algorithm of Section V can still be used. The difference is that now we have two families of solutions for each of ( r I 3 ,r13, r 3 1 , r32),(s13, s231 S3Ir s32), and ( w 1 3 , ~ 2 3 w, 3 1 , ~ 3 2 ) .In applying the constraint W = SR there are eight combinations. Thus, the algorithm will find eight solutions to motion. There are eight additional motion solutions corresponding to structure reflections. The total number of solutions to motion is therefore 16. The number of solutions to structure needs further elaboration. 1) First, we claim that for each three-point configuration at t l which is a structure solution, there are four motion solutions associated with it. The reason is as follows. Let Q be the three-point configuration at tl. Pick a motion solution ( R , S ) associated with it, and call the corresponding three-point configurations at t z and t 3 ,Q 2 and Q,, respectively. Let “*” denote reflection. Then Q , --t 0, Q:., 0, 0:: Q3, and Q , QT Q,* are also solutions because for a three-point configuration (which always lies in a plane), a reflection can be achieved by a rotation. 2) Second, since the total number of motion solutions is 16, the number of structure solutions at t l must be 1 6 / 4 = 4 . 3 ) Finally, since f2: is a solution if Q, is, two of the structure solutions (out of the four at t l ) must be reflections of the other two. zyxwvutsr Finally, (8) is used to determine I,; i = 2 , 3 , 4 . We note that as Ullman [ I ] pointed out, four point correspondences over three frames allow two solutions to motion and structure, and in these two solutions, the point configurations are reflections of each other with respect to the image plane. VI. DEGENERATE CASES The algorithm described in Section V fails under any of the following three conditions. 1 ) The four points under observation are coplanar in the object space. 2) R , S, or W represents a rotation around the z axis. (42) 3, r 1 3 s 3 2 - r23s31 = O. We now discuss each of these three cases. + + -+ + + VIII. SUMMARY With regard to solution uniqueness, the conclusions are as folA. Case I: The Four Points Are Coplanar lows. I ) Two orthographic views allow an uncountably infinite numThen (10) f o r i = 2, 3 , 4 are linearly dependent. Therefore, ( r I 3 , r23, r 3 , . r3*)cannot be determined to within a scale factor. In this ber of solutions to motion/structure of a rigid object. 2) Four point correspondences over three views ensure a unique case, as pointed out by Ullman [ l ] , four point correspondences solution to motionistructure (plus a reflection). over three frames allow infinitely many solutions. 3) Three point correspondences over three views yield 16 soB. Case 2: One of the Rotations Is Around the z Axis lutions to motion and four solutions to structure (two plus their Without loss of generality, let us assume that R represents a ro- reflections). A linear algorithm is given for finding the solutions in Cases 2) tation around the z axis. Then r I 3 = rI3 = r X I = r32 = 0. In this case, R can be easily determined by two point correspondences and 3 ) . zyxw zyxwvutsrqponm zyxwvutsrqponmlkjihgfe zyxwvutsrqpo IEEE TRANSACTIONS ON PATTERN ANALYSIS A N D MACHINE INTELLIGENCE, VOL. I I , NO. 5 . MAY 1989 540 REFERENCES [ I ] S . Ullman, The Interpreration of Visual Motion. Cambridge, MA: M.I.T. Press, 1979. 12) J . Aloimonos and C. M. Brown, “Perception of structure from motion,” in Proc. IEEE Con5 Compur. Vision and Pattern Recognirion, Miami Beach, FL, June 1986, pp. 510-517. [3] T. S . Huang and C. H. Lee, “Motion and structure from orthographic projections,” Coordinated Sci. Lab., Univ. Illinois, Urbana, Tech. Note ISP-101, Dec. 1986. A Comment on “On Kineopsis and Computation of Structure and Motion” AMAR MITICHE Abstract-This correspondence brings a correction to a previously published PAM1 correspondence by showing that five points rather than four are required to determine structure and motion from optical flow using the distance invariance method. We require that the number of equations be equal or greater than the number of equations be equal or greater than the number of unknowns, i.e., 7n 2 6n + 5. Which gives n 2 5. This counting argument is due to S . Maybank. It is an indication that the system of rigidity equations written for four points in [ 11 is underconstrained. Indeed, let four points be given, called P I ,P 2 , P,, P4. Consider the six equations such as (7) obtained from these four points [ 11. Assume that five of these six equations are satisfied. For clarity, and without loss of generality, let the sixth equation be the one written for the pair (PI,P 2 ) . Given the meaning of (7), the equations written for pairs ( P I , P 3 ) , ( P I , P 4 ) , and ( P 3 . P 4 ) indicate that body B , = { P,,P,, P , 1 is moving according to a rigid motion M I . Similarly, the equations involving pairs ( P 2 , P 3 ) , ( P 2 ,P 4 ) , and (P,,P4)indicate that body B2 = { P 2 , P,, P 4 } moves according to a rigid motion M 2 . Are motions MI and M , identical? If they are, then the sixth equation [the one written for pair ( P I ,P,)]will be satisfied and the six equations will therefore be dependent. Let MI be described by rotational component (Ql”,Q ; ’ ) , Q:”) and translational component ( t i ” , t i l ’ , t\”). Similarly, let M2 be described by ( Q : ” , 0i2’)and ( r:2’, t i 2 ’ , t i 2 ’ ) . Let, for i = 1, 3, At, = tl” - t;’), and AQ, = Q:” - Q ; ” . First, note that because equation (7) in [ I ] embodies the substitution of equation (6) in [ l ] (this is really the key point), we have zyxwvutsr zyxwvutsrqpo zyxwvutsrqponm zyxwvutsr zyxwvutsr Index Terms-Motion, optical flow, structure. I. INTRODUCTION To determine structure from rigid motion using optical flow, the principle of invariance of distances has been proposed in [ 11 which led to a system of equations that seemed to require only four points to solve. This correspondence brings a correction, showing that five points rather than four are required with the distance invariance formulation, Qi2), The velocity of P, E B , according to MI is 11. STRUCTURE A N D MOTIONFROM OPTICAL FLOW USING DISTANCEINVARIANCE Suppose n points move rigidly in space with translational velocity T, and rotational velocity Q taken about an axis through the center of projection. Let the ith point be at position P, = ( X ) , Y , , Z , ) , with velocity P,’ = ( X , ’, ,Z,‘). The movement of the ith point in space gives rise to a flow vector ( U , ,v i ) at the point ( x i , y , ) on the projection plane. We have the following basic equations: P,’ = T + Q x P, = + o;l)Y, (3c) z; = t y - Q:”x,. (3a) But because P3 E B 2 , its velocity can also be written according to M2: x; = t y + Q:”z,- Q:” Y, Y; = t i 2 ) + Q:2’x, -Qyz, Z ; = rh2’ + n12)y3- ni2)x3. (4a) (4b) n). (1) The first equation of (1) is a vector equation, and the remaining four are scalar. Let us compare the number of unknowns to the number of equations. There are seven equations associated with each point, giving 7n equations in all. The unknown quantities are P , , P: ( I 5 i I n ) , Q, and the direction of T, since 7‘can be determined only up to a scale factor. The total number of unknowns is thus: + 3n + 3 + 2 (4c 1 From (3a) and (4a), and from (3b) and (4b). considering ( 2 ) , we obtain Y,Zi x; = u,z, + x,z: = vizi + y,Z: 3n (3b) zyxwvutsrq x,= x j z , y, x; = ri” + Q:”z,- Q:”Y, Y; = r;” + Q:”x, - nyz, (1 Ii 5 = 6n +5 Manuscript received July 2 3 , 1987; revised August 2, 1988. Recommended for acceptance by W . B . Thompson. The author is with INRS-Telecommunications, 3 Place du Commerce, Ile-des-Soeurs, P.Q. H3E 1H6, Canada. IEEE Log Number 8927071. AR,Y, = 0 At, - At? + AQ,X, = 0. (5a) (5b) If one considers point P, then similar expressions can be written: AQ3Y4 = 0 At, - At2 + AQ,X, 0. (6a) (6b) Equations (5a) and (6a) form a homogeneous system of linear equations the solution of which is At, = 0, AQ, = 0 unless Y, = Y,. Similarly, (5b) and (6b) would yield At, = 0, AQ, = 0 unless X , = X , . Therefore, except for pathological cases, the sixth equation is satisfied when the other five are satisfied, and the system is underconstrained. At least five points are therefore needed [2], [3]. Although one can “patch” an underconstrained system by finding the smallest perturbation in an initial set of values T , Q necessary to obtain values compatible with the data, an underconstrained sys- 0162-8828/89/0500-0540$01.OO 0 1989 IEEE