Abstract
Radon transform is a popular mathematical tool for shape analysis. However, it cannot handle affine deformation. Although its extended version, trace transform, allow us to construct affine invariants, they are less informative and computational expensive due to the loss of spatial relationship between trace lines and the extensive repeated calculation of transform. To address this issue, a novel line integral transform is proposed. We first use binding line pairs that have the desirable property of affine preserving as a reference frame to rewrite the diametrical dimension parameters of the lines in a relative manner which make them independent on affine transform. Along polar angle dimension of the line parameters, a moment-based normalization is then conducted to degrade the affine transform to similarity transform which can be easily normalized by Fourier transform. The proposed transform is not only invariant to affine transform, but also preserves the spatial relationship between line integrals which make it very informative. Another advantage of the proposed transform is that it is more efficient than the trace transform. Conducting it one time can allow us to achieve a 2D matrix of affine invariants. While conducting the trace transform once only generates a single feature and multiple trace transforms of different functionals are needed to derive more to make the descriptors informative. The effectiveness of the proposed transform has been validated on two types of standard shape test cases, affinely distorted contour shape dataset and region shape dataset, respectively.
Similar content being viewed by others
Keywords
1 Introduction
Shape analysis is an active research area in the computer vision and pattern recognition community and has a large body of potential applications such as human activity recognition [28], target tracking [29], medical image retrieval [30], etc. The object images captured by the camera are generally subject to various deformations. One of the typical distortions is perspective transform which occurs on the situation of the pictures of the objects are taken under arbitrary orientations. The computer vision systems are also expected to effectively handle the recognition of perspective distorted shapes. Under certain circumstance like the distance between the camera and the object being far enough, or the thick of the object being very small that can be approximated a plane, the perspective transform can be well approximated by affine transformation [20, 31]. In this paper, we focus on recognizing shapes distorted by affine transform.
Conducting line integrals over the image plane and transform the image function to another 2D function of line parameters \((\theta ,\lambda )\) is an effective way for shape analysis. This is one of the significant applications of the well-known Radon transform [1]. The main appealing characteristics of the Radon transform is that it benefits extracting feature of the inner structure of the object and the image function of the object can be fully reconstructed from its Radon transform.
Numerous efforts have been made on extracting invariant shape features from Radon transform for object recognition [32,33,34,35,36]. However, these methods can only handle similarity transform (translation, rotation and scaling) which is a kind of shape-preserving transformation and is a subset of larger affine group. Although affine transform is line-preserving, the complex behaviours of the line integrals of the image function make the affine transform challenging to extract affine invariants from the domain of the line integral transform.
Trace transform [37] is a generalized version of the Radon transform. It extends the line integral to any 1D functional along lines (termed trace functional). So, various functionals used will derive different trace transform. However, trace transform is still sensitive to affine transform because the functionals used along the line can not yield invariants. To achieve affine invariants, Petrou and Kadyrov [5] propose to further conduct diametrical functionals along the dimension of \(\lambda \) and circus functionals along the dimension of \(\theta \). It is worth noting that in this method, a set of functionals including a trace functional, a diametrical functionals and a circus functionals can only generate one invariant feature. So, to make the shape descriptors informative, many triples of functionals have to be developed for yielding more affine invariants. However, it is very difficult to find appropriate triple of functionals to constructing the desirable invariants and the expensive computational cost also make it not suitable for real applications.
In this paper, we propose a novel integral transform that can effectively and efficiently handle affine-distorted shapes. The proposed transform has the following advantages over the trace transform: (1) It is much more discriminative than the trace-transform based method [5]. The later uses diametrical functionals and circus functionals to achieve affine invariants which cannot preserve the information of diametrical dimension and circus dimension of the trace lines. While the proposed transform perfectly preserves it; (2) It has the higher efficiency to yield affine invariants than the trace transform. Conducting it one time can allow us to achieve a group of shape invariants. While conducting the trace transform once only generates a single feature and multiple trace transforms of different functionals are needed to derive more to make the descriptors informative; (3) The features obtained by the proposed transform has a physical interpretation, while the features obtained from the trace transform generally has no clearly physical meaning.
2 Related Work
The existing affine shape analysis methods can be categorized into two groups. One is contour based methods which represent the object boundary as ordered points or parameterized curves. The other is region based methods which attempt to generate affine invariants from the whole shape region.
A large body of methods have made attempts to model the silhouette of the shape for characterizing the behavior of the shapes that are subject to affine transform. The popular ideas of them model the silhouette as a parameterized curve. Then some mathematical tools can be used for analysis the geometric properties. The earlier work [20] applied Fourier transform to the affine-length parameterized boundary description for eliminating its dependency on the affine transformation. Various wavelet transforms with different wavelet basis functions are also utilized to generate affine shape invariants [22, 23].
B-spline is a kind of continuous curve representation which make it very suitable for shape analysis. Wang and Teoh [18] modeled the shape contour using B-spline to construct the Curvature Scale Space (CSS) image for affine invariant shape matching. Huang and Cohen [17] proposed a fast algorithm for estimating the B-spline control points and used a new class of weighted B-spline curve moments to handle the affine transformation between curves. Various algebraic curve models such as quartic implicit polynomials [16] and conic curve [19] were proposed for extracting affine or perspective invariants. Zuliani et al. [10] used the area, centroid, and covariance of the domain enclosed by the curve to normalize the shape for removing the effect of affine transform.
There are also many methods treating the silhouette of the shape as a sequence of order points to construct shape descriptors or building corresponding to matching shapes. Mai et al. [13] represented shape as a sequence of chained edge points and proposed to project one shape onto the subspace spanned by the other. The two shapes are then matched by minimizing a subspace projection error. This method has a clear physical interpretation and works very fast for estimating the affine transform. Jia et al. [14] developed a new projective invariant, the characteristic number (CN) whose values is calculated on a series of five sample points along the shape contour. With the sample intervals varying, a coarse to fine strategy is developed for capturing both the global geometry described by projective invariants and the local contextual information. Xue et al. [25] proposed a fuzzy algorithm for aligning shapes under affine transform. This algorithm can efficiently estimate the point correspondence and the relevant affine parameters. Recently, Bryner et al. [7, 8] presented a contour-based shape analysis framework based on Riemannian geometry that is invariant to affine transform and re-parameterization of contours. Three shape space, landmark-affine, curve-affine and landmark projective are studied in their work.
Different from the contour based methods, the region based shape analysis approaches characterize shape considering all the pixels over the shape domain. Moment invariants are the popular region based shape descriptors. In the earlier work, Flusser and Suk [12] introduced affine moment invariants for object recognition. Heikkil\(\ddot{\mathrm{a}}\) [21] used the second and higher order moments of the image points for both recognition and alignment under affine transform. The advantage of this method is that it does not require to know any point correspondences in advance. Yang and Cohen [26] proposed a framework for deriving a class of new global affine invariants based on a novel concept of cross-weighted moments with fractional weights.
Besides the moment descriptors, many recent works developed various strategies for region based shape recognition and matching. Domokos et al. [15] proposed a novel parametric estimation of affine deformations for planar shapes. Instead of finding the correspondences between the landmarks of the template shape and target shape for computing the affine parametric, they treat the affine parametric estimation as a solution of polynomial equations in which all the information available in the input image is used. More recently, Ruiz et al. [24] proposed a fast and accurate affine canonicalization method for recognizing and matching planar shapes. Different from many shape analysis methods which extract invariant features for recognition, this work attempts to produce multiple canonical versions of the shape for provides a common reference frame for accurate shape comparison.
The works that are most relevant to our research in this paper are various transform based methods. Ruiz et al. [11] proposed a multiscale autoconvolution (MSA) transform based on a probabilistic interpretation of the image function. The MSA transform is a 2D function derived from the shape image function which can present infinitely many affine invariant features by varying its two variable. So, it can be directly applied for shape recognition. The trace transform is a generalization of the Radon transform. Petrou and Kadyrov [5] proposed to conduct several trace functionals on the original image function firstly to obtain the same number of trace transforms of the image function, then for each available trace transform, several diametrical functionals are used to transform them to the same number of 1D functions of directional angle of line. The affine invariants are finally achieved by further performing circus functionals to the available 1D functions. The number of the available affine invariant features is the number of the combinations of the used three types of functionals. It can be directly used for shape recognition and its theory can be extended for affine parametric estimation [27]. Recently, Zhang and Chu [38] developed a ray projection transform and apply it for recovering a geometric transformation and an affine lighting change between two objects.
3 Affine Theory of Line Integral
Given a 2D function f(x) and a straight line \(\ell \) with equation \(\lambda - \overrightarrow{\theta }^T\)x = 0, where \(\overrightarrow{\theta } = (cos \theta ,sin \theta )^T\) is a unit vector in the direction of the normal to the line \(\ell \), and x = \((x,y)^T \in \mathbb {R}^2\) denote the coordinates of a point of the line \(\ell \). The integral of the function f(x) over the line \(\ell \) can be mathematically expressed as
where \(\delta (\cdot )\) denotes a Dirac delta function. Through the line integral, the function f(x) is transformed into another 2D function \(\mathfrak {R}_{f }(\lambda , \theta )\) of line parameters \((\lambda , \theta )\) (in the later sections, we directly use the parameters \((\lambda , \theta )\) to denote a line). This is the well-known Radon transform [1] that has been widely applied for image analysis. In this section, we study the behavior of the line integral of the function f(x) under affine transform.
An affine transform \(\mathcal {H}=\mathcal {H}(A,s)\) can be defined as x\(^{'}=A \mathrm{x}+s \), where A is a \(2\times 2\) nonsingular real matrix and \(s \in \mathbb {R}^2\). Using it, a function f(x) can be transformed to another function \(g (\mathrm{x})=f \,(A ^{-1}\mathrm{x}-A ^{-1} s )\). The relationship between their Radon transformed versions, \(\mathfrak {R}_{f } (\lambda , \theta )\) and \(\mathfrak {R}_{g } (\lambda , \theta )\), can be deduced as
Let \(\mathrm{y}=A ^{-1}\mathrm{x}-A ^{-1} s \), Eq. (2) can be rewritten as
Note that the vector \(A^{T} \overrightarrow{\theta }\) is a transformed version of the vector \(\overrightarrow{\theta }\) and may not be a unit vector. Here, we use the scaling property [2] of the Dirac delta function, that is \(\delta (\alpha x)=|\alpha |^{-1}\delta (x)\) for \(\alpha \ne 0\), to normalize it and Eq. (3) can be rewritten as
where \(\left\| \cdot \right\| \) denotes the length of the vector. Defining following functions of variable \(\theta \):
\(\epsilon (\theta )=\frac{|det(A)|}{\left\| A^{T} \overrightarrow{\theta } \right\| },\zeta (\theta )=\frac{1}{\left\| A^{T} \overrightarrow{\theta } \right\| },\eta (\theta )=-\frac{\overrightarrow{\theta }^T s}{\left\| A^{T} \overrightarrow{\theta } \right\| }, \)
and
where \(\left\langle \cdot \right\rangle \) denotes the direction angle of the vector, we then have
The above equation indicates the following effects of the affine transform \(\mathcal {H}(A,s)\) on the line integral of the function f(x): (1) its amplitude is scaled by \(\epsilon (\theta )\); (2) its parameter \(\lambda \) is scaled by \(\zeta (\theta )\) and shifted by \(\eta (\theta )\), and (3) its parameter \(\theta \) is transformed to be \(\rho (\theta )\).
Let the inverse transform of \(\mathcal {H}(A,s)\) be \(\mathcal {H}^{-1}(A^{-1},-A^{-1}s)\). The inverse transformed versions \(\epsilon ^{-1} (\theta )\), \( \zeta ^{-1} (\theta )\), \(\eta ^{-1} (\theta )\), and \(\rho ^{-1}(\theta )\) of the functions \(\epsilon (\theta )\), \( \zeta (\theta )\), \(\eta (\theta )\), and \(\rho (\theta )\) can be defined by replacing A and s appeared in the Eq. (5) with \(A^{-1}\) and \(-A^{-1} s\), respectively. From Eq. (6), we can also conclude that when a line \((\lambda ,\theta )\) is subject to the affine transform \(\mathcal {H}(A,s)\), it will become the line \((\zeta ^{-1}(\theta )\cdot \lambda + \eta ^{-1}(\theta ),\rho ^{-1}(\theta ))\). In the next section, we will use the affine theories of the line and line integral under affine transform to construct a novel line integral transform for affine shape analysis.
4 The Proposed Line Integral Transform
A shape is defined as region D that is a subset of pixels in the image plane \(\mathbb {R}^{2}\) [3]. The shape image function f(x) can then be defined an indicator function f(x) = 1 if \(x \in D\) and f(x) = 0 otherwise.
4.1 Binding Line Pair and Its Affine Property
Using the line integral \(\mathfrak {R}_{f } \, (\lambda , \theta )\) of the function f(x), we define a 1D function of the variable \(\theta \) as
Then for an angle \(\theta \in [0,2\pi ]\), we can uniquely derive a line pair \((\sigma _{f} (\theta ),\theta )\) and \((-\sigma _{f} (\theta +\pi ),\theta )\). It can be easily concluded that they have the following relationships with the shape region D:
and
The Eq. (8) and Eq. (9) indicate that the shape region D is located between the line pair \((\sigma _{f} (\theta ),\theta )\) and \((-\sigma _{f} (\theta +\pi ),\theta )\) and has at least one intersection point with them which also means that the shape region D is bound by the line pair. We term the line \((\sigma _{f} (\theta ),\theta )\) and \((-\sigma _{f} (\theta +\pi ),\theta )\) as binding line pair. An example of binding line pair is presented in Fig. 1.
Now, we analysis the property of the binding line pair under the affine transform \(\mathcal {H}(A,s)\). According to Eq. (7), we can conclude that the function \(\sigma _{f}(\theta )\) has the following relationship with the function \(\sigma _{g}(\theta )\):
Under the affine transform \(\mathcal {H}(A,s)\), the line \((\sigma _{f}(\theta ),\theta )\) is transformed to be the line \((\zeta ^{-1}(\theta ) \cdot \sigma _{f}(\theta )+\eta ^{-1}(\theta ),\rho ^{-1} (\theta ))\) which can be rewritten as \((\sigma _{g}(\rho ^{-1} (\theta )),\rho ^{-1} (\theta ))\) in terms of Eq. (10). Therefore, we can conclude that the affine transformed version of a binding line of the shape is the binding line of the affine transformed version of the shape. Note that the parameters \((-\sigma _{f}(\theta +\pi ),\theta )\) and the parameters \((\sigma _{f}(\theta +\pi ),\theta +\pi )\) represent the same line. Thus, another binding line \((-\sigma _{f}(\theta +\pi ),\theta )\) also keeps its binding property under the affine transform. A graphical illustration of this property is shown in Fig. 2.
4.2 The Proposed Transform
Given an image function f(x) and a direction angle \(\theta \in [0,2\pi )\), a binding line pair \((\sigma _{f}(\theta ),\theta )\) and \((-\sigma _{f}(\theta +\pi ),\theta )\) can be calculated. The distance between them is \(-\sigma _{f}(\theta +\pi )-\sigma _{f}(\theta )\). The line that has equal distance to the binding line pair, i.e., the center line, can be represented as \(((\sigma _{f}(\theta )-\sigma _{f}(\theta +\pi ))/2,\theta )\). Taking the center line and the binding line \((\sigma _{f}(\theta ),\theta )\) as the reference lines, each line that is located between them can be then uniquely represented as the parameter form of \((\varGamma _{f}(\mu ,\theta ),\theta )\). Where \(\theta \in [0,2\pi ]\) is the direction angle of the line which is also the direction angle of the reference lines and the parameter \(\varGamma _{f}(\mu ,\theta )\) is a 2D function of the variables \((\mu ,\theta )\) defined by
where \(\mu \in [0,1]\) is the ratio of the distance between the line with the reference line \((\sigma _{f}(\theta ),\theta )\) to the distance between the two reference lines, \((\sigma _{f}(\theta ),\theta )\) and \(((\sigma _{f}(\theta )-\sigma _{f}(\theta +\pi ))/2,\theta )\).
Let the parameter \(\theta \) vary from 0 to \(2\pi \) and the parameter \(\mu \) vary from 0 to 1, all the lines that go through the shape region D can then be available to form a line set denoted by
We integral the shape image function f(x) over each line in the above set and obtain a novel integral transform for the function f(x) as follows:
which is a 2D function of the parameters \(\mu \in [0,1]\) and \(\theta \in [0,2\pi ]\).
The proposed transform \(\varPsi _{f}(\mu ,\theta )\) has the following property under the affine transform \(\mathcal {H}=\mathcal {H}(A,s)\) of the shape image function f(x):
Comparing the above equation with Eq. (6), we can find the difference between the proposed transform with the Radon transform under affine transform as follows: the first parameter \(\lambda \) for the Radon transform embeds the parameters of affine transform, while the first parameter \(\mu \) for the proposed transform is independent of the affine transform. However, Eq. (14) also indicates that the second parameter \(\theta \) and the amplitude of the proposed transform still encode the affine transform parameters. Since the parameter \(\mu \) of the proposed transform is independent of the affine transform, we fix it and rewrite the function \(\varPsi _{f}(\mu ,\theta )\) as the function \(\varPsi _{f}^{\mu }(\theta )\) which has only one variable \(\theta \). Delimited by the function \(\varPsi _{f}^{\mu }(\theta )\), a region \(W_{f}^{\mu }\) can be derived and mathematically defined as
Similarly, a region \(W_{g}^{\mu }\) that is delimited by the function \(\varPsi _{g}^{\mu }(\theta )\) can be defined by replacing f with g in Eq. (15). We are interested in the relationship between the available regions \(W_{f}^{\mu }\) and \(W_{g}^{\mu }\). For any \(w^{'} \in W_{g}^{\mu }\), there exists \(w \in W_{f}^{\mu }\) such that
and on the other hand, for any \(w \in W_{f}^{\mu }\), there exists \(w^{'} \in W_{g}^{\mu }\) which also makes Eq. (16) hold, where \(A^{-T}\) denotes the transpose matrix of \(A^{-1}\). Therefore, we can deduce that the region \(W_{f}^{\mu }\) and the region \(W_{g}^{\mu }\) are correlated by the affine transform \(\mathcal {H}^{'}=\mathcal {H}^{'}(|det(A)|A^{-T},0)\). This is a desirable property which allows us to use their respective second-order geometric moment to derive a transformation for reducing the relation between them from an affinity into a similarity. Similar ideas can be found in [4, 5].
For the region \(W_{f}^{\mu }\), we construct its second-order geometric moment matrix as
Similarly, the second-order geometric moment matrix \(M_{g}^{\mu }\) for the region \(W_{g}^{\mu }\) can be available. For the integration expression of calculating the moment matrix \(M_{g}^{\mu }\), by making following changes: the variable \(w^{'}=|det(A)|A^{-T}w\), the derivative \(dw^{'}=|det(A)|dw\), and the integration region \(W_{g}^{\mu } \rightarrow M_{f}^{\mu }\), we can easily deduce that
which indicates the relationship between the moment matrix \(M_{f}^{\mu }\) and the moment matrix \(M_{g}^{\mu }\). Define \(M^{1/2}\) as any matrix that satisfies \((M^{1/2})(M^{1/2})^{T}=M\). Since \(det(M)>0\), the \(M^{1/2}\) can be always achieved. It can be calculated by using an eigenvalue method [5, 6]. Let \(\varDelta _{f}=(M_{f}^{\mu })^{1/2}\) and \(\varDelta _{g}=(M_{g}^{\mu })^{1/2}\), we accordingly achieve \(M_{f}^{\mu }=\varDelta _{f} \varDelta _{f}^T\) and \(M_{g}^{\mu }=\varDelta _{g} \varDelta _{g}^T\). Using the matrices \(\varDelta _{f}\) and \(\varDelta _{g}\), we can produce a transformation matrix as
Then we have
Using Eq. 18, the above equation can be rewritten as
where I is an identity matrix. The above equation indicates that E is a similarity matrix which encodes the relation of the matrices \(A,\varDelta _{f}\, \) and \(\, \varDelta _{g}\). According to Eq. (21), the transform matrix E can be denoted by \(E=\alpha R_{\theta _{0}}\) for \(det(E)>0\) and \(E=\alpha \tilde{R} _{\theta _{0}}\) for \(det(E)<0\), where \(\alpha =|det(A)|^{-1/2}\) play the role of scale factor, \(R_{\theta _{0}}\) and \(\tilde{R} _{\theta _{0}}\) are
and
which can transform a vector \(\overrightarrow{\theta }\) to be \(\overrightarrow{\theta +\theta _{0}}\) and \(\overrightarrow{\theta _{0}-\theta }\) respectively. It is worth noting that for Eq. (19), since \(det(\varDelta _{f})>0\) and \(det(\varDelta _{g})>0\) [5, 6], det(E) takes the same sign as det(A). While \(det(A)<0\) indicates that besides the scaling, rotation and shearing, the affine transform with the affine matrix A also includes a mirror transform.
We have now obtained the matrices \(\varDelta _{f}\) and \(\varDelta _{g}\), they are then be used to normalize the function \(\varPsi _{f}^{\mu }(\theta )\) and its affine-transform version \(\varPsi _{g}^{\mu }(\theta )\) respectively. We first normalize the function \(\varPsi _{f}^{\mu }(\theta )\) as
Similarly, the normalized version of \(\varPsi _{g}^{\mu }(\theta )\) is defined as \(\dot{\varPsi } _{g}^{\mu }(\theta )\). The relationship between them can be achieved as
and
which indicate that the normalized versions \(\dot{\varPsi } _{f}^{\mu }(\theta )\) and \(\dot{\varPsi } _{g}^{\mu }(\theta )\) are only correlated by a similarity transform. The Eq. (26) also indicates that when the affine transform includes a mirror transform, the original function \(\dot{\varPsi } _{f}^{\mu }(\theta )\) is also subject to an additional mirror transform.
Also, we can see that for any \(\mu \in [0,1]\), its corresponding 1D function \(\dot{\varPsi } _{f}^{\mu }(\theta )\) for the shape f only suffers from scaling, translation and mirror distortions when the shape f is subject to affine transform. In the former section, we fix the variable \(\mu \) for the convenience of presenting the details of the proposed transform. Now, we set it free and rewrite the 1D function \(\dot{\varPsi } _{f}^{\mu }(\theta )\) to a 2D function \(\dot{\varPsi } _{f}(\mu ,\theta )\). Obviously, the affine transform makes the 2D function \(\dot{\varPsi } _{f}(\mu ,\theta )\) occur only shifting and mirror effects on the dimension of \(\theta \) and a scaling of the amplitude of the function.
4.3 Affine Invariants
Here we apply the proposed transform \(\dot{\varPsi } _{f}(\mu ,\theta )\) to affine invariant shape recognition. As discussed in the former section, the transform \(\dot{\varPsi } _{f}(\mu ,\theta )\) is only subject to a shifting and a mirror transform on the dimension of \(\theta \) and a scaling on the amplitude of the function. So, it is very easy to remove these effects from the proposed transform. To do so, we apply the 1D Fourier transform to the dimension of \(\theta \) against the proposed transform \(\dot{\varPsi } _{f}(\mu ,\theta )\). Assume that \(\dot{\varPsi } _{f}(\mu _{i},\theta _{j})\), \(i=1,...,k\) and \(j=1,...,N\) is the digital form of the proposed transform \(\dot{\varPsi } _{f}(\mu ,\theta )\), where K and N are the number of the points uniformly sampled from the range [0, 1] and the range \([0,2\pi ]\) respectively. Then we obtain a matrix of size \(K \times N\). For each row of the matrix, we perform discrete 1D Fourier transform against it and keep the magnitudes of Fourier coefficients. According to the theory of Fourier transform, the shifting and mirror effects are removed from the transform \(\dot{\varPsi } _{f}(\mu _{i},\theta _{j})\). As for the scaling effect, we normalize each row of the matrix using its respective \(0^{th}\)-order Fourier coefficient. Then we obtain a completely affine invariant version of the transform \(\dot{\varPsi } _{f}(\mu _{i},\theta _{j})\) which can be directly utilized for shape recognition. The dissimilarity between two shapes can be measured by calculating the L1 norm between their transforms \(\dot{\varPsi }(\mu _{i},\theta _{j})\).
5 Experimental Results and Discussions
To examine the feasibility and effectiveness of the proposed method on shape retrieval, we perform the proposed method on two groups of shape image datasets: (1) Contour shape dataset in which each shape is enclosed by a single silhouette and no content is contained inside, and (2) Region shape dataset in which each sample has several separated regions or its whole region is though enclosed by a single silhouette, it also contain some contents inside which usually have complex structure. For all the experiments, we uniformly sample 18 values from the range [0, 1] and 180 values from the range \([0,2\pi ]\) for conducting the proposed transform.
To quantify the retrieval performance of the algorithms, the standard metric for information retrieval, knee-point score [7, 8], is used in our experiments. For any query shape \(Q_{i}\), calculate the distances of all the dataset samples to it and rank them to a sequence in ascending order. Let H be the number of all the dataset samples and \(V_{i}\) be the number of the relevant ones to the query shape \(Q_{i}\) in the dataset. For each integer number \(1 \le h \le H\), count the number \(v_{i,h}\) of the relevant samples of the top k best matches in the sequence. Then for the given query \(Q_{i}\), its precision and recall at the top h best matches are defined as
respectively. We calculate their average values over all the queries \(Q_{i}\). In our experiments, each sample from the dataset is taken in turn as a query. So, there are a total of H queries. When \(V_{i}=h\), precision and recall will take the same value which is termed knee-point score as measurement [7, 8].
Multiview Curve Database (MCD): To evaluate the performance of the algorithms on recognizing the curved shapes in presence of affine distortion, Zuliani et al. [10] chose 40 samples from the MPEG-7 CE-1 contour shape dataset [9] with each selected from one shape class. Each of them is then printed on a white paper and 7 pictures are taken to it from different view angles using a digital camera. Another 7 images are achieved by randomly rotating and reflecting the available seven samples. So, there are a total of \(40 \times 14 = 560\) samples in the MCD which consists of MPEG-7 shapes that are affected by natural perspective skew due to the manner of extracting from real images. Some typical samples from the MCD are shown in Fig. 3. This dataset is publicly available and has been used as test case in many works [7, 10, 14].
To make a fair comparison, we choose those approaches which are particularly designed for affine shape recognition. Three region based methods, including Trace transform based method [5], Multiscale autoconvolution (MSA) [11] and Affine moment invariants [12] which are state-of-the-arts descriptors for affine shape recognition, are used as benchmarks in our experiments. Since the template sample of the shapes in MCD database are from the MPEG-7 CE-1 dataset which is a contour based test case, four recently published contour based methods including Affine-invariant elastic metric [7], Hierarchical projective invariant contexts [14], subspace approach [13] and Affine-invariant curve matching [10] are selected as benchmarks for a wide comparison. All of them take the MCD dataset as the test case in their experiments. The Knee-point scores of the proposed method together with the benchmark methods are summarized in Table 1. It can be seen that the proposed method achieves 97.07% of retrieval accuracy which is about 17% higher than the Trace transform based method [5] and much more than the other two region based methods [12] and [11]. While compare with the other four contour based methods which generally perform better on the contour based test case than those region based methods, the proposed method still achieves more than about 7% of retrieval accuracy than them.
MPEG-7 CE-2 Perspective Transform Test: MPEG-7 CE-2 database is developed for evaluating the performance of those region based shape analysis methods. Here, we perform the MPEG-7 CE-2 perspective test to validate the effectiveness of the proposed on the retrieval of region shapes in presence of perspective transform. In this test protocol, all the 3101 region-based shape images in the database are used as gallery images (Some typical samples are shown in Fig. 4(a)). Among them, 330 images of 30 classes with 11 images in each class are labeled as queries for the retrieval experiment. In each query class, one image is the original shape, and the other ten images are its perspective transformed versions (Example images are shown in Fig. 4(b)). As can be seen that different from the CE-1 shapes, the CE-2 samples usually have complex interior structures and some samples have even separate shape regions.
In our experiments, we follow the protocol of the CE-2 perspective transform test. Since the contour based shape recognition methods used in the former experiments can not handle region shapes, we only compare the proposed method with the other three region based methods. The knee-point scores for all the compared methods are summarized in Table 2. It can be seen that the proposed method achieved an accuracy of 97.49%, much higher than those of all the benchmarks (2.48% higher than the second best approach), on retrieving shapes with various perspective transformations. As can be seen that on the CE-2 perspective test, the proposed method keeps the best retrieval performance over the benchmark methods. The encouraging experimental results demonstrate the effectiveness of the proposed transform in describing shapes in presence of perspective transform and its superior discriminability over the existing region based methods on handling the shapes with complex interior structures.
6 Conclusions
A novel line integral transform has been presented for affine-invariant shape recognition. It is a 2D function which is not only invariant to affine transform, but also preserves the spatial relationship between the line integrals which makes it more discriminative than those shape descriptors from the trace transform. In additional, the proposed method is more efficient than the trace transform. In the proposed method, a 2D matrix of affine invariants can be generated by conducting the proposed transform once. While conducting the trace transform once can only generate a single feature and multiple times of transforms should be performed make shape descriptors discriminative. The proposed transform has been tested on the standard affinely distorted contour shape database and region shape database and compared with the state-of-the-art shape descriptors that are designed for affine shape analysis. The encouraging experimental results showed that the proposed method is effective for affine shape recognition.
References
Deans, S.R.: The Radon Transform and Some of Its Applications. Wiley, New York (1983)
Strichartz, R.: A Guide to Distribution Theory and Fourier Transforms. CRC Press, Boca Raton (1994)
Hong, B., Soatto, S.: Shape matching using multiscale integral invariants. IEEE Trans. Pattern Anal. Mach. Intell. 37(1), 151–160 (2015)
Baumberg, A.: Reliable feature matching across widely separated views. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, vol. 1, pp. 774–781 (2000)
Petrou, M., Kadyrov, A.: Affine invariant features from the trace transform. IEEE Trans. Pattern Anal. Mach. Intell. 26(1), 30–44 (2004)
Golub, G.H., Van Loan, C.F.: Matrix Computations, 3rd edn. Johns Hopkins University Press, Baltimore (1996)
Bryner, D., Srivastava, A., Klassen, E.: Affine-invariant, elastic shape analysis of planar contours. In: Proceedings of the IEEE Conference on Computer Vision Pattern Recognition, pp. 390–397 (2012)
Bryner, D., Klassen, E., Le, H., Srivastava, A.: 2D affine and projective shape analysis. IEEE Trans. Pattern Anal. Mach. Intell. 36(5), 998–1011 (2014)
Latecki, L.J., Lakamper, R., Eckhardt, T.: Shape descriptors for no-rigid shapes with a single closed contour. In: Proceedings of the IEEE Conference on Computer Vision Pattern Recognition, vol. 1, pp. 424–429 (2000)
Zuliani, M., Bhagavathy, S., Manjunath, B.S., Kenney, C.: Affine-invariant curve matching. In: Proceedings of the IEEE International Conference on Image Processing (2004)
Rahtu, E., Salo, M., Heikkila, J.: Affine invariant pattern recognition using multiscale autoconvolution. IEEE Trans. Pattern Anal. Mach. Intell. 27(6), 908–918 (2005)
Flusser, J., Suk, T.: Pattern recognition by affine moment invariants. Pattern Recogn. 26(1), 167–174 (1993)
Mai, F., Chang, C., Hung, Y.S.: A subspace approach for matching 2D shapes under affine distortions. Pattern Recogn. 44(2), 210–221 (2011)
Jia, Q., Fan, X., Liu, Y., Li, H., Luo, Z., Guo, H.: Hierarchical projective invariant contexts for shape recognition. Pattern Recogn. 52(52), 358–374 (2016)
Domokos, C., Kato, Z.: Parametric estimation of affine deformations of planar shapes. Pattern Recogn. 43(3), 569–578 (2010)
Tarel, J., Wolovich, W., Cooper, D.B.: Covariant-conics decomposition of quartics for 2D shape recognition and alignment. J. Math. Imaging Vis. 19(3), 255–273 (2003). https://doi.org/10.1023/A:1026285105653
Huang, Z., Cohen, F.S.: Affine-invariant B-spline moments for curve matching. IEEE Trans. Image Process. 5(10), 1473–1480 (1996)
Wang, Y., Teoh, E.K.: 2D affine-invariant contour matching using B-spline model. IEEE Trans. Pattern Anal. Mach. Intell. 29(10), 1853–1858 (2007)
Srestasathiern, P., Yilmaz, A.: Planar shape representation and matching under projective transformation. Comput. Vis. Image Underst. 115(11), 1525–1535 (2011)
Arbter, K., Snyder, W.E., Burkhardt, H., Hirzinger, G.: Application of affine-invariant Fourier descriptors to recognition of 3-D objects. IEEE Trans. Pattern Anal. Mach. Intell. 12(7), 640–647 (1990)
Heikkilä, J.: Pattern matching with affine moment descriptors. Pattern Recogn. 37(9), 1825–1834 (2004)
Khalil, M.I., Bayoumi, M.M.: A dyadic wavelet affine invariant function for 2D shape recognition. IEEE Trans. Anal. Mach. Intell. 23(10), 1152–1164 (2001)
Rube, I.E., Ahmed, M., Kamel, M.S.: Wavelet approximation-based affine invariant shape representation functions. IEEE Trans. Pattern Anal. Mach. Intell. 28(2), 323–327 (2006)
Ruiz, A., Lopez-de-Teruel, P.E., Fernandez-Maimo, L.: Efficient planar affine canonicalization. Pattern Recogn. 72, 236–253 (2017)
Xue, Z., Shen, D., Teoh, E.K.: An efficient fuzzy algorithm for aligning shapes under affine transformations. Pattern Recogn. 34(6), 1171–1180 (2001)
Yang, Z., Cohen, F.S.: Cross-weighted moments and affine invariants for image registration and matching. IEEE Trans. Pattern Anal. Mach. Intell. 21(8), 804–814 (1999)
Kadyrov, A., Petrou, M.: Affine parameter estimation from the trace transform. IEEE Trans. Pattern Anal. Mach. Intell. 28(10), 1631–1645 (2006)
Wang, Y., Huang, K., Tan, T.: Human activity recognition based on R transform. In: CVPR, pp. 1–8 (2007)
Andriluka, M., Roth, S., Schiele, B.: People-tracking-by-detection and people-detection-by-tracking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2008)
Korn, P., Sidiropoulos, N.D., Faloutsos, C., Siegel, E.L., Protopapas, Z.: Fast and effective retrieval of medical tumor shapes. IEEE Trans. Knowl. Data Eng. 10(6), 889–904 (1998)
Suk, T., Flusser, J.: Affine moment invariants generated by graph method. Pattern Recogn. 44(9), 2047–2056 (2011)
Chen, Y.W., Chen, Y.Q.: Invariant description and retrieval of planar shapes using Radon composite features. IEEE Trans. Signal Process. 56(10), 4762–4771 (2008)
Hjouj, F., Kammler, D.W.: Identification of reflected, scaled, translated, and rotated objects from their Radon projections. IEEE Trans. Image Process. 17(3), 301–310 (2008)
Tabbone, S., Wendling, L., Salmon, J.: A new shape descriptor defined on the Radon transform. Comput. Vis. Image Underst. 102(1), 42–51 (2006)
Hoang, T.V., Tabbone, S.: The generalization of the R-transform for invariant pattern representation. Pattern Recogn. 45(6), 2145–2163 (2012)
Hasegawa, M., Tabbone, S.: Amplitude-only log Radon transform for geometric invariant shape descriptor. Pattern Recogn. 47(2), 643–658 (2014)
Kadyrov, A., Petrou, M.: The trace transform and its applications. IEEE Trans. Pattern Anal. Mach. Intell. 23(8), 811–828 (2001)
Zhang, Y., Chu, C.H.: IEEE Trans. Pattern Anal. Mach. Intell. 33(3), 446–458 (2011)
Acknowledgement
This work was supported in part by the Australian Research Council (ARC) under Discovery Grant DP140101075 and the Natural Science Foundation of China under Grant 61372158.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Wang, B., Gao, Y. (2020). A Novel Line Integral Transform for 2D Affine-Invariant Shape Retrieval. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, JM. (eds) Computer Vision – ECCV 2020. ECCV 2020. Lecture Notes in Computer Science(), vol 12373. Springer, Cham. https://doi.org/10.1007/978-3-030-58604-1_36
Download citation
DOI: https://doi.org/10.1007/978-3-030-58604-1_36
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-58603-4
Online ISBN: 978-3-030-58604-1
eBook Packages: Computer ScienceComputer Science (R0)