HARRIS AFFINE REGION DETECTOR-1998 - Lindeberg - Automatic - Scale - Selec - IJCV
HARRIS AFFINE REGION DETECTOR-1998 - Lindeberg - Automatic - Scale - Selec - IJCV
HARRIS AFFINE REGION DETECTOR-1998 - Lindeberg - Automatic - Scale - Selec - IJCV
TONY LINDEBERG
Computational Vision and Active Perception Laboratory (CVAP), Department of Numerical Analysis
and Computing Science, KTH (Royal Institute of Technology), S-100 44 Stockholm, Sweden
tony@nada.kth.se
Received February 1, 1994; Revised June 1, 1996; Accepted July 30, 1998
Abstract. The fact that objects in the world appear in different ways depending on the scale of observation has
important implications if one aims at describing them. It shows that the notion of scale is of utmost importance
when processing unknown measurement data by automatic methods. In their seminal works, Witkin (1983) and
Koenderink (1984) proposed to approach this problem by representing image structures at different scales in a
so-called scale-space representation. Traditional scale-space theory building on this work, however, does not
address the problem of how to select local appropriate scales for further analysis. This article proposes a systematic
methodology for dealing with this problem. A framework is presented for generating hypotheses about interesting
scale levels in image data, based on a general principle stating that local extrema over scales of different combinations
of γ -normalized derivatives are likely candidates to correspond to interesting structures. Specifically, it is shown
how this idea can be used as a major mechanism in algorithms for automatic scale selection, which adapt the local
scales of processing to the local image structure.
Support for the proposed approach is given in terms of a general theoretical investigation of the behaviour of the
scale selection method under rescalings of the input pattern and by integration with different types of early visual
modules, including experiments on real-world and synthetic data. Support is also given by a detailed analysis of
how different types of feature detectors perform when integrated with a scale selection mechanism and then applied
to characteristic model patterns. Specifically, it is described in detail how the proposed methodology applies to the
problems of blob detection, junction detection, edge detection, ridge detection and local frequency estimation.
In many computer vision applications, the poor performance of the low-level vision modules constitutes a major
bottleneck. It is argued that the inclusion of mechanisms for automatic scale selection is essential if we are to
construct vision systems to automatically analyse complex unknown environments.
Keywords: scale, scale selection, normalized derivative, feature detection, blob detection, corner detection, fre-
quency estimation, Gaussian derivative, scale-space, multi-scale representation, computer vision
80 Lindeberg
In certain controlled situations, appropriate scales should be regarded as important, we obtain a sub-
for analysis may be known a priori. For example, a stantial expansion of the amount of data to be inter-
desirable property of a physicist is his intuitive ability preted by later stage processes. In most previous works,
to select appropriate scales to model a given situation. this problem has been handled by formulating algo-
Under other circumstances, however, it may not be ob- rithms which rely on the information present in the
vious at all how to determine in advance what are the data at a small set of manually chosen scales (or even
proper scales. One such example is a vision system a single scale). Alternatively, coarse-to-fine algorithms
with the task of analysing unknown scenes. Besides have been expressed, which start at a given coarse scale
the inherent multi-scale properties of real-world ob- and propagate down to a given finer scale. Determin-
jects (which, in general, are unknown), such a system ing such scales in advance, however, leads to the intro-
has to face the problems that the perspective mapping duction of free parameters. If one aims at autonomous
gives rise to size variations, that noise is introduced algorithms which are to operate in a complex environ-
in the image formation process, and that the available ment without need for external parameter tuning, we
data are two-dimensional data sets reflecting only in- therefore argue that it is essential to complement tra-
direct properties of a three-dimensional world. To be ditional multi-scale processing by explicit mechanisms
able to cope with these problems, an image representa- for automatic scale selection. Notably, image descrip-
tion that explicitly incorporates the notion of scale is a tors can be highly unstable if computed at inappro-
crucially important tool whenever interpreting sensory priately chosen scales, whereas a proper tuning of the
data, such as images, by automatic methods. scale parameter can improve the quality of an image
In computer vision and image processing, these in- descriptor substantially. As will be demonstrated later,
sights have lead to the construction of multi-scale repre- local scale information can also constitute an important
sentations of image data, obtained by embedding any cue in its own right.
given signal into a one-parameter family of derived Early work addressing this problem was presented
signals (Burt, 1981; Crowley, 1981; Witkin, 1983; in (Lindeberg, 1991, 1993a) for blob-like image struc-
Koenderink, 1984; Yuille and Poggio, 1986; Florack tures. The basic idea was to study the behaviour of im-
et al., 1992; Lindeberg, 1994a; Haar Romeny, 1994). age structures over scales, and to measure the saliency
This family should be parameterized by a scale para- of image structures from the stability properties and the
meter and be generated in such a way that fine-scale lifetime of these structures in scale-space. Scale levels
structures are successively suppressed when the scale were selected from the scales at which a measure of
parameter is increased. A main intention behind this blob strength assumed local maxima over scales and
construction is to obtain a separation of the image significant image structures were determined from the
structures in the original image, such that fine scale stability of the blob structures in scale-space. Experi-
image structures only exist at the finest scales in the mentally, it was shown that this approach could be used
multi-scale representation. Thereby, the task of oper- for extracting regions of interest with associated scale
ating on the image data will be simplified, provided levels, which in turn could serve as to guide various
that the operations are performed at sufficiently coarse early visual processes.
scales where unnecessary and irrelevant fine-scale The subject of this article is to address the prob-
structures have been suppressed. Empirically, this idea lem of automatic scale selection in a more general set-
has proved to be extremely useful, and multi-scale rep- ting, for wider classes of image descriptors. We shall
resentations such as pyramids, scale-space representa- be concerned with the extraction of image features and
tion and non-linear diffusion methods are commonly the computation of filter-like image descriptors, and
used as preprocessing steps to a large number of early present a scale selection principle for image descriptors
visual operations, including feature detection, stereo which can be expressed in terms of Gaussian deriva-
matching, optic flow, and the computation of shape tive filters. The general idea that will be proposed is
cues. to study the evolution properties over scales of nor-
A multi-scale representation by itself, however, con- malized differential descriptors. Specifically, it will be
tains no explicit information about what image struc- suggested that local extrema over scales of normalized
tures should be regarded as significant or what scales differential entities, which arise in this way, are likely
are appropriate for treating those. Hence, unless early to correspond to interesting image structures. By theo-
judgements can be made about what image structures retical considerations and experiments it will be shown
P1: VIY/MDR-BNY-BIS P2: VIY/RCK P3: SUK/RCK QC: SUK
International Journal of Computer Vision KL660-01-Linderberg-I October 20, 1998 16:21
that this approach gives rise to intuitively reasonably 2. Scale-Space Representation: Review
results in different situations and that it provides a uni-
fied framework for scale selection for detecting image Given any continuous signal f : R D → R, its linear
features such as blobs, corners, edges and ridges. scale-space representation L : R D × R+ → R is de-
fined as the solution to the diffusion equation
82 Lindeberg
3. Normalized Derivatives and Intuitive Idea useful when formulating scale selection mechanisms
for Scale Selection for edge detection and ridge detection.
For the sinusoidal signal, the amplitude of an mth
A well-known property of the scale-space representa- order normalized derivative as function of scale is given
tion is that the amplitude of spatial derivatives by
in general decrease with scale, i.e., if a signal is subject i.e., it first increases and then decreases. Moreover, it
to scale-space smoothing, then the numerical values of assumes a unique maximum at tmax,L ξ m = γωm2 . If we
spatial derivatives computed from the smoothed data 0
define a scale parameter σ of dimension length by
can be expected to decrease. This is a direct conse- √
σ = t and introduce the wavelength λ0 of the sig-
quence of the non-enhancement property of local ex- nal by λ0 = 2π/ω0 , we can see that the scale at which
trema, which means that the value at a local maximum the amplitude of the γ -normalized derivative assumes
cannot increase, and the value at a local minimum can- its maximum over scales is proportional to the wave-
not decrease. Hence, the amplitude of the variations in length, λ0 , of the signal:
a signal will always decrease with scale.
√
As a simple example of this, consider a sinusoidal γm
input signal1 of some given frequency ω0 ; for simplicity σmax,L ξ m = λ0 . (9)
2π
in one dimension,
The maximum value over scales is
f (x) = sin ω0 x. (4) ¡ ¢ (γ m)γ m/2 (1−γ )m
L ξ m ,max tmax,L ξ m = ω0 . (10)
It is straightforward to show that the solution of the eγ m/2
diffusion equation is given by In the case when γ = 1, this maximum value is inde-
pendent of the frequency of the signal (see Fig. 1),
L(x; t) = e−ω0 t/2 sin ω0 x.
2
(5) and the situation is highly symmetric, i.e., given any
scale t0 , the
√ maximally amplified frequency is given
Thus, the amplitude of the scale-space representa- by ωmax = m/t0 , and for any ω0 the scale with maxi-
tion, L max , as well as the amplitude of the mth order mum amplification is tmax = m/ω02 . In other words, for
smoothed derivative, L x m ,max , decrease exponentially normalized derivatives with γ = 1 it holds that sinu-
with scale soidal signals are treated in a similar (scale invariant)
way independent of their frequency (see Fig. 1).
L max (t) = e−ω0 t/2 , L x m ,max (t) = ω0m e−ω0 t/2 .
2 2
The situation is a bit different when γ 6= 1. We shall
return to this subject in Section 4.1.
Let us next introduce a γ -normalized derivative oper-
ator defined by
4. Proposed Scale Selection Methodology level, at which some (possibly non-linear) combi-
nation of normalized derivatives assumes a local
The example shows that the scale at which a normal- maximum over scales, can be treated as reflecting
ized derivative assumes its maximum over scales is for a characteristic length of a corresponding struc-
a sinusoidal signal proportional to the wavelength of ture in the data.
the signal. In this respect, maxima over scales of nor-
malized derivatives reflect the scales over which spatial This principle is closely related to although not equiv-
variations take place in the signal. alent to the method for scale selection previously pro-
This operation corresponds to an interesting compu- posed in (Lindeberg, 1991, 1993a), where interesting
tational structure, since it constitutes a way of estimat- scale levels were determined from maxima over scales
ing length based on measurements performed at only a of a normalized blob measure. It can be theoretically
single spatial point in the scale-space representation, justified under a number of different assumptions and
and without explicitly laying out a ruler. Moreover, for a number of specific brightness models (see next).
compared to a local windowed Fourier transform there Its general usefulness, however, must be verified em-
is no need for making any explicit settings of window pirically, and with respect to the type of problem it is
size for computing the Fourier transform. Instead, the to be applied to.
propagation of length information over space is per-
formed via the diffusion equation, and the decisions 4.1. General Scaling Property of Local Maxima
about the contents in the data are made by studying the over Scales
output of derivative operators as the diffusion process
evolves. A basic justification for the abovementioned arguments
Alternatively, we can view such a measurement pro- can be obtained from the fact that for a large class
cedure as a pattern matcher, which matches Gaussian of (possibly non-linear) combinations of normalized
derivative kernels of different size to the given image derivatives it holds that maxima over scales have a nice
pattern, based on a specific normalization of the prim- behaviour under rescalings of the intensity pattern. If
itive templates. By using the proposed γ -normalized the input image is rescaled by a constant scaling factor
derivative concept, we obtain one-to-one correspon- s, then the scale at which the maximum is assumed
dence between the matching response of the Gaussian will be multiplied by the same factor (if measured in
√
derivative kernels and the wavelength of the signal. Se- units of σ = t). This is a fundamental requirement
lecting the scale at which the maximum over scale is on a scale selection mechanism, since it guarantees that
assumed corresponds to selecting the pattern (or the image operations commute with size variations.
scale) for which the operator response is strongest.
This property is, however, not restricted to sine wave Transformation Properties under Rescalings. To give
patterns or to image measurements in terms of linear a formal characterization of this scaling property, con-
derivative operators of a certain order. Contrary, it ap- sider two signals f and f 0 related by
plies to a large class of image descriptors which can
be formulated as multi-scale differential invariants ex- f (x) = f 0 (sx), (11)
pressed in terms of Gaussian derivatives (this notion
will be made more precise next). A main message of
and define the scale-space representations of f and f 0
this article is that this property can be used as a ma-
in the two domains by
jor mechanism in algorithms for automatic scale se-
lection, which automatically adapt the local scales of
processing to image data. Let us hence generalize the L(·; t) = g(·; t) ∗ f, (12)
0 0 0 0
abovementioned observation to more complex signals L (·; t ) = g(·; t ) ∗ f , (13)
and state the following principle for scale selection, to
be applied in situations when no other information is where the spatial variables and the scale parameters are
available. In its most general form, it can be expressed transformed according to
as follows:
84 Lindeberg
Then, L and L 0 are related by does not depend on the index i of that term. For a
differential expression of this form, the corresponding
L(x; t) = L 0 (x 0 ; t 0 ), (16) normalized differential expression in each domain is
given by
and the mth order spatial derivatives satisfy
Dγ -norm L = t Mγ /2 DL , (23)
0 0 0
∂x m L(x; t) = s ∂x 0 m L (x ; t ).
m
(17) Dγ0 -norm L 0 =t 0 Mγ /2 0
DL. 0
(24)
For γ -normalized derivatives in the two domains From (20) it follows that these normalized differential
expressions are related by
∂ξ = t γ /2 ∂x , (18)
∂ξ 0 = t 0 γ /2
∂x 0 , (19) Dγ -norm L = s M(1−γ ) Dγ0 -norm L 0 . (25)
Necessity of the γ -Normalization. More generally, be detected at quite coarse scales, and the localization
one may ask what choices of normalization factors are properties may not be the best. Therefore, we propose
possible, provided that we would like to state this scal- to complement this framework by a second process-
ing property as a fundamental constraint on a scale se- ing stage, in which more refined processing is invoked
lection mechanism based on local maxima over scales for computing more accurate localization estimates. In
of normalized differential entities. Then, in fact, it can this respect, the suggested framework naturally gives
be shown that the γ -normalized derivative concept ac- rise to two-stage algorithms, with feature detection at
cording to (6) arises by necessity. coarse scales followed by feature localization at finer
In other words, the γ -normalized derivative concept scales. Whereas coarse-to-fine approaches are com-
comprises the most general class of normalization fac- mon practice in several computer vision problems, this
tors for which detection of local maxima over scales framework leads to explicit mechanisms for automatic
commutes with rescalings of the input pattern. A more selection of all the scale parameters involved.
precise formulation of this statement as well as the de- In the following, a series of theoretical and exper-
tails of the necessity proof are given in Appendix A.1. imental results will be presented showing how the
abovementioned general approach applies to differ-
Summary: General Properties of Scale Selection ent types of feature detectors expressed as polynomial
Framework. To conclude, this analysis shows that if combinations of Gaussian derivatives.
a γ -normalized homogeneous differential expression
assumes a maximum over scales at (x0 ; t0 ) in the scale-
space representation of f , then there will be a cor- 4.3. Scale-Space Signatures from Real-World Data
responding maximum over scales in the scale-space
representation of f 0 at (sx0 ; s 2 t0 ). Moreover, although Figure 2 shows the variations over scales of two simple
the magnitude of a normalized derivative at a local max- differential expressions formulated in terms of normal-
imum over scales is not scale invariant unless γ = 1, it ized derivatives. It shows the result of computing the
is possible to compensate for this phenomenon and to trace and the determinant of the normalized Hessian
define scale invariant magnitude descriptors also when matrix by (with γ = 1)
γ 6= 1.
trace Hnorm L = t γ ∇ 2 L = t γ (L x x + L yy ), (30)
¡ ¢
4.2. The Scale Selection Mechanism in Practice det Hnorm L = t 2γ L x x L yy − L 2x y , (31)
So far, we have proposed a general methodology for for two details in an image of a field of sunflowers.2
scale selection by detecting local maxima in feature These graphs are called the scale-space signatures of
responses over scales. A fundamental problem that re- trace Hnorm L and det Hnorm L, respectively.
mains to be solved in this context concerns what dif- As can be seen, the maximum over scales in the top
ferential expressions to use. Is any differential invari- row of Fig. 2 is assumed at a finer scale than in the
ant feasible? Here, we shall not attempt to answer this bottom row. A more detailed examination of the ratio
question. Let us instead contend that the differential ex- between the scale values3 where the graphs attain their
pression should at least be determined so as to capture maxima over scales shows that when the scale param-
the types of image structures under consideration. eter is measured in dimension length, this scale ratio is
The general approach to scale selection that is pro- roughly equal to the ratio of the diameters of the sun-
posed is to use these locally maximal responses over flowers in the centers of the two images, respectively.
scales in the stage of detecting image features, i.e., This example illustrates that results in agreement with
when establishing the existence of different types of the proposed scale selection principle can be obtained
image structures. Basically, the scale at which a maxi- for real-world data (i.e., signals having a much richer
mum over scales is attained will be assumed to give in- frequency content than a single sine wave).
formation about how large a feature is, in analogy with The reason why these particular differential expres-
the common approach of taking the spatial position at sions have been selected here is because they consti-
which the maximum operator response is assumed as tute useful differential entities for blob detection; see
an estimate of the spatial location of a feature. In cer- e.g. (Marr, 1982; Voorhees and Poggio, 1987; Blostein
tain situations, this implies that image features may and Ahuja, 1989). Before we turn to the problem of
P1: VIY/MDR-BNY-BIS P2: VIY/RCK P3: SUK/RCK QC: SUK
International Journal of Computer Vision KL660-01-Linderberg-I October 20, 1998 16:21
86 Lindeberg
Figure 2. Scale-space signatures of the trace and the determinant of the normalized Hessian matrix computed for two details of a sunflower
image; (left) grey-level image, (middle) signature of (trace Hnorm L)2 , (right) signature of (det Hnorm L)2 . (The signatures have been computed
at the central point in each image. The horizontal axis shows effective scale, essentially the logarithm of the scale parameter, whereas the scaling
of the vertical axis is linear in the normalized operator response.)
expressing an integrated blob detector with automatic mally, an extremum path of a differential entity Dnorm L
scale selection, however, let us describe a further ex- is defined (from the implicit function theorem) as a set
tension of the general scale selection idea. of points (r (t); t) ∈ R N × R+ in scale-space such that
for any t ∈ R+ the point r (t) is a local extremum of the
mapping x 7→ (Dnorm L)(x; t),
4.4. Simultaneous Detection of Interesting
Points and Scales © ª
(x; t) ∈ R N × R+
© ª
In Fig. 2, the signatures of the normalized differen- = (r (t); t) ∈ R N × R+ : (∇(Dnorm L))(r (t); t)=0 .
tial entities were computed at the central point in each
image. These points were deliberately chosen to co- At points at which extrema in the scale-space signature
incide with the centers of the sunflowers, where the are assumed, the derivative along the scale direction is
blob response can be expected to be maximal under zero as well. Hence, it is natural to define a normalized
spatial perturbations. In a real-world vision situation, scale-space extremum of a differential entity Dnorm L as
however, we cannot assume such points to be known a point (x0 ; t0 ) ∈ R N × R+ in scale-space which is si-
a priori. Moreover, we can expect that the spatial max- multaneously a local extremum with respect to both the
imum of the operator response is assumed at different spatial coordinate and the scale parameter.4 In terms
positions at different scales. This is one example of of derivatives, such points satisfy
the well-known fact that scale-space smoothing leads
to shape distortions. (∇(Dnorm L))(x0 ; t0 ) = 0,
(32)
Therefore, a more general approach to scale selec- (∂t (Dnorm L))(x0 ; t0 ) = 0.
tion from local extrema in the scale-space signature is
by accumulating the signature of any normalized differ- These normalized scale-space extrema constitute natu-
ential entity Dnorm L along the path r : R+ → R N that ral generalizations of extrema in the spatial domain,
a local spatial extremum in Dnorm L describes across and can serve as natural interest points for feature
scales. The mathematical framework for describing detectors formulated in terms of spatial maxima of
such paths is described in (Lindeberg, 1994a). For- differential operators, such as blob detectors, junction
P1: VIY/MDR-BNY-BIS P2: VIY/RCK P3: SUK/RCK QC: SUK
International Journal of Computer Vision KL660-01-Linderberg-I October 20, 1998 16:21
detectors, symmetry detectors, etc. Specific examples graphically illustrated by a circle centered at the point
of this idea will be worked out in more detail in the at which the spatial maximum is assumed and with the
following sections.5 size determined such that the radius (measured in pixel
Trivially, the invariance properties of local maxima units) is proportional to the scale at which the maxi-
over scales under rescalings of the input signal transfer mum over scales is assumed (measured in dimension
to scale-space maxima. Hence, if a normalized scale- length). To reduce the number of blobs, a threshold on
space maximum is assumed at (x0 ; t0 ) in the scale- the maximum normalized response has been selected
space representation of a signal f , then in a rescaled such that the 250 blobs having the maximum normal-
signal f 0 defined by f 0 (sx) = f (x), a correspond- ized responses according to (28) remain.
ing scale-space maximum is assumed at (sx0 ; s 2 t0 ) in The bottom row shows the result of superimposing
the scale-space representation of f 0 . these circles onto a bright copy of the original image,
as well as corresponding results for the normalized
scale-space extrema of the square of the determinant
5. Blob Detection with Automatic Scale Selection of the Hessian matrix. Corresponding experiments for
a synthetic pattern (analysed in Section 5.1) are given
Figure 3 shows the result of detecting scale-space ex- in Fig. 4. Observe how these conceptually very sim-
trema of the normalized Laplacian in an image of a ple differential geometric descriptors give a very rea-
sunflower field. Every scale-space maximum has been sonable description of the blob-like structures in the
Figure 3. Normalized scale-space maxima computed from an image of a sunflower field: (top left): Original image. (top right): Circles
representing the 250 normalized scale-space maxima of (trace Hnorm L)2 having the strongest normalized response. (bottom left): Circles
representing scale-space maxima of (trace Hnorm L)2 superimposed onto a bright copy of the original image. (bottom right): Corresponding
results for scale-space maxima of (det Hnorm L)2 .
P1: VIY/MDR-BNY-BIS P2: VIY/RCK P3: SUK/RCK QC: SUK
International Journal of Computer Vision KL660-01-Linderberg-I October 20, 1998 16:21
88 Lindeberg
Figure 4. The 250 most significant normalized scale-space extrema detected from the perspective projection of a sine wave (with 10% added
Gaussian noise).
image (in particular concerning the blob size) consid- sunflower image. Here, each scale-space maximum
ering how little information is used in the processing. has been visualized by a sphere centered at the posi-
Figure 5 shows a three-dimensional illustration of tion (x0 ; t0 ) in scale-space at which the maximum is
the multi-scale blob descriptors computed from the assumed, with the radius proportional to the selected
Figure 5. Three-dimensional view of the 150 strongest scale-space maxima of the square of the normalized Laplacian of the Gaussian computed
from the sunflower image. (A dark copy of the original grey-level image is shown in the ground plane, and the vertical dimension represents
scale.)
P1: VIY/MDR-BNY-BIS P2: VIY/RCK P3: SUK/RCK QC: SUK
International Journal of Computer Vision KL660-01-Linderberg-I October 20, 1998 16:21
scale, and the brightness increasing with the signifi- is simple. It is assumed at
cance of the blob. Observe how the size variations in
√
the image data are reflected in the spatial variations of tdet HL = t1 t2 , (37)
the image descriptors.
and for both trace Hnorm L and det Hnorm L the scale at
5.1. Analysis for Idealized Model Patterns which the scale-space maximum is assumed reflects a
characteristic size of the blob.
Whereas the theoretical analysis in Section 4.1 applies
Example 2. Another interesting special case to con-
generally to large classes of differential invariants and
sider is a periodic signal defined as the sum of two
input signals, one may ask how the scale selection
perpendicular sine waves,
method for blob detection performs in specific situa-
tions. In this section, we shall study two model patterns
for which a closed-form solution of the diffusion equa- f (x, y) = sin ω1 x + sin ω2 y. (38)
tion can be calculated and a complete analytical study
hence is feasible. Its scale-space representation is
g(xi ; t) = √
1
e−xi /2t1
2 ¡ ¢ ¡ ¢
ω12 2 − ω12 t e−ω1 t/2 + ω22 2 − ω22 t e−ω2 t/2 = 0.
2 2
(34)
2πt1
is a model of a√two-dimensional √ blob with charac- There is a unique solution when the ratio ω2 /ω1 is close
teristic lengths t1 and t2 in the coordinate direc- to one, and three solutions when the ratio is sufficiently
tions. From the semi-group property of the Gaus- large. Hence, there is a unique maximum over scales
sian g(·; t A ) ∗ g(·; t B ) = g(·; t A + t B ) it follows that the when ω2 /ω1 is close to one, and two maxima when the
scale-space representation of f is ratio is sufficiently large. (The bifurcation occurs when
ω2 /ω1 ≈ 2.4.) In the special case when ω1 = ω2 = ω0 ,
L(x1 , x2 ; t) = g(x1 ; t1 + t)g(x2 ; t2 + t). (35) the maximum over scales is assumed at
2
After a few algebraic manipulations it can be shown ttrace HL = . (40)
that for any t1 , t2 > 0 there is a unique maximum over ω02
scales in
Similarly, setting ∂t |(det Hnorm L)(π/2, π/2; t)| =
¯¡ 2 ¢ ¯ ∂t (t 2 ω12 e−ω1 t/2 ω22 e−ω2 t/2 ) = 0 gives that the maximum
2 2
¯ ∇ ¯ t (t1 + t2 + 2t)
norm L (0, 0; t) = . over scales in det Hnorm L is at
2π(t1 + t)3/2 (t2 + t)3/2
90 Lindeberg
Figure 6. The 50 strongest spatial responses to the Laplacian operator computed at the scale levels: (a) t = 4.0, (b) t = 16.0, and (c) t = 64.0.
Observe how this blob detector leads to a bias towards image structures of a certain size.
5.2. Comparisons with Fixed-Scale Blob Detection In (Lindeberg and Gårding, 1993; Gårding and
Lindeberg, 1996) an application to estimation of sur-
In view of these results, it is interesting to compare this face shape is presented, where: (i) scale-space max-
blob detector with automatic scale selection to a stan- ima guide the computation of regional image texture
dard multi-scale blob detector operating at a fixed scale. descriptors and (ii) scale information serves as a cue to
Figure 6 shows the result of computing spatial maxima three-dimensional surface shape.
at different scales in the response of the Laplacian op- In (Wiltschi et al., 1997) it is demonstrated how scale
erator from the sine wave pattern in Fig. 4. At each information from a blob detection step can be incorpo-
scale, the 50 strongest responses have been extracted. rated into a pattern classifier.
As can be seen, small blobs are given the high-
est relative ranking at fine scales, whereas large blobs
are given the highest relative ranking at coarse scales. 6. Junction Detection with Automatic
Hence, a blob detector operating at a single predeter- Scale Selection
mined scale induces a bias towards image structures
of a corresponding size. On the other hand, if we use A similar approach can be used for detecting corners
the proposed methodology for blob detection based on in grey-level images. In this section, it will be shown
the detection of scale-space maxima, we obtain a con- how a multi-scale junction detector can be formulated
ceptually clean way of handling image structures of all in terms of the scale-space maxima of a normalized
sizes (within a given scale range) in a similar manner. differential invariant.
5.3. Applications of the Scale Selection Method 6.1. Detection Scales from Scale-Space Maxima
Following the previously presented arguments, we ar- A commonly used entity for junction detection is the
gue that a scale selection mechanism is an essential curvature of level curves in intensity data multiplied
complement to any blob detector aimed at handling by the gradient magnitude (Kitchen and Rosenfeld,
large size variations in the image structures. In addi- 1982; Dreschler and Nagel, 1982; Koenderink and
tion, scale information associated with such adaptively Richards, 1988; Noble, 1988; Deriche and Giraudon,
computed image descriptors may serve as an important 1990; Blom, 1992; Florack et al., 1992; Lindeberg,
cue in its own right. 1994a). A special choice is to multiply the level curve
In (Bretzner and Lindeberg, 1996, 1997) a feature curvature by the gradient magnitude raised to the power
tracker is presented, where (i) the scale information of three. This is the smallest value of the exponent that
is a key component for matching image features over leads to a polynomial expression
time, and (ii) the scale selection mechanism is essential
to capture objects under large size variations. κ̃ = L 2x2 L x1 x1 − 2L x1 L x2 L x1 x2 + L 2x1 L x2 x2 .
P1: VIY/MDR-BNY-BIS P2: VIY/RCK P3: SUK/RCK QC: SUK
International Journal of Computer Vision KL660-01-Linderberg-I October 20, 1998 16:21
Moreover, spatial maxima of this operator are pre- Figure 7 shows the result of applying this operator to a
served under affine transformations. The correspond- grey-level image at a number of different scales. The
ing normalized differential expression is obtained by results are displayed in two ways; (i) as grey-level im-
replacing each derivative operator ∂xi by t γ /2 ∂xi , which ages showing the scale-space representation L as well
gives as the junction response κ̃ 2 at each scale, and (ii) in
terms of the 50 strongest spatial maxima of κ̃ 2 , respec-
κ̃norm = t 2γ κ̃. (42) tively, extracted at the same scale levels. As can be
Figure 7. Junction responses at different scales computed from a noisy image containing a number of ideal sharp corners as well as rounded
and diffuse corners. As can be seen, different types of junction structures give rise to different types of responses at different scales. Notably,
certain diffuse junction structures fail to give rise to dominant responses at the finest levels of scale. (Image size: 320 ∗ 240 pixels.)
P1: VIY/MDR-BNY-BIS P2: VIY/RCK P3: SUK/RCK QC: SUK
International Journal of Computer Vision KL660-01-Linderberg-I October 20, 1998 16:21
92 Lindeberg
Figure 8. Junction detection with automatic scale selection: The result of computing the 50 most significant normalized scale-space extrema
of κ̃norm
2 from a grey-level image containing sharp straight edges as well as diffuse and rounded edges (with γ = 1). Compare with Fig. 7 and
observe how corner structures at different scales are captured by this operation.
P1: VIY/MDR-BNY-BIS P2: VIY/RCK P3: SUK/RCK QC: SUK
International Journal of Computer Vision KL660-01-Linderberg-I October 20, 1998 16:21
Diffuse Step Junction. Consider Differentiation and insertion into (41) shows that the
absolute value of the rescaled level curve curvature as-
f (x1 , x2 ) = 8(x1 ; t0 )8(x2 ; t0 ) (43) sumes its spatial maximum
2γ
After differentiation, and using the fact that L x1 x1 = tκ̃ = t0 . (51)
5 − 2γ
L x2 x2 = 0 at √
the origin, as well as 8(0; t) = 1/2 and
g(0; t) = 1/ 2πt, we obtain Interpretation of the Qualitative Behaviour. To
conclude, the junction response κ̃norm
2
can for γ = 1 be
t 2γ
|κ̃norm (0, 0; t)| = . (46) expected to increase with scales when a single corner
8π 2 (t0 + t)2 model of infinite extent constitutes a reasonable ap-
proximation. On the other hand, κ̃norm
2
can be expected
When γ = 1, this entity increases monotonically with
to decrease with scales when so much smoothing is
scale, whereas for γ ∈ ]0, 1[, κ̃norm (0, 0; t) assumes a
applied that the overall shape of the object is substan-
unique maximum over scales at
tially distorted (and neighbouring junctions interfere
γ with each other or disappear altogether).
tκ̃ = t0 . (47) Hence, selecting scale levels (and spatial points)
1−γ
where κ̃norm
2
assumes maxima over scales can be ex-
Non-Uniform Gaussian Blob. A limitation of the pected to give rise to scale levels in the intermediate
abovementioned analysis is that the signature is com- scale range, where a finite extent junction model con-
puted at a fixed point, whereas the maximum in κ̃ 2 stitutes a reasonable approximation. In particular, this
can be expected to drift due to scale-space smoothing. approach will lead to larger scale values for corners
Unfortunately, the equation that determines the posi- having large spatial extent, and prevent too fine scales
tion of the spatial maximum in κ̃ 2 over scales is non- from being selected at rounded junctions.
trivial to handle (it contains a non-linear combination
of the Gaussian function, the primitive function of the 6.3. Experiments: Scale-Space Signatures
Gaussian, and polynomials). Carrying out a closed- in Junction Detection
form analysis along non-vertical extremum paths is,
however, straightforward for the previously treated Figure 9 illustrates these effects for synthetic L-junc-
non-uniform Gaussian blob model. This function can tions with varying degrees of diffuseness. It shows
be regarded as an approximate model of the behaviour simulation experiments with scale-space signatures of
at so coarse scales in scale-space that the shape distor- κ̃norm accumulated in two different ways: (i) a verti-
tions are substantial and the overall shape of a finite- cal signature obtained by computing κ̃norm at the fixed
size object is severely affected. From (35) we have central point at different scales, and (ii) a path signa-
that the scale-space representation of the non-uniform ture obtained by tracking the spatial extremum in κ̃norm
P1: VIY/MDR-BNY-BIS P2: VIY/RCK P3: SUK/RCK QC: SUK
International Journal of Computer Vision KL660-01-Linderberg-I October 20, 1998 16:21
94 Lindeberg
Figure 9. Scale-space signatures of κ̃norm for synthetic L-junctions with different degrees of diffuseness (top t = 4.0, bottom t = 64.0). (left)
original grey-level image, (middle) path signature of κ̃norm accumulated by tracking a spatial maximum in κ̃norm across scales, (right) vertical
signature of κ̃norm accumulated at the central point.
Figure 10. Scale-space signatures of κ̃norm for diffuse L-junctions (t0 = 1.0) of different spatial extent (1/4 and 1/16 of the image size). (left)
original grey-level image, (middle) path signature of κ̃norm accumulated by tracking a spatial maximum in κ̃norm across scales, (right) vertical
signature of κ̃norm accumulated at the central point.
P1: VIY/MDR-BNY-BIS P2: VIY/RCK P3: SUK/RCK QC: SUK
International Journal of Computer Vision KL660-01-Linderberg-I October 20, 1998 16:21
across scales. As can be seen, the qualitative behaviour at coarse scales, whereas the superimposed corner
is in agreement with the approximate analysis in previ- structures of smaller size give rise to scale-space max-
ous section—with increasing degree of diffuseness the ima at finer scales.
values of κ̃norm become smaller at fine scales. More results on corner detection, including a com-
Figure 10 shows the result of replacing the infinite plementary mechanism for accurate corner localiza-
extent L-junction model by junction models of finite tion, are presented in Section 7.
size. Observe that the peak in the signature is assumed
at finer scales when the spatial extent of the junction is 7. Feature Localization with Automatic
decreased. In other words, the scale at which the maxi- Scale Selection
mum over scales is assumed indicates the spatial extent
(the size) of the region for which a junction model is The scale selection methodology presented so far ap-
consistent with the local grey-level pattern (in agree- plies to the detection of image features, and the role
ment with the suggested scale selection principle). of the scale selection mechanism is to estimate the
Figure 11 gives a three-dimensional illustration approximate size of the image structures the feature
of this junction detector with automatic scale selec- detector responds to. Whereas this approach provides
tion. It shows scale-space maxima of κ̃norm2
computed a conceptually simple way to express various feature
from a synthetic image containing corner structures at detectors, such as a junction detector, which automat-
different scales. The original grey-level image is shown ically adapts its scale levels to the local image struc-
in the ground plane, and each scale-space maximum ture, it is not guaranteed that the spatial positions of
has been graphically visualized by a sphere centered at the scale-space maxima constitute accurate estimates
the position (x0 ; t0 ) in scale-space at which the scale- of the corner locations. The local maxima over scales
space maximum was assumed. (The height over the may be assumed at rather coarse scales, where the drift
image plane reflects the selected scale.) Observe how due to scale-space smoothing is substantial and adja-
the large scale corner as a whole gives rise to a response cent features may interfere with each other. For this
Figure 11. Three-dimensional view of scale-space maxima of κ̃norm 2 computed for a large scale corner with superimposed corner structures at
finer scales. Observe that a coarse scale response is obtained for the large scale corner structure as a whole, whereas the superimposed corner
structures of smaller size give rise to scale-space maxima at finer scales.
P1: VIY/MDR-BNY-BIS P2: VIY/RCK P3: SUK/RCK QC: SUK
International Journal of Computer Vision KL660-01-Linderberg-I October 20, 1998 16:21
96 Lindeberg
Given an approximate estimate x0 of the location and Figure 12. Minimizing (53) corresponds to finding the point x that
the size s of a corner (computed according to Section 6), minimizes the distance to all edge tangents in a neighbourhood of
an improved estimate of the corner position can be the given candidate junction point x0 .
computed as follows: Following (Förstner and Gülch,
1987), consider at every point x 0 ∈ R2 in a neighbour-
hood of a junction candidate x0 , the line l x 0 perpendic-
and the minimization problem be expressed as a stan-
ular to the gradient vector (∇ L)(x 0 ) = (L x1 , L x2 )T (x 0 )
dard least squares problem
at that point:
min x T Ax − 2x T b + c ⇔ Ax = b, (54)
0 0
Dx 0 (x) = ((∇ L)(x )) (x − x ) = 0. T
(52) x∈R2
• The problem of choosing the weighting function is duce to relate minimizations at different scales. The
a special case of a common scale problem in least particular choice of norm(t) = trace A implies that the
squares estimation: Over what spatial region should normalized residual
the fitting be performed? Clearly, it should be large
enough such that statistics of gradient directions is R 0
x 0 ∈R2 R|((∇ L)(x ))T (x − x 0 )|2 wx0 (x 0 ) d x 0
accumulated over a sufficiently large neighbourhood d˜ = min 0 2 0 0
around the candidate junction. Nevertheless, the re- x∈R2 x 0 ∈R2 |(∇ L)(x )| wx0 (x ) d x
gion must not be so large that interfering structures (60)
corresponding to other junctions are included.
• The second scale problem, on the other hand, is of
has dimension [length]2 and can be interpreted as a
a slightly different nature than the previous ones—
weighted estimate of the localization error. Specifi-
it concerns what scales should be used for localiz-
cally, scale selection by minimizing the normalized
ing image structures. Previously, in this paper, only
residual r̃ (60) over scales, corresponds to selecting
the problem of detecting image structures has been
the scale that minimizes the estimated inaccuracy in
treated.
the localization estimate.
Here, the following solutions are proposed: This principle for selecting localization scales im-
plies that we take as localization scale the scale that
Selection of Window Function and Spatial Points from gives the maximum consistency between the distribu-
the Detection Step. When computing A, b, and c tion of gradient directions in a neighbourhood of x0
above, let the window function wx0 be a Gaussian func- and a local (qualitative) junction model. More specific
tion centered at the point x0 at which κ̃norm
2
assumed its motivations behind this choice can also be expressed
scale-space maximum, and let the scale value of this as follows:
window function be proportional to the detection scale At very fine scales, where a large amount of noise
t0 at which the maximum over scales in κ̃norm 2
was as- and interfering fine-scale structures can be expected to
sumed. be present, the first-order derivative operators will re-
The idea behind this approach is that the detection spond mainly to such structures. Hence, the gradient
scale should reflect a representative region around the directions can be expected to be roughly randomly dis-
candidate junction, such that larger regions are se- tributed, and the residual dmin will, in general, be large.
lected for corners with large spatial extent than for At coarser scales in scale-space, such fine-scale struc-
corners with small extent. Experimentally, this has tures will be suppressed and the locally computed gra-
been demonstrated to be the case in a large number dient directions will be better aligned to the underlying
of situations (compare with the qualitative results in corner structure. Thus, when smoothing is necessary,
Sections 6.2 and 6.3).6 the residual will decrease. On the other hand, if too
much smoothing is applied, then the shape distorting
Selection of Localization Scale: Minimize the Normal- effects of scale-space smoothing will be dominant and
ized Residual. Clearly, the gradient estimates used for the residual can be expected to increase again. Hence,
computing A, b, and c must be computed at a certain selecting the minimum gives a natural trade-off be-
scale. To determine this localization scale, it is nat- tween these two effects.
ural to select the scale that minimizes the normalized
residual dmin in (60) over scales. Behaviour at Ideal Sharp Polygon-Type Junctions.
This scale selection criterion corresponds to extend- Note, in particular, that for an ideal (sharp) step junc-
ing the minimization problem (54) from a single scale tion, the localization scale given by this method will
to optimization over multiple scales always be zero in the noise free case. This can be easily
understood by observing that for an ideal polygon-type
x T Ax − 2x T b + c junction (consisting of regions of uniform grey-level
min min
t∈R+ x∈R2 norm(t) delimited by straight edges), all edge tangents meet at
c − b T A−1 b the junction point, which means that the residual d˜min is
= min min (59) exactly zero. Thus, any amount of smoothing increases
t∈R+ x∈R2 trace A
the residual, and the minimum value will be assumed
where the normalization factor norm(t) has been intro- at zero scale.
P1: VIY/MDR-BNY-BIS P2: VIY/RCK P3: SUK/RCK QC: SUK
International Journal of Computer Vision KL660-01-Linderberg-I October 20, 1998 16:21
98 Lindeberg
Figure 13. Scale-space signatures of the normalized residual at a synthetic sharp T -junction (t0 = 0.0) for different amounts of added white
Gaussian noise (ν = 1.0 and 30.0): (left) grey-level image, (middle left) signature of normalized residual d˜min accumulated at the central point,
(middle right) signature of the error in the localization estimate relative to the true corner position in the unsmoothed image, (right) localization
estimate computed at the scale at which d˜min assumes its minimum over scales (illustrated by a circle overlayed onto a bright copy of the image
smoothed to that scale).
7.3. Experiments: Choice of Localization Scale discussion above. There is a clear minimum over scales
in each scale-space signature, and the minimum over
Figures 13 and 14 show the result of applying this scale scales is assumed at coarser scales when the noise level
selection mechanism to a sharp and a diffuse corner is increased. The position of the minimum in d˜min over
with different amounts of added white Gaussian noise. scales agrees well with the position of the minimum
As can be seen, the results agree with the qualitative over scales in the absolute error measure (defined as
Figure 14. Corresponding results for a synthetic diffuse T -junction (t0 = 64.0).
P1: VIY/MDR-BNY-BIS P2: VIY/RCK P3: SUK/RCK QC: SUK
International Journal of Computer Vision KL660-01-Linderberg-I October 20, 1998 16:21
Table 1. The result of applying the junction localization method to a synthetic T -junction
with different amounts of added white Gaussian noise. For each noise level, this table
gives the scale at which the normalized residual assumes its minimum over scales, as
well as the scale at which the estimate with the minimum absolute error over scales is
obtained. Moreover, numerical values of the two error measures are given at these scales.
As can be seen, the selected scales increase with the noise level, and the scale at which
the normalized residual assumes its minimum over scales serves as a reasonable estimate
of a scale at which a near optimal localization estimate over scales is obtained.
the distance between the estimated and the true posi- which the normalized residual assumes its minimum
tion of the corner).7 Moreover, slightly coarser scales over scales is only slightly higher than the minimum
are selected for the diffuse junction than for the sharp absolute error over all scales. In this respect, mini-
one. mization of d˜norm over scales gives a near-optimal lo-
Table 1 gives a numerical illustration of basic pro- calization estimate.
perties of this scale selection method for junction lo- Moreover, whereas the error estimates assume rather
calization. It shows the result of applying one iteration high values for a single application of the junction
of the junction localization method to a T-junction with localization scheme when the noise level is high,
90 degree opening angles, and the results are shown in the localization erro can be decreased substantially
terms of the following six measures as function of the by applying the junction localization scheme itera-
noise level: tively.
Figure 15 shows the result of applying such an ite-
• the selected scale tdmin obtained by minimizing the
riative junction localization stage to the junction can-
normalized residual over scales,
didates in Fig. 8. For each scale-space maximum, an
• the normalized residual at the selected scale,
individual scale selection process has been invoked
• the absolute error in the localization estimate at the
consisting of the following processing steps:
selected scale,
• the scale tabs at which the localization estimate with
• The signature of the normalized residual has been ac-
the minimum absolute error is obtained,
cumulated using a window function with scale value
• the normalized residual at tabs ,
equal to the detection scale of the scale-space maxi-
• the minimum actual error of the localization esti-
mum.
mates computed at all scales.
• The minimum over scales in the signature of d˜min
All descriptors have been computed at a position with has been detected, and a new localization estimate
10 pixels horizontal and vertical offset from true corner has been computed using x = A−1 b.
position. • This procedure has been repeated iteratively until
The results show that the normalized residual serves either the difference between two successive local-
as an estimate of the inaccuracy in the corner localiza- ization estimates is less than one pixel or the num-
tion estimate, and specifically that the scale at which the ber of iterations has reached an upper bound (here 3
minimum over scales in d˜min is assumed is a reasonable iterations).
estimate of the scale at which we have the localization • Junction candidates for which the localization es-
estimate with the minimum absolute error. Whereas timate fall outside the support region of the orig-
the correspondence between the two error measures is inal scale-space maximum have been classified as
not perfect, the absolute error computed at the scale at “diverged” and been suppressed.
P1: VIY/MDR-BNY-BIS P2: VIY/RCK P3: SUK/RCK QC: SUK
International Journal of Computer Vision KL660-01-Linderberg-I October 20, 1998 16:21
100 Lindeberg
Figure 15. Improved localization estimates for the junction candidates in Fig. 8. Each junction has been graphically illustrated by a circle
centered at the new location estimate. In the left image, the size reflects the detection scale, whereas in the right image, the size reflects the
localization scale.
• Each remaining junction candidate has been illus- scale. Then, at the scale at which the minimum in
trated by a circle with radius proportional to the de- d˜min is assumed, compute an improved localization
tection scale or the localization scale. estimate using
Figure 16. Results of composed two-stage junction detection followed by junction localization for two different grey-level images. (top row)
original grey-level image, (middle and bottom rows) the N strongest junction candidates for different values of N .
P1: VIY/MDR-BNY-BIS P2: VIY/RCK P3: SUK/RCK QC: SUK
International Journal of Computer Vision KL660-01-Linderberg-I October 20, 1998 16:21
102 Lindeberg
Figure 17. Results of composed two-stage junction detection followed by junction localization for two different grey-level images. (top row)
original grey-level image, (middle and bottom rows) the N strongest junction candidates for different values of N .
P1: VIY/MDR-BNY-BIS P2: VIY/RCK P3: SUK/RCK QC: SUK
International Journal of Computer Vision KL660-01-Linderberg-I October 20, 1998 16:21
L-junction 3.81 0.78 0.43 The main purpose of this section has been to make ex-
Y -junction 2.12 0.35 0.35 plicit how a scale selection mechanism can be incorpo-
4-junction 3.53 0.28 0.07
rated in junction detection.When building a stand-alone
P1: VIY/MDR-BNY-BIS P2: VIY/RCK P3: SUK/RCK QC: SUK
International Journal of Computer Vision KL660-01-Linderberg-I October 20, 1998 16:21
104 Lindeberg
points. In certain situations, however, one is also inter- also with Section 3.) In this respect, QL serves as an
ested in computing dense image descriptors. approximate quadrature pair leading to small relative
An obvious problem that arises if we would base spatial variations near the scales given by the scale se-
a scale selection mechanism for computing dense im- lection procedure. For this reason, we propose to use
age descriptors on a partial derivative of the intensity QL as an entity to maximize over scales when com-
function, such as the Laplacian operator is that there puting dense image descriptors. For two-dimensional
would be large spatial variations in the operator re- data, we can instead consider
sponse and the selected scales. A common methodol- ¡ ¢
ogy in signal processing for reducing such a so-called QL = L 2ξ + L 2η + C L 2ξ ξ + 2L 2ξ η + L 2ηη
phase dependency is by using quadrature filter pairs ¡ ¢ ¡ ¢
= t L 2x + L 2y + Ct 2 L 2x x + 2L 2x y + L 2yy (63)
defined (from a Hilbert transform) in such a way that the
Euclidean sum of the filter responses is constant for any
defined to be rotationally symmetric and equal to the
sine wave. Since the Hilbert transform of a Gaussian
one-dimensional quadrature measure in any direction.
derivative kernel, however, is not within the Gaussian
derivative family, one may be interested in operators
Analysis for Sine Wave Patterns. The free param-
of small support which can be expressed within the
eter C determines the relative weight between the in-
scale-space framework.
formation in the first- and second-order derivatives. To
Given the normalized derivative concept, there is, a
obtain an intuitive understanding of how this parameter
straightforward way of combining Gaussian derivatives
affects the scale selection procedure, it is instructive to
into an entity that gives an approximately constant op-
analyse what scales are obtained by maximizing QL
erator response at the scale given by the scale selection
over scales for different values of C. Straightforward
mechanism. At any scale t in the scale-space repre-
differentiation of (62) gives that selected scale as func-
sentation L of a one-dimensional signal f , define the
tion of spatial position is given by
following quasi quadrature entity in terms of normal-
ized derivatives based on γ = 110 by µ ¶
1 2Cs 2
tQL (x) = 2 1 + √ (64)
QL = L 2ξ + C L 2ξ ξ = t L 2x + Ct 2 L 2x x (61) ω0 c2 + c4 + 4C 2 s 4
(62). 1
tQ,0 = tQL |ω0 x=0 = , (66)
ω02
As can be seen, the spatial variations in QL will be
large when tω02 is either much smaller or much larger 2
tQ, π2 = tQL |ω0 x= π2 = . (67)
than one, whereas the relative oscillations decrease to ω02
zero when t approaches 1/(Cω02 ). (As will be shown
below, this scale value is of the same order of magnitude Graphs showing this variation for a few values of C are
as the scales that maximize QL over scales; compare displayed in Fig. 20. Given the form of these curves, a
Figure 20. Spatial variation of the selected scale levels when maximizing the quasi quadrature entity (61) over scales for different values of
the free parameter C using a one-dimensional sine wave of unit input frequency as input pattern. Observe that C = 2/3 gives rise to the most
symmetric variations in the selected scale values.
P1: VIY/MDR-BNY-BIS P2: VIY/RCK P3: SUK/RCK QC: SUK
International Journal of Computer Vision KL660-01-Linderberg-I October 20, 1998 16:21
106 Lindeberg
Figure 21. Spatial variation of the maximum value over scales of the quasi quadrature entity (61) computed for different values of the free
parameter C using a one-dimensional sine wave of unit input frequency as input pattern. As can be seen, the smallest spatial variations in the
amplitude of the maximum response are obtained for C = e/4.
natural symmetry requirement can be stated as In other words, the determination of C based on small
spatial variations in the magnitude measure computed
1¡ ¢
tQL |ω0 x= π4 = tQL |ω0 x=0 + tQL |ω0 x= π2 at the selected scales gives rise to an approximately
2 similar value of C as the abovementioned symmetry
2 requirement.
⇒ C = ≈ 0.6667. (68)
3 This choice of C also corresponds to normalizing the
Gaussian derivative operators of first and second order
In this respect, C = 2/3 gives the most symmetric vari-
to having the same L 1 -norm (compare with the explicit
ation of selected scale levels with respect to the in-
expressions in (74) and (75)).
formation contents in the first-order and second-order
derivatives.11
Another interesting factor to analyse is the variation Experimental Results. Figure 22 gives a three-
in magnitude at the selected scales. Insertion of the dimensional illustration of the result of applying this
scale values according to (64) into the quasi quadrature operation to the perspective image of a sine wave pat-
measure (61) gives spatial variations of as displayed in tern with large size variations. The results are shown
Fig. 21. To determine C, a simple minimum-ripple as a surface plot of the magnitude of QL computed at
condition is to require that different positions and scales along a vertical cross-
section of the image. Moreover, the position of the
QL| ω0 x=0 = QL| ω0 x= π2 first local maximum over scales has been indicated
t=tQ,0 t=tQ, π
2
at each spatial point. Observe how the size variations
e in the vertical direction are captured and that the spa-
⇒ C = ≈ 0.6796. (69)
4 tial variations in QL at the selected scales are minor
Figure 22. Dense scale selection by maximizing the quasi quadrature measure (63) over scales: (left) Original grey-level image. (right) The
variations over scales of the quasi quadrature measure Q L computed along a vertical cross-section through the center of the image. The result is
visualized as a surface plot showing the variations over scale of the quasi quadrature measure as well as the position of the first local maximum
over scales.
P1: VIY/MDR-BNY-BIS P2: VIY/RCK P3: SUK/RCK QC: SUK
International Journal of Computer Vision KL660-01-Linderberg-I October 20, 1998 16:21
108 Lindeberg
9.2. Interpretation of γ -Normalized Derivatives In other words, the normalized derivative model is neu-
in Terms of Self-similar Fourier Spectrum tral with respect to power spectra of the form
and the following class of energy measures con- S(ω) ∼ |ω|−D , (88)
cerning the amount of information in the mth order
γ -normalized Gaussian derivatives where D is the dimension of the signal, can be easily
derived12 directly from the assumption that the power
Z X spectrum should contain the same amount of energy
Em = t mγ |L x α |2 d x. (79)
x∈R D |α|=m
for all frequencies.
where α represents multi-index notation. These differ- 9.3. Relations to Previous Work
ential energy measures are related to mth order spectral
moments by L 1 -normalized Gaussian derivative kernels of first or-
Z der have been used, for example, in edge detection
t mγ
Em = |ω|2m | L̂(ω; t)|2 dω. (80) and edge classification by Korn (1988), Mallat and
(2π) D ω∈R D Zhong (1992), and Zhang and Bergholm (1993), and in
Specifically, for derivatives up to order three in the pyramids by Crowley and Parker (1984). More gener-
two-dimensional case, this class of energy measures ally, evolution properties across scales of wavelet trans-
includes the following descriptors forms have been used by Mallat and Hwang (1992)
for characterizing local Lipschitz exponents of singu-
Z
larities. Mallat and Hwang (1992) also proposed the
E0 = L(x; t)2 d x, (81)
x∈R2
notion of “general maxima” of wavelet transforms for
Z estimating the frequency of local oscillations. This idea
¡ ¢
E1 = t γ L 2ξ + L 2η d x, (82) is closely related to the notion of scale-space maximum
x∈R2 considered here and to the scale selection mechanism in
Z
¡ ¢ (Lindeberg, 1991, 1993a) based on local maxima over
E2 = t 2γ L 2ξ ξ + 2L 2ξ η + L 2ηη d x, (83)
x∈R2 scales of blob responses computed along extremum
Z paths in scale-space. There is also a connection to the
¡ ¢
E3 = t 3γ L 2ξ ξ ξ + 3L 2ξ ξ η + 3L 2ξ ηη + L 2ηηη d x. “top point” representation proposed by Johansen et al.,
x∈R2 (1986) in the sense that the points in scale-space at
(84)
which bifurcations occur serve as to delimit extremum
paths with different topology. A main difference be-
It is rather straightforward to show (see Appendix A.3)
tween the scale selection mechanism suggested here
that the variation over scales of these γ -normalized
and the work in (Lindeberg, 1991; Mallat and Hwang,
energy measures are given by
1992), however, here it is shown that how these no-
E m (·; t) ∼ t β−D/2−m(1−γ ) . (85) tions can be applied to large classes of non-linear dif-
ferential invariants computed in a scale-space repre-
This expression is scale independent if and only if sentation. Moreover, feature detection algorithms have
been formulated with integrated scale selection mech-
D anisms and it has been shown how different derivative
β= + m(1 − γ ). (86)
2 normalization approaches lead to different classes of
P1: VIY/MDR-BNY-BIS P2: VIY/RCK P3: SUK/RCK QC: SUK
International Journal of Computer Vision KL660-01-Linderberg-I October 20, 1998 16:21
differential expressions for which the scale selec- detection algorithms, where features are first detected
tion mechanism commutes with rescalings of the in- at locally adapted coarse scales, and then localized
put pattern. Specifically, it has been shown how L 1 - to finer scales in a second stage processing stage.
normalization is special in terms of scale invariance Whereas the general advantages of such a two-stage
properties.13 approach to feature detection are well-known in the
literature, a major contribution here is that explicit
mechanisms are provided for automatic selection of
10. Summary and Discussion the detection scales as well as the localization scales.
Moreover, these processing stages are integrated into
We have argued that the subject of scale selection is algorithms which are essentially free from other tuning
essential to many problems in computer vision and au- parameters that the number of features of interest.
tomated image analysis. Specifically, we have outlined Of course, the task of selecting “the best scale” for
how the evolution properties over scales of normal- handling real-world image data (about which usually
ized Gaussian derivatives provide important cues in no or very little a priori information is available) is
this context—for generating hypotheses about interest- intractable if treated as a pure mathematical problem.
ing scales (and associated spatial points or regions) for Therefore, the proposed heuristic principle should not
further analysis. A general scale selection principle has be interpreted as any “optimal solution,” but rather as a
been presented stating that in the absence of other evi- systematic method for generating initial hypotheses in
dence, coarse estimates of the size of image structures situations where no or very little information is avail-
can be computed from the scales at which normalized able about what can be expected to be in the scene.
differential geometric descriptors assume maxima over
scales. In particular, it has been suggested that this ap-
10.1. Technical Contributions
proach can be used for adaptively choosing the scales
for feature detection. At a technically more detailed level some of the main
Support for the overall approach has been provided contributions are that:
in terms of a theoretical analysis of the general scaling
property of local maxima over scales in the scale-space • It is emphasized how the evolution properties over
signature, and by a detailed analysis of the behaviour scales of normalized scale-space derivatives dif-
of the scale selection method when integrated with fea- fer from those of traditional spatial derivatives.
ture detection algorithms and applied to characteristic Whereas the magnitude of a traditional scale-space
model patterns; see Table 3 for an overview. The main derivative always decreases with scale, peaks over
support of the methodology is, however, experimental; scales can be expected in the scale-signatures of nor-
it has been demonstrated that intuitively reasonable and malized derivatives computed from data containing
quantitatively accurate results can be obtained by ap- dominant information at certain scales.
plying the proposed scheme to the problems of blob de- • A general heuristic principle for scale selection has
tection, junction detection and frequency estimation.14 been proposed stating that extrema over scales in
For a problem such as junction detection, the the signature of normalized differential entities are
methodology naturally gives rise to two-stage feature useful in the stage of detecting image features. In
Table 3. Measures of feature strength and normalization parameters used for different types
of feature detectors with automatic scale selection (including results from a companion paper
(Lindeberg, 1996b, 1996c)). For each feature detector, a preferred γ -value is specified as well
as the p-value for which the L p -norm of the Gaussian derivatives is constant over scales (73)
and the β-value for which the energy of a self-similar Fourier spectrum is constant over scales
(86).
Feature type Differential entity for scale selection γ -value L p -norm Fourier β
110 Lindeberg
∂ξ m = ϕ(t)∂x m (2)
Acknowledgments
This work was partially performed under the ESPRIT- where ϕ : R+ → R is a smooth function.
BRA project INSIGHT and the ESPRIT-NSF collab- • To allow for scale selection based on local maxima
oration DIFFUSION. The support from the Swedish over scales, the normalized derivative concept must
Research Council for Engineering Sciences, TFR, is preserve local maxima over scales. Specifically, if a
gratefully acknowledged. normalized differential entity assumes a local maxi-
The three-dimensional illustrations in Figs. 5 and 11 mum over scales at a certain point (x0 , t0 ) in scale-
have been generated with the kind assistance of Pascal space, then after a rescaling of the input signal by
Grostabussiat, and the illustration in Fig. 22 is from a a factor of s, the maximum over scales should be
collaboration work with Andrés Almansa. assumed at (sx0 , s 2 t0 ).
P1: VIY/MDR-BNY-BIS P2: VIY/RCK P3: SUK/RCK QC: SUK
International Journal of Computer Vision KL660-01-Linderberg-I October 20, 1998 16:21
Given these requirements, it follows that the normal- for some other (arbitrary) function h. If we differentiate
ization must be of the form with respect to u and v
L ξ m = t γ /2 L x m , (4)
we find that ψ 0 (and h 0 ) must be constant. In other
0 γ /2
L ξ 0m = t L x0m . (5) words, ψ(u) = C1 u + C2 and
From (17) we have that these derivatives at correspond- ϕ(t) = eC1 log t + C2 = At B . (15)
ing points (x 0 ; t 0 ) = (sx, s 2 t 0 ) are related by
ϕ(t)
L ξ m (x; t) = s m L ξ m (sx; s 2 t). (6) A.2. L p -Normalization Interpretation
ϕ(s 2 t) of γ -Normalized Derivatives
If a local maximum over scales is to be preserved, we
must require that In this appendix it is shown how the γ -normalized
derivative concept can be interpreted in terms of L p -
∂t (L ξ m (x; t)) = 0 ⇔ ∂t 0 (L ξ 0 m (x 0 ; t 0 )) = 0. (7) normalization of the Gaussian derivative kernels over
scales. The L p -norm of the mth order γ -normalized
Differentiating (6) with respect to t gives Gaussian derivative kernel is
Z ∞
¯ mγ /2 ¯p
∂t (L ξ m (x; t)) kgξ m (·; t)k pp = ¯t gx m (x; t)¯ d x. (16)
µ µ ¶ ¶
ϕ(t) ¡ ¢ x=−∞
= s m ∂t + ∂ t L ξ 0 m (sx; s t)
2
(8)
ϕ(s 2 t) From the well-known relation between the derivatives
of the unnormalized Gaussian kernel and the Her-
and application of (7) results in the necessary require- mite polynomials Hn ∂x m (e−x ) = (−1)m Hm (x) e−x
2 2
112 Lindeberg
In other words, the L p -norm of the Gaussian deriva- where ĥ i (ω) is the Fourier transform of h i (x) and by
tive kernel at scale level t is related to the L p -norm at letting h 1 = h 2 = L x α = L x1α1 ··· x Dα D , we obtain
unit scale by Z
√ m(γ −1)−1+1/ p L 2x α1 ··· x α D (·; t) d x
kgξ m (·; t)k p = t kgξ m (·; 1)k p . (19) x∈R2 1 D
Z
1 −2α
= ω2α1 · · · ω2α
D ĝ (ω; t)|ω|
D 2
dω
Concerning L p -norms of Gaussian derivatives in (2π ) D ω∈R D 1
higher dimensions, we can make use of the separabil- (25)
ity of the normalized Gaussian derivative to separate
the D-dimensional integral in the L p -norm definition where ĝ denotes the Fourier transform of the Gaussian
into a product of one-dimensional integrals of the form kernel. By adding (25) over all possible multi-indeces
(18). Hence, if we let |m| = m 1 + · · · + m D denote the α with |α| = α1 +· · ·+α D = m, and by using the defini-
total order of differentiation, it follows that the varia- tion (79) we obtain the rotationally invariant descriptor
tion over scales of the L p -norm of the D-dimensional
Z
normalized Gaussian derivative kernel will be of the t mγ
Em = |ω|2m ĝ 2 (ω; t)|ω|−2β dω (26)
form (2π ) D ω∈R D
√ |m|(γ −1)+D(1/ p−1)
kgξ m (·; t)k p = t kgξ m (·; 1)k p . (20) Let us next introduce the D-dimensional correspon-
dence to spherical coordinates, (ρ; ϕ1 , . . . , ϕ D−1 ), with
In other words, the L p -norm of this kernel is constant the volumetric element dω = C D ρ D−1 dρ
over scales if and only if Z ∞
t mγ
ρ 2m−2β+D−1 e−ρ t dρ (27)
2
Em =
m(γ − 1) + D(1/ p − 1) = 0. (21) (2π ) D ρ=0
R∞ 0((m+1)/2)
x m e−ax d x =
2
Specifically, p is independent of m if and only if γ = 1. Then, using 0 2a (m+1)/2
, we get
C D t mγ 0(m − β + D/2)
A.3. Normalized Derivative Responses Em = , (28)
(2π ) D 2t m−β+D/2
to Self-similar Power Spectra
and the variation over scales is of the form
In this appendix, a closed-form expression is derived
for the variation over scales of the following class of E m (t) ∼ t β−D/2−m(1−γ ) . (29)
energy measures
Z X A.4. Discrete Implementation of the Scale
Em = t mγ |L x α |2 d x. (22) Selection Mechanisms
x∈R D |α|=m
when computed at different scales t in the scale-space Discretizing the normalized derivative operators leads
representation L of a two-dimensional signal f with a to two discretization problems; (i) how to discretize
self-similar power spectrum of the form the scale-space derivatives such that the scale-space
properties are preserved, and (ii) how to discretize the
S f (ω1 , . . . , ω D ) = ( fˆ fˆ∗ )(ω) normalization factor.
¡ ¢−β
= |ω|−2β = ω12 + ω22 . (23) A.4.1. Computing Discrete Derivative Approxima-
tions. The first problem can be solved by using the
Using Plancherel’s relation scale-space concept for discrete signals (Lindeberg,
Z 1994a), given by L(·, ·; t) = T (·, ·; t) ∗ f (·, ·), where
ĥ 1 (ω)ĥ ∗2 (ω) dω T (m, n; t) = T1 (m; t)T1 (n; t) and T1 (m; t) = e−t Im (t)
ω∈R D
Z is the one-dimensional discrete analogue of the Gaus-
= (2π) D
h 1 (x)h ∗2 (x) d x, (24) sian kernel defined from the modified Bessel func-
x∈R2 tions In .
P1: VIY/MDR-BNY-BIS P2: VIY/RCK P3: SUK/RCK QC: SUK
International Journal of Computer Vision KL660-01-Linderberg-I October 20, 1998 16:21
The scale-space properties of L transfer to any dis- Normalization factors αm (t) for discrete approxima-
crete derivative approximations L x i x j defined as the tions (δx m T )(x; t) to higher order Gaussian derivative
1 2
result of applying difference operators δx i x j to L. In operators (∂x m g)(x; t) can be determined in an analo-
1 2
the implementations presented here, the first-order and gous way, such that the discrete l1 -norm is equal to the
second-order derivatives are approximated by the op- continuous L 1 -norm
erators (δx L)(x; t) = 12 (L(x + 1; t) − L(x − 1; t)) and X
∞
(δx x L)(x; t) = (L(x + 1; t) − 2L(x; t) + L(x − 1; t)), αm (t) |(δx m T )(x; t)|
respectively. x=−∞
Z ∞
A.4.2. Normalization in the Discrete Case. In view = t m/2 |(∂x m g)(x; t)| d x. (32)
x=−∞
of the results presented in Section 9.1, it is natural
to normalize the discrete derivative approximation
kernels δx i x j T such that their discrete l1 -norms will
1 2
be constant over scales. Of course, it is not necessary A.4.3. Detection of Scale-Space Maxima in Dis-
to construct the normalized derivative approximation crete Data. Given the abovementioned discretiza-
kernels explicitly. Concerning e.g., first order deriva- tion methods for computing normalized differential
tives, discrete approximations to L x1 and L x2 can first descriptors based on the local N -jet representation, it
be computed according to Section A.4.1. Then, the is straightforward to express algorithms for detecting
result can be multiplied by the discrete normalization scale-space maxima. In summary, the implementations
factor underlying this work have been performed as follows:
√ 1. Given a discrete image f (here: of size between
2 128 × 128 or 256 × 256 pixels), select a scale
α1 (t) = √ , (30)
π(T1 (0; t) + T1 (1; t)) range for the analysis (here: tmin = 2 and tmax = 256).
Within this range, distribute a set of scale levels
which has been determined such that the discrete l1 - tk (here: 20 or 40 levels) such that the ratio be-
norm of the discrete derivative approximation kernel is tween successive scale levels tk+1 /tk is approxi-
equal to the continuous L 1 -norm of the Gaussian kernel mately constant.15
of the same order 2. For each scale tk , compute the scale-space represen-
tation of f according to Section A.4.1. Then, for
X
0
α1 (t)(δx T )(x; t) each point at each scale, compute discrete deriva-
x=−∞ tive approximations according to Section A.4.1 and
Z normalize them according to Section A.4.2. Finally,
0 √ 1
= t(∂x g)(x; t) d x = √ . (31) compute the normalized differential expression by
x=−∞ 2π pointwise combination of these entities.
3. In the three-dimensional volume so generated, de-
Using an asymptotic expression for the modified Bessel
tect local maxima (as points whose values are
functions of integer order (Abramowitz and Stegun,
greater than or equal to the values of their 26 dis-
1964) In (t) = √e2πt (1 − 4n8t−1 + O( t12 )), it can be ver-
t 2
114 Lindeberg
uniformly distributed between a lower scale (here: 11. If we redefine the quasi quadrature measure as Q L = (1 − α)
0.01) and the detection scale t0 , vary the local scale L 2ξ + αL 2ξ ξ , then C = 2/3 corresponds to the relative weights
1 − α = 3/5 and α = 2/5.
at which derivatives are computed and select the lo-
12. To derive the self-similar power spectrum, consider an D-
cal scale that minimizes the normalized residual (60) dimensional signal with power spectrum S(ω), and parameterize
over scales. the D-dimensional frequency space using the D-dimensional
• At this scale, the new localization estimate is correspondence to spherical coordinates, (r ; ϕ1 , . . . , ϕ D−1 ),
x̂ = A−1 b. where r = |ω| ∈ [0, ∞] and ϕ1 , . . . , ϕ D−1 are suitably selected
angles in some domain Ä. To analyse the energy contribution
• Iterate the abovementioned steps until either the in-
from each range of frequencies, consider a volume element d V
crement is sufficiently small (here: within the same defined by r0 ≤ |ω| ≤ r0 (1 + dρ) for some r0 . Since the area
pixel) or an upper bound (here: 3 iterations) has been of an D-dimensional hypersphere of radius r0 is proportional
reached. to r0D−1 , the volume of a scale-invariant element can be written
Suppress all points for which the scheme diverges d V = C D r0D−1 r0 dρ for some constant C D . If we want the signal
(here: when the total update is larger than the detec- to contain the same amount of energyR for all frequencies (where
tion scale measured in dimension [length]).
the total energy is measured
R by R E = ω∈R D S(ω) dω), it follows
by necessity that d E = d V ( ϕ∈Ä S(r ; ϕ) dϕ) R d V must be inde-
pendent of ω, which in turn implies that ( ϕ∈Ä S(r ; ϕ) dϕ) d V
must be proportional to dρ, and the power spectrum must be of
the form S(ω) ∼ |ω|−D .
Notes
13. Applications of scale selection based on L p -normalization with
p < 1 are developed in more detail in (Lindeberg, 1996b).
1. An analysis of scale-space like responses to sine waves corre- 14. See (Lindeberg, 1996b) concerning scale selection mechanisms
sponding to the case when γ = 1 in this section has also been for detecting edges and ridges.
performed in wavelet analysis by (Mallat and Hwang, 1992); see 15. Specifically, the scale levels have been determined such that the
Section 9.3. difference τk+1 − τk in effective scale between adjacent scales
2. To avoid the sensitivity to sign of these entities, and hence the tk+1 and tk is constant (see Appendix A.4.4).
polarity of the signal, trace Hnorm L and det Hnorm L have been 16. In other words, a point (x0 , y0 , tk0 ) is regarded as a discrete scale-
squared before presentation. space maximum of a normalized differential entity Dnorm L if and
3. In the graphs in Fig. 2 the scale parameter (on the horizontal only if (Dnorm L)(x0 , y0 , tk0 ) ≥ (Dnorm L)(x0 + i, y0 + j, tk0 +k )
axis) is measured in terms of effective scale, τ . For continuous holds for all 26 neighbouring points (i, j, k) ∈ {−1, 0, 1}.
signals, this parameter is essentially the logarithm of the scale
parameter τ = C1 log t + C2 for some C1 , C2 > 0. To avoid
the singularity at zero scale, however, all experiments are based
on an effective scale concept especially developed for discrete
signals and defined such that τ ∼ log t at coarse scales and τ ∼ t References
at fine scales see (Lindeberg, 1994a).
4. When detecting scale-space maxima in practice, there is, of Abramowitz, M. and Stegun, I.A. (Eds.). 1964. Handbook of Mathe-
course, no need to explicitly track the extrema along the ex- matical Functions. Applied Mathematics Series. National Bureau
tremum path in scale-space. It is sufficient to detect three- of Standards, 55th edition.
dimensional maxima over space and scale (as described in more Almansa, A. and Lindeberg, T. 1996. Enhancement of finger-
detail in Section A.4.3). print images by shape-adapted scale-space operators. In Gaussian
5. Further extensions of this idea are also explored in (Lindeberg, Scale-Space Theory: Proc. Ph.D. School on Scale-Space Theory,
1996b), where differential definitions of edges and ridges are Copenhagen, Denmark. J. Sporring, M. Nielsen, L. Florack, and
expressed in such a way that scale selection constitutes an inte- P. Johansen (Eds.), Kluwer Academic Publishers.
grated part of the feature definition. Almansa, A. and Lindeberg, T. 1998. Fingerprint enhancement by
6. A more general approach for defining the support region of the shape adaptation of scale-space operators with automatic scale
junction feature is described in Section 7.7. selection. Technical Report ISRN KTH NA/P-98/03-SE., Dept. of
7. The difference in positions in the last row in Fig. 14 is mainly Numerical Analysis and Computing Science, KTH, Stockholm,
due to the fact that for the purpose of numerical evaluation, the Sweden.
reference position of the corner is defined as the position in the Babaud, J., Witkin A.P., Baudin, M., and Duda, R.O. 1986. Unique-
ideal corner image before the smoothing operation. ness of the Gaussian kernel for scale-space filtering. IEEE Trans.
8. Besides the general descriptions given in previous sections, fur- Pattern Analysis and Machine Intell., 8(1):26–33.
ther details concerning algorithms and discrete implementation Blom, J. 1992. Topological and geometrical aspects of image struc-
can be found in Appendix A.1.4 and in (Lindeberg, 1994b). ture. Ph.D. Thesis. Dept. Med. Phys. Physics, Univ. Utrecht, NL-
9. An integrated vision system for analysing junctions by ac- 3508 Utrecht, Netherlands.
tively zooming in to interesting structures is presented in Blostein, D. and Ahuja, N. 1989. A multiscale region detector. Com-
(Brunnström et al., 1992; Lindeberg, 1993a). puter Vision, Graphics, and Image Processing, 45:22–41.
10. Since Q L is an inhomogeneous differential expression, γ = 1 is a Bretzner, L. and Lindeberg, T. 1996. Feature tracking with automatic
necessary requirement for the scale selection procedure to com- selection of spatial scales. Technical Report ISRN KTH/NA/P-
mute with size variations in the input pattern (see Section 4.1). 96/21-SE, Dept. of Numerical Analysis and Computing Science,
P1: VIY/MDR-BNY-BIS P2: VIY/RCK P3: SUK/RCK QC: SUK
International Journal of Computer Vision KL660-01-Linderberg-I October 20, 1998 16:21
KTH, Stockholm, Sweden. Revised version in Computer Vision Koenderink, J.J. 1984. The structure of images. Biological Cyber-
and Image Understanding. netics, 50:363–370.
Bretzner, L. and Lindeberg, T. 1997. On the handling of spatial and Koenderink, J.J. and Richards, W. 1988. Two-dimensional curva-
temporal scales in feature tracking. In Proc. 1st Int. Conf. on Scale- ture operators. J. of the Optical Society of America, 5:7:1136–
Space Theory in Computer Vision, Utrecht, The Netherlands. ter 1141.
Haar Romeny et al. (Ed.), LNCS, Springer Verlag: New York, Koenderink, J.J. and van Doorn, A.J. 1990. Receptive field families.
Vol. 1252, pp. 128–139. Biological Cybernetics, 63:291–298.
Bretzner, L. and Lindeberg, T. 1998. Feature tracking with automatic Koenderink, J.J. and van Doorn, A.J. 1992. Generic neighborhood
selection of spatial scales. Computer Vision and Image Under- operators. IEEE Trans. Pattern Analysis and Machine Intell.,
standing, 71(3):385–392. 14(6):597–605.
Brunnström, K., Lindeberg, T., and Eklundh, J.-O. 1992. Active Korn, A.F. 1988. Toward a symbolic representation of intensity
detection and classification of junctions by foveation with a head- changes in images. IEEE Trans. Pattern Analysis and Machine
eye system guided by the scale-space primal sketch. In Proc. 2nd Intell., 10(5):610–625.
European Conf. on Computer Vision, Santa Margherita Ligure, Lindeberg, T. 1990. “Scale-space for discrete signals. IEEE Trans.
Italy. G. Sandini (Ed.), Vol. 588 of Lecture Notes in Computer Pattern Analysis and Machine Intell., 12(3):234–254.
Science, Springer-Verlag, pp. 701–709. Lindeberg, T. 1991. Discrete scale-space theory and the scale-space
Burt, P.J. 1981. Fast filter transforms for image processing. Computer primal sketch. Ph.D. Thesis. ISRN KTH/NA/P-91/08-SE. Dept.
Vision, Graphics, and Image Processing, 16:20–51. of Numerical Analysis and Computing Science, KTH, Stockholm,
Crowley, J.L. 1981. A representation for visual information. Ph.D. Sweden.
Thesis, Carnegie-Mellon University, Robotics Institute, Pitts- Lindeberg, T. 1993. Detecting salient blob-like image structures and
burgh, Pennsylvania. their scales with a scale-space primal sketch: A method for focus-
Crowley, J.L. and Parker, A.C. 1984. A representation for shape of-attention. Int. J. of Computer Vision, 11(3):283–318.
based on peaks and ridges in the difference of low-pass trans- Lindeberg, T. 1993. On scale selection for differential operators.
form. IEEE Trans. Pattern Analysis and Machine Intell., 6(2): In Proc. 8th Scandinavian Conf. on Image Analysis, Tromsø,
156–170. Norway, K. Heia K.A. Høgdra, B. Braathen (Ed.), pp. 857–866.
Deriche, R. and Giraudon, G. 1990. Accurate corner detection: An Lindeberg, T. 1994a. Scale-Space Theory in Computer Vision.
analytical study. In Proc. 3rd Int. Conf. on Computer Vision, Kluwer Academic Publishers: Netherlands.
Osaka, Japan, pp. 66–70. Lindeberg, T. 1994b. Scale selection for differential operators.
Dreschler, L. and Nagel, H.-H. 1982. Volumetric model and 3D- Technical Report ISRN KTH/NA/P-94/03-SE, Dept. of Nu-
trajectory of a moving car derived from monocular TV-frame se- merical Analysis and Computing Science, KTH, Stockholm,
quences of a street scene. Computer Vision, Graphics, and Image Sweden.
Processing, 20(3):199–228. Lindeberg, T. 1994c. On the axiomatic foundations of linear scale-
Field, D.J. 1987. Relations between the statistics of natural images space: Combining semi-group structure with causality vs. scale
and the response properties of cortical cells. J. of the Optical So- invariance. Technical Report ISRN KTH/NA/P-94/20-SE, Dept.
ciety of America, 4:2379–2394. of Numerical Analysis and Computing Science, KTH, Stockholm,
Florack, L.M.J. 1993. The syntactical structure of scalar images. Sweden. Revised version in J. Sporring and M. Nielsen and L.
Ph.D. Thesis. Dept. Med. Phys. Physics, Univ. Utrecht, NL-3508 Florack and P. Johansen (Eds.) Gaussian Scale-Space Theory:
Utrecht, Netherlands. Proc. Ph.D. School on Scale-Space Theory, Copenhagen, Den-
Florack, L.M.J., ter Haar Romeny, B.M., Koenderink, J.J., and mark, Kluwer Academic Publishers.
Viergever, M.A. 1992. Scale and the differential structure of im- Lindeberg, T. 1994d. Junction detection with automatic selection of
ages. Image and Vision Computing, 10(6):376–388. detection scales and localization scales. In Proc. 1st International
Florack, L.M.J., ter Haar Romeny, B.M., Koenderink, J.J., and Conference on Image Processing, Austin, Texas, Vol. 1, pp. 924–
Viergever, M.A. 1994. Linear scale-space. J. of Mathematical 928.
Imaging and Vision, 4(4):325–351. Lindeberg, T. 1995a. Direct estimation of affine deformations of
Förstner, W.A. and Gülch, E. 1987. A fast operator for detec- brightness patterns using visual front-end operators with auto-
tion and precise location of distinct points, corners and centers matic scale selection. In Proc. 5th International Conference on
of circular features. In Proc. Intercommission Workshop of the Computer Vision, Cambridge, MA, pp. 134–141.
Int. Soc. for Photogrammetry and Remote Sensing, Interlaken, Lindeberg, T. 1995b. On scale selection in subsampled (hybrid)
Switzerland. multi-scale representations. Draft manuscript.
Gårding, J. and Lindeberg, T. 1996. Direct computation of shape Lindeberg, T. 1996a. Feature detection with automatic scale se-
cues using scale-adapted spatial derivative operators. Int. J. of lection. Technical Report ISRN KTH/NA/P-96/18-SE, Dept. of
Computer Vision, 17(2):163–191. Numerical Analysis and Computing Science, KTH, Stockholm,
ter Haar Romeny, B. (Ed.). 1994. Geometry-Driven Diffusion in Sweden.
Computer Vision. Kluwer Academic Publishers, Netherlands. Lindeberg, T. 1996b. Edge detection and ridge detection with auto-
Johansen, P., Skelboe, S., Grue, K., and Andersen, J.D. 1986. matic scale selection. Technical Report ISRN KTH/NA/P-96/06-
Representing signals by their top points in scale-space. In Proc. SE, Dept. of Numerical Analysis and Computing Science, KTH,
8th Int. Conf. on Pattern Recognition, Paris, France, pp. 215– Stockholm, Sweden. Revised version in Intl. J. of Computer Vision,
217. 30(2), 1998.
Kitchen, L. and Rosenfeld, A. 1982. Gray-level corner detection. Lindeberg, T. 1996c. Edge detection and ridge detection with
Pattern Recognition Letters, 1(2):95–102. automatic scale selection. In Proc. IEEE Comp. Soc. Conf. on
P1: VIY/MDR-BNY-BIS P2: VIY/RCK P3: SUK/RCK QC: SUK
International Journal of Computer Vision KL660-01-Linderberg-I October 20, 1998 16:21
116 Lindeberg
Computer Vision and Pattern Recognition, 1996, San Francisco, Mallat, S.G. and Zhong, S. 1992. Characterization of signals from
California, pp. 465–470. multi-scale edges. IEEE Trans. Pattern Analysis and Machine
Lindeberg, T. 1996d. A scale selection principle for estimating im- Intell., 14(7):710–723.
age deformations. Technical Report ISRN KTH/NA/P-96/16-SE, Marr, D. 1982. Vision. W.H. Freeman: New York.
Dept. of Numerical Analysis and Computing Science, KTH. Im- Noble, J.A. 1988. Finding corners. Image and Vision Computing,
age and Vision Computing, in press. 6(2):121–128.
Lindeberg, T. 1996e. On the axiomatic foundations of linear scale- Pauwels, E.J., Fiddelaers, P., Moons, T., and van Gool, L. J. 1995.
space, In Gaussian Scale-Space Theory: Proc. Ph.D. School An extended class of scale-invariant and recursive scale-space
on Scale-Space Theory, Copenhagen, Denmark, J. Sporring, M. filters. IEEE Trans. Pattern Analysis and Machine Intell., 17(7):
Nielsen, L. Florack, and P. Johansen (Eds.), Kluwer Academic 691–701.
Publishers. Tsotsos, J.K., Culhane, S.M., Wai, W.Y.K., Lai, Y., Davis, N., and
Lindeberg, T. 1997. On automatic selection of temporal scales in Nufflo, F. 1995. Modeling visual attention via selective tuning,
time-casual scale-space. In Proc. AFPAC’97: Algebraic Frames Artificial Intelligence, 78:507–545.
for the Perception-Action Cycle, G. Sommer and J. J. Koenderink Voorhees, H. and Poggio, T. 1987. Detecting textons and texture
(Eds.), Vol. 1315 of Lecture Notes in Computer Science, Springer boundaries in natural images. In Proc. 1st Int. Conf. on Computer
Verlag, Berlin, pp. 94–113. Vision, London, England.
Lindeberg, T. and Gårding, J. 1993. Shape from texture from a multi- Wiltschi, K., Pinz, A., and Lindeberg, T. 1997. Classification of car-
scale perspective. In Proc. 4th Int. Conf. on Computer Vision, bide distributions using scale selection and directional distribu-
Berlin, Germany, H.-H. Nagel et. al., (Eds.), pp. 683–691. tions, In Proc. 4th International Conference on Image Processing,
Lindeberg, T. and Gårding, J. 1997. Shape-adapted smoothing in Santa Barbara, CA.
estimation of 3D depth cues from affine distortions of local 2D Witkin, A.P. 1983. Scale-space filtering. In Proc. 8th Int. Joint Conf.
structure. Image and Vision Computing, 15:415–434. Art. Intell., Karlsruhe, West Germany, pp. 1019–1022.
Lindeberg, T. and Li, M. 1995. Segmentation and classification Young, R.A. 1985. The Gaussian derivative theory of spatial vi-
of edges using minimum description length approximation and sion: Analysis of cortical cell receptive field line-weighting pro-
complementary junction cues. In Proc. 9th Scandinavian Confer- files. Technical Report GMR-4920, Computer Science Depart-
ence on Image Analysis, Uppsala, Sweden, G. Borgefors (Ed.), ment, General Motors Research Lab., Warren, Michigan.
pp. 767–776. Young, R.A. 1987. The Gaussian derivative model for spatial vision:
Lindeberg, T. and Li, M. 1997. Segmentation and classification of I. Retinal mechanisms. Spatial Vision, 2:273–293.
edges using minimum description length approximation and com- Yuille, A.L. and Poggio, T.A. 1986. Scaling theorems for zero-
plementary junction cues. Computer Vision and Image Under- crossings. IEEE Trans. Pattern Analysis and Machine Intell., 8:
standing, 67(1):88–98. 15–25.
Lindeberg, T. and Olofsson, G. 1995. The aspect feature graph in Zhang, W. and Bergholm, F. 1993. An extension of Marr’s signature
recognition by parts. Draft manuscript. based edge classification and other methods for determination of
Mallat, S.G. and Hwang, W.L. 1992. Singularity detection and diffuseness and height of edges, as well as line width. In Proc.
processing with wavelets. IEEE Trans. Information Theory, 4th Int. Conf. on Computer Vision, Berlin, Germany, H.-H. Nagel
38(2):617–643. et al. (Eds.), IEEE Computer Society Press, pp. 183–191.