Nothing Special   »   [go: up one dir, main page]

Chao Jaccard, Chao Sorensen

Download as pdf or txt
Download as pdf or txt
You are on page 1of 13

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/236733729

A new statistical approach for assessing compositional similarity based on


incidence and abundance data

Article  in  Ecology Letters · January 2005

CITATIONS READS

1,166 1,164

4 authors:

Anne Chao Robin L Chazdon


National Tsing Hua University University of Connecticut
162 PUBLICATIONS   18,510 CITATIONS    244 PUBLICATIONS   20,023 CITATIONS   

SEE PROFILE SEE PROFILE

Robert K Colwell Tsung-Jen Shen


University of Connecticut National Chung Hsing University
231 PUBLICATIONS   37,719 CITATIONS    13 PUBLICATIONS   3,169 CITATIONS   

SEE PROFILE SEE PROFILE

Some of the authors of this publication are also working on these related projects:

Dry Tropical Forest, Providencia Island View project

Forest Restoration and Climate Experiment (FoRCE) View project

All content following this page was uploaded by Anne Chao on 10 September 2015.

The user has requested enhancement of the downloaded file.


Ecology Letters, (2005) 8: 148–159 doi: 10.1111/j.1461-0248.2004.00707.x

LETTER
A new statistical approach for assessing similarity
of species composition with incidence and
abundance data

Abstract
1 2*
Anne Chao, Robin L. Chazdon, The classic Jaccard and Sørensen indices of compositional similarity (and other indices
Robert K. Colwell2 and that depend upon the same variables) are notoriously sensitive to sample size, especially
Tsung-Jen Shen1 for assemblages with numerous rare species. Further, because these indices are based
1
Institute of Statistics, National solely on presence–absence data, accurate estimators for them are unattainable. We
Tsing Hua University, Hsin-Chu,
provide a probabilistic derivation for the classic, incidence-based forms of these indices
Taiwan
2
and extend this approach to formulate new Jaccard-type or Sørensen-type indices based
Department of Ecology and
on species abundance data. We then propose estimators for these indices that include the
Evolutionary Biology, University
of Connecticut, Storrs, CT, USA
effect of unseen shared species, based on either (replicated) incidence- or abundance-
*Correspondence: E-mail:
based sample data. In sampling simulations, these new estimators prove to be
chazdon@uconn.edu considerably less biased than classic indices when a substantial proportion of species are
missing from samples. Based on species-rich empirical datasets, we show how
incorporating the effect of unseen shared species not only increases accuracy but also
can change the interpretation of results.

Keywords
Abundance data, beta diversity, biodiversity, complementarity, incidence data, shared
species, similarity estimators, similarity index, species overlap, succession.

Ecology Letters (2005) 8: 148–159

unseen species are based on the number of rare species


INTRODUCTION
observed within samples (Colwell & Coddington 1994;
Ecologists who conduct field surveys of species richness Chazdon et al. 1998), either abundance data or replicated
have long recognized that it is virtually impossible to detect incidence samples are required for richness estimation. In
all species and their relative abundances with a limited the simplest richness estimators (e.g. Chao1, Chao2, or jack-
number or intensity of samples. Sampling limitations create knife estimators), rare species are classified as species with a
challenges for making accurate estimates of alpha diversity, total abundance of 1 (singletons) or 2 (doubletons) in an
the number of species within local, approximately homo- abundance-based sample or that occur in only one sampling
geneous assemblages, particularly for assemblages with high unit (uniques) or in exactly two sampling units (duplicates)
species richness and a large fraction of rare species (Colwell in replicated incidence data. The abundance-based coverage
& Coddington 1994; Chazdon et al. 1998; Colwell et al. estimator (ACE) uses additional information based on those
2004; Magurran 2004). To meet this challenge, several species with 10 or fewer individuals in the sample (Chao
methods have been developed for estimating species et al. 1993) and the corresponding incidence-based coverage
richness from sample data, either through extrapolation of estimator (ICE) is based on species found in 10 or fewer
species accumulation curves, or through application of non- sampling units (Lee & Chao 1994; Chazdon et al. 1998;
parametric methods (see reviews by Bunge & Fitzpatrick Magurran 2004).
1993; Colwell & Coddington 1994; Magurran 2004; Chao, in The same limitations that apply to estimating the alpha
press). The latter approach involves the estimation of unseen diversity of species assemblages equally apply to estimating
species (species that are likely to be present in a larger the beta diversity or dissimilarity (complementarity, turnover
homogeneous sample of the assemblage, but that are or distance) between two assemblages. The Jaccard index of
missing from actual sample data). Because estimates of similarity and the closely related Sørensen index are the two

2004 Blackwell Publishing Ltd/CNRS


A new statistical approach for assessing similarity 149

oldest and most widely used similarity indices for assessing conclusion for several datasets, based on rarefaction tests.]
compositional similarity of assemblages (sometimes called Moreover, for the new indices we present here, it can be
Ôspecies overlapÕ) and hence, its complement, dissimilarity. shown theoretically that sampling bias, when present, is
Both measures are based on the presence/absence of always negative. [The authors demonstrate the expected
species in paired assemblages and are simple to compute negative bias mathematically (A. Chao, R. L. Chazdon,
(Magurran 2004). Many other similarity indices exist that are R. K. Colwell & T.-J. Shen, unpublished data); it can be
based on the same information: the number of species proved for any abundance models given in Magurran (2004)
shared by two samples and the number of species unique to and Plotkin & Muller-Landau (2002).]
each of them (Legendre & Legendre 1998), and new indices Recently, interest has intensified in the development and
continue to appear (e.g. Lennon et al. 2001). A modified evaluation of indices to measure beta diversity, or turnover
version of the Sørensen index was developed by Bray & rate, of species assemblages (Duivenvoorden 1995; Lennon
Curtis (1957), based on abundance data (also known as the et al. 2001; Arita & Rodrı́guez 2002, 2004; Condit et al. 2002;
Sørensen abundance index; Magurran 2004), and a large Plotkin & Muller-Landau 2002; Koleff et al. 2003; Rodrı́guez
number of other abundance-based indices have been & Arita 2004), underscoring the need for robust statistical
developed (Legendre & Legendre 1998), including the estimators for inferring compositional similarity from sample
widely applied Morisita–Horn index (Magurran 2004). data. Increasing species turnover (decreasing similarity) with
Despite their wide application in ecological studies, the increasing distance between sites may reflect spatial patterns
classic Jaccard and Sørensen indices, when computed for of dispersal or may be driven by increasing environmental
sample data, perform poorly as measures of similarity heterogeneity at greater scales (Harte et al. 1999; Hubbell
between diverse assemblages that include a substantial 2001; Balvanera et al. 2002; Chave & Leigh 2002; Condit et al.
fraction of rare species (Wolda 1981; Colwell & Coddington 2002; Duivenvoorden et al. 2002; Ruokolainen & Tuomisto
1994; Plotkin & Muller-Landau 2002), because the sample 2002; Rodrı́guez & Arita 2004; Valencia et al. 2004).
data are (usually wrongly) assumed to be true and complete Unfortunately, most indices of beta diversity rely on the
representations of assemblage composition. [Indeed, with same information as the classic Jaccard and Sørensen indices
very few exceptions (e.g. Grassle & Smith 1976; MacKenzie and share the limitations discussed above.
et al. 2004), nearly all existing approaches to measuring With this problem in mind, Plotkin & Muller-Landau
similarity make this assumption.] In general, as we will show (2002) developed a Sørensen-type similarity index for
with simulations, these measures are likely to severely abundance counts using a ÔparametricÕ approach that relies
underestimate true similarity between two (genuinely sim- on a gamma distribution to characterize species abundance
ilar) assemblages that contain numerous rare species. structure. Condit et al. (2002) adopt an approach to
Because many species are missed by the samples, the rare measuring beta diversity using Leigh et al.Õs (1993) Ôcodom-
species that appear in one sample are likely to be different inanceÕ index F, the probability that two individuals chosen
than the rare species that show up in the other sample, even randomly from each of two assemblages are the same
if all are actually present in both assemblages. Similar species. Although this measure is based on abundance data,
problems arise from comparing two samples of substantially F, itself, is not a statistically valid index of similarity. For two
different size: simply because it contains fewer individuals or identical assemblages with many species, F tends to 0.
sampling units, the smaller sample may lack species that Moreover, it is possible for any two identical assemblages to
appear in the larger sample. In short, the underestimation of have any value of F from 0 to 1, depending on how many
similarity occurs because of the failure to account for unseen species are present and patterns of relative abundance. It is
shared species. possible, however, to normalize F to produce a valid
In principle, overestimation of similarity can also occur similarity index. Chave & Leigh (2002) point out that the
when comparing undersampled, high-dominance commu- Morisita–Horn index is a normalized version of F.
nities in which the common species are widespread and rare We begin by developing a new, probabilistic approach for
ones tend to be locally endemic. In this case, two samples the classic Jaccard and Sørensen incidence-based indices.
might yield the same few common species, but fail to reveal We then extend this approach to formulate Jaccard-type and
rare species that would differentiate the assemblages in Sørensen-type indices that consider species abundances. In
larger samples (Colwell & Coddington 1994; Ruokolainen & contrast to Plotkin & Muller-Landau (2002), we adopt a
Tuomisto 2002 discuss a possible example). In nearly all non-parametric approach that does not require any
cases we have examined quantitatively, however, rarity assumptions about species abundance distributions. We
(either in nature or because of small sample size) increases then propose a method to estimate both incidence-based
the chance that a species will be spuriously absent from one and abundance-based Jaccard and Sørensen indices from
sample but not from the other, thus negatively biasing sample data, incorporating the effect of unseen shared
similarity indices. [Fisher (1999, Fig. 8) comes to the same species.

2004 Blackwell Publishing Ltd/CNRS


150 A. Chao et al.

We then carry out sampling simulations with empirical correspond to the A ¼ S12, B ¼ S1 ) S12 and C ¼
data sets to assess the relative performance of the S2 ) S12. Substituting these expressions in eqns 1 and 2,
classic Jaccard and Sørensen indices; their new, abun- we have an alternate way to write the classic indices that
dance-based Jaccard and Sørensen counterparts; and the will be required for the next steps in developing the new
corresponding Jaccard and Sørensen estimators. We show indices:
that incorporating the effect of unseen species substantially
A S12
reduces the sample-size bias of these estimators and Jclas ¼ ¼ ð3Þ
improves their suitability for inferring similarity (or its A þ B þ C S1 þ S2  S12
complement, dissimilarity) between hyper-diverse assem- and
blages for which a large proportion of species are missing
2A 2S12
from samples. Finally, we illustrate an application of the new Lclas ¼ ¼ : ð4Þ
abundance-based Jaccard index and the Jaccard abundance- 2A þ B þ C S1 þ S2
based estimator, using data from a successional study of
tree, sapling and seedling abundance of canopy species. A probabilistic approach to the classic Jaccard
Based on data sets for rich, tropical insect and plant and Sørensen indices
assemblages, we show how incorporating the effect of
The classic Jaccard and Sørensen indices consider only the
unseen shared species not only increases accuracy, but also
presence or absence (incidence) of species. Two pairs of
can change the interpretation of results.
assemblages, one pair sharing abundant species but not rare
ones and the other pair sharing rare species, but not
DEVELOPING THE NEW INDICES AND common ones, will yield the same index value. From the
ESTIMATORS point of view of overall assemblage similarity, taking
similarity of assemblage composition to the level of
The classic Sørensen and Jaccard similarity indices
individuals often makes more sense (Magurran 2004). Our
The classic Sørensen and Jaccard indices depend on three next objective is to extend the incidence indices to take
simple incidence counts: the number of species shared by account of the relative abundance of species, a prerequisite
two assemblages and the number of species unique to each for developing index estimators for sampling data that take
of them. It has become traditional to refer to these counts as account of unseen rare species.
A, B and C, respectively (Table 1). The classic Jaccard and We must first provide a probabilistic derivation of the
Sørensen indices for incidence counts are then classic Jaccard and Sørensen incidence indices. Suppose we
randomly select a species from Assemblage 1 and a species
A from Assemblage 2 and then classify each member of the
Jclas ¼ ð1Þ
AþBþC pair according to whether it is a shared species or not. The
and corresponding probabilities are shown graphically in Fig. 1
and specified in Table 2.
2A
Lclas ¼ ð2Þ Although the probabilities in Table 2 are not counts, they
2A þ B þ C can be thought of as Ônormalized counts,Õ because they sum
(We use L for the Sørensen index to avoid confusion to unity. Substituting these probabilities into eqns 1 and 2,
with S for species.) There is a close, monotonic relation then we have
between the two indices: Lclas ¼ 2Jclas/(Jclas + 1) and
Jclas ¼ 1/(2/Lclas ) 1). Jclas ¼
A
AþBþC
Assume that there are S1 species in Assemblage 1 and S2
½ðS12 =S1 ÞðS12 =S2 Þ
species in Assemblage 2. Let the number of shared species ¼
½ðS12 =S1 ÞðS12 =S2 Þ þ ½ðS12 =S1 Þð1  ðS12 =S2 ÞÞ þ ½ð1  ðS12 =S1 ÞÞðS12 =S2 Þ
be S12. Then, the incidence counts A, B, C in Table 1 S12
¼
S1 þ S2  S12

Table 1 Species classification counts used in the classic indices which is exactly eqn 3. Likewise, we have
Assemblage 2 2A
Lclas ¼
2A þ B þ C
Present Absent 2½ðS12 =S1 ÞðS12 =S2 Þ
¼
2½ðS12 =S1 ÞðS12 =S2 Þ þ ½ðS12 =S1 Þð1  ðS12 =S2 ÞÞ þ ½ð1  ðS12 =S1 ÞÞðS12 =S2 Þ
Assemblage 1 2S12
Present A B ¼
S1 þ S2
Absent C –
which is the same as eqn 4.

2004 Blackwell Publishing Ltd/CNRS


A new statistical approach for assessing similarity 151

Species from a2 Species from a2


Figure 1 A graphical representation of the is shared is not shared
meaning of shared species for two assem-
Case 1 Case 2
blages. Assemblage 1 (a1) is grey, Assem-
blage 2 (a2) is white. The grey dot represents a1 a2 a1 a2
a species selected at random from Assem- Species from a 1
blage 1 and the white dot represents a is shared
species selected at random from Assemblage
2. Case 1 is the only case in which both
species are shared species (but not necessar-
ily the same species). In Case 2, the species
Case 3 Case 4
chosen at random from Assemblage 1 is a
shared species, but the species chosen from a1 a2 a1 a2
Assemblage 2 is not shared with Assemblage Species from a 1
1. The reverse is true for Case 3. In Case 4, is not shared
neither of the chosen species is a shared
species. These patterns are described mathe-
matically in Table 2.

Table 2 Probabilistic derivation of species counts for the classic common and some are rare. Instead, the basic idea for
indices handling abundance counts is that we treat all individuals
equally. Adapting the approach from the previous section,
Select any species from Assemblage 2
we randomly select one individual from Assemblage 1 and
Shared Non-shared one individual from Assemblage 2. For each individual of the
pair, note whether it belongs to a shared species or not.
Select any species from Assemblage 1  
Shared A ¼ SS121 SS122 B ¼ SS121 1  SS122
We now derive the general formulas for the abundance-
(Case 1) (Case 2) based versions of the Jaccard and Sørensen indices.
     Without loss of generality, we assume the first S12 species
Non-shared C ¼ 1  SS121 SS122 1  SS121 1  SS122 are shared species, that is, the shared species are indexed
(Case 3) (Case 4) by 1,2,…,S12. In Assemblage 1, let U denote the total
relative abundances of individuals belonging to the shared
species, U ¼ p1 + p2 +    + pS12. Likewise in Assemblage
It might appear that we have made no progress, but 2, let V denote the total relative abundances of individuals
this probabilistic approach lays the groundwork for belonging to shared species, V ¼ p1 + p2 +    + pS12.
developing abundance-based indices, which in turn allow Table 3 shows the probabilities that two individuals, one
for the estimation of indices that take into account the from each assemblage, represent each of the usual four
effect of unseen shared species. Note that, using this categories.
approach, we can also calculate the chance that both Based on eqns 1 and 2 for the three probabilities (A, B
randomly chosen species are non-shared species (Case 4 and C in Table 3), we obtain the following abundance-based
as shown in Fig. 1 and Table 2). However, the basic indices in terms of U and V:
concept for the Jaccard and Sørensen indices is
based only on information for the other three cells A UV
Jabd ¼ ¼ ð5Þ
(Cases 1–3). A þ B þ C U þ V  UV

Extending the probabilistic approach to abundance-based


Table 3 Probabilities for individual-based species counts
indices
Select any individual from Assemblage 2
Let the probabilities of species discovery (which depend
primarily on relative abundance, assuming random mixing Shared Non-shared
and equivalent detectability) in Assemblages 1 and 2 be
Select any individual from Assemblage 1
P1, p2, …, pS1P
denoted, respectively, by (p ) and (p1, p2, …, pS2),
Shared A ¼ UV B ¼ U(1 ) V)
where pi > 0, pi > 0 and Si 1¼ 1 pi ¼ Si 2¼ 1 pi ¼ 1. We no Non-shared C ¼ (1 ) U)V D ¼ (1 ) U)(1 ) V)
longer treat all species equally because some species are

2004 Blackwell Publishing Ltd/CNRS


152 A. Chao et al.

and rare, shared species to estimate an appropriate adjustment


2A 2UV term for U and V to account for unseen shared species. We
Labd ¼ ¼ ð6Þ first define the indicator function I(expression) such that
2A þ B þ C U þ V
I ¼ 1 if ÔexpressionÕ
PD12 is true and I ¼ 0 if ÔexpressionÕ is false.
As U and V represent the total abundances of the shared Let f1þ ¼ i¼1 I ½ X i ¼ 1; Yi  1 be the observed num-
species in Assemblages 1 and 2, respectively, we see that ber of shared species that are singletons (Xi ¼ 1) in Sample 1
both indices reach 1 for identical assemblages and tend to 0 (these species must be present in Sample 2, but may have
for disjoint assemblages. In the latter case, for example, any abundance). Now, let f2+ be the observed number of
Labd ¼ 2/[(1/U) + (1/V)] tends to 0 as both U and V shared species that are doubletons (Xi ¼ 2) in Sample 1.
approach 0. Similarly, we define f+1 and f+2 to be the observed number
of shared species that are, respectively, singletons (Yi ¼ 1)
and doubletons (Yi ¼ 2) in Sample 2.
Estimation of the abundance-based indices from sample
Then the proposed estimator for U is
data
X
D12
Xi ðm  1Þ fþ1 X
D12
Xi
Up to now, we have considered only the species and ^ ¼
U þ I ðYi ¼ 1Þ ð7Þ
individuals observed in two assemblages. Both the classic i¼1
n m 2fþ2 i¼1 n
Jaccard and Sørensen and the new, abundance-based
Notice that the first term in the right-hand side of eqn 7
versions assume full and complete knowledge of the two
denotes the observed total of frequencies associated with
assemblages being contrasted. In practice, we need to
the observed shared species; the second term accounts for
estimate similarity indices from sample data, the task that we
the estimated effect of unseen shared species. Similarly, we
turn to now. Our approach is non-parametric in the sense
have
that we do not need to postulate any particular species
abundance distribution to derive the estimators, which are
X
D12
Yi ðn  1Þ f1þ X
D12
Yi
therefore valid under many statistical abundance models ^ ¼
V þ I ðXi ¼ 1Þ ð8Þ
(e.g. log-normal, broken stick, gamma, etc.). The derivation i¼1
m n 2f2þ i¼1 m
does assume that the number of species is finite so that
When f+2 ¼ 0 or f2+ ¼ 0, replace f+2 and f2+ in the
species discovery probabilities are bounded below. [The
denominators by f+2 + 1 or f2+ + 1, respectively. If the
authors show that the estimators are valid under many of ^ or V^ is greater than 1 (which rarely happens),
value of U
the statistical abundance models (A. Chao, R. L. Chazdon,
then it is replaced by 1. Our proposed abundance-based
R. K. Colwell & T.-J. Shen, unpublished data) (e.g.
Jaccard and Sørensen estimators are
log-normal, exponential, gamma, negative binomial, Zipf–
Mandelbrot, broken-stick models, etc.) that appear in ^V^
^Jabd ¼ U
Magurran (2004, Table 2.1) or in Plotkin & Muller-Landau ð9Þ
^ þV
U ^ U^V
^
(2002, Table 1).]
A random sample of n individuals (Sample 1) is taken from and
Assemblage 1 and a random sample of m individuals (Sample ^ ^
2) is taken from Assemblage 2. Denote the species frequencies ^ abd ¼ 2U V
L ð10Þ
U^ þV ^
in the samples by (X1, X2, …, XS1) and (Y1, Y2, …, YS2),
respectively. (Note that if a species is missing from a sample, The variances for these two estimators can be derived by
Xi or Yi will equal zero.) Thus, the pair of frequencies for the a bootstrap method. (The complete derivation of eqns 7
S12 species truly shared by the two assemblages are and 8 and details on the bootstrap procedure for computing
(X1, Y1)(X2, Y2)…(XS12, YS12). Assume that D12 of the S12 variance estimators for eqns 9 and 10 are available upon
shared species available are actually observed in both samples, request from the first author.)
and their frequencies are the first D12 pairs. Thus, an
additional S12 ) D12 species are shared by the two assem-
Estimation of similarity indices from incidence frequencies
blages, but absent from one or both of the samples. The
greater the frequencies of rare, shared species observed in one Because information about the frequencies and identities of
of the two samples, the more probable it is that additional rare species provides the critical information for adjusting
shared species are present in both assemblages, but are absent similarity indices to account for the effect of unseen shared
from one or both samples. We refer to these as unseen shared species, a simple pair of lists of the species present in two
species. assemblages (incidence data) cannot be used, even in
To incorporate the effect of unseen shared species on the principle, to adjust similarity indices for the effect of unseen
probabilities of Table 3, we use the frequencies of observed species. On the other hand, the estimation-based approach

2004 Blackwell Publishing Ltd/CNRS


A new statistical approach for assessing similarity 153

can be extended to replicated incidence (presence–absence) and


data. ^ ^
Suppose we take a set of w replicated incidence samples ^ inc ¼ 2Uinc Vinc :
L ð14Þ
U ^ inc
^ inc þ V
from Assemblage X and a set of z replicated incidence
samples from Assemblage Y. For both sets of samples
combined, there are S species. The number of samples in PERFORMANCE TESTS: CLASSIC VS. NEW INDICES
which a species is found in Assemblage X or Y is the
Indices tested
frequency for that species in that sample set. The frequencies
for species i are thus defined as We carried out performance tests for: (1) the classic Jaccard
X
w X
z and Sørensen indices (eqns 1 and 2); (2) the new,
Xi ¼ xij and Yi ¼ yij ; abundance-based Jaccard and Sørensen indices (eqns 5
j¼1 j¼1 and 6); (3) the estimators for the abundance-based indices
(eqns 9 and 10); and (4) the replicated-incidence estimators
where xij and yij represent the presence (1) or absence (0) of
for the abundance-based indices (eqns 13 and 14).
species i in sample j.
Note that Xi or Yi will be zero for some species, unless all
species are shared and observed. Data sets used in the tests
Under the assumption that replicate incidence samples
We conducted the performance tests on a large, species-rich
are statistically homogeneous (within each assemblage), the
data set for tropical rainforest ants (Longino et al. 2002),
chance of a species being present in a particular sample is
collected using several replicated, mass-collecting techniques
proportional to its relative abundance in the assemblage, and
at La Selva Biological Station in Costa Rica. Here, we present
the frequency vectors Xi or Yi are thus statistical proxies for
representative results for three collection methods: Berlese
the relative abundance of species in Assemblages X and Y
extraction of soil samples (217 samples, 4318 individuals, 117
(e.g. Chao 2004; Colwell et al. 2004). Thus, with minor
species, of which 19 were singletons), Malaise trap samples for
changes, eqns 7 and 8 can be used to compute adjusted
flying and crawling insects (62 samples, 1660 individuals, 103
probabilities that a randomly chosen incidence (species
species, of which 35 were singletons), and Fogging samples
detection) from each of the two assemblages will both
from canopy fogging (459 samples, 26302 individuals, 165
represent shared species (though not necessarily the same
species of which 19 were singletons). [Relative abundance
shared species).
diagrams appear in Longino et al. (2002).] As Longino et al.
For replicated incidence data, f1+ is the number of observed
(2002) point out, these three methods intentionally sample
shared species that occur in exactly one sample (Xi ¼ 1) in
different, but overlapping segments of the local ant fauna.
X and f2+ is the number of observed shared species that occur
Whereas the raw species sum for the three methods would be
in exactly two samples (Xi ¼ 2) in X; f+1 and f+2 are
117 + 103 + 165 ¼ 385 species, the actual number of
the corresponding numbers for sample matrix Y. Define
species captured by the three methods together was only
the sum of the incidence frequencies for the matrices as
276 species. Parallel tests for other high-richness data sets,
X S X S
including the rainforest tree data discussed later in this paper,
n¼ Xi and m ¼ Yi :
i¼1 i¼1 yielded concordant results (A. Chao, R. L. Chazdon, R. K.
Colwell & T.-J. Shen, unpublished data).
Then the proposed estimators are

X D12  
^ inc ¼
D12
Xi ðz  1Þ fþ1 X Xi The tests
U þ I ðYi ¼ 1Þ ð11Þ
i¼1
n z 2fþ2 i¼1 n Although the classic Jaccard and Sørensen indices and our
new indices all measure Ôsimilarity,Õ they are intended to
and
measure different aspects of this construct: the classic indices
X D12  
^ inc ¼
D12
Yi ðw  1Þ f1þ X Yi ostensibly measure similarity in species composition while
V þ I ðX i ¼ 1 Þ ð12Þ ignoring relative abundance (although they are strongly
i¼1
m w 2f2þ i¼1 m
affected by it, when sampling is involved), whereas our new
(The same modifications described for eqns 7 and 8 may be indices [and many others (Legendre & Legendre 1998;
applied here if f+2 ¼ 0 or f2+ ¼ 0.) Thus, our proposed Magurran 2004)] explicitly consider relative abundance.
incidence-based Jaccard and Sørensen estimators are Thus, for any particular data set, differences in the absolute
magnitude of incidence- vs. abundance-based Jaccard or
^ inc V
U ^ inc
^Jinc ¼ ð13Þ Sørensen values (or indeed, differences between most other
U ^ inc  U
^ inc þ V ^ inc V
^ inc indices of similarity) are meaningless, in themselves.

2004 Blackwell Publishing Ltd/CNRS


154 A. Chao et al.

Nevertheless, indices of compositional similarity can be random samples of a single sampling pool? If an index is
compared in terms of their performance in tests of unbiased by sample size, it should yield a value of 1 when
sensitivity to undersampling. Using the ant data, we illustrate applied to samples of any size. First, we randomly sampled
three tests: (1) Test 1: equal-sized samples from a single data individuals (with replacement) from the pooled ant data for
set (within-assemblage rarefaction); (2) Test 2: unequal-sized a single collecting method to produce pairs of samples
samples from a single data set; and (3) Test 3: equal- having the same number of individuals as the pools
proportion samples from two data sets (between-assemblage themselves (full samples). Next, we randomly selected
rarefaction). For purposes of these tests, we treated the ant smaller samples, each totalling one-half the number of
data from each collecting method (Berlese, Malaise, or individuals in the original sampling pool, then computed
Fogging) as a separate, complete Ôassemblage,Õ referred to similarity indices for this sample pair. We then repeated this
here as a sampling pool. Samples of specified sizes (in terms of procedure for a pair of samples each 1/4 the size of the
numbers of individuals) were then selected, at random, with original pool, then a pair 1/8 the size of the pool, and so on,
replacement, from these pools. Of course, not all species successively halving sample size, down to 1/64 the original
present in a sampling pool are represented in smaller number of individuals. (Note that this is quite a severe test
samples. However, because sampling was done with of undersampling bias, even for these very large pools.) This
replacement, not all species are present even when the entire process was repeated 1000 times and means taken, for
number of individuals selected is the same as the number of each test of each index, and for each of the three ant
individuals in the pool. collecting methods.
Figure 2 shows representative results of this test for the
classic Jaccard and Sørensen indices (first column of panels,
RESULTS Test 1: Berlese rarefaction). Clearly both of these indices
were quite sensitive to undersampling. Figure 3 (first column
Test 1: Equal-sized samples from a single data set
of panels) shows the corresponding results for the new
All similarity indices yield a true value of 1 when a complete indices for this test. The new abundance-based Jaccard and
sampling pool (assemblage) is compared with itself. What Sørensen indices, without adjustment for unseen shared
happens when a similarity index is computed for two species (Jabd and Labd), were also sensitive to sample size. In

Test 1 Test 2 Test 3


Berlese Berlese Malaise-Fog Mal.-Berlese
rarefaction unequal rarefaction rarefaction
Jaccard
Jclas

1.0
0.8
Sørensen
Lclas

0.6
0.4
0.2
0
Full

1/16
1/32
1/64
1/2
1/4
1/8

Figure 2 Random sampling tests of the classic Jaccard (Jclas, eqn 1) and Sørensen (Lclas, eqn 2) overlap indices. The graphs show the effect
on each index of considering random samples composed of 1/1 (Full), 1/2, 1/4, …, 1/64 of the abundances or incidence-equivalents in the
sampling pools, sampled with replacement. (The labels on the lower left graph are the same for all graphs.) Column 1 (Test 1: Berlese
rarefaction) shows similarity index values for equal-sized, paired samples from the Berlese ant data set. Column 2 (Test 2: Berlese unequal)
shows index values for comparisons of samples of decreasing size vs. a sample of the same size as the full Berlese ant data set. Column 3
(Malaise–Fog rarefaction) shows similarity index values for equal-proportion, paired samples (Test 3) from the Malaise vs. the Fogging ant
data set, a high-similarity comparison. Column 4 (Malaise–Berlese rarefaction) shows similarity index values for equal- proportion, paired
samples (Test 3) from the Berlese vs. the Malaise ant data set, a low-similarity comparison. The true value of each index for the sampling
pools considered are shown by horizontal dotted lines in the columns for Test 3 (Malaise–Fog and Malaise–Berlese rarefaction). The true
index value for Test 1 and Test 2 is 1.0, the top of the graphs.

2004 Blackwell Publishing Ltd/CNRS


A new statistical approach for assessing similarity 155

Test 1 Test 2 Test 3


Berlese Berlese Malaise-Fog Mal.-Berlese
rarefaction unequal rarefaction rarefaction

Jabd
Jaccard

Jabd
^
Jinc
^
Labd
Sørensen

Labd
^

1.0
0.8
Linc

0.6
^

0.4
0.2
0
Full
1/2
1/4
1/8
1/16
1/32
1/64

Figure 3 Random sampling tests the new overlap indices. The graphs show the effect on each index of considering random samples
composed of 1/1 (Full), 1/2, 1/4, …, 1/64 of the abundances or incidence-equivalents in the sampling pools, sampled with replacement.
(The labels on the lower left graph are the same for all graphs.) Columns are described in the caption for Fig. 2. Jaccard indices: Jabd is the new
abundance-based Jaccard index, not adjusted for unseen species, computed by eqn 5. ^Jabd is the corresponding abundance-based estimator
that takes unseen species into account, computed by eqn 9. The estimator based on replicated incidence data, ^Jinc , is computed by eqn 13.
Sørensen indices: Labd is the new abundance-based Sørensen index, not adjusted for unseen species, computed by eqn 6. L ^ abd is the
corresponding abundance-based estimator that takes unseen species into account, computed by eqn 10. The estimator based on replicated
incidence data, L ^ inc , is computed by eqn 14. The true value of each index for the sampling pools considered are shown by horizontal dotted
lines in the columns for Test 3 (Malaise–Fog and Malaise–Berlese rarefaction). The true index value for Test 1 and Test 2 is 1.0, the top of the
graphs. To allow a valid comparison of the incidence-based estimators (^Jinc and L ^ inc ) with the corresponding abundance-based estimators
(^Jabd and L
^ abd , respectively), the X-axis for each incidence-based estimator was re-scaled so that the minimum number of incidences matches
the minimum abundance of the corresponding abundance-based estimator, thus equalizing the amount of statistical information.

2004 Blackwell Publishing Ltd/CNRS


156 A. Chao et al.

contrast, the Jaccard and Sørensen estimators, which include for unseen species (^Jabd and L ^ abd in third and fourth
the estimated effect of unseen shared species, proved to be columns of Fig. 3) as well as for the corresponding
less sensitive to undersampling, remaining substantially estimators based on replicated incidence data (^Jinc and L
^ inc
closer to 1 even for small samples (Fig. 3). This was true in third and fourth columns of Fig. 3).
for both the abundance-based estimators (^Jabd and L
^ abd ) and
the estimators based on replicated incidence data (^Jinc and
^ inc ). APPLICATION
L
As an example of the application of the new indices, we
apply the classic Jaccard index (eqn 1), the new abundance-
Test 2: Unequal-sized samples from a single data set
based Jaccard index (eqn 5) and its estimator (eqn 9) to
A similarity index should ideally be robust to sample size not data from two mature and four second-growth rainforest
only for equal-sized samples, but also for samples of sites in Costa Rica. We examine compositional similarity
unequal size. To test for this property we computed between species of trees ‡ 25 cm diameter at breast height
similarity indices for samples of successively smaller size, vs. (DBH; canopy individuals), canopy tree saplings (1–5 cm
ÔfullÕ samples, equal in number of individuals to the number DBH) and canopy tree seedlings (> 20 cm height, but
in the corresponding sampling pool. As with the first test, an < 1 cm DBH) within four second-growth forests of
ideal index should remain at 1, regardless of the discrepancy different age since pasture abandonment and in two old-
in sample sizes. Figures 2 and 3 (second column, Test 2: growth forests in the same study area. During early stages of
Berlese unequal) show such a test for the Berlese sample ant succession, when the forest canopy is first beginning to
data, using samples created by the same scheme outlined for close, fast-growing, shade-intolerant colonizing tree species
the first method. Even more than in the first test, the classic are present as canopy trees and are also found as smaller
Jaccard and Sørensen indices (Fig. 2) were strongly affected individuals in the understory, as seedlings and saplings. As
by the size of the sample, leading to a severe negative time progresses and the understory becomes more shaded,
bias when one sample was markedly smaller than the these shade-intolerant tree species are eliminated from the
full sample. In contrast, the new Jaccard and Sørensen seedling and sapling pool and shade-tolerant species readily
estimators (Fig. 3, second column) were strikingly resistant colonize these small size classes. These shade-tolerant
to undersampling, including both abundance-based estima- species are represented by seedlings and saplings, but have
tors (^Jabd and L
^ abd ) and the estimators based on replicated few or no canopy trees present, gradually augmenting tree
incidence data (^Jinc and L^ inc ). species richness as the forest matures (Guariguata et al.
1997; Table 4). Thus, we would predict that, as secondary
forests mature, compositional similarity between tree species
Equal-proportion samples from two data sets
It is all very well for a similarity index to be robust to sample
size in comparing paired samples from the same pool, but Table 4 Observed patterns of species richness of tree seedlings,
saplings and canopy individuals in 1 ha plots in four second-
an index is of little use if it does not retain that robustness in
growth and two old-growth forests in year 2000
comparing different data sets, while successfully detecting
compositional differences between them. We performed the Sobs Sobs Sobs
same sample size comparison procedures described for the Site Age seedlings saplings canopy trees
first set of tests, but instead of comparing sample pairs from
LSUR 15 45 68 12
the same sampling pool, we compared successively smaller
TIR 18 49 74 16
sample pairs from the Malaise and Fogging [high similarity LEP 23 47 67 24
(Longino et al. 2002)], and from the Malaise and Berlese CR 28 57 91 33
(low similarity) data sets. The results for the classic Jaccard LSUR old-growth > 200 47 101 37
and Sørensen indices appear in the third and fourth columns LEP old-growth > 200 69 102 43
of Fig. 2. An ideal index would yield and maintain the true
value computed for the full pools (the dotted horizontal line All trees and saplings were marked and measured for diameter
within a 1 ha plot in each forest. Seedlings were sampled in 144
in each panel) in the face of rarefaction. The classic Jaccard
1 · 5 m quadrats within the 1 ha plot, for a total area sampled of
and Sørensen indices proved quite sensitive to undersam-
0.072 ha. In these analyses, we included only canopy tree species;
pling in this test (Fig. 2). The new abundance-based Jaccard shrubs, treelets and midstory trees were excluded. Note that young
and Sørensen indices, uncorrected for unseen species (Jabd sites show a low number of canopy tree species per ha (individuals
and Labd in third and fourth columns of Fig. 3), also suffer ‡ 25 cm DBH) and fewer sapling species compared with old-
from undersampling bias, but the bias is quite substantially growth forests, but differences in seedling species richness were
reduced for their abundance-based counterparts corrected less pronounced.

2004 Blackwell Publishing Ltd/CNRS


A new statistical approach for assessing similarity 157

and seedlings or saplings would initially be high, but would The abundance-based Jaccard index (eqn 5) showed a
quickly decline to a minimum during intermediate stages of strikingly different pattern across the six forest stands.
succession and then begin to increase later in succession as Compositional similarity between seedling and tree assem-
shade-tolerant trees reach reproductive maturity and pro- blages and between sapling and tree assemblages was
duce seedlings that can establish, grow and survive. initially high in the youngest stand, as we had predicted. As
The classic Jaccard index (eqn 1) showed low compo- the forest matures, tree seedling and sapling pools become
sitional similarity between trees and seedlings for the four enriched by shade-tolerant species not represented as
second-growth forests compared with the old-growth canopy trees, resulting in a decreasing compositional
forests, with similarity decreasing slightly with age among similarity that reached a minimum in the 23-year-old
the four second-growth forests (Fig. 4). Similarity between LEP stand (Fig. 4). This minimum similarity represents a
trees and saplings, in contrast, showed gradual increases point in forest succession of maximum recruitment
from the youngest forest to the older second-growth forest, limitation for both seedlings and saplings. In the oldest
continuing the trend to old-growth forests (Fig. 4). second-growth plot, CR, the abundance-based Jaccard
index began to increase, reflecting recruitment of shade-
tolerant species in all three-size classes (Fig. 4). The
similarity index continued to increase and stabilized at
Seedlings vs.trees
0.3 0.4–0.5 in the two old-growth stands. With the exception
Saplings vs.trees
of one old-growth stand, similarity indices were higher for
0.2 seedlings vs. trees than for saplings vs. trees. At the scale
Jclas

of 1 ha plots, compositional similarity between canopy


0.1
trees and seedling and sapling size classes in old-growth
forests was comparable to that observed within a 15-year-
0
old second-growth forest, but greater than that observed in
0.5 second-growth forests of intermediate age. By design, the
0.4 abundance-based Jaccard index responds sensitively to
changes in total relative abundances of shared species
J abd

0.3
during forest succession.
0.2
The abundance-based Jaccard estimator (eqn 9), which
0.1 incorporates the effects of unseen shared species, showed
0 similar general trends across stands when compared
with the abundance-based Jaccard index (Fig. 4). The
0.7
28-year-old second-growth stand, however, had nearly
0.6
comparable estimates of similarity compared with the
0.5 two old-growth stands, suggesting that the estimator is
J abd

0.4 responding to rare or infrequent species that are shared


^

0.3 between the size classes (Fig. 4). The estimator for sapling
0.2
vs. tree similarity was higher than for seedling vs. trees in
the TIR second-growth site, indicating that this stand has
0.1
more rare species of shared saplings than seedlings.
0
Forest site LSUR TIR LEP CR LSUR LEP
Forest age (year) 15 18 23 28 Old growth CONCLUSIONS
Figure 4 Compositional similarity between canopy trees and Because similarity is a qualitative human construct, it has no
seedlings and canopy trees and saplings in four second-growth precise mathematical definition. Nevertheless, measuring
forests of increasing age and in two old-growth forests. Results are ÔsimilarityÕ relies on quantitative indices devised for the
shown for Jclas, the classic Jaccard index (eqn 1; top panel), for the purpose, and in practice, we may expect that similarity
new abundance-based Jaccard index, Jabd (eqn 5) not adjusted for
indices fulfil reasonable criteria for their mathematical
unseen species (middle panel), and for ^Jabd , the new abundance-
behaviour (Legendre & Legendre 1998). Given indices that
based Jaccard estimator that takes unseen species into account
(eqn 9; error bars are 1 SE, computed by a bootstrapping make sense mathematically, it is their statistical performance
procedure; details available from the first author; A. Chao, R. L. under the realities of field sampling that we have concerned
Chazdon, R. K. Colwell & T.-J. Shen, unpublished data). These ourselves with here, particularly for species-rich taxa for
analyses include only canopy tree species; shrubs, treelets and which complete inventories are impractical or even
midstory tree species were excluded. impossible.

2004 Blackwell Publishing Ltd/CNRS


158 A. Chao et al.

Using sampling simulations applied to representative for sharing vegetation data for tree species in mature forests.
field data sets, we confirmed that two of the most widely The new estimators presented in this paper are included in
used classic indices, Jaccard and Sørensen, are negatively version 7.5 of ESTIMATES (Colwell 2004) and the program
biased under conditions of undersampling, often quite SPADE (Chao & Shen 2003), to be released upon publication
substantially (Fig. 2). Our objective was to develop new, of this paper. The complete derivation of eqns 7 and 8 and
probability-based indices that reduce undersampling bias by the variance estimators for eqns 9 and 10 are available upon
estimating and compensating for the effects of unseen, request from the first author. The complete ant data sets are
shared species. We based a new similarity index on the available from RKC.
probability that two randomly chosen individuals, one from
each of two samples, both belong to any of the species
REFERENCES
shared by the two samples [not necessarily to the same
shared species, the basis of F (Chave & Leigh 2002; Condit Arita, H.T. & Rodrı́guez, P. (2002). Geographic range, turnover
et al. 2002) and the Morisita–Horn index]. This approach rate and the scaling of species diversity. Ecography, 25, 541–550.
Arita, H.T. & Rodrı́guez, P. (2004). Local–regional relationships
opened the way to the crucial step, adjusting this probability
and the geographical distribution of species. Global Ecol. Biogeogr.,
to account for the chance that larger samples would reveal a 13, 15–21.
larger proportion of shared species. As anticipated, the new Balvanera, P., Lott, E., Segura, G., Siebe, C. & Islas, A. (2002). Beta
indices consistently reduced undersampling bias in the per- diversity patterns and correlates in a tropical dry forest of
formance tests, in most circumstances quite substantially. Mexico. J. Veg. Sci., 13, 145–158.
Inevitably some bias remains, especially under severe Bray, J.R. & Curtis, J.T. (1957). An ordination of the upland forest
undersampling and for highly dissimilar samples. Under communities of southern Wisconsin. Ecol. Monogr., 27, 325–349.
Bunge, J. & Fitzpatrick, M. (1993). Estimating the number of
such conditions, relatively little information exists to guide
species: a review. J. Am. Stat. Assoc., 88, 364–373.
bias reduction. Chao, A. (in press). Species richness estimation. In: Encyclopedia of
Ecologists distinguish two aspects of the compositional Statistical Sciences, 2nd edn (eds Balakrishnan, N., Read, C.B. &
similarity of species assemblages: similarity of species lists Vidakovic, B.). Wiley Press, New York, NY, USA.
(incidence) and similarity of speciesÕ relative abundances. Chao, A. & Shen, T.J. (2003). Program SPADE (Species Prediction
Classic abundance-based indices (e.g. Morisita–Horn or and Diversity Estimation). Program and User’s Guide available
Bray–Curtis) match abundances, species-by-species. Our at http://chao.stat.nthu.edu.tw.
Chao, A., Ma, M.-C. & Yang, M.C.K. (1993). Stopping rules and
new indices take an intermediate path, by assessing the
estimation for recapture debugging with unequal failure rates.
probability that individuals belong to shared vs. unshared Biometrika, 80, 193–201.
species, without regard to which species they belong to. Chave, J. & Leigh, E.G. (2002). A spatially explicit neutral model of
Unfortunately for many studies, unreplicated, pure incidence beta-diversity in tropical forests. Theor. Pop. Biol., 62, 153–168.
data (pairs of species lists) provide no information that can Chazdon, R.L., Colwell, R.K., Denslow, J.S. & Guariguata, M.R.
be used to estimate the number of unseen, shared species. (1998). Statistical methods for estimating species richness of
In principle, it may be possible to derive estimators that use woody regeneration in primary and secondary rain forests of NE
Costa Rica. In: Forest Biodiversity Research, Monitoring and Modeling:
abundance data to correct pure incidence similarity indices
Conceptual Background and Old World Case Studies. (eds Dallmeier,
for unseen species, but it is currently statistically difficult for F. & Comiskey, J.). Parthenon Publishing, Paris, France, pp.
biologically realistic data. However, we recommend the new 285–309.
indices for any application in which not only species Colwell, R.K. (2004). ESTIMATES: Statistical Estimation of Species
matching but similarity of relative abundance is of interest. Richness and Shared Species from Samples, Version 7.5.
Moreover, these new indices are better suited than the Available at http://viceroy.eeb.uconn.edu/estimates. Persistent
corresponding classic indices for assessing compositional URL http://purl.oclc.org/estimates.
Colwell, R.K. & Coddington, J.A. (1994). Estimating terrestrial
similarity between samples that differ in size, are known or
biodiversity through extrapolation. Phil. Trans. R. Soc. Lond. B
suspected to be undersampled, or are likely to contain Biol. Sci., 345, 101–118.
numerous rare species. Colwell, R.K., Mao, C.X. & Chang, J. (2004). Interpolating,
extrapolating, and comparing incidence-based species accumu-
lation curves. Ecology, 85, 2717–2727.
ACKNOWLEDGEMENTS Condit, R., Pitman, N., Leigh, E.G., Jr, Chave, J., Terborgh, J.,
We thank three anonymous referees for their comments and Foster, R.B. et al. (2002). Beta-diversity in tropical forest trees.
Science, 295, 666–669.
suggestions. This work was supported by Taiwan National
Duivenvoorden, J.F. (1995). Tree species composition and rain
Science Council Contract NSC92-2118-M007-013 to forest-environment relationships in the middle Caquetá area,
A. Chao and T.-J. Shen, by a grant from the Andrew Colombia, NW Amazonia. Vegetatio, 120, 91–113.
W. Mellon Foundation to R. L. Chazdon, and by US-NSF Duivenvoorden, J.F., Svenning, J.-C. & Wright, S.J. (2002). Beta
grant DEB-0072702 to R. K. Colwell. We thank Jorge Leiva diversity in tropical forests. Science, 295, 636–637.

2004 Blackwell Publishing Ltd/CNRS


A new statistical approach for assessing similarity 159

Fisher, B.L. (1999). Improving inventory efficiency: a case study Longino, J.T., Coddington, J. & Colwell, R.K. (2002). The ant
of leaf-litter ant diversity in Madagascar. Ecol. Appl., 9, 714– fauna of a tropical rain forest: estimating species richness three
731. different ways. Ecology, 83, 689–702.
Grassle, J.F. & Smith, W. (1976). A similarity measure sensitive to MacKenzie, D.I., Bailey, L.L. & Nichols, J.D. (2004). Investigating
the contribution of rare species and its use in investigation of species co-occurrence patterns when species are detected im-
variation in marine benthic communities. Oecologia, 25, 13–22. perfectly. J. Anim. Ecol., 73, 546–555
Guariguata, M.R., Chazdon, R.L., Denslow, J.S., Dupuy, J.M., Magurran, A.E. (2004). Measuring Biological Diversity. Blackwell,
Anderson, L. (1997). Structure and floristics of secondary and Oxford.
old-growth forest stands in lowland Costa Rica. Plant Ecology, Plotkin, J.B. & Muller-Landau, H.C. (2002). Sampling the species
132, 107–120. composition of a landscape. Ecology, 83, 3344–3356.
Harte, J., Kinzig, A. & Green, J. (1999). Self-similarity in the dis- Rodrı́guez, P. & Arita, H.T. (2004). Beta diversity and latitude in
tribution and abundance of species. Science, 284, 334–336. North American mammals: testing the hypothesis of covaria-
Hubbell, S.P. (2001). A Unified Neutral Theory of Biodiversity and tion. Ecography, 27, 1–11.
Biogeography. Princeton University Press, Princeton, NJ. Ruokolainen, K. & Tuomisto, H. (2002). Beta-diversity in tropical
Koleff, P., Gaston, K.J. & Lennon, J.J. (2003). Measuring beta forests. Science, 297, 1439a.
diversity for presence–absence data. J. Anim. Ecol., 72, 367–382. Valencia, R., Foster, R.B., Villa, G., Condit, R., Svenning, J.-C.,
Lee, S.-M. & Chao, A. (1994). Estimating population size via Hernández, C. et al. (2004). Tree species distributions and local
sample coverage for closed capture–recapture models. Biometrics, habitat variation in the Amazon: large forest plot in eastern
50, 88–97. Ecuador. J. Ecol., 92, 214–229
Legendre, P. & Legendre, L. (1998). Numerical Ecology. Elsevier, Wolda, H. (1981). Similarity indices, sample size and diversity.
Amsterdam. Oecologia, 50, 296–302.
Leigh, E.G., Wright, S.J., Putz, F.E. & Herre, E.A. (1993). The
decline of tree diversity on newly isolated tropical islands: a test
of a null hypothesis and some implications. Evol. Ecol., 7, 76– Editor, Nicholas Gotelli
102. Manuscript received 30 June 2004
Lennon, J.J., Koleff, P., Greenwood, J.J.D. & Gaston, K.J. (2001). First decision made 6 August 2004
The geographical structure of British bird distributions: diversity,
spatial turnover and scale. J. Anim. Ecol., 70, 966–979.
Manuscript accepted 20 October 2004

2004 Blackwell Publishing Ltd/CNRS

View publication stats

You might also like