Nothing Special   »   [go: up one dir, main page]

McCune 2002 AnalysisEcologicalCommunities

Download as pdf or txt
Download as pdf or txt
You are on page 1of 307

Analysis of Ecological

Communities

Bruce McCune
Oregon State UniverslN Corvallis

James B. Grace
USGS ,Vational Wetlands Research Center, Lafayette, Louisiana

with a contribution from


Dean L. Urban
Duke University, Chapel H111,North Carolina

MjM Software Design


Gleneden Beach. Oregon
Contents

PART 2. DATA ADJUSTMENTS .........................................................................................58

23 . MULTIVARIATE EXPERIMENTS ......................................................................................... 182


24 . IVlRPP (MULTI-RESPONSE PERMUTATION PROCEDURES) ................................................... 188
25 . INDICATOR SPECIES ANALYSIS ........................................................................................ 198
26 . D1sCRIMINANT ANALYSIS ............................................................................................... 205
27 . MANTEL TEST ................................................................................................................
211
28 . NESTED DESIGNS ............................................................................................................
218
Cover The dust bunny distribution in ecological communih data. with three levels of abstraction
(see Chapter 5)
Background: A dust bunny is the accnmulation of fluff. lint. and dirt particles in the
corner of a room.
Middle: Sample units in a 3-D species space. the three species forming a series of
uniniodal distributions along a single en\~ronmentalgradient. Each axis represents
abundance of one of the three species: each ball represents a sample unit. Tlie vertical
axis and the axis coming forward represent the two species peaking on the extremes of
the gradient. The species peaking in the middle of the gradient is represented by the
horizontal axis.
Foreground: The en~ironn~ental gradient f o m a strongl!- nonlinear shape in species
space. The species represented by the ~.ertical axis donunates one end of the
environnlental gradient. the species sho\vn b! the horizontal axis dominates the middle.
and the species represenled b! tlie axis coming foniard dominates the other end of Ihe
environmental gradient. Successful representation of the en\.ironn~ental gradient
reql~iresa tecllillque tliat can recover the underl!ing 1-D gradien~froni its contorted
path tllrough species space. This is a central problenl addressed in one section of this
book.

Cover gri~phics:A n ~ yCharron

Copyright C 2002 MjM Software Design 02002 Bruce McCune


PO Box 129
Gleneden Beach, Oregon 97388 USA

All rights reserved. No part of the material protected by this copvright notice may be reproduced
or used in any form or by any means. electronic or mechanical. including photocopying,
recording, or by any inforn~ationstorage and retrieval system. witllout written permission from
the copyright holders.

McCune. Bnlce
Analysis of Ecological Conlmunities I Bruce McCune, James B. Grace1
Includes bibliographical references and index.
ISBN 0-972 1290-0-6

Printed in the United States of America


Contents

PART 2. DATA ADJUSTMENTS ......................................................................................... 58

.
PART 3 DEFINING GROUPS WITH MULTIVARIATE DATA .......................................80

PART 4 . ORDINATION ...................................................................................................... 102

PART 5. COMPARING GROUPS ...................................................................................... 182

.
PART 6 STRUCTURAL MODELS .................................................................................... 222

.
APPENDIX 1 ELEMENTARY MATRIX ALGEBRA ......................................................257
REFERENCES ......................................................................................................................260
Preface
We wanted to write a book on communio analysis with practical utility for ecologists. We
chose to emphasize multivariate techmques for community ecology, because there is a great
demand for accessible, practical. and current information in this area.
We hope this book will be.usefu1 for researchers, academicians. and students of community
ecology. We envision its use as both a reference book and a textbook. By publishing the book as
a paperback through MjM Software, rather than a major publisher, we hope to keep the price
reasonable.
T h s book shares some of the first author's experience gained from many years of teachng a
course in community analysis and h s pmcipation in de~elopingthe software package PC-ORD.
This book is not a manual for PC-ORD - that alrea* eusts. The PC-ORD manual describes
the mechanics for using the software. along with a \.en- brief statement of the purpose of each
analytical tool. The current book. in contrast. provides a foundation for better understanding
community analysis. T h s book also expands the logic behind choosing one technique over the
other and explains the assumptions implicit in that decision. We also illustrate many of the
methods with examples.
We have tried to write a book for all community analysts, not just PC-ORD users. Many of
the techniques described in t h s book are not currently available in PC-ORD. The reader will
find. however, numerous references to PC-ORD, simply because so many readers will want to
immediately try out ideas generated from reading this book.
Because of the need to relate community properties to environmental factors, Part 6 of the
book deals with newly emerging methods that are well suited for this purpose. Dean Urban
contributed an overview of the methods of classification and regression trees. The final chapter
in Part 6 introduces structural equations, a coilstantly evolving body of methods for multivariate
hypothesis testing that is widely used in many fields outside of the environmental sciences. The
second author has spent a number of years appraising the value of structural equations and
introducing these methods to the uninitiated. Our treatment of t h s topic is brief, but we hope to
suggest some of the power of structural equation modeling for understanding ecological
communities.

Bruce McCune dedicates h s efforts on tlus book to Edward W. Beals, for h s insight and
teachng on many of these techniques, for h s geometric view of community analysis, and for his
willingness to look standard practice in the eve. Bruce also feels forever indebted to Paul L.
Farber and Stella M. Coakley for graciously giving space in their gardens for this project to grow.
Jim Grace acknowledges his appreciation to his major professor, Bob Wetzel, for giving him
an example of the power of perseverance and dedication, and to his mother and sister for their
unwavering support.
We thank the numerous graduate students in McCune's Community Analysis class. Their
collective contribution to the book is tremendous. We apologize for subjecting them to variously
half-baked versions of this document, and we thank them for their patience, corrections, and
suggestions.
We also thank the many people who have provided comments, insights, and encouragement
to Jim Grace during his journey through the world of complex data.
The manuscript greatly benefited from careful readings by Michelle Bolda, Michael Mefford,
JeriLynn Peck. and especially Patricia Muir. Selected chapters were improved by Mark V.
Wilson. Portions of the chapter on cluster analysis were derived from unpublished written
materials by W. M. Post and J. Sheperd. We thank Amy Charron for the cover design and its
central graphic.
CHAPTER 1

Part 1. Overview
Introduction
Who lives with whom and why? In one form or For example, consider a conlmunih data set
another this is a common question of naturalists, consisting of 100 sample units and the 80 species
farmers. natural resource managers, academics, and found in those sample units. This can be organized as
anyone who is just curious about nature. This book a table with 100 objects (rows) and 80 variables
describes statistical tools to help answer that question. (columns). Faced with the problem of summarizing
Species come and go on their own but interact the information in such a data set, our first reaction
with each other and their environments. Not only do might be to construct some kind of classification of the
they interact, but there is limited space to fill. If space sample units. Such a classification boils down to
is occupied by one species, it is usually unavailable to assigning a category to each of the sample units. In so
another. So. if we take abundance of species as our doing, we have taken a data matrix with 80 variables
basic response variable in community ecology, then we and reduced it to a single variable with one value for
must work from the understanding that species each of the objects (sample units).
responses are not independent and that a cogent The other fundamental method of data reduction is
analvsis of community data must consider this lack 3f to construct a small number of continuous variables
independence. representing a large number of the original variables.
We confront this interdependence among response This is possible and effective only if the original
variables by studying their correlation structure. We response variables covary. It is not as intuitive as
also summarize how our sample units are related to classification, because we must abandon the cornfort-
each other in ternls of this correlation structure. This able typological model. But what we get is the capacity
is one form of "data reduction." Data reduction takes to represent continuous change as a quantitative syn-
various forms, but it has two basic parts: (1) summari- thetic variable, rather than forcing continuous change
zing a large number of observations into a few num- into a set of pigeonholes.
bers and (2) expressing many interrelated response So data reduction is summarization. and summari-
variables in a more conlpact way. zation can result in categories or quantitative variables.
Many people realize the need for multivariate data It is obvious that the need for data reduction is not
reduction after collecting masses of comnlunity data. unique to community ecology. It shows up in many
They become frustrated with analyzing the data one disciplines including sociology, psychology. medicine,
species at a time. Although this is practical for very economics, market analysis. meteorology. etc. Given
simple comn~unities, it is inefficient, awkward, and this broad need. it is no surprise that many of the basic
unsatisfving for even moderate-sized data sets, which tools of data reduction - multivariate analysis - have
mav easily contain 100 or more s~ecies. been widely written about and are available in all
major statistical software packages.
We can approach data reduction by categorization
(or classification), a natural human approach to organi- In community ecology, our response variables
zing conlplex systems, Or we can approach it by usually have distinct and unwelconle properties
sumnlarizing continuous change in a large number of compared with the variables expected by traditional
variables as a synthetic continuous variable (ordina- multivariate analyses. These are not Just minor viola-
tion), The synthetic variable represents the combined tions. These are fundamental problems wlth the data
variation in a Rroup
- *
of response variables, ~~t~ t h d seriouslv weaken the effectiveness of traditional
reduction by categorization or classification is perhaps nlultivariate
the most intuitive, natural approach. It is the first This book is about how species abundance as a
solution to which the human mind will gravitate when response variable differs from the ideal, how this
faced with a complex problem, especially when we are creates problems, and how to deal effectively with
trying to elucidate relationships anlong objects, and those problems. This book is also about how to relate
those objects have many relevant characteristics. species abundance to environmental conditions. the
\.arious challenges to analvsis. and ways to extract the thev respond to disturbance, and how they respond to
niosr inforn1ation from a set of correlated predictors. our attempts to niarl~pulatespecies
It IS not possible now. nor is it ever likel! 10 be
Definition of community possible to make reliable. specific. long-tern1 pred~c-
What is a "cornmunil>" in ccolog!,? The word has tions of comniunih d~namicsfor specific siles based
been used 111311~different ways and it is iu~likelythat it on general ecological theory. This is not to sa! \\e
will ever be used consistently. Some use "con~munity" should not try. But. we face the same problenls as
as a n abstract group of organisms that recurs on the long-term bveather forecasters. Most of our predicr~\e
landscape This can be called the itbstract community success will come from short-term predictions appl! ing
concept. and it usually carries with it an implication of local knowledge of species and environment to specific
a level of Integration among its parts that could be sites and questions.
called organ~slnal or quasi-organismal. Others.
including 11s. use the concrete community concept. Purpose and structure of this
meaning simply the collection of organisms found at a book
specific place and time. The concrete conlmunity is
The primary purpose of this book is to describe the
formali~edby a sample unit which arbitrarily bounds
most important tools for data analysis in colninu~lir!
and coinpartmentalizes variation in species conlposi-
ecology. Most of the tools described in this book can
tion in space and time. The content of a sample unir is
be used either in the description of conlmunities or rhe
tllc operational definition of a coinmunity.
analysis of ~nanipulativecsperinlents. The topics of
The nard "assemblage" has ohen been used in the community sampling and nleasuring diversir! each
scnsc of a concrete conlmunity. Not only is this an deserve a book in themsel\es. Rather than complerely
a\\ kward word for a silnple concept, but the word also ignoring those topics, we briefly present some of rlie
carries unwanted connotations. It implies to some that most i~nportantissues relevanl to conlmunity ecology.
species are independent and noninteracting. In this Explicitly spatial statistics as applied to community
book. we use the term "community" in the concrete ecology likewise deserve a whole book. We excluded
sense. without any conceptual or theoretical implica- this topic here. except for a few tangential references.
tions in itself.
Each analytical method in this book is described
with a standard format: Background. When to use it.
Why study biological Ho\v it works, What to report. Examples. and Varia-
communities? tions. The Bacliground section briefl! describes the
People hi~vebeen interested in natural communi- development of tlie technique. with emphasis on the
ties of organisms for a long time. Prehistoric people development of its use in community ecology. It also
(and Inan! animals. perhaps) can be considered coni- describes the general purpose of the method. When to
munity ecologists, since their ability to survive depend- use it describes more explicitly tlie conditions and
ed in part on their ability to recognize habitats and to assumptions needed to apply the method. Knowing
understand some of the environnlental in~plicationsof How it worlis \+.ill also help most readers appreciate
species the! encountered. What different~atescommu- when to use a particular method. Depending on rlie
niry ccolog) as a scientific endearor is that we utility of the method to ecologists. the level of detail
svs~ematically collect data to answer the question varies from an overview to a full step-by-step descrip-
"why" in rhe "who lives with whom and why." tion of the method. What to report lists the metliodo-
logical options and key portions of the numerical
Anotlier fundamental question of community
results that should be given to a reader. It does nor
ecology is "What controls species diversity?" This
include items that should be reported from any
springs from the more basic question, "What species
analysis, such as data transforn~ations (if any) and
are here?" We keep backyard bird lists. We note
detection and handling of outliers. Examples provide
which species of fish occur in each place where we go
further guidance on how to use the methods and \\hat
fishil~g. We have mental inventories of our gardens.
to report. Vitriations are available for mosr lech-
Inventorying species is perhaps the most fundamental
niques. Describing all of them ~vouldresult in a much
acti\ ir! in com~nunity e c o l o ~ ~ Few
. ecologisrs can
more expensive book. Instead. we emphasi~ethe most
reslsl, however. going beyond rllar ro rq to understand
useful and basic techniques. The references in each
which species associate with which other species and
section provide additional information about the
~ 1 1 ho\\
~ . the! respond to environmental changes. how
variants.
Chapter 2

Table 2.1. ExampIes of objects and attributes in ecological matrices

Type of Study Objects Attributes


Community analysis Sample plots Species
Stands Molecular markers
Community types Structures or functions
Environmental factors
Time of sample
Niche-space analysis Individuals Resources used or provided
Populations Environmental optima, limits. or responses
Species Physicochemical characteristics of resources
Guilds Habitats
Behavioral analysis individuals Activities
Populations Response to stimuli
Species Test scores .-.

Taxonomic analysis Individuals (specimens) Morphological characters


Populations Nucleotide positions
Species Isozyme presence
Secondary chemicals
Functional or guild Individuals Life history characteristics
analysis Populations Morphology
Species Ecological functions
Higher taxa Ecological preferences

Table 2.2. Examples of molecular markers used in lieu of species in community ecology. BIOLOG = carbon
source utilization profiles: cpDNA = chloroplast DNA: FAME = fatty acid methyl esters: LH-PCR = length
heterogeneity polymerase chain reaction: T-RFLP = terminal restriction fragment length polymorphisms.
Orga~lisnls Kind of marker Reference
n~icrobialcon~n~unities BlOLOG n~icroplates Ellis et al. 1995, Garland &
(fi~ngalor bacterial) Mills 1991, Myers et al.
200 1. Zak et al. 1994
fine roots of trees cpDNA. restriction fragments Bn~nneret al. 2001
soil microbes FAME Cavigelli et al. 1995;
Schulter & Dick 2000. 200 1
soil bacteria FAME and LH-PCR of 16s rDNA Ritchie et al. 2000
biosolids in wastewater FAME Werker & Hall 2001
treatment
aquatic microbes LH-PCR of 16s rDNA Bernhard et al. 2002
nitrogen-fixing microbes nitrogen fixing gene sequences (nifH) Affourtit et al. 2001
mycorrhizal fine roots and soil phospholipid fatty acids Myers et al. 200 1, Wiernken
lnicrobes et al. 200 1
microbes in soil T-RFLP of 16s rDNA and 16s rRNA- Buckley & Schmidt 200 1
targeted oligonucleotide probes
Overview

analyze a matrix of relationshps among sample


Analyzing the data matrix units.
See Appendix 1 for conventions of simple matrix R route: Arriving at a grouping or ordering of either
notation and matrix operations. In the data matrix, in objects or attributes by anal!-zing a matrix of
its normal orientation (matrices A and E, Table 2.3), relationships among attributes. For example,
rows represent sample units, objects, entities, or cases. with a matrix of sample units by species. we would
Columns represent variables or attributes of the analyze a matrix of relationships among species.
objects. The analysis can take several general forms, The choice between Q route and R route is usually
defined below: inherent in the choice of analytical method. For
Normal analysis: The grouping or ordering of objects example, a normal analysis with Bray-Curtis ordina-
(Tables 2.1 and 2.4). tion is always by the Q route. A few methods,
however, can be done by either route. For example,
Transpose analysis: The grouping or ordering of identical results from principal components analysis
attributes (Table 2.1). The matrix can be can be obtained for a normal analysis either by the Q
transposed. then analyzed (Table 2.4). route or the R route.
Q route: Arriving at a grouping or ordering of either
objects or attributes by analyzing a matrix of
relationships among objects. For example, with a
matrix of sample units by species, we would

Table 2.3. Example data matrices in community ecology. A: 15 sample units (plots) x 8 species: each species
indicated by a 4-letter acronym. Each element represents the abundance of a species in a plot. E: 15 sample units
x 3 environmental variables. For the first two variables each element represents a measured value. while the third
variable, "Group" represents assignments to two treatments and a control group. S: 3 ecological traits x 8 species.
Each element represents the characteristic value for an ecological trait for a given species.

A Species E Environmental Variables


ALSA CAHU CECH HYDU HYEN HYlN HYVl PLHE Elev Moisture Group
Plot01 1 51 0.11 0 0.35 0.21 0 0 0.24 31 1 9 1
Plot02 1.73 0 0 0.25 0.23 0 0 0.53 323 17 1
Plot03 1.20 0 0.02 0.03 0.05 0 0 0.05 12 10 1
Plot04 1.42 0 05 0 0.99 0.13 0 0.09 0.08 15 8 1
Plot05 1. I 4 0 0 0.14 0.17 0 0 0 183 12 1
Plot06 1.39 0.07 0.07 0 0 0 0 0.04 12 26 2
Plot07 2.26 0.11 0.03 0.02 0 0 0 0.07 46 29 2
Plot08 1.01 0.32 0.03 0 0 0 0 0.48 220 19 2
Plot09 1.09 0.09 0 0 0 0 0 0 61 22 2
Plotl 0 2.90 0 0.04 0.19 0.71 0.30 0.15 0.39 43 34 2
Plotl I 3.22 0.03 0.06 0 0.56 0.41 0 0.63 256 21 3
Plotl 2 3.42 0.12 0 0.38 0.26 0 0 0.06 46 17 3
Plotl 3 2.55 0.08 0 0 0.27 0.43 0.08 0.15 76 22 3
Plotl 4 2.72 0.13 0 0.28 1.11 0 0.11 0 488 36 3
Plotl 5 3.01 0 0 0.28 0.67 0.86 0 0.25 274 23 3

S
MaxAge
RootDpth
Fecundity
Overview

Metaobjects -+ Species Environment Species SU


-1 traits Scores

Sample
units (SU) A E (AS') (X)

Species S (SA'E)
traits

Environment (A'E)'

Species scores
(Y)
Figure 2.1. The complete community data set. Bold uppercase letters represent different matrices.
Calculated matrices are in parentheses. Sample unit (SU) scores and species scores are based on
ordination or classification of sample units or sp5cies. The prime mark (') indicates a transposed matrix
(see Appendix 1).

The difference between the Q route vs. R route is ecology. In other words, a sample is a metaobject
one of mechanics. There are. however. very important conlposed of the objects that are sample units. The
conceptual differences between the normal analysis and environment is a metaobject conlposed of environ-
transpose analysis. Those differences have not been mental variables.
thoroughly compared and explained in the literature. You can measure or calculate a matrix for each
Table 2 . 1 contrasts normal and transpose analyses. pair of metaobjects (Fig. 2.1, Table 2.5). Furthermore.
based on the literature (e.g., Clarke 1993, p. 1 18) and you can represent each metaobject as points in a space
our own experience. Understanding these differences defined by another metaobject (Table 2.6). In other
requires some experience with community analysis, so words. sample units can be represented as points in
.
\ye reconlnlend that beginners return to this table as a species space and vice-versa: if this concept is nor
reference. immediately clear, we hope it will become clear when
distance measures are introduced (Ch. 6). Similarl,.
Community data sets species can be represented as points in environmental
space, environmental variables can be represented as
Community data sets take many forms. but most of
points in species space, etc. Although not all of these
them can be fit into a concept of basic matrices and
combinations are conceptually appealing, all are
their relationships (Fig. 2.1, Tables 2.5 & 2.6). One
mathematically possible.
can thnk of the sample, species, environment, and
species traits as the basic "metaobjects" in community
Chapter 2

Table 2.5. Sources and types of multivariate ecological data used in studies of species composition.
Categorical (= nominal) variables are qualitative rather than quantitative, indicating membership in one
of two or more categories.

Matrix Matrix Categorical* Usual data source


Fullness vs. ordinal
Species composition sparse* * field
Environment full field
Species traits full fieldlliterature
Sample ordination or group full calculated
membership
Species ordination or group full calculated
membership
AS' samples x traits full calculated***
A'E species x environment full calculated* * *
SA'E traits x environment full calculated* * *

* Categorical (nominal) variables cannot be used in calculating new matrices. unless they are converted to a
series of binary (011) variables.
** A sparse matrix contains many zero values.
*** Rare in the literature. --, ..

Table 2 6 Matrices required for representing ecological entities in spatial


coordinates defined by other sets of ecological vanables You can represent any
of these enhties In any of the coorcllnate systems All are appropriate for
multivariate analysis The prime mark indicates a transposed matrix (see
(I)
-
Appendix 1)

Coordinate system
Entities Species Samples Environment Species
traits
Species A' A'E S'
Samples A E AS'
Environment (A'E)' E' (SATE)'
Species traits S . (AS')' SA'E
Overview

The most common data sets are either the matrix by analyzing its correlation structure (using correlation
A (sample units in rows, species in columns) or the in the broad sense). A good example of this approach
combination of matrices A and E (sample units in is in Brazner and Beals (1997).
rows. environmental variables in columns). The matrix Serious potential pitfalls in calculating A'E must
S is rarely used, but may be of prima? interest where be carehlly avoided. For example, all of the environ-
the focus is on variation in life history strategies or mental variables are included, whether or not they are
traits as it relates to environments or communities. related to species distribution and abundance. By
lnterest has increased in using S as a basis for assign- multiplying the two matrices you implicitly assuine
lng species to "functional groups." that all of the environmental variables are important.
The matrices X and Y are very commonly used. Brazner and Beals reduced this problem by first
Typically X has only a few columns, each column restricting the environmental variables to those shown
representing placement on an ordination axis (the to be important in relation to an ordination of sample
sample unit "score"). Alternatively, X may contain a units in species space.
categorization of the sample units into community It is important to give some thought to prior data
tvpes. Similarly, Y may contain species scores on transformations (Ch. 9). Because matrix multiplication
ordination axes andlor a categorization of species into involves adding the products of many numbers. the
species types. resulting numbers can become very small or very large.
The other matrices are rarely used, but can be A worse problem is that the resulting matrix can
produced by matrix multiplication. For example, be nearly meaningless if careful attention is not paid to
suppose you are doing a standard community analysis scaling your variables. For example, if one of the
on species conlposition and environmental data. Your environmental variables ranges from 1000 to 3000 and
main matrix contains the abundance of each species in the other environmental variables range from 0 to 1,
sach plot and your secondary matrix contains the the large numbers will completely obscure the small
sn\.ironmental parameters for each plot. The most numbers. resulting in a matrix dominated by a single
common analytical approach is to examine the rela- environmental variable. This problem could be
tionships among plots in species space. That is, plots avoided by standardizing or relativizing the variables
are grouped, ordered, or otherwise arranged by their in the two matrices, so that each variable is given equal
similarities in species composition, A. Relationships weight (see Chapter 9). Brazner and Beals (1997)
bet\veen the environmental variables, E and the com- relativized all of their environmental variables,
munity patterns are then sought. The purpose of this expressing each data point as the number of standard
procedure is to see how species distributions are related deviations away from the mean for that variable.
to environmental factors. If you do this kind of matrix n~anipulation,it is
important that you examine the resulting matrices
The species x environment matrix critically to be sure they contain what you think they
Why not examine species directly in environmen- contain. See Greig-Smith (1983 pp. 229 and 278) for
tal spacea? First you would need a matrix of species further cautions about ordinations based on environ-
scores for each environmental variable. One way to mental data.
produce such a matrix is to multiply your main matrix,
-4. by vour environmental matrix, E. (See Appendix 1 The sample unit x species traits matrix
for matrix nlultiplication.) Actually, to make the Given the standard community matrix, A, and a
matrix algebra come out right, you must postmultiply matrix of species traits; S, one can multiply these to
h e transpose of the main matrix (rows and colunlns yield a matrix of sample units by species traits:
m-itched)by the environmental matrix:
sample units x species traits = AS'
Analysis of this matrix (e.g., Feoli & Scimone 1984,
species x environment matrix = A'E Diaz et al. 1992, Diaz & Carbido 1997, Diaz et al.
1999, Diaz Barradas et al. 1999, Lavorel et al. 1999,
(The transpose of A is indicated by A'; see Appendix 1 Landsberg et al. 1999) would reveal how sample units
for elementary matrix algebra.) The resulting matrix are related to each other in terms of species traits. One
contalns scores for each species on each environmental might wish to contrast the blend of species traits in
\-anable. Thls matrix can now be summarized further different groups of sample units (e.g., treatment vs.
control). For example. Lavorel et al. (1999) analyzed thought of as a surnmw of S. where G must contain
traits from AS'. one trait at a time, comparing categorical or binary variables rather than measure-
experimental treatments with univariate ANOVA. Or ment variables. If G is derived from S b! multi\.ariate
one might wish to study how species traits covary in a analysis (Pillar 1999a), and different traits are .
sanlple along a gradient. For example, Diaz et al. measured on different scales (the usual case). then the
(1999) used ordination to extract the main gradients traits must be relativized in some way. This can
(X) from the AS' matrix. then analyzed the happen as part of the analysis itself (e.g., the choice of
relationship of those gradients to climate and correlation coefficients in the cross-products matrix for
disturbance (E). PCA: see Chapter 14) or as a relativization prior to
If A and S are binary (A contains presence- analysis (see Chapter 9).
absence data and S specifies I=yes or O=no for each The functional group matrix, G , can be analyzed
trait for each species). then AS' gives the total number in relation to a community sample, A, and environ-
of species in a sample unit with a given trait. If either mental variables. E. Kleyer (1999), for example. first
A or S or both are quantitative, the elements of the AS' derived G by a cluster analysis of S. [n this case. G
matrix can be considered to represent the abundance or contained a single variable. representing groups of
magnitude of a particular trait in a particular sample species with similar traits. The number of individuals
unit. If S is quantitative and different traits are in each species group in each sample unit is
~neasuredon different scales (the usual case), then the conveniently calculated by first separating the group
traits must be relativized in some way (for example. as membership variable in G into a series of mutually
a proportion of the maximum value or as number 9f exclusive binary variables representing membership or
standard deviations away from the mean; see Chapter not (110) in each species group, then calculating AG'
9). (analogous to AS'). This yields the representation of
each species group in each sample unit. Because A
Functional groups consisted of counts of each species in Kleyer's case,
We can add another matrix, G , that assigns each element of AG' specified the number of
species to functional groups. The groups may be individuals of each functional group in each sanlple
different states of a single categorical variable. or the unit. Multivariate analysis of the AG' ~ilatrix is
groups may be multiple variables that are not neces- possible, but in this case, Kleyer used Gaussiail logistic
sarilv mutually exclusive. For example. the Common regression to fit the probability of co-occurrence of all
Loon could be assigned to both the groups "divers" and species in a group along a gradient plane of resource
"predators." supply and disturbance intensity.

The traits x environment matrix


Traits Functional
groups Calculating the traits x environment matrix .
SA'E. is one solution to the "fourth-corner problem"
(Legendre et al. 1997). This refers to the fourth corner
(lower right) in the square arrangement of matrices in
Figure 2.1. This matrix can be used to represent ho\v
species traits (morphological, physiological. phylo-
genetic. and behavioral) are related to habitat or site
characteristics. Three data matrices are required: S. A.
and E.
With presence-absence species data and nominal
variables in S and E. then SATEis a contingenc! table.
Legendre et al. (1997) pointed out that one cannot use
a G statistic to test for independence of traits and
environment in this contingenc? table. because several
Figure 2.2. Derivation of functional groups, G , from a species are obsened in each sample urut. This makes
species trait matrix, S. the obsenations norundependent The!- propose a per-
mutation method for staumcal significance.
Functional groups can be assigned a priori or \Vith quanutatne data all of the cautions about
based on an analysis of the trait matrix. S. G can be calculating .4'E q p h kre as \bell. Carefull! done. the
Chapter 2

Table 2.7. Conlparison of a focal species-centered approach to a general habitat-centered approach for analvsis of
a focal species in relation to its habitat. F, H: and S are defined in Figure 2.3.

Focal species-centered approach General habitat-centered approach


Goal Predict focal species (i.e., predict Describe variation in available habitats.
presence or abundance or performance of then describe position of the focal species
focal species based on habitat variables) within that variation.
Matrix concepts F = f(subset of H) Derive S from H, then see how F is
related to X
Example tools Logistic regression Extract primary gradients X in habitat by
ordination or classification of habitat
Multiple regression
matrix H. Overlay F on this ordination
Discriininant analvsis and calculate correlations between F and
X.
Advantages Better predictive power for focal species Better description of habitat variation in
general.
Potentially applicable to a wider range of
species.
Disadvantages Little or no description of habitat Worse predictive power for focal species.
variation in general.
Not applicable to other focal species.
Cautions Need to evaluate reliability of predictions. Need to carefully consider (1) relative
preferably using an independent data set. weights of variables that are on different
an unused subset of the data set. or a scales and (2) which variables to include.
resampl ing procedure
CHAPTER 3

Community Sampling and Measurements


Sampling is the process of selecting objects of they were continuous data, sor~lemethods are available
study from a larger number of those objects (the that explicitly recognize their ordered. multistate na-
population). Each object is then subjected to one or ture (e.g., Guisan & Harrell 2000). A disad\,antage of
more measuren~ents or observations. Although the cover classes is the potential for consistent differenccs
word "sampling" is often used in a broad sense to (bias) between observers (Sykes et al. 1983).
~ncludc a discussion of the measurements. the two The most useful and most commonly used co\,er
concepts are distinct. classes are narrow at the extremes and broad i n t l ~ c
Tliis book does not include sampling theory. nor middle. These approxi~nate an arcsine-squareroot
does it contain a comprehensive survey of sanlpling transformation. which is generally desirable for pro-
methods. It does include a few sampling basics and a portion data. The cover classes can thus be anal!,~cd
discussion of some recurrent issues in com~nunity directly, improving normality and homogeneig. of
sampling and measurement. variances anlong groups without converting to percent-
ages. Unless transformed (Chapter 9). multivariate
To be perfectly clear, we will use the word
analysis of raw percent cover data tends to e~nphasize
"sample" to refer to a collection of sa~nplingunits or
the dominant species at the expense of species \\.ith
sample units (SUs). In casual conversation. a "sample"
medium to low abundance. Cover class data seldo~n
is often used to mean a single sample unit or a
have this problen~. If. hocvever. cover classes are
collection of sample units.
transformed into midpoints of the ranges. then the
problem reappears.
Measures of species abundance Many cover class schemes have been devised.
1. Cover is the percentage of some surface covered Some of the most common and/or logical are listed in
by a vertical projection of the perimeter of an organ- Table 3.1. Note the high degree of similarity among
ism. Note that when summed for a given sample unit, the systems.
percent cover can exceed 100'%, because of nlultiple Frequency is the proportion (or percentage) or
lqering. Cover excels as a n abundance measure in sample units in which a species occurs. The best traits
speed. repeatabilih. comparability between different of frequency are that it is relati\,ely sensiti1.c to
estimatio~lmethods. and because it can be measured infrequent species and it is fasl to score in the field.
nondestructively. Conversion to biomass estimates is Frequency measures should. however, generally be
possible but requires additional data collection for
avoided because frequency, unlike cover or density. is
calibration (e.g., Forman 1969, McCune 1990). C o ~ ~ e r highly dependent on the size of the sample unit.
is the most con~n~only used abundance measure for Because there is little standardization in size of sample
plants. units. use of frequent), measures restricts opportunities
Percent cover is often scored as cover classes. for con~parisonwith other studies.
rather than estimates to the nearest percent (allhough Frequency does. hocvever. carry information about
these too are classes. just much narrower). Using co\,er spatial distribution. For exaniple. consider two popula-
classes rather than attempting to estimate cover to the tions of equal densih. one highly aggregated and the
nearesl one percent tends to speed sampling and data other dispersed. The second population is more
entry. Co\,er classes do not pretend to achieve more frequent.
accuracy than is realistic. Furthermore, they yield
statistical results that are similar to unclassed data, If individuals are randomly located then frequency
provided that the classes are not too broad (Sneath & is an asymptotic function of density (Fig. 3.1). A little
Sokal 1973). Cover classes have been shown to be known fact: an average density of two individuals1SU
effecti\,e surrogates for direct biomass measurement gives about 86?/u frequency. Why? Wliar ~~nderlqing
(Hern~y 1988. McCune 1990) and detectors of distribution is this based o n ? (Answer: the Poisson
c o n ~ n ~ u n ichanges
t through time (Mitchell et al. distribution, the same distribution as is used to describe
1988). Altllough most analysts treat cover classes as if the number of chocolate chips in chocolate chip
Chapter 3

Table 3.1. Cutoff points for cover classes. Question marks for cutoff points represent classes that are not exactly
defined as percentages. Instead. another criterion is applied. such as number of individuals. Cutoffs in parenthe-
ses are additional cutoffs points used by some authors.
Name Cutoff points, % Notes References
Arcsine squareroot 0 1 5 25 50 75 95 99 Designed to approximate an Muir &L McCune (1987,
arcsine squareroot 1988)
transformation of percent
cover.
Braun-Blanquet 0? '7 5 25 50 75 Uses two categories of low Braun-Blanquet (1965).
co\,er not exactly defined as, Mueller-Dornbois &
percents. Commonly used in Ellenberg (1974)
Europe.
Widel! used in western U.S. in
habitat-typing efforts by U.S.
Forest Service and many other
studies.
One category of low cover not Krajina (1933): Mueller-
exactly defined as percent. Dotnbois &L Ellenberg
(1974)
Hult-Sernander 0 (.02 .05 .10 .lY .39 .78) Based on successive halving of Oksanen ( 1976)
(modified) 1.56 3.13 6.25 12.5 25 50 the quadrat.
75 ...

cook~es.) As SU size increases, frequency loses sensl-


tivity and plateaus at 100%.
Counts (density). Density is the number of
individuals per unit area. Dens~ty1s not dependent on
size of the SU. Relative density of species J is the
proportion of the p species that belong to species j:

2 Density is useful if the target organisms have readily


DENSITY (individuals/SU) distinguishable individuals (these may be ranlets of
clonal plants) and the individuals do not vary much in
size. Density is of questionable utility when applied to
Figure 3.1 Expected percent frequency of organisms that v a y greatly in size. such as trees.
presence in sample units (SU) as a filnctioll unless applied to restricted size classes (e.g.. seedling
of density (individuals/SU). density).
Biom;~ss. Biomass values are usually relatively
destructive and tedious to obtain directly. Most often,
biomass is estimated by regression equations based on
Sampling and Measurements

more easily measured "predictor" variables. Of course, The chief disadvantage of importance values is
someone at sonletinle had to go through the tedium of that you never know quite what a number represents.
obtaining data on both the biomass and the predictors. For example, consider an IV based on relative density
Biomass is often chosen as a measure of abundance and relative dominance. The following two species
when the functional role of the organisms is important. have the same lV, yet they will look very different in
For example. it may be inlportant to estimate available the stand (Table 3.2).
forage for animals.
Basal area is a measure applied to individual trees Table 3.2. Example of identical importance values
(units usually cn12), then aggregated to the stand level, representing different community structures.
where it is sonletimes referred to as dominance. The
Species 1 Species 2
usual units for stand-level basal area are m2/hectare (or
ft2/acre historically). We like basal area as a descriptor Relative Density 42 8
of forests because it is more closely proportional to leaf Relative Dominance 10 44
area and foliage mass than are the other common Sun1 52 52
measures, such as density or frequency. Thus, we
would argue that it has more functional significance IV% 26 26
than most other simple descriptors of forest structure.
See Box 1 for an example description of forest compo- Presence-absence is a very useful measure in
sir ion based on individual-tree data. large-scale studies. It is also what is recorded in point-
Reli~tive measures: Density, frequency, and intercept sampling. In large regional studies or any
dominance of species mav all be expressed as relative other study in which the heterogeneity of the SUs is
proportions. For example, relative density of speciesj large, most of the inforination in the data will be
is the ratio of its density to the overall density, the sum carried in the presence-absence of species. But
of the densities of the p species. These relative mea- presence-absence is not useful in detecting more subtle
sures are conln~onlyexpressed as percents, by multiply- differences in more homogeneous areas. For example,
ing the proportions by 100, for example: in a conlparison of old-growth and inanaged second-
growth forests in Montana, the lichen Alector~a
(1 00 ~ e n s i v , ) sarnlentosa was present in all stands but was
Relative u%rsiv1 % = consistently much more abundant in old growtl~rhan in
second growth (Lesica et al. 199 I).
Densi*]
]=I
Size-cliiss d i ~ t i(or
~ age classes or life histon
stages) can give useful insight into the history and
Importance values are averages of two or more of future of a species. For example. an ideal "climax"
the a b o ~ cparameters. each of which is expressed on a species should be well represented in all size classes,
relat~vebasis. For example, a measure often used for indicating that the species is reproducing and replaci~~g
trees in eastern North American forests is: itself at a site. On the other hand, a species that is
present in only the largest size classes may gradually
be lost from a population as the large. old indi\-iduals
IT.% = (Relative frequency + relative dominance + die.
relative density) / 3 Size-structured populations are sonletinles incor-
porated into community analysis by treating each s i ~ e
where relative dominance is based on basal area. Some class (or age class or stage) as a separate "species" in
ecologists feel that that importance values muddle the analysis. This has a desirable effect of incorpora-
interpretation. while others appreciate the simpli- ting information that may better integrate life history
fication of several measures into one overall measure of patterns into an analysis of community patlerns. On
abundance. the other hand, the goals of a study may be sufficiently
The advantage of importance values is that they broad that including this additional detail contributes
are not ovenvhelnlingly influenced either by large tree little or nothing to the results.
size (as is relative dominance) or large numbers of
small trees (as are relative frequency and relative
density). The inlportance values add to 100 when
summed across species in a stand.
Chapter 3

Box 3.1. Example of stand description, based on individual tree data from fixed-area plots. The variance-
to-mean ratio, V/M, is a descriptor of aggregation. values larger than one indicating aggregation and
values smaller than one indicating a more even distribution than random. The variance and mean refer to
the number of trees per plot. IV and other measures are defined in the text.

Raw data for three tree species in each of four 0.03 hectare plots. Each number represents the
diameter (~111)of an individual tree.
Species Plot 1 Plot 2 Plot 3 Plot 4
Carya glabra 23 22 31
24
C'orr~u.5 jlorrda 10 10 12
10 11
12
Quercus alba 13 20 11 10
17 30 32
Ad

Frequencies. counts, total basal areas, stand densities, and stand basal areas.
Species Freq. No. BA (dm2) Freq.% Density BA
Trees Treestha dm'tha
Catya glabra 3 4 20.0 75 33.3 166.9
Cornu.sJlorida 3 6 5.6 75 . 50.0 46.4
Quercu.~alba 4 8 38.8 100 66.7 323.3
Totals 10 18 64.4 150.0 536.56

Relative abundances, importance values, and variance statistics


Relative Abundance Variance
Species Frequency Density Dominance IV(%) no. trees VIM
C'arva glabra 30.0 22.2 31.1 27.8 0.67 0.67
Cornus,florida 30.0 33.3 8.7 24.0 1.67 1.11
Quercus alba 40.0 44.4 60.3 48.2 0.67 0.33
All species 1.00 0.22

Nunlber of quadrats = 4
Empty quadrats -
- 0
Quadrat size -
- 0.030 hectares
Area sampled - 0.120 hectares
Average BNtree - 3.577 dm2
BNhectare - 5.366 m2/hectare
Treesthectare -
-
150
Treestquadrat - 4.5
Sampling and Measurements

systematic sampling are NOT accurate (e.g., see


Defining the population Whysong and Miller 1987). In many cases. thougll.
the practical advantages of svste~ilaticsanlpling (say
Write down the definition of your population. It
with a grid of points) outweigh the rcduction in f ; ~ i ~inl l
is in~portantthat the population in the statistical sense
our p-values. One practical advantage is that it is
be defined in writing during the planning stages.
usually much easier to relocate systematically placed
Revise the definition in the field as you encounter
pernialient plots than randomly placed plots.
unexpected situations. Perhaps the clearest practical
way of defining the population is to make a list of 4. Arbitrary but without preconceived bias. We
criteria used to reject SUs. This list should be reported find this to be an apt phrase for what biologists usually
in the resulting publication. For example, one might do when they are clain~ing"random sanlpling." Uilless
include the sentence, "Plots were selected on southeast you are strictly following a randomization scheme that
to west aspects between 1000 and 1500 m in elevation. is applied to your WHOLE population, you cannot
Only stands with the dominant cohort between 70 and really say you are sampling at random. V e v often we
120 years in age.were included. Rock outcrop, talus, try hard not to bias the san~plebut do not carcfi~lly
and riparian areas were excluded. " randomize. What are the consequences of this'? Surcly
they depend on the goal of the study. The more
Homogeneity within sample units. Very often
in~portantit is to make a statement with known error
we apply a criterion of homogelleity to sample units,
about the population as a whole. the more in~portilnlit
the idea being that SUs are internally more-or-less
is that sampling be truly random. When describing
homogeneous. Typically, sites or "stands" are consi-
your sampling method, consider calling it "arbitrary
dered to be areas that, at the scale of the dominant
but without preconceived bias" or "haphazard" instead
vegetation, are essentially homogeneous in vegetation,
of bending the truth by calling it random sampling if il
environment, and histoq. This is almost always
is, in fact. not random.
applied in a v e v loose way. One rule of thumb is that
the leading dominant should vary no more than by 5. Subjective. In some cases it makes sense to
chance. This could be evaluated by subsampling, b ~ l t locate samples subjectively, but you should be veq
in practice, homogeneity is usually assessed by eye. cautious about using subjective sampling - it has a
long and partly unpleasant history. Sampling in Eu-
rope (especially the Braun-Blanquet school of phytoso-
Placenient of sample units ciology) and North America was often based on subjec-
An anonymous early ecologist: "The illost tively placed SUs, the criteria being whether or no[ 111c
important decision an ecologist makes is community was "typical" of a preconceived communit~
where to stop the car." type. Clearlv. one cannot use such data to make ob.jec-
1. Random sampling requires the application of tive statements about topics such as the existence of
two criteria: each point has an equal probability of continuous vs. discrete variation in communities.
being included and points are chosen independently of In other cases, subjectivity is a necessary and inte-
each other. gral part of a study design. For example, if you wan1 lo
2. Striltified rilntlom sampling has the additional find or study the most diverse spots in the landscape,
feature that a population (or area) is divided into strata: then it is reasonable to use a subjective \.isual
subpopulations (or subareas) with known proportions assessment of diversity to choose SUs Of coursc. tllc
of the whole. Sample units are selected at randoin price is that you inlmcdiatclj reduce the scopc of
vithin strata. Stratification allows sampling intensity inference from the study. In this example. you n o ~ ~ l d
to vary anlong different strata, yet you can still calcu- obviously be remiss to use the resulting diversity
late overall population estimates for your parameters. estimates to make a statement about average diversity
in the landscape as a whole.
3 . Regular (systematic) si~mpling has sample
units that are spaced at regular intervals. In most Sources of random digits. Often one needs a
cases. the consequences of sampling in a regular source of random nunibers in the field. Sonle people
pattern are not severe unless the target organisms are copy a page from the random nunlbers tables containcd
patterned at a scale sinlilar to the distance between in most con~pilationsof statistical tables. Some people
SUs. It has repeatedly been shown, however, that the have used a die in a clear container. Random dlgits
p-1-alues emerging from hypothesis tests based on can also be assigned in advance and entered on dilt;~
Chapter 3

forms, before going into the field. Fulton (1996) Distance methods. Distance-based methods have
pointed out that the digital stopwatch built into most been most frequently used for sampling forest structure
people's watches is a good source of randoni digits, (Cottam & Curtis 1956) as well as for animal popu-
\vlien the digit appearing in one of the most rapidly lations (Buckland et al. 1993). If you are sampling
changing positio~isis used. forests, you measure distances from randomlv chosen
points to the nearest trees. The diameter (or basal
Types of sample units area) and species of those trees are then recorded The
most conimonly used of these methods 1s the point-
Fixed-area. Fixed-area sample units are of a sct centered quarter method (Cottam & Curtis 1956). In
s u e and shape Usually these are called quadrats or the point-quarter method the observer measures the
plots. Quadrats need not be four-sided. The ease of distance to the nearest individual in each of four
use and s~a~istical efficiency of quadrats depend on quadrants around randomly chosen points.
their shape.
Distance methods are based on the concept that the
Circles are very fast to lay out. only one marker is distances call be used to calculate a "mean area"
needed for permanent relocation and they minimize the occupied by the objects. The average of the distances
nurnber of edge decisions (lowest perimeter to area is equal to the square root of the mean area. Mean area
ratio). but they have the poorest shape for estimation is the reciprocal of density. Density is converted into
from aggregated distributions, yielding a high variance sonie measure of dominance or biomass by multiplying
aniong SUs. the density by the average size of each object. Usually
Squares are slow to lay out when large, two or four this means multiplying tree density by the average
markers are needed for permanent relocation, and they basal area of the trees to arrive at a total basal area per
have a poor shape for aggregated distributions. unit land area.
Rectangles are slow to lay out when they are large Distance methods can also be used to estimate the
and two or four niarkers are needed for permanent quantity of any discrete objects in any area. For
relocation. but they have a better shape for aggregated example. Peck and McCune (1998) used the point-
distributio~is (the narrower the better for that: 1 e., centered quarter method to estimate the biomass of
lower variances anlong SUs). Rectangles require more harvestable epiphytic moss mats in forests in western
edge decisions than do other shapes. Oregon. Batcheler (197 1) used point-to-object dis-
Point intercel~t:Percent cover is calculated as tances to eslimate the density of anirnal pellet groups
proportion of hits by (theoretically) dimensionless and introduced a correction factor for a fixed
points. Points are usually arrayed in pin frames, ticks maximum search distance.
on tapes. etc. With multilayered vegetation more than The main drawback to distance methods is that
one hit can be recorded per pin. Some details follow. their effectiveness diminishes as the objects become
Size of pin makes a difference in rarer species (Goodall increasingly aggregated. Distance methods perform
1952, 1953b). The more abundant the species, the less well when the objects are distributed at randoni.
the pin size matters. Doubling the pin diameter from 2 Because plots need not be laid out, distance methods
mm to I mnl makes about a 5-10% difference in % are usually more rapid to apply than area-based
cover. Point sampling is difficult to apply to grasses or methods. Using a laser-based tool for measuring
tall vegetation, and easiest to apply in low-growing distances makes the method even more rapid and
vegetation such as tundra. Clustering pins in frames is accurate.
convenient but reduces the quality of estimates for a Another criticism of distance methods is that they
given number of points, because the points are not require judgements by the analyst that can lead to
independent. differences in estimates of density. Anderson and
Line intercept. Percent cover is calculated as a Southwell (1995) compared results from a panel of
proportion of a line directly superimposed on a species students and experts. All participants used the same
(b?, vertical projection). This method was developed data, but they were not restricted to particular software.
for use on desert shrubs. It is difficult or impossible to The authors concluded that "the subjective aspects of
use if the highest plants are taller than your eye level. the analysis of distance sampling may be overcome
It is a relatively good method for both conlnlon and with some education. reading and experience with
rare species. The line intercept method allows an computer software."
estimate of cover bur nor density. unless individuals are
of uniform size.
Chapter 3

Table 3.3. Average accuracy and bias of estiinates of lichen species richness and gradient scores in the
southeastern United States. Results are given separately for experts and trainees in the multiple-expert study.
Extracted from McCune et al. (1997). N = sanlple size.

% Deviation from expert

Activity N Species Score on Score on


richness cli~naticgradient air quality
gradient

% of expert Bias Acc. Bias Acc. Bias


Reference plots 16 61 -3 9 4.4. +2.4 11.1 -10.5
Multiple-expert study. experts 3 95 -5 3.6 +3.6 4.7 -4.7
Multiple-expert study, trainees 3 54 -46 8.0 +8.0 5.0 -5.0
Certifications 7 74 -26 2.7 +2.4 2.1 -2.1
Audits 3 50 -50 10.3 +3.7 6.0 +2.7

HOW do we apply these to community ecology? x (obsewer's score - expert's score) / length of the
First. we usually express the parameters of interest in a gradient.
univariate way. Some univariate measures are species If Sob!is the observed number of species, x,b, is the
richness. ordination scores, and abundances of domi- observed value of variable x. and x,,,, is the true value
nant species. Second, we need "true" values for conl- of parameter x. then:
parison. This can be done with computer simulation,
but in field studies, our best approximation of the
"truth" can be obtained by resampling an area intense- Species capture, % 100 (S,,,,, : S,,)
ly. using multiple observers and a large number of 100 'xob., - x,,-,,,j
sample units. Accuracy. %
< .
=
x true
For an extended example, we will list soine results
fro111 the Lichen Community component of the Forest 100 (x0bs - xtrue)
Health Monitoring program (McCune et al. 1997a). Bias. % =

Data quality was assessed for each plot-level summary


statistic (air quality index, climatic index, and species
richness) with several criteria: species capture. bias, To quote from the report (McCune el al. 1997)
and accuracy (Table 3.3). The indices were scores on fro111 which Table 3.3 was extracted:
major gradients, as determined by ordillation methods. Two results of this study seem most
"Species capture" was the proportion of the "true" important. First, species richness is a very difficult
~iu~iiber of species (St,,,,) in a plot that was captured in parameter to estimate, being strongly dependent on
the sampling. Accuracy was the absolute deviation the skill, experience, and training of the observer.
from "true" gradient scores, as determined from expert Second, scores on compositional gradients are
data. Bias was the signed deviation from "true" relatively consistent across observers, even in cases
gradient scores. as deternlined from expert data. A11 where there is considerable variation in species
expert was considered to be a persoil with extensive capture by the different observers. Each of these
points is discussed further below.
experience with the local lichen flora, in most cases
with two or nlore peer-reviewed publications in \vliich With the concept of "biodiversitv" becoming
tlre person contributed floristic knowledge of lichens. deeply entrenched in the management plans of
government agencies and conservation
Percent deviation in gradient score is calculated as 100
organization, there comes a great need to inventoq,
Chapter 3

may be subsampled (three stage). From a statistical


standpoint, it is desirable to randomize the sampling at
each stage in the design. In forest inventories, clusters
Recurrent issues in commi~nity of plots are often placed randomly but with a
sampling systematic pattern of plots within clusters. According
Tradeoffs between size and number of sample to Husch et al. (1972), "Fixed clusters of tlus type do
units. All studies with fixed-area plots face tradeoffs not permit a valid measure of within-cluster variation ...
between size and number of sample units (Table 3.6). The entire cluster would have to be considered as the
The "manv-but-small" strategy will yield relatively ultimate sampling unit. I'

accurate abundance estimates for the most common Pairing. Paired designs are potentially very
species but will yield a very incomplete species list. powerful ways to isolate single factors in a field set-
The "large-but-few" strategy will yield a relatively ting. In theory, pairing has the potential to isolate the
complete species list but will tend to overestimate the ever-present site-to-site variation from variation related
cover of rarer species and yields imprecise estimates of to the factor of interest. Pairs are selected such that the
the more common species (McCune and Lesica 1992). members are "identical" except that one member
Kenkel and Podani (1991) recommended that a receives (or received) a treatment or disturbance and
plot size somewhat larger than the mean patch size one member does not. The most serious problem with
will likely provide the most efficient sampling design. this design in practice is that it is usually very difficult
They also recommended that to maximize efficiency. to find two adjacent spots that are identical. This is
plots be as large as possible, given the constraints on particularly true if one is studying historical distur-
sainpling effort. bances. Many disturbances, such as fire, tend to leave
edges at natural topographic breaks, such that the areas
Pseudoreplication. Experimental units are often
inside and outside the disturbance are likely to differ in
subsampled and the subsamples analyzed as indepen-
some important way.
dent replicates. This is pseudoreplication because sub-
sample units are not independent. Designs that include
subsampling (nested designs) are fine, but it is crucial Topographic variables
to identify the level in the design at which treatments Although the topic of environmental measures is
are applied and analyze the data accordingly. not covered here, a couple of issues about topography
Repeated measures. Successive samples from recur so frequently that a short discussion of them is
permanent sample units are usually correlated, so worthwhile.
analyzing successive dates as if they were independent Aspect of a slope (the direction or azimuth that a
replicates of a treatment is invalid. Permanent sample slope faces) is commonly measured in field studies.
units are excellent for detecting temporal change Untransformed, aspect is a poor variable for
(Lesica and Steele 1997) because they allow you to quantitative analysis. For example, l o is adjacent to
separate spatial variation from the temporal change. 360". and although the numbers suggest a large
However. seldom can "time" or "date" be included as difference, the aspect is about the same. So aspect
main effects in a factorial ANOVA. In most cases, if needs to be transformed in one of several ways,
you are following an experiment with permanent plots, depending on the precision with which it was
you should be aware that you are using a repeated measured and the environmental factor(s) you would
measures design. like it to represent.
Most permanent plot studies are subject to some Heat load. Heat load is not symmetrical about the
degree of error from changes in observers and inexact north-south axis. A slope with afternoon sun will be
relocation of sample units (Ketchledge & Leonard warmer than an equivalent slope with morning sun. A
1984. McCune B Menges 1986). This was called reasonable approximation of heat load for slopes in the
"pseudo-turnover" of species by Fischer and Stocklin northern hemisphere is to make the scale symmetrical
(1997). Although rarely measured directly, it is about the northeast-southwest line. The following
important to know the size of this error relative to the formula rescales aspect to a scale of zero to one, with
size of observed changes in community composition. zero being the coolest slope (northeast) and one being
Nesting. Ecologists frequently subsample their the warmest slope (southwest).
sample units. thus creating nested designs. Nested
designs are also called multistage sampling. Sample
units are subsampled (two-staged) and the subsamples
CHAPTER 4

Species Diversity

Background
Species richness is defined as the number of spe- Alpha diversity
cies in a sample unit or other specified area. Accord-
ing to Whittaker (1972). "Diversity in the strict sense Proportionate diversity measures
is richness in species. and is appropriately measured as Many diversity measures are special cases of a general
the number of species in a sanlple of standard size." equation proposed by Hill (1973a) and Renyi (196 1).
Although species richness is an intuitive measure For an observed abundance x,, (numbers, biomass,
of diversity, including inequality in relative abundance cover, etc.) of species i in a sample unit, let
as a component of diversity can be intuitive as well. p, = proportion of individuals belonging to species i :
Consider, for example, two plots, each with three
species (Table 4.1). Plot 1 has equal amounts of the
three species, but Plot 2 has mostly one species and just
a bit of the other two. Most people agree that Plot 1
a = constant that can be assigned and alters the proper-
seems more diverse than Plot 2. This intuitive notion
ty of the measure
of diversity incorporates the evenness (equitability) of
abundance. A sample unit with more even abundances S = number of species
is. all else being equal, more diverse than a sample unit D, = diversity measure based on the constant a. The
with abundant and sparse species. units are "effective number of species"

Table 4.1. Which plot is more diverse?


species 1 species 2 species 3
plot 1 10 10 10
plot 2 28 1 1

Whittaker (1960. 1965. 1972) defined three levels


of diversity.
Alpha diversity: diversity in individual sample
units
Beta diversity: amount of compositional variation
in a sample (a collection of sample units)
Gamma diversity: overall diversity in a collection
of sample units, often "landscape-level"
diversity
parameter a
Each of these can be measured in various ways.
There have been numerous reviews of the pros and Figure 4.1. Influence of equitability on Hill's (1973a)
cons of various diversity measures. Some of the better generalized diversity index. Diversity is shown as a
known and more complete references are Auclair and function of the parameter a for two cases: a sample
Goff (197 l), Hill (1973a). Hurlbert (197 l), Magurran unit with strong inequitability in abundance and a
(1988), Peet (1974). Pielou (1966, 1975), Rosen~weig sample unit with perfect equitability in abundance (all
(1995), and Whittaker (1972). A selection of the most species present have equal abundance; see Table 4.1 ).
popular diversity measures follows.
Diversity

Box 4.1. How is information related to uncertainty?

You are blindfolded next to two plots, one with equal mmbeIS of two species and one with
many individuals of one species and few of the other species. Your partner calls out species
names of individuals as they are encountered. After 100 individuals have been tallied from each
plot, your data are:

Plot 1
Plot 2
1 50 50 0.50 0.50
For the 101" individual, in which plot is your uncertainty greater'? In which plot does the next
individual provide more information?

information content = H' = -z 2

1=1
p, log pi
For plot 1
H' = - 1 [099 . log(0.99) + 0.01. log(OO I)] = 0.024
For plot 2,
HI= - 1[0.5. Iog(0.5) + 0.5 log(0.5)] = 0.301
Clearly, the information content of the next individual chosen from plot 2 is much higher
than for plot 1, because it resolves more uncertainty In plot 1 you are fairly certain that the next
individual chosen will be species A. but it is more uncertain in plot 2. The more uncertainty is
relieved. the inore information you have obtained. This much is clear. But Hurlburl (1971) and
others question how the concept of information is relevant to biological diversity.

On the other hand, species richness is very


Species richness sensitive to the sample unit area and the skill of the
Species richness is simply calculated as the observer. Measurement error is high for small, cryptic,
number of species in a sample unit (SU). whether the mobile. or taxonomically difficult organisms (e.g.,
sample unit is defined as a specific number of indivi- Coddington et al. 1996: McCune et al. 1997a).
duals, area. or biomass. If expressed per unit area. it is Whittaker's bottom line (1972) was: "For the mea-
called species density (Hurlbert 1971). SUs of different surement of alpha diversity relations I suggest. first,
sizes cannot, however, be compared directly, because use of a direct diversity expression, [species richness],
the relation between species richness and SU size is as a basic measurement wherever possible; second,
nonlinear (see species-area curves below). accompaniment of this by a suitable slope expression [a
Species richness as a measure of diversity is very measure incorporating equitability] when the data
attractive to ecologists because it is simple, easily permit. "
calculated, readily appreciated, and easy to communi-
cate to policy makers and other lay people (Purvis 8r Beta diversity
Hector 2000). For example, consider trying to explain Beta diversity is the amount of compositional
H' = 2.12 versus S = 33. Peters (1991) said we should change represented in a sample (a set of sample units).
try harder to present results in a way that does not There are various ways of measuring beta diversity.
obfuscate simple underlying units. depending on our concepts or measurements of
Chapter 4

underlying sources of the compositional variation as a consequence of the weakening of structure in the
(Table 4.2). The phrase "beta diversity" need not data matrices when the fine-grained patterns of the
invoke specific gradients. Species turnover. on the vegetation are emphasized." Thus. beta diversity is
other hand. is a special case of beta diversity applied to controlled by a combination of biological and sampling
changes in species composition along explicit processes.
environmental gradients (Vellend 2001). The term The performance of multivariate methods strongly
.'beta diversity" has also recently been applied in a depends on beta diversity. Estimates of beta diversity
different way (e.g.. Condit et al. 2002), as a rate of help to inform us about which ordination methods
decay in species sinlilarity with increasing distance, might be appropriate for a particular data set and
without respect to explicit environmental gradients. whether differences between ordination methods
Three applications of beta diversity in the usual should be anticipated. The greater the beta diversity,
sense are: the more ordination methods are challenged and the
1. Direct gradient - beta diversity is the amount of more results will differ anlong methods.
change in species composition along a directly Beta diversity can be used to compare responsive-
nieasured gradient in en\ lronment or time. ness of different groups of organisms to environmental
2. Indirect gradients - beta diversity is the length differences in a sample (e.g., McCune 62 Antos 1981).
of a presumed environmental or temporal gradient Such comparisons are not, however, biologically
as measured by the species, and meaningful between studies using different methods.
3. No specific gradient - beta diversity measures
compositional heteroneneib without reference to a
Beta diversity along a direct gradient
-
-
s~ecificgradient. Beta diversity integrates the rate of change. Do
not confuse the rate of species change with the
Measures of beta diversity depend on the underlying
gradient model and the data type (Table 4.2). After a amount of change. The rate of change, R, refers to
brief summary of the usefulness of measuring beta steepness of species response curves along gradients
diversity, we describe measures of beta diversity for (not to be confused with Minchin's R, which is a mea-
each class of underlying gradient model. sure of the amount of change). For example, if you
sample with a series of plots along a gradient and
calculate the dissimilarity among adjacent plots. then
Usefulness of beta cliversity you can graph dissimilarity as a measure of R against
Greig-Smith (1983) pointed out that beta diversity position on the gradient (Fig 4.2). This coilcept of
is a Property of the sample. not an inherent Property of "rate of species change" has historically been used to
the community. This is well illustrated by the results address theoretical questions about sharpness of
of 0kland el al. (1990): "Beta diversity. measured as boundaries between communities.
the length of the first DCA axis. invariably increased
upon lowering of sample plot size. . This is explained

Table 4.2. Some measures of beta diversity. See Wilson and Mohler (1983) and Wilson and Shmida
(1984) for other published methods. "DCA" is detrended correspondence analysis. A direct gradient
refers to sanlple units taken along an explicitly measured environmental or temporal gradient. lndirect
gradients are gradients in species conlposition along presumed environmental gradients.
Underlying Data type
gradient model Quantitative Presence-absence
Direct gradient HC (Whittaker's half changes) a (beta turnover)
PG(gleasons) Minchin's R
Minchin's R
A (total gradient length)
lndirect gradient Axis length in DCA Axis length in DCA
No specific gradient PD Dissimilarity P (half changes) (Whittaker's beta. yla- I) =
0 5 10 15 20
>
. .
Position on Gradient Separation along gradient, half
changes .=.
Figure 4.2. Example of rate of change, R, measured as
proportional dissimilarity in species conlposition at Figure 4.3. Hypothetical decline in similarity in
different sampling positions along an environmental species composition as a function of separation of
gradient Peaks represent relatively abrupt change in sample units along an environmental gradient,
species composition. This data set is a series of measured in half changes. Sample units one half
vegetation plots over a low mountain range. In more change apart have a similarity of 50%.
homogeneous vegetation, the curve and peaks would be
lower.

response curves. It is the sum of the slopes of


We seldom try to estimate R along gradients individual species at each point along the gradient.
because we rarely have the appropriate quality of
information. Usually the prinlary interest is in the total
amount of change along the gradient, in other words,
the integral of R. Measures of the total an~ountof
change, beta diversity; are discussed below. One of where Y is the abundance of species i at position x along
them, Wilson and Mohler's (1983), is based on esti- the gradient. T h s can be integrated into an estimate of
mates of R using similarity measures. Other methods beta diversity along a whole gradienl with
for estinlating R proposed by Oksanen and Tonteri n-l
( 1995) assume Gaussian (bell-shaped) response func-
P, = 2 1 [IA - PS(i, i + 1)]
tions. summing the absolute values of the slopes of 1ll
individual species' response functions.
where PS(a.h) is the percentage similarity of sample
The amount of change, P, is the integral of the rate units a and b and IA is the expected similaritj of
of change: replicate samples (the similarity intercept on Fig. 1 . 3 ) .
b For simulated data, Minchin (1987) defined a
measure of beta diversity as "R units." To reduce
coihsion with Wilson's use of R as a rate of change.
we refer to Minchin's measure of the amount of change
where a and h refer to the ends of an ecological as "Minchin's R." Minchin measured beta divers~ty
gradient x. using the mean range of the species' physiolog~cal
response function:
Beta diversity can be calculated in various ways,
depending on the available data. First consider species L
abundance data collected in sequential sample units Minchin' s R =
along one directly measured gradient. Wilson and
Mohler (1983) introduced "gleasons" as a unit of
species change. This measures the steepness of species where r, is the range of species i along the gradienl. L
is the length of the gradient. and r and L are measured
CHAPTER 5

Species on Environmental Gradients


The ideal and the real
Robert H. Whittaker's writings brought to
the fore the concept of hump-shaped species
responses to environmental gradients, through
his ecological monographs (Whittaker 1956,
1960). review papers (1967, 1973), and his
textbooks. Whittaker drew species responses as
smooth, hump-shaped lines, some narrow, some
broad, and varying in amplitude (Fig. 5.1).
These smooth, noiseless curves represent the
Gaussian ideal response of species to environ-
mental gradients. Under the Gaussian ideal, a
species response is completely described by its Environmental Gradient
mean position on the environmental gradient. Figure 5.1.'~ypotheticalspecies abundance in response to an
its standard deviation along that gradient, and its environmental gradient. Lettered curves represent different
peak abundance. Even if species followed the species. Figure adapted from Whittaker (1954).
Gaussian ideal. community analysis would be
difficult because two species following the
Gaussian ideal will have a nonlinear relationship
to each other, challenging our usual statistics
based on linear models.
An even rnore idealistic model would be
linear responses to environment (Fig. 5.2). The
linei~rideal has species rising and falling in
straight lines in response to environmental
gradients. Although the linear model is blatant-
ly inappropriate for all but very short gradients,
many of the most popular multivariate tools Environmental Gradient
( e g . principal components and Figure 5.2. Hypothetical linear responses of species abundance
minant analysis) assume linear responses. to an environmental gradient. Lettered lines represent different
Even the Gaussian model has several critical species.
shortcomings when compared with actual conl-
munity data. Three major problems are common
.> hi,
in community data: . .

1. Species response have the zero truncation


The zero truncation problem
problem.
2. Curves are "solid" due to the action of many Beals (1984, p. 6) introduced the term "zero
other factors. truncation problem." Beyond the extremes of a species
3 , R~~~~~~~ curves can be complex: polynlodal, tolerance on an environmental gradient only zeros are
asymmetric, or discontinuous. possible (Fig. 5.3). Therefore, once a species is absent,
we have no infornration on how unfavorable the
Each of these is explained below.
environment is for that species. The dashed lines in
Figure 5.3 indicate the ideal response curve, if the zero
truncation problem did not exist.
Chapter 5

Figure 5.8. The dust bunny distribution in ecological community data, with three levels of abstraction.
Background: a dust bunny is the accumulation of fluff, lint, and dirt particles in the corner of a room.
Middtc: sample units in a 3-D species space, the three species forming a series of unimodal distributions
along a single environmental gradient. Each axis represents abundance of one of the three species: each
ball represents a sample unit. The vertical axis and the axis coming forward represent the two species
peaking on the extremes of the gradient. The species peaking in the middle of the gradient is represented by
the horizontal axis. Foreground: The environmental gradient forms a strongly nonlinear shape in species
space. The species represented by the vertical axis dominates one end of the environmental gradient, the
species shown by the horizontal axis dominates the middle, and the species represented by the axis coming
forward dominates the other end of the environmental gradient. Successful representation of the
environmental gradient requires a technique that can recover the underlying I-D gradient from its contorted
path through species space.
Chapter 5

0 2 4 6
Species 2 Number of added 0.0's

Figure 5.10. Plotting abundance of one species


Figure 5.12. The consequence for the correlation
against another reveals the bivariate dust bunny
distribution. Note the dense array of points near coefficient of adding (0,O) values between species
the origin and along the two axes. This bivariate
distribution is typical of community data. Note the
extreme departure from bivariate normality.
As the heterogeneity of our sample increases, our
distance measures lose sensitivity. Thts is discussed
further under "Distance Measures" (Chapter 6).
*
Summary
Box 5.1 summarizes the basic properties of ecolo-
gical community data. These properties influence the
choice of data transformations, analytical methods, and
interpretation of results.
Partial solutions to the zero-truncation problem
can be found at eveq level in the analysis (data
adjustments and transformations, distance measures,
methods of data reduction, and hypothesis testing;
Table 5. I), yet most of these solutions are not available
in the major statistical packages. Multivariate data sets
in most other fields do not usually have this problem;
hence their lack of consideration by the major
Figure 5.11. Nature abhors a vacuum. A sam-
statistical packages. This is changing, however,
ple unit with all species removed is usually soon
Recent versions of SPSS and SAS include some tools
colonized. The vector shows a trajectory through
that are useful with this kind of matrix.
species space. The sample unit moves away from
the origin (an empty sample unit) as it is
colonized. In this case, species B and a bit of
species C colonized the sample unit. As in this
example. successional trajectories tend to follow
the corners of species space.

You might also like