GLMM in Agriculture and Biology
GLMM in Agriculture and Biology
GLMM in Agriculture and Biology
Generalized Linear
Mixed Models
with Applications
in Agriculture and
Biology
Generalized Linear Mixed Models with Applications
in Agriculture and Biology
Josafhat Salinas Ruíz • Osval Antonio Montesinos
López • Gabriela Hernández Ramírez
Jose Crossa Hiriart
The translation was done with the help of artificial intelligence (machine translation by the service
DeepL.com). A subsequent human revision was done primarily in terms of content.
© The Editor(s) (if applicable) and The Author(s) 2023. This book is an open access publication.
Open Access This book is licensed under the terms of the Creative Commons Attribution 4.0
International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing,
adaptation, distribution and reproduction in any medium or format, as long as you give appropriate
credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate
if changes were made.
The images or other third party material in this book are included in the book's Creative Commons license,
unless indicated otherwise in a credit line to the material. If material is not included in the book's Creative
Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted
use, you will need to obtain permission directly from the copyright holder.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication
does not imply, even in the absence of a specific statement, that such names are exempt from the relevant
protective laws and regulations and therefore free for general use.
The publisher, the authors, and the editors are safe to assume that the advice and information in this book
are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the
editors give a warranty, expressed or implied, with respect to the material contained herein or for any
errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional
claims in published maps and institutional affiliations.
This Springer imprint is published by the registered company Springer Nature Switzerland AG
The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
Foreword
v
Acknowledgments
The work presented in this book described models and methods for improving the
statistical analyses of continuous, ordinal, and count data usually collected in
agriculture and biology. The authors are grateful to the past and present CIMMYT
Directors General, Deputy Directors of Research, Deputy Directors of Administra-
tion, Directors of Research Programs, and other Administration offices and Labora-
tories of CIMMYT for their continuous and firm support of biometrical genetics, and
statistics research, training, and service in support of CIMMYT’s mission: “maize
and wheat science for improved livelihoods.”
This work was made possible with support from the CGIAR Research Programs
on Wheat and Maize (wheat.org, maize.org), and many funders including Australia,
United Kingdom (DFID), USA (USAID), South Africa, China, Mexico
(SAGARPA), Canada, India, Korea, Norway, Switzerland, France, Japan, New
Zealand, Sweden, and the World Bank. We thank the financial support of the Mexico
Government throughout MASAGRO and several other regional projects and close
collaboration with numerous Mexican researchers.
We acknowledge the financial support provided by the (1) Bill and Melinda Gates
Foundation (INV-003439 BMGF/FCDO Accelerating Genetic Gains in Maize and
Wheat for Improved Livelihoods [AG2MW]) as well as (2) USAID projects
(Amend. No. 9 MTO 069033, USAID-CIMMYT Wheat/AGGMW, AGG-Maize
Supplementary Project, AGG [Stress Tolerant Maize for Africa]).
Very special recognition is given to Bill and Melinda Gates Foundation for
providing the Open Access fee of this book.
We are also thankful for the financial support provided by the (1) Foundations for
Research Levy on Agricultural Products (FFL) and the Agricultural Agreement
Research Fund (JA) in Norway through NFR grant 267806, (2) Sveriges
Llantbruksuniversitet (Swedish University of Agricultural Sciences) Department of
The original version of this book has been revised. The Acknowledgment section which was
inadvertently omitted after the Foreword has now been included.
vii
viii Acknowledgments
Plant Breeding, Sundsvägen 10, 23053 Alnarp, Sweden, (3) CIMMYT CRP, (4) the
Consejo Nacional de Tecnología y Ciencia (CONACYT) of México, and (5)
Universidad de Colima of Mexico.
We highly appreciate and thank the several students and professors at the
Universidad de Colima and students as well professors from the Colegio de Post-
Graduados (COLPOS) who tested and made suggestions on early version of the
material covered in the book; their many useful suggestions had a direct impact on
the current organization and the technical content of the book.
Contents
ix
x Contents
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 425
Chapter 1
Elements of Generalized Linear Mixed
Models
Linear models are commonly used to describe and analyze datasets from different
research areas, such as biological, agricultural, social, and so on. A linear model aims
to best represent/describe the nature of a dataset. A model is usually made up of
factors or a series of factors that can be nominal or discrete variables (sex, year, etc.)
or continuous variables (age, height, etc.), which have an effect on the observed data.
Linear models are the most commonly used statistical models for estimating and
predicting a response based on a set of observations.
Linear models get their name because they are linear in the model parameters.
The general form of a linear model is given by
y = Xβ þ ε ð1:1Þ
of a full column rank, whereas in analysis of variance models, the design matrix X is
not of a full column rank. Some linear models are briefly described in the following
sections.
Linear models are often used to model the relationship between a variable, known as
the response or dependent variable, y, and one or more predictors, known as
independent or explanatory variables, X1, X2, ⋯, Xp.
yi = β0 þ β1 X 1i þ εi
y1 1 X 11 E1
y2 1 X 12 β0 E2
yn × 1 = , Xn × 2 = , β2 × 1 = , en × 1 =
⋮ ⋮ ⋮ β1 ⋮
yn 1 X 1n En
Example Let us consider the relationship between the performance test scores and
tissue concentration of lysergic acid diethylamide commonly known as LSD (from
German Lysergsäure-diethylamid) in a group of volunteers who received the drug
Table 1.2 Results of the (a) Type III tests of fixed effects
simple regression analysis
Effect Num DF Den DF F-value Pr > F
Conc 1 5 35.93 0.0019
(b) Parameter estimates
Standard
Effect Estimate error DF t-value Pr > |t|
Intercept 89.1239 7.0475 5 12.65 <0.0001
Conc -9.0095 1.5031 5 -5.99 0.0019
Scale 50.7763 32.1137 . . .
σ2
(Wagner et al. 1968). The average scores on the mathematical test and the LSD
tissue concentrations are shown in Table 1.1.
The components of this regression model are as follows:
Distribution: yi N ηi , σ 2
Linear predictor: ηi = β0 þ β1 × Conci
Link function: μi = ηi ðidentityÞ
The syntax for performing a simple linear regression using the GLIMMIX
procedure in Statistical Analysis Software (SAS) is as follows:
proc glimmix;
model y= X1/solution;
run;
Part of the results is shown in Table 1.2. The analysis of variance (item a)
indicates that drug concentration has a significant effect on average mathematical
performance (P = 0.0019). The estimates of the regression model parameters (item b)
are β0 and β1, and the mean squared error (MSE scale) is shown in Table 1.2(b)
under “Parameter estimates.”
With these results, the linear predictor ðηi Þ that predicts the average mathematical
performance as a function of LSD concentration is as follows:
90
80
70
y = 89.124 -9.0095*Conc
60 R² = 0.8778
Average score
50
40
30
20
10
0
1 2 3 4 5 6 7
Concentration (LSD)
Fig. 1.1 Relationship between applied drug concentration and the mathematical score of the youth
Adjusted model of the relationship between the average score and LSD
concentration.
yi = β0 þ β1 X i1 þ β2 X i2 þ ⋯ þ βp X ip þ εi
β0
y1 1 X 11 X 12 ⋯ X 1p
β1
y2 1 X 21 X 22 ⋯ X 2p
yn × 1 = , X n × ðpþ1Þ = , βp × 1 = β2 ,
⋮ ⋮ ⋮ ⋮ ⋮
⋮
yn 1 X n1 X n2 ⋯ X np βp
E1
E2
εn × 1 =
⋮
En
1.2 Regression Models 5
Table 1.3 Body weight (kilograms) and its relationship with circumference (centimeters) and heart
length (centimeters) of seven young bulls
Bull 1 2 3 4 5 6 7
Weight (kilograms) 480 450 480 500 520 510 500
Circumference (centimeters) 175 177 178 175 186 183 185
Length (centimeters) 128 122 124 131 131 130 124
Distribution: yi N ηi , σ 2
Linear predictor: ηi = β0 þ X i1 β1 þ X i2 β2
Link function: μi = ηi ðidentityÞ
The syntax for performing a multiple regression using the GLIMMIX procedure
in SAS, assuming that there is no interaction between bull heart girth (X1) and length
(X2), is shown below:
proc glimmix;
model y = X1 X2/solution cl;
run;
Based on the regression model specifications, the option “solution cl” prompts
GLIMMIX to provide the value of the estimated parameters and their respective
confidence intervals. Other useful options available are “htype = 1, 2, and 3,” which
refer to the sum of squares of types I, II, and III. The type III fixed effects tests in
(a) of Table 1.4 indicate that there is a linear relationship between heart length (size)
and weight in young bulls. The estimated parameters with their respective confi-
dence intervals β0 , β1 , β2 as well as the MSE (scale) of the fitted regression model
are listed below in (b).
Note that in a linear model, the parameters are linearly entered, but the variables
do not necessarily have to be linear. For example, consider the following two
examples:
yi = β0 þ β1 X i1 þ β2 logðX i2 Þ þ ⋯ þ βk X ik þ Ei
6 1 Elements of Generalized Linear Mixed Models
β
yi = β0 þ X i11 þ β2 X i2 þ ⋯ þ βk expðX ik Þ þ Ei
The first example is a linear model, whereas the second one is not, since its
β
derivatives do not depend on the beta coefficients, with the exception of the term X i11
β1
whose derivative is equal to X i1 logðX i1 Þ . This clearly shows that the second
example is a nonlinear model because the derivative of the predictor depends on β1.
Consider an experiment in which you want to test t treatments (t > 2), to the level of
the ith treatment with ni experimental units that are selected and randomly assigned
to the ith treatment. The model describing this experiment is as follows:
yij = μ þ τi þ Eij
for i = 1, 2, ⋯, t and j = 1, 2, ⋯, ni . Here, Eij are the uncorrelated random errors with
normal distribution with a zero mean and a variance constant σ 2 (εij ~ N(0, σ 2)). If the
treatment effects are considered as fixed constants (drawn from a finite number), then
this model is a special case of the general linear model (1), with the total number of
t
experimental units n = ni .
i
1.3 Analysis of Variance Models 7
In matrix terms, the information under this design of experiment is equal to:
where 1ni is the vector of ones of order ni and 0ni is the vector of zeros of order ni.
Note that the matrix Xn×(t+1) is not of a full column rank because its first column can
be obtained as a linear combination of its remaining columns.
Example Assume that measurements of the biomass produced by three different
types of bacteria are collected in three separate Petri dishes (replicates) in a glucose
broth culture medium for each bacterium (Table 1.5).
The sources of variation and degrees of freedom (DFs) for this experiment are
shown in Table 1.6.
The components for this one-way model, assuming that each of the response
variable yij is normally distributed, are as follows:
where yij is the response observed at the jth repetition in the ith bacterium, ηi is the
linear predictor, α is the intercept (the grand mean), and τi is the fixed effect due to
the type of bacterium.
8 1 Elements of Generalized Linear Mixed Models
Similar to “proc glm” or “proc mixed,” the “class” command allows to define the
type of class variables (categorical or nominal) to be included in the model; in this
case, for the class variable “bacteria,” the “model” command allows to declare (list)
the response variable “y” and all the class or continuous variables that enter the
model, whereas the “lsmeans” command asks GLIMMIX to estimate the means of
the treatments and the “lines” option allows to make a comparison of means. Part of
the results is presented below.
By default, “proc GLIMMIX” provides the fit statistics (information criteria),
which are extremely useful for comparing or choosing a model that explains the
largest possible proportion of variation present in a dataset, i.e., the best-fit model
(part (a) of Table 1.7). The statistic “-2 res log likelihood” is most useful when
comparing nested models, and the rest of the statistics is useful for comparing
models that are not necessarily nested. The mean squared error (MSE) in GLIMMIX
is given as the statistic “Pearson′s chi - square/DF.” In this analysis, this value is
8.78. σ 2 = MSE = 8:78 . In part (b), the analysis of variance indicates that at least
one type of bacterium produces a different biomass (P < 0.0001). That is, the null
hypothesis is rejected (H0 : τA = τB = τC) at a significance level of 5%.
The estimated least squares (LS) means obtained with “lsmeans” are tabulated
under the “Estimate” column with their standard errors in the “Standard error”
column of Table 1.8. These estimated means were obtained (by default) with
Fisher’s LSD (least significant difference).
1.3 Analysis of Variance Models 9
Table 1.8 Means and estimated standard errors of the one-way model
Least squares means of bacteria
Bacteria Estimate Standard error DF t-value Pr > |t|
A 12.0000 1.7105 7.02 0.0004
B 20.6667 1.7105 12.08 <0.0001
C 39.0000 1.7105 22.80 <0.0001
Table 1.9 Comparison of the T grouping of the least squares means of bacteria (α = 0.05)
means (LSD) in the one-way
LS means with the same letter are not significantly different
model
Bacteria Estimate
C 39.0000 A
B 20.6667 B
A 12.0000 C
Finally, Table 1.9 presents a comparison of the means obtained with “lines” and
indicates that bacteria type C has a better fermentative conversion of glucose to lactic
acid compared to bacteria types B and A. Equal letters per column indicate that they
are statistically equal.
Let us consider an experiment with two factors, A and B, in which each level of B is
nested within a level of factor A, that is, each level of factor B appears within a level
of factor A. Then, the model that describes this experiment is as follows:
y111 1 1 0 0 1 0 0 0 0 0
y112 1 1 0 0 1 0 0 0 0 0
y121 1 1 0 0 0 1 0 0 0 0
y122 1 1 0 0 0 1 0 0 0 0
y211 1 0 1 0 0 0 1 0 0 0
y212 1 0 1 0 0 0 1 0 0 0
y= y221 , X= ,
1 0 1 0 0 0 0 1 0 0
y222 1 0 1 0 0 0 0 1 0 0
y311 1 0 0 1 0 0 0 0 1 0
y312 1 0 0 1 0 0 0 0 1 0
y321 1 0 0 1 0 0 0 0 0 1
y322 1 0 0 1 0 0 0 0 0 1
E111
μ E112
α1 E121
α2 E122
α3 E211
β11 E212
β= β12 , e= :
E221
β21 E222
β22 E311
β31 E312
β32 E321
E322
where yij is the level of assimilation of the fluorescent protein obtained from rat j by
technician i, α is the intercept, τi is the fixed effect due to the technician, and β(τ)j(i) is
the nested effect of rat j within technician i.
The SAS commands for the main effects of factor A and factor B nested within A
are as follows:
Part of the results is shown in Table 1.12. The results indicate that there is minimum
variability of the technicians since the value of the mean squared error
(Pearson′s chi - square/DF) is 0.04 (part (a)). This means that the variance between
group means is smaller than would be expected. The analysis of variance in part
(b) indicates that there is no difference in the measurement of fluorescent proteins in
the rats between technicians (P = 0.3065). Since there is variation between rats in the
12 1 Elements of Generalized Linear Mixed Models
average protein uptake, it is to be expected that between rats within technicians, there
are mean differences in the protein uptake (P = 0.0067).
In Table 1.13 part (a), the values of the least squares means tabulated under the
“Estimate” column are shown with their respective “Standard errors.” It can be seen
that rats under technician A have statistically the same mean protein uptake as do rats
under technician B (part (b)).
Comparison of means for rat subgroups under both technicians showed similar
means for rats under technician A but different means for rats under technician B
(part (a) and (b), Table 1.14).
1.3 Analysis of Variance Models 13
Table 1.14 Comparison of the means (LSD) of the subgroups nested within technicians
(a) Least squares means of rats (technical)
Technician Rat Estimate Standard error DF t-value Pr > |t|
A 5 1.2187 0.06003 20.30 <0.0001
A 1.2302 0.06003 20.49 <0.0001
A 1.1841 0.06003 19.72 <0.0001
B 1 1.3540 0.06003 22.56 <0.0001
B 1.0670 0.06003 17.77 <0.0001
B 1.0602 0.06003 17.66 <0.0001
(b) T grouping of the least squares means (α = 0.05) of rats (technical)
LS means with the same letter are not significantly different
Technician Rat Estimate
B 1 1.3540 A
A 1.2302 B A
A 5 1.2187 B A
A 1.1841 B A
B 1.0670 B
B 1.0602 B
This experiment is used when one wishes to test two factors A and B, with a levels of
factor A and b levels of factor B. In this experiment, both factors are crossed, this
means that each level of A occurs in combination with each level of factor B. The
model with interaction is given by:
y111 1 1 0 0 1 0 1 0 0 0 0 0
y112 1 1 0 0 1 0 1 0 0 0 0 0
y113 1 1 0 0 1 0 1 0 0 0 0 0
y121 1 1 0 0 0 1 0 1 0 0 0 0
y122 1 1 0 0 0 1 0 1 0 0 0 0
y123 1 1 0 0 0 1 0 1 0 0 0 0
y211 1 0 1 0 1 0 0 0 1 0 0 0
y212 1 0 1 0 1 0 0 0 1 0 0 0
y213 1 0 1 0 1 0 0 0 1 0 0 0
y= y221 , X= ,
1 0 1 0 0 1 0 0 0 1 0 0
y222 1 0 1 0 0 1 0 0 0 1 0 0
y223 1 0 1 0 0 1 0 0 0 1 0 0
y311 1 0 0 1 1 0 0 0 0 0 1 0
y312 1 0 0 1 1 0 0 0 0 0 1 0
y313 1 0 0 1 1 0 0 0 0 0 1 0
y321 1 0 0 1 0 1 0 0 0 0 0 1
y322 1 0 0 1 0 1 0 0 0 0 0 1
y323 1 0 0 1 0 1 0 0 0 0 0 1
E111
E112
E113
μ E121
α1 E122
α2 E123
α3 E211
β1 E212
β2 E213
β= γ 11 , e= E221
γ 12 E222
γ 21 E223
γ 22 E311
γ 31 E312
γ 32 E313
E321
E322
E323
Example This experiment consisted of developing an in vitro efficacy test for self-
tanning formulations. Two brands, 1 = erythrulose, 2 = dihydroxyacetone (factor
A), and three formulations, 1 = solution, 2 = gel, and 3 = cream (factor B), were
tested with four replicates for each condition according to Jermann et al. (2001).
Total color change was measured for each of the combination conditions. The
dataset is shown in Table 1.15.
1.3 Analysis of Variance Models 15
For this two-way model, assuming that the response variable yijk has a normal
distribution, the components are as follows:
where yijk is the color change observed at the kth repetition at the ith level of factor A
and at the jth level of factor B, μ is the intercept (the overall mean), αi is the fixed
effect due to the level of factor A (mark), βj represents the fixed effect of the level of
factor B (type of formulation), and γ ij is the fixed effect due to the interaction
between the brand and formulation. Table 1.16 shows the sources of variation and
degrees of freedom.
The following code in GLIMMIX in SAS allows us to estimate the main effects
and the interaction:
proc glimmix;
class brand formulation;
model y = brand|formulacion;
lsmeans brand|formulacion/lines;
run;
16 1 Elements of Generalized Linear Mixed Models
Table 1.17 Results of the analysis of variance of the two-way model with interaction
(a) Fit statistics
-2 Res log likelihood 90.20
AIC (smaller is better) 104.20
AICC (smaller is better) 115.40
BIC (smaller is better) 110.43
CAIC (smaller is better) 117.43
HQIC (smaller is better) 105.06
Pearson’s chi-square 99.61
Pearson’s chi-square / DF 5.53
(b) Type III tests of fixed effects
Effect Num DF Den DF F-value Pr > F
Brand 1 257.04 <0.0001
Formulation 22.99 <0.0001
Brand × formulation 4.68 0.0231
Part of the results is shown below. Of all the fit statistics in (a) of Table 1.17, the
value that we are interested in highlighting in this analysis is “Pearson′s chi -
square/DF,” which corresponds to the mean squared error (MSE), even though we
are evaluating different possible models for this given dataset. The value of the MSE
is 5.53. The type III fixed effects tests, in part (b) of Table 1.17, indicate that the type
of brand (P < 0.0001), formulation (P < 0.0001), and the interaction between both
factors (P = 0.0231) all have a significant effect on the change of self-tanning color.
The least mean squares obtained with “lsmeans” are shown in the Table 1.18 for
the levels of tanning brand factor in Table 1.19 for the levels of tanning brand
formulation and in Table 1.20 for the interaction of both factors. The “lines” option
allows us to make a comparison of means using the LSD method.
The least squares means for the tanning brand factor are given in Table 1.18.
The least squares means for the type of tanning brand formulation are given in
Table 1.19.
1.3 Analysis of Variance Models 17
Table 1.19 Means and standard errors of the tanning brand formulation
Least squares means for the tanning brand formulation
Formulation Estimate Standard error DF t-value Pr > |t|
1 22.9000 0.8317 27.53 <0.0001
15.4000 0.8317 18.52 <0.0001
16.7988 0.8317 20.20 <0.0001
T grouping of the least squares means (α = 0.05) of the tanning brand formulation
LS means with the same letter are not significantly different
Formulation Estimate
1 22.9000 A
16.7988 B
B
15.4000 B
The hypothesis test for the interaction should be tested first, and only if the
interaction effect is not significant, should the main effects be tested. If the interac-
tion is significant, then tests for the main effects are meaningless. The interaction
analysis shows that brand 2 (dihydroxyacetone), in all three formulations, shows a
greater change compared to brand 1 (erythrulose).
Now, considering the previous model without interaction (γ 11 = γ 12 = ⋯ = γ 32 = 0)
where factor A has a levels and factor B has b levels, the model without interaction is
given by:
yijk = μ þ αi þ βj þ Eijk
y111 1 1 0 0 1 0
y112 1 1 0 0 1 0
y113 1 1 0 0 1 0
y121 1 1 0 0 0 1
y122 1 1 0 0 0 1
y123 1 1 0 0 0 1
y211 1 0 1 0 1 0 μ
y212 1 0 1 0 1 0 E111
α1
y213 1 0 1 0 1 0 E112
y= X= , β= α2 , e=
y221 ,
1 0 1 0 0 1 α3 ⋮
E322
y222 1 0 1 0 0 1 β1
E323
y223 1 0 1 0 0 1 β2
y311 1 0 0 1 1 0
y312 1 0 0 1 1 0
y313 1 0 0 1 1 0
y321 1 0 0 1 0 1
y322 1 0 0 1 0 1
y323 1 0 0 1 0 1
Note that the design matrix for the model without interaction is the same as that
for the model with interaction, except that the last six columns are removed.
Let us assume that the interaction effect is not significant. The following SAS
code estimates the main effects of both factors. Running the program and analysis is
left as practice for the readers.
proc glimmix;
class brand formulation;
model y = formula brand;
lsmeans brand formulation/lines;
run;
y11 1 1 0 0 x11 0 0
y12 1 1 0 0 x12 0 0 μ
y13 1 1 0 0 x13 0 0 α1
y21 1 0 1 0 0 x21 0 α2
y= y22 , X= 1 0 1 0 0 x22 0 , β= α3 ,
y23 1 0 1 0 0 x23 0 β1
y31 1 0 0 1 0 0 x31 β2
y32 1 0 0 1 0 0 x32 β3
y33 1 0 0 1 0 0 x33
E11
E12
E13
E21
e= E22
E23
E31
E32
E33
An ANCOVA is appropriate for this study to test the following three null
hypotheses from these data:
(a) There is no difference in the average number of ovules per flower between the
two populations (the main effect).
(b) There is no effect of plant size on the average number of ovules per flower (the
covariate effect).
(c) The effect of plant size on the mean number of ovules per flower did not differ
between the study sites (the interaction effect).
The components of the ANCOVA model, assuming that the response variable yijk
is normally distributed, are as follows:
where yij is the number of ovules observed in the jth plant of the ith population, μ is
the overall mean,τi is the fixed effect due to the population i, planta(τ)j(i) is the
random effect due to the plant j in the population i, βi is the slope of the population i,
X :: is the overall mean of the size of all plants, and Xij is the plant size i in the
population j. The ANCOVA results (sources of variation and degrees of freedom)
are shown in Table 1.21.
The basic syntax in GLIMMIX for analysis of covariance with different slopes is
as follows:
proc glimmix;
class poblacion plant;
model ovules = population xbar population*xbar/ddfm=satterthwaite;
random plant(population);
lsmeans population/lines;
run;
In the above syntax, the “class” command lists all classes or categorical variables,
except the covariate (continuous variable), which – in this case – is a variable
centered by the average of the size of all plants xbar = X ij - X :: : The options
“ddfm” and “lines” invoke proc GLIMMIX to do a degree-of-freedom correction
using the Satterthwaite method and a comparison of the means using the LSD
method. Part of the results is shown in Table 1.22.
The estimates of the variance components (part (a)) due to plant and within-
treatment variability are σ 2plantaðpoblacionÞ = 12:795 and σ 2 = MSE = 0:9321, respec-
tively. The analysis of variance in (b) showed that there is a significant effect
between the two populations (P = 0.0084), plant size (P = 0.0001) and plant size
is influenced by subspecies (interaction) on the average number of ovules
(P = 0.0066) per flower. The estimated means and their respective standard errors
of the average number of ovules for both populations are tabulated in the “Estimate”
column in part (c), as well as the comparison of the means in part (d).
If in the previous model we assume that the slopes were equal (β1 = β2), then the
ANCOVA reduces to:
yij = μ þ τi þ β X ij - X :: þ Eij
The basic syntax using GLIMMIX for an analysis of covariance with equal slopes
is as follows:
proc glimmix;
class poblacion plant;
model ovules = population xbar/ddfm=satterthwaite;
random plant(population);
lsmeans population/lines;
run;
So far, we have exemplified the general linear model of the form y = Xβ +e. In the
following, some characteristics of a linear mixed model (LMM) will be described.
1.5.1 Introduction
Linear mixed models (LMMs) are appropriate for analyzing continuous response
variables in which the residuals are normally distributed. These types of models are
well suited for studies of grouped datasets such as (1) students in classrooms,
animals in herds, people grouped by municipality or geographic region, or random-
ized block experimental designs such as batches of raw materials for an industrial
process and (2) longitudinal or repeated measures studies, in which subjects are
measured repeatedly over time or under different conditions. These designs occur
in a wide variety of settings: biology, agriculture, industry, and socioeconomic
sciences. LMMs provide researchers with powerful and flexible analytical
tools for these types of data.
The name linear mixed models comes from the fact that these models are linear in
the parameters and that the covariates, or independent variables, may involve a
combination of fixed and random effects. “Fixed effects” can be associated with
continuous covariates, such as weight in kilograms of an animal, maize yield in tons
per hectare, and reference test score or socioeconomic status, which will carry a
continuous range of values, or with factors, such as gender, variety, or group
1.5 Mixed Models 23
treatment, which are categorical. Fixed effects are unknown constant parameters
associated with continuous covariates or levels of the categorical factors in an LMM.
The estimation of these parameters in LMMs is generally of intrinsic interest because
they indicate the relationship of the covariates with the continuous response variable.
When the levels of a factor are drawn from a large enough sample such that each
particular level is not of interest (e.g., classrooms, regions, herds, or clinics that are
randomly sampled from a population), the effects associated with the levels of those
factors can be modeled as random effects in an LMM. “Random effects” are
represented by random (unobserved) variables that we generally assume to have a
particular distribution, with normal distribution being the most common.
Mixed models are extremely useful because they allow us to work on (address)
two important aspects:
1. From a statistical point of view, biological data are often structured in a way that
does not satisfy the assumption of independence of the dataset. Examples include
the following:
(a) Multiple measurements of the same subject/organism
(b) Experiments organized into spatial blocks
(c) Observational data in which multiple investigations were conducted in dif-
ferent locations
(d) Synthesis of data from similar experiments that were performed by different
researchers
2. From a biological perspective, the processes being measured can be affected by
multiple sources of variation, often occurring at different spatial or temporal
scales. We are interested in using statistical methods that can model multiple
sources of stochasticity, at multiple scales, so that we can measure the relative
magnitude of the different sources of variation and determine which predictors
explain variation at different scales.
The matrix notation for a mixed model is highly similar to that for a fixed effects
(systematic) model. The main difference is that, instead of using only one design
matrix to explain the entire model in its systematic part, the matrix notation for a
mixed model uses at least two design matrices: a design matrix X to describe the
fixed effects in the model and a design matrix Z to describe the random effects in the
model. The fixed effects design matrix X is constructed in the same way as a general
linear fixed effects model (y = Xβ + ε ). X has a dimension of n × ( p + 1), where n is
the number of observations in the dataset and p + 1 is the number of parameters of
fixed effects in the model to be estimated. The design matrix for the random effects
Z is constructed in the same way as the construction of the design matrix for
fixed effects, but now for the random effects. The Z matrix has a dimension of
n × q, where q is the number of coefficients of random effects in the model.
24 1 Elements of Generalized Linear Mixed Models
E ðbÞ = 0, E ðε Þ = 0
VarðbÞ = G, Varðε Þ = R
Covðb, εÞ = 0
VarðyÞ = ZGZ0 þ R = V
Then, the vector of observations y will have a normal distribution, that is,
y~N(Xβ, V). The same model can be written in the probability distribution form in
two different but equivalent ways. The first is the marginal model
In this marginal model, the mean is based only on fixed effects and the parameters
describing the random effects appear (are contained) in the variance and covariance
matrix V (Littell et al. 2006). In general, a structure is imposed in b in terms of
Var(b) = G, and, therefore, marginally, the components of y depend on the structure
in V = ZGZ′ + R.
1.5 Mixed Models 25
In this conditional model, b is distributed as shown in Eq. (1.2) for this parameter.
For LMMs, the two models are exactly the same; but if the response variable is
modeled under a non-normal distribution, then the models are different (Stroup,
2012) and generalized linear mixed models are required.
The fixed effects estimator (β) is useful to obtain the best linear unbiased
estimators (commonly known as BLUEs), whereas the estimator b is useful for
computing the best linear unbiased predictors (commonly known as BLUPs) for the
random effects b. The estimation of the expected value of the marginal LMM (1.3)
allows the estimation of the BLUEs and that of the conditional LMM (1.4), the
BLUPs. The estimators for the BLUEs of β and the BLUPs of b are as follows:
-1
β = XT V - 1 X XT V - 1 y
b = GZT V - 1 y - β
This solution is efficient when working with small datasets because, in the context
of big data, it is computationally highly demanding since the inverse of matrix V has
to be estimated. For this reason, it is normally used to obtain the solution of the
BLUEs of β and the BLUPs of b, also known as Henderson’s mixed model
equations, which are presented later in this chapter.
The distribution selected by the researcher from the population under study should
be true or a good approximation that represents the likely distribution of the response
variable. A good representation of the population distribution of a response variable
should not only take into account the nature of the response variable (e.g., contin-
uous, discrete, etc.) and the shape of the distribution but should also provide a good
model for the relationship between the mean and variance. For the distribution of the
dataset, in this chapter, we assume that it is normally distributed with a mean μ and a
variance σ 2 {yij ~ (μ, σ 2)} and, for the random effects, it will assume a normal
distribution with mean 0 and constant variance σ 2b bj 0, σ 2b .
26 1 Elements of Generalized Linear Mixed Models
In an LMM, there are two types of factors, namely, fixed factors that make up the
systematic part and random factors that are the stochastic part, and their related
effects on the dependent variable (response). In the following sections, we provide a
brief description of these factors and their implications in the context of an LMM.
A random factor is a classification variable with levels that can be randomly sampled
from a population with different levels of study. All possible levels of a random
factor are not present in the dataset, but this is the intention of the researcher, i.e., to
make inference about the entire population of levels from the selected sample of
these factor levels. Random factors are considered in an analysis such that the
change in the dependent variable across random factor levels can be evaluated and
the results of the data analysis can be generalized to all random factor levels in the
population.
In contrast to fixed factor levels, random factor levels do not represent conditions
specifically chosen to meet the objectives of the study. However, depending on the
objectives of the study, the same factor may be considered as either a fixed factor or a
random factor.
Fixed effects, commonly referred to as regression coefficients or fixed effect
parameters, describe the relationships between the dependent variable and predictor
variables (i.e., fixed factors or continuous covariates) for an entire population of
units of analysis or for a relatively small number of subpopulations defined by the
levels of a fixed factor. Fixed effects may describe the contrasts or differences
1.5 Mixed Models 27
between levels of a fixed factor (e.g., sex between males and females) in the mean
responses for a continuous dependent variable or may describe the effect of a
continuous covariate on the dependent variable. Fixed effects are assumed to be
unknown fixed quantities in an LMM and are estimated based on analysis of the data
collected in a study.
Random effects are random values associated with the levels of a random factor
(or factors) in an LMM. These values, which are specific to a given level of a random
factor, generally represent random deviations from the relationships described by
fixed effects. For example, random effects associated with levels of a random factor
may enter an LMM as random intercepts (random deviations for a given subject or
group as an overall intercept) or as random coefficients (random deviations for a
given subject or group from the total fixed effects) in the model. In contrast to fixed
effects, random effects are represented as stochastic variables in an LMM.
When a given level of one factor (random or fixed) can be measured only at a single
level of another factor and not across multiple levels, then the levels of the first factor
are said to be nested within the levels of the second factor. The effects of the nested
factor on the response variable are known as nested effects. For example, suppose
that you want to conduct a particular study at the primary level in a school zone, you
would select schools and classrooms at random. Classroom levels (one of the
random factors) are nested within school levels (another random factor), since
each classroom can appear within a single school.
When a given level of one factor (random or fixed) can be measured across
multiple levels of another factor, one factor is said to be crossed with the other and
the effects of these factors on the dependent variable are known as crossover effects.
y = Xβ þ Zb þ ε
σ 2 þ σ 2b σ 2b 0 0 0 0
σb
2
σ þ σ 2b 2 0 2
2 0 0 0
σ þ σb σ 2b 0 0
V = VarðyÞ = ZGZ ′ þ σ 2 I = 0 0
0 0 σ 2b σ 2 þ σ 2b 0 0
0 0 0 0 σ þ σ 2b
2
σ 2b
0 0 0 0 σ 2b σ 2 þ σ 2b
The variance of y11 is V 11 = σ 2 þ σ 2b and the covariance between y11 and y21 is
V 12 = V 21 = σ 2b . These two observations come from the same block. The covariance
between y11 and other observations is zero. In matrix V, all possible covariances can
be found.
The likelihood function l is a function of the observations and the model parameters.
It gives us a measure of the probability of looking at a particular observation y, given
a set of model parameters β and b. The likelihood function for y j b and b for a mixed
model is given by:
n 1 1
lðyjbÞ = - logð2π Þ - logjRj - ðy - Xβ - ZbÞ0 R - 1 ðy - Xβ - ZbÞ
2 2 2
and
Nb 1 1
l ð bÞ = - logð2π Þ - logjGj - bT G - 1 b
2 2 2
where Nb represents the total number of random effect levels. Therefore, the joint
distribution of y and b is equal to:
1 1 T -1
lðy, bÞ = - ðy - Xβ - ZbÞT R - 1 ðy - Xβ - ZbÞ - b G b
2 2
Now, after deriving the above expression with respect to β and b and then setting
it to zero and solving the resulting equations with respect to β and b, the maximum
likelihood estimators are obtained:
∂lðy, bÞ
T
= XT R - 1 y - X T R - 1 Xβ - ZT R - 1 Xb
∂β
∂lðy, bÞ
T
= ZT R - 1 y - X T R - 1 Zβ - ZT R - 1 Zb
∂b
Setting them to zero and solving for β and b, we obtain the following linear mixed
equations:
1.5 Mixed Models 29
XT R - 1 X XT R - 1 Z β XT R - 1 y
ZR - 1 X ZT R - 1 ZþG - 1
= ZT R - 1 y
b
-1
β XT R - 1 X XT R - 1 Z XT R - 1 y
= ZR - 1 X ZT R - 1 ZþG - 1 ZT R - 1 y
b
Here, β is the vector of fixed effects parameters and b is the vector of random
effects parameters. The information of these parameters is related to the two covari-
ance matrices G and R, and it no longer depends on V as in the previous solution.
Moreover, this solution, which is known as Henderson’s (1950) mixed model
equations, is computationally much more efficient than the previous one given for
the parameters (β and b) since it does not need to obtain the inverse of the matrix
V = ZGZ′ + R. The solution to these mixed model linear equations is based on the
assumption that we know the components of G and R, which, in practice, need to be
estimated. Therefore, the following is a popular method for estimating the variance
components of G and R, which is extremely versatile and powerful.
The restricted maximum likelihood method is also known as the residual maximum
likelihood method and is extremely useful, among other things, for estimating
variance components. This method is also based on the maximum likelihood
method, but, instead of maximizing the likelihood function of the original data, it
maximizes the likelihood function over a set of errors obtained by removing the
variables from the original response to fixed effects, which are assumed to be known.
That is, now instead of maximizing over y is maximized over Ky but to obtain the
variance components, it is assumed that K is a matrix of constants, such that KX = 0,
which implies that:
This implies that Ky is distributed over N(0, KTVK) and the likelihood of Ky is
called the restricted maximum likelihood (REML). There are many options to
choose K and typically K = I - X(XTX)-1XT, which is the ordinary least squares
residual operator used. Therefore, the log likelihood of Ky is equal to
n-p 1 1 0
lðVjKyÞ = - logð2π Þ - log K T VK - yT K T K T VK - 1 ðKyÞ
2 2 2
This log likelihood after some algebra, according to Stroup (2012), is equal to:
30 1 Elements of Generalized Linear Mixed Models
n-p 1 1 1
lðVjKyÞ = - logð2π Þ - logjV j - log X T V - 1 X - rT Vr
2 2 2 2
-
where p = rank (X) and r = y - XβML , where βML = X0 V - 1 X X 0 V - 1 y
The variance components of G and R are estimated with iterative methods such as
the Newton–Raphson or Fisher’s scoring method, which maximizes the likelihood
function l(V| Ky) with respect to the variance components. The maximization process
starts with starting values for the variance components to estimate G and R, and, with
these values of G and R, it is possible to estimate a new, more refined version of the
parameters β and b; then, these values are used to update the estimates of the
variance components of the matrices G and R, and this process continues until the
established convergence is met.
Suppose that we randomly select a possible levels from a sufficiently large set of
levels of the factor of interest. In this case, we say that the factor is random. Random
factors are usually categorical. Continuous covariates that cannot be measured at
random levels are generally known as “systematic” or “fixed” effects (e.g., linear,
quadratic, or even exponential terms). Random effects are not systematic. Let us
assume a simple one-way model:
yij = μ þ τi þ εij ; i = 1, 2, ⋯, a; j = 1, 2, ⋯, ni
However, in this case, the treatment effects and the error term are random
variables, i.e., τi N 0, σ 2τ and εij~N(0, σ 2), respectively. The terms τi and εij
are uncorrelated, commonly referred to as “variance components.”
There can be some confusion about the differences between noise factors and
random factors. Noise factors can be fixed or random.
Factors are random when we think of them as being/coming from a random
sample of a larger population, and their effect is not systematic. It is not always clear
when a factor is random. For example, suppose that the vice president of a chain of
stores is interested in the effects of implementing a management policy in his stores
and the experiment includes all five existing stores, he might consider “the store” as a
fixed factor because the levels of the factor “store” do not come from a random
sample. However, if the store chain has 100 stores and takes 5 stores for the
experiment, as the company is considering rapid expansion and plans to implement
the selected new policy at the new locations, then “store” could be considered as a
random factor.
In fixed effects models, the researcher’s interest would focus on testing the
equality of means of treatments (stores). This would not be appropriate, however,
for the case in which 5 stores are randomly selected out of 100 because the
1.5 Mixed Models 31
treatments are randomly selected and we are interested in the population of treat-
ments (stores), not in a particular store or group of stores. The appropriate hypothesis
test for this random effect model would be
H 0 : σ 2τ = 0 vs H a : σ 2τ > 0
Partitioning a standard analysis of variance from the total sum of squares still
works; however, the form of the appropriate test statistic depends on the expected
mean squares. In this case, the appropriate test statistic would be
Mean SaquareTreatments
Fc = ,
Mean SquareError
σ 2 þ nσ 2τ = Mean SquareTreatments
Mean SquareError - σ 2
σ 2τ =
n
yij = μ þ τi þ bj þ Eij
Sistematic Ramdom
y11 1 1 0 1 0 0 ε11
y21 1 0 1 1 0 0 ε21
μ b1
y12 1 1 0 0 1 0 ε12
= τ1 þ b2 þ
y22 1 0 1 0 1 0 ε22
τ2 b3
y13 1 1 0 0 0 1 ε13
y23 1 0 1 β 0 0 1 b ε23
y X Z ε
where b~N(0, G) and ε~N(0, R). The variance–covariance matrix G for the random
effects in this case is a diagonal matrix 3 × 3 with diagonal elements σ 2b . Note how
the matrix representation of this model exactly corresponds to the mixed model
formulation. That is,
where yij is the weight observed in the ijth piglet, μ is the overall mean, τi is the fixed
effect due to ith diet, bj is the random effect due to the jth block (litter) assuming
bj N 0, σ 2b , and εij is the independent and identically distributed, approximately
normal, observed error term with mean 0 and variance σ 2, i.e., εij~N(0, σ 2).
Random effects, bj and εij, are assumed to be independent and uncorrelated.
Table 1.24 shows an outline of the analysis of variance for this dataset.
The SAS program to analyze this dataset is as follows:
In the previous syntax, we can mention two commands of great importance in this
example: (1) the “ddfm = satterthwaite” command allows to make a correction of
the degrees of freedom, and this correction is of great importance when the number
of experimental units (UE) is different in each one of the treatments and (2) the
command “lines” serve to obtain the means of “lsmeans” but are grouped with
letters, and, if these averages appear with different letters, then they reflect signifi-
cant differences.
The output for this code is shown in Table 1.25. Subsection (a) of this table shows
the estimated variance due to litter ðσ 2litter = 5:3117Þ and the mean squared error
σ 2 = 3:2961 . The analysis of variance, part (b), shows that there is a highly
significant effect of diet on piglet weight gain (P = 0.0091). In the results (part c),
we also observe the estimated means and its standard errors (obtained with “lsmeans
diet/lines”) and the grouping of means that are statistically different (part d). In these
last results, we can observe that the weight gain of piglets under treatments I and II
34 1 Elements of Generalized Linear Mixed Models
are not statistically different from each other, but they are statistically different with
respect to treatment III.
Since the researcher wishes to make an inference about the entire population of
litters, the factor “litter” must be entered as a random effect; otherwise, the ability of
the F-test to detect differences between treatments is diminished because the P-value
changes from 0.0091 to 0.0248. Another way to see the importance of including
random effects in an ANOVA is to calculate the relative efficiency (RE) between the
two models.
Table 1.26 shows the results of the analysis of variance under a completely
randomized design (CRD), i.e., yij = μ + litteri + eij is as follows:
In this case, if the experiment had been analyzed under a CRD, then the relative
efficiency (RE) between an RCBD and a CRD would be:
ðSSBRCBD þSCERCBD Þ
CMECRD t ðb - 1Þ ðb - 1ÞMSBRCBD þ bðt - 1ÞCMERCBD
RE = = =
CMERCBD CMERCBD ðbt - 1ÞCMERCBD
where CMEDCA is the mean squared error under a CRD, CMERCBD is the mean
squared error under an RCBD, SSBDBCA is the sum of squares due to blocks in an
RCBD, SSEDBCA is the sum of squares of errors in an RCBD, MSBDBCA is the mean
1.6 Exercises 35
square due to blocks, and t and b are the number of treatments and blocks,
respectively. If blocks are not useful, then the RE would be equal to 1. The higher
the RE, the more effective the blocking is in reducing the error variance. This value
can be interpreted as the relationship r=b, where r is the number of experimental units
that would have to be assigned to each treatment if a CRD were used instead of
an RCBD.
In Table 1.27, we can observe the mean squared error (MSE) of a CRD and
RCBD (Pearson’s chi-square / DF) obtained with the GLIMMIX procedure in SAS
as well as a series of fit statistics.
The MSE for a CRD and an RCBD are 8.61 and 3.3, respectively. Substituting
these values into the above equation, we obtain
CMECRD 8:61
ER = = = 2:609:
CMERCBD 3:3
This value indicates that, an RCBD is 2.609 times more efficient than a CRD. In
other words, this implies that it should have taken, at least, 8 (2.609 × 3 ≈ 8) more
experimental units × treatment units in a CRD to obtain the same MSE as that
obtained in an RCBD.
1.6 Exercises
Exercise 1.6.1 The following dataset corresponds to the growth of pea plants, in
eye units, in tissue culture with auxins ( 0.114 mm). The purpose of this experiment
was to test the effects of the addition of various types of sugars to the culture medium
on growth in length. Pea plants were randomly assigned to one of five treatments:
control (no sugar), 2 % of glucose, 2 % of fructose, 1 % of glucose + 1 % of fructose,
and 2% sucrose. A total of 10 observations were taken in each of the treatments,
assuming that the measurements are approximately normally distributed with con-
stant variance. Here, the individual plants to which the treatments were applied are
the experimental units. The data from this experiment are shown below (Table 1.28):
36 1 Elements of Generalized Linear Mixed Models
Table 1.28 Growth of pea plants in the culture medium with auxins with different types of sugars
Plant Control 2% Glucose 2% Fructose 1% Glucose +1% fructose 2% Sucrose
1 75 57 58 58 62
2 67 58 61 59 66
3 70 60 56 58 65
4 75 59 58 61 63
5 65 62 57 57 64
6 71 60 56 56 62
7 67 60 61 58 65
8 67 57 60 57 62
9 76 59 57 57 62
10 68 61 58 59 67
(a) Write the statistical model that best describes this dataset, indicating its
components.
(b) Calculate the analysis of variance for this experiment.
(c) Is there any significant difference between treatments on average plant growth?
Exercise 1.6.2 A forage company wants to test three different types of fertilizers
(F1, F2, and F3) for the production of two forage species (A and B) for cattle and
compare them with a fertilizer they usually apply, which we will call control. For
this, he decides to use 48 pots with 6 replications in the greenhouse to test the
combinations of fertilizers and forage species. The data from this experiment are
shown in Table 1.29:
1.6 Exercises 37
(a) Write and describe the statistical model of the experimental design with all its
components.
(b) Calculate the analysis of variance for this experiment.
(c) Is there any significant difference between treatments on average plant growth?
Exercise 1.6.3 The data in this experiment are the number of plants regrown after
grazing with sheep–goats. The initial size of the plant at the top of its rootstock is
recorded, and the weight of seeds (g) that it produces at the end of the season is the
response or dependent variable. The data for this experiment are as follows
(Table 1.30):
(a) List and describe all the components of the linear mixed model.
(b) Calculate the ANOVA for this dataset and answer the following questions:
Is seed weight influenced by the type of grazing?
Is seed weight influenced by the plant size?
Is the effect of grazing type on plant size influenced by the initial plant size?
Table 1.31 Supplementation trial in Dorper (breed 1) and Red Maasai (breed 2) lambs
Id Race Sex Supplement Block IW FW PRBC FEC WG
349 1 2 1 1 8 8.9 10 6500 0.9
326 1 2 1 1 9 10.1 11 2650 1.1
393 1 1 1 2 12 12.6 22 750 0.6
71 1 1 1 2 12.3 14.6 15 5200 2.3
271 1 1 1 3 13 13.7 19 4800 0.7
382 1 2 1 3 15.5 16.8 24 2450 1.3
85 1 2 1 4 16.3 18.2 27 200 1.9
176 1 2 1 4 15.9 17.7 21 3000 1.8
286 1 2 2 1 11 13.6 21 1600 2.6
183 1 1 2 1 9.9 11.7 21 450 1.8
21 1 2 2 2 11.6 13.1 25 2900 1.5
122 1 1 2 2 12.5 14.8 25 300 2.3
374 1 1 2 3 14.6 17.9 19 2250 3.3
32 1 2 2 3 14.2 16.9 22 2800 2.7
282 1 2 2 4 16.3 20.2 20 750 3.9
94 1 1 2 4 16.7 17.7 13 5600 1
127 2 2 1 1 7.5 8.1 26 1350 0.6
216 2 2 1 1 8.2 9.3 19 1150 1.1
133 2 1 1 2 10.1 11.7 30 200 1.6
249 2 1 1 2 8.8 10.4 28 0 1.6
123 2 2 1 3 1.6 12.6 23 600 1
222 2 2 1 3 11.3 13.5 24 1500 2.2
290 2 2 1 4 12.3 14.3 22 1950 2
148 2 1 1 4 13.1 14.9 26 500 1.8
142 2 2 2 1 8.2 11.5 25 850 3.3
154 2 2 2 1 9.5 12.2 35 700 3.7
166 2 1 2 2 9.7 12.8 29 400 3.1
322 2 1 2 2 8.6 12 26 800 3.4
156 2 1 2 3 10.2 13 28 1550 2.8
161 2 2 2 3 11.2 14.6 22 550 3.4
321 2 1 2 4 12.1 15.9 25 1250 3.8
324 2 1 2 4 13.8 18.1 24 1100 4.3
IW initial weight, FW final weight, PRBC percentage of red blood cells, FEC fecal egg count, WG
weight gain
40 1 Elements of Generalized Linear Mixed Models
Appendix
Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0
International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing,
adaptation, distribution and reproduction in any medium or format, as long as you give appropriate
credit to the original author(s) and the source, provide a link to the Creative Commons license and
indicate if changes were made.
The images or other third party material in this chapter are included in the chapter's Creative
Commons license, unless indicated otherwise in a credit line to the material. If material is not
included in the chapter's Creative Commons license and your intended use is not permitted by
statutory regulation or exceeds the permitted use, you will need to obtain permission directly from
the copyright holder.
Chapter 2
Generalized Linear Models
2.1 Introduction
In the generalized linear model (GLM) (which is not highly general) y = Xβ + e, the
response variables are normally distributed, with constant variance across the values
of all the predictor variables, and are linear functions of the predictor variables.
Transformations of data are used to try to force the data into a normal linear
regression model or to find a non-normal-type response variable transformation
(discrete, categorical, positive continuous scale, etc.) that is linearly related to the
predictor variables; however, this is no longer necessary. Instead of using a normal
distribution, a positively skewed distribution with values that are positive real
numbers can be selected. Generalized linear models (GLMs) go beyond linear
mixed models, taking into account that the response variables are not of continuous
scale (not normally distributed), GLMs are heteroscedastic, and there is a linear
relationship between the mean of the response variable and the predictor or explan-
atory variables.
Nelder and Wedderburn (1972) implemented a unified methodology for linear
models, thus opening a window for researchers to design models that can explain the
variation of the phenomenon under study. Later, McCullagh and Nelder (1989)
proposed an extension of linear models, called generalized linear models (GLMs).
They pointed out that the key elements of a classical linear model are as follows:
(i) the observations are independent, (ii) the mean of the observation is a linear
function of some covariates, and (iii) the variance of the observation is a constant. To
further extend these, points (ii) and (iii) are modified as follows: (ii’) the mean of the
observation is associated with a linear function of some covariates via a link function
and (iii’) the variance of the observation is a function of the mean. For more details,
see the study by McCullagh and Nelder (1989). GLMs can be adapted to a wide
variety of response variables. Special cases of GLMs include not only regression and
analysis of variance (ANOVA) but also logistic regression, probit models, Poisson
regression, log-linear models, and many more.
The construction of a GLM begins with choosing the distribution of the response
variable, the predictor or explanatory variables to include in the systematic compo-
nent, and how to connect the mean of the response to the systematic component. The
three important components are described in the following sections:
The first component to specify is the random component, which consists of choosing
a probability distribution for the response variable. This can be any member of the
exponential family of distributions, such as normal, binomial, Poisson, gamma, and
so on.
Finally, we will look at the specification of the link function that maps the mean of
the response variable to the linear predictor. The link function allows a nonlinear
relationship between the mean of the response variable and the linear predictor, and
this link g () connects the mean of the response variable with the linear predictor.
That is,
gðμÞ = η
The function must be monotonous (and differentiable). The mean is equal, in turn,
to the inverse transformation of g (), that is,
μ = g - 1 ðηÞ
The most natural and meaningful way to interpret the model parameters is in
terms of the scale of the data. In other words,
It is important to note that the link relates the mean of the response to the linear
predictor and that this is different from transforming directly to the response
variable. If the response variables are transformed (i.e., y’s), then a distribution
must be selected, which describes the population distribution of the transformed
data, thus making the original interpretation of the data more difficult. A transfor-
mation of the mean is generally not equal to the mean of the transformed values, that
is, g(E[y]) ≠ E(g[y]). For example, suppose we have a distribution with the following
values (and probabilities):
yi 1 2 3 4
prob(Y = yi) 0.125 0.375 0.375 0.125
μ = η:
46 2 Generalized Linear Models
π eη
η = log → π=
ð1 - π Þ ð 1 þ eη Þ
Another useful alternative for these types of data is the probit link function:
η = Φ - 1 ðπ Þ → π = ΦðηÞ
η = logðλÞ → λ = eη
The variance of the function has the form Var(λ) = λ, and, similar to the binomial
distribution, the scale parameter is 1. Poisson models with a log link function are
often referred to as log-linear models, commonly used when there are contin-
gency (data frequency) tables with at least two entries.
2.2 Components of a GLM 47
1 1
η= → μ=
μ η
The variance of the function is given by Var(μ) = μ2and the scale parameter ϕ is
usually unknown. In some cases, the log link function is commonly used, which
results in an exponential inverse link. It should be noted that the link function
does not map the range of the means contained within the linear predictor.
Therefore, given its limitations, the theory only provides reasonable approxima-
tions for most applications. An exponential distribution is a special case of the
gamma distribution.
Previously, the classical methods for working with non-normal data – before the
advances in computational methods – consisted of using direct transformation of the
response variable, that is, the data were transformed using the function t( y) before
being analyzed. The goal of the transformation was to obtain a simple connection
between the mean and the linear predictor. However, obtaining a consistent scale of
variation when selecting a transformation is vitally important. The usual way for
selecting a suitable transformation is based on the assumption that, within the region
of variation of the random variable, the transformation can capture the variability
adequately through a simple linear approximation of the mean. That is, if the random
variable y has a distribution with a mean μ and variance σ 2(μ), we want to find a
transformation t( y) such that it is forced to have a constant variance (stabilizes the
variance). The commonly used functions to stabilize variance are the square root
p
y when data have a Poisson distribution; the arcsine square root when data are
binomial; and the logarithmic transformation for data with a constant coefficient of
variation.
Table 2.1 provides an overview of the most common link functions that will give
admissible values for certain types of response variables and the corresponding
inverse of the link function.
According to McCullagh and Nelder (1989) and Agresti (2013) in Chap. 4, a GLM is
defined under the following assumptions:
(a) The data y1, y2, ⋯, yn are independent.
(b) The response variable yi does not necessarily have to have a normal distribution,
but we usually assume a distribution from an exponential family (e.g., binomial,
Poisson, multinomial, gamma, etc.).
(c) A GLM does not assume a linear relationship between the dependent variable
and the independent variables, but it does assume a linear relationship between
the response transformed in terms of the link function and the explanatory
variables; for example, for logit(π) from a binary logistic regression, logit
(π) = β0 + βx.
(d) The predictor (explanatory) variables may be in terms of power or some other
nonlinear transformations of the original independent variables.
(e) The assumption of homogeneity of variance need not be satisfied. In fact, it is not
possible in many cases, given the structure of the model and the presence of
overdispersion (when the observed variance is larger than what the model
assumes).
(f) Errors are independent but are not normally distributed.
(g) The estimation method is maximum likelihood (ML) or other methods instead of
ordinary least squares (OLS) to estimate the parameters.
Estimators of the regression coefficients for linear models with a normal response are
obtained using least squares or ML, and significance tests are generally used to
compare the sum of least squares under different hypothesis tests using the F-test. It
is worth mentioning that these tests are exact, and, so, no approximations are
required for their implementation.
GLMs offer a natural extension of this situation in the sense that: (1) The
computational calculations used to determine the ML estimations of the regression
parameters/coefficients are highly similar to those used in cases when the response is
normal, with the difference being that the estimation process is iterative, which
produces successive approximations that converge to the ML estimates. (2) In the
inference procedures, the test statistic commonly used is the likelihood ratio test,
which is parallel to the F-tests in linear models with a normal response. Thus, GLMs
provide a uniform method of estimation and inference. Estimation of parameter β is
highly similar to the ML method, whereas the inference methods are generally
approximations since they are based on the theory of the distribution of a sufficiently
large sample, as in the case of the likelihood ratio method. There are several
alternative tests such as the Wald test, test scores, and the likelihood ratio test.
2.5 Specification of a GLM 49
In the following examples, we will describe the components of a GLM for some
normal, gamma, binomial, and Poisson regression models.
y i = β 0 þ β 1 xi þ εi , εi N 0, σ 2
Equivalently,
Eðyi jxi Þ = β0 þ β1 xi
Distribution: yi N μi , σ 2
E ð yi Þ = μ i
Varðyi Þ = σ 2
Linear predictor : ηi = β0 þ β1 xi
Link function: ηi = μi ðidentity linkÞ
where β0 and β1 are the intercept and slope, respectively. This means that we are
expressing the linear model as a GLM.
Example 1 A simple linear regression analysis was performed on the diamond price
( y) as a function of the number of carats (Table 2.2) and assuming that the response
variable “y” has a normal distribution with a mean β0 + β1xi and variance σ 2.
The basic Statistical Analysis Software (SAS) syntax for simple linear regression
is as follows:
proc reg ;
model price=weight/clb p r;
output out=diag p=pred r=resid;
id weght;
run;
In the above program, “proc reg” invokes a linear regression procedure in SAS.
The “clb” option generates a confidence interval for the slope and intercept. The “p”
50 2 Generalized Linear Models
option generates fitted values and standard errors. The “r” option performs a residual
analysis (i.e., checks assumptions). The “output out” statement generates a new
dataset called “diag” containing the residuals and the predicted/adjusted values. The
“id weight” statement adds the specified variable to the fitted values output.
Part of the results is shown in Table 2.3. The estimated parameters, obtained from
“proc reg,” are shown below:
Note that the estimated parameters are all statistically significantly different from
zero. Then, the linear predictor takes the form:
If the response variable “y” does not fit the data well, then the normal distribution
may barely represent the response distribution; that is, it would weakly explain the
variability of the data and, consequently, the “identity” may not be the best link
function, since the linear predictor would not include all the relevant information or
some combination of the three components of the GLM. Although other fit measure
statistics exist in the linear regression model, such as the coefficient of determination
(R2), the residual analysis is used to determine whether there is a good fit of the
model or whether the assumptions of a Gaussian model are met. In this example, the
value of R2 is R2 = 0.9783, and this value indicates that the model used explains
97.83% of the total variability of the dataset. In Fig. 2.1, we can see that the simple
linear regression model provides a good fit to this dataset.
2.5 Specification of a GLM 51
Fig. 2.1 A dot plot of price vs. weight (carat) and fitted model
Logistic regression and other binomial response models are widely used in research
areas like biological sciences and agriculture. Given their importance in this section,
some relevant features of these models are mentioned.
Let yi be the observed response on a set of p explanatory variables x1, x2, ⋯, xp
whose distribution yi is binomial with ni independent Bernoulli trials and probability
of success π i on each trial, i.e.,
yi Binomialðni , π i Þ
Then, we can model the response using a GLM with a binomial response. The
linear predictor in this case will be equal to
πi
log = β0 þ β1 x1i þ ⋯ þ βp xpi
1 - πi
πi
logitðπ i Þ = log
ð1 - π i Þ
52 2 Generalized Linear Models
which models the logarithm of the odds ratio, ð1 -πi πi Þ , as a function of the predictor
variables. The components of this GLM for binomial data are:
πi
Link function : ηi = logitðπ i Þ = log ðlogit linkÞ
ð1 - π i Þ
Another highly useful link function – when you have experiments – is the
“probit” link ηi = Φ-1(π i), which was mentioned before.
The basic GLM for this dataset, under the probit link, is almost identical to the
logit link as seen below:
Distribution: yi Binomialðni , π i Þ
Linear predictor : ηi = β0 þ β1 xi þ ⋯ þ βp xp
where ηij is the linear predictor and β0, β1, and β2 are the parameters to be estimated.
In this GLM, the link function is
π ij
ηij = logit π ij = log
1 - π ij
and the probability in the interval (0, 1) is computed through the inverse of the link
function
1
π ij = = g - 1 ηij
1 þ exp ηij
2.5 Specification of a GLM 53
This last expression allows to estimate the probability of germination (π ij) under
different temperature conditions (°C) and time periods (days). Note that the
nonlinear relationship between the result π ij and the linear predictor ηij is modeled
by the inverse of the link function. In this particular case, the link function is the
logit.
π ij
ηij = log = g π ij
1 - π ij
For the illustration of this example, a set of data was simulated using the values
β0 = 8, β1 = - 0.19, and β2 = - 0.37 in the linear predictor and the inverse of the
linear function by varying the temperature from 0 to 40 °C and time from 0 to
15 days, i.e.,
1
π^ij = ð8 - 0:19 × Tempi - 0:37 × Dayj Þ
1þe
Probit Regression
proc glimmix data=germ;
model p = t d/solution dist=binomial link=probit cl;
output out=probitout pred(noblup ilink)=predicted resid=residual;
run;
“proc GLIMMIX” in SAS uses complex models without modifying the response
variable as occurs when a direct transformation is applied to the response variable.
Instead, GLIMMIX uses a link function of the response variable that is modeled as
having a linear relationship with the explanatory variables. The “model” command
specifies the response variable p as a function of the explanatory variables t and d,
which define Xβ. The “solution” option in the model specification invokes the
regression procedure to list the fixed effects parameter estimates of the model
(β0, β1, and β2). The “dist” option is used to specify the distribution of the response
variable, and the “link” option is used to specify the link function.
To get predicted probability values for each observation, the “output” option in
proc GLIMMIX is used. Two types of predicted values can be obtained with the
“output” option. The first type is the solution for the random effects (best linear
unbiased predictors (BLUPs)) in the linearized model, and the second type is the
predictions based on the fixed effects (best linear unbiased estimators (BLUEs))
(pred(noblup ilink) = predicted). The “ilink” sub-option in the “pred” option asks for
the inverse function of the predicted values, that is, the probabilities of the pre-
dictions that are stored under the predicted file name. Finally, the “resid” option is
used to request the residuals of the regression, which are stored in the residual.
Table 2.4 shows part of the output (analysis of variance (part (a)) and estimation
and significance of fixed effects (part (b)) of the regression procedure using the logit
link function.
Table 2.4 Estimation and (a) Type III tests of fixed effects
significance of fixed effects
Effect Num DF Den DF F-value Pr > F
using the logit link function
T 1 2508 551.28 <0.0001
D 1 2508 407.19 <0.0001
(b) Parameters estimates
Standard
Effect Estimate error DF t-value Pr > |t|
Intercept -8.0000 0.3189 2508 -25.08 <0.0001
T 0.1900 0.008092 2508 23.48 <0.0001
D 0.3700 0.01834 2508 20.18 <0.0001
2.5 Specification of a GLM 55
Table 2.5 Parameter estimates, linear predictor, and probability of linear, logit, and probit models
Link function Parameter Estimated value η π
Linear β0 -0.417 1.149 0.873
β1 0.022
β2 0.0412
Logit β0 -8.00 5.95 0.962
β1 0.190
β2 0.370
Probit β0 -4.483 3.362 0.965
β1 0.106
β2 0.207
*The linear predictor η and the probability π were estimated using D = 15 and T = 30
Fig. 2.2 (a, b) Probability of seed germination as a function of temperature and day
In Table 2.5, parameter estimates of the linear predictor for the generalized linear,
logit, and probit models are presented. The probabilities estimated by the probit and
logit models are almost identical to each other, but those of the linear probability
model are different; this is because the data were generated with a binomial distri-
bution, whereas the estimated linear predictor differs substantially from the linear
predictor under the link probit and logit.
In Fig. 2.2a, b, we observe that in an interval between 3 and 7 days and 0 and
15 °C, there is approximately 20% seed germination, but, while both factors
increase, the germination percentage also increases substantially.
For a linear model, a plot of the predicted values against the residuals is probably the
simplest way to decide whether the model used provides a good fit to the data; but,
for a GLM, we must decide on the appropriate scale to use for the fitted values.
56 2 Generalized Linear Models
Fig. 2.3 Predicted vs. residual values using the logit link
Generally, it is better to use linear predictors η in the plot rather than the predicted
responses μ. If there is no linear relationship between the linear predictors and the
residuals, then it could indicate a lack of fit in the model. For a linear model, we
could perform a transformation of the response variable, but this is not highly
recommended for a GLM as this could change the response distribution. Another
alternative would be to change the link function, but since there are not many link
functions that allow interpreting a model easily, this is not a good option. Moreover,
changing the linear predictor or transforming the predictor variables would not be the
best way to go.
Figures 2.3, 2.4, and 2.5 show the linear predictor versus residual (we can also see
the predicted value versus the residual). By investigating the nature of the relation-
ship between the predictors and the residuals in Fig. 2.3, we can see that there is a
linear relationship between the predictor and the residual, using the logit function,
whereas the probit and identity functions do not show this linear relationship.
However, with the probit link function, we observe a curvilinear relationship
between the predictor and the residual, which may be because homogeneity of
variance is not satisfied under this link function. Therefore, the logit link is shown
to be the best choice.
Example 2 Fruit flies can be a year-round problem in fruit-growing areas in many
regions of the world, such as in Mexico, and are most common especially in late
summer and fall because ripe or fermented fruits and vegetables attract insects by
serving as a natural host. If these insects are not controlled, economic losses in fruit-
growing areas could be large and devastating to the producers. In response to this,
entomologists have implemented experiments to help mitigate the damages caused
by these insects. One such experiment attempted to establish the relationship
between the concentration of a toxic agent (nicotine) for 5 hours and the number
2.5 Specification of a GLM 57
Fig. 2.4 Predicted values vs. residuals using the probit link
Fig. 2.5 Predicted vs. residual values using the identity link
of insects killed (common fruit fly); the data are shown in Table 2.6, and, for more
information, see the study by Myers et al. (2002).
The number of dead insects can be modeled under a binomial distribution (n, π).
Let yi denote the number of dead insects at a concentration i. The GLM components
for this dataset are:
58 2 Generalized Linear Models
Table 2.6 Ratio of the concentration of a toxic agent to the number of fruit flies killed
Concentration (g/100 cc) Number of insects (n) Dead insects ( y) Proportion of dead insects
0.1 47 8 0.17
0.15 53 14 0.264
0.20 55 24 0.436
0.30 52 32 0.615
0.50 46 38 0.826
0.70 54 50 0.926
0.95 52 50 0.962
Note that we are using conci to denote the independent variable nicotine toxicant
concentration. The following SAS code allows us to perform a binomial regression
for the fruit fly dataset:
fly data;
input conc n y;
datalines;
0.1 47 8
0.15 53 14
0.2 55 24
0.3 52 32
0.5 46 38
0.7 54 50
0.95 52 50
;
proc glimmix data=nobound fly;
model y/n = conc/dist=binomial link=logit solution;
run;
Table 2.7 Results of the (a) Type III tests of fixed effects
analysis of variance with the
Effect Num DF Den DF F-value Pr > F
logit link
Conc 1 5 71.94 0.0004
(b) Parameter estimates
Effect Estimate Standard error DF t -value Pr > |t|
Intercept -1.7361 0.2420 5 -7.17 0.0008
Conc 6.2954 0.7422 5 8.48 0.0004
Therefore, with the logistic regression model, we can estimate the probability that
an insect dies when exposed to a certain concentration i of nicotine using the
following expression:
eη i e - 1:7361þ6:2954 × conci
π ðconci Þ = =
1 þ eη i 1 þ e - 1:7361þ6:2954 × conci
certain disease in a population over a period of time, (2) the number of insects
surviving after the application of an insecticide over time, (3) the number of dead fish
found per cubic kilometer due to a certain pollutant, (4) the number of sick animals
occurring in a given month in a given country, and so on. The Poisson probability
distribution is perhaps the most widely used for modeling count-type response
variables. As λ (the average count) increases, the Poisson distribution grows sym-
metrically and eventually approaches a normal distribution.
The Poisson likelihood function is appropriate for nonnegative integer data and
this process assumes that events occur randomly over time, so the following
conditions must be met:
(a) The probability of at least one occurrence of an event in a given time interval is
proportional to the length of the interval.
(b) The probability of two or more occurrences of an event within an extremely
small interval is negligible.
(c) The number of occurrences of an event in disjoint time intervals are mutually
independent.
The probability distribution of a Poisson random variable "y, " which represents
the number of successes occurring in a given time interval or in a given region of
space, is given by the expression
e - λ λk
Pðy = kÞ = , λ > 0, k = 1, 2, ⋯
k!
where λ is the average number of successes (the average count) in a time or space
interval. The mean and variance of this distribution are the same, that is,
EðyÞ = VarðyÞ = λ
Poisson regression belongs to a GLM and is appropriate for analyzing count data
or contingency tables. A Poisson regression assumes that the response variable “y”
has a Poisson distribution and that the logarithm of its expected value can be
modeled by a linear combination of unknown parameters and independent variables.
As in a standard linear regression, the predictors, weighted by the coefficients of x1,
x2, ⋯, xp, are summed to form the linear predictor,
P
ηi = β 0 þ xpi βp
p=1
where β0 is the intercept and βp is the slope of the covariates xp ( p = 1, ⋯, P). Thus,
the expected value of yi and the linear predictor ηi are related through the link
function. The components of a GLM with a Poisson response (yi ~ Poisson(λi)),
where λi is the expected value of yi, are as follows:
2.5 Specification of a GLM 61
For the purposes of implementation, we use days to denote elapsed days and
students to denote infected students. We can employ the Poisson regression model
using GLIMMIX in SAS, as shown below:
The “proc GLIMMIX” statement invokes the SAS generalized linear mixed
model (GLMM) procedure. The “model” command specifies the response variable
and the predictor variable, whereas the “solution” option in the model specification
requests a listing of the fixed effects parameter estimates. The “dist = poisson”
option specifies the distribution of the data, and the “link = log” option declares the
link function to be used in the model. The default estimation technique in general-
ized linear mixed models is restricted pseudo-likelihood (the “RPSL method”); in
this example, we use “method = laplace.” The “output” option creates a dataset
containing predicted values and diagnostic residuals, calculated after fitting the
model. By default, all variables in the original dataset are included in the output
dataset, whereas the “out = sal_infection” statement specifies the name of the output
dataset. The “pre(noblup ilink) = predicted” option calculates the predicted values
without taking into account the random effects of the model, and “ilink” calculates
the statistics and predicted values at the scale of the data. Finally, the “resid = residual
option” calculates the residuals.
The probability estimation of a GLMM involves an integral, which, in general,
cannot be calculated explicitly. “GLIMMIX,” by default, uses the RSPL method, but
it also offers different options such as the quadrature and Laplace integration
method, among others. These integral approximation methods approximate the
probability function of an GLMM, and the optimization of the function is numeri-
cally approximated. These methods provide a real objective function for optimiza-
tion. For more details, see the SAS manual. However, in a GLM, this approximation
involving the integral is not necessary since an exact solution can be obtained to
estimate the parameters, as there are no random effects. The results of this analysis
are shown below (Table 2.8).
The fit statistics in part (a) (“Fit statistics”) give us an idea of the quality of the
goodness of the fit of the model; these statistics are very useful when we are
proposing different models to try and find the best model for the data. In this case,
the value of the generalized chi-squared statistic divided over its degrees of freedom
2.5 Specification of a GLM 63
Table 2.8 Results of the (a) Fit statistics (Akaike’s information criterion (AIC), a small
analysis of variance sample bias corrected Akaike’s information criterion
(AICC), Bozdogan Akaike’s information criterion (CAIC),
Schwarz’s Bayesian information criterion (BIC), Hannan and
Quinn information criterion (HQIC))
-2 Log likelihood 389.11
AIC (smaller is better) 393.11
AICC (smaller is better) 393.22
BIC (smaller is better) 398.49
CAIC (smaller is better) 400.49
HQIC (smaller is better) 395.29
Pearson’s chi-square 84.95
Pearson’s chi-square / DF 0.78
(b) Type III tests of fixed effects
Effect Num DF Den DF F-value Pr > F
Days 1 102.28 <0.0001
(c) Parameter estimates
Standard
Effect Estimate error DF t-value Pr > |t|
Intercept 1.9902 0.08394 23.71 <0.0001
Days - 0.001727 - <0.0001
0.01746 10.11
is close to 1. This indicates that the variability of these data has been reasonably
modeled and that there is no residual overdispersion. The value of the generalized
chi-squared statistic divided over its degrees of freedom (Pearson′s chi - square/DF)
is the experimental error of the analysis.
The “Type III tests of fixed effects” (in part (b)) and the solution for the intercept
and the days effect (“Parameter estimates”) in part (c) are shown in Table 2.8. The
negative coefficient of the covariate days indicates that as the number of days
increases, the average number of students diagnosed with the disease decreases.
That is, we reject the null hypothesis (P = 0.0001) that the expected number of
infected students is the same as the number of days increases.
We see that with a 1-day increase in the infection period, the expected
(or average) number of students diagnosed with the disease decreases by a factor
of e-0.01746 = 0.9827.
The estimated linear predictor for this GLM is:
For example, we can calculate the probability of diagnosing "k = 2" infected
students in a period of 2 days; i.e., "Days = 2"as follows:
k
- λi
exp λi
PðY i = kÞ =
k!
64 2 Generalized Linear Models
Distribution: yi Poissonðλi Þ
Linear predictor : ηi = β0 þ β1 × Agei þ β2 × heighti
Link function: ηi = logðλi Þ = gðλi Þ ðlog linkÞ
For this example, a dataset was simulated using the following parameter values:
β0 = - 2, β1 = - 0.03, and β2 = - 0.04. In addition, in order to obtain the linear
predictor, the variable age (years) varied from 0 to 50 and height (meters) from 0 to
30, both with increments in one unit. Thus, the values of yij were simulated with the
following expression:
2.5 Specification of a GLM 65
Fig. 2.9 (a, b) Probability of tree infection as a function of tree height and age in years
In Fig 2.9a, b, we can see that at a young age, between 1 and 10 years and at a
height of no more than 10 meters, trees are more susceptible to be infested by the
virus. However, as their age increases, trees show greater resistance.
The following SAS code fits a Poisson regression model with two predictor
variables, assuming that there is no interaction between the two explanatory
variables.
In Table 2.9 part (a), the analysis of variance shows that age and tree height are
highly significant, indicating that both variables help explain the infection mecha-
nism of the trees through a Poisson model (P < 0.0001).
The linear predictor for this GLM, with Poisson distribution, in the response
variable is:
Table 2.9 Part of the results of the analysis of variance under a Poisson distribution
(a) Type III tests of fixed effects
Effect Num DF Den DF F-value Pr > F
Age 1 6158 43.20 <0.0001
Height 1 6158 29.10 <0.0001
(b) Parameter estimates
Effect Estimate Standard error DF t-value Pr > |t|
Intercept -2.0000 0.1388 6158 -14.41 <0.0001
Age -0.03000 0.004564 6158 -6.57 <0.0001
Height -0.04000 0.007415 6158 -5.39 <0.0001
k
- λi
exp λi
PðY i = kÞ =
k!
^ i = 3Þ
PðY
expð - exp ½ - 2 - 0:03 × Age - 0:04 × HeightÞ ðexp ½ - 2 - 0:03 × Age - 0:04 × HeightÞ3
=
3!
expð - exp ½ - 2 - 0:03 × 2 - 0:04 × 3Þ ðexp ½ - 2 - 0:03 × 2 - 0:04 × 3Þ3
= = 0:000215
3!
1
f ðyÞ = yα - 1 e - ðy=βÞ ; y, α, β > 0
ΓðαÞβα
where Γ(∙) is the gamma function. A gamma regression uses the input variables X’s
and coefficients to make a prediction about the mean of "y, " but it actually focuses
more of its attention on the scale parameter β. The mean and variance of a Gamma
random variable are:
The probability density function gamma can be rewritten in terms of the mean
μ and the scale parameter α as follows:
α
exp ð- yμÞ ,
1 yα α
f ðyÞ = y>0
ΓðαÞy μ
Plotting the gamma distribution (Fig. 2.10) with three different values of shape
α = (0.75, 1, and 2), the scale parameter μ has a multiplicative effect. In the gamma
density of the first panel α = 0.75, we see that the density is infinite at 0, whereas in
the second panel α = 1, it corresponds to the exponential density, and, in the third
panel α = 2, we see a skewed distribution.
A gamma distribution can arise in different forms. The sum of "n" independent
and identically distributed exponential random variables with parameter β has a
gamma distribution (n, β). The chi-squared distribution χ 2 is a special case of a
gamma distribution with β = 1/2 and α/2 degrees of freedom.
Theoretically, a Gamma distribution should be the best choice when the response
variable has a real value in the range of zero to infinity and it is appropriate when a
fixed relationship between the mean and variance is suspected. If we expect the
values "y" to be small, then we should expect a small amount of variability in the
observed values. Conversely, if we expect large values of "y, " then we should expect
(observe) a lot of variability.
Models with a gamma distribution with multiplicative covariate effects provide
additional support for modeling nonnegative right-skewed continuous responses,
such as the gamma variable with the log link function. Whether the data are modeled
with an inverse or logarithmic link function will depend on whether the rate of
change or the logarithm of the rate of change is a more meaningful measure. For
example, in studies of yield density that commonly assume that yield per plant is
inversely proportional to plant density (Shinozaki and Kira 1956), the linear
predictor is:
η i = ð β 0 þ β 1 xi Þ - 1
data coagu;
input num conc y;
datalines;
1 5 118
2 10 58
3 15 42
4 20 35
5 30 27
6 40 25
7 60 21
8 80 19
9 100 18
;
proc glimmix data = coagu;
model y = conc / dist=gamma link=power(-1) solution;
output out=salgamm1 pred(noblup ilink)=predicted resid=residual;
run;
Most of the syntax has already been described in the previous examples; the only
new one is the link = power(-1) option. This statement invokes the inverse
link function in the GLIMMIX procedure.
Some of the output from this analysis is shown in Table 2.10.
The dilution percentage, part (a) in Table 2.10, of the blood plasma concentration
significantly affects the clotting time (P = 0.0004). The values for constructing the
fitted linear predictor are tabulated in part (b) of Table 2.10.
Table 2.10 Results of the (a) Type III tests of fixed effects
regression analysis under a
Effect Num DF Den DF F-value Pr > F
gamma distribution
Conc 1 41.01 0.0004
(b) Parameter estimates
Effect Estimate Standard error DF t-value Pr > |t|
Intercept 0.008686 0.002294 3.79 0.0068
Conc 0.000658 0.000103 6.40 0.0004
Scale 0.05213 0.02436 . . .
1 1
y=μ= = = 65:505
0:008686 þ 0:000658 × conc 0:008686 þ 0:000658 × 10
μ2 65:5052
VarðyÞ = = = 85818:215
α 0:052
The average time it takes for blood to clot – when a thromboplastin concentration
of 10% is added – is 65.505 seconds with a variance of 85818.215.
Selecting a model from a set of candidate models that provides the best fit and largely
explains the variability in the data is a necessary but complex task. This process
involves trying to minimize information loss. From the field of information theory,
several information criteria have been proposed to quantify information, or the
expected value of information, and, among these, the most widely used are the
Akaike information criterion (AIC) (Akaike 1973, 1974) and the Bayesian informa-
tion criterion (BIC) (Schwarz 1978). Both AIC and BIC are based on the ML
estimates of the model parameters. In a regression fit, the estimates of β´s under
the ordinary least method and the ML method are identical. The difference between
the two methods comes from estimating the common variance σ 2 of the normal
distribution of the errors, around the true mean.
2.5 Specification of a GLM 71
To get an idea of how to use these adjustment statistics, let us compare three
possible models that best explain the effect of the plasma dilution percentage:
Model 1: ηi = β0 þ β1 × conci
Since the proposed models have a gamma error structure, the commonly used fit
statistic (R2) in a simple linear regression model is not reported. Part of the results of
this analysis is shown below with various metrics as goodness-of-fit measures:
With regard to the values of the goodness-of-fit metrics (Table 2.11 part (a)), the
smaller they are, the better the fit. Based on the above, the accuracy of the fit of the
three regression models increased as the polynomial in the linear predictor increased.
That is, model three best explained the variability of the plasma clotting time. The
72 2 Generalized Linear Models
Fig. 2.12 Fitting the gamma regression model with three predictors
type III sum of squares for fixed effects and the estimated parameters under model
three are tabulated in parts (b) and (c) in Table 2.11, respectively.
Parameter estimates under the linear predictor with linear, quadratic, and cubic
effects are highly significant. The results suggest that a cubic effect for the percent-
age dilution in plasma concentration in the linear predictor is more efficient in
explaining the clotting time than taking only a linear predictor with only linear or
both linear and quadratic effects (Fig. 2.12).
Studies in various areas of knowledge, including agriculture, often face the need to
explain a variable expressed as a proportion, percentage, rate, or fraction in the
continuous range (0,1). In economics, for example, the factors that influence the
proportion of households that do not have a cement floor have been studied.
Similarly, in plant breeding, it is desired to investigate the factors that influence
the proportion of plant leaves damaged by a certain disease. In parallel, the propor-
tion of impurities in chemical compounds is of everyday interest in physics and
chemistry. While studies on electoral preferences analyze citizen participation rates
and the variables that can explain them, in the field of education and academic
performance, we try to explain the proportion of success in standardized tests.
Moreover, it is also used to identify the factors associated with the proportion of
credit used by credit card users. The public health field has also been confronted with
the need to model the proportion of coverage in health programs in order to identify
the sociodemographic and economic characteristics associated with whether a
woman is covered. Johnson et al. (1995) presented the properties of the probability
distribution of this type of variable; these researchers showed that the beta distribu-
tion can be used to model proportions, since its density can take different forms
depending on the values of the two shape parameters that index the distribution.
However, the beta regression that results from using the beta distribution as a
2.5 Specification of a GLM 73
response variable in the context of generalized linear models is not very well known,
but its use is increasing every day, thanks to friendly software that allow its
implementation in an extremely easy manner.
Definition Let y be a continuous random variable defined in the interval [0, 1] and
α, β > 0. Then, Y has a beta distribution with parameters of forms α and β if and
only if:
1
f Y ð yÞ = y α - 1 ð 1 - yÞ β - 1 , 0 < y < 1
Bðα, βÞ
α αβ
E ðY Þ = and VarðY Þ = :
αþβ ðα þ β þ 1Þðα þ βÞ2
In the context of regression analysis, the density of the beta distribution provided
above is not very useful for modeling the mean of the response. Therefore, this
density is reparametrized so that it contains a precision (or dispersion) parameter.
α
This reparameterization consists of defining a μ = αþβ and ϕ = α + β, i.e., α = μϕ
and β = (1 - μ)ϕ, which means that:
E ðyÞ = μ
and
μ ð1 - μ Þ
VarðyÞ =
1þϕ
So, μ is the mean of the response variable and ϕ can be interpreted as a parameter
of precision in the sense that, for a fixed μ, the higher the value of ϕ, the smaller the
variance of y. The density function of y can be written as:
ΓðϕÞ
f ðy; μ, ϕÞ = yμϕ - 1 ð1 - yÞð1 - μÞϕ - 1 , 0 < y < 1
ΓðμϕÞΓðð1 - μÞϕÞ
k
gðμi Þ = xij βi = ηi
i=1
where β1, β2, . . ., βk are unknown regression parameters and xij are the k covariates
(k < n) that are fixed and known. Finally, g(∙) is a strictly monotone and differen-
tiable link function that maps to the real numbers in the interval (0, 1).
There are several possible options for the link function g(∙). For example, we can
μ
use a logit link function gðμÞ = log 1-μ , which is considered the most popular and
asymptotically efficient, but it is also feasible to use the probit g(μ) = Φ-1(μ)
function, where Φ(∙) is the cumulative distribution function of a standard normal
random variable, and the complementary link function g(μ) = log {- log (1 - μ)},
among others (McCullagh and Nelder 1989).
Example 1 The objective of this experiment was to evaluate the effect of the
concentration of a chemical compound on the proportion of damage ( y) in the fruits
(Table 2.12). This compound is known to inhibit the growth of an insect, but, at a
certain concentration, it can cause damage to the fruits.
The proportion of damage to the fruits can be modeled under a beta distribution
(μ, ϕ). Let yi be the proportion of damage to the fruits due to the ith concentration.
The GLM components for this dataset are as follows:
μð1 - μÞ
Distribution: yi betaðμi , ϕÞ, with E ðyÞ = μ and VarðyÞ =
1þϕ
Linear predictor: ηi = β0 þ β1 × conci
πi
Link function : ηi = logitðπ i Þ = log ðlogit logÞ
ð1 - π i Þ
2.5 Specification of a GLM 75
Note that we are using conc to denote the independent variable concentration of
the chemical compound. The following SAS code allows us to perform a beta
regression for the dataset:
The “method = Laplace” statement asks SAS for the estimation method to be
Laplace integration, and the “dist = beta” and “s” options invoke GLIMMIX to
perform beta regression and provide fixed parameter estimation, respectively.
In order to see which type of linear, quadratic, or cubic predictor best explains the
observed variability in a dataset, we make use of the fit statistics (-2 log likelihood,
AIC, etc.). Part of the output is shown below in Table 2.13. According to the fit
statistics in part (a), the predictor that best models this experiment is the quadratic
predictor.
In Fig. 2.13, we can see that the best linear predictor to model a dataset is of the
cubic order, but due to the indeterminacy (not showing here) in the t-value (infinity),
in the hypothesis test of the estimated parameters, it was decided to take the
quadratic predictor. Both predictors, quadratic and cubic, better model the propor-
tion (percentage = proportion×100) of fruit damage caused by the concentration of
the applied chemical.
76 2 Generalized Linear Models
2.6 Exercises
various countries. The extract is diluted to 20%, which is the maximum concentra-
tion commercially available in the United States. Pyrethrin oxidizes on exposure to
air but has been shown to be stable for long periods in water-based emulsions and oil
concentrates. Synergistic compounds (such as piperonyl butoxide or N-octyl
bicycloheptene dicarboximide), which enhance the effect of pyrethrin on insects,
are present in commercially available pyrethrin formulations. The results of this
study are shown below (Table 2.16).
(a) List and describe the components of the GLM (distribution, systematic compo-
nent (predictor), and the link function).
(b) Fit the model according to part (a).
(c) Interpret your results.
80 2 Generalized Linear Models
Exercise 2.6.4 The objective of this experiment was to model the probability of
mortality of the toxic effect of carbon disulfide (CS2) gas on beetles. The insects
were exposed to various concentrations of this gas (in mf/L) for 5 hours (Bliss 1935),
and, then the number of dead beetles (Y ) was counted. The data are shown below
(Table 2.17).
(a) List and describe the components of the GLM (distribution, systematic compo-
nent (predictor), and the link function).
(b) Fit the model according to part (a).
(c) Interpret your results.
Exercise 2.6.5 A study was conducted to assess the fowlpox virus in chorioallantois
by the Pock counting technique. The membrane Pock count for 50 embryos exposed
to one of four dilutions of virus (multiples of 10ˆ(-3.86)). The FD column heading
corresponds to the dilution factor and the number of Pocks observed (Table 2.18).
82 2 Generalized Linear Models
(a) List and describe the components of the GLM (distribution, systematic compo-
nent (predictor), and the link function).
(b) Fit the model according to part (a).
(c) Interpret your findings.
Exercise 2.6.6 Data were provided by Margolin et al. (1981) from an Ames
Salmonella reverse mutagenicity assay. The table shows the number of reversed
colonies observed on each of the three plates (repeats) tested at each of the six
quinoline dose levels. The focus is on testing for mutagenic effects over time in the
excess variation typically observed between counts (Table 2.19).
(a) List and describe the components of the GLM (distribution, systematic compo-
nent (predictor), and the link function).
(b) Fit the model according to part a).
(c) Interpret your results.
Appendix 83
Appendix
Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0
International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing,
adaptation, distribution and reproduction in any medium or format, as long as you give appropriate
credit to the original author(s) and the source, provide a link to the Creative Commons license and
indicate if changes were made.
The images or other third party material in this chapter are included in the chapter's Creative
Commons license, unless indicated otherwise in a credit line to the material. If material is not
included in the chapter's Creative Commons license and your intended use is not permitted by
statutory regulation or exceeds the permitted use, you will need to obtain permission directly from
the copyright holder.
Chapter 3
Objectives of Inference for Stochastic
Models
Throughout this book, we have been using the pseudonym GLMMs to denote
generalized linear mixed models. The common denominator among all these models
is that they all contain a linear model (LM) part, which refers to the fixed effects
component of the linear predictor Xβ. In a GLMM, the prefix “G” indicates that the
distribution of observations may not be normal, the suffix of the first M means that
the linear predictor includes mixed effects and thus contains random effects, which
are expressed by the term “Zb.” The fixed linear component of the predictor Xβ is
important because the fixed effects describe the treatment design, which, in turn, is
determined by the objectives or the initial research questions that the study wishes to
answer. Therefore, if the researcher proposes using a reasonable model to analyze an
experiment, then he/she must be able to express each objective as a question about a
model parameter or as a linear combination of model parameters.
Example Assume a factorial 2 × 2 model, with two levels in both factors A and B,
in which all possible combinations are tested. In this case, Xβ corresponds to a
two-way model with interaction and a predictor given by
ηij = μ þ αi þ βj þ ðαβÞij ; i, j = 1, 2
As in all the statistical models studied so far, the linear predictor is expressed in
terms of the link function, and ηij can estimate the mean μij (a combination of
treatments) directly if the data follow a normal distribution and indirectly if the
data are not normally distributed. For this example, the inference should focus on
one or more of the following options (estimable functions): a treatment combination
mean; a main effect mean; the mean of factor A, which is the average of the overall
levels of factor B or vice versa; the difference of the main effects or the difference of
a single effect, i.e., the difference between two levels of factor A at a given level of B
or the difference between two levels of factor B at a given level of factor A; and so
on. Each of these options can be expressed in terms of the parameters of the linear
predictor, as shown in Table 3.1.
Table 3.1 Estimable functions in a factorial 2 × 2 treatment structure using the identity link
function
Target estimation in
terms of the expected
Target estimate Parameter estimator of the linear predictor value
Combination A × B η + αi + βj + (αβ)ij μij
Main effect of factor A ηi: = η þ αi þ 12 βj þ 12 ðαβÞij μ
i: = 12 μij
j j j
Main effect of factor B η:j = η þ 12 αi þ βj þ 12 ðαβÞij μ
:j = 1
2 μij
i i i
Difference between μ
1: - μ
2:
level 1 and level 2 of η1: - η2: = α1 - α2 þ 12 ðαβÞ1j - ðαβÞ2j
j j
factor A
Difference between μ
:1 - μ
:2
level 1 and level 2 of η:1 - η:2 = 1
2 ðαβÞi1 - ðαβÞi2 þ β1 - β2
i i
factor B
Simple effect of A η1j - η2j = α1 - α2 + (αβ)1j - (αβ)2j μ1j - μ2j
given Bj (A| Bj)
Simple effect of B ηi1 - ηi2 = (αβ)i1 - (αβ)i2 + β1 - β2 μi1 - μi2
given A (B| Ai)
Interaction between (η11 - η21) - (η12 - η22) = (αβ)11 - (αβ)21 μ11 - μ12 - μ21 + μ22
factors A and B - (αβ)12 + (αβ)22= (η11 - η12) - (η21 - η22)
(A × B) = A|B1 - A|B2
=B j A1 - B j A2
Assuming that the data have a normal distribution, which is equivalent to using an
identity link function, the estimator, in terms of the linear predictor (column 2),
estimates the expected values of column three. If the data do not follow a normal
distribution, then column 2 indirectly estimates the expected values of column three,
and, in order to estimate the expected values, link functions are required. For link
functions other than identity, the estimates in column two require a more careful
handling. In an experimental design with a factorial treatment structure, the analysis
should focus on the interaction of the two factors. If this interaction is significant,
then the simple effects are not equal; however, if the interaction is not significant,
then the main effects provide useful information; otherwise, the main effects are
confounded. Therefore, for this reason, in this case, it is better to focus on the simple
effects.
When constructing a model, the researcher must decide whether the effects are fixed
or random. This decision has important implications with respect to the estimation
criteria and in the interpretation of the tests and estimates obtained. Given these
implications, three important aspects, described in the following sections, must be
taken into consideration in statistical modeling.
3.1 Three Aspects to Consider for an Inference 87
This is a very particular issue for models with a link function other than “identity,”
since the scale of the data used in the modeling process is not always the same as the
scale of the original data when the assumption of normality in the response variable
is no longer valid. When the data are normally distributed, the estimable function
directly estimates the expected value. However, this is not true if the data follow a
non-normal distribution. For example, in a logistic model for binomial data in a
completely randomized design, the estimable function η + τi estimates a logit or
“log” odds. In this vein, η + τi must be expressed as a probability and not as a logit,
i.e., the expected value for individuals receiving the ith treatment is a probability.
This requires converting the estimate to a probability, using the inverse link; that is:
1
πi =
ð1 þ e - ðηþτi Þ Þ
Thus, for functions other than “identity,” there are two ways of expressing the
estimates: (1) in terms of the parameters directly estimated from the GLMM (model
scale) or (2) in terms of the expected value of the response variable (data scale).
This problem arises only when the linear predictor contains random effects. In these
models, the estimates are obtained through a linear combination (an estimable
function) with fixed effects, even though the linear predictor contains random
effects. K′β denotes the estimable function, where K is the matrix of order
[( p + 1) × k] and β is the vector of fixed effects parameters of order [( p + 1) × 1].
The estimable function (K′β) represents a broad inference as it generalizes results to
the entire population represented by the random effects.
Although the linear combination K′β + Z′b is a predictable function with Z′, a
matrix for random effects with nonzero coefficients, its inference is limited to only
those levels defined in b. Suppose that you are conducting an experiment with three
treatments at different locations (L ), then the estimable function τ1 - τ2 provides
information for the inference about the difference between treatments 1 and 2 in the
whole population under study. Although the predictable function [τ1 - τ2 + (Lτ)1j -
(Lτ)2j] constrains the inference space between treatments 1 and 2, it is limited to
location (Lj). The type of inference produced by predictable functions is called
“narrow inference” because the nonzero coefficients in matrix Z reduce the scope
of inference for the entire population at those levels identified in Z. Thus, the
predictive function K′β + Z′b should be used for specific estimates, whereas the
estimable function K′β should be used for valid estimates for the entire population
under study.
88 3 Objectives of Inference for Stochastic Models
In linear models, inference begins with the estimable function K′β, and, these
models, in turn, are defined in terms of the linear function η = g(μ) = Xβ (if there
are no random effects) and η = Xβ + Zb (if there are random effects in the model),
whereas K′β produces results in terms of the link function.
For linear normal response models such as LMs and LMMs, the link function is
not visible because they use the “identity” function as the link. Linear combinations
of model parameters directly estimate desired values such as differences between
treatments and many other hypothesis tests of interest. Inference for an LM is
straightforward.
For GLMs and GLMMs with a non-normal response, the estimation of K′β yields
a linear combination of elements of the linear predictor η, which is a linear combi-
nation of g(μ), typically a nonlinear function of μ. For example, with Poisson data the
K′β is a function of logarithm (log) and for binomial data, it is a function of logit or
probit. However, most of the time, the researcher wants to see the binomial results
expressed in terms of the probability of the outcome of interest, whereas for Poisson,
the results are expressed in terms of counts. This means that since both GLMs and
GLMMs carry out the estimation process on the scale of the model (depending on the
link used) to report the results of interest in terms of the scale of the data, it is
necessary to apply the inverse link to the predictor in terms of the model scale to
express the results. To mention two examples, in the case of the logit link for
binomial, the results are expressed in terms of probability and, in the case of the
3.2 Illustrative Examples of the Data Scale and the Model Scale 89
Poisson model, they are expressed in terms of counts. To exemplify the model scale
and the data scale, an example is shown below.
Example 3.1 Consider the following experiment in which three chemical seed coat
softeners were tested for studying their effect on germination of tomato seeds in
Styrofoam trays (Table 3.2).
To illustrate the above two concepts, we first analyze these data using a
completely randomized design (CRD), assuming the response variable to be normal,
and, then, we analyze the same experimental design but with a binomial response
variable. We are interested in comparing the means of treatments using a completely
randomized design. Note that for demonstrative purposes, we are assuming that
Y has a normal distribution, when in fact it has a binomial distribution.
The components of this model are defined as follows:
Distribution: yij~N(μi, σ 2)
Linear predictor: ηi = η + τi; (i = 1, 2, 3)
Link function: ηi = μi (identity link)
The analysis of variance (ANOVA) (part (a)) and estimated parameters (part (b))
of this experimental design indicate that there is a highly significant difference
between the treatments (P = 0.0033) for the germination of tomato seeds.
Table 3.3 shows part of the results.
The estimated parameter values of the model, except for treatment three, are
shown in the table above (obtained with the “solution” command) because the model
is over-parameterized. The estimable functions K′β for the treatment means are as
follows:
1 1 0 0 η
K0 = 1 0 1 0 ; β= τ1
τ2
1 0 0 1 τ3
90 3 Objectives of Inference for Stochastic Models
From the estimated treatment parameters τi = μ i = ^η þ ^τi , we can obtain the
estimated mean for each one of the treatments (i = 1, 2, 3) as follows: for treatment
1, τ1 = ^η þ ^τ1 = 41:6667 þ 7:3333 = 49; for treatment
2, τ2 = ^η þ ^τ2 = 41:6667 - 18 = 23:6667; and for treatment
3, τ3 = ^η þ ^τ3 = 41:6667 þ 0 = 41. The value of the mean squared error (^ σ 2 ),
which appears in the table as “Scale,” is 29.5556.
For the difference between treatments, the τi - τi′ values for i ≠ i′are as follows:
τ1 - τ2 = ^η þ ^τ1 - ð^η þ ^τ2 Þ = ^τ1 - ^τ2 = 7:3333 - ð- 18Þ = 25:3333,τ1 - τ3 = ^η þ ^τ1
- ð^η þ ^τ3 Þ = ^τ1 - ^τ3 = 7:333 - 0:0 = 7:3333, and τ2 - τ3 = ^η þ ^τ2 - ð^η þ ^τ3 Þ = ^τ2
- ^τ3 = - 18:00 - 0:0 = - 18:0. These estimates can be obtained using the Statis-
tical Analysis Software (SAS) “estimate” and “lsmeans” commands, as shown
below:
The “estimate” command requires us to specify what we wish to estimate and the
“intercept” command refers to the intercept (η) and “Trt” to the treatment (τi) effects
under evaluation; the coefficients needed for the estimates are shown above. While
the “lsmeans” command invokes GLIMMIX in SAS to estimate the treatment
means, “diff” asks to estimate the differences between pairs of treatments, and “E”
3.2 Illustrative Examples of the Data Scale and the Model Scale 91
Table 3.4 Results obtained using the “estimate” and “lsmeans” commands
(a) Differences of Trt least squares means
Trt Trt Estimate Standard error DF t-value Pr > |t|
Trt1 Trt2 25.3333 4.4389 6 5.71 0.0013
Trt1 Trt3 7.3333 4.4389 6 1.65 0.1496
Trt2 Trt3 -18.0000 4.4389 6 -4.06 0.0067
(b) Estimates
Label Estimate Standard error DF t-value Pr > |t|
LSM Trt 1 49.0000 3.1388 6 15.61 <0.0001
LSM Trt 2 23.6667 3.1388 6 7.54 0.0003
LSM Trt 3 41.6667 3.1388 6 13.27 <0.0001
Overall mean 38.1111 1.8122 6 21.03 <0.0001
Overall mean 38.1111 1.8122 6 21.03 <0.0001
Trt diff 1&2 25.3333 4.4389 6 5.71 0.0013
Trt diff 2&3 -18.0000 4.4389 6 -4.06 0.0067
displays the coefficients of the estimable functions used in “lsmeans.” Some of the
outputs of the above code are shown in Table 3.4.
Next, we analyze the same data, also using a CRD, but now assuming a binomial
distribution in the response variable. N indicates the independent number of
Bernoulli trials observed in the ijth observation. The components of the model are
as follows:
Distribution: yij~Binomial(Nij, π i)
Linear predictor: ηi = η + τi; (i = 1, 2, 3)
πi
Link function: ηi = logit 1 - πi (logit link)
Fitting these data in a binomial model, the fixed effects solution of the parameters
obtained in terms of the model scale are tabulated in Table 3.5.
The above results were obtained using the following SAS code:
Similar to the previous example, we can estimate the mean of treatments and the
differences between two pairs of treatments. The linear predictors for the treatments
are as follows: ^η1 = ^η þ ^τ1 = 0:5108 þ 0:5093 = 1:0201, ^η2 = ^η þ ^τ2 = 0:5108 -
1:108 = - 0:5971, and ^η3 = ^η þ ^τ3 = 0:5108 þ 0:0 = 0:5108, and, for the differ-
ences between treatments (1 and 2, 1 and 3, and 2 and 3), they are as follows:
^η1 - ^η2 = 1:0201 - ð- 0:5971Þ = 1:6173, ^η1 - ^η3 = 1:0201 - 0:5108 = 0:5093, and
^η2 - ^η3 = - 0:5971 - 0:5108 = - 1:1079, respectively
92 3 Objectives of Inference for Stochastic Models
Using the relationship between the linear predictor and the link function ηi =
πi πi
logit 1 - πi = log 1 - πi , we can estimate the probability of observing a favorable
outcome for each of the treatments, that is, π 1, π 2, and π 3, respectively. Applying the
inverse link, we obtain:
Here, we can see that the treatment with the highest probability of success is
treatment one, followed by treatment three, whereas treatment two has the lowest
probability of success. Now, for the difference between two treatments, τi - τi′ for
i ≠ i′, we can estimate the logarithm of the odds ratio as
πi π i0 πi
τi - τi0 = log - log = log =
1 - πi
π0
i
1 - πi 1 - π i0 1 - π i0
πi
where, in this particular case, odds = 1 - πi is the odds of the treatment i and
is the odds ratio for treatments i and i′, for i ≠ i′.
πi
oddsratio = log =
1 - πi
π i0
1-π 0
i
When applying the inverse link to the above expression (odds ratio), we get
Oddsratio = 1= 1þe ð
- ^τi - ^τ 0
i Þ
3.2 Illustrative Examples of the Data Scale and the Model Scale 93
Similarly, for the pair of treatments 1–2 and 2–3, the resulting odds ratios are
Oddsratio1 - 2 = 0.8344 and Oddsratio2 - 3 = 0.2483, respectively. It is important to
mention that the odds ratios are not the mean of the difference of π i - π i′for i ≠ i′.
From the previous example, it is clear that when the response variable is not
normal, parameter estimation and inference occurs at two levels. The linear predictor
Xβ and the estimable function K′β are expressed in terms of the link function, logit –
estimates on the model scale (scale of the link function) – as in the above example.
Under the logit link, the logarithm of the odds and the difference of the estimate (log
odds ratio) are very common and useful terms in categorical data analysis for the
estimation of treatments or treatment differences in terms of the data scale.
Commonly, estimation at the model scale in GLMs is not very easy to interpret,
and, as such, the data scale plays a very important role. A data scale involves
applying the inverse of the link function to the estimable function, K′β, as we did
in the previous example to convert the log of the odds for each treatment to a
probability. In general, we use the inverse of the link function to transform the
estimates at the model scale to the data scale. The inverse of the link function is not
used for estimating the differences between treatments because the link functions are
generally nonlinear. This is why the inverse of the link function is not applied to the
differences between treatments because it produces meaningless results.
Thus, in the logit model, we have two approximations for the difference. First, we
could apply the inverse of the link function to each linear predictor for each treatment
and then take the difference between probabilities: π^i - π^i ′ . That is, we can estimate
the difference between π i - π i′ through 1= 1 þ e - ðηþτi Þ - 1= 1 þ e - ðηþτi0 Þ
and not as 1= 1 þ e - ð^τi - ^τi0 Þ . Second, we know that τi - τi′ estimates the
logarithm of the odds ratio by means of eðτi - τi0 Þ , which produces an estimate of
the odds ratio. Both approaches are valid, and the use of one approach or the other
depends on the requirements of the particular study.
With the GLIMMIX procedure, we can implement the solution in terms of the
data scale with the “ilink,” “exp,” and “oddsratio” commands, as shown in the
following SAS code:
proc glimmix;
class trt;
model y/n=trt / solution oddsratio;
lsmeans trt / diff oddsratio ilink ;
estimate 'lsm trt 1' intercept 1 trt 1 0/ilink;
estimate 'lsm trt 2' intercept 1 trt 0 1/ilink;
estimate 'lsm trt 3' intercept 1 trt 0 0 1/ilink;
estimate 'overall mean' intercept 1 trt 0.33333 0.33333 0.33333
0.33333/ilink;
estimate 'overall mean' intercept 3 trt 1 1 1 1 / divisor=3 ilink;
94 3 Objectives of Inference for Stochastic Models
Part of the output of “proc GLIMMIX” is shown in Table 3.6. The “Odds ratio
estimates” (part (a)) are the result of the “oddsratio” command in the previous
program, whereas the confidence intervals are provided by default.
What appears under “Estimate” (in part (b)) is in the model scale ^ηi = ^η þ ^τi , and
what appears under “Mean” (in part (b)) is an estimate of the inverse of the link
function π^i = 1= 1 þ e - ð^ηþ^τi Þ and, in this case, is a probability that corresponds to
the data scale. Similarly, what appears under “Estimate” is in model scale ^τi - ^τi0 ,
whereas the “Odds ratio” values were estimated using eðτi - τi0 Þ and are in data scale.
Under “Estimates” column in Table 3.7, the log odds ratio appears as an
“Exponentiated estimate” regardless of whether we use the “oddsratio” or “exp”
option in the “estimate” command. For the overall mean, the inverse of the link
function applied to ^η þ 13 ð^τ1 þ ^τ2 þ ^τ3 Þ is 0.5772, which is totally different from the
average of π^is; that is, 13 ðπ^1 þ π^2 þ π^3 Þ = 13 ð0:735 þ 0:355 þ 0:655Þ = 0:5816. This
illustrates that we have to be extremely careful when using the output of proc
GLIMMIX, as it can produce outputs in terms of both the model scale and the
data scale through the application of the inverse of the link function; however, this
has to be applied appropriately, otherwise, we will get meaningless results.
Example 3.2: Randomized complete block design (RCBD) with normal
and binomial responses
Now, assume that we have the same example but in an RCBD. The three treatments
were tested in each of the blocks, as shown in Table 3.8.
In this example, first, the data are analyzed assuming a normal response and
assuming that the block effect is fixed; then, they are analyzed assuming a binomial
response.
The model components under a Gaussian response variable are as follows:
Distribution: yij ~ N(μij, σ 2)
Linear predictor: ηij = η + τi + blockj; (i, j = 1, 2, 3)
Link function: ηij = μij; (identity link)
From the theory of linear models, we know that we can estimate the ith treatment
mean through
3 3
ηi • = 1=3 yij = η þ τi þ 1=3 blockj = η þ τi þ bloq •
j=1 j=1
3.2 Illustrative Examples of the Data Scale and the Model Scale 95
Table 3.7 Estimates at the model scale and at the data scale
Estimates
Standard
Standard error Exponentiated
Label Estimate error DF t Value Pr > |t| Mean Mean estimate
LSM 1.0201 0.1602 6 6.37 0.0007 0.7350 0.03121
Trt 1
LSM -0.5971 0.1478 6 -4.04 0.0068 0.3550 0.03384
Trt 2
LSM 0.5108 0.1461 6 3.50 0.0129 0.6250 0.03423
Trt 3
Overall 0.3113 0.08746 6 3.56 0.0119 0.5772 0.02134
Mean
Trt diff 1.6173 0.2180 6 7.42 0.0003 0.8344 0.03011 5.0393
1&2
Trt diff 0.5093 0.2168 6 2.35 0.0571 0.6246 0.05083 1.6642
1&3
Trt diff -1.1080 0.2078 6 -5.33 0.0018 0.2483 0.03878 0.3302
2&3
Trt diff 1.6173 0.2180 6 7.42 0.0003 5.0393
1&2
Trt diff 0.5093 0.2168 6 2.35 0.0571 1.6642
1&3
Trt diff -1.1080 0.2078 6 -5.33 0.0018 0.3302
2&3
96 3 Objectives of Inference for Stochastic Models
Table 3.8 Percentage of germinated seeds (Y ) out of total seeds (N ) in a randomized complete
block design
Treatment Block Y (no. of germinated seeds) N (total no. of seeds)
Trt1 Block1 54 70
Trt1 Block2 41 60
Trt1 Block3 52 70
Trt2 Block1 28 70
Trt2 Block2 22 60
Trt2 Block3 21 70
Trt3 Block1 41 70
Trt3 Block2 37 60
Trt3 Block3 47 70
3
where bloq • = 1=3 blockj :
j=1
For the mean difference of two treatments i and i’, this is estimated as
• - ðη þ τi0 þ bloq
ηi • - ηi ′ • = η þ τi þ bloq • Þ = τi - τi ′
The goal of this experiment could be to compare the treatment means, that is,
η1: = η2: = η3: , equivalently – this can be expressed as τ1 = τ2 = τ3 – or to compare
one treatment with the average of the other treatments: for example, to compare
treatment 1 with the averages of treatments 2 and 3 (Trt1.vs.average.Trt2.and.Trt3).
For the hypothesis test of the equality of treatments (τ1 = τ2 = τ3), the estimable
function K′β is given by:
η
τ1
τ2
0 1 0 -1 0 0 0
K′ = ; β= τ3
0 0 1 -1 0 0 0 bloq1
bloq2
bloq3
τ2 - τ3
Trt1:vs:average:Trt2:and:Trt3 = η1 • - 1=2ðη2 • þ η3 • Þ = τ1 -
2
τ1 - τ3
Trt2:vs:average:Trt1:and:Trt3 = η2 • - 1 =2ðη1 • þ η3 • Þ = τ2 -
2
3.2 Illustrative Examples of the Data Scale and the Model Scale 97
η
τ1
τ2
0 2 -1 -1 0 0 0
K′ = ; β= τ3
0 1 -2 1 0 0 0 bloq1
bloq2
bloq3
proc glimmix;
class trt block;
model y = trt block/solution;
lsmeans trt / diff e;
estimate 'lsm trt1' intercept 3 trt 3 0 0 0 block 1 1 1 / divisor=3;
estimate 'overall mean' intercept 3 trt 1 1 1 1 block 1 1 1 1 / divisor=3;
estimate 'average trt1&trt2' intercept 6 trt 3 3 0 block 2 2 2 /
divisor=6;
estimate 'average trt1&trt2&trt3' intercept 9 trt 3 3 3 3 block 3 3 3 3
3/divider=9;
estimate 'trt1 vs trt2' trt 1 -1 0 ;
estimate 'trt1 vs trt3' trt 1 0 -1;
estimate 'trt2 vs trt3' trt 0 1 -1;
estimate 'trt1 vs trt2' trt 1 -1 0, 'trt1 vs trt3' trt 1 0 -1, 'trt2 vs
trt3' trt 0 1 -1/divisor=1,1,1 adjust=sidak;
contrast 'trt1 vs trt2' trt 1 -1 0 ;
contrast 'trt1 vs trt3' trt 1 0 -1;
contrast 'trt2 vs trt3' trt 0 1 -1;
contrast 'trt1 vs average trt1,trt2' trt 2 -1 -1;
contrast 'trt2 vs average trt1,trt3' trt -1 2 -1;
contrast 'type 3 trt ss' trt 1 0 -1 0,trt 0 1 -1;
contrast 'type 3 trt test' trt 2 -1 -1,trt -1 2 -1;
run;
Part of the GLIMMIX output is shown below in Table 3.9. Parameter estimation
for treatments 1–2 and blocks 1–2 are shown below, except for treatment and block
3. This is because it is an incomplete rank model. The generalized inverse is used in
the estimation through the SWEEP operator of SAS. In this case, it sets the last class
effect equal to zero (Table 3.9).
“Coefficients” (part (a) of Table 3.10) obtained with option E for the least squares
means of treatments in “lsmeans” shows how SAS uses this information in the
parameter solution to calculate the treatment means (part (b)). In part (c), we can see
the difference of means obtained with the “diff” option in “lsmeans.”
The estimates obtained from the “estimate” command with multiple estimable
functions and in the “Sidak” adjustment and contrasts are shown in Table 3.11. This
adjustment allows us to control for type I errors. The “adjust” option in “estimate” in
the Sidak adjustment (part (b)) allows us to obtain the adjusted P-values denoted as
AdjP in addition to Pr > |t|.
98 3 Objectives of Inference for Stochastic Models
Table 3.10 Coefficients for (a) Coefficients for Trt least squares means
treatment and block used in
Effect Trt BLOCK Row1 Row2 Row3
least squares
Intercept 1 1 1
Trt 1 1
Trt 2 1
Trt 3 1
BLOCK 1 0.3333 0.3333 0.3333
BLOCK 2 0.3333 0.3333 0.3333
BLOCK 3 0.3333 0.3333 0.3333
(b) Trt least squares means
Trt Estimate Standard error DF t-value Pr > |t|
1 49.0000 2.4683 4 19.85 <0.0001
2 23.6667 2.4683 4 9.59 0.0007
3 41.6667 2.4683 4 16.88 <0.0001
(c) Differences of Trt least squares means
Trt _Trt Estimate Standard error DF t-value Pr > |t|
1 2 25.3333 3.4907 4 7.26 0.0019
1 3 7.3333 3.4907 4 2.10 0.1036
3 3 -18.0000 3.4907 4 -5.16 0.0067
The planned contrasts in matrix K′ and with the F-values obtained with the
“contrast command” produce the same results (part (c)).
Now, the same dataset is fitted using the same predictor but assuming that the
response variable is binomial. This analysis intends to show the options available in
the SAS commands when you want to fit non-normal responses; in this case, it is
binomial. Practically, the same commands used in the previous program with normal
data are used, but, now, some other options (“ilink,” “oddsratio,” or “exp”) are
exemplified with details under what circumstances they should be used. This is
because all estimable functions produce estimates at the model scale, and we must
3.2 Illustrative Examples of the Data Scale and the Model Scale 99
decide what conversions are necessary to obtain the results at the data scale. Below,
the estimable functions and the appropriate conversion required to produce the
results on the data scale are listed.
(a) Least squares means (“lsmeans”) for normal data and an inverse link (“ilink”) for
non-normal data
(b) Difference between pairs of treatment means of “lsmeans” for normal data and
“odds ratio” for non-normal data
(c) Estimation of the mean of a treatment (“estimate”) for normal data and an inverse
link (“ilink”) for non-normal data
(d) Estimation of a treatment i vs treatment i′: exponentiation (“exp”) (or odds ratio)
(e) Multiple estimates of treatment differences as “exp” (or odds ratio) for
non-normal data
(f) In “contrast estimation,” conversion to the data scale is not necessary, since it is
only an F-statistic test.
The following GLIMMIX program shows how to implement this model with a
binomial response.
100 3 Objectives of Inference for Stochastic Models
proc glimmix;
class trt block;
model y/n = trt block/solution oddsratio;
lsmeans trt / diff e oddsratio;
estimate 'lsm trt1' intercept 3 trt 3 0 0 0 block 1 1 1 1/divider=3 ilink;
estimate 'difference trt1 vs trt2' trt 1 -1 0/exp;
estimate 'avg trt1&trt2&trt3' intercept 9 trt 3 3 3 3 block 3 3 3 3
3/divider=9;
estimate 'trt1 vs trt2' trt 1 -1 0/exp;
estimate 'trt1 vs trt3' trt 1 0 -1/exp;
estimate 'trt2 vs trt3' trt 0 1 -1/exp;
estimate 'trt1 vs trt2' trt 1 -1 0, 'trt1 vs trt3' trt 1 0 -1, 'trt1 vs
trt3' trt 0 1 -1/exp adjust=sidak;
contrast 'trt1 vs trt2' trt 1 -1 0;
contrast 'trt1 vs trt3' trt 1 0 -1;
contrast 'trt2 vs trt3' trt 0 1 -1;
contrast 'trt1 vs average trt1,trt2' trt 2 -1 -1;
contrast 'trt2 vs average trt1,trt3' trt -1 2 -1;
contrast 'type 3 trt ss' trt 1 0 -1 0,trt 0 1 -1;
contrast 'type 3 trt test' trt 2 -1 -1,trt -1 2 -1;
run;
Part of the output is shown in Table 3.12. The estimated treatment and block
parameters of the model are given in part (a) of Table 3.12; the last two effects of
both classes were restricted to zero because they are incomplete rank design matri-
ces. In part (b), the type III tests of fixed effects and in part (c) the odds ratio
estimates are provided. Note that σ^2 does not appear in the output because the
variance of the binomial distribution is not an independent parameter.
In Table 3.12 (parts (b) and (c)), which shows the sum of the squares of fixed
effects type III as well as the odds ratio, it can be seen that only the effect of
treatments is significant but not the effect of blocks, which indicates that it is valid
to analyze these data using a completely randomized design. Two sets of odds ratios
were estimated (part (c)): one for the treatment effects and the other for the block
effects in the model. In the calculation of odds ratios, generally, the last level of the
factor is compared with the rest of the levels of that same factor.
The estimates obtained with “estimate”, in Table 3.13 (parts (a) and (b)), are
results in terms of the model scale, whereas the last column is obtained by applying
EXP ðeτi - τi 0 Þ.
The least squares means for treatment and the linear predictors of treatment
differences (parts (a) and (b) of Table 3.14, respectively) obtained with “lsmeans”
are the values under the “Estimate” column, and, these, together with their
corresponding standard errors, were obtained using the linear predictor
^ .
^ηij = ^η þ ^τi þ block •
These estimates are on the model scale, whereas the values under the “Mean”
column and their respective standard errors were obtained by applying the inverse
π i Þ. While the estimated
link to obtain the probabilities of success of each treatment (^
linear predictors for the mean differences were obtained with the “oddsratio” option,
the mean difference in the data scale is obtained by taking the inverse of these
predictors.
3.3 Fixed and Random Effects in the Inference Space 101
In practice, the random effects in a linear mixed model (LMM) should represent the
population from which the data were collected and should be included in studies as if
they came from a well-planned sample. In a model, random effects can be locations,
regions, states, blocks, and so on, and they have two very particular characteristics.
• Random effects represent the target population.
• Random effects have a probability distribution.
These two characteristics allow us to have a broad inference space where we can
calculate point estimates, estimate intervals, and perform hypothesis testing appli-
cable to the entire population represented by the random effects. Formally, an
estimate or hypothesis test based on an LMM indicates that we have a broad
Table 3.13 Different estimates obtained with “estimate”
102
(a) Estimates
Label Estimate Standard error DF t-value Pr > |t| Mean Standard error Mean Exponentiated estimate
LSM Trt1 1.0174 0.1603 4 6.35 0.0032 0.7345 0.03127 .
Average Trt1&Trt2&Trt3 0.3080 0.08768 4 3.51 0.0246 Non-est
Trt1 vs Trt2 1.6184 0.2181 4 7.42 0.0018 Non-est 5.0452
Trt1 vs Trt3 0.5097 0.2169 4 2.35 0.0785 Non-est 1.6647
Trt2 vs Trt3 -1.1088 0.2079 4 -5.33 0.0059 Non-est 0.3300
(b) Estimate adjustment for multiplicity: Sidak
Standard t- Exponentiated
Label Estimate error DF value Pr > |t| Adj P estimate
Trt1 1.6184 0.2181 4 7.42 0.0018 0.0053 5.0452
vs
Trt2
3
Table 3.14 Estimated linear predictors for treatments and treatment differences with their respec-
tive inverse values
(a) Trt least squares means
Trt Estimate Standard error DF t-value Pr > |t| Mean Standard error Mean
1 1.0174 0.1603 4 6.35 0.0032 0.7345 0.03127
-0.6011 0.1480 4 -4.06 0.0153 0.3541 0.03385
0.5077 0.1462 4 3.47 0.0255 0.6243 0.03429
(b) Differences of Trt least squares means
Trt _Trt Estimate Standard error DF t-value Pr > |t| Odds ratio
1 2 1.6184 0.2181 4 7.42 0.0018 5.045
1 3 0.5097 0.2169 4 2.35 0.0785 1.665
2 3 -1.1088 0.2079 4 -5.33 0.0059 0.330
inference space defined by the estimable function K′β if Z is a matrix with coeffi-
cients equal to zero; otherwise, the estimation or hypothesis test is defined by the
prediction function K′β + Z′β, which is a specific inference.
In Example 3.2, the response variable was assumed as a function of fixed effects due
to treatments and blocks, since block effects were also assumed to be fixed effects.
Now, suppose that applications of treatments were done by three different people
(blocks); then, assuming that the block effects are fixed, this would be questionable
since each person does their job according to their experience, skill, and so forth.
Clearly, there is some variability between blocks that is not due to the experiment
and this has to be removed, so the effects due to blocks must be considered random.
In this example, let us assume that the three blocks (persons) were randomly selected
from a population. Thus, the components of the model are defined as follows:
Distribution:
(a) yijj blockj ~ N(μij, σ 2)
(b) bloquej N ð0, σ2block Þ
proc glimmix;
class trt block;
model y = trt /solution;
random block/solution;
lsmeans trt / diff e;
estimate 'lsm trt1' intercept 1 trt 1 0 0 0|block 0 0 0 0;
estimate 'lsm trt2' intercept 1 trt 0 1 0|block 0 0 0 0;
estimate 'lsm trt3' intercept 1 trt 0 0 0 1|block 0 0 0 0;
estimate 'blup trt1' intercept 3 trt 3 0 0 0|block 1 1 1 / divisor=3;
estimate 'blup trt2' intercept 3 trt 0 3 0|block 1 1 1 / divisor=3;
estimate 'blup trt3' intercept 3 trt 0 0 3|block 1 1 1 / divisor=3;
run;
In the previous SAS GLIMMIX code, the “estimate” command shows the
coefficients associated with the fixed effects before the vertical bar (j) and after the
vertical bar, are provided the coefficients for the random effects associated with the
model, that is:
efectosfijos efectosaleatorios
η
1 1 0 0 1 1 1 block1
τ1
K ′ β þ Z′ b = 1 0 1 0
τ2
þ 1 1 1 block2
1 0 0 1 τ3 1 1 1 block3
Part of the output is shown in Table 3.15. Subsection (a) shows the estimated
variance components due to blocks, and for conditional observations, the effect of
the blocks is σ ^2block = 11:2778, whereas the mean squared error (MSE) is
σ^ = 18:2778. On the other hand, the fixed effects solution obtained with the
2
Table 3.15 Variance components, fixed effects, and fixed effects test
(a) Covariance parameter estimates
Cov Parm Estimate Standard error
BLOCK 11.2778 17.8966
Residual 18.2778 12.9243
(b) Solutions for fixed effects
Effect Trt Estimate Standard error DF t-value Pr > |t|
Intercept 41.6667 3.1388 2 13.27 0.0056
Trt 1 7.3333 3.4907 4 2.10 0.1036
Trt 2 -18.0000 3.4907 4 -5.16 0.0067
Trt 3 0. .. .
(c) Type III tests of fixed effects
Effect Num DF Den DF F-value Pr > F
Trt 2 4 27.89 0.0045
The process of analyzing a dataset has two main objectives: the first is model
selection, which aims to find well-fitting parsimonious models for the responses
being measured, and the second is model prediction, where estimates from the
selected models are used to predict quantities of interest and their uncertainties.
The differences that may arise in this analysis process are mainly due to the
choice of unidentifiable constraints on random effects. To compare two different
models, we must compare analogous quantities. Different constraints can lead to
apparently extremely different but inferentially identical models. The conditional
model is believed to be the basic model, and any conditional model leads to a
specific marginal model. Lee and Nelder (2004) proposed and worked on condi-
tional models derived from generalized hierarchical linear models (GHLMs) and
marginal models derived from these conditional models. Marginal models have
often been fitted using generalized estimating equations (GEEs), the drawbacks of
which are also discussed.
Consider two models with a normal distribution: one is a random effects model
(a mixed model)
yij = μ þ τi þ bj þ εij
Table 3.16 Estimated means, best linear unbiased estimates (BLUEs), and BLUPs for treatment
and the difference between two means
(a) Estimates
Label Estimate Standard error DF t-value Pr > |t|
lsm trt1 49.0000 3.1388 4 15.61 <0.0001
lsm trt2 23.6667 3.1388 4 7.54 0.0017
lsm trt3 41.6667 3.1388 4 13.27 0.0002
blup trt1 49.0000 2.4683 4 19.85 <0.0001
blup trt2 23.6667 2.4683 4 9.59 0.0007
blup trt3 41.6667 2.4683 4 16.88 <0.0001
(b) Trt least squares means
Trt Estimate Standard error DF t-value Pr > |t|
1 49.0000 3.1388 4 15.61 <0.0001
1 23.6667 3.1388 4 7.54 0.0017
2 41.6667 3.1388 4 13.27 0.0002
(c) Differences of Trt least squares means
Trt _Trt Estimate Standard error DF t-value Pr > |t|
1 2 25.3333 3.4907 4 7.26 0.0019
1 3 7.3333 3.4907 4 2.10 0.1036
2 3 -18.0000 3.4907 4 -5.16 0.0067
E yij = μ þ τi
where the elements in V( y) = Σ are variances and covariances that have an arbitrary
correlation structure. Zeger et al. (1988) pointed out that given a marginal model, the
generalized estimating equations are consistent. An obvious advantage of using
random effects models is that they allow conditional inferences in addition to
marginal inferences (Robinson 1991). Using the model with random effects, we
can obtain not only the conditional mean
whereas with the marginal model, we can obtain only the marginal mean μij.
It may be reasonable to assume that the unobservable characteristic of the random
effects of blocks (bj) follows a certain distribution. However, the center of this
distribution cannot be identified because it is confounded with the intercept. There-
fore, in the random effects model, we put the unidentifiable constraints E(bi) = 0 and
E(εij) = 0 as we do for error terms in linear models. In the mixed model, these
^
restrictions are j bj = 0 and j^
εij = 0 in any estimation procedure. First, we
3.4 Marginal and Conditional Models 107
Table 3.17 Mortality of coffee seedling clones (C) in different substrates (S)
Block S C Mortality Pct Block S C Mortality Pct
1 3 1 3.33 0.0333 3 3 1 6.6 0.066
1 3 3 16 0.16 3 3 2 10 0.1
1 3 2 16 0.16 3 3 3 56.6 0.566
1 1 1 3.33 0.0333 3 2 1 3.3 0.033
1 1 3 6.6 0.066 3 2 2 26.6 0.266
1 1 2 3.3 0.033 3 2 3 40 0.4
1 2 1 10 0.1 3 4 1 3.3 0.033
1 2 3 3.33 0.0333 3 4 2 46 0.46
1 2 2 3.33 0.0333 3 4 3 33.3 0.333
1 4 1 3.33 0.0333 3 1 1 6.6 0.066
1 4 3 16 0.16 3 1 2 43.3 0.433
1 4 2 13.3 0.133 3 1 3 50 0.5
2 4 3 3.3 0.033 4 4 1 33 0.33
2 4 1 3.3 0.033 4 4 2 10 0.1
2 4 2 20 0.2 4 4 3 23.3 0.233
2 1 3 10 0.1 4 2 3 50 0.5
2 1 1 3.33 0.0333 4 1 2 23.3 0.233
2 1 2 6.6 0.066 4 1 3 6.6 0.066
2 2 3 36.6 0.366 4 2 1 16 0.16
2 2 1 26.6 0.266 4 2 2 10 0.1
2 2 2 43.3 0.433 4 2 3 16 0.16
2 3 3 3.3 0.033
consider the case in which the data follow a normal distribution. We then briefly
discuss how the results differ for data with a non-normal distribution.
Example The effect of different substrates (factor A), i.e., three substrates made
from vermicompost and one from compost, on the development of physiological
variables and mortality of cuttings of three clones (factor B) of robusta coffee
(Coffea canephora p.) was evaluated. The levels of factor A are randomly assigned
to rows in each block, with the following restriction: each block receives levels A1,
A2, A3, and A4 and each level of factor B (B1, B2, and B3) is randomly assigned to
each level of factor A in each block. The data for this experiment are tabulated in
Table 3.17.
Note that while there are two randomization processes, there are effectively three
sizes of experimental units: rows for A levels, columns for B levels, and row–column
intersections for A × B combinations. Thus, the experimental design used was a
complete randomized design with a strip-plot treatment arrangement.
108 3 Objectives of Inference for Stochastic Models
where yijk is the kth response observed at the ith level of factor A and at the jth level
of factor B, μ is the overall mean, bk is the random effect due to blocks assuming
bk N 0, σ 2b , αi is the fixed effect due to substrate type (S), (αb)ik is the random
effect due to the interaction of a substrate with blocks assuming ðαbÞik N 0, σ 2αb ,
βj is the fixed effect due to the coffee clone type (C), (βb)jk is the random effect due to
the interaction of a coffee clone with blocks assuming ðβbÞjk N 0, σ 2βb , (αβ)ij is
the interaction fixed effect between a substrate and a coffee clone, and εijk is the
normal random error εijk~N(0, σ 2). The components of the model for this dataset are
as follows:
Linear predictor: ηijk = μ + bk + αi + (αb)ik + βj + (βb)jk + (αβ)ij
Distributions: yijk j bk, (αb)ik, (βb)jk~N(μijk, σ 2)
proc glimmix;
class block s c;
model y=s|c;
random intercept s w/subject=block;
lsmeans s*c/ slicediff=s;
run;
Part of the results of this analysis is shown below. The estimated variance
components for blocks, block × substrate, blocks × clon, and the MSE are
σ^2b = 23:4714, σ^2αb = 35:4995, σ^2βb = 67:0160 and σ^2 = CME = 139:58, respectively,
which are listed in part (a) of Table 3.18. However, the fixed effects tests for both
factors and the interaction (part (b)) are not statistically significant.
According to the “slicediff = s” option in the “lsmeans” statement, Table 3.19
shows the simple effects of each substrate level at varying clone levels.
Example Using the data in Table 3.17 but under a beta distribution, the components
of the GLMM change slightly:
3.4 Marginal and Conditional Models 109
Distributions: yijk j bk, (αb)ik, (βb)jk~Beta(μijk, ϕ), where ϕ is the scale parameter.
Some of the SAS output from this analysis is shown below. The variance
components estimated for blocks, block × substrate, blocks × clon, and the scale
parameter are σ^2b = 0:06723, σ^2αb = 0:1594, σ^2βb = 0:1932, and with scale parameter
^ = 16:6041, respectively, which are listed in part (a) of Table 3.20. However, the
ϕ
fixed effects tests for both factors and the interaction (part (b)) are not statistically
significant. Unlike a normal distribution (the previous example), the variance com-
ponents (multiplied by 100) under the beta distribution are smaller, and the type III
fixed effects test is closer to be significant.
Table 3.21 shows, for each substrate level at varying clone levels, the estimates
(linear predictors) of the simple effects. These effects differ from the previous
results, but this is mainly because in a GLMM, these values correspond to the linear
predictors estimated at the model scale and not to the estimated means at the data
3.5 Exercises 111
3.5 Exercises
Exercise 3.5.1 The data in the Table 3.22 below show the yield of five barley
varieties in a randomized complete block experiment conducted in Minnesota
(Immer et al. 1934).
• Write a complete description of the statistical model associated with this study
and the assumptions of this model.
• Compute the ANOVA for the design model according to part (a) and determine
whether there is a significant difference in the varieties.
• Use the least significance difference (LSD) method to make pairwise compari-
sons of variety mean yields.
(c) Analyze the data as a randomized complete block design, where the number of
trials represents a blocking factor.
(d) Is there any difference in the results obtained in (a) and (b)? If so, explain what
might be the cause of the difference in results and what method would you
recommend?
Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0
International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing,
adaptation, distribution and reproduction in any medium or format, as long as you give appropriate
credit to the original author(s) and the source, provide a link to the Creative Commons license and
indicate if changes were made.
The images or other third party material in this chapter are included in the chapter's Creative
Commons license, unless indicated otherwise in a credit line to the material. If material is not
included in the chapter's Creative Commons license and your intended use is not permitted by
statutory regulation or exceeds the permitted use, you will need to obtain permission directly from
the copyright holder.
Chapter 4
Generalized Linear Mixed Models
for Non-normal Responses
4.1 Introduction
Generalized linear mixed models (GLMMs) have been recognized as one of the
major methodological developments in recent years, which is evidenced by the
increased use of such sophisticated statistical tools with broader applicability and
flexibility. This family of models can be applied to a wide range of different data
types (continuous, categorical (nominal or ordinal), percentages, and counts), and
each is appropriate for a specific type of data. This modern methodology allows data
to be described through a distribution of the exponential family that best fits the
response variable. These complex models were not computationally possible up until
recently when advances in statistical software have allowed users to apply GLMMs
(Zuur et al. 2009; Stroup 2012; Zuur et al. 2013). Researchers in fields other than
statistical science are also interested in modeling the structure of data. For example,
in the social sciences there have been applications in the field of education when
several tests are applied to students; in longitudinal personality studies when the
occurrence of an emotion is repeatedly observed over time over a set of people; and
in surveys to investigate the political preference of a population, among others.
Likewise, agriculture and life sciences are other major areas, where the measure-
ment of response variables depart from the conventionally used classical methodol-
ogy based on “normality” to model or describe the data set, i.e., data that generally
fall within the nominal, ordinal or interval (continuous) scales of measurement. In a
GLMM, the data response does not undergo any transformation, but, instead, the
response is modeled as a function of the expected value through a linear relationship
with the explanatory variables. GLMMs, a powerful tool, allow proper modeling of
variations between groups and between space and time, leading to accuracy in the
modeling of the observed data as well as in the estimation of variance components.
y = Xβ þ Zb þ ε ð4:1Þ
Most datasets in agricultural, biological, and social sciences often fall outside the
scope of the traditional methods taught in introductory statistics and statistical
methods. Often, these data (response variables) are: (a) binary (the presence or
absence of a trait of interest, success or failure, the infection status of an individual,
or the expression of a genetic disorder); (b) proportional (the ratio of females to
males, infection or mortality rates within a group of individuals); or (c) counts (the
number of emerging seedlings, the number of sprouts, etc.), where basic statistical
methods attempt to quantify the effects of each predictor variable. However, often,
studies of these experiments involve random effects, the purpose of which is to
quantify variation among individuals or units. The most common random effects are
blocks in experimental or observational studies that are replicated across sites
(locations or environments) or over time. Random effects also encompass variations
4.3 Generalized Linear Mixed Models 115
gðηÞ = Xβ þ Zb,
where g-1(∙) is the inverse link and the other terms have already been mentioned
earlier. The fixed and random effects are combined to form the conditional linear
predictor
The relationship between the linear predictor and the vector of observations is
modeled as follows:
y j b g - 1 ðηÞ, R ð4:4Þ
The above notation (4.4) expresses the conditional distribution of y, given b has a
mean g-1(η) and variance R. Note that instead of specifying the distribution for y as
in the case of a GLM, we specify a distribution for the conditional response y j b.
The variance and covariance matrix for the observations is given by:
where matrix A is a diagonal matrix containing the variance functions of the model.
GLMMs cover an important group of statistical models, such as:
116 4 Generalized Linear Mixed Models for Non-normal Responses
(a) Linear models (LMs): absence of random effects, identity link function and the
assumption of a normal distribution.
(b) Generalized linear models (GLMs): random effects are absent, link function is
different from the identity function, and the response variables are non-normally
distributed.
(c) Linear mixed models (LMMs): presence of random effects, identity link function
and normal distribution assumed for the response variable.
GLMMs have been formulated to correct the shortcomings of LMMs, as there are
many cases where the assumptions made in linear mixed models are inadequate.
First, an LMM assumes that the relationship between the mean of the dependent
variable (y) and fixed and random effects (β, b) can be modeled through a linear
function. This assumption is questionable, like when a researcher wishes to model
the incidence of a disease or the success or failure of an event.
The second assumption of an LMM is that variance is not a function of the mean
and that the random effects follow a normal distribution. The assumption of constant
variance is not met when the response variable is binary (1, 0). In this case, the
variance is π(1 - π), which is a function of the mean. The result is a random variable,
which can take two values (0, 1); in contrast, the normal distribution can take any
real number. Finally, the predictions for an LMM can take any real value, whereas
the predictions for a binary variable are bounded in the interval (0, 1), since it is a
probability and this prediction cannot support negative values.
Historically, a number of options have been used to address and solve some
LMM problems, even though their use is not the most appropriate. These include
applying logarithmic transformations (log( y)), transformations using the square root
p
y , arcsine transformations (seno-1( y)), and so on. However, many of these
transformations use linear mixed models by ignoring the fact that these models are
not the most accurate, despite being aware that the response variable does not satisfy
the assumption of normality. These options are attractive because they are relatively
simple and easy to implement using the LMM machinery. However, they circum-
vent the problem that a linear mixed model is not the best model for analyzing data.
In a GLMM, the canonical link function maps the original data to the linear predictor
of the model g(η) = Xβ + Zb. This linear predictor can be transformed to an observed
data scale through an inverse link function. In other words, the inverse link function
is used to map the value of the linear predictor for the ith observation to the
conditional mean at the data scale ηi. For example, suppose that we are conducting
an experiment in which we are assessing the number of undesirable weeds observed
in a crop of interest after the application of a certain number of treatments; the
response variable is assumed to have a Poisson distribution with a mean λij, the linear
predictor of which is given by
4.6 Specification of a GLMM 117
ηij = η þ τi þ bj
where η is the intercept, τi is the fixed effect due to treatments, and bj is the random
effect assuming bj N 0, σ 2b .
To obtain the inverse function of the following predictor
The variance function is used to model the inconsistent variability of the phenom-
enon under study. With GLMMs, the residual variability arises from two sources,
namely, the variability of the distribution of sampling units in an experimental
arrangement (blocks, plots, locations, etc.) and the variability due to overdispersion.
Overdispersion can be modeled in several ways. When dealing with a GLMM, the
scale parameter or the dispersion parameter ϕ is extremely important since it can
either increase or decrease the variance in the model for each observation.
If overdispersion exists, one way to remove it is to add the random effects (in SAS
_residual_) of each observation to the linear predictor. Another alternative is to use
another distribution to model the dataset; for example, the two-parameter negative
binomial (NB) distribution (ηij, ϕ) instead of the single-parameter Poisson distribu-
tion (λij) in the case of count data.
A GLMM is composed of three parts: (1) fixed effects that convey systematic and
structural differences in responses; (2) random effects that convey stochastic differ-
ences between blocks or other random factors, as these effects allow generalizations
118 4 Generalized Linear Mixed Models for Non-normal Responses
of the population from which the sampling units have been (randomly) sampled; and
(3) distribution of errors. Thus, a complete definition of a GLMM is as follows:
where the distribution function f(∙) is a member of the exponential family, g(μ) is the
linear function, X and Z are the design matrices, and β and b are the unknown
parameters for fixed and random effects, respectively.
When fitting a GLMM, the data remain on the original measurement scale (data
scale). However, when means are estimated from a linear function of the explanatory
variables (the predictor), these means are on the model scale. A link function is used
to link the model scale back to the original data scale. This is not the same as
transforming the original measurements to a different measurement scale. For
example, applying the log transformation for counts followed by an analysis of
variance (ANOVA) under a normal distribution is not the same as fitting a general-
ized linear model, assuming a Poisson distribution and using a log link (Gbur et al.
2012). In the first case, the least squares means would normally be equal to the
arithmetic means, whereas in the second case, the means are inversely linked to the
data scale, which may not be equal to the arithmetic means of the original sample.
The distribution specifications in “proc GLIMMIX” have default link functions,
but it is always highly recommended to explicitly code the link function, since for
some type of response variable, more than one alternative exists. This way, there is
no doubt that an appropriate function was used. Using the wrong link function will
lead to totally meaningless and incorrect results. Table 4.1 shows some common
distributions, the appropriate link function, and the proper syntax for each.
For a complete list, see the online Statistical Analysis Software (SAS/STAT)
documentation for PROC GLIMMIX.
4.7 Estimation of the Dispersion Parameter 119
The overall measures of fit compare the observed values of the response variable
with fitted (predicted) values. The dispersion parameter is unknown and therefore
must be estimated. There are two methods for estimating the overdispersion param-
eter. McCullagh (1983) proposed estimating overdispersion as follows:
ðy - μÞ0 V μ- 1 ðy - μÞ Pearson0 s χ 2
ϕ = =
N -p N -p
where V μ- 1 is the diagonal matrix of the variance functions and N - p is the degree
of freedom for lack of fit. Later, McCullagh and Nelder (1989) suggested using
deviance
Deviance is a global fit statistic that also compares fitted and observed values;
however, its exact function depends on the likelihood function of the random
component of the model. Deviance compares the maximum value of the likelihood
function of a model, like M1, with the maximum possible value of the likelihood
function that is calculated using data. When data are used in the likelihood function,
the model is saturated and has as many parameters as possible. Thus, M2 is saturated
and has as many parameters as the data. Model M2 tries to fit the data and gives the
highest possible value for the likelihood.
If the overdispersion parameter is significantly greater than one, this indicates that
overdispersion exists; in other words, it indicates that the variance is greater than the
mean. Therefore, the parameter should be used to adjust the variance. If
overdispersion is not taken into account, inflated test statistics may be generated.
However, when the dispersion parameter is less than 1, the test statistics are more
conservative, which is not considered a big problem.
The following example is intended to show how GLIMMIX in SAS estimates the
dispersion parameter in a GLMM.
Example An agronomist wants to test the effectiveness of a new herbicide offered
on the market (we will denote this as herb_N) and compare it with the herbicide that
has been used for several cycles (herb_C). The experimental arrangement used was a
randomized complete block design as shown below (Table 4.2).
The components of a GLMM with a Poisson response variable are listed below:
This model assumes that the slopes are the same for each herbicide. The following
SAS code is used for the proposed model:
Explanation The “method = ”option is used to specify the method used to opti-
mize the logarithm of the likelihood function. In “proc GLIMMIX,” there are two
popular methods: adaptive quadrature (quad) or Laplace (laplace), which are the
preferred methods for categorical response variables. Both of these methods fit a
conditional model. When the quadrature method is used (method = quad), subjects
(individuals) must be declared in the random effects (e.g., for the above program,
“random intercept/subject=block”). In addition, processing random effects by sub-
ject is more efficient than using the syntax “random block” random effects in blocks.
The “dist” option is where you specify the probability distribution that is appropriate
for the type of response; in this case, it is the Poisson distribution. The “link” option
is for specifying the link function of the distribution. The “ddfm” option is omitted
so that GLIMMIX uses – by default – the method for calculating the denominator
degrees of freedom for the fixed effects tests that result from the model. The “ilink”
option converts the estimates of the treatment means (lsmeans) on the model scale to
the data scale. Finally, “proc GLIMMIX” supports the “lines” option, which adds
letter groups to the mean differences resulting from using “lsmeans.”
The most relevant parts of the SAS output, for the purposes of what we want to
show, are shown in Tables 4.3 and 4.4. The fit statistics of the fitted model are shown
in part (a) and part (b) of Table 4.3. The -2 log likelihood statistic is extremely
4.7 Estimation of the Dispersion Parameter 121
Table 4.3 Fit statistics and (a) Fit statistics (Akaike’s information criterion (AIC), a small
variance components sample bias corrected Akaike’s information criterion
(AICC), Bozdogan Akaike’s information criterion (CAIC),
Schwarz’s Bayesian information criterion (BIC), Hannan and
Quinn information criterion (HQIC))
-2 Log likelihood 175.35
AIC (smaller is better) 181.35
AICC (smaller is better) 183.35
BIC (smaller is better) 181.59
CAIC (smaller is better) 184.59
HQIC (smaller is better) 179.74
(b) Fit statistics for conditional distribution
-2 Log L (count | r. effects) 139.03
Pearson’s chi-square 77.56
Pearson’s chi-square/degree of freedom (DF) 4.85
(c) Covariance parameter estimates
Cov Parm Estimate Standard error
Block 1.5590 0.8690
Table 4.4 Type III fixed effects tests and estimated least squares means
(a) Type III tests of fixed effects
Effect Num DF Den DF F-value Pr > F
Herbicide 1 7 101.34 <0.0001
(b) Trts least squares means
Trts Estimate Standard error DF t-value Pr > |t| Mean Standard error mean
Herb_C 1.4604 0.4696 7 3.11 0.0171 4.3076 2.0227
Herb_N 2.8947 0.4561 7 6.35 0.0004 18.0778 8.2447
useful for comparing nested models, whereas the different versions of information
criteria that exist, such as Akaike information criterion (AIC), Akaike’s information
criteria with small sample bias correction (AICC), Bayesian information criterion
(BIC), Bozdogan Akaike’s information criteria (CAIC), and Hannan and Quinn
information criteria (HQIC), are useful when comparing models that are not neces-
sarily nested (subsection (a)). The table of fit statistics for the conditional distribution
shows the sum of the independent contributions to the conditional (part (b)) -2 log
likelihood, the value of which is 139.03, whereas the value of Pearson’s statistic
divided by the degrees of freedom for the conditional distribution (Pearson′s chi -
square/DF) is 4.85.
The estimated dispersion parameter (ϕ = Pearson’s chi-square/DF) has a value
far from 1; in this case, it is ϕ = 4:85, which indicates that there is a strong
overdispersion. This may be because the specified distribution of the data is not
appropriate, the counts are too small, or the variance function was not correctly
122 4 Generalized Linear Mixed Models for Non-normal Responses
specified. The estimate of the variance component due to a block is tabulated in part
(c) of Table 4.2, the estimated value of which is σ 2bloque = 1:559.
The fixed effects test and least squares means are shown in Table 4.4. The type III
fixed effects tests indicate that there is a highly significant difference (part (a)) in the
effectiveness of herbicides in weed suppression; the estimated means with their respec-
tive standard errors are tabulated under the “Mean” column (part (b)). The “Estimate”
column containing the estimates of the means of lsmeans is on the model scale. They
are derived from the log likelihood function. SAS always lists the means obtained with
lsmeans from the model scale when creating least squares means test tables. The
“Mean” column has been converted back to the data scale using the “ilink” inverse
link function. These values are estimates of the average counts for each treatment level
(in this case, the herbicide type on the data scale). When we report the results, we must
replace the corresponding model’s least squares values in the test tables with these
estimates (means on the data scale corresponding to the values in the “Mean” column).
Since there is a strong overdispersion ϕ > 1 , assuming that the data have a
Poisson distribution is risky because this implies that the mean and variance are
equal, which is an assumption implying that the data have a Poisson distribution, i.e.,
that the mean and variance are the same. A useful alternative distribution might be a
negative binomial distribution; this distribution has a mean λ and variance λ + λϕ2
with ϕ > 0 commonly known as the scale parameter.
The following is the specification of the components of a GLMM with a negative
binomial (NB) response variable:
Part of the output is shown in Table 4.5. The fit statistics for the model compar-
ison (part (a)) and that for the conditional distribution (part (b)) are both provided by
the GLIMMIX procedure when a conditional distribution is specified. Since in the
previous analysis, it was observed that overdispersion exists when assuming a
Poisson distribution, the results – under a negative binomial distribution – indicate
that this overdispersion problem no longer exists; i.e., the binomial distribution is no
4.8 Estimation and Inference in Generalized Linear Mixed Models 123
4.8.1 Estimation
the squares that produce the same results as an ML estimation. However, this
equivalence is not obtained in models with more complex structures such as
LMMs or GLMMs. To find the ML estimators, in GLMMs, one must integrate
over all possible values of the random effects. For GLMMs, this computation is at
best slow and at worst (a large number of random effects) computationally
infeasible.
Statisticians have proposed several ways to approximate the parameter estimates
of a GLMM, including penalized quasi-likelihood (PQL) and pseudo-likelihood
methods (Schall 1991; Wolfinger and O'Connell 1993; Breslow and Clayton
1993), Laplace approximations (Raudenbush et al. 2000) and Gauss–Hermite quad-
rature (Pinheiro and Chao (2006), and Bayesian methods based on Markov chain
Monte Carlo (Gilks et al. 1996). In all these approaches, researchers must distinguish
between a standard ML estimation, which estimates the standard deviations of the
random effects assuming that the fixed effects estimates are precisely correct, and
restricted maximum likelihood (REML), a variant that averages over the uncertainty
in the fixed effects parameters (Pinheiro and Bates 2000; Littell et al. 2006).
The ML method underestimates the standard deviations of random effects, except
in extremely large datasets, but it is most useful for comparing models with different
fixed effects. Pseudo- and quasi-likelihood methods are the simplest and the most
widely used in approximating a GLMM. They are widely implemented in statistical
packages that promote the use of GLMMs in many areas of ecology, biology, and
quantitative and evolutionary genetics (Breslow 2004). Unfortunately, pseudo- and
quasi-likelihood methods produce biases in parameter estimation if the standard
deviations of the random effects are large, especially when using binary data
(Rodriguez and Goldman 2001; Goldstein and Rasbash 1996). Lee and Nelder
(2001) have implemented several improvements to the PQL version, but these are
not available in most common statistical software packages. As a rule of thumb, PQL
performs poorly for Poisson data when the average number of counts per treatment
combination is less than five or for binomial data when the expected numbers of
successes and failures for each observation are less than five (Breslow 2004).
Another disadvantage of PQL is that it calculates a quasi-likelihood rather than the
true likelihood. Because of this, many statisticians believe that PQL-based methods
should not be used for inference.
There are two more accurate approximations available, which also reduce bias.
One is the Laplace approximation (Raudenbush et al. 2000), which approximates the
true likelihood of a GLMM instead of a quasi-likelihood, allowing the maximum
likelihood method in the GLMM inference process. The other approach is called
Gauss–Hermite quadrature (Pinheiro and Chao 2006), which is more accurate than
the Laplace approximation but is slower (requires more computational resources).
Therefore, the procedures for parameter estimation of a GLMM that are approxima-
tions are as follows:
The penalized quasi-likelihood method performs the estimation process by alternat-
ing between (1) estimating the fixed parameters by fitting a GLM with a variance–
covariance matrix based on an LMM fit and (2) estimating the variances and
4.8 Estimation and Inference in Generalized Linear Mixed Models 125
covariances by fitting a GLM with unequal variances calculated from the previ-
ous GLM fit. Pseudo-likelihood, a close cousin of the ML method, estimates
variances differently and estimates a scale parameter to account for
overdispersion (some authors use these terms interchangeably). In summary,
GLMMs require an iterative process in parameter estimation. Two categories of
iterative procedures are used by SAS: linearization and integral approximation.
The GLIMMIX procedure uses the pseudo-likelihood method in linearization,
and integral approximation uses the Laplace approximation or adaptive methods
such as Gauss–Hermite quadrature. These methods maximize the log likelihood
of the exponential distribution family, i.e., non-normal distributions. The pseudo-
likelihood method is the default procedure in the GLIMMIX procedure (Proc
GLIMMIX). The Laplace method and quadrature are an approximation for
maximum likelihood, but the Laplace method is computationally simpler than
quadrature and also provides excellent estimates.
4.8.2 Inference
After estimating the parameter values in a GLMM, the next step is to extract
information and draw statistical conclusions from a given dataset through careful
analysis of the parameter estimates (confidence intervals, hypothesis testing) and
select a model that best describes or explains the most variability in the dataset.
Inference can generally be based on three types: (a) hypothesis testing, (b) model
comparison, and (c) Bayesian approaches. Hypothesis testing compares test statistics
(F-test in ANOVA) to verify their expected distributions under the null hypothesis
(H0), estimating the value of P (P-value) to determine whether H0 can be rejected.
On the other hand, model selection compares candidate model fits. These can be
selected using hypothesis testing; that is, testing nested versus more complex models
(Stephens et al. 2005) or using information theory approaches such as Wald tests (Z,
χ 2, t, and F). In model selection, likelihood ratio (LR) tests can ensure the signifi-
cance of factors or choose the best of a pair of candidate models. On the other hand,
information criteria allow multiple comparisons and selections of non-nested
models. Among these criteria are the Akaike information criterion (AIC) and related
information criteria that use deviance as a measure of fit, adding a term to penalize
more complex models. Information criteria can provide better estimates. Variations
of AIC are highly common when sample sizes are not large (AICC), when there is
overdispersion in the data (quasi-AIC, QAIC), or when one wishes to identify/
determine the number of parameters in a model (Bayesian information criterion,
BIC).
126 4 Generalized Linear Mixed Models for Non-normal Responses
4.10 Exercises
Table 4.6 Number of seeds Trt1 Trt2 Trt3 Trt4 Trt5 Trt6
not germinating (out of 50)
A 10 11 8 9 7 6
B 8 10 3 7 9 3
C 5 11 2 8 10 7
D 1 6 4 13 7 10
4.10 Exercises 127
2. Table 4.7 shows the counts per sample area of a variety type of cockchafer larva
(two age groups a and b). The experiment consisted of five treatments in eight
randomized blocks and two age groups to study the differential effects of
treatments on insect age.
(a) Considering the type of answer of this exercise; what type(s) of probability
distribution(s) do you suggest for this type of response?
(b) Construct a GLMM to study the effect of treatments and the age of Cock-
chafer larvae.
(c) Analyze the dataset according to the model proposed in (a).
(d) Is the model used in (a) sufficient? If so, discuss your findings.
Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0
International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing,
adaptation, distribution and reproduction in any medium or format, as long as you give appropriate
credit to the original author(s) and the source, provide a link to the Creative Commons license and
indicate if changes were made.
The images or other third party material in this chapter are included in the chapter's Creative
Commons license, unless indicated otherwise in a credit line to the material. If material is not
included in the chapter's Creative Commons license and your intended use is not permitted by
statutory regulation or exceeds the permitted use, you will need to obtain permission directly from
the copyright holder.
Chapter 5
Generalized Linear Mixed Models
for Counts
5.1 Introduction
Data in the for of counts regularly appear in studies in which the number of
occurrences is investigated, such as the number of insects, birds, or weeds in
agricultural or agroecological studies; the number of plants transformed or
regenerated using modern breeding techniques; the number of individuals with a
certain disease in a medical study; and the number of defective products in a quality
improvement study, among others. These counts can be counted per unit of time,
area, or volume. When using a generalized linear model (GLM) with a Poisson
distribution, it is often found that there is excessive dispersion (extra variation) that is
no longer captured by the Poisson model. In these cases, the data must be modeled
with a negative binomial distribution that has the same mean as the Poisson
distribution but with a variance greater than the mean. Most experiments have
some form of structure due to the experimental design (completely randomized
design (CRD), randomized complete block design (RCBD), incomplete block, or
split-plot design) or the sampling design, which must be incorporated into the
predictor to adequately model the data.
e - λ λy
f ðyÞ = ; λ > 0, y = 0, 1, 2, ⋯:
y!
The mean and variance of a Poisson random variable are equal, i.e., E( y) = Var
( y) = λ. A Poisson distribution is often used to model responses that are “counts.” As
λ increases, the Poisson distribution becomes more symmetric and eventually it can
be reasonably approximated by a normal distribution.
Let yij be the value of the count variable associated with unit i at level one and
with unit j at level two, given a set of explanatory variables. Therefore, we can
express this as
y
e - λij λijij
f yij = , yij = 0, 1, 2, ⋯
yij !
A Poisson distribution has very particular mathematical properties that are used
when we model “counts.” For example, the expected value of y is equal to the
variance of y, such that
log λij = η þ τi þ bj :
This is a special case of a generalized linear mixed model (GLMM) in which the
link function of this family of distributions is g(λij) = log (λij). The dispersion
parameter ϕ, in this case, is equal to 1.
Sometimes, if the data counts are extremely large, their distribution can be
approximated to a continuous distribution. Whereas, if all the counts are large
enough, then the square root of the counts is viable for fitting the model as it allows
the variance to be stabilized. However, as mentioned in previous chapters, the
estimation process under normality can be problematic, as it can provide negative
fitted values and predictions, which is illogical.
5.2 The Poisson Model 131
ηij = η þ τi
where ηij denotes the ijth link function of the ith treatment in the jth observation, η is
the intercept, and τi is the fixed effect due to treatment i (i = 1, 2, ⋯, t; j = 1, 2, ⋯ri),
with t treatments and ri replicates in each treatment i.
Example Effect of a subculture on the number of shoots during micropropagation
of sugarcane.
The objective of micropropagation in sugarcane is to produce vegetative material
identical to the donor so that its genetic integrity is preserved. Despite this,
somaclonal variation has been observed in plants derived from in vitro culture
regardless of explant, variety, ploidy level, number of subcultures, and generation
route used, among others. A total of 8 explants were planted in temporary immersion
bioreactors (explant/bioreactor) to determine whether the number of subcultures
(10 subcultures) influences the number of shoots observed per explant. In this
example, we have ri observations ( j = 1, 2, . . ., ri) on each of the 10 subcultures
(i = 1, 2, . . ., 10) in a completely randomized design (Appendix 1: Data: Subcul-
tures). The analysis of variance (ANOVA) table (Table 5.1) for this model is given
below:
The components of the GLM are set out below:
While most of the commands used have been explained before, the options in the
model statement “dist,” “s,” and “link” communicate to the SAS the type of data
distribution, the fixed effects solution, and the link to use, respectively. In addition,
the “lines” option asks the GLIMMIX procedure in the “lsmeans” (least squares
means) command for mean comparisons, and the “ilink” option provides the inverse
link function.
Part of the output is shown in Table 5.2, where part (a) shows the model and the
methods used to fit the statistical model, whereas part (b) lists the dimensions of the
relevant matrices in the model specification.
Due to the absence of random effects in this model, there are no columns in
matrix Z. The 11 columns in matrix X comprise an intercept and 10 columns for the
effect of subcultures.
The goodness-of-fit statistics of the model are shown in part (a) of Table 5.3. The
value of the generalized chi-squared statistic over its degrees of freedom (DFs) is
less than 1. (Pearson′s chi - square/DF = 0.79). This indicates that there is no
overdispersion and that the variability in the data has been adequately modeled with
the Poisson distribution.
Subsection (b) of Table 5.3 shows the maximum likelihood (ML) (“Estimate”),
parameter estimates, standard errors, and t-tests for the hypothesis of the parameters.
5.2 The Poisson Model 133
Table 5.4 (part (a)) shows significance tests for the fixed effects in the model
“Type III fixed effects tests.” These tests are Wald tests and not likelihood ratio tests.
The effect of a subculture on the number of shoots is highly significant in this model
with a value of P < 0.0001, indicating that the 10 subcultures do not produce the
same number of shoots, that is, the number of subcultures affects the average shoot
production in the explant.
The least squares means obtained with “lsmeans” (part (b) in Table 5.4) are the
values under the column “Estimate,” which along with the standard errors, were
calculated with the linear predictor ηi = η þ τi . These estimates are on the model
scale, whereas the “Mean” column values and their respective standard errors are on
the data scale, which were obtained by applying the inverse link to obtain the λi
values, i.e., λi = exp ðηi Þ with their respective standard errors.
A comparison of means, using the option “lines,” is presented in Fig. 5.1. In this
figure, we can see that in the first subcultures, the average production is minimal but
it increases as subcultures increase from 5 to 8, and, in subculture 9, the average
number of shoots per explant begins to decrease.
134 5 Generalized Linear Mixed Models for Counts
Table 5.4 Type III tests of fixed effects and least squares means (means)
(a) Type III tests of fixed effects
Effect Num DF Den DF F-value Pr > F
sub1 9 164 120.14 <0.0001
(b) sub1 least squares means
sub1 Estimate Standard error DF t-value Pr > |t| Mean Standard error mean
1 2.5878 0.06131 164 42.21 <0.0001 13.3000 0.8155
2 2.7644 0.05234 164 52.81 <0.0001 15.8696 0.8307
3 3.1091 0.05455 164 56.99 <0.0001 22.4000 1.2220
4 3.3274 0.04891 164 68.03 <0.0001 27.8667 1.3630
5 3.8864 0.03699 164 105.08 <0.0001 48.7333 1.8025
6 3.8944 0.03567 164 109.18 <0.0001 49.1250 1.7522
7 3.9318 0.03131 164 125.57 <0.0001 51.0000 1.5969
8 4.0073 0.03015 164 132.91 <0.0001 55.0000 1.6583
9 3.9370 0.03606 164 109.18 <0.0001 51.2667 1.8487
10 3.6687 0.04124 164 88.96 <0.0001 39.2000 1.6166
ηi errorstd ðηi Þ λi errorstd λi
60 a
ab ab
b b
Average of shoots per explant
50
c
40
30 d
e
20 f
g
10
0
1 2 3 4 5 6 7 8 9 10
Subcultures
Fig. 5.1 Average number of shoots per subculture. Bars with different letters are statistically
different using α = 0.05
Table 5.5 Number of nuts Trt yij Trt yij Trt yij Trt yij Trt yij
per tree (yij) in each of the
C 1 A3 79 A3 89 B1 50 B2 138
combinations of the two
factors A2 118 B3 99 C 21 A2 69 B1 69
A1 69 A1 50 A2 79 A3 69 A3 138
B1 89 B3 118 B3 99 A1 21 C 11
B1 99 A3 99 A1 79 B2 118 B2 89
B2 158 C 50 B1 118 C 30 B3 158
A1 89 A2 127 A2 89 B2 99 B3 118
in two formulations (A and B) plus a control (C). In addition to the treatments (Trt)
there was a control, where no compound was applied. In total, 7 treatments were
randomly applied to the experimental units (trees), i.e., 35 trees, in a rectangular
arrangement (as shown below). The average number of nuts yij observed in the
formulation and the time of application are provided in Table 5.5.
The components of the GLMM are listed below:
The options in the model statement, dist, s and ilink communicates to SAS the
type of data distribution, the fixed effects solution and to compute the inverse link,
respectively. In addition, the option “lines” requests the GLIMMIX procedure in the
“lsmeans” (least squares means) command, and the mean comparisons and the
“ilink” option provide the inverse of the link function.
Part of the results is presented in Table 5.6. The value of the statistic for
conditional distribution (part (a)) indicates that there is a strong overdispersion (χ 2/
df = 3.62), and the variance component estimates due to sampling in the experi-
mental units (trees) is σ 2tree = 0:035 (part (b)).
136 5 Generalized Linear Mixed Models for Counts
Table 5.6 Results of the (a) Fit statistics for conditional distribution
analysis of variance
-2 Log L (count | r. effects) 354.60
Pearson’s chi-square 126.54
Pearson’s chi-square/DF 3.62
(b) Covariance parameter estimates
Cov Parm Estimate Standard error
rep 0.03573 0.02362
(c) Type III tests of fixed effects
Effect Num DF Den DF F-value Pr > F
Trt 6 24 59.55 <0.0001
In addition, Table 5.6 (part (c)) shows the type III tests of fixed effects, indicating
that there is a significant difference between treatments on the average number of
nuts per tree (P = 0.0001). However, it is not recommended to continue with the
inference and analysis of the experiment due to the presence of extra-variance
(commonly known as overdispersion; Pearson′s chi - square/DF = 3.62) in the
data that strongly affects the F-test and the standard errors of the means.
A highly effective alternative to deal with the inconvenience of overdispersion in
the data is to use a different distribution to the Poisson distribution. A negative
binomial distribution is an excellent option for count data with overdispersion.
Assuming that the conditional distribution of the observations is given by:
where λij~Gamma~(1/ϕ, ϕ), ϕ as the scale parameter and r j Nð0, σ2tree Þ. The
resulting new GLMM is:
The following GLIMMIX statements for fitting this model under a negative
binomial distribution in a CRD manner is provided next.
Part of the results is listed below. The information criteria in Table 5.7 part (a) are
helpful in choosing which model best fits the dataset. Clearly, the negative binomial
distribution provides the best fit to these data. On the other hand, in the conditional fit
statistics (part (b)), we observed that the Poisson model had a strong overdispersion
(Pearson′s chi - square/DF = 3.62) and that by fitting the data under a negative
binomial distribution, the overdispersion of the dataset was removed (Pearson′s chi -
Square/DF = 0.91).
Table 5.8 shows the variance component estimates (part (a)) and the type III tests
of fixed effects (part (b)). The estimated variance parameter, due to trees, is
σ 2tree = 0:04288, and the estimated scale parameter (Scale) is ϕ = 0:06141. The
type III tests of fixed effects (part (b)) show that there is a highly significant effect
of treatments on the average number of nuts (P < 0.0001).
The values under the column “Estimates” are the estimates of the linear predictor
ηi (the model scale), and the values under “Mean” are the means λi (the data scale)
with their respective standard errors obtained with the command “lsmeans” and
“ilink” (Table 5.9). The results show that the treatments implemented in this
experiment showed a higher average number of walnuts than did the “control”
treatment C. In general, formula B applied to the walnut trees at the full-flowering
stage showed a higher nut production.
138 5 Generalized Linear Mixed Models for Counts
Table 5.9 Estimates on the model scale (“Estimate”) and means on the data scale (“Mean”)
Trt least squares means
Trt Estimate Standard error DF t-value Pr > |t| Mean Standard error mean
A1 4.0865 0.1560 24 26.19 <0.0001 59.5307 9.2890
A2 4.5624 0.1519 24 30.04 <0.0001 95.8162 14.5529
A3 4.5293 0.1519 24 29.82 <0.0001 92.6956 14.0783
B1 4.4349 0.1529 24 29.01 <0.0001 84.3417 12.8958
B2 4.7863 0.1504 24 31.82 <0.0001 119.86 18.0304
B3 4.7641 0.1504 24 31.67 <0.0001 117.23 17.6335
C 3.0499 0.1742 24 17.51 <0.0001 21.1140 3.6785
ηij = η þ τi þ bj
where η is the intercept, τi is the fixed effect due to the ith treatment i, and bj is the
random effect of the block j with bj Nð0, σ2block Þ.
One of the main problems when growing cereal crops is the competition that exists
between the weeds and seedlings. If a field supervisor is interested in testing five
designed treatments plus a control for weed control in cereal crops, then a random-
ized complete block design (four blocks) should be used. Table 5.10 shows the
number of weed plants observed in each of the treatments (yij) in parentheses.
Table 5.11 shows the sources of variation and the degrees of freedom of a
randomized complete block design used in this experiment.
Since the response is count, it will be modeled using a GLMM with a Poisson
response variable, which is stated below:
Table 5.10 Number of weeds in each treatment (the number in parentheses corresponds to the
treatment number)
Block
A (1) 438 (4) 17 (2) 538 (5) 18 (3) 77 (6) 115
B (3) 61 (2) 422 (6) 57 (1) 442 (5) 26 (4) 31
C (5) 77 (3) 157 (4) 87 (6) 100 (2) 377 (1) 319
D (2) 315 (1) 380 (5) 20 (3) 52 (4) 16 (6) 45
where yij denotes the number of weed plants observed in treatment i and block
j (i = 1, 2, ⋯, 6; j = 1, 2, 3, 4), ηij is the linear predictor, η is the intercept, τi is the
fixed effect due to treatment i, and bj is the random block effect bj N 0, σ 2block .
Using the GLIMMIX procedure, the following syntax specifies the analysis of a
GLMM with a Poisson response.
Note that in the above syntax, we use “method = laplace” (or we can also use
“method = quadrature”) to fit the mixed model and obtain the chi-squared/DF fit
statistic. If the method of integration is not specified, then a generalized chi-squared/
DF statistic is obtained. The auxiliary options after the “lsmeans” command are
described below: “diff” provides paired comparisons between treatments, “lines”
provides the pair comparison of means using letters, and “ilink” provides the value
of the inverse of the link function. Some of the outputs are listed below.
Table 5.12 (a) presents the basic information about the model and estimation
procedure used.
Subsection (b) of Table 5.12 shows/ lists the “Dimensions” of the relevant
matrices used in the model. The random effects matrix Z indicates that there are
four columns due to blocks, and the fixed effects matrix X indicates that there is one
column for the intercept plus six columns due to treatments.
140 5 Generalized Linear Mixed Models for Counts
The “Fit statistics” and “Fit statistics for conditional distribution” (parts (a) and
(b) of Table 5.13, respectively) show information about the fit of the GLMM. The
generalized chi-squared statistic measures the sum of the residual squares in the final
model and the relationship with its degrees of freedom; this is a measure of the
variability of the observations about the model around the mean.
The value of Pearson’s chi-square/DF for the conditional distribution is 11.8, well
above up 1. This value gives strong evidence of overdispersion in the dataset. In
other words, this value is calling our distribution and linear predictor assumption into
question, which means that the variance function was not adequately specified.
5.2 The Poisson Model 141
Table 5.14 Variance component estimates, parameter estimates, and type III tests of fixed effects
Cov Parm Estimate Standard error
Block ð^
σ2block Þ 0.01840 0.01377
(a) Solutions for fixed effects
Effect Trt Estimate Standard error DF t-value Pr > |t|
Intercept η 4.3637 0.08808 3 49.54 <0.0001
Trt 1 τ1 1.6056 0.06155 15 26.09 <0.0001
Trt 2 τ2 1.6508 0.06132 15 26.92 <0.0001
Trt 3 τ3 0.09042 0.07769 15 1.16 0.2627
Trt 4 τ4 -0.7416 0.09888 15 -7.50 <0.0001
Trt 5 τ5 -0.8101 0.1012 15 -8.00 <0.0001
Trt 6 τ6 0 . . . .
(b) Type III tests of fixed effects
Effect Num DF Den DF F-value Pr > F
Trt 5 15 523.57 <0.0001
Linear mixed models assume that the observations have a normal distribution
conditional to the fixed effects of parameters. In addition, the mean μ is independent
of the variance σ 2, whereas, in most GLMMs that assume a binomial or Poisson
distribution, the variance “dispersion” is set to 1. That is, if the mean is known, then
we assume that the variance is also known. The extra variability not predicted
by a generalized linear model’s random component reflects overdispersion.
Overdispersion occurs because the mean and variance components of a GLM are
related and depend on the same parameter that is being predicted through the
predictor set. However, if overdispersion is present in a dataset, then the estimated
standard errors and test statistics of the overall goodness of fit will be distorted and
5.2 The Poisson Model 143
Fig. 5.3 Conditional residuals versus predicted values on the data scale
The first alternative is to add a scale parameter and replace Var(yij| bj) = λij by
Var(yij| bj) = ϕλij. This consists of replacing the logarithm of the conditional
likelihood yij log (λij) - λij - log (yij) by the quasi-likelihood yij log (λij) - λij/ϕ,
assuming that ϕ > 1 could adequately model the observed variance.
The following GLIMMIX syntax invokes this alternative of adding a scale
parameter under a Poisson response variable.
proc glimmix;
class Block Trt;
model Count = Trt/dist=Poisson;
random intercept/subject=block;
random _residual_;
lsmeans Trt/ ilink ;
run;
The SAS code is highly similar to that previously used with the addition of the
“random _residual_” command to the program. Note that the Laplace integration
method (“method = laplace”) has been removed, which causes the estimation to be
performed using the pseudo-likelihood (PL) method; the scale parameter is esti-
mated and used in the adjustment of the standard errors and test statistics. The
5.2 The Poisson Model 145
standard errors are multiplied by ϕ, and all F-test values are divided by ϕ.
Table 5.16 shows part of the results.
In Table 5.16, we observe the fit statistics (part (a)), covariance parameter
estimates (part (b)), and the value of the scale parameter, which is equal to
ϕ = 19:4848 (Residual(VC)). The value of the F-statistic under the Poisson distri-
bution in the analysis is 26.87 (part (c)); this value is obtained by dividing the
F-value from the previous analysis by 523:57=ϕ . The results indicate that even
under this adjustment, overdispersion exists and that this value increases from 11.8
to 19.4848 (part (a)). The inclusion of the scale parameter affects the variance
estimate due to blocks σ 2block as well as the estimates of treatment means (part (d)),
but the main impact is on the standard errors.
The inclusion of the scale parameter implies that there is a quasi-likelihood,
meaning that there is no true likelihood of the model and, therefore, there is no
true likelihood process that provides a true expected value of λ and a variance of ϕλ.
146 5 Generalized Linear Mixed Models for Counts
In count and binomial response variables, it is important to check whether the linear
predictor is correctly specified, that is, whether it is being randomly affected by the
experimental units within blocks. If λij is being randomly affected by the experi-
mental units within blocks, which is important in count and binomial response
variables, then, the ANOVA table should include the effect of the block × treatment
source of variation; this must be specified in the linear predictor in a GLMM. Thus,
the linear predictor is specified as
ηij = η þ τi þ bj þ ðbτÞij
Distribution : yij j bj , bτij Poissonðλij Þ
bj Nð0, σ2block Þ
bτij Nð0, σ2block × τ Þ
Linear predictor: ηij = η þ τi þ bj þ ðbτÞij
Part of the output is shown in Table 5.17. The results tabulated in part (a) indicate
that the overdispersion has been eliminated ϕ = 0:11 , but there is a risk of
underestimating the variance. For this reason, it is highly recommended that the
value of ϕ should be close to 1. The estimated variance components (part (b)) for
blocks and block × treatments are σ 2block = 0:05969 and σ 2block × Trt = 0:1152,
respectively.
The type III tests of fixed effects are highly significant (P = 0.0001), indicating
that the six treatments are not equally effective in weed control (part (c)). The values
in part (d) under the “Mean” column are the means on the original scale of the data
for each of the treatments with their respective standard errors. The values of the
means – compared with the previous ones – (using the scale parameter) do not vary
much, but the standard errors have a more marked variation.
148 5 Generalized Linear Mixed Models for Counts
Another way to account for the problem of overdispersion when using a Poisson
distribution is to change the assumed distribution of the response variable. Poisson
variables have the same mean and variance, but, in biological sciences, with vari-
ables such as counts, this assumption is not always true. A negative binomial
distribution is a good alternative (see Example 5.2), as previously discussed. A
negative binomial variable’s mean is denoted by the parameter λ > 0 and variance
λ + ϕλ2 by ϕ > 0. That is, the expected value E( y) = λ and variance Var( y) = λ + ϕλ2,
where ϕ is the scale parameter. The components of this model are shown below:
Given that yij j bj~Poisson(λij), it is assumed that λij~Gamma~(1/ϕ, ϕ), with ϕ as
the scale parameter and bj N 0, σ 2block . The new specification of the resulting
GLMM is as follows:
Some of the most relevant outputs from GLIMMIX are presented in Table 5.18.
Pearson’s chi-squared (Pearson′s chi - square/DF) value of 0.88 (part (a)) shows
that overdispersion in the dataset has been removed. The estimated scale parameter
tabulated in part (b) (Scale) is ϕ = 0:1080. This value is not the same scale parameter
estimated using the Poisson model with the “random _residual_” command, since
the methodology for calculating them in these models is different. However, as
mentioned above, both scale parameters affect the relationship between the mean
and variance in the Poisson and negative binomial distributions.
The value of the test statistic shown in part (c) of Table 5.18, under the negative
binomial distribution for the effect of treatments, is highly similar to the value
obtained with the Poisson distribution when the effect of the block × treatment
interaction was added to the linear predictor. The values under “Estimate” are
estimates of the linear predictor on the model scale (part (d)), whereas those under
the “Mean” column are the treatment means on the data scale, using the negative
binomial distribution. Of the three proposed alternatives to fit these data, the last two
(including in the predictor the block–treatment interaction and assuming a negative
binomial distribution) provides a better fit.
5.2 The Poisson Model 149
Many experiments involve studying the effects of two or more factors. Factorial
designs are the most efficient for these types of experiments. In a factorial design, all
possible combinations of factor levels are investigated in each replicate. If there are
a levels of factor A and b levels of factor B, then each replicate contains all ab
treatment combinations.
explantðpetri:dishÞlðkÞ N 0, σ2explantðpetri:dishÞ
Some of the SAS output is shown in Table 5.21. The fit statistics in part (a) for
this dataset are shown below. Note that “method = laplace” was used for the
estimation process and to obtain Pearson’s fit statistic χ 2/DF. The result indicates
that there is evidence of overdispersion (Pearson′s chi - square/DF = 1.84).
Overdispersion, as discussed before, implies more variability in the data than
would be expected, potentially explaining the lack of fit in a Poisson model. Part
(b) shows the variance component estimates due to Petri_dish, which is equal
to σ ^2Petri:dish = 0:003616, and, for the explants within Petri.dish, it is
^explantðPetri:dishÞ = 0:01462. However, the type III test of fixed effects indicates that
σ 2
Table 5.21 Conditional fit statistics, variance component estimates, and type III tests of fixed
effects under the Poisson distribution
(a) Fit statistics for conditional distribution
-2 Log L (y | r. effects) 1168.16
Pearson’s chi-square 354.01
Pearson’s chi-square/DF 1.84
(b) Covariance parameter estimates
Cov Parm Estimate Standard error
Petri.dish 0.003616 0.006014
Explant (Petri.dish) 0.01462 0.008798
(c) Type III tests of fixed effects
Effect Num DF Den DF F-value Pr > F
Genotype 1 161 11.01 0.0011
Culture 3 161 57.30 <0.0001
Genotype*culture 3 161 3.95 0.0095
Since there is overdispersion in the data, we will fit the GLMM again using the
negative binomial distribution. That is, under the following GLMM:
proc glimmix ;
class genotype culture petri.dish explant ;
model y = cultivar|culture/dist=NegBin link=log;
random petri.dish Explant(petri.dish);
lsmeans cultivar|culture;
run;
It should be noted that this program is very similar to the previous one, and the
only difference is that now a negative binomial distribution is used (“dist = negbin”).
Part of the results is presented in Table 5.23. As we have already mentioned, a
negative binomial distribution is another model for count variables when there is
overdispersion in the dataset. If Pearson’s chi-squared value divided over the degrees
of freedom is less than or equal to 1, then the overdispersion is 0 or close to 0, which
means that the model is able to efficiently capture the degree of overdispersion.
5.2 The Poisson Model 153
Table 5.22 Estimates on the model scale and means on the data scale under the Poisson
distribution
(a) Genotype least squares means
Standard t- Standard error
Genotype Estimate error DF value Pr > |t| Mean mean
1 2.2979 0.05165 161 44.49 <0.0001 9.9533 0.5141
2 2.1345 0.05298 161 40.29 <0.0001 8.4531 0.4479
(b) Culture least squares means
Standard t- Standard error
Culture Estimate error DF value Pr > |t| Mean mean
1 2.1984 0.06180 161 35.57 <0.0001 9.0107 0.5569
2 2.5684 0.05607 161 45.81 <0.0001 13.0456 0.7314
3 2.4609 0.05738 161 42.89 <0.0001 11.7156 0.6723
4 1.6371 0.07445 161 21.99 <0.0001 5.1402 0.3827
(c) Genotype*culture least squares means
Standard t- Standard
Genotype Culture Estimate error DF value Pr > |t| Mean error mean
1 1 2.2465 0.07676 161 29.26 <0.0001 9.4547 0.7258
1 2 2.7789 0.06395 161 43.45 <0.0001 16.1018 1.0298
1 3 2.5331 0.06932 161 36.54 <0.0001 12.5925 0.8729
1 4 1.6331 0.09793 161 16.68 <0.0001 5.1196 0.5014
2 1 2.1503 0.07958 161 27.02 <0.0001 8.5877 0.6834
2 2 2.3580 0.07370 161 31.99 <0.0001 10.5694 0.7790
2 3 2.3887 0.07290 161 32.77 <0.0001 10.8997 0.7945
2 4 1.6411 0.09760 161 16.81 <0.0001 5.1609 0.5037
Table 5.23 Conditional fit statistics, variance component estimates, and type III tests of fixed
effects under the negative binomial distribution
(a) Fit statistics for conditional distribution
-2 Log L (y | r. effects) 1143.90
Pearson’s chi-square 159.95
Pearson’s chi-square/DF 0.83
(b) Covariance parameter estimates
Cov Parm Estimate Standard error
Petri.dish -0.02717 .
Explant (Petri.dish) -0.04323 .
Scale 0.1712 0.03514
(c) Type III tests of fixed effects
Effect Num DF Den DF F-value Pr > F
Genotype 1 161 4.43 0.0369
Culture 3 161 25.91 <0.0001
Genotype*culture 3 161 1.44 0.0322
5.2 The Poisson Model 155
Table 5.24 Estimates on the model scale and means on the data scale under the negative binomial
distribution
(a) Cultivar least squares means
Standard t- Standard error
Genotype Estimate error DF value Pr > |t| Mean mean
1 2.3054 0.05407 161 42.64 <0.0001 10.0287 0.5423
2 2.1426 0.05535 161 38.71 <0.0001 8.5219 0.4717
ηi λi
(b) T grouping of genotype least squares means (α=0.05)
LS means with the same letter are not significantly different
Genotype Estimate
1 2.3054 A
2 2.1426 B
Table 5.25 Means estimates on the model scale and data scale for the culture medium
(a) Culture least squares means
Standard t- Standard error
Culture Estimate error DF value Pr > |t| Mean mean
1 2.2061 0.07653 161 28.82 <0.0001 9.0802 0.6950
2 2.5766 0.07198 161 35.80 <0.0001 13.1527 0.9468
3 2.4684 0.07300 161 33.81 <0.0001 11.8031 0.8617
4 1.6451 0.08708 161 18.89 <0.0001 5.1815 0.4512
ηj λj
(b) T grouping of culture least squares means (α=0.05)
LS means with the same letter are not significantly different
Culture Estimate
2 2.5766 A
3 2.4684 A
1 2.2061 B
4 1.6451 C
For the culture medium (Table 5.25), the estimated values in this comparison of
means correspond to the values of the linear predictor ηj (on the model scale), but, by
applying the inverse link to ηj , we obtain the values under the “Mean” column that
provide the means on the data scale (part (a)). The mean comparisons on the model
scale are shown in part (b).
The results indicate that the means in culture media 2 and 3 provided a statisti-
cally similar average number of buds compared to the means in culture media 1 and
4 (see Fig. 5.6).
The interaction between both factors (Table 5.26), the average number of buds,
and the mean comparisons are shown in Table 5.26.
156 5 Generalized Linear Mixed Models for Counts
Table 5.26 Estimates on the model scale and means on the data scale for the interaction between
genotype and culture medium
Genotype*culture least squares means
Standard t- Standard
Genotype Culture Estimate error DF value Pr > |t| Mean error mean
1 1 2.2540 0.1072 161 21.02 <0.0001 9.5255 1.0212
1 2 2.7869 0.09844 161 28.31 <0.0001 16.2310 1.5978
1 3 2.5401 0.1020 161 24.91 <0.0001 12.6805 1.2933
1 4 1.6408 0.1233 161 13.31 <0.0001 5.1595 0.6360
2 1 2.1582 0.1093 161 19.75 <0.0001 8.6558 0.9457
2 2 2.3663 0.1050 161 22.53 <0.0001 10.6582 1.1196
2 3 2.3967 0.1045 161 22.94 <0.0001 10.9865 1.1478
2 4 1.6493 0.1230 161 13.41 <0.0001 5.2036 0.6401
ηij λij
16
a
14 a
Average number of buds
12
10 b
6 c
0
1 2 3 4
Culture medium
Fig. 5.6 Comparison of the average number of buds as a function of the type of culture medium
(LSD, α = 0.05)
The values under “Estimates” (Table 5.26) correspond to those of the linear
predictor ηij (model scale), but the values under “Mean” correspond to the means
λij on the data scale.
Graphically, Fig. 5.7 shows that genotype 1 in culture medium 2 provides the
highest number of buds, whereas the lowest number of buds was observed in culture
medium 4. For genotype 2, the highest number of buds was observed in culture
media 2 and 3. Finally, culture medium 4 is less suitable for both genotypes.
5.2 The Poisson Model 157
GENOTIPO1 GENOTIPO2
20
a
18
Average number of buds
16
14 ab
bc bc
12 bc
10 c
8
6 d d
4
2
0
1 2 3 4
Culture medium
Fig. 5.7 Effect of the cultivar × culture medium interaction on the average number of buds
(LSD, α = 0.05)
A Latin square (LS) is used where heterogeneity is associated with the crossing of
two factors, generally, both with the same number of levels. This design was
originally used in agricultural experimentation with plots placed in a square arrange-
ment, with expected heterogeneity along the rows and columns of the square.
Blocking in both directions across rows and columns is done in this experimental
design. Sometimes in experimentation, blocking in two directions may be appropri-
ate, i.e., the use of an LS design is a good option. Some examples are provided below
to illustrate the use of this experimental design:
• Field experiments on plots set in a square arrangement with rows and columns
that contribute to the heterogeneity between plots. For example, gradients of
fertility, moisture, management practices, and so on.
• Experiments in greenhouses, rooms with a controlled environment, or growth
chambers where the placement of shelves, trays, etc. with respect to walls or light
sources can introduce systematic variability related to temperature, humidity, or
light in different directions (e.g., left to right, back to front, or top to bottom).
• Laboratory experiments in which there are two potential sources of variability
(e.g., technicians, machines, etc.) and researchers are aware of the possible impact
of variation from both sources.
For an LS layout, the number of rows (r) and columns (c) should be equal to the
number of treatments (t) and the number of replicates of each treatment. The
assignment of treatments is such that each treatment appears exactly once in each
158 5 Generalized Linear Mixed Models for Counts
row and column, with each row and column containing a full set of treatments. Thus,
the treatment effect estimates are independent of the differences between rows or
columns, and the rows, columns, and treatments are orthogonal to each other.
The analysis of variance for this experimental design, assuming that there are
r rows, c columns, and t treatments, with r = c = t, contains the following sources of
variability (Table 5.27).
From the analysis of variance table, the linear model for an LS design with
t treatments is as follows:
yijk = μ þ f j þ ck þ τi þ εijk
where yijk is the response observed in treatment i in row f and column c, μ is the
overall mean, fj is the random effect of row j assuming f j N 0, σ 2f , ck is the
random effect of column k with ck N 0, σ 2c , τi is the fixed effect of treatment i,
and εijk is the distributed random error term N(0, σ 2). Note that the treatments are
allocated in the jkth quadrant (in row j and column k).
where ηijkl is the linear predictor that relates the effect of the repetition l (l = 1, 2) in
row j ( j = 1, 2, ⋯, 6) and column k (k = 1, 2, , ⋯, 6) when treatment i is applied
(i = 1, 2, 3, ), η is the intercept, τi is the fixed effect of treatment i, fj is random effect
of row j, and ck is the random effect due to column k, assuming that there is no
interaction between the rows and columns as well as between the treatments and
rows or the treatments and columns. The assumed distributions for rows and
columns are f N 0, σ 2f and ck N 0, σ 2c , respectively. The model uses the
linear predictor (ηijkl) to estimate the means (λijkl = μijkl) of the treatments.
The following GLIMMIX program fits a Latin square design with a Poisson
response:
Part of the output is shown in Table 5.28. In the values of the fit statistics (part
(a)), we observe that the value of Pearson’s chi-square divided by the degrees of
χ2
freedom is less than 1 DF = 0:55 , indicating that there is no overdispersion in the
data and that the Poisson distribution adequately models the dataset.
The type III tests of fixed effects in part (b) indicate that there is no significant
evidence of differences between the treatments (P = 0.0621).
Part (c) of Table 5.28 shows the estimates of treatments on the model scale
(“Estimate”) and on the data scale (“Mean”) with their respective standard errors.
The values 4.6191, 6.9396, and 5.1561 (under the “Mean” column) correspond to
the treatment means for S, TR, and U, respectively.
factor (B) are applied to the secondary subunits formed within the primary unit in
which the first factor was applied. In other words, the primary experimental unit
(whole plot) was used for the application of the first factor; then, after this, it was
divided to form the secondary experimental units (subplots) for the application of the
levels of the second factor. Since the split-plot design has two levels of experimental
units, the whole plot portions (primary units) and subplots (secondary units) have
different experimental errors. Split-plot experiments were invented in agriculture by
Fisher (1925), and their importance in industrial experimentation has been widely
recognized (Yates 1935).
As a simple illustration, consider a study of three pulp preparation methods
(factor A) and four temperature levels (factor B) on the effect of paper tensile
strength (paper quality). A batch of pulp is produced by one of the three methods;
it is then divided into four equal portions (samples). Each portion is cooked at a
specific level of temperature. The assignment of treatments to plots and subplots is
shown in Table 5.29.
The standard ANOVA model for two factors in a split-plot design, in which there
are three levels of factor A and four levels of factor B nested within factor A, is
described below:
fixed effect at level i of factor A and at level j of factor B, and εijk is the normal
random experimental error {εijk~iidN(, σ 2)}. The ANOVA table with sources of
variation is shown in Table 5.30 for this experimental design.
Example 5.1 A split-plot design in randomized complete block arrangement with a
Poisson response
A split plot is probably the most common design structure in plant and soil
research. Such experiments involve two or more treatment factors. Typically, large
units called whole plots are grouped into blocks. The levels of the first factor are
randomly assigned to whole plots. Each whole plot is divided into smaller units,
called subplots (split plots). Next, the levels of the second factor are randomly
assigned to units of split plots within each whole plot.
In this example, four blocks were implemented, which were divided into seven
parts for the seven levels of the first factor (A1, A2, A3, A4, A5, A6, and A7), as whole
plots. Then, each whole plot was divided into four units for randomly assigning the
four levels of factor B, known as subplots (B1, B2, B3, and B4). Both factors were
used to control the growth of a particular weed. Both factors were randomly
allocated in each block, as shown below:
Block 1 Block 4
A1 A7 A3 A2 A5 A4 A6 ⋯ A6 A3 A7 A2 A1 A5 A4
B3 B3 B4 B1 B2 B1 B3 B3 B3 B4 B1 B2 B1 B3
B1 B2 B3 B3 B1 B2 B2 ⋯ B1 B2 B3 B3 B1 B2 B2
B2 B4 B1 B4 B3 B3 B4 B2 B4 B1 B4 B3 B3 B4
B4 B1 B2 B2 B4 B4 B1 ⋯ B4 B1 B2 B2 B4 B4 B1
162 5 Generalized Linear Mixed Models for Counts
The sources of variation and degrees of freedom for this experiment are shown
below in Table 5.31:
In this experiment, the response variable was the number of weeds in each of the
plots (Appendix 1: Weed counts). The components that define this GLMM are as
shown below:
where ηijk is the linear predictor that relates the effect of factor A with i levels
(i = 1, 2, ⋯, 7)and factor B with j levels ( j = 1, 2, 3, 4) in block k with
(k = 1, 2, 3, 4); η is the intercept, αi is the fixed effect at level i of factor A, βj is
the fixed effect at level j of factor B, (αβ)ij is the fixed effect of the interaction
between level i of factor A and level j of factor B, rk is the random effect due to
block; and α(r)ik is the random error effect of the whole plot, assuming r k
N 0, σ 2r and αðr Þik N 0, σ 2AR , respectively. The model uses the aforementioned
linear predictor (ηijk) to estimate the means (λijk = μijk) of the treatments.
The following GLIMMIX program fits a split-plot block design with a Poisson
response variable:
Table 5.32 Results of the (a) Fit statistics for conditional distribution
analysis of variance
-2 Log L (Conteo | r. effects) 1053.96
Pearson’s chi-square 504.44
Pearson’s chi-square/DF 4.50
(b) Covariance parameter estimates
Cov Parm Estimate Standard error
Bloque 0.01526 0.03867
Bloque*A 0.2454 0.07565
(c) Type III tests of fixed effects
Effect Num DF Den DF F-value Pr > F
A 6 18 2.32 0.0775
B 3 63 22.91 <0.0001
A*B 18 63 10.06 <0.0001
χ2
freedom is greater than 1 df = 4:50 .This indicates that we have probably
misspecified either the conditional distribution of y j b or the linear predictor, but,
in this case, there is evidence that we need to look for other distributions for this
dataset (part (a), Table 5.32. In addition, in part (b), the values of variance compo-
nent estimates due to blocks and blocks × A are tabulated
σ^2r = 0:01526; σ^2ra = 0:2454 . On the other hand, the type III tests of fixed effects
(part (c)) show a significant effect of factor B and the interaction between both
factors.
An alternative to reduce the overdispersion is to keep the same linear predictor,
changing the Poison distribution in the response variable by the negative binomial
distribution, that is:
Part of the output is shown below (Table 5.33). According to the results tabulated
in (a), they indicate that the overdispersion has been removed from the analysis
164 5 Generalized Linear Mixed Models for Counts
Table 5.33 Results of the (a) Fit statistics for conditional distribution
analysis of variance
-2 log L (Conteo | r. effects) 838.51
Pearson’s chi-square 79.36
Pearson’s chi-square/DF 0.71
(b) Covariance parameter estimates
Cov Parm Subject Estimate Standard error
Intercept Bloque 0.002421 0.02768
A Bloque 0.1222 0.07102
Scale 0.3458 0.06875
(c) Type III tests of fixed effects
Effect Num DF Den DF F-value Pr > F
A 6 18 2.71 0.0473
B 3 63 2.13 0.1054
A*B 18 63 1.18 0.3017
χ2
df = 0:71 . The variance components estimates, tabulated in part (b), are σ 2r =
0:0024 and σ 2AR = 0:1222 for blocks and blocks × A, respectively. The estimated
scale parameter is ϕ = 0:3458. Note that the results under the negative binomial
distribution differ from those obtained under the Poisson distribution, which is due,
of course, to the fact that the negative binomial distribution better captures
overdispersion. The fixed effects F-test for factor A is significant at the 5% signif-
icance level (part (c)), whereas factor B and the interaction effect do not significantly
influence the response variable.
Example 5.2 A split-split plot in time in a randomized complete block design with a
Poisson response.
The propagation of coffee seedlings through grafting in nurseries depends on
several factors such as the type of substrate, the rootstock of the plant that will host
the graft, type of graft, light intensity, type and size of the container, humidity,
temperature, and so forth. The objective of this experiment was to evaluate the effect
of shade cloth (light intensity), type of container, and clone on the number of leaves
produced by the Coffea canephora P. clones grafted with the Coffea arabica
L. variety Oro azteca.
The factors studied were the color of the shade cloth (black, pearl, and red),
container size (tube of 0.5 kg and 1 kg), and five coffee clones of the variety Coffea
canephora P. plus a franc foot (Coffea arabica L. and Var. Oro azteca) over a period
of 11 months (Appendix 1: Coffee data). The clones used in the experiment are listed
below (Table 5.34). Different physiological parameters were evaluated for
11 months.
This work was implemented in four randomized complete blocks. The following
table exemplifies how a block was constructed.
5.2 The Poisson Model 165
yijklm = μ þ αi þ r m þ ðar Þim þ βj þ ðαβÞij þ γ k þ ðαγ Þik þ ðβγ Þjk þ ðαβγ Þijk
þðrabγ Þijkm þ τl þ ðατÞil þ ðβτÞjl þ ðαβτÞij þ ðγτÞkl þ ðαγτÞikl
þðβγτÞjkl þ ðαβγτÞijkl þ εijklm
i = 1, 2, 3; j = 1, 2, 3, 4, 5; k = 1, 2, 3; l = 1, ⋯, 11; m = 1, 2, 3, 4
where yijklm is the response variable in repetition m, shade cloth i, clone j, and tray
k in time l; μ is the overall mean; αi is the fixed effect due to the type of shade cloth;
βj, γ k, and τl are the fixed effects due to clone type, tray,and sampling time,
respectively; (αβ)ij, (αγ)ik,(βγ)jk, (ατ)il, (βτ)jl, and (γτ)kl are the effects of the double
interactions of the factors shade cloth type with clone, tray, and sampling time;
(αβγ)ijk, (αβτ)ij, (αγτ)ikl, (βγτ)jkl, and (αβγτ)ijkl are the effects of the third and fourth
interactions of the factors under study; (ar)im is the random effect of blocks with type
of shade cloth with rm, (ar)im, (rabγ)ijkm are the random effect due to blocks, blocks
with type of shade cloth, blocks with type of shade cloth, and time assuming
r m N 0, σ 2r , ðar Þim N 0, σ 2rα ðrabγ Þijkm N 0, σ 2αβγðrepÞ , and εijklm is random
error {εijklm~N(0, σ 2)}.
The following SAS program fits a GLMM in a split-split plot in time under a
randomized complete block design with a Poisson response.
Some of the results are listed below. To study which correlation structure best fits
this experimental design, five types of correlation structures were tested
(Table 5.35): compound symmetry (“CS”), autoregression of order 1 (“AR(1)”),
unstructured (“UN”), Toeplizt of order 1 (“Toep(1)”), and ante (ANTE(1)). To do
this, in the “random” command with the “type” option, the type of correlation to be
tested is specified, and it is here where the option of type of variance–covariance
structure must be changed. The fit statistics indicate that the variance–covariance
structure that best fits the model is the autoregressive structure of order 1 hAR(1)i.
This can be seen in the following table in which the goodness-of-fit statistics for
choosing between all these variance–covariance structures are reported.
Table 5.36 shows the conditional statistics and variance component estimates.
The fit statistic Pearson′s chi - square/DF = 0.57 in part (a) indicates that, in a
conditional model, there is no evidence of mis-specifying the distribution or linear
predictor. In other words, there is no overdispersion in the dataset, and, therefore, it
is reasonable that the analysis and inference can be based on the Poisson model.
The analysis of variance for the type III tests of fixed effects (Table 5.37)
indicates that there is a highly significant effect of the main effect type of shade
cloth (P = 0.0001), clone (P = 0.0001), and tray (P = 0.0001) as well as of most of
the interactions, except for the interactions shade_cloth*clone; (P = 0.3846),
5.2 The Poisson Model 167
Table 5.38 Estimated means on the model scale and on the data scale for the shade cloth
(a) Shade cloth least squares means
Shade Standard Standard error
cloth Estimate error DF t-value Pr > |t| Mean mean
Black 1.6221 0.01542 2 105.17 <0.0001 5.0638 0.07810
Pearl 1.5472 0.01533 2 100.94 <0.0001 4.6981 0.07201
Red 1.7184 0.01301 2 132.09 <0.0001 5.5757 0.07254
(b) T grouping of shade cloth least squares means (α=0.05)
LS means with the same letter are not significantly different
Shade cloth Estimate ðτi Þ
Red 1.7184 A
Black 1.6221 B
Pearl 1.5472 B
Table 5.39 Estimated means on the model scale and on the data scale for the type of clone
(a) Clone least squares means
Clone Estimate Standard error DF t-value Pr > |t| Mean Standard error mean
C1 1.5008 0.04989 153 30.08 <0.0001 4.4854 0.2238
C2 1.4250 0.05080 153 28.05 <0.0001 4.1578 0.2112
C3 1.5064 0.05019 153 30.02 <0.0001 4.5106 0.2264
C4 1.4750 0.05029 153 29.33 <0.0001 4.3709 0.2198
C5 1.5965 0.04970 153 32.12 <0.0001 4.9357 0.2453
Pf 1.6344 0.04943 153 33.07 <0.0001 5.1264 0.2534
(b) T grouping of clone least squares means (α=0.05)
LS means with the same letter are not significantly different
Clone Estimate
Pf 1.6344 A
C5 1.5965 A
C3 1.5064 B
C1 1.5008 B
C4 1.4750 C B
C2 1.4250 C
Table 5.40 Estimated means on the model scale and on the data scale for the tray factor
(a) Tray least squares means
Tray Estimate Standard error DF t-value Pr > |t| Mean Standard error mean
CH1 1.3843 0.04859 28.49 <0.0001 3.9921 0.1940
CH2 1.5665 0.04838 32.38 <0.0001 4.7898 0.2317
CH3 1.6183 0.04819 33.58 <0.0001 5.0443 0.2431
(b) T grouping of tray least squares means (α=0.05)
LS means with the same letter are not significantly different
Tray Estimate
CH3 1.6183 A
CH2 1.5665 B
CH1 1.3843 C
Table 5.40 presents the estimates for the levels of the tray on both scales (part (a)).
Similarly, in this table (part (b)), the treatment mean comparisons are presented for
the levels of the tray.
Tables 5.41, 5.42, 5.43, and 5.44 show the means and standard errors on both
scales of the two-factor and three-factor interactions.
Interaction type of shade cloth*clone
Interaction type of shade cloth*tray
Interaction clone*tray
Interaction shade*clone*tray
Although it is not the objective of this book, part of the results is discussed below.
In Fig. 5.8, it is possible to observe that the red shade cloth significantly stimulates
leaf production in coffee grafts, followed by the black and pearl shade cloths. The
5.2 The Poisson Model 169
Table 5.41 Estimated means on the model scale and on the data scale for the type of shade
cloth*clone
Shade cloth*clone least squares means
Shade Standard t- Standard error
cloth Clone Estimate error DF value Pr > |t| Mean mean
Black C1 1.5109 0.06230 153 24.25 <0.0001 4.5307 0.2823
Black C2 1.3340 0.06507 153 20.50 <0.0001 3.7961 0.2470
Black C3 1.4990 0.06354 153 23.59 <0.0001 4.4771 0.2845
Black C4 1.4485 0.06425 153 22.54 <0.0001 4.2566 0.2735
Black C5 1.5916 0.06163 153 25.83 <0.0001 4.9118 0.3027
Black pf 1.6219 0.06114 153 26.53 <0.0001 5.0628 0.3095
Pearl C1 1.3835 0.07711 153 17.94 <0.0001 3.9889 0.3076
Pearl C2 1.3926 0.07781 153 17.90 <0.0001 4.0254 0.3132
Pearl C3 1.4028 0.07589 153 18.49 <0.0001 4.0666 0.3086
Pearl C4 1.4288 0.07575 153 18.86 <0.0001 4.1736 0.3161
Pearl C5 1.5216 0.07536 153 20.19 <0.0001 4.5797 0.3451
Pearl pf 1.5285 0.07458 153 20.50 <0.0001 4.6112 0.3439
Red C1 1.6081 0.06991 153 23.00 <0.0001 4.9933 0.3491
Red C2 1.5483 0.07100 153 21.81 <0.0001 4.7036 0.3339
Red C3 1.6175 0.07055 153 22.93 <0.0001 5.0404 0.3556
Red C4 1.5477 0.07072 153 21.88 <0.0001 4.7005 0.3324
Red C5 1.6762 0.06971 153 24.04 <0.0001 5.3451 0.3726
Red pf 1.7528 0.06923 153 25.32 <0.0001 5.7707 0.3995
Table 5.42 Estimated means on the model scale and on the data scale for the interaction type of
shade cloth*tray
Shade*tray least squares means
Shade Standard t- Standard error
cloth Tray Estimate error DF value Pr > |t| Mean mean
Black CH1 1.4274 0.05961 153 23.94 <0.0001 4.1679 0.2485
Black CH2 1.5523 0.05846 153 26.55 <0.0001 4.7224 0.2761
Black CH3 1.5232 0.05824 153 26.15 <0.0001 4.5869 0.2672
Pearl CH1 1.2070 0.07354 153 16.41 <0.0001 3.3434 0.2459
Pearl CH2 1.4972 0.07218 153 20.74 <0.0001 4.4691 0.3226
Pearl CH3 1.6247 0.07145 153 22.74 <0.0001 5.0771 0.3628
Red CH1 1.5185 0.06733 153 22.55 <0.0001 4.5655 0.3074
Red CH2 1.6499 0.06714 153 24.57 <0.0001 5.2066 0.3496
Red CH3 1.7068 0.06732 153 25.35 <0.0001 5.5114 0.3710
production of leaves in coffee grafts shows a bimodal figure that can be due to factors
such as humidity and temperature. Extreme conditions of both factors cause stress at
the growing points and, therefore, the appearance of leaves.
Regarding the type of clone used as rootstock, the clones showed a better average
leaf production in months 5 and 6, whereas the lowest production was observed in
170 5 Generalized Linear Mixed Models for Counts
Table 5.43 Estimated means on the model scale and on the data scale for the clone–tray interaction
Clone*tray least squares means
Standard t- Standard error
Clone Tray Estimate error DF value Pr > |t| Mean mean
C1 CH1 1.3916 0.06112 153 22.77 <0.0001 4.0214 0.2458
C1 CH2 1.5502 0.05861 153 26.45 <0.0001 4.7122 0.2762
C1 CH3 1.5607 0.05861 153 26.63 <0.0001 4.7622 0.2791
C2 CH1 1.2242 0.06459 153 18.95 <0.0001 3.4014 0.2197
C2 CH2 1.4780 0.06029 153 24.51 <0.0001 4.3843 0.2644
C2 CH3 1.5727 0.05890 153 26.70 <0.0001 4.8196 0.2839
C3 CH1 1.2924 0.06114 153 21.14 <0.0001 3.6414 0.2226
C3 CH2 1.5433 0.05975 153 25.83 <0.0001 4.6799 0.2796
C3 CH3 1.6836 0.05841 153 28.83 <0.0001 5.3851 0.3145
C4 CH1 1.2982 0.06251 153 20.77 <0.0001 3.6626 0.2289
C4 CH2 1.5829 0.05815 153 27.22 <0.0001 4.8690 0.2831
C4 CH3 1.5439 0.05939 153 26.00 <0.0001 4.6828 0.2781
C5 CH1 1.5311 0.05843 153 26.20 <0.0001 4.6234 0.2702
C5 CH2 1.5981 0.05920 153 26.99 <0.0001 4.9438 0.2927
C5 CH3 1.6602 0.05803 153 28.61 <0.0001 5.2604 0.3053
pf CH1 1.5684 0.05794 153 27.07 <0.0001 4.7989 0.2781
pf CH2 1.6464 0.05833 153 28.23 <0.0001 5.1884 0.3026
pf CH3 1.6884 0.05728 153 29.48 <0.0001 5.4107 0.3099
months 1, 2, 8, and 9. The franc foot showed a higher average of leaves compared to
the rest of the clones (Fig. 5.9).
5.3 Exercises
Exercise 5.3.1 A researcher in the area of plant sciences wants to know what is the
response of a plant in vitro culture when it is exposed to different concentrations
(ppm) of a chemical compound to the number of outbreaks that the explant produces
(yij). The data for this experiment are given below (Table 5.45):
(a) Write down the analysis of variance table (sources of variation and degrees of
freedom).
(b) Write down the components of the GLMM.
(c) Analyze the dataset with the model proposed in (b).
(d) Compare and contrast the results of these analyses. If necessary, reanalyze the
dataset using the same model as above, but, now, assume that the data have a
negative binomial distribution.
(e) Summarize the relevant results.
5.3 Exercises 171
Table 5.44 Estimated means on the model scale and on the data scale for the shade–clone–tray
interaction
Shade*clone*tray least squares means
Shade Standard t- Standard
cloth Clone Tray Estimate error DF value Pr > |t| Mean error mean
Black C1 CH1 1.2821 0.1528 153 8.39 <0.0001 3.6041 0.5509
Black C1 CH2 1.4143 0.1521 153 9.30 <0.0001 4.1136 0.6258
Black C1 CH3 1.2201 0.1538 153 7.93 <0.0001 3.3874 0.5209
Black C2 CH1 0.8131 0.1615 153 5.04 <0.0001 2.2549 0.3641
Black C2 CH2 1.1486 0.1543 153 7.45 <0.0001 3.1538 0.4866
Black C2 CH3 1.3376 0.1533 153 8.72 <0.0001 3.8100 0.5842
Black C3 CH1 1.1809 0.1548 153 7.63 <0.0001 3.2574 0.5041
Black C3 CH2 1.1105 0.1550 153 7.17 <0.0001 3.0359 0.4705
Black C3 CH3 1.3672 0.1528 153 8.95 <0.0001 3.9242 0.5996
Black C4 CH1 0.7672 0.1608 153 4.77 <0.0001 2.1538 0.3462
Black C4 CH2 1.4660 0.1517 153 9.66 <0.0001 4.3318 0.6573
Black C4 CH3 1.3925 0.1523 153 9.14 <0.0001 4.0250 0.6131
Black C5 CH1 1.2316 0.1538 153 8.01 <0.0001 3.4269 0.5270
Black C5 CH2 1.6090 0.1507 153 10.67 <0.0001 4.9979 0.7534
Black C5 CH3 1.4684 0.1515 153 9.70 <0.0001 4.3422 0.6577
Black Pf CH1 1.6751 0.1503 153 11.15 <0.0001 5.3393 0.8025
Black Pf CH2 1.3126 0.1548 153 8.48 <0.0001 3.7160 0.5753
Black Pf CH3 1.5092 0.1511 153 9.99 <0.0001 4.5231 0.6834
Pearl C1 CH1 0.6441 0.1741 153 3.70 0.0003 1.9043 0.3314
Pearl C1 CH2 1.3602 0.1639 153 8.30 <0.0001 3.8970 0.6387
Pearl C1 CH3 1.6030 0.1633 153 9.82 <0.0001 4.9678 0.8111
Pearl C2 CH1 0.6336 0.1741 153 3.64 0.0004 1.8844 0.3281
Pearl C2 CH2 1.2050 0.1672 153 7.21 <0.0001 3.3366 0.5579
Pearl C2 CH3 1.5547 0.1635 153 9.51 <0.0001 4.7335 0.7740
Pearl C3 CH1 0.8786 0.1684 153 5.22 <0.0001 2.4074 0.4053
Pearl C3 CH2 1.2777 0.1646 153 7.76 <0.0001 3.5885 0.5905
Pearl C3 CH3 1.5724 0.1637 153 9.60 <0.0001 4.8184 0.7889
Pearl C4 CH1 0.9893 0.1680 153 5.89 <0.0001 2.6893 0.4519
Pearl C4 CH2 1.4198 0.1636 153 8.68 <0.0001 4.1362 0.6769
Pearl C4 CH3 1.4357 0.1646 153 8.72 <0.0001 4.2026 0.6919
Pearl C5 CH1 1.4557 0.1631 153 8.93 <0.0001 4.2875 0.6992
Pearl C5 CH2 1.1672 0.1696 153 6.88 <0.0001 3.2130 0.5449
Pearl C5 CH3 1.6010 0.1633 153 9.80 <0.0001 4.9582 0.8098
Pearl Pf CH1 1.1901 0.1649 153 7.22 <0.0001 3.2875 0.5422
Pearl Pf CH2 1.4004 0.1643 153 8.52 <0.0001 4.0570 0.6665
Pearl Pf CH3 1.7623 0.1620 153 10.88 <0.0001 5.8260 0.9440
Red C1 CH1 1.5245 0.1606 153 9.49 <0.0001 4.5930 0.7379
Red C1 CH2 1.6004 0.1605 153 9.97 <0.0001 4.9548 0.7953
Red C1 CH3 1.6327 0.1607 153 10.16 <0.0001 5.1178 0.8224
Red C2 CH1 1.3462 0.1630 153 8.26 <0.0001 3.8430 0.6264
(continued)
172 5 Generalized Linear Mixed Models for Counts
12.5
Avergage number of leafs
10.
7.5
5.
2.5
0.
1 2 3 4 5 6 7 8 9 10 11
Time (months)
Black Pesrl Red
8.25
5.5
2.75
0.
1 2 3 4 5 6 7 8 9 10 11
Time (months)
Clone1 Clone2 Clone3 Clone4 Clone5 Pf
Exercise 5.3.2 Earthworms (Lubricus terrestris L.) were counted in four replicates
of a factorial experiment at the W.K. Kellogg Biological Station in Battle Creek,
Michigan, in 1995. A 24 factorial experiment was conducted. Factors and treatment
levels were plowing (chiseled and unplowed), input level (conventional and low),
manure application (yes/no), and crop (corn and soybean). The objective of interest
was whether L. terrestris density varies according to these management protocols
and how various factors act and interact. The data (not pooled) in the table shows the
total worm counts (per square foot) in the factorial design 24 for the experimental
units 64 (24 × 4) (juvenile and adult worms). The numbers in each cell of the table
correspond to the counts in the replicates (Table 5.46).
(a) Write down the analysis of variance table (sources of variation and degrees of
freedom).
(b) Write down the components of the GLMM.
(c) Analyze the dataset with the model proposed in (b).
(d) Summarize the relevant results.
(a) Write down the analysis of variance table (sources of variation and degrees of
freedom).
(b) Write down the components of the GLMM.
(c) Analyze the dataset with the model proposed in (b).
(d) Reanalyze the dataset using the same model as above, but, now, assume that the
data have a negative binomial distribution.
(e) Compare and contrast the results of these analyses.
(f) Summarize the relevant results.
176 5 Generalized Linear Mixed Models for Counts
Exercise 5.3.5 The following example deals with one of the most harmful insects in
the root system of the main crops, whose common name is “blind hen.” The
experiment consisted of six treatments formulated for larval control in a randomized
block arrangement (A, B, C, D, E, and F). The count per area shows the number of
larvae in two age groups (a and b) (Table 5.49).
5.3
Table 5.48 Factors T (Treatment: 1: Fecundin; 2: Control), A (Age: 1: ≤0.5; 2: 0.5 < - ≤ 1.5; 3: 1.5 < - ≤ 2.5; 4:≥2.5 years); M (Mating period: 1: October
1; 2: October 22); n (number of ovulations), and x (number of fetuses)
T A M N x T A M n x T A M n X T A M n x T A M n x
Exercises
1 1 1 2 1 1 2 2 2 1 1 4 1 3 2 2 2 1 3 2 2 3 1 3 2
1 1 1 2 2 1 2 2 2 2 1 4 1 3 3 2 2 1 2 1 2 3 1 2 2
1 1 1 2 2 1 2 2 2 1 1 4 1 2 2 2 2 1 1 1 2 3 2 2 1
1 1 1 1 1 1 2 2 3 2 1 4 1 3 3 2 2 1 2 1 2 3 2 2 2
1 1 1 2 1 1 2 2 2 2 1 4 2 2 2 2 2 1 2 2 2 3 2 2 1
1 1 1 2 2 1 2 2 3 3 1 4 2 4 4 2 2 1 2 1 2 3 2 2 2
1 1 2 2 1 1 2 2 3 2 1 4 2 2 2 2 2 1 2 2 2 3 2 2 2
1 1 2 2 1 1 2 2 3 1 1 4 2 4 3 2 2 1 3 2 2 3 2 2 2
1 1 2 1 1 1 2 2 3 2 1 4 2 2 2 2 2 1 2 2 2 4 1 2 1
1 1 2 2 1 1 2 2 2 1 1 4 2 2 2 2 2 2 2 1 2 4 1 2 2
1 1 2 2 1 1 3 1 3 3 1 4 2 5 2 2 2 2 2 2 2 4 1 2 2
1 1 2 2 1 1 3 1 3 2 2 1 1 1 1 2 2 2 2 2 2 4 1 2 2
1 1 2 1 1 1 3 1 2 1 2 1 1 2 1 2 2 2 2 2 2 4 1 2 2
1 2 1 2 1 1 3 1 3 3 2 1 1 2 1 2 2 2 2 2 2 4 1 3 3
1 2 1 2 2 1 3 1 2 2 2 1 1 1 1 2 2 2 2 2 2 4 1 2 2
1 2 1 2 2 1 3 1 2 1 2 1 1 1 1 2 2 2 2 2 2 4 1 2 2
1 2 1 4 2 1 3 2 4 4 2 1 2 1 1 2 2 2 1 1 2 4 2 2 2
1 2 1 3 2 1 3 2 4 3 2 1 2 1 1 2 2 2 2 2 2 4 2 2 2
1 2 1 2 2 1 3 2 4 1 2 1 2 1 1 2 2 2 1 1 2 4 2 2 1
1 2 1 2 2 1 3 2 3 3 2 1 2 1 1 2 2 2 1 1 2 4 2 2 2
1 2 1 2 1 1 3 2 3 2 2 1 2 1 1 2 3 1 2 2 2 4 2 2 2
1 2 1 2 2 1 3 2 2 2 2 1 2 1 1 2 3 1 3 3 2 4 2 3 3
1 2 1 3 1 1 4 1 3 2 2 2 1 2 2 2 3 1 2 1 2 4 2 3 3
1 2 2 2 2 1 4 1 2 1 2 2 1 1 1 2 3 1 2 2
177
178 5 Generalized Linear Mixed Models for Counts
(a) Write down the analysis of variance table (sources of variation and degrees of
freedom).
(b) Write down the components of the GLMM.
(c) Analyze the dataset with the model proposed in (b).
(d) Does the proposed model in (b) adequately describe the variation observed in the
dataset? Summarize the relevant results.
Appendix 1
Data: Subcultures
sub1 Rep1 NB sub1 Rep1 NB sub1 Rep1 NB sub1 Rep1 NB
1 1 18 3 2 24 6 1 45 8 9 53
1 2 16 3 3 24 6 2 44 8 10 59
1 3 15 3 4 19 6 3 45 8 11 57
1 4 15 3 5 25 6 4 44 8 12 65
1 5 11 3 6 24 6 5 52 8 13 63
1 6 17 3 7 20 6 6 47 8 14 55
1 7 10 3 8 24 6 7 46 8 15 50
1 8 8 3 9 20 6 8 45 8 16 52
1 9 17 3 10 19 6 9 48 8 17 55
1 10 13 3 11 26 6 10 56 8 18 50
1 11 16 3 12 22 6 11 54 8 19 53
1 12 15 3 13 23 6 12 44 8 20 52
1 13 12 3 14 24 6 13 54 9 1 48
1 14 15 3 15 23 6 14 62 9 2 44
1 15 8 4 1 24 6 15 55 9 3 54
1 16 8 4 2 28 6 16 45 9 4 55
1 17 15 4 3 29 7 1 56 9 5 51
1 18 15 4 4 34 7 2 62 9 6 58
1 19 14 4 5 24 7 3 45 9 7 47
1 20 8 4 6 24 7 4 45 9 8 42
2 1 15 4 7 25 7 5 46 9 9 50
2 2 11 4 8 28 7 6 48 9 10 48
2 3 12 4 9 24 7 7 55 9 11 48
2 4 18 4 10 32 7 8 45 9 12 53
(continued)
Appendix 1 179
Data: Beatles
Row Column Treatment Count
1 1 S 3
1 2 U 6
1 3 U 2
1 4 TR 7
1 5 S 1
1 6 TR 5
2 1 TR 5
2 2 S 4
2 3 TR 5
2 4 U 8
2 5 U 6
2 6 S 3
3 1 U 3
3 2 TR 6
3 3 U 4
3 4 S 3
3 5 S 4
3 6 TR 7
4 1 U 3
4 2 TR 4
(continued)
180 5 Generalized Linear Mixed Models for Counts
Block A B Count
1 7 2 20
1 7 3 16
1 7 4 6
2 1 1 9
2 1 2 9
2 1 3 9
2 1 4 19
2 2 1 31
2 2 2 11
2 2 3 30
2 2 4 29
2 3 1 25
2 3 2 11
2 3 3 15
2 3 4 23
2 4 1 7
2 4 2 22
2 4 3 20
2 4 4 3
2 5 1 0
2 5 2 28
2 5 3 18
2 5 4 18
2 6 1 55
2 6 2 58
2 6 3 18
2 6 4 19
2 7 1 14
2 7 2 44
2 7 3 19
2 7 4 17
3 1 1 12
3 1 2 8
3 1 3 44
3 1 4 0
3 2 1 29
3 2 2 11
3 2 3 5
3 2 4 49
3 3 1 99
3 3 2 66
3 3 3 11
3 3 4 15
(continued)
182 5 Generalized Linear Mixed Models for Counts
Block A B Count
3 4 1 9
3 4 2 8
3 4 3 9
3 4 4 21
3 5 1 49
3 5 2 49
3 5 3 17
3 5 4 22
3 6 1 41
3 6 2 21
3 6 3 48
3 6 4 11
3 7 1 58
3 7 2 34
3 7 3 28
3 7 4 20
4 1 1 6
4 1 2 9
4 1 3 20
4 1 4 0
4 2 1 10
4 2 2 0
4 2 3 7
4 2 4 9
4 3 1 9
4 3 2 29
4 3 3 22
4 3 4 4
4 4 1 22
4 4 2 31
4 4 3 32
4 4 4 41
4 5 1 112
4 5 2 44
4 5 3 24
4 5 4 28
4 6 1 8
4 6 2 8
4 6 3 11
4 6 4 10
4 7 1 117
4 7 2 78
4 7 3 36
4 7 4 38
Coffee data
Shade Clone Tray Rep y1 y2 y3 y4 y5 y6 y7 y8 y9 y10 y11
Roja C1 CH1 R1 2 2 4 4 6 4 5 5 5 4 7
Appendix 1
Roja C1 CH1 R1 2 2 4 4 2 2 4 2 4 8 12
Roja C1 CH1 R1 2 0 4 5 8 7 5 4 4 6 5
Roja C2 CH1 R1 0 2 5 2 6 3 9 8 8 9 8
Roja C2 CH1 R1 0 0 3 2 6 6 7 6 7 5 6
Roja C2 CH1 R1 2 0 2 2 6 6 6 5 5 10 14
Roja C3 CH1 R1 2 0 4 5 8 3 3 6 2 11 9
Roja C3 CH1 R1 0 2 6 6 8 10 10 10 9 5 9
Roja C3 CH1 R1 2 2 4 5 6 7 7 5 5 5 7
Roja C4 CH1 R1 2 2 6 5 8 9 7 4 3 10 6
Roja C4 CH1 R1 0 2 4 4 8 8 4 4 4 5 7
Roja C4 CH1 R1 2 2 4 6 8 6 6 4 4 4 8
Roja C5 CH1 R1 2 1 5 7 8 6 4 4 2 11 13
Roja C5 CH1 R1 2 2 4 5 6 4 7 7 8 6 11
Roja C5 CH1 R1 0 2 6 8 10 10 8 8 7 4 2
Roja pf CH1 R1 2 2 6 5 8 7 6 2 2 6 12
Roja pf CH1 R1 2 2 4 5 8 8 5 3 2 9 12
Roja pf CH1 R1 2 4 6 7 8 10 11 11 10 11 13
Roja C1 CH1 R2 2 2 5 6 8 9 6 6 6 8 9
Roja C1 CH1 R2 2 2 4 5 8 10 6 4 2 2
Roja C1 CH1 R2 2 2 3 5 6 8 5 4 4 6 7
Roja C2 CH1 R2 0 2 4 6 8 10 6 7 7 9 8
Roja C2 CH1 R2 0 2 4 6 8 10 7 4 3 7 7
Roja C2 CH1 R2 2 2 6 5 2 8 4 5 4 4 4
Roja C3 CH1 R2 2 0 4 6 8 10 9 8 6 13 13
183
(continued)
Coffee data (continued)
184
Roja pf CH1 R3 2 3 6 6 8 8 6 5 5 18 23
Roja C1 CH1 R4 2 2 4 5 8 8 2 3 3 2 4
Roja C1 CH1 R4 2 2 6 6 10 8 10 2 2 11 11
Roja C1 CH1 R4 0 2 6 6 8 8 12 3 2 2 10
Roja C2 CH1 R4 2 2 2 2 4 6 6 1
Roja C2 CH1 R4 2 0 2 2 4 3 3 2 6 3 3
Roja C2 CH1 R4 2 2 2 2 2 4 6 6 6 8 9
Roja C3 CH1 R4 2 0 3 2 4 4 3 1 1 1 2
Roja C3 CH1 R4 2 0 4 4 8 10 4 8 6 4 6
Roja C3 CH1 R4 2 0 2 4 6 7 7 6 7 4 5
Roja C4 CH1 R4 2 0 4 5 4 2 1 1 2 1 1
Roja C4 CH1 R4 2 0 2 2 2 4 5 2 2 6 7
Roja C4 CH1 R4 2 2 4 5 6 8 8 6 5 9 9
Roja C5 CH1 R4 2 4 4 5 10 10 6 5 7 10 5
Roja C5 CH1 R4 2 0 4 4 4 6 3 3 2 1 1
Roja C5 CH1 R4 2 0 5 6 8 10 9 4 3 3 2
Roja pf CH1 R4 0 4 6 5 4 8 2 3 11 11
Roja pf CH1 R4 0 2 4 5 8 8 7 4 4 3 6
Roja pf CH1 R4 2 4 6 6 8 10 8 3 3 16 16
Roja C1 CH2 R1 2 2 3 5 6 8 5 10 4 10 14
Roja C1 CH2 R1 2 2 4 6 10 10 11 11 9 9 8
Roja C1 CH2 R1 2 2 4 6 8 10 11 10 11 13 14
Roja C2 CH2 R1 2 2 4 4 8 8 2 1 1 7 11
Roja C2 CH2 R1 2 2 4 4 6 6 3 5 5 1 6
(continued)
185
Coffee data (continued)
186
Roja pf CH2 R2 2 4 6 6 8 12 4 1 1 3 4
Roja pf CH2 R2 2 3 6 5 8 10 8 5 6 5 11
Roja pf CH2 R2 2 2 4 6 8 10 10 4 4 5 8
Roja C1 CH2 R3 2 2 6 6 10 10 4 2
Roja C1 CH2 R3 2 2 4 6 10 10 2 1 1 1 4
Roja C1 CH2 R3 2 2 4 5 8 10 6 3 2 1
Roja C2 CH2 R3 2 2 6 6 8 10 3 1 1
Roja C2 CH2 R3 0 2 4 5 8 12 10
Roja C2 CH2 R3 2 0 4 6 8 10 8 9 5 7 8
Roja C3 CH2 R3 2 2 4 6 8 12 10
Roja C3 CH2 R3 2 2 6 6 10 10 10 1
Roja C3 CH2 R3 2 2 6 6 10 10 10 2 1 8 11
Roja C4 CH2 R3 2 2 4 6 8 8 4
Roja C4 CH2 R3 2 2 6 5 8 6 6 2 1
Roja C4 CH2 R3 2 2 6 6 8 8 4 1 1 1
Roja C5 CH2 R3 2 2 6 5 8 10 6 3 3 12 10
Roja C5 CH2 R3 2 0 4 4 8 10 6 3 2 1 2
Roja C5 CH2 R3 2 2 6 8 8 10 6 4 3 3 2
Roja pf CH2 R3 2 3 5 6 10 12 11 7 7 11 11
Roja pf CH2 R3 2 3 5 6 10 12 6 1 1 12 5
Roja pf CH2 R3 2 2 6 6 10 10 5 6 7 12 14
Roja C1 CH2 R4 2 0 4 5 4 8 10 2 2 5 6
Roja C1 CH2 R4 2 2 4 6 8 10 8 6 5 6 9
Roja C1 CH2 R4 2 2 4 6 8 10 12 8 12 14 18
(continued)
187
Coffee data (continued)
188
Roja C5 CH3 R1 2 2 4 4 6 8 5 5 7 10 9
Roja C5 CH3 R1 0 0 6 6 8 10 5 4
Roja pf CH3 R1 0 2 6 6 8 10 10 6 3 3 6
Roja pf CH3 R1 2 2 4 6 9 10 8 7 8 6 10
Roja pf CH3 R1 2 2 4 6 8 10 6 4 4 1 7
Roja C1 CH3 R2 2 0 4 6 8 8 10 5 4 4 10
Roja C1 CH3 R2 2 0 4 6 8 8 3 10 9 10 12
Roja C1 CH3 R2 2 0 4 6 8 10 12 7 7 14 17
Roja C2 CH3 R2 2 2 4 6 8 10 12 11 11 12 9
Roja C2 CH3 R2 2 2 2 2 6 8 8 9 8 8
Roja C2 CH3 R2 2 0 4 4 8 8 6 9
Roja C3 CH3 R2 2 2 4 6 10 8 8 1 3 5 5
Roja C3 CH3 R2 2 1 6 6 10 10 10 12 11 17 22
Roja C3 CH3 R2 2 2 6 8 10 10 9 11 8 12 18
Roja C4 CH3 R2 2 2 4 6 8 8 7
Roja C4 CH3 R2 2 2 6 6 9 8 2 2
Roja C4 CH3 R2 2 2 4 6 8 8 8 6 5 10 10
Roja C5 CH3 R2 0 2 6 6 10 10 12 8 8 12 15
Roja C5 CH3 R2 2 2 4 6 8 8 8
Roja C5 CH3 R2 2 2 6 6 8 10 10 10 8 9 12
Roja pf CH3 R2 2 2 4 6 8 10 10 11 12 15 14
Roja pf CH3 R2 2 0 4 6 8 10 8 9 10 5 17
Roja pf CH3 R2 2 2 4 6 8 10 10 9 10 14 20
Roja C1 CH3 R3 2 2 5 6 8 10 10 10 9 12 8
(continued)
189
Coffee data (continued)
190
Roja C4 CH3 R4 2 1 6 8 9 10 8 3 2 2
Roja C5 CH3 R4 2 2 6 8 10 8 5 9 10 14 9
Roja C5 CH3 R4 2 2 6 8 10 10 7 5 10 12 14
Roja C5 CH3 R4 2 0 6 8 8 10 10 10 9 14 16
Roja pf CH3 R4 2 2 4 5 8 10 12 10 10 21 24
Roja pf CH3 R4 2 2 4 6 8 10 12 10 11 11 23
Roja pf CH3 R4 0 0 5 6 8 6 8 9 7 9 13
Perla C1 CH1 R1 2 0 4 3 2 2 1
Perla C1 CH1 R1 2 0 4 5 7 5 1
Perla C1 CH1 R1 2 0 2 2 2 4
Perla C2 CH1 R1 2 0 2 2 4 5 6 2 2 1 1
Perla C2 CH1 R1 2 2 6 4 4 4
Perla C2 CH1 R1 2 0 3 2 4 4
Perla C3 CH1 R1 2 2 4 5 8 6 2
Perla C3 CH1 R1 2 2 4 2 3 2 1 4
Perla C3 CH1 R1 2 0 4 5 7 4 3 2 1 2 6
Perla C4 CH1 R1 2 0 2 2 2 7 4 1
Perla C4 CH1 R1 2 3 4 3 6 3 3 3 3
Perla C4 CH1 R1 2 2 4 5 7 6 3 3
Perla C5 CH1 R1 2 2 4 6 5 6 4 2 1 4 8
Perla C5 CH1 R1 2 2 4 4 7 8 7 6 12
Perla C5 CH1 R1 2 2 4 5 7 9 9 4 7 4 14
Perla pf CH1 R1 2 0 6 4 6 4 2 5 14
Perla pf CH1 R1 2 2 4 6 8 6 2 2 1 4 6
(continued)
191
Coffee data (continued)
192
Perla C4 CH1 R3 2 2 6 5 9 4 3 1 1 5 9
Perla C4 CH1 R3 2 2 4 7 7 5 5 5 8
Perla C4 CH1 R3 2 2 4 8 6 5 2 2 2 3 2
Perla C5 CH1 R3 2 2 4 6 7 9 5 1 1 5 12
Perla C5 CH1 R3 2 0 4 6 8 5 4 3 2 6 10
Perla C5 CH1 R3 2 4 4 5 7 3 5 3 3 6 7
Perla pf CH1 R3 2 2 6 7 5 4 3
Perla pf CH1 R3 2 4 3 6 7 7 6 6 6 10 11
Perla pf CH1 R3 2 4 6 5 5 4 3 1 1
Perla C1 CH1 R4 2 2 4 4 4 4 4
Perla C1 CH1 R4 2 0 4 3 5 3
Perla C1 CH1 R4 2 0 4 3 6 4 6
Perla C2 CH1 R4 2 0 2 2 2 2
Perla C2 CH1 R4 2 0 4 4 6 5 5 5 4 4
Perla C2 CH1 R4 2 2 4 4 7 4 3 5 2 2
Perla C3 CH1 R4 2 0 4 4 7 6
Perla C3 CH1 R4 2 0 2 2 4 2 3
Perla C3 CH1 R4 2 0 3 3 3 4
Perla C4 CH1 R4 2 0 2 2 2 4 4 2 9 12
Perla C4 CH1 R4 2 0 3 4 6 3
Perla C4 CH1 R4 2 2 4 4 7 3 1 2 4
Perla C5 CH1 R4 2 4 10 7 9 8 8 8 7 8 9
Perla C5 CH1 R4 2 2 4 4 7 8 6 4 3 3 3
Perla C5 CH1 R4 2 2 4 5 7 7 5 3 3 8 13
(continued)
193
Coffee data (continued)
194
Perla C3 CH2 R2 2 2 6 6 10 10 8 6 5 8 10
Perla C3 CH2 R2 2 2 4 6 8 8 9 6 7 9 11
Perla C4 CH2 R2 2 2 6 6 9 9 8 8 8 6 9
Perla C4 CH2 R2 2 0 6 6 10 10 10 6 7 7 9
Perla C4 CH2 R2 2 1 4 7 8 10 7 4 4 5 7
Perla C5 CH2 R2 0 2 4 5 10 8 10 6 7 9 12
Perla C5 CH2 R2 2 2 4 6 10 6 5 5
Perla C5 CH2 R2 2 1 4 4 8 8 9 1
Perla pf CH2 R2 2 0 4 6 10 9 6 6 5 4 6
Perla pf CH2 R2 2 2 4 6 10 4 3 1 1 7 7
Perla pf CH2 R2 2 5 6 9 11 10 6 3 1 5 6
Perla C1 CH2 R3 2 0 4 6 8 7 5 5 3
Perla C1 CH2 R3 2 2 4 6 8 8 5 5 2 8 9
Perla C1 CH2 R3 2 2 6 6 8 8 6 4 2 4 6
Perla C2 CH2 R3 2 0 4 3 7 7
Perla C2 CH2 R3 0 0 4 4 8 7
Perla C2 CH2 R3 2 2 4 6 8 6 2
Perla C3 CH2 R3 2 2 4 4 8 8 5 3 3 3 6
Perla C3 CH2 R3 2 0 4 4 7 5 9 2 2 1 6
Perla C3 CH2 R3 2 0 4 4 8 8 8 4 3 9
Perla C4 CH2 R3 2 2 4 8 8 7 9 2 2
Perla C4 CH2 R3 2 2 4 8 8 8 1 2 1 5 6
Perla C4 CH2 R3 2 0 4 6 8 4 3 1 1 3 6
Perla C5 CH2 R3 2 0 4 4 8 8 10 5 5 5 11
(continued)
195
Coffee data (continued)
196
Perla C2 CH3 R1 2 2 4 6 8 5 9 8 9 10 11
Perla C3 CH3 R1 2 2 4 6 8 10 11 3 3 2 4
Perla C3 CH3 R1 2 0 4 6 8 9 10 10 8 8 10
Perla C3 CH3 R1 2 0 4 6 8 9 11 7 7 8 12
Perla C4 CH3 R1 2 0 4 6 8 6 8 5 5 6 8
Perla C4 CH3 R1 2 2 4 4 8 8 8 6 2 8
Perla C4 CH3 R1 2 2 2 4 6 5 6 4 3 4 12
Perla C5 CH3 R1 2 0 4 6 10 7 11 12 8 10 16
Perla C5 CH3 R1 2 2 4 6 10 10 12 11 11 11 11
Perla C5 CH3 R1 2 2 4 6 10 8 9 5 4 10 11
Perla pf CH3 R1 2 2 4 6 8 8 6 10 11 11 17
Perla pf CH3 R1 2 2 4 6 8 10 12 7 6 14 20
Perla pf CH3 R1 2 2 2 6 8 8 9 3 3 3 5
Perla C1 CH3 R2 2 2 4 6 8 10 12 12 11 10 14
Perla C1 CH3 R2 2 2 4 6 9 5 5 8 8 9 14
Perla C1 CH3 R2 2 0 4 4 8 8 10 10 8 10 16
Perla C2 CH3 R2 2 4 4 6 10 8 5 9 8 12 16
Perla C2 CH3 R2 2 0 3 4 7 7 7 7 5 6 9
Perla C2 CH3 R2 2 0 4 4 8 8 10 9 10 10 15
Perla C3 CH3 R2 2 2 4 6 8 8 10 10 10 12 14
Perla C3 CH3 R2 2 0 4 4 8 8 10 6 6 9 10
Perla C3 CH3 R2 2 0 4 4 8 8 10 9 9 10 16
Perla C4 CH3 R2 2 2 4 4 8 7 9 8 10 14 9
Perla C4 CH3 R2 2 0 4 4 8 8 10 10 10 9 6
(continued)
197
Coffee data (continued)
198
Perla C2 CH3 R4 2 2 3 5 7 8 9 5 5 3 7
Perla C2 CH3 R4 2 0 5 6 6 8 6 6 5 11 14
Perla C2 CH3 R4 2 0 4 4 6 2 4 6 5 5 6
Perla C3 CH3 R4 2 2 4 6 8 8 9 10
Perla C3 CH3 R4 2 0 4 6 8 7 5 4 4 4 11
Perla C3 CH3 R4 2 0 4 6 6 8 10 1 7 5 8
Perla C4 CH3 R4 2 2 4 4 6 4 2 1 1 3 7
Perla C4 CH3 R4 2 2 4 6 8 4 1 1
Perla C4 CH3 R4 2 0 6 6 8 6 5 1
Perla C5 CH3 R4 2 2 6 4 9 5 3
Perla C5 CH3 R4 2 2 5 5 8 4 1 1 2 5
Perla C5 CH3 R4 2 2 4 4 6 5 5 5 4 7 8
Perla pf CH3 R4 2 2 4 6 8 6 4 1 1 16 20
Perla pf CH3 R4 2 0 5 5 7 5 4 2 2 4 8
Perla pf CH3 R4 2 4 3 5 6 4 5 5 4 10 14
Negra C1 CH1 R1 2 2 4 6 6 8
Negra C1 CH1 R1 2 2 6 6 6 8
Negra C1 CH1 R1 2 0 4 6 8 8
Negra C2 CH1 R1 2 0 2 4 6 8 4 5 7 6
Negra C2 CH1 R1 0 0 2 2 2 4 4 6 5 4
Negra C2 CH1 R1 0 0 2 2 2 4 9 8 4 7
Negra C3 CH1 R1 2 2 4 4 6 8 7 3 2 6 10
Negra C3 CH1 R1 2 2 4 5 6 8 5 4 4 3 9
Negra C3 CH1 R1 2 0 2 3 4 6 3 3 4 9 11
(continued)
199
Coffee data (continued)
200
Negra C1 CH1 R3 2 2 6 6 8 10 7 5 6 4 15
Negra C1 CH1 R3 2 0 4 6 8 10 6 6 4 7 14
Negra C2 CH1 R3 2 0 2 6 8 4 3 1 4 8
Negra C2 CH1 R3 2 0 2 2 2 2 4 1
Negra C2 CH1 R3 2 0 4 4 4 8 3
Negra C3 CH1 R3 2 2 6 6 8 6 3 2
Negra C3 CH1 R3 2 2 6 6 8 10 6 2 1 4
Negra C3 CH1 R3 2 2 4 6 8 6 2 1 7 10
Negra C4 CH1 R3 2 2 4 6 8 6 5 3
Negra C4 CH1 R3 2 2 6 6 8 10 5 2
Negra C4 CH1 R3 2 2 6 4 6 4 4 1
Negra C5 CH1 R3 2 2 6 6 8 6 6 13 20
Negra C5 CH1 R3 2 2 6 6 8 10 2 4 5 1 2
Negra C5 CH1 R3 2 2 6 6 8 9 8 4 1 7 10
Negra pf CH1 R3 2 2 6 6 8 10 8 6 6 10 18
Negra pf CH1 R3 2 4 6 6 8 8 6 6 6 13 17
Negra pf CH1 R3 2 4 6 8 8 8 4 2 1 18 18
Negra C1 CH1 R4 2 2 6 6 8 6 5 4 3 3 10
Negra C1 CH1 R4 2 0 4 6 8 4 5 3 3 9
Negra C1 CH1 R4 2 2 6 6 8 4 1 4 1 7 12
Negra C2 CH1 R4 2 2 4 5 8 8 6
Negra C2 CH1 R4 2 2 6 6 8 6 4 2 2 2 2
Negra C2 CH1 R4 2 2 4 6 8 6 4 1
Negra C3 CH1 R4 2 2 4 6 8 6 3 2 2 5 11
(continued)
201
Coffee data (continued)
202
Negra pf CH2 R1 2 2 6 6 8 6 6 3 3 2 12
Negra C1 CH2 R2 2 0 4 6 8 7 8 8 8 3 8
Negra C1 CH2 R2 2 2 6 6 8 9 8 2 2 2 2
Negra C1 CH2 R2 2 2 4 6 6 8 6 2 2 7 14
Negra C2 CH2 R2 2 2 6 6 8 5 8 8 6
Negra C2 CH2 R2 2 2 4 4 8 7 9 9 7 4 10
Negra C2 CH2 R2 0 2 4 4 6 8 5
Negra C3 CH2 R2 2 2 4 6 8 6 3
Negra C3 CH2 R2 2 0 4 4 6 8 10 2 4
Negra C3 CH2 R2 2 2 4 6 8 8 10
Negra C4 CH2 R2 2 0 4 4 8 9 10 7 6 2
Negra C4 CH2 R2 2 0 6 6 8 7 9 10 7 7 4
Negra C4 CH2 R2 2 2 6 6 10 10 7 10 10 7 6
Negra C5 CH2 R2 2 2 6 6 8 10 9 7 6 7 7
Negra C5 CH2 R2 2 2 6 6 8 10 12 10 8 10 11
Negra C5 CH2 R2 2 2 4 8 8 10
Negra pf CH2 R2 2 2 4 6 8 10 11 8 9 10 20
Negra pf CH2 R2 2 0 4 6 8 10 3
Negra pf CH2 R2 2 0 4 6 8 7 6
Negra C1 CH2 R3 2 0 6 6 8 8 7 7 3 2 12
Negra C1 CH2 R3 2 2 6 6 10 4 10 7 5 8 14
Negra C1 CH2 R3 2 2 4 6 8 8 10 5 2 6 12
Negra C2 CH2 R3 2 0 4 6 8 8 2 5 2
Negra C2 CH2 R3 2 0 4 6 8 6 6 2
(continued)
203
Coffee data (continued)
204
Negra pf CH2 R4 2 0 4 6 8 8 3 1
Negra pf CH2 R4 2 2 4 6 8 8
Negra pf CH2 R4 2 2 4 6 8 10 5 2
Negra C1 CH3 R1 2 0 6 6 8 9 7 2 1
Negra C1 CH3 R1 2 2 6 6 8 6 5 5 5 6 6
Negra C1 CH3 R1 2 2 4 4 4 4 6 6 4 4
Negra C2 CH3 R1 2 2 6 4 6 8 10 10 9 10 10
Negra C2 CH3 R1 2 2 6 6 8 8 12 12 9 12 15
Negra C2 CH3 R1 0 0 2 4 2 6 6 8 6 1 5
Negra C3 CH3 R1 2 2 6 6 8 10 4 3 3 2 7
Negra C3 CH3 R1 2 2 4 6 6 8 8 7 9 7 16
Negra C3 CH3 R1 2 0 4 6 8 10 8 10 10 11 6
Negra C4 CH3 R1 2 4 4 6 8 10 9 8 6 7 9
Negra C4 CH3 R1 2 2 4 4 6 8 6 6
Negra C4 CH3 R1 2 2 4 6 7 6 9 8 9 11 11
Negra C5 CH3 R1 2 2 6 6 8 10 9 8 7 4 3
Negra C5 CH3 R1 2 2 6 6 8 9 9 8 7 9 9
Negra C5 CH3 R1 2 2 6 8 8 5 6 8 7 10 12
Negra pf CH3 R1 2 2 6 8 6 8 9 7 9 3 9
Negra pf CH3 R1 2 2 6 6 8 10 7 7 10 10 10
Negra pf CH3 R1 2 0 4 6 8 10 8 5 3 1 7
Negra C1 CH3 R2 2 0 4 6 8 10 8 2
Negra C1 CH3 R2 2 0 4 6 8 6 6 4 4 1 2
Negra C1 CH3 R2 2 2 6 6 10 6 7 5 3 2 6
(continued)
205
Coffee data (continued)
206
Negra C5 CH3 R3 2 2 4 6 8 6 5
Negra C5 CH3 R3 2 2 6 6 6 3 4 1
Negra pf CH3 R3 2 2 4 6 8 5 5 2 2 6 10
Negra pf CH3 R3 2 2 4 8 8 8 4 5 7 7 7
Negra pf CH3 R3 2 2 4 8 8 5 3 4 2 6 4
Negra C1 CH3 R4 2 2 4 4 8 8 9 8 8 8 9
Negra C1 CH3 R4 2 0 4 6 8 7 9 1 2 2
Negra C1 CH3 R4 2 0 4 6 8 8 1
Negra C2 CH3 R4 2 0 4 4 6 5 6 6
Negra C2 CH3 R4 2 2 2 6 6 7 7
Negra C2 CH3 R4 2 0 2 2 6 5 7 2 1 1
Negra C3 CH3 R4 2 0 4 4 6 6 6 10
Negra C3 CH3 R4 2 2 6 6 10 10 10
Negra C3 CH3 R4 2 0 4 4 6 7 7
Negra C4 CH3 R4 2 0 6 6 8 6 8 4 2 1 2
Negra C4 CH3 R4 2 0 4 4 6 8 10 7
Negra C4 CH3 R4 2 0 4 4 8 7 6 2 2 3 2
Negra C5 CH3 R4 2 2 6 8 8 6 4 3 2 2 2
Negra C5 CH3 R4 2 0 4 6 8 8 8 6 6 3 5
Negra C5 CH3 R4 2 2 4 6 8 8 8 7 8 5 8
Negra pf CH3 R4 2 2 4 6 8 10 11 4 4 4 6
Negra pf CH3 R4 2 4 4 8 8 8 3 1
Negra pf CH3 R4 2 4 6 8 10 10 5 2 2 4 6
207
208 5 Generalized Linear Mixed Models for Counts
Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0
International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing,
adaptation, distribution and reproduction in any medium or format, as long as you give appropriate
credit to the original author(s) and the source, provide a link to the Creative Commons license and
indicate if changes were made.
The images or other third party material in this chapter are included in the chapter's Creative
Commons license, unless indicated otherwise in a credit line to the material. If material is not
included in the chapter's Creative Commons license and your intended use is not permitted by
statutory regulation or exceeds the permitted use, you will need to obtain permission directly from
the copyright holder.
Chapter 6
Generalized Linear Mixed Models
for Proportions and Percentages
In this chapter, we will review generalized linear mixed models (GLMMs) whose
response can be either a proportion or a percentage. For proportion and percentage
data, we refer to data whose expected value is between 0 and 1 or between 0 and 100.
For the remainder of this book, we will refer to this type of data only in terms of
proportion, knowing that it is possible to change it to a percentage scale only when
multiplying it by 100. Proportions can be classified into two types: discrete and
continuous. Discrete proportions arise when the unit of observation consists of
N distinct entities, of which individuals have the attribute of interest “y”. N must
be a nonnegative integer and “y” must be a positive integer; here, y ≤ N. Therefore,
the observed proportion must be a discrete fraction, which can take values
N , N , ⋯, N . A binomial distribution is the sum of a series of m independent binary
0 1 N
trials (i.e., trials with only two possible outcomes: success or failure), where all trials
have the same probability of success. For binary and binomial distributions, the
target of inference is the value of the parameter such that 0 ≤ E Ny = π ≤ 1. Contin-
uous proportions (ratios) arise when the researcher measures responses such as the
fraction of the area of a leaf infested with a fungus, the proportion of damaged cloth
in a square meter, the fraction of a contaminated area, and so on. As with the
binomial parameter π, the continuous rates (fractions) take values between 0 and
1, but, unlike the binomial, the continuous proportions do not result from a set of
Bernoulli tests. Instead, the beta distribution is most often used when the response
variable is in continuous proportions. In the following sections, we will first address
issues in modeling when we have binary and binomial data. When the response
variable is binomial, we have the option of using a linearization method (pseudo-
likelihood (PL)) or the Laplace or quadrature integral approximation (Stroup 2012).
Observed proportion 1
0.8
0.6
0.4
0.2
0
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6
Dose
Fig. 6.1 Effect of the demethylating agent on the proportion of normal plants
yij = μ þ τi þ εij
where yij is the number of observed normal plants in the tray j ( j = 1, 2, 3, 4) at the
dose i (i = 1, 2, ⋯, 6), μ is the overall mean, τi is the effect of dose i of the
demethylating agent, and εij are non-normal errors.
The expected value (normal plants) of a set of tests ni follows a binomial
distribution yi ~ Binomial(ni, π i), where π i is the probability of success in each
trial, with 0 ≤ π i ≤ 1, where π i = yi=ni . Thus, the probability of observing an outcome
yi can be written as
π i i ð1 - π i Þni - yi ; yi = 0, 1, ⋯, ni :
y
PðY i = yi jni , yi Þ = ni
yi
This probability depends on the number of known tests ni, whereas the probabil-
ity of success (π i) is an unknown parameter. In Fig. 6.1, we observe that the
probability of obtaining a normal plant depends on the applied dose of the
demethylating agent. Given that yi has a binomial distribution, the expected value
(the mean) is the product of the number of trials and the probability of success in
each trial, that is, E(Yi) = niπ i. Since the number of trials is fixed (once the data have
been obtained), modeling the probability of success is equivalent to modeling the
expected value as well as the variance since it is also a function of the number of
trials and the probability of success. So, the expected value and variance of yi are
212 6 Generalized Linear Mixed Models for Proportions and Percentages
E ðyi Þ = μi = ni π i ; Varðyi Þ = ni π i ð1 - π i Þ:
This variance is small if the value π i is close to 0 or 1, and this increases to its
maximum when π i = 0.5. This can be seen in Fig. 6.1, where proportions close
to 0 or 1 show less variance than do proportions between 0.1 and 0.2 for a
demethylating agent dose of 0.5. This variance can also be written in terms of the
expected value as:
μi
Varðyi Þ = ðn - μi Þ:
ni i
In this CRD, the fixed number of treatments t (doses) were randomly assigned to
r experimental units (trays). The linear predictor describing the structure of the mean
of this GLMM is
ηi = η þ τ i
where ηi denotes the ith linear predictor, η is the intercept, and τi is the fixed effect
due to treatments i (i = 1, 2, ⋯, t) with t treatments and ri replicates in each
treatment.
The components that define this GLMM are shown below:
Distribution: yi~Binomial(Nij, π i)
Linear predictor: ηi = η + τi
πi
Link function: logitðπ i Þ = logit 1 - πi = ηi
In this example, the distribution of the dataset was not specified to GLIMMIX in
the model specification because by using the expression “y/N,” proc GLIMMIX
automatically infers that this dataset has a binomial distribution. It is also important
to note that variable dose and repetition were declared as class variables in the
“class” command, which Statistical Analysis Software (SAS) interprets as explana-
tory variables that are nonnumerical factors. However, the variable declared “Rep” is
not used in the model specification.
6.2 Analysis of Discrete Proportions: Binary and Binomial Responses 213
Part of the results is shown in Table 6.3. Pearson’s chi-squared statistic value
divided by the degrees of freedom in part (a) (Pearson′s chi - square/DF = 0.5)
indicates that there is no evidence of extra-dispersion in the dataset. The analysis of
variance (ANOVA) tabulated in part (b) in Table 6.3, with the type III tests of fixed
effects, indicates that there is a highly significant difference (P = 0.0001) in the
average proportion of normal plants with respect to the dose applied to the seeds.
The output when using the “lsmeans” command in conjunction with the “ilink”
option is in the “Mean” column (part (c) in Table 6.3). These values are the values of
π i′s, i.e., the estimated probabilities π^0 = 0:9813 and π^0:01 = 0:9813 of normal plants
for the treatments whose doses are 0 and 0.01, respectively. For treatments with
doses of 0.1 and 0.5, the observed probabilities of normal plants are π^0:1 = 0:8813
and π^0:5 = 0:1375, respectively, whereas for the 1 and 1.5 doses, the observed
probabilities of normal plants decrease dramatically with π^1 = 0:02501 and
π^1:5 = 0:03126, respectively.
Figure 6.2 shows the mean comparisons (least significance difference (LSD)) of
the estimated probabilities according to the dose applied to the seeds in trays. In this
figure, we can observe that in the treatments with dose = 0 (control) and dose = 0.01,
the observed proportions of normal plants are not statistically different from each
other, but they do differ with the other applied doses. At a dose of 0.1, the observed
proportion of normal plants was 88.13%, and this was statistically different from all
the doses used. Finally, doses at 0.5, 1, and 1.5 of the demethylating agent in the
observed proportion of normal plants decreased drastically to 13.75%, 2.501%, and
3.12%, respectively. The doses of 1 and 1.5 produced statistically equal proportions
of normal plants.
214 6 Generalized Linear Mixed Models for Proportions and Percentages
1.1 Mean
0.825
Average roportion
0.55
0.275
0.
C 0.01 0.1 0.5 1 1.5
Dose
Fig. 6.2 Comparison of the estimated probabilities per dose of the demethylating agent
If the researcher wishes to model how dose levels of the demethylating agent
affect normal plant proportions, then the dose must be declared as a continuous
variable. The following SAS syntax with proc GLIMMIX runs a binomial
regression:
Most of the commands and options have already been discussed throughout this
book; the “model y/N” command indicates that the response variable is in a ratio.
Therefore, this dataset is modeled with a binomial distribution, which is affected by
the different number of individuals in each repetition. proc GLIMMIX interprets the
distribution of the data as binomial, whereas the “solution” option requests the
parameter estimates of the model (intercept and slope).
The components that define this GLMM are shown below:
Distribution: yi~Binomial(Nij, π i)
Linear predictor: ηi = η + β dosei
πi
Link function: logitðπ i Þ = logit 1 - πi = ηi
μi ni π i πi
ηi = log = log = log = logitðπ i Þ = η þ βdosei
ni - μi ni - ni π i 1 - πi
and the logit function can be written in terms of the probability of success, π i, as
1
πi =
1 þ expð- ηi Þ
Part of the SAS output of the GLIMMIX syntax is shown below. The goodness-
of-fit statistics, type III tests of fixed effects, and parameter estimates are shown in
Table 6.4. The analysis of variance indicates that the demethylating agent has a
highly significant effect on the observed proportion of normal plants (P < 0.0001)
(part (b)). The maximum likelihood estimates for the intercept and slope are
η = 2.7927 and β = - 7.6232, respectively.
Figure 6.3 shows that as the value of the linear predictor increases (ηi), the value
of the residuals rapidly decreases. We can also see that the residuals plotted against
the quantiles clearly do not follow a normal distribution because this model is not a
linear function of the explanatory variable “dose.”
Figure 6.4 shows that the proportions studied and fitted are not so far apart, and,
as such, the binomial model is suitable for this dataset. The estimated linear predictor
of this model is as follows:
Residuals
300 80
60
200
Residual
Percent
40
100
20
0 0
-7.5 -5.0 -2.5 0.0 2.5 -240 -120 0 120 240 360
Linear Predictor Residual
300 300 21
22
200
Residual 200
Residual
100 24
0 100
23
-100
0
-2 -1 0 1 2
Quantile
1.1
0.825
Probability ( i)
0.55
0.275
0.
0. 0.4 0.8 1.2 1.6
Dose
1
πi =
1 þ exp ð- ηi Þ
Given the parameter estimates, we can predict the success probability of observ-
ing a normal plant, and given a certain concentration of the demethylating agent, this
estimated probability (using the estimated linear predictor) can be seen plotted in
Fig. 6.4.
1 1
π^i = =
1 þ expð^ηi Þ 1 þ expð - 2:7927þ7:6232 × dosei Þ
A group of researchers wishes to study the toxic effect of certain treatments (Trts) on
two flea species (SP) (Daphnia magna and Ceriodaphnia dubia). To compare the
toxicity effect of treatments on both flea species, a randomized complete block
design (RCBD bioassay) was implemented with three replicates per treatment,
with each replicate consisting of 10 fleas (Appendix: Fleas). The linear predictor
describing this experiment is described below:
where η is the intercept, αi is the fixed effect due to species i, βj is the fixed effect
of treatment j, (αβ)ij is the fixed effects interaction between the flea species and
treatment, bioassayk is the random effect due to bioassay k assuming
bioassayk N 0, σ 2bioassay , and rep(bioassay)l(k) is the random effect due to repeti-
tion bioassay assuming repðbioassayÞlðkÞ N 0, σ 2repðbioassayÞ .
The remaining components of this GLMM with a binomial response (Nijk, π ijk) are
described below:
Distribution: yijkl j bioassayk, rep(bioassay)l(k)~Binomial(Nijk, π ijk)
bioassayk N 0, σ 2bioassay , repðbioassayÞlðkÞ N 0, σ 2repðbioassayÞ , where Nijkl is
the number of dead fleas, observed in species i in replicate l in bioassay k under
treatment j,
π ijk
Link function: logit π ijk = log 1 - π ijk = ηijk .
The following SAS syntax allows us to fit the GLMM with a binomial response.
218 6 Generalized Linear Mixed Models for Proportions and Percentages
Part of the results is listed in Table 6.5. The fit statistics in part (a) and the
conditional statistics in part (b) are useful for model comparison, whereas the
variance component estimates are shown in part (c). The value of the statistic
Pearson’s chi - square/DF = 0.10 indicates that the binomial model gives a good
fit to the dataset. The variance component estimates for bioassays and replication
nested in bioassays are σ^2bioassay = - 0:1051 and σ^2repðbioassayÞ = - 0:1192, respec-
tively. The type III tests of fixed effects (part (d)) show the significance tests of the
fixed effects in the model. The treatment effect and the interaction between the flea
species (SP) and treatment are clearly significant with P < 0.0001 and P = 0.0009,
respectively.
Since survival was statistically similar in both flea species, we will focus on the
factors that were significant. Part (a) in Table 6.6 shows the means and standard
errors of treatments on the model scale (“Estimate” column) and on the data scale
(“Mean” column), obtained with “lsmeans” and the “ilink” option as well as the
mean comparisons, which are on the model scale (part (b)).
6.4 A Split-Plot Design in an RCBD with a Normal Response 219
Table 6.6 Means and standard errors on the model scale and on the data scale
(a) Trt least squares means
Trt Estimate Standard error DF t-value Pr > |t| Mean Standard error mean
T1 8.1179 4.3180 80 1.88 0.0637 0.9997 0.001287
T2 4.3564 3.0554 80 1.43 0.1578 0.9873 0.03820
T3 1.0081 0.1924 80 5.24 <0.0001 0.7326 0.03768
T4 -1.0509 0.1712 80 -6.14 <0.0001 0.2591 0.03286
T5 -4.7187 3.0570 80 -1.54 0.1266 0.008848 0.02681
T6 -8.1182 4.3184 80 -1.88 0.0638 0.000298 0.001286
(b) Conservative T grouping of Trt least squares means (α=0.05)
LS means with the same letter are not significantly different
Trt Estimate
T1 8.1179 A
T2 4.3564 B A
T3 1.0081 B A C
T4 -1.0509 B D C
T5 -4.7187 D C
T6 -8.1182 D
The LINES display does not reflect all significant comparisons. The following additional pairs are
significantly different: (T3,T4)
Based on the fixed effects tests, the flea species × treatment interaction is
significant. The means on the model scale are listed under the “Estimate” column,
followed by their standard errors, “Standard error” (Table 6.7). The output of the
“ilink” option in “lsmeans” applies the inverse function of the link function to the
estimates on the model scale to obtain the estimates on the data scale. The proba-
bilities, on the data scale, are given under the “Mean” column with their respective
standard errors and correspond to the probability of insect (flea) survival.
Figure 6.5 shows that the survival of both species is different in treatments 2–5;
the Daphnia species showed more resistance in treatments 2 and 3, whereas the
Ceriodaphnia species showed greater resistance in treatments 4 and 5. On the other
hand, in treatments 1 and 6, survival was similar in both species.
A split plot is the most common treatment structure design in agricultural and agro-
industrial research areas. These experiments generally involve two or more factors
under study. Typically, large or primary experimental units, commonly known as the
whole plot, are grouped into blocks. The levels of the first factor are randomly
assigned to the whole plots. Then, each whole plot is divided into smaller units,
known as split or secondary plots. The levels of the second factor are randomly
assigned to the subplots within each whole plot.
220
Table 6.7 Means and standard errors on the model scale and on the data scale of the interaction between both factors
6
0.825
0.55
0.275
0.
Trt1 Trt2 Trt3 Trt4 Trt5 Trt6
Treatment
Dapnhia Ceriodaphnia
The model equation for the analysis of variance assuming normality in the
response is
i = 1, 2, ⋯, a; j = 1, 2, ⋯, b; k = 1, 2, ⋯, r
where yijk is the observed response variable in the kth block at the ith level of factor A
and at the jth level of factor B, α and β refer to the fixed treatment effects due to
factors A and B, respectively, r is the random effect due to the blocks, (ra)ik is the
random error term due to the whole plot that is an interaction between the blocks and
factor A, and eijk is the random residual effect. Normally, the errors and other random
terms are also assumed to be normal; however, when the response variable is not
normally distributed, this way of specifying the model is not the most appropriate.
Thus, under the assumption that the response variable is normal, this way of
specifying the model is valid.
Data were obtained from an experiment that was designed to compare a number of
carrot genotypes with respect to their resistance to infestation by carrot fly larvae.
The data involved 16 genotypes that were compared at 2 pest levels to be controlled.
The experiment was conducted in three randomized blocks. Each block consisted of
222 6 Generalized Linear Mixed Models for Proportions and Percentages
Table 6.8 The notation 44/53 denotes that 44 carrots were infected ( y) out of a sample size of
53 studied (N )
Treatment (level of infestation)
1 2
Genotype Block1 Block2 Block3 Block1 Block2 Block3
G1 44/53 42/48 27/51 16/60 9/52 26/54
G2 24/48 35/42 45/52 13/44 20/48 16/53
G3 8/49 16/49 16/50 4/52 6/51 12/43
G4 4/51 5/42 12/46 15/52 10/56 6/48
G5 11/52 13/51 15/44 4/51 6/43 9/46
G6 15/50 5/49 7/50 1/51 8/49 3/54
G7 18/52 13/47 7/47 2/52 4/52 6/52
G8 5/47 15/49 8/50 6/56 4/50 6/42
G9 11/52 6/45 5/51 3/54 8/51 3/53
G10 0/51 10/39 14/48 3/50 0/50 10/51
G11 6/52 4/46 10/37 1/52 7/38 4/48
G12 0/52 4/55 1/40 1/50 3/50 1/45
G13 14/45 18/43 4/40 4/51 7/46 7/45
G14 3/52 12/53 4/55 3/52 7/48 12/49
G15 11/52 6/54 5/49 2/50 4/46 14/53
G16 4/53 1/40 4/52 4/56 1/44 3/42
32 plots, 1 for each combination of genotype and pest infestation level. At the end of
the experiment, about 50 carrots were taken from each plot and assessed for
infestation by carrot fly larvae. The data obtained are shown in Table 6.8.
Table 6.9 shows the analysis of variance summarizing the sources of variation
and degrees of freedom.
Rewriting in terms of the linear predictor
Since the observations were taken at the subplot level, conditioned on the
structural effects of the design, these observations have a variance associated with
the subplot. Therefore, α and β refer to the treatment fixed effects due to factors A
6.4 A Split-Plot Design in an RCBD with a Normal Response 223
Table 6.10 Results of the (a) Fit statistics for conditional distribution
analysis of variance
-2 Log L (y | r. effects) 527.82
Pearson’s chi-square 189.09
Pearson’s chi-square/DF 1.97
(b) Covariance parameter estimates
Cov Parm Subject Estimate Standard error
Intercept Bloque 0.004272 0.02741
Trt Bloque 0.03344 0.03545
(c) Type III tests of fixed effects
Effect Num DF Den DF F-value Pr > F
Genotype 15 60 28.28 <0.0001
Trt 1 2 16.24 0.0564
Genotype*Trt 15 60 4.45 <0.0001
and B, respectively; (αβ)ij refers to the interaction of the above factors; rk is the
random effect due to blocks; and blocks × whole plot (ra)ik is assumed to contribute
to the variation such that r k N 0, σ 2r and ðraÞik N 0, σ 2block × A . This model uses
the linear predictor ηijk to estimate the mean of the observations μijk.
The specification of the this GLMM is as follows:
Distribution: yijk j rk, (ra)rk~Binomial(Nijk, π ijk)
r k N 0, σ 2r ,
ðraÞrk N 0, σ 2blockA
Link function: logit(π ijk) = ηijk.
The following SAS GLIMMIX program allows the fitting of a GLMM with a
split-plot structure in a randomized complete block design with a binomial response.
effects tests (part (c)) indicate that the effect of genotype and the interaction between
genotype and treatment are significant.
The appropriate method for model evaluation depends on whether or not there is
evidence of overdispersion, so we consider this issue below. The residual variance
incorporates systematic discrepancies between the model and the observed
responses, variation between replicates (observations in independent experimental
units with the same values of the explanatory variables) and sampling variation
arising from the distribution of the data; in this case, it is the binomial distribution. If
there are no duplicate observations and the fitted model provides an adequate
description of the systematic trend, then only sampling variation contributes to the
residual variance. If this is true, then the residual deviation has an approximate
chi-squared distribution with degrees of freedom similar to the mean squared error
(MSE) (the residual).
Since there is overdispersion in the data using the binomial distribution, there are
three alternatives we can explore: (1) review the linear predictor, which involves
carefully revising the analysis of variance table; (2) add a scale parameter; or (3) use
another distribution for the dataset. Each of these three possible alternatives is
discussed below, in this order.
If the proportion of normal plants (π ijk) is being affected by the genotype within each
infestation level (trt = αi) from plot to plot within each of the blocks, then a nested
factorial effect of genotype within infestation levels (trt) could be included in the
analysis of variance. Thus, the linear predictor would be defined as
where αi, β(τ)j(i), rk, and (ra)ik are the fixed effects due to treatments, the effect of
genotypes nested within a treatment, random effects due to blocks r k N 0, σ 2r ,
and the interaction between blocks and treatment ðraÞik N 0, σ 2RA , respectively.
The following GLIMMIX syntax estimates the above linear predictor:
The only difference between this proc GLIMMIX and the previous one is that in
this program, we have included the nested effect of genotypes within treatment,
genotype (trt), and removed only the fixed effects of genotypes. Part of the results is
shown in Table 6.11. The value of Pearson’s chi-squared/DF statistic (part (a)) as
6.4 A Split-Plot Design in an RCBD with a Normal Response 225
Table 6.11 Results of the (a) Fit statistics for conditional distribution
analysis of variance, under a
-2 Log L (y | r. effects) 527.82
new linear predictor
Pearson’s chi-square 189.07
Pearson’s chi-square/DF 1.97
(b) Covariance parameter estimates
Cov Parm Subject Estimate Standard error
Intercept Bloque 0.004265 0.02740
Trt Bloque 0.03343 0.03544
(c) Type III tests of fixed effects
Effect Num DF Den DF F-value Pr > F
Trt 1 2 16.20 0.0565
Genotype (Trt) 30 60 15.83 <0.0001
well as the fit statistics did not decrease when modifying the linear predictor.
However, the F-values calculated for treatments and genotypes within treatments
(part (c)) are smaller than those obtained in the split-plot design.
Since the overdispersion is still present (Pearson’s chi - square/DF = 1.97),
another alternative is to add a scaling parameter to the model. This alternative is
presented below.
If the residual deviation is larger than expected when compared to critical values of
the appropriate chi-squared distribution, and if this cannot be corrected by redefining
the linear predictor of the model, then there is more variation present than can be
accounted for by the distributional likelihood assumption. In this case, we say that
the data show overdispersion. The simplest way to deal with overdispersion is to
extend the model for scaling the variance function. Adding the scale parameter
replaces Var(yij) = π ij(1 - π ij) with Var(yij) = ϕπ ij(1 - π ij). The rationale for this
approach is discussed by Collett (2002). The parameter ϕ is a scale factor, called the
dispersion parameter, which is used to summarize the degree of overdispersion
present in the observations. Clearly, ϕ = 1 corresponds to the original distribution
model. This parameter can be estimated in several different ways. The logarithm of
the likelihood of the binomial distribution is given by
N π ij
log þ yij log þ N log 1 - π ij
yij 1 - π ij
π ij
In the logarithm of the likelihood, the term “yij log 1 - π ij ” is very important; any
quantity that multiplies yij is known as the natural or canonical parameter, and this
parameter is always a function of the mean. For the binomial distribution, the mean
226 6 Generalized Linear Mixed Models for Proportions and Percentages
π ij
Nijπ ij and the natural parameter is log 1 - π ij , and, in categorical data, it is known as
“log odds.” The generalized estimating equation (GEE) method provides a valid
analysis for marginal means, since under a binomial distribution, in the quasi-
likelihood, the variance of the distribution is given by ϕπ ij(1 - π ij). This is achieved
by adding the “random _residual_” command in the following SAS syntax.
The following GLIMMIX commands are used to invoke the scale parameter but
using the first predictor proposed for these data.
In this syntax, we still keep the binomial distribution (y/N is equivalent to telling
GLIMMIX in SAS that it is a binomial response) but will add the “random
_residual_” command. In this case, we cannot obtain the maximum likelihood
estimators because we cannot implement the Laplace method (“method = laplace”)
or adaptive quadrature (“method = quad”) approximation method, so the estimation
is performed through the pseudo-likelihood (PL) method. This causes the scale
parameter to be estimated, and, consequently, it is used in the adjustment of all
standard errors and statistical tests. Proc GLIMMIX uses the generalized statistics of
McCullagh and Nelder (1989), i.e., χ 2/df as the estimator of the scale parameter (ϕÞ.^
All standard errors from the analysis under a binomial distribution are multiplied by
^ and all F-tests are divided by ϕ
ϕ, ^ to account for overdispersion. Part of the output
is shown below.
The value of Pearson’s statistic in part (a) indicates that overdispersion has not
been eliminated. Chi - square/DF = 3.13, on the contrary, indicates that this value
has increased. This result indicates that adding a scale parameter to the model does
not decrease the extra-variation present in the dataset, since the binomial assumption
forces a relationship between the mean and variance of the data that might not
contain the data being analyzed. On the other hand, the estimated scale parameter is
^ = 3:1263 (part (b)). Pearson’s residual analysis showed that its variance is 3.6257,
ϕ
which is considerably larger than 1, implying a large overdispersion. In addition, the
results of the fixed effects tests (part (c)) vary from those above (Table 6.12).
Therefore, the third option based on assuming an alternative distribution (beta
distribution) on the response variable is discussed below.
6.4 A Split-Plot Design in an RCBD with a Normal Response 227
Table 6.14 Results of the analysis of variance, assuming binomial and beta distributions
(a) Covariance parameter estimates
Binomial Beta
Cov Parm Subject Estimate Standard error Estimate Standard error
Intercept Bloque 0.004272 0.02741 -0.00524 .
Trt Bloque 0.03344 0.03545 0.02175 0.1475
Scale ϕ^ . 25.7070
(b) Type III tests of fixed effects
Binomial Beta
Effect Num DF Den DF F-value Pr > F F-value Pr > F
Trt 1 4 16.24 0.0564 9.98 0.0342
Genotype 15 60 28.28 <0.0001 13.25 <0.0001
Genotype*Trt 15 60 4.45 <0.0001 2.23 0.0146
Some of the SAS GLIMMIX output is listed below. Based on the fit statistics
under the binomial (first alternative) and beta distributions (Table 6.13), clearly the
values of the statistics related to the degree of overdispersion are lower in the beta
distribution than in the binomial distribution, indicating that the beta distribution
provides a better fit (part (a)). Looking at the fit statistics for the conditional model in
part (b), the values of the three fit statistics in the binomial model are higher than the
values in the beta model. The value of Pearson’s chi - square/DF under the beta
distribution is 1.01. This value indicates that the overdispersion has been virtually
eliminated from the data and that therefore the beta distribution is a better candidate
model for this dataset.
Adding the scale parameter (ϕ) to the model, the variance components and
standard errors in Table 6.14 cause (part (a)) variation for each of the results and,
therefore, the F- and t-tests are affected (part (b)). The estimated value of the scale
6.4 A Split-Plot Design in an RCBD with a Normal Response 229
Table 6.15 Estimated means and standard errors on the model scale and the data scale
(a) Trt least squares means
Trt Estimate Standard error DF t-value Pr > |t| Mean Standard error mean
Trt1 -1.2362 0.01768 2 -69.94 0.0002 0.2251 0.003083
Trt2 -1.9327 0.01768 2 -109.34 <0.0001 0.1264 0.001952
(b) Genotype least squares means
Standard Standard error
Genotype Estimate error DF t-value Pr > |t| Mean mean
G1 0.1524 0 57 Infty <0.0001 0.5380 0
G10 -1.4143 0 57 -Infty <0.0001 0.1956 0
G11 -1.8698 0 57 -Infty <0.0001 0.1336 0
G12 -2.8971 0.03885 57 -74.58 <0.0001 0.05230 0.001925
G13 -1.4336 0 57 -Infty <0.0001 0.1925 0
G14 -1.8761 0.1304 57 -14.39 <0.0001 0.1328 0.01502
G15 -1.8618 0 57 -Infty <0.0001 0.1345 0
G16 -2.6686 0 57 -Infty <0.0001 0.06485 0
G2 0.2225 0 57 Infty <0.0001 0.5554 0
G3 -1.3329 0 57 -Infty <0.0001 0.2087 0
G4 -1.5897 0 57 -Infty <0.0001 0.1694 0
G5 -1.3696 0 57 -Infty <0.0001 0.2027 0
G6 -2.0173 0 57 -Infty <0.0001 0.1174 0
G7 -1.7001 0.1356 57 -12.53 <0.0001 0.1545 0.01771
G8 -1.7161 0 57 -Infty <0.0001 0.1524 0
G9 -1.9796 0 57 -Infty <0.0001 0.1214 0
Part of the results is shown below. The values of fit statistics in part (a) of
Table 6.16 for the model are clearly lower than those estimated in the previous
options. This indicates that the normal distribution is reasonable, even though the
response is a proportion. The estimated variance components, tabulated in
part (b) due to blocks, blocks x treatment, and the mean squared error (MSE)
(Residual = Gener. chi-square/DF) are σ^2block = 0:000123, σ^2block × trt = 0:00039, and
σ^2 = MSE = 0:009442 ffi 0:01, respectively.
6.4 A Split-Plot Design in an RCBD with a Normal Response 231
Table 6.17 Means and standard errors for genotypes and treatments
(a) Genotype least squares means
Standard t- Standard error
Genotype Estimate error DF value Pr > |t| Mean mean
G1 0.5260 0.04086 60 12.87 <0.0001 0.5260 0.04086
G10 0.1340 0.04086 60 3.28 0.0017 0.1340 0.04086
G11 0.1522 0.04086 60 3.73 0.0004 0.1522 0.04086
G12 0.03332 0.04086 60 0.82 0.4179 0.0333 0.04086
G13 0.2026 0.04086 60 4.96 <0.0001 0.2026 0.04086
G14 0.1342 0.04086 60 3.28 0.0017 0.1342 0.04086
G15 0.1360 0.04086 60 3.33 0.0015 0.1360 0.04086
G16 0.05625 0.04086 60 1.38 0.1737 0.0562 0.04086
G2 0.5355 0.04086 60 13.11 <0.0001 0.5355 0.04086
G3 0.2139 0.04086 60 5.24 <0.0001 0.2139 0.04086
G4 0.1751 0.04086 60 4.28 <0.0001 0.1751 0.04086
G5 0.2035 0.04086 60 4.98 <0.0001 0.2035 0.04086
G6 0.1301 0.04086 60 3.18 0.0023 0.1301 0.04086
G7 0.1671 0.04086 60 4.09 0.0001 0.1671 0.04086
G8 0.1504 0.04086 60 3.68 0.0005 0.1504 0.04086
G9 0.1187 0.04086 60 2.90 0.0051 0.1187 0.04086
(b) Trt least squares means
Trt Estimate Standard error DF t-value Pr > |t| Mean Standard error mean
Trt1 0.2478 0.01863 2 13.30 0.0056 0.2478 0.01863
Trt2 0.1358 0.01863 2 7.29 0.0183 0.1358 0.01863
The F-statistics for the fixed effects of genotype, treatments, and the interaction
between both factors provide significant statistical evidence on the proportion of
infested carrots in each of the genotypes (part (c)). Overall, the least squares means
for genotypes and treatments are reported in Table 6.17 in parts (a) and (b). The
genotypes showing the highest fraction of infested carrots were 1, 2, 3, 5, and
13, whereas genotypes 12 and 16 showed the lowest percentage of infested carrots.
Now, for treatments, the highest proportion of infested carrots was observed in
treatment 1 with 24.78%, whereas in treatment 2, it was 13.58%.
Based on the fixed effects tests, the interaction effect of genotype x treatment on
the proportion of infested carrots was statistically different. Genotypes 9 and
16 showed higher susceptibility in treatment 1 followed by treatment 2, whereas
genotypes 5, 11, 13, and 15 showed the same proportions of infested carrots in both
treatments (Fig. 6.6). On the other hand, genotypes that showed higher resistance to
infestation levels were genotypes 1, 2, and 6 followed by genotypes 3, 4, 7, 8,
10, and 12.
232 6 Generalized Linear Mixed Models for Proportions and Percentages
1.
Proportion of infested carrots
0.75
0.5
0.25
0.
G1 G2 G3 G4 G5 G6 G7 G8 G9 G10 G11 G12 G13 G14 G15 G16
Interaction
Fig. 6.6 The average proportion of infested carrots in genotypes as a function of treatment
Block 1
A3 A1 A2
B1 B2 B1 B2 B2 B1
C2 C2 C2 C2 C2 C2 C2 C2 C2 C2 C2 C2
C1 C1 C1 C1 C1 C1 C1 C1 C1 C1 C1 C1
Block 2
A2 A1 A3
B1 B2 B1 B2 B2 B1
C1 C1 C1 C1 C1 C1 C1 C1 C1 C1 C1 C1
C2 C2 C2 C2 C2 C2 C2 C2 C2 C2 C2 C2
In each of the factor combinations, N orchid seeds were placed to germinate for a
period of time. Let yijk be the number of seeds germinated at the ith level of factor A,
at the jth level of factor B, and at the kth level of factor C. Since the observations are
made at the sub-subplot level, conditional on the structural effects of the design,
these observations have a variance associated with the subplot. Therefore, the
statistical model for this experiment is given below:
6.5 A Split-Split Plot in an RCBD:- In Vitro Germination of Seeds 233
Table 6.19 Sources of variation and degrees of freedom for the randomized block design with an
arrangement of treatments under the split-split-plot structure
Sources of variation Degrees of freedom
Blocks r-1=2-1=1
Factor A a-1=3-1=2
Errora(Bloque*A) (r - 1)(a - 1) = 2
Factor B b-1=2-1=1
A* B (a - 1)(b - 1) = 2
Errorb(A*B(Bloque)) a(b - 1)(r - 1) = 3 × 1 × 1 = 3
Factor C (c - 1) = 2 - 1 = 1
A*C (3 - 1)(2 - 1) = 2
B*C (b - 1)(c - 1) = 1
A*B* C (a - 1)(b - 1)(c - 1) = 2
Error ab(c - 1)(r - 1) = 3 × 2 × 1 × 1 = 6
Total r × a × b × c - 1 = 2 × 3 × 2 × 2 - 1 = 23
Part of the output is shown in Table 6.20. The value of the conditional statistic
Pearson’ chi - square/DF = 1.81 (part (a)) indicates that there is an overdispersion in
the dataset since these values are greater than 1. The estimated variance components
tabulated in part (b) correspond to blocks, blocks × factor A, and blocks × fac-
tor A × factor B, which are σ 2r = 0:0752, σ 2rA = 0:088, and σ 2rab = 0:0425, respec-
tively. The type III tests of fixed effects are shown in part (c). Here, we see that
the test of equality of treatments is not significant for factors A and B and the
interaction AB (A, P = 0.1917, B, P = 0.0897; AB, P = 0.6262), whereas for factor
C and the interactions AC, BC, and ABC, it is significant at a level of 5%.
Since there is overdispersion in the dataset, the binomial distribution does not
provide a good fit for the dataset (Pearson’s chi - square/DF = 1.81). An alternative
to model this dataset could be the beta distribution. Under this assumption, let the
y
response variable be pijk = Nijkijk , the proportion of seeds that germinated, then pijk is
assumed to have a beta distribution rather than a binomial distribution for the success
count yijk out of a total of Nijk Bernoulli trials.
The components of the model are listed below:
Distribution: pijk j rl, (ra)il, (rαβ)ijl ~ Beta(π ijk, ϕ), with ϕ as the scale parameter.
rl N 0, σ 2r , ðraÞrk N 0, σ 2rA , ðrαβÞijl N 0, σ 2rab
Linear predictor:
ηijk = η + αi + rl + (rα)il + βj + (αβ)ij + (rαβ)ijl + γ k + (αγ)ik + (βγ)jk + (αβγ)ijk
π ijk
Link function: logit π ijk = logit 1 - π ijk = ηijk
6.5 A Split-Split Plot in an RCBD:- In Vitro Germination of Seeds 235
Table 6.20 Results of the analysis of variance of the RCBD in the split-split plot under the
binomial distribution
(a) Fit statistics for conditional distribution
-2 Log L (y | r. effects) 146.19
Pearson’s chi-square 43.49
Pearson’s chi-square/DF 1.81
(b) Covariance parameter estimates
Cov Parm Estimate Standard error
Bloque 0.07521 0.1180
Bloque*A 0.08847 0.09319
Bloque*A*B 0.02205 0.04258
(c) Type III tests of fixed effects
Effect Num DF Den DF F-value Pr > F
A 2 2 4.22 0.1917
B 1 3 6.12 0.0897
A*B 2 3 0.55 0.6262
C 1 6 65.73 0.0002
A*C 2 6 11.68 0.0085
B*C 1 6 29.38 0.0016
A*B*C 2 6 31.69 0.0006
Part of the results is listed in Table 6.21 under a beta distribution. The value of the
fit statistic for the conditional model tabulated in (a) (Pearson’s chi - square/
DF = 1.01) indicates that overdispersion has been removed and that the
beta distribution is a good model to fit the dataset. Part (b) shows the variance
component estimates for blocks, blockxA, and blockxAxB
σ^2r = - 0:157, σ 2rA = - 0:05558, and σ 2rab = - 0:227, respectively and the value
of the estimated scale parameter ϕ ^ = 19:2789 . According to the type III tests of
fixed effects in part (c), the main effect of factor C (P = 0.0128) and interaction
A×B×C (P = 0.0424) are statistically significant at a level of 5%.
The estimates of the interactions are shown in Table 6.22 on the model scale
under the “Estimate” column and as probabilities on the data scale under the “Mean”
column with its corresponding standard errors under the “Standard error mean”
column.
236 6 Generalized Linear Mixed Models for Proportions and Percentages
Table 6.21 Results of the analysis of variance of the RCBD in the split-split plot structure under
the beta distribution
(a) Fit statistics for conditional distribution
-2 Log L (p | r. effects) -37.51
Pearson’s chi-square 21.31
Pearson’s chi-square/DF 1.01
(b) Covariance parameter estimates
Cov Parm Estimate Standard error
Bloque -0.1570 .
Bloque*A -0.05558 .
Bloque*A*B -0.2270 .
Scale 19.2789 5.8703
(c) Type III tests of fixed effects
Effect Num DF Den DF F-value Pr > F
A 2 2 1.21 0.4521
B 1 2 0.00 0.9687
A*B 2 2 1.08 0.4799
C 1 4 18.34 0.0128
A*C 2 4 1.50 0.3257
B*C 1 4 6.56 0.0626
A*B*C 2 4 7.72 0.0424
Table 6.22 Estimated least mean squares on the model scale (“Estimate” column) and the data
scale (“Mean” column)
A*B*C least squares means
Standard t- Standard error
A B C Estimate error DF value Pr > |t| Mean mean
1 1 1 -0.3769 0.3194 4 -1.18 0.3034 0.4069 0.07709
1 1 2 0.9506 0.3445 4 2.76 0.0509 0.7212 0.06927
1 2 1 0.1721 0.3147 4 0.55 0.6135 0.5429 0.07810
1 2 2 0.7010 0.3308 4 2.12 0.1014 0.6684 0.07331
2 1 1 -0.6521 0.3296 4 -1.98 0.1190 0.3425 0.07422
2 1 2 2.9148 0.8071 4 3.61 0.0225 0.9486 0.03937
2 2 1 0.7430 0.4699 4 1.58 0.1890 0.6776 0.1026
2 2 2 0.4056 0.4515 4 0.90 0.4198 0.6000 0.1084
3 1 1 0.2695 0.3161 4 0.85 0.4419 0.5670 0.07761
3 1 2 0.2752 0.3163 4 0.87 0.4334 0.5684 0.07759
3 2 1 0.1236 0.3143 4 0.39 0.7143 0.5309 0.07827
3 2 2 1.1726 0.3614 4 3.24 0.0315 0.7636 0.06523
6.6 Alternative Link Functions for Binomial Data 237
1 a
0.9
ab ab
Average germination rate
0.8 abc
a..
abc
0.7 bc bc
bc bc
0.6
c
0.5
c
0.4
0.3
0.2
0.1
0
C1 C2 C1 C2 C1 C2 C1 C2 C1 C2 C1 C2
B1 B2 B1 B2 B1 B2
A1 A2 A3
Interaction
The simple effects of factors show that the best combination of factor levels was
A2*B1*C2, showing the highest seed germination proportion followed by the
combination of factors A1*B1*C2, A3*B2*C2, and lower proportion, which were
observed in the combination of factors A1*B2*C2, A2*B2*C1 and A2*B2*C2
(Fig. 6.7). Finally, the combination of the factor levels A2 × B1 × C1 showed the
lowest proportion of seed germination.
In previous chapters, we used proc GLIMMIX with binomial data and, by default, it
works with the link function “logit. ” However, in certain applications with binomial
data, other link functions are acceptable, either because they make it easier to
interpret or because for certain binomial datasets, the link function “logit” cannot
accurately model the data and, as a result, produce biased (misleading) results. In this
section, we consider two alternative link functions to the logit for binomial data: the
link “probit” and the complementary log-log link.
The probit model is also used to model dichotomous (Bernoulli) or binomial (sum
of Bernoulli trials) responses. For this model, the link function, called the probit link,
uses the inverse of the cumulative distribution function of a standard normal
distribution to transform probabilities to the standard normal variable. That is,
Φ-1(π i) = ηi, which implies that π i = Φ(ηi), where ΦðZ Þ = - 1 p12π e - 2t dt.
z 1 2
The use of the probit regression model dates back to Bliss (1934). Bliss was
interested in finding an effective pesticide to control insects that fed on grape leaves.
238 6 Generalized Linear Mixed Models for Proportions and Percentages
He discovered that the relationship between the response and a dose of pesticide was
sigmoid, and he applied the probit link function to transform the dose–response
curve from a sigmoid to a linear relationship.
The complementary function log - log defined as ηi = log (- log (1 - π i)),
η
whose inverse is π i = 1 - e - e i , is useful for data in which most of the probabilities
are near zero or near one. For small values of π i, the log-log transformation produces
results highly similar to those produced when using a logit link. As the probability
increases, the transformation approaches infinity more slowly than the probit or logit
model.
This example takes the dataset of the split-split plot in an RCBD (Exercise 6.8.5). In
this example, the data were modeled using the function “logit.” In this exercise, we
will fit the dataset using the link function “probit, ” and we will compare and contrast
the results using a logit link. The components of the GLMM are identical to those in
Example 6.5, except for the link function. That is, we replace:
π ijk
Link function: logit π ijk = logit 1 - π ijk = ηijk by Φ-1(π ijk) = ηijk.
The following GLIMMIX syntax implements the fitting of the binomial data
using the link function “probit. ”
Table 6.23 shows part of the results under the binomial distribution with the
“probit” link function. In parts (a) and (b), we see the mean squared error and
variance component estimates for blocks, whole plot, subplot, and sub-subplot,
where it can be observed that these values are positive and not negative, as the
ones obtained with the link function “logit. ” Since the variance components are
positive, this analysis makes more sense than the one based on the logit link.
The type III tests of fixed effects are tabulated in part (c) of Table 6.23; the main
effects of factors A and B and the interactions A*B, A*C, and B*C are not significant
in both link functions, whereas the main effect of factor C and the interaction A*B*C
are statistically significant under the “probit” link.
The estimated probabilities π^ijk and their respective standard errors are
presented in Table 6.24 for each of the combinations of the three factors, which
6.6 Alternative Link Functions for Binomial Data 239
Table 6.23 Results of the analysis of variance of the RCBD in the split-split plot structure under
the binomial distribution using the “probit” link
(a) Fit statistics for conditional distribution
-2 Log L (y | r. effects) 146.43
Pearson’s chi-square 43.01
Pearson’s chi-square/DF CME = σ^2 1.09
(b) Covariance parameter estimates
Cov Parm Estimate Standard error
Block σ^2block 0.02411 0.03707
Block*A σ^2block × A 0.02128 0.02830
Block*A*B σ^2block × AB 0.01617 0.01896
(c) Type III tests of fixed effects
Probit Logit
Effect Num DF Den DF F-value Pr > F Pr > F
A 2 2 5.49 0.1541 0.4521
B 1 3 4.17 0.1339 0.9687
A*B 2 3 0.36 0.7226 0.4799
C 1 6 67.13 0.0002 0.0128
A*C 2 6 12.34 0.0075 0.3257
B*C 1 6 29.16 0.0017 0.0626
A*B*C 2 6 33.93 0.0005 0.0424
are very similar in both link functions. However, the average standard error
is slightly higher with the “logit” link function ðstandar:error:meanlogit = 0:0711Þ
compared to the “probit” link ðstandar:errormeanprobit = 0:0693Þ.
Table 6.24 Means and standard errors using the probit and logit link functions
A*B*C least squares means
Probit Logit
A B C Mean Standard error mean Mean Standard error mean
1 1 1 0.1543 0.05050 0.1494 0.04796
1 1 2 0.3723 0.08296 0.3780 0.08767
1 2 1 0.2724 0.06746 0.2694 0.06896
1 2 2 0.2954 0.07798 0.2953 0.08053
2 1 1 0.1023 0.03805 0.09593 0.03409
2 1 2 0.8255 0.06338 0.8292 0.06135
2 2 1 0.5684 0.08306 0.5703 0.08845
2 2 2 0.5529 0.08327 0.5530 0.08847
3 1 1 0.2844 0.07196 0.2844 0.07418
3 1 2 0.2751 0.06868 0.2733 0.07041
3 2 1 0.2568 0.06452 0.2563 0.06589
3 2 2 0.4612 0.08017 0.4608 0.08553
unit (nijkr). The data can be referred to in the Appendix (Data: Commercial crop
explant attachment).
The GLMM for this experiment is described below (log-log data):
Distribution: yijkl j rl, r(aβ)ijl~Binomial(Nijk, π ijk)
r l N 0, σ 2r , r ðaβÞijl N 0, σ 2rab ,
Linear predictor: ηijkl = η + rl + αi + βj + (αβ)ijl + r(αβ)il + γ k + (αγ)ik + (βγ)jk + (αβγ)ijk,
i + βj + (αβ)ijl + r(αβ)il + γk + (αγ)ik + (βγ)jk + (αβγ)ijk, where blocks (rl) and blocks
x (A x B) ((r(aβ))ijl) are assumed to contribute to the variation such that rl
N 0, σ 2r and r ðaβÞijl N 0, σ 2rab , respectively.
Link function: log - log (π ijkl) = ηijkl
The following GLIMMIX code adjusts the binomial proportions with a comple-
mentary link function log - log in an RCBD manner.
The “link = ccll” option specifies that “proc GLIMMIX” will fit the model using
the complementary (log - log) link function. The “lsmeans A|B|C/lines ilink”
command calls for estimation of the linear predictors ηijk, whereas the “lines” and
“ilink” options provide the comparison between the linear predictors and their
inverse. Part of the output is shown below. Table 6.25 shows the variance compo-
nent estimates of blocks and blocks (A×B) using alternative link functions. Under
6.6 Alternative Link Functions for Binomial Data 241
Table 6.25 Variance component estimates using the same distribution but a different link function
Covariance parameter estimates
Log – log Logit Probit
Standard Standard Standard
Cov Parm Estimate error Estimate error Estimate error
Block 0.05808 0.07112 0.08144 0.1042 0.02676 0.03494
Block 0.05065 0.03121 0.09203 0.05754 0.03374 0.02111
(A*B)
Table 6.26 Type III tests of fixed effects using the same distribution but with a different link
function
Type III tests of fixed effects
Log - log Logit Probit
Effect Num DF Den DF F-value Pr > F F-value Pr > F F-value Pr > F
A 2 5 6.27 0.0434 7.44 0.0318 8.17 0.0266
B 1 5 4.85 0.0789 3.13 0.1370 2.81 0.1543
A*B 2 5 0.65 0.5613 0.28 0.7693 0.24 0.7971
C 1 6 68.84 0.0002 65.29 0.0002 66.70 0.0002
A*C 2 6 11.94 0.0081 11.53 0.0088 12.12 0.0078
B*C 1 6 27.51 0.0019 28.88 0.0017 28.77 0.0017
A*B*C 2 6 32.44 0.0006 32.36 0.0006 33.93 0.0005
the link “probit,” the variance components are smaller compared to those obtained
with the link functions “log – log” and “logit.”
The values of the hypothesis tests for the fixed effects, both main effects and
interactions, are shown in Table 6.26. The three link functions behave similarly.
One tool that might be useful in choosing which link function provides a better fit,
or which best describes the variability of a dataset, is the model fit statistics. The fit
statistics indicate that the model with the complementary “log - log” link function
provides the best fit (Table 6.27).
Table 6.28 shows the maximum likelihood estimators π^ijk for each of the link
functions and the combination of factor levels, and it can be verified that they
provide very similar estimates. It is important to mention that the correct
242 6 Generalized Linear Mixed Models for Proportions and Percentages
Table 6.28 Means and standard errors using the same distribution but with a different link function
A*B*C least squares means
Log - log Logit Probit
Standard error Standard error Standard error
A B C Mean mean Mean mean Mean mean
1 1 1 0.1494 0.04259 0.1513 0.04732 0.1547 0.05030
1 1 2 0.3776 0.08554 0.3727 0.08510 0.3696 0.08223
1 2 1 0.2661 0.06257 0.2706 0.06744 0.2737 0.06718
1 2 2 0.3001 0.07718 0.2993 0.07951 0.2980 0.07789
2 1 1 0.1020 0.03079 0.1023 0.03451 0.1047 0.03829
2 1 2 0.8389 0.08212 0.8188 0.06189 0.8196 0.06375
2 2 1 0.5558 0.09578 0.5733 0.08633 0.5700 0.08251
2 2 2 0.5578 0.09596 0.5560 0.08635 0.5546 0.08273
3 1 1 0.2770 0.06780 0.2805 0.07192 0.2827 0.07131
3 1 2 0.2782 0.06574 0.2779 0.06929 0.2778 0.06855
3 2 1 0.2555 0.05987 0.2561 0.06416 0.2569 0.06410
3 2 2 0.4599 0.08735 0.4610 0.08331 0.4609 0.07965
specification of the linear predictor as well as the distribution of the response variable
are the most important elements for obtaining a good fit.
6.7 Percentages
In this section, we consider proportions that have been calculated from discrete
counts, for example, the number of infected plants in treatment i of total Ni plants
that are likely to have a binomial distribution. This class of models allows the
response to arise from different distributions and probabilities.
An experiment was designed to study the effect of conidial density on the transmis-
sion of a fungus that attacks aphids. Aphid carcasses killed by the fungus, and from
which the fungus released spores, were placed on bean plants at three densities
(A = 1, B = 5, or C = 10 carcasses per plant) to provide different doses of fungal
conidia. Densities were assigned to individual bean plants in a completely random-
ized design with six replicates. A total of 20 live uninfected (N ) aphids were placed
on each plant with a ladybug that was allowed to forage (feed on the bean plants) to
facilitate the transfer of conidia between the carcasses and the live aphids. For each
plant, the number of aphids infected with the fungus was counted (nij) and the
proportion of aphids infected with the fungus was calculated 7 days after the
6.7 Percentages 243
inoculum was placed. The results shown below correspond to the proportion of
infected aphids calculated at each of the inoculum concentrations ( pij = nij/N;
N = 20) to each of the conidial concentrations (density) tested (Table 6.29).
The sources of variation and degrees of freedom for this experiment are shown in
Table 6.30.
The components of the GLMM having a beta response are listed below:
Distributions: pij j density(plant)i( j ) ~ Beta(π ij, ϕ)
densityðplantÞiðjÞ N 0, σ density
2
ðplantÞ
Linear predictor: ηij = μ + densityi + density(plant)i( j ); i = 1, 2, 3; j = 1, ⋯, 6
π ij
Link function: log 1 - π ij = logit π ij = ηij
Table 6.31 Results of the (a) Fit statistics for conditional distribution
analysis of variance
-2 Log L (P | r. effects) -24.13
Pearson’s chi-square 18.45
Pearson’s chi-square/DF 1.02
(b) Covariance parameter estimates
Cov Parm Estimate Standard error
Conc_Ino (Planta) -0.1833 .
Scale 12.9999 4.1954
(c) Type III tests of fixed effects
Effect Num DF Den DF F-value Pr > F
Conc_Ino 2 15 8.25 0.0038
Table 6.32 Means and standard errors on the model scale and the data scale
Conc_Ino least squares means
t-
Conc_Ino Estimate Standard error DF value Pr > |t| Mean Standard error mean
A -1.0340 0.2438 15 -4.24 0.0007 0.2623 0.04717
B -0.5282 0.2246 15 -2.35 0.0328 0.3709 0.05241
C 0.2775 0.2197 15 1.26 0.2259 0.5689 0.05388
Part of the results is shown in Table 6.31. The value of the conditional fit statistic
in part (a), Pearson’s chi - square/DF = 1.02, indicates that there is no
overdispersion in the data and that the beta distribution is a good model for this
dataset. The estimated variance of the plants’ nested inoculum density is
σ^2densityðplantÞ = - 0:1833 and the estimated scale parameter is ϕ ^ = 12:999; both are
tabulated in part (b). In part (c) of the same table, the type III tests of fixed effects are
shown, indicating that the density (concentration) of the inoculum has a significant
effect (P = 0.0038) on the proportion of infected aphids with the fungus.
The values under the column “Estimates” are estimated mean proportions on the
model scale, whereas the column “Mean” shows the estimated mean proportions on
the data scale with their respective standard errors (Table 6.32). These estimates
where obtained with the “lsmeans” and “ilink” option.
Figure 6.8 shows a linear trend in the proportion of aphids infested as conidial
density increases. Conidia densities A and B showed statistically equal proportions
of infested aphids compared to density C. Finally, the highest proportion of infested
aphids was observed at density C.
6.7 Percentages 245
0.7
Proportion of infested aphids
0.525
0.35
0.175
0.
A B C
Conidial density
Table 6.33 Percentage of quality malt as a function of both factors (variety and germination time)
Variety Time Block y Variety Time Block y
Gambella T1 1 7.25 Red Swazi T2 1 21
Gambella T1 2 11.16 Red Swazi T2 2 15.09
Gambella T1 3 15.9 Red Swazi T2 3 24.84
Macia T1 1 10.91 Teshale T2 1 25.42
Macia T1 2 8.75 Teshale T2 2 26.86
Macia T1 3 10.87 Teshale T2 3 26.64
Meko T1 1 24.65 76 T1#23 T2 1 23.69
Meko T1 2 23.63 76 T1#23 T2 2 20.71
Meko T1 3 28.75 76 T1#23 T2 3 26.14
Red Swazi T1 1 20.95 Gambella T3 1 12.45
Red Swazi T1 2 15.82 Gambella T3 2 15.34
Red Swazi T1 3 25.24 Gambella T3 3 17.32
Teshale T1 1 25.92 Macia T3 1 8.51
Teshale T1 2 27.64 Macia T3 2 8.15
Teshale T1 3 28.03 Macia T3 3 13.07
76T1#23 T1 1 23.39 Meko T3 1 22.09
76T1#23 T1 2 19.43 Meko T3 2 24.11
76T1#23 T1 3 25.55 Meko T3 3 24.47
Gambella T2 1 10.03 Red Swazi T3 1 20.81
Gambella T2 2 12.9 Red Swazi T3 2 16.05
Gambella T2 3 17.84 Red Swazi T3 3 23.7
Macia T2 1 7.88 Teshale T3 1 26.42
Macia T2 2 9.14 Teshale T3 2 27.07
Macia T2 3 11.99 Teshale T3 3 28.01
Meko T2 1 22.97 76 T1#23 T3 1 24.18
Meko T2 2 25.37 76 T1#23 T3 2 19.58
Meko T2 3 25.71 76 T1#23 T3 3 25.74
Part of the results of the above program is shown in Table 6.35. In part (a), the
value of Pearson’s chi-square/DF is tabulated χdf = 0:92 , which indicates that the
2
beta distribution is a good distribution for modeling malt percentage since the t-value
of Pearson’s chi-square/DF is close to 1. The estimated variance due to blocks is
6.7 Percentages 247
Table 6.35 Results of the (a) Fit statistics for conditional distribution
analysis of variance of the
-2 Log L (p | r. effects) -280.89
RCBD with a beta distribution
Pearson’s chi-square 49.66
Pearson’s chi-square/DF 0.92
(b) Covariance parameter estimates
Cov Parm Estimate Standard error
Block 0.01210 0.01055
Scale 431.54 85.4922
(c) Type III tests of fixed effects
Num Den F-
Effect DF DF value Pr > F
Var_sorghum 5 34 106.51 <0.0001
Ger_time 2 34 0.26 0.7722
Var_sorghum*ger_time 10 34 1.08 0.4041
Table 6.36 Means and standard errors on the model scale and the data scale for sorghum varieties
Var_sorghum least squares means
Standard Standard error
Var_sorghum Estimate error DF t-value Pr > |t| Mean mean
76 T1#23 -1.2011 0.07401 34 -16.23 <0.0001 0.2313 0.01316
Gambella -1.8898 0.07929 34 -23.83 <0.0001 0.1313 0.009042
Macia -2.2067 0.08295 34 -26.60 <0.0001 0.09915 0.007409
Meko -1.1201 0.07364 34 -15.21 <0.0001 0.2460 0.01366
Red Swazi -1.3685 0.07493 34 -18.26 <0.0001 0.2029 0.01212
Teshale -1.0025 0.07314 34 -13.71 <0.0001 0.2685 0.01436
0.3
Average malt percentage
0.225
0.15
0.075
0.
76T1#23 Gambella Macia Meko RedSwazi Teshale
Variety
Table 6.37 Analysis of variance with sources of variation and degrees of freedom for this
experiment
Sources of variation Degrees of freedom
Blocks r - s1 = 2 - 1 = 1
Isolation a-1=6-1=5
Block (insulation) a(r - 1) = 6
Age b-1=3-1=2
Isolation*age (a - 1)(b - 1) = 5 × 2 = 10
Error (a - 1)(b - 1)(r - 1) = 2 × 5 × 1 = 10
Total r × a × b - 1 = 2 × 6 × 3 - 1 = 35
Table 6.38 Results of the (a) Fit statistics for conditional distribution
analysis of variance of the
-2 Log L (y | r. effects) -74.53
RCBD with a factorial struc-
ture in treatments Pearson’s chi-square 34.02
Pearson’s chi-square/DF 1.00
(b) Covariance parameter estimates
Cov Parm Subject Estimate Standard error
Aislamiento Block -0.03125 .
Scale 24.1882 5.7925
(c) Type III tests of fixed effects
Effect Num DF Den DF F-value Pr > F
Isolation 5 6 16.48 0.0019
Age 2 10 30.01 <0.0001
Isolation*age 10 10 4.83 0.0102
Some of the outputs are listed below (Table 6.38). The conditional statistic
Pearson′s chi - square/DF = 1 indicates that the distribution used is appropriate
for these datasets (part (a)). The variance component estimates are tabulated in part
(b), and, for blocks, the estimate is σ^2r = - 0:03125 and the estimated scale param-
^ = 24:1882. The hypothesis test is in part (c) with type III fixed effects of
eter is ϕ
equality of means for type of isolation, age of the insect, and the interaction between
both factors. These outputs indicate that they have a significant effect on insect
mortality.
We see the expected proportions with their respective standard errors of both
factors on the data scale under the “Mean” column (Tables 6.39 and 6.40). These
values arise by applying the inverse link to estimates under “Estimate” on the model
scale. Table 6.39 shows the estimated average mortality probabilities for the isolates;
for example, for isolate A1, applying the inverse link to the linear predictor estimate
^η1: = 0:1722 we get π^1: = 1=1 þ e - 0:1722 = 0:5429. In this manner, we see that the
expected proportions for isolates 2 and 4 are π^2: = 0:6555 and π^4: = 0:5762, respec-
tively, whereas for the control π^control: = 0:1157.
Regarding the age of the insect (Table 6.40), the expected average probability of
mortality was higher at age three (adults) with a higher mortality rate π^:3 = 0:6435,
whereas insects at age two (E2) had a higher resistance to the isolations, showing a
mortality of π^:2 = 0:2598.
In general, fungal isolates A1, A2, A3, and A4 showed an average mortality of
more than 75% for adult insects (E3), whereas isolates A1, A2, and A5 showed a
250 6 Generalized Linear Mixed Models for Proportions and Percentages
Table 6.39 Means and standard errors on the model scale and the data scale for isolation
Isolate least squares means
Isolate Estimate Standard error DF t-value Pr > |t| Mean Standard error mean
A1 0.1722 0.1859 6 0.93 0.3900 0.5429 0.04614
A2 0.6442 0.2100 6 3.07 0.0220 0.6557 0.04740
A3 -0.1489 0.1952 6 -0.76 0.4746 0.4629 0.04853
A4 0.3073 0.2088 6 1.47 0.1915 0.5762 0.05098
A5 -0.2023 0.1806 6 -1.12 0.3053 0.4496 0.04468
Control -2.0339 0.2418 6 -8.41 0.0002 0.1157 0.02473
Table 6.40 Means and standard errors on the model scale and the data scale for insect age
Age least squares means
Age Estimate Standard error DF t-value Pr > |t| Mean Standard error mean
E1 -0.1747 0.1310 -1.33 0.2120 0.4564 0.03251
E2 -1.0468 0.1374 -7.62 <0.0001 0.2598 0.02643
E3 0.5908 0.1634 3.61 0.0047 0.6435 0.03749
mortality rate of around 65% for cockroaches of age E1 (juvenile insects). On the
other hand, all isolates showed lower lethal effectiveness on insects of age E2
(Fig. 6.10).
A plant pathologist wishes to compare the response of two plant varieties to different
doses/amounts of a pesticide formulated to protect plants against a disease. Five
racks (blocks) were chosen to account for local variation within the greenhouse.
Each rack was divided into four sections or rooms and were randomly assigned one
of four pesticide levels to each rack. The four pesticide levels were 1, 2, 4, and 8 mg/
L. One plant of each variety was placed in each section of the rack. Of the two plant
varieties, one variety was susceptible, labeled S, and the other variety was resistant,
labeled R (Table 6.41). The response variable ( y) is the percentage of disease
inhibition in the plant.
The sources of variation and degrees of freedom for this experiment are shown in
Table 6.42.
Following the same reasoning used in the examples above, the components of the
GLMM with a beta response that models the observed disease inhibition proportion
( pijk) under dose i with variety j in block k are listed as follows:.
Distributions: yijk j rk, (rα)ik~Beta(π ijk, ϕ); i = 1, ⋯, 4; j = 1, 2; k = 1, ⋯, 5
rk N 0, σ 2r , ðrαÞik N 0, σ 2rA
6.7 Percentages 251
1
0.9 ab a
ab
ab
0.8 abc
Average mortality rate
bc bc
0.7
0.6 cd cd cd
0.5 de
0.4
ef ef
0.3 ef f
0.2 f f f
0.1
0
E1 E2 E3 E1 E2 E3 E1 E2 E3 E1 E2 E3 E1 E2 E3 E1 E2 E3
A1 A2 A3 A4 A5 Control
Isolate/Age
The “contrast” command in the program can perform a hypothesis testing to see
what trend (linear, quadratic, or cubic) the “dose” factor has on the percentage of
disease inhibition. Part of the output is shown in Table 6.43. The value of the
conditional goodness-of-fit statistic Pearson’s chi - square/DF= 0.59 indicates
that we have no evidence of overdispersion, and, therefore, the beta distribution is
adequate to model this dataset (part (a)). The variance component estimates in part
(b) for block and block × dose are σ^2r = 0:004898 and σ^2r ∙ dose = 0:002372, respec-
tively. Finally, the F-value provides sufficient statistical evidence of the effect of
dose on disease decline in plants (P = 0.0001), whereas the effect of variety and dose
× variety do not provide sufficient evidence.
Table 6.44 shows the polynomial contrasts for the effect of “dose,” which
indicate that there is a significant quadratic effect on the percentage of disease
inhibition.
The inhibition percentage has almost a linear trend as the dose increases from 1 to
4 ml/L in both varieties, but when the dose is higher than 4 ml/L, the inhibition of the
disease decreases in both varieties (Fig. 6.11).
6.7 Percentages 253
Table 6.43 Results of the (a) Fit statistics for conditional distribution
analysis of variance
-2 Log L (y | r. effects) -184.32
Pearson’s chi-square 23.63
Pearson’s chi-square/DF 0.59
(b) Covariance parameter estimates
Cov Parm Estimate Standard error
Block 0.004898 .
Block*dose 0.002372 0.007513
Scale 205.52 67.7447
(c) Type III tests of fixed effects
Effect Num DF Den DF F-value Pr > F
Dose 3 12 17.67 0.0001
Variety 1 16 1.74 0.2057
Dose*variety 3 16 1.22 0.3337
0.35
Inhibition porcentage
0.3
0.25
0.2
0.15
0.1
1 2 3 4 5 6 7 8
Dose (ml/L)
r s
where pijkl is the ijkl proportion of diseased leaves, η is the intercept, τi is the fixed
treatment effect i, bj is the random effect of blocks assuming bj N 0, σ 2block , (bv)jk
is the block–plant random effect assuming ðbvÞjk N 0, σ 2block × plant , (bvr)jkl
is the random effect due to block–plant–sprouts assuming ðbvrÞjkl
N 0, σ block
2
× plant × sprout , and εijkl is the experimental error assuming εijkl~N(0, σ 2).
For the disease incidence data, the assumption of a normal distribution for pijkl is
not recommended. A good starting point for the analysis is to assume that the
observed number of diseased leaves in the sprouts (yijkl) follows a binomial distri-
bution with parameter π ijkl and nijkl, the total number of leaves on the sprout.
Therefore, the components of the GLMM with a binomial distribution in the
response variable are as follows:
Distribution: pijkl j bj, (bv)jk, (bvr)jkl ~ binomial(π ijkl, nijkl)
bj N 0, σ 2block ,ðbvÞjk N 0, σ 2block × plant , ðbvrÞjkl N 0, σ 2block × plant × sprout
Linear predictor: ηijkl = η + τi + bj + (bv)jk + (bvr)jkl.
Link function: logit(π ijkl) = ηijkl
The following GLIMMIX syntax fits a GLMM with a binomial response.
Part of the results based on the aforementioned model is shown in Table 6.45. By
default, proc GLIMMIX provides the fit statistics useful for selecting the best model
from a group of models (part (a)).
In addition to accuracy considerations, the Laplace (or quadrature) analysis
allows us to obtain the “conditional distribution fit statistics,” specifically
Pearson’s χ 2/df. Recall that this statistic helps assess the goodness of fit of the
model. If the value of χ 2/df ≫ 1 is an indicator that there is overdispersion in the
dataset, then this may be because the linear predictor is incomplete or the assumed
distribution is not suitable (mis-specified) for this dataset. In part (b), we can see that
the value of the conditional distribution statistic of Pearson’s χ 2/df = 1.47. This value
indicates that we have evidence of overdispersion. The variance component esti-
mates due to block, block × plant, and block × plant × sprout are tabulated in part (c),
whereas the type III tests of fixed effects (part (d)) indicate that there is a significant
difference (P < 0.0001) between treatments.
Since there is overdispersion in the data in the binomial model, an alternative
distribution is the beta distribution. The components of the GLMM are as follows:
Distribution: pijkl j bj, (bv)jk, (bvr)jkl~beta(π ijkl, ϕ);
bj N 0, σ block
2
,ðbvÞjk N 0, σ block
2
× plant , ðbvrÞjkl N 0, σ block × plant × sprout
2
Some of the outputs are shown below. Table 6.46 shows that the values of the fit
statistics, as well as the conditional distribution statistics (parts (a) and (b)), are much
smaller than when the binomial distribution was used.
This indicates that the beta distribution is more appropriate for the dataset, as
the value of Pearson’s statistic is χ 2/df = 1.03, indicating that the problem of
overdispersion was almost totally controlled. The variance component estimates as
well as the estimated scale parameter ϕ ^ are tabulated in part (c). Similar to the
previous analysis, the type III tests of fixed effects indicate that there is a highly
significant difference (part (d)) in treatments on the average proportion of leaves
with fungal disease.
The least mean squares (means) on the model scale (column “Estimate”) and on
the data scale (column “Mean”) are tabulated in Table 6.47. The results indicate that
6.8 Exercises 257
Table 6.47 Estimated means (least squares means) on the model scale and on the data scale
Least squares means
t Estimate Standard error DF t-value Pr > |t| Mean Standard error mean
1 0.7223 0.09989 83 7.23 <0.0001 0.6731 0.02198
2 -1.7482 0.1543 83 -11.33 <0.0001 0.1483 0.01949
3 -2.0178 0.2214 83 -9.11 <0.0001 0.1174 0.02294
4 -1.9358 0.1873 83 -10.34 <0.0001 0.1261 0.02064
5 -1.7887 0.2173 83 -8.23 <0.0001 0.1432 0.02667
6 -1.5360 0.1665 83 -9.23 <0.0001 0.1771 0.02427
all proposed treatments in this study reduce the proportion of diseased leaves
compared to the control treatment (t = 1).
The mean comparison (LSD) obtained with the option “lines” indicates that the
proportion of diseased leaves in treatment one is statistically different from the rest
of the treatments (Table 6.48).
6.8 Exercises
Exercise 6.8.1 Seeds of a particular crop were stored at four different temperatures
(T1, T2, T3, and T4) under four different chemical concentrations (0, 0.1, 1.0, and 10).
To study the effects of temperature and chemical concentration, a completely
randomized experiment was conducted with a factorial treatment structure 4 × 4
and four replicates. For each of the 64 experimental units, 50 seeds were placed in a
dish and the number of seeds that germinated under standard conditions was
recorded. Germination data were obtained from Mead et al. (1993, p. 325)
(Table 6.49).
258 6 Generalized Linear Mixed Models for Proportions and Percentages
(a) Write down an ANOVA table (sources of variation, degrees of freedom) for this
experiment.
(b) List all the components of the GLMM in (a).
(c) Analyze this dataset and summarize the relevant results.
Exercise 6.8.2 Data were obtained from an experiment in which separate sprouts of
apple trees were inoculated with macroconidia of the fungus Nectria galligena,
which causes apple cancer (canker gangrene). The experimental factors were inoc-
ulum density (three levels: 200, 1000, and 5000 macroconidia per ml) and variety
(three levels: Jonagold, Golden Delicious, and Jonathan). The experiment was
carried out in 4 randomized blocks with 12 plots. Each plot consisted of one sprout
on which five inoculations were made. The numbers of successful inoculations per
plot on day 17 after inoculation are shown in the table below (Table 6.50).
(a) Write down an ANOVA table (sources of variation, degrees of freedom) for this
experiment.
(b) List all the components of the GLMM from part (a).
(c) Analyze this dataset and summarize the relevant results.
(d) Is there is an extra-variation in the dataset? What alternative distribution do you
propose? Reanalyze the data and compare the results.
Exercise 6.8.3 This experiment concerns the germination efficiencies of protoplasts
obtained from plants of seven species of the genera Lycopersicon (tomato) and
6.8 Exercises 259
Solanum (potato). For each species, three or four protoplast isolates were used and,
depending on the availability of the protoplasts, a variable number of plates was
carried out. Per plate, approximately 105 protoplasts were placed in a Petri dish, and,
after 4 weeks, the proportion of dividing protoplasts was recorded. The results in
percentages are listed below (Table 6.51).
(a) Write down an ANOVA table (sources of variation, degrees of freedom) for the
experimental design of this study.
(b) Write down a generalized linear mixed model base in (a), assuming a beta
distribution on the response variable.
(c) Implement an analysis of these data according to the linear predictor and model
in part (b). Summarize the relevant results.
Exercise 6.8.4 The data in this example are the results of a triangle test for 12 raters
tasting 10 pairs of coffee varieties (Table 6.52). The triangle test consisted of each
rater drinking three cups, one of one variety and two of the other. Each rater had
12 triangles for each pair of varieties, 2 for each of the following sequences: AAB,
ABA, BAA, ABB, BAB, and BBA. The answer is the correct variety identification
number appearing once. The experiment was conducted in two groups of six
260 6 Generalized Linear Mixed Models for Proportions and Percentages
Table 6.52 Triangle test (G = group, Eval = panelist, PdV = variety pair, V_A = variety A;
V_B = variety B; Y = number of correct discriminations, n = number of trials)
G Eval PdV V_A V_B Y n G Eval PdV V_A V_B Y n
1 1 1 8 9 2 12 2 7 1 8 9 4 12
1 1 2 5 9 11 12 2 7 2 5 9 12 12
1 1 3 9 6 9 12 2 7 3 9 6 7 12
1 1 4 6 5 6 12 2 7 4 6 5 9 12
1 1 5 6 8 8 12 2 7 5 6 8 10 12
1 1 6 5 8 9 12 2 7 6 5 8 5 12
1 1 7 7 8 6 12 2 7 7 7 8 9 12
1 1 8 7 9 8 12 2 7 8 7 9 9 12
1 1 9 7 5 11 12 2 7 9 7 5 7 12
1 1 10 7 6 5 12 2 7 10 7 6 5 12
1 2 1 8 9 5 12 2 8 1 8 9 2 12
1 2 2 5 9 8 12 2 8 2 5 9 10 12
1 2 3 9 6 8 12 2 8 3 9 6 8 12
1 2 4 6 5 9 12 2 8 4 6 5 9 12
1 2 5 6 8 10 12 2 8 5 6 8 8 12
1 2 6 5 8 11 12 2 8 6 5 8 9 12
1 2 7 7 8 8 12 2 8 7 7 8 4 12
1 2 8 7 9 9 12 2 8 8 7 9 6 12
1 2 9 7 5 8 12 2 8 9 7 5 10 12
1 2 10 7 6 7 12 2 8 10 7 6 7 12
1 3 1 8 9 4 12 2 9 1 8 9 3 12
1 3 2 5 9 9 12 2 9 2 5 9 11 12
1 3 3 9 6 9 12 2 9 3 9 6 11 12
1 3 4 6 5 11 12 2 9 4 6 5 8 12
1 3 5 6 8 8 12 2 9 5 6 8 8 12
1 3 6 5 8 10 12 2 9 6 5 8 11 12
1 3 7 7 8 3 12 2 9 7 7 8 5 12
1 3 8 7 9 7 12 2 9 8 7 9 4 12
1 3 9 7 5 10 12 2 9 9 7 5 11 12
1 3 10 7 6 9 12 2 9 10 7 6 8 12
1 4 1 8 9 7 12 2 10 1 8 9 7 12
1 4 2 5 9 10 12 2 10 2 5 9 9 12
1 4 3 9 6 7 12 2 10 3 9 6 5 12
1 4 4 6 5 8 12 2 10 4 6 5 11 12
1 4 5 6 8 7 12 2 10 5 6 8 5 12
1 4 6 5 8 8 12 2 10 6 5 8 10 12
1 4 7 7 8 7 12 2 10 7 7 8 7 12
1 4 8 7 9 6 12 2 10 8 7 9 8 12
1 4 9 7 5 10 12 2 10 9 7 5 6 12
1 4 10 7 6 7 12 2 10 10 7 6 9 12
1 5 1 8 9 6 12 2 11 1 8 9 7 12
1 5 2 5 9 10 12 2 11 2 5 9 9 12
(continued)
6.8 Exercises 261
evaluators, each with the aim of discriminating the abilities of the panelists for future
evaluations. The data for this example are shown below:
(a) Write down an ANOVA table (sources of variation, degrees of freedom) for this
experiment.
(b) List all the components of the GLMM according to part (a).
(c) Analyze this dataset and summarize the relevant results.
(d) Is there an extra-variation in the dataset? If so, what alternative distribution do
you propose? Reanalyze the data and compare the results.
Exercise 6.8.5 Several brewing techniques are used in the production of espresso
coffee. Among them, the most widespread are bar machines and single-dose pods,
designed in large numbers due to their commercial popularity. This experiment tries
to compare the foaming rate (Y, in percentage) effects of three different brewing
techniques on espresso quality (method 1 = bar machine (BM), method 2 = hyper-
espresso method (HIP), and method 3 = I-espresso system (IT)). Nine replicates per
method were carried out (Table 6.53).
(a) Write down an ANOVA table (sources of variation, degrees of freedom) for the
experimental design of this study.
(b) Describe the generalized linear mixed model in (a), assuming a beta distribution.
(c) Implement the analysis of these data according to the predictor and model in (b).
Summarize the relevant results.
262 6 Generalized Linear Mixed Models for Proportions and Percentages
Exercise 6.8.6 The decision to adopt a particular scale for data involving small
integers is not an easy one because any analysis must be – to some extent – as
adequate as possible to obtain estimates with as little uncertainty as possible. As a
simple example of this type of data, consider the following results from a potted
wheat germination experiment (Table 6.54).
(a) Write down an ANOVA table (sources of variation, degrees of freedom) for this
experiment.
(b) List all components of the GLMM in (a), assuming a binomial response variable.
(c) Analyze this dataset and summarize the relevant results.
(d) Is there an extra-variation in the dataset? If so, reanalyze the data with an
alternative distribution. Summarize and compare your findings.
Exercise 6.8.7 A greenhouse experiment was carried out to investigate how a
disease spreads in two varieties of (agurkesyge) cucumber, which is supposed to
depend on the climate and the amount of fertilizers used for the two varieties. The
following data come from the Department of Plant Pathology. Two climates
were used: (1) change to day temperature 3 hours before sunrise and (2) normal
change to day temperature. Three amounts of fertilizer were applied, normal
(2.0 units), high (3.5 units), and very high (4.0 units). The two varieties were
Aminex and Dalibor. To have a better controlled experiment, the plants were
“standardized” to equally have as many leaves, and, then (on day 0, for example),
the plants were contaminated with the disease. Subsequently, 8 days after the plants
were contaminated, the amount of infection (in percentage) was recorded. From the
resulting infection curve, two measures were calculated (in a manner not specified
here), namely, the rate of spread of the disease (%) and the level of infection at the
6.8 Exercises 263
end of the disease period. The experiment was implemented in three blocks, each of
which consisted of two sections. Each section consisted of three plots, which were
divided into two subplots, each of which had six to eight plants. Thus, there were a
total of 36 subplots. The results were recorded for each subplot. The experimental
factors were randomly assigned to the different units as follows: two climates to the
two sections within each block, three amounts of fertilizer to the three plots within
each section, and, finally, the two varieties to the two subplots within each plot. The
data are shown below (Table 6.55).
(a) Write down a statistical model of this experiment.
(b) List all the components of the GLMM in (a).
(c) Write down the null and alternative hypotheses associated with this experiment.
(d) Construct an ANOVA table indicating the sources of variation and degrees of
freedom.
(e) Analyze the rate of disease spread to investigate the effect of different factors.
(f) Comment on the results obtained.
Exercise 6.8.8 This example is an experiment to identify damage to the uterus in
laboratory rodents after exposure to boric acid, a compound widely used in pesti-
cides, pharmaceuticals, and other household products (Heindel et al. 1992). The
study design included four doses of boric acid. The compound was administered to
pregnant female mice during the first 17 days of gestation, and, then, the females
were sacrificed and their litters examined. The table below presents the resulting
trials for litters dying in utero (Y ) of the total number of trials conducted (N ) at each
of the four doses tested: d1 = 0{control}, d2 = 0.1, d3 = 0.2, and d3 = 0.4
(as percentage of boric acid in the diet) (Table 6.56).
(a) Write down an ANOVA table (sources of variation, degrees of freedom) for this
experiment.
(b) List all the components of the GLMM in (a).
(c) Analyze this dataset and summarize the relevant results.
(d) Is there an extra-variation in the dataset? If so, what alternative distribution do
you propose? Reanalyze the data and compare your findings.
264 6 Generalized Linear Mixed Models for Proportions and Percentages
Appendix
Data: Fleas
Bioen SP Treat Rep Overvi Dead
B1 Daphnia T1 1 10 0
B1 Daphnia T1 2 10 0
B1 Daphnia T1 3 10 0
B1 Daphnia T2 1 10 0
B1 Daphnia T2 2 10 0
B1 Daphnia T2 3 10 0
B1 Daphnia T3 1 9 1
B1 Daphnia T3 2 9 1
B1 Daphnia T3 3 8 2
(continued)
266 6 Generalized Linear Mixed Models for Proportions and Percentages
Data: Fleas
Bioen SP Treat Rep Overvi Dead
B1 Daphnia T4 1 2 8
B1 Daphnia T4 2 2 8
B1 Daphnia T4 3 3 7
B1 Daphnia T5 1 0 10
B1 Daphnia T5 2 0 10
B1 Daphnia T5 3 0 10
B1 Daphnia T6 1 0 10
B1 Daphnia T6 2 0 10
B1 Daphnia T6 3 0 10
B2 Daphnia T1 1 10 0
B2 Daphnia T1 2 10 0
B2 Daphnia T1 3 10 0
B2 Daphnia T2 1 10 0
B2 Daphnia T2 2 10 0
B2 Daphnia T2 3 10 0
B2 Daphnia T3 1 9 1
B2 Daphnia T3 2 9 1
B2 Daphnia T3 3 9 1
B2 Daphnia T4 1 2 8
B2 Daphnia T4 2 2 8
B2 Daphnia T4 3 2 8
B2 Daphnia T5 1 0 10
B2 Daphnia T5 2 0 10
B2 Daphnia T5 3 0 10
B2 Daphnia T6 1 0 10
B2 Daphnia T6 2 0 10
B2 Daphnia T6 3 0 10
B3 Daphnia T1 1 10 0
B3 Daphnia T1 2 10 0
B3 Daphnia T1 3 10 0
B3 Daphnia T2 1 10 0
B3 Daphnia T2 2 10 0
B3 Daphnia T2 3 10 0
B3 Daphnia T3 1 8 2
B3 Daphnia T3 2 9 1
B3 Daphnia T3 3 9 1
B3 Daphnia T4 1 3 7
B3 Daphnia T4 2 2 8
B3 Daphnia T4 3 2 8
B3 Daphnia T5 1 0 10
B3 Daphnia T5 2 0 10
B3 Daphnia T5 3 0 10
(continued)
Appendix 267
Data: Fleas
Bioen SP Treat Rep Overvi Dead
B3 Daphnia T6 1 0 10
B3 Daphnia T6 2 0 10
B3 Daphnia T6 3 0 10
B1 Dubia T1 1 10 0
B1 Dubia T1 2 10 0
B1 Dubia T1 3 10 0
B1 Dubia T2 1 5 5
B1 Dubia T2 2 6 4
B1 Dubia T2 3 6 4
B1 Dubia T3 1 5 5
B1 Dubia T3 2 5 5
B1 Dubia T3 3 5 5
B1 Dubia T4 1 2 8
B1 Dubia T4 2 3 7
B1 Dubia T4 3 3 7
B1 Dubia T5 1 2 8
B1 Dubia T5 2 2 8
B1 Dubia T5 3 2 8
B1 Dubia T6 1 0 10
B1 Dubia T6 2 0 10
B1 Dubia T6 3 0 10
B2 Dubia T1 1 10 0
B2 Dubia T1 2 10 0
B2 Dubia T1 3 10 0
B2 Dubia T2 1 7 3
B2 Dubia T2 2 5 5
B2 Dubia T2 3 6 4
B2 Dubia T3 1 5 5
B2 Dubia T3 2 5 5
B2 Dubia T3 3 5 5
B2 Dubia T4 1 4 6
B2 Dubia T4 2 4 6
B2 Dubia T4 3 4 6
B2 Dubia T5 1 2 8
B2 Dubia T5 2 2 8
B2 Dubia T5 3 2 8
B2 Dubia T6 1 0 10
B2 Dubia T6 2 0 10
B2 Dubia T6 3 0 10
B3 Dubia T1 1 10 0
B3 Dubia T1 2 10 0
B3 Dubia T1 3 10 0
(continued)
268 6 Generalized Linear Mixed Models for Proportions and Percentages
Data: Fleas
Bioen SP Treat Rep Overvi Dead
B3 Dubia T2 1 8 2
B3 Dubia T2 2 8 2
B3 Dubia T2 3 7 3
B3 Dubia T3 1 5 5
B3 Dubia T3 2 5 5
B3 Dubia T3 3 6 4
B3 Dubia T4 1 2 8
B3 Dubia T4 2 3 7
B3 Dubia T4 3 2 8
B3 Dubia T5 1 3 7
B3 Dubia T5 2 2 8
B3 Dubia T5 3 2 8
B3 Dubia T6 1 0 10
B3 Dubia T6 2 0 10
B3 Dubia T6 3 0 10
Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0
International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing,
adaptation, distribution and reproduction in any medium or format, as long as you give appropriate
credit to the original author(s) and the source, provide a link to the Creative Commons license and
indicate if changes were made.
The images or other third party material in this chapter are included in the chapter's Creative
Commons license, unless indicated otherwise in a credit line to the material. If material is not
included in the chapter's Creative Commons license and your intended use is not permitted by
statutory regulation or exceeds the permitted use, you will need to obtain permission directly from
the copyright holder.
Chapter 7
Time of Occurrence of an Event of Interest
7.1 Introduction
yα - 1 ef =βg , y ≥ 0
1 -y
f ðy; α, βÞ =
Γ ðαÞβα
1
where Γ ðαÞ = 0 t α - 1 e - t dt is the gamma function (Casella and Berger 2002). The
mean and variance of a random gamma variable are E[Y] = αβ = μ and
Var½Y = αβ2 = μ2=α, respectively. This density function can be rewritten in terms
of the mean μ and the scale parameter ϕ = 1/α.
1
f ðy; α, βÞ =
ef
1=ϕ 1 - 1 - y=μϕg
Γ 1
ϕ ðμϕÞ y ϕ , y ≥ 0:
Estrus induction in ewes is a very common practice carried out in livestock farms or
at research centers. For this, an animal researcher uses gonadotropin-releasing
hormone (GnRH), equine chorionic gonadotropin (eCG), and P4 in a controlled
internal drug-releasing (CIDR) intravaginal device in female Pelibuey ewes (n = 78)
with single, double, and triple lambing as treatments. In order to ensure that all
animals were in good condition during the experiment, ewes received the same
zootechnical management and feeding. For this experiment, the ewes were synchro-
nized on the same day under a synchronization protocol. Table 7.1 presents the
analysis of variance (ANOVA).
The variables evaluated in this experiment were the time of onset and duration of
estrus (yij) in hours according to the type of calving. The variability among
female sheep on weight, age, and body condition must be taken into account in the
analysis. The data from this experiment can be found in the Appendix 1 of this book
(Data: Pelibuey Sheep). Thus, the components of a gamma GLMM are as follows:
where ηij is the ith link function for treatment i (type of birth angle, double or triple)
in ewes j, μ is the overall mean, τi is the fixed effect due to type of birth (treatment),
r(τ)ij is the random effect due to type of birth (treatment) in ewes j with
τðr Þj N 0, σ 2τðanimalÞ .
The following GLIMMIX program fits the model
Table 7.3 Means and standard errors on the model scale (“Estimate” column) and the data scale
(“Mean” column) for the onset and duration of estrus in Pelibuey ewe lambs
Parto least squares means
Standard Standard error
Birth_type Estimate error DF t-value Pr > |t| Mean mean
Start of estrus
1 3.2913 0.06631 75 49.63 <0.0001 26.8787 1.7824
2 3.0622 0.04606 75 66.48 <0.0001 21.3735 0.9845
3 3.0496 0.04542 75 67.14 <0.0001 21.1059 0.9586
Duration of estrus
1 1.6826 0.1518 75 11.09 <0.0001 5.3795 0.8164
2 2.6716 0.1171 75 22.81 <0.0001 14.4637 1.6938
3 2.8075 0.09846 75 28.51 <0.0001 16.5684 1.6313
The last two columns of Table 7.3, labeled “Mean” and “Standard error,”
correspond to the means (μij) on the data scale for the ewes’ mean onset and duration
of estrus with their respective standard errors. For example, the mean time to onset of
estrus in single-birth ewes was 26.87 ± 1.78 hours, whereas for double- and triple-
birth ewes, it was 21.37 ± 0.98 and 21.1 ± 0.95, respectively. On the other hand, the
average time (in hours) of estrus duration was longer in double- and triple-birth ewes
(14.46 ± 1.69 and 16.56 ± 1.63, respectively) compared to single-birth ewes
(5.38 ± 0.81).
where ηij is the predictor with treatment i and block j, μ is the overall mean, rj is the
random effect of the patient with r j N 0, σ 2patient , and τi is the fixed effect due to
treatment.
Note, although the exponential and gamma distributions have a canonical link
equal to the inverse of the mean, the gamma and exponential GLMMs most often use
a computationally more stable link (link = log), which was used in this and in the
previous analysis.
The following GLIMMIX syntax adjusts a GLMM into complete blocks.
Table 7.6 Results of the (a) Fit statistics for conditional distribution
analysis of variance
-2 Log L (y | r. effects) 728.62
Pearson’s chi-square 5.69
Pearson’s chi-square/DF 0.08
(b) Covariance parameter estimates
Cov Parm Estimate Standard error
Patient 0.03964 0.02375
Residual ϕ 0.09132 0.01640
The statistics of the conditional model (Pearson′s chi - squre/DF = 0.08) as well
as the variance components (Patient) and the scale parameter ϕ of the model
indicate that the gamma model adequately describes the dataset (Table 7.6 parts
(a) and (b)). The analysis of variance (Table 7.6 part (c)) indicates that there is
a highly significant difference of treatments in the mean time of itch duration
(P = 0.0030).
The dispersion observed in the following plot (top left) of the residuals versus the
linear predictor value suggests that the variance is constant and homogeneous
(Fig. 7.1). The histogram (upper right) shows a nearly symmetrical pattern with
little bias. Furthermore, the residuals versus quantile plot (bottom left) shows no
marked deviations, indicating that the fit is adequate. Finally, the bottom right plot
shows that the average residuals are zero and vary between -0.5 and 0.75.
The “lsmeans” on the data scale, for each of the five treatments, placebo, and the
control treatment, are shown under the “Mean” column with their respective “Stan-
dard error” in Table 7.7. Each of the five drugs appear to have a significant effect
compared to the placebo and control. Papaverine (Papv) is the most effective drug.
Both the placebo and control treatment have statistically similar means. The rela-
tively large difference in the placebo group suggests that some patients responded
negatively to the placebo compared to the control, whereas others responded
positively.
Figure 7.2 shows that the drug papaverine significantly reduced the itching time,
followed by the drugs aminophylline and morphine, whereas the efficacies of the
drugs pentobarbital and tripelennamine were highly similar to each other in elimi-
nating itching.
Table 7.7 Means and standard errors on the model scale (“Estimate” column) and the data scale
(“Mean” column) for the average duration time of the itch
Trt least squares means
Standard Standard error
Trt Estimate error DF t-value Pr > |t| Mean of mean
Amino 4.9795 0.1149 43.32 <0.0001 145.41 16.7129
Morp 4.9797 0.1146 43.44 <0.0001 145.43 16.6733
Papv 4.7356 0.1149 41.20 <0.0001 113.93 13.0956
Pento 5.1703 0.1149 44.99 <0.0001 175.97 20.2211
Placebo 5.2704 0.1151 45.79 <0.0001 194.49 22.3867
No drug 5.2542 0.1148 45.76 <0.0001 191.36 21.9723
Tripel 5.0802 0.1147 44.28 <0.0001 160.80 18.4487
of beetles (Appendix 1: Data: Beetles). The interaction between both factors (insec-
ticide * dose) yielded a total of 12 combinations (treatments). The objective of this
study was to compare the insecticides, dose, and interaction with beetle survival
time. Due to the intrinsic characteristics of each of the insects, these must be
considered as a source of variation in the experiment, since they respond differently
286 7 Time of Occurrence of an Event of Interest
250
200
Average me (seconds)
150
100
50
0
Amino Morp Papv Pento Placebo Sindroga Tripel
Treatment
to certain stimuli. Assuming that 48 beetles are available, they were randomly
assigned equally to 4 groups (blocks) with 12 treatment combinations. That is,
four beetles were randomly assigned to each treatment.
The sources of variation and degrees of freedom for this experiment are shown in
the following analysis of variance table (Table 7.8).
The components of the gamma-response GLMM are as follows:
Table 7.9 Results of the (a) Fit statistics for conditional distribution
analysis of variance
-2 Log L (tiempo | r. effects) 121.05
Pearson’s chi-square 1.91
Pearson’s chi-square/DF 0.04
(b) Covariance parameter estimates
Cov Parm Estimate Standard error
block -0.00173 .
Residual 0.04155 0.008818
(c) Type III tests of fixed effects
Effect Num DF Den DF F-value Pr > F
Dose 2 33 69.61 <0.0001
Insecticide 3 33 31.36 <0.0001
Dose*insecticide 6 33 2.05 0.0868
Part of the Statistical Analysis Software (SAS) output is shown in Table 7.9. The
value of the conditional model’s Pearson′s chi - square/DF = 0.04 indicates that the
gamma distribution adequately models the data. The estimated variance component
for blocks and the scaling parameter given by the “residual” value are shown below
σ2block = - 0:00173, and σ
(in part (b)) ð^ ^2 = 0:04155, respectivelyÞ.
The analysis of variance in (c) of Table 7.9 indicates that the insecticides and dose
(P = 0.0001) have different significant effectiveness (toxicity) on beetle survival
time. However, the interaction between both factors is close to significance
(P = 0.0868). The “lsmeans” values on the data scale for dose μi:: (part (a)) and
insecticide μ:j: (part (b)) with their respective standard errors for both factors are
listed under the columns titled “Mean” and “Standard error mean” of Table 7.10,
respectively.
The combination of levels of both factors affected the average survival time of the
beetles (Table 7.11). For insecticides 1 and 3 at a high dose, the survival time was
lower with average times of 2.1 ± 0.209 and 2.35 ± 0.334 hours, respectively. In
general, low values of survival times were observed for insecticides 1 and 3 com-
pared to insecticides 2 and 4.
Four samples were obtained from each of two batches (Reps) of unprocessed gum
from Acacia sp. Trees, with eight samples in total. Within each batch, the four
288 7 Time of Occurrence of an Event of Interest
Table 7.10 Means and standard errors on the model scale (“Estimate”) and the data scale (“Mean”)
for the factor dose and type of insecticide
(a) Dose least squares means
Standard Standard error
Dose Estimate error DF t-value Pr > |t| Mean mean
High 0.9960 0.04984 33 19.98 <0.0001 2.7075 0.1349
Low 1.7840 0.04984 33 35.79 <0.0001 5.9538 0.2967
Medium 1.6203 0.04984 33 32.51 <0.0001 5.0548 0.2519
(b) Insecticide least squares means
Standard Standard error
Insecticide Estimate error DF t-value Pr > |t| Mean mean
Insec1 1.1074 0.05755 33 19.24 <0.0001 3.0265 0.1742
Insec2 1.8272 0.05755 33 31.75 <0.0001 6.2166 0.3578
Insec3 1.3041 0.05755 33 22.66 <0.0001 3.6845 0.2121
Insec4 1.6284 0.05755 33 28.29 <0.0001 5.0960 0.2933
Table 7.11 Means and standard errors on the model scale and the data scale for the interaction
between dose and type of insecticide
Dose*insecticide least squares means
Standard Standard
Dose Insecticide Estimate error DF t-value Pr > |t| Mean error mean
High Insec1 0.7419 0.09968 33 7.44 <0.0001 2.1000 0.2093
High Insec2 1.2089 0.09968 33 12.13 <0.0001 3.3499 0.3339
High Insec3 0.8545 0.09969 33 8.57 <0.0001 2.3501 0.2343
High Insec4 1.1788 0.09969 33 11.82 <0.0001 3.2503 0.3240
Low Insec1 1.4171 0.09968 33 14.22 <0.0001 4.1250 0.4112
Low Insec2 2.1747 0.09968 33 21.82 <0.0001 8.7998 0.8772
Low Insec3 1.7361 0.09969 33 17.42 <0.0001 5.6754 0.5658
Low Insec4 1.8082 0.09968 33 18.14 <0.0001 6.0994 0.6080
Medium Insec1 1.1632 0.09969 33 11.67 <0.0001 3.2000 0.3190
Medium Insec2 2.0980 0.09968 33 21.05 <0.0001 8.1499 0.8124
Medium Insec3 1.3218 0.09969 33 13.26 <0.0001 3.7501 0.3738
Medium Insec4 1.8984 0.09969 33 19.04 <0.0001 6.6753 0.6654
samples were randomly assigned to combinations of two factors with two levels
each. The first factor refers to whether the gum was demineralized or not, and the
second factor refers to whether the gum was pasteurized or not. An emulsion made
from each gum sample was divided into three smaller parts, which were randomly
assigned to the levels of a third factor, the PH, and pH was adjusted to 2.5, 4.5, or 5.5
using citric acid (Appendix 1: Data: Gum Breakdown Times).
This is a split-plot design, with whole plots and rubber samples in a block
arrangement. The combined levels of demineralization and pasteurization of the
paste are large (whole) plot factors. The split plots are the smaller parts, with a
specific pH, which is the only split-plot factor. The response measured ( y) was the
7.2 Generalized Linear Mixed Models with a Gamma Response 289
time to break, i.e., the time (in hours) until the emulsion failed. The sources of
variation and degrees of freedom for this experiment are shown in Table 7.12.
The components of the GLMM with a Gamma response are as follows:
where αi, βj, and γ k are the fixed effects due to the factors demineralization,
pasteurization, and pH, respectively; the effects (αβ)ij, (αγ)ik, (βγ)jk, and (αβγ)ijk
are the two- and three-way interactions of the factors under study; and αβ(r)ijl are
random effects due to the demineralization x pasteurization x rep interaction,
assuming that αβðr Þijl N 0, σ rαβ
2
.
The relevant results from the SAS output are shown in Table 7.13. The value of
χ2
the conditional model DF = 0:01 indicates that the gamma distribution does not
cause overdispersion. The variance component due to blocks × demineralization ×
pasteurization σ 2rðαβÞ and the scale parameter ϕ are shown in (b).
The hypothesis tests for type III fixed effects are presented in part (c) of
Table 7.13, where a significant effect of the factors demineralization, pasteurization,
and pH as well as the interaction between demineralization with pasteurization are
observed on the gum. However, the interactions demineralization*pH (P = 0.0676)
and demineralization*pasteurization*pH are close to significance (P = 0.0535). The
emulsion breaking time is strongly affected by no demineralization (demineraliza-
tion = 1) and no pasteurization (pasteurization = 1) of the gum and, to a lesser
extent, by the pH adjusted to the gum (Table 7.14).
Analyzing the simple effects of the factors, we can observe that when the gum has
not been pasteurized (B = 1), the average emulsion break time is very similar in the
demineralized paste than in the non-demineralized paste at the three pH levels.
However, when the gum has been pasteurized, demineralization has a significant
impact on the emulsion breakup time; for example, for a paste that is not
demineralized and pasteurized (A1B2), the emulsion breakup time is much lower
than when the gum has been demineralized and pasteurized (A2B2) at all three pH
levels. Finally, with a demineralized, pasteurized gum at pH = 4.5, a gum with
higher breaking stability is obtained (Table 7.15).
7.3 Survival Analysis 291
Table 7.14 Means and standard errors of the main effects on the model scale (Estimate) and the
data scale (Mean)
(a) Demineralization least squares means
Standard Standard error
Demineralization Estimate error DF t-value Pr > |t| Mean mean
1 5.0911 0.02930 4 173.77 <0.0001 162.57 4.7628
2 5.3379 0.02930 4 182.18 <0.0001 208.07 6.0964
(b) Pasteurization least squares means
Standard Standard error
Pasteurization Estimate error DF t-value Pr > |t| Mean mean
1 5.1230 0.02930 4 174.87 <0.0001 167.84 4.9171
2 5.3059 0.02930 4 181.08 <0.0001 201.53 5.9051
(c) pH least squares means
pH Estimate Standard error DF t-value Pr > |t| Mean Standard error mean
1 5.1610 0.03050 8 169.22 <0.0001 174.33 5.3171
2 5.2839 0.03050 8 173.24 <0.0001 197.13 6.0124
3 5.1986 0.03052 8 170.32 <0.0001 181.02 5.5255
Table 7.15 Means and standard errors of the simple effects on the model scale (Estimate) and the
data scale (Mean)
Demineralization*pasteurization*pH least squares means
Standard Standard error
A B C Estimate error DF t-value Pr > |t| Mean mean
1 1 1 5.0696 0.06099 8 83.13 <0.0001 159.11 9.7035
1 1 2 5.1695 0.06105 8 84.68 <0.0001 175.83 10.7339
1 1 3 5.1311 0.06100 8 84.12 <0.0001 169.20 10.3204
1 2 1 5.1137 0.06099 8 83.84 <0.0001 166.28 10.1419
1 2 2 5.0445 0.06098 8 82.72 <0.0001 155.17 9.4623
1 2 3 5.0183 0.06098 8 82.29 <0.0001 151.15 9.2170
2 1 1 5.0811 0.06103 8 83.26 <0.0001 160.95 9.8225
2 1 2 5.1694 0.06099 8 84.76 <0.0001 175.81 10.7225
2 1 3 5.1175 0.06110 8 83.76 <0.0001 166.91 10.1978
2 2 1 5.3796 0.06100 8 88.19 <0.0001 216.93 13.2320
2 2 2 5.7520 0.06100 8 94.30 <0.0001 314.81 19.2031
2 2 3 5.5277 0.06106 8 90.53 <0.0001 251.57 15.3607
A = demineralization (1 = no, 2 = yes), B = pasteurization (1 = no, 2 = yes), and C = pH (1 = 2.5,
2 = 4.5, and 3 = 5.5)
times is the presence of censored times, that is, when there are individuals whose
actual survival time is not known.
For a set of survival times (including censored ones) of a sample of individuals, it
is possible to estimate the proportion of the population that will survive a time
interval under the same circumstances. The methods used to make this estimate are
based on the proposal of Kaplan and Meier (1958). This method allows – through
different statistical tests (log rank, Breslow, Tarone–Ware, etc.) – the comparison of
the survival of two or more groups of individuals who differ with respect to certain
factors.
Survival analysis focuses its interest on a group or several groups of individuals
for whom an event is defined, which occurs after a time interval. To determine the
time of interest, there are three requirements: an initial time, a scale to measure the
passage of time (minutes, hours, days, etc.), and clarity about what is meant by the
event of interest.
Survival of an individual is conceptually the probability of being alive in a given
time "t" from diagnosis, i.e., initiation of treatment or complete remission for a group
of individuals. In clinical studies, survival times often refer to time till death,
development of a particular symptom, or relapse after complete remission of a
disease. Failure is defined as death, relapse, or the occurrence of a new disease. In
many survival analyses, when the end of the observation period previously set by the
investigator is reached, there are individuals to whom the event has not occurred and
we do not know when it will occur. Therefore, the actual survival time for them is
unknown, and only the survival time to the end of the study is known. Such survival
times are called censored times. It also happens, in some cases, that some individuals
do not continue the study until the end of the analysis period for reasons unrelated to
the research, e.g., death from other causes; these times are also censored. These
censored data contribute valuable information and, therefore, should not be omitted
from the analysis.
The pharmaceutical and food industries are legally required to label the shelf life
of their product on the packaging. For pharmaceuticals, the requirements for how to
determine shelf life are highly regulated. However, the regulatory standards do not
specifically define shelf life. Instead, the definition is implicit through the estimation
procedure. The interest is in the situation where multiple batches are used to
determine a shelf life of a product that applies to all future batches. Consequently,
both shelf life and label life are of great importance because of the variability within
and between batches. Product development must be very well thought out before a
company can have confidence in shelf life estimates. The company must be able to
reliably produce a homogeneous product from batch to batch of ingredients, as
physical and chemical factors impact the ability of bacteria to grow, such as pH,
water activity, and uniformity of the mix (moisture distribution, salt, preservative or
food acid) and, consequently, the shelf life of the product. Therefore, products
should be inspected at appropriate times and samples should be tested for critical
stability of physical and chemical characteristics. These tests also provide an oppor-
tunity to begin microbiological testing for spoilage organisms. Testing should
continue beyond the intended shelf life unless the product fails earlier. Testing
7.3 Survival Analysis 293
To clearly understand and interpret a rate of change calculated from the event data of
interest, a more extensive approach is needed. The definition of a rate of change
begins with the mathematical description of a changing pattern over time,
represented by the symbol S(t). A version of a ratio is created by dividing the change
in function S(t)[S(t) to S(t + Δt)] by the corresponding change over time t(t to t + Δt)
producing the rate of change
Rates of change, with respect to time, apply to a variety of situations, but one
specific function, traditionally denoted by S(t), is fundamental to the analysis of
survival data. This is called the survival function and is defined as the probability of
surviving (probability of survival) beyond a specific point in time (denoted by t).
That is;
Equivalent to
where F(t) is the cumulative distribution function with F(t) = P(T ≤ t). Another
important concept in survival analysis is the hazard function h(t). The hazard
function that depends on T is defined as
294 7 Time of Occurrence of an Event of Interest
F ðt þ Δt Þ - F ðt Þ 1
hðt Þ = lim ×
Δt → 0 Δt PðT ≥ t Þ
f ðt Þ
hð t Þ =
Sð t Þ
where f(t) is the probability density function. Any distribution defined by t 2 [0, t)
can serve as a survival distribution. Consequently,
∂
hð t Þ = - f log Sðt Þg:
t
Sðt Þ = expf - H ðt Þg
H ðt Þ = hðuÞdu
0
H ðt Þ = - log Sðt Þ:
For the simplest model, the exponential model with h(t) = λ (λ is a constant), the
survival function is given by
t t
∂
f ðt Þ = Sðt Þ = λe - λt :
t
Thus, the survival function, hazard function, and cumulative risk for the
exponential model is given by:
7.3 Survival Analysis 295
The objective of this experiment was to test the vulnerability of Aedes aegypti
mosquitoes to different fungal treatments (four treatments). A bioassay was
conducted to determine the survival time of each of the mosquitoes. Three-day-old
mosquitoes were maintained after hatching in 45-cm rearing cages with access to
water but not food. The mosquitoes were kept in rearing cages with water and fed
warm pig blood (37 °C) through a natural membrane (sausage casing) approximately
every 3 days and allowed to oviposit freely during the waiting period. A total of
10 mosquitoes were placed in a chamber to which one of the treatments (four) plus a
control was applied. Here, we present part of the data from a bioassay with four
replicates. The complete data from this trial can be found in the Appendix 1 (Data:
Aedes aegypti).
Treatment Rep Y
C 1 8
C 1 11
⋮ ⋮ ⋮
C 4 20
Mam 1 2
Mam 1 2
⋮ ⋮ ⋮
MaS 1 3
MaS 1 3
⋮ ⋮ ⋮
MaC 1 2
MaC 1 2
MaC 1 2
⋮ ⋮ ⋮
Ma1 1 2
Ma1 1 2
⋮ ⋮ ⋮
Ma1 4 11
296 7 Time of Occurrence of an Event of Interest
Table 7.16 Results of the (a) Fit statistics for conditional distribution
analysis of variance
-2 Log L (T | r. effects) 716.70
Pearson’s chi-square 35.33
Pearson’s chi-square/DF 0.18
(b) Type III tests of fixed effects
Effect Num DF Den DF F-value Pr > F
Trt 4 192 186.42 <0.0001
where η is the intercept, τi is the treatment effect, and repj is the random effect due to
the mosquito chamber assuming repj N 0, σ 2rep :
The following GLIMMIX commands adjust a GLMM with a gamma response:
Part of the output is shown in Table 7.16. The statistic in (a) above indicates that
there is no over-dispersion in the fit of the data, as indicated by Pearson′s chi -
square/DF = 0.18. The analysis of variance (type III tests of fixed effects) indicates
that there is a highly significant effect (P = 0.0001) of the fungal treatments on the
mean mosquito survival time.
The relevant information in Table 7.17 “lsmeans” comes from the columns
labeled “Estimate” and “Mean”: these are the estimates on the model scale and the
data scale, and the average survival time in each of the treatments is represented by
μi ð ± standard errorÞ.
The estimated risk function for each treatment combination is ^λi = 1=μ^i : For exam-
ple, for treatment Ma1, the estimated hazard function is λMa1 = 1=3:4223 = 0:2922: We
can manually calculate these values from the Mean column or we can automate the
process by adding the command “ods output lsmeans = mu” in the GLIMMIX
7.3 Survival Analysis 297
Table 7.17 Means and standard errors of the main effects on the model scale (Estimate) and the
data scale (Mean)
Trt least squares means
Standard t- Standard error
Trt Estimate error DF value Pr > |t| Mean mean
Ma1 1.2303 0.06354 192 19.36 <0.0001 3.4223 0.2174
MaC 0.9562 0.06350 192 15.06 <0.0001 2.6017 0.1652
MaS 1.5798 0.06357 192 24.85 <0.0001 4.8542 0.3086
Mam 0.6946 0.06350 192 10.94 <0.0001 2.0029 0.1272
Control 2.7155 0.06362 192 42.68 <0.0001 15.1126 0.9615
program above. Once we have saved the treatment means, we can ask SAS to estimate
the estimated hazard function for the treatments. The commands are as follows:
data hazard;
set mu;
hazard=1/mu;
proc print data=hazard;
run;
The results are listed below in Table 7.18. The hazard column contains the
estimated hazard functions for each treatment hi ðt Þ = λi .
From the values λi , we can calculate the estimated survival function Si ðt Þ = e - λi t
for each of the treatments. Figure 7.3 shows the probability of survival over time
^
obtained with Si ðt Þ = e - λi t of each of the proposed treatments and the control.
Clearly, the treatments MaS, Ma1, MaC, and Mam showed a greater efficacy in
the biological control of these mosquitoes.
Similar to the previous example, this experiment consisted of testing the vulnerabil-
ity of Aedes aegypti mosquitoes to different fungal treatments (four treatments). For
this, two bioassays were conducted to determine the survival time of each of the
mosquitoes. Three-day-old mosquitoes were maintained after hatching in 45-cm
rearing cages with access to water but not food. Mosquitoes were maintained in
rearing cages with water and were fed warm pig blood (37 °C) through a natural
membrane (sausage casing) approximately every 3 days. They were allowed to
freely oviposit during the waiting period. A total of 10 mosquitoes were placed in
a chamber to which one of the treatments (four) plus a control was applied. The data
can be found in the Appendix 1 (Data: Aedes aegypti).
298
Table 7.18 Means and standard errors of the main effects on the model scale (Estimate) and the data scale (Mean) and the hazard function λi
Effect TRT Estimate Standard error DF t-value Probt Mean Standard error mean Hazard λi
TRT Ma1 1.2303 0.06354 192 19.36 <0.0001 3.4223 0.2174 0.29220
TRT MaC 0.9562 0.06350 192 15.06 <0.0001 2.6017 0.1652 0.38437
7
TRT MaS 1.5798 0.06357 192 24.85 <0.0001 4.8542 0.3086 0.20601
TRT Mam 0.6946 0.06350 192 10.94 <0.0001 2.0029 0.1272 0.49927
TRT Control 2.7155 0.06362 192 42.68 <0.0001 15.1126 0.9615 0.06617
Time of Occurrence of an Event of Interest
7.3 Survival Analysis 299
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Time (days)
where η is the intercept, τi is the treatment effect, bioj and rep(bio)k( j ) are the random
effects of the bioassay and the mosquito chamber within the bioassay, respectively,
assuming bioj N 0, σ 2bio and repðbioÞkðjÞ N 0, σ rep 2
ðbioÞ :
The following GLIMMIX program fits a block GLMM with a gamma response.
The results obtained are shown below. Part of the statistics and variance
components are listed in Table 7.19. In part (a), the value of the statistic of
300 7 Time of Occurrence of an Event of Interest
Table 7.19 Results of the (a) Fit statistics for conditional distribution
analysis of variance
-2 log L (Y | r. effects) 3303.50
Pearson’s chi-square 202.30
Pearson’s chi-square/DF 0.34
(b) Cov Parm Estimate Standard error
BIO 0.1859 0.1936
REP(BIO) 0.02562 0.01673
Residual 0.2822 0.01568
(c) Type III tests of fixed effects
Effect Num DF Den DF F-value Pr > F
TRAT 4 588 115.36 <0.0001
Table 7.20 Means and standard errors of the main effects on the model scale (Estimate) and the
data scale (Mean)
TRT least squares means
Standard t- Standard error
TRT Estimate error DF value Pr > |t| Mean mean
Ma1 1.6344 0.3140 588 5.21 <0.0001 5.1266 1.6097
MaC 1.4903 0.3140 588 4.75 <0.0001 4.4386 1.3939
MaS 1.8788 0.3140 588 5.98 <0.0001 6.5455 2.0550
Mam 1.8053 0.3143 588 5.74 <0.0001 6.0820 1.9115
Control 2.8293 0.3139 588 9.01 <0.0001 16.9329 5.3153
Table 7.21 Means and standard errors of the main effects on the model scale (Estimate), the data
scale (Mean), and the hazard function λi
Pearson′s chi - square/DF = 0.34 and in part (b), the estimated variance compo-
nents due to blocks, within-block replicates, and experimental error are
σ^2bio = 0:1859, σ^2repðbioÞ = 0:02562, and σ^2 = 0:2822, respectively. The type III effect
hypothesis tests (part (c)) indicate that there is a highly significant difference
between treatments on the mean survival time, as indicated by P = 0.0001.
Tables 7.20 and 7.21 show the estimates on the model scale and the data scale,
linear predictors ðηi Þ, means ðμi Þ with their respective standard errors, and the
estimated hazard function. The results indicate that the MaC treatment has a greater
lethal effect than A. aegypti mosquito control.
7.4 Exercises 301
1
Ma1 MaC MaS Mam Control
0.9
0.8
Survival probability (S(t))
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Time (days)
Figure 7.4 shows the survival times for the different treatments tested. These
- λi t
curves were obtained with Si ðt Þ = e .
7.4 Exercises
Exercise 7.4.1 The investigation of this experiment focused on studying the times
of animal incapacitation experienced after being exposed to the burning of eight
types of aircraft interior materials (M1–M9) and performances in milligram/gram
combustion of seven gases (CO, HCN, H2S, HCl, HBr, NO2, SO2) (Spurgeon 1978).
The recorded incapacitation time of the animal when exposed to different combus-
tion materials (under the column “Material”) is found under the column “Time in
minutes” and in the third column the value of (1000/Time); these data are shown
below (Table 7.22):
Table 7.22 Time of incapacity of the animal when exposed to different combustion gases
Material Time 1000/Time CO HCN H2S HCl HBr NO2 SO2
M1 2.36 423.7 164 6.4 0 0 0 0.26 0
M1 2.38 420.2 174 7.5 0 0 5 1.07 0
M1 2.61 383.1 96 4.7 0 33 5 0.08 0
M1 3.07 325.7 101 7.5 0 0 7.1 0.43 0
M1 3.07 325.7 142 6.8 0 27.6 0 0.25 0
M1 3.19 313.5 143 8.2 0 0 5.5 0.33 0
M1 3.7 270.3 147 5.2 0 11.3 0 0.37 0
M1 3.9 256.4 156 4.7 0 12 2.6 0.39 0
M1 4.18 239.2 124 3.2 0 23.3 0 0.2 0
M1 4.7 212.8 101 8.9 0.9 5.4 8 0.63 0
M1 4.86 205.8 142 4.6 0 19.4 4.1 0.19 0
M1 5.58 179.2 104 3.4 0 80 0 0.15 0.4
M1 5.85 170.9 90 2.3 0 34.4 0 0.09 1.2
M2 3.22 310.6 159 16.4 0 0 5.3 2 0
M2 3.89 257.1 153 2.9 0 0 6.6 0.15 0
M2 4.79 208.8 161 0.6 0 0 0 0.62 0
M2 5.07 197.2 159 0 0 4.6 1.7 0.04 0
M2 5.22 191.6 162 0 0 22 0 0.04 0
M2 5.82 171.8 106 3.2 0 45.2 15.6 0.08 0
M2 6.09 164.2 124 1.5 0 0 0 0.85 0
M2 8.36 119.6 89 0.7 0 0 5.3 0.29 0
M2 13.02 76.8 88 0 0 0 0 0.02 0
M3 4.29 233.1 129 6 0 4.2 0 0.02 0.7
M3 4.8 208.3 105 5.8 0 0 0 0.03 0
M3 5.04 198.4 108 7.8 0 7.3 0 0.04 0
M3 5.06 197.6 120 11.6 0 23 0 0.02 0
M3 5.25 190.5 149 0 0 8.6 0 0 0
M3 5.5 181.8 28 9.1 0.4 56.2 0 0 2.2
M3 5.55 180.2 83 5 0 0 0 0.02 0
M3 7.55 132.5 68 5.5 0 27.3 0 0.01 0.9
M3 9.58 104.4 28 2.4 2 137 0 0 16.6
M4 1.15 869.6 88 62.4 0 182 0 0.52 2.1
M4 2 500 89 41.7 13.4 0 0 0 0.3
M4 2.15 465.1 63 14.9 0 0 9.6 1.6 8.5
M4 2.22 450.5 112 37.2 14.2 0 20.5 0 1.5
M4 2.23 448.4 96 7 0 43.1 0 0.53 11.2
M4 2.72 367.6 78 33.8 13.9 0 0 0 0
M4 2.93 341.3 348 1.9 0 28 7.1 1 1.8
M4 3.07 325.7 255 1.9 0 0 0 0.57 0
M4 3.47 288.2 112 19.5 10.7 88 0 0.03 4.8
M4 4.18 239.2 144 3.8 0 14.5 5.1 0.39 0.9
M4 4.64 215.5 70 11.2 6.2 205 0 0.04 4.9
(continued)
7.4 Exercises 303
Exercise 7.4.2 Cockroaches are responsible for 80% of infestations in spaces used
by humans. They associate with humans and have the ability to contaminate food
with their feces and secretions, having both medical and economic implications.
Different insecticides have been formulated, mainly synthetic, and, in some cases,
have led to the development of cockroaches’ resistance. This example deals with the
study of survival in days ( y) of this insect when exposed to two promising fungi in
the biological control of this insect plus an already known control. The data for this
example are shown below (Table 7.23):
304 7 Time of Occurrence of an Event of Interest
Exercise 7.4.3 Consider a study on the effect of analgesic treatments (Trt) in elderly
patients with neuralgia. Two test treatments (A and B) and a placebo (P) are
compared. The response variable is whether the patient reported pain or not
(yes = 1, n = 0). The investigators recorded the age (E) and sex (S) of 60 patients
and the duration (time = T) in which the pain disappeared after starting the
treatment. The data are presented in the Table 7.24 below.
Table 7.24 Results with neuralgia patients (Trt = Treatment, S = Sex, E = Age, T = Time,
D = Pain with yes = 1 and no = 0)
Trt S E T D Tr S E T D Tr S E T D
P F 68 1 0 B M 74 16 0 P F 67 30 0
P M 66 26 1 B F 67 28 0 B F 77 16 0
A F 71 12 0 B F 72 50 0 B F 76 9 1
A M 71 17 1 A F 63 27 0 A F 69 18 1
B F 66 12 0 A M 62 42 0 P F 64 1 1
A F 64 17 0 P M 74 4 0 A F 72 25 0
P M 70 1 1 B M 66 19 0 B M 59 29 0
A F 64 30 0 A M 70 28 0 A M 69 1 0
B F 78 1 0 P M 83 1 1 B F 69 42 0
B M 75 30 1 P M 77 29 1 P F 79 20 1
A M 70 12 0 A F 69 12 0 B F 65 14 0
B M 70 1 0 B M 67 23 0 A M 76 25 1
P M 78 12 1 B M 77 1 1 B F 69 24 0
P M 66 4 1 P F 65 29 0 P M 60 26 1
A M 78 15 1 B M 75 21 1 A F 67 11 0
P F 72 27 0 P F 70 13 1 A M 75 6 1
B F 65 7 0 P F 68 27 1 P M 68 11 1
P M 67 17 1 B M 70 22 0 A M 65 15 0
P F 67 1 1 A M 67 10 0 P F 72 11 1
A F 74 1 0 B M 80 21 1 A F 69 3 0
Appendix 1
Data: Onset and duration of estrus in Pelibuey ewes (age in weeks, weight in kilograms,
Inestro = number of days from the onset of estrus, Durestro = number of days in the duration of
estrus)
Animal Birth type Age Weight CC Inestro Durestro
1 1 18.5096 52.5 4 28 4
2 1 18.4438 47.4 4 28 4
3 1 19.3973 50.2 4 16 20
4 1 19.3973 53.6 4 28 16
(continued)
Appendix 1 307
Data: Beetles
Dose Insecticide Rep Frac Time
Low Insec1 1 0.31 3.1
Low Insec2 1 0.82 8.2
Low Insec3 1 0.43 4.3
Low Insec4 1 0.45 4.5
Medium Insec1 1 0.36 3.6
Medium Insec2 1 0.92 9.2
Medium Insec3 1 0.44 4.4
Medium Insec4 1 0.56 5.6
High Insec1 1 0.22 2.2
(continued)
Appendix 1 309
Trt Rep Y Trt Rep Y Trt Rep Y Trt Rep Y Trt Rep Y
Control 2 11 Mam 2 2 MaS 2 3 MaC 2 2 Ma1 2 3
Control 2 11 Mam 2 2 MaS 2 3 MaC 2 2 Ma1 2 3
Control 2 15 Mam 2 2 MaS 2 3 MaC 2 3 Ma1 2 3
Control 2 15 Mam 2 2 MaS 2 4 MaC 2 3 Ma1 2 4
Control 2 15 Mam 2 2 MaS 2 5 MaC 2 3 Ma1 2 4
Control 2 16 Mam 2 2 MaS 2 6 MaC 2 4 Ma1 2 4
Control 3 11 Mam 3 2 MaS 3 3 MaC 3 2 Ma1 3 2
Control 3 11 Mam 3 2 MaS 3 3 MaC 3 2 Ma1 3 2
Control 3 11 Mam 3 2 MaS 3 3 MaC 3 2 Ma1 3 2
Control 3 11 Mam 3 2 MaS 3 2 MaC 3 2 Ma1 3 2
Control 3 23 Mam 3 2 MaS 3 2 MaC 3 3 Ma1 3 3
Control 3 25 Mam 3 2 MaS 3 5 MaC 3 3 Ma1 3 3
Control 3 26 Mam 3 2 MaS 3 5 MaC 3 3 Ma1 3 3
Control 3 27 Mam 3 2 MaS 3 6 MaC 3 3 Ma1 3 3
Control 3 30 Mam 3 2 MaS 3 10 MaC 3 4 Ma1 3 4
Control 3 30 Mam 3 2 MaS 3 12 MaC 3 4 Ma1 3 4
Control 4 8 Mam 4 2 MaS 4 3 MaC 4 2 Ma1 4 2
Control 4 8 Mam 4 2 MaS 4 3 MaC 4 2 Ma1 4 2
Control 4 11 Mam 4 2 MaS 4 3 MaC 4 2 Ma1 4 2
Control 4 13 Mam 4 2 MaS 4 4 MaC 4 2 Ma1 4 3
Control 4 14 Mam 4 2 MaS 4 4 MaC 4 2 Ma1 4 3
Control 4 19 Mam 4 2 MaS 4 5 MaC 4 2 Ma1 4 3
Control 4 20 Mam 4 2 MaS 4 5 MaC 4 3 Ma1 4 4
Control 4 20 Mam 4 2 MaS 4 6 MaC 4 3 Ma1 4 5
Control 4 20 Mam 4 2 MaS 4 9 MaC 4 3 Ma1 4 6
Control 4 22 Mam 4 2 MaS 4 12 MaC 4 3 Ma1 4 11
Data: Aedes aegypti (Bio = bioassay, Trt = treatment, Rep = repetition, Y = survival time)
Bio Trt Rep Y Bio Trt Rep Y Bio Trt Rep Y
B1 C 1 8 B1 MaS 3 3 B2 C 1 5
B1 C 1 11 B1 MaS 3 3 B2 C 1 7
B1 C 1 11 B1 MaS 3 3 B2 C 1 8
B1 C 1 11 B1 MaS 3 2 B2 C 1 8
B1 C 1 11 B1 MaS 3 2 B2 C 1 10
B1 C 1 11 B1 MaS 3 5 B2 C 1 13
B1 C 1 13 B1 MaS 3 5 B2 C 1 14
B1 C 1 13 B1 MaS 3 6 B2 C 1 16
B1 C 1 14 B1 MaS 3 10 B2 C 1 20
B1 C 1 20 B1 MaS 3 12 B2 C 1 22
B1 C 2 8 B1 MaS 4 3 B2 C 1 22
B1 C 2 11 B1 MaS 4 3 B2 C 1 23
B1 C 2 11 B1 MaS 4 3 B2 C 1 23
B1 C 2 11 B1 MaS 4 4 B2 C 1 23
(continued)
312 7 Time of Occurrence of an Event of Interest
Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0
International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing,
adaptation, distribution and reproduction in any medium or format, as long as you give appropriate
credit to the original author(s) and the source, provide a link to the Creative Commons license and
indicate if changes were made.
The images or other third party material in this chapter are included in the chapter's Creative
Commons license, unless indicated otherwise in a credit line to the material. If material is not
included in the chapter's Creative Commons license and your intended use is not permitted by
statutory regulation or exceeds the permitted use, you will need to obtain permission directly from
the copyright holder.
Chapter 8
Generalized Linear Mixed Models
for Categorical and Ordinal Responses
8.1 Introduction
N! y y y
f ð y1 , y 2 , . . . , y C Þ = π 1 π 2 . . . π Cc
y1 !y2 ! . . . yC ! 1 2
Multinomial models are applied in data analysis where the categorical response
variable has more than two possible outcomes while the independent variables can
be continuous, categorical, or both (Hosmer and Lemeshow 2000). The categorical
response variable can be either ordinal (ordered) or nominal (unordered). Ordinal
response variables are single values that represent a rank order on some dimension,
but there are not enough values to be treated as a continuous variable. Nominal
(unordered) response variables are those whose values provide a rank but do not
provide an indication of order. Models for multinomial data are constructed in a
similar way as for binomial data. The link functions used in these types of models are
similar to the logit and probit functions used for binomial data. Cumulative logit and
cumulative probit models define the link function such that when properly fitted to
the data, they allow for parsimonious modeling of ordinal or multinomial data.
Generalized logit and probit models do not require ordered categories and are
therefore suitable for multinomial nominal data.
In terms of generalized linear models (GLMs) and generalized linear mixed
models (GLMMs), a multinomial distribution with C categories requires C - 1
link functions to fully specify a model that relates the response probabilities
(π 1, π 2, . . ., π C) to the linear predictor. The commonly used models are the cumula-
tive logit model, also known as the proportional odds model proposed by McCullagh
(1980), and the cumulative probit model, also known as the threshold model.
Throughout this chapter, we will use either of these two link functions
interchangeably.
The link functions for a cumulative logit model with C categories are
8.3 Cumulative Logit Models (Proportional Odds Models) 323
π1
η1 = log = η1 þ Xβ þ Zb
1 - π1
π1 þ π2
η2 = log = η2 þ Xβ þ Zb
1 - ðπ 1 þ π 2 Þ
⋮
π1 þ π2 þ ⋯ þ πC - 1
ηC - 1 = log = ηC - 1 þ Xβ þ Zb
1 - ðπ 1 þ π 2 þ ⋯ þ π C - 1 Þ
where X and Z are the design matrices, whereas β and b are the vectors of fixed and
random effects parameters, respectively. The inverse links of each of the functions
are as follows:
1
π1 = = hðη1 Þ
1 þ e - η1
1
π1 þ π2 = = hðη2 Þ
1 þ e - η2
⋮
1
π1 þ π2 þ ⋯ þ πC - 1 = = hðηC - 1 Þ:
1 þ e - ηc - 1
Once h(η1), h(η2), ... h(ηc - 1) have been estimated, we can then estimate the
probabilities π^1 , π^2 , ..., π^c .
Multinomial logit models are used to model the relationships between a polytomous
response variable and a set of predictor variables. These polytomous response
models can be classified – as mentioned above – into two different types, depending
on whether the response variable has an ordered or an unordered structure.
In a proportional odds model, the covariates (linear predictor η) have the same
effect on the probabilities that the response variable has in any category when
considering different values of the covariates, thus shifting the response distribution
to the right (or left) without changing the shape of the distribution. In a proportional
odds model, the cumulative logits model the effect of the covariates on the response
probabilities below or equal to the category cutoff.
A multinomial logit model assumes independence of categories, which implies
that the probabilities of choosing a category c relative to a category c′ are indepen-
dent of the category characteristics of c and c′ for c ≠ c′. The assumption requires that
if a new category is available, then the prior probabilities are precisely adjusted to
preserve the original probabilities between all pairs of outcomes. The proportional
odds model employs a strict assumption that the odds ratio does not depend on the
category, and, therefore, we need to test the proportional odds assumption, which is
also called the “parallel regression assumption.”
324 8 Generalized Linear Mixed Models for Categorical and Ordinal Responses
Data are obtained from an experiment related to red core disease in strawberries,
which is caused by the fungus Phytophthora fragariae. In this example, 12 straw-
berry populations were evaluated in a completely randomized experiment with
4 replications (Table 8.1). Plots generally consisted of 10 plants; in some cases,
only 9 plants were observed. At the end of the experiment, each plant was assigned
to one of three ordered categories representing fungal damage (1 = no damage,
2 = moderate damage, and 3 = severe damage).
A total of 12 populations were obtained by crossing 3 genotypes of male parents
with 4 genotypes of female parents. The variation between and within plots is
considered minimal, whereas the genetic and nongenetic effects are more significant,
as plants from the same cross are not genetically identical.
The model that fits these data for the cumulative probabilities is a GLMM, which
exhibit a classification effect on the treatment variable (population resulting from
crossing genotypes). Thus, the GLMM for multinomial ordered outcomes with
C categories requires C - 1 link function equations to fully specify the model that
relates the response probabilities (π 1, π 2, . . ., π C) to the linear predictor ηij (Stroup
2013). The C - 1 multinomial logit equations are tested against each of the
remaining categories 1, 2, . . , C - 1.
π 1ij
log = η1ij
1 - π 1ij
π 1ij þ π 2ij
log = η2ij
1 - π 1ij þ π 2ij
The following GLIMMIX program fits a cumulative logit model with an ordinal
multinomial response in a CRD.
Table 8.3 Results of the multinomial analysis of variance for injury level in strawberry plants
(a) Covariance parameter estimates
Cov Parm Subject Estimate Standard error
Intercept Rep 0.1453 0.1437
(b) Type III tests of fixed effects
Effect Num degree of freedom (DF) Den DF F-value Pr > F
Trt 11 457 2.60 0.0032
The odds ratio (Table 8.5) is the result of taking eτi for crosses 1–12. Since odds
ratios are not specific to a particular category, this value is the same for all three
categories and hence the name odds ratio.
In Table 8.6, we show the maximum likelihood estimates of the linear predictors
^ηci = ^ηC þ ^τi in the “Estimate” column, in terms of the model scale, as well as the
means on the data scale for each of the categories of the treatments tested (“Mean”).
Thus, for c = 1, t = 1 (response category “Without” damage and treatment 1), the
estimator is η11 = - 1:6027 and for c = 2, t = 1 (“Moderate” damage and treatment
1), the linear predictor is η21 = - 0:0825. Taking the inverse of the link function
yields the probability of π 11 = 1=1þe1:6027 = 0:1676. This is the estimated probability
for which the cross (treatment) M1H1 has a response score of “Without damage.”
This inverse value is presented under the “Mean” column (Table 8.6).
Now, for c = 2, t = 1, the inverse of the link yields the following probability:
π 11 þ π 21 = 1=1þe0:0825 = 0:4794 (cumulative probability). From this value, we deduce
the probability of observing a “Moderate” damage and a “Severe” damage in
the plant of the cross M1H1. For “Moderate” damage,
the probability is π 21 = 0:4794 - π 11 = 0:4794 - 0:1676 = 0:3118, and, for
“Severe” damage, it is π 31 = 1 - π 11 þ π 21 = 1 - 0:4794 = 0:5206. Similarly, the
rest of the probabilities in the different crosses are estimated.
In recent years, poultry production has become conscious of animal welfare, which
is associated with bird mortality, behavior, and health, among others (Stanley 1981;
Martrenchar et al. 2002). One of the diseases related to animal welfare is footpad
dermatitis, and, among many repercussions, it affects a bird’s ability to walk (Bilgili
et al. 2009). Pododermatitis is known as contact dermatitis or footpad dermatitis and
8.3 Cumulative Logit Models (Proportional Odds Models) 329
Table 8.6 Estimates on the model scale (Estimate) and on the data scale (Mean) for the damage
categories in strawberry plants
Estimates
Standard Standard error
Label Estimate error DF t-value Pr > |t| Mean mean
c = 1, t = 1 -1.6027 0.3706 457 -4.33 <0.0001 0.1676 0.05170
c = 2, t = 1 -0.08254 0.3625 457 -0.23 0.8200 0.4794 0.09047
c = 1, t = 2 -1.2926 0.3597 457 -3.59 0.0004 0.2154 0.06080
c = 2, t = 2 0.2276 0.3542 457 0.64 0.5208 0.5567 0.08741
c = 1, t = 3 -0.9191 0.3572 457 -2.57 0.0104 0.2851 0.07281
c = 2, t = 3 0.6010 0.3555 457 1.69 0.0916 0.6459 0.08131
c = 1, t = 4 -0.9286 0.3542 457 -2.62 0.0090 0.2832 0.07190
c = 2, t = 4 0.5915 0.3524 457 1.68 0.0939 0.6437 0.08081
c = 1, t = 5 -1.7214 0.3744 457 -4.60 <0.0001 0.1517 0.04818
c = 2, t = 5 -0.2013 0.3656 457 -0.55 0.5822 0.4499 0.09047
c = 1, t = 6 -1.0631 0.3590 457 -2.96 0.0032 0.2567 0.06850
c = 2, t = 6 0.4571 0.3557 457 1.28 0.1995 0.6123 0.08444
c = 1, t = 7 -0.6903 0.3526 457 -1.96 0.0509 0.3340 0.07842
c = 2, t = 7 0.8299 0.3533 457 2.35 0.0193 0.6963 0.07471
c = 1, t = 8 -0.8483 0.3566 457 -2.38 0.0178 0.2998 0.07485
c = 2, t = 8 0.6719 0.3556 457 1.89 0.0595 0.6619 0.07958
c = 1, t = 9 -2.0133 0.3864 457 -5.21 <0.0001 0.1178 0.04016
c = 2, t = 9 -0.4932 0.3759 457 -1.31 0.1902 0.3791 0.08849
c = 1, -0.9079 0.3540 457 -2.56 0.0106 0.2874 0.07250
t = 10
c = 2, 0.6123 0.3524 457 1.74 0.0830 0.6485 0.08033
t = 10
c = 1, -1.8997 0.3813 457 -4.98 <0.0001 0.1301 0.04317
t = 11
c = 2, -0.3795 0.3714 457 -1.02 0.3074 0.4062 0.08958
t = 11
c = 1, -0.4571 0.3526 457 -1.30 0.1955 0.3877 0.08369
t = 12
c = 2, 1.0631 0.3558 457 2.99 0.0030 0.7433 0.06789
t = 12
on finding strategies to reduce leg and carcass lesions in poultry. Important factors in
broiler fattening are the type of litter, litter height, nutrition and feeding programs,
and bird health, among others.
The objective of this study was to evaluate the effect of litter density and organic
minerals (Availa Zn and Availa Mn), with an extract of Yucca schidigera (Micro-
Aid) as a supplement to a traditional fattening program, on the development of
footpad dermatitis in broilers. The genetic material used in this experiment was
mainly male Ross line chickens. The traditional broiler fattening program by the
poultry farm consists of three phases: a starter diet (1–18 days), a grower diet
(19–35 days), and a finisher diet (36–50 days), applied for a period of 50 days,
where rice husk is used as bedding material at a density of 1 kg m-2. In this research,
a foot health program was implemented in addition to the traditional fattening
program, which included the addition of 125 ppm of Micro-Aid (Yucca schidigera
extract), 40 ppm of Availa Zn, and 40 ppm of Availa Mn to the fattening diet.
Based on the above information, four treatments were evaluated at two poultry
farms, as described below:
• Treatment 1 involved the application of the company’s traditional fattening
program (Trt1).
• Treatment 2 was the company’s traditional fattening program plus an increase in
litter density from 1 to 2 kg m-2 (Trt2).
• Treatment 3 was the traditional fattening program plus the implementation of the
foot health program during the fattening period until completion (Trt3).
• Treatment 4 consisted of the traditional fattening program plus the implementa-
tion of the foot health program and an increase in litter density from 1 to 2 kg m-2
(Trt4). The following table lists the treatments studied (Table 8.7):
The response variable evaluated was the degree of foot lesion (pododermatitis) at
the end of the fattening period (50 days). The response variable was evaluated on
1250 chickens per treatment. The degree of a footpad lesion was determined
according to a visual guide for lesions in chickens based on the method of De
Jong and Guémené (2012). This method entails defining three grades: grade 0 is
attributed to legs with no lesions, grade one is if lesions exist in some areas of the
footpad (<50%), and grade two is if the leg has extensive lesions in areas of the
footpad (50–100%). Table 8.8 shows the dataset indicating the block, treatment,
level of lesion, and the number of birds observed with a given lesion (frequency).
8.3 Cumulative Logit Models (Proportional Odds Models) 331
The GLMM for multinomial ordered results with C categories requires C - 1 link
function equations instead of one to fully specify a model that relates the response
probabilities (π 1, π 2, . . ., π C) to the linear predictor ηij (Stroup 2013). The C - 1
multinomial logit equations are tested against each of the categories 1, 2, . . ., C - 1.
The link functions for the cumulative logit model to describe the response
variable with C categories are as follows:
π 1ij
ηð1Þij = log = η1 þ τi þ bj
1 - π 1ij
π 1ij þ π 2ij
ηð2Þij = log = η 2 þ τ i þ bj
1 - π 1ij þ π 2ij
⋮
π 1ij þ π 2ij þ ⋯ þ π ðC - 1Þij
ηðC - 1Þij = log = η C - 1 þ τ i þ bj
1 - π 1ij þ π 2ij þ ⋯ þ π ðC - 1Þij
The components of the GLMM with an ordinal multinomial response variable are
as follows:
Distributions: yoij, y1ij, y2ij|bj ~ Multinomial(Nij, π 0ij, π 1ij, π 2ij), where yoij, y1ij, and y2ij
are the observed frequencies of the responses (paw injury) in each category (none,
mild, and severe) and bj is the random effect due to block assuming
bj N 0, σ 2b .
Linear predictor: η(c)ij = ηc + τi + bj, where η(c)ij is cth link (c = 0, 1) for processing
i and block j, ηc is the intercept for the cth link, τi is the fixed effect due to the ith
treatment, and bj is the random effect due to the jth block bj N 0, σ 2b . The
link functions for each category are as follows:
332 8 Generalized Linear Mixed Models for Categorical and Ordinal Responses
π 0ij
log = ηð0Þij
1 - π 0ij
π 0ij þ π 1ij
log = ηð1Þij
1 - π 0ij þ π 1ij
The data should have one column for block, treatment, lesion category, and
frequency or number of observations (Y ), which, in this case, is referenced by the
variables block, trt, category, and frequency, respectively.
Most of the options in the above syntax have already been explained previously;
the “order = data” option specifies that the order in which the categories appear in
the dataset will be treated as ordinal categories from the lowest to the highest for the
analysis. If this option is not used with the response variable in the model specifi-
cation, “proc GLIMMIX” will rearrange its categories in an alphabetical or numer-
ical order, but this will depend on whether the categories are entered as a number or a
name. The “freq y” option orders GLIMMIX to use y as the number of observations
in the corresponding category. The “estimate” command specifies the estimable
functions that form the boundaries between categories of each of the four treatments.
For example, the first estimate “c = 0, t = 1” defines η0 + τ1, that is, the boundary
between the categories “Without” (no lesion) and “Moderate” (slight lesion) for
treatment 1. This first estimate corresponds to logit log 1 -π 01π01 , which is the
probability that a chicken that received treatment 1 will respond to a degree of lesion
classified under category 0 (no lesion). The second estimation “c = 1, t = 1” defines
η1 + τ1, that is, the boundary between the categories “Moderate” (slight lesion) and
8.3 Cumulative Logit Models (Proportional Odds Models) 333
Table 8.9 Results of the analysis of variance in the multinomial cumulative logit model
(a) Type III tests of fixed effects
Effect Num DF Den DF F-value Pr > F
Trt 3 794 22.45 <0.0001
(b) Solutions for fixed effects
Effect Categoría Trt Estimate Standard error
Intercept ðη1 Þ Without 0.6144 0.1799
Intercept ðη2 Þ Moderate 3.8787 0.2465
Trt ðτ1 Þ 1 -1.5034 0.2086
Trt ðτ2 Þ 2 -0.2509 0.2055
Trt ðτ3 Þ 3 -1.0365 0.2036
Trt ðτ4 Þ 4 0 .
π 11
“Severe” (severe lesion) for treatment 1 and corresponds to logit log 1 - π 11 , and so
on. By taking the inverse of these links values, we can obtain the estimated
probabilities of π 01 and π 11. Part of the Statistical Analysis Software (SAS) glimmix
output is presented below:
The results of the analysis of variance in part (a) of Table 8.9 indicate that the
degree of lesion in the chicken footpad (pododermatitis) in the treatments tested were
significantly different (P < 0.0001). Therefore, the hypothesis of proportional odds
of treatments is rejected (H0 : τi = 0 for all i, that is, oddsratio = 1).
In part (b) of Table 8.9, we can see that the estimated intercepts η1 = 0:6144 and
η2 = 3:8787 define the boundary between the categories “Without” lesion
and “Moderate” lesion and the boundary between the categories “Moderate” lesion
and “Severe” lesion, respectively. The estimated effect of the treatments ðτi Þ shows
that the boundaries move either upward or downward when a certain treatment is
applied. In this sense, all estimated treatment coefficients have a negative effect with
respect to treatment 4. This means that chickens under treatments 1–3 have a low
probability of developing a moderate lesion and a higher probability of developing a
severe lesion than when treatment 4 is applied.
To calculate the probability that a chicken will not develop footpad dermatitis
(c = 0) when receiving treatment 1, that is, “c = 0, Trt = 1,” we first estimate the
linear predictor η01 = η0 þ τ1 = 0:6144 þ ð- 1:5034Þ = - 0:889, and, taking the
inverse, we obtain π 01 = 1=1þe - ð- 0:889Þ = 0:29. This value is the estimated probability
that a chicken will not develop footpad dermatitis when receiving treatment 1. How-
ever, now, for “c = 1, Trt = 1,” η11 = η1 þ τ1 = 3:8787 þ ð- 1:5034Þ = 2:3753,
whose inverse value is 0.915. This value is an estimate of the probability π 01 þ π 11 .
From this value, we obtain the probability that a chicken will develop a moderate
lesion and a severe lesion. For a moderate lesion, the probability is π 11 =
0:915 - π 01 = 0:915 - 0:29 = 0:624, and, for a severe lesion, the probability is
π 21 = 1 - 0:915 = 0:085. In a similar way the probabilities for the categories
(c = 0, 1, 2) of the rest of the treatments are computed.
334 8 Generalized Linear Mixed Models for Categorical and Ordinal Responses
Table 8.11 Estimates on the model scale (Estimate) and on the data scale (Mean) for footpad
dermatitis categories in the multinomial cumulative logit model
Estimates
Standard Standard error
Label Estimate error DF t-value Pr > |t| Mean mean
c = 0, -0.8893 0.2428 794 -4.93 <0.0001 0.2914 0.001174
t=1
c = 1, 2.3753 0.2214 794 10.73 <0.0001 0.9149 0.01724
t=1
c = 0, 0.3634 0.1757 794 2.07 0.0390 0.5899 0.04252
t=2
c = 1, 3.6277 0.2420 794 14.99 <0.0001 0.9741 0.006103
t=2
c = 0, -0.4222 0.1740 794 -2.43 0.0155 0.3960 0.04162
t=3
c = 1, 2.8422 0.2304 794 12.34 <0.0001 0.9449 0.01199
t=3
c = 0, 3.8787 0.2465 794 15.73 <0.0001 0.9797 0.004893
t=4
c = 1, 0.6144 0.1799 794 3.41 0.0007 0.6489 0.04098
t=4
The odds ratios tabulated in Table 8.10 are the odds ratios for treatments 1 through
4, i.e., eτi for treatments 1–4. These are the estimated odds ratios of adjacent
categories of treatments i (i = 1, 2, 3) relative to treatment 4. Values of τi are not
category-specific; the odds ratios for “Without” lesion versus “Moderate” lesion and
those for “Moderate” lesion versus “Severe” lesion are listed below (hence the name
“proportional odds”).
From the above odds ratio results, it should be obvious why the F- and P-values
in the fixed effects tests are what they are. Adding the “ilink” option to the end of the
“estimate” command prompts GLIMMIX to estimate the inverse of the linear pre-
dictors ðηci Þ, i.e., the probabilities per category π ci = 1=1þe - ηci (Table 8.11).
In the above table, several estimates are shown for ηc þ τi . For example, the
probability that a chicken will not develop a lesion under treatment 1 can be
represented by “c = 0, t = 1,” that is, ηc þ τ1 = -0.8893. This result matches
the one obtained from the fixed effects table “Solutions for fixed effects”
previously shown. Taking the inverse of the link yields the probability
8.3 Cumulative Logit Models (Proportional Odds Models) 335
0.680 0.649
0.624
0.590
0.580 0.549
Probability of lesion
0.480
0.384 0.396
0.380 0.331
0.291
0.280
0.180
0.085
0.055
0.080 0.026 0.020
-0.020
Trt1 Trt2 Trt3 Trt4
Treatment
Fig. 8.1 Estimated probabilities for the footpad lesion categories in the treatments tested, using the
cumulative logit model
η1 = Φ - 1 ðπ 1 Þ = η1 þ Xβ þ Zb
η2 = Φ - 1 ðπ 1 þ π 2 Þ = η2 þ Xβ þ Zb
⋮
-1
ηC - 1 = Φ ðπ 1 þ π 2 þ ⋯ þ π C - 1 Þ = ηC - 1 þ Xβ þ Zb
where X and Z are the design matrices, β and b are the vectors of fixed and random
effects parameters, respectively, and Φ-1() is the inverse function of the standard
normal cumulative distribution. The inverse link of each of the link functions is as
follows:
π 1 = Φðη1 Þ = hðη1 Þ
π 1 þ π 2 = Φðη2 Þ = hðη2 Þ
⋮
π 1 þ π 2 þ ⋯ þ π C - 1 = Φðηc - 1 Þ = hðηc - 1 Þ:
Once h(η1), h(η2), ... h(ηc - 1) are estimated, we can estimate π 1 , ... , π C . The
quality of the estimates of the ordinal cumulative probit model are usually very
similar to those of an ordinal cumulative logit model for some datasets but not all.
Both involve stochastic ordering at different levels of the response variable and are
designed to detect the location of changes in the response variable.
Returning to Example 8.3.1, for the cumulative probit model, we change the
“LINK = CPROBIT” option in the model’s definition of the above program syntax.
The output will contain all the same elements, except the odds ratios. The analysis
for the cumulative probit is exactly the same as that one we performed in the
cumulative logit model. Part of the output is shown in parts (a)–(c) of Table 8.12.
The estimated variance component due to blocks is σ 2block = 0:0092. The results of
the analysis of variance showed that the degrees of lesion in the chickens’ footpad
(pododermatitis) in the tested treatments differ significantly (P < 0.0001).
In part (b) of Table 8.12, it is possible to observe that the estimated intercepts
η1 = 0:3880 and η2 = 2:2407 define the boundary between the “Without” lesion and
“Moderate” lesion categories and the boundary between the “Moderate” lesion and
“Severe” lesion categories, respectively. The estimated effect of the treatments ðτi Þ
moves the boundaries either upward or downward, when a certain treatment is
applied. In this sense, all estimated treatment coefficients have a negative effect
8.4 Cumulative Probit Models 337
Table 8.12 Results of the analysis of variance in the multinomial cumulative probit model
(a) Covariance parameter estimates
Cov Parm Subject Estimate Standard error
Intercept Blk 0.009262 0.01817
(b) Type III tests of fixed effects
Effect Num DF Den DF F-value Pr > F
Trt 3 794 24.57 <0.0001
(c) Solutions for fixed effects
Effect Categoría Trt Estimate Standard error
Intercept ðη1 Þ Without 0.3880 0.1124
Intercept ðη2 Þ Moderate 2.2407 0.1375
Trt ðτ1 Þ 1 -0.9278 0.1227
Trt ðτ2 Þ 2 -0.1595 0.1242
Trt ðτ3 Þ 3 -0.6459 0.1219
Trt ðτ4 Þ 4 0 .
with respect to treatment 4. This means that chickens under treatments 1–3 have a
low probability of developing a footpad lesion and a higher probability of develop-
ing a severe lesion with respect to treatment 4.
From “Type III tests of fixed effects” (Table 8.12, part (b)), the probabilities for
each of the categories can be obtained. For the probability that a chicken will not
develop a footpad lesion (c = 0) under treatment 1, i.e., “c = 0, Trt = 1, ” the estimated
linear predictor is obtained as η01 = η0 þ τ1 = 0:3880 þ ð- 0:9278Þ = - 0:5398 and,
taking the inverse, gives π 01 = Φð- 0:5398Þ = 0:2946, that is, the estimated probabil-
ity that a chicken will not develop a footpad lesion when receiving treatment 1. For
“
c = 1, Trt = 1, ” η11 = η1 þ τ1 = 2:2407 þ ð- 0:9278Þ = 1:3129, whose inverse
value is 0.9054. This value is an estimator of π 01 þ π 11 . From this value, we can
obtain the probability that a chicken will develop a moderate lesion and a severe
lesion. For a moderate lesion, π 11 = 0:9054 - π 01 = 0:9054 - 0:2946 = 0:6108, and,
for a severe lesion, π 21 = 1 - 0:9054 = 0:0946. Similarly, we can obtain the proba-
bilities of the categories for the other treatments (c = 0, 1, 2) for the rest of the
treatments.
Similar to the previous example, adding the “ILINK” option to the end of the
“ESTIMATE” command prompts GLIMMIX to estimate the values of the linear
predictors ðηci Þ and the inverse of the linear predictors, which are the probabilities
per category ðπ ci = Φðηci ÞÞ. Table 8.13 shows the estimates of the linear predictors as
well as their inverse values (probabilities in this case).
From the above table, we show the estimates of ηc þ τi . For example, the estimated
linear predictor that a chicken will not develop a footpad lesion under treatment 1, i.e.,
“
c = 0, t = 1, ” is calculated as ηc þ τ1 = - 0:5398. This result matches the values
obtained from the fixed effects table (“Solutions for fixed effects”) previously shown.
Taking the inverse of the link function, π 01 = Φð0:5398Þ = 0:2947. This is the
338 8 Generalized Linear Mixed Models for Categorical and Ordinal Responses
Table 8.13 Estimates on the model scale (Estimate) and on the data scale (Mean) for footpad
lesion categories in the multinomial cumulative probit model
Estimates
Standard t- Standard error
Label Estimate error DF value Pr > |t| Mean mean
c = 0, -0.5398 0.1100 794 -4.91 <0.0001 0.2947 0.03793
t=1
c = 1, 1.3129 0.1208 794 10.87 <0.0001 0.9054 0.02035
t=1
c = 0, 0.2285 0.1105 794 2.07 0.0389 0.5904 0.04293
t=2
c = 1, 2.0812 0.1345 794 15.47 <0.0001 0.9813 0.006153
t=2
c = 0, -0.2578 0.1085 794 -2.38 0.0178 0.3983 0.04189
t=3
c = 1, 1.5949 0.1258 794 12.68 <0.0001 0.9446 0.01407
t=3
c = 0, 2.2407 0.1375 794 16.29 <0.0001 0.9875 0.004457
t=4
c = 1, 0.3880 0.1124 794 3.45 0.0006 0.6510 0.04158
t=4
probability that a chicken will not develop a footpad lesion when receiving treatment
1. This probability is under the “Mean” column.
Now, for the category “c = 1, t = 1, ” the inverse of the link function is a probability
of 0.9054, which results from the inverse value of the linear predictor η1 þ τ1 = 1:3129.
This value is the estimate in terms of probability π 01 þ π 11 . From this value, we can
obtain the probability that a chicken presents a “Moderate” lesion when receiving
treatment 1, that is, π 01 þ π 11 = 0:9054, and, using the value of π 01 , we obtain the
values π 11 = 0:9054 - 0:2947 = 0:6107 and π 21 = 1 - 0:9054 = 0:0946. Following
the same procedure, we can obtain the rest of the probabilities for each one of the
categories (c = 0, 1, 2) and for the rest of the treatments (2–4).
Canning quality is one of the most essential traits required in all new dry bean
(Phaseolus vulgaris L.) varieties, and the selection for this trait is a critical part of
bean breeding programs. Advanced lines that are candidates for release as varieties
must be evaluated for canning quality for at least 3 years from samples grown at
different locations. Quality is evaluated by a panel of judges with varying levels of
experience in evaluating breeding lines for visual quality traits. A total of 264 bean
breeding lines from 4 commercial classes were retained according to the procedures
described by Walters et al. (1997). These included 62 white (navy), 65 black,
8.5 Effect of Judges’ Experience on Canned Bean Quality Ratings 339
Table 8.14 Frequency of ratings of different types of beans as a function of the bean-rating
experience
Black Kidney Navy Pinto
< > < > < > < >
Calif 5 Years 5 Years 5 Years 5 Years 5 Years 5 Years 5 Years 5 Years
1 13 32 7 10 10 22 13 2
2 91 78 32 31 56 51 29 17
3 123 124 136 96 84 107 91 68
4 72 122 101 104 84 98 109 124
5 24 31 47 71 51 52 60 109
6 2 3 6 18 24 37 25 78
7 0 0 1 0 1 5 1 12
55 kidney, and 82 pinto bean lines plus control or “check” lines. The visual
appearance of the processed beans was determined subjectively by a panel of
13 judges on a 7-point hedonic scale (1 = very undesirable, ..., 4 = neither desirable
nor undesirable,..., 7 = very desirable). Beans were presented to the panel of judges
in random order at the same time. Before evaluating the samples, all judges were
shown examples of samples rated as satisfactory.
There is concern that certain judges, due to lack of experience, may not be able to
correctly score the canned samples. From attribute-based product evaluations, infer-
ences about the effects of experience can be drawn from the psychology literature
(Wallsten and Budescu 1981). Prior to the bean canning quality rating experiment, it
was postulated that not only do less experienced judges have a more severe rating
than do more experienced judges but also that experience should have little or no
effect on white beans, for which the canning procedure was developed. Judges are
stratified for the purpose of analysis by experience (less than 5 years, greater than
5 years). Counts by canning quality, judge experience, and bean breeding lines are
listed in the following table (Table 8.14).
The link functions for the cumulative logit model for describing a variable with
C categories are as follows:
π 1ij
ηð1Þij = log = η1 þ αi þ βj þ ðαβÞij
1 - π 1ij
π 1ij þ π 2ij
ηð2Þij = log = η2 þ αi þ βj þ ðαβÞij
1 - π 1ij þ π 2ij
⋮
π 1ij þ π 2ij þ ⋯ þ π ðC - 1Þij
ηC - 1 = log = ηC - 1 þ αi þ βj þ ðαβÞij
1 - π 1ij þ π 2ij þ ⋯ þ π ðC - 1Þij
π 1ij
log = η1ij
1 - π 1ij
π 1ij þ π 2ij
log = η2ij
1 - π 1ij þ π 2ij
solution oddsratio;
Contrast 'Effect of Experience on Black bean' exper 1 -1 class*exper 1 -1
0 0 0 0 0 0 0 0 0 0;
Contrast 'Effect of Experience on Kidney Bean' exper 1 -1 class*exper 0 0
1 -1 0 0 0 0 0 0 0 0;
Contrast 'Effect of Experience on Navies bean' exper 1 -1 class*exper 0 0
0 0 0 0 0 1 -1 0 0 0;
Contrast 'Effect of Experience on Pinto beans' exper 1 -1 class*exper 0 0
0 0 0 0 0 0 0 0 0 1 -1;
estimate 'Black, < 5 year, Rating = 1' Intercept 1 0 0 0 0 0 0 0 0 class 1 0
0 0 0 0 0 exper 1 0 class*exper 1 0 0 0 0 0 0 0 0 0 0/ilink;
estimate 'Black, < 5 year, Rating <= 2' Intercept 0 1 0 0 0 0 0 0 0 class 1 0
0 0 0 0 0 exper 1 0 0 class*exper 1 0 0 0 0 0 0 0 0 0 0/ilink;
estimate 'Black, < 5 year, Rating <= 3' Intercept 0 0 0 1 0 0 0 0 0 class 1 0
0 0 0 0 0 exper 1 0 0 class*exper 1 0 0 0 0 0 0 0 0 0 0/ilink;
estimate 'Black, < 5 year, Rating <= 4' Intercept 0 0 0 0 1 0 0 0 class 1 0 0 0
0 0 0 exper 1 0 0 class*exper 1 0 0 0 0 0 0 0 0 0 0/ilink;
estimate 'Black, < 5 year, Rating <= 5' Intercept 0 0 0 0 0 0 1 0 0 class 1 0
0 0 0 0 0 exper 1 0 0 class*exper 1 0 0 0 0 0 0 0 0 0 0/ilink;
estimate 'Black, > 5 year, Rating <= 6' Intercept 0 0 0 0 0 0 0 0 1 class 1 0
0 0 0 0 exper 1 0 class*exper 1 0 0 0 0 0 0 0 0 0 0/ilink;
estimate 'Black, > 5 year, Rating = 1' Intercept 1 0 0 0 0 0 0 0 0 class 1 0
0 0 0 0 exper 0 1 class*exper 0 1 0 0 0 0 0 0 0 0/ilink;
estimate 'Black, > 5 year, Rating <= 2' Intercept 0 1 0 0 0 0 0 0 0 class 1 0
0 0 0 0 0 exper 0 1 class*exper 0 1 0 0 0 0 0 0 0 0/ilink;
estimate 'Black, > 5 year, Rating <= 3' Intercept 0 0 0 1 0 0 0 0 0 class 1 0
0 0 0 0 0 exper 0 1 class*exper 0 1 0 0 0 0 0 0 0 0/ilink;
estimate 'Black, > 5 year, Rating <= 4' Intercept 0 0 0 0 1 0 0 0 class 1 0 0 0
0 0 exper 0 1 class*exper 0 1 0 0 0 0 0 0 0 0 0/ilink;
estimate 'Black, > 5 year, Rating <= 5' Intercept 0 0 0 0 0 0 1 0 0 class 1 0
0 0 0 0 0 exper 0 1 class*exper 0 1 0 0 0 0 0 0 0 0 0/ilink;
estimate 'Black, > 5 year, Rating <= 6' Intercept 0 0 0 0 0 0 0 0 1 class 1 0
0 0 0 0 exper 0 1 class*exper 0 1 0 0 0 0 0 0 0 0 0/ilink;
estimate 'Kidney, < 5 year, Rating = 1' Intercept 1 0 0 0 0 0 0 0 0 class 0 1
0 0 0 0 exper 1 0 0 class*exper 0 0 0 1 0 0 0 0 0 0 0/ilink;
estimate 'Kidney, < 5 year, Rating <= 2' Intercept 0 1 0 0 0 0 0 0 0 class 0 1
0 0 0 0 exper 1 0 0 class*exper 0 0 0 1 0 0 0 0 0 0 0/ilink;
estimate 'Kidney, < 5 yr, Rating <= 3' Intercept 0 0 0 1 0 0 0 0 0 class 0 1
0 0 0 exper 1 0 0 class*exper 0 0 0 1 0 0 0 0 0 0 0/ilink;
estimate 'Kidney, < 5 year, Rating <= 4' Intercept 0 0 0 0 0 1 0 0 0 class 0 1
0 0 0 exper 1 0 0 class*exper 0 0 0 1 0 0 0 0 0 0 0/ilink;
estimate 'Kidney, < 5 year, Rating <= 5' Intercept 0 0 0 0 0 0 1 0 0 class 0 1
0 0 0 exper 1 0 0 class*exper 0 0 0 1 0 0 0 0 0 0 0/ilink;
estimate 'Kidney, < 5 year, Rating <= 6' Intercept 0 0 0 0 0 0 0 0 1 class 0 1
0 0 0 0 exper 1 0 0 class*exper 0 0 0 1 0 0 0 0 0 0 0/ilink;
estimate 'Kidney, > 5 year, Rating = 1' Intercept 1 0 0 0 0 0 0 0 0 class 0 1
0 0 0 0 exper 0 1 class*exper 0 0 0 0 1 0 0 0 0 0 0/ilink;
estimate 'Kidney, > 5 year, Rating <= 2' Intercept 0 1 0 0 0 0 0 0 0 class 0 1
0 0 0 0 exper 0 1 class*exper 0 0 0 0 1 0 0 0 0 0 0/ilink;
estimate 'Kidney, > 5 year, Rating <= 3' Intercept 0 0 0 1 0 0 0 0 0 class 0 1
0 0 0 0 exper 0 1 class*exper 0 0 0 0 1 0 0 0 0 0 0/ilink;
estimate 'Kidney, > 5 year, Rating <= 4' Intercept 0 0 0 0 0 1 0 0 0 class 0 1
0 0 0 0 exper 0 1 class*exper 0 0 0 0 1 0 0 0 0 0 0/ilink;
342 8 Generalized Linear Mixed Models for Categorical and Ordinal Responses
Part of the results is shown below. The results of the analysis of variance show
that the class of bean (Class), experience of the evaluator (Exper), and the interaction
between class and experience (Class×Exper) on bean canning scores differ signifi-
cantly (P = 0.0001). That is, the results of comparing judges with more and less
years of experience will depend on the line (variety) of beans (Table 8.15).
The contrasts address this interaction (Table 8.16). Hypothesis testing is as
follows: π class of bean, < 5 years of experience = π class of bean, > 5 years of experience.
The results show that judges with more than 5 years of experience differ from
those with less than 5 years of experience in evaluating the quality of canned kidney
and pinto beans (Table 8.16). With the “solution” option in the model specification,
the fixed parameter estimates table shows the solution of the fixed effects parameters
under maximum likelihood. In this table, we can observe the values of the estimated
intercepts: η1 = - 4:6421 defines the boundary between the categories, “1 = highly
undesirable” and “2 = moderately undesirable”, whereas η2 = - 2:9316 defines the
boundary between the categories “2 = moderately undesirable” and “3 = slightly
undesirable.” The third intercept defines the boundary between the categories
“3 = moderately undesirable” and “3 = slightly undesirable,” η3 = - 1:3995 defines
the boundary between the categories “3 = slightly undesirable” and “4 = neither
undesirable nor desirable,” and so on.
The estimated effects of bean type ðαi Þ, evaluator βi , and their interaction αβij are
shown below. From these values, we can estimate the linear predictors for each of the
categories. For example, the linear predictor for canned black beans evaluated by an inexpe-
rienced judge who assigns the category “1 = very undesirable” is η111 = η1 þ α1 þ β1 þ
αβ11 = - 4:6421 þ 1:9670 þ 1:0284 - 0:8066 = - 2:4533, for category “2 = moder-
ately undesirable,” it is η211 = η2 þ α1 þ β1 þ αβ11 = - 2:9316 þ 1:9670 þ
1:0284 - 0:8066 = - 0:7428, for category “3 = slightly undesirable,” it is
η311 = η3 þ α1 þ β1 þ αβ11 = - 1:3995 þ 1:9670 þ 1:0284 - 0:8066 = 0:7893, and,
344 8 Generalized Linear Mixed Models for Categorical and Ordinal Responses
Table 8.17 Maximum likelihood estimation of the estimated parameters in the fixed effects
solution of canned bean quality ratings in the multinomial cumulative logit model
Fixed parameter estimates
Cal Class Expert Standard
Effect ηi αi β1 Estimate error DF t-value Pr > |t|
Intercept η1 1 -4.6421 0.1363 2779 -34.05 <0.0001
Intercept η2 2 -2.9316 0.1057 2779 -27.74 <0.0001
Intercept η3 3 -1.3995 0.09643 2779 -14.51 <0.0001
Intercept η4 4 0.004287 0.09230 2779 0.05 0.9630
Intercept η5 5 1.4191 0.1026 2779 13.84 <0.0001
Intercept η6 6 3.8925 0.2346 2779 16.59 <0.0001
Class Black 1.9670 0.1318 2779 14.93 <0.0001
α1
Class Kidney 1.0472 0.1342 2779 7.80 <0.0001
α2
Class Navy 1.3076 0.1345 2779 9.72 <0.0001
α3
Class Pinto 0 . . . .
α4
Exper 1 β1 1.0284 0.1350 2779 7.62 <0.0001
Exper 2 0 . . . .
Class*Exper Black 1 αβ11 -0.8066 0.1894 2779 -4.26 <0.0001
Class*Exper Black 2 0 . . . .
Class*Exper Kidney 1 αβ21 -0.6457 0.1912 2779 -3.38 0.0007
Class*Exper Kidney 2 0 . . . .
Class*Exper Navy 1 αβ31 -1.0072 0.1969 2779 -5.12 <0.0001
Class*Exper Navy 2 0 . . . .
Class*Exper Pinto 1 αβ41 0 . . . .
Class*Exper Pinto 2 0 . . . .
Table 8.18 Estimates on the model scale (Estimate) and on the data scale (Mean) based on judges’
experience in canned bean quality ratings in the multinomial cumulative logit model
Estimates
Standard
Standard error
Label Estimate error DF t-value Pr > |t| Mean mean
Black -2.4533 0.1292 2779 -18.99 <0.0001 0.07920 0.009419
<5 years,
score = 1
Black -0.7428 0.1004 2779 -7.40 <0.0001 0.3224 0.02194
<5 years,
score ≤ 2
Black 0.7893 0.1008 2779 7.83 <0.0001 0.6877 0.02164
<5 years,
score ≤ 3
Black 2.1931 0.1076 2779 20.38 <0.0001 0.8996 0.009716
<5 years,
score ≤ 4
Black 3.6079 0.1238 2779 29.15 <0.0001 0.9736 0.003180
<5 years,
score ≤ 5
Black 6.0814 0.2467 2779 24.65 <0.0001 0.9977 0.000561
>5 years,
score ≤ 6
Black -2.6751 0.1264 2779 -21.17 <0.0001 0.06446 0.007621
>5 years,
score = 1
Black -0.9646 0.09577 2779 -10.07 <0.0001 0.2760 0.01913
>5 years,
score ≤ 2
Black 0.5675 0.09314 2779 6.09 <0.0001 0.6382 0.02151
>5 years,
score ≤ 3
Black 1.9713 0.09967 2779 19.78 <0.0001 0.8778 0.01069
>5 years,
score ≤ 4
Black 3.3861 0.1170 2779 28.95 <0.0001 0.9673 0.003704
>5 years,
score ≤ 5
Black 5.8595 0.2434 2779 24.07 <0.0001 0.9972 0.000690
>5 years,
score ≤ 6
Kidney -3.2122 0.1333 2779 -24.11 <0.0001 0.03871 0.004958
<5 years,
score = 1
Kidney -1.5017 0.1018 2779 -14.74 <0.0001 0.1822 0.01517
<5 years,
score ≤ 2
(continued)
346 8 Generalized Linear Mixed Models for Categorical and Ordinal Responses
probability of π^112 = 0:0646. To calculate the probability that a judge with less than
5 years experience would assign a rating of 2 (2 = moderately undesirable) to canned
black beans, we derive this probability from the cumulative probability of 0.3224,
which corresponds to π 211 þ π 111 , from which we get
π 211 = 0:3224 - π 111 = 0:3224 - 0:08 = 0:24. On the other hand, for a judge with
experience (>5 years), the probability of assigning a score of 2 to canned black
beans is π 212 = 0:2760 - π 112 = 0:2760 - 0:06446 = 0:2115.
Following the same procedure, the other probabilities for the rest of the categories
are obtained. The probabilities calculated for each of the categories are shown in
Table 8.19 and can be seen in Fig. 8.2.
In a model with unordered data, the polytomous response variable does not have an
ordered structure. Two classes of models, generalized logit models and conditional
logit models, can be used with nominal response data. A generalized logit model
consists of a combination of several binary logits estimated simultaneously. A logit
model is the simplest and best-known probabilistic choice model. However, there are
problems in making use of a multinomial logit model because of its inflexibility. A
generalized logit model is essentially more flexible than the traditional multinomial
cumulative logit model.
A generalized logit model shows the same flexibility as a probit model but is
much more tractable. Like cumulative logit and probit models, a generalized logit
model has C – 1 link functions, where C denotes the number of response categories.
8.6 Generalized Logit Models: Nominal Response Variables 349
Table 8.19 Probabilities calculated for each of the canned bean grades
Cal1 Cal2 Cal3 Cal4 Cal5 Cal6 Cal7
Black J1 0.08 0.24 0.37 0.21 0.07 0.02 0.00
J2 0.06 0.21 0.36 0.24 0.09 0.03 0.00
Kidney J1 0.04 0.14 0.33 0.30 0.14 0.05 0.00
J2 0.03 0.11 0.28 0.33 0.18 0.07 0.01
Navy J1 0.04 0.13 0.31 0.31 0.15 0.05 0.01
J2 0.03 0.13 0.31 0.31 0.15 0.06 0.01
Pinto J1 0.03 0.10 0.28 0.33 0.18 0.07 0.01
J2 0.01 0.04 0.15 0.30 0.30 0.17 0.02
Cal1 = qualification 1, Cal2 = qualification 2,...., Cal7 = qualification 7; J1 = panelist with less
than 5 years’ experience, and J2 = panelist with more than 5 years’ experience
0.24 0.09
0.9
0.14
0.17
0.15
0.15
Probability for each category
0.18
0.18
0.8
0.7
0.30
0.30
0.31
0.6 0.31
0.33
0.33
0.37
0.5
0.36
0.4
0.30
0.33
0.31
0.31
0.3
0.28
0.28
0.24
0.2
0.21
0.15
0.14
0.13
0.13
0.1
0.11
0.10
0.04
0.08 0.06 0.04 0.03 0.04 0.03 0.03 0.01
0.0
J1 J2 J1 J2 J1 J2 J1 J2
Black Kidney Navy Pinto
Fig. 8.2 Estimated probabilities for each category of the acceptability of canned beans, according
to the experience of the panelist (judge)
π 1ij
η1 = log = α1 þ Xβ1 þ Zb1
π cij
π 2ij
η2 = log = α2 þ Xβ2 þ Zb2
π Cij
⋮
π ðC - 1Þij
ηC - 1 = log = αc - 1 þ XβC - 1 þ ZbC - 1
π Cij
Given the different effects in the models, the intercepts (α´s), β´s, and b´s vary
across the pairs of response variable categories for each link function. Using algebra,
it can be shown that the general form of the inverse of the link functions is given by
eη c
πc = C-1
, c = 1, 2, . . . , C - 1
1þ eη c
c=1
In practice, cumulative models are used for analyzing ordinal data and generalized
logit models for nominal data. Returning to Example 8.3.1, we will now implement
the analysis of a generalized logit model. This model relaxes the assumptions of
proportionality; but it is less parsimonious than the “odds ratio” model since they fit
C - 1 binary logit models, where C is the number of categories of the response
variable. The linear predictor and distribution are the same as in the previous
example.
The following GLIMMIX syntax implements the analysis of the generalized logit
model:
Table 8.21 Maximum likelihood estimates on the model scale (Estimate) for footpad lesion level
in the multinomial generalized logit model
Solutions for fixed effects
Effect Category Trt Estimate Standard error DF t-value Pr > |t|
Intercept Without lesion 4.8525 1.0059 2 4.82 0.0404
Intercept Moderate lesion 4.2485 1.0071 2 4.22 0.0519
trt Without lesion 1 -3.8447 1.0330 790 -3.72 0.0002
trt Moderate lesion 1 -2.6478 1.0327 790 -2.56 0.0105
trt Without lesion 2 -1.1888 1.1618 790 -1.02 0.3065
trt Moderate lesion 2 -0.9651 1.1662 790 -0.83 0.4082
trt Without lesion 3 -2.7860 1.0585 790 -2.63 0.0087
trt Moderate lesion 3 -1.8326 1.0598 790 -1.73 0.0842
trt Without lesion 4 0 . . . .
trt Moderate lesion 4 0 . . . .
Most of the syntax of the program has already been explained. The “reference=”
option is new to this program in the command, where the model is defined and is
used to designate the reference category. By not specifying the “reference=” option,
GLIMMIX by default uses the last category in the dataset. Moreover, the
“link = glogit” option prompts GLIMMIX to fit a generalized logit model. The
“bycat” option in the “estimate” command is unique to the generalized logit model.
Finally, the “ilink” option asks GLIMMIX to estimate all category probabilities for
each treatment, except those for the reference category. Part of the output is shown in
Table 8.20. The fixed effects test shows that there are highly significant differences
(P = 0.0001) on the average percentage of footpad lesion level between treatments.
Unlike the cumulative logit model, in the generalized logit model, the estimates
of the fixed effects (treatments), as well as the intercepts, are separated for each
link function. For the estimation of linear predictors, we use the estimated values
of Table 8.21 (“Solutions for fixed effects”). The estimated intercepts α1 = 4:8525 and
α2 = 4:2485 define the boundary between the categories “Without” lesion and “Mod-
erate” lesion and the boundary between the categories “Moderate” lesion and “Severe”
lesion, respectively. For treatment 1, the treatment effects (^τi Þ estimated for the
“Without” lesion category is τ1 = - 3:8447 and for the “Moderate” lesion category,
it is τ1 = - 2:6478. With these values, the linear predictors for the “Without” lesion
and “Moderate” lesion categories under treatment 1 are η01 = 4:8525 - 3:8447 =
1:0077 and η11 = 4:2485 - 2:6478 = 1:6007, respectively.
The estimated probabilities for each of the categories (“Without” lesion
and “Moderate” lesion) in each treatment, except for the reference category, are found
352 8 Generalized Linear Mixed Models for Categorical and Ordinal Responses
Table 8.22 Estimates on the model scale (“Estimate”) and on the data scale (“Mean”) for footpad
lesion level observed in treatments in the multinomial generalized logit model
Estimates
Standard Standard
Label Category Estimate error DF t-value Pr > |t| Mean error mean
t=1 Without 1.0077 0.2515 790 4.01 <0.0001 0.3150 0.03552
lesion
t=1 Moderate 1.6007 0.2286 790 7.00 <0.0001 0.5700 0.03677
lesion
t=2 Without 3.6637 0.5881 790 6.23 <0.0001 0.5850 0.03801
lesion
t=2 Moderate 3.2834 0.5881 790 5.58 <0.0001 0.4000 0.03761
lesion
t=3 Without 2.0665 0.3414 790 6.05 <0.0001 0.3929 0.03755
lesion
t=3 Moderate 2.4159 0.3300 790 7.32 <0.0001 0.5573 0.03762
lesion
t=4 Without 4.8525 1.0059 790 4.82 <0.0001 0.6433 0.03687
lesion
t=4 Moderate 4.2485 1.0071 790 4.22 <0.0001 0.3517 0.03669
lesion
under the “Mean” column of Table 8.22. The probability that a chick has no footpad
lesion when receiving treatment 1 is π 01 = 0:315, whereas the value 0.57 corresponds to
the cumulative probability π 01 þ π 11 . From this value, we can calculate the probability
of observing a moderate lesion, which is π 11 = 0:57 - π 01 = 0:57 - 0:315 = 0:255.
From these probabilities, we can estimate the probability of observing a severe footpad
lesion under treatment 1 as π 21 = 1 - ð0:57Þ = 0:43. Following the same logic, we can
estimate the reference probabilities for the rest of the other treatments.
Another important result is the odds ratio estimates. These estimates are shown in
Table 8.23.
These odds ratios compare the odds for the labeled category to those for the
reference category for treatments 1–3 relative to treatment 4. These odds ratio values
are derived from the estimated probabilities in each of the categories. For example,
the probabilities that a chicken does not present a lesion and a moderate lesion
are π 04 = 0:6433 and π 14 = 0:3517, respectively. From these probabilities, we
can estimate the probability of observing a severe lesion as follows:
π 24 = 1 - ð0:6433 þ 0:3517Þ = 0:005. The estimated odds ratio of not observing a
lesion (“Without” lesion) between treatments 1 and 4 is
π 01 π 04 0:315 0:6433
Odds ratioTrt1,Trt4 = = = = = 0:0213
π 21 π 24 0:115 0:005
the value provided in the odds ratio estimates table. If we compare the analysis using
the cumulative logit link and the generalized logit link, we observe insignificant
8.6 Generalized Logit Models: Nominal Response Variables 353
Consider a study in which you want to know the effects of various additives on the
flavor of cheese. Researchers tested 4 cheese additives and obtained 52 response
ratings for each additive. Each response was measured on a scale of 9 categories
ranging from: I dislike it very much (1) to I like it very much or excellent flavor (9).
Data are obtained from the study by McCullagh and Nelder (1989) (Table 8.24).
The components of the GLMM with an ordinal multinomial response are as
follows:
Distributions: y1i, y2i, y3i, y4i, y5i, y6i,y7i, y8i, y9i~Multinomial
(Ni, π 1i, π 2i, π 3i, π 4i, π 5i, π 6i, π 7i,π 8i, π 9i), where y1i, y2i, y3i, y4i, y5i, y6i,y7i, y8i,
and y9i are the observed frequencies of the responses in each category c of the
hedonic scale (1 = very undesirable, ..., 5 = neither desirable nor undesirable, ... ,
9 = very desirable).
Linear predictor: η(c)i = ηc + αi, where η(c)ij is cth link (c = 1, 2, . . ., 8) for the additive
type i, ηc is the intercept for the cth link, and αi is the fixed effect due to the ith
additive. The link functions for each category are as follows:
π 1i
log = η1i
1 - π 1i
π 1i þ π 2i
log = η2i
1 - ðπ 1i þ π 2i Þ
π 1i þ π 2i þ π 3i
log = η3i
1 - ðπ 1i þ π 2i þ π 3i Þ
354 8 Generalized Linear Mixed Models for Categorical and Ordinal Responses
π 1i þ π 2i þ π 3i þ π 4i
log = η4i
1 - ðπ 1i þ π 2i þ π 3i þ π 4i Þ
π 1i þ π 2i þ π 3i þ π 4i þ π 5i
log = η5i
1 - ðπ 1i þ π 2i þ π 3i þ π 4i þ π 5i Þ
π 1i þ π 2i þ π 3i þ π 4i þ π 5i þ π 6i
log = η6i
1 - ðπ 1i þ π 2i þ π 3i þ π 4i þ π 5i þ π 6i Þ
π 1i þ π 2i þ π 3i þ π 4i þ π 5i þ π 6i þ π 7i
log = η7i
1 - ðπ 1i þ π 2i þ π 3i þ π 4i þ π 5i þ π 6i þ π 7i Þ
π 1i þ π 2i þ π 3i þ π 4i þ π 5i þ π 6i þ π 7i þ π 8i
log = η8i
1 - ðπ 1i þ π 2i þ π 3i þ π 4i þ π 5i þ π 6i þ π 7i þ π 8i Þ
proc glimmix ;
class id additive scale;
model scale(order=data)= additive/dist=Multinomial link=clogit
solution oddsratio;
estimate 'c=1, a=1' intercept 1 0 0 0 0 0 0 0 additive 1 0 0 0,
'c=2, a=1' intercept 0 1 0 0 0 0 0 0 additive 1 0 0 0,
'c=3, a=1' intercept 0 0 1 0 0 0 0 0 additive 1 0 0 0,
'c=4, a=1' intercept 0 0 0 1 0 0 0 0 additive 1 0 0 0,
8.6 Generalized Logit Models: Nominal Response Variables 355
Part of the results is shown in Table 8.25. The results of the analysis of variance
show that the type of additive used in the manufacture of cheese significantly affects
the degree of consumer acceptance (P = 0.0001). That is, the type of additive affects
the sensory characteristics of the cheese.
The contrast of hypothesis are presented in Table 8.26. The hypothesis tests are as
follows:
π additivei = π additivej ; 8i ≠ j
The results show that the additives provide different sensory characteristics that
are reflected in the evaluation of preference.
With the “solution” option in the model specification, Table 8.27 (fixed parameter
estimates) shows the solution of the maximum likelihood estimates for the fixed
356 8 Generalized Linear Mixed Models for Categorical and Ordinal Responses
Table 8.26 Contrast of hypothesis in the acceptance of cheese made with four additives
Contrasts
Label Num DF Den DF F-value Pr > F
Additive effect 1 vs. 2 1 197 61.13 <0.0001
Additive effect 1 vs. 3 1 197 21.19 <0.0001
Additive effect 2 vs. 3 1 197 19.14 <0.0001
Additive effect 2 vs. 4 1 197 108.45 <0.0001
Additive effect 3 vs. 4 1 197 62.04 <0.0001
Table 8.27 Maximum likelihood estimates of the fixed effects in the preference ratings of cheese
made with different types of additives in the multinomial cumulative logit model
Fixed parameter estimates
Effect escala Additive Estimate Standard error DF t-value Pr > |t|
Intercept η1 1 -7.0802 0.5640 197 -12.55 <0.0001
Intercept η2 2 -6.0250 0.4764 197 -12.65 <0.0001
Intercept η3 3 -4.9254 0.4257 197 -11.57 <0.0001
Intercept η4 4 -3.8568 0.3880 197 -9.94 <0.0001
Intercept η5 5 -2.5206 0.3453 197 -7.30 <0.0001
Intercept η6 6 -1.5685 0.3122 197 -5.02 <0.0001
Intercept η7 7 -0.06688 0.2738 197 -0.24 0.8073
Intercept η8 8 1.4930 0.3357 197 4.45 <0.0001
Aditivo α1 1 1.6128 0.3805 197 4.24 <0.0001
Aditivo α2 2 4.9646 0.4767 197 10.41 <0.0001
Aditivo α3 3 3.3227 0.4218 197 7.88 <0.0001
Aditivo α4 4 0 . . . .
effects parameters. In this table, we observe the values of the estimated intercepts:
η1 = - 7:0802 defines the boundary between categories “1” and “2,” whereas
η2 = - 6:0250 defines the boundary between categories “2” and “3.” The third
intercept, ^η3 = - 4:9254, defines the boundary between categories “3” and “4”
and so forth. The estimated effects of the additive type ðαi , i = 1, 2, 3, and 4Þ are
1.628, 4.9646, 3.3227, and 0, respectively. From these values, linear predictors are
estimated for each of the categories.
For example, the estimated linear predictor for a cheese made with additive 1, where
the evaluator (consumer) assigns it category “1 = highly undesirable,” is represented as
η11 = η1 þ α1 = - 7:0802 þ 1:6128 = - 5:4674 ; for the category “2 = moderately
undesirable,” it is η21 = η2 þ α1 = - 6:0250 þ 1:6128 = - 4:4122; for the category
“3 = slightly undesirable,” it is η31 = η3 þ α1 = - 4:9254 þ 1:6128 = - 3:3126; and
for the category “4 = neither undesirable nor desirable,” it is
η41 = η4 þ α1 = - 3:8568 þ 1:6128 = - 2:2440. These values are shown in the “Esti-
mate” column of Table 8.28; other categories are similarly calculated for each type of
additive.
The estimated values in Table 8.27 obtained with the “estimate” command in
conjunction with the “ilink” option prompts GLIMMIX to calculate the values of the
8.7 Exercises 357
linear predictors ηCi tabulated in the “Estimate” column and estimated probabilities
π Ciof all categories of each treatment, tabulated in the “Mean” column π cij , except
for the reference category.
From Table 8.28 (Estimates), we obtain the probabilities for each category that is
reported under the “Mean” column. In this case, the probability for π 11 = 0:004205.
This value is obtained by taking the inverse value of the linear predictor η11 = - 5:4674
π 11 = 1= 1 þ exp ð5:4674Þ = 0:004205 . To calculate the probability that a panelist
would assign a rating of 2 (2 = moderately undesirable) to cheese made with additive
1, we use the cumulative probability of 0.01198, which corresponds to π^21 þ π^11 . From
this value, we obtain π 21 = 0:01198 - π 11 = 0:01198 - 0:004205 = 0:007775 and for
the probability of assigning a rating of 3 to cheese made with additive
1, π 31 = 0:03514 - ðπ 21 þ π 11 Þ = 0:03514 - 0:001198 = 0:033942: Following the
same procedure, we obtain the other probabilities for the rest of the categories of each
of the additives used in the manufacturing of cheese, which are tabulated in Table 8.29
and can be seen in Fig. 8.3.
Figure 8.3 shows the probability results of each flavor rating for each of the
additives (it should be noted that some probability values were suppressed to avoid
overwriting). It can be seen that additive 1 primarily receives ratings of 5–7; additive
2 primarily receives ratings of 2–5; additive 3 primarily receives ratings of 4–6; and
additive 4 primarily receives ratings of 7–9.
The odds ratio results (Table 8.30) show the preferences more clearly. For
example, the odds ratio additive 1 vs. 4 states that the first additive is 5.017 times
more likely to receive a lower score than the fourth additive.
8.7 Exercises
Exercise 8.7.1 The dataset for this exercise corresponds to the results of 9 judges
who rated 2 classes of wine, namely, white wine (WW = 1) and red wine (RW = 2),
and, within each wine class, they rated 10 wines on a scale of 1–20 points. The
minimum rating for a particular wine was 7, and the maximum rating was 19.5. For
didactic purposes, ratings between 7 and 11 were assigned low quality, a rating
between 12 and 15 as medium quality, and anything above 15 was considered
excellent quality. The data are shown in Table 8.31 of the wine evaluation experi-
ment under columns “Judge” (wine evaluator panelist), “Wine_type” (white wine:
1, red wine: 2), “Quality” (low, medium, and excellent), and the frequency of the
observed qualities (“y”).
(a) Fit the cumulative logit proportional odds model to these data. Perform a
complete and appropriate analysis of the data, focusing on:
(i) An evaluation of the effects of the combination of treatments
(ii) Interpretation of the odds ratios
(iii) The expected probability per category for each treatment
358 8 Generalized Linear Mixed Models for Categorical and Ordinal Responses
Table 8.28 Estimates on the model scale (Estimate) and on the data scale (Mean) based on judges’
preference ratings of cheese made with different types of additives in the multinomial cumulative
logit model
Estimates
Standard Standard error
Label Estimate error DF t-value Pr > |t| Mean mean
c = 1, -5.4674 0.5236 197 -10.44 <0.0001 0.004205 0.002192
a=1
c = 2, -4.4122 0.4278 197 -10.31 <0.0001 0.01198 0.005064
a=1
c = 3, -3.3126 0.3700 197 -8.95 <0.0001 0.03514 0.01255
a=1
c = 4, -2.2440 0.3267 197 -6.87 <0.0001 0.09587 0.02832
a=1
c = 5, -0.9078 0.2833 197 -3.20 0.0016 0.2875 0.05804
a=1
c = 6, 0.04425 0.2646 197 0.17 0.8673 0.5111 0.06611
a=1
c = 7, 1.5459 0.3017 197 5.12 <0.0001 0.8243 0.04369
a=1
c = 8, 3.1058 0.4057 197 7.65 <0.0001 0.9571 0.01665
a=1
c = 1, -2.1155 0.4106 197 -5.15 <0.0001 0.1076 0.03942
a=2
c = 2, -1.0603 0.3009 197 -3.52 0.0005 0.2572 0.05749
a=2
c = 3, 0.03922 0.2735 197 0.14 0.8861 0.5098 0.06836
a=2
c = 4, 1.1078 0.2969 197 3.73 0.0002 0.7517 0.05542
a=2
c = 5, 2.4441 0.3397 197 7.19 <0.0001 0.9201 0.02497
a=2
c = 6, 3.3961 0.3724 197 9.12 <0.0001 0.9676 0.01168
a=2
c = 7, 4.8978 0.4249 197 11.53 <0.0001 0.9926 0.003124
a=2
c = 8, 6.4576 0.5045 197 12.80 <0.0001 0.9984 0.000789
a=2
c = 1, -3.7575 0.4761 197 -7.89 <0.0001 0.02281 0.01061
a=3
c = 2, -2.7023 0.3677 197 -7.35 <0.0001 0.06284 0.02165
a=3
c = 3, -1.6027 0.3001 197 -5.34 <0.0001 0.1676 0.04186
a=3
c = 4, -0.5341 0.2556 197 -2.09 0.0379 0.3696 0.05955
a=3
c = 5, 0.8021 0.2610 197 3.07 0.0024 0.6904 0.05579
a=3
(continued)
8.7 Exercises 359
Table 8.29 Probabilities calculated for each of the ratings by additives used in the manufacture of cheese
Rating
Cal1 Cal2 Cal3 Cal4 Cal5 Cal6 Cal7 Cal8 Cal9
Additive 1 0.00421 0.00778 0.02316 0.06073 0.19163 0.2236 0.3132 0.1328 0.0429
Additive 2 0.1076 0.1496 0.2526 0.2419 0.1684 0.0475 0.025 0.0058 0.0016
Additive 3 0.02281 0.04003 0.10476 0.202 0.3208 0.1621 0.1104 0.0291 0.008
Additive 4 0.00084 0.00157 0.0048 0.01349 0.05373 0.09797 0.3109 0.3332 0.1835
Note: Grade1 = Grade 1, Grade2 = Grade 2,...., Grade9 = Grade 9
Generalized Linear Mixed Models for Categorical and Ordinal Responses
8.7 Exercises 361
1.0 0.0429
0. 0.
0.0291
0.0475
0.
0.9 0.1328
0. 0.1104
0. 0.1835
0.
Probability of acceptability
0.
0.1684
0.8
0.1621
0.
0.7
0.3132
0. 0.3332
0.
0.
0.2419
0.6
0.5 0.
0.3208
0.4 0.2236
0. 0.
0.2526
0.3 0.3109
0.
0. 02
0.202
0.2 0.
0.19163
63
0.1 0.
0.10476
76 0.
0.09797
97
0.
0.06073
73 0.
0.1076 0.05373
0. 73
0.0 0.
0.02281
81
Aditivo 1 Aditivo 2 Aditivo 3 Aditivo 4
Fig. 8.3 Estimated probabilities for the categories of acceptability for the cheese according to the
type of additive
Table 8.32 Results of the tuber experiment. V = variety, C = string, B = block, D = damage
(sd = no damage, dl = slight damage, dm = moderate damage, ds = severe damage), and
Y = observed frequency
V C B D Y V C B D Y V C B D Y
1 1 1 sd 5 2 1 1 sd 4 3 1 1 sd 3
1 1 1 dl 14 2 1 1 dl 5 3 1 1 dl 2
1 1 1 dm 1 2 1 1 dm 4 3 1 1 dm 8
1 1 1 ds 0 2 1 1 ds 7 3 1 1 ds 7
1 2 1 sd 6 2 2 1 sd 8 3 2 1 sd 18
1 2 1 dl 11 2 2 1 dl 3 3 2 1 dl 1
1 2 1 dm 1 2 2 1 dm 0 3 2 1 dm 0
1 2 1 ds 0 2 2 1 ds 0 3 2 1 ds 0
1 3 1 sd 6 2 3 1 sd 3 3 3 1 sd 5
1 3 1 dl 13 2 3 1 dl 10 3 3 1 dl 7
1 3 1 dm 0 2 3 1 dm 6 3 3 1 dm 4
1 3 1 ds 0 2 3 1 ds 1 3 3 1 ds 4
1 4 1 sd 2 2 4 1 sd 1 3 4 1 sd 1
1 4 1 dl 9 2 4 1 dl 3 3 4 1 dl 4
1 4 1 dm 6 2 4 1 dm 11 3 4 1 dm 6
1 4 1 ds 3 2 4 1 ds 5 3 4 1 ds 9
1 5 1 sd 11 2 5 1 sd 16 3 5 1 sd 12
1 5 1 dl 8 2 5 1 dl 3 3 5 1 dl 7
1 5 1 dm 0 2 5 1 dm 1 3 5 1 dm 1
1 5 1 ds 0 2 5 1 ds 0 3 5 1 ds 0
1 6 1 sd 12 2 6 1 sd 16 3 6 1 sd 16
1 6 1 dl 5 2 6 1 dl 3 3 6 1 dl 3
1 6 1 dm 2 2 6 1 dm 0 3 6 1 dm 0
1 6 1 ds 0 2 6 1 ds 0 3 6 1 ds 1
1 7 1 sd 8 2 7 1 sd 11 3 7 1 sd 20
1 7 1 dl 12 2 7 1 dl 9 3 7 1 dl 0
1 7 1 dm 0 2 7 1 dm 0 3 7 1 dm 0
1 7 1 ds 0 2 7 1 ds 0 3 7 1 ds 0
1 8 1 sd 12 2 8 1 sd 10 3 8 1 sd 18
1 8 1 dl 4 2 8 1 dl 10 3 8 1 dl 2
1 8 1 dm 0 2 8 1 dm 0 3 8 1 dm 0
1 8 1 ds 0 2 8 1 ds 0 3 8 1 ds 0
1 1 2 sd 5 2 1 2 sd 5 3 1 2 sd 6
1 1 2 dl 31 2 1 2 dl 7 3 1 2 dl 8
1 1 2 dm 2 2 1 2 dm 5 3 1 2 dm 5
1 1 2 ds 1 2 1 2 ds 1 3 1 2 ds 1
1 2 2 sd 6 2 2 2 sd 13 3 2 2 sd 12
1 2 2 dl 11 2 2 2 dl 6 3 2 2 dl 6
1 2 2 dm 1 2 2 2 dm 1 3 2 2 dm 1
1 2 2 ds 0 2 2 2 ds 0 3 2 2 ds 1
1 3 2 sd 5 2 3 2 sd 5 3 3 2 sd 10
(continued)
8.7 Exercises 365
less experienced judges have a more severe rating than do more experienced judges
but also that experience should have little or no effect on the white beans for which
the canning procedure was developed. Judges are stratified for the purpose of
analysis by experience (less than 5 years, greater than 5 years).
Counts by canning quality, judge experience, and bean breeding lines are listed in
the following table (Table 8.33).
8.7 Exercises 369
(a) Fit the generalized logit model to these data. Perform a complete and appropriate
analysis of the data, focusing on:
(i) An evaluation of the effects of the combination of treatments
(ii) Interpretation of the odds ratios
(iii) The expected probability per category for each treatment
(b) Test whether the proportional odds assumption is viable. Cite relevant evidence
to support your conclusion regarding the adequacy of the assumption.
Exercise 8.7.5 An experiment was conducted to look at the damage levels (ordinal
categories 0–4) of Picea sitchensis shoots in two time periods (10 November and
8 December), at four temperatures (different on each date), and at four ozone levels
(Table 8.34).
(a) Fit the cumulative logit proportional odds model to these data. Perform a
complete and appropriate analysis of the data, focusing on:
(i) An evaluation of the effects of the combination of treatments
(ii) Interpretation of the odds ratios
(iii) The expected probability per category for each treatment
(b) Test whether the proportional odds assumption is viable. Cite relevant evidence
to support your conclusion regarding the adequacy of the assumption.
370 8 Generalized Linear Mixed Models for Categorical and Ordinal Responses
Appendix
Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0
International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing,
adaptation, distribution and reproduction in any medium or format, as long as you give appropriate
credit to the original author(s) and the source, provide a link to the Creative Commons license and
indicate if changes were made.
The images or other third party material in this chapter are included in the chapter's Creative
Commons license, unless indicated otherwise in a credit line to the material. If material is not
included in the chapter's Creative Commons license and your intended use is not permitted by
statutory regulation or exceeds the permitted use, you will need to obtain permission directly from
the copyright holder.
Chapter 9
Generalized Linear Mixed Models
for Repeated Measurements
9.1 Introduction
Repeated measures data, also known as longitudinal data, are those derived from
experiments in which observations are made on the same experimental units at
various planned times. These experiments can be of the regression or analysis of
variance (ANOVA) type, can contain two or more treatments, and are set up using
familiar designs, such as completely randomized design (CRD), randomized com-
plete block design (RCBD), or randomized incomplete blocks, if blocking is appro-
priate, or using row and column designs such as Latin squares when appropriate.
Repeated measures designs are widely used in the biological sciences and are fairly
well understood for normally distributed data but less so with binary, ordinal, count
data, and so on. Nevertheless, recent developments in statistical computing meth-
odology and software have greatly increased the number of tools available for
analyzing categorical data.
A generalized linear mixed model (GLMM) is one of the most useful and
sophisticated structures in modern statistics, as it allows complex structures to be
incorporated into the framework of a general linear model. Fitting such models has
been the subject of much research over the last three decades. GLMMs, for repeated
measures, combine both generalized linear model (GLM) theory (e.g., a binomial,
multinomial, or Poisson response variable) and linear mixed effects models.
Experimentation is sometimes not well understood since researchers believe that
it involves only the manipulation of the levels of independent variables and the
observation of subsequent responses in dependent variables. Independent variables,
whose levels are determined or set by the experimenter, are said to have fixed effects,
although random effects are also very common, where the levels of the effects are
assumed to be randomly selected from an infinite population of possible levels.
Many variables of interest in research are not fully amenable to experimental
manipulation but can nevertheless be studied by considering them to have random
effects. For example, the genetic composition of individuals of a species cannot be
Table 9.1 Turf quality of five grass varieties (low, Med = medium, Excel = Excellent,
Sept = September)
May July Sept
Variety No. of plots Low Med Excel Low Med Excel Low Med Excel
1 18 4 10 4 1 9 8 0 12 6
2 17 2 11 4 0 7 10 0 9 8
3 17 2 11 4 2 8 7 2 11 4
4 18 8 7 3 4 8 6 4 13 1
5 18 1 11 6 3 4 11 3 6 9
The data were obtained from an experiment studying the turf quality of five grass
varieties. The varieties were sown independently in 17 or 18 plots. The evaluations
of the plots (experimental units) were carried out in the months of May, July, and
September of the growing season, and turf quality was classified on an ordinal scale
into three categories: low quality, medium quality, and excellent quality, as demon-
strated in Table 9.1.
The components of the GLMM, with repeated measures with an ordinal multi-
nomial response, are as follows:
Distributions: y1ij, y2ij, y3ij|ρij~Multinomial(Nij, π 1ij, π 2ij, π 3ij), where y1ij, y2ij, and y3ij
are the observed frequencies of the responses (turf quality) in each c category
(low, medium, and excellent), and ρij is the random effect due to the combination
variety × month (measurement time), assuming ρij N 0, σ 2ρ .
Linear predictor: η(c)ij = ηc + τi + ρij, where η(c)ij is the cth link (c = 1, 2) in the ijth
combination variety × month, ηc is the intercept for the cth link, τi is the fixed
effect due to the ith treatment, and ρij is the random effect due to the ijth
measurement of variety × month ρij N 0, σ 2variety × month . The link functions
for each category are as follows:
π 0ij
log = η0ij
1 - π 0ij
π 0ij þ π 1ij
log = η1ij
1 - π 0ij þ π 1ij
The following Statistical Analysis Software (SAS) program fits a repeated mea-
sures GLMM with an ordinal response.
Mixed models have advantages over fixed linear models (Littell et al. 1996)
because they have the ability to incorporate fixed (Xβ) and random effects (Zb)
that allow us to select different variance–covariance structures for repeated measures
experiments (with or without missing data) to see which covariance structure best fits
the model (Henderson 1984; Smith et al. 2005). Selecting or building a good enough
model involves selecting a covariance structure that best fits the dataset. The
information criteria minus two Restricted Log Likelihood (-2RLL), Akaike infor-
mation criterion (AIC), Corrected Akaike’s information criterion (AICC), Bayesian
information criterion (BIC), etc.) provided by proc GLIMMIX are used as statistical
fit measures to select the variance structure (compound symmetry (“CS”), first-order
autoregressive (“AR(1)”), Toeplitz (“Toep(1)”), unstructured (“UN)”) that best
models the dataset.
Most of the commands have already been explained. To provide the correlation
structure that you want to model, with the above program, you vary the “TYPE”
option = (CS, AR(1), Toep(1), and UN) separately to specify each of the covariance
structures in the parentheses. Part of the results is shown below.
According to the fit statistics (Table 9.2), the covariance structure that best fits the
dataset is Toeplitz of order 1 (Toep(1)). The type III tests of fixed effects, shown in
Table 9.3 part (a), indicate that grass variety provides different turfgrass qualities
9.2 Example of Turf Quality 381
Table 9.4 Estimated linear predictors and means on the model scale (Estimate) and on the data
scale (Mean) for observed turfgrass quality in grass varieties in the multinomial generalized logit
model
Estimates
Standard Standard error
Label Estimate error DF t-value Pr > |t| Mean mean
c = 1, -2.0248 0.3018 10 -6.71 <0.0001 0.1166 0.03110
var = 1
c = 2, 0.6222 0.2646 10 2.35 0.0405 0.6507 0.06013
var = 1
c = 1, -2.4659 0.3177 10 -7.76 <0.0001 0.07828 0.02292
var = 2
c = 2, 0.1811 0.2667 10 0.68 0.5125 0.5452 0.06613
var = 2
c = 1, -1.8384 0.3040 10 -6.05 0.0001 0.1372 0.03599
var = 3
c = 2, 0.8086 0.2760 10 2.93 0.0150 0.6918 0.05884
var = 3
c = 1, -0.9605 0.2791 10 -3.44 0.0063 0.2768 0.05588
var = 4
c = 2, 1.6865 0.2992 10 5.64 0.0002 0.8438 0.03944
var = 4
c = 1, -2.4509 0.3219 10 -7.61 <0.0001 0.07937 0.02352
var = 5
c = 2, 0.1961 0.2721 10 0.72 0.4875 0.5489 0.06737
var = 5
(P = 0.0202). The “solution” option in the model specification “Model” provides the
solution of fixed effects of the model (intercepts and treatments), which we use to
^ i (part (b)).
estimate the linear predictors ^ηci = ^ηc þ Variety
The probabilities π ci obtained using the “Estimate” information are tabulated
under the “Mean” column of Table 9.4.
382 9 Generalized Linear Mixed Models for Repeated Measurements
From these values, we can observe that for the category "c = 1, var = 1, " the
value of the linear predictor is η11 = η1 þ variety1 = - 2:0248. Taking the inverse of
^η11 corresponds to the probability of π 11 = 0:1166 of observing “Low”-quality grass
of variety 1. Now, for the category "c = 2, var = 1, " the inverse of the linear
predictor is 0.6507, which is the estimate of the probability π 11 þ π 21 . From this
value, we can obtain the probability that variety 1 provides grass of “Medium”
quality, that is, π 11 þ π 21 = 0:6504, and, substituting the value of π 11 , we obtain the
probability value π 21 = 0:6507 - 0:1166 = 0:5341. With these two probability esti-
mates π 11 and π 21 , it is possible to estimate the probability that variety 1 will yield an
“Excellent” quality turf, which is equal to π 31 = 1 - 0:6504 = 0:3496. Likewise, we
obtain the values of the remaining probabilities π ci for the rest of the grass varieties.
A cage experiment was used to investigate the effect of three insecticides on aphid
colonies with partial resistance to a common active compound. There were eight
treatments: all combinations of the three insecticides and a control (no insecticide)
with two types of colonies (susceptible or partially resistant). The experiment was
organized as an RCBD with six blocks of eight cages, and each cage was assigned a
treatment combination in each block. A colony of aphids was reared in each cage,
and the number of live aphids was recorded before insecticide treatment was applied
and then 2 and 6 days after application. Both hatches and deaths could occur within
each cage between evaluations. The dataset from this experiment is shown below
(Table 9.5).
Following the same reasoning as in previous examples, the components of the
GLMM with a Poisson response and repeated measures, which models the number
of aphids (yijkl), is described in the following lines.
Link function: log(λijkl) = ηijk is the link function that relates the linear predictor to
the mean (λijkl).
The following SAS program adjusts the GLMM with a Poisson distribution on
repeated measures.
Before fitting the GLMM, we compare the estimates of covariance structures with
a Poisson distribution assumed in the response variable. According to the fit statis-
tics, the covariance structure that best models the data is the autoregressive type of
order 1 (AR(1)). The value of the fit statistic of the conditional distribution
Pearson′s chi - square/DF = 5.77 indicates that there is an extra variation (aka
overdispersion) and that the Poisson distribution does not adequately fit the data
(Table 9.6).
Since there is overdispersion in the data, a highly recommended alternative is to
find another suitable (or more appropriate) distribution for this dataset. In this case,
the linear predictor will be the same, although now, a negative binomial distribution
will be assumed in the response variable. That is,
This negative binomial model arises by assuming that the conditional distribution
of observations given random blocks and Insecticide*clone(block)ij(l )) is as follows:
yijkljbl, insecticide*clone(block)ij(l ) ~ Poisson(λijkl), where λijkl Gamma ϕ1 , ϕ .
The result of the new distribution of yijkljbl, insecticide × clone(Block)ij(l ) is a
negative binomial (Negative binomial (λijkl, ϕ)). The link function is log(λijkl) = ηijkl.
The following SAS code fits the GLMM with a negative binomial distribution.
Part of the results is shown in Table 9.7. The values of the fit statistics, assuming a
negative binomial distribution of the data, are shown in part (a), and the value of the
conditional statistic is observed in part (b) (Pearson′s chi - square/DF = 0.81). This
indicates that overdispersion has been eliminated from the data, and, so, the negative
binomial distribution adequately models the response variable.
The estimated variance components are shown in part (a) of Table 9.8, under an
AR(1) covariance structure. The estimates of the variance components of blocks, the
interaction between the insecticide and clone within blocks, and the scale parameter
are σ 2block = 0:06613, σ 2insecticide × cloneðblockÞ = - 0:7575, and ϕ = 0:1584, respectively.
The fixed III type effects tests (part (b)) indicate that there is a significant effect of
insecticide type (P < 0.0001), clone (P = 0.0387), measurement time (P = 0.0137),
and interactions insecticide x measurement time (P < 0.0001) and clone x measure-
ment time (P = 0.0259) on the average number of aphids. The interaction insecticide
x clone x measurement time is close to significance (P < 0.0663).
386 9 Generalized Linear Mixed Models for Repeated Measurements
Table 9.9 Estimates of insecticide least squares (LS) means on the model scale (Estimate) and the
data scale (Mean)
Standard t- Standard error
Insecticide Estimate error DF value Pr > |t| Mean mean
C 4.7344 0.1478 19 32.03 <0.0001 113.79 16.8211
D 3.9647 0.1547 19 25.62 <0.0001 52.7043 8.1553
H 4.3733 0.1561 19 28.02 <0.0001 79.3010 12.3753
P 3.4892 0.1753 19 19.90 <0.0001 32.7588 5.7432
Table 9.10 Clone least squares means on the model scale (Estimate) and the data scale (Mean)
Clone Estimate Standard error DF t-value Pr > |t| Mean Standard error mean
R 4.4332 0.1320 19 33.58 <0.0001 84.1990 11.1158
S 3.8476 0.1890 19 20.36 <0.0001 46.8785 8.8586
Table 9.11 Insecticide*clone least squares means on the model scale (Estimate) and the data scale
(Mean)
Standard t- Standard
Insecticide Clone Estimate error DF value Pr > |t| Mean error mean
C R 4.8836 0.1529 19 31.93 <0.0001 132.10 20.2032
C S 4.5852 0.2479 19 18.49 <0.0001 98.0186 24.3008
D R 4.0521 0.1886 19 21.49 <0.0001 57.5153 10.8459
D S 3.8773 0.2337 19 16.59 <0.0001 48.2958 11.2858
H R 4.7106 0.1997 19 23.59 <0.0001 111.11 22.1870
H S 4.0359 0.2244 19 17.98 <0.0001 56.5964 12.7026
P R 4.0866 0.2322 19 17.60 <0.0001 59.5346 13.8263
P S 2.8918 0.2534 19 11.41 <0.0001 18.0255 4.5675
Table 9.12 Time least squares means on the model scale (Estimate) and the data scale (Mean)
Time Estimate Standard error DF t-value Pr > |t| Mean Standard error mean
1 4.2730 0.1434 44 29.79 <0.0001 71.7375 10.2905
2 3.9108 0.1454 44 26.90 <0.0001 49.9372 7.2603
3 4.2373 0.1457 44 29.09 <0.0001 69.2231 10.0830
The linear predictors and estimated means of the factors and interaction are under
the “Estimate” and “Mean” columns, respectively. Average number of aphids for
insecticide, clone and time are given below:
For insecticide type (Table 9.9):
For clone (Table 9.10):
For the interaction insecticide*clone (Table 9.11):
For measurement time (Table 9.12):
For the interaction insecticide*time (Table 9.13):
For the interaction clone*time (Table 9.14):
For the interaction insecticide*clone*time (Table 9.15):
9.4 Manufacture of Livestock Feed 387
Table 9.13 Insecticide*time least squares means on the model scale (Estimate) and the data scale
(Mean)
Standard t- Standard
Time Estimate error DF value Pr > |t| Mean error mean
C 1 4.2381 0.1930 44 21.95 <0.0001 69.2781 13.3733
C 2 4.6631 0.1913 44 24.38 <0.0001 105.96 20.2696
C 3 5.3019 0.1898 44 27.94 <0.0001 200.71 38.0898
D 1 4.4854 0.1965 44 22.83 <0.0001 88.7111 17.4275
D 2 3.7061 0.2014 44 18.40 <0.0001 40.6940 8.1959
D 3 3.7026 0.2035 44 18.20 <0.0001 40.5537 8.2517
H 1 4.4718 0.1978 44 22.60 <0.0001 87.5164 17.3133
H 2 3.9790 0.2016 44 19.73 <0.0001 53.4625 10.7804
H 3 4.6689 0.1977 44 23.62 <0.0001 106.59 21.0694
P 1 3.8967 0.2241 44 17.39 <0.0001 49.2403 11.0358
P 2 3.2949 0.2357 44 13.98 <0.0001 26.9755 6.3583
P 3 3.2759 0.2399 44 13.65 <0.0001 26.4664 6.3502
Table 9.14 Clone*time least squares means on the model scale (Estimate) and the data scale
(Mean)
Standard t- Standard error
Clone Time Estimate error DF value Pr > |t| Mean mean
R 1 4.3839 0.1595 44 27.49 <0.0001 80.1482 12.7828
R 2 4.2826 0.1605 44 26.68 <0.0001 72.4270 11.6256
R 3 4.6331 0.1601 44 28.94 <0.0001 102.83 16.4644
S 1 4.1621 0.2092 44 19.90 <0.0001 64.2093 13.4323
S 2 3.5390 0.2131 44 16.60 <0.0001 34.4308 7.3387
S 3 3.8416 0.2144 44 17.91 <0.0001 46.5989 9.9931
In this experiment, two types of pelleted feed were manufactured using different
amounts of whole sorghum. Using the whole grain resulted in one feed with a high
pellet durability index (PDI) and one with a low PDI. The researcher was interested
in how much impact this difference in PDI would have on the amount of intact and
pelleted feed distributed to the different positions along the feeding line. The line
was fed four times with the high PDI feed and four times with the low PDI feed.
After each run, the total weight of the feed in each of the 12 identified trays was
measured. The feed was then sieved into each tray, and the crushed fine granules
were weighed in the feed line. The response of interest was the ratio (proportion)
between the weight of fine granules and the total weight of the feed for each tray. The
data for this experiment are in the Appendix (Data: Feeding line experiment).
The experimental design used in this study was a split plot in a randomized
completely design. There were 2 fixed factors, feed with 2 levels (high PDI feed
(H) and low PDI feed (L)), and a tray with 12 levels (1, 2, 3, ..., 12 locations along
388
Table 9.15 Insecticide*clone*time least squares means on the model scale (Estimate) and the data scale (Mean)
Insecticide Clone Time Estimate Standard error DF t-value Pr > |t| Mean Standard error mean
C R 1 4.2592 0.2321 44 18.35 <0.0001 70.7546 16.4256
C R 2 4.9631 0.2280 44 21.76 <0.0001 143.04 32.6184
C R 3 5.4284 0.2269 44 23.92 <0.0001 227.78 51.6892
C S 1 4.2170 0.3044 44 13.86 <0.0001 67.8323 20.6460
C S 2 4.3630 0.3031 44 14.40 <0.0001 78.4950 23.7891
C S 3 5.1754 0.2999 44 17.26 <0.0001 176.87 53.0366
9
the feed line). Different run levels (1, 2, 3, 4 runs in the feed line) may influence the
inference of this experiment, so it is advisable to analyze which variance structure is
suitable for this analysis.
The ANOVA table (Table 9.16) with degrees of freedom for this experiment is
shown below.
The researcher aims to draw conclusions about the destructiveness in the feed line
with two types of feed, high PDI and low PDI. The following GLMM is used to
describe the experiment:
Part of the output is shown below. Four covariance structures (“CS,” “AR(1),”
“Toep(1),” and “UN”) were tested to see which one best fits the response variable.
Of these covariance structures, “Toep(1)” produced the best fit statistics (part (a),
Table 9.17).
Another important result that gives the guideline to continue with the analysis is
the conditional distribution statistic (Pearson′s chi - square/DF = 0.96), whose
390 9 Generalized Linear Mixed Models for Repeated Measurements
Table 9.18 Feed least squares means on the model scale (Estimate) and the data scale (Mean)
Feed Estimate Standard error DF t-value Pr > |t| Mean Standard error mean
H -2.0009 0.07409 6 -27.01 <0.0001 0.1191 0.007773
L 1.3832 0.07208 6 19.19 <0.0001 0.7995 0.01155
value indicates that the beta model adequately fits the data, whereas the fixed effects
tests (part (c)) indicate that there is a statistically significant effect of feeding type
(P = 0.0001) and tray (P = 0.0001).
The linear predictors and estimated probabilities of the factors and interaction are
listed under the “Estimate” and “Mean” columns of the following tables,
respectively.
For the feeding line (Table 9.18):
For the tray (Table 9.19):
For the interaction feeding*tray (Table 9.20):
During a 1-month period (June 1981), 30 river water samples were collected from
the channel at 3 stations, A, B, and C (downstream to upstream) on 5 randomly
selected days at 9:00 a.m. and 3:00 p.m. (1 sample per station per hour per day). Each
sample was analyzed for fecal coliform by method FC-96. The data from this
experiment are shown in Table 9.21.
9.5 Characterization of Spatial and Temporal Variations in Fecal Coliform Density 391
Table 9.19 Tray least squares means on the model scale (Estimate) and the data scale (Mean)
Tray Estimate Standard error DF t-value Pr > |t| Mean Standard error mean
01 -0.5652 0.08182 65 -6.91 <0.0001 0.3623 0.01891
02 -0.6607 0.08531 65 -7.74 <0.0001 0.3406 0.01916
03 -0.6950 0.08822 65 -7.88 <0.0001 0.3329 0.01959
04 -0.2958 0.08100 65 -3.65 0.0005 0.4266 0.01981
05 -0.3773 0.08212 65 -4.59 <0.0001 0.4068 0.01982
06 -0.2947 0.08057 65 -3.66 0.0005 0.4268 0.01971
07 -0.3520 0.08165 65 -4.31 <0.0001 0.4129 0.01979
08 -0.2992 0.07939 65 -3.77 0.0004 0.4258 0.01941
09 -0.1314 0.07670 65 -1.71 0.0916 0.4672 0.01909
10 -0.3935 0.08096 65 -4.86 <0.0001 0.4029 0.01948
11 0.1571 0.07860 65 2.00 0.0499 0.5392 0.01953
12 0.2014 0.07949 65 2.53 0.0137 0.5502 0.01967
Table 9.20 Tray*feed least squares means on the model scale (Estimate) and the data scale (Mean)
Standard Standard error
Tray Feed Estimate error DF t-value Pr > |t| Mean mean
01 H -2.2408 0.1284 65 -17.46 <0.0001 0.09614 0.01116
01 L 1.1104 0.1015 65 10.94 <0.0001 0.7522 0.01892
02 H -2.4581 0.1369 65 -17.95 <0.0001 0.07885 0.009946
02 L 1.1367 0.1018 65 11.17 <0.0001 0.7571 0.01872
03 H -2.4724 0.1375 65 -17.98 <0.0001 0.07782 0.009869
03 L 1.0823 0.1105 65 9.79 <0.0001 0.7469 0.02089
04 H -2.0307 0.1217 65 -16.69 <0.0001 0.1160 0.01248
04 L 1.4391 0.1070 65 13.45 <0.0001 0.8083 0.01658
05 H -2.1481 0.1254 65 -17.13 <0.0001 0.1045 0.01174
05 L 1.3935 0.1061 65 13.13 <0.0001 0.8011 0.01690
06 H -2.0087 0.1208 65 -16.62 <0.0001 0.1183 0.01260
06 L 1.4192 0.1066 65 13.31 <0.0001 0.8052 0.01673
07 H -2.1026 0.1242 65 -16.93 <0.0001 0.1088 0.01204
07 L 1.3987 0.1061 65 13.18 <0.0001 0.8020 0.01685
08 H -1.9310 0.1192 65 -16.20 <0.0001 0.1266 0.01318
08 L 1.3325 0.1050 65 12.69 <0.0001 0.7913 0.01734
09 H -1.6240 0.1113 65 -14.59 <0.0001 0.1647 0.01531
09 L 1.3613 0.1056 65 12.89 <0.0001 0.7960 0.01715
10 H -2.0863 0.1238 65 -16.85 <0.0001 0.1104 0.01216
10 L 1.2994 0.1044 65 12.45 <0.0001 0.7857 0.01757
11 H -1.4559 0.1075 65 -13.55 <0.0001 0.1891 0.01648
11 L 1.7701 0.1148 65 15.42 <0.0001 0.8545 0.01427
12 H -1.4519 0.1076 65 -13.50 <0.0001 0.1897 0.01653
12 L 1.8548 0.1171 65 15.83 <0.0001 0.8647 0.01371
392 9 Generalized Linear Mixed Models for Repeated Measurements
Table 9.21 Variation in fecal coliform densities of the river water samples from three sampling
stations on five sampling days at 9:00 a.m. (TM = 1) and 3:00 p.m. (TM = 2)
Sampling date TM Site No. of coliforms per milliliter
18 May 9:00 a.m. A 648
18 May 3:00 p.m. A 798
18 May 9:00 a.m. B 517
18 May 3:00 p.m. B 702
18 May 9:00 a.m. C 532
18 May 3:00 p.m. C 55
26 May 9:00 a.m. A 1421
26 May 3:00 p.m. A 1388
26 May 9:00 a.m. B 1883
26 May 3:00 p.m. B 1855
26 May 9:00 a.m. C 1724
26 May 3:00 p.m. C 1769
29 May 9:00 a.m. A 1523
29 May 3:00 p.m. A 759
29 May 9:00 a.m. B 1361
29 May 3:00 p.m. B 603
29 May 9:00 a.m. C 2004
29 May 3:00 p.m. C 541
1 June 9:00 a.m. A 1987
1 June 3:00 p.m. A 1056
1 June 9:00 a.m. B 1796
1 June 3:00 p.m. B 1579
1 June 9:00 a.m. C 1221
1 June 3:00 p.m. C 1223
5 June 9:00 a.m. A 870
5 June 3:00 p.m. A 1099
5 June 9:00 a.m. B 920
5 June 3:00 p.m. B 951
5 June 9:00 a.m. C 926
5 June 3:00 p.m. C 887
To assess the relative magnitudes of sources of variation due to time, site, and
subsampling on the number of coliforms per milliliter (yijk), an analysis of variance
using a GLMM with a Poisson response was performed, as described below:
We denote yijk as the number of colonies per milliliter, whose conditional
distribution is given by yijkjsampling(site)ik ~ Poisson (λijk) with the linear predictor
ηijk defined by
ði = 1, 2, 3; j = 1, 2, 3, 4, 5; k = 1, 2Þ
where ηijk is the linear predictor that relates the linear function to the mean, θ is the
intercept, sitei is the fixed effect due to the sampling site i, sampling(site)ik is the
random effect due to the sampling time nested within the site, assuming
samplingðsiteÞik N 0, σ 2samplingðsiteÞ , timej is the fixed effect due to sampling
date, and (site × time)ij is the effect of the interaction between the site and sampling
date. The link function for this model is log(λijk) = ηijk.
The following GLIMMIX syntax fits a GLMM with a Poisson response.
of sampling (P = 0.0001). That is, the concentration of fecal coliform units per
milliliter is affected by the date of data collection. However, we observed that there
is an excessive dispersion in the data. One way to check for and deal with
overdispersion is to run a quasi-Poisson model, which, during the fitting process,
adds an additional dispersion parameter to account for that additional variance.
Another option is to look for a distribution that adequately fits the data; in this
case, the negative binomial distribution is a good alternative.
Next, we will implement the analysis assuming that the response variable is
distributed under a negative binomial distribution. This means that the distribution
of yijk (number of colonies per militro) is given by yijk j smapling(site)ik~Negative
Binomial (λijk, ϕ), where ϕ is the scale parameter. However, the linear predictor ηijk
and the link function remain unchanged.
The following GLIMMIX commands fit a GLMM with a negative binomial
distribution.
Part of the output of the above program is shown below. The values of the fit
statistics under the negative binomial distribution (part (a) of Table 9.23) are much
smaller compared to those obtained assuming the Poisson model, indicating that the
negative binomial distribution adequately fits the response variable. Furthermore,
the value of the conditional distribution statistic indicates that the negative binomial
distribution is a good distribution for these data (Pearson′s chi - square/DF = 0.76).
This parameter Pearson0 s chi - squareDF = 0:76 refers to how many times the
variance is larger than the mean. Since this value is less than 1 (part (b)), the
conditional variance is actually smaller than the conditional mean, indicating that
overdispersion has been removed in the fitting of the data. Another direct effect
9.6 Log-Normal Distribution 395
Table 9.24 Type III fixed Effect Num DF Den DF F-value Pr > F
effects tests
Site 2 3 0.78 0.5346
T 4 12 11.57 0.0004
T*site 8 12 1.13 0.4096
Table 9.25 Means and standard errors on the model scale (Estimate) and on the data scale (Mean)
of the sampling site data
Site Estimate Standard error DF t-value Pr > |t| Mean Standard error mean
A 7.0195 0.1270 3 55.25 <0.0001 1118.25 142.06
B 7.0243 0.1271 3 55.28 <0.0001 1123.57 142.78
C 6.8237 0.1305 3 52.30 <0.0001 919.40 119.95
Table 9.26 Means and standard errors of measurement time on the model scale (Estimate) and the
data scale (Mean)
T Estimate Standard error DF t-value Pr > |t| Mean Standard error mean
1 6.2084 0.1467 12 42.32 <0.0001 496.91 72.8990
2 7.4212 0.1420 12 52.27 <0.0001 1670.97 237.23
3 7.0074 0.1455 12 48.16 <0.0001 1104.74 160.75
4 7.2910 0.1418 12 51.42 <0.0001 1466.97 208.00
5 6.8513 0.1422 12 48.19 <0.0001 945.09 134.35
observed when there is no overdispersion is the F-values of the fixed effects tests
(Table 9.24). In this case, the date on which the samples were collected was
significant but not the interaction between the two factors, as the case when the
data were fitted using the Poisson GLMM.
The linear predictors and estimated probabilities of the main effects and the
interaction between both factors are under the columns “Estimate” and “Mean,”
respectively. The sampling site averages are presented below (Table 9.25).
The averages by sampling date are listed below (Table 9.26).
The means of the interaction site × sampling date are shown below (Table 9.27).
Table 9.27 Means and standard errors for the interaction T*site on the model scale (Estimate) and
the data scale (Mean)
Standard t- Standard error
Site T Estimate error DF value Pr > |t| Mean mean
A 1 6.5905 0.2463 26.76 <0.0001 728.17 179.35
B 1 6.4197 0.2466 26.03 <0.0001 613.79 151.38
C 1 5.6151 0.2772 20.26 <0.0001 274.53 76.1038
A 2 7.2508 0.2452 29.57 <0.0001 1409.17 345.59
B 2 7.5367 0.2451 30.75 <0.0001 1875.54 459.64
C 2 7.4761 0.2463 30.36 <0.0001 1765.30 434.71
A 3 7.0336 0.2461 28.58 <0.0001 1134.09 279.14
B 3 6.8855 0.2465 27.94 <0.0001 978.01 241.05
C 3 7.1030 0.2586 27.47 <0.0001 1215.59 314.37
A 4 7.3224 0.2458 29.79 <0.0001 1513.87 372.07
B 4 7.4329 0.2450 30.33 <0.0001 1690.62 414.28
C 4 7.1176 0.2463 28.90 <0.0001 1233.47 303.81
A 5 6.9003 0.2460 28.04 <0.0001 992.56 244.21
B 5 6.8467 0.2458 27.85 <0.0001 940.73 231.25
C 5 6.8069 0.2453 27.75 <0.0001 904.07 221.80
Fig. 9.1 Density function of the log-normal distribution with parameters 1 and 0.6
The experiment was conducted between January and February 2017 at the Colegio
de Postgraduados Campus Córdoba located in Amatlán de los Reyes, Veracruz,
México. The genetic material used were four 5–6-month-old males of the Criollo
lechero tropical (CLT) breed, randomly distributed in individual pens of
4.8 × 2.1 m2, each one with 75% shade, a cup drinker, and a drawer-type feeder.
To ensure the required crude protein percentages for each treatment, the following
diets (treatments 1–4) were developed: Trt1 (12% crude protein), Trt2 (14% crude
protein), Trt3 (16% crude protein), and Trt4 (commercial feed with 16% crude
protein). Each animal randomly received the four treatments in different periods.
Each treatment was applied for 11 days, of which the first 7 were considered
adaptation days and the following 4 days were used for the measurement of gases
in the daily accumulated excreta. The experiment had a total duration of 44 days. The
data from this experiment are tabulated in the Appendix (Data: Nitrous oxide
emission). The N2O gas fluxes in ppm were calculated from a linear or nonlinear
increase of the concentrations inside the static chambers over time, and these fluxes
were converted to micrograms of N2O–N per m2 per hour ( y); for more details, see
the study by Nadia Hernández-Tapia et al., (2019). The statistical model used in this
study was an analysis of covariance model in a randomized complete block design
with repeated measures, as described below.
where yijk is the flux of N2O–N (μg m-2 h-1); μ is the overall mean; τi is the fixed
effect due to treatment i (i = 1, 2, 3, 4); animalj is the random effect due to animal
j ( j = 1, 2, 3, 4), assuming animalj~N(0, σ 2animal); timek is the fixed effect of time
k (k = 1, 2, 3, 4, 5) at the time of measurement; (τ × time)ik is the effect of the
interaction between τi and timek, βi is the coefficient of linear regression of the
covariate xij in treatment i and time j, where xij can be the pH, humidity (HE),
temperature (TE) in the manure, maximum temperature (TMaxA), minimum tem-
perature (TMinA), maximum humidity (HMaxA), minimum humidity (HMinA), or
initial weight (kilograms) at the start of a treatment; x is the mean of the covariate in
question; and εijk is the non-normal experimental error.
The linear predictor ηijk for N2O–N is ηijk = μ þ τi þ animalj þ timek þ
ðτ timeÞik þ βi xij - x . The response variable yijk has a conditional log-normal
distribution with a mean μijk and variance eσ - 1 :e2μþσ , that is, yijkjanimalj ~ Log
2 2
In order to observe the tolerance of the fungus Fusarium sp. to different concentra-
tions of a chemical salt, a bioassay was implemented to evaluate the percentage of
inhibition of the fungus. This bioassay consisted of placing a nutritive culture
medium in Petri dishes for the fungal development in which different concentrations
of the salt in ppm were added (0, 500, 1000, and 2000, ). Mycelium growth was
measured during 6 days, and the percentage of inhibition of Fusarium sp. growth
was calculated. Part of the data is shown below, and the complete base is in the
Appendix (Data: Percentage inhibition).
9.7
where ηijk is the linear predictor, θ is the intercept, conci is the fixed effect of salt
concentration, ωkl is the random effect of the Petri dish within the bioassay,
assuming ωkl N 0, σ 2ω , conc(ω)i(kl) is the random effect of salt concentration–
Petri dish–bioassay, assuming concðωÞiðklÞ N 0, σ 2concðωÞ , timej is the fixed effect
due to the day of measurement, and (conc × time)ij is the interaction effect of
chemical salt concentration with the day of measurement.
Link function: logit(π ijkl) = ηijkl is the link function that relates the linear predictor to
the mean (π ijkl).
The following SAS program adjusts the beta GLMM with repeated measures.
Before fitting the generalized linear mixed model, we compare the estimates of
the covariance structures with the beta distribution in the response variable
(Table 9.31 part (a)). According to the fit statistics, the covariance structures that
best fit the data are the Toeplitz type (Toep(1)) and unstructured (UN).
Having defined the covariance structure, in this case, Toeplitz of order 1, we
present part of the results of the data fit (Table 9.31 part (b)). The fit statistic
Pearson′s chi - square/DF = 1.07 indicates that there is no overdispersion and
that the beta distribution fits the data adequately. The estimated variance component,
under Toeplitz (1), of the concentration–repetition bioassay is σ 2conðωÞ = 0:00285 and
the scale parameter ϕ = 52:281 (c).
402 9 Generalized Linear Mixed Models for Repeated Measurements
Table 9.31 Fit statistics for the conditional distribution and variance components
(a) Fit statistics CS AR(1) Toep(1) UN
-2 Log likelihood -523.69 -523.69 -523.69 -523.69
AIC (smaller is better) -469.69 -469.69 -471.69 -471.69
AICC (smaller is better) -458.73 -458.73 -461.59 -461.59
BIC (smaller is better) -467.54 -467.54 -469.62 -469.62
CAIC (smaller is better) -440.54 -440.54 -443.62 -443.62
HQIC (smaller is better) -484.16 -484.16 -485.62 -485.62
(b) Fit statistics for conditional distribution
-2 Log L (pct | r. effects) -529.79
Pearson’s chi-square 177.68
Pearson’s chi-square/DF 1.07
(c) Covariance parameter estimates
Cov Parm Subject Estimate Standard error
Variance Con(Bio) 0.002849 0.004147
Scale 52.2809 5.8849
Table 9.32 Type III fixed Type III tests of fixed effects
effects tests
Effect Num DF Den DF F-value Pr > F
Con 3 4 125.40 0.0002
Day 5 138 10.99 <0.0001
Day*Con 15 138 2.25 0.0074
The fixed effects indicate that there is a highly significant effect of salt concen-
tration (P = 0.0002), time (P = 0.0001), and the interaction concentration x time
(P = 0.0074) on the growth inhibition of Fusarium sp. (Table 9.32).
The linear predictors and estimated probabilities of the factors (Table 9.33 parts
(a) and (b)) and interaction (Table 9.34) are found under the columns “Estimate” and
“Mean,” respectively.
Table 9.33 Concentration and measurement time least square means on the model scale (Estimate)
and the data scale (Mean)
(a) Conc least squares means
Estimate Standard Mean Standard error
Con ηi: error DF t-value Pr > |t| π i: mean
0 -3.5438 0.1499 4 -23.64 <0.0001 0.02809 0.004093
500 -1.0650 0.05941 4 -17.93 <0.0001 0.2563 0.01133
1000 -0.9847 0.05895 4 -16.70 <0.0001 0.2720 0.01167
2000 -0.4487 0.05891 4 -7.62 0.0016 0.3897 0.01401
(b) Day least squares means
Estimate Standard Mean Standard error
Day η:j error DF t-value Pr > |t| π :j mean
1 -1.6017 0.1161 138 -13.79 <0.0001 0.1677 0.01621
2 -1.0446 0.08689 138 -12.02 <0.0001 0.2603 0.01673
3 -1.2475 0.08794 138 -14.19 <0.0001 0.2231 0.01524
4 -1.5668 0.1020 138 -15.36 <0.0001 0.1727 0.01457
5 -1.7606 0.1039 138 -16.94 <0.0001 0.1467 0.01301
6 -1.8422 0.1067 138 -17.26 <0.0001 0.1368 0.01260
microbial activity in the soil. The study included a control treatment (no sludge) and
three treatments using sludge as a fertilizer with different moisture contents, whose
moisture levels for the fertilized soil were 0.24, 0.26, and 0.28 kg water/kg soil.
Soil samples were randomly assigned to the four treatments in a randomized
completely design. Soil samples were placed in sealed containers and incubated
under favorable conditions for microbial activity. The soil was compacted in the
containers simulating a degree of compaction experienced in the field. Microbial
activity, measured as an increase in CO2, was used as a measure of the level of soil
oxygenation. The CO2 evolution/kilogram soil/day in each container was measured
on 2, 4, 6, and 8 days after starting of the incubation period. Microbial activity in
each soil sample was recorded as the percentage increase in CO2 produced above the
atmospheric level. The data are shown in Table 9.35.
The analysis of variance table for this experiment is shown below (Table 9.36).
Let pctijk be the percentage of CO2 emission, assuming that pctijk has a beta
distribution with a mean π ijk and scale parameter ϕ, i.e., pctijk~Beta(π ijk, ϕ). The
linear predictor ηijk that relates the mean to the link function is given by
where θ is the intercept, αi is the fixed effect of the treatment i, α(r)i(k) is the random
effect of treatment nested in the repetition k, assuming that αðr ÞiðkÞ N 0, σ 2αðrÞ , τj
is the fixed effect of measurement time j, and (ατ)ij is the interaction effect of
treatment with measurement time. The link function is defined by logit(π ijk) = ηijk.
The following SAS syntax fits a GLMM on repeated measures with a beta
distribution.
404 9 Generalized Linear Mixed Models for Repeated Measurements
Table 9.34 Measuring time*salt concentration interaction on the model scale (Estimate) and the
data scale (Mean)
Day*con least squares means
Estimate Standard Mean Standard error
Day Con ηij error DF t-value Pr > |t| π ij mean
1 0 -4.0127 0.4083 138 -9.83 <0.0001 0.01776 0.007124
1 500 -0.8709 0.1123 138 -7.76 <0.0001 0.2951 0.02335
1 1000 -0.6848 0.1092 138 -6.27 <0.0001 0.3352 0.02434
1 2000 -0.8382 0.1579 138 -5.31 <0.0001 0.3019 0.03328
2 0 -3.5743 0.2957 138 -12.09 <0.0001 0.02727 0.007844
2 500 -0.4140 0.1061 138 -3.90 0.0001 0.3980 0.02543
2 1000 -0.3519 0.1053 138 -3.34 0.0011 0.4129 0.02554
2 2000 0.1616 0.1043 138 1.55 0.1235 0.5403 0.02590
3 0 -2.9511 0.2944 138 -10.02 <0.0001 0.04969 0.01390
3 500 -0.9923 0.1149 138 -8.64 <0.0001 0.2705 0.02266
3 1000 -0.9044 0.1131 138 -8.00 <0.0001 0.2881 0.02319
3 2000 -0.1423 0.1041 138 -1.37 0.1739 0.4645 0.02590
4 0 -3.5167 0.3558 138 -9.88 <0.0001 0.02884 0.009967
4 500 -1.2429 0.1213 138 -10.25 <0.0001 0.2239 0.02108
4 1000 -1.0361 0.1159 138 -8.94 <0.0001 0.2619 0.02241
4 2000 -0.4716 0.1065 138 -4.43 <0.0001 0.3842 0.02520
5 0 -3.5503 0.3579 138 -9.92 <0.0001 0.02791 0.009710
5 500 -1.4180 0.1269 138 -11.17 <0.0001 0.1950 0.01992
5 1000 -1.4489 0.1277 138 -11.34 <0.0001 0.1902 0.01967
5 2000 -0.6251 0.1083 138 -5.77 <0.0001 0.3486 0.02458
6 0 -3.6579 0.3691 138 -9.91 <0.0001 0.02514 0.009046
6 500 -1.4522 0.1277 138 -11.37 <0.0001 0.1897 0.01963
6 1000 -1.4823 0.1289 138 -11.50 <0.0001 0.1851 0.01944
6 2000 -0.7765 0.1106 138 -7.02 <0.0001 0.3151 0.02388
Part of the results is shown below. The fit statistics under different covariance
structures (Table 9.37 part (a)), such as AIC and AICC indicate that a Toeplitz-type
covariance structure of order 1 provides the best fit to the dataset of this experiment.
Table 9.38 part (a) shows the estimated variance component due to treatment x
repetition, i.e., - σ 2aðrÞ = 0:03363, and the estimated scale parameter ϕ = 790:82, and
the hypothesis test (part (b)) indicates that the treatments yielded statistically differ-
ent means (P = 0.0011).
9.8 Carbon Dioxide (CO2) Emission as a Function of Soil Moisture. . . 405
Table 9.35 Repeated measurements of emissions of CO2 by bacterial activity in soil under
different moisture conditions
%CO2 evolution/kilogram soil/day
Moisture (kg water/kg soil) Container Day 2 Day 4 Day 6 Day 8
Control 1 0.22 0.56 0.66 0.89
2 0.68 0.91 1.06 0.8
3 0.68 0.45 0.72 0.89
0.24 1 2.53 2.7 2.1 1.5
2 2.59 1.43 1.35 0.74
3 0.56 1.37 1.87 1.21
0.26 1 0.22 0.22 0.2 0.11
2 0.45 0.28 1.24 0.86
3 0.22 0.33 0.34 0.2
0.28 1 0.22 0.8 0.8 0.37
2 0.22 0.62 0.89 0.95
3 0.22 0.56 0.69 0.63
Table 9.37 Fit statistics of the beta GLMM under different covariance structures
(a) Fit statistics CS AR(1) Toep(1) UN
-2 Log likelihood -433.28 -433.94 -433.28 No converge
AIC (smaller is better) -395.28 -395.94 -397.28
AICC (smaller is better) -368.14 -368.80 -373.69
BIC (smaller is better) -412.41 -413.07 -413.50
CAIC (smaller is better) -393.41 -394.07 -395.50
HQIC (smaller is better) -429.71 -430.37 -429.89
(b) Fit statistics for conditional distribution CS AR(1) Toep(1) UN
-2 Log L (y | r. effects) -446.54 -444.41 -446.58 No converge
Pearson’s chi-square 30.46 33.98 30.38
Pearson’s chi-square/DF 0.63 0.71 0.63
Table 9.39 shows the estimated average emissions of CO2 in tested treatments,
which showed that the treatment with moisture 0.24 kg water/kg soil favored a
higher microbial activity, whereas treatments with moisture levels 0.26 and 0.28 kg
water/kg soil showed similar microbial activity between them.
406 9 Generalized Linear Mixed Models for Repeated Measurements
Table 9.39 Means and standard errors on the model scale (Estimate) and the data scale (Mean)
(a) Trt least squares means
Standard Standard error
Trt Estimate error DF t-value Pr > |t| Mean mean
C -4.9242 0.1595 8 -30.87 <0.0001 0.007216 0.001143
T0.24 -4.1331 0.1343 8 -30.79 <0.0001 0.01578 0.002085
T0.26 -5.5728 0.1898 8 -29.36 <0.0001 0.003786 0.000716
T0.28 -5.1588 0.1728 8 -29.86 <0.0001 0.005716 0.000982
0.02
0.018
0.016
0.014
0.012
% CO2
0.01
0.008
0.006
0.004
0.002
0
1 2 3 4 5 6 7 8
Time (Days)
Figure 9.2 clearly shows that the treatment with moisture 0.24 kg water/kg soil
provides the best conditions for soil microbial activity, whereas the rest of the
treatments significantly affect the activity of microorganisms.
9.9 Effect of Soil Compaction and Soil Moisture on Microbial Activity 407
Table 9.41 Analysis of variance of an CRD with factorial structure of treatments in repeated
measures
Sources of variation Degrees of freedom
Treatment (a - 1) = 3 - 1 = 2
Humidity (b - 1) = 3 - 1 = 2
Treatment*humidity (a - 1)(b - 1) = 4
Error1 ab(r - 1) = 3 × 3 × 1 = 9
Time (c - 1) = 3 - 1 = 2
Treatment time (a - 1)(c - 1) = 4
Humidity*time (b - 1)(c - 1) = 4
Treat*hum*time (a - 1)(b - 1)(c - 1) = 8
Error2 /diferencia/17
Total a × b × c × r - 1 = 3 × 3 × 3 × 2 - 1 - 1 = 52
Note: Here, 1 degree of freedom was subtracted from the total observations of the experiment since
there is a missing observation
i = 1, 2, 3, j = 1, 2, 3, k = 1, 2, 3, l = 1, 2
where θ is the intercept, αi is the fixed effect of the density factor, βj is the fixed effect
of the humidity factor, (αβ)ij is the effect of the interaction between density and
humidity, αβ(r)ij(l ) is the random effect of the interaction density × humidity ×
repetition αβðr ÞijðlÞ N 0, σ 2αβðrÞ , τl is the fixed effect of measurement time,
(ατ)ij is the fixed effect of the interaction between density and measurement time,
(βτ)jk is the fixed effect of the interaction between moisture and measurement time,
and (αβτ)ijk is the fixed effect of the interaction of density × humidity × time. The
link function is defined by logit(π ijkl) = ηijkl.
The following SAS GLIMMIX syntax fits a repeated measures GLMM with a
beta distribution.
Part of the results is listed below. The fit statistics (AIC and AICC) in Table 9.42
part (a) indicate that a Toeplitz covariance structure of order 1 provides the best fit to
of the data.
The type III tests of fixed effects in Table 9.43 indicate that soil density
(P = 0.0021), humidity (P = 0.0001), the evolution of emission over time
(P = 0.0001), and the interaction between moisture and time of measurement
(P = 0.0001) are statistically significant.
9.10 Joint Model for Binary and Poisson Data 409
Table 9.42 Fit statistics of a beta GLMM with a factorial structure of treatments under different
covariance structures
(a) Fit statistics CS AR(1) Toep(1) UN
-2 Log likelihood -413.74 -413.72 -413.72 No converge
AIC (smaller is better) -353.74 -353.72 -355.72
AICC (smaller is better) -269.19 -269.18 -280.07
BIC (smaller is better) -392.94 -392.93 -393.62
CAIC (smaller is better) -362.94 -362.93 -364.62
HQIC (smaller is better) -435.73 -435.71 -434.98
(b) Fit statistics for conditional distribution CS AR(1) Toep(1) UN
-2 Log L (y | r. effects) -413.74 -413.72 -413.72 No converge
Pearson’s chi-square 64.60 64.65 64.65
Pearson’s chi-square/DF 1.22 1.22 1.22
The least mean squares obtained with the “lsmeans” command on the model scale
are shown under the “Estimate” column and the data scale under the “Mean” column
of Table 9.44.
Table 9.44 Means and standard errors and comparison of means (least significance difference
(LSD)) on the model scale (Estimate) and data scale (Mean)
(a) Density*humidity least squares means
Standard
Standard error
Density Humidity Estimate error DF t-value Pr > |t| Mean mean
1.1 0.1 -4.5961 0.1614 9 -28.48 <0.0001 0.009990 0.001596
1.1 0.2 -3.1829 0.07606 9 -41.85 <0.0001 0.03981 0.002908
1.1 0.24 -3.3152 0.08060 9 -41.13 <0.0001 0.03505 0.002726
1.4 0.1 -4.4567 0.1450 9 -30.74 <0.0001 0.01147 0.001643
1.4 0.2 -3.3798 0.08333 9 -40.56 <0.0001 0.03293 0.002654
1.4 0.24 -3.5363 0.08932 9 -39.59 <0.0001 0.02830 0.002456
1.6 0.1 -4.7890 0.1809 9 -26.47 <0.0001 0.008252 0.001481
1.6 0.2 -3.5453 0.08972 9 -39.52 <0.0001 0.02805 0.002446
1.6 0.24 -4.3213 0.1440 9 -30.00 <0.0001 0.01311 0.001863
(b) T grouping of density*humidity least squares means (α = 0.05)
LS means with the same letter are not significantly different
Density Humidity Estimate
1.1 0.20 -3.1829 A
1.1 0.24 -3.3152 B A
1.4 0.20 -3.3798 B A
1.4 0.24 -3.5363 B
1.6 0.20 -3.5453 B
1.6 0.24 -4.3213 C
1.4 0.10 -4.4567 C
1.1 0.10 -4.5961 C
1.6 0.10 -4.7890 C
measures the length of hospital stay after the surgery (in days). The binary variable
“OKstatus” is a regressor variable that distinguishes patients according to their
postoperative physical status (“1” implies better status), and the variable age is the
age of the patient.
These data can be modeled with a separate logistic model for the binary outcome
and with a Poisson model for the count outcome. Such separate analyses would not
take into account the correlation between the two response variables. It is reasonable
to assume that the duration of post-surgery hospitalization is correlated and will
depend on whether the patient requires intensive care.
In the following analysis, the correlation between the two types of response
variables for a patient is modeled with shared random effects (G-side). The dataset
variable “dist” identifies the distribution for each observation. For those observations
that follow a binary distribution, the response variable option “(event = “1 “)”
determines which value of the binary variable is modeled as the event of interest.
Since no “link” option is specified, the link is also chosen on an observation-by-
observation basis as a predetermined link for the respective distribution. The fol-
lowing GLIMMIX commands fit this dataset with two distributions:
9.10 Joint Model for Binary and Poisson Data 411
data Poi_Bin;
length dist $7;
input d$ patient age OKstatus response @@;
if d = 'B' then dist='Binary'; else dist='Poisson';
datalines;
B 1 78 1 0 P 1 78 1 9 B 2 60 1 0 P 2 60 1 4
B 3 68 1 1 1 P 3 68 1 7 B 4 62 0 1 P 4 62 0 35
.................................
......................................
...................................
412 9 Generalized Linear Mixed Models for Repeated Measurements
....................................
.....................................
..................................
B 29 54 1 0 P 29 54 1 2 B 30 43 1 1 1 P 30 43 1 3
B 31 4 1 1 1 P 31 4 1 3 B 32 52 1 1 1 P 32 52 1 8
;
proc glimmix data=joint;
class patient dist;
model response(event='1') = dist dist*age dist*OKstatus /
noint s dist=byobs(dist);
random int / subject=patient;
lsmeans dist/lines ilink;
run;
Some of the output is shown below. Table 9.46 (“Model information”) shows that
the distribution of the data is multivariate and that possibly multiple link functions
are involved; by default, proc. GLIMMIX uses a logit link for the binary observa-
tions and a log link for the Poisson data.
Table 9.47 shows the value of the distribution statistic Gener. chi - square/
DF = 0.90, which indicates that there is no overdispersion, and also shows the
estimated variance component due to patient, which is, σ 2patient = 0:299. The fixed
effects tests for the effects of age and status are shown in part (c).
In addition to the above results, the maximum likelihood estimators of the
intercepts, as well as the values of the slopes of each of the variables of both
probability distributions, are tabulated in Table 9.48.
Thus, to calculate the probability that a patient will experience a routine recovery,
the following expression is used:
1
π^ =
- β0 - β1 × age - β2 × okstatus
1 þ exp
1
=
1 þ exp f - 5:7783þ0:07572 × ageþ0:4697 × okstatusg
9.11 Exercises 413
whereas the following expression is used to calculate the average value of the length
of hospital stay after the surgery (in days):
^λ = exp fα0 þα1 × ageþα2 × okstatusg = exp f0:8410þ0:01875 × age - 0:1856 × okstatusg
9.11 Exercises
Table 9.49 Results of a repeated measures experiment with an ordinal response variable
Week 0 Week 4 Week 12
Response Placebo Trt1 Trt2 Response Placebo Trt1 Trt2 Response Placebo
Bad 60 59 54 14 5 3 13 10 7
Without 7 6 13 34 33 38 25 17 21
change
Better 0 0 0 15 22 17 17 28 21
The following table shows the data from an experiment in which each cell
contains the number of animals in a given treatment × week × response category
combination (Table 9.49).
(a) List all the components of the repeated measures under a multinomial GLMM.
(b) Study and choose the best covariance structure that models this dataset. Cite the
most relevant results.
(c) Fit the multinomial cumulative logit model to these data. Perform a complete and
appropriate analysis of the data, focusing on:
(i) An evaluation of the effects of the combination of treatments
(ii) Odds ratio interpretation
(iii) The expected probability per category for each treatment
(d) Test whether the proportional odds assumption is viable. Cite relevant evidence
to support your conclusion regarding the adequacy of the assumption.
Repeat (b) through (d), assuming a generalized multinomial logit in Exercise
9.11.1. Discuss your results.
Repeat (b) through (d) assuming a multinomial cumulative probit in Exercise
9.11.1. Discuss your results and compare with those found in (1) and (2).
Alternatively, the contingency table approach can be implemented using a
log-linear model. For the previous example, 9.11.1, fit the log-linear model
where λijk is the expected count of the treatment combination ijk by week by response
category and τ, ϖ, and c refer to treatment, week,and response category effects,
respectively.
9.11 Exercises 415
(c) Test whether the proportional odds assumption is viable. Cite relevant evidence
to support your conclusion regarding the adequacy of the assumption.
Appendix
A T T F pH HE TE tx tm Hx Hm A T t F pH HE TE tx tm Hx Hm
3 1 1 108 8 85 20 17 16 69 68 2 1 1 4.13 6.9 87 20 19 19 67 66
2 2 1 15 4.7 86 20 17 16 69 68 3 2 1 82.4 7 87 22 19 19 67 66
1 3 1 33 5.5 85 20 17 16 69 68 4 3 1 51.4 6.4 87 20 19 19 67 66
4 4 1 23 6.4 84 21 17 16 69 68 1 4 1 704 95 20 19 19 67 66 170
3 1 2 -58 8 85 23 34 34 54 51 3 2 2 14 7 87 22 27 27 68 67
2 2 2 -45 4.7 86 24 34 34 54 51 4 3 2 130 6.4 87 24 27 27 68 67
1 3 2 -97 5.5 85 29 34 34 54 51 1 4 2 537 95 24 27 27 68 67 170
9
A T T F pH HE TE tx tm Hx Hm A T t F pH HE TE tx tm Hx Hm
2 4 3 1.2 7.2 84 31 29 25 51 44 3 4 5 40.8 6.7 85 24 26 25 43 31
1 1 4 28 35 160 161 4 2 2 4 4 1 1 37.5 7.2 89 22 18 17 61 48
3 3 4 81 5.8 85 27 28 27 35 35 1 2 1 88.4 8.1 88 21 18 17 61 48
2 4 4 17 7.2 84 29 28 27 35 35 2 3 1 63.4 7.6 90 22 18 17 61 48
1 1 5 26 39 160 161 4 2 2 5 3 4 1 -72 7.4 85 22 18 17 61 48
3 3 5 99 5.8 85 20 26 26 41 39 4 1 5 -83 7.2 89 26 29 27 48 45
2 4 5 31 7.2 84 20 26 26 41 39 1 2 5 95.1 8.1 88 26 30 28 49 46
9
Data: Percentage inhibition (Bio bioassay, Con concentration, Rep repetition, Por percentage
inhibition)
Bio Day Con Rep Por Bio Day Con Rep Por
1 1 0 3 5.2632 1 6 2000 4 35.1724
1 1 0 4 5.2632 2 1 0 2 0.0016
1 1 500 1 15.7895 2 1 0 3 14.2857
1 1 500 2 26.3158 2 1 500 1 42.8571
1 1 500 3 15.7895 2 1 500 2 42.8571
1 1 500 4 15.7895 2 1 500 3 42.8571
1 1 1000 1 36.8421 2 1 500 4 42.8571
1 1 1000 2 36.8421 2 1 1000 1 7.1429
1 1 1000 3 36.8421 2 1 1000 2 42.8571
1 1 1000 4 36.8421 2 1 1000 3 42.8571
1 1 2000 1 15.7895 2 1 1000 4 42.8571
1 1 2000 2 36.8421 2 2 0 1 1.3699
1 1 2000 3 36.8421 2 2 0 2 1.3699
1 1 2000 4 36.8421 2 2 0 4 1.3699
1 2 0 2 1.9355 2 2 500 1 34.2466
1 2 0 3 4.5161 2 2 500 2 31.5068
1 2 0 4 1.9355 2 2 500 3 42.4658
1 2 500 1 43.2258 2 2 500 4 36.9863
1 2 500 2 48.3871 2 2 1000 1 34.2466
1 2 500 3 40.6452 2 2 1000 2 47.9452
1 2 500 4 40.6452 2 2 1000 3 45.2055
1 2 1000 1 35.4839 2 2 1000 4 45.2055
1 2 1000 2 45.8065 2 2 2000 1 47.9452
1 2 1000 3 43.2258 2 2 2000 2 53.4247
1 2 1000 4 32.9032 2 2 2000 3 50.6849
1 2 2000 1 58.7097 2 2 2000 4 56.1644
1 2 2000 2 53.5484 2 3 0 1 4.2735
1 2 2000 3 53.5484 2 3 0 4 14.5299
1 2 2000 4 58.7097 2 3 500 1 28.2051
1 3 0 2 1.2346 2 3 500 2 28.2051
1 3 0 3 3.7037 2 3 500 3 35.0427
1 3 500 1 25.9259 2 3 500 4 24.7863
1 3 500 2 23.4568 2 3 1000 1 24.7863
1 3 500 3 23.4568 2 3 1000 2 35.0427
1 3 500 4 24.6914 2 3 1000 3 24.7863
1 3 1000 1 30.8642 2 3 1000 4 26.4957
1 3 1000 2 32.0988 2 3 2000 1 40.1709
1 3 1000 3 28.3951 2 3 2000 2 38.4615
1 3 1000 4 25.9259 2 3 2000 3 47.0085
1 3 2000 1 53.0864 2 3 2000 4 41.8803
1 3 2000 2 49.3827 2 4 0 2 1.5015
1 3 2000 3 49.3827 2 4 0 3 1.5015
(continued)
422 9 Generalized Linear Mixed Models for Repeated Measurements
Data: Percentage inhibition (Bio bioassay, Con concentration, Rep repetition, Por percentage
inhibition)
Bio Day Con Rep Por Bio Day Con Rep Por
1 3 2000 4 51.8519 2 4 0 4 1.5015
1 4 0 3 4.6729 2 4 500 1 20.7207
1 4 500 1 19.6262 2 4 500 2 23.1231
1 4 500 2 20.5607 2 4 500 3 27.9279
1 4 500 3 22.4299 2 4 500 4 20.7207
1 4 500 4 20.5607 2 4 1000 1 35.1351
1 4 1000 1 21.4953 2 4 1000 2 26.7267
1 4 1000 2 21.4953 2 4 1000 3 26.7267
1 4 1000 3 23.3645 2 4 1000 4 32.7327
1 4 1000 4 20.5607 2 4 2000 1 33.9339
1 4 2000 1 42.0561 2 4 2000 2 37.5375
1 4 2000 2 36.4486 2 4 2000 3 44.7447
1 4 2000 3 32.7103 2 4 2000 4 38.7387
1 4 2000 4 40.1869 2 5 0 2 2.008
1 5 0 3 4.065 2 5 0 4 0.4016
1 5 0 4 4.065 2 5 500 1 13.253
1 5 500 1 21.1382 2 5 500 2 21.2851
1 5 500 2 24.3902 2 5 500 3 21.2851
1 5 500 3 17.0732 2 5 500 4 18.0723
1 5 500 4 17.0732 2 5 1000 1 21.2851
1 5 1000 1 18.6992 2 5 1000 2 18.0723
1 5 1000 2 18.6992 2 5 1000 3 16.4659
1 5 1000 3 20.3252 2 5 1000 4 16.4659
1 5 1000 4 17.8862 2 5 2000 1 35.743
1 5 2000 1 41.4634 2 5 2000 2 34.1365
1 5 2000 2 38.2114 2 5 2000 3 29.3173
1 5 2000 3 34.1463 2 5 2000 4 30.9237
1 5 2000 4 33.3333 2 6 0 2 4.2159
1 6 0 3 4.8276 2 6 0 4 0.1686
1 6 0 4 2.069 2 6 500 1 18.3811
1 6 500 1 17.2414 2 6 500 2 20.4047
1 6 500 2 18.6207 2 6 500 3 22.4283
1 6 500 3 16.5517 2 6 500 4 20.4047
1 6 500 4 13.7931 2 6 1000 1 21.0793
1 6 1000 1 15.8621 2 6 1000 2 17.7066
1 6 1000 2 16.5517 2 6 1000 3 17.7066
1 6 1000 3 15.8621 2 6 1000 4 20.4047
1 6 1000 4 18.6207 2 6 2000 1 31.1973
1 6 2000 1 32.4138 2 6 2000 2 29.1737
1 6 2000 2 29.6552 2 6 2000 3 29.8482
1 6 2000 3 31.7241 2 6 2000 4 30.5228
Appendix 423
Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0
International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing,
adaptation, distribution and reproduction in any medium or format, as long as you give appropriate
credit to the original author(s) and the source, provide a link to the Creative Commons license and
indicate if changes were made.
The images or other third party material in this chapter are included in the chapter's Creative
Commons license, unless indicated otherwise in a credit line to the material. If material is not
included in the chapter's Creative Commons license and your intended use is not permitted by
statutory regulation or exceeds the permitted use, you will need to obtain permission directly from
the copyright holder.
References
Agresti A (2013) Introduction to categorical datikea analysis, 3rd edn. Wiley, Hoboken
Aitchison J, Silvey S (1957) The generalization of probit analysis to the case of multiple responses.
Biometrika 44:131–140
Akaike H (1973) Information theory and an extension of the maximum likelihood principle. In:
Petrov BN, Casake F (eds) Second international symposium on information theory. Akademiai
kiado, Budapest, pp 267–281
Akaike H (1974) A new look at the statistical model identification. IEEE Trans Automat Control 19:
716–723
Amoah S, Wilkinson M, Dunwell J, King GJ (2008) Understanding the relationship between DNA
methylation and phenotypic plasticity in crop plants. Comp Biochem Physiol 150(Suppl 1):
S145
Bekele A, Bultosa G, Belete K (2012) The effect of germination time on malt quality of six sorghum
(sorghum bicolor) varieties grown at Melkassa, Ethiopia. J Brew 118(1):76–81
Bilgili S, Hess J, Blake J, Macklin K, Saenmahayak B, Sibley J (2009) Influence of bedding
material on footpad dermatitis in broiler chickens. J Appl Poult Res 18(3):583–589
Bliss CL (1934) Methods of probits. Science 79:38–39
Bliss CI (1935) The calculation of the dosage-mortality curve. Ann Appl Biol 22(1):134–167
Breslow NE (2004) Whither PQL? In: Lin DY, Heagerty PJ (eds) Proceedings of the second Seattle
symposium in biostatistics: analysis of correlated data. Springer, pp 1–22
Breslow NE, Clayton DG (1993) Approximate inference in generalized linear mixed models. J Am
Stat Assoc 88:9–25
Casella G, Berger RL (2002) Statistical inference, 2nd edn. Duxbury, Pacific Grove
Collett D (2002) Modelling binary data, 2nd edn. Chapman & Hall/CRC Press, Boca Raton. 387 pp.
De Jong IC, Guémené D (2012) Major welfare issues in broiler breeders. Worlds Poult Sci J 67:73–
82
Engel B, te Brake J (1993) Analysis of embryonic development with a model for under-or
overdispersion relative binomial variation. Biometrics 49:269–279
Fisher RA (1925) Statistical methods for research workers. Oliver and Boyd, Edinburgh
Garcia RG, Almeida PICL, Caldara FR, Naas IA, Pereira DF et al (2010) Effect of litter material on
water quality in broiler production. Braz J Poult Sci 12:165–169
Gbur EE, Stroup WW, McCarter KS, Durham S, Young LJ, Christman M, West M, Kramer M
(2012) Analysis of generalized linear mixed models in the agricultural and natural resources
sciences. ASA, CSSA, SSSA, Madison
Gilks WR et al (1996) Introducing Markov chain Monte Carlo. In: Gilks WR (ed) Markov chain
Monte Carlo in practice. Chapman and Hall, pp 1–19
Goldstein H, Rasbash J (1996) Improved approximations for multilevel models with binary
responses. J R Stat Soc Ser A Stat Soc 159:505–513
Hand DJ, Daly F, Lunn AD, McConway KJ, Ostrowski E (1994) A handbook of small data sets.
Chapman and Hall, London
Heindel J, Price C, Field E, Marr M, Myers C, Morrissey R, Schwetz B (1992) Developmental
toxicity of boric acid in mice and rats. Toxicol Sci 18(2):266–277
Henderson CR (1950) Estimation of genetic parameters. Ann Math Stat 21:309–310
Henderson CR (1984) Applications of linear models in animal breeding. In Univ. of Guelph,
Guelph
Hosmer DW, Lemeshow S (2000) Applied logistic regression, 2nd edn. Wiley, New York
Immer RF, Hayes HK, Powers LR (1934) Statistical determination of barley varietal adaptation. J
Am Soc Agron 26:403–419
Jermann R, Toumiat M, Imfeld D (2001) Development of an in vitro efficacy test for self-tanning
formulations. Int J Cosmet Sci 24(1):35–42
Johnson NL, Kotz S, Balakrishnan N (1995) Continuous univariate distributions, vol 2. Wiley,
New York
Kaplan EL, Meier P (1958) Nonparametric estimation from incomplete observations. J Am Stat
Assoc 53(282):457–481
Lee Y, Nelder JA (2001) Hierarchical generalised linear models: a synthesis of generalised linear
models, random-effect models and structured dispersions. Biometrika 88:987–1006
Lee Y, Nelder JA (2004) Conditional and marginal models: anhother view. Stat Inference 19(2):
219–238
Lew M (2007) Good statistical practice in pharmacology. Br J Pharmacol 152(3):299–303
Littell RC, Milliken GA, Stroup WW, Wolfinger RD (1996) SAS for mixed models. SAS Institute,
Inc., Cary
Littell RC et al (2006) SAS for Mixed Models, 2nd edn. SAS Publishing
Limpert E, Stahel WA, Abbt M (2001) Log-normal Distributions across the Sciences: Keys and
Clues. BioScience 51(5)
Logan M (2010) Biostatistical design and analysis using R: a practical guide. Wiley
Madden L, Hughes G (1995) Plant disease incidence: distributions, heterogeneity, and temporal
analysis. Annu Rev Phytopathol 33:529–564
Margolin BH, Kaplan N, Zeiger E (1981) Statistical analysis of the Ames Salmonella/microsome
test. Proc Natl Acad Sci USA 76:3779–3783
Martrenchar A, Boilletot E, Huonnic D, Pol F (2002) Risk factors for foot-pad dermatitis in chicken
and Turkey broilers in France. Prev Vet Med 52(3–4):213–226
McCullagh P (1980) Regression models for ordinal data. J R Stat Soc Series B Methodol 42:109–
142
McCullagh P (1983) Quasi-likelihood functions. Ann Stat 11(1):59–67
McCullagh P, Nelder JA (1989) Generalized linear models, 2nd edn. Chapman and Hall, London
Mead J, Curnow R, Hasted A (1993) Statistical methods in agriculture and experimental biology,
2nd edn. Chapman and Hall, London, p 325
Mosteller F, Tukey JW (1977) Data analysis and regression. Addison-Wesley, Reading
Myers R, Montgomery D, Vining G (2002) Generalized linear models with applications in
engineering and the sciences. Wiley, New York
Nadia Hernández-Tapia, Josafhat Salinas-Ruiz, Vinisa Saynes-Santillán, Julio M. Ayala-
Rodríguez, Francisco Hernández-Rosas y Joel Velasco-Velasco (2019). N2O, CO2 and NH3
emission from dung of bovine with different percentage of crude protein in diet. Rev. Int.
Contam. Ambie 35 (3) 597–608. DOI: 10.20937/RICA.2019.35.03.07
Nelder JA, Wedderburn RWM (1972) Generalized linear models. J R Stat Soc Ser A 135:370–384
Pinheiro JC, Bates DM (2000) Mixed-effects models in S and SPLUS. Springer
Pinheiro JC, Chao EC (2006) Efficient Laplacian and adaptive Gaussian quadrature algorithms for
multilevel generalized linear mixed models. J Comput Graph Stat 15:58–81
References 427
Quinn GP, Keough MJ (2002) Experimental design and data analysis for biologists. Cambridge
University Press, New York
Raudenbush SW et al (2000) Maximum likelihood for generalized linear models with nested
random effects via high-order, multivariate Laplace approximation. J Comput Graph Stat 9:
141–157
Robinson GK (1991) That BLUP is a good thing: The estimation of random effects (with
discussion), Statist Sci 6:15–51
Rodriguez G, Goldman N (2001) Improved estimation procedures for multilevel models with
binary response: a case-study. J R Stat Soc Ser A Stat Soc 164:339–355
Schall R (1991) Estimation in generalized linear models with random effects. Biometrika 78:719–
727
Schwarz G (1978) Estimating the dimension of a model. Ann Stat 6:461–464
Shinozaki K, Kira T (1956) Intraspecific competition among higher plants. VII. Logistic theory of
the C-D effect. J Inst Polytech 12:69–82
Smith T, Mlambo V, Sikosana JLN, Maphosa V, Mueller-Harvey I, Owen E (2005) Dichrostachys
cinerea and Acacia nilotica fruits as dry season feed supplements for goats in a semi-arid
environment. Anim Feed Sci Technol 122(1–2):149–157
Spurgeon J (1978) The correlation of animal response data with the yields of selected thermal
decomposition products for typical aircraft interior materials. U.S. D.O.T. Report No. FAA-RD-
78-131
Stanley VG (1981) The effect of stocking density on commercial broiler performance. Poult Sci 60:
1737–1738
Stephens PA et al (2005) Information theory and hypothesis testing: a call for pluralism. J Appl Ecol
42:4–12
Stroup W (2012) Generalized linear mixed models, 1st edn. Chapman & Hall/CRC
Stroup W (2013) Generalized linear mixed models. CRC Press, Boca Raton
Taira K, Nagai T, Obi T, Takase K (2014) Effect of litter moisture on the development of footpad
dermatitis in broiler chickens. J Vet Med Sci 76:583–586
Wagner JG, Aghajanian GK, Bing OHL (1968) Correlation of performance test scores with "tissue
concentration" of lysergic acid diethylamide in human subjects. Clin Pharmacol Ther 9(5):
635–638
Wallsten TS, Budescu DV (1981) Adaptivity and nonadditivity in judging MMPI profiles. J Exp
Psychol Hum Percept Perform 7:1096–1109
Walters KJ, Hosfield GL, Uebersax MA, Kelly JD (1997) Navy bean canning quality: correlations,
heritability estimates and randomly amplified polymorphic DNA markers associated with
component traits. J Am Soc Hortic Sci 122(3):338–343
Wolfinger R, O'Connell M (1993) Generalized linear mixed models: a pseudo-likelihood approach.
J Stat Comput Simul 48:233–243
Yates F (1935) Complex experiments, with discussion. J R Stat Soc Ser B 2:181–223
Zeger SL, Liang K-Y, Albert PS (1988) Models for longitudinal data: a generalized estimating
equation approach. Biometrics 44(4):1049
Zuur AF, Leno EN, Walker N, Saveliev AA, Smith GM (2009) Mixed effects models and
extensions in ecology with R. Springer, New York
Zuur AF, Hilbe JM, Leno EN (2013) A Beginner's guide to GLM and GLMM with R: a frequentist.
and Bayesian perspective for ecologists. Highland Statistics Ltd., Newburgh