Essence of Multivariate Thinking - Basic Themes and Methods
Essence of Multivariate Thinking - Basic Themes and Methods
Essence of Multivariate Thinking - Basic Themes and Methods
MULTIVARIATE THINKING
Basic Themes and Methods
Multivariate Applications Series
Sponsored by the Society of Multivariate Experimental Psychology, the goal of this series is
to apply complex statistical methods to significant social or behavioral issues, in such a way
so as to be accessible to a nontechnical-oriented readership (e.g., nonmethodological re-
searchers, teachers, students, government personnel, practitioners, and other professionals).
Applications from a variety of disciplines, such as psychology, public health, sociology,
education, and business, are welcome. Books can be single- or multiple-authored, or edited
volumes that: (1) demonstrate the application of a variety of multivariate methods to a
single, major area of research; (2) describe a multivariate procedure or framework that
could be applied to a number of research areas; or (3) present a variety of perspectives on
a controversial topic of interest to applied multivariate researchers.
There are currently nine books in the series:
Anyone wishing to submit a book proposal should send the following: (1) author/title,
(2) timeline including completion date, (3) brief overview of the book's focus, including
table of contents, and ideally a sample chapter (or more), (4) a brief description of competing
publications, and (5) targeted audiences.
For more information please contact the series editor, Lisa Harlow, at: Department of
Psychology, University of Rhode Island, 10 Chafee Road, Suite 8, Kingston, RI 02881-0808;
Phone: (401) 874-4242; Fax: (401) 874-5562; or e-mail: LHarlow@uri.edu. Information
may also be obtained from members of the advisory board: Leona Aiken (Arizona State
University), Gwyneth Boodoo (Educational Testing Service), Barbara M. Byrne (University
of Ottawa), Patrick Curran (University of North Carolina), Scott E. Maxwell (University of
Notre Dame), David Rindskopf (City University of New York), Liora Schmelkin (Hofstra
University) and Stephen West (Arizona State University).
THE ESSENCE OF
MULTIVARIATE THINKING
Basic Themes and Methods
Lisa L. Harlow
University of Rhode Island
This book was typeset in 10/12 pt. Times, Italic, Bold, and Bold Italic. The heads were typeset in
Americana, Americana Italic, and Americana Bold.
QA278.H349 2005
519.5'35—dc22 2004028095
Disclaimer:
This eBook does not include the ancillary media that was
packaged with the original printed version of the book.
In memory of
Jacob Cohen
This page intentionally left blank
Contents
I: OVERVIEW
1 Introduction 3
What is Multivariate Thinking? 3
Benefits 4
Drawbacks 6
Context for Multivariate Thinking 7
2 Multivariate Themes 10
Overriding Theme of Multiplicity 10
Theory 11
Hypotheses 11
Empirical Studies 12
Measurement 12
Multiple Time Points 13
Multiple Controls 13
Multiple Samples 14
Practical Implications 15
Multiple Statistical Methods 15
Summary of Multiplicity Theme 17
Central Themes 17
Variance 18
Covariance 18
Ratio of (Co-)Variances 18
Linear Combinations 19
Components 19
Factors 20
Summary of Central Themes 20
Interpretation Themes 21
Macro-Assessment 21
vii
viii CONTENTS
Significance Test 21
Effect Sizes 22
Micro-Assessment 23
Means 23
Weights 24
Summary of Interpretation Themes 25
Summary of Multivariate Themes 25
3 Background Themes 28
Preliminary Considerations before Multivariate Analyses 28
Data 28
Measurement Scales 29
Roles of Variables 30
Incomplete Information 31
Missing Data . 32
Descriptive Statistics 33
Inferential Statistics 34
Roles of Variables and Choice of Methods 35
Summary of Background Themes 36
Questions to Help Apply Themes to Multivariate Methods 37
III: MATRICES
6 Matrices and Multivariate Methods 85
Themes Applied to Matrices
What Are Matrices and How Are They Similar to and Different
from Other Tools? 85
What Kinds of Matrices Are Commonly Used with
Multivariate Methods? 86
X CONTENTS
VI: SUMMARY
12 Integration of Multivariate Methods 221
Themes Applied to Multivariate Methods
What Are the Multivariate Methods and How Are They Similar
and Different? 221
When are Multivariate Methods used and What Research
Questions Can They Address? 222
What Are the Main Multiplicity Themes for Methods? 223
What Are the Main Background Themes Applied to Methods? 225
What Are the Statistical Models That Are Tested
with Multivariate Methods? 227
How Do Central Themes of Variance, Covariance, and Linear
Combinations Apply to Multivariate Methods? 227
What Are the Main Themes Needed to Interpret Multivariate
Results at a Macro-Level? 229
XiV CONTENTS
Figures
4.1. Depiction of standard MR with three predictors, where the lines
connecting the three IVs depict correlations among predictors
and the arrow headed toward the outcome variable represents
prediction error. 44
4.2. MR with 4 xs and 1 Y showing significant R2 shared variance,
F (4,522) = 52.28, p < 0.001, and significant standardized
regression coefficients. Lines connecting the three IVs depict
correlations among predictors and the arrow headed toward the
outcome variable represents prediction error. 60
5.1. ANCOVA with IV = STAGEA, covariate = CONS A, and
DV =CONSB. 78
7.1. Depiction of Follow-up ANOVA Results in the MANOVA
Example with IV = STAGEA and DVs = PSYSXB, PROSB,
CONSB, and CONSEFFB NS = No Significant Differences;
*** p < 0.001. 122
8.1. DFA with 4 IVs and 1 DV showing significant R2 (= 0.30) shared
variance, F(16, 1586) = 13.52, p < 0.0001, with discriminant
loadings for 1st function (VI). 146
8.2. Plot of group centroids for first two discriminant functions. 148
9.1. LR predicting five-stage DV with odds ratios provided. 163
9.2. LR predicting contemplation versus precontemplation with odds
ratios provided. 166
9.3. LR predicting preparation versus precontemplation with odds
ratios provided. 169
9.4. LR predicting action versus precontemplation with odds ratios
provided. 170
9.5. LR predicting maintenance versus precontemplation with odds
ratios provided. 171
10.1. CC with 3 Xs, and 2 7s, with each X linked to the 2
canonical variates, VI and V2; and each Y linked to the 2 Ws.
xv
xvi LIST OF FIGURES AND TABLES
Tables
1.1. Summary of the Definition, Benefits, Drawbacks, and Context
for Multivariate Methods 8
2.1. Summary of Multivariate Themes 17
3.1. Summary of Background Themes to Consider for Multivariate
Methods 36
3.2. Questions to Ask for Each Multivariate Method 39
4.1. Descriptive Statistics for 4 IVs and the DV, Stage
of Condom Use 51
4.2. Coefficient Alpha and Test-Retest Reliability Coefficients 52
4.3. Correlation Coefficients within Time B, N = 527 52
4.4. Summary of Macro-Level Standard MR Output 53
4.5. Summary of Micro-Level Standard MR Output 53
4.6. Step 1 of Macro-Level Hierarchical MR Output 54
4.7. Step 1 of Micro-Level Hierarchical MR Output 55
4.8. Step 2 of Macro-Level Hierarchical MR Output 55
4.9. Step 2 of Micro-Level Hierarchical MR Output 55
4.10. Step 3 of Macro-Level Hierarchical MR Output 56
4.11. Step 3 of Micro-Level Hierarchical MR Output 56
LIST OF FIGURES AND TABLES xvii
7.18. Least-Squares Means for the Four DVs over the Five Stages
of the IV 126
7.19. Multiplicity, Background, Central, and Interpretation Themes
Applied to MANOVA 126
8.1. Macro-Level Results for the Follow-up DFA 138
8.2. Mid-Level Results for the Follow-up DFA 139
8.3. Micro-Level Discriminant Loadings for the Follow-up DFA 139
8.4. Micro-Level Unstandardized Discriminant Weights for the
Follow-up DFA 140
8.5. Group Centroids for the Follow-up DFA Discriminant
Functions 141
8.6. Individual Classification Results for the Follow-up DFA 141
8.7. Classification Table for Actual and Predicted Stages in the
Follow-up DFA 142
8.8. Descriptive Frequencies for Stand-Alone DFA Example 143
8.9. Descriptive Means, SDs, Range, Skewness, and Kurtosis for
Stand-Alone DFA 143
8.10. Pearson Correlation Coefficients (N = 527) Prob>| r \ under
HO: Rho = 0 144
8.11. Macro-Level Results for Stand-Alone DFA 144
8.12. Mid-Level Results for Stand-Alone DFA 145
8.13. Micro-Level Discriminant Loadings for the Stand-Alone DFA 146
8.14. Micro-Level Unstandardized Results 147
8.15. Group Centroids for Stand-Alone DFA Discriminant Functions 147
8.16. Individual Classification Results for Stand-Alone DFA 148
8.17. Classification Table for Actual and Predicted Stages in
Stand-Alone DFA 149
8.18. Multiplicity, Background, Central, and Interpretation Themes
Applied to DFA 150
9.1. Frequencies for STAGEB for LR Example 160
9.2. Initial Test of Odds Assumption for Five-Stage DV 161
9.3. Macro-Level LR Results for Five-Stage DV 161
9.4. Macro-Level Indices for LR with Five-Stage DV 162
9.5. Micro-Level LR Results for Five-Stage DV 162
9.6. Micro-Level Odds Ratio Estimates for LR with Five-Stage DV 163
9.7. Frequencies for STAGE2B for LR Example
(DV: 1 = Contemplation vs. 0 = Precontemplation) 164
9.8. Macro-Level LR Results for STAGE2B Example
(DV: 1 = Contemplation vs. 0 = Precontemplation) 165
9.9. Macro-Level LR Indices for STAGE2B Example (DV: 1 = vs.
0 — Precontemplation) 165
9.10. Micro-Level LR Results for STAGE2B Example (DV: 1 =
Contemplation vs. 0 = Precontemplation) 166
LIST OF FIGURES AND TABLES xix
The current volume was written with a simple goal: to make the topic of mul-
tivariate statistics more accessible and comprehensible to a wide audience. To
encourage a more encompassing cognizance of the nature of multivariate meth-
ods, I suggest basic themes that run through most statistical methodology. I then
show how these themes are applied to several multivariate methods that could be
covered in a statistics course for first-year graduate students or advanced under-
graduates. I hope awareness of these common themes will engender more ease
in understanding the basic concepts integral to multivariate thinking. In keeping
with a conceptual focus, I kept formulas at a minimum so that the book does not
require knowledge of advanced mathematical methods beyond basic algebra and
finite mathematics. There are a number of excellent statistical works that present
greater mathematical and statistical details than the current volume or present other
approaches to multivariate methods. When possible I suggest references to some
of these sources for those individuals who are interested.
Before delineating the content of the chapters, it is important to consider what
prerequisite information would be helpful to have before studying multivariate
methods. I recommend having a preliminary knowledge of basic statistics and
research methods as taught at the undergraduate level in most social science fields.
This foundation would include familiarity with descriptive and inferential statistics,
the concepts and logic of hypothesis testing procedures, and effect sizes. Some
discussion of these topics is provided later in this book, particularly as they relate
to multivariate methods. I invite the reader to review the suggested or similar
material to ensure good preparation at the introductory level, hopefully making an
excursion into multivariate thinking more enjoyable.
CONTENTS
The first three chapters provide an overview of the concepts and approach addressed
in this book. In Chapter 1,1 provide an introductory framework for multivariate
thinking and discuss benefits and drawbacks to using multivariate methods before
providing a context for engaging in multivariate research.
xxi
xxii PREFACE
and logistic regression that each incorporate a major categorical, grouping variable
(e.g., gender, treatment, qualitative or ordinal outcome).
In Chapters 10 and 11, respectively, I apply the themes to multivariate corre-
lational methods that are used in an exploratory approach: canonical correlation
and a combined focus on principal components analysis and factor analysis.
In Chapter 12,1 present an integration of the themes across each of the selected
multivariate methods. This summary includes several charts that list common
themes and how they pertain to each of the methods discussed in this book. I
hope readers will leave with greater awareness and understanding of the essence
of multivariate methods and how they can illuminate our research and ultimately
our thinking.
LEARNING TOOLS
A detailed example is provided for each method to delineate how the multivariate
themes apply and to provide a clear understanding and interpretation of the findings.
Results from statistical analysis software programs are presented in tables that for
the most part mirror sections of the output files.
Supplemental information is provided in the accompanying CD, allowing sev-
eral opportunities for understanding the material presented in each chapter. Data
from 527 women at risk for HIV provide a set of variables, collected over three
time points, to highlight the multivariate methods discussed in this book. The data
were collected as part of a National Institute of Mental Health grant (Principal
investigators L. L. Harlow, K. Quina, and P. J. Morokoff) to predict and prevent
HIV risk in women. The same data set is used throughout the book to provide a
uniform focus for examples. SAS computer program and output files are given
corresponding to the applications in the chapters. This allows readers to verify
how to set up and interpret the analyses delineated in the book. A separate set
of homework exercises and lab guidelines provide additional examples of how to
apply the methods. Instructors and students can work through these when they
want to gain practice applying multivariate methods. Finally, lecture summaries
are presented to illuminate the main points from the chapters.
ACKNOWLEDGMENTS
This book was partially supported by a Fulbright Award while I was at York
University, Toronto, Ontario, Canada; by a National Science Foundation grant on
multidisciplinary learning communities in science and engineering (Co-principal
investigators: Donna Hughes, Lisa Harlow, Faye Boudreaux-Bartels, Bette
Erickson, Joan Peckham, Mercedes Rivero-Herdec, Barbara Silver, Karen Stein,
and Betty Young), and by a National Science Foundation grant on advancing
women in the sciences, technology, engineering and mathematics (principal inves-
tigator: Janett Trubatch).
xxiv PREFACE
Thanks are offered to all the students, faculty, and staff at the University of
Rhode Island, York University, and the Cancer Prevention Research Center who
generously offered resources, support, and comments. I am deeply indebted to the
many students I have taught over the years, who have raised meaningful questions
and provided insightful comments to help clarify my thinking.
I owe much to the National Institute of Mental Health for a grant on prediction of
HIV risk in women and to Patricia Morokoff and Kathryn Quina, my collaborators
on the grant. Without the grant and the support of these incredible colleagues, the
data, examples, and analyses in this book would not be possible.
Much recognition is extended to Tara Smith, Kate Cady-Webster, and Ana
Bridges, all of whom served as teaching assistants and/or (co-)instructors of mul-
tivariate courses during the writing of this book. Each of these intelligent and
dedicated women continually inspires me to greater clarity in my thinking. In par-
ticular, Tara helped me immeasurably in developing lab exercises, and Kate helped
with some of the lecture summaries for the chapters. Their help made it possible
for me to include a CD supplement for this text.
I am very grateful to Dale Pijanowski who generously shared her joyous and
positive spirit about my writing at a time when I was not as convinced as she was
that this book would be finished.
I owe many thanks to Barbara Byrne and Keith Markus, who provided detailed
and constructive reviews of several preliminary chapters. Their thoughtful com-
ments went a long way toward improving the book, but any remaining errors are
most certainly my own.
Lawrence Erlbaum Associates—in particular, Debra Riegert and Larry
Erlbaum—deserve my highest praise for unfailing support, encouragement, and a
wealth of expertise. Nicole McClenic also gets a gold star as project manager.
Appreciation is offered to the Society of Multivariate Experimental Psychology
(SMEP) that offers an ongoing forum in which to stay informed and enlightened in
state-of-the-art methodology. I especially want to express my enduring gratitude
for the wisdom that freely flowed and was generously bestowed on all SMEP
members by Jacob (Jack) Cohen, whose memory permeates the hearts and minds
of all of us fortunate enough to have been in his presence, if only much too briefly.
Jack had a no-nonsense style that cut through all fuzziness and vagaries of thinking,
all the while pleasantly illuminating key concepts with such erudite acumen that
no one could leave him feeling uninformed. If ever there were a guru of pivotal
statistical insight, it assuredly would be Jack.
Finally, my heartfelt thanks are extended to my husband, Gary, and daughter,
Rebecca, who are a constant source of support and inspiration to me. Gary was
also instrumental in providing extensive production assistance with formatting the
text, tables, and the accompanying supplements in the CD. I consider myself very
fortunate to have been gifted with my family's functional support as well as their
unyielding tolerance of and encouragement to having me spread the word about
the wonders and marvels of multivariate thinking.
I
Overview
This page intentionally left blank
1
Introduction
In much of science and life, we often are trying to understand the underlying
truth in a morass of observable reality. Herbert Simon (1969) states that we are
attempting to find the basic simplicity in the overt complexity of life. Margaret
Wheatley (1994), a social scientist working with organizations, suggests that we
are seeking to uncover the latent order in a system while also recognizing that
"It is hard to open ourselves to a world of inherent orderliness... trusting in the
unfolding dance of order" (1994, p. 23). I would like to argue that the search for
simplicity and latent order could be made much more attainable when approached
with a mindset of multivariate thinking.
Multivariate thinking is defined as a body of thought processes that illuminate
interrelatedness between and within sets of variables. The essence of multivariate
thinking as portrayed in this book proposes to expose the inherent structure and
to uncover the meaning revealed within these sets of variables through application
and interpretation of various multivariate statistical methods with real-world data.
The multivariate methods we examine are a set of tools for analyzing multi-
ple variables in an integrated and powerful way. The methods make it possible to
examine richer and more realistic designs than can be assessed with traditional uni-
variate methods that analyze only one outcome variable and usually just one or two
independent variables. Compared with univariate methods, multivariate methods
allow us to analyze a complex array of variables, providing greater assurance that
we can come to some synthesizing conclusions with less error and more validity
than if we were to analyze variables in isolation.
3
4 CHAPTER 1
Multivariate knowledge offers greater flexibility and options for analyses that
extend and enrich other statistical methods of which we have some familiarity.
Ultimately, a study of multivariate thinking and methods encourages coherence
and integration in research that hopefully can motivate policy and practice. A
number of excellent resources exist for those interested in other approaches to
multivariate methods (Cohen, Cohen, West, & Aiken, 2003; Gorsuch, 1999; Harris,
2001; Marcoulides & Hershberger, 1997; Tabachnick & Fidell, 2001).
Having a preliminary understanding of what is meant by multivariate thinking,
it is useful to itemize several benefits and drawbacks to studying multivariate
methods.
BENEFITS
Several benefits can be derived from understanding and using multivariate methods.
a more realistic and critical view of others' research and gain more clarity
on the merits of a body of research. Even if we never choose to conduct our
own analyses, knowledge of multivariate methods opens our eyes to a wider
body of research than would be possible with only univariate methods of
study.
c. Third, multivariate thinking helps expand our capabilities by informing ap-
plication to our own research. We are encouraged to consider multiple meth-
ods for our research, and the methods needed to perform research are more
fully understood. An understanding of multivariate methods increases our
ability to evaluate complex, real-world phenomena and encourages ideas on
how to apply rigorous methods to our own research. Widening our lens to
see more and own more information regarding research, we are encouraged
to think in terms that lead to asking deeper, clearer, and richer questions.
With this broadened perspective, we are able to see the connection between
theory and statistical methods and potentially to inform theory development.
Empirically, a background in multivariate methods allows us to crystallize
theory into testable hypotheses and to provide empirical support for our ob-
servations. Thus, it can increase the credibility of our research and help us
add to existing literature by informing an area with our unique input. We
also are offered greater responsibility and are challenged to contribute to
research and scholarly discourse in general, not exclusively in our own area
of interest.
d. Fourth, multivariate thinking enables researchers to examine large sets of
variables in encompassing and integrated analysis, thereby controlling for
overall error rate and also taking correlations among variables into ac-
count. This is preferred to conducting a large number of univariate analyses
that would increase the probability of making an incorrect decision while
falsely assuming that each analysis is orthogonal. More variables also can
be analyzed within a single multivariate test, thereby reducing the risk of
Type I errors (rejecting the null hypothesis too easily), which can be thought
of as liberal, assertive, and exploratory (Mulaik, Raju, & Harshman, 1997).
We also can reduce Type II errors (retaining the null hypothesis too easily),
which may be described as conservative, cautious, and confirmatory (Abel-
son, 1995). Analyzing more variables in a single analysis also minimizes
the amount of unexplained or random error while maximizing the amount
of explained systematic variance, which provides a much more realistic and
rigorous framework for analyzing our data than with univariate methods.
e. Fifth, multivariate thinking reveals several assessment indices to determine
whether the overall or macro-analysis, as well as specific part or micro-
analysis, are behaving as expected. These overall and specific aspects en-
compass both omnibus (e.g., F-test) and specific (e.g., Tukey) tests of sig-
nificance, along with associated effect sizes (e.g., eta-squared and Cohen's
d). Acknowledging the wide debate of significance testing (Berkson, 1942;
6 CHAPTER 1
Cohen, 1994; Harlow, Mulaik & Steiger, 1997; Kline, 2004; Meehl, 1978;
Morrison & Henkel, 1970; Schmidt, 1996), I concur with recommenda-
tions for their tempered use along with supplemental information such as
effect sizes (Abelson, 1997; Cohen, 1988, 1992; Kirk, 1996; Mulaik, Raju
& Harshman, 1997; Thompson, 1996; Wilkinson & the APA Task Force on
Statistical Inference, 1999). In Chapter 2 we discuss the topic of macro- and
micro-assessment in greater detail to help interpret findings from multivariate
analyses.
f. Finally, multivariate participation in the research process engenders more
positive attitudes toward statistics in general. Active involvement increases
our confidence in critiquing others' research and gives us more enthusi-
asm for applying methods to our own research. Greater feeling of empow-
erment occurs with less anticipatory anxiety when approaching statistics
and research. We may well find ourselves asking more complex research
questions with greater assurance, thereby increasing our own understand-
ing. All this should help us to feel more comfortable articulating multi-
ple ideas in an intelligent manner and to engage less in doubting our own
capabilities with statistics and research. This is consequential because the
bounty of multivariate information available could instill trepidation in many
who would rather not delve into it without some coaxing. However, my
experience has been that more exposure to the capabilities and applica-
tions of multivariate methods empowers us to pursue greater understand-
ing and hopefully to provide greater contributions to the body of scientific
knowledge.
DRAWBACKS
Because of the size and complexity of most multivariate designs, several drawbacks
may be evident. I present three drawbacks that could emerge when thinking about
multivariate methods and end with two additional drawbacks that are more tongue-
in-cheek perceptions that could result:
Comrey & Lee, 1992) recommend having a sample size of 200-500, with smaller
sample sizes allowed when there are large effect sizes (Green, 1991; Guadagnoli
& Velicer, 1988).
c. Third, interpretation of results from a multivariate analysis may be difficult
because of having several layers to examine. With multivariate methods, we can
often examine:
i. The overall significance to assess the probability that results were due to
chance;
ii. The main independent variables that are contributing to the analysis;
iii. The nature of the dependent variable(s) showing significance; and
iv. The specific pattern of the relationship between relevant independent and
dependent variables.
d. Fourth, some researchers speculate that multivariate methods are too com-
plex to take the time to learn. That is an inaccurate perception because the basic
themes are clear and reoccurring, as we will shortly see.
e. Fifth, after immersing ourselves in multivariate thinking, it could become
increasingly difficult to justify constructing or analyzing a narrow and unrealistic
research study. We might even find ourselves thinking from a much wider and
global perspective.
The main focus of learning and education is knowledge consumption and develop-
ment in which we are taught about the order that others have uncovered and learn
methods to seek our own vision of order. During our early years, we are largely
consumers of others' knowledge, learning from experts about what is important
and how it can be understood. As we develop in our education, we move more into
knowledge development and generation, which is explored and fine-tuned through
the practice of scientific research. The learning curve for research can be very slow,
although both interest and expertise increase with exposure and involvement. After
a certain point, which widely varies depending on individual interests and instruc-
tion, the entire process of research clicks and becomes unbelievably compelling.
We become hooked, getting a natural high from the process of discovery, creation,
and verification of scientific knowledge. I personally believe all of us are latent
scientists of sorts, if only at an informal level. We each go about making hypothe-
ses about everyday events and situations, based on more or less formal theories.
We then collect evidence for or against these hypotheses and make conclusions
and future predictions based on our findings. When this process is formalized and
validated in well-supported and well-structured environments, the opportunity for
a major contribution by a well-informed individual becomes much more likely.
Further, this is accompanied by a deeply felt sense of satisfaction and reward. That
has certainly been my experience.
8 CHAPTER 1
TABLE 1.1
Summary of the Definition, Benefits, Drawbacks, and Context for
Multivariate Methods
1. Definition Set of tools for identifying relationships among multiple variables
2. Benefits a. Stretch thinking to embrace a larger context
b. Help in understanding others' research
c. Expand capabilities with our own research
d. Examine large sets of variables in a single analysis
e. Provide several macro- and micro-assessment indices
f. Engender more positive attitudes toward statistics in general
3. Drawbacks a. Less is known about robustness of multivariate assumptions
b. Larger sample sizes are needed
c. Results are sometimes more complex to interpret
d. Methods may be challenging to learn
e. Broader focus requires more expansive thinking
4. Context a. Knowledge consumption of others' research
b. Knowledge generation from one's own research
REFERENCES
Abelson, R. P. (1995). Statistics as principled argument. Mahwah, NJ: Lawrence Erlbaum Associates.
Abelson, R. P. (1997). The surprising longevity of flogged horses: Why there is a case for the significance
test. Psychological Science, 8, 12-15.
Bentler, P. M. (1995). EQS: Structural equations program manual. Encino, CA: Multivariate Software,
Inc.
Berkson, J. (1942). Tests of significance considered as evidence. Journal of the American Statistical
Association, 37, 325-335.
Boomsma, A. (1983). On the robustness of LISREL (maximum likelihood estimation) against small
sample size and nonnormality. PhD. Thesis, University of Groningen, The Netherlands.
INTRODUCTION 9
Cohen, J. (1988). Statistical power analysis for the behavioral sciences. San Diego, CA: Academic
Press.
Cohen, J. (1992). A power primer. Psychological Bulletin, 112, 155-159.
Cohen, J. (1994). The earth is round (p < .05). American Psychologist, 49, 997-1003.
Cohen, J., Cohen, P., West, S. G., & Aiken, L. S. (2003). Applied multiple regression/correlation
analysis for behavioral sciences (3rd ed.). Mahwah, NJ: Lawrence Erlbaum Associates.
Comrey, A. L., & Lee, H. B. (1992). A first course in factor analysis (2nd ed.). Hillsdale, NJ: Lawrence
Erlbaum Associates.
Gorsuch, R. L. (1999). UniMult: For univariate and multivariate data analysis (Computer program
and guide). Pasadena, CA: UniMult.
Green, S. B. (1991). How many subjects does it take to do a regression analysis?Multivariate Behavioral
Research, 26, 449-510.
Guadagnoli, E., & Velicer, W. F. (1988). Relation of sample size to the stability of component patterns.
Psychological Bulletin, 10, 265-275.
Harlow, L. L., Mulaik S. A., & Steiger, J. H. (1997). What if there were no significance tests? Mahwah,
NJ: Lawrence Erlbaum Associates.
Harris, R. J. (2001). A primer of multivariate statistics. Mahwah, NJ: Lawrence Erlbaum Associates.
Kirk, R. E. (1996). Practical significance: A concept whose time has come. Educational and Psycho-
logical Measurement, 56, 746-759.
Kline, R. B. (2004). Beyond significance testing: Reforming data analysis methods in behavioral
research. Washington DC: APA.
Marcoulides, G. A., & Hershberger, S. L. (1997). Multivariate statistical methods: A first course.
Mahwah, NJ: Lawrence Erlbaum Associates.
McCullagh, P., & Nelder, J. (1989). Generalized linear models. London: Chapman and Hall.
Meehl, P. E. (1978). Theoretical risks and tabular asterisks: Sir Karl, Sir Ronald, and the slow progress
of soft psychology. Journal of Consulting and Clinical Psychology, 46, 806-834.
Morrison, D. E., & Henkel, R. E. (Eds.) (1970). The significance test controversy. Chicago: Aldine.
Mulaik, S. A., Raju, N. S., & Harshman, R. A. (1997). A time and place for significance testing.
In L. L. Harlow, S. A. Mulaik, & J. H. Steiger (Eds.), What if there were no significance tests?
(pp. 65-115). Mahwah, NJ: Lawrence Erlbaum Associates.
Schmidt, F. L. (1996). Statistical significance testing and cumulative knowledge in psychology: Impli-
cations for the training of researchers. Psychological Methods, 1, 115-129.
Simon, H. A. (1969). The sciences of the artificial. Cambridge, MA: The M.I.T. Press.
Tabachnick, B. G., & Fidell, L. S. (2001). Using multivariate statistics (4th ed.). Boston: Allyn and
Bacon.
Thompson, B. (1996). AERA editorial policies regarding statistical significance testing: Three sug-
gested reforms. Educational Researcher, 25, 26-30.
Wheatley, M. J. (1994). Leadership and the new science: Learning about organization from an orderly
universe. San Francisco, DA: Berrett-Koehler Publishers, Inc.
Wilkinson, L., & the APA Task Force on Statistical Inference (1999). Statistical methods in psychology
journals: Guidelines and explanations. American Psychologist, 54, 594-604.
2
Multivariate Themes
Quantitative methods have long been heralded for their ability to synthesize the
basic meaning in a body of knowledge. Aristotle emphasized meaning through the
notion of "definition" as the set of necessary and sufficient properties that allowed
an unfolding of understanding about concrete or abstract phenomena; Plato thought
of essence or meaning as the basic form (Lakoff & Nunez, 2000). Providing insight
into central meaning is at the heart of most mathematics, which uses axioms and
categorical forms to define the nature of specific mathematical systems.
This chapter focuses on the delineation of basic themes that reoccur within
statistics, particularly with multivariate procedures, in the hope of making con-
scious and apprehensible the core tenets, if not axioms, of multivariate thinking.
our results. We might argue that the extent to which a research study incorporated
the concept of multiplicity, the more rigorous, generalizable, reliable, and valid
the results would be.
In our multivariate venture into knowledge generation within the social sciences,
perhaps the most primary goal is to consider several relevant theories that could
direct our efforts to understand a phenomenon.
Theory
Before embarking on a research study, it is essential to inquire about meta-
frameworks that can provide a structure with which to conduct our research. Are
there multiple divergent perspectives to consider? Are any of them more central or
salient than the others? Which seem to offer a more encompassing way to view an
area of study while also providing a basis for strong investigations? Meehl (1997)
talks of the need to draw on theory that makes risky predictions that are capable of
being highly refuted. These strong theories are much preferred to weak ones that
make vague and vacuous propositions. Others concur with Meehl's emphasis on
theory. Wilson (1998) speaks of theory in reverent words, stating that "Nothing
in science—nothing in life, for that matter—makes sense without theory. It is our
nature to put all knowledge into context in order to tell a story, and to re-create
the world by this means" (p. 56). Theory provides a coherent theme to help us
find meaning and purpose in our research. Wheatley (1994) speaks of the power
and coherence of theory in terms of providing an overall meaning and focus in
our research. She writes, "As long as we keep purpose in focus... we are able to
wander through the realms of chaos... and emerge with a discernible pattern or
shape." (p. 136). Abelson (1995) discusses theory as being able to cull together a
wide range of findings into "coherent bundles of results" (p. 14). Thus, a thorough
understanding of the theories that are germane to our research will provide purpose
and direction in our quest to perceive the pattern of meaning that is present in a
set of relevant variables. This level of theoretical understanding makes it more
likely that meaningful hypotheses can be posited that are grounded in a coherent
structure and framework.
Hypotheses
Upon pondering a number of theories of a specific phenomenon, several hypothe-
ses or predictions undoubtedly will emerge. In our everyday life, we all formulate
predictions and hypotheses, however informal. This can be as mundane as a pre-
diction about what will happen during our day or how the weather will unfold.
In scientific research, we strive to formalize our hypotheses so that they directly
follow from well-thought-out theory. The more specific and precise our hypothe-
ses, the more likelihood there is of either refuting them or finding useful evidence
to corroborate them (Meehl, 1997). Edward O. Wilson (1998) makes this clear
by stating that theoretical tests of hypotheses "are constructed specifically to be
12 CHAPTER 2
blown apart if proved wrong, and if so destined, the sooner the better" (p. 57). Mul-
tivariate statistics allows us to formulate multiple hypotheses that can be tested
in conjunction. Thus, we should try to formulate several pivotal hypotheses or
research questions that allow for rigorous tests of our theories, allowing us to hone
and fine-tune our theories or banish them as useless (Wilson, 1998). The testing
of these hypotheses is the work of empirical research.
Empirical Studies
Having searched out pertinent theories that lead to strong predictions, it is im-
portant to investigate what other researchers have found in our area of research.
Are there multiple empirical studies that have previously touched on aspects of
these theories and predictions? Are there multiple contributions that could be
made with new research that would add to the empirical base in this area? Schmidt
and Hunter (1997) emphasize the need to accrue results from multiple studies
and assess them within a meta-analysis framework. This allows the regulari-
ties and consistent ideas to emerge as a larger truth than could be found from
single studies. Abelson (1995) describes this as the development of "the lore"
whereby "well-articulated research... is likely to be absorbed and repeated by
other investigators" as a collective understanding of a phenomenon (pp. 105-106).
No matter what the empirical area of interest, a thorough search of previous re-
search on a topic should illuminate the core constructs that could be viewed as
pure or meta-versions of our specific variables of interest. After taking into account
meaningful theories, hypotheses, and empirical studies we are ready to consider
how to measure the major constructs we plan to include in our research.
Measurement
When conducting empirical research, it is useful to ask about the nature of mea-
surement for constructs of interest (McDonald, 1999; Pedhazur & Schmelkin,
1991). Are there several pivotal constructs that need to be delineated and mea-
sured? Are there multiple ways to measure each of these constructs? Are there
multiple, different items or variables for each of these measures? Classical test
theory (Lord & Novick, 1968) and item response theory (Embretson, & Reise,
2000; McDonald, 2000) emphasize the importance of modeling the nature of an
individual's response to a measure and the properties of the measures. Reliability
theory (Anastasi & Urbina, 1997; Lord & Novick, 1968; McDonald, 1999) em-
phasizes the need to have multiple items for each scale or subscale we wish to
measure. Similarly, statistical analysts conducting principal components or factor
analyses emphasize the need for a minimum of three or four variables to anchor
each underlying dimension or construct (Gorsuch, 1983; Velicer & Jackson, 1990).
The more variables we use, the more likelihood there is that we are tapping the true
dimension of interest. In everyday terms, this is comparable to realizing that we
cannot expect someone else to know who we are if we use only one or two terms
MULTIVAR1ATE THEMES 13
to describe ourselves. Certainly, students would agree that if a teacher were to ask
just a single exam question to tap all their knowledge in a topic area, this would
hardly begin to do the trick. Multivariate thinking aids us in this regard by not only
encouraging but also requiring multiple variables to be examined in conjunction.
This makes it much more likely that we will come to a deeper understanding of
the phenomenon under study. Having identified several pertinent variables, it also
is important to consider whether there are multiple time points across which a set
of variables can be analyzed.
Multiple Controls
Perhaps the best way to ensure causal inferences is to implement controls within a
research design (Pearl, 2000). The three most salient controls involve a test of clear
association between variables, evidence of temporal ordering of the variables, and
the ability to rule out potential confounds or extraneous variables (Bullock, Harlow,
& Mulaik, 1994). This can be achieved most elegantly with an experimental design
that:
With this kind of design, there is a greater likelihood that nonspurious relation-
ships will emerge in which the independent variable can definitively be identified
as the causal factor, with potential confounding variables safely ruled out with the
random selection and assignment (Fisher, 1925, 1926).
Despite the virtues of an experimental design in ensuring control over one's
research, it is often difficult to enact such a design. Variables, particularly those
14 CHAPTER 2
used in social sciences, cannot always be easily manipulated. For example, I would
loathe to experimentally manipulate the amount of substance abuse that is needed
to bring about a sense of meaninglessness in life. These kinds of variables would be
examined more ethically in a quasi-experimental design that tried to systematically
rule out relevant confounds (Shadish, Cook, & Campbell, 2002). These types
of designs could include background variables (e.g., income, education, age at
first substance abuse, history of substance abuse, history of meaninglessness),
or covariates (e.g., network of substance users in one's environment, stressful
life events) that could be statistically controlled while examining the relationship
perceived between independent variables (IVs) and dependent variables (DVs).
Needless to say, it is very difficult to ensure that adequate controls are in place
without an experimental design, although the realities of real-world research make
it necessary to consider alternative designs. In addition to multiple controls, it is
useful to consider collecting data from multiple samples.
Multiple Samples
Are there several pertinent populations or samples from which data could be gath-
ered to empirically study the main constructs and hypotheses? Samples are a subset
of entities (e.g., persons) from which we obtain data to statistically analyze. Ideally,
samples are drawn randomly from a relevant population, although much research
is conducted with convenience samples, such as classrooms of students. Another
type of sampling is called "purposive," which refers to forming a sample that is
purposely heterogeneous or typical of the kind of population from which gener-
alization is possible (Shadish, Cook & Campbell, 2002). When samples are not
drawn at random or purposively, it is difficult to generalize past the sample to a
larger population (Shadish, 1995). Still, results from a nonrandom sample can offer
descriptive or preliminary information that can be followed up in other research.
Procedures such as propensity score analysis (Rosenbaum, 2002) can be used to
identify covariates that can address selection bias in a nonrandom sample, thus
allowing the possibility of generalizing to a larger population. The importance of
identifying relevant and meaningful samples is pivotal to all research. Further, in
multivariate research, samples are usually larger than when fewer variables are
examined.
Whether analyzing univariate or multivariate data from a relevant sample, it is
preferable to verify whether the findings are consistent. Fisher (1935) highlighted
the need for replicating findings in independent samples. Further, researchers
(Collyer, 1986; Cudeck & Browne, 1983) reiterate the importance of demonstrat-
ing that findings can be cross-validated. Statistical procedures have been developed
in several areas of statistics that incorporate findings from multiple samples. For
example, Joreskog (1971) and Sorbom (1974) developed multiple sample pro-
cedures for assessing whether a hypothesized mathematical model holds equally
well in more than one sample. These multiple sample procedures allow for tests of
MULTIVARIATE THEMES 15
increasing rigor of replication or equality across the samples, starting with a test
of an equal pattern of relationships among hypothesized constructs, up through
equality of sets of parameters (e.g., factor loadings, regressions, and means) among
constructs. If a hypothesized model can be shown to hold equally well across mul-
tiple samples, particularly when constraining the parameters to be the same, this
provides a strong test of the generalizability of a model (Alwin & Jackson, 1981;
Bentler, Lee & Weng, 1987; Joreskog, 1971). Even if many multivariate methods
do not have specific procedures for cross-validating findings, efforts should be
taken to ensure that results would generalize to multiple samples, thus allowing
greater confidence in their applicability.
Practical Implications
Although research does not have to fill an immediately apparent practical need, it
is helpful to consider what implications can be derived from a body of research.
When multiple variables are examined, there is a greater likelihood that connec-
tions among them will manifest in ways that suggest practical applications. For
example, research in health sciences often investigates multiple plausible predic-
tors of disease, or conversely well-being (Diener & Suh, 2000), that can be used in
developing interventions to prevent illness and sustain positive health (Prochaska
& Velicer, 1997; Velicer et al., 2000). Practical applications do not have to originate
with initial research in an area. For example, John Nash researched mathemati-
cal group theory, which only later was used to understand economics, bringing
Nash a Nobel Prize (Nash, 2002). Lastly, it is important to consider a number of
multivariate methods from which we can select for specific research goals.
TABLE 2.1
Summary of Multivariate Themes
CENTRAL THEMES
Just as with basic descriptive and inferential statistics, multivariate methods help us
understand and quantify how variables (co-)vary. Multivariate methods provide a
set of tools to analyze how scores from several variables covary—whether through
group differences, correlations, or underlying dimensions—to explain systematic
variance over and above random error variance. Thus, we are trying to explain
or make sense of the variance in a set of variables with as little random error
variance as possible. Multivariate methods draw on the multiplicity theme just
discussed, with several additional themes. Probably the most central themes, for
either multivariate or univariate methods, involve the concepts of variance, covari-
ance, and ratios of these (co-)variances. We also will examine the theme of creating
18 CHAPTER 2
Variance
Variance is the average of the squared difference between a set of scores and their
mean. Variance is what we usually want to analyze with any statistic. When a
variable has a large variance, sample scores tend to be very different, having a
wide range. It is useful to try to predict how scores vary, to find other variables that
help explain the variation. Statistical methods help identify systematic, explained
variance, acknowledging that there most likely will be a portion of unknowable
and random (e.g., error) variance. The goal of most statistics is to try to explain
how scores vary so that we can predict or understand them better. Variance is an
important theme, particularly in multivariate thinking, and it can be analyzed in
several ways, as we shall see later in this chapter.
Covariance
Covariance is the product of the average differences between one variable and its
mean and a second variable and its mean. Covariance or its standardized form,
correlation, depicts the existence of a linear relationship between two or more
variables. When variables rise and fall together (e.g., study time and grade point
average), they positively covary or co-relate. If scores vary in opposite directions
(e.g., greater practice is associated with a lower golf score), negative covariance
occurs. The theme of covariation is fundamental to multivariate methods, because
we are interested in whether a set of variables tend to co-occur together, indicating
a strong relationship. Multivariate methods most often assess covariance by assess-
ing the relationship among variables while also taking into account the covariation
among other variables included in the analysis. Thus, multivariate methods allow
a more informed test of the relationships among variables than can be analyzed
with univariate methods that expect separate or orthogonal relationships with other
variables.
Ratio of (Co-)Variances
Many methods examine a ratio of how much (co-)variance there is between vari-
ables or groups relative to how much variance there is within variables or within
groups. When the between information is large relative to the within information,
we usually conclude that the results are significantly different from those that could
be found based on pure chance. The reason for this is that when there are greater
differences across domains than there are within domains, whether from differ-
ent variables or different groups, there is some indication of systematic shared or
associated variance that is not just attributable to random error.
MULTIVARIATE THEMES 19
It is useful to see how correlation and ANOVA, two central, univariate statis-
tical methods, embody a ratio of variances. We can then extend this thinking to
multivariate methods. Correlation shows a ratio of covariance between variables
over variance within variables. When the covariance between variables is almost
as large as the variance within either variable, this indicates a stronger relationship
between variables. Thus, a large correlation indicates that much of the variance
within each variable is shared or covaries between the variables.
With group difference statistics (e.g., ANOVA), we often form an F-ratio of how
much the group means vary relative to how much variance there is within each
group. When the means are much more different between groups (i.e., large vari-
ance between groups) than the scores are within each group (i.e., smaller variance
within groups), we have evidence of a relationship between the grouping (e.g., cat-
egorical, independent) and outcome (e.g., continuous, dependent) variables. Here,
a large F-ratio indicates significant group difference variance.
These ratios, whether correlational or ANOVA-based, are also found in multi-
variate methods. In fact, just about every statistical significance test is based on
some kind of ratio of variances or covariances. Knowing this fact and understand-
ing the nature of the ratio for each analysis helps us make much more sense out of
our statistical results, whether from univariate or multivariate methods.
Linear Combinations
A basic theme throughout most multivariate methods is that of finding the rela-
tionship between two or more sets of variables. This usually is accomplished by
forming linear combinations of the variables in each set that are additive com-
posites that maximize the amount of variance drawn from the variables. A simple
example of a linear combination is the course grade received in many classes. The
grade, let's call it X', is formed from the weighted combination of various scores.
Thus, a course grading scheme of X' = 0.25 (homework) + 0.25 (midterm exam)
+ 0.30 (final exam) + 0.20 (project) would reveal a linear combination showing
the weights attached to the four course requirements.
These linear combination scores are then analyzed, summarizing the many
variables in a simple, concise form. With multivariate methods, we often are trying
to assess the relationship between variables, which is often the shared variance
between linear combinations of variables. Several multivariate methods analyze
different kinds of linear combinations.
Components
A component is a linear combination of variables that maximizes the variance
extracted from the original set of variables. The concept of forming linear combi-
nations is most salient with PCA (see Chapter 11 for more on PCA). With PCA,
we aim to find several linear combinations (i.e., components) that help explain
most of the variance in a set of the original variables. MANOVA and DFA (see
2O CHAPTER 2
Factors
We have just seen how linear combinations can be thought of as dimensions that
seem to summarize the essence of a set of variables. If we are conducting a FA, we
refer to these dimensions as factors. Factors differ from the linear combinations
analyzed in PCA, MANOVA, DFA, and CC in that they are more latent dimensions
that have separated common, shared variance among the variables from any unique
or measurement error variance within the variables. Thus, a factor sometimes is
believed to represent the underlying true dimension in a set of variables, after
removing the portion of variance in the variables that is not common to the others
(i.e., the unique or error portion). Exploratory FA is discussed in the current text,
(see Chapter 10) although there are confirmatory FA procedures that are also
relevant and will be left for another volume.
INTERPRETATION THEMES
When interpreting results and assessing whether an analysis is successful, we
should evaluate from several perspectives. Most statistical procedures, whether
we are using univariate or multivariate methods, allow a macro-assessment of how
well the overall analysis explains the variance among pertinent variables. It is also
important to focus on a micro-assessment of the specific aspects of the multivariate
results.
In keeping with ongoing debates about problems with the exclusive use of
significance testing (Harlow, Mulaik, & Steiger, 1997; Kline, 2004; Wilkinson, &
the APA Task Force on Statistical Inference, 1999), I advocate that each result be
evaluated with both a significance test and an effect size whenever possible; that
is, it is helpful to know whether a finding is significantly different from chance
and to know the magnitude of the significant effect.
Macro-Assessment
The first way to evaluate an analysis is at a global or macro-level that usually
involves a significance test and most likely some synthesis of the variance in a
multivariate dataset. A macro summary usually depicts whether there is significant
covariation or mean differences within data, relative to how much variation there
is among scores within specific groups or variables.
Significance Test
A significance test, along with an accompanying probability or p? value is usually
the first step of macro-assessment in a multivariate design (Wilkinson, & the APA
Task Force on Statistical Inference, 1999). Significance tests tell us whether our
empirical results are likely to be due to random chance or not. It is useful to be
able to rule out, with some degree of certainty, an accidental or anomalous finding.
Of course, we always risk making an error no matter what our decision. When
we accept our finding too easily, we could be guilty of a Type I error, saying that
our research had veracity when in fact it was a random finding. When we are too
cautious about accepting our findings, we may be committing a Type II error, saying
that we have no significant findings when in fact we do. Significance tests help
us to make probabilistic decisions about our results within an acceptable margin
of error, usually set at 1% to 5%. We would like to say that we have more reason
to believe our results are true than that they are not true. Significance tests give
us some assurance in this regard, and they are essential when we have imperfect
knowledge or a lack of certainty about an area. We can help rule out false starts
and begin to accrue a growing knowledge base with these tests (Mulaik, Raju &
Harshman, 1997).
22 CHAPTER 2
Effect Sizes
Effect sizes provide an indication of the magnitude of our findings at an overall
level. They are a useful supplement to the results of a significance test (Wilkinson &
the APA Task Force on Statistical Inference, 1999). Effect size (ES) calculations
can be standardized differences between means for group difference questions
(e.g., Cohen's d) (Cohen, 1992; Kirk, 1996). Guidelines for interpreting small,
medium, and large ESs are Cohen's d values of 0.2, 0.5, and 0.8, respectively
(Cohen, 1988). Quite often, an ES takes the form of a proportion of shared variance
between the independent and the dependent variables, particularly for multivariate
analyses. Guidelines for multivariate shared variance are 0.02, 0.13, and 0.26 for
small, medium, and large ESs, respectively (Cohen, 1992). Although much more
can be said about effect sizes (Cohen, 1988), we focus largely on those that involve
a measure of shared variance.
Shared variance is a common theme throughout most statistical methods and
often can form the basis of an ES. We are always trying to understand how to
explain the extent by which scores vary or covary. Almost always, this involves
two sets of variables so that the focus is on how much variance is shared between
the two sets (e.g., a set of IVs and a set of DVs, or a set of components-factors and
a set of measured variables). With multivariate methods, one of the main ways of
summarizing the essence of shared variance is with squared multiple correlations.
Indices of shared variance or squared multiple correlations can inform us of the
strength of relationship or effect size (Cohen, 1988).
Squared multiple correlation, R2, indicates the amount of shared variance
between the variables. It is useful by providing a single number that conveys how
much the scores from a set of variables (co-)vary in the same way (i.e., rise and
fall together) relative to how much the scores within each variable differ among
themselves. A large R2 value (e.g., 0.26 or greater) indicates that the participants'
responses on a multivariate set of variables tend to behave similarly, so that a
MULTIVAR1ATE THEMES 23
common or shared phenomenon may be occurring among the variables. For ex-
ample, research by Richard and Shirley lessor (1973) and their colleagues have
demonstrated large proportions of shared variance between alcohol abuse, drug
abuse, risky sexual behavior, and psychosocial variables, providing compelling
evidence that an underlying phenomenon of "problem behavior" is apparent.
Many statistical methods make use of the concept of R2 or shared variance.
Pearson's correlation coefficient, r, is an index of the strength of relationship
between two variables. Squaring this correlation yields R2, sometimes referred to
as a coefficient of determination that indicates how much overlapping variance
is shared between two variables. In MR (Cohen, Cohen, West, & Aiken, 2003),
a powerful multivariate method useful for prediction, we can use R2 to examine
how much variance is shared between a linear combination of several independent
variables and a single outcome variable. The linear combination for MR (see
Chapter 4) is formed by a least-squares approach that minimizes the squared
difference between actual and predicted outcome scores. In Chapter 6, we will see
how shared variance across linear combinations can be summarized with functions
of eigenvalues, traces, or determinants, each of which is described in the discussion
of matrix notation and calculations.
Residual or error variance is another possible consideration when discussing
central themes. Many statistical procedures benefit from assessing the amount of
residual or error variance in an analysis. In prediction methods, such as MR (see
Chapter 4), we often want to examine prediction error variance (i.e., 1 — R2), which
is how much variation in the outcome variable was not explained by the predictors.
In MANOVA, DFA, and CC (see Chapters 7, 8 and 10), we can approximate the
residuals by subtracting eta squared (i.e., 2: ratio of between variance to total
variance) from one, also known as Wilks's (1932) lambda.
Micro-Assessment
After finding significant results and a meaningful ES at a macro-level, it is then
useful to examine results at a more specific, micro-level. Micro-assessment in-
volves examining specific facets of an analysis (e.g., means, weights) to determine
specifically what is contributing to the overall relationship. In micro-assessment,
we ask whether there are specific coefficients or values that can shed light on which
aspects of the system are working and which are not.
Means
With group difference methods, micro-assessment entails an examination of the
differences between pairs of means. We can accomplish this by simply presenting a
descriptive summary of the means and standard deviations for the variables across
groups and possibly graphing them to allow visual examination of any trends.
Several other methods are suggested for more formally assessing means.
24 CHAPTER 2
Weights
With correlational or dimensional methods, we often want to examine weights
that indicate how much of a specific variable is contributing to an analysis. In MR,
we examine least-squares regression weights that tell us how much a predictor
variable covaries with an outcome variable after taking into account the relation-
ships with other predictor variables. In an unstandardized metric, it represents the
change in an outcome variable that can be expected when a predictor variable is
changed by one unit. In a standardized metric, the regression weight, often re-
ferred to as a beta weight, gives the partial correlation between the predictor and
the outcome, after controlling for the other predictor variables in the equation.
Thus, the weight provides an indication of the unique importance of a predictor
with an outcome variable. In other multivariate methods (e.g., see DFA, CC, PCA
in Chapters 8, 10, and 11), the unstandardized weights are actually eigenvector
MULT1VARIATE THEMES 25
REFERENCES
Abelson, R. P. (1995). Statistics as principled argument. Mahwah, NJ: Lawrence Erlbaum Associates.
Alwin, D. E, & Jackson, D. J. (1981). Applications of simultaneous factor analysis to issues of factorial
invariance. In D. Jackson & E. Borgatta (Eds.), Factor analysis and measurement in sociological
research: A multi-dimensional perspective. Beverly Hills: Sage Publications.
Anastasi, A., & Urbina, S. (1997). Psychological testing. Upper Saddle River, NJ: Prentice Hall.
Bentler, P. M., Lee, S.-Y, & Weng, L.-J. (1987). Multiple population covariance structure analysis
under arbitrary distribution theory. Communications in Statistics—Theory, 16, 1951-1964.
Bullock, H. E., Harlow, L. L., & Mulaik, S. (1994). Causation issues in structural modeling research.
Structural Equation Modeling Journal, 1, 253-267.
Carmer, S. G., & Swanson, M. R. (1973). An evaluation of ten pairwise multiple comparison procedures
by Monte Carlo methods. Journal of the American Statistical Association, 68, 66-74.
Cohen, J. (1988). Statistical power analysis for the behavioral sciences. San Diego, CA: Academic
Press.
Cohen, J. (1992). A power primer. Psychological Bulletin, 112, 155-159.
26 CHAPTER 2
Cohen, J., Cohen, P., West, S. G., & Aiken, L. S. (2003). Applied multiple regression/correlation
analysis for behavioral sciences (3rd ed.). Mahwah, NJ: Lawrence Erlbaum Associates.
Collins, L., & Horn, J. (Eds.) (1991). Best methods for the analysis of change. Washington, DC: APA
Publications.
Collins, L. M., & Sayer, A. G. (Ed). (2001). New methods for the analysis of change. Decade of
behavior. Washington, DC, US: American Psychological Association.
Collyer, C. E. (1986). Statistical techniques: Goodness-of-fit patterns in a computer cross-validation
procedure comparing a linear and threshold model. Behavior Research Methods, Instruments, &
Computers, 18, 618-622.
Cudeck, R., & Browne, M. W. (1983). Cross-validation of covariance structures. Multivariate Behav-
ioral Research, 18, 147-157.
Diener, E., & Suh, E. M. (Eds.), (2000). Culture and subjective well-being: Well being and quality of
life. Cambridge, MA: The MIT Press.
Embretson, S. E., & Reise, S. P. (2000). Item response theory for psychologists. Mahwah, NJ: Lawrence
Erlbaum Associates.
Fisher, R. A. (1925). Statistical methods for research workers. Edinburgh: Oliver & Boyd.
Fisher, R. A. (1926). The arrangement of field experiments. Journal of the ministry of agriculture of
Great Britain, 33, 505-513.
Fisher, R. A. (1935). The design of experiments. Edinburgh: Oliver & Boyd.
Gorsuch, R. L. (1983). Factor Analysis (second edition). Hillsdale, NJ: Lawrence Erlbaum Associates.
Grimm, L. G., & Yarnold, P. R. (1995). Reading and understanding multivariate statistics. Washington,
DC: APA.
Grimm, L. G., & Yarnold, P. R. (2000). Reading and understanding more multivariate statistics.
Washington, DC: APA.
Harlow, L. L., Mulaik, S. A., & Steiger, J. H. (1997). What if there were no significance tests? Mahwah,
NJ: Lawrence Erlbaum Associates.
Hosmer, Jr., D. W, & Lemeshow, S. (2000). Applied logistic regression. New York, NY: John Wiley
& Sons.
Jessor, R., & Jessor, S. L. (1973). A social psychology of marijuana use: Longitudinal studies of high
school and college youth. Journal of Personality and Social Psychology, 26, 1-15.
Joreskog, K. G. (1971). Simultaneous factor analysis in several populations. Psychometrika, 57, 409-
426.
Kirk, R. E. (1996). Practical significance: A concept whose time has come. Educational and Psycho-
logical Measurement, 56, 746-759.
Kline, R. B. (2004). Beyond significance testing: Reforming data analysis methods in behavioral
research. Washington, DC: American Psychological Association.
Lakoff, G., & Nunez, R. E. (2000). Where mathematics comes from: How the embodied mind brings
mathematics into being. New York: Basic Books, A Member of the Perseus Books Groups.
Lord, F. M., & Novick, M. R. (1968). Statistical theories of mental test scores. Reading, MA: Addison-
Wesley.
Marcoulides, G. A., & Hershberger, S. L. (1997). Multivariate statistical methods: A first course.
Mahwah, NJ: Lawrence Erlbaum Associates.
Maxwell, S. E., & Delaney, H. D. (2004). Designing experiments and analyzing data: A model com-
parison perspective. Mahwah, NJ: Lawrence Erlbaum Associates.
McDonald, R. P. (1985). Factor analysis and related methods. Hillsdale, NJ: Lawrence Erlbaum As-
sociates.
McDonald, R. P. (1997). Goodness of approximation in the linear model. In L. L. Harlow, S. A.
Mulaik, & J. H. Steiger (Eds.), What if there were no significance tests? (pp. 199-219). Mahwah,
NJ: Lawrence Erlbaum Associates.
McDonald, R. P. (1999). Test theory: A unified treatment. Mahwah, NJ: Lawrence Erlbaum Associates.
MULT1VARIATE THEMES 27
McDonald, R. P. (2000). A basis for multidimensional item response theory. Applied Psychological
Measurement, 24, 99-114.
Meehl, P. E. (1997). The problem is epistemology, not statistics: Replace significance tests by confi-
dence intervals and quantify accuracy of risky numerical predictions. In L. L. Harlow, S. A. Mulaik,
& J. H. Steiger (Eds.), What if there were no significance tests? (pp. 393-425). Mahwah, NJ:
Lawrence Erlbaum Associates.
Moskowitz, D. S., & Hershberger, S. L. (Eds.) (2002). Modeling intraindividual variability with re-
peated measures data: Methods and applications. Mahwah, NJ: Lawrence Erlbaum Associates.
Mulaik, S. A., Raju, N. S., & Harshman, R. A. (1997). A time and place for significance testing.
In L. L. Harlow, S. A. Mulaik, & J. H. Steiger (Eds.), What if there were no significance tests?
(pp. 65-115). Mahwah, NJ: Lawrence Erlbaum Associates.
Nash, J., with S. Nasar and H. W. Kuhn (Eds.) (2002). The essential John Nash. Princeton: Princeton
University Press.
Pearl, J. (2000). Causality: Models, reasoning and inference. Cambridge, England: Cambridge Uni-
versity Press.
Pedhazur, E. J., & Schmelkin, L. P. (1991). Measurement, design and analysis: An integrated approach.
Mahwah, NJ: Lawrence Erlbaum Associates.
Prochaska, J. O., & Velicer, W. F. (1997). The transtheoretical model of health behavior change. (Invited
paper). American Journal of Health Promotion, 12, 3 8 - 8 .
Rosenbaum, P. R. (2002). Observational studies (2nd ed.). New York: Springer Verlag.
Schmidt, F. L., & Hunter, J. E. (1997). Eight common but false objections to the discontinuation of
significance testing in the analysis of research data. In L. L. Harlow, S. A. Mulaik, & J. H. Steiger
(Eds.), What if there were no significance tests (pp. 37-64)? Mahwah, NJ: Lawrence Erlbaum
Associates.
Shadish, W. R. (1995). The logic of generalization: Five principles common to experiments and ethno-
graphies. American Journal of Community Psychology, 23, 419-428.
Shadish, W. R., Cook, T. D., & Campbell, D. T. (2002). Experimental and quasi-experimental designs
for generalized causal inference. Boston: Houghton Mifflin Company.
Sorbom, D. (1974). A general method for studying difference in factor means and factor structures
between groups. British Journal of Mathematical and Statistical Psychology, 27, 229-239.
Tabachnick, B. G., & Fidell, L. S. (2001). Using multivariate statistics (4th ed.). Boston: Allyn and
Bacon.
Tatsuoka, M. M. (1970). Discriminant analysis. Champaign, IL: Institute for Personality and Ability
Testing.
Tukey, J. W. (1953). The problem of multiple comparisons. Unpublished manuscript, Princeton Uni-
versity (mimeo).
Tukey, J. W. (1977). Exploratory data analysis. Reading, MA: Addison-Wesley.
Velicer, W. F, & Jackson, D. N. (1990). Component analysis versus common factor analysis: Some
issues in selecting an appropriate procedure. Multivariate Behavioral Research, 25, 1-28.
Velicer, W. F, Prochaska, J. O., Fava, J. L., Rossi, J. S., Redding, C. A., Laforge, R. G., Robbins, M.
L. (2000). Using the transtheoretical model for population-based approaches to health promotion
and disease prevention. Homeostasis in Health and Disease, 40, 174-195.
Wheatley, M. J. (1994). Leadership and the new science: Learning about organization from an orderly
universe. San Francisco, DA: Berrett-Koehler Publishers, Inc.
Wilkinson, L., & the APA Task Force on Statistical Inference (1999). Statistical methods in psychology
journals: Guidelines and explanations. American Psychologist, 54, 594-604.
Wilks, S. S. (1932). Certain generalizations in the analysis of variance. Biometrika, 24, 471-494.
Wilson, E. O. (1998). Consilience. New York: Vintage Books, A division of Random House.
3
Background Themes
Data
Data constitute the pieces of information (i.e., variables) on a phenomenon of
interest. Data that can be assigned meaningful numerical values can be analyzed
by a number of statistical methods. We usually assign a (numerical) score to each
variable for each entity and store this in an "N by p" data matrix, where N stands
for the number of participants or entities and p stands for the number of variables
(predictors or outcomes). A data matrix is the starting point for statistical analysis.
It is the large, numerical knowledge base from which we can combine, condense,
and synthesize to derive meaningful and relevant statistical nuggets that capture
28
BACKGROUND THEMES 29
the essence of the original information. Obviously, a data matrix will tend to have
more columns (of variables) and most likely more rows (of participants) with
multivariate research than with univariate methods. To the extent that the data are
collected from a large and representative random sample, it can offer a strong
foundation and workplace for subsequent analyses.
In later chapters, we examine a data matrix from 527 women at risk for HIV,
examining a set of about 30 variables across three time points, measured 6 months
apart. Thus, the full data set that is analyzed in this book is (N by p) = 527
by 90 and is a portion of a larger data collection that was funded by a grant
from the National Institute of Mental Health (Harlow, Quina & Morokoff, 1991).
At the end of each chapter that discusses a different multivariate method, we
explore an example from this data set that addresses background themes and other
considerations necessary when addressing a multivariate research question. The
data, computer program setups, and analyses from these examples are presented
in the accompanying CD.
Measurement Scales
Variables can be measured on a continuous or on a categorical scale. Variables
measured on a continuous scale have numerical values that can be characterized
by a smooth flow of arithmetically meaningful quantitative measurement, whereas
categorical variables take on finite values that are discrete and more qualitatively
meaningful. Age and height are examples of continuous variables that can take on
many values that have quantitative meaning. In contrast, variables like gender and
ethnicity have categorical distinctions that are not meaningfully aligned with any
numerical values. It is also true that some variables can have measurement scales
that have both numerical and categorical properties. Likert scale variables have
several distinct categories that have at least ordinal, if not precisely quantitative,
values. For example, variables that ask participants to rate a statement anywhere
from "1 = strongly disagree" to "5 = strongly agree" are using an ordinal Likert
scale of measurement.
Continuous variables can be used as either predictors or as outcomes (e.g.,
in multiple regression). Categorical variables are often used to separate people
into groups for analysis with group-difference methods. For example, we may
assign participants to a treatment or a control group with the categorical variable
of treatment (with scores of 1 = 4 yes, or (0 = no). Because of the common
use of Likert scales in social science research, the scale of such ordinal variables
has been characterized as either categorical or quasi-continuous depending on
whether the analysis calls for a grouping or a quantitative variable. As with any
research decision, the conceptualization of a variable should be grounded in strong
theoretical and empirical support.
As we will see later in this chapter, the choice of statistical analysis often de-
pends, at least in part, on the measurement scales of the variables being studied. This
3O CHAPTER 3
is true for either multivariate or univariate methods. If variables are reliably and
validly measured, whether categorical or continuous, then the results of analyses
will be less biased and more trustworthy. We will also see that, before conducting
multivariate methods, we often begin by analyzing frequencies on categorical data
and examining descriptive statistics (e.g., means, standard deviations, skewness,
and kurtosis) on continuous data. We will have a chance to do this in later chap-
ters when providing the details from a fully worked-through example for each
method.
Roles of Variables
Variables can be independent (i.e., perceived precipitating cause), dependent (i.e.,
perceived outcome), or mediating (i.e., forming a sequential link between inde-
pendent and dependent variables). In research, it is useful to consider the role each
variable plays in understanding phenomena. A variable that is considered a causal
agent is sometimes labeled as independent or exogenous. It is not explained by a
system of variables but is rather believed to have an effect on other variables. The
affected variables are often referred to as dependent or endogenous, implying that
they were directly impinged upon by other, more inceptive variables (e.g., Byrne,
2001).
Another kind of endogenous variable can be conceptualized as intermediate and
thus intervenes between or changes the nature of the relationship between indepen-
dent variables (IVs) and dependent variables (DVs). When a variable is conceived
as a middle pathway between IVs and DVs it is often labeled as an intervening or
mediating variable (e.g., Collins, Graham, & Flaherty, 1998; MacKinnon & Dwyer,
1993). For example, Schnoll, Harlow, Stolbach and Brandt (1998) found that the
relationship between age, cancer stage, and psychological adjustment appeared to
be mediated by coping style. In this model, age and cancer stage, the independent or
exogenous variables, were not directly associated with psychological adjustment,
the dependent or endogenous variable, after taking into account a cancer patient's
coping style, the mediating (also endogenous) variable. Instead, cancer patients
who were younger and had a less serious stage of cancer had more positive coping
styles. Furthermore, those who coped more with a fighting spirit—rather than with
hopelessness, fatalism, and anxious preoccupation—adjusted better. Thus, coping
style served as a mediator between demographic/disease variables and psycholog-
ical adjustment variables.
Variables are referred to as moderator variables when they change the nature of
the relationship between the IVs and DVs variables (e.g., Baron & Kenny, 1986;
Gogineni, Alsup, & Gillespie, 1995). For example, teaching style may be a pre-
dictor of an outcome, school performance. If another variable is identified, such as
gender, that when multiplied by teaching style changes the nature of the predictive
relationship, then gender is seen as a moderator variable. Thus, a moderator is
BACKGROUND THEMES 31_
Incomplete Information
Inherent in all statistical methods is the idea of analyzing incomplete information,
where only a portion of knowledge is available. For example, we analyze a subset
32 CHAPTER 3
of the data by selecting a sample from the full population because this is all we
have available to provide data. We examine only a subset of the potential causal
agents or explaining variables, because it is nearly impossible to conceive of all
possible predictors. We collect data from only a few measures for each variable
of interest, because we do not want to burden our participants. We describe the
main themes in the data (e.g., factors, dimensions) and try to infer past our original
sample and measures to a larger population and set of constructs. In each case,
there is a need to infer a generalizable outcome from a subset to a larger universe
to explain how scores vary and covary. Ultimately, we would like to be able to
demonstrate that associations among variables can be systematically explained
with as little error as possible. For example, a researcher might find that alcohol
use scores vary depending on the level of distress and the past history of alcohol
abuse (Harlow, Mitchell, Fitts, & Saxon, 1999). It may be that the higher the level
of distress and the greater the past history of substance abuse, the more likely
someone is to engage in greater substance use. Strong designs and longitudinal
data can help in drawing valid conclusions.
It is most likely true that other variables are important in explaining an outcome.
Even when conducting a large, multivariate study, it is important to recognize that
we cannot possibly examine the full set of information, largely because it is not
usually known or accessible. Instead, we try to assess whether the pattern of vari-
ation and covariation in the data appears to demonstrate enough evidence for
statistically significant relationships over and above what could be found from
sheer random chance. Multivariate methods, much more so than univariable meth-
ods, offer more comprehensive procedures for analyzing real-world data that most
likely have incomplete information.
Missing Data
Another background consideration is how to address missing data. Whether data
are collected from an experiment or a survey, at least a portion of the participants
invariably refrain from responding on some variables. If a sample is reasonably
large (e.g., 200-400), the percent of missing data is small (e.g., 5% to 10%),
and the pattern of missing data appears random, there are a number of options
available. Although a thorough discussion of this topic is beyond the scope of
this book, it is worth briefly describing several methods for approaching missing
data.
Deleting data from all participants that have missing data is the most severe and
probably one of the least preferable approaches to adopt. This method is called
listwise deletion and can result in a substantially smaller data set, thus calling into
question how representative the sample is of the larger population. I would not
advise using this method unless there is a very large sample (e.g., 500 or more)
and the percent of missing data is very small (i.e., <5%).
BACKGROUND THEMES 33
Descriptive Statistics
Descriptive statistics provide an ungirded view of data. This often involves sum-
marizing the central nature of variables (e.g., average or mean score; midpoint
or median score; and most frequently occurring or modal score), ideally from a
representative sample. This also can comprise the spread or range of scores, as
well as the average difference each score is from the mean (i.e., standard devia-
tion). Descriptive statistics also can include measures of skewness, and kurtosis
to indicate how asymmetric or lopsided, and how peaked or heavy-tailed, re-
spectively, is a distribution of scores. Thus, descriptive statistics summarize basic
characteristics of a distribution such as central tendency, variability, skewness,
34 CHAPTER 3
and kurtosis. Descriptive statistics can be calculated for large multivariate studies
that investigate the relationships among a number of variables, hopefully based
on a well-selected and large sample. With multivariate data, we organize the
means from a set of variables in a column labeled a vector. We organize the vari-
ances and covariances among the variables into a variance-covariance matrix (see
chapter 6).
Another form of descriptive statistics occurs when we synthesize information
from multiple variables in a multivariate analysis using inferential statistics on
a specific, nonrandom sample. For example, an instructor may want to describe
the nature of class performance from a specific set of variables (e.g., quizzes,
tests, projects, homework) and sample (e.g., one classroom). If she wanted to
describe group differences between male and female students, she could conduct
a multivariate analysis of variance with a categorical IV, gender, and the several
continuous outcomes she measured from students' performance. Results would not
necessarily be generalized beyond her immediate classroom, although they could
provide a descriptive summary of the nature of performance between gender groups
in her class of students.
Inferential Statistics
Inferential statistics allow us to generalize beyond our sample data to a larger
population. With most statistical methods, inferences beyond one's specific data
are more reliable when statistical assumptions are met.
Statistical assumptions for multivariate analyses include the use of representa-
tive samples, a normally or bell-shaped distribution of scores, linear relationships
between the variables (i.e., variables that follow an additive pattern of the form:
Y = BiXi+ error), and homoscedasticity (i.e., similar variance on one variable
along all levels of another variable). We also would want to make sure that
variables are not too highly overlapping or collinear. This would cause insta-
bility in statistical analyses so that it would be difficult to decide to which vari-
able a weight should be attached. Generally, if variables are correlated greater
than 0.90, collinearity is most likely present and decisions will have to be made
as to whether to drop one of the variables or to combine them in a composite
variable.
Examining skewness and kurtosis, bivariate plots of the variables, and corre-
lations among the variables can check assumptions and collinearity. We also may
want to consider whether any variables need to be transformed to meet assump-
tions. If any variables have skewness greater than an absolute value of about 1.0, or
kurtosis greater than about 2.0, it may be helpful to consider taking a logarithmic
transformation to reduce nonnormality. Although transformed scores are not in
the original metric and interpretation may be difficult, many scores used in the
social sciences have arbitrary scales. Thus, transformations, although somewhat
controversial, still may be preferred to increase the power of analyses and decrease
BACKGROUND THEMES 35
bias by meeting assumptions (Cohen, Cohen, West, & Aiken, 2003; Johnson &
Wichern, 2002; Tabachnick & Fidell, 2001). Consider, making transformations
especially when analyzing data concerned with extreme variables such as drug use
and sexual risk. If the data can be transformed to meet assumptions, then inferences
made on these data may be more accurate.
Inferential statistics allow estimates of population characteristics from sam-
ples representative of the population. Inferences are strengthened when potential
extraneous variables are identified and taken into account. Likewise, if we can
show that the data follow expected assumptions so that we can rule out random
chance with our findings, then results are more conclusive. Multivariate research
that shares these same features (i.e., representative samples, controlling for con-
founds, and normally distributed data) can provide a basis for inferences beyond
the immediate sample to a large population of interest.
the categorical outcome. In LR, weights are interpreted as the odds that an indi-
vidual with the IV characteristic will end up in a specific outcome group (Hosmer
& Lemeshow, 2000). If there is a 50/50 chance of ending up in a specific group,
the odds will be equal to 1.0. If there is less chance, then the odds will be less
than 1.0; conversely, if there is greater than a 50/50 chance of being in a specific
group based on the IV characteristic, the odds will be greater than 1.0. Predictor
variables that are meaningful in an analysis generally have an odds value different
from 1.0 associated with them.
Multivariate correlational methods, such as canonical correlation (CC), princi-
pal components analysis (PCA), and factor analysis (FA) incorporate a large set
of continuous variables. In CC, there are two sets of variables, where the goal
is to explore the patterns of relationships between the two sets (see Chapter 10).
In PCA or FA, there is one set of continuous variables with the goal of trying to
identify a smaller set of latent dimensions that underlie the variables (see Cha-
pter 11).
TABLE 3.1
Summary of Background Themes to Consider for Multivariate Methods
TABLE 3.2
Questions to Ask for Each Multivariate Method
1. What is this method and how is it similar to and different from other methods?
2. When is this method used and what research questions can it address?
3. What are the main multiplicity themes for this method?
4. What are the main background themes applied to this method?
5. What is the statistical model that is tested with this method?
6. How do central themes of variance, covariance, and linear combinations apply?
7. What are the main themes needed to interpret results at a macro-level?
8. What are the main themes needed to interpret results at a micro-level?
9. What are some other considerations or next steps after applying this method?
10. What is an example of applying this method to a research question?
REFERENCES
Allison, P. D. (2001). Missing data (Quantitative Applications in the Social Sciences). Newbury Park,
CA: Sage.
Baron, R. M., & Kenny, D. A. (1986). The moderator-mediator variable distinction in social psycho-
logical research: Conceptual, strategic, and statistical considerations. Journal of Personality and
Social Psychology, 51, 1173-1182.
Britt, D. W. (1997). A conceptual introduction to modeling: Qualitative and quantitative perspectives.
Mahwah, NJ: Lawrence Erlbaum Associates.
Byrne, B. M. (2001). Structural Equation Modeling with AMOS: Basic Concepts, Applications, and
Programming. Mahwah, NJ: Lawrence Erlbaum Associates.
Cohen, J., Cohen, P., West, S. G., & Aiken, L. S. (2003). Applied multiple regression/correlation
analysis for behavioral sciences (3rd ed.). Mahwah, NJ: Lawrence Erlbaum Associates.
Collins, L. M., Graham, J. W., & Flaherty, B. P. (1998). An alternative framework for defining mediation.
Multivariate Behavioral Research, 33, 295-312.
Collins, L. M., Schafer, J. L., Kam, C. M. (2001). A comparison of inclusive and restrictive strategies
in modern missing data procedures. Psychological Methods, 6(4), 330-351.
Devlin, K. (1994). Mathematics: The science of patterns. The search for order in life, mind, and the
universe. New York: Scientific American Library.
Enders, C. K. (2001). The performance of the full information maximum likelihood estimator in
multiple regression models with missing data. Educational & Psychological Measurement, 61,
713-740.
Gogineni, A., Alsup, R., & Gillespie, D. F. (1995). Mediation and moderation in social work research.
Social Work Research, 19, 57-63.
Harlow, L. L., Mitchell, K. J., Fitts. S. N., & Saxon, S. E. (1999). Psycho-existential distress and
problem behaviors: Gender, subsample and longitudinal tests. Journal of Applied Biobehavioral
Reasearch,4, 111-138.
Harlow, L. L., Quina, K., & Morokoff, P. (1991). Predicting heterosexual HIV-risk in women. NIMH
Grant MH47233 (awarded $500,000).
Hosmer, D. W., & Lemeshow, S. (2000). Applied logistic regression (2nd ed.). New York: Wiley.
4O CHAPTER 3
Johnson, R. A., & Wichern, D. W. (2002). Applied multivariate statistical analysis (2nd ed.).
Englewood Cliffs, NJ: Prentice-Hall, Inc.
Little, R. J. A., & Rubin, D. B. (2002). Statistical analysis with missing data (2nd ed.). New York:
Wiley.
MacKinnon, D. P., & Dwyer, J. H. (1993). Estimating mediated effects in prevention studies. Evaluation
Review, 17(2), 144-158.
McCullagh, P., & Nelder, J. A. (1989). Generalized linear models (2nd ed.). London: Chapman & Hall.
Rubin, D. B. (2004). Multiple imputation for nonresponse in surveys. New York: Wiley.
Schafer, J. L., & Schafer, J. (1997). Analysis of incomplete multivariate data. London: Chapman &
Hall/CRC.
Schnoll, R. A., Harlow, L. L., Stolbach, L. L., & Brandt, U. (1998). A structural model of the rela-
tionships among disease, age, coping, and psychological adjustment in women with breast cancer.
Psycho-Oncology, 7, 69-77.
Sijtsma, K., & van der Ark, L. A. (2003). Investigation and treatment of missing item scores in test
and questionnaire data. Multivariate Behavioral Research, 38, 505-528.
Sinharay, S., Stern, H. S., & Russell, D. (2001). The use of multiple imputation for the analysis of
missing data. Psychological Methods, 6, 317-329.
Tabachnick, B. G., & Fidell, L. S. (2001). Using multivariable statistics (4th ed.). Boston: Allyn and
Bacon.
II
Intermediate Multivariate
Methods With 1
Continuous Outcome
This page intentionally left blank
4
Multiple Regression
are used with a categorical outcome. We will discuss more about DFA and LR
methods in Chapters 8 and 9, respectively.
For those who are interested in a very thorough and readable presentation of MR,
consider the acclaimed volume Applied Multiple Regression/Correlation Analysis
for the Behavioral Sciences (3rd ed.), by Cohen, Cohen, West and Aiken (2003).
a. Standard MR. This approach loads all predictors in one step. It is useful
when predicting or explaining a single phenomenon with a set of predictors (e.g.,
predict amount of condom use with a set of attitudinal, interpersonal and behavioral
variables). Figure 4.1 depicts this example with three predictors.
b. Hierarchical MR. This form, also called sequential MR, allows researchers
to theoretically order variables in specific steps. It allows assessment of whether a
set of variables substantially adds to prediction, over and above one or more other
variables already in the analysis (e.g., assess whether attitudinal variables increase
prediction, over and above behavioral predictors of condom use).
c. Step wise MR. This kind, sometimes called statistical MR, has the computer
select variables for entry. Selection is based on the IV that has the strongest partial
correlation with the DV, after controlling for the effect of variables already in
the equation. This form capitalizes on chance variation in the data much more
than standard or hierarchical MR, and for this reason it is not often recommended
(Cohen, Cohen, West, & Aiken, 2003). Still, it may be useful when it is important
to identify the most salient predictors of an outcome, particularly in a new field
that does not allow much theory to guide the researcher. For example, a researcher
could assess which are the most important predictors (i.e., behavioral, attitudinal, or
environmental) of an outcome variable designating a new form of disease. Because
of the atheoretical nature of stepwise MR, a large sample size and replication should
be used.
at least 100 (Green, 1991), and much more (e.g., 200-400+) when assumptions
are violated or when capitalizing on chance variation (e.g., with stepwise MR).
We also should compute descriptive statistics (e.g., means, standard deviations)
and test assumptions (i.e., examine bivariate scatter plots for pairs of variables,
and skewness and kurtosis for all variables).
If data do not appear to conform to assumptions (e.g., normality, linearity), con-
sider making transformations on pertinent variables. For example, a variable such
as substance abuse could be highly lopsided to the left (i.e., positively skewed).
This would indicate that most individuals reported low or no substance abuse,
whereas a few individuals reported moderate to high substance use. This pattern
of responses would be depicted by a peak at the low end of the score range with a
tail trailing off to the right or high end of the range of scores. In this case, it would
be helpful to consider transforming the scores to readjust the variable into one
that more closely followed assumptions (e.g., normality). Several transformations
could be considered (e.g., logarithm or square root) to assess which is best for
evening out the distribution of a variable such as substance abuse. Several excel-
lent discussions of transformations are offered elsewhere (Cohen, Cohen, West &
Aiken, 2003; Tabachnick & Fidell, 2001).
Correlations among the variables and reliability coefficients also should be
calculated at this point. Because MR assumes that predictors are perfectly reliable,
regression coefficients are biased with unreliable variables. To the extent that data
were randomly selected and relevant, the data met assumptions, and predictors are
reasonably reliable (e.g., reliability coefficients >0.70), we could infer past the
initial sample to a larger population with some degree of confidence. Correlations
also should be scanned to ensure that variables are not overly correlated (i.e.,
multicollinear) with other variables. Correlations of 0.90 or higher indicate clear
multicollinearity, although correlations as high as 0.70 may be suggestive of a
potential problem (Tabachnick & Fidell, 2001).
Y = X' + E (4.2)
The goal is to find the multiple correlation, R, between X' and Y and show that
it is significantly different from zero. Significance tests for individual regression
weights, B, are then conducted to determine which of the independent (X) variables
significantly predict the outcome variable, Y. Seen in this way, a MR model is
very similar to a model between just a single X and a single Y variable. The main
difference is that with MR, the single predictor, X', is actually a linear combination
of multiple X variables.
There is less adjustment with a large sample size, N, and a small number of
predictors, p. The adjusted R2 value could be substantially smaller than the original
R2 with a small sample size and a large number of predictors. The example provided
at the end of the chapter provides a calculation for an adjusted R2 value with a
sample size of 527 and four predictors. As can be expected, there is very little
adjustment with such a large sample size and only a few predictors.
1
Also see the webpage: http://members.aol.com/johnp71/pdfs.html to calculate F, t, z, and X2
values or corresponding p values.
MULTIPLE REGRESSION 49
After determining that the least squares linear combination of X variables, X', is
significantly related to the outcome variable, Y, we can then investigate which
variables are contributing to the overall relationship. As with macro-level assess-
ment, we first examine statistical tests and then proceed to evaluate the magnitude
of the effects.
Weights
After determining that a variable is a significant predictor of Y, we usually want
to examine the standardized weights that indicate how much a specific variable is
contributing to an analysis. In MR, we examine least squares regression weights
that tell us how much a predictor variable covaries with an outcome variable after
taking into account the relationships with other predictor variables. Thus, a stan-
dardized regression coefficient is a partial correlation between a specific predictor,
Xi, and the F, that does not include the correlation between Xi and the other X
variables. The standardized regression weight, or beta weight, , indicates the re-
lationship between a predictor and the outcome, after recognizing the correlations
among the other predictor variables in the equation. High absolute values (> |0.30|)
indicate better predictive value for standardized regression coefficients.
In an unstandardized metric (i.e., B), the weight represents the change in an
outcome variable that can be expected when a predictor variable is changed by
one unit. We could focus on the unstandardized regression weights for at least
two reasons. The first is when we want to assess the significance of a predictor
variable. To do this, we examine the ratio of the unstandardized regression weight
over the standard error associated with a specific variable, forming a critical ratio
that is interpreted as a r-test. If the p-value associated with the t-test is greater than
a designated level (e.g., 0.05 or 0.01), then the variable is said to be a significant
predictor of the outcome variable. The second reason to focus on unstandardized
regression coefficients is if we want to compare regression weights with those from
5O CHAPTER 4
another sample or form a prediction equation for possible use in another sample.
The unstandardized regression weights retain the original metric and provide a
better basis for comparing across different samples that may have different stan-
dardized weights even when unstandardized weights are similar (Pedhazur, 1997).
Regression weights can also provide a measure of ES. To keep the micro-level
ES in the same metric as the macro-level R2, we could square the standardized
coefficients. Much like values for bivariate r2, values of 2 could be interpreted
as small, medium, and large micro-level ESs when equal to 0.01, 0.06, and 0.13,
respectively.
Descriptive Statistics
Notice from the descriptive statistics of Table 4.1 that all the variables are relatively
normally distributed. Skewness values are within an acceptable range of — 1.0 to
+ 1.0, except for PROSB, which shows slight skewness. Kurtosis values are all
reasonable (i.e., below 2.0), indicating that the data are not all piled up at specific
values.
TABLE 4.1
Descriptive Statistics for 4 IVs and the DV, Stage of Condom Use
TABLE 4.2
Coefficient Alpha and Test-Retest Reliability Coefficients
Time Points
TABLE 4.3
Correlation Coefficients Within Time B, N = 527
TABLE 4.4
Summary of Macro-Level Standard MR Output
Analysis of Variance
different from zero. The multivariate effect size, R2, is large (i.e., >0.26), as is the
adjusted R2 (see below) that corrects for sample size and the number of predictors.
Using equation 4.3, presented earlier, we could verify the value for the F-test by
showing that F(4,522) = (R2/p)/(l - R2)/(N - p - 1) = (0.286/4)/(0.714)/
(527 - 4 - 1) = 0.0715/0.001368 = 52.27.
Note that Table 4.4 also presents a value for adjusted R2 that provides an
indication of the true value of R2 in the population. The value for adjusted R2 is
very close to the unadjusted value because of the large sample size and the relatively
small number of predictors. The adjusted R2 value would be markedly less than the
unadjusted value with small sample sizes and a large number of predictors. For our
example, the value for adjusted R2 could be verified by solving for Equation 4.4:
Table 4.5 presents a micro-level summary of the regression coefficients for the
standard MR. Examining the unstandardized weights (i.e., parameter estimates)
and their associated t-values, we see that all the predictors are significantly related
to stage of readiness to use condoms (i.e., STAGEB). Combining Equations 4.1
TABLE 4.5
Summary of Micro-Level Standard MR Output
Parameter Estimates
TABLE 4.6
Step 1 of Macro-Level Hierarchical MR Output
and 4.2, we could write the prediction equation for this example as: Y — A +
B1 (PROSB) + S2(CONSB) + #3(CONSEFFB) + B4(PSYSXB) + E. Insert-
ing the unstandardized parameter estimate values from Table 4.5, we have the
prediction equation: Y = 2.705 + 0.188(PROSB) - 0.326(CONSB) + 0.447
(CONSEFFB) - 0.484(PSYSXB) + E.
Examining the standardized estimates, the strongest predictor of STAGEB is
self-efficacy for condom use (i.e., CONSEFFB), which has a standardized beta
weight of 0.37. The variable PROSB (i.e., perceptions of the advantages of condom
use) is least related to stage of readiness, with a beta weight of 0.11.
TABLE 4.7
Step 1 of Micro-Level Hierarchical MR Output
Parameter Estimates
TABLE 4.8
Step 2 of Macro-Level Hierarchical MR Output
At the final step, Model 3, the variable PSYSXB is added so that at this point
there are the full set of four predictors in the equation. Not surprisingly, the statistics
look the same as those from the standard MR on these same variables. This shows
that the order of entry into the equation makes a difference only during the interim
steps when there are varying numbers of variables included. When all variables
have been entered, the significance tests (i.e., F and t), effect sizes (e.g., R2), and
TABLE 4.9
Step 2 of Micro-Level Hierarchical MR Output
Parameter Estimates
TABLE 4.10
Step 3 of Macro-Level Hierarchical MR Output
parameter estimates (i.e., B and ) are the same regardless of their ordering. See
Tables 4.10 and 4.11.
TABLE 4.11
Step 3 of Micro-Level Hierarchical MR Output
Parameter Estimates
TABLE 4.12
Step 1 of Macro-Level Stepwise MR Output
TABLE 4.13
Step 1 of Micro-Level Stepwise MR Output
Parameter Estimates
seem like a reasonable choice for adding at Step 2. However, a glance at the
correlations among the IVs reveals that CONSB is moderately correlated (i.e., r =
-0.49) with CONSEFFB, the variable that entered first. The other IV, PROSB, also
has a somewhat moderate correlation (i.e., r = 0.36) with CONSEFFB, though
PSYSXB does not have much overlap with CONSEFFB (i.e., r = 0.24). Thus, the
computer selects PSYSXB to enter second into the Stepwise MR equation. Tables
4.14 and 4.15 show results for the second step.
Tables 4.16 and 4.17 present results at Step 3 when CONSB is added to the
Stepwise regression analysis. Results for Step 4 are given in Tables 4.18 and 4.19,
when the final variable, PROSB, is added.
TABLE 4.14
Step 2 of Macro-Level Stepwise MR Output
TABLE 4.15
Step 2 of Micro-Level Stepwise MR Output
Parameter Estimates
TABLE 4.16
Step 3 of Macro-Level Stepwise MR Output
TABLE 4.17
Step 3 of Micro-Level Stepwise MR Output
Parameter Estimates
TABLE 4.18
Step 4 of Macro-Level Stepwise MR Output
Sum of Mean
Source df Squares Square F -Value Prob. > F
TABLE 4.19
Step 4 of Micro-Level Stepwise MR Output
Parameter Estimates
Note in Table 4.19 that this final stepwise model is the same as the last step
in hierarchical and standard MR, because all the variables were included in the
equation for each MR. If only a subset of variables were significant in the stepwise
MR, the final equation would have fewer variables than in hierarchical and standard
MR and the final output would not be the same for stepwise MR. In each form
of MR (standard, hierarchical, and stepwise), all four predictor variables were
entered into the model, providing the same results for the final output.
Table 4.20 provides a summary of the four steps in the stepwise MR for this
example with R-Squares, F-Values, and probability values for each step.
Notice that in all three forms of MR (standard, hierarchical, and stepwise), the
contribution of PSYSXB seems to be larger in the presence of other variables than
the simple correlation between PSYSXB and STAGEB (i.e., r = -0.10). This
suggests that PSYSXB may be what is called a suppressor variable that facilitates
relationships within an analysis. Though discussion of this kind of variable is
beyond the scope of this book, interested readers are encouraged to examine this
in more detail elsewhere (Cohen, Cohen, West, & Aiken, 2003; Tabachnick &
Fidell, 2001; Grimm & Yarnold, 1995).
Figure 4.2 depicts the four predictors of stage of condom use, with the standard-
ized coefficients found from the final step in standard, hierarchical, and stepwise
TABLE 4.20
Summary of Micro-Level Stepwise MR Output
Step 1 2 3 4
Variable Entered CONSEFFB PSYSXB CONSB PROSB
Number of Variables in 1 2 3 4
Partial R-Square 0.2001 0.0470 0.0283 0.0106
Model R-Square 0.2001 0.2472 0.2755 0.2860
Adjusted R-Square 0.1940 0.2414 0.2700 0.2805
F-Value 131.3700 32.7300 20.4200 7.7300
Pr> F <0.0001 <0.0001 <0.0001 0.0056
6O CHAPTER 4
MR analyses conducted on these data. Though it is not always the case, the re-
sults were the same because all variables were retained in all three analyses. For
ease of presentation, values for the correlations among the predictor variables and
prediction error are not indicated in Figure 4.2.
These results suggested that almost 30% of the variance in stage of condom use
is associated with the four predictors. Beginning with variables that contributed the
most, with standardized regression coefficients ( ) given in parentheses, stage of
condom use is associated with greater perceived self-efficacy for using condoms
( = 0.37), less positive psychosexual functioning ( = —0.26), less perceived
disadvantages (i.e., cons) of condom use (ft = —0.20), and greater perceived ad-
vantages (i.e., pros) of condom use (ft = 0.11). Ideally, these results should be
replicated in a larger, more diverse, and random sample.
SUMMARY
Table 4.21 summarizes how the main multiplicity, background, central, and inter-
pretation themes apply to MR.
REFERENCES
Cohen, J. (1988). Statistical power analysis for the behavioral sciences. San Diego, CA: Academic
Press.
Cohen, J. (1992). A power primer. Psychological Bulletin, 112, 155-159.
Cohen, J., Cohen, P., West, S.G., & Aiken, L.S. (2003). Applied multiple regression/correlation analysis
for behavioral sciences (3rd ed.). Mahwah, NJ: Lawrence Erlbaum Associates.
Goldman, J. A., & Harlow, L. L. (1993). Self-perception variables that mediate AIDS preventive
behavior in college students. Health Psychology, 12, 489-498.
Green, S. B. (1991). How many subjects does it take to do a regression analysis? Multivariate Behavioral
Research, 26, 449-510.
Grimm, L. G., & Yarnold, P. R. (1995). Reading and understanding multivariate statistics (Ch. 2,
pp. 19-64). Washington, DC: APA.
Harlow, L. L., Quina, K., Morokoff, P. J., Rose, J. S., & Grimley, D. (1993). HIV risk in women: A
multifaceted model. Journal of Applied Biobehavioral Research, 1, 3-38.
Harlow, L., Rose, J., Morokoff, P., Quina, K., Mayer, K., Mitchell, K. & Schnoll, R. (1998). Women
HIV sexual risk takers: Related behaviors, interpersonal issues & attitudes. Women's Health:
Research on Gender, Behavior and Policy, 4, 407-39.
Kirk, R. E. (1996). Practical significance: A concept whose time has come. Educational and Psycho-
logical Measurement, 56, 746-759.
Marcus, B. H., Eaton, C. A., Rossi, J. S., & Harlow, L. L. (1994). Self-efficacy, decision-making and
stages of change: A model of physical exercise. Journal of Applied Social Psychology, 24,489-508.
Pedhazur, E. J. (1997). Multiple regression in behavioral research: Explanation and prediction (3rd ed.).
Fort Worth, TX: Harcourt Brace College Publishers.
Prochaska, J. O., Redding, C. A., Harlow, L. L., Rossi, J. S., & Velicer, W. F. (1994a). The Transtheo-
retical model and HIV prevention: A review. Health Education Quarterly, 21, 45-60.
62 CHAPTER 4
Prochaska, J. O., Velicer, W. F, Rossi, J. S., Goldstein, M. G., Marcus, B. H., Rakowski, W., Fiore,
C., Harlow, L. L., Redding, C. A., Rosenbloom, D., & Rossi, S. R. (1994b). Stages of change and
decisional balance for 12 problem behaviors. Health Psychology, 13, 39- 46.
Prochaska, J. O., & Velicer, W. R (1997). The transtheoretical model of health behavior change.
American Journal of Health Promotion, 12, 38-48.
SAS (1999). Statistical Analysis Software, Release 8.1. Cary, NC: SAS Institute Inc.
Tabachnick, B. G., & Fidell, L. S. (2001). Using multivariate statistics (4th ed.: Ch. 5, pp. 111-176).
Boston: Allyn and Bacon.
5
Analysis of Covariance
Themes Applied to Analysis
of Covariance (ANCOVA)
type of partial correlation analysis, where the focus is on the relationship between
the IVs and the DV with the effect of the covariate(s) partialed out of the DV.
ANCOVA requires the following pattern of correlations:
(5.1)
(5.2)
(5.3)
Equation 5.3 shows that, when the DV is adjusted for the effects of the covariate,
this adjusted F-score can be modeled just like the unadjusted Y in the ANOVA
model (though we use up an extra df for each covariate that is added). Thus,
ANCOVA can be viewed as an ANOVA on Y scores in which the relationships
between the covariates and the DV are partialed out of the DV.
Similar to what is done with MR, we examine a regression line between two
continuous variables. Whereas in MR the regression is between an IV and the
DV, in ANCOVA the regression is between a covariate, C, and the DV. Of course
a covariate is an IV, much like an IV in MR, though it is not usually the main
focus in an ANCOVA where one or more categorical IVs are often examined as
the main focus for explaining the variance in an outcome variable, 7. Another
distinction is that in MR we think of adding extra continuous variables to improve
ANALYSIS OF COVARIANCE 69
2
= (SSeffect)/(SStotal) (5-4)
Like R2, 2 is a proportion ranging from 0 to 1. We will find out more about
2
in the next section, which examines macro-level interpretation of ANCOVA
results.
Significance Test
As with ANOVA and MR, the main overall index of fit in ANCOVA is the
F-test. If the F-test is significant in ANCOVA (e.g., p < 0.05), we can conclude
that there are significant mean differences between at least two groups, after taking
into account one or more covariates.
a. Follow-up planned comparisons (e.g., post hoc, simple effects, etc.) pro-
vide a specific examination of which groups are showing differences. Tukey
(1953) tests of an honestly significant difference (HSD) between a pair of
means provide some protection for overall error rate. Though I tend to con-
duct Tukey tests using a stringent alpha level of 0.01, this may be too conser-
vative. As with the choice of ES guidelines, I leave the selection of a p-value
size to the researcher's good judgment, something we have to rely on more
often than we would like to think in science.
ANALYSIS OF COVAR1ANCE 71_
Descriptive Statistics
Table 5.1 presents descriptive statistics for the variables used in this analysis of
covariance example. From the frequencies presented for the IV, STAGEA, we see
that there are 527 participants who contributed data for these analyses. Individuals
ANALYSIS OF COVARIANCE 73
TABLE 5.1
ANCOVA Example Descriptive Statistics
Cumulative Cumulative
STAGEA Frequency Percent Frequency Percent
are unevenly distributed across the five stages of condom use, with most in the con-
templation stage where they are considering possible condom use. The next most
endorsed category is the precontemplation stage where 151 of the 527 individuals
are not even considering the use of condoms. The least endorsed category is the
action stage where only 20 individuals (i.e., 4%) report that they have consistently
used condoms for the last 6 months or less. Having unequal cell sizes for the five
stages is not optimal for ANCOVA, nor is it preferable to have naturally occur-
ring, intact stages where participants are not randomly assigned. Still, the analysis
can provide descriptive information on whether perceived disadvantages to using
condoms differ after 6 months for individuals at various stages of readiness to use
condoms, while controlling for initial perceived cons of condom use.
From the means, standard deviations, skewness, and kurtosis values, it appears
that all variables are relatively normally distributed, with low levels of endorsement
on the one to five point scales. All three variables exhibit slight skewness (i.e., see
values close to 1.0). Still, the means do not reach ceiling or floor levels and are
all greater than their standard deviations, suggesting reasonably even distributions
for the variables.
Correlations
As shown in Table 5.2, there is a relatively strong correlation (r = 0.61) between
the covariate (i.e., CONSA) and the DV (CONSB), as is needed. The size of
this correlation indicates reasonable test-retest reliability over 6 months on the
perceived disadvantages to using condoms. The ordinal variables, STAGEA and
STAGER correlate moderately high (i.e., r = 0.67), also suggesting fairly stable
test-retest reliability over a 6-month period and relatively static stages of change
in this naturalistic study.
74 CHAPTER 5
TABLE 5.2
Pearson Correlation Coefficients (N = 527)
Finally, there is also a relatively small, though not zero, significant correla-
tion (i.e., r = -0.30) between the IV (STAGEA) and the covariate (CONSA).
Remember that the assumption of homogeneity of regressions assumes that the
slopes between the covariate and DV are similar across all levels of the IV. This as-
sumption implies that the covariate and IV are not correlated. Even the relatively
small correlation (i.e., r = —0.30) found here could lead to possible problems
with heterogeneity of regressions when conducting an ANCOVA on these data
later on. A possible violation of the assumption of homogeneity of regressions
will be examined more closely in the next section.
TABLE 5.3
Testing for Homogeneity of Regressions
Sum of Mean
Source df Squares Square F -Value Prob. > F
TABLE 5.4
ANOVA Macro-Level Results
The GLM Procedure: Class Level Information
Class Levels Values
STAGEA 5 12345
Number of observations 527
Dependent Variable: CONSB
TABLE 5.5
Micro-Level Tukey Tests for ANOVA
*Mean differences that are indicated with an asterisk are significant. Thus, STAGE 5 is significantly
different from STAGES 1, 2 & 3.
Tukey planned comparison tests to assess whether there are any significant differ-
ences in perceived disadvantages to condom use, across the five stages of condom
use.
Table 5.5 presents the micro-level Tukey tests for the significant macro-level
F-test in the ANOVA just conducted. Notice that there are significant mean differ-
ences between STAGE 5, and stages 1, 2, and 3. Thus, individuals in the mainte-
nance stage of condom use (i.e., regularly using a condom for 6 months or longer)
report having significantly fewer perceived disadvantages to using condoms than
ANALYSIS OF COVAR1ANCE 77
TABLE 5.6
ANCOVA Macro-Level Results
The GLM Procedure: Class Level Information
Class Levels Values
STAGEA 5 12 34 5
Number of observations 527
Dependent Variable: CONSB
condom use and perceived cons of condom use 6 months later. The proportion
of shared variance between cons at t1 and cons at t2 is equal to R2 = 2 =
SSCONSA/SSTotal = 113.19/373.405 = 0.30, which indicates a very large univariate
or multivariate ES for the covariate. Another way of assessing the ES for the
covariate is to subtract the R2 values between this ANCOVA that includes STAGEA
and the covariate, CONSA, and the R2 value from the ANOVA that included
only the IV, STAGEA. This also reveals that 2STAGEA+CONSA - 2STAGEA = 0-39 —
0.09 = 0.30 is the unique ES for CONSA.
Figure 5.1 depicts the covariate, CONSA, predicting the DV, CONSB, with the
IV, STAGEA, as the predictor to the adjusted CONSB that has removed the effects
of the covariate.
It is now instructive to view the micro-results (i.e., means and Tukey tests) for
the ANCOVA in Table 5.7. In Table 5.7, we see that the means and confidence
intervals on CONSA across the 5 stages of condom use reveal relatively low values
(i.e., close to 2 on a 5-point scale) with fairly small confidence intervals around
the mean for individuals within stages 1 and 2 and moderately large confidence
intervals for those in stages 3 to 5. This suggests that there is wider variation for
those moving toward using condoms.
In contrast to the ANOVA micro-results, however, no significant differences
are revealed with Tukey tests on CONSB, after controlling for levels of CONSA
6 months earlier. This can be seen in Table 5.7, where none of the p values for
ANALYSIS OF COVARIANCE 79
TABLE 5.7
Micro-Level Follow-up Tukey Tests for ANCOVA
the least squares means is less than our stringent alpha level of 0.01, although the
difference between stages 1 and 5 comes close (i.e.,p = 0.0101).
Because this is not an experimental design, it is difficult to come to definitive
conclusions regarding the findings. It may be that the initial ANOVA findings of
a (small to) medium ES (i.e., 0.09) was spurious and largely due to other con-
founds. Alternatively, it may be that there is truly a modest ES between STAGEA
and CONSB, although the degree of correlation between the IV and covariate
(i.e., r = —0.30: see correlation matrix presented earlier) was enough to cause
some slight instability in the findings. These findings should be followed up in fu-
ture work to understand better the nature of the links between the stages of condom
use and the perceived disadvantages of condom use.
8O CHAPTER 5
TABLE 5.8
Multiplicity, Background, Central, and Interpretation Themes Applied to ANCOVA
Themes ANCOVA
SUMMARY
REFERENCES
Cohen, J. (1988). Statistical power analysis for the behavioral sciences. San Diego, CA: Academic
Press.
Cohen, J. (1992). A power primer. Psychological Bulletin, 112, 155-159.
Hair, J. F., Anderson, R. E., Tatham, R. L., & Black, W. C. (2004). Multivariate data analysis (6th ed.).
Prentice Hall.
Pedhazur, E. J. (1997). Multiple regression in behavioral research: Explanation and prediction (3rded.).
Belmont, CA: Wadsworth Publishing.
ANALYSIS OF COVARIANCE 81_
Prochaska, J. 0., Velicer, W. R, Rossi, J. S., Goldstein, M. G., Marcus, B. H., Rakowski, W., Fiore,
C., Harlow, L. L., Redding, C. A., Rosenbloom, D., & Rossi, S. R. (1994). Stages of change and
decisional balance for 12 problem behaviors. Health Psychology, 13, 39-46.
Rutherford, A. (2001). Introducing ANOVA and ANCOVA: A GLM approach. Thousand Oaks, CA:
Sage Publications.
SAS (1999). The SAS system for windows. Cary, NC: SAS Institute Inc.
Tabachnick, B.G., & Fidell, L.S. (2001). Using multivariate statistics (4th ed.). Boston: Allyn and
Bacon.
Tukey, J.W. (1953). The problem of multiple comparisons. Unpublished manuscript, Princeton Univer-
sity (mimeo).
This page intentionally left blank
III
Matrices
This page intentionally left blank
6
This chapter focuses on matrices and how they relate to multivariate methods.
As such, it is not a chapter on a specific method, but rather foundational infor-
mation to help in understanding the calculations involved in many multivariate
methods [e.g., multivariate analysis of variance (MANOVA), discriminant func-
tion analysis (DFA), canonical correlation (CC)]. Although a great deal could be
written about matrices as is evident in several excellent sources (Harville, 1997;
Namboordiri, 1984; Schott, 1997; Tabachnick & Fidell, 2001; Tatsuoka, 1971),
in this chapter we focus on very basic information and calculations. The initial
framework of presenting material within the 10 questions outlined at the end of
the chapter on background themes will be modified somewhat to accommodate
the topic of matrices.
Matrices can be viewed as cartons for carrying the many numbers involved
in multivariate analyses. Just as scalar equations (e.g., Y = mx + b) organize
numerical information into organized symbolic units, matrices serve this role for
multivariate methods. There are similarities between single numbers and matrices,
some of which are touched on later in this chapter. For example, the concept of
adding or subtracting numbers has a direct parallel with matrices. We can always
add or subtract matrices, as long as they have the same size (i.e., the same number
of rows and the same number of columns). And, just as with individual numbers,
the order of the numbers or matrices does not matter for addition, although it does
matter for subtraction. Thus, when adding two numbers or two matrices, it does not
matter which one comes first, whereas when subtracting two numbers or matrices
the order is important. Another similarity between single numbers and matrices
is the concept of dividing a number or matrix by itself. When dividing a single
number by itself, the number 1 results. When dividing a matrix by itself, a matrix
with Is along the diagonal and 0s everywhere else (labeled an identity matrix) is
the result. We will see more about this when discussing matrix division.
There are also clear differences between calculations involving single numbers
and those involving matrices. Matrix multiplication and division are two kinds
of calculations that are more complex and seemingly very different than when
performing these operations on single numbers. Calculations with matrices also
allow for concepts not even considered when using scalars. For example, we can
calculate values such as a determinant, a trace, and eigenvalues of a matrix, all of
which provide some indication of the variance of a matrix. These matrix concepts
are discussed in more detail shortly.
There are several different types of matrices, each with different properties that
are often seen in statistics. For each one, I mention the size and the physical
characteristics. This includes making note of the number of rows and columns in
a specific matrix, the shape (e.g., rectangular or square), and whether the matrix
is symmetric (i.e., having the same elements in the upper right triangular portion
as in the lower left triangular portion of a square matrix).
a. Scalar. This is the smallest matrix. It has one row and one column and is
a single (1 x 1) number. Several examples of important scalars used in statistics
include the mean, the number of variables (p), the sample size (N), and the sample
size minus one (N — 1).
b. Data matrix. X is a (N x p) rectangular and nonsymmetrical matrix. A data
matrix is used to store the information collected from N participants in the rows
MATRICES AND MULT1VARIATE METHODS 87
and on p variables stored in the columns. Thus, a data matrix with information from
100 participants and 20 variables would be a rectangular, nonsymmetric matrix
with 100 rows and 20 columns.
c. Vector. A vector is usually a column (p x 1) or row (1 x p) from a larger
matrix. For example, one column of a data matrix would provide values on one
variable from all the N participants. On the other hand, one row of a data matrix
would yield information on all the variables from one participant. A mean (M)
vector can be created that contains the means of the p variables across all N
participants (or across participants within a group). Mean vectors can then be used
to create deviation scores if subtracted from original X scores (i.e., X — M).
d. Sum of squares and cross-products matrix. SSCP is a (p x p) square,
symmetrical matrix and contains the numerators of the variance-covariance, E,
matrix. Sums of squares (SS) are listed along the diagonal of the SSCP matrix
and give the squared distance each score is from its mean. In the off-diagonals
of the SSCP matrix, the cross products of the distances for pairs of variables are
provided. We could create a SSCP matrix from calculating the squared deviations
and cross products of an X matrix. In matrix format,
where D = a deviation score matrix, X — M, and D' is the transpose of this deviation
score matrix. A transpose of a matrix is formed by converting each column of a
matrix to a corresponding row of the transposed matrix. Thus, the first column
would become the first row; the second column would become the second row,
and so on in a transposed matrix. This also could be visualized by toppling over
a matrix on its right side (with the first row becoming the first column with the
numbers in reverse order) and then flipping this from front to back (thus shifting
the numbers to their correct position). The resulting SSCP matrix would have the
SS for each variable along the diagonal and the cross products for each pair of
variables in the corresponding off-diagonal cell (e.g., the cross product between
variable 4 and variable 3 would be in the 4th row and the 3rd column). Though
the concept of a SSCP matrix may sound a bit foreign, recognize that SS show
up in other areas that examine the numerator of a variance term [e.g., analysis
of variance (ANOVA) source table, where SS are divided by degrees of freedom
to yield between-groups or within-groups variance]. The reason we do not see
the cross product (i.e., CP) portion as often as SS is probably due to less use of
multivariate methods that examine several continuous variables that may covary
together as they predict one or more outcome variables. Let's turn to another matrix
that builds on SS and CP values in multivariate methods.
e. Variance-covariance matrix. E is a (p x p) square and symmetric matrix
from the population that has the variances of the p variables along the diagonal and
covariances in the off-diagonals. In a sample, the variance-covariance matrix is
referred to as S. and S are sometimes called unstandardized correlation matrices
88 CHAPTER 6
and show how much two variables vary together using the original metric of the
variables. Thus, values can range from zero to infinity for the variances along
the diagonal, and from minus infinity to plus infinity for the covariances in the
off-diagonals. A variance-covariance matrix can also be viewed as one step up
from a SSCP matrix. That is, a sample S matrix is formed by placing the sum of
squares divided by N — I along the diagonal for each variable, and placing the
cross-products, (i.e., the sum of X minus the mean of X times 7 minus the mean of
Y), divided by TV — 1 in the off-diagonals. Thus, the sample variance-covariance
matrix equals:
The SSCP and S matrices are used in calculations for MANOVA and DFA (see
Chapters 7 and 8), just as the SS and variances (between and within groups) are
important for calculating ANOVA.
f. Correlation matrix. R is a (p x p) square and symmetrical matrix. Ones
are placed along the diagonal, with the off-diagonals showing the magnitude and
direction of relationship between pairs of variables, using a standardized metric
ranging from —1 to 1. Values close to the absolute value of 1 indicate strong
relationships (i.e., variables vary together, rising and falling in similar patterns), and
values close to zero indicate little to no relationship. A correlation, r, is calculated
by: the covariance of X and Y, divided by the square root of the product of the
variances of X and Y:
When there are more than 2 variables, it is helpful to store correlations (as in
equation 6.3) in an R matrix, where each cell represents the correlation between
the corresponding variables [e.g., r(2, 1) is the correlation between variable 1 and
variable 2]. There is a distinction between a correlation and covariance matrix. The
correlation matrix tells us about the interpretable nature of the relationship between
two variables whereas the covariance matrix tells us about how two variables vary
using their original metric (e.g., inches, IQ points, etc.). Covariances are not easily
interpretable as to the size or magnitude of the relationship. A correlation matrix
is often used in multiple regression (see Chapter 4), CC (see Chapter 10), factor
analysis and principal components analysis (see Chapter 11).
g. Diagonal matrix. A diagonal matrix is a square, symmetric matrix with
values on the diagonal and zeros on the off-diagonals. A common diagonal matrix
is an identity matrix, I, which is the matrix parallel of the scalar, 1.0. As we will see
later, I is also useful in verifying that we have found the inverse (i.e., divisor) of a
matrix used in matrix division (i.e., A-1 A = A A-1 = I, where A-1 is the inverse
of the matrix A). Thus, if we divide a matrix by itself (i.e., multiply a matrix by its
MATRICES AND MULT1VARIATE METHODS 89
inverse), we will get an identity matrix, much like dividing a scalar by itself yields
the number, 1.
element in a matrix. The resulting matrix has the same size but changed elements:
c. Adding matrices requires that each matrix be the same size, that is, con-
formable. We can add two (2 x 2) matrices or two (4 x 3) matrices, but we cannot
add a (2 x 2) matrix to a (4 x 3) matrix. Although the size is important, the
order of operations is not important in adding matrices; that is, adding matrices
A + B = B + A. To add matrices, simply add corresponding elements in the two
matrices:
element in the 1st row of the first matrix times each element from the 2nd column
of the second matrix.
kinds of symmetric and positive definite (e.g., SSCP) matrices we often see
in statistics.
Having verified that we have the correct inverse for the divisor matrix, A, we
can now proceed to perform the division B/A = (B) A-1:
The central themes of variance, covariance and linear combinations most certainly
apply to matrices. First, the most central matrices used in multivariate methods
involve some form of variances and covariances. For example, the SSCP matrix has
the numerators of the variances (i.e., the sums of squares) along the diagonal and the
numerators of the covariances (i.e., the cross products) along the off-diagonals. As
shown previously, the variance-covariance matrix, S, holds the variances along the
diagonal and covariances in the off-diagonal. These matrices are directly analyzed
in MANOVA and DFA (see Chapters 7 and 8, respectively) where we are interested
in the ratio of between- over within-groups' variance-covariance matrices, just as
we are interested in this ratio at a scalar level in ANOVA.
Linear combinations are also centrally linked to matrices in that we often need
to combine multiple variables that are housed in matrices. These new (linear)
combinations, which are simply composites of the original variables, often become
the focal point in several multivariate methods [e.g., MANOVA, DFA, CC, and
principal components analysis (PCA)]. As with many multivariate statistics, we
are interested in analyzing the variance in a set of variables or linear combinations
(see Chapter 2 regarding linear combinations).
The variances of linear combinations are called eigenvalues, which are actu-
ally just a redistribution of the original variances of the variables. There are usually
p eigenvalues for each p x p matrix, where the sum of the eigenvalues is equal to
the sum of the variances of the original matrix. Often we choose to examine only a
portion of the original variance. This portion usually corresponds to the variances
of the biggest linear combinations of the original variables. Thus, we would want
to identify the largest eigenvalues, indicating that the corresponding linear com-
binations were retaining a good bit of the variance of the original variables. The
linear combinations are formed to maximize the amount of information or variance
that is taken from each of the individual variables to form a composite variable.
The specific amount of variance that is taken from each variable is called an
eigenvector weight. It is somewhat similar to an unstandardized multiple regres-
sion weight in telling us how much of a variable corresponds to the overall linear
combination, allowing calculation of a determinant for any pxp matrix.
94 CHAPTER 6
We can use an analogy with sandboxes to help describe the process with eigen-
values and eigenvector weights. Imagine that the variance for each variable is
contained in its own separate sandbox. We may want to form combinations of
several variables to examine relationships between sets of variables. To do this,
we can visualize a whole row of distinct sandboxes representing the variances for
each of the variables with a whole row of empty and separate sandboxes behind
them waiting to hold the variances of the new linear combinations that will be
formed. Using information from eigenvector weights, we take a large (eigenvector
weight) portion of sand from each of the first row of sandboxes and place it in the
first back-row linear combination sandbox. The amount of sand in the first linear
combination sandbox is indicated by its eigenvalue (i.e., the amount of variance
in a linear combination). The process continues with more sand being drawn from
the first row of sandboxes (i.e., the original variables' variance), which is placed in
subsequent independent (i.e., orthogonal) back-row linear combination sandboxes.
Remaining linear combination sandboxes contain less and less of the original sand
because most of the information is placed in the first few linear combinations.
Knowing that an eigenvalue is a variance for a linear combination, it is helpful to
mention a single number that summarizes both eigenvalues and the variances in a
matrix. This value is called a trace and is discussed next.
The sum of the variances or the sum of the eigenvalues is labeled a trace of a
matrix. Thus, a trace tells us something about how much variance we have to ana-
lyze in a matrix, just as with the set of eigenvalues of a matrix. We can write this as:
variable. We would want this to be true, i.e., have greater variance between means
than scores, in a study in which the groups are expected to reveal very different
scores on the dependent variables.
d. How is the concept of orthogonality related to the concept of a determi-
nant?
Orthogonality plays a role in the discussion of matrices. Total orthogonality
in a correlational matrix would result in the largest possible attainable matrix
determinant (i.e., 1.0). Total collinearity in a correlation matrix would result in
the smallest possible correlational determinant (i.e., 0). The former would indicate
that each of the variables provides very different information. In contrast, the later
matrix has completely redundant information and thus a determinant of zero.
Knowledge of the degree of relatedness among variables, as indicated by a deter-
minant, often is used in behavioral science to understand the underlying structure
of a data matrix. When a matrix has a determinant of zero, we cannot find an
inverse for this matrix because anything divided by zero is undefined mathemati-
cally. Thus, it would not be useful to study variables that are completely redundant
because they would result in a determinant of zero.
e. What is the relationship of generalized variance to orthogonality?
Generalized variance is a general estimate of how unique each measure is.
Orthogonality occurs when measures are completely independent or unrelated.
When measures are orthogonal, there will be huge generalized variance because
the variables are completely unrelated.
We could then calculate the means for STAGE and CONSEFF and insert them
into a row mean vector that we duplicate for each participant. Then, a matrix of
means would be formed from concatenating the N identical rows of means for
the p variables. The mean for STAGE is formed by finding the average of the first
column of X [i.e., (4 + 3 + 1 + 5 + 2)/5 = 15/5 = 3]; the mean for CONSEFF
is the average of the second column of X, and is equal to 2.
We might now want to calculate a deviation score matrix, D, to form the SSCP
matrix, (i.e., D' D = SSCP).
Thus, the above matrix shows the sum of squares for STAGE is 10 and the sum
of squares for CONSEFF is 2. The off-diagonals are identical and indicate the
cross-product between STAGE and CONSEFF.
We can now form a variance-covariance matrix, S, by multiplying the SSCP
matrix by l/(N — 1). Remember that it does not matter whether we multiply the
constant, l/(N — 1), before or after the SSCP matrix because it will yield the same
matrix. Notice, below, that the variances for STAGE and CONSEFF (i.e., 2.5 and
0.5, respectively) are along the diagonal, and the covariance between STAGE and
CONSEFF (i.e., 0.75) is in the off-diagonal. Remember further that we cannot
interpret the magnitude of the relationship between STAGE and CONSEFF by
examining the covariance, although we can interpret the standardized covariance,
or correlation, shortly when we calculate the correlation matrix, R.
We can now calculate the correlation matrix, R, by dividing the elements in the
S matrix by the square root of the product of respective variances.
From the correlation matrix, we are free to interpret the magnitude and direction
of the relationship between STAGE and CONSEFF as a moderately strong and
positive correlation (i.e., r = 0.67).
Having calculated a number of important matrices with these data, it is instruc-
tive to perform several additional calculations to highlight some of the other matrix
concepts. For example, we could calculate the determinant or generalized variance
of the R matrix by the following:
determinant of R = (1.0)(1.0) - (0.67)(0.67) = 1 - 0.4489 = 0.5511.
This suggests that there is some moderately sized generalized variance between
the variable STAGE and the variable CONSEFF. Remember that if the two variables
were completely redundant, the determinant would be zero. If the two variables
were completely orthogonal, the determinant would equal 1.0. Thus, the value 0.55
indicates that the two variables in the current R matrix are approximately midway
between completely redundant and completely orthogonal.
100 CHAPTER 6
It also would be a good exercise to calculate the inverse of our R matrix, remem-
bering that this is formed from dividing the adjoint of R by the determinant of R.
Finally, it would be helpful to also know the eigenvalues for the R matrix and
verify that the sum of the eigenvalues is equal to the trace of R, and also show that
the product of the eigenvalues is equal to the determinant of R. Using the PROC
FACTOR routine in the SAS computer program, the two eigenvalues for the R
matrix are 1.67 and 0.33. We can now show that the sum of the eigenvalues is
equal to the trace of R.
We can conclude by showing that the product of the eigenvalues equals the
determinant of the matrix.
SUMMARY
Concepts Description
TABLE 6.2
Summary of Matrix Calculations
Calculation Description
Adding a constant Adding a single number to each element of a matrix (e.g., adding
10 points to all students' exam score ©)
Subtracting a constant Subtracting a single number from each element of a matrix (e.g.,
subtracting the mean to form deviation scores)
Multiplying by a constant Multiplying each element in a matrix by a single number (e.g.,
multiplying an S matrix by "N— 1" to yield a SSCP matrix)
Dividing by a constant Dividing each element of a matrix by a single number (e.g., dividing
the SSCP matrix by "N— 1" to yield an S matrix)
Adding matrices Adding each element in one matrix to the corresponding element in
another matrix of the same size (e.g., adding matrices of exam scores
from 2 portions of a test to yield total scores)
Subtracting matrices Subtracting each element in one matrix from the corresponding
element in another matrix of the same size (e.g., subtracting a matrix
of means from an X matrix to yield a deviation matrix, D)
Multiplying matrices Summing the products of the elements of one row with the elements of
a column (e.g., multiplying D' by D to yield SSCP)
Dividing matrices Multiplying a matrix by the inverse of the divisor matrix (e.g., dividing
a matrix by its inverse to yield an I matrix)
102 CHAPTER 6
REFERENCES
Harville, D. A. (1997). Matrix algebra from a statistician's perspective. New York: Springer.
Namboordiri, K. (1984). Matrix algebra: An introduction. Beverly Hills, CA: Sage Publications.
Prochaska, J. O., Redding, C. A., Harlow, L. L., Rossi, J. S., & Velicer, W. F. (1994). The transtheoretical
model and HIV prevention: A review. Health Education Quarterly, 21, 45-60.
Prochaska, J. O., & Velicer, W. F. (1997). The transtheoretical model of health behavior change (invited
paper). American Journal of Health Promotion, 12, 38-48.
Prochaska, J. O., Velicer, W. F., Rossi, J. S., Goldstein, M. G., Marcus, B. H., Rakowski, W., Fiore,
C., Harlow, L. L., Redding, C. A., Rosenbloom, D., & Rossi, S. R. (1994). Stages of change and
decisional balance for 12 problem behaviors. Health Psychology, 13, 39-46.
Schott, J. R. (1997). Matrix analysis for statistics. New York: Wiley.
Tabachnick, B. G., & Fidell, L. S. (2001). Using multivariate statistics (4th ed.: Appendix A,
pp. 908-917). Boston: Allyn and Bacon.
Tatsuoka, M. (1971). Multivariate analysis: Techniques for educational and psychology research. New
York: Wiley.
IV
Multivariate Group
Methods
This page intentionally left blank
7
Multivariate Analysis
of Variance
Themes Applied to Multivariate
Analysis of Variance (MANOVA)
MANOVA is the first multivariate method we cover that allows for multiple de-
pendent variables. The previous presentation on intermediate methods as well as
the topic of matrices discussed in Chapter 6 help in forming a base from which
to discuss the rigorous capabilities of MANOVA. As we shall see shortly, back-
ground, central, and multiplicity themes also apply to MANOVA, similar to yet
extending from what we saw with the methods of multiple regression (MR) and
analysis of covariance (ANCOVA).
105
1O6 CHAPTER 7
more realistic appraisal of group differences than does ANOVA. MANOVA also
can be extended to incorporate one or more covariates, essentially becoming an
ANCOVA that allows for two or more (continuous) DVs [i.e., multivariate analy-
sis of covariance (MANCOVA)]. MANOVA is somewhat similar to discriminant
function analysis (DFA) and logistic regression (LR), which will be discussed in
upcoming chapters (8 and 9, respectively), in that all three methods include at least
one major categorical variable. In MANOVA, the major categorical variable is on
the independent side, whereas with DFA and LR, the DV is categorical.
MANOVA differs from purely correlational methods such as MR and other
correlational methods discussed later [i.e., canonical correlation (CC), principal
components analysis (PCA), and factor analysis (FA)] in that with MANOVA we
are very interested in assessing the differing means between groups, whereas with
the other methods the focus is not on the means but on correlations or weights
between variables (see Chapters 10 and 11, respectively).
because the scores at each time point depend on the previous time point with the
same (within-group) sample providing repeated measures across time.
c. MANOVA can be used in a nonexperimental design to assess differences
between two or more intact groups, on two or more DVs (e.g., examine differences
between men and women on several substance use variables). In this example, the
IV would be gender with several DVs (e.g., alcohol use, marijuana use, hard drug
use). Even if we found significant differences between gender groups, it would
be impossible to attribute causality, especially because we did not manipulate the
IV and most likely did not build in adequate control for confounding variables.
Nonetheless, this form of MANOVA is often used with results interpreted more
descriptively than inferentially.
D = MI - M2/pooled s (7.1)
MULT1VAR1ATE ANALYSIS OF VARIANCE 1O9
(7.2)
(7.3)
Looking in the master table of Kraemer and Thiemann (1987), we find that
v = 133. Thus, we would need an approximate sample size of:
With two groups, we should try to have at least 68 participants per group
to detect a medium effect size with 80% power (i.e., that would correctly
reject the null hypothesis of no effects 80% of the time), and with only a 5%
chance of making a Type I error. With more than one outcome variable used
in MANOVA, it would be advisable to include even more than the minimum
suggested sample size per group.
After estimating sample size and collecting data to test hypotheses of group dif-
ferences on two or more DVs, it also would be helpful to calculate basic descriptive
statistics to show the means and standard deviations for the variables involved. It
also would be helpful to calculate reliability coefficients for all variables and to
check whether assumptions were met. These are the same as those for MR and
include normality, linearity, and homoscedasticity. These assumptions could be at
least preliminarily, assessed with an examination of skewness, kurtosis, and bi-
variate scatter plots. If covariates are included (i.e., a MANCOVA is conducted),
then, similar to ANCOVA, the assumption of homogeneity of regressions must be
met; that is, there should be no interaction between a covariate and a grouping
variable.
1 1O CHAPTER 7
(7.4)
where Yi is a continuous DV, yi is the grand mean of the ith DV, is the treatment
effect, and E is error variance.
For MANOVA, we also have to include the following to recognize that we are
forming linear combinations of the dependent variables:
(7.5)
where V, is the ith linear combination, bi is the z'th eigenvector weight, and Yi is
the ith DV.
For a MANOVA, we can form one or more linear combinations, where the
number is determined by:
(7.6)
where p is the number of DVs, and k is the number of groups or levels of the IV.
With only two groups, we will form only one (i.e., 2—1 = 1) linear combination
no matter how many DVs we have in our design.
In MANOVA, even though we are forming one or more linear combinations,
each with a specific set of eigenvector weights (i.e., the b values in equation 7.5),
we do not tend to focus on these combinations scores or weights until we get to
DFA in the next chapter. For now, we also need to know how we are modeling
group differences in MANOVA. Similar to ANOVA, we form a ratio of between-
group variance over within-group variance except that, with MANOVA, we are
now forming a ratio of variance-covariance (i.e., S) matrices. The between-group
variance-covariance matrix could be labeled B, but to distinguish it from the un-
standardized B regression weight in MR, this matrix is often labeled as H for the
"hypothesis" matrix; the within-group variance-covariance matrix is often labeled
E to represent error variance (Harris, 2001).
A ratio of variance matrices is then formed that is analogous to that found with
the ratio of variance scalars in ANOVA:
the ratio involves matrices, and we need to form an inverse of the divisor (E)
matrix and then multiply this inverse by the dividend (H) matrix. As we see later
in the chapter, one of the challenges in conducting a MANOVA is summarizing
the essence of this ratio of matrices with a single number that can be assessed
for significance with an F-test. Several methods are suggested to summarize this
matrix, including determinants, traces, and eigenvalues, all of which focus on
variance information (see Chapter 6).
As with ANOVA and ANCOVA, MANOVA results should be interpreted first at the
macro-level. Similar to ANOVA and ANCOVA, we are first interested in whether
there is a significant macro-level effect and the size of the effect, and second
whether there are significant micro-level differences between pairs of groups. In
addition, MANOVA is concerned about which DVs are showing significant differ-
ences across groups, adding a third, mid-level layer to the interpretation of results.
Several macro-assessment summary indices have been offered to summarize
results for MANOVA. Probably the most widely used macro-assessment summary
1 12 CHAPTER 7
index is Wilks's (1932) lambda, which uses determinants to summarize the variance
in the ratio of matrices formed in MANOVA. Wilks suggested that the determinant
of the within-groups variance-covariance matrix over the determinant of the total
(i.e., within plus between) variance-covariance matrix indicates how much of the
variation and covariation between the grouping variable(s) and the continuous vari-
ables was unexplained. Thus, one minus the ratio of Wilks's lambda is a measure
of the shared or explained variance between grouping and continuous variables.
Two other macro-assessment summary indices incorporate the trace of a variance-
covariance matrix to summarize group difference matrices. Hotelling's trace is
simply the sum of the diagonal elements of the matrix formed from the ratio of the
between-groups over the within-groups-variance-covariance matrix (i.e., E-1 H).
Pillai's trace is the sum of the diagonal elements of the between-groups variance-
covariance matrix over the total (i.e., between plus within) variance-covariance
matrix (i.e., [H + E]-1 H). A fourth macro-assessment summary is Roy's greatest
characteristic root (OCR) (Harris, 2001). The GCR is actually the largest eigen-
value from the E-1 H matrix, providing a single number that gives the variance of
the largest linear combination from this matrix.
Below, I delineate further how to summarize macro-level information from
MANOVA (and MANCOVA).
Significance Test
Each of the four main macro summary indices just described has an associated F-
test for assessing whether group differences are significantly different from chance:
i. The most common macro summary index is Wilks's (1932) lambda, and
its associated F-test. Wilks's lambda shows the amount of variance in the
linear combination of DVs that is not explained by the IVs. Therefore, it is
preferable to get low values for Wilk's lambda (closer to zero than 1). If the
associated F-statistic is significant, we can conclude that there are significant
differences between at least two groups on the linear combination of DVs.
Wilks's lambda can be calculated as follows:
A = |E|/|H + E| (7.8)
where | | stands for the determinant of the matrix inside the parallel lines,
ii. The second macro summary index is the Hotelling-Lawley trace and the
associated F-test. This is formed by summing up the diagonal elements in
the E"1 H matrix (see equation 7.7):
tr[E -1l H]
Hotelling-Lawley trace = tr[E~
E-11 H
= sum of eigenvalues of E" (7.9)
MULTIVARIATE ANALYSIS OF VARIANCE 1 13
iii. A third macro-summary index is Pillai's trace and it's associated F-test.
Pillai's trace is the sum of the diagonal values of the matrix formed from
the ratio of the H over the E + H matrices:
Pillai's trace has the advantage of being the most robust of the four summary
indices when there are less-than-ideal conditions, such as unequal sample
size or heterogeneity of variances. Pillai's trace also can be interpreted as
the proportion of variance in the linear combination of DVs that is explained
by the IV(s). Thus, it is intuitively meaningful.
iv. A fourth macro-summary index is called Roy's largest root or the GCR.
Thus, this can be simply represented as follows:
Effect Size
A common multivariate effect size for MANOVA is eta-squared:
2
( )= (l- A), (7.12)
c. As with significance tests, two levels (mid- and micro-level) of effect sizes
can be examined after finding a significant macro-level effect in MANOVA:
i. A shared variance midlevel effect size (e.g., r2 = sum of squares between
groups divided by sum of squares within groups) could be calculated for each
DV to assess the proportion of variance in common between that specific
continuous variable and the grouping variable(s). Cohen's (1992) guidelines
for univariate effects would apply here: 0.01 for a small effect, 0.06 for a
medium effect, and about 0.13 or more for a large effect,
ii. For MANOVA, Cohen's d can provide a micro-level effect size for the differ-
ence between a pair of means (e.g., Cohen, 1988,1992), just as with ANOVA
and ANCOVA. This is easily calculated with the formula in equation 7.1.
The same values suggested for ANCOVA also apply for follow-up ANOVAs
on each DV. Thus, a standardized difference of 0.20 is a small effect, a d of
0.50 is a medium effect, and 0.80 represents a large effect.
d. Just as with univariate ANOVAs, or with intermediate ANCOVAs, we can
graph the means in MANOVA for each group to provide a qualitative examina-
tion of specific differences between groups. Many computer programs allow this
to be easily accomplished.
TABLE 7.1
MANOVA Example Descriptive Frequencies
Cumulative Cumulative
STAGEA Frequency Percent Frequency Percent
(PROSB), cons of condom use (CONSB), self-efficacy for condom use (CON-
SEFFB), and psychosexual functioning (PSYSXB) assessed 6 months later. As
with the MR and ANCOVA examples, this application draws on the theories of the
transtheoretical model (e.g., Prochaska et al., 1994) and the multifaceted model of
HIV risk (Harlow et al., 1993).
A caution before beginning analyses is that this application of MANOVA is
not ideal as the levels of the IV are not manipulated but are intact stages in which
people fall depending on the length of time for which they have used or considered
using condoms. Still, it is worthwhile to examine the steps needed for a MANOVA
as well as provide a brief interpretation of the output.
Output is provided below for several sets of analyses: A, descriptive statistics; B,
correlations; C, MANOVA; D, ANOVAs; E, Tukey tests of HSDs between groups.
Descriptive Statistics
Table 7.1 presents the frequencies of participants at each of the five levels of the IV,
STAGEA. Note that there is an imbalance in the number of participants per group.
Most participants are in STAGEA 2 (contemplating condom use) or STAGEA 1
(precontemplation: not even considering condom use). Such unequal samples sizes
across groups can reduce the power of a MANOVA (e.g., Kraemer & Thiemann,
1987). Still, we proceed with caution, as this is a worthwhile, real-world research
area where results would be important, if only used descriptively.
Table 7.2 provides means, standard deviations, ranges, skewness, and kurtoses
for the DVs. Notice that the means for PSYSXB and PROSB are fairly high,
TABLE 7.2
MANOVA Example Descriptive Means, SDs, Range, Skewness, and Kurtosis
TABLE 7.3
Test-Retest Correlations for PSYSX (N = 527)
Correlations
Tables 7.3-7.6 provide the test-retest correlations for the four DVs. Note that the
6-month (A to B) test-retest reliability (correlation) coefficient is acceptable (i.e.,
0.66) for psychosexual functioning (see Table 7.3).
In Table 7.4, we see that similar to what we found for PSYSX, the 6-month
test-retest reliability (correlation) coefficient for pros of condom use is reasonable
(i.e., r = 0.60).
The 6-month test-retest reliability coefficient for the cons of condom use, pre-
sented in Table 7.5, is also adequate (i.e., 0.61).
Similar to the other variables' reliability coefficients, CONSEFF shows suffi-
cient test-retest correlation (i.e., 0.62) over 6 months (see Table 7.6).
TABLE 7.4
Test-Retest Correlations for PROS (N = 527)
TABLE 7.5
Test-Retest Correlations for CONS (N = 527)
Table 7.7 presents the test-rest reliability (correlation) coefficient for STAGE,
the IV. Stage of condom use shows moderate to high test-retest reliability (i.e.,
0.67), suggesting fairly stable measurement over six months.
Table 7.8 provides the correlations among the IVs and DV. An examination
of the correlations among the variables does not indicate collinearity (i.e., rs <
0.70 - 0.90).
MANOVA
Table 7.9 gives the macro-level results from the MANOVA. The F-tests for all four
MANOVA summary indices are significant, indicating that there is evidence for
group differences on a linear combination of the DVs. One minus Wilks's lambda,
which is equal to )2, a multivariate effect size, is moderately large (i.e., 0.24).
ANOVAS
After finding significant macro-level results with MANOVA, we can proceed to
conduct follow-up analyses at the micro level. Four ANOVAs are conducted, one
for each DV, with results given in Tables 7.10 to 7.13.
From Table 7.10, we see that the F-test for the follow-up ANOVA on the DV,
PSYSXB, is not significant, nor is the R2 value very high (i.e., 0.01). Thus, it
appears that psychosocial functioning is not significantly different across the five
stages of condom use.
TABLE 7.6
Test-Retest Correlations for CONSEFF (N = 527)
TABLE 7.7
Test-Retest Correlations for STAGE (N = 527)
TABLE 7.8
Correlations Among DVs and IV (N = 527)
TABLE 7.9
Macro-Level Results for MANOVA
MANOVA Test Criteria and F Approximations for
the Hypothesis of No Overall STAGEA Effect
H = Type III SSCP Matrix for STAGEA
E = Error SSCP Matrix
Statistic Value F-Value df Num Den Pr>F
TABLE 7.10
Micro-Level ANOVA Results for Psychosexual Functioning
Sum of Mean
Source df Squares Square F -Value Prob > F
TABLE 7.11
Micro-Level ANOVA Results for Pros of Condom Use
Class Levels Values
STAGEA 5 12 3 4 5
Number of observations 527
Sum of Mean
Source df Squares Square F -Value Prob > F
TABLE 7.12
Micro-Level ANOVA Results for Cons of Condom Use
Class Levels Values
STAGEA 5 12 3 4 5
Number of observations 527
Dependent Variable: CONSB
Sum of Mean
Source df Squares Square F -Value Prob > F
TABLE 7.13
Micro-Level ANOVA Results for Condom Self-Efficacy
Mean
Source df Type I SS Square F -Value Prob > F
STAGEA 4 107.47610 26.86902 23.78 <0.0001
FIG. 7. l. Depiction of Follow-up ANOVA Results in the MANOVA
Example with IV = STAGEA and DVs = PSYSXB, PROSB, CONSB,
and CONSEFFB NS = No Significant Differences; *** p < 0 . 0 0 l .
TABLE 7.14
Micro-Level Tukey Tests for ANOVA on Psychosexual Functioning
Alpha 0.01
Error Degrees of Freedom 522
Error Mean Square 0.558284
Critical Value of Studentized Range 4.62683
TABLE 7.15
Micro-Level Tukey Tests for ANOVA on Pros of Condom Use
Alpha 0.01
Error Degrees of Freedom 522
Error Mean Square 0.620823
Critical Value of Studentized Range 4.62683
Table 7.11 reveals that the F-test for the follow-up ANOVA on the DV, PROSB,
is significant [F(4, 522) = 7.42, p < 0.0001], with a small to moderate effect size
(i.e., R2 = 0.05). Following the presentation of the ANOVAs, it will be worthwhile
to examine Tukey tests between pairs of groups to assess which stages are showing
significant differences for PROSB.
Table 7.12 shows that, as with the previous variable, the F-test for the follow-
up ANOVA on CONSB is significant [F(4, 522) = 12.49, p < .0001), with a
moderate effect size (i.e., R2 = 0.09). Thus, it will be worth investigating follow-up
Tukey tests to assess which stages are showing significant differences in CONSB.
The follow-up ANOVA for CONSEFFB (see Table 7.13) reveals a significant
F-test[F(4, 522) = 23.78, p < 0.0001)], with a large effect size (i.e., R2 = 0.15).
We will be interested in determining which stages show significant differences on
CONSEFFB when conducting Tukey tests on pairs of means, which we turn to
shortly.
124 CHAPTER 7
TABLE 7.16
Micro-Level Tukey Tests for ANOVA on Cons of Condom Use
Alpha 0.01
Error Degrees of Freedom 522
Error Mean Square 0.652847
Critical Value of Studentized Range 4.62683
Comparisons significant at the 0.01 level are indicated by ***.
Difference Simultaneous 99% Sig.
STAGEA Comparison Between Means Confidence Limits at 0.01
Figure 7.1 depicts the overall ANOVA results with the stages of the categorical
IV preceding (by 6 months) the four continuous DVs for this MANOVA example.
In the micro-level results that follow, we examine whether there are significant
differences across the stages for each of the DVs.
TABLE 7.17
Micro-Level Tukey Tests for ANOVA on Condom Self-Efficacy
Table 7.15 delineates the Tukey tests conducted on pairs of means for the DV
PROSB, showing that individuals in STAGEA 5 (maintaining condom use for 6
months or longer) perceive significantly (p < 0.01) more advantages or PROS to
using condoms than individuals in either STAGEA 2 (contemplating condom use)
or STAGEA 1 (precontemplation, or not even considering condoms).
In Table 7.16, Tukey tests between pairs of groups indicate three significant dif-
ferences for CONS (p < 0.01). Individuals in STAGEA 5 (maintenance) perceive
significantly less disadvantages or CONS to using condoms than individuals in
either STAGEA 1 (precontemplation), STAGEA 2 (contemplation), or STAGEA
3 (preparation).
Table 7.17 shows that Tukey tests for CONSEFFB reveal significant differences
(p < 0.01) between six pairs of groups. Individuals in STAGEA 5 (maintenance)
126 CHAPTER 7
TABLE 7.18
Least-Squares Means for the Four DVs over the Five Stages of the IV
TABLE 7.19
Multiplicity, Background, Central, and Interpretation Themes Applied
to MANOVA
Themes MANOVA
report significantly greater self-efficacy for using condoms than individuals in ei-
ther STAGEA 3 (preparation), STAGEA 2 (contemplation), or STAGEA1 (precon-
templation). Further, individuals in STAGEA 1 (precontemplation) show signifi-
cantly less condom self-efficacy than individuals in STAGEA 2 (contemplation),
STAGEA 3 (preparation), and STAGEA 4 (action).
Least-squares means across the five stages of condom use, for each of the four
DVs are presented in Table 7.18. For the first DV, there is a decreasing (but non-
significant) pattern of psychosexual functioning when moving from STAGEA 1 to
STAGEA 4, although psychosexual functioning increases slightly in STAGEA 5.
With the DV, PROSB, there is generally an increasing trend across the stages, al-
though it dips slightly during the action STAGEA 4. CONSB shows a clear linear
decrease in scores when moving from STAGEA 1 to STAGEA 5, suggesting that the
disadvantages to using condoms become much less salient the longer an individual
uses condoms. Finally, there is a linearly increasing trend in condom self-efficacy as
one moves through the stages of condom use from precontemplation (STAGEA 1)
to maintenance (STAGEA 5). This indicates that individuals feel much more ef-
ficacious about their likelihood of using condoms in the future the longer they
actually use condoms.
SUMMARY
REFERENCES
Prochaska, J. O., Velicer, W. R, Rossi, J. S., Goldstein, M. G., Marcus, B. H., Rakowski, W., Fiore,
C., Harlow, L. L., Redding, C. A., Rosenbloom, D., & Rossi, S. R. (1994). Stages of change and
decisional balance for 12 problem behaviors. Health Psychology, 13, 39-46.
Tabachnick, B. G., & Fidell, L. S. (2001). Using multivariate statistics (4th ed.). Boston: Allyn and
Bacon.
Tukey, J. W. (1953). The problem of multiple comparisons. Unpublished manuscript, Princeton Uni-
versity (mimeo).
Wilks, S. S. (1932). Certain generalizations in the analysis of variance. Biometrika, 24,471-94.
8
Discriminant Function
Analysis
Themes Applied to Discriminant
Function Analysis (DFA)
DFA is the first method we discuss that uses a categorical outcome or de-
pendent variable (DV). It is a rigorous multivariate method and shares much
in common with multivariate analysis of variance (MANOVA) (Chapter 7), in
terms of its mathematics, though the focus for research questions and interpre-
tation may be more similar to multiple regression (MR) (see Chapter 4) and to
some extent logistic regression (LR) (see Chapter 9). Interested readers can in-
vestigate other descriptions of DFA in several excellent texts (e.g., Grimm, &
Yarnold, 1995; Huberty, 1994; Johnson & Wichern, 2002; Tabachnick, & Fidell,
2001). We approach DFA with the same 10 questions we used for other methods
thus far.
DFA is a multivariate method that uses several usually continuous independent vari-
ables (IVs) and one categorical DV. It is mathematically equivalent to MANOVA
in that there are one or more categorical variables on one side and two or more
continuous variables on the other side. The main difference is that DFA uses
the continuous (discriminating) variables as predictors of the categorical group
membership DV. So, the focus in DFA is reversed between "independent" and
"dependent" variables compared with MANOVA. DFA can be classified as either
a group-difference, or a prediction, statistical analysis. It is mainly used in exam-
ining what variables differentiate between groups most clearly. In this sense, the
129
13O CHAPTER 8
The multiplicity themes for DFA are very similar to those for MANOVA because
the two methods are mathematically the same. Thus, it is important to evaluate
multiple theories and previous empirical studies when considering a study using
DFA. It is also important to carefully consider multiple continuous predictor vari-
ables as well as a relevant categorical outcome variable that has clear categories.
Ideally, a researcher should plan to have approximately equal numbers of partici-
pants across each of the categories of the DV, although this is not always possible.
Using equal category sizes ensures greater robustness of the results. As with most
statistical methods, it would be helpful to minimize any redundancy, either in the
choice of IVs or in the nature of the categories for the DV. This would help avoid
collinearity problems and help ensure clear discrimination among the groups or
categories of the DV.
When possible, it is also a good idea to use longitudinal data, possibly collect-
ing both the predictor variables and the DV at an initial time point and again at
one or more follow-up time points. This would allow examination of direction of
prediction between the IVs and DV to help determine which come first. If a re-
searcher is also able to collect data on multiple samples, this allows an opportunity
to investigate cross-validation and the generalizability of one's findings.
highly uneven sample sizes across all the groups (i.e., N = 151, 204, 70, 20, and
82, for STAGEA 1-5, respectively) could serve to lower the power and robustness
of a study with these categories.
We also would want to examine descriptive statistics (e.g., means or frequencies,
standard deviations, skewness, and kurtosis) as well as correlations among all the
variables. The reliability of the variables also should be checked, most likely
using either internal consistency or test-retest coefficients. Lastly, we would want
to check the assumptions of normality, homoscedasticity, and linearity. As with
other methods, a cursory assessment of assumptions could be accomplished by
examining skewness, kurtosis, and scatter plots.
where for DFA, p is now the number of independent or predictor variables, and
k is the number of groups or levels of the categorical dependent variable. For
example, when using an outcome with five categories or groups (e.g., stage of
readiness to use condoms) and six predictor variables, we could form four linear
combination variables (i.e., the minimum of [6, 5—1] = 4). Each of the linear
combinations would be orthogonal to each other and would weight the variables
slightly differently. Further, not all the linear combinations would be equally effec-
tive in differentiating among the groups. We see this more clearly when discussing
macro- and micro-assessment later in the chapter.
In contrast to MANOVA that did not focus on these linear combinations, but
rather on the means, DFA is very much focused on the weighted combinations,
labeled discriminant functions or discriminant scores. Some computer programs
also label these as canonical variates.
Just as with MANOVA, for DFA we focus on the ratio of between-group vari-
ance over within-group variance matrices: E - 1 H, where, again, E-1 represents
DISCRIMINANT FUNCTION ANALYSIS 133
the inverse of the error or within-group matrix and H represents the hypothesis or
between-group matrix.
Later we discuss how we evaluate this ratio of matrices at a macro-level and
then follow this up with micro-level assessment.
with a single macro-level F-test, followed by one or more midlevel F-tests, one
for each linear combination.
Significance Test
For DFA, just as with MANOVA, we can summarize the variance in the E-1H
matrix by either of the four macro-summary indices presented previously (see
Chapter 7):
Significance F-Tests
After examining and finding a significant macro-level F-test in DFA, we would
want to examine the mid-level F-tests, where there will be as many of these F-tests
as there are linear combinations. When the first mid-level F-test is significant,
this indicates that at least the first linear combination is significantly related to the
grouping variable. If a second mid-level F-test is significant, this indicates that there
is a second linear combination score or discriminant function that significantly
differentiates the groups of the categorical outcome variable, and so on until all
the F-tests have been examined. Notice that the first mid-level F-test is actually
the same as the macro-level F-test. This is because, with DFA, the whole set of
linear combinations is examined in the first F-test; if it is significant, we conclude
DISCRIMINANT FUNCTION ANALYSIS 135
that at least the first linear combination is significant, although literally it is all the
discriminant functions that are being tested in the first step.
Effect Size
A second point of assessment at the mid-level has to do with the eigenvalues (i.e.,
discriminant criteria) for each of the linear combinations or discriminant functions.
We would like to take note of the size of each eigenvalue, remembering that this
tells us about the variance of a discriminant function. The mid-level F-tests inform
us about the statistical significance of these eigenvalues or discriminant criteria.
We also can form a ratio of discriminant function variance over total variance to
find how much of the available discriminatory power is attributed to a specific
discriminant function. Thus, we can calculate:
(8.3)
where A.,- refers to the ith eigenvalue or discriminant criterion, and refers to
the sum of all the discriminant criteria. The sum of all the DDP values adds up to
1.0 and thus cannot be interpreted as a proportion of shared variance such as with
an ES. We could, however, convert this index to a shared-variance effect size by
multiplying each IDP by the macro-level 2 (i.e., 1 — A). I would like to label
this new, mid-level effect size-ratio as the "Bowker index" (BI) because a former
graduate student, Diane Bowker, first suggested this to me in a multivariate class
that I teach (D. Bowker, personal communication, March 22, 1994). Thus,
(8.4)
where BI, indicates the proportion of shared variance between the categorical
outcome and the ith specific linear combination of the continuous predictors, IDP,
refers to the index of discriminatory power for a specific ith eigenvalue, and 2
refers to the macro-level effect size, 1 — A. The sum of the BIs will equal the total
2
, with the first BI most likely being much larger than the remaining values:
(8.5)
standardized, with the latter used for interpreting the relative contribution of each
of the continuous variables in separating the groups. In contrast to other methods
we have discussed so far, however, there is a third set of weights for DFA. This
third set, called discriminant or canonical loadings, is the most interpretable and
shows the correlation between a predictor variable and the linear combination or
discriminant function. These loadings can be interpreted like part-whole correla-
tions with high absolute values indicating greater discriminatory power. Although
significance tests are not usually provided for standardized discriminant weights
or loadings, we can interpret them as we would do with correlations with those
greater than | 0.30 | being worthwhile to interpret.
Effect Size
Although not always calculated, we could square discriminant loadings (or
standardized discriminant weights) to get the proportion of shared variance be-
tween a variable and the underlying linear combination (i.e., discriminant func-
tion). These effect sizes could be interpreted with univariate, micro-level guide-
lines of 0.01,0.06, and 0.13 for small, medium, and large ESs, respectively (Cohen,
1992).
TABLE 8.1
Macro-Level Results for the Follow-up DFA
TABLE 8.2
Mid-Level Results for the Follow-up DFA
0.24
V Likelihood Ratio Approximate F-Value Num. df Den. df Pr. > F
Pooled Within Canonical Structure [i.e., Discriminant Loadings] for Follow-up DFA
Variable vl v2 v3 v4
TABLE 8.4
Micro-Level Unstandardized Discriminant Weights for the Follow-up DFA
Variable vl v2 v3 v4
as meaningful when greater than or equal to | 0.30 |. With this criterion, three of
the four variables have loadings on the first discriminant function (i.e., VI) that
are worth examining. As with the follow-up ANOVAs in the MANOVA chap-
ter, the continuous variable that shows the biggest differences across the STAGE
groups is also the variable (i.e., condom self-efficacy B) that has the highest dis-
criminant loading (i.e., 0.79). The next highest loading is associated with the cons
(i.e., —0.56), followed by the pros of condom use (i.e., 0.43). Thus, this first
discriminant function is largely focused on condom self-efficacy and, to a lesser
degree, the pros and cons of condom use. [Note that we do not have to examine
loadings for the other three functions (i.e., V2 to V4) as these were not signifi-
cant in the midlevel assessment.] If we checked back to the MANOVA chapter, we
would realize that these same three variables (CONSEFFB, CONSB, and PROSB)
were the measures that showed significant differences across the stages of condom
use, showing a link between high (absolute value) discriminant loadings and group
differences.
Given the Unstandardized discriminant weights (i.e., raw canonical coefficients)
in Table 8.4, we could form a discriminant function score from the first column:
This could be useful in a future prediction study if we knew the scores for the four
continuous variables. Remember, however, that we cannot interpret the importance
or magnitude of the relationship between a continuous variable and its discriminant
function with Unstandardized weights. This is best handled with discriminant load-
ings (see above) or, when not available, with standardized discriminant weights.
It is also sometimes informative to examine the group centroids (see Table 8.5),
which are listed as the class means of the canonical variables (i.e., discrim-
inant functions). These are simply the means of the Vi- scores for each of
the five STAGE A groups. For the first discriminant function (labeled as VI in
Table 8.5), which is the only function we would want to interpret given the pattern
of significance, there is more separation between the stages than with the other
three discriminant functions. This is because the analyses revealed that the sec-
ond, third, and fourth discriminant functions were not significantly separating the
DISCRIMINANT FUNCTION ANALYSIS 141
TABLE 8.5
Group Centroids for the Follow-up DFA Discriminant Functions
Class Means (i.e., Group Centroids) on Canonical Variables (i.e., Discriminant Functions)
STAGEA vl v2 v3 v4
groups. Later in the stand-alone example of DFA, we present a graph of the class
means or centroids for the first two canonical variates, in which, similar to this
example, only the first one is significant.
It is sometimes useful to examine the pattern of predicted to actual group (i.e.,
STAGE) designation. In Table 8.6, participants are numbered from 1 to N, al-
though only the first seven cases are presented here to conserve space. Both the
actual and predicted STAGE classifications are given, and when these are disparate
it is indicated by an asterisk. We can see that there is considerable misclassification
(i.e., 5 of the first 7 cases), although we know that classification is at least signifi-
cantly greater than chance because the F-test associated with Wilks's lambda was
significant.
Next, we examine the percentages of correct classification for each stage and
also overall presented along the diagonal of the classification table. These provide
the percentage of correct classification for each stage, which ideally should be
greater than (prior) chance (i.e., 1/5 = 0.20 or 20%: see bottom values of 0.20).
We can see from the classification grid in Table 8.7 that the discriminant func-
tions were reasonably accurate in classifying individuals into Stages 1, 4, and 5
TABLE 8.6
Individual Classification Results for the Follow-up DFA
* Misclassified observation
142 CHAPTER 8
TABLE 8.7
Classification Table for Actual and Predicted Stages in the Follow-up DFA
From
STAGEA 1 2 3 4 5 Total
1 84 20 18 11 18 151
55.63 13.25 11.92 7.28 11.92 100%
2 70 27 31 35 41 204
34.31 13.24 15.20 17.16 20.10 100%
3 23 13 10 9 15 70
32.86 18.57 14.29 12.86 21.43 100%
4 2 1 1 9 7 20
10.00 5.00 5.00 45.00 35.00 100%
5 4 4 8 21 45 82
4.88 4.88 9.76 25.61 54.88 100%
Total 183 65 68 85 126 527
34.72 12.33 12.90 16.13 23.91 100%
Priors 0.2 0.2 0.2 0.2 0.2
Error Count Estimates for STAGEA (for Follow-up DFA)
1 2 3 4 5 Total
(i.e., 56%, 45%, and 55%, correct classification for these respective stages). The
contemplation and action stages show much less accuracy, in fact less than chance
(i.e., < 0.20%). The overall percentage of correct classification is found by sub-
tracting the proportion of total error count estimates from 1 and multiplying by
100. Thus, the current discriminant function resulted in 37% correct classification
into STAGES, which is greater than a 20% chance classification.
We now turn to an example of a stand-alone DFA with the continuous variables
(i.e., psychosexual functioning, pros of condom use, cons of condom use, and
condom self-efficacy) at time 1 serving as predictors or discriminating variables,
and the stage of condom use, measured 6 months later, serving as the categorical
outcome. The interpretation of results is similar to the follow-up form of DFA,
although the focus is now more on a predictive model than a follow-up to a group-
difference model.
TABLE 8.8
Descriptive Frequencies for Stand-Alone DFA Example
Cumulative Cumulative
STAGES Freq. % Frequency Percent
TABLE 8.9
Descriptive Means, SDs, Range, Skewness, and Kurtosis for Stand-Alone DFA
TABLE 8.10
Pearson Correlation Coefficients (N = 527) Prob > | r | under HO: Rho = 0
Table 8.11), with 2 equal to 0.32, indicating a large multivariate ES. We would
now want to examine which of the individual discriminant functions had significant
discriminant criteria or eigenvalues.
Similar to the MANOVA example, there are four discriminant functions, each
with their respective eigenvalues and significance tests (see Table 8.12). From
the macro-level results presented here, we can see that only the first discriminant
TABLE 8.11
Macro-Level Results for Stand-Alone DFA
The DISCRIM Procedure for the Stand- Alone DFA
Observations 527 DF Total 526
Variables 4 DF Within Classes 522
Classes 5 DF Between Classes 4
Class Level Information
STAGEA Freq. Weight Proportion Prior Prob.
1 = Precontemplation 154 154.000 0.292 0.200
2 = Contemplation 185 185.000 0.351 0.200
3 = Preparation 79 79.000 0.150 0.200
4 = Action 22 22.000 0.042 0.200
5 = Maintenance 87 87.000 0.165 0.200
df
TABLE 8.12
Mid-Level Results for Stand-Alone DFA
Canonical Discriminant Analysis (for Stand-Alone DFA)
Squared
Canonical Adj. Canonical Approximate Canonical
[discriminant] [discriminant] Standard [discriminant]
V Correlation Correlation Error Correlation
Likelihood Approximate F
Ratio Value NumDF Den DF Pr>F
1 0.67668094 13.52 16 1586.2 <0.0001
2 0.97156341 1.68 9 1265.7 0.0897
3 0.99589808 0.54 4 1042 0.7094
4 0.99996457 0.02 1 522 0.8919
Test of HO: The canonical correlations in the current row and all that follow are zero (for Stand-alone
DFA)
function is significant (see the likelihood ratio and F tests in Table 8.12), though the
second one approached significance. Consistent with this result, the first eigenvalue
is capturing the bulk of the available variance, with the remaining ones being fairly
small. The predominance of the first discriminant function is also evident by its
large ES (see BI = 0.30), whereas the ESs for the remaining functions are trivial.
Notice also that the first F-test is actually the same as the macro-level test we saw
initially. This is because it includes information on all of the eigenvalues, although
it is interpreted as providing information on just the first discriminant function.
We would now want to examine the weights for this first function.
Table 8.13 presents the micro-level results for the stand-alone DFA. As in the
previous example, the variable with the highest discriminant loading (i.e., see
pooled within canonical structure) is CONSEFFA, which correlates 0.84 with
the first discriminant function. In contrast to the findings with the follow-up
146 CHAPTER 8
TABLE 8.13
Micro-Level Discriminant Loadings for the Stand-Alone DFA
Pooled Within Canonical Structure [i.e., Discriminant Loadings] for Stand-alone DFA
Variable VI V2 V3 V4
DFA, here PROSA has the next highest loading (i.e., 0.42), followed by CONS
(i.e., —0.38). As before, PSYSXA does not contribute very much to this discrimi-
nant function (i.e., the loading is < | 0.30 |). These findings would tell us that the
best predictor of STAGE at time 2 is an individual's level of condom self-efficacy
6 months earlier. Knowing both the perceived pros and cons of condom use would
also contribute to the prediction of stage of condom use 6 months later. In both the
follow-up and stand-alone DFA, knowing an individual's level of psychosexual
functioning does not help in predicting which stage of condom use the person will
be in after six months.
Figure 8.1 depicts the stand-alone DFA results with loadings attached to each
predictor for the first discriminant function, VI.
Table 8.14 presents the unstandardized discriminant weights for all four discrim-
inant functions (i.e., VI to V4), although only the first column of weights should
TABLE 8.14
Micro-Level Unstandardized Results
Raw Canonical Coefficients [i.e., Unstandardized Discriminant Weights]
Variable VI V2 V3 V4
Table 8.15 gives the group centroids for the four discriminant functions. As
with the other results, we would interpret those just for the only significant (first)
function. Notice that the function means for VI are more spread apart than those
for the other (nonsignificant) functions (i.e., V2 to V4). This is consistent with the
F-test results that showed that only the first function significantly discriminated
across the stage groups.
Figure 8.2 visually depicts the discriminating power of the first function by
comparing the group centroids for the first discriminant function, VI, with those
for the second discriminant function, V2. Notice that the bars representing the
centroids for VI are more spread apart than those for V2.
Table 8.16 presents the individual classification results (labeled: posterior prob-
ability of membership into stageB from the stand-alone DFA) for the first seven par-
ticipants. This lists the stages of condom use from which they started, the stages at
which the DFA predicted them, and the probabilities of being classified into each of
the five stages. As we did with the follow-up DFA, we can scan the first few columns
to see how well the first discriminant function did in classifying individuals
TABLE 8.15
Group Centroids for Stand-Alone DFA Discriminant Functions
Class Means [i.e., Group Centroids] on Discriminant Functions for Stand-alone DFA
STAGEB VI V2 V3 V4
FIG. 8.2. Plot of group centroids for first two discriminant functions.
into the correct stage group. We can see by the number of asterisks, indicating
an incorrect classification, that there are many individuals who were not correctly
classified into their initial stage category.
Table 8.17 provides the summary classification table for the stand-alone DFA.
The stage with the highest percentage of correct classification is maintenance (i.e.,
68% of the individuals who were initially listed as in Stage 5 were subsequently
classified as Stage 5 based on the discriminant function results). Stage 1 (precon-
templation) also fares well with 56% correct classification. In contrast to the DFA
TABLE 8.16
Individual Classification Results for Stand-Alone DFA
Posterior Probability of Membership in STAGEB (for Stand-alone DFA)
Classified into
Obs. from STAGEB STAGEB 1 2 3 4 5
Misclassified observation
DISCRIMINANT FUNCTION ANALYSIS 149
TABLE 8.17
Classification Table for Actual and Predicted Stages in Stand-Alone DFA
From
STAGEB 1 2 3 4 5 Total
1 86 13 15 27 13 154
55.84 8.44 9.74 17.53 8.44 100%
2 67 26 29 28 35 185
36.22 14.05 15.68 15.14 18.92 100%
3 12 8 23 11 25 79
15.19 10.13 29.11 13.92 31.65 100%
4 3 3 4 3 9 22
13.64 13.64 18.18 13.64 40.91 100%
5 3 1 10 14 59 87
3.45 1.15 11.49 16.09 67.82 100%
Total 171 51 81 83 141 527
32.45 9.68 15.37 15.75 26.76 100%
Priors 0.2 0.2 0.2 0.2 0.2
Error Count Estimates for STAGEB (for Stand-Alone DFA)
1 2 3 4 5 Total
results with the follow-up to the MANOVA example from the previous chapter,
individuals in the preparation stage were classified better than chance (i.e., 29%
is > 20%). However, the action stage did not do well at all for the current analysis,
nor did contemplation, both of which had only 14% correct classification. Thus, if
an individual started out in either of these stages there would be only a chance pos-
sibility of them being correctly classified into their original stage for both action
and contemplation.
From the error count estimates at the bottom of Table 8.17 we see that overall,
there was 36% correct classification (i.e., 100 — 64) for this discriminant function
predicting membership in STAGEB. This is very similar to what we found earlier
(i.e., 37%) with STAGEA in the follow-up DFA results. Thus, we can conclude that
the discriminant function did reasonably well in predicting stage of condom use
6 months later, although the variable psychosexual functioning did not contribute
very much and probably could be dropped from future research.
I5O CHAPTER 8
TABLE 8.18
Multiplicity, Background, Central, and Interpretation Themes Applied to DFA
SUMMARY
REFERENCES
Cohen, J. (1992). A power primer. Psychological Bulletin, 112, 155-159.
Grimm, L. G., & Yarnold, P. R. (1995). Reading and understanding multivariate statistics. Washington,
DC: APA.
Harlow, L. L., Quina, K., Morokoff, P. J., Rose, J. S., & Grimley, D. (1993). HIV risk in women: A
multifaceted model. Journal of Applied Biobehavioral Research, 7, 3-38.
Huberty, C. (1994). Applied discriminant analyses. New York: Wiley & Sons.
DISCRIMINANT FUNCTION ANALYSIS 151
Johnson, R. A., & Wichern, D. W. (2002). Applied multivariate statistical analysis (5th ed.). Englewood
Cliffs, NJ: Prentice Hall.
Prochaska, J. O., Velicer, W. K, Rossi, J. S., Goldstein, M. G., Marcus, B. H., Rakowski, W., Fiore,
C, Harlow, L. L., Redding, C. A., Rosenbloom, D., & Rossi, S. R. (1994). Stages of change and
decisional balance for 12 problem behaviors. Health Psychology, 13, 39-46.
Tabachnick, B. G., & Fidell, L. S. (2001). Using multivariate statistics (4th ed.) (Chapter 11: pp.
456-516). Boston: Allyn and Bacon.
9
Logistic Regression
LR is a multivariate prediction method that is most likely to use all or some cate-
gorical predictors to explain a categorical, usually dichotomous, outcome. As we
will see, it bears similarities and differences to other prediction methods such as
multiple regression (MR) and discriminant function anlalysis (DFA) (see Chap-
ters 4 and 8, respectively). Several excellent sources can be consulted regarding
logistic regression. Hosmer and Lemeshow (1989) have written what is commonly
believed to be the definitive text on LR. Other well-written manuscripts explain
the essentials of LR in easy to understand language (e.g., Menard, 1995; Rose,
Chassin, Presson & Sherman, 2000; Tabachnick & Fidell [Ch. 12], 2001; Wright,
1995). We now turn to a delineation of how the set of 10 questions and themes
pertain to the method of LR.
homoscedasticity, and normality are usually required for DFA. LR, in contrast,
does not require these assumptions, and outcome categories must be exclusive and
exhaustive so that each participant must be classified into one, and only one, of the
outcome categories. As for objectives and output, DFA usually is focused on the
correlational weights for prediction, and the percentage of correct classification of
individuals, into group membership. LR is used, particularly in health research, to
assess the odds or likelihood of disease given certain characteristics or symptoms.
Further, LR usually requires larger samples due to using maximum-likelihood
estimation (Wright, 1995).
LR is very similar to DFA in the multiplicity themes entailed. As with most multi-
variate methods, we would like to consider multiple theories and empirical studies
before conducting a LR analysis. As with DFA, LR requires the use of a categorical
outcome with two or more response categories. We also want to include multiple,
reliable IVs to adequately predict membership into one of the outcome categories.
Unlike DFA, LR is more likely to include a set of interactions between some of
the IVs to enhance prediction. This practice makes it even more important to try
and limit collinearity among the IVs because interactions with variables that are
already highly related can exacerbate problems with estimation. We also would
want to have a large sample to ensure stability and power in our findings. If it
were feasible, it would be helpful to collect data from multiple time points to pro-
vide some evidence of the direction of causation between IVs and the categorical
outcome.
Background themes for LR are somewhat similar to those for MR and DFA. We
would start by examining a reasonably large (N x p) data matrix that has several IVs
and a single (categorical) DV (with two or more groups). To enhance the power of
the results, Aldrich and Nelson (1984) recommend a ratio of 50 participants per IV,
probably requiring a larger sample size than either MR or DFA. Our LR example
(presented later in the chapter) exceeds these guidelines because we use the same
sample (i.e., N = 527 women at risk for HIV) and variables (i.e., five-category
outcome: stage of condom use; and four predictors: psychosexual functioning, pros
of condom use, cons of condom use, and self-efficacy of condom use) as with DFA
and MR. Thus, our sample appears sufficiently large (i.e., N > 200) to use LR.
We also would want to begin a LR analysis by first examining descriptive
statistics (e.g., means or frequencies, standard deviations, skewness, and kurtosis),
reliability, and correlations among all the variables. Because we are using the same
variables as in the stand-alone DFA example presented in Chapter 8, we do not
repeat these background analyses in this chapter. Note that the same variables were
LOGISTIC REGRESSION 155
used in the MR example presented in Chapter 4, although for MR, all the variables
were measured at time B, whereas for DFA and LR the IVs are measured at time
A and the DV is measured 6 months later at time B.
Y = X' + E (9.1)
(9.1)
be the highest stage of maintenance in our example near the end of this chapter
(see section on What Is an Example of Applying LR to a Research Question?).
Significance test
For LR, we calculate the log-likelihood (LL) by:
where /i = 1 to N, and Y7,i, is the actual outcome score (i.e., 1 or 0) that is multiplied
by the natural log of X[
X as defined in equation 9.3, which is then added to 1 — 7, Yi
times the natural log of 1 minus a predicted X'{ score. The LL is usually calculated
for a specific model, M, with a set of "p" predictors and an intercept, and is
compared with the LL for a model (I) that includes only an intercept parameter (I)
and no predictors. A chi-square significance test can then be calculated as:
X2 = 2 [ L L ( M ) - L L ( I ) ] (9.5)
Effect size
For LR, we do not usually get a traditional R2 value accompanying a macro-
level significance test. Several indices (e.g., Somer's D, gamma, tau-a, and c;
and McFadden's p2) are provided with some computer packages (e.g., SAS, and
SPSS, respectively). The values differ in how they deal with pairs of outcome
categories, and the range of values for the index, with McFadden's p2 being the
most conservative, and c usually having the largest values (i.e., a range of 0.5 to
1.0). To calculate an effect size (ES) that can be interpreted as the proportion of
shared variance between a set of predictors and a categorical outcome, it may be
best to average several indices. This maintains the spirit of multiplicity, recognizing
that we often need multiple measures or methods to estimate a single underlying
concept (e.g., shared variance). This average shared variance index can be evaluated
with Cohen's (1992) guidelines for a multivariate ES (i.e., 0.02,0.13, and 0.26 for
small, medium, and large effect sizes, respectively). A formula for one of the
/?2-like indices is given below:
R
where LL(M) is the log-likelihood for a specific model with p predictors, and LL
(I) is the log-likelihood for a model that only includes an intercept parameter or
constant.
When we get a significant x2 test and a reasonable ES, it is worthwhile to
examine specific micro-level findings to assess which variables are significant in
predicting membership in the outcome.
158 CHAPTER 9
a. Logistic regression weights and significance tests. The initial weights that
come out of an LR analysis can be evaluated for significance much like what is
done in MR. The significance test for these weights is a Wald test that is inter-
preted as a z- or f-test and is simply the ratio of the LR coefficient divided by its
standard error. Some computer programs use ax2 test to assess the significance of
weights.
b. Odds ratios. It is useful to examine the odds of falling into an outcome
category given a one-unit change in a specific predictor. These odds ratios are
calculated for each predictor and are helpful when interpreting which IVs provide
relevant information in predicting membership in the outcome variable. Larger
values for an odds ratio associated with an IV indicate that participants with high
scores on that IV have greater odds of falling into the baseline reference category.
The reference category is usually the highest category (i.e., the one coded with a
1 versus a 0 in a dichotomous outcome). Because some computer routines (e.g.,
SAS, 1999) use the smallest category as a reference, it is useful to flip the valence
to a descending order to be consistent with other programs and interpretation (e.g.,
focus on the likelihood of falling into the fifth stage of condom use, maintenance,
instead of the first stage, precontemplation). To aid in interpretation, we use this
reversed, descending order of the outcome categories in the example presented
later.
its relevance to the phenomenon under study, and its ability to provide relatively
generalizable results.
Output is provided below for LR using the same variables and time points as in
the DFA stand-alone example (and the same variables, though different time point
for the IVs in the standard MR example). As before, the four predictor IVs are
psychosexual functioning, pros of condom use, cons of condom use, and con-
dom self-efficacy. The outcome is stage of condom use. To include a longitudinal
aspect, the IVs are measured at the first time point (i.e., indicated by the letter
A at the end of each variable name), and the outcome is measured 6 months
later (i.e., indicated by the letter B at the end of the outcome name, STAGEB).
There is no need to repeat the preliminary descriptives and correlations that were
presented earlier (see stand-alone example in the DFA chapter). As with previ-
ous examples, the LR analysis draws on both the transtheoretical model (e.g.,
Prochaska et al., 1994) and the multifaceted model of HIV risk (Harlow et al.,
1993).
Several LR analyses were conducted where the nature of the categorical DV
changed for each analysis. In the first analysis, the outcome has five levels (i.e., the
five stages of condom use). For the last four analyses, four dichotomous outcome
variables are created to compare stages 2 to 5, respectively, with the first stage,
precontemplation (coded 0 in each dichotomy). Because there are varying num-
bers of participants in each of the five stages, the sample sizes vary for the four
analyses with a dichotomous outcome (see calculations, below). These four LRs
with dichotomous outcomes can be viewed as follow-up analyses to the initial LR
to determine with which stage levels the predictors were most significantly related.
Given the multiple analyses, it is probably wise to use a more conservative alpha
level such as 0.01 for the four follow-up analyses, with the conventional 0.05 alpha
level for the initial analysis.
a. The first LR analysis uses the same five-level (stage of condom use at time
B: STAGEB) DV as was used in MR and in DFA.
b. The second LR analysis uses a dichotomous STAGE2B DV, with precontem-
plators coded 0 and individuals in the second stage, contemplators, coded 1.
c. The third analysis uses a dichotomous STAGE3B DV, with precontemplation
coded as 0 and individuals in the third stage, preparation, coded as 1.
d. The fourth LR analysis again codes a dichotomous STAGE4B DV as 0 for
precontemplation, but a 1 is now given for those in the fourth action stage
of condom use.
160 CHAPTER 9
TABLE 9.1
Frequencies for STAGEB for LR Example
1 = Maintenance 5 87
2 = Action 4 22
3 = Preparation 3 79
4 = Contemplation 2 185
5 = Precontemplation 1 154
TABLE 9.2
Initial Test of Odds Assumption for Five-Stage DV
stages (i.e., precontemplation and maintenance) was more easily predicted than
for, say, the second (i.e., contemplation) stage of condom use. Although the lack
of proportionality is not optimal, we will proceed cautiously with interpreting the
output from this LR analysis.
In Table 9.3, notice that the likelihood ratio is significant [x2(4) =
183.2448, p < 0.0001], indicating that the model with both the intercept and four
predictors (i.e., Model M) is better than the model with the intercept only (i.e.,
Model I).
Table 9.4 presents four macro-level indices that can be interpreted as a correla-
tion between the set of IVs and the DV. A macro-level effect size can be calculated
by squaring any of the indices in the far right column of Table 9.4. Because these
produce discrepant values (i.e., 0.24,0.25,0.13, and 0.56, respectively for squared
values of Somer's D, gamma, tau-a, and c), it is also useful to calculate McFadden's
p2 (see equation 9.4):
TABLE 9.3
Macro-Level LR Results for Five-Stage DV
TABLE 9.4
Macro-Level Indices for LR with Five-Stage DV
is substantially smaller than the multivariate effect size found with a DFA (i.e.,
2
= 1 - A = 0.32) or MR (i.e., R2 = 0.29) of these same data (see DFA and MR
chapters). This is probably because LR is not as powerful as either DFA or MR.
Given the range of values for the macro-level effect sizes from Somer's D, gamma,
tau-a, c, and McFadden's p2, as suggested earlier it might be useful to average
these to arrive at a more balanced assessment of the proportion of shared variance
between the linear combination of predictors and the outcome. The average of
these five indices provides a moderately large ES, a value that is more in line with
that found from DFA and MR:
Thus, we could conclude that the four predictors for this LR analysis explain
approximately 25% of the variance in stage of condom use.
An inspection of the significance tests (i.e., chi-square) and probabilities for
the LR parameter estimates (see Table 9.5) reveals that all four of the predictors
are significantly predicting membership in the outcome variable, StageB (at the
conservative alpha level of .01 or better). Thus, it is helpful to examine the odds
ratios for each predictor.
The pattern of odds ratios in Table 9.6 indicates that individuals who score high
in psychosexual functioning are only half as likely to fall in the maintenance stage
TABLE 9.5
Micro-Level LR Results for Five-Stage DV
TABLE 9.6
Micro-Level Odds Ratio Estimates for LR with Five-Stage DV
95% Wald
Effect Point Estimate Confidence Limits
of condom use (than in earlier stages). Similarly, but to a slightly larger degree,
individuals who score high on the cons of condom use are about three-quarters as
likely to be staged in maintenance (and thus are more likely to be classified into
an earlier stage). Conversely, individuals who score high in condom self-efficacy
are almost twice as likely to be in the maintenance stage of condom use, and
those scoring high on the pros of condom use are 1.4 times more likely to fall in
the maintenance category as in one of the earlier stages. Although the nature of
these values is not the same as that of the loadings in DFA or the standardized
regression coefficients in MR, the interpretation of the findings is similar. Condom
self-efficacy appears to be the most salient predictor of who will have a high stage
(i.e., maintenance) of condom use. Figure 9.1 depicts the LR prediction model for
this five-stage DV with odds ratios provided for each IV.
We now turn to the first follow-up LR analysis that explores whether the
same four IVs (psychosexual functioning, pros, cons, and condom self-efficacy)
can predict a dichotomous outcome comparing contemplators (score = 1) to
TABLE 9.7
Frequencies for STAGE2B for LR Example (DV: 1 = Contemplation
vs. 0 = Precontemplation)
Contemplation 1 185
Precontemplation 0 154
precontemplators (i.e., score = 0). Given the initial significance of the propor-
tional odds (i.e., score) test earlier, it is likely that LR results will vary for the
four follow-up analyses, depending on the stages compared in the dichotomous
outcomes.
As noted earlier, this value tends to be more conservative than the four previous
ES estimates, making it worthwhile to compute an average of all five indices (i.e.,
Somer's D, gamma, tau-a, c, and McFadden's p2) to get a more robust estimate
LOGISTIC REGRESSION 165
TABLE 9.8
Macro-Level LR Results for STAGE2B Example
(DV: 1 = Contemplation vs. 0 = Precontemplation)
(i.e., the average of 0.11, 0.11, 0.03, 0.44, and 0.07 = 0.15). This represents a
medium multivariate effect size by Cohen's (1992) standards and indicates that it
would be worthwhile to further explore the nature of prediction by examining the
micro-level weights (both unstandardized and odds ratios).
Table 9.10 gives the micro-level results for the parameter estimates for predict-
ing STAGE2B. Note that the predictor, CONS A, is not significant, and the pre-
dictor, PROS A, is not significant at the conservative 0.01 level set for follow-up
analyses. The other two predictors, psychosexual functioning and condom self-
efficacy are significant. Thus, we would interpret only the odds ratio estimates for
these latter two predictors (see Figure 9.2). The findings with this dichotomous
DV contrasting contemplators with precontemplators are similar in nature, if not
the exact magnitude, as those from the five-level stage DV. In the current analysis,
an individual with high condom self-efficacy is 1.36 times more likely to fall in
a higher stage of condom use. Conversely, those with a high psychosexual func-
tioning score are just a little more than half as likely to fall into a higher stage of
TABLE 9.9
Macro-Level LR Indices for STAGE2B Example
(DV: 1 = Contemplation vs. 0 = Precontemplation)
TABLE 9.10
Micro-Level LR Results for STAGE2B Example (DV: 1 = Contemplation vs.
0 = Precontemplation)
condom use. It is also interesting to note that the values for the two nonsignifi-
cant odds ratios (i.e., for PROSA and CONSA) are similar across the LR analyses
conducted with the five-level and two-level DVs, respectively. Because PROSA
and CONSA were significant predictors in the initial analysis that included the
full sample of individuals at all five stages, we could surmise that there was most
likely not enough power to find significance for these two predictors in the current
analysis that included less than two-thirds of the original sample. This lack of
power is apt to be even more of a concern for subsequent analyses that include
even smaller proportions of the full sample. Of course, we cannot rule out the pos-
sibility that PROSA and CONSA simply do not do well in differentiating between
precontemplators and contemplators of condom use. Thus, although macro- and
micro-level decisions are somewhat similar across the two ways of scoring the
categorical DV, there is much less shared variance and fewer significant predictors
TABLE 9.11
Frequencies for STAGE3B for LR Example (DV: 1 = Preparation vs.
0 = Precontemplation)
Response Profile
Preparation 1 79
Precontemplation 0 154
with this follow-up comparing those who are thinking about using condoms versus
those who are not even considering condoms.
We now explore the nature of prediction when using the same four IVs (psy-
chosocial functioning, pros, cons, and condom self-efficacy) to predict membership
in a dichotomous DV where individuals in Stage 3 (preparation) are coded "1" and
individuals in Stage 1 (precontemplation) are coded "0".
TABLE 9.12
Macro-Level LR Results for STAGE3B Example (DV: 1 = Preparation vs.
0 = Precontemplation)
TABLE 9.13
Macro-Level LR Indices for STAGE3B Example (DV: 1 = Preparation vs.
0 = Precontemplation)
we can conclude that a model (M) with the four predictors and the intercept
is better than a model (I) with only the intercept for this analysis comparing
those in the preparation stage of condom use with those in the precontemplation
stage. From the values in this table, we can also calculate McFadden's p2 = 1 —
237.973/298.430 = 0.20, which represents a medium-large multivariate ES.
Table 9.13 gives the values for the other four macro-level indices for this LR.
As in previous LR analyses, we can calculate an estimate of the macro-level ES
by squaring the values for Somer's D, gamma, tau-a, and c and averaging them
with the value for McFadden's p2. This average ES can be calculated as:
p = 0.0003). The estimates of shared variance between the four predictors and
the outcome dichotomy (i.e., action versus precontemplation) produced a large ES
when averaging values from (Somer's D)2, (gamma)2, (tau-a)2, (c)2, and McFad-
den's p2 =, respectively:
This represents a very large effect size, with more than half of the variance in
STAGE5B membership (i.e., maintenance versus precontemplation) shared with
the set of four predictors. It is certainly appropriate to examine micro-level findings.
Figure 9.5 shows only two significant predictors at the 0.01 level. The odds
ratio for condom self-efficacy indicates that an individual who scores high on
this predictor is more than five times as likely to be in the maintenance stage
of condom use as in the precontemplation stage. Similar to what was found in
previous analyses, those who score high in psychosexual functioning are highly
unlikely to be staged in maintenance.
Extrapolating from all five of the LR analyses, it appears that individuals who
are high in condom self-efficacy are most likely to be in the maintenance stage
172 CHAPTER 9
TABLE 9.14
Multiplicity, Background, Central, and Interpretation Themes Applied to LR
SUMMARY
A summary of the various themes for LR is presented in Table 9.14.
LOGISTIC REGRESSION 173
REFERENCES
Aldrich, J. H., & Nelson, F. D. (1984). Linear probability, logit, and probit models. Beverly Hills, CA:
Sage.
Cohen, J. (1992). A power primer. Psychological Bulletin, 112, 155-159.
Harlow, L. L., Quina, K., Morokoff, P. J., Rose, J. S., & Grimley, D. (1993). HIV risk in women:
A multifaceted model. Journal of Applied Biobehavioral Research, 1, 3-38.
Hosmer, D. W., & Lemeshow, S. (1989). Applied logistic regression. New York: Wiley.
Menard, S. (1995). Applied logistic regression analysis (Sage University Paper 106 in the Series:
Quantitative Applications in the Social Sciences). Thousand Oaks, CA: Sage Publications.
Prochaska, J. O., Velicer, W. R, Rossi, J. S., Goldstein, M. G., Marcus, B. H., Rakowski, W., Fiore,
C., Harlow, L. L., Redding, C. A., Rosenbloom, D., & Rossi, S. R. (1994). Stages of change and
decisional balance for 12 problem behaviors. Health Psychology, 13, 39-46.
Rose, J. S., Chassin, L., Presson, C. C., & Sherman, S. J. (2000) Prospective predictors of smoking
cessation: A logistic regression application. In J. S. Rose, L. Chassin, C. C. Presson, & S. J. Sherman
(Eds.), Multivariate applications in substance use research (Chapter 10: pp. 289-317). Mahwah,
NJ: Lawrence Erlbaum Associates.
SAS (1999). Statistical Analysis Software, Release 8.1. Gary, NC: SAS Institute Inc.
Tabachnick, B. G., & Fidell, L. S. (2001). Using multivariate statistics (4th ed.) (Chapter 12: pp. 517-
581). Boston: Allyn and Bacon.
Wright, R. E. (1995). Logistic regression. In L. G. Grimm & P. R. Yarnold (Eds.), Reading and
understanding multivariate statistics (pp. 217-244). Washington, DC: American Psychological
Association.
This page intentionally left blank
V
Multivariate Correlation
Methods with Continuous
Variables
This page intentionally left blank
1O
Canonical Correlation
177
178 CHAPTER 10
on the right). The focus is on correlations and weights, as opposed to group means
or classification. The main question is how are the best linear combinations of the
IVs related to the best linear combinations of the DVs? In CC, there are several
layers of analysis:
If the number of IVs is equal to p and the number of DVs is equal to q, then
the number of canonical variates (i.e., linear combinations) on each side, and
hence the number of cross-side correlations (i.e., canonical correlations), is equal
to the minimum of p or q. For example, with three substance use variables on
the left and two personality variables on the right, there would be two canonical
variates to explain the variance on each side. The first step would be to explore
whether canonical correlations between the two pairs of canonical variates were
significantly different from zero. Next, the correlations (i.e., canonical loadings)
between the three substance use variables and their two canonical variates would be
examined, followed by a similar examination of the canonical loadings for the two
personality variables on their two canonical variates. Finally, it would be helpful
to see how the three substance use variables related to each of the two personality
canonical variates and how the two personality variables related to each of the
two substance use canonical variates. Figure 10.1 depicts the first two layers of
analysis for this example. To examine redundancy, add lines from each X to each
W, and from each Y to each V.
Follow-up MRs, one for each DV, are depicted in Figure 10.2 for this example.
Notice that all three predictors (i.e., alcohol, marijuana, and hard drug use) are
used to predict the first DV, distress, and then the second DV, self-esteem. These
follow-up analyses could be conducted with a more conservative alpha level (e.g.,
p < 0.01) to help control the overall Type I error rate when conducting multiple
analyses. However, researchers interested in increasing the power of their study
CANONICAL CORRELATION 179
(and controlling Type II error) may choose the more conventional alpha level of
p < 0.05 for both the CC and follow-up MRs.
CC is similar to MR in that multiple IVs are allowed. CC differs from MR in
that MR allows only a single DV, whereas CC allows two or more DVs. CC is
similar to discriminant function analysis (DFA) and logistic regression (LR) in
that multiple IVs are allowed with all three methods. CC is different from both
DFA and LR in that the latter two methods usually have only a single categorical
(10.1)
182 CHAPTER 10
The four submatrices needed for this analysis (i.e., Ryy - 1 , Ryx, Rxx -1 , and
Rxy) derive from partitioning the original larger matrix of correlations among the
DVs and IVs into four sections. The first includes the inverse of the q by q matrix
of correlations among the DVs. Second, we examine the correlations among the q
DVs and the p IVs. Third, we take the inverse of the matrix of correlations among
the p IVs. Finally, we multiply the three previous matrices by the p by q matrix of
correlations among the IVs and DVs.
The nature of this model is further explained, below, when discussing ratios of
(co-)variances.
Thus, this is not too different from the formula for a correlation that examines
the ratio of covariance of X and Y over the square root of the product of the
variance of X and the variance of Y:
Similarly, it is not too far afield from the ratio of between-group variance over
within-group variance that is examined with group difference methods such as
ANOVA, ANCOVA, and MANOVA:
The main difference between the ratio examined in CC and those with the other
methods is that CC includes information on p multiple independent variables and
q multiple dependent variables. As mentioned earlier, we can examine several
CANONICAL CORRELATION 183
linear combinations of the IVs and DVs, equal to whichever is smaller, p or q. This
is not too different from what we found in MANOVA or DFA, with a couple of
exceptions. First, we do not often have grouping variables in CC that necessitate
forming dummy variables for k — 1 of the categories. Thus, we can form linear
combinations for the smaller of the subset of vectors of data from the p IVs and from
the q DVs. Second, because there are two full sets of variables, we want to have
two sets of linear combinations, summarizing the IVs and the DVs, respectively.
Thus, the total number of linear combinations in CC is equal to:
they cannot be interpreted as proportion of variance terms but rather the proportion
of available canonical variance attributable to that specific pair of variates. Most
often, the first proportion reveals that the bulk of the variable information is in the
first pair of canonical variates so that the others may not be worth examining. If a
pair of canonical variates are significantly related and explain a large proportion
of the variance in the variables, move to the next layer to see which variables are
substantially related to each canonical variate.
analysis will be conducted, with output provided for each: A, correlations among
the p IVs and q DVs; B, a macro-level assessment of CC; C, midlevel assessment
of the canonical correlations among the pairs of canonical variates; D, micro-level
assessment of canonical loadings for both IVs and DVs; E, micro-level assess-
ment of the (redundancy) relationships among variables on one side and canonical
variates on the other side; and F, follow-up MRs, one for each DV, to attempt to
examine the directional ordering of the variables.
TABLE 10.1
(Rxx) Pearson Correlations (Among Xs) (TV — 527)
TABLE 10.2
(Ryx) Pearson Correlations (Among Ys and Xs) (N = 527)
(e.g., 2 weeks), these values may well have been even larger and more in line with
conventional standards for reliability (i.e., values > 0.70).
In Table 10.4 we present the Ryy portion of the matrix for use in CC analyses.
This provides the intercorrelations among all of the Y or DVs, which are the same
as the X or IVs except that the Ys are measured 6 months later. As with the RXX
portion of the matrix presented earlier, we would want to scan the correlations in
the Ryy portion to check for collinearity within these variables (on the right, i.e.,
DVs). Because none of the values is > |0.70 — 0.90|, there is no reason to suspect
collinearity.
TABLE 10.3
(Rxy) Pearson Correlations (Among Xs and Ys) (TV = 527)
TABLE 10.4
(Ryy) Pearson Correlations (Among Ys) (N = 527)
Each of these four portions of the larger Rcc matrix is thus ready to be used in
CC analyses, applying equation 1 where:
A Macro-Level Assessment of CC
The macro-level assessment in Table 10.5 reveals a small Wilks's lambda (0.11)
and a large F(25, 1922) = 63.23, p < 0.0001. A multivariate effect size can easily
be calculated by subtracting Wilks's lambda from 1 to yield 2 = 1 — 0.11 = 0.89.
This is very large by most standards, although in this case it represents a form of
reliability coefficient for the whole set of linear combinations across a 6-month
period. This indicates an impressive degree of stability from the initial time period
to a follow-up assessment 6 months later.
TABLE 10.5
Macro-Level Assessment of Canonical Correlation Example
TABLE 10.6
Mid-Level Assessment of Canonical Correlation Example
Though all pairs are significant, we may only want to interpret the first one
or two pairs that explain the bulk of the shared variance. Figure 10.3 depicts our
canonical correlation example with lines for the loadings shown for only the first
two pairs of canonical variates.
TABLE 10.7
Micro-Level Assessment of Canonical Correlation Example
Table 10.7) parallel those for time 1 in terms of which variables are associated
with which canonical variates.
TABLE 10.8
Redundancy Assessment for Canonical Correlation Example
Table 10.10 provides micro-level results for the MR with STAGEB as the DE
All variables significantly predict stage (see t-values and associated p-values,
all < 0.01). Findings suggest that stage may well be a relevant outcome variable,
with other variables as potential predictors or causal agents.
Tables 10.11 and 10.12, respectively, present macro-level and micro-level re-
sults for the second follow-up MR with PSYSXB as the DV. Notice that relatively
little variance is shared (i.e., R2 = 0.09) between psychosexual functioning and
TABLE 10.9
Macro-Level Results for First Follow-Up MR: DV = STAGEB
Sum of Mean
Source df Squares Square F -Value Prob. > F
TABLE 10.10
Micro-Level Results for First Follow-Up MR: DV = STAGEB
Parameter Estimates
TABLE 10.11
Macro-Level Results for Second Follow-Up MR: DV = PSYSXB
Sum of Mean
Source df Squares Square F -Value Prob. > F
TABLE 10.12
Micro-Level Results for Second Follow-Up MR: DV = PSYSXB
Parameter Estimates
TABLE 10.13
Macro-Level Results for Third Follow-Up MR: DV = PROSB
Sum of Mean
Source df Squares Square F -Value Prob. > F
the other four variables across six months. Further, one of the predictors, pros,
does not significantly predict psychosexual functioning. These findings suggest
that psychosexual functioning most likely is not an outcome variable, but if re-
gression coefficients going from psychosexual functioning at time 1 to other vari-
ables at time 2 are significant, it may be a good predictor or potentially causal
variable.
Macro-level and micro-level results from the third follow-up MR, with PROSB
as the outcome, are presented in Tables 10.13 and 10.14, respectively. These MR
results are not impressive with only 8% shared variance between the set of predic-
tors and PROSB 6 months later. Further, there are two nonsignificant predictors
(i.e., psychosexual functioning and cons are not significantly related to pros). As
with the previous MR, this suggests that pros is most likely not a meaningful out-
come or at least that there is no evidence that psychosexual functioning and cons
precede pros.
TABLE 10.14
Micro-Level Results for Third Follow-Up MR: DV = PROSB
Parameter Estimates
TABLE 10.15
Macro-Level Results for Fourth Follow-Up MR: DV = CONSB
Sum of Mean
Source df Squares Square F-Value Prob. > F
Tables 10.15 and 10.16 present macro-level and micro-level results, respec-
tively, for the fourth follow-up MR, with CONSB as the DV. There is a reasonable
proportion of shared variance (i.e., R2 = 0.23: F(4,522) = 39.74, p < 0.0001) be-
tween cons at t2 and the other variables at t1, but pros is not a significant predictor.
These results suggest that cons may serve as an outcome, with psychosexual func-
tioning, condom self-efficacy, and stage potentially serving as causal predictors of
cons measured 6 months later.
Still, it is worth holding back on this speculation until viewing the results from
the last MR where condom self-efficacy is hypothesized as an outcome and the
other variables are posited as predictors.
Tables 10.17 and 10.18 present the follow-up MR results for the fifth DV,
CONSEFFB. The macro-level MR results reveal substantial shared variance
(i.e., R2 = 0.29: F(4,522) = 53.98, p < 0.0001 between the IVs and condom
TABLE 10.16
Micro-Level Results for Fourth Follow-Up MR: DV = CONSB
Parameter Estimates
TABLE 10.17
Macro-Level Results for Fifth Follow-Up MR: DV = CONSEFFB
Sum of Mean
Source df Squares Square F -Value Prob. > F
TABLE 10.18
Micro-Level Results for Fifth Follow-Up MR: DV = CONSEFFB
Parameter Estimates
self-efficacy, with all predictors significantly related to the outcome. This provides
some evidence that condom self-efficacy may well serve as an outcome variable
with the remaining variables (i.e., psychosexual functioning, pros, cons, and stage)
serving as potentially causal predictors.
Given all the MR results, it is conceivable that both condom self-efficacy and
stage are mediators or outcomes with the other variables serving as potential causal
predictors.
SUMMARY
A summary of the multiplicity, background, central and interpretation themes for
CC is presented in Table 10.19.
CANONICAL CORRELATION 197
TABLE 10.19
Multiplicity, Background, Central, and Interpretation Themes Applied
to Canonical Correlation
Themes Canonical Correlation
REFERENCES
Campbell, K. T, & Taylor, D. L. (1996). Canonical correlation analysis as a general linear model: A
heuristic lesson for teachers and students. Journal of Experimental Education, 64, 157-171.
Cohen, J. (1988). Statistical power analysis for the behavioral sciences. San Diego, CA: Academic
Press.
Cohen, J., Cohen, P., West, S. G., & Aiken, L. S. (2003). Applied multiple regression/correlation
analysis for behavioral sciences (3rd ed.: Chapter 16, pp. 608-628). Mahwah, NJ: Lawrence
Erlbaum Associates.
Fan, X. (1997). Canonical correlation analysis and structural equation modeling: What do they have
in common? Structural Equation Modeling, 4, 65-79.
Harlow, L. L., Quina, K., Morokoff, P. J., Rose, J. S., & Grimley, D. (1993). HIV risk in women: A
multifaceted model. Journal of Applied Biobehavioral Research, 1, 3-38.
Prochaska, J. O., Velicer, W. R, Rossi, J. S., Goldstein, M. G., Marcus, B. H., Rakowski, W, Fiore, C,
Harlow, L. L., Redding, C. A., Rosenbloom, D., & Rossi, S. R. (1994). Stages of change and
decisional balance for 12 problem behaviors. Health Psychology, 13, 39-16.
198 CHAPTER 10
PCA and FA are exploratory multivariate methods that delineate the underlying
dimensions in a large set of variables or individuals. Although we consider them
multivariate methods, as they indeed handle multiple variables, both methods an-
alyze only a single set of variables. Unlike the other methods discussed in this
book, there is not the usual distinction between independent and dependent vari-
ables. Still, one of our central themes in multivariate methods is that of explaining
the variance and covariance within and across sets of variables. To maintain this
pervasive theme, we can consider the dimensions as a set of underlying indepen-
dent variables (IVs) from which the actual measured (dependent) variables (DVs)
emanate. We now elucidate how the 10 questions and themes relate to both PCA
and FA.
a single set of continuous variables and determine the number and nature of the
underlying dimensions that organize these variables. We are, in essence, trying
to find a few cogent dimensions that pull together the nature of the variables.
PCA and FA locate these dimensions by noting which variables are interrelated.
A main difference is that PCA uses all the variance in the variables and treats
it as true variance when finding the underlying dimensions or components. FA
recognizes that there is measurement error or unique variance in the variables
that should be separated from the true variance or factor variance, before finding
the underlying dimensions or factors. PCA is more mathematically precise. FA is
more conceptually realistic. Both PCA and FA solutions can be rotated to increase
interpretability. As we see later in the chapter, the two major rotation methods are
Varimax (orthogonal uncorrelated) and oblique (correlated; e.g., Promax).
PCA and FA differ from most multivariate methods in that only a single set
of measured variables is analyzed. PCA and FA are similar to other correlation
methods that focus on the nature of the relationship among variables [e.g., multi-
ple regression (MR), canonical correlation (CC)]. In contrast to group-difference
methods [e.g., analysis of covariance (ANCOVA), multivariate analysis of vari-
ance (MANOVA)], PCA and FA do not focus on the means for a set of variables.
With PCA and FA, as with several other methods [e.g., MR, discriminant function
analysis (DFA), logistic regression (LR), CC], we are much more interested in
a. PCA (or to some degree FA) can be used to transform a set of corre-
lated variables into the same number of uncorrelated linear combinations or
components. These newly formed components include all the original variance
in the variables, with the first component claiming most of the variance, and the
remaining ones taking less and less of the variance. In effect, PCA retains all the
information from the original variables while making the new components com-
pletely orthogonal, and thus more mathematically viable. There is no guarantee,
however, that these mathematically elegant and orthogonal components will be
conceptually interpretable. Still, these components can be used in other analyses
such as MR or MANOVA to avoid possible collinearity problems.
b. PCA or FA can be used to reduce a large set of correlated variables to
a few orthogonal underlying dimensions. These dimensions can then be rotated
further, either orthogonally or obliquely (discussed later), to improve interpreta-
tion. For example, a researcher may be interested in reducing information from a
100-item inventory to a set of 10 underlying dimensions that explain much of the
variance in the original variables. FA would recognize that there was most likely
some measurement error or uniqueness in each item that is not analyzed when
forming the dimensions or factors. The factors would involve only the variance in
each item that overlaps with the other items.
2O2 CHAPTER 1 1
c. FA (and PCA, potentially) can be used for theory testing to assess the
conceptual nature of underlying dimensions in a set of variables. This usually
involves having strong theory to suggest the nature of the variables and underlying
dimensions. For example, the 12 subscales of the Wechsler intelligence tests are
often delineated into verbal and performance dimensions.
with other methods, should still be investigated. Variables within a dimension are
often viewed as similar ways of expressing the same dimension and thus can ex-
hibit substantial correlation (e.g., 10.30 to 0.70 |). However, if correlation exceeds
| 0.90 |, there could be problems associated with collinearity (e.g., instability of
the weights or loadings). If collinearity is suspected, consider collapsing the two
variables involved into an averaged or summed composite or even dropping one
of the variables.
As with most multivariate methods, we would like to have access to a large
(N x p) data matrix with multiple continuous variables. Unlike many methods,
there would be less emphasis on meeting statistical assumptions, particularly when
making only descriptive summaries of the data. Certainly, inferences beyond a
specific sample would be strengthened when meeting linear model assumptions
(i.e., normality, linearity, and homoscedasticity) in a large and relevant sample.
Finally, the reliability of each variable is also a concern. Ideally, we would like
internal consistency coefficients to be 0.70 or higher, but test—retest reliability,
especially over long time periods (e.g., 6 to 12 months) may not be quite as high.
V=XB (11.1)
SC=B'SXB (11.2)
preserved in the first few components. Note that all the variance in the variables is
retained and redistributed in PCA.
For FA, the underlying dimension is not modeled, but rather the original X
variable is modeled as a function of the underlying dimension times a (factor
loading) weight plus some uniqueness:
X = LF+E (11.3)
As we saw with the descriptions of the equations for PCA and FA, variance is
examined differently for the two methods. In PCA, we examine all the variance
and do not even consider the possibility of error variance in the variables. Thus,
PCA uses the matrix of correlations among the variables as the initial starting
point for analysis. The p standardized variances (i.e., 1s) along the diagonal are
redistributed among the new components. This method assumes that the variables
are perfectly reliable and that all the variance in the variables is worth retaining.
In FA, we recognize that each variable has a portion of true-score variance (e.g.,
Lord & Novick, 1968) as well as some portion that is not shared with the other
variables loading on a factor. The focus of FA is on the portion of the variance
in a variable that is shared in common with the other variables and thus is called
common factor variance. In FA, then, we use a matrix of correlations as the start-
ing point, except that instead of Is along the diagonal, we insert communalities,
which are estimates of the shared variance between a specific variable and all the
remaining variables. Remembering back to MR, we see that a squared multiple
PRINCIPAL COMPONENTS AND FACTOR ANALYSIS 2O5
correlation (SMC or R2) between a variable and the remaining variables provides
a measure of shared variance. Another estimate of communality is the absolute
value of the largest correlation within a factor (Gorsuch, 1983). These estimates
values are often inserted along the diagonal in FA to reflect the fact that we are
analyzing only the portion of variance in the original variables that is held in com-
mon among the variables. The diagonal matrix of unique or error variance holds
the remaining variance so that when adding this matrix to the correlation matrix
with SMCs along the diagonal we get the full R matrix of correlations among the
variables (i.e., with Is along the diagonals).
Covariance plays a central role in that variables must have some covariance if
there are underlying dimensions that explain the relatedness among the variables.
One rule of thumb is to make sure there are at least several correlations of at least
10.301 or more to ensure the presence of one or more dimensions. If all the variables
were completely orthogonal (i.e., correlations were equal to zero), it would not be
possible to describe the set of p variables with a smaller set of q dimensions.
Finally, linear combinations are viewed differently between PCA and FA. In
PCA, the linear combination of interest is the new component score that is a
function of the original variables and a set of eigenvector weights. In FA, the
linear combination that we focus on is the original X variable, which is seen as a
weighted function of an underlying factor plus some uniqueness or measurement
error.
a. Similar to several other methods (e.g., MR, DFA, LR, CC), PCA and FA
focus on weights attached to specific variables to get a microperspective. Just as
with DFA and CC, the most interpretable weight is a (component or factor) loading
or structure coefficient. Unlike most applications of DFA and CC, the loadings are
often rotated in PCA and FA to increase the interpretability of the dimensions.
There are several kinds of rotation procedures, but the most common are Varimax,
PRINCIPAL COMPONENTS AND FACTOR ANALYSIS 2O7
which rotates dimensions orthogonally, and oblique (e.g., Promax), which allows
dimensions to be correlated. While most computer programs use the Varimax
orthogonal rotation as a default option, it is often useful to consider an oblique
rotation. This is especially true if we expect the dimensions to be related. In either
case, we usually strive to rotate the weights so that each dimension has several
variables that load highly with the remaining variables loading close to zero. This
pattern is labeled a "simple structure" (Thurstone, 1935), which is strived for but
not always achieved. In any structure, whether simple or not, loadings range from
— 1 to +1 and show how correlated a variable is with an underlying dimension
(i.e., component or factor). For both PC A and FA, we use the same criterion as
with other methods that rely on loadings; variables with loadings of |0.30| or
greater are interpreted as having a meaningful part on the whole dimension. We
also would like to try to describe the nature of each dimension by noting the kind
of variables that highly load on the components and factors. As with other methods
that focus on weights, the sign attached to a loading informs us about the nature
of the relationship. A positive value indicates that a variable is very similar to the
underlying dimension. A negative loading suggests that the higher the score on
the respective variable, the lower the score on the dimension on which the variable
loads. Thus, variables could be evaluated with several guidelines (see below).
b. Those with loadings >| 0.301 would be retained as marker variables for a
dimension, with ideally three or more marker variables per dimension.
c. Variables with loadings <| 0.30 on all dimensions could be discarded.
This would not necessarily mean the variables are unreliable. It could be the
variables do not have enough in common with the other variables. If this is the
case, more variables addressing the same content could be included in a future
study to help anchor the additional dimension.
d. Those with loadings >|0.30| on more than one dimension would be
labeled as complex variables. Complex variables most likely would be discarded
because it would not be clear to which dimension the variable was describing.
e. Variables that had positive and high loadings would be most consistent
with the direction and nature of a dimension.
f. Those with negative and high loadings would indicate variables that are
inversely related to an underlying dimension.
If statistical assumptions are met and a large, representative, and ideally random
sample is analyzed, then generalizations beyond the immediate sample become
more credible. Lykken's (1968) emphasis on constructive replication is relevant
here. Lykken argues that results from a single study are much more convincing
when replicated with separate, independent samples, different items to anchor each
of the major dimensions, and different methods. Thus, exploratory PCA or FA
results would be more compelling if replicated in a separate sample or if the same
factor structure were found using different items for each of the main constructs.
Further, confirmatory methods such as confirmatory factor analysis (CFA) should
be considered. CFA is a subset of structural equation modeling in which several
latent factors are posited, with each of them having hypothesized loadings on
several salient variables. In CFA, the number and nature of the factors is known
at the beginning of the study. The goal is to assess how well the hypothesized
structure fits the data. Though the topic of CFA is beyond the scope of this book,
several excellent sources describe this useful methodology and the larger method
of structural equation modeling (e.g., Rentier, 2000; Byrne, 2001; Loehlin, 2004;
Raykov & Marcoulides, 2000; Schumacker & Lomax, 2004).
If follow-up results appear encouraging, there would be greater verisimilitude
for the underlying dimensions that could be used to summarize scores on a measur-
ing instrument or even used in a predictive framework, such as structural equation
modeling.
TABLE 11.1
Descriptive Statistics on the Variables in the PCA and FA Example
involve pros with some negative skewness (i.e., most people report a high level
of perceived advantages of condom use), and condom self-efficacy (CONSEFFA)
that has some negative kurtosis (i.e., there is a platykurtic distribution, so that
there are approximately an equal number of people who report low, medium, and
high levels of condom self efficacy). Still, there does not appear to be enough
nonnormality to warrant making transformations of the data.
TABLE 11.2
Pearson Correlation Coefficients
Prob > | r | under HO :Rho = 0
PROSA CONSA CONSEFFA PSYSXA
Figure 11.2 presents the plot of the eigenvalues (i.e., scree plot: Cattell, 1966)
for these data. The eigenvalues from Table 11.3 are plotted along the vertical axis
and the dimensions (up to the number of variables analyzed) are listed along the
horizontal axis. The steep drop in the plot of the first two eigenvalues, followed by
a shallower slope in the plot for the remaining eigenvalues also suggests that two
dimensions would adequately represent the data.
TABLE 11.3
Principal Component Loadings for the Example
TABLE 11.4
Micro-Assessment of PCA with Orthogonal, Varimax Rotation
Another consideration is that the two dimensions should have theoretical rele-
vance to justify retaining them. This is best evaluated by examining the pattern of
loadings on the two retained dimensions. Table 11.4 presents component loadings
that have been rotated orthogonally, assuming that the two components are uncor-
related. Table 11.5 presents component loadings that have been rotated obliquely,
also providing the degree of correlation between the two dimensions. Both the
orthogonal (Varimax) and oblique (Promax) solutions indicate two relatively un-
correlated (i.e., r = —0.16) components with near simple structure. Factor 1 has
high loadings for psychosocial variables hypothesized in the MMOHR (Harlow
et al., 1993). Factor 2 has high loadings for condom use variables from the
TABLE 11.5
Micro-Assessment of PCA with Oblique, Promax Rotation
Inter-Factor Correlations
Factor 1 Factor 2
TABLE 11.7
Micro-Assessment of FA with Orthogonal Rotation
transtheoretical model (Prochaska et al., 1994). Thus, we could conclude that the
two components have both theoretical and empirical support.
Macro- and Micro-Level Assessment of FA
While it is not as clear-cut as with PC A, there appears to be evidence for two factors
with a FA of these same eight variables. Eigenvalues are not easily interpreted with
FA (see Table 11.6), but the scree plot (see Figure 11.3) starts to form an elbow
after the second factor. This suggests that the remaining factors would not be worth
examining. Thus, similar to PCA, we move forward and examine two dimensions
at a micro-level after the scree plot.
As with the PCA micro-results, loadings for the orthogonal and oblique solu-
tions are similar (see Tables 11.7 & 11.8, respectively). This is most likely due to the
relatively small correlation between the dimensions (i.e., r = —0.23). Compared
TABLE 11.8
Micro-Assessment of FA with Oblique, Promax Rotation
Inter-Factor Correlations
Factor 1 Factor 2
with PCA loadings, however, FA loadings are slightly lower in magnitude for the
marker variables and somewhat higher otherwise. Still, results appear comparable
across PCA and FA (e.g., Velicer & Jackson, 1990), probably due to fairly reliable
variables and relatively uncorrelated dimensions. Figure 11.4 depicts the loadings
and factor correlation for the oblique FA solution. Note that factor loadings are
given only for marker variables on their respective factors.
SUMMARY
REFERENCES
Bentler, P. M. (2000). EQS6: Structural equations program manual. Encino, CA: Multivariate Soft-
ware, Inc.
Byrne, B. M. (2001). Structural equation modeling with AMOS: Basic concepts, applications, and
programming. Mahwah, NJ: Lawrence Erlbaum Associates.
Cattell, R. B. (1966). The scree test for the number of factors. Multivariate Behavioral Research, 1,
245-266.
Comrey, A. L., & Lee, H. B. (1992). A first course in factor analysis (2nd ed.). Hillsdale, NJ: Lawrence
Erlbaum Associates.
Gardner, H. (1999). Intelligence reframed: Multiple intelligences for the 21st century. New York: Basic
Books.
Gorsuch, R. L. (1983). Factor Analysis (2nd ed.). Hillsdale, NJ: Erlbaum.
Guadagnoli, E., & Velicer, W. F. (1988). Relation of sample size to the stability of component patterns.
Psychological Bulletin, 10, 265-275.
Guttman, L. (1954). Some necessary conditions for common factor analysis. Psychometrika, 19, 149-
161.
Horn, J. L. (1965). A rationale and test for the number of factors in factor analysis. Psychometrika, 30,
179-185.
Harlow, L. L., Quina, K., Morokoff, P. J., Rose, J. S., & Grimley, D. (1993). HIV risk in women: A
multifaceted model. Journal of Applied Biobehavioral Research, 1, 3-38.
Kaiser, H. F. (1970). A second generation Little Jiffy. Psychometrika, 35, 401-15.
Loehlin, J. C. (2004). Latent variable models: An introduction to factor, path, and structural equation
analysis (4th ed.). Mahwah, NJ: Lawrence, Erlbaum Associates.
Lord, F. M., & Novick, M. R. (1968). Statistical theories of mental test scores. Reading, MA: Addison-
Wesley Publishing Company, Inc.
Lykken, D. T. (1968). Statistical significance in psychological research. Psychological Bulletin, 70,
151-159.
Matthews, D. J. (1988). Gardner's Multiple Intelligence theory: An evaluation of relevant research
literature and a consideration of its application to gifted education. Roeper Review, 11, 100-104.
McDonald, R. P. (1985). Factor analysis and related methods. Hillsdale, NJ: Erlbaum.
Preacher, K. J., & MacCallum, R. C. (2003). Repairing Tom Swift's electric factor analysis machine.
Understanding Statistics, 2, 13-43.
PRINCIPAL COMPONENTS AND FACTOR ANALYSIS 217
Prochaska, J. O., Velicer, W. F., Rossi, J. S., Goldstein, M. G., Marcus, B. H., Rakowski, W., Fiore,
C, Harlow, L. L., Redding, C. A., Rosenbloom, D., & Rossi, S. R. (1994). Stages of change and
decisional balance for 12 problem behaviors. Health Psychology, 13, 39-6.
Raykov, T., & Marcoulides, G. A. (2000). A first course in structural equation modeling. Mahwah, NJ:
Lawrence Erlbaum Associates.
Schumacker, R. E., & Lomax, R. G. (2004). A beginner's guide to structural equation modeling (2nd
ed.). Mahwah, NJ: Lawrence Erlbaum Associates.
Spearman, C. (1904). General intelligence, objectively measured. American Journal of Psychology,
38, 201-293.
Thurstone, L. L. (1935). The vectors of the mind. Chicago: University of Chicago Press.
Velicer, W. F. (1976). Determining the number of components from the matrix of partial correlations.
Psychometrika, 41, 321-327.
Velicer, W. F., & Jackson, D. N. (1990). Component analysis vs. common factor analysis: Some issues
in selecting an appropriate procedure. Multivariate Behavioral Research, 25, 1-28.
Wechsler, D. (1975). Intelligence defined and undefined: A relativistic appraisal. American Psycholo-
gist, 30, 135-139.
Zwick, W. R., & Velicer, W. F. (1986). Factor influencing five rules for determining the number of
components to retain. Psychological Bulletin, 99, 432-442.
This page intentionally left blank
VI
Summary
This page intentionally left blank
12
Integration of Multivariate
Methods
Consider each of the 10 themes, below, for the multivariate methods covered.
Correlational methods include MR, CC, PCA-FA, and to some degree DFA, with
each of these methods focusing on weights that can be interpreted in a correlational
metric. These methods focus little, if at all, on mean differences, although DFA may
involve group centroids that are means of the linear combinations (i.e., discriminant
functions). MR, CC, PCA, and FA use predominantly continuous variables, with
DFA also including a categorical outcome in addition to continuous independent
variables (IVs) with categorical IVs allowed.
Methods with one or more major categorical variables include ANCOVA,
MANOVA, DFA, and LR. Whereas ANCOVA and MANOVA have one or
more categorical IVs, DFA and LR have a single categorical outcome. LR also
allows categorical IVs, whereas in DFA the IVs are usually continuous. There
is a single continuous outcome variable in ANCOVA, with MANOVA allowing
two or more continuous outcomes. The last two methods focus on mean differ-
ences between groups, whereas DFA and LR focus more on interpreting weights
between a set of IVs and a single categorical outcome. These four methods (AN-
COVA, MANOVA, DFA, and LR), involving one or more categorical variables,
each relates one or more IVs to one or more dependent variables (DVs).
All the methods covered allow both macro- and micro-assessment of the effect
(i.e., shared variance) between IVs and DVs, but the IVs in PCA and FA are
underlying dimensions.
We now turn to a discussion of multivariate methods and research questions.
of measurement error. FA analyzes only the common variance among the variables,
separating out the unique or measurement error variance.
We turn now to a discussion of the multivariate themes across the methods.
Note: + = multiple, MR = multiple regression, ANCOVA = analysis of covariance, MANOVA = multivariate analysis of variance,
DFA = discriminant function analysis, LR = logistic regression, CC = canonical correlation, PCA — principal components analysis,
FA = factor analysis, IV = independent variable, DV = dependent variable, Linear Combos = linear combinations, Min = minimum,
Comps. = components, Pract. Applications = practical applications.
INTEGRATION OF MULTIVARIATE METHODS 225
versus survival of an illness. In these cases, results are sometimes used to assist
practitioners with diagnosis or risk assessment for certain diseases.
We now consider background themes for each of the multivariate methods
discussed in this book.
Sample Size 5-50 per IV 20+ per group (fc)20+ per DV 5-50 per IV 5-50 per IV 5-50 per vble 100-200+ 100-200+
Cont Vbles Usually All DV & Cov DVs IVs IVs OK Usually All Usually All Usually All
Categ Vbles Not Likely Yes for IV(s) Yes for IV(s) Yes for DV(s) Yes for DV Not Likely Not Likely Not Likely
Moderator(s) May May May May May May Not Likely Not Likely
Mediator(s) May May Not Likely Not Likely May Not Likely Not Likely Not Likely
Descr Freqs Not Likely Yes Yes Yes Yes Not Likely Not Likely Not Likely
Means & SDs May Yes Yes May May May Not Likely Not Likely
Linearity Yes Yes Yes Yes May Yes Yes Yes
Normality Yes Yes Yes Yes May Yes May May
Homoscedas. Yes Yes Yes Yes May Yes May May
Homog. Regr No Yes May No No No No No
Method Type Prediction Group Diff. Group Diff. Prediction Prediction Correlation Corr Structure Corr Structure
Note: MR = multiple regression, ANCOVA = analysis of covariance, MANOVA = multivariate analysis of variance, DFA = discrimi-
nant function analysis, LR = logistic regression, CC = canonical correlation, PCA = principal components analysis, FA = factor analysis,
IV = independent variable, k = number of groups, DV = dependent variable, Vble = variable, Cont = continuous, Categ — categorical, Descr
Freqs = descriptive frequencies, SDs = standard deviations, Homoscedas. = homoscedasticity, Homog. Regr = homogeneity of regressions,
Diff. = differences, Corr Structure = correlational structure.
INTEGRATION OF MULTIVARIATE METHODS 227
Note: MR = multiple regression, ANCOVA = analysis of covariance, MANOVA = multivariate analysis of variance, DFA = discriminant function analysis,
LR = logistic regression, CC = canonical correlation, PCA = principal components analysis, FA = factor analysis, IV = X = independent variable, DV — Y =
dependent variable, V = linear combination for X's, W = linear combination for F's, Cov. — covariance, [cr2(x)] = variance of x, BG = between groups, WG — within
groups, E-1 H = BG variance-covariance hypothesis matrix over WG variance-covariance error matrix, A = intercept, B & b = unstandardized weight, M = mean,
= treatment effect, E = error, R — correlation matrix, R = inverse of a correlation matrix, F = factor, L = factor loading, X' = [e
M + e A+B1Xl+B2X2+B3X3+B4X4-]for LR
INTEGRATION OF MULTIVARIATE METHODS 229
In the next two sections, we summarize macro- and micro-level assessment for
the multivariate methods discussed in this book.
Note: IV = independent variable, DV = dependent variable, Cov. = covariance, MR = multiple regression, ANCOVA = analysis of covariance, MANOVA =
multivariate analysis of variance, DFA = discriminant function analysis, LR = logistic regression, CC = canonical correlation, PCA = principal components
analysis, FA = factor analysis, ANOVAs, analyses of variance, R2 = 2 — percent of shared variance between Xs and Ys.
INTEGRATION OF MULTIVAR1ATE METHODS 231
10.30 | at the micro-level, also allowing squared values that can be interpreted as
small, medium, and large ESs for values of 0.01, 0.06, and 0.13, respectively.
Throughout this book, we have examined applications on a single data set (see
accompanying CD) collected from 527 women at risk for HIV. Each of the exam-
ples relied on the theoretical frameworks of the transtheoretical model (Prochaska
et al., 1994a, 1994b) and the multifaceted model of HIV risk (Harlow et al., 1993,
1998). In most of the examples (for MR, MANOVA, DFA, and LR), we analyzed
the relationships between psychosexual functioning, the pros and cons of condom
use, and condom self-efficacy, on the one hand, and stages of condom use on the
other hand. For ANCOVA, we examined the cons of condom use at the initial
time point as a covariate, with the second time point providing data for the DV. As
with the MANOVA example, the five stages of condom use (1, precontemplation;
2, contemplation; 3, preparation; 4, action; and 5, maintenance) served as levels
of the IV. For CC, we analyzed the relationship among all five variables (psycho-
sexual functioning, pros, cons, condom self-efficacy, and stage of condom use) at
two different time points, collected 6 months apart.
Analyses from each of these applications showed that there was significant
shared variance among the variables, particularly with condom self-efficacy and
stages of condom use, with psychosexual functioning having less in common with
stages, and the pros and cons falling somewhere in between.
For PCA and FA, we analyzed three transtheoretical model variables (pros,
cons, and condom self-efficacy) with five multifaceted model of HIV risk vari-
ables (psychosexual functioning, meaninglessness, stress, demoralization, and
232 CHAPTER 12
powerlessness). These analyses resulted in two dimensions (i.e., for the trans-
theoretical model and multifaceted model of HIV risk variables, respectively) to
explain the pattern of correlations among the variables.
It is hoped that the presentation of various themes that cut across all the meth-
ods, with theoretically anchored applications for each method, provided a useful
framework for understanding the essence of multivariate methods. It is up to the
imagination and energy of the reader to further explore how to apply these methods
to a wide range of phenomena, generating far-reaching implications and a strong
knowledge base in the fields in which the multivariate methods are applied.
REFERENCES
Cohen, J. (1988). Statistical power analysis for the behavioral sciences. San Diego, CA: Academic
Press.
Cohen, J. (1992). A power primer. Psychological Bulletin, 112, 155-159.
Comrey, A. L., & Lee, H. B. (1992). A first course in factor analysis (2nd ed.). Hillsdale, NJ: Lawrence
Erlbaum Associates.
Gorsuch, R. L. (1983). Factor Analysis (2nd ed.). Hillsdale, NJ: Erlbaum.
Green, S. B. (1991). How many subjects does it take to do aregression analysis? Multivariate Behavioral
Research, 26, 449-510.
Guadagnoli, E., & Velicer, W. F. (1988). Relation of sample size to the stability of component patterns.
Psychological Bulletin, 10, 265-275.
Harlow, L. L., Quina, K., Morokoff, P. J., Rose, J. S., & Grimley, D. (1993). HIV risk in women: A
multifaceted model. Journal of Applied Biobehavioral Research, 1, 3-38.
Harlow, L., Rose, J., Morokoff, P., Quina, K., Mayer, K., Mitchell, K., & Schnoll, R. (1998). Women HIV
sexual risk takers: Related behaviors, interpersonal issues & attitudes. Women's Health: Research
on Gender, Behavior and Policy, 4, 407-439.
Prochaska, J. O., Redding, C. A., Harlow, L. L., Rossi, J. S., & Velicer, W. F. (1994a). The Transtheo-
retical model and HIV prevention: A review. Health Education Quarterly, 21, 45-60.
Prochaska, J. O., Velicer, W. F, Rossi, J. S., Goldstein, M. G., Marcus, B. H., Rakowski, W., Fiore,
C., Harlow, L. L., Redding, C. A., Rosenbloom, D., & Rossi, S. R. (1994b). Stages of change and
decisional balance for 12 problem behaviors. Health Psychology, 13, 39-46.
Author Index
A c
Abelson, R. P., 5, 6, 8, 11, 12, 25 Campbell, D. T., 14, 27
Aiken, L. S., 4, 9, 16, 23, 24, 26, 33, 35, 39, 44, Campbell, K. T., 177, 797
45, 46, 47, 59, 61, 177, 797 Carmer, S. G., 24, 25
Aldrich, J. H., 154,173 Cattell,R.B.,206,210,276
Allison, P. D., 33, 39 Chassin, L., 152, 775
Alsup, R., 30, 39 Cohen, J., 4, 6, 9, 16, 22, 23, 24, 25, 26, 33,
Alwin, D. R, 15, 25 35, 39, 44, 45,46, 47, 48, 59, 67, 67, 71,
Anastasi, A., 12, 25 74, 80, 108, 109, 113, 115, 727, 134, 136,
Anderson, R. E., 65, 80 750, 157, 161, 165, 773, 177, 181, 797,
APA Task Force on Statistical Inference, 6, 9, 229, 230,232
21,22,27 Cohen, P., 4, 9, 16, 23, 24, 26, 33, 35, 39,
44, 45, 46, 47, 59, 67, 177, 797
Collins, L. M., 13, 26, 30, 33, 39
B Collyer, C. E., 14, 26
Comrey, A. L., 7, 9, 202, 276, 225,
Baron, R. M., 30, 39 232
Bentler, P. M., 6, 8, 15, 25, 208, 276 Cook, T. D., 14, 27
Berkson, J., 5, 8 Cudeck, R., 14, 26
Black, W. C, 65, 80
Bock, R. D., 114, 727
Boomsma, A., 6, 8 D
Brandt, U., 30, 40
Britt, D. W., 37, 39 Delaney, H. D., 15,26
Browne, M. W., 14, 26 Devlin, K., 28, 37, 39
Bullock, H. E., 13, 25 Diener, E., 15,26
Byrne, B. M., 30, 39, 208, 276 Dwyer, J. H., 30,40
233
234 AUTHOR INDEX
E Henkel, R. E., 6, 9
Hershberger, S. L., 4, 13, 15, 26, 27
Eaton, C. A., 51,67 Horn, J. L., 13,26, 206,276
Embretson, S. E., 12, 26 Hosmer, D. W., 16,26, 36,39, 152, 773
Enders, C. K., 33, 39 Huberty, C., 129, 750
Hunter, J. E., 12, 27
Hwang, H., 177,198
F
T
R
Tabachnick, B. G., 4, 6, 9, 15, 27, 33, 35, 40, 46,
Raju, N. S., 5, 6, 9, 21,27 59, 62, 65, 81, 85, 102, 105, 725, 129, 757,
Rakowski, W., 62, 87, 702, 128, 151, 173, 797, 152, 153, 773, 177, 795
277, 232 Takane, Y., 177, 798
236 AUTHOR INDEX
D
c
Data, 28-29
Canonical correlation (CC), 16, 177-198 Analysis from 527 women, 29, 51, 72-73, 154,
assumptions, 197 160, 231-232
237
238 SUBJECT INDEX
G M
Principal components analysis (PCA) and Factor Significance test, 21-22, 229-230
Analysis (FA), 16, 19-20, 24, 36, 93, 95, See also specific methods
130, 199-217, 221-231 Debate, 5 - 6, 21
assumptions, 208 Similarities and differences
background themes, 202-203 See specific methods
central themes, 204-205 Statistical tables
example, 208-215 webpage address, 48
macro-assessment, 205-206 Structural equation modeling, 180, 185,
eigenvalues, number of, 206 205, 208
interpretability of factors, 206 Sum of squares and cross products matrix (SSCP)
number of dimensions, 206 See Matrices
percentage of variance, 205
scree plot, 206
micro-assessment, 206- 207 T
loadings, 206- 207
model, 203-204 Themes, 10-25
multiplicity themes, 202 Theory, 10- 11
next steps, 207-208 Trace, 23, 86, 94-96, 100- 102, 111- 112
significance test, 205-206 Hotelling-Lawley trace, 112, 119, 126, 134,
similarities and differences, 199-201 138, 144, 183, 188
simple structure, 207 Pillai's trace, 112-113, 119, 126, 134, 138, 144,
SMC (squared multiple correlation), 183, 188
204-205 Tukey tests, 5, 24, 70, 72, 74, 114, 116,
what is PCA and FA, 199-201 229-230
when to use PCA and FA, 201-202 See also Comparisons between means
Type I error, 5, 21, 24, 72, 109, 114,
178, 184
Type II error, 5, 21, 72, 114, 185
Q
Questions to ask for multivariate methods, 37-39 V
Variable
R covariate, 31
dependent, 30- 31
Ratios, 228 endogenous, 30
of covariances, 18-20, 22, 96 exogenous, 30- 31
of variances, 18-20, 22, 96 independent, 30- 31
Roy's greatest characteristic root (GCR), 112- 113, mediating, or intervening, 30- 31
134, 183 moderator, 30- 31
See also MANOVA, macro-assessment Variance, 18
Variance-covariance matrix
See Matrices
s
Sample W
multiple samples, 14-15, 17
sample size, 7-8, 45-46, 48, 86, 109, 132, 154, Weights, 17, 19, 23-25, 35-36, 38, 47-50, 94-95,
160, 181, 202, 225 - 226 101, 106, 110, 114, 130, 135-137, 153,
SAS, 51,72, 100, 157, 155-156, 158, 178, 180, 184-185, 197,
Shared variance, 19-20, 22-23, 25, 47-48, 60, 69, 201, 203, 205-207, 222, 229-230
115, 134-136, 156-157, 162, 180, 183, Wilks's lambda, 23, 112-113, 118, 133-134, 183,
189, 204-205, 222, 227, 229-231 229