Multivariate Analysis 6th Edition Afifi
Multivariate Analysis 6th Edition Afifi
Multivariate Analysis 6th Edition Afifi
OR CLICK LINK
https://textbookfull.com/product/practical-
multivariate-analysis-6th-edition-afifi/
Read with Our Free App Audiobook Free Format PFD EBook, Ebooks dowload PDF
with Andible trial, Real book, online, KINDLE , Download[PDF] and Read and Read
Read book Format PDF Ebook, Dowload online, Read book Format PDF Ebook,
[PDF] and Real ONLINE Dowload [PDF] and Real ONLINE
More products digital (pdf, epub, mobi) instant
download maybe you interests ...
https://textbookfull.com/product/practical-multivariate-
analysis-6th-edition-afifi-2/
https://textbookfull.com/product/multivariate-data-analysis-
joseph-f-hair/
https://textbookfull.com/product/multivariate-analysis-for-the-
behavioral-sciences-kimmo-vehkalahti/
https://textbookfull.com/product/multivariate-analysis-for-the-
behavioral-sciences-2nd-edition-brian-s-everitt/
Multivariate Time Series Analysis in Climate and
Environmental Research 1st Edition Zhihua Zhang (Auth.)
https://textbookfull.com/product/multivariate-time-series-
analysis-in-climate-and-environmental-research-1st-edition-
zhihua-zhang-auth/
https://textbookfull.com/product/practical-statistics-for-
educators-6th-edition-ruth-ravid/
https://textbookfull.com/product/genetics-analysis-
principles-6th-edition-robert-j-brooker/
https://textbookfull.com/product/longitudinal-multivariate-
psychology-emilio-ferrer/
https://textbookfull.com/product/recent-developments-in-
multivariate-and-random-matrix-analysis-festschrift-in-honour-of-
dietrich-von-rosen-thomas-holgersson/
Practical Multivariate
Analysis
Sixth Edition
Practical Multivariate
Analysis
Sixth Edition
Abdelmonem Afifi
Susanne May
Robin A. Donatello
Virginia A. Clark
CRC Press
Taylor & Francis Group
6000 Broken Sound Parkway NW, Suite 300
Boca Raton, FL 33487-2742
This book contains information obtained from authentic and highly regarded sources. Reasonable efforts have been made
to publish reliable data and information, but the author and publisher cannot assume responsibility for the validity of all
materials or the consequences of their use. The authors and publishers have attempted to trace the copyright holders of all
material reproduced in this publication and apologize to copyright holders if permission to publish in this form has not been
obtained. If any copyright material has not been acknowledged please write and let us know so we may rectify in any future
reprint.
Except as permitted under U.S. Copyright Law, no part of this book may be reprinted, reproduced, transmitted, or utilized
in any form by any electronic, mechanical, or other means, now known or hereafter invented, including photocopying, mi-
crofilming, and recording, or in any information storage or retrieval system, without written permission from the publishers.
For permission to photocopy or use material electronically from this work, please access www.copyright.com
(http://www.copyright.com/) or contact the Copyright Clearance Center, Inc. (CCC), 222 Rosewood Drive, Danvers, MA
01923, 978-750-8400. CCC is a not-for-profit organization that provides licenses and registration for a variety of users. For
organizations that have been granted a photocopy license by the CCC, a separate system of payment has been arranged.
Trademark Notice: Product or corporate names may be trademarks or registered trademarks, and are used only for identi-
fication and explanation without intent to infringe.
Preface xi
Authors xv
4 Data visualization 37
4.1 Introduction 37
4.2 Univariate data 38
4.3 Bivariate data 45
4.4 Multivariate data 50
4.5 Discussion of computer programs 52
4.6 What to watch out for 54
4.7 Summary 56
4.8 Problems 56
v
vi CONTENTS
5 Data screening and transformations 59
5.1 Transformations, assessing normality and independence 59
5.2 Common transformations 59
5.3 Selecting appropriate transformations 62
5.4 Assessing independence 69
5.5 Discussion of computer programs 71
5.6 Summary 71
5.7 Problems 72
II Regression Analysis 85
7 Simple regression and correlation 87
7.1 Chapter outline 87
7.2 When are regression and correlation used? 87
7.3 Data example 88
7.4 Regression methods: fixed-X case 89
7.5 Regression and correlation: variable-X case 93
7.6 Interpretation: fixed-X case 93
7.7 Interpretation: variable-X case 94
7.8 Other available computer output 98
7.9 Robustness and transformations for regression 103
7.10 Other types of regression 105
7.11 Special applications of regression 107
7.12 Discussion of computer programs 110
7.13 What to watch out for 110
7.14 Summary 112
7.15 Problems 112
Appendix A 389
A.1 Data sets and how to obtain them 389
A.2 Chemical companies’ financial data 389
A.3 Depression study data 389
A.4 Financial performance cluster–analysis data 389
A.5 Lung cancer survival data 390
A.6 Lung function data 390
A.7 Parental HIV data 390
A.8 Northridge earthquake data 391
A.9 School data 391
A.10 Mice data 391
Bibliography 393
Index 411
Preface
The first edition of this book appeared in 1984 under the title “Computer Aided Multivariate Anal-
ysis.” The title was chosen in order to distinguish it from other books that were more theoretically
oriented. By the time we published the fifth edition in 2012, it was impossible to think of a book on
multivariate analysis for scientists and applied researchers that is not computer oriented. We there-
fore decided at that time to change the title to “Practical Multivariate Analysis” to better characterize
the nature of the book. Today, we are pleased to present the sixth edition.
We wrote this book for investigators, specifically behavioral scientists, biomedical scientists,
and industrial or academic researchers, who wish to perform multivariate statistical analyses and
understand the results. We expect the readers to be able to perform and understand the results,
but also expect them to know when to ask for help from an expert on the subject. The book can
either be used as a self-guided textbook or as a text in an applied course in multivariate analysis.
In addition, we believe that the book can be helpful to many statisticians who have been trained
in conventional mathematical statistics who are now working as statistical consultants and need to
explain multivariate statistical concepts to clients with a limited background in mathematics.
We do not present mathematical derivations of the techniques; rather we rely on geometric and
graphical arguments and on examples to illustrate them. The mathematical level has been deliber-
ately kept low. While the derivations of the techniques are referenced, we concentrate on applica-
tions to real-life problems, which we feel are the ‘fun’ part of multivariate analysis. To this end, we
assume that the reader will use a packaged software program to perform the analysis. We discuss
specifically how each of four popular and comprehensive software packages can be used for this
purpose. These packages are R, SAS, SPSS, and STATA. The book can be used, however, in con-
junction with all other software packages since our presentation explains the output of most standard
statistical programs.
We assume that the reader has taken a basic course in statistics that includes tests of hypotheses
and covers one-way analysis of variance.
xi
xii PREFACE
Part Two covers regression analysis. Chapter 7 deals with simple linear regression and is in-
cluded for review purposes to introduce our notation and to provide a more complete discussion
of outliers and diagnostics than is found in some elementary texts. Chapters 8-10 are concerned
with multiple linear regression. Multiple linear regression is used very heavily in practice and pro-
vides the foundation for understanding many concepts relating to residual analysis, transformations,
choice of variables, missing values, dummy variables, and multicollinearity. Since these concepts
are essential to a good grasp of multivariate analysis, we thought it useful to include these chapters
in the book.
Chapters 11-18 might be considered the heart of multivariate analysis. They include chapters on
discriminant analysis, logistic regression analysis, survival analysis, principal components analysis,
factor analysis, cluster analysis, log-linear analysis, and correlated outcomes regression. The mul-
tivariate analyses have been discussed more as separate techniques than as special cases of some
general framework. The advantage of this approach is that it allows us to concentrate on explaining
how to analyze a certain type of data from readily available computer programs to answer realistic
questions. It also enables the reader to approach each chapter independently. We did include inter-
spersed discussions of how the different analyses relate to each other in an effort to describe the ‘big
picture’ of multivariate analysis.
Acknowledgements
We would like to express our appreciation to our colleagues and former students and staff that
helped us over the years, both in the planning and preparation of the various editions. These include
our colleagues Drs. Carol Aneshensel, Roger Detels, Robert Elashoff, Ralph Frerichs, Mary Ann
Hill, and Roberta Madison. Our former students include Drs. Stella Grosser, Luohua Jiang, Jack
Lee, Steven Lewis, Tim Morgan, Leanne Streja, and David Zhang. Our former staff includes Ms.
Dorothy Breininger, Jackie Champion, and Anne Eiseman. In addition, we would like to thank Ms.
Meike Jantzen and Mr. Jack Fogliasso for their help with the references and typesetting.
We also thank Rob Calver and Lara Spieker from CRC Press for their very capable assistance
in the preparation of the sixth edition.
We especially appreciate the efforts of the staff of the UCLA Institute for Digital Research and
Education in putting together the UCLA web site of examples from the book (referenced above).
Our deep gratitude goes to our spouses, Marianne Afifi, Bruce Jacobson, Ian Donatello, and
Welden Clark, for their patience and encouragement throughout the stages of conception, writing,
and production of the book. Special thanks go to Welden Clark for his expert assistance and trou-
bleshooting of earlier electronic versions of the manuscript.
Abdelmonem Afifi
Susanne May
Robin A. Donatello
Virginia A. Clark
Authors
Abdelmonem Afifi, Ph.D., has been Professor of Biostatistics in the School of Public Health, Uni-
versity of California, Los Angeles (UCLA) since 1965, and served as the Dean of the School from
1985 until 2000. His research includes multivariate and multilevel data analysis, handling miss-
ing observations in regression and discriminant analyses, meta-analysis, and model selection. Over
the years, he taught well-attended courses in biostatistics for public health students and clinical re-
search physicians, and doctoral-level courses in multivariate statistics and multilevel modeling. He
has authored many publications in statistics and health related fields, including two widely used
books (with multiple editions) on multivariate analysis. He received several prestigious awards for
excellence in teaching and research.
Susanne May, Ph.D., is a Professor in the Department of Biostatistics at the University of Wash-
ington in Seattle. Her areas of expertise and interest include clinical trials, survival analysis, and
longitudinal data analysis. She has more than 20 years of experience as a statistical collaborator
and consultant on health related research projects. In addition to a number of methodological and
applied publications, she is a coauthor (with Drs. Hosmer and Lemeshow) of Applied Survival
Analysis: Regression Modeling of Time-to-Event Data. Dr. May has taught courses on introductory
statistics, clinical trials, and survival analysis.
Virginia A. Clark, Ph.D., was professor emerita of Biostatistics and Biomathematics at UCLA.
For 27 years, she taught courses in multivariate analysis and survival analysis, among others. In
addition to this book, she is coauthor of four books on survival analysis, linear models and analysis
of variance, and survey research, as well as an introductory book on biostatistics. She published
extensively in statistical and health science journals.
xv
Part I
1
Chapter 1
3
4 CHAPTER 1. WHAT IS MULTIVARIATE ANALYSIS?
framework can be tested to determine if they are consistent with the data. An example of such model
testing is given in Aneshensel and Frerichs (1982).
Data from the first time period of the depression study are described in Chapter 3. Only a subset
of the factors measured on a subsample of the respondents is included in this book’s web site in
order to keep the data set easily comprehensible. These data are used several times in subsequent
chapters to illustrate some of the multivariate techniques presented in this book.
Logistic regression
An online movie streaming service has classified movies into two distinct groups according to
whether they have a high or low proportion of the viewing audience when shown. The company
also records data on features such as the length of the movie, the genre, and the characteristics
of the actors. An analyst would use logistic regression because some of the data do not meet the
assumptions for statistical inference used in discriminant function analysis, but they do meet the
assumptions for logistic regression. From logistic regression we derive an equation to estimate the
probability of capturing a high proportion of the target audience.
Poisson regression
In a health survey, middle school students were asked how many visits they made to the dentist in
the last year. The investigators are concerned that many students in this community are not receiving
adequate dental care. They want to determine what characterizes how frequently students go to the
dentist so that they can design a program to improve utilization of dental care. Visits per year are
count data and Poisson regression analysis provides a good tool for analyzing this type of data.
Poisson regression is covered in the logistic regression chapter.
8 CHAPTER 1. WHAT IS MULTIVARIATE ANALYSIS?
Survival analysis
An administrator of a large health maintenance organization (HMO) has collected data for a number
of years on length of employment in years for their physicians who are either family practitioners or
internists. Some of the physicians are still employed, but many have left. For those still employed,
the administrator can only know that their ultimate length of employment will be greater than their
current length of employment. The administrator wishes to describe the distribution of length of
employment for each type of physician, determine the possible effects of factors such as gender and
location of work, and test whether or not the length of employment is the same for two specialties.
Survival analysis, or event history analysis (as it is often called by behavioral scientists), can be used
to analyze the distribution of time to an event such as quitting work, having a relapse of a disease,
or dying of cancer.
Factor analysis
An investigator has asked each respondent in a survey whether he or she strongly agrees, agrees, is
undecided, disagrees, or strongly disagrees with 15 statements concerning attitudes toward inflation.
As a first step, the investigator will do a factor analysis on the resulting data to determine which
statements belong together in sets that are uncorrelated with other sets. The particular statements
that form a single set will be examined to obtain a better understanding of attitudes toward inflation.
Scores derived from each set or factor will be used in subsequent analyses to predict consumer
spending.
Cluster analysis
Investigators have made numerous measurements on a sample of patients who have been classified
as being depressed. They wish to determine, on the basis of their measurements, whether these
patients can be classified by type of depression. That is, is it possible to determine distinct types of
depressed patients by performing a cluster analysis on patient scores on various tests?
Another random document with
no related content on Scribd:
[760] G., siguira.
[761] G., deçimo terçio.
ARGUMENTO
DEL DEÇIMO
QUARTO CANTO
DEL GALLO[762]