Social Research Methods/Statistical Analysis


indicate a condence interval and a condence level.

Statistics is the applied branch of mathematics espe- Computations of condence levels and intervals are based
on a probability theory and assume that conventional
cially appropriate for a variety of research analysis
probability-sampling techniques have been employed in
Descriptive Statistics
the study.
Descriptive statistis are used to summarize data under
study. Some descriptive statistics summarize the distribution of attributes on a single variable; others summarize
the associations between variables.

Inferences about the generalizability, to a population, of

the associations discovered between variables in a sample
involve tests of statistical signicance, which estimate the
likelihood that an association as large as the observed one
Descriptive statistics summarizing the relationships be- could result from normal sampling error if no such association exists between the variables in the larger population.
tween variables are called measures of association.
Tests of statistical signicance are also based on prob Many measures of association are based on a propor- ability theory and assume that conventional probabilitytionate reduction of error (PRE) model. This model is sampling techniques have been employed in the study.
based on a comparison of 1. the numbers of errors we
would make in attempting to guess the attributes of a The level of signicance of an observed association is
given variable for each of the cases under study - if we reported in the form of the probability that the association
knew nothing but the distribution of attributes on that could have been produced merely by sampling error. To
variable - and 2. the number of errors we would make say that an association is signicant at the .05 level is to
if we knew the joint distribution overall and were told say that an association is large as the observed one could
for each case the attribute o one variable each time we not be expected o result from sampling error more than 5
were asked to guess the attribute of the other. These mea- times our of 100.
sures include lambda, which is appropriate for the analy- Social researchers tend to use a particular set of levels
sis of two nominal variables; gamma, which is appropri- of signicance in connection with tests of statistical sigate for the analysis of two ordinal variables; and Pearsons nicance: .05, .01 and .001. This is merely a convention,
product-moment correlation, which is appropriate for the however.
analysis of two interval or ratio variables.
A frequently used test of statistical signicance in tabular
Regression analysis represents the relationships between data is chi-sqaure.
variables in the form of equations, which can be used to
predict the values of a dependent variable on the basis of The t-test is a frequently used test of statistical signicance for comparing means.
values of one or more independent variables
Regression equations are computed on the basis of a Statistical signicance must not be confused with subregression line: the geometric line representing, with the stantial signicance, the latter meaning that an observed
least amount of discrepancy, the actual location of points association is strong, important, meaningful, or worth
writing home to your mother about.
in a scattergram.
Tests of statistical signicance, strictly speaking, make
Types of regression analysis include linear regression
analysis, multiple regression analysis, partial regression assumptions about data and methods that are almost never
satised completely by real social research. Despite this,
analysis, and curvilinear regression analysis.
the tests can serve a useful function in the analysis and
Inferential Statistics
interpretation of data.
Inferential statistics are used to estimate the generalizability of ndings arrived at through the analysis of a
sampling to the larger population from which the sample
has been selected. Some inferential statistics estimate the
single-variable characteristics of the population; others tests of statistical signicance - estimate the relationships
between variables in the population.

Other Multivariate Techniques

Path analysis is a method of presenting graphically the
networks of causal relationships among several variables.
It illustrates the primary paths of variables through
which independent variables cause dependent ones. Path
coecients that represent the partial relationships between variables.

Inferences about some characteristic of population must

Time-series analysis is an analysis of changes in a vari1

able (such as crime rates) over time.

Geographic Information Systems (GIS): Analytic tech Factor analysis, feasible only with a computer, is an an- nique in which researchers map quantitative data that dealytic method of discovering the general dimensions rep- scribe geographic units in a graphic display.
resented by a collection of actual variables. These gen- Inferential statistics: The body of statistical computaeral dimensions, or factors, are calculated hypothetical di- tions relevant to making inferences from ndings based
mensions that are not perfectly represented by any of the on sample observations to some larger populations.
empirical variables under study but are highly associated
Level of signicance: In the context of tests of statiswith groups of empirical variables. A factor loading indi- tical signicance, the degree of likelihood that an obcates the degree of association between a given empirical served, empirical relationship could be attributable to
variable and a given factor.
sample error. A relationship is signicant at the .05 level
Analysis of variance (ANOVA) is based on comparing variations between and within groups and determining whether between-group dierences could reasonably
have occurred in simple random sampling or whether they
likely represent a genuine relationship between the variables involved.
Discriminant analysis seeks to account for variation in
some dependent variable. It results in an equation that
scored people on the basis of that hypothetical dimensions and allows us to predict their values on the dependent variable.

if the likelihood of its being only a function of sampling

error is no greater than 5 out of 100.
Linear regression analysis: A form of statistical analysis that seeks the equation for the straight line that best
describes the relationship between two ratio variables.
Log-linear analysis: Data-analysis technique based on
specifying models that describe the interrelationships
among variables and then comparing expected and observed table-cell frequencies.

Multiple regression analysis: A form of statistical analysis that seeks the equation representing the impact of
Log-linear models oer a method for analyzing com- two or more independent variables on a single dependent
plex relationships among several nominal variables hav- variable.
ing more than two attributes each.
Nonsampling error: Those imperfections of data qual Geographic Information Systems (GIS) map quantitative ity that are a result of factors other than sampling erdata that describe geographic unites for a graphic display. ror. Exampling include misunderstandings of questions
Key Terms that are important for understanding statisti- by respondents, erroneous recordings by interviewers and
coders, and keypunch errors.
cal analyses.
Analysis of variance (ANOVA): Method of analysis in
which cases under study are combined into groups representing an independent variable, and the extent to which
the groups dier from from one another is analyzed in
terms of some dependent variable. Then, the extent to
which the groups dier is compared with the standard of
random distribution.

Partial regression analysis: A form of regression analysis in which the eects of one or more variables are held
constant, similar to the logic of the elaboration model.
Path analysis: A form of multivariate analysis in which
the causal relationships among variables are presented in
a graphic format.

Proportionate reduction of error (PRE): A logical model

for assessing the strength of a relationship by asking how
much knowing values on one variable would reduce our
errors in guessing values on the other. For example, if we
know how much education people have, we can improve
Descriptive statistics: Statistical computation describ- our ability to estimate how much they earn, thus indicating either the characteristics of a sample or the relation- ing there is a relationship between the two variables.
ship among variables in a sample. Descriptive statistics Regression analysis: A method of data analysis in which
merely summarize a set of sample observations, whereas the relationships among variables are represented in the
inferential statistics move beyond the description of spe- form of an equation, called a regression equation.
cic observations to make inferences about the larger
population from which the sample observations were Statistical signicance: A general term referring to the
likelihood that relationships observed in a sample could
be attributed to sampling error alone.
Discriminant analysis: Method of analysis similar to
multiple regression, except that dependent variable can Tests of statistical signicance: A class of statistical
computations that indicate the likelihood that the relabe nominal.
tionship observed between variables in a sample can be
Factor analysis: A complex algebraic method for deter- attributed to sampling error only.
mining the general dimensions or factors that exist within
Time-series analysis: An analysis of changes in a varia set of concrete observations.
able (such as crime rates) over time.
Curvilinear regression analysis: A form of regression
analysis that allows relationships among variables to be
expressed with curved geometric lines instead of straight

