Descriptive Statistics and Graphs: Statistics For Psychology
Descriptive Statistics and Graphs: Statistics For Psychology
Descriptive Statistics and Graphs: Statistics For Psychology
3
Descrip.on
of
Data:
Sta.s.cs
and
Graphical
Descrip.on
Outline
Introduc)on
Descrip)ve
Sta)s)cs:
Describing
Data
Measures
of
Central
Tendency
Proper)es
of
Measures
of
Central
Tendency
Measures
of
Variability
Graphical
Representa)on
of
Data
Descrip.ve
Sta.s.cs:
Describing
Data
The
descrip)ve
sta)s)cs
provides
informa)on
about:
Center
of
the
data
Variability
of
the
data
Distribu)on
of
the
data
It
is
used
to
understand
the
sample
and
not
make
an
inference
about
the
popula)on.
Measures
of
Central
Tendency
Mean
Most
popular
measure
Obtained
by
dividing
the
sum
of
the
scores
(x)
by
the
number
of
scores
(n)
n n
X i X i
(1 + 2 + 3 + 4 + 5 ) 15
X= i =1 X= i =1
= = =3
n n 5 5
Median
and
Mode
Median:
Median
is
the
point
above
which
and
below
which
lie
50
percent
ordinally
arranged
data
points.
n +1 5 +1 6
Median Location = = = =3
2 2 2
( X X)
2
i
1 n
= ( Xi X )
2
S =
2
X
i =1
n n i =1
Sample
variance
is
the
average
of
squared
devia)on
from
mean.
The
denominator
n1
is
the
variance,
which
is
an
es)mator
of
popula)on
variance.
Standard
Devia.on
(S)
Standard
devia)on
is
a
posi)ve
square-root
of
the
variance
n
( Xi X )
2
S X = S X2 = i =1
n 1
The
Graphical
Representa.on
of
Data
Stem
and
Leaf
Graph
Stem
is
on
the
le[-hand
column
and
leaves
are
the
lists
on
the
right-hand
row.
Data:
22,
25,
32,
43,
46,
49,
55,
55,
55
Stem
Leaves
1
0
2
2
5
3
2
4
3
6
9
5
5
5
5
6
0
Box-whisker
Plot
(Box
Plot)
The
box
plot
uses
a
quar)le
as
its
basis.
The
data
is
divided
into
three
areas:
lower
25
percent,
middle
50
percent
data,
and
upper
25
percent
data.
The
upper
and
lower
25
percent
data
is
represented
by
whiskers
and
middle
50
percent
data
is
represented
by
a
box.
Histograms
The
histograms
represent
how
frequent
the
numbers
are
in
the
data.
The
gure
shows
the
histogram
for
the
data
sampled
from
normally
distributed
popula)on
Kernel
Density
Plots
Kernel
density
es)mator
(KDE)
is
a
nonparametric
method
of
es)ma)ng
pdf
of
con)nuous
random
variable
n
1
fh ( X ) = K h ( X X i )
2 i =1
Steps:
(i) Choose
kernel
(ii) Construct
kernel
func)on
for
each
data
point
(iii) All
individual
func)ons
are
added
and
divided
by
n
The
ggplot2
and
LaSce:
Data
Visualiza.on
with
R
install.packages("ggplot2")
library(ggplot2)
qplot(x,
geom="histogram",
binwidth
=
1)
qplot(x,
y,
data
=,
color
=,
shape
=,
size
=,
alpha
=,
geom
=,
method
=,
formula
=,
facets
=,
xlim
=,
ylim
=
xlab
=,
ylab
=,
main
=,
sub
=)
Each
argument
needs
to
be
specied.