Nothing Special   »   [go: up one dir, main page]

Experimental Psychology Reviewer

Download as pdf or txt
Download as pdf or txt
You are on page 1of 26

MODULE 1: Introduction to Experimental Psychology Nonscientific inference

Science STEREOTYPING

Science ★ Is a cognitive process in that it involves associating a


characteristic with a group, but it can also involve, lead
★ Lt. scientia (knowledge)
to, or serve to justify an affective reaction toward
★ Connoted content (what we know) and process people from other groups
(systematic way -> gathering data, noting relationships, GAMBLER’s FALLACY
offering explanations.
★ Is the belief that the probability for an outcome after
Methodology
a series of outcomes is not the same as the probability
★ Consists of the scientific techniques we use to collect for a single outcome.
and evaluate data. OVERCONFIDENCE BIAS
Data
★ Is the tendency people have to be more confident in
★ Are the facts we gather using scientific methods. their own abilities, such as driving, teaching, or spelling,
than is objectively reasonable.

The need for Scientific Methodology


The characteristics of a modern science
COMMONSENSE PSYCHOLOGY (Heider, 1958)
THE SCIENTIFIC MENTALITY
★ The kind of everyday, nonscientific data gathering
that shapes our expectations and beliefs and directs our ★ “Behavior must follow a natural order; therefore, it
behavior towards others can be predicted” § There are specifiable reasons for the
way people behave and that these reasons can be
o Commonsense beliefs about our behavior are derived discovered through research (determinism)
from the data we collect from our own experience and
what we have learned from others. GATHERING EMPIRICAL DATA

o Absence makes the heart grow fonder – out of sight ★ Data that are observable or experienced § Can be
out of mind. verified or disproved through investigation

o Opposite attracts – Birds with the same feather flock SEEKING GENERAL PRINCIPLES
together.
✦ LAWS – when principles have the generality to be
applied in all situations

Nonscientific sources of data ✦ THEORIES – general principle or set of rules, that can
be used to predict new examples of behavior (it can
CONFIRMATION BIAS
explain many but not all)
★ is a type of cognitive bias that involves favoring
GOOD THINKING
information that confirms your previously existing
beliefs or biases. ★ Being open to new ideas even when they contradict
our prior beliefs and attitudes
★ E.g., lunacy, Friday the 13th, horoscope, born in
February. ✦ PARSIMONY (Occam’s razor) – the simplest
explanation is preferred until it is ruled out by
People who are popular, attractive, high in status,
conflicting data
seemingly expert and confident
SELF-CORRECTION Basic and Applied Research

★ “WEIGHT OF EVIDENCE” APPROACH – the more ★ BASIC RESEARCH - Research designed to test theories
evidence that accumulates to support a particular or to explain psychological phenomena in humans and
explanation or theory, the more confidence we have animals
that the theory is correct.
★ APPLIED RESEARCH - Research that is designed to
PUBLICIZING RESULTS solve real-world problems (like helping patients to deal
with grief or improving employee morale)
★ Scientific papers -> scientific journals

REPLICATION
Tools of psychological science
★ We can replicate research findings of others by
setting up the same or OBSERVATION

similar conditions and observing whether or not the ★ Systematic noting and recording of events
outcome is the
MEASUREMENT
same.
★ Is assigning numerical values to objects or events or
their characteristics according to conventional rules.
Objective of psychological science EXPERIMENTATION
DESCRIPTION ★ Is a process undertaken to show that certain kinds of
events are predictable under certain, specifiable
★ Initial step in understanding any phenomenon
situations.
★ Descriptive research (case study, field study)
★ Predictions must be testable § Must be objective (not
PREDICTION bias), ethical Inductive and deductive reasoning

★ Refers to the capacity for knowing in advance when


certain behaviors would be expected to occur.
Inductive and deductive reasoning
★ Research designs such as correlational and quasi- Reasoning and logic represent vital components of the
experimental designs research process.
EXPLANATION
★ DEDUCTIVE REASONING (THEORY TESTING) - Uses
★ Includes knowledge of the conditions that reliably logic that moves from the general statement to the
reproduce the occurrence of the behavior specific

★ The use of experimental research design (true ★ INDUCTIVE REASONING (THEORY BUILDING) - Uses
experiments -> cause and effect) logic that is launched from a specific case or occurrence
and moves to inferences concerning the general
CONTROL

★ Refers to the application of what has been learned


about behavior

★ Testing the effects of specified conditions on behavior


and changing behavior
Scientific Explanation in Psychological Science GENERAL ETHICAL PRINCIPLE

IDENTIFYING ANTECEDENT CONDITIONS ★ The psychologist/researcher should decide whether


his or her research is potentially valuable for
★ Are the circumstances that come before the event or
psychological science and human welfare.
behavior that we want to explain
★ “Whether we work with animals or humans. We must
COMPARING TREATMENT CONDITIONS
always consider their SAFETY and WELFARE”
★ SUBJECTS (research participants)
THE PARTICIPANT AT MINIMAL RISK
★ TREATMENT – a specific set of antecedent conditions
★ The first major consideration for implementing the
created by the
general principle is the decision as to whether the
experimenter and presented to subject to test its effect participant will be a “subject at risk” or a “subject at
on behavior minimal risk” according to recognized standards.

THE PSYCHOLOGY EXPERIMENT SUBJECT AT RISK

★ Is a controlled procedure in which at least two ★ Any individual who may be exposed to the possibility
different treatment conditions are applied to subjects. of injury, including physical, psychological, or social
(at least two different treatments so as to compare injury, as a consequence of participation as a subject in
behavior) any research.

ESTABLISHING CAUSE AND EFFECT THE RESEARCHER’S ETHICAL EVALUATION

★ CAUSE AND EFFECT RELATIONSHIP – the relation ★ The researcher carefully assesses the ethical
between a particular behavior and a set of antecedents acceptability of the research. If there is any question
that always precedes it – whereas other antecedents do here, the researcher should seek ethical advice from
not – so that the set of inferred to cause the behavior. colleagues or, if necessary, from the relevant university
committee or professional commission.
NECESSARY vs. SUFFICIENT CONDITIONS
RESPONSIBILITY
★ Psychologists generally look for the sufficient
conditions rather than looking for the ultimate causes of ★ It is the major investigator who is responsible for all
behavior aspects of the research, including the ethical treatment
of participants by all who are collaborating in the
project.
MODULE 2: Research Ethics, Correlation and Quasi-
Experimental Designs
INFORMED CONSENT
ETHICS
o A subject’s voluntary agreement to participate in a
★ It is a body of principles of right, proper or good research project after the nature and purpose of the
conduct. study have been explained.
★ Responsible research is aimed at advancing our o Individuals must give their consent freely (without
understanding of feelings, thoughts and behaviors in force, duress, or coercion)
ways that will benefit humanity.
o They are free to drop out of the experiment at any
★ The researcher is legally responsible for what time.
happens to research participants of the study.
o Full explanation of the procedures

o Researchers must make clear the potential risks and


benefits of the experiment
o All data will remain private and confidential. that the recording will be used in a manner that could
cause personal identification or harm, or
o Obtained in writing
(2) the research design includes deception, and consent
o Minor (consent from parent or legal guardian)
for the use of the recording is obtained during
debriefing.

TYPES OF CONSENT OFFERING INDUCEMENTS FOR RESEARCH


PARTICIPATION
Direct Consent
★ Psychologists make reasonable efforts to avoid
★ This is the most preferred because agreement is offering excessive or inappropriate financial or other
obtained directly from the person to be involved in the inducements for research participation when such
study. inducements are likely to coerce participation.
Substitute Consent
★ When offering professional services as an inducement
★ Or third-party consent. It is given by someone other for research participation, psychologists clarify the
than the person to be involved in the study. Substitute nature of the services, as well as the risks, obligations
consent may be obtained when it is determined that and limitations.
persons do not have the capacity to make the decision DECEPTION AND DEBRIEFING
or are dependent on others for their welfare.
★ The relationship between researchers and
participants should be as open and honest as possible.
ELEMENTS OF CONSENT
★ In some studies, if participants are given an accurate
Capacity description of what the research is about, the research
would be pointless- its validity would be destroyed. If
★ This is the person’s ability to acquire and retain the researcher believes that deception is justified by the
knowledge. The ability to evaluate the information prospective value of the research, and if alternative
obtained and subsequently make a choice based on this procedures are not available, then the participant must
evaluation is integral to the element of capacity. be provided with a thorough explanation as soon as
Information possible.

★ The researcher’s duty is to see that the information FREEDOM FROM COERCION
given to the potential subject is designed to be fully ★ It should be made clear that the participant is free to
understood and is fully understood. decline to participate in or to withdraw from the
Voluntariness research at any time. Subtle forms of coercion should be
avoided, such as participant feeling a special obligation
★ Voluntary consent is concerned with each individual’s to please a professor.
ability to exercise free power of choice without the
intervention of force, fraud, deceit, duress or other
forms of constraint or coercion.

INFORMED CONSENT FOR RECORDING VOICES AND PROTECTION OF PARTICIPANTS


IMAGES IN RESEARCH
★ If there are any risk at all, be they physical or mental
★ Psychologists obtain informed consent from research discomfort, harm or danger, the researcher must inform
participants prior to recording their voices or images for the participants of those risks. The researcher should
data collection unless realize that participation in any research may produce at
(1) the research consists solely of naturalistic least some degree of stress. Thus, the participant should
observations in public places, and it is not anticipated be carefully assured of his or her safety
CONFIDENTIALITY DUPLICATE PUBLICATION OF DATA

★ Any information obtained about the participant must ★ Psychologists do not publish, as original data, data
be kept confidential, unless the participant agrees that have been previously published. This does not
otherwise. preclude republishing data when they are accompanied
by proper acknowledgment.
SCIENTIFIC FRAUD AND PLAGIARISM
RESEARCH WITH ANIMAL SUBJECTS
★ Fraud in science is typically thought of as falsifying or
fabricating data; clearly, fraud is unethical. The peer ★ Researchers have responsibility to promote animal
review process, replication and scrutiny by colleague’s welfare whenever they use animal subjects. Animals
help hold fraud in check. must receive adequate physical care to stay healthy and
comfortable.
★ Plagiarism, representing someone else’s work as your
own, is a serious breach of ethics and is also considered HARLOW'S MONKEY EXPERIMENT
a type of fraud.
In the 1950s, Harry Harlow of the University of
REPORTING RESEARCH RESULTS Wisconsin tested infant dependency using rhesus
monkeys in his experiments rather than human babies.
★ Psychologists do not fabricate data. The monkey was removed from its actual mother which
was replaced with two “mothers,” one made of cloth
★ If psychologists discover significant errors in their
and one made of wire. The cloth “mother” served no
published data, they take reasonable steps to correct
purpose other than its comforting feel whereas the wire
such errors in a correction, retraction, erratum or other
“mother” fed the monkey through a bottle. The monkey
appropriate publication means.
spent the majority of his day next to the cloth “mother”
PUBLICATION CREDIT and only around one hour a day next to the wire
“mother,” despite the association between the wire
★ Psychologists take responsibility and credit, including model and food. Harlow also used intimidation to prove
authorship credit, only for work they have actually that the monkey found the cloth “mother” to be
performed or to which they have substantially superior. He would scare the infants and watch as the
contributed. monkey ran towards the cloth model. Harlow also
★ Principal authorship and other publication credits conducted experiments which isolated monkeys from
accurately reflect the relative scientific or professional other monkeys in order to show that those who did not
contributions of the individuals involved, regardless of learn to be part of the group at a young age were
their relative status. Mere possession of an institutional unable to assimilate and mate when they got older.
position, such as department chair, does not justify Harlow’s experiments ceased in 1985 due to APA rules
authorship credit. Minor contributions to the research against the mistreatment of animals as well as humans.
or to the writing for publications are acknowledged
appropriately, such as in footnotes or in an introductory
ALTERNATIVES TO EXPERIMENTATION: CORRELATIONAL
statement. AND QUASI- EXPERIMENTAL DESIGNS TWO
★ Except under exceptional circumstances, a student is CATEGORIES OF NONEXPERIMENTAL RESEARCH
listed as principal author on any multiple-authored METHODS
article that is substantially based on the student's ★ Correlational Design
doctoral dissertation. Faculty advisors discuss
publication credit with students as early as feasible and ★ Quasi-Experimental Designs
throughout the research and publication process as
o Both tend to be high in external validity
appropriate.
o Both methods rely on statistical data analysis
Moment Correlation Coefficient (r) is used to compute;
when r is computed, outcome can only have a POSITIVE,
INTERNAL VALIDITY
NEGATIVE, OR NO RELATIONSHIP; use a General Linear
★ The certainty that the changes in behavior observed Model for statistical formulas (assumes relationship
across treatment conditions in the experiment were between X and Y is generally the same)
actually caused by the independent variable
★ Values of a correlation coefficient can only vary
EXTERNAL VALIDITY between - 1.00 and +1.00

★ How well the findings of an experiment generalize or


apply to people and settings that tested directly
★ SCATTERPLOTS (or SCATTERGRAPHS): visual
representations of the scores belonging to each subject
in the study; dots show if the pattern is positive,
CORRELATIONAL DESIGNS negative, or non-existent
★ Used to establish relationships among preexisting ★ REGRESSION LINES: lines of best fit; lines drawn on
behaviors and can be used to predict one set of the scatterplot; direction of line corresponds to the
behaviors from others (e.g. predicting your college direction of the relationship; line represents
grades from your entrance exam) mathematical equation that represents the linear
★ Can show relationships between sets of antecedent relationship between two scores
conditions and behavioral effects (e.g. relationship
between smoking and lung cancer)
★ POSITIVE CORRELATION: value of r is positive; also
★ Antecedents are preexisting; conditions are not called a direct relationship
manipulated or controlled by researchers
★ NEGATIVE CORRELATION: value of r is negative; aka
★ Harder to establish a cause-and-effect relationship inverse relationship

★ The direction of the relationship (positive or negative)


★ One that is designed to determine the correlation does not affect ability to predict score (e.g. you could
(degree of relationship) between two traits, behaviors, predict vocabulary just as well from negative scores as
or events; when two things are correlated, changes in you could positive ones)
one are associated with changes in another ★ If r is near zero, there is NO RELATIONSHIP
★ Used often to explore behaviors that are not yet
understood

★ Asking how well the measures go together

★ Once the correlation is known, predictions can be


made; higher correlation means more accurate

predictions

★ Researcher measures events without attempting to


alter the antecedent conditions in any way

★ SIMPLE CORRELATIONS: relationships between pairs


of scores from each subject; Pearson Product
★ Correlation coefficients can be affected by a nonlinear ★ MULTIPLE CORRELATION (R): a measure predicted by
trend, range truncation, and outliers multiple other measured behaviors; test the relationship
of several predictor variables (X1, X2, X3...) with a
★ RANGE TRUNCATION: artificial restriction of the criterion variable; similar to r but R allows us to use
range of values of X or Y (e.g. shoe size increasing as information provided by two or more measured
children age) behaviors to predict another measured behavior when
we have that info available; R2 can be used the same
★ OUTLIERS: extreme scores that can aAect correlation way as r2
coeAicients
★ MULTIPLE REGRESSION ANALYSIS: used when you
want to predict the score on one behavior from the
CORRELATION DOES NOT MEAN CAUSATION score on the other when two or more related behaviors
are correlated
o causal direction between two variables cannot be
determined by simple correlations
★ THIRD VARIABLE: another agent that may cause the
two behaviors to appear related (e.g. amount of TV
★ BIDIRECTIONAL CAUSATION: behaviors aAecting each
other watched, age, and vocabulary; TV and vocabulary are
agerelated, but all three variables are not related
★ THIRD VARIABLE PROBLEM: a third agent making the
therefore age is a third variable)
two behaviors seem like they are related
★ PARTIAL CORRELATION: allows the statistical
influence of one measured variable to be held constant
★ COEFFICIENT OF DETERMINATION (r2): estimates the while computing the correlation between the other two;
amount of variability in scores on one variable that can (e.g. if age is a third variable that is largely responsible,
be explained by the other variable; how much one controlling the contribution of age should decrease the
variable can explain the variability in scores of the other correlation between TV time and vocabulary à making
variable; this correlation less significant proves that age was the
initial factor making correlation so high)
o e.g. firm handshake and first impressions experiment
(positivity); r=.56, r2=.31

o 31% of differences in positivity scores can be ★ CAUSAL MODELING: creating and testing models that
accounted for the firmness of handshake may suggest cause and effect relationships among
behaviors
o Cohen (1988) anything over .25 is considered a strong
association ★ CAUSAL MODELING TOOLS IN CORRELATION-BASED
DESIGNS

o Path Analysis
★ LINEAR REGRESSION ANALYSIS: when two behaviors
are o Cross-Lagged Panel
strongly related, researcher estimates a score on one of
the measure behaviors from a score on the other
★ FACTOR ANALYSIS: Determines subsets of correlated
o e.g. hours of watching TV and vocabulary test scores variables within a larger set of variables
were correlated -> substitute viewing time into the
equation for regression line so we can estimate their ★ PATH ANALYSIS: Uses beta weights from multiple
score on vocabulary test regression analysis to generate possible direction of
cause and effect from correlated variables
o The stronger the correlation, the better the prediction
★ CROSS-LAGGED PANEL: Measures the same pair of EX POST FACTO: Explores characteristics, behaviors or
variables at two different points in time; looks at a effects of naturally occurring events in preexisting
pattern of correlations across time for possible direction groups of subjects.
of cause and effect
NONEQUIVALENT GROUPS: Compares the effects of
different treatment conditions on preexisting group of
subjects.
QUASI-EXPERIMENTAL DESIGNS
LONGITUDINAL: Investigates changes across time by
★ Conditions cannot be manipulated or controlled measuring behavior of same group of subjects at
different points of time.
★ Also known as “natural experiment”. Quasi means
“seeming like” real experiments but they lack one or CROSS-SECTIONAL: Investigates changes across time by
more essential elements such as manipulation of comparing groups of subjects already at different stages
antecedents or random assignment to treatment at a single point of time
conditions
PRETEST/POSTTEST: Explores the effect of an event (or
★ Subjects are selected on a basis of preexisting treatment) by comparing behavior before and after the
conditions event (or treatment)

★ Used to compare behavioral differences associated


with different types of subjects (eg: normal or
MODULE 3: Formulating Hypothesis, Basics of
schizophrenic children), observe naturally occurring
Experimentation and Controlling Extraneous Variables
situations (raised in a one or two parent home), unusual
events (surviving a hurricane)), that cannot be Formulating Hypothesis
manipulated by an experimenter
Hypothesis
★ Can increase understanding of biological,
★ Thesis, or main idea of an experiment
environmental, cognitive, and genetic attributes.
★ It is a statement about a predicted relationship
★ It often allows the researcher more systematic
between at least two variables
control over the situation compared to other
nonexperimental designs (phenomenology, case and Nonexperimental hypothesis
field studies, etc.)
★ Statement of your predictions of how events, traits or
★ It is used whenever subjects cannot be assigned at behaviors might be related- not a statement about
random to receive different experimental manipulations cause and effect.
or treatments (eg: effect of lighting on working
productivity in two different companies; differences Experimental Hypothesis
could be from workers’ abilities or the lighting) ★ Tentative explanation of an event or behavior. It is a
★ Goal of a quasi-experiment is to compare different statement that predicts the effects of specified
groups of subjects, looking for differences between antecedent conditions on a measured behavior.
them, or looking for changes over time in the same Characteristics of an Experimental Hypothesis
group of subjects
o Synthetic statement - those that can be demonstrated
★ AN IMPORTANT DIFFERENCE BETWEEN EXPERIMENTS either true or false.
AND QUASI-EXPERIMENTS IS THE AMOUNT OF
CONTROL THE RESEARCHER HAS OVER SUBJECTS WHO o Non-Synthetic Analytic statement - one that is always
RECEIVE TREATMENT true

o Contradictory statement - statements with elements


that oppose each other
o Testable statement - the means for manipulating o It can be useful in generating new hypotheses only
antecedent conditions and measuring the resulting when we are open to new possibilities
behavior must exist
o It is not just a matter of luck; it is also a matter of
o Falsifiable statement - disprovable by research findings knowing enough to use an opportunity.

o Fruitful - leads to new studies or researches ★ INTUITION – knowing without reasoning.


o Parsimonious statements - simple explanation o It is probably closest to phenomenology

o Intuition is more accurate when it comes from


experts. Good hunches are really an unconscious result
Process of Formulating Hypothesis
of our own expertise in an area (Simon, 1967).
★ DEDUCTION – is the process of reasoning from
o Intuition should not destroy objectivity
general principles to predictions about specific
instances. Through deduction, we may rigorously test
the implications of a theory
WHEN ALL ELSE FAILS…
★ INDUCTION – is the process of reasoning from
★ Pick a psychology journal and READ; Narrow down
specific cases to more general principles. Through
your interest into one or two broad topics. Locate the
induction, we may devise general principles, or theories,
latest research based on your interest
that may be used to organize, explain and predict
behavior until more satisfactory principles are found ★ Try observation Forming hypothesis --> antecedent
conditions

Ways of finding hypothesis ★ Try to turn your attention to a real-world problem


Once the cause can be determined, a solution often
o Building on Prior Research suggests itself.
o Serendipity and the Windfall Hypothesis ★ Set realistic goals for yourself. SMART
o Intuition

o When All Else Fails… POINTS TO REMEMBER IN HYPOTHESIS TESTING

★ When predicting a relationship: A two-tailed test is


BUILDING ON PRIOR RESEARCH used when we predict a relationship, but do not predict
the direction in which scores will change. A one-tailed
o The most useful way of finding hypothesis is by test is used when we predict the direction in which
working from research that has already been done scores will change.
o Prior experimental research is an excellent source of
★ Experimental hypotheses describe the predicted
hypotheses.
relationship we may or may not find. • Statistical
o Reading more of the previous researches and studies hypotheses describe the population parameters the
sample data represent if the predicted relationship does
o READ, READ, READ! or does not exist.

★ The alternative hypothesis (Ha) describes the


SERENDIPITY AND THE WINDFALL HYPOTHESIS population parameters the sample data represent if the
predicted relationship exists.
★ SERENDIPITY – is the knack of finding things that are
not being sought ★ The null hypothesis (Ho) describes the population
parameters the sample data represent if the predicted
o A discovery was made when none was intended relationship does not exist.
★ The P (P-value) is the exact probability that the null o At least two different conditions are required à levels
hypothesis is true in light of the sample data. of IV

★ The α (alpha value) is the threshold below which is o E.g. blue, black, green (levels of IV) à IV = color
considered so small that we decide to reject the null Dependent Variable
hypothesis. Common values are 0.05 and 0.01
★ It is being measured to know whether changes in the
★ Decision to reject the null hypothesis, accept levels of IV have altered behavior.
alternative hypothesis is made if the P value is less than
the α (alpha value), otherwise retain it (z-test). ★ Is the particular behavior we expect to change
because of our experimental treatment.
★ On the other hand, for a t-test or F-test (obtained),
the computed t or F must be greater than or equal to ★ In an experiment, we are testing effects of IV on the
the value of the alpha level set (critical value) DV.

★ Since we manipulate the IV and measure its effects


on the DV, dependent variables are sometimes called
Basics of Experimentation
dependent measures.
★ The Psychology Experiment - When an experiment is
well conducted, it is high in internal validity
Other terms used to describe IVs and DVs

Main Features of Psychology Experiment

★ We manipulate the antecedent conditions to create


at least two different treatment conditions

★ At least two treatments are required so that we can


make statements about the impact of different set of
Conceptual and Operational definitions
antecedents.
★ CONCEPTUAL DEFINITION - Definition that is used in
★ We expose subjects to different treatment
everyday language
conditions so that we can measure the effects of this
condition in behavior.

★ We record the responses or behaviors and compare OPERATIONAL DEFINITION


them using statistics.
★ Definition that is used in carrying out the experiment

★ Specifies the precise meaning of variable within an


Independent Variables experiment

★ Or simply IV, is the dimension that the experimenter ★ Two kinds of operational definitions: EXPERIMENTAL
intentionally manipulates and MEASURED
★ It is the antecedent the experimenter choose to vary

★ Are sometimes aspects of the physical environment Experimental operational definitions


that can be brought under the experimenter’s direct
★ Explain the precise meaning of IVs
control.

o E.g. Lighting (bright or dim); noise levels (soft or loud); ★ These definitions describe exactly what was done to
difficulty (easy or hard); psychological states (introvert create the various treatment conditions of the
versus extroverts) experiment.
★ Includes all the steps that were followed to set up ★ TEST-RETEST RELIABILITY – consistency between an
each value of the IV. individual’s scores on the same test taken at two or
more different times

★ INTERITEM RELIABILITY – the degree to which


Measured operational definitions
different items measuring the same variable attain
★ Measured operational definitions of the dependent consistent results (split-half reliability and Cronbach’s
variable alpha)

★ Describes exactly what procedures we follow to


assess the impact of different treatment conditions. Validity
★ Includes exact descriptions of the specific behaviors ★ Refers to the principle of actually studying the
or responses recorded and explain how those responses variables that we intend to study. • It measures what it
are scored. intends or purports to measure
o E.g. “scores on the Culture Fair Intelligence Test” not
★ It checks truthfulness
simply “scores on an intelligence test”
TYPES OF VALIDITY

★ FACE VALIDITY – the degree to which a manipulation


Scales/Levels of Measurement
or measurement technique is evident
✦ NOMINAL – Classifies items into two or more distinct
★ CONTENT VALIDITY - the degree to which the content
categories that can be named
of a measure reflects the content of what is being
✦ ORDINAL – The magnitude of each value is measured measured
in the form of ranks
★ PREDICTIVE VALIDITY – the degree to which a
✦ INTERVAL – measures magnitude or quantitative size measuring instrument yields information allowing
using measures with equal intervals between the values prediction of actual behavior or performance

✦ RATIO – measures magnitude or quantitative size ★ CONCURRENT VALIDITY – the degree to which scores
using measures with equal intervals between all values on the measuring instrument correlate with another
and a true zero point. known standard for measuring the variable being
studied.

★ CONSTRUCT VALIDITY – the degree to which an


Evaluating Operational Definitions
operational definition accurately represents the
construct it is intended to manipulate or measure

Reliability Internal Validity

★ Means consistency and dependability ★ Is the degree to which a researcher is able to state a
causal relationship between antecedent conditions and
★ Good operational definitions are reliable the subsequent observed behavior.

★ An experiment is internally valid if we can be sure


TYPES OF RELIABILITY that the changes in behavior that occurred across
treatment conditions were caused by the IV.
★ INTERRATER RELIABILITY – the degree of agreement
among different observers or raters ★ Is one of the most important concepts in
experimentation.
Three important concepts that are tied to the problem in the project all the way to end may be more motivated
of internal validity: EXTRANEOUS VARIABLES, to learn and thus achieved higher performance.
CONFOUNDING and THREATS TO INTERNAL VALIDITY
9. Selection - maturation interaction - the selection of
1. EXTRANEOUS VARIABLES - they are factors that are comparison groups and maturation interacting
not the focus of the experiment but can influence the
which may lead to confounding outcomes, and
findings.
erroneous interpretation that the treatment caused the
2. CONFOUNDING – an error that occurs when the value
effect.
of an extraneous variable changes systematically along
with the IV in an experiment; an alternative explanation 10. John Henry effect - John Henry was a worker who
for the findings that threaten internal validity. outperformed a machine under an experimental setting
because he was aware that his performance was
3. CLASSIC THREATS TO INTERNAL VALIDITY
compared with that of a machine.
1. History - the specific events which occur between the
Controlling Extraneous Variables
first and second measurement.
Controlling Extraneous Variables
2. Maturation -the processes within subjects which act
as a function of the passage of time. i.e. if the project ★ One of the major goals in setting up experiments is to
lasts a few years, most participants may improve their avoid confounding by controlling extraneous variables.
performance regardless of treatment.
★ The IV should be the only thing that changes
3. Testing - the effects of taking a test on the outcomes systematically across the conditions of the experiments
of taking a second test.
★ EVs that can threaten an experiment’s internal
4. Instrumentation - the changes in the instrument,
validity: physical, social, personality and context
observers, or scorers which may produce changes in
variables
outcomes.
Physical Variables
5. Statistical regression - It is also known as regression to
the mean. This threat is caused by the selection of ★ Aspects of testing conditions that need to be
subjects on the basis of extreme scores or controlled
characteristics. Give me forty worst students and I
guarantee that they will show immediate improvement ★ Day of the week, testing room, noise, distractions
right after my treatment. ★ Techniques for controlling physical variables
6. Selection of subjects - the biases which may result in o Elimination
selection of comparison groups. Randomization
(Random assignment) of group membership is a o Constancy
counter-attack against this threat. However, when the
o Balancing
sample size is small, randomization may lead to Simpson
Paradox

7. Simpson paradox - refers to a situation where you Elimination


believe you understand the direction of a relationship
between two variables, but when you consider an ★ A technique to control extraneous variables by
additional variable, that direction appears to reverse. removing them from the experiment

8. Experimental mortality - the loss of subjects. For ★ What necessary measures or procedures that you can
example, in a Web-based instruction project entitled do to eliminate noise?
Eruditio, it started with 161 subjects and only 95 of
them completed the entire module. Those who stayed
Constancy of Condition ★ Can lead to distorted data by compelling subjects to
produce responses that conform to what subjects
★ A control procedure used to avoid confounding
believe is expected of them from the experiment
★ Keeping all aspects of the treatment conditions
★ Participants should understand the nature and
(nearly) identical except for the independent variable
purpose of the experiment but not the exact hypothesis.
that is being manipulated
Controlling Social EVs: Single-Blind Experiments
o E.g., weather, lighting conditions, paint on the walls
★ An experiment in which subjects are not told which of
Balancing
the treatment conditions they are in
★ A technique used to control the impact of extraneous
★ We can disclose some but not all information about
variables by distributing their effects equally among
the experiment to subjects.
treatment conditions

Number of EVs to be Controlled ★ We do not tell them the treatment condition they are
in Placebo effect
You can set up a reasonably good experiment by taking
these precautions: Cover Stories

1. Eliminate extraneous variables whenever you can ★ A plausible but false explanation of the procedures in
an experiment told to disguise the actual research
2. Keep treatment conditions as similar as possible hypothesis so that the subject will not guess what is it.
3. Balance out the effects of other variables ★ It involves deception and debriefing after
4. Be sure to assign individual subjects to treatment Experimenter Bias
conditions at a random
★ Any behavior of the experimenter that can create
o As long as there is no systematic change in an
confounding in an experiment
extraneous variable, things are fine
★ An experimenter’s demeanor can be a confounding
o The more extraneous variables we control, the
variable
more we increase internal validity

o In a well-constructed experiment, we are ★ Experimenters might also treat subjects differently


depending on what they expect from them.
confident that the IV, not EVs caused the effects
★ Experimenter’s errors in recording data

Social Variables Rosenthal Effect

★ The qualities of the relationships between subjects ★ Expectations can alter the behavior of others, even
and experimenters that can influence the results of an animals.
experiment ★ The phenomenon of experimenters treating subjects
★ Two principal social variables, demand characteristics differently depending on what they expect from the
and experimenter bias, can be controlled through single subjects
and double blind experiments ★ Also, Pygmalion effect
Demand Characteristics Controlling Social EVs: Double Blind Experiment
★ The aspects of the experimental situation itself that ★ An experiment in which neither the subjects nor the
demand or elicit particular behaviors experimenter know which treatment subjects are in

o Used to control experimenter bias


o Being aware of them (experimenter effects) ★ A factor is an independent variable.
o Following set of written directions, timing all phases of ★ More Than One Independent Variable
the experiment, being consistent as possible
What is a factorial design?
o Make observations as objective as possible
★ A factorial design contains more than one
o Minimize personal contact with the subjects to avoid independent variable.
unintentional biases
★ The effect of psychotherapy (IV1) and antidepressant
o Standardize the testing and scoring procedures
drugs (IV2) on depression (DV).
o Avoid giving subjects extraneous clues
★ A two-factor experiment is the simplest factorial
Personality Variables design. More Than One Independent Variable

★ The personal characteristics that an experimenter or What information can a factorial design provide?
volunteer subject brings to the experimental setting
★ A factorial design can provide information about
- Social desirability both treatment and interaction effects.

Context Variables ★ More Than One Independent Variable

★ Extraneous variable stemming from procedures What is a main effect? What determines the number of
created by environment, or context, of the research main effects in an experiment?
setting
★ A main effect is the action of a single IV on the DV.
★ Include subject recruitment, selection and There can be as many main effects as independent
assignment procedures, as well as typical problems variables.
encountered in research on a university population
★ More Than One Independent Variable
★ Two basic kinds of context variables: (1) those
Provide an example of a main effect in a hypothetical
occurring when subjects select their own experiment;
study of exercise and depression.
(2) those produced when experimenters select their
own subjects ★ An experimenter studies the effects of exercise
intensity (IV1) and duration (IV2) on depression (DV). If
exercise intensity or duration separately reduced
★ SOME FOLKLORE ABOUT THE SUBJECTS - Students depression, these would constitute main effects.
who sign-up late in the term might be less motivated
★ More Than One Independent Variable
than those who volunteer early
How do we determine whether we have main effects in
★ Practice Effect - Change in subjects’ performance our experiment?
resulting from practice
★ Perform an appropriate statistical test.
★ Fatigue Effect - Change in performance caused by
★ More Than One Independent Variable
fatigue, boredom or irritation
In a 2 x 3 x 3 study, how many IVs and treatment
conditions are there?
MODULE 4: Between-Subjects Design (Basic and
Factorial) ★ There are 3 independent variables and 18 treatment
conditions.
Between-Subjects Factorial Design
★ More Than One Independent Variable
What is a factor?
Provide an example of a 2 x 3 x 3 study. How many interactions are possible in a study with
three IVs?
★ The independent variables were the perpetrator’s
gender (male or female), relationship to the child ★ Assign letters (A, B, C) to the independent variables.
(parent, step-parent, or parent’s partner), and severity Identify all unique two- and three- treatment
of the abuse (neurological damage, broken bones, or combinations. For three independent variables, these
bruising). include AB, AC, BC, and ABC. ABC is the higher-order
interaction.
The dependent variable was sentence length.
★ More Than One Independent Variable
★ More Than One Independent Variable
How does an interaction affect the interpretation of our
What is an interaction?
results?
★ An interaction is the joint effect of two or more IVs
★ An interaction qualifies a main effect, warning us that
on the DV. When there is an interaction, the effect of
there may be limits or exceptions to the effect of an IV
one IV is different across levels of the other IV.
on the DV. When there is an interaction, we must
★ More Than One Independent Variable consider both IVs, because the effects of one factor will
depend on the levels of the other factor.
Provide an example of an interaction.
★ More Than One Independent Variable
★ If the antidepressant Paxil produced greater
reductions in depression in the Cognitive Behavior Explain the factor-labeling method.
Therapy (CBT) condition than the Waiting List condition,
★ The factor-labeling method lists the two factors in
this would illustrate an interaction between drug and
parentheses after the numerical notation. For example,
psychotherapy.
2 x 2 (Type of Name x Length of Name).
★ More Than One Independent Variable Laying Out a Factorial Design
What is a higher-order interaction? Explain the factor and levels method.
★ A higher-order interaction is an interaction among ★ This method lists the two factors and their respective
three or more IVs. Interpretation can be difficult when levels after the numerical notation. For example, 2 x 2
more than three IVs interact in an experiment.
(Type of Name: given, nickname x Length of Name:
★ More Than One Independent Variable short, long).
Provide an example of a higher-order interaction. ★ Laying Out a Factorial Design
★ A previous hypothetical study examined the effect of What advantage does the factor and levels method have
a perpetrator’s gender (male or female), relationship to over the factor-labeling method?
the child (parent, step-parent, or parent’s partner), and
severity of the abuse (neurological damage, broken ★ The factor and levels method provides more detailed
bones, or bruising) on sentencing. information about the design than the factor labeling

★ More Than One Independent Variable method.

Provide an example of a higher-order interaction. ★ Laying Out a Factorial Design

★ There would be a higher-order interaction if the Why use a factorial design instead of two separate
perpetrator’s gender, relationship to the child, and univariate experiments?
severity of abuse jointly determined sentence length.
★ A factorial design is more efficient since it combines
★ More Than One Independent Variable several one-factor experiments and allows us to study
interactions.
★ A factorial design can achieve greater external validity ★ The presence of an interaction between two factors
since it can better recreate the complexity of the also tells us that the main effects of one factor will be
multivariate environment. altered by the other factor

★ A Research Example ★ Interaction qualifies the main effects

Why should we keep between-subjects designs simple? ★ In an experiment with three IVs, it is also possible
that any two factors, but not the third, could interact
★ Practical limitations include:
just as they might in a two-factor experiment.
o number of subjects
★ You can translate your thinking about an experiment
o time into a simple diagram called DESIGN MATRIX

o interpretability of results ★Factorial designs are described with SHORTHAND


NOTATION
★ Choosing a Between-Subjects Design
★ If we are told that an experiment has 3 x 2 x 4 design,
we know it has 3 factors because 3 numbers are given.
PPT: Between Subjects Factorial Design
★ There are 24 treatment conditions (product)
Factorial Designs
★ HIGHER-ORDER INTERACTIONS involve more than
★ Designs in which we study two or more independent two variables at a time
variables at the same time
Practical Limitations of Using Factorial Design
★ The independent variables in these designs are called
FACTORS and each factor will have two or more values ★ Often requires many subjects
or LEVELS ★ Can be time-consuming
★ The simplest factorial design has only two factors and ★ Require more complicated statistical procedures
is called a TWO-FACTOR EXPERIMENT.
★ It provides valuable information than other types of
Main Effect
experiments cannot.
★ Is the action of a single independent variable in an
experiment.
PPT. BASIC BETWEEN-SUBJECTS DESIGN
★ Simply a change in behavior associated with a change
in the VALUE OF A SINGLE IV within the experiment

★ How much did the change in this one independent The Experimental Design
variable change subject’s behavior? ★ Is the general structure of the experiment – the
★ The main effect might or might not be important to experimenter’s plan for testing the hypothesis
be statistically significant THREE ASPECTS IN DECIDING FOR EXPERIMENTAL
Interaction DESIGN

• The number of IVs in the hypothesis


★ A factorial design allows us to test for relationships
between the effects of different IVs • The number of treatment conditions
• Whether the same or different subjects are
★ Interaction is present if the effect of one IV changes used in each of the treatment conditions
across the levels of another IV

Between Subjects Designs


★ A design in which different subjects take part in each ★ If individuals in the population are all very similar to
condition of the experiment one another on the dependent variable, small samples
are adequate.
Selecting and Recruitment of Subjects
★ When individuals are likely to be quite different,
★ The more the sample resembles the whole
larger samples are needed
population, the more likely it is that the behavior of the
sample mirrors that of the population ★ We get slightly different responses from different
subjects in an experiment because of individual
★ When we random sample, every individual in the
differences
population as an equal chance of being selected (ideal,
rarely achieved). Effect size is important in research reports as an
estimate of the magnitude of treatment effects found in
★ Use sampling procedures an experiment after the data are analyzed statistically.
How do you recruit subjects?
★ A larger sample is no guarantee that an experiment
In desperate times, you.. will turn out as you expect

Practical Limits

★ Recruiting subjects from a single class or location Practical Considerations in Determining the Number of
resulting to convenient sampling Subjects

★ Convenience sampling → very low in ★ If the experiment requires lengthy individual testing
representativeness sessions, it might not be feasible to run large numbers
of subjects
★ Representativeness – the extent to which the sample
responses we observe and measure reflect those who ★ As general rule, it is advisable to have at least 20
would obtain if we could sample the entire population subjects in each treatment group; however, most
researchers are more comfortable if there are 30
★ People who volunteer to participate in research might subjects in each group.
be somewhat different from those who do not
★ Smaller numbers make it very difficult to detect an
effect of the IV unless effect is huge
Methods for Encouraging Prospective Subjects

★ Make your appeal interesting, nonthreatening and One Independent Variable: Two-Groups Design
meaningful
• The simplest experiments are those in which there is
★ Emphasizing that participating in research can help only one IV.
others and point out that lots of people do it
• Remember: Experiment must have at least two
★ Give token gifts treatment conditions

★ Always assess whether you should ask for volunteers • Two-groups design (two separate groups of subjects)
publicly or privately • Two variations: Two-independent groups design Two-
How Many Subjects? matched groups design

★ Too small samples can lead to erroneous results

★ We might hesitate to make great claims for our IV on Two Independent Groups
the basis of few subjects • Subjects are placed in each of the two treatment
conditions through random assignment
• In an experiment, subjects are always randomly ★ Control condition –is used to determine the value of
assigned to treatment conditions DV without an experimental manipulation of the IV
(control group)

Difference between Random Selection and Random ★ In control condition, we carry out exactly the same
Assignment procedures that are followed in the experimental
condition, except for the experimental manipulation

★ Sometimes the control condition is a “no treatment”


condition

★ Researchers must be on the lookout to the potential


confounds

★ The behavior of the control group subjects should be


controlled so that they are not inadvertently engaging in
• Random selection refers to the process of randomly
behaviors that would affect the result of the experiment
selecting individuals from a population to be involved in
a study.

• Random assignment refers to the process of randomly Two Experimental Groups Design
assigning the individuals in a study to either a treatment
group or a control group. • Can be used to look at behavioral differences that
occur when subjects are exposed to two different values
or levels of the IV.
Random Assignment • E.g., high violent music, music video with a low level
of violence to aggressiveness, 15 minutes of aerobic
• Means that every subject has an equal chance of
exercise is better than 10 minutes
being placed in any of the treatment conditions.
• One IV, two treatment conditions
• We used unbiased procedures for assigning subjects
to groups that are used in random selection of subjects

• If subjects are not randomly assigned to treatment Two-Matched-Groups Design


groups, confounding can occur
• Randomization does not guarantee that treatment
• May use a coin or a random number table → eliminate groups will be comparable on all the relevant
bias extraneous subject variables.
• Nonrandom selection affects the external validity of • An experimental with two treatment conditions and
an experiment and random assignment is critical to with subjects who are matched on a subject variable
internal validity thought to be highly related to the DV.

• The use of matching or equating the groups that will


probably affect the DV.
Experimental Group-Control Group Design

★ Experimental condition-control condition


Matching Before and After an Experiment
★ Experimental condition–we apply a particular value
of our IV to subjects and measure the DV (experimental ★ PRECISION MATCHING –creating pairs whose
group) subjects have identical scores on matching variables
★ RANGE MATCHING –creating pairs of subjects whose e.g., 5, 10, 15 milligrams 15, 30, 45 minutes 75, 50, 100-
scores on the matching variable fall within a previously watt light bulb
specified range of scores
★ Always think of the terms of the hypothesis you are
testing

★ What will I gain by adding these extra conditions

★ Select the simplest design that will make an adequate


test of your hypothesis

Multiple-Groups Design: Practical Limits

- A Review of Experimental Literature


★ RANK-ORDERED MATCHING –creating matched pairs - Pilot Study
by placing subjects in order of their scores on matching
variable; subjects with adjacent scores become pairs
Difference between Between Subjects and Within
Subjects Design
When to Use Two Matched Groups

• Eliminate sources of confounding

• When we match, it is essential that we match on the


basis of an EV that is highly related to the DV of the
experiment

Multiple-Groups Design

★ A between -subjects design with one independent


variable, in which there are more than two treatment
conditions

★ The most commonly used multiple -groups design is


the multiple - independent groups design

Assigning Subjects

• Random Number Table


• Bock Randomization
o Can be used to ensure that each
condition has an equal number of MODULE 5: Within-Subjects Designs
subjects
1. How does a within-subjects experiment differ from a
between-subjects experiment?

In a within-subjects design, each subject serves in more


than one experimental condition. In a between-subjects
design, different subjects serve in each experimental
condition.
★ Choosing Treatments
2. What are the advantages and disadvantages of using Mary has forgotten that within-subjects designs may be
a within-subjects design. impossible, useless, confounded by order effects, or
impractical when excessive subject time spent in an
★ Advantages experiment makes data inaccurate.
(1) Within-subjects designs require fewer subjects than
★ Impossible: in a study comparing Type A and Type B
between-subjects designs because we use the same
personality blood pressure changes in response to
subjects in each treatment condition.
provocation, a subject cannot participate in both the
(2) Within-subjects designs can reduce training time Type A and Type B personality conditions.
when we can train the same subject for several
★ Useless: in a study in which we ask subjects to use
conditions instead of one.
two different strategies to learn the same list of words,
(3) Within-subjects designs control for extraneous they are likely to recall the list more easily in the second
subject variables since they compare each subject condition.
against himself or herself.
★ Order effects: in a study in which we ask subjects to
★ Disadvantages rate four television commercials, the first commercial
might receive a higher rating than the others due to its
(1) Within-subjects designs may require that subjects
novelty; later commercials might receive lower ratings
spend excessive time in an experiment due to multiple
than they deserve because the subjects have tuned
treatment conditions and recalibrating equipment,
them out.
resulting in inaccurate data.
★ Impractical: in a study of perceptual judgment, a
(2) Within-subjects designs may be impossible or
subject might spend several hours in each condition. An
useless due to carryover effects.
experiment with three levels might require that a
(3) Within-subjects designs may be confounded by order subject participate for six hours, which could result in
effects. unacceptable levels of progressive error.

3. Sample study using complete counterbalancing. 5. What requirements must be met to make the within-
subjects approach feasible?
Hypothesis of the Study: Children who play with
weaponlike toys (for example, toy guns and knives) A within-subjects approach is feasible when subjects
become more aggressive. can participate in all treatment conditions, the
conditions do not seriously interfere with each other,
The independent variable is type of toy (weaponlike or
and the experimenter can control order effects.
non-weaponlike). The dependent variable is aggression,
defined by play actions (e.g., banging, hitting, pounding, 6. Study this scenario: You are planning an experiment
and shooting) observed by a team of raters during a 20- on anagrams (scrambled words). You want to test
minute time period. Each child plays with a weaponlike whether different scramble patterns lead to different
toy and non-weaponlike toy for 20 minutes each. We solution rates. For instance, the letter order 54321
can use complete counterbalancing to prevent might be easier to solve than 41352 (12345 represents
confounding by order. Half the children start with a the actual word). You want to use the same words in all
weaponlike toy and then play with a non-weaponlike conditions so that the type of word will not be a
toy; half start with a non-weaponlike toy and then play confounding variable. People solve anagrams at
with a weaponlike toy. different rates, so you are thinking about using a within-
subjects design.
4. Study this scenario why within-subjects design is not
a good choice for some experiments. a. If you use a within-subjects design for this
experiment, will you have to worry about order effects?
Mary, a researcher, is very excited about the within-
Why or why not?
subjects approach. “Now I’ll never need to run large
numbers of subjects again,” she says. What has she Yes, a researcher will have to control order effects due
forgotten? to practice effects and fatigue effects. Repeated
exposure to the same scrambled word should make treatment sequences. For example, if an experiment has
anagram solution easier. In contrast, fatigue could make 120 possible sequences (five treatment conditions) and
anagram solution harder during later trials than earlier only 30 subjects, we could randomly assign subjects to
ones. 30 of these sequences. Partial counterbalancing controls
for linear progressive error.
b. There are four counterbalancing techniques (reverse
counterbalancing, block randomization, complete, and
partial) for handling order effects discussed in this
Complete counterbalancing would be the best
chapter. Which would help you most in this experiment?
technique for controlling progressive error if sufficient
Why?
subjects are available. Since it only presents each
- Subject-by-subject counterbalancing controls condition once, it is less likely to produce practice
progressive error for each subject by presenting all effects than reverse counterbalancing and block
treatment conditions more than once. Reverse randomization which present each condition several
counterbalancing and block randomizations are two times. Since it controls both linear and nonlinear
subject-by-subject counterbalancing methods. progressive error, complete counterbalancing is superior
to partial counterbalancing which only controls linear
progressive error.
★ In reverse counterbalancing, we present all
treatment conditions twice, in an initial order and then
in reverse order. For example, for conditions A and B, c. What are carryover effects? Would they be a problem
each subject would receive the sequence ABBA. Reverse in this experiment? How would you handle them?
counterbalancing only controls linear progressive error
★ Carryover effects are the persistence of treatment
(progressive error that changes monotonically).
effects after an experimental condition ends.
★ In block randomization, we present several blocks
They are a serious problem in this experiment, since the
containing each treatment to each subject; the
same words will be used in each scramble condition.
treatments in each block are given in random order. For
The solution of the first anagrams should improve
example, if you decided to present four treatments
performance in later conditions. For example, the word,
(ABCD) five times, each subject might receive the
“zebra,” should be easier to unscramble in subsequent
following blocks: BCDA DBAC ACDB CABD BADC. Block
anagrams because of its distinctive “z.”
randomization can control both linear and nonlinear
progressive error. We can reduce the problem of carryover effects by
using different sets of words in each of the scramble
- Across-subjects counterbalancing distributes conditions. We will have to control word distinctiveness,
progressive error across subjects in an experiment. The length, and frequency to prevent these factors from
effects of progressive error should be the same for each confounding the experiment. If sufficient subjects are
experimental condition since they are averaged across available, complete counterbalancing can effectively
subjects. Complete and partial counterbalancing are the control symmetrical carryover effects. However, if the
two main across-subjects counterbalancing methods. carryover effects are asymmetrical, a between-subjects
★ In complete counterbalancing, we randomly assign design will be required.
an equal number of subjects to every possible
treatment order. For example, for conditions A and B,
we randomly assign half the subjects to sequence AB or
BA. Complete counterbalancing controls for linear and 7. Study this scenario: A television commercial showed
nonlinear progressive error, and symmetrical carryover people tasting and choosing between two colas. One
effects. was labeled R; the second was labeled Q. The majority
★ In partial counterbalancing, we randomly assign an of people said they liked cola R better than cola Q. Given
equal number of subjects to a subset of possible what you know about experimental design, would you
accept the ad’s claim that cola R tastes better than cola
Q? Why or why not? How might you change the conditions to determine whether there is a treatment
procedures to get more acceptable data? effect.

★ A within-subjects factorial design assigns subjects to


all levels of two or more independent variables.
I would not accept the ad’s claim because the
commercial showed confounding by cola label. Since the ★ A mixed design is an experiment where there is at
label (R or Q) systematically changed with type of cola, least one between-subjects and one within subjects
the design lacked internal validity. variable.
We could obtain more acceptable data if we removed Advantages:
the labels, randomly assigned subjects to order AB or
BA, and asked the subjects to drink water between • use fewer subjects
conditions to minimize carryover effects. • save time on training
• greater statistical power
• more complete record of subjects’ performance
8. Explain why (when it is used as a factor in a design) Disadvantages:
order is always a between-subjects factor.
• subjects participate longer
When order is manipulated as an independent variable,
• resetting equipment may consume time
it is always a between-subjects factor because each
• treatment conditions may interfere with each
subject can never participate in more than one order.
other
For example, if there are three treatments and we
• treatment order may confound results
expose a subject to ABC and then to CBA, the subject
has really been exposed to one new order (ABCCBA). When can’t we use a within-subjects design?
PPT: 11 WITHIN-SUBJECTS DESIGN ★ We can’t use a within-subjects design when one
treatment condition precludes another due to
★ In a within-subjects experiment, subjects are
interference
assigned to more than one treatment condition.
★ Order effects are positive (practice) and negative
★ Power is an experiment’s ability to detect the
(fatigue) performance changes due to a condition’s
independent variable’s effect on the dependent
position in a series of treatments.
variable.
★ The term, progressive error, encompasses both
★ Statistical power is desirable when it allows us to
positive and negative order effects.
detect practically significant differences between the
experimental conditions. Theoretically, there is a point ★ Counterbalancing is a method of controlling order
of diminishing returns where excessive power detects effects by distributing progressive error across different
meaningless differences between treatment conditions. treatment conditions.
For example, in a study of treatments to lower blood ★ Two major counterbalancing strategies are
pressure, a difference of 0.1 mm Hg— while statistically
significant—would not affect patient health or life subject-by-subject counterbalancing, which controls
expectancy progressive error for each subject, and

★ In a within-subjects experiment, researchers across-subjects counterbalancing, which distributes


measure subjects on the dependent variable after each progressive error across all subjects.
treatment.
★ A fatigue effect is form of progressive error where
Subjects participate in more than one treatment performance declines on the DV due to tiredness,
condition and serve as their own control. We compare boredom, or irritation.
their performance on the dependent variable across
★ Subject performance on the dependent variable may Since subject-by-subject counterbalancing presents each
improve across the conditions of a within subjects’ treatment several times, this can result in long-duration,
experiment and these positive changes are called expensive, or boring procedures. This problem is
practice effects compounded as the experimenter increases the number
of treatments.
★ Practice effects may be due to relaxation, increased
familiarity with the equipment or task, development of ★ Across-subjects counterbalancing techniques present
problem-solving strategies, or discovery of the purpose each treatment once and controls progressive error by
of the experiment. distributing it across all subjects.

Why can’t we eliminate or hold order effects constant in ★ Two techniques are complete and partial
a within-subjects experiment? counterbalancing.

★ We can’t eliminate order effects because there is an ★ Complete counterbalancing uses all possible
order as soon as we present two or more treatments. treatment sequences an equal number of times.
Holding order constant—always assigning subjects to Researchers randomly assign each subject to one of
the sequence ABC—would confound the experiment. these sequences.

★ Partial counterbalancing is a form of across subjects


counterbalancing, where we present only some of the
★ Subject-by-subject counterbalancing controls
possible (N!) orders.
progressive error for each subject by presenting all
treatment conditions more than once. ★ Two partial counterbalancing techniques are
randomized partial and Latin square counterbalancing.
★ Two subject-by-subject counterbalancing techniques
are reverse counterbalancing and block randomization When is a within-subjects superior to a between
subjects’ design?
★ In reverse counterbalancing, we administer
treatments twice in a mirror-image sequence, for A within-subjects design is usually preferable when you
example, ABBA. When progressive error is linear, it need to control large individual differences or have a
progressively changes across the experiment so that A small number of subjects. However, it may not be
and B have the same amount of progressive error. feasible if the experiment is long or there is a risk of
asymmetrical carryover.
★ Nonlinear progressive error, which can be curvilinear
(inverted-U) or nonmonotonic (changes direction), Topic 2
cannot be graphed as a straight line. Within-Subjects Designs: Small N
★ Reverse counterbalancing only controls for linear Small N designs are within-subjects experiments that
progressive error. When progressive error increases in a study the behavior of one or very few subjects.
straight line, this method actually confounds the
experiment The ABA design, also called a reversal design is used in
most small N designs. The ABA approach constitutes a
★ Block randomization is a subject-by-subject family that includes the AB, ABA, ABAB, and ABABA
counterbalancing technique where researchers assign designs and they may be used to study experimental
each subject to several complete blocks of treatments. treatments that can be reversed (withdrawn). They
emphasize the importance of returning to baseline to
★ A block consists of a random sequence of all
verify the effects of the independent variable.
treatments, so that each block presents the treatments
in a different order Unlike an ABA design, a multiple baseline design never
ends a treatment to return to baseline. This design
should be used when we do not want to reverse a
What is a problem with subject-by-subject treatment or when we want to test a treatment across
counterbalancing? multiple settings, patients, or behaviors.
When should researchers should use small N and large (3) Some small N designs are not appropriate when the
N designs? target behavior does not return to baseline.

The choice of design depends on both practical and What is an ABA design? Why is it really a family of
methodological issues and comparing the pros and cons designs?
of both designs. It is impossible to conclude whether
small or large N studies always have greater generality ★ In an ABA design (reversal design) we present a
since all things are rarely equal. There is a need to baseline condition (A) in which we only measure the
emphasize the importance of replication to establish target behavior, present an experimental condition (B)
both the internal validity and external validity of small N and measure its effect, and then return to the baseline
studies. condition (A) to verify that the change in the target
behavior is associated with the experimental condition.
What are the advantages and disadvantages of small N This design may be used in both small N and large N
versus large N designs. experiments.

The ABA design is actually a family of designs because


researchers use variations like AB, ABAB, and ABABA
★ Advantages
designs.
(1) Uses a smaller number of subjects which reduces
cost of experimentation
What do we mean by a reversal (or withdrawal) design?
(2) Allows study of clinical disorders when there are
insufficient subjects for a large N study. ★ ABA designs are called reversal or withdrawal designs
because we reinstate the baseline condition (A) after
(3) Provides better control of extraneous variables,
presenting the experimental condition (B). This return
especially subject variables.
to baseline is needed to link the change in target
(4) Provides a more complete record of subject behavior to our manipulation of the independent
performance due to each subject’s participation in all variable. We may only use ABA designs when the
experimental conditions. treatment conditions are reversible.

(5) Provides a more precise assessment of subject Explain how a baseline condition in a small N
performance because data only come from one or a few experiment is similar to a control group in a large N
subjects instead of many subjects. experiment.

(6) Can achieve greater statistical power due to control ★ We measure the dependent variable without
over individual differences and examination of individual manipulating an independent variable in both the
records instead of pooled data. baseline condition (small N experiment) and control
group (large N experiment). Both the baseline condition
(7) Data analysis may not require statistical tests.
and control group allow us to determine whether the
independent variable affected the dependent variable
by providing a check for confounding by extraneous
variables like history threat and maturation threat.

★ Disadvantages Study this: One student is still looking for shortcuts. He


(1) External validity may be lower if subjects do not says, “Running through the baseline condition of an
resemble the population. experiment twice is silly. I’ll just run through A and B
and draw my conclusions from that.”
(2) Researchers may be unable to exactly recreate the
original baseline conditions, especially outside the What would you say to him to convince him that
laboratory. carrying out the entire ABA design would be a better
idea?
★ We need to return to baseline to rule out ★ B. F. Skinner examined the continuous behavior of
confounding by extraneous variables like history threat individual subjects in preference to analyzing discrete
and maturation threat. We are more confident that the measurements from separate groups of subjects.
independent variable influenced the dependent variable
when the target behavior reverses after we remove the ★ Sir Ronald Fisher’s (1935) creation of the analysis of
experimental intervention. variance allowed inferential testing of large N data.

The failure of the target behavior to return to its


baseline level challenges the causal relationship Baselines
between the treatment and change in the dependent
variable. This could be due to confounding by an ★ In both large and small N designs, baselines are
extraneous variable, carryover by the independent control conditions that allow us to measure behavior
variable, or failure to exactly recreate the original without the influence of the IV
baseline condition (A).
How did Kazdin explain the decision of many clinical
researchers to end without a return to baseline?

PPT: 12 Within-Subjects Designs: Small N ★ It would be ethically indefensible to cause a patient


to relapse by returning to baseline after treatment
appeared to improve behavior
★ A large N design compares the performance of This is most important when relapse threatens the
groups of subjects health or safety of the patient or others, as in self-
★ A small N design studies one or two subjects, often injurious, and suicidal or homicidal behavior
using variations of the ABA reversal design What price do researchers pay when they can't return to
baseline?
★ Aggregate effects are the pooled findings from many
subjects ★ They can’t rule out the possibility that the patient’s
clinical improvement was caused by an extraneous
variable.
Why do small N researchers challenge large N
experiments?

They argue that large N studies ignore individual subject Multiple Baseline Design.
responses to the IV and instead report aggregate results ★ In a multiple baseline design, a series of baselines
or trends. When subjects vary greatly in their response and treatments are compared within the same subject,
to the IV, this can create the appearance of no and once treatments are administered, they are not
difference between the groups. withdrawn.
Why and who would use small N designs?
★ This approach could also be used to evaluate the
★ A clinical psychologist could use a small N design to effect of a treatment administered to different
test a treatment when there are insufficient subjects to individuals after baselines of different lengths.
conduct a large N study and when she wants to avoid
★ A researcher can evaluate the effects of a treatment
the ethical problem of an untreated control group
on two or more behaviors or on the same behavior in
★ Animal researchers prefer small N designs to different settings
minimize the acquisition and maintenance cost, training
★ In a multiple baseline design, an experimenter never
time, and possible sacrifice of their animal subjects.
withdraws treatments after administering the
★ Small N designs have been most extensively used in How do researchers analyze data from small N
operant conditioning research. experiments?
★ Researchers often visually inspect changes in the ★ When studying a clinical subject (a self-injurious
dependent variable across treatment conditions. The child) or when very few subjects are available, a small N
independent variable’s effect is often apparent. design is appropriate.

★ They may also use statistics to analyze small N data ★ A large N design would be desirable when we have
sufficient subjects and want to increase generalizability.
Why is statistical analysis of small N data controversial?
The generalizability of a large N study depends on how
★ Critics are concerned about generalizing from a single we select our sample since a seriously biased sample
subject to a population. Unless 50 measurements are will not represent the population
taken during each baseline and treatment phase,
★ The generalizability of a small N study depends on
important assumptions underlying inferential tests may
repeated successful replications with different subjects.
be violated
Why doesn't a large N study always have greater
generality than a small N study?
Changing Criterion Designs
★ If a large N study’s sample is biased, we will be
★ In changing criterion designs, the criteria for unable to generalize its findings to a larger population.
reinforcement are incrementally increased as Also, if it is poorly controlled, there will be no valid
participants succeed. findings to generalize

★ For example, initially, a subject might receive a ★ In contrast, a well-controlled small N experiment
reward for 30 minutes of daily exercise, later, for 45 using a single subject might be successfully replicated
minutes, and finally, for 60 minutes. across sufficient subjects to generalize its results to the
population from which they were drawn.
★ Reinforcement for successive approximations of the
target behavior is central to athletic training, behavior
modification, and biofeedback and neurofeedback

Discrete Trials Designs

★ A discrete trials design is a small N design without


baselines used in psychophysical research.

★ Instead, the impact of different levels of the


independent variable is averaged across 100s to 1000s
of trials

★ A discrete trials design has no baselines and


administers the levels of the independent variable 100s
to 1000s of times to each subject

★ The large number of data points produced by 100s to


1000s of trials provides a very reliable measurement of
the effect of the independent variable.

★ The similarity of human sensory systems allows


researchers to generalize from a small number of
subjects.

When to use Large N and Small N Designs

You might also like