HCI Testing
HCI Testing
HCI Testing
Formative Evaluation
Summative Evaluation
Evaluation of the user interface after it has been
developed.
Typically performed only once at the end of
development. Rarely used in practice.
Not very formal.
Data is used in the next major release.
Formative Evaluation
Evaluation of the user interface as it is being
developed.
Begins as soon as possible in the development
cycle.
Typically, formative evaluation appears as part of
prototyping.
Extremely formal and well organized.
Formative Evaluation
Performed several times.
o An average of 3 major cycles followed by iterative redesign
per version released
o First major cycle produces the most data.
o Following cycles should produce less data, if you did it
right.
Subjective Data
o Opinions, generally of the user.
o Some times this is a hypothesis that leads to additional
experiments.
Numeric
Performance metrics, opinion ratings (Likert Scale)
Statistical analysis
Tells you that something is wrong.
Qualitative Data
o Non numeric
o User opinions, views or list of problems/observations
o Tells you what is wrong.
Experiment Design
Subject selection
o
o
o
o
o
Experiment Design
Task Development
o What tasks do you want the subjects to perform using your
interface?
o What do you want to observe for each task?
o What do you think will happen?
o Benchmarks?
o What determines success or failure?
Experiment Design
Protocol & Procedures
o What can you say to the user without contaminate the
experiment?
o What are all the necessary steps needed to eliminate
bias?
o You want every subject to undergo the same experiment.
o Do you need consent forms (IRB)?
Experiment Trials
Calculate Method Effectiveness
o
Sears, A., (1997) Heuristic Walkthroughs: Finding the Problems Without the Noise, International
Journal of Human-Computer Interaction, 9(3), 213-23.
Experiment Trials
Pilot Study
o An initial run of a study (e.g. an experiment, survey, or
interview) for the purpose of verifying that the test itself is
well-formulated. For instance, a colleague or friend can be
asked to participate in a user test to check whether the test
script is clear, the tasks are not too simple or too hard, and
that the data collected can be meaningfully analyzed.
o (see http://www.usabilityfirst.com/ )
Data Collection
Collect more than enough data.
o More is better!
Data Analysis
Use more than one method.
All data lead to the same point.
o Your different types of data should support each other.
Remember:
o Quantitative data tells you something is wrong.
o Qualitative data tells you what is wrong.
o Experts tell you how to fix it.
Conclusions
The data should support your conclusions.
o Method Effectiveness Measure
Redesign
Redesign should be supported by data findings.
Setup next experiment.
o Sometimes it is best to keep the same experiment.
o Sometimes you have to change the experiment.
o Is there a flaw in the experiment or the interface?