Nothing Special   »   [go: up one dir, main page]

Data Collection:: Professor Tarek Tawfik Amin

Download as pdf or txt
Download as pdf or txt
You are on page 1of 72

Data Collection: Methods and Instruments

Basic Research Competency Program for Research Coordinators


August 2015, MEDC, Faculty Of Medicine, Cairo University, Cairo, Egypt.

Professor Tarek Tawfik Amin


Epidemiology and Public Health, Faculty of Medicine, Cairo University
Geneva Foundation for Medical Education and Training
Asian Pacific Organization for Cancer Prevention
International Osteoporosis Foundation
Wiley Innovative Panel
amin55@myway.com dramin55@gmail.com
Objectives

By the end of this session, research coordinator will be able to:


1- Indentify the research data plan and its importance.
2- Differentiate between primary and secondary data sources
3- Recognize the different data collection methods, tools and
techniques and the demerits of each.
4- Recognize the importance of reliability and validity of an
instrument.
Introduction
• Data can be define as the quantitative or qualitative value of a
variable (e.g. number, images, words, figures, facts or ideas)

• It is a lowest unit of information from which other


measurements and analysis can be done.

• Data is one of the most important and vital aspect of any


research study.
Before starting
• Accurate and systematic data collection is
critical to conducting scientific research.

• Data collection allows to collect information


about study objects/subjects/participants.

• Includes documents review, observation,


questioning, measuring, or a combination of
different methods.
Factors to be Considered
Before
Collection of Data (plan)
o Objectives and scope of the enquiry (research
question).
o Sources of information (type, accessibility).
o Quantitative expression (measurement/scale).
o Techniques of data collection.
o Unit of collection.
Data collection plan
Types of
Identify types of Instrume
measurements and
data
variables nts
Research question Scales
Research hypothesis
Methods
Revise

1- Data
Written
ion collection
forms permissions
Implementat
2-Operational
Pilot testing
procedures
Sources of
Data

External Internal
sources sources

Primary Secondary
data data
Internal vs. External Sources of Data

Internal External

o Many institutions and departments o When information is collected


have information about their from outside source.
regular functions, for their own
internal purposes.
o Such types of data are either
o When those information are used primary or secondary.
in any survey is called internal
sources of data. o This type of information can be
collected by census or sampling.
Routine surveillance, hospital records
.
I- Primary Data
 Collected from first-hand experiences is known as primary
data. More reliable, authentic and not been published
anywhere.

 Primary data has not been changed or altered by human


being, therefore its validity is greater than secondary data.
Direct Personal
Investigation
(interviewing)

Investigation Indirect oral


through investigation
observation Methods of
collecting
primary data

Measurements
Lab. results Case studies
Experimentation
Primary Data

Merits Demerits
Targeted issues are Cost
addressed

Data interpretation is better Time

High accuracy of data More personnel / resources

Addressing specific Inaccurate feedback


research issues
Training, skill and
Greater control laborious.
II-Secondary Data
 Already been collected by others.
 Journals, periodicals, research publication ,official
record etc.
 May be available in the published or unpublished
form.
 Resorted to when primary data sources/methods are
infeasible -, inaccessible.
Method of collection: secondary data

International

Government
Published
Corporation

Unpublished Institutional
Secondary Data

Merits Demerits
Not fulfilling specific
Quick and cheap research needs

Poor accuracy
Wider geographical area

Not up to date
Longer orientation period
Poor accessibility in some
cases
Leading to primary data
Primary vs. secondary data

Primary data Secondary data


 Real time o Past data
 Sure about the sources o Not sure about sources
 Can answer research o Refining the research
question. problem
 Cost and time o Cheap and no time
 Can avoid bias o Bias can’t be ruled out
 More flexible o Less flexible
Data Sources for Health Research
Primary or secondary sources

 Birth and death records


 Medical records at physician offices, hospitals,
nursing homes, etc.
 Medical databases within various agencies,
universities, and institutions.
 Physical exams and laboratory testing
 Diseases registries
 Self-report measures: interviews and questionnaires
Research Data: Considerations

Data collection vs. data analysis


• Poor data collection and management can render a
perfectly executed trial useless
• Bad data practices carry resource and ethical costs
• Good practices:
– What are data?
– How are they represented?
– How are they stored for retrieval and use?
Considerations in collecting clinical data:
Data are Surrogates

Data (specimens) are all that remain after the


active phase of a clinical trial
 Data are about the objects and events in the
trial
 Understanding how the data are captured
and recorded affects interpretation
 Improper collection or interpretation lead to
Indirect nature of Data

Hematocrit:
– an indicator of oxygen carrying capacity -depends on chemical
alterations of hemoglobin and concentration of hydrogen ions
– an indicator of blood or bone marrow health -falsely normal in the
setting of an acute hemorrhage.

Mortality:
– patients who died from unrelated causes
– accounting for patients who are lost to follow-up
Considerations Cont’:
Objectivity, Subjectivity and Reproducibility
Objectivity: the degree to which recorded data
may be influenced by the individual thought of
the observer
• Data reported by subjects (such as symptoms of a disease)
can be objectively recorded (statements themselves are
subjective)
• Objective observations (such as signs of a disease) can be made
by outside observers
Episode of bleeding reported by a subject ≠episode witnessed by a
researcher
Objectivity is a process by which the observation is represented
in the data
– Human observer is never completely free of influences
Controlling of subjectivity:
• Use an unbiased device to record information
• Employ rating systems and train the observers
• Consider limits to data precision

Reproducibility – corroborating findings in a


subsequent study requires knowing how observations were
made (metadata)
The concept of METADATA
“Data about the data” (e.g., method of collection,
relationship of data to the events in the research
protocol, etc.)
- Temporal metadata require particular attention
- Understand the implications of a time (e.g., if a
blood specimen is drawn to measure a drug level,
we must know the time that the specimen was
drawn and the time the drug was administered
- Need to choose when to measure and how often
Types of Data
I- Quantitative data - measurements that can be manipulated
mathematically
• Precision - body temperature, serum chloride , absolute
eosinophil count.
II- Qualitative data - conceptual entities rather than numeric
values (subject gender and race, signs and symptoms,
diagnoses)
• May represent concepts that relate to quantitative data
[“blood pressure” is numeric, but the procedures themselves
are qualitative]
III- Ordinal data look like numbers (e.g., urine protein
measurements “0”, “1+”, “2+”, etc.)
IV- Signal data - quantitative in nature but are treated as
qualitative (e.g., electrocardiogram tracings)
Data Standards

A.Support data usefulness and exchange


B.Standards for quantitative data: units of
measurements.
C.Standards for qualitative data: controlled
terminologies (ICD-10)
D.Standards for data format
Primary Research Methods and Techniques

Primary
Research
Quantitative Data Qualitative Data

Surveys Experiments Focus groups

 Personal interview Individual in-depth


(intercepts) Mechanical
interviews
 Mail observation
 In-house, self- Human
administered observation
 Telephone, fax, e- Simulation
mail, Web
Case studies
Differentiation between data collection
techniques and tools.

Techniques Tools

Using available data


 Data compilation sheet
Observation
 Check list, eye, watch, scales,
Microscope, pen and paper.
Interviewing
 Schedule, agenda, questionnaire, recorder.

Self-administered Questionnaire.
questionnaire
1- Self-reported data

Some information can be gathered only by asking people


questions (i.e. not easily observable)
 Self report measures are estimates of true scores

True score + Measurement error = Survey response


Pitfalls of self-reported information
1

Susceptible to the respondent’s


A. Mood
B. Motivation
C. Memory
D. Understanding
1
Also susceptible to:
oContext circumstances of interview
oSocial desirability choosing answers that
are viewed favorably
Common Types of Questions
• Open-ended
What health conditions do you have?
• Closed-ended
Which of the following conditions do you currently
have? Say yes or no to
each.
-Diabetes?
-Asthma?
- Hypertension
Common Types of Questions
I- Response options
- Nominal – unordered response categories (e.g.
male, female)
- Ordinal – ordered response categories
(e.g. excellent, good, fair, poor)
II- Type of information
- Factual – objectively verifiable facts and
events
- Subjective – knowledge, perceptions,
feelings, judgment
Data collection Methods
I- Document Review
A qualitative (sometimes quantitative) research
project may require review of documents such
as:
– Course syllabi
– Faculty journals
– Meeting minutes
– Strategic plans
– Newspapers
Depending on the research question, the
researcher might utilize:
– Rating scale
– Checklist
– Content analysis
– Matrix analysis
II-Self reported data

Survey
Self reported data Computerized Paper-based
collection methods
Interviewer In person Telephone In person
administration Telephone
(human)
Self Administration Web, Smartphone, Tablet Paper
Interactive voice Telephone, Web Not applicable
response
Pros 1- Faster data availability 1- Answer respondent questions, probe for
2- Can handle complex skip adequate answers
Patterns 2- Administer to illiterate/low reading level
3- Can be tailored to severity of 3- Easier to reach poor, homeless, etc.
symptoms or situation 4- Build rapport
(computerized adaptive testing) 5- People feel more anonymous
2 1 2
6- Can use visual aids
1

Cons 1- Data can get lost if system


2
1- Expensive
crashes 2- Longer data collection period
2- Requires power source 3- Interviewer presence/technique
can bias results
A-Personal Interview
Interviews consist of collecting data by asking questions.
• Data can be collected by listening to individuals, recording,
filming their responses, or a combination of methods.
There are four types of interview:
• Structured interview
• Semi-structured interview
• In-depth interview, and
• Focused group discussion
In structured interviews the questions as well as
their order is already scheduled.
• Your additional intervention consists of giving
more explanation to clarify your question (if
needed), and to ask your respondent to provide
more explanation if the answer they provide is
vague (probing).

Semi-structured and in-depth interviews


• Semi-structured interviews include a number of
planned questions, but the interviewer has more
freedom to modify the wording and order of questions.
• In-depth interview is less formal and the least
structured, in which the wording and questions are not
predetermined. This type of interview is more
appropriate to collect complex information with a
higher proportion of opinion-based information.
Interview
Pros Cons
•Collect complete information with greater •Data analysis—especially when there
understanding. is a lot of qualitative data.
•It is more personal, as compared to •Interviewing can be tiresome for
questionnaires, higher response rates. large numbers of participants.
•It allows more control over the order and • Risk of bias is high due to fatigue
flow of questions. and to becoming too involved with
•Necessary changes in the interview interviewees
schedule based on initial results can be
made (which is not possible in the case of a
questionnaire study/survey)
B- Telephone Interview

Pros Cons
1- Lower Costs 1-Omit persons without phones
2- Can ensure uniform data 2-Phone accessibility
collection 3-Need complex statistical
3- Shorter data collection period framework
4- Cell phones are best way to 4- Cannot use visual aids
reach transient people 5- Many of us do not answer our
phone
2- Paper and pen Self Administered

Pros Cons
1- Anonymity 1- Good reading and writing skills
2- Can use longer, more complex required
response categories 2- Cannot have complex skip patterns
3- Can use visual aids 3- No quality control
4- Consistent across respondents 4- Similar cost and response rates to
5- Cover large geographic area other methods
6- Length easy to see (plus or minus?)
3- Web, smart phone administered survey

Pros Cons
1- Anonymity 1- Varying degrees of computer skills,
2- Better for sensitive items access, connection speeds,
3- Timely data configurations
4- Lower cost 2- Challenge to verify informed consent
5- Can use long list of response 3- Concern about multiple responses
categories from same person
6- Can use visual aids 4- Difficult to track non responders
7- Any time/location 5- Could be biased sample
8- Cover large geographic area
9- Can use complex skip patterns
Focus Group Discussion
• Focus group is a structured discussion with the purpose of stimulating
conversation around a specific topic.
• Focus group discussion is led by a facilitator who poses questions and
the participants give their thoughts and opinions.
• Focus group discussion gives us the possibility to cross check one
individual’s opinion with other opinions gathered.
• A well organized and facilitated FGD is more than a question and
answer session.
• In a group situation, members tend to be more open and the dynamics
within the group and interaction can enrich the quality and quantity of
information needed.
FGD: practical issues
The ideal size of the Focus groups:
• 8-10 participants
• 1 Facilitator
• 1 Note-taker
Preparation for the Focus Group
• Identifying the purpose of the discussion
• Identifying the participants
• Develop the questions
Running the Focus Group
1) Opening the Discussion
2) Managing the discussion
3) Closing the focus group
4) Follow-up after the focus group
III- Observation
OBSERVATION is a technique that involves systematically
selecting, watching and recording behavior and characteristics of
living beings, objects or phenomena.

• Without training, our observations will heavily reflect our


personal choices of what to focus on and what to remember.

• You need to heighten your sensitivity to details that you


would normally ignore and at the same time to be able to
focus on phenomena of true interest to your study.
Types of observation
Observation of human behavior
• Participant observation: The observer takes part in the
situation he or she observes
– Example: a doctor hospitalized with a broken hip, who now observes
hospital procedures ‘from within’

• Non-participant observation: The observer


watches the situation, openly or concealed,
but does not participate
Open
– (e.g., ‘shadowing’ a health worker with his/her permission
during routine activities)
Concealed
– (e.g., ‘mystery clients’ trying to obtain antibiotics without
medical prescription)
Observations of objects
– For example, the presence or absence of a operative room
hand washing facilities and its state of cleanliness
1. General observation may be used as the starting point
in to be familiar with the setting and the new context.
2. Focus observation may be used to evaluate
whether people really do what they say they do.
3. Access the unspoken knowledge of subject, that
is, the subconscious knowledge that they would not be
able to verbalize in an interview setting.
4. Compare a phenomena and its specific
components in greater detail.
Dimensions of observation
1. Space (physical places)
2. Actors (people involved)
3. Activities (the set of related acts people do)
4. Object (the physical things that are present)
5. Time (the sequencing that takes place over time)
6. Goal (the things people are trying to accomplish)
7. Feeling (the emotions felt and expressed)
(Spradlet 1979)
Mixed Methods in data collection

Integrating or combining qualitative and quantitative methods to


draw on strengths of each:
Reasons for using mixed methods
View problems from multiple perspectives
Contextualize information
Develop more complete understanding of problem
Challenges
Teamwork, resources, sample size, interpretation
Basic Mixed Methods Designs

Qualitative →Quantitative: qualitative used to develop outcome


measure or intervention
Quantitative →Qualitative: qualitative used to explain
quantitative outcomes in-depth
Concurrent: Qualitative used to understand participant’s
experiences with intervention/describe process
Effect of data collection methods on response
1

Multiple methods increase response rates


Aural vs. Visual (Interviewer vs. self response)
Aural more positive
Aural give more agreeable answers

Questions often tailored to mode


- Yes/No popular with telephone; Long list of check
boxes popular for web
- Long scales often used for self-administered; shorter
scales for telephone
- Vast array of visual/graphic choices available for
computerized surveys
Techniques of data collection
(advantages and disadvantages)

Technique Advantages Disadvantages

Records and registries 1. Inexpensive 1. Accessible.


2. Permit examination of past 2. Non-ethical
trends. 3. Incomplete and imprecise.
Observation A. Ethical issues
A. More detailed information. B. Observer bias
B. Facts not mentioned by C. Data collector may influence
questioning results.
C. Test reliability D. Need training.
Techniques of data collection
(advantages and disadvantages)
Technique Advantages Disadvantages

Personal interviewing I. Suitable for illiterates I. Interviewer may influence results


II. Permits clarification II. Less accurate recording than
III. High response rate observation
III. Needs trained personnel

Self administered 1. Less expensive 1. Not suitable for illiterate


questionnaire 2. Permit anonymity 2. Low response rate
3. Less personnel 3. Problem of misunderstanding
4. Eliminate bias
Techniques of data collection
(advantages and disadvantages)

Technique Advantages Disadvantages

Focus group discussion Collection of in-depth 1. Interviewer may


information and exploration influence results
2. Open-ended questions
3. Domination
4. Non response
Measuring scale
oPrecision o Training
oEliminate bias o Validity and accuracy
Psychometry
- Quantitative methods to statistically assess the
reliability and validity of survey instruments; also a
way to establish scoring mechanisms
- Enables users to combine a set of items and come up
with a single score(e.g. level of depression or physical
functioning)
Classical Test Modern Test
Theory (Old Theory (Current
science) standard)
Requires the Focuses on
use
To what of EVERY
extent contribution of
item
does each inmeasure
item a set eachitem
Differential individual
functioning (DIF)
the underlying detectsitem
error in a set
related to subgroups of
construct? people, Identify items that introduce
bias
Computerized Adaptive
Testing
Combines item response
theory (IRT) and computer
technology:
- Question selected based on
person’s
response to previous questions-
Requirements of Measurement in Research
Definitions
CONSTRUCT: A theoretical concept

MEASUREMENT: A system of defining the


level of a construct

Operational Definition: The method used


for examining some domain
Examples
1. Depression
A. Hamilton Depression Rating Scale
B. Beck Depression Inventory
2. Tremor
A. Judge rated spirals
B. Computer evaluated spirals
3. Heart Disease
A. Cholesterol
B. C-Reactive Protein
1. Construct
Validity: How
A. Face
well does the
measure reflect
B. Content
the construct?
2.Criterion-related
A. Convergent
B. Divergent

Reliability: 1. Internal Consistency


Consistency of 2. Inter-Rater
measurement
3. Test-Retest
Reliability

Reliability is defined as the extent to which a questionnaire,


test, observation or any measurement procedure produces
the same results on repeated trials.
In short, it is the stability or consistency of scores over time
or across raters.
The amount of agreement between
Aspects of reliability: two or more instruments that are
administered at nearly the same point
equivalence, in time.
stability and Measured through a parallel forms
procedure in which one administers
internal consistency alternative forms of the same measure
(homogeneity ) to either the same group or different
group of respondents.

The higher the degree of correlation Equivalence is


between the two forms, the more equivalent demonstrated by
they are. Seldom implemented, difficult to assessing inter-rater
verify that two tests are indeed parallel (i.e., reliability which refers to
have equal means, variances, and the consistency with
correlations with other measures) which observers or
raters make judgments.
When the same or similar scores are
Aspects of reliability: obtained with repeated testing with the same
equivalence, group of respondents. In other words, the
scores are consistent from one time to the
stability and
next. Stability is assessed through a test-
internal consistency retest procedure that involves administering
(homogeneity ) the same measurement instrument to the
same individuals under the same conditions
Assumptions: after some period of time. Test-rest
1-The characteristic that is reliability is estimated with correlations
measured does not change
between the scores at Time 1 and those at
over the time period.
2-The time period is long Time 2
enough that the respondents’
memories of taking the test at
Time 1 does not influence their
scores at the second test
administrations.
The extent to which items on the test or
Aspects of reliability: instrument are measuring the same thing.
equivalence, If the individual items are highly correlated
with each other you can be highly
stability and
confident in the reliability of the entire
internal consistency scale. Internal consistency is estimated via
(homogeneity ) the split-half reliability index, coefficient
alpha (Cronbach, 1951) index or the
The split- half estimate entails Kuder- Richardson formula 20 (KR -
dividing up the test into two 20)(Kuder & Richardson, 1937) index.
parts (e.g., odd/even
items or first half of the Specifically, coefficient alpha during scale
items/second half of the development with items that have several
items), administering the two response options (i.e., 1 = strongly
forms to the same group of disagree to 5 = strongly agree) whereas
individuals and correlating the KR -20 is used to estimate reliability for
responses. Coefficient alpha dichotomous (i.e., Yes/No; True/False)
response scales
and KR - 20 both represent
The more items you have in
your scale to measure the
construct of interest the
more reliable your scale will
become. But at what cost?
Validity of an instrument
The extent to which the instrument
measures what it purports to
Content validity pertains to the degree to
measure:
which the instrument fully assesses or
1- content validity ,
measures the construct
2- face validity ,
of interest. The development of a content
3- criterion -related validity (or
valid instrument is typically achieved by a
predictive validity),
rational analysis of the instrument by
4- construct validity ,
raters (ideally 3 to 5) familiar with the
5- factorial validity ,
construct of interest.
6- concurrent validity ,
Specifically, raters will review all of the
7- convergent validity and divergent
items for readability, clarity and
comprehensiveness.
Validity of an instrument
The extent to which the instrument Face validity is a component of
measures what it purports to
content validity and is established
measure:
1- content validity , when an individual reviewing the
2- face validity , instrument concludes that it
3- criterion -related validity (or measures the characteristic or trait
predictive validity), of interest.
4- construct validity ,
5- factorial validity ,
6- concurrent validity ,
7- convergent validity and divergent
Validity of an instrument
The extent to which the instrument
measures what it purports to
measure: Assessed when one is interested in
1- content validity , determining the relationship of
2- face validity , scores on a test to a specific
3- criterion -related validity (or criterion.
predictive validity),
4- construct validity ,
5- factorial validity ,
6- concurrent validity ,
7- convergent validity and divergent
Validity of an instrument
The extent to which the instrument The degree to which an instrument
measures what it purports to
measures the trait or theoretical
measure:
1- content validity , construct that it is intended to
2- face validity , measure. Construct validity is very
3- Criterion -related : predictive much an ongoing process as one
4- construct validity , refines a theory, if necessary, in
5- factorial validity , order to make predictions about
6- Criterion-related: concurrent
test scores in various settings and
7- convergent validity and divergent
situations.
Thank you

You might also like