Research Methods: Foundation Course On
Research Methods: Foundation Course On
Research Methods: Foundation Course On
Foundation Course
On
Research Methods
RESEARCH METHODS
Objective: The aim of the course is to create a background and awareness of the nature of
research process and inquiry. It will expose the student to the methodological problems
encountered in interdisciplinary research. The course will provide a comprehension of basic
principles of research design and strategy, including an understanding of how to formulate and
execute researchable problems.
UNIT-1
Introduction Research – definition, scope and objective, types, approaches, significance;
scientific investigation. The research process – the broad problem area, preliminary data
collection, problem, selection and definition, theoretical framework, hypothesis development
and elements of research design. Experimental design – the laboratory experiment, variables,
validity, types of experimental designs.
UNIT-2
Data measurement, collection, processing and analysis
Measurement – measurement in research, operational definition, measurement scales, scaling,
scaling techniques, reliability and validity.
Data collection – sources of data; data collection methods: interviewing, questionnaires, other
methods of data collection. Sampling: Introduction - Need and purpose of sampling,
population and sample, population frame, sampling with and without replacement, population
parameters.
Sampling theory – sampling distributions, parameter estimation, hypothesis testing. Sampling
designs – probability and non-probability sampling
UNIT-3
Data processing and analysis – review of statistical data analysis, hypothesis formulation and
testing , parametric and non parametric test
UNIT-4
Interpretation and Report Writing
Interpretation – meaning, need, technique.
Report writing – the research proposal, the report, integral parts of the report, steps involved in
report writing, types of reports, oral presentation, conclusions.
Reading List
1. Singleton.R.A.Jr, and Straits B. C. (1999). Approaches to Social Research. Oxford
2. University Press, New York.
3. Moore, D.S. (1999). The Basic Practice of Statistics. W.H. Freedman, N
4. De Vaus, D.A. (1995). Surveys in Social Research. Allen & Unwin, Sydney, NSW, 1995.
5. Foddy, W. (1994). Constructing Questions for Interviews and Questionnaires. Cambridge
University Press, Cambridge,.
6. Scarbrough E., E. Tanenbaum (1998) Research Strategies in the Social Sciences. Oxford
University Press. Oxford.
\
3
UNIT I
INTRODUCTION TO RESEARCH METHODS
Research can be defined as the systematic and objective identification, collection, analysis, and
dissemination of information for the purpose of assisting management in decision making
related to the identification and solution of problems (and opportunities)
1. Identification: involves defining the research problem (or opportunity) and determining
the information that is needed to address it.
2. Collection: data must be obtained from relevant sources.
3. Analysis: data are analyzed, interpreted, and inferences are drawn.
4. Dissemination of information: the findings, implications, and recommendations are
provided in a format that makes this information actionable and directly useful as an input into
decision making.
Types of research
1. Problem identification research. The goal is to identify existing or potential problems not
apparent on the surface. Examples include market potential, market share, market
characteristics, sales analysis, short-range forecasting, long-range forecasting, and business
trends research.
2. Problem solution research. The goal is to solve specific marketing problems such as
segmentation, product, pricing promotion, and distribution research.
Steps for research process
1. Problem definition: defining the research problem to be addressed is the most important
step because all other steps will be based on this definition.
2. Developing an approach to the problem: development of a broad specification of how
the problem will be addressed allows the researcher to break the problem into salient issues and
manageable pieces.
3. Research design formulation: a framework for conducting the research project that
specifies the procedures necessary for obtaining the required information. It details the
statistical
methodology needed to solve the problem and thus the data requirements needed from data
collection.
4. Fieldwork or data collection: a field force (personal interviewing, phone, mail, or
electronic surveys) gathers project data. While seemingly trivial in nature, to obtain meaningful
results field workers must be accurate and thorough in data collection.
4
5. Data preparation and analysis: the editing, coding, transcription, and verification of data
allow researchers to derive meaning from the data.
6. Report preparation and presentation: the findings are communicated to the client. The
report should address the specific research questions identified in the problem definition,
describe the approach, the research design, data collection and the data analysis procedures
adopted, and present
The management decision problem asks what the DM needs to do, whereas the research
hproblem entails determining what information is needed and how it can be obtained in the
most feasible way.
Management Decision Problem Research Problem
1. Should the price be cut in response Determine the buyer- behavior at
to a price-cut by a competitor? various price levels.
2. Should the product „X‟ be Assess the probable market size and
introduced in the market? share for product „X‟
3. What should be done to increase the Determine the strengths and
relative market share of product weaknesses of „Y‟ vis-à-vis those
„Y‟? of the competitors.
RESEARCH DESIGN
Research design may be defined as a framework or blueprint for conducting the research
project. It specifies the precise details of the procedures necessary for obtaining the required
information. Finally, stress that it is important to have a good research design in order to ensure
that the project is conducted effectively and efficiently.
Exploratory research is used in situations where the problem may have to be defined
more precisely, relevant courses of action identified, hypotheses formulated, or additional
insights gained before an approach can be developed. Conclusive research would be used to test
specific hypotheses, examine specific relationships, or make predictions.
While both descriptive and causal researches are classified as conclusive research, they differ in
terms of their objectives. Descriptive research is used to describe something, usually market
characteristics or functions. Causal research is used to obtain evidence regarding cause-and-
effect relationships.
5
Exploratory research is typically used to provide structure and insight into the research
problem. For example, using focus groups to determine key factors related to the use of your
product..
Descriptive research, it begins with the structure already defined and proceeds to actual
data collection in order to describe some market variable. For example, determining the average
age of purchasers of your product. Causal research also proceeds from a pre-established
structure but attempts to infer causal relationships between variables as opposed to describing
variables. For example, determining if increased advertising spending has led to an increase in
sales.
Descriptive research is marked by the prior formulation of specific hypotheses, the design
requires a clear specification of the six W‟s of the research:
1. Who: who should be considered?
2. Where: where should the respondents be contacted to obtain the required information?
3. When: when should the information be obtained from the respondents?
4. What: what information should be obtained from the respondents?
5. Why: why are we obtaining information from the respondents?
6. Way: the way in which we are going to obtain information from the respondents
Causal research is appropriate to use when the purposes are to understand which variables
are the cause and which variables are the effect, and to determine the nature of the functional
relationship between the causal variables and the effect to be predicted.
Descriptive research can be classified into cross-sectional and longitudinal designs.
Cross-sectional designs involve the collection of information from a sample population at
a single point in time, whereas in a longitudinal design a fixed sample population is measured
repeatedly over several points in time. Longitudinal data tend to be more informative than
cross-sectional data because they measure changes in behavior, large amounts of data can be
collected, and they are more accurate.
B Depth Interview
One-on-one interviews that probe and elicit detailed answers to questions, often using
nondirective techniques to uncover hidden motivations.
Advantages
– No group pressure
– Respondent is focus of attention and feels important
– Respondent is highly aware and active
– Long time period encourages revealing new information
– Can probe to reveal feelings and motivations
– Discussion is flexible and can explore tangential issues
• Disadvantages
– Much more expensive than focus groups
– Do not get the same degree of client involvement; clients do not want to observe single
interviews
– Are physically exhausting for the moderator…reduces the number of people that can be
interviewed in a given time period.
– Moderators do not have other group members to help obtain reactions and to stimulate
discussion.
C. Focus Group
Goal of focus group research: learn and understand what people have to say and why.
Find out how participants feel about a product, concept, idea, organization, etc.;
How it fit into their lives;
Their emotional involvement with it
• May be conducted alone or as part of a broader project
• May be use to define issues
• or to confirm findings from survey research.
• Independent variables are variables or alternatives that are manipulated and whose effects
are measured and compared, e.g., price levels. Test units are individuals, organizations, or other
entities whose response to the independent variables or treatments is being examined, e.g.,
consumers or stores. Dependent variables are the variables which measure the effect of the
independent variables on the test units, e.g., sales, profits, and market shares.
Extraneous variables are all variables other than the independent variables that affect the
response of the test units, e.g., store size, store location, and competitive effort.
Validity in ExperimentationInternal validity refers to whether the manipulation of the
independent variables or treatments actually caused the observed effects on the dependent
variables. Control of extraneous variables is a necessary condition for establishing internal
validity. External validity refers to whether the cause-and-effect relationships found in the
experiment can be generalized. To what populations, settings, times, independent variables and
dependent variables can the results be projected?
UNIT 2
Measurement and Scaling
Measurement means assigning numbers or other symbols to characteristics of objects
according to certain pre-specified rules. Scaling involves creating a continuum upon which
measured objects are located.
Nominal Scale
• The numbers serve only as labels or tags for identifying and classifying objects.
• When used for identification, there is a strict one-to-one correspondence between the
numbers and the objects.
• The numbers do not reflect the amount of the characteristic possessed by the objects.
• The only permissible operation on the numbers in a nominal scale is counting.
• Only a limited number of statistics, all of which are based on frequency counts, are
permissible, e.g., percentages, and mode.
Ordinal Scale
• A ranking scale in which numbers are assigned to objects to indicate the relative extent to
which the objects possess some characteristic.
• Can determine whether an object has more or less of a characteristic than some other
object, but not how much more or less.
• Any series of numbers can be assigned that preserves the ordered relationships between
the objects.
• In addition to the counting operation allowable for nominal scale data, ordinal scales
permit the use of statistics based on centiles, e.g., percentile, quartile, median.
Interval Scale
• Numerically equal distances on the scale represent equal values in the characteristic being
measured.
• It permits comparison of the differences between objects.
• The location of the zero point is not fixed. Both the zero point and the units of
measurement are arbitrary.
• Any positive linear transformation of the form y = a + bx will preserve the properties of
the scale.
• It is not meaningful to take ratios of scale values.
• Statistical techniques that may be used include all of those that can be applied to nominal
and ordinal data, and in addition the arithmetic mean, standard deviation, and other statistics
commonly used in research.
Ratio Scale
• Possesses all the properties of the nominal, ordinal, and interval scales.
• It has an absolute zero point.
• It is meaningful to compute ratios of scale values.
• Only proportionate transformations of the form y = bx, where b is a positive constant, are
allowed.
• All statistical techniques can be applied to ratio data.
10
Comparative scales involve the direct comparison of stimulus objects. Comparative scale data
must be interpreted in relative terms and have only ordinal or rank order properties.
In noncomparative scales, each object is scaled independently of the others in the stimulus set.
The resulting data are generally assumed to be interval or ratio scaled.
Relative Advantages of Comparative Scales
• Small differences between stimulus objects can be detected.
• Same known reference points for all respondents.
• Easily understood and can be applied.
• Tend to reduce halo or carryover effects from one judgment to another.
A respondent is presented with two objects and asked to select one according to some criterion.
The data obtained are ordinal in nature. Paired comparison scaling is the most widely used
comparative scaling technique. With n objects , [n(n - 1) /2] paired comparisons are required
Under the assumption of transitivity, it is possible to convert paired comparison data to a rank
order.
Rank order scaling: The respondents are required to rank various object on the basis of there
preferences
11
Constant Sum Scaling: Respondents allocate a constant sum of units, such as 100 points to
attributes of a product to reflect their importance.
If an attribute is unimportant, the respondent assigns it zero points. If an attribute is twice as
important as some other attribute, it receives twice as many points. The sum of all the points is
100. Hence, the name of the scale.
Continuous Rating Scale: Respondents rate the objects by placing a mark at the appropriate
position on a line that runs from one extreme of the criterion variable to the other. The form of
the continuous scale may vary considerably.
Itemized Rating Scales: The respondents are provided with a scale that has a number or brief
description associated with each category. The categories are ordered in terms of scale position,
and the respondents are required to select the specified category that best describes the object
being rated. The commonly used itemized rating scales are the Likert, semantic differential, and
Stapel scales.
Likert Scale: The Likert scale requires the respondents to indicate a degree of agreement or
disagreement with each of a series of statements about the stimulus objects.
Semantic differential is a seven-point rating scale with end points associated with bipolar labels
that have semantic meaning. The Staple scale is a unipolar rating scale with ten categories
numbered from -5 to +5, without a neutral point (zero). This scale
is usually presented vertically.
Measurement Accuracy
The true score model provides a framework for understanding the accuracy of measurement.
XO = XT + XS + XR
Where
XO = the observed score or measurement
XT = the true score of the characteristic
XS = systematic error
XR = random error
Reliability can be defined as the extent to which measures are free from random error,
XR. If XR = 0, the measure is perfectly reliable. In test-retest reliability, respondents are
administered identical sets of scale items at two different times and the degree of similarity
between the two measurements is determined. In alternative-forms reliability, two equivalent
forms of the scale are constructed and the same respondents are measured at two different
times, with a different form being used each time. Internal consistency reliability determines
the extent to which different parts of a summated scale are consistent in what they indicate
about the characteristic being measured. In split-half reliability, the items on the scale are
divided into two halves and the resulting half scores are correlated. The coefficient alpha, or
Cronbach's alpha, is the average of all possible split-half coefficients resulting from different
ways of splitting the scale items. This coefficient varies from 0 to 1, and a value of 0.6 or less
generally indicates unsatisfactory internal consistency reliability.
12
Validity: The validity of a scale may be defined as the extent to which differences in observed
scale scores reflect true differences among objects on the characteristic being measured, rather
than systematic or random error. Perfect validity requires that there be no measurement error
(XO = XT, XR = 0, XS = 0).Content validity is a subjective but systematic evaluation of how
well the content of a scale represents the measurement task at hand. Criterion validity reflects
whether a scale performs as expected in relation to other variables selected (criterion variables)
as meaningful criteria.
QUESTIONNAIRE DESIGN
An element is the object about which or from which the information is desired, e.g., the
respondent. A sampling unit is an element, or a unit containing the element, that is available
for selection at some stage of the sampling process.
– Extent refers to the geographical boundaries.
– Time is the time period under consideration.
Convenience Sampling
Convenience sampling attempts to obtain a sample of convenient elements. Often, respondents
are selected because they happen to be in the right place at the right time.
– use of students, and members of social organizations
– mall intercept interviews without qualifying the respondents
– department stores using charge account lists
– “people on the street” interviews
Judgmental sampling is a form of convenience sampling in which the population elements are
selected based on the judgment of the researcher.
– test markets
– purchase engineers selected in industrial research
– bellwether precincts selected in voting behavior research
– expert witnesses used in court
Quota sampling may be viewed as two-stage restricted judgmental sampling.
The first stage consists of developing control categories, or quotas, of population elements.
In the second stage, sample elements are selected based on convenience or judgment.
Population Sample
Systematic Sampling: The sample is chosen by selecting a random starting point and then
picking every ith element in succession from the sampling frame. The sampling interval, i, is
determined by dividing the population size N by the sample size n and rounding to the nearest
integer. When the ordering of the elements is related to the characteristic of interest, systematic
sampling increases the representativeness of the sample. If the ordering of the elements
produces a cyclical pattern, systematic sampling may decrease the representativeness of the
sample. For example, there are 100,000 elements in the population and a sample of 1,000 is
desired. In this case the sampling interval, i, is 100. A random number between 1 and 100 is
selected. If, for example, this number is 23, the sample consists of elements 23, 123, 223, 323,
423, 523, and so on.
Stratified Sampling
A two-step process in which the population is partitioned into subpopulations, or strata. The
strata should be mutually exclusive and collectively exhaustive in that every population
element should be assigned to one and only one stratum and no population elements should be
omitted. Next, elements are selected from each stratum by a random procedure, usually SRS.
A major objective of stratified sampling is to increase precision without increasing cost.
The elements within a stratum should be as homogeneous as possible, but the elements in
different strata should be as heterogeneous as possible. The stratification variables should also
be closely related to the characteristic of interest.
Finally, the variables should decrease the cost of the stratification process by being easy to
measure and apply.
Cluster Sampling: The target population is first divided into mutually exclusive and
collectively exhaustive subpopulations, or clusters. then a random sample of clusters is
selected, based on a probability sampling technique such as SRS. For each selected cluster,
either all the elements are included in the sample (one-stage) or a sample of elements is drawn
probabilistically (two-stage). elements within a cluster should be as heterogeneous as possible,
but clusters themselves should be as homogeneous as possible. Ideally, each cluster should be
a small-scale representation of the population.
In probability proportionate to size sampling, the clusters are sampled with probability
proportional to size. In the second stage, the probability of selecting a sampling unit in a
selected cluster varies inversely with the size of the cluster.
Finite Population Correction: The finite population correction (fpc) is a correction for
overestimation of the variance of a population parameter, e.g., a mean or proportion, when the
sample size is 10% or more of the population size.
Precision level: When estimating a population parameter by using a sample statistic, the
precision level is the desired size of the estimating interval. This is the maximum permissible
difference between the sample statistic and the population parameter.
16
Confidence interval: The confidence interval is the range into which the true population
parameter will fall, assuming a given level of confidence.
Confidence level: The confidence level is the probability that a confidence interval will include
the population parameter.
Proportion p
Variance 2 s2
Standard deviation s
_
Size N
_ n
Skewness of a Distribution
Symmetric Distribution
Skewed Distribution
Mean
Median
Mode
(a)
Parametric tests assume that the variables of interest are measured on at least an interval scale.
Nonparametric tests assume that the variables are measured on a nominal or ordinal scale.
These tests can be further classified based on whether one or two or more samples are involved.
the samples are independent if they are drawn randomly from different populations. For the
purpose of analysis, data pertaining to different groups of respondents, e.g., males and females,
are generally treated as independent samples. The samples are paired when the data for the two
samples relate to the same group of respondents.
Z-Test : The mean life of sample of 400 fluorescent light bulb produced by a company is
found to be 1600 hours with a S.D of 150 hrs .Test the hypotheses mean life of bulb in general
is higher than mean life of 1570 hours at 0.01 level of significance
Z= 30/7.5 = 4
CV = 1570+2.337x7.5= 1587
Since Sample mean > CV null hypotheses rejected
The t statistic assumes that the variable is normally distributed and the mean is known (or
assumed to be known) and the population variance is estimated from the sample. Assume that
the random variable X is normally distributed, with mean and unknown population variance
that is estimated by the sample variance s 2.
• Then, t distributed with n - 1 degrees of freedom.
• The t distribution is similar to the normal distribution in appearance. Both distributions
are bell-shaped and symmetric. As the number of degrees of freedom increases, the t
distribution approaches the normal Founder WS Gosset Wrote under the pseudonym
“Student”Mostly worked in tea (t) time? Hence known as Student's t test. Preferable when the n
< 60Certainly if n < 30 When you have scores from a particular group of people and you want
to estimate what the variance would be for people in general who are like the ones you have
scores from, use (n -1).To calculate the variance of a sample, when estimating the variance of
its population, use (n -1) in order to provide an unbiased estimate of the population variance.
21
•
• Degrees of Freedom : The number you divide by (the number of scores minus 1) to get
the estimated population variance is called the degrees of freedom.The degrees of freedom is
the number of scores in a sample that are “free to vary”. Imagine a very simple situation in
which the individual scores that make up a distribution are 3, 4, 5, 6, and 7.If you are asked to
tell what the first score is without having seen it, the best you could do is a wild guess, because
the first score could be any number.If you are told the first score (3) and then asked to give the
second, it too could be any number. The same is true of the third and fourth scores – each of
them has complete “freedom” to vary.But if you know those first four scores (3, 4, 5, and 6)
and you know the mean of the distribution (5), then the last score can only be 7.If, instead of the
mean and 3, 4, 5, and 6, you were given the mean and 3, 5, 6, and 7, the missing score could
only be 4. the t test, because the known sample mean is used to replace the unknown population
mean in calculating the estimated standard deviation, one degree of freedom is lost.For each
parameter you estimate, you lose one degree of freedom.
Degrees of freedom is a measure of how much precision an estimate of variation has. A
general rule is that the degrees of freedom decrease when you have to estimate more
parameters.
Type I Error occurs when the sample results lead to the rejection of the null hypothesis when it
is in fact true. The probability of type I error is also called the level of significance. Type II
Error occurs when, based on the sample results, the null hypothesis is not rejected when it is in
fact false. The probability of type II error is denoted by Unlike, which is specified by the
researcher, the magnitude of b depends on the actual value of the population parameter
(proportion). The power of a test is the probability (1 – alpha) ejecting the null hypothesis when
it is false and should be rejected. Although is unknown, it is related to . An extremely low
value of (e.g., = 0.001) will result in intolerably high errors. So it is necessary to balance the
two types of errors.
Example: Odometers measure automobile mileage. How close to the truth is the number that is
registered? Suppose 12 cars travel exactly 10 miles (measured beforehand) and the following
mileage figures were recorded by the odometers:
9.8, 10.1, 10.3, 10.2, 9.9, 10.4, 10.0, 9.9, 10.3, 10.0, 10.1, 10.2
Using the .01 level of significance, determine if you can trust your odometer.
State the research question.
– Are odometers accurate?
State the statistical hypotheses.
22
23
You can estimate a population mean based on confidence intervals rather than statistical
hypothesis tests.
A confidence interval is an interval of a certain width, which we feel “confident” will contain
the population mean. You are not determining whether the sample mean differs significantly
from the population mean. Instead, you are estimating the population mean based on knowing
the sample mean. If the primary concern is whether an effect is present, use a hypothesis test.
You should consider using a confidence interval whenever a hypothesis test leads you to reject
the null hypothesis, in order to determine the possible size of the effect.
The t test for a single sample is for when you know the population mean but not its variance,
and where we have a single sample of
scores. In most research, you do not even know the population‟s mean. and, in most research
situations, we have not one set, but two sets of scores
X hyp
t
sX
D Dhyp
t
sD
25
26
In one large factory, 100 employees were judged to be highly successful and another 100
marginally successful. All workers were asked, “Which do you find more important to you
personally, the money you are able to take home or the satisfaction you feel from doing the
job?” In the first group, 49% found the money more important, but in the second group 53%
responded that way. Test the null hypothesis that job performance and job motivation are
independent using the .01 level of significance.
State the research hypothesis.
– Are job performance and job motivation independent?
State the statistical hypotheses.
• The road accident in a country vary from day to day .In a sample following information
was obtained
29
30
Readers. A report should be written for a specific reader or readers: the marketing managers
who will use the results.
Easy to follow. The report should be easy to follow. It should be structured logically and
written clearly.
Presentable and professional appearance. The looks of a report are important.
Objective. Objectivity is a virtue that should guide report writing. The rule is, "Tell it like it
is."
Reinforce text with tables and graphs. It is important to reinforce key information in the text
with tables, graphs, pictures, maps, and other visual devices.
Terse. A report should be terse and concise. Yet, brevity should not be achieved at the expense
of completeness.
A line chart connects a series of data points using continuous lines. This is an attractive
way of illustrating trends and changes over time. Several series can be compared on the same
chart, and forecasts, interpolations, and extrapolations can be shown. A stratum chart is a set of
line charts in which the data are successively aggregated over the series. Areas between the line
charts display the magnitudes of the relevant variables.
Oral Presentation
The key to an effective presentation is preparation. A written script or detailed outline should
be prepared following the format of the written report. The presentation must be geared to the
audience. The presentation should be rehearsed several times before it is made to the
management. Visual aids, such as tables and graphs, should be displayed with a variety of
media. It is important to maintain eye contact and interact with the audience during the
presentation.