Research Methodology
Research Methodology
Research Methodology
A sample design is a definite plan for obtaining a sample from a given population. It refers to the
technique or the procedure the researcher would adopt in selecting items for the sample. Sample
design also leads to a procedure to tell the number of items to be included in the sample i.e., the
size of the sample. Hence, sample design is determined before the collection of data. Among
various types of sample design technique, the researcher should choose that samples which are
reliable and appropriate for his research study.
A good sample is one which satisfies all or few of the following conditions:
A well defined population reduces the probability of including the respondents who do not fit the
research objective of the company. For ex, if the population is defined as all women above the
age of 20, the researcher may end up taking the opinions of a large number of women who
cannot afford to buy a micro oven.
Once the definition of the population is clear a researcher should decide on the sampling frame.
A sampling frame is the list of elements from which the sample may be drawn. Continuing with
the micro oven ex, an ideal sampling frame would be a database that contains all the households
that have a monthly income above Rs.20,000. However, in practice it is difficult to get an
exhaustive sampling frame that exactly fits the requirements of a particular research. In general,
researchers use easily available sampling frames like telephone directories and lists of credit card
and mobile phone users. Various private players provide databases developed along various
demographic and economic variables. Sometimes, maps and aerial pictures are also used as
sampling frames. Whatever may be the case, an ideal sampling frame is one that entire
population and lists the names of its elements only once.
A sampling frame error pops up when the sampling frame does not accurately represent the total
population or when some elements of the population are missing another drawback in the
sampling frame is over —representation. A telephone directory can be over represented by
names/household that have two or more connections.
A sampling unit is a basic unit that contains a single element or a group of elements of the
population to be sampled. In this case, a household becomes a sampling unit and all women
above the age of 20 years living in that particular house become the sampling elements. If it is
possible to identify the exact target audience of the business research, every individual element
would be a sampling unit. This would present a case of primary sampling unit. However, a
convenient and better means of sampling would be to select households as the sampling unit and
interview all females above 20 years, who cook. This would present a case of secondary
sampling unit.
The sampling method outlines the way in which the sample units are to be selected. The choice
of the sampling method is influenced by the objectives of the business research, availability of
financial resources, time constraints, and the nature of the problem to be investigated. All
sampling methods can be grouped under two distinct heads, that is, probability and non-
probability sampling.
The sample size plays a crucial role in the sampling process. There are various ways of
classifying the techniques used in determining the sample size. A couple those hold primary
importance and are worth mentioning are whether the technique deals with fixed or sequential
sampling and whether its logic is based on traditional or Bayesian methods. In non-probability
sampling procedures, the allocation of budget, thumb rules and number of sub groups to be
analyzed, importance of the decision, number of variables, nature of analysis, incidence rates,
and completion rates play a major role in sample size determination. In the case of probability
sampling, however, formulas are used to calculate the sample size after the levels of acceptable
error and level of confidence are specified. The details of the various techniques used to
determine the sample size will be explained at the end of the chapter.
In this step, the specifications and decisions regarding the implementation of the research process
are outlined. Suppose, blocks in a city are the sampling units and the households are the
sampling elements. This step outlines the modus operandi of the sampling plan in identifying
houses based on specified characteristics. It includes issues like how is the interviewer going to
take a systematic sample of the houses. What should the interviewer do when a house is vacant?
What is the recontact procedure for respondents who were unavailable? All these and many other
questions need to be answered for the smooth functioning of the research process. These are
guide lines that would help the researcher in every step of the process. As the interviewers and
their co-workers will be on field duty of most of the time, a proper specification of the sampling
plans would make their work easy and they would not have to revert to their seniors when faced
with operational problems.
This is the final step in the sampling process, where the actual selection of the sample elements is
carried out. At this stage, it is necessary that the interviewers stick to the rules outlined for the
smooth implementation of the business research. This step involves implementing the sampling
plan to select the sampling plan to select a sample required for the survey.
First, you need to understand the difference between a population and a sample, and identify the
target population of your research.
The population is the entire group that you want to draw conclusions about.
The sample is the specific group of individuals that you will collect data from.
The population can be defined in terms of geographical location, age, income, and many other
characteristics.
It can be very broad or quite narrow: maybe you want to make inferences about the whole adult
population of your country; maybe your research focuses on customers of a certain company,
patients with a specific health condition, or students in a single school.
If the population is very large, demographically mixed, and geographically dispersed, it might be
difficult to gain access to a representative sample.
Sampling frame
The sampling frame is the actual list of individuals that the sample will be drawn from. Ideally, it
should include the entire target population (and nobody who is not part of that population).
Example: Sampling frameYou are doing research on working conditions at Company X. Your
population is all 1000 employees of the company. Your sampling frame is the company’s HR
database which lists the names and contact details of every employee.
Sample size
The number of individuals you should include in your sample depends on various factors,
including the size and variability of the population and your research design. There are
different sample size calculators and formulas depending on what you want to achieve
with statistical analysis.
Simple random sampling: One of the best probability sampling techniques that helps in
saving time and resources, is the Simple Random Sampling method. It is a reliable method of
obtaining information where every single member of a population is chosen randomly, merely
by chance. Each individual has the same probability of being chosen to be a part of a sample.
For example, in an organization of 500 employees, if the HR team decides on conducting team
building activities, it is highly likely that they would prefer picking chits out of a bowl. In this
case, each of the 500 employees has an equal opportunity of being selected.
Cluster sampling: Cluster sampling is a method where the researchers divide the entire
population into sections or clusters that represent a population. Clusters are identified and
included in a sample based on demographic parameters like age, sex, location, etc. This makes
it very simple for a survey creator to derive effective inference from the feedback.
For example, if the United States government wishes to evaluate the number of immigrants
living in the Mainland US, they can divide it into clusters based on states such as California,
Texas, Florida, Massachusetts, Colorado, Hawaii, etc. This way of conducting a survey will be
more effective as the results will be organized into states and provide insightful immigration
data.
Systematic sampling: Researchers use the systematic sampling method to choose the sample
members of a population at regular intervals. It requires the selection of a starting point for the
sample and sample size that can be repeated at regular intervals. This type of sampling method
has a predefined range, and hence this sampling technique is the least time-consuming.
For example, a researcher intends to collect a systematic sample of 500 people in a population
of 5000. He/she numbers each element of the population from 1-5000 and will choose every
10th individual to be a part of the sample (Total population/ Sample Size = 5000/500 = 10).
Stratified random sampling: Stratified random sampling is a method in which the researcher
divides the population into smaller groups that don’t overlap but represent the entire
population. While sampling, these groups can be organized and then draw a sample from each
group separately.
For example, a researcher looking to analyze the characteristics of people belonging to
different annual income divisions will create strata (groups) according to the annual family
income. Eg – less than $20,000, $21,000 – $30,000, $31,000 to $40,000, $41,000 to $50,000,
etc. By doing this, the researcher concludes the characteristics of people belonging to different
income groups. Marketers can analyze which income groups to target and which ones to
eliminate to create a roadmap that would bear fruitful results.
Reduce Sample Bias: Using the probability sampling method, the bias in the sample derived
from a population is negligible to non-existent. The selection of the sample mainly depicts the
understanding and the inference of the researcher. Probability sampling leads to higher
quality data collection as the sample appropriately represents the population.
Diverse Population: When the population is vast and diverse, it is essential to have adequate
representation so that the data is not skewed towards one demographic. For example, if Square
would like to understand the people that could make their point-of-sale devices, a survey
conducted from a sample of people across the US from different industries and socio-
economic backgrounds helps.
Create an Accurate Sample: Probability sampling helps the researchers plan and create an
accurate sample. This helps to obtain well-defined data.
Types of non-probability sampling with examples
The non-probability method is a sampling method that involves a collection of feedback based
on a researcher or statistician’s sample selection capabilities and not on a fixed selection process.
In most situations, the output of a survey conducted with a non-probable sample leads to skewed
results, which may not represent the desired target population. But, there are situations such as
the preliminary stages of research or cost constraints for conducting research, where non-
probability sampling will be much more useful than the other type.
Four types of non-probability sampling explain the purpose of this sampling method in a better
manner:
A population specification error would occur if XYZ Company does not understand the specific
types of consumers who should be included in the sample. For example, if XYZ creates a
population of people between the ages of 15 and 25 years old, many of those consumers do not
make the purchasing decision about a video streaming service because they may not work full-
time. On the other hand, if XYZ put together a sample of working adults who make purchase
decisions, the consumers in this group may not watch 10 hours of video programming each
week.
Before we jump into sample size determination, let’s take a look at the terms you should know:
1. Population size: Population size is how many people fit your demographic. For example, you
want to get information on doctors residing in North America. Your population size is the total
number of doctors in North America. Don’t worry! Your population size doesn’t always have
to be that big. Smaller population sizes can still give you accurate results as long as you know
who you’re trying to represent.
2. Confidence level: Confidence level tells you how sure you can be that your data is accurate. It
is expressed as a percentage and aligned to the confidence interval. For example, if your
confidence level is 90%, your results will most likely be 90% accurate.
3. The margin of error (confidence interval): When it comes to surveys, there’s no way to be
100% accurate. Confidence intervals tell you how far off from the population means you’re
willing to allow your data to fall. A margin of error describes how close you can reasonably
expect a survey result to fall relative to the real population value. Remember, if you need help
with this information you can use our margin of error calculator.
4. Standard deviation: Standard deviation is the measure of the dispersion of a data set from its
mean. It measures the absolute variability of a distribution. The higher the dispersion or
variability, the greater the standard deviation and the greater the magnitude of the deviation.
For example, you have already sent out your survey. How much variance do you expect in
your responses? That variation in response is the standard of deviation.
Measurement: Measurement is the process of observing and recording the observations that are
collected as part of research. The recording of the observations may be in terms of numbers or
other symbols to characteristics of objects according to certain prescribed rules. The
respondent’s, characteristics are feelings, attitudes, opinions etc. For example, you may assign
‘1’ for Male and ‘2’ for Female respondents. In response to a question on whether he/she is using
the ATM provided by a particular bank branch, the respondent may say ‘yes’ or ‘no’. You may
wish to assign the number ‘1’ for the response yes and ‘2’ for the response no. We assign
numbers to these characteristics for two reasons. First, the numbers facilitate further statistical
analysis of data obtained.
The most important aspect of measurement is the specification of rules for assigning numbers to
characteristics. The rules for assigning numbers should be standardised and applied uniformly.
This must not change over time or objects.
For example, we may assign strongly agree as ‘1’, agree as ‘2’ disagree as ‘3’, and strongly
disagree as ‘4’. Therefore, each of the respondents may assign 1, 2, 3 or 4. there are four levels
of measurement scales or methods of assigning numbers: (a) Nominal scale, (b) Ordinal scale,
(c) Interval scale, and (d) Ratio scale.
a) Nominal Scale is the crudest among all measurement scales but it is also the simplest scale.
In this scale the different scores on a measurement simply indicate different categories. The
nominal scale does not express any values or relationships between variables. For example,
labelling men as ‘1’ and women as ‘2’ which is the most common way of labelling gender for
data recording purpose does not mean women are ‘twice something or other’ than men. Nor it
suggests that men are somehow ‘better’ than women. Another example of nominal scale is to
classify the respondent’s income into three groups: the highest income as group 1. The middle
income as group 2, and the low-income as group 3. The nominal scale is often referred to as a
categorical scale. The assigned numbers have no arithmetic properties and act only as labels.
The only statistical operation that can be performed on nominal scales is a frequency count. We
cannot determine an average except mode.
In designing and developing a questionnaire, it is important that the response categories must
include all possible responses. In order to have an exhaustive number of responses, you might
have to include a category such as ‘others’, ‘uncertain’, ‘don’t know’, or ‘can’t remember’ so
that the respondents will not distort their information by forcing their responses in one of the
categories provided. Also, you should be careful and be sure that the categories provided are
mutually exclusive so that they do not overlap or get duplicated in any way.
b) Ordinal Scale involves the ranking of items along the continuum of the characteristic being
scaled. In this scale, the items are classified according to whether they have more or less of a
characteristic. For example, you may wish to ask the TV viewers to rank the TV channels
according to their preference and the responses may look like this as given below:
Doordarshan-1 1
Star plus 2
NDTV News 3
Aaaj Tak TV 4
The main characteristic of the ordinal scale is that the categories have a logical or ordered
relationship. This type of scale permits the measurement of degrees of difference, (that is, ‘more’
or ‘less’) but not the specific amount of differences (that is, how much ‘more’ or ‘less’). This
scale is very common in marketing, satisfaction and attitudinal research.
Another example is that a fast food home delivery shop may wish to ask its customers:
Suppose respondent X gave the response ‘Excellent’ and respondent Y gave the response
‘Good’, we may say that respondent X thought that the service provided better than respondent Y
to be thought. But we don’t know how much better and even we can’t say that both respondents
have the same understanding of what constitutes ‘good service’.
In marketing research, ordinal scales are used to measure relative attitudes, opinions, and
preferences. Here we rank the attitudes, opinions and preferences from best to worst or from
worst to best. However, the amount of difference between the ranks cannot be found out. Using
ordinal scale data, we can perform statistical analysis like Median and Mode, but not the Mean.
c) Interval Scale is a scale in which the numbers are used to rank attributes such that
numerically equal distances on the scale represent equal distance in the characteristic being
measured. An interval scale contains all the information of an ordinal scale, but it also one
allows to compare the difference/distance between attributes. For example, the difference
between ‘1’ and ‘2’ is equal to the difference between ‘3’ and ‘4’. Further, the difference
between ‘2’ and ‘4’ is twice the difference between ‘1’ and ‘2’. However, in an interval scale, the
zero point is arbitrary and is not true zero. This, of course, has implications for the type of data
manipulation and analysis. We can carry out on data collected in this form. It is possible to add
or subtract a constant to all of the scale values without affecting the form of the scale but one
cannot multiply or divide the values. Measuring temperature is an example of interval scale. We
cannot say 400C is twice as hot as 200C. The reason for this is that 00C does not mean that there
is no temperature, but a relative point on the Centigrade Scale. Due to lack of an absolute zero
point, the interval scale does not allow the conclusion that 400C is twice as hot as 200C.
Interval scales may be either in numeric or semantic formats. The following are two more
examples of interval scales one in numeric format and another is semantic format.
Ratio Scale is the highest level of measurement scales. This has the properties of an interval
scale together with a fixed (absolute) zero point. The absolute zero point allows us to construct
a meaningful ratio. Examples of ratio scales include weights, lengths and times. In the marketing
research, most counts are ratio scales. For example, the number of customers of a bank’s ATM
in the last three months is a ratio scale. This is because you can compare this with previous three
months. Ratio scales permit the researcher to compare both differences in scores and relative
magnitude of scores. For example, the difference between 10 and 15 minutes is the same as the
difference between 25 and 30 minutes and 30 minutes is twice as long as 15 minutes. Most
financial research that deals with rupee values utilizes ratio scales. However, for most
behavioural research, interval scales are typically the highest form of measurement. Most
statistical data analysis procedures do not distinguish between the interval and ratio properties of
the measurement scales and it is sufficient to say that all the statistical operations that can be
performed on interval scale can also be performed on ratio scales.
Now you must be wondering why you should know the level of measurement. Knowing the level
of measurement helps you to decide on how to interpret the data. For example, when you know
that a measure is nominal then you know that the numerical values are just short codes for longer
textual names. Also, knowing the level of measurement helps you to decide what statistical
analysis is appropriate on the values that were assigned. For example, if you know that a measure
is nominal, then you would not need to find mean of the data values or perform a t-test on the
data. (t-test will be discussed in Unit-16 in the course). It is important to recognise that there is a
hierarchy implied in the levels of measurement. At lower levels of measurement, assumptions
tend to be less restrictive and data analyses tend to be less sensitive. At each level up the
hierarchy, the current level includes all the qualities of the one below it and adds something new.
In general, it is desirable to have a higher level of measurement (that is, interval or ratio) rather
than a lower one (that is, nominal or ordinal).
Comparative Scales
For comparing two or more variables, a comparative scale is used by the respondents. Following
are the different types of comparative scaling techniques:
Paired Comparison
A paired comparison symbolizes two variables from which the respondent needs to select one.
This technique is mainly used at the time of product testing, to facilitate the consumers with a
comparative analysis of the two major products in the market.
To compare more than two objects say comparing P, Q and R, one can first compare P with Q
and then the superior one (i.e., one with a higher percentage) with R.
For example, A market survey was conducted to find out consumer’s preference for the network
service provider brands, A and B. The outcome of the survey was as follows:
Brand ‘A’ = 57%
Brand ‘B’ = 43%
Thus, it is visible that the consumers prefer brand ‘A’, over brand ‘B’.
Rank Order
In rank order scaling the respondent needs to rank or arrange the given objects according to his
or her preference.
For example, A soap manufacturing company conducted a rank order scaling to find out the
orderly preference of the consumers. It asked the respondents to rank the following brands in the
sequence of their choice:
Brand V 4
Brand X 2
Brand Y 1
Brand Z 3
The above scaling shows that soap ‘Y’ is the most preferred brand, followed by soap ‘X’, then
soap ‘Z’ and the least preferred one is the soap ‘V’.
Constant Sum
It is a scaling technique where a continual sum of units like dollars, points, chits, chips, etc. is
given to the features, attributes and importance of a particular product or service by the
respondents.
For example, The respondents belonging to 3 different segments were asked to allocate 50 points
to the following attributes of a cosmetic product ‘P’:
Finish 11 8 9
Skin Friendly 11 12 12
Fragrance 7 11 8
Packaging 9 8 10
Price 12 11 11
From the above constant sum scaling analysis, we can see that:
Segment 1 considers product ‘P’ due to its competitive price as a major factor.
But segment 2 and segment 3, prefers the product because it is skin-friendly.
Q-Sort Scaling
Q-sort scaling is a technique used for sorting the most appropriate objects out of a large number
of given variables. It emphasizes on the ranking of the given objects in a descending order to
form similar piles based on specific attributes.
It is suitable in the case where the number of objects is not less than 60 and more than 140, the
most appropriate of all ranging between 60 to 90.
For example, The marketing manager of a garment manufacturing company sorts the most
efficient marketing executives based on their past performance, sales revenue generation,
dedication and growth.
The Q-sort scaling was performed on 60 executives, and the marketing head creates three piles
based on their efficiency as follows:
In the above diagram, the initials of the employees are used to denote their names.
Non-Comparative Scales
It is a graphical rating scale where the respondents are free to place the object at a position of
their choice. It is done by selecting and marking a point along the vertical or horizontal line
which ranges between two extreme criteria.
For example, A mattress manufacturing company used a continuous rating scale to find out the
level of customer satisfaction for its new comfy bedding. The response can be taken in the
following different ways (stated as versions here):
The above diagram shows a non-comparative analysis of one particular product, i.e. comfy
bedding. Thus, making it very clear that the customers are quite satisfied with the product and its
features.
Itemized scale is another essential technique under the non-comparative scales. It emphasizes on
choosing a particular category among the various given categories by the respondents. Each class
is briefly defined by the researchers to facilitate such selection.
The three most commonly used itemized rating scales are as follows:
Likert Scale: In the Likert scale, the researcher provides some statements and ask the
respondents to mark their level of agreement or disagreement over these statements by
selecting any one of the options from the five given alternatives.
For example, A shoes manufacturing company adopted the Likert scale technique for its
new sports shoe range named Z sports shoes. The purpose is to know the agreement or
disagreement of the respondents.
For this, the researcher asked the respondents to circle a number representing the most
suitable answer according to them, in the following representation:
1 – Strongly Disagree
2 – Disagree
3 – Neither Agree Nor Disagree
4 – Agree
5 – Strongly Agree
NEITHER
STRONGLY AGREE STRONGLY
STATEMENT DISAGREE AGREE
DISAGREE NOR AGREE
DISAGREE
Z sports shoes 1 2 3 4 5
look too trendy
I will definitely 1 2 3 4 5
recommend Z
sports shoes to
friends, family and
colleagues
The above illustration will help the company to understand what the customers think about its
products. Also, whether there is any need for improvement or not.
For example, A well-known brand for watches, carried out semantic differential scaling to
understand the customer’s attitude towards its product. The pictorial representation of this
technique is as follows:
From the above diagram, we can analyze that the customer finds the product of superior quality;
however, the brand needs to focus more on the styling of its watches.
Stapel Scale: A Stapel scale is that itemized rating scale which measures the response,
perception or attitude of the respondents for a particular object through a unipolar rating.
The range of a Stapel scale is between -5 to +5 eliminating 0, thus confining to 10 units.
For example, A tours and travel company asked the respondent to rank their holiday
package in terms of value for money and user-friendly interface as follows:
With the help of the above scale, we can say that the company needs to improve its package in
terms of value for money. However, the decisive point is that the interface is quite user-friendly
for the customers.
Comparison Chart
BASIS FOR
PRIMARY DATA SECONDARY DATA
COMPARISON
Meaning Primary data refers to the first Secondary data means data
hand data gathered by the collected by someone else earlier.
researcher himself.
Primary data is data originated for the first time by the researcher through direct efforts and
experience, specifically for the purpose of addressing his research problem. Also known as the
first hand or raw data. Primary data collection is quite expensive, as the research is conducted by
the organisation or agency itself, which requires resources like investment and manpower. The
data collection is under direct control and supervision of the investigator.
The data can be collected through various methods like surveys, observations, physical testing,
mailed questionnaires, questionnaire filled and sent by enumerators, personal interviews,
telephonic interviews, focus groups, case studies, etc.
Secondary data implies second-hand information which is already collected and recorded by any
person other than the user for a purpose, not relating to the current research problem. It is the
readily available form of data collected from various sources like censuses, government
publications, internal records of the organisation, reports, books, journal articles, websites and so
on.
Secondary data offer several advantages as it is easily available, saves time and cost of the
researcher. But there are some disadvantages associated with this, as the data is gathered for the
purposes other than the problem in mind, so the usefulness of the data may be limited in a
number of ways like relevance and accuracy.
Moreover, the objective and the method adopted for acquiring data may not be suitable to the
current situation. Therefore, before using secondary data, these factors should be kept in mind.
The fundamental differences between primary and secondary data are discussed in the following
points:
1. The term primary data refers to the data originated by the researcher for the first time.
Secondary data is the already existing data, collected by the investigator agencies and
organisations earlier.
2. Primary data is a real-time data whereas secondary data is one which relates to the past.
3. Primary data is collected for addressing the problem at hand while secondary data is
collected for purposes other than the problem at hand.
4. Primary data collection is a very involved process. On the other hand, secondary data
collection process is rapid and easy.
5. Primary data collection sources include surveys, observations, experiments,
questionnaire, personal interview, etc. On the contrary, secondary data collection sources
are government publications, websites, books, journal articles, internal records etc.
6. Primary data collection requires a large amount of resources like time, cost and
manpower. Conversely, secondary data is relatively inexpensive and quickly available.
7. Primary data is always specific to the researcher’s needs, and he controls the quality of
research. In contrast, secondary data is neither specific to the researcher’s need, nor he
has control over the data quality.
8. Primary data is available in the raw form whereas secondary data is the refined form of
primary data. It can also be said that secondary data is obtained when statistical methods
are applied to the primary data.
9. Data collected through primary sources are more reliable and accurate as compared to the
secondary sources.
Primary data collection methods are different ways in which primary data can be collected. It
explains the tools used in collecting primary data, some of which are highlighted below:
1. Interviews
An interview is a method of data collection that involves two groups of people, where the first
group is the interviewer (the researcher(s) asking questions and collecting data) and the
interviewee (the subject or respondent that is being asked questions). The questions and
responses during an interview may be oral or verbal as the case may be.
Interviews can be carried out in 2 ways, namely; in-person interviews and telephonic interviews.
An in-person interview requires an interviewer or a group of interviewers to ask questions from
the interviewee in a face-to-face fashion.
It can be direct or indirect, structured or structure, focused or unfocused, etc. Some of the tools
used in carrying out in-person interviews include a notepad or recording device to take note of
the conversation—very important due to human forgetful nature.
On the other hand, telephonic interviews are carried out over the phone through ordinary voice
calls or video calls. The 2 parties involved may decide to use video calls like Skype to carry out
interviews.
A mobile phone, Laptop, Tablet, or desktop computer with an internet connection is required for
this.
Pros
Cons
It is more time-consuming.
It is expensive.
The interviewer may be biased.
Surveys and questionnaires are 2 similar tools used in collecting primary data. They are a group
of questions typed or written down and sent to the sample of study to give responses.
After giving the required responses, the survey is given back to the researcher to record. It is
advisable to conduct a pilot study where the questionnaires are filled by experts and meant to
assess the weakness of the questions or techniques used.
There are 2 main types of surveys used for data collection, namely; online and offline
surveys. Online surveys are carried out using internet-enabled devices like mobile phones, PCs,
Tablets, etc.
They can be shared with respondents through email, websites, or social media. Offline surveys,
on the other hand, do not require an internet connection for them to be carried out.
The most common type of offline survey is a paper-based survey. However, there are also offline
surveys like Formplus that can be filled with a mobile device without access to an internet
connection.
This kind of survey is called online-offline surveys because they can be filled offline but require
an internet connection to be submitted.
Pros
Cons
3. Observation
The observation method is mostly used in studies related to behavioral science. The researcher
uses observation as a scientific tool and method of data collection. Observation as a data
collection tool is usually systematically planned and subjected to checks and controls.
A controlled and uncontrolled approach signifies whether the research took place in a natural
setting or according to some pre-arranged plans. If an observation is done in a natural setting, it
is uncontrolled but becomes controlled if done in a laboratory.
Before employing a new teacher, academic institutions sometimes ask for a sample teaching
class to test the teacher’s ability. The evaluator joins the class and observes the teaching, making
him or her a participant.
The evaluation may also decide to observe from outside the class, becoming a non-participant.
An evaluator may also be asked to stay in class and disguise as a student, to carry out a disguised
observation.
Pros
Cons
4. Focus Groups
Focus Groups are gathering of 2 or more people with similar characteristics or who possess
common traits. They seek open-ended thoughts and contributions from participants.
A focus group is a primary source of data collection because the data is collected directly from
the participant. It is commonly used for market research, where a group of market consumers
engages in a discussion with a research moderator.
It is slightly similar to interviews, but this involves discussions and interactions rather than
questions and answers. Focus groups are less formal and the participants are the ones who do
most of the talking, with moderators there to oversee the process.
Pros
It incurs a low cost compared to interviews. This is because the interviewer does not have
to discuss with each participant individually.
It takes lesser time too.
Cons
Response bias is a problem in this case because a participant might be subjective to what
people will think about sharing a sincere opinion.
Group thinking does not clearly mirror individual opinions.
5. Experiments
An experiment is a structured study where the researchers attempt to understand the causes,
effects, and processes involved in a particular process. This data collection method is usually
controlled by the researcher, who determines which subject is used, how they are grouped, and
the treatment they receive.
During the first stage of the experiment, the researcher selects the subject which will be
considered. Therefore, some actions are carried out on these subjects, while the primary data
consisting of the actions and reactions are recorded by the researcher.
After which they will be analyzed and a conclusion will be drawn from the result of the analysis.
Although experiments can be used to collect different types of primary data, it is mostly used for
data collection in the laboratory.
Pros
It is usually objective since the data recorded are the results of a process.
Non-response bias is eliminated.
Cons
One-on-one interviews
Interviews are one of the most common qualitative data-collection methods, and they’re a
great approach when you need to gather highly personalized information. Informal,
conversational interviews are ideal for open-ended questions that allow you to gain rich,
detailed context.
Open-ended surveys and questionnaires allow participants to answer freely at length, rather
than choosing from a set number of responses. For example, you might ask an open-ended
question like “Why don’t you eat ABC brand pizza?”
You would then provide space for people to answer narratively, rather than simply giving
them a specific selection of responses to choose from — like “I’m a vegan,” “It’s too
expensive,” or “I don’t like pizza.”
Focus groups
Focus groups are similar to interviews, except that you conduct them in a group format. You
might use a focus group when one-on-one interviews are too difficult or time-consuming to
schedule.
They’re also helpful when you need to gather data on a specific group of people. For example,
if you want to get feedback on a new marketing campaign from a number of demographically
similar people in your target market or allow people to share their views on a new product,
focus groups are a good way to go.
Observation
Observation is a method in which a data collector observes subjects in the course of their
regular routines, takes detailed field notes, and/or records subjects via video or audio.
Case studies
In the case study method, you analyze a combination of multiple qualitative data sources to
draw inferences and come to conclusions.
What is a Questionnaire?
A questionnaire is a research instrument that consists of a set of questions or other types of
prompts that aims to collect information from a respondent. A research questionnaire is typically
a mix of close-ended questions and open-ended questions.
Open-ended, long-form questions offer the respondent the ability to elaborate on their thoughts.
Research questionnaires were developed in 1838 by the Statistical Society of London.
The data collected from a data collection questionnaire can be both qualitative as well
as quantitative in nature. A questionnaire may or may not be delivered in the form of a survey,
but a survey always consists of a questionnaire.
With a survey questionnaire, you can gather a lot of data in less time.
There is less chance of any bias creeping if you have a standard set of questions to be used for
your target audience. You can apply logic to questions based on the respondents’ answers, but
the questionnaire will remain standard for a group of respondents that fall in the same
segment.
Surveying online survey software is quick and cost-effective. It offers you a rich set of
features to design, distribute, and analyze the response data.
It can be customized to reflect your brand voice. Thus, it can be used to reinforce your brand
image.
The responses can be compared with the historical data and understand the shift in
respondents’ choices and experiences.
Respondents can answer the questionnaire without revealing their identity. Also, many survey
software complies with significant data security and privacy regulations.
You can use multiple question types in a questionnaire. Using various question types can help
increase responses to your research questionnaire as they tend to keep participants more engaged.
The best customer satisfaction survey templates are the most commonly used for better insights
and decision-making.
Online Questionnaire: In this type, respondents are sent the questionnaire via email or other
online mediums. This method is generally cost-effective and time-efficient. Respondents can
also answer at leisure. Without the pressure to respond immediately, responses may be more
accurate. The disadvantage, however, is that respondents can easily ignore these
questionnaires. Read more about online surveys.
Telephone Questionnaire: A researcher makes a phone call to a respondent to collect
responses directly. Responses are quick once you have a respondent on the phone. However, a
lot of times, the respondents hesitate to give out much information over the phone. It is also an
expensive way of conducting research. You’re usually not able to collect as many responses as
other types of questionnaires, so your sample may not represent the broader population.
In-House Questionnaire: This type is used by a researcher who visits the respondent’s home
or workplace. The advantage of this method is that the respondent is in a comfortable and
natural environment, and in-depth data can be collected. The disadvantage, though, is that it is
expensive and slow to conduct.
Mail Questionnaire: These are starting to be obsolete but are still being used in some market
research studies. This method involves a researcher sending a physical data collection
questionnaire request to a respondent that can be filled in and sent back. The advantage of this
method is that respondents can complete this on their own time to answer truthfully and
entirely. The disadvantage is that this method is expensive and time-consuming. There is also
a high risk of not collecting enough responses to make actionable insights from the data.
Researchers are always hoping that the responses received for a survey questionnaire yield
useable data. If the questionnaire is too complicated, there is a fair chance that the respondent
might get confused and will drop out or answer inaccurately.
As a survey creator, you may want to pre-test the survey by administering it to a focus group
during development. You can try out a few different questionnaire designs to determine which
resonates best with your target audience. Pre-testing is a good practice as the survey creator can
comprehend the initial stages if there are any changes required in the survey.
2. Keep it simple:
The words or phrases you use while writing the questionnaire must be easy to understand. If the
questions are unclear, the respondents may simply choose any answer and skew the data you
collect.
For efficient market research, researchers need a representative sample collected using one of the
many sampling techniques, such as a sample questionnaire. It is imperative to plan and define
these target respondents based on the demographics required.
Always save personal questions for last. Sensitive questions may cause respondents to drop off
before completing. If these questions are at the end, the respondent has had time to become more
comfortable with the interview and are more likely to answer personal or demographic questions.
What is it? The instrument of data collection Process of collecting and analyzing that data
Time and
Fast and cost-effective Much slower and expensive
Cost
Sources of secondary data include books, personal sources, journals, newspapers, websitess,
government records etc. Secondary data are known to be readily available compared to that of
primary data. It requires very little research and needs for manpower to use these sources.
With the advent of electronic media and the internet, secondary data sources have become more
easily accessible. Some of these sources are highlighted below.
Books
Books are one of the most traditional ways of collecting data. Today, there are books available
for all topics you can think of. When carrying out research, all you have to do is look for a book
on the topic being researched, then select from the available repository of books in that area.
Books, when carefully chosen are an authentic source of authentic data and can be useful in
preparing a literature review.
Published Sources
There are a variety of published sources available for different research topics. The authenticity
of the data generated from these sources depends majorly on the writer and publishing company.
Published sources may be printed or electronic as the case may be. They may be paid or free
depending on the writer and publishing company’s decision.
This may not be readily available and easily accessible compared to the published sources. They
only become accessible if the researcher shares with another researcher who is not allowed to
share it with a third party.
For example, the product management team of an organization may need data on customer
feedback to assess what customers think about their product and improvement suggestions. They
will need to collect the data from the customer service department, which primarily collected the
data to improve customer service.
Journal
Journals are gradually becoming more important than books these days when data collection is
concerned. This is because journals are updated regularly with new publications on a periodic
basis, therefore giving to date information.
Also, journals are usually more specific when it comes to research. For example, we can have a
journal on, “Secondary data collection for quantitative data” while a book will simply be titled,
“Secondary data collection”.
Newspapers
In most cases, the information passed through a newspaper is usually very reliable. Hence,
making it one of the most authentic sources of collecting secondary data.
The kind of data commonly shared in newspapers is usually more political, economic, and
educational than scientific. Therefore, newspapers may not be the best source for scientific data
collection.
Websites
The information shared on websites is mostly not regulated and as such may not be trusted
compared to other sources. However, there are some regulated websites that only share authentic
data and can be trusted by researchers.
Most of these websites are usually government websites or private organizations that are paid,
data collectors.
Blogs
Blogs are one of the most common online sources for data and may even be less authentic than
websites. These days, practically everyone owns a blog, and a lot of people use these blogs to
drive traffic to their website or make money through paid ads.
Therefore, they cannot always be trusted. For example, a blogger may write good things about a
product because he or she was paid to do so by the manufacturer even though these things are not
true.
Diaries
They are personal records and as such rarely used for data collection by researchers. Also, diaries
are usually personal, except for these days when people now share public diaries containing
specific events in their life.
A common example of this is Anne Frank’s diary which contained an accurate record of the Nazi
wars.
Government Records
Government records are a very important and authentic source of secondary data. They contain
information useful in marketing, management, humanities, and social science research.
Some of these records include; census data, health records, education institute records, etc. They
are usually collected to aid proper planning, allocation of funds, and prioritizing of projects.
Podcasts
Podcasts are gradually becoming very common these days, and a lot of people listen to them as
an alternative to radio. They are more or less like online radio stations and are generating
increasing popularity.
Information is usually shared during podcasts, and listeners can use it as a source of data
collection.