RESEARCHMETHODOLOGYLAB
RESEARCHMETHODOLOGYLAB
RESEARCHMETHODOLOGYLAB
MODULE 1
INTRODUCTION TO SPSS
SPSS Statistics is a software package used for interactive or batched, statistical analysis. Long
produced by SPSS Inc., it was acquired by IBM in 2009. The current versions (2015) are named
IBM SPSS Statistics.
The software name originally stood for Statistical Package for the Social Sciences (SPSS)
reflecting the original market, although the software is now popular in other fields as well,
including the health sciences and marketing.
1.2ABOUT SPSS :
SPSS is a widely used program for statistical analysis in social science. It is also used by market
researchers, health researchers, survey companies, government, education researchers, marketing
organizations, data miners and others. The original SPSS manual (Bent & Hull, 1970) has been
described as one of "sociology's most influential books" for allowing ordinary researchers to do
their own statistical analysis. In addition to statistical analysis, data management (case selection,
file reshaping, creating derived data) and data documentation (a metadata dictionary is stored in
the data file) are features of the base software.
SPSS Statistics places constraints on internal file structure, data types, data processing, and
matching files, which together considerably simplify programming. SPSS datasets have a two-
dimensional table structure, where the rows typically represent cases (such as individuals or
households) and the columns represent measurements (such as age, sex, or household income).
Only two data types are defined: numeric and text (or "string"). All data processing occurs
sequentially case-by-case through the file (dataset). Files can be matched one-to-one and one-to-
many, but not many-to-many. In addition to that cases-by-variables structure and processing,
there is a separate Matrix session where one can process data as matrices using matrix and linear
algebra operations.
1|Page
Atisha Jain 41314901718
Several variants of SPSS Statistics exist. SPSS Statistics Grand packs are highly discounted
versions sold only to students. SPSS Statistics Server is a version of SPSS Statistics with a
client/server architecture. Add-on packages can enhance the base software with additional
features (examples include complex samples which can adjust for clustered and stratified
samples, and custom tables which can create publication-ready tables). SPSS Statistics is
available under either an annual or a monthly subscription license
SPSS Statistics can read and write data from ASCII text files (including hierarchical files), other
statistics packages, spreadsheets and databases. SPSS Statistics can read and write to external
relational database tables via ODBC and SQL.
1.3FEATURES OF SPSS:
Completely redesigned web reports: Version 23 brings with it the new Web Report with a lot
more interactivity. And because it is web based, you don’t have to worry about the recipient
having a copy of SPSS.
Compare Subgroups Plot: Another bit of big news in this release is that there are a ton of new
programmability plug-ins in the menus. IBM has written these for you so you don’t have to
know any Python. In fact, you don’t really have to know where they came from except that you
have to select Install Python when you install Version 23. As an example, there is a nifty plot in
the Graphs menu that automatically chooses appropriate graphic based on the Level of
Measurement of the variables.
Split into Files: Another one of the Python plug-in macros. It makes it super easy to create files
for each category in a categorical file — for instance, you may want to create a file for new
customers and a separate file for established customers
Create Dummy Variables: Another great Python plug-in. This one creates true/false variables
for each category in a categorical variable. This is a requirement in Regression. Many people
have been doing this manually for years, but this plug-in makes it easier.
Styling Output: There are a couple of great Version 22 features that you may not be using yet.
This is a fantastic recent addition that hasn’t gotten enough attention. You can conditionally
format your pivot tables — for instance, all percentages above 80 percent could be highlighted.
Generalized Spatial Association Rule (GSAR): One of the new Geo Spatial Modeling Wizard
options allows you to build a Time Series model using geo mapping information. The idea is to
map events taking place in space over slices of time. For instance, a lot of urban crime is at night,
but suburban breaking-and-entering crimes tend to happen during the workday.
2|Page
Atisha Jain 41314901718
1.5ELEMENTS OF SPSS:
NAME: is the variable's machine readable name. This is the name used to refer to the variable
in SPSS's underlying code and, if no "Label" is defined, the name that will appear at the top of
the column in the "Data View."
TYPE: indicates the type of data that can be stored in the variable's column. The most
frequently used types are "String" (for text) and "Numeric." SPSS uses the type to know what
rules can be applied to a specific variable. It won't do arithmetic on a string variable, for
example.
Label: sets the name that will be displayed at the top of the column in the Data Editor, allowing
for a human readable representation of the variable name.
Values: sets names given to coded values (e.g. if the variable contains survey responses where
a "0" represents "no" and "1" represents a "yes" this field can be used to tell SPSS to display the
text values instead of the numerical raw data).
Measure: sets the statistical level of measurement. SPSS distinguishes between "Scale"
(variables that represent a continuous scale like population or temperature), "Ordinal" (variables
that can be rank ordered but do not represent precisely measured values), and "Nominal"
(variables that cannot be ranked such as those that represent labels or classifications).
Role: is used by some SPSS dialogues to distinguish between the variable's intended usage in
some predictive applications (e.g. regression, clustering, and classification). For most dialogues
the role won't be significant.
3|Page
Atisha Jain 41314901718
1.6SCALE OF MEASUREMENT:
Types of scale of measurement are:
Nominal variables
Ordinal variables
Interval variables;
Ratio variables.
NOMINAL:
A variable can be treated as nominal when its values represent categories with no intrinsic
ranking. For example the department of the company in which an employee work
ORDINAL:
A variable can be treated as ordinal when its values represent categories with some intrinsic
ranking. For example, levels of service satisfaction from highly dissatisfied to highly satisfied.
INTERVAL:
The interval scale is defined as a quantitative measurement scale where the difference between 2
variables is meaningful. Interval scale is the 3rd level of measurement. In other words, the
variables are measured in actuals and not as a relative manner, where the presence of zero is
arbitrary.
RATIO:
Ratio scales are the ultimate nirvana when it comes to data measurement scales because they tell
us about the order, they tell us the exact value between units, AND they also have an absolute
4|Page
Atisha Jain 41314901718
zero–which allows for a wide range of both descriptive and inferential statistics to be applied. At
the risk of repeating myself, everything above about interval data applies to ratio scales, plus
ratio scales have a clear definition of zero.
The data editor has tabs for switching between Data View and Variable View. For now,
make sure you're in Data View.
Columns of cells are called variables. Each variable has a unique name (“gender”) which is
shown in the column header.
Rows of cells are called cases. Oftentimes, each respondent in a study is represented as a
single case.
In SPSS, values refer to cell contents.
The status bar may give useful information on the data.
5|Page
Atisha Jain 41314901718
In the left bottom corner we find tabs for switching between Variable View and Data View.
For now, select Variable View.
In Variable View, variables are shown as rows of cells.
The first column shows the variable name for each variable.
The fifth column may or may not contain a variable label. This describes the exact meaning
of each variable.
The sixth column shows value labels: descriptions of the meaning of one, many or
all values that a variable may contain.
1.8BENEFIT OF SPSS:
While it is spot on that a spreadsheet program offers more control with regards to the data
organization, this can also be seen as a demerit. In contrast, you cannot move data blocks in
SPSS as it is meant for organizing data in an optimal manner. A row represents one case,
whereas a column denotes one variable. SPSS makes data analysis quicker because the program
knows the location of the cases and variables. When using a spreadsheet, users must manually
define this relationship in every analysis.
6|Page
Atisha Jain 41314901718
SPSS is specifically made for analyzing statistical data and thus it offers a great range of
methods, graphs and charts. General programs may offer other procedures like invoicing and
accounting forms, but specialized programs are better suited for this function. SPSS also comes
with more techniques of screening or cleaning the information in preparation for further analysis.
Furthermore, normal spreadsheet programs may only support data analysis immediately
following installation, with extra plug-ins being required for accessing more intricate techniques.
SPSS is designed to make certain that the output is kept separate from data itself. In fact, it stores
all results in a separate file that is different from the data. However, in programs like Excel,
results of an analysis are placed in one worksheet and there is a likelihood of overwriting other
information by accident.
1.9USES OF SPSS:
SPSS is often used as a data collection tool by researchers. The data entry screen in SPSS looks
much like any other spreadsheet software. You can enter variables and quantitative data and save
the file as a data file. Furthermore, you can organize your data in SPSS by assigning properties to
different variables.
Data Output:
Once data is collected and entered into the data sheet in SPSS, you can create an output file from
the data. For example, you can create frequency distributions of your data to determine whether
your data set is normally distributed. The frequency distribution is displayed in an output file.
You can export items from the output file and place them into a research article you're writing.
Statistical Tests:
The most obvious use for SPSS is to use the software to run statistical tests. SPSS has all of the
most widely used statistical tests built-in to the software. Therefore, you won't have to do any
mathematical equations by hand. Once you run a statistical test, all associated outputs are
displayed in the data output file.
7|Page
Atisha Jain 41314901718
MODULE 2
MANAGING DATA IN SPSS
2.1 FINDING OUT THE CASE SUMMARY
Case summary are used to understand the nature of data
2. A dialogue box named “Summarize Cases” will appear, then Add “Final Marks” in
the “Variables” column
And,
“Gender” in “Grouping Variables” column.
8|Page
Atisha Jain 41314901718
3. Then go to statistics and select mean in Statistics cell & Press “continue”.
In output statistics viewer, summarize case summarizes appear containing marks obtained in
final exam on the basis of “Gender” with “Mean”
Case Summaries
Final Marks
What is your gender ? Female 1 72
2 69
3 65
4 60
5 70
6 61
7 73
8 56
9 61
10 64
11 57
12 55
Total Mean 63.58
N 12
Male 1 62
2 58
3 67
4 63
5 60
9|Page
Atisha Jain 41314901718
6 59
7 28
8 59
Total Mean 57.00
N 8
Total Mean 60.95
N 20
a. Limited to first 100 cases.
Case Summaries
Final Marks
To which caste do you belong? SC 1 69
2 61
Total Mean 65.00
N 2
ST 1 67
2 57
Total Mean 62.00
N 2
OBC 1 73
2 59
Total Mean 66.00
N 2
MINORITY 1 62
2 64
Total Mean 63.00
N 2
GENERAL 1 72
2 65
3 60
4 58
10 | P a g e
Atisha Jain 41314901718
5 70
6 61
7 63
8 60
9 56
10 28
11 59
12 55
Total Mean 58.92
N 12
Total Mean 60.95
N 20
a. Limited to first 100 cases.
2. In the open pop up window, on the top left corner define Target Variable as “Mean”.
3. Select Function Group as “STATISTICAL” & Function and Statistical Variables as “Mean”.
11 | P a g e
Atisha Jain 41314901718
5. Click OK.
Now the result has been executed and a New Variable of Mean has been added to the DATA &
Variable View.
MODULE 3
CODING AND RECODING IN SPSS
3.1 RECODING INTO DIFFERENT VARIABLES : OLD & NEW VALUE
1. Click Transform>> Recode into different variables.
12 | P a g e
Atisha Jain 41314901718
FIG.1
2. Move Final Marks to Numeric Value & Define the OUTPUT Value as “Grade” and
click on Change.
FIG2.
5. Click on Continue>>OK.
6. Now the result has been executed and a New Variable of Grade has been added to the
DATA & Variable View, depicting grade A,B & C according to the marks of final
exam.
13 | P a g e
Atisha Jain 41314901718
14 | P a g e
Atisha Jain 41314901718
6. Once “OK” is pressed, column named MIDTERM reappeared in data editor sheet
depicting new values.
MODULE 4
SELECTING, SORTING, AND ANALYSING THE DATA IN SPSS
4.1SELECT CASES
1. Go to Data>>Select Cases from the drop down menu.
15 | P a g e
Atisha Jain 41314901718
2. A dialogue box named “Select cases” will appear then select “if condition is
satisfied”.
3. Choose “if” and type “Gender=1” in variable box and then press continue.
16 | P a g e
Atisha Jain 41314901718
4. In Data View data appear with some changes. Female students remain unmarked
since they are selected cases.
4.2CASE SUMMARIES
1. Go to Analyze>> Reports>> Case summaries.
17 | P a g e
Atisha Jain 41314901718
Case Summaries
Final Marks
What is your gender ? Female 1 72
2 69
18 | P a g e
Atisha Jain 41314901718
3 65
4 60
5 70
6 61
7 73
8 56
9 61
10 64
11 57
12 55
Total N 12
Mean 63.58
Total N 12
Mean 63.58
a. Limited to first 100 cases.
4.3SORT CASES
1. Go to Data>>Sort Cases from the drop down menu.
2. Select which section one want to sort. Here we selected “What is your
Gender”.
19 | P a g e
Atisha Jain 41314901718
20 | P a g e