Regression Analysis - Notes
Regression Analysis - Notes
Regression Analysis - Notes
Q. From the above output you can see the various attributes of the dataset.
The 'target' column has the dependent values(housing prices) and rest of the colums
are the independent values that influence the target values
Lets find the relation between 'housing price' and 'average number of rooms per
dwelling' using stats model
Assign the values of column "RM"(average number of rooms per dwelling) to variable
X
similerly assign the values of 'target'(housing price) column to variable Y
sample code: values = data_frame['attribute_name']
Ans:
X = dataset['RM']
Y = dataset['target']
Q. import statsmodels.api as sm
Ans:
import statsmodels.api as sm
Ans:
statsModel = sm.OLS(Y, X)
fittedModel = statsModel.fit()
Ans:
print(fittedModel.summary())
Q. from the summary report note down the R-squared value and assign it to variable
'r_squared' in the below cell
Ans.
###Start code here
r_squared = fittedModel.rsquared
###End code(approx 1 line)
with open("output.txt", "w") as text_file:
text_file.write("rsquared= %f\n" % r_squared)
Q. print
Ans. print(r_squared)
----------------------------
##
Q. create a datframe named as 'X' such that it includes all the feature columns and
drop the target column.
assign the 'target' columns to variiable Y
Ans.
X = dataset.drop('target', axis = 1)
Y = dataset['target']
Q.
Now the dataframe X has just the features that influence the target
print the correlation matrix for dataframe X. Use '.corr()' function to compute
correlation matrix
from the correlation matrix note down the correlation value between 'CRIM' and
'PTRATIO' and assign it to variable 'corr_value'
Ans.
###Start code here
#print correlation matrix for X
print(X.corr())
corr_value = X['CRIM'].corr(X['PTRATIO'])
print(corr_value)
Ans.
###Start code here
import statsmodels.api as sm
statsModel = sm.OLS(Y, X)
fittedModel = statsModel.fit()
print(fittedModel.summary())
Q. from the summary report note down R squared value and assign it to variable
'r_square'
ans.
###Start code here
r_squared = fittedModel.rsquared
###End code(approx 1 line)
with open("output.txt", "w") as text_file:
text_file.write("corr= %f\n" % corr_value)
text_file.write("rsquared= %f\n" % r_squared)
Q. print
Ans.
print(r_squared)