Multiple Regression
Multiple Regression
Multiple Regression
Linear Regression
● Multiple linear regression differs from the simple one because it can handle multiple
input features
● It is a simple algorithm initially developed in the field of statistics and was studied as
a model for understanding the relationship between input and output variables
● It is a linear model - assumes a linear relationship between input variables (X) and
the output variable (y)
● Used to predict continuous values (e.g., weight, price...)
Assumptions
Take-home point
● Training a linear regression model means calculating the best coefficients for the
line equation formula
● The best coefficients can be obtained with gradient descent
○ An iterative optimization algorithm that calculates derivatives wrt.
each coefficient, and updates the coefficients on the go
○ One additional parameter - learning rate, specifies the rate at
which coefficients are updated
■ High learning rate can lead to "missing" the best
values
■ Low learning rate can lead to slow optimization
Math behind
Implementation
@staticmethod
def _mean_squared_error(y, y_hat):
'''
Private method, used to evaluate loss at each iteration.
# Calculate derivatives
partial_w = (1 / X.shape[0]) * (2 * np.dot(X.T, (y_hat - y)))
partial_d = (1 / X.shape[0]) * (2 * np.sum(y_hat - y))
Testing
data = load_diabetes()
X = data.data
y = data.target
● You can now initialize and train the model, and afterwards make predictions:
In [5]:
model = LinearRegression()
model.fit(X_train, y_train)
preds = model.predict(X_test)
152.2631135652031
Loss Evaluation
● You can now visualize loss at each iteration for these different learning rates:
In [16]:
xs = np.arange(len(model.loss))
model._mean_squared_error(y_test, preds)
lr_model = LinearRegression()
lr_model.fit(X_train, y_train)
lr_preds = lr_model.predict(X_test)
mean_squared_error(y_test, lr_preds)