Home Multiple Linear Regression (Boston Housing Dataset)
Post
Cancel

Multiple Linear Regression (Boston Housing Dataset)


Configuration

Data Preprocessing

1
2
3
4
5
6
7
8
9
10
11
from sklearn import datasets
from sklearn import linear_model
from sklearn.metrics import mean_squared_error
import pandas as pd

boston = datasets.load_boston()
X = pd.DataFrame(boston.data)
X.columns = boston.feature_names
y = pd.DataFrame(boston.target)
y.columns = ['PRICE']
y = y['PRICE']

Use Boston Housing Data from scikit-learm to predict house prices.


Analysis

1
2
3
4
5
6
7
8
9
linear_regression = linear_model.LinearRegression()
linear_regression.fit(X=pd.DataFrame(X), y=y)
prediction = linear_regression.predict(X=pd.DataFrame(X))

a = linear_regression.intercept_
b = linear_regression.coef_

print("a : %.2f" %a)
print("b : %.2f" %b)

$y=a+b_1x_1+b_2x_2+…+b_{13}x_{13}$

result


Performation Evaluation

Residual Calculation

1
2
residuals = y - prediction
residuals.describe()

residuals

Coefficient of Determination Calculation

1
2
3
4
5
SSE = (residuals**2).sum()
SST = ((y-y.mean())**2).sum()
r2 = 1 - SSE/SST

print("R Squared : %.3f" %r2)

R Squred : 0.741

MSE Calculation

1
2
3
4
5
score = linear_regression.score(X=pd.DataFrame(X), y=y)
MSE = mean_squared_error(prediction, y)

print("Score : %.3f" %score)
print("MSE : %.2f" %MSE)

Score : 0.741 MSE : 21.89

This post is licensed under CC BY 4.0 by the author.

-

Dimension Reduction using PCA