Open Source Your Knowledge, Become a Contributor

Technology knowledge has to be shared and made accessible for free. Join the movement.

Create Content


Using the house prices dataset, we implement a linear regression model that predicts the price y of a house using a set of features X:

import numpy as np
from sklearn import datasets
from sklearn.linear_model import SGDRegressor
from sklearn.metrics import mean_squared_error
from sklearn.preprocessing import StandardScaler
np.random.seed(42) # constant seed for reproducibility
houses = datasets.load_boston()
split = 4 * len( // 5
X_train, X_test =[:split],[split:]
y_train, y_test =[:split],[split:]
# linear regression works better with normalized features
scaler = StandardScaler()
X_train = scaler.transform(X_train)
X_test = scaler.transform(X_test)
predictor = SGDRegressor(loss="squared_loss"), y_train)
mse = mean_squared_error(y_test, predictor.predict(X_test))
print("Test Mean Squared Error: ${:,.2f}".format(mse * 1000))
Open Source Your Knowledge: become a Contributor and help others learn. Create New Content