How to implement logistic regression in NumPy
Published on Aug. 22, 2023, 12:16 p.m.
To implement logistic regression in NumPy, you can define the input matrix X and output vector y, and then use NumPy to calculate the coefficients of the logistic regression model. Here is an example code snippet:
import numpy as np
# Define the input matrix X and output vector y
X = np.array([[1, 2], [2, 3], [3, 4], [4, 5]])
y = np.array([0, 0, 1, 1])
# Define the sigmoid function
def sigmoid(z):
return 1 / (1 + np.exp(-z))
# Calculate the coefficients of the logistic regression model
w = np.zeros(X.shape[1])
alpha = 0.1
num_iterations = 1000
for i in range(num_iterations):
z = X.dot(w)
h = sigmoid(z)
gradient = X.T.dot(h - y) / y.size
w -= alpha * gradient
# Print the coefficients
print(w)
In this code, we define the input matrix X as a 4x2 numpy array, with each row representing a pair of input variables. We also define the output vector y as a 1-dimensional numpy array.
We then define the sigmoid function, which is used to transform the outputs of the linear regression model to probabilities.
We initialize the coefficients w to zero and then use gradient descent to update the coefficients based on the predictions and the actual output values. We iterate over the specified number of iterations and update the weights using the gradient.
Finally, we print out the coefficients of the logistic regression model.
Keep in mind that you might need to preprocess the data set by adding a column of 1’s to X to represent the intercept term in the logistic regression model and splitting the data set into training and testing sets. Additionally, you can use regularization to prevent overfitting by adding an L1 or L2 penalty term to the cost function used for optimization.
here’s an example implementation of logistic regression in NumPy that includes model saving and loading
Sure, here’s an example implementation of logistic regression in NumPy that includes model saving and loading:
import numpy as np
import pickle
class LogisticRegression:
def __init__(self, lr=0.01, num_iter=100000, fit_intercept=True, verbose=False):
self.lr = lr
self.num_iter = num_iter
self.fit_intercept = fit_intercept
self.verbose = verbose
def __add_intercept(self, X):
intercept = np.ones((X.shape[0], 1))
return np.concatenate((intercept, X), axis=1)
def __sigmoid(self, z):
return 1 / (1 + np.exp(-z))
def __loss(self, h, y):
return (-y * np.log(h) - (1 - y) * np.log(1 - h)).mean()
def fit(self, X, y):
if self.fit_intercept:
X = self.__add_intercept(X)
self.theta = np.zeros(X.shape[1])
for i in range(self.num_iter):
z = np.dot(X, self.theta)
h = self.__sigmoid(z)
gradient = np.dot(X.T, (h - y)) / y.size
self.theta -= self.lr * gradient
if self.verbose and i % 10000 == 0:
z = np.dot(X, self.theta)
h = self.__sigmoid(z)
print(f'loss: {self.__loss(h, y)} \t')
def predict_prob(self, X):
if self.fit_intercept:
X = self.__add_intercept(X)
return self.__sigmoid(np.dot(X, self.theta))
def predict(self, X, threshold=0.5):
return self.predict_prob(X) >= threshold
def save_model(self, filepath):
with open(filepath, 'wb') as f:
pickle.dump(self, f)
@staticmethod
def load_model(filepath):
with open(filepath, 'rb') as f:
return pickle.load(f)
In this implementation, we define a LogisticRegression
class that supports the fit()
, predict()
, predict_prob()
, save_model()
, and load_model()
methods.
The fit()
method trains the logistic regression model using gradient descent, and the predict()
and `predict