How to implement logistic regression in NumPy

Published on Aug. 22, 2023, 12:16 p.m.

To implement logistic regression in NumPy, you can define the input matrix X and output vector y, and then use NumPy to calculate the coefficients of the logistic regression model. Here is an example code snippet:

import numpy as np

# Define the input matrix X and output vector y
X = np.array([[1, 2], [2, 3], [3, 4], [4, 5]])
y = np.array([0, 0, 1, 1])

# Define the sigmoid function
def sigmoid(z):
    return 1 / (1 + np.exp(-z))

# Calculate the coefficients of the logistic regression model
w = np.zeros(X.shape[1])
alpha = 0.1
num_iterations = 1000
for i in range(num_iterations):
    z = X.dot(w)
    h = sigmoid(z)
    gradient = X.T.dot(h - y) / y.size
    w -= alpha * gradient

# Print the coefficients
print(w)

In this code, we define the input matrix X as a 4x2 numpy array, with each row representing a pair of input variables. We also define the output vector y as a 1-dimensional numpy array.

We then define the sigmoid function, which is used to transform the outputs of the linear regression model to probabilities.

We initialize the coefficients w to zero and then use gradient descent to update the coefficients based on the predictions and the actual output values. We iterate over the specified number of iterations and update the weights using the gradient.

Finally, we print out the coefficients of the logistic regression model.

Keep in mind that you might need to preprocess the data set by adding a column of 1’s to X to represent the intercept term in the logistic regression model and splitting the data set into training and testing sets. Additionally, you can use regularization to prevent overfitting by adding an L1 or L2 penalty term to the cost function used for optimization.

here’s an example implementation of logistic regression in NumPy that includes model saving and loading

Sure, here’s an example implementation of logistic regression in NumPy that includes model saving and loading:

import numpy as np
import pickle

class LogisticRegression:

    def __init__(self, lr=0.01, num_iter=100000, fit_intercept=True, verbose=False):
        self.lr = lr
        self.num_iter = num_iter
        self.fit_intercept = fit_intercept
        self.verbose = verbose

    def __add_intercept(self, X):
        intercept = np.ones((X.shape[0], 1))
        return np.concatenate((intercept, X), axis=1)

    def __sigmoid(self, z):
        return 1 / (1 + np.exp(-z))

    def __loss(self, h, y):
        return (-y * np.log(h) - (1 - y) * np.log(1 - h)).mean()

    def fit(self, X, y):
        if self.fit_intercept:
            X = self.__add_intercept(X)

        self.theta = np.zeros(X.shape[1])

        for i in range(self.num_iter):
            z = np.dot(X, self.theta)
            h = self.__sigmoid(z)
            gradient = np.dot(X.T, (h - y)) / y.size
            self.theta -= self.lr * gradient

            if self.verbose and i % 10000 == 0:
                z = np.dot(X, self.theta)
                h = self.__sigmoid(z)
                print(f'loss: {self.__loss(h, y)} \t')

    def predict_prob(self, X):
        if self.fit_intercept:
            X = self.__add_intercept(X)

        return self.__sigmoid(np.dot(X, self.theta))

    def predict(self, X, threshold=0.5):
        return self.predict_prob(X) >= threshold

    def save_model(self, filepath):
        with open(filepath, 'wb') as f:
            pickle.dump(self, f)

    @staticmethod
    def load_model(filepath):
        with open(filepath, 'rb') as f:
            return pickle.load(f)

In this implementation, we define a LogisticRegression class that supports the fit(), predict(), predict_prob(), save_model(), and load_model() methods.

The fit() method trains the logistic regression model using gradient descent, and the predict() and `predict