How to train a logistic regression model using scikit-learn in Python

Published on Aug. 22, 2023, 12:16 p.m.

To train a logistic regression model using scikit-learn in Python, you can use the LogisticRegression class from the scikit-learn library. Here’s an example code snippet:

import numpy as np
from sklearn.linear_model import LogisticRegression

# Load the data
X = np.array([[1, 2], [3, 4], [5, 6]])
y = np.array([0, 0, 1])

# Create the logistic regression model
model = LogisticRegression()

# Fit the model on the data
model.fit(X, y)

# Predict the output for a new input
test_input = np.array([[7, 8]])
output = model.predict(test_input)
print(output)

In this code, we first load some sample data into X and y. The X variable should contain the training feature matrix (one row per sample, one column per feature), and y should contain the corresponding binary class labels (0 or 1) for each sample.

We then create an instance of the LogisticRegression class and fit the model on the data using the fit() method.

Finally, we can use the trained model to make predictions for new inputs using the predict() method.

This is just a simple example, but scikit-learn provides many other options for logistic regression training and evaluation.

To save and load machine learning models

To save and load machine learning models, common practices include using the pickle or joblib module in Python. Here’s how you can use these modules to save and load a model:

  1. Using pickle to save and load a model:
import pickle

# Assuming you already have a trained scikit-learn model called 'model'
# Save the model
with open('model.pkl', 'wb') as f:
    pickle.dump(model, f)

# Load the model
with open('model.pkl', 'rb') as f:
    loaded_model = pickle.load(f)
  1. Using joblib to save and load a model:
import joblib

# Assuming you already have a trained scikit-learn model called 'model'
# Save the model
joblib.dump(model, 'model.joblib')

# Load the model
loaded_model = joblib.load('model.joblib')

Both pickle and joblib can serialize and deserialize a trained machine learning model, allowing you to save and load the model for later use. PyTorch and TensorFlow also have their own model saving and loading methods, which you can learn by checking their respective documentation.