How to train an image classification model using scikit-learn in Python

Published on Aug. 22, 2023, 12:16 p.m.

To train an image classification model using scikit-learn in Python, you can use the RandomForestClassifier or SVM classifiers with image features extracted using methods such as HOG or SIFT. Here’s an example code snippet:

from sklearn.svm import SVC
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score
from skimage.feature import hog
from skimage import data, exposure
from joblib import dump, load

# Load the image data and extract features
image_data = data.image_collection_paths('path/to/image/folder/*')
image_features = []
image_labels = []
for image_path in image_data:
    image = data.imread(image_path, as_gray=True)
    fd, hog_image = hog(image, orientations=8, pixels_per_cell=(16, 16),
                        cells_per_block=(1, 1), visualize=True, multichannel=False)
    image_features.append(fd)
    image_labels.append('cat' if 'cat' in image_path else 'dog')

# Create the classifier model
classifier = RandomForestClassifier(n_estimators=100)

# Fit the model on the data
classifier.fit(image_features, image_labels)

# Save the model
dump(classifier, 'image_classifier.joblib')

# Load the model
loaded_classifier = load('image_classifier.joblib')

# Predict the output for a new input
test_image = data.imread('path/to/test/image.jpg', as_gray=True)
fd, hog_image = hog(test_image, orientations=8, pixels_per_cell=(16, 16),
                    cells_per_block=(1, 1), visualize=True, multichannel=False)
input_features = [fd]
predicted_label = loaded_classifier.predict(input_features)
print(predicted_label)

In this code, we first load the image data and extract features using the HOG method from the scikit-image library. We then create an instance of the SVC or RandomForestClassifier class, depending on which classifier we want to use, and fit the model on the extracted features using the fit() method.

Next, we save the trained model to disk using the dump() method from the joblib module.

Finally, we can load the saved model using the load() method from the joblib module, and use the trained model to make predictions for new inputs by extracting features from the new image using the same method as the training data, and calling the predict() method on the loaded classifier.