How to Perform OLS Regression in Python

Published on Aug. 22, 2023, 12:16 p.m.

Here is an example of how to perform OLS regression in Python using the statsmodels library:

  1. Import required libraries and load data

import pandas as pd
import statsmodels.api as sm

data = pd.read_csv(‘data.csv’)
Y = data[‘target_variable’]
X = data[[‘predictor_variable_1’, ‘predictor_variable_2’, ‘predictor_variable_3’]]


2. Add constant to X

X = sm.add_constant(X)


3. Create the OLS model and fit the data

model = sm.OLS(Y, X).fit()


4. View the results

print(model.summary())



In this example, `Y` is the dependent variable and `X` consists of the independent variables. `sm.add_constant()` adds a column of ones to X, which is necessary for the intercept term to be estimated in the regression model. `sm.OLS()` creates the OLS regression model with `Y` as the dependent variable and `X` as the independent variables, and `fit()` method fits the data to the model. `summary()` method displays the results of the regression analysis.

You can use the regression results to analyze the relationship between the dependent and independent variables, evaluate the significance of the independent variables, and make predictions on new data.

Note that there are other libraries like scikit-learn which also provide `LinearRegression()` class to perform Linear regression that uses OLS method to estimate the model coefficients, and to predict values.

Tags:

related content