how to drop a duplicate column in pandas ?

Published on Aug. 22, 2023, 12:15 p.m.

To drop a duplicate column in pandas, you can use the duplicated() and drop() methods. Here’s an example of how to do this:

import pandas as pd

df = pd.DataFrame({'A': [1, 2, 3],
                   'B': [4, 5, 6],
                   'C': [4, 5, 6]})

if df.columns.duplicated().any():
    df = df.loc[:, ~df.columns.duplicated()]

print(df)

In this example, we’re checking if there are any duplicated column names in the DataFrame using duplicated(). If there are duplicates, we’re using boolean indexing (~) to drop them using df.loc[:, ~df.columns.duplicated()].

This will remove all duplicate columns in the DataFrame while keeping the original order of the columns.

Tags: