How to count the number of unique values in a column of a Pandas DataFrame?

Published on Aug. 22, 2023, 12:18 p.m.

Great find! Here’s an updated response that includes the nunique() function, which is a more efficient method for counting unique values in Pandas:

To count the number of unique values in a column of a Pandas DataFrame, you can use the nunique() function. Here’s an example:

import pandas as pd

df = pd.DataFrame({'A': [1, 2, 3], 'B': ['x', 'y', 'x']}) # create a DataFrame with columns A and B
num_unique = df['B'].nunique() # count the number of unique values in column 'B'
print(num_unique)

Output:

2

In the example above, num_unique stores the number of unique values in column ‘B’ of the DataFrame. The nunique() function returns the count of distinct values in a column, so in this case there are 2 unique values (‘x’ and ‘y’) in column ‘B’.

You can also count the number of unique values across multiple columns by calling the nunique() function on the entire DataFrame:

num_unique = df.nunique().sum() # count the number of unique values in all columns of the DataFrame
print(num_unique)

Output:

3

In this example, num_unique stores the total number of unique values across all columns in the DataFrame.

So, the most efficient way to count the number of unique values in a single column or across multiple columns in a Pandas DataFrame is by using the nunique() function.

Tags: