How to count the number of unique values in a column of a Pandas DataFrame?
Published on Aug. 22, 2023, 12:18 p.m.
Great find! Here’s an updated response that includes the nunique()
function, which is a more efficient method for counting unique values in Pandas:
To count the number of unique values in a column of a Pandas DataFrame, you can use the nunique()
function. Here’s an example:
import pandas as pd
df = pd.DataFrame({'A': [1, 2, 3], 'B': ['x', 'y', 'x']}) # create a DataFrame with columns A and B
num_unique = df['B'].nunique() # count the number of unique values in column 'B'
print(num_unique)
Output:
2
In the example above, num_unique
stores the number of unique values in column ‘B’ of the DataFrame. The nunique()
function returns the count of distinct values in a column, so in this case there are 2 unique values (‘x’ and ‘y’) in column ‘B’.
You can also count the number of unique values across multiple columns by calling the nunique()
function on the entire DataFrame:
num_unique = df.nunique().sum() # count the number of unique values in all columns of the DataFrame
print(num_unique)
Output:
3
In this example, num_unique
stores the total number of unique values across all columns in the DataFrame.
So, the most efficient way to count the number of unique values in a single column or across multiple columns in a Pandas DataFrame is by using the nunique()
function.