How to group data in pandas by a specific column?
Published on Aug. 22, 2023, 12:17 p.m.
To group data in pandas by a specific column, you can use the groupby()
function followed by the column you want to group on. Here’s an example:
import pandas as pd
data = {'Name': ['Alice', 'Bob', 'Charlie', 'Alice', 'Bob', 'Charlie'],
'Score': [85, 75, 90, 95, 80, 70],
'Subject': ['Math', 'Math', 'Math', 'English', 'English', 'English']}
df = pd.DataFrame(data)
# Group the data by the 'Name' column and calculate the mean score for each group
grouped = df.groupby('Name')['Score'].mean()
print(grouped)
This will output:
Name
Alice 90.0
Bob 77.5
Charlie 80.0
Name: Score, dtype: float64
In the example above, we grouped the data by the ‘Name’ column and calculated the mean score for each group. You can use different aggregation functions like sum()
, min()
, max()
, etc. with the groupby()
function to perform group-specific computations.
I hope this helps! Let me know if you have any other questions.