How to calculate the mean,median,mode and sum of a column in a Pandas DataFrame?
Published on Aug. 22, 2023, 12:19 p.m.
To calculate the mean, median, mode, and sum of a column in a Pandas DataFrame, you can use built-in pandas methods. Here is an example:
import pandas as pd
# Create a sample DataFrame with columns 'A' and 'B'
data = {'A': [1, 2, 3], 'B': [4, 5, 6]}
df = pd.DataFrame(data)
# Calculate the mean, median, mode, and sum of the 'B' column
mean_val = df['B'].mean()
median_val = df['B'].median()
mode_val = df['B'].mode().values[0]
sum_val = df['B'].sum()
# Print the statistics
print('Mean:', mean_val)
print('Median:', median_val)
print('Mode:', mode_val)
print('Sum:', sum_val)
This code will output the following statistics for the ‘B’ column:
Mean: 5.0
Median: 5.0
Mode: 4
Sum: 15
In this example, we first created a DataFrame with columns ‘A’ and ‘B’. We then calculated the mean, median, mode (using the .mode()
method), and sum of the ‘B’ column using the appropriate pandas methods.
Note that if multiple values have the same highest frequency, .mode()
returns all of them as a list. In such cases, you can access the first (or any) values using the .values[0]
syntax.
You can also compute these statistics on multiple columns by passing a list of column names to the appropriate pandas methods.