How to calculate the sum of a column in a Pandas DataFrame?

Published on Aug. 22, 2023, 12:18 p.m.

To calculate the sum of a column in a Pandas DataFrame, you can use the sum() function. Here’s an example:

import pandas as pd

df = pd.DataFrame({'A': [1, 2, 3], 'B': [4, 5, 6]}) # create a DataFrame with columns A and B
sum_col = df['B'].sum() # calculate the sum of column 'B'
print(sum_col)

Output:

15

In the example above, sum_col stores the sum of values in column ‘B’ of the DataFrame. The sum() function calculates the sum of values in a column, so in this case the sum of values in column ‘B’ is 15.

You can also calculate the sum of values across multiple columns by calling the sum() function on the entire DataFrame:

sum_all = df.sum().sum() # calculate the sum of all values in the DataFrame
print(sum_all)

Output:

21

In this example, sum_all stores the sum of all values in the DataFrame. The first call to sum() calculates the sum of values in each column, and the second call to sum() calculates the sum of values across all columns.

Overall, calculating the sum of a column or all values in a Pandas DataFrame is straightforward using the sum() function.

Tags: