How to fill null values in a column of a Pandas DataFrame?
Published on Aug. 22, 2023, 12:19 p.m.
To fill null (or missing) values in a column of a Pandas DataFrame, you can use the fillna()
method. Here is an example:
import pandas as pd
# Create a sample DataFrame with a column of numbers and some null values
data = {'numbers': [1, 2, None, 4, None]}
df = pd.DataFrame(data)
# Fill the null values in the 'numbers' column with the mean of the column
df['numbers'] = df['numbers'].fillna(df['numbers'].mean())
# Print the updated DataFrame
print(df)
This code will output the following DataFrame with the null values replaced by the mean of the ‘numbers’ column:
numbers
0 1.00
1 2.00
2 2.33
3 4.00
4 2.33
In this example, we first created a DataFrame with a column of numbers and some null values. We then used the fillna()
method to fill the null values in the ‘numbers’ column with the mean of the column.
You can also use other methods to fill null values, such as the ffill()
or bfill()
methods to forward-fill or backward-fill values respectively. Additionally, you can pass a value or a dictionary to the fillna()
method to fill null values with a specific value or different values for different columns.
# Forward-fill missing values in the 'numbers' column
df['numbers'] = df['numbers'].ffill()
# Backward-fill missing values in the 'numbers' column
df['numbers'] = df['numbers'].bfill()
# Fill missing values with a specific value (e.g. 0)
df['numbers'] = df['numbers'].fillna(0)
# Use a dictionary to fill missing values in different columns with different values
df = df.fillna({'numbers': 0, 'letters': 'N/A'})
In any case, make sure to carefully consider and choose the most appropriate method for filling null values depending on your specific use case.