How to split columns in pandas?

Published on Aug. 22, 2023, 12:18 p.m.

To split columns in pandas, you can use the str.split() method. This method splits a string column into multiple columns based on a delimiter and returns a DataFrame with the new columns.

Here is an example of how to split a column in pandas:

import pandas as pd

# Create a DataFrame with a column to split
df = pd.DataFrame({'name': ['John Smith', 'Mary Johnson', 'Frank Miller']})

# Split the name column into first and last name columns
df[['first_name', 'last_name']] = df['name'].str.split(' ', expand=True)

# Print the updated DataFrame
print(df)

In this code, we create a pandas DataFrame df with a single column called name. We then call df['name'].str.split(' ', expand=True) to split the name column into two columns, first_name and last_name, based on the space character ' '. The expand=True argument tells pandas to expand the results into separate columns. Finally, we assign the result back to df[['first_name', 'last_name']] to add the new columns to the DataFrame.

If the delimiter is different, you can simply replace the space character with the desired delimiter. For example, if the delimiter is a comma ,, you would pass ', ' to .str.split():

# Create a DataFrame with a column to split using a comma delimiter
df = pd.DataFrame({'address': ['123 Main St, Anytown USA', '456 Maple Ave, Smallville USA', '789 Oak St, Bigtown USA']})

# Split the address column into street, city, and state columns
df[['street', 'city', 'state']] = df['address'].str.split(', ', expand=True)

# Print the updated DataFrame
print(df)

In this code, we create a pandas DataFrame df with a single column called address. We call df['address'].str.split(', ', expand=True) to split the column into three columns: street, city, and state. Note that we pass , as the delimiter instead of just a space.

Tags: