How to split columns in pandas?
Published on Aug. 22, 2023, 12:18 p.m.
To split columns in pandas, you can use the str.split()
method. This method splits a string column into multiple columns based on a delimiter and returns a DataFrame with the new columns.
Here is an example of how to split a column in pandas:
import pandas as pd
# Create a DataFrame with a column to split
df = pd.DataFrame({'name': ['John Smith', 'Mary Johnson', 'Frank Miller']})
# Split the name column into first and last name columns
df[['first_name', 'last_name']] = df['name'].str.split(' ', expand=True)
# Print the updated DataFrame
print(df)
In this code, we create a pandas DataFrame df
with a single column called name
. We then call df['name'].str.split(' ', expand=True)
to split the name
column into two columns, first_name
and last_name
, based on the space character ' '
. The expand=True
argument tells pandas to expand the results into separate columns. Finally, we assign the result back to df[['first_name', 'last_name']]
to add the new columns to the DataFrame.
If the delimiter is different, you can simply replace the space character with the desired delimiter. For example, if the delimiter is a comma ,
, you would pass ', '
to .str.split()
:
# Create a DataFrame with a column to split using a comma delimiter
df = pd.DataFrame({'address': ['123 Main St, Anytown USA', '456 Maple Ave, Smallville USA', '789 Oak St, Bigtown USA']})
# Split the address column into street, city, and state columns
df[['street', 'city', 'state']] = df['address'].str.split(', ', expand=True)
# Print the updated DataFrame
print(df)
In this code, we create a pandas DataFrame df
with a single column called address
. We call df['address'].str.split(', ', expand=True)
to split the column into three columns: street
, city
, and state
. Note that we pass ,
as the delimiter instead of just a space.