How to shuffle or randomize the order of rows in a Pandas DataFrame 如何使用pandas打乱数据

Published on Aug. 22, 2023, 12:14 p.m.

To shuffle or randomize the order of rows in a Pandas DataFrame, you can use the sample function. Here is an example:

import pandas as pd

# create a DataFrame
df = pd.DataFrame({'Col1': ['A', 'B', 'C', 'D'], 'Col2': [1, 2, 3, 4]})

# shuffle the DataFrame
shuffled_df = df.sample(frac=1).reset_index(drop=True)

# print the shuffled DataFrame
print(shuffled_df)

This will output the original DataFrame with the rows randomly shuffled. In the sample function, the parameter frac specifies the fraction of rows to return in the random sample. By setting frac=1, we are returning all the rows in the DataFrame in a random order. The reset_index function is used to reset the index of the shuffled DataFrame.

Tags: