How to shuffle data in Python, NumPy, and Pandas

Published on Aug. 22, 2023, 12:17 p.m.

To shuffle data in Python, NumPy, and Pandas, you can use the random module in Python or the shuffling functions provided by NumPy and Pandas.

Here is an example of how to shuffle data in Python using the random module:

import random

data = [1, 2, 3, 4, 5]
random.shuffle(data)
print(data)

This code defines a list of data and uses the random.shuffle() function to shuffle the data in place. The shuffled data is then printed to the console.

If you’re working with NumPy arrays, you can use the random.permutation() function to generate a random permutation of the array:

import numpy as np

data = np.array([1, 2, 3, 4, 5])
shuffled_data = np.random.permutation(data)
print(shuffled_data)

This code creates a NumPy array and uses the np.random.permutation() function to generate a random permutation of the array. The shuffled data is assigned to a new variable and printed to the console.

Similarly, if you’re working with Pandas DataFrames, you can use the sample() method to shuffle the rows of the DataFrame:

import pandas as pd

df = pd.DataFrame({'A': [1, 2, 3], 'B': [4, 5, 6]})
shuffled_df = df.sample(frac=1)
print(shuffled_df)

This code creates a Pandas DataFrame and uses the sample() method with frac=1 to shuffle the rows of the DataFrame. The shuffled DataFrame is assigned to a new variable and printed to the console.

Tags: