How to read data from compressed files in Jupyter Notebook

Published on Aug. 22, 2023, 12:16 p.m.

To read data from compressed files in Jupyter Notebook, you can use various libraries in Python like gzip, zipfile and tarfile.

Here is an example of how to read data from a gzipped file using the gzip library:

import gzip
with gzip.open('compressed_data.gz', 'rb') as f:
    file_content = f.read()
    # process the file_content as needed

Similarly, if you want to read data from a zipped file using the zipfile library, you can use the following code:

import zipfile
with zipfile.ZipFile('compressed_data.zip', 'r') as zip_ref:
    zip_ref.extractall('./unzipped_data')
    # process the unzipped data as needed

You can use tarfile library to read data from tarballs as shown below:

import tarfile
with tarfile.open('compressed_data.tar.gz') as tar:
    tar.extractall('./untarred_data')
    # process the untarred data as needed

Tags: