解决python错误 UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe9 in position 2892: invalid continuation byte

Published on Aug. 22, 2023, 12:11 p.m.

使用二进制读取

You could resolve the problem with:

for line in open(your_file_path, ‘rb’):

‘rb’ is reading the file in binary mode. Read more here.

I was using a dataset downloaded from Kaggle while reading this dataset it threw this error:

UnicodeDecodeError: ‘utf-8’ codec can’t decode byte 0xf1 in position 183: invalid continuation byte

So this is how I fixed it.

import pandas as pd
pd.read_csv('top50.csv', encoding='ISO-8859-1')

pd.read_csv('ml-100k/u.item', sep='|', names=m_cols , encoding='latin-1')

参考链接：

python