解决python错误 UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe9 in position 2892: invalid continuation byte
Published on Aug. 22, 2023, 12:11 p.m.
使用二进制读取
You could resolve the problem with:
for line in open(your_file_path, ‘rb’):
‘rb’ is reading the file in binary mode. Read more here.
指定编码
I was using a dataset downloaded from Kaggle while reading this dataset it threw this error:
UnicodeDecodeError: ‘utf-8’ codec can’t decode byte 0xf1 in position 183: invalid continuation byte
So this is how I fixed it.
import pandas as pd
pd.read_csv('top50.csv', encoding='ISO-8859-1')
pd.read_csv('ml-100k/u.item', sep='|', names=m_cols , encoding='latin-1')
参考链接:
https://stackoverflow.com/questions/19699367/for-line-in-results-in-unicodedecodeerror-utf-8-codec-cant-decode-byte
https://grabthiscode.com/whatever/unicodedecodeerror-utf-8-codec-cant-decode-byte-0xe9-in-position-2892-invalid-continuation-byte