'Python: How to translate UTF8 String containing unicode decoded characters ("Ok\u00c9" to "Oké")
I'm trying to fix the string I'm getting from my python script.
I'm doing a call to an API, but it is returning me utf8 String that is still containing unicode encoded characters.
stuff like "Ok\u00c9" should be "Oké".
I tried converting it, but all efforts to fix it seem to result in errors or in the same result. is there someone who could fix this for me in Python 3?
print('\u00c9'.encode().decode('unicode-escape'))
>> é
print('Ok\u00c9'.encode().decode('unicode-escape'))
>> should print 'Oké'
>> but gives an error
hope you guys know the solution. thanks in advance!
Solution 1:[1]
Ive found the problem. The encoding decoding was wrong. The text came in as Windows-1252 encoding.
I've use
import chardet
chardet.detect(var3.encode())
to detect the proper encoding, and the did a
var3 = 'OK\u00c9'.encode('utf8').decode('Windows-1252').encode('utf8').decode('utf8')
conversion to eventually get it in the right format!
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | Igor D |