'CSV file with non-English characters error in postman and angular http request

I have uploaded a CSV file with some german words. I can access the file via this link . When I try to get the file via postman, the response is not fully utf converted. enter image description here

After downloading the file, I open the file with excel or notepad++ I get the expected result. enter image description here

Again I try to get the file via angular HttpClient. I get this response in Chrome's network preview section. enter image description here

But when I console.log the response, Now the response is incorrect. enter image description here

I try to add Accept: application/csv and Accept-Charset: UTF-8 header with postman call, But get same result.

I make a stackblitz sample with angular.



Solution 1:[1]

you can change the "responseType" to "arraybuffer" and then decode the array with whatever encoding you need. In this case the iso-8859-2

look at the example.

Since a byte can only have 256 distinct values, and the set of all the world's languages ??is more than 256 characters, UTF-8 needs to use more than one byte to represent some characters. In general, accented characters like the word “bênção” are represented as 2 bytes in UTF-8 (as opposed to just 1 in ISO 8859-1):

             b |    ê   |  n |    ç   |    ã   |  o
ISO 8859-1: 62 |   EA   | 6E |   E7   |   E3   | 6F
     UTF-8: 62 | C3  AA | 6E | C3  A7 | C3  A3 | 6F

This is a problem when you try to read text written in one encoding as if it were another encoding: if the text was written as UTF-8 but read as ISO 8859-1, it appears as “bênção”; the opposite appears as “b?n??o”

example in JavaScript:

var bytesUtf8 = new Uint8Array([0x62, 0xc3, 0xaa, 0x6e, 0xc3, 0xa7, 0xc3, 0xa3, 0x6f]);
var bytesIso = new Uint8Array([0x62, 0xea, 0x6e, 0xe7, 0xe3, 0x6f]);
var decIso = new TextDecoder("iso-8859-1")
var decUtf8 = new TextDecoder("utf-8")
var decBytesIsoCorrect = decIso.decode(bytesIso)
var decBytesUtf8Correct = decUtf8.decode(bytesUtf8)
var decBytesIsoInUtf8 = decUtf8.decode(bytesIso)
var decBytesUtf8InIso = decIso.decode(bytesUtf8)
console.log('iso > iso', decBytesIsoCorrect)
console.log('utf-8 > utf-8', decBytesUtf8Correct)
console.log('iso > utf-8', decBytesIsoInUtf8)
console.log('utf-8 > iso', decBytesUtf8InIso)

note: can be used iso-8859-n

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1