'Tibetan characters in data.frames cannot be displayed in R console even after locale is set (Chinese is fine, Tibetan in matrices is fine)
In R, Tibetan characters do not display properly when they are located in data.frames:
> Sys.setlocale("LC_CTYPE", "Tibetan")
[1] "Tibetan_China.950"
> data.frame(a="བོད་")
a
1 <U+0F56><U+0F7C><U+0F51><U+0F0B>
> Sys.setlocale("LC_CTYPE", "Dzongkha_Bhutan.950")
[1] "Dzongkha_Bhutan.950"
> data.frame(a="འབྲུག་ཡུལ་")
a
1 <U+0F60><U+0F56><U+0FB2><U+0F74><U+0F42><U+0F0B><U+0F61><U+0F74><U+0F63><U+0F0B>
Chinese is fine following the instructions here:
> Sys.setlocale("LC_CTYPE", "Chinese")
[1] "Chinese (Simplified)_China.936"
> data.frame(a="中文")
a
1 中文
Tibetan characters are also fine in matrices:
> matrix("བོད")
[,1]
[1,] "བོད"
Could anyone shed some light on this issue? I'm using Windows 10. Thanks!
Solution 1:[1]
This appears to be fixed in R 4.2.0:
> Sys.setlocale("LC_CTYPE", "Tibetan")
[1] "Tibetan_China.utf8"
>
> data.frame(a="????")
a
1 ????
Not a satisfactory answer at the time I posted this, but certainly works well now!
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | WavesWashSands |