'Tibetan characters in data.frames cannot be displayed in R console even after locale is set (Chinese is fine, Tibetan in matrices is fine)

In R, Tibetan characters do not display properly when they are located in data.frames:

> Sys.setlocale("LC_CTYPE", "Tibetan")
[1] "Tibetan_China.950"
> data.frame(a="བོད་")
                                 a
1 <U+0F56><U+0F7C><U+0F51><U+0F0B>
> Sys.setlocale("LC_CTYPE", "Dzongkha_Bhutan.950")
[1] "Dzongkha_Bhutan.950"
> data.frame(a="འབྲུག་ཡུལ་")
                                                                                 a
1 <U+0F60><U+0F56><U+0FB2><U+0F74><U+0F42><U+0F0B><U+0F61><U+0F74><U+0F63><U+0F0B>

Chinese is fine following the instructions here:

> Sys.setlocale("LC_CTYPE", "Chinese")
[1] "Chinese (Simplified)_China.936"
> data.frame(a="中文")
     a
1 中文

Tibetan characters are also fine in matrices:

> matrix("བོད")
     [,1]
[1,] "བོད"

Could anyone shed some light on this issue? I'm using Windows 10. Thanks!



Solution 1:[1]

This appears to be fixed in R 4.2.0:

> Sys.setlocale("LC_CTYPE", "Tibetan")
[1] "Tibetan_China.utf8"
> 
> data.frame(a="????")
    a
1 ????

Not a satisfactory answer at the time I posted this, but certainly works well now!

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 WavesWashSands