'asp.net mvc @ put out wrong unicode string

 <meta http-equiv="Content-type" content="text/html;charset=UTF-8" />
 <meta name="description" content="@("Du lịch chữa bệnh ở Thái Lan là hình thức du lịch đẳng cấp kết hợp du lịch, spa và chữa bệnh")" />

why would it put out wrong unicode string

 <meta name="description" content="Du lịch chữa bệnh ở Th&#225;i Lan l&#224; h&#236;nh thức du lịch đẳng cấp kết hợp du lịch, spa v&#224; chữa bệnh" />

I've tried new HtmlString("Du lịch...") or Html.Raw("Du...") and no luck so far

What's wrong with that ? please give an advice. I'm using asp.net mvc 5.0

Without @, it works fine, just as expected !

Other thread has the same result but no answer, Disable encoding of unicode characters in ASP.NET-MVC3



Solution 1:[1]

You can configure default encoding behaviour in your ConfigureServices method:

public void ConfigureServices(IServiceCollection services)
{
    services.Configure<WebEncoderOptions>(options => 
            {
                options.TextEncoderSettings = new TextEncoderSettings(UnicodeRanges.All);
            });
}

This will render non-encoded unicode characters on the html page. Source https://github.com/aspnet/HttpAbstractions/issues/315

Solution 2:[2]

This is because of html encoding razor by default do html encoding. Try this

content="@Html.Raw("Du l?ch ch?a b?nh ? Thái Lan là hình th?c du l?ch ??ng c?p k?t h?p du l?ch, spa và ch?a b?nh")"

@Html.Raw(""); render the input string as raw html

Solution 3:[3]

The string is not wrong. &#225; will be read by any web browser as the character á. In terms of the information set a browser will parse into the DOM, the two lines:

<meta name="description" content="Du l?ch ch?a b?nh ? Th&#225;i Lan l&#224; h&#236;nh th?c du l?ch ??ng c?p k?t h?p du l?ch, spa v&#224; ch?a b?nh" />
<meta name="description" content="Du l?ch ch?a b?nh ? Thái Lan là hình th?c du l?ch ??ng c?p k?t h?p du l?ch, spa và ch?a b?nh" />

are absolutely identical.

It's a bit weird that ASP.NET's default HTML encoder chooses to write &#...; character references for characters in the range U+0080 to U+00FF but not other Unicode characters, but in general an HTML encoder can choose to encode any character it fancies and the output will still be correct.

nemesv's answer in the question you linked shows how you can change this behaviour by overriding encoderType, but there is almost never any reason to care. Any other tool that consumes your output and does not treat &#225; and á in an attribute value as the same character, is broken.

Solution 4:[4]

Try it in webconfig(asp.net mvc5) <globalization fileEncoding="utf-8" /> </system.web>

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Volodymyr
Solution 2 titan61
Solution 3 bobince
Solution 4 Nguy?n Nhân