Category "unicode"

JavaScript: how to check if character is RTL?

How can I programmatically check if the browser treats some character as RTL in JavaScript? Maybe creating some transparent DIV and looking at where text is pl

Standard character set for Outlook 2010 .msg file

I need to find out, what type of character set it is, if I save an email from outlook 2010 in non-unicode format. At saving you can choose between .msg file and

Regex to match word delimiters in multilingual text

I have a text box that a user can input any text in any language in and I need to split that text into words so that I could pass those words into hunspell spel

Why does QStringLiteral returns a garbled string

I'm programming a Chinese software, and embed some of the strings in the source file. To reduce runtime overhead (well, actually this is premature optimization,

UnicodeDecodeError, invalid continuation byte

Why is the below item failing? Why does it succeed with "latin-1" codec? o = "a test of \xe9 char" #I want this to remain a string as this is what I am receivin

What's the difference between ASCII and Unicode?

What's the exact difference between Unicode and ASCII? ASCII has a total of 128 characters (256 in the extended set). Is there any size specification for Unic

HTMLParser.HTMLParser().unescape() doesn't work

I would like to convert HTML entities back to its human readable format, e.g. '£' to '£', '°' to '°' etc. I've read several posts r