'VS Code terminal unable to privide UTF-8 input
I'm hopeful that this question is a duplicate, but I have searched a lot at this point, and I've not come across anything helpful, so I am dubious.
The following simple Java code works perfectly in Eclipse, which has its own home-built terminal, and on any *nix system (such as Macs):(1)
public static void main(String[] args) throws Exception {
System.setProperty("file.encoding","UTF-8");
System.out.println("File encoding is " + System.getProperty("file.encoding"));
Scanner console = new Scanner(new InputStreamReader(System.in, StandardCharsets.UTF_8));
PrintStream sysout = new PrintStream(System.out, true, StandardCharsets.UTF_8);
sysout.println("\nTests:");
sysout.println("Can we print lambda using unicode? Let's see: \u03BB");
sysout.print("Copy and paste the lambda above here: ");
String line = console.nextLine();
sysout.println("\nThe input consisted of " + line.length() + " character(s).");
sysout.println("The input was: " + line);
if (line.equals("λ")) {
sysout.println("lambda (λ) was detected with .equals!");
} else {
sysout.println("lambda (λ) was not detected with .equals.");
}
printHexOfString("Hex of λ", "λ");
printHexOfString("Hex of input", line);
}
private static void printHexOfString(String title, String str) {
StringBuilder stringBuilder = new StringBuilder();
char[] charArray = str.toCharArray();
for (char c : charArray) {
String charToHex = Integer.toHexString(c);
stringBuilder.append(charToHex);
}
System.out.println(title+ ": "+stringBuilder.toString());
}
Eclipse Joy
When I type a lambda, I receive the expected output:
File encoding is UTF-8
Tests:
Can we print lambda using unicode? Let's see: λ
Copy and paste the lambda above here: λ
The input consisted of 1 character(s).
The input was: λ
lambda (λ) was detected with .equals!
Hex of λ: 3bb
Hex of input: 3bb
Hooray!
VS Code Troubles
However, all is not so nice in VS Code, which uses Windows Powershell. I ran chcp 65001
in the terminal at the bottom of VS Code before anything else. (Note: prior to my command, chcp
by itself reports Active code page: 437
, and after I type my command, I get the expected Active code page: 65001
)
There, running the same code, and inputting the same lambda, I get:
File encoding is UTF-8
Tests:
Can we print lambda using unicode? Let's see: λ
Copy and paste the lambda above here: λ
The input consisted of 1 character(s).
The input was:
lambda (λ) was not detected with .equals.
Hex of λ: cebb
Hex of input: 0
What I need
I'm not sure where to go from here. I need the lambda symbol to be read, processed, and written to the console consistently. I don't need the rest of UTF-8, I only need the one symbol, so any hacky solution that gains me a lambda will be well received.
Bonus points if the solution doesn't break everything in Eclipse and in Unix (where the original code already works fine).
(1) - As an aside, I tried all 6 character encodings in StandardCharsets
for my PrintStream
and InputStreamReader
, and while the numbers changed, the lambda was never detected (so it never output "String is λ
"), and the last line printed was never a lambda.)
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|