encoding - Error reading UTF-8 file in Java -


i trying read in sentences file contains unicode characters. print out string reason messes unicode characters

this code have:

public static string readsentence(string resourcename) {      string sentence = null;     try {         inputstream refstream = classloader                 .getsystemresourceasstream(resourcename);         bufferedreader br = new bufferedreader(new inputstreamreader(                 refstream, charset.forname("utf-8")));         sentence = br.readline();     } catch (ioexception e) {         throw new runtimeexception("cannot read sentence: " + resourcename);     }     return sentence.trim(); } 

the problem in way string being output.

i suggest confirm correctly reading unicode characters doing this:

for (char c : sentence.tochararray()) {     system.err.println("char '" + ch + "' unicode codepoint " + ((int) ch))); } 

and see if unicode codepoints correct characters being messed up. if correct, problem output side: if not, input side.


Comments

Popular posts from this blog

c++ - How do I get a multi line tooltip in MFC -

asp.net - In javascript how to find the height and width -

c# - DataTable to EnumerableRowCollection -