Skip to content

Tag: utf-8

Encoding Problem while saving a txt file in utf-8

The follwing line should write a ü in test.txt encoded in utf-8- At least this is what I expect it to do. But if I open the file in a text editor, the editor shows and the editor states that it would read the file as utf-8. I even tried two editors and both show the same unexpected result. A

Java: Reading from getResourceAsStream gets too many bytes

I’m trying to read a binary file, using getResourceAsStream. The problem is I get too many bytes back. The file is 56374 bytes long, according to ls, but when I read it in my code, I consistently get 85194 bytes. I get the same result with similar code: If I run the code without the resource, everything is fine, I

Java: how to undo conversion from UTF-8 to ISO-8859-1 [closed]

Closed. This question needs debugging details. It is not currently accepting answers. Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question. Closed 2 years ago. Improve this question My UTF-8 strings have been converted to ISO-8859-1 strings in the following way:

Different results reading file with Files.newBufferedReader() and constructing readers directly

It seems that Files.newBufferedReader() is more strict about UTF-8 than the naive alternative. If I create a file with a single byte 128—so, not a valid UTF-8 character—it will happily be read if I construct an BufferedReader on an InputStreamReader on the result of Files.newInputStream(), but with Files.newBufferedReader() an exception is thrown. This code has this result: Is this documented?

Opening CSV with UTF-8 BOM via Excel

I create csv file with data by the means of java. And I faced the following well-known issue: the letters in Portuguese were displayed by the wrong way in Excel (when opening by double click). I solved this by UTF-16LE+BOM, but excel started to recognize tabs as columns separators instead of commas. So I looked up for another solution and

How to read write this in utf-8?

I was getting an error io.MalformedByteSequenceException: Invalid byte 2 of 2-byte UTF-8 sequence The solution is to read and write file in UTF-8. My code is: I’m reading a url and writing it to a file DirectionResponse.xml. Then reading DirectionResponse.xml and writing the same as *ppre_DirecionResponse.xml* for processing. How do I change this so that reading and writing is done

Unicode text through socket in java

I am facing a tiny issue (I believe) in socket programming. When sending text from non-English languages, I get garbled results. After a lot of researching on google, I made some corrections. I changed getBytes() to getBytes(“UTF-8”) and tried to send some Arabic text. When connecting sockets locally, it works fine. I see the arabic text I expected. But when
