I was getting an error io.MalformedByteSequenceException: Invalid byte 2 of 2-byte UTF-8 sequence
The solution is to read and write file in UTF-8.
My code is:
InputStream input = null; OutputStream output = null; OutputStreamWriter bufferedWriter = new OutputStreamWriter( output, "UTF8"); input = new URL(url).openStream(); output = new FileOutputStream("DirectionResponse.xml"); byte[] buffer = new byte[1024]; for (int length = 0; (length = input.read(buffer)) > 0;) { output.write(buffer, 0, length); } BufferedReader br = new BufferedReader(new FileReader("DirectionResponse.xml" )); FileWriter fstream = new FileWriter("ppre_DirectionResponse.xml"); BufferedWriter out = new BufferedWriter(fstream);
I’m reading a url and writing it to a file DirectionResponse.xml. Then reading DirectionResponse.xml and writing the same as *ppre_DirecionResponse.xml* for processing.
How do I change this so that reading and writing is done in UTF-8?
Advertisement
Answer
First, you need to call output.close()
(or at least call output.flush()
) before you reopen the file for input. That’s probably the main cause of your problems.
Then, you shouldn’t use FileReader
or FileWriter
for this because it always uses the platform-default encoding (which is often not UTF-8). From the docs for FileReader
:
The constructors of this class assume that the default character encoding and the default byte-buffer size are appropriate.
You have the same problem when using a FileWriter
. Replace this:
BufferedReader br = new BufferedReader(new FileReader("DirectionResponse.xml" ));
with something like this:
BufferedReader br = new BufferedReader(new InputStreamReader( new FileInputStream("DirectionResponse.xml"), "UTF-8"));
and similarly for fstream
.