Kotlin gzip uncompress fail

Tags: , , ,



I try to simplify my java gzip uncompress code to kotlin. But after I changed, it sames broken.

Here is the java code

  public static byte[] uncompress(byte[] compressedBytes) {
    if (null == compressedBytes || compressedBytes.length == 0) {
      return null;
    }

    ByteArrayOutputStream out = null;
    ByteArrayInputStream in = null;
    GZIPInputStream gzipInputStream = null;

    try {
      out = new ByteArrayOutputStream();
      in = new ByteArrayInputStream(compressedBytes);
      gzipInputStream = new GZIPInputStream(in);
      byte[] buffer = new byte[256];
      int n = 0;

      while ((n = gzipInputStream.read(buffer)) >= 0) {
        out.write(buffer, 0, n);
      }

      return out.toByteArray();
    } catch (IOException ignore) {
    } finally {
      CloseableUtils.closeQuietly(gzipInputStream);
      CloseableUtils.closeQuietly(in);
      CloseableUtils.closeQuietly(out);
    }

    return null;
  }

This is my kotlin code.

  payload = GZIPInputStream(payload.inputStream())
      .bufferedReader()
      .use { it.readText() }
      .toByteArray()

And I got this error.

com.google.protobuf.nano.InvalidProtocolBufferNanoException: While parsing a protocol message, the input ended unexpectedly in the middle of a field.  This could mean either than the input has been truncated or that an embedded message misreported its own length.

It seems that the decompression process was interrupted by reader?

Answer

The readText(charset: Charset = Charsets.UTF_8) decodes the bytes into UTF-8 character set, which is why it says “This could mean either than the input has been truncated” it probably have tried to convert 8-bits into a Char and build a String out of it.

Use the readBytes() to get ByteArray which is represented same as byte[] in JVM platform.

Example:

GZIPInputStream(payload.inputStream())
      .bufferedReader()
      .use { it.readBytes() }

Edit:

For reading bytes, you shouldn’t be using the Reader, it is meant for reading the Text in UTF-8 format as defined in the Kotlin’s InputStream.bufferedReader:

public inline fun InputStream.bufferedReader(charset: Charset = Charsets.UTF_8): BufferedReader = reader(charset).buffered()

The InputStream.readBytes() will read the bytes at a buffer of 8KB itself.

public fun InputStream.readBytes(): ByteArray {
    val buffer = ByteArrayOutputStream(maxOf(DEFAULT_BUFFER_SIZE, this.available()))
    copyTo(buffer)
    return buffer.toByteArray()
}
// This copies with 8KB buffer automatically
// DEFAULT_BUFFER_SIZE = 8 * 1024
public fun InputStream.copyTo(out: OutputStream, bufferSize: Int = DEFAULT_BUFFER_SIZE): Long {
    var bytesCopied: Long = 0
    val buffer = ByteArray(bufferSize)
    var bytes = read(buffer)
    while (bytes >= 0) {
        out.write(buffer, 0, bytes)
        bytesCopied += bytes
        bytes = read(buffer)
    }
    return bytesCopied
}

So you just have to do:

GZIPInputStream(payload.inputStream()).use { it.readBytes() }


Source: stackoverflow