Skip to content
Advertisement

Guava Resources.readLines() for Zip/Gzip files

I’ve found the Resources.readLines() and Files.readLines() to be helpfull in simplifiying my code.
The problem is that I often read gzip-compressed txt-files or txt-files in zip archives from URL’s (HTTP and FTP).
Is there a way to use Guava’s methods to read from these URL’s too? Or is that only possible with Java’s GZIPInputStream/ZipInputStream?

Advertisement

Answer

You can create your own ByteSources:

For GZip:

public class GzippedByteSource extends ByteSource {
  private final ByteSource source;
  public GzippedByteSource(ByteSource gzippedSource) { source = gzippedSource; }
  @Override public InputStream openStream() throws IOException {
    return new GZIPInputStream(source.openStream());
  }
}

Then use it:

Charset charset = ... ;
new GzippedByteSource(Resources.asByteSource(url)).toCharSource(charset).readLines();

Here is the implementation for the Zip. This assumes that you read only one entry.

public static class ZipEntryByteSource extends ByteSource {
  private final ByteSource source;
  private final String entryName;
  public ZipEntryByteSource(ByteSource zipSource, String entryName) {
    this.source = zipSource;
    this.entryName = entryName;
  }
  @Override public InputStream openStream() throws IOException {
    final ZipInputStream in = new ZipInputStream(source.openStream());
    while (true) {
      final ZipEntry entry = in.getNextEntry();
      if (entry == null) {
        in.close();
        throw new IOException("No entry named " + entry);
      } else if (entry.getName().equals(this.entryName)) {
        return new InputStream() {
          @Override
          public int read() throws IOException {
            return in.read();
          }

          @Override
          public void close() throws IOException {
            in.closeEntry();
            in.close();
          }
        };
      } else {
        in.closeEntry();
      }
    }
  }
}

And you can use it like this:

Charset charset = ... ;
String entryName = ... ; // Name of the entry inside the zip file.
new ZipEntryByteSource(Resources.asByteSource(url), entryName).toCharSource(charset).readLines();
User contributions licensed under: CC BY-SA
9 People found this is helpful
Advertisement