Skip to content
Advertisement

ZipFile : Wrong values when reading

I am creating a zip file with one directory and an single compressed text file inside of it.

Code to create the zip file

JavaScript

Upon writing the file the size of the byte data uncompressed is 23 bytes and the size of the data compressed is 15. I am using every method inside ZipEntry just to test if i can retrive all the values correctly upon reading it.

Upon Reading it using ZipFile class & not ZipInputStream(bug getSize() always returns -1) using this code

JavaScript

I get this output

JavaScript

There is a lot of wrong output in this code

For the directory

1)Creation Time & Access Time are null[even though i have specified it in the write method]

2)Extra Data[Optional Data] has wrong encoding

For the file

1)Creation Time & Access Time are null[even though i have specified it in the write method]

2)getSize() & getCompressedSize() methods return the wrong values. I have specified these values during writing manually with sizeSize() & setCompressedSize() when creating the file the values were 23 and 15 but it returns 15 and 17

3)Extra Data[Optional Data] has wrong encoding

4)Since getSize() returns incorrect size it dosen’t display the whole data[Hello World Hel]

With so many things going wrong i thought to post this as one question rather than multiple small ones as they all seem related. I am a complete beginner in writing zip files so any direction on where do i go from here would be greatly appreciated.

I can read the data of an zip entry using an while loop into an buffer if the size is not known or incorrect which is not an problem but why would they even create an set or get size method if they knew we would be doing this most of the time anyway. Whats the point?

Advertisement

Answer

After much research i was able to solve 70% of the problems. Others can’t be solved given the nature of how an ZipOutputStream & ZipFile reads the data

Problem 1: Incorrect values returned by getSize() & getCompressedSize()

1) During Writing

I was blind to have not seen this earlier but ZipOutputStream already does compression for us and i was double compressing it by using my own inflater so i removed that code and i realized that you must specify these values only when you are using the method as STORED. else they are computed for you from the data. So refracting my zip writing code this is how it looks like

JavaScript

2)During Reading

To get the correct size & compressed size values there are 2 approaches

-> If you read the file using ZipFile class the values come out correctly

-> If you use ZipInputStream then these values are computed only after you have read all the bytes from the entry. more info here

JavaScript

Problem 2: Incorrect Extra data

This post pretty much explains everything

Here is the code

JavaScript

Unsolved Problems

There are still 3 values which return different results based on which method you use to read the file. I made a table of my observations per entry

JavaScript

Apparently from the bug report This is expected behavior since zip file is random access and zip input stream is sequential and so they access data differently.

From my observations Using ZipInputStream returns the best results so i will continue to use that

User contributions licensed under: CC BY-SA
10 People found this is helpful
Advertisement