RFC 1952 section 2.3.1 specifies that GZIP headers must contain an
OS(Operating System). This identifies the type of file system on which compression took place. This may be useful in determining end-of-line convention for text files. The currently defined values are as follows:
0 - FAT filesystem (MS-DOS, OS/2, NT/Win32) 1 - Amiga 2 - VMS (or OpenVMS) 3 - Unix 4 - VM/CMS 5 - Atari TOS 6 - HPFS filesystem (OS/2, NT) 7 - Macintosh 8 - Z-System 9 - CP/M 10 - TOPS-20 11 - NTFS filesystem (NT) 12 - QDOS 13 - Acorn RISCOS 255 - unknown
However, Java’s GZIP serialisation instead writes a zero in all cases, as can be seen on line 193 of GzipOutputStream.java. I’ve run tests on four different operating systems to confirm no other code is modifying this header after writing.
Why is this value hard-coded?
As Elliott pointed out, setting it to a default value is fine as per section 126.96.36.199 of the same RFC you reference:
A compliant compressor must produce files with correct ID1, ID2, CM, CRC32, and ISIZE, but may set all the other fields in the fixed-length part of the header to default values (255 for OS, 0 for all others). The compressor must set all reserved bits to zero.
However, the default value is still incorrect, according to this very fragment – the default for the
OS flag is 255, not 0. This was a known bug in the JDK as per JDK-8244706. It was fixed in Java version 16, early access build 16.