Why is the GZIP “os” header hard-coded to FAT in Java?

Tags: ,



RFC 1952 section 2.3.1 specifies that GZIP headers must contain an OS flag:

OS (Operating System). This identifies the type of file system on which compression took place. This may be useful in determining end-of-line convention for text files. The currently defined values are as follows:

  0 - FAT filesystem (MS-DOS, OS/2, NT/Win32)
  1 - Amiga
  2 - VMS (or OpenVMS)
  3 - Unix
  4 - VM/CMS
  5 - Atari TOS
  6 - HPFS filesystem (OS/2, NT)
  7 - Macintosh
  8 - Z-System
  9 - CP/M
 10 - TOPS-20
 11 - NTFS filesystem (NT)
 12 - QDOS
 13 - Acorn RISCOS
255 - unknown

However, Java’s GZIP serialisation instead writes a zero in all cases, as can be seen on line 193 of GzipOutputStream.java. I’ve run tests on four different operating systems to confirm no other code is modifying this header after writing.

Why is this value hard-coded?

Answer

As Elliott pointed out, setting it to a default value is fine as per section 2.3.1.2 of the same RFC you reference:

A compliant compressor must produce files with correct ID1, ID2, CM, CRC32, and ISIZE, but may set all the other fields in the fixed-length part of the header to default values (255 for OS, 0 for all others). The compressor must set all reserved bits to zero.

However, the default value is still incorrect, according to this very fragment – the default for the OS flag is 255, not 0. This was a known bug in the JDK as per JDK-8244706. It was fixed in Java version 16, early access build 16.



Source: stackoverflow