I’m trying to read a binary file in Java. I need methods to read unsigned 8-bit values, unsigned 16-bit value and unsigned 32-bit values. What would be the best (fastest, nicest looking code) to do this? I’ve done this in c++ and did something like this:
uint8_t *buffer; uint32_t value = buffer[0] | buffer[1] << 8 | buffer[2] << 16 | buffer[3] << 24;
But in Java this causes a problem if for example buffer[1] contains a value which has it sign bit set as the result of a left-shift is an int (?). Instead of OR:ing in only 0xA5 at the specific place it OR:s in 0xFFFFA500 or something like that, which “damages” the two top bytes.
I have a code right now which looks like this:
public long getUInt32() throws EOFException, IOException { byte[] bytes = getBytes(4); long value = bytes[0] | (bytes[1] << 8) | (bytes[2] << 16) | (bytes[3] << 24); return value & 0x00000000FFFFFFFFL; }
If I want to convert the four bytes 0x67 0xA5 0x72 0x50 the result is 0xFFFFA567 instead of 0x5072A567.
Edit: This works great:
public long getUInt32() throws EOFException, IOException { byte[] bytes = getBytes(4); long value = bytes[0] & 0xFF; value |= (bytes[1] << 8) & 0xFFFF; value |= (bytes[2] << 16) & 0xFFFFFF; value |= (bytes[3] << 24) & 0xFFFFFFFF; return value; }
But isn’t there a better way to do this? 10 bit-operations seems a “bit” much for a simple thing like this.. (See what I did there?) =)
Advertisement
Answer
You’ve got the right idea, I don’t think there’s any obvious improvement. If you look at the java.io.DataInput.readInt
spec, they have code for the same thing. They switch the order of <<
and &
, but otherwise standard.
There is no way to read an int
in one go from a byte
array, unless you use a memory-mapped region, which is way overkill for this.
Of course, you could use a DataInputStream
directly instead of reading into a byte[]
first:
DataInputStream d = new DataInputStream(new FileInputStream("myfile")); d.readInt();
DataInputStream
works on the opposite endianness than you are using, so you’ll need some Integer.reverseBytes
calls also. It won’t be any faster, but it’s cleaner.