Skip to content
Advertisement

Why is UUID#compareTo incompatible with RFC 4122?

Overview

Javas UUID class implements Comparable. But the order it implements appears to be incompatible with the specificiation given in RFC 4122.

In particular, it is inconsistent with the natural order implied by its string representation (uuid1.toString().compareTo(uuid2.toString())), which lines up with the RFC.


Example

You can reproduce and observe the problem by using the following code:

UUID uuid1 = UUID.randomUUID();
UUID uuid2 = UUID.randomUUID();

Assert.assertEquals(
        Math.signum((int) uuid1.compareTo(uuid2)),
        Math.signum((int) uuid1.toString().compareTo(uuid2.toString())));

Details

My main problem with this is that almost all other tools and languages seem to be consistent and compatible with RFC 4122, but Java is not.

In my particular case, I am using postgresql 13 and order by a column that contains the UUID, e.g. myColumnd::UUID or myColumnd::text (using uuid_v4), but the order I obtain by this differs from the order obtained with Java.

Advertisement

Answer

Reason

What you are observing here is a known bug which will not be fixed anymore to preserve backwards compatibility.

For details, see JDK-7025832:

Though the bug is accurate that the compareTo implementation is not consistent with other implementations the Java UUID.compareTo() method must remain consistent among versions of Java. The compareTo() function is used primarily for sorting and the sort order of UUIDs must remain stable from version to version of Java.


Signed comparison

The underlying root problem is that Javas long type is a signed type but the reference implementation from RFC 4122, and implementations in other tools and languages, do the math with unsigned types.

This results in small differences in the outcome of the order, since the point where the numbers over-/underflow is different. E.g. Long.MAX_NUMBER is bigger than LONG.MAX_NUMBER + 1, but not for their unsigned counterparts.

The issue with Javas implementation was detected too late and now we have to live with this incompatibility.


Implementation Appendix

Here is the correct reference implementation from RFC 4122:

/* uuid_compare --  Compare two UUID's "lexically" and return */
#define CHECK(f1, f2) if (f1 != f2) return f1 < f2 ? -1 : 1;
int uuid_compare(uuid_t *u1, uuid_t *u2)
{
    int i;

    CHECK(u1->time_low, u2->time_low);
    CHECK(u1->time_mid, u2->time_mid);
    CHECK(u1->time_hi_and_version, u2->time_hi_and_version);
    CHECK(u1->clock_seq_hi_and_reserved, u2->clock_seq_hi_and_reserved);
    CHECK(u1->clock_seq_low, u2->clock_seq_low)

    for (i = 0; i < 6; i++) {
        if (u1->node[i] < u2->node[i])
            return -1;
        if (u1->node[i] > u2->node[i])
            return 1;
    }
    return 0;
}
#undef CHECK

defined on the struct

typedef struct {
    unsigned32  time_low;
    unsigned16  time_mid;
    unsigned16  time_hi_and_version;
    unsigned8   clock_seq_hi_and_reserved;
    unsigned8   clock_seq_low;
    byte        node[6];
} uuid_t;

as you see, they compare the nodes, which are byte, one by one (in the correct order).

Javas implementation however is this:

@Override
public int compareTo(UUID val) {
    // The ordering is intentionally set up so that the UUIDs
    // can simply be numerically compared as two numbers
    return (this.mostSigBits < val.mostSigBits ? -1 :
            (this.mostSigBits > val.mostSigBits ? 1 :
                (this.leastSigBits < val.leastSigBits ? -1 :
                (this.leastSigBits > val.leastSigBits ? 1 :
                0))));
}

based on the two (signed) longs:

private final long mostSigBits;
private final long leastSigBits;
User contributions licensed under: CC BY-SA
2 People found this is helpful
Advertisement