Overview
Javas UUID
class implements Comparable
. But the order it implements appears to be incompatible with the specificiation given in RFC 4122.
In particular, it is inconsistent with the natural order implied by its string representation (uuid1.toString().compareTo(uuid2.toString())
), which lines up with the RFC.
Example
You can reproduce and observe the problem by using the following code:
UUID uuid1 = UUID.randomUUID(); UUID uuid2 = UUID.randomUUID(); Assert.assertEquals( Math.signum((int) uuid1.compareTo(uuid2)), Math.signum((int) uuid1.toString().compareTo(uuid2.toString())));
Details
My main problem with this is that almost all other tools and languages seem to be consistent and compatible with RFC 4122, but Java is not.
In my particular case, I am using postgresql 13 and order by a column that contains the UUID, e.g. myColumnd::UUID
or myColumnd::text
(using uuid_v4
), but the order I obtain by this differs from the order obtained with Java.
Advertisement
Answer
Reason
What you are observing here is a known bug which will not be fixed anymore to preserve backwards compatibility.
For details, see JDK-7025832:
Though the bug is accurate that the
compareTo
implementation is not consistent with other implementations the JavaUUID.compareTo()
method must remain consistent among versions of Java. ThecompareTo()
function is used primarily for sorting and the sort order of UUIDs must remain stable from version to version of Java.
Signed comparison
The underlying root problem is that Javas long
type is a signed type but the reference implementation from RFC 4122, and implementations in other tools and languages, do the math with unsigned types.
This results in small differences in the outcome of the order, since the point where the numbers over-/underflow is different. E.g. Long.MAX_NUMBER
is bigger than LONG.MAX_NUMBER + 1
, but not for their unsigned counterparts.
The issue with Javas implementation was detected too late and now we have to live with this incompatibility.
Implementation Appendix
Here is the correct reference implementation from RFC 4122:
/* uuid_compare -- Compare two UUID's "lexically" and return */ #define CHECK(f1, f2) if (f1 != f2) return f1 < f2 ? -1 : 1; int uuid_compare(uuid_t *u1, uuid_t *u2) { int i; CHECK(u1->time_low, u2->time_low); CHECK(u1->time_mid, u2->time_mid); CHECK(u1->time_hi_and_version, u2->time_hi_and_version); CHECK(u1->clock_seq_hi_and_reserved, u2->clock_seq_hi_and_reserved); CHECK(u1->clock_seq_low, u2->clock_seq_low) for (i = 0; i < 6; i++) { if (u1->node[i] < u2->node[i]) return -1; if (u1->node[i] > u2->node[i]) return 1; } return 0; } #undef CHECK
defined on the struct
typedef struct { unsigned32 time_low; unsigned16 time_mid; unsigned16 time_hi_and_version; unsigned8 clock_seq_hi_and_reserved; unsigned8 clock_seq_low; byte node[6]; } uuid_t;
as you see, they compare the nodes, which are byte
, one by one (in the correct order).
Javas implementation however is this:
@Override public int compareTo(UUID val) { // The ordering is intentionally set up so that the UUIDs // can simply be numerically compared as two numbers return (this.mostSigBits < val.mostSigBits ? -1 : (this.mostSigBits > val.mostSigBits ? 1 : (this.leastSigBits < val.leastSigBits ? -1 : (this.leastSigBits > val.leastSigBits ? 1 : 0)))); }
based on the two (signed) longs:
private final long mostSigBits; private final long leastSigBits;