I’m tinkering with some code and I realized something I never knew. A normal binary search will return a random index in a data set for a key that occurs more than once. How can I modify this code below to return the first occurrence? Is this something people do?
//ripped from the JDK public static int binarySearchValue(InvertedContainer.InvertedIndex[] a, long key) { return bSearchVal(a, 0, a.length, key); } private static int bSearchVal(InvertedContainer.InvertedIndex[] a, int fromIndex, int toIndex, long key) { int low = fromIndex; int high = toIndex - 1; while (low <= high) { int mid = (low + high) >>> 1; long midVal = a[mid].val; if (midVal < key) low = mid + 1; else if (midVal > key) high = mid - 1; else return mid; // key found } return (low); // key not found. return insertion point }
Advertisement
Answer
Having found a matching value, you basically need to walk up the collection until you find an entry which doesn’t match.
You could potentially make it faster by fetching the index of a key immediately lower than the one you were looking for, then do a binary chop between the two – but I’d probably go for the simpler version, which is likely to be “efficient enough” unless you’ve got a really large number of equal entries.