codility absolute distinct count from an array

Question

so i took the codility interview test yesterday and was informed today that i failed, unfortunately i wasnt given any other information by either codility nor the employer as to where i screwed up so i would appreciate some help in knowing where i went wrong. i know codility pays alot of emphasis on how fast the program runs and

Accepted Answer

If the array is sorted you can find duplicates by looking a neighbours. To compare absolute values to need to start at both the start and the end. This avoid creating a new structure.EDIT: IMHO HashMap/HashSet is O(log(log(n)) due to collisions, it is only O(1) if there is a perfect hash function. I would have thought not creating object which be much much faster but appears to be only 4x fast on my machine.In summary, you can see that using a Set is simpler, clearer and easier to maintain. It is still very fast and would be the best solution in 98% of cases.public static void main(String[] args) throws Exception { for (int len : new int[]{100 * 1000 * 1000, 10 * 1000 * 1000, 1000 * 1000, 100 * 1000, 10 * 1000, 1000}) { int[] nums = new int[len]; for (int i = 0; i < len; i++) nums[i] = (int) (Math.random() * (Math.random() * 2001 - 1000)); Arrays.sort(nums); long timeArray = 0; long timeSet = 0; int runs = len > 1000 * 1000 ? 10 : len >= 100 * 1000 ? 100 : 1000; for (int i = 0; i < runs; i++) { long time1 = System.nanoTime(); int count = countDistinct(nums); long time2 = System.nanoTime(); int count2 = countDistinctUsingSet(nums); long time3 = System.nanoTime(); timeArray += time2 - time1; timeSet += time3 - time2; assert count == count2; } System.out.printf("For %,d numbers, using an array took %,d us on average, using a Set took %,d us on average, ratio=%.1f%n", len, timeArray / 1000 / runs, timeSet / 1000 / runs, 1.0 * timeSet / timeArray); }}private static int countDistinct(int[] nums) { int lastLeft = Math.abs(nums[0]); int lastRight = Math.abs(nums[nums.length - 1]); int count = 0; for (int a = 1, b = nums.length - 2; a <= b;) { int left = Math.abs(nums[a]); int right = Math.abs(nums[b]); if (left == lastLeft) { a++; lastLeft = left; } else if (right == lastRight) { b--; lastRight = right; } else if (lastLeft == lastRight) { a++; b--; lastLeft = left; lastRight = right; count++; } else if (lastLeft > lastRight) { count++; a++; lastLeft = left; } else { count++; b--; lastRight = right; } } count += (lastLeft == lastRight ? 1 : 2); return count;}private static int countDistinctUsingSet(int[] nums) { Set s = new HashSet(); for (int n : nums) s.add(Math.abs(n)); int count = s.size(); return count;}printsFor 100,000,000 numbers, using an array took 279,623 us on average, using a Set took 1,270,029 us on average, ratio=4.5For 10,000,000 numbers, using an array took 28,525 us on average, using a Set took 126,591 us on average, ratio=4.4For 1,000,000 numbers, using an array took 2,846 us on average, using a Set took 12,131 us on average, ratio=4.3For 100,000 numbers, using an array took 297 us on average, using a Set took 1,239 us on average, ratio=4.2For 10,000 numbers, using an array took 42 us on average, using a Set took 156 us on average, ratio=3.7For 1,000 numbers, using an array took 8 us on average, using a Set took 30 us on average, ratio=3.6On @Kevin K’s point, even Integer can have collision even through it’s hash values are unique, it can map to the same bucket as the capacity is limited.public static int hash(int h) { // This function ensures that hashCodes that differ only by // constant multiples at each bit position have a bounded // number of collisions (approximately 8 at default load factor). h ^= (h >>> 20) ^ (h >>> 12); return h ^ (h >>> 7) ^ (h >>> 4);}public static void main(String[] args) throws Exception { Map map = new HashMap(32, 2.0f); for (int i = 0; i < 10000 && map.size() < 32 * 2; i++) { if (hash(i) % 32 == 0) map.put(i, i); } System.out.println(map.keySet());}prints[2032, 2002, 1972, 1942, 1913, 1883, 1853, 1823, 1763, 1729, 1703, 1669, 1642, 1608, 1582, 1548, 1524, 1494, 1456, 1426, 1405, 1375, 1337, 1307, 1255, 1221, 1187, 1153, 1134, 1100, 1066, 1032, 1016, 986, 956, 926, 881, 851, 821, 791, 747, 713, 687, 653, 610, 576, 550, 516, 508, 478, 440, 410, 373, 343, 305, 275, 239, 205, 171, 137, 102, 68, 34, 0]The values are in reverse order because the HashMap has generated into a LinkedList.

Advertisement

Answer