Need explanation for algorithm searching minimal large sum

Tags: , ,



I’m solving Codility questions as practice and couldn’t answer one of the questions. I found the answer on the Internet but I don’t get how this algorithm works. Could someone walk me through it step-by-step? Here is the question:

 /*
  You are given integers K, M and a non-empty zero-indexed array A consisting of N integers.
  Every element of the array is not greater than M.
    You should divide this array into K blocks of consecutive elements.
    The size of the block is any integer between 0 and N. Every element of the array should belong to some block.
    The sum of the block from X to Y equals A[X] + A[X + 1] + ... + A[Y]. The sum of empty block equals 0.
    The large sum is the maximal sum of any block.
    For example, you are given integers K = 3, M = 5 and array A such that:
      A[0] = 2
      A[1] = 1
      A[2] = 5
      A[3] = 1
      A[4] = 2
      A[5] = 2
      A[6] = 2
    The array can be divided, for example, into the following blocks:
    [2, 1, 5, 1, 2, 2, 2], [], [] with a large sum of 15;
    [2], [1, 5, 1, 2], [2, 2] with a large sum of 9;
    [2, 1, 5], [], [1, 2, 2, 2] with a large sum of 8;
    [2, 1], [5, 1], [2, 2, 2] with a large sum of 6.
    The goal is to minimize the large sum. In the above example, 6 is the minimal large sum.
    Write a function:
    class Solution { public int solution(int K, int M, int[] A); }
    that, given integers K, M and a non-empty zero-indexed array A consisting of N integers, returns the minimal large sum.
    For example, given K = 3, M = 5 and array A such that:
      A[0] = 2
      A[1] = 1
      A[2] = 5
      A[3] = 1
      A[4] = 2
      A[5] = 2
      A[6] = 2
    the function should return 6, as explained above. Assume that:
    N and K are integers within the range [1..100,000];
    M is an integer within the range [0..10,000];
    each element of array A is an integer within the range [0..M].
    Complexity:
    expected worst-case time complexity is O(N*log(N+M));
    expected worst-case space complexity is O(1), beyond input storage (not counting the storage required for input arguments).
    Elements of input arrays can be modified.
 */

And here is the solution I found with my comments about parts which I don’t understand:

      public static int solution(int K, int M, int[] A) {
    int lower = max(A);  // why lower is max?
    int upper = sum(A);  // why upper is sum?
    while (true) {
      int mid = (lower + upper) / 2;
      int blocks = calculateBlockCount(A, mid); // don't I have specified number of blocks? What blocks do? Don't get that.
      if (blocks < K) {
        upper = mid - 1;
      } else if (blocks > K) {
        lower = mid + 1;
      } else {
        return upper;
      }
    }
  }

  private static int calculateBlockCount(int[] array, int maxSum) {
    int count = 0;
    int sum = array[0];
    for (int i = 1; i < array.length; i++) {
      if (sum + array[i] > maxSum) {
        count++;
        sum = array[i];
      } else {
        sum += array[i];
      }
    }
    return count;
  }

  // returns sum of all elements in an array
  private static int sum(int[] input) {
    int sum = 0;
    for (int n : input) {
      sum += n;
    }
    return sum;
  }

  // returns max value in an array
  private static int max(int[] input) {
    int max = -1;
    for (int n : input) {
      if (n > max) {
        max = n;
      }
    }
    return max;
  }

Answer

So what the code does is using a form of binary search (How binary search works is explained quite nicely here, https://www.topcoder.com/community/data-science/data-science-tutorials/binary-search/. It also uses an example quite similar to your problem.). Where you search for the minimum sum every block needs to contain. In the example case, you need the divide the array in 3 parts

When doing a binary search you need to define 2 boundaries, where you are certain that your answer can be found in between. Here, the lower boundary is the maximum value in the array (lower). For the example, this is 5 (this is if you divide your array in 7 blocks). The upper boundary (upper) is 15, which is the sum of all the elements in the array (this is if you divide the array in 1 block.)

Now comes the search part: In solution() you start with your bounds and mid point (10 for the example). In calculateBlockCount you count (count ++ does that) how many blocks you can make if your sum is a maximum of 10 (your middle point/ or maxSum in calculateBlockCount).
For the example 10 (in the while loop) this is 2 blocks, now the code returns this (blocks) to solution. Then it checks whether is less or more than K, which is the number of blocks you want. If its less than K your mid point is high because you’re putting to many array elements in your blocks. If it’s more than K, than your mid point is too high and you’re putting too little array elements in your array. Now after the checking this, it halves the solution space (upper = mid-1). This happens every loop, it halves the solution space which makes it converge quite quickly.

Now you keep going through your while adjusting the mid, till this gives the amount blocks which was in your input K.

So to go though it step by step:

Mid =10 , calculateBlockCount returns 2 blocks
solution. 2 blocks < K so upper -> mid-1 =9, mid -> 7  (lower is 5)
Mid =7 , calculateBlockCount returns 2 blocks  
solution() 2 blocks < K so upper -> mid-1 =6, mid -> 5 (lower is 5, cast to int makes it 5)
Mid =5 , calculateBlockCount returns 4 blocks
solution() 4 blocks < K so lower -> mid+1 =6, mid -> 6  (lower is 6, upper is 6
Mid =6 , calculateBlockCount returns 3 blocks
So the function returns mid =6....

Hope this helps,

Gl learning to code 🙂



Source: stackoverflow