Skip to content

Divide dataset into n partition

Suppose I have an array a. I want to divide it in n partition. How can I perform the for function in Java? I tried this code but in some cases it is wrong.

public static void main(String[] args) {
        int[] a = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10};
        int[] part = {0, 1, 2, 3};
        for (int i = 0; i < part.length; i++) {
            for (int j = ((a.length / part.length) * part[i]); j < ((a.length / part.length) * (part[i] + 1)); j++) {
                System.out.print(a[j] + " ");
            }
            System.out.println();
        }
    }
}

Output:

1 2 
3 4 
5 6 
7 8

Data 9 and 10 are missing. I do not need an equal size, but at least all data is well distributed. How to modify the for j function?

Answer

Here’s one possible solution (very naive approach, not benchmarked or optimised for performance):

int numOfPartitions = 4; 
double n = (double)a.length/numOfPartitions;
int start =0;
int partitionNum=1;
while(partitionNum<= numOfPartitions)
  {        
    int end = (int)java.lang.Math.ceil(n*partitionNum);
    for(int k = start;k<end;k++)
      {
        System.out.print(a[k] + " ");
      }
    start = end;
    ++partitionNum;
    System.out.println();
  }

First, see what would be the size of each partition (even if it is not integral). We will use it to slice up the array into sub-arrays.

Now, start slicing into sub-arrays from the first element of the input array. We use a strategy of taking the largest integer closest to (computed partition size * current partition number) as the upper bound. In this example, we have size = 10/4 = 2.5, so our indices to break into sub-arrays will be 3 (for 2.5), 5, 8 (for 7.5) and 10.

The first sub-array will take the elements from index 0 of input until index 2, since the first computed index is 3. The next sub-array starts at index 3 and goes till index 5, and so on.

This way, we end up with the sub-arrays having number of elements differing at most by 1 from each other.

Note: If we have more partitions than number of elements in input, we can just return the entire array.