Skip to content
Advertisement

Merge sets when two elements in common

This is the follow up of compare sets

I have

Set<Set<Node>> NestedSet = new HashSet<Set<Node>>();

[[Node[0], Node[1], Node[2]], [Node[0], Node[2], Node[6]], [Node[3], Node[4], Node[5]]  [Node[2], Node[6], Node[7]] ]

I want to merge the sets when there are two elements in common. For example 0,1,2 and 0,2,6 has two elements in common so merging them to form [0,1,2,6].

Again [0,1,2,6] and [2,6,7] has 2 and 6 common. so merging them and getting [0,1,2,6,7].

The final output should be :

[  [Node[0], Node[1], Node[2], Node[6], Node[7]], [Node[3], Node[4], Node[5]] ]

I tried like this :

         for (Set<Node> s1 : NestedSet ) {
                        Optional<Set<Node>> findFirst = result.stream().filter(p -> { HashSet<Node> temp = new HashSet<>(s1); 
                        temp.retainAll(p); 
                        return temp.size() == 2; }).findFirst(); 

                        if (findFirst.isPresent()){

                            findFirst.get().addAll(s1); 
                        }
                        else {
                            result.add(s1);
                        }    

                    }

But the result I got was :

[[Node[0], Node[1], Node[2], Node[6], Node[7]], [Node[3], Node[4], Node[5]], [Node[0], Node[2], Node[6], Node[7]]]

Any idea ? Is there any way to get the desired output?

Answer

Some considerations:

  • Each time you apply a merge, you have to restart the procedure and iterate over the modified collection. Because of this, the iteration order of the input set is important, if you want your code to be deterministic you may want to use collections that give guarantees over their iteration order (e.g. use LinkedHashSet (not HashSet) or List.
  • Your current code has side effects as it modifies the supplied sets when merging. In general I think it helps to abstain from creating side effects whenever possible.

The following code does what you want:

static <T> List<Set<T>> mergeSets(Collection<? extends Set<T>> unmergedSets) {
  final List<Set<T>> mergedSets = new ArrayList<>(unmergedSets);

  List<Integer> mergeCandidate = Collections.emptyList();
  do {
    mergeCandidate = findMergeCandidate(mergedSets);

    // apply the merge
    if (!mergeCandidate.isEmpty()) {
      // gather the sets to merge
      final Set<T> mergedSet = Sets.union(
          mergedSets.get(mergeCandidate.get(0)),
          mergedSets.get(mergeCandidate.get(1)));

      // removes both sets using their index, starts with the highest index
      mergedSets.remove(mergeCandidate.get(0).intValue());
      mergedSets.remove(mergeCandidate.get(1).intValue());

      // add the mergedSet
      mergedSets.add(mergedSet);
    }
  } while (!mergeCandidate.isEmpty());

  return mergedSets;
}

// O(n^2/2)
static <T> List<Integer> findMergeCandidate(List<Set<T>> sets) {
  for (int i = 0; i < sets.size(); i++) {
    for (int j = i + 1; j < sets.size(); j++) {
      if (Sets.intersection(sets.get(i), sets.get(j)).size() == 2) {
        return Arrays.asList(j, i);
      }
    }
  }
  return Collections.emptyList();
}

For testing this method I created two helper methods:

static Set<Integer> set(int... ints) {
  return new LinkedHashSet<>(Ints.asList(ints));
}

@SafeVarargs
static <T> Set<Set<T>> sets(Set<T>... sets) {
  return new LinkedHashSet<>(Arrays.asList(sets));
}

These helper methods allow to write very readable tests, for example (using the numbers from the question):

public static void main(String[] args) {
    // prints [[2, 6, 7, 0, 1]]
    System.out.println(mergeSets(sets(set(0, 1, 2, 6), set(2, 6, 7))));
    // prints [[3, 4, 5], [0, 2, 6, 1, 7]]
    System.out.println(
      mergeSets(sets(set(0, 1, 2), set(0, 2, 6), set(3, 4, 5), set(2, 6, 7))));
}
Advertisement