Group by multiple fields and filter by common value of a field

Tags: ,

public class Employee{

    private int empid;
    private  String empPFcode;
    private String collegeName;

Employee emp1=new Employee (1334090,"220","AB");
Employee emp2=new Employee (1334091,"220","AB");
Employee emp3=new Employee (1334092,"220","AC");
Employee emp4=new Employee (1434091,"221","DP");
Employee emp5=new Employee (1434091,"221","DP");
Employee emp6=new Employee (1434092,"221","DP");

I want to filter this Employee object based on the EmpPFcode . If collegeName has common value for 3 EmpPFcode, we will collect otherwise we will skip that records.

So my result would be like below.

Employee emp4=new Employee (1434091,"221","DP");
Employee emp5=new Employee (1434091,"221","DP");
Employee emp6=new Employee (1434092,"221","DP");

Below one will skip because collageName is different.

I try to do some logic below but it doesn’t not filter properly.

List<CombinedDTO> distinctElements = ::empPFcode,Employee ::collegeName))

public static <T> Predicate <T> distinctByKeys(Function<? super T, Object>... keyExtractors) {
     Map<Object, Boolean> uniqueMap = new ConcurrentHashMap<>();

     return t ->
         final List<?> keys =
                 .map(ke -> ke.apply(t))

         return uniqueMap.putIfAbsent(keys, Boolean.TRUE) == null;


I. Solution:

A more cleaner and readable solution would be to have a set of empPFcode values ([221]), then filter the employee list only by this set.

First you can use Collectors.groupingBy() to group by empPFcode, then you can use Collectors.mapping(Employee::getCollegeName, Collectors.toSet()) to get a set of collegeName values.

Map<String, Set<String>> pairMap =,
        Collectors.mapping(Employee::getCollegeName, Collectors.toSet()))); 

will result in: {220=[AB, AC], 221=[DP]}

Then you can remove the entries which includes more than one collegeName:

pairMap.values().removeIf(v -> v.size() > 1); 

will result in: {221=[DP]}

The last step is filtering the employee list by the key set. You can use java.util.Set.contains() method inside the filter:

List<Employee> distinctElements = -> pairMap.keySet().contains(emp.getEmpPFcode()))

II. Solution:

If you use Collectors.groupingBy() nested you’ll get a Map<String,Map<String,List<Employee>>>:

   220 = {AB=[...], AC=[...]}, 
   221 = {DP=[...]}

Then you can filter by the map size (Map<String,List<Employee>>) to eliminate the entries which has more than one map in their values (AB=[...], AC=[...]).

You still have a Map<String,Map<String,List<Employee>>> and you only need List<Employee>. To extract the employee list from the nested map, you can use flatMap().

Try this:

List<Employee> distinctElements =
                .collect(Collectors.groupingBy(Employee::getEmpPFcode, Collectors.groupingBy(Employee::getCollegeName)))
                .entrySet().stream().filter(e -> e.getValue().size() == 1).flatMap(m -> m.getValue().values().stream())

Source: stackoverflow