Skip to content
Advertisement

Java Streams – group-by and return a Nested Map

my data is like this,

unitId  time  value1 value2
 a      2021    10    11   
 a      2022    15    13
 b      2021    20    25
 b      2022    30    37

my goal is put every unitId and value into a map like this,

{
  'a': {'2021_value1': 10, '2021_value2': 11, '2022_value1': 15, '2022_value2': 13},
  'b': {'2021_value1': 20, '2021_value2': 25, '2022_value1': 30, '2022_value2': 37},
}

I already figure out two ways to achieve that, here is my code,

public class Unit {

    public String unitId;

    public Integer year;

    public Integer value1;

    public Integer value2;

    public static Unit of(String unitId, Integer year, Integer value1, Integer value2) {
        Unit unit = new Unit();
        unit.unitId = unitId;
        unit.year = year;
        unit.value1 = value1;
        unit.value2 = value2;
        return unit;
    }

}

and,

public class UnitTest {

    private static void printMap(Map<String, Map<String, Integer>> map) {
        map.forEach((k, v) -> {
            String vStr = v.entrySet().stream().map(a -> String.format("%%s: %%s", a.getKey(), a.getValue())).collect(Collectors.joining(", "));
            System.out.printf("%%s: {%%s}%%n", k, vStr);
        });
    }

    public static void main(String[] args) {
        List<Unit> list = new ArrayList<>();
        list.add(Unit.of("a", 2021, 10,  11 ));
        list.add(Unit.of("a", 2022, 15,  13));
        list.add(Unit.of("b", 2021, 20,  25));
        list.add(Unit.of("b", 2022, 30,  37));

        Map<String, Map<String, Integer>> map1 = list.stream().collect(
            Collectors.groupingBy(
                x -> x.unitId,
                Collector.of(
                    HashMap::new,
                    (x, y) -> {
                        x.put(String.format("%%s_%%s", y.year, "value1"), y.value1);
                        x.put(String.format("%%s_%%s", y.year, "value2"), y.value2);
                    },
                    (x, y) -> {x.putAll(y); return x;}
                )
            )
        );

        Map<String, Map<String, Integer>> map2 = list.stream().collect(
            Collectors.groupingBy(
                x -> x.unitId,
                Collectors.collectingAndThen(
                    Collectors.toList(),
                    x -> x.stream()
                        .flatMap(y -> Stream.of(
                                    new AbstractMap.SimpleEntry<>(String.format("%%s_%%s", y.year, "value1"), y.value1),
                                    new AbstractMap.SimpleEntry<>(String.format("%%s_%%s", y.year, "value2"), y.value2)
                             ))
                        .collect(Collectors.toMap(
                                     AbstractMap.SimpleEntry::getKey, 
                                     AbstractMap.SimpleEntry::getValue)))
            )
        );
        printMap(map1);
        printMap(map2);
    }
}

First one more like write the processing manually, second one uses temporary lists which may not be necessary. Is there any direct or simple way to do this, like use Collectors.toMap API or something else?

Advertisement

Answer

Is there any direct or simple way to do this, like use Collectors.toMap API or something else?

If you want to utilize only built-in collectors, you might try a combination of groupingBy() and teeing().

Collectors.teeing() expects three arguments: 2 downstream collectors and a merger function. Each element from the stream will be passed into both collectors, and when these collectors are done, results produced by them will get merged by the function.

In the code below, toMap() is used as both downstream collectors of teeing(). And each of these collectors is responsible for retrieving its type of value.

The code might look like that:

public static void main(String[] args) {
    List<Unit> list =
        List.of(Unit.of("a", 2021, 10,  11 ),
                Unit.of("a", 2022, 15,  13),
                Unit.of("b", 2021, 20,  25),
                Unit.of("b", 2022, 30,  37));

    Map<String, Map<String, Integer>> map = list.stream()
        .collect(Collectors.groupingBy(Unit::getUnitId,
            Collectors.teeing(
                Collectors.toMap(
                    unit -> unit.getYear() + "_value1",
                    Unit::getValue1),
            Collectors.toMap(
                    unit -> unit.getYear() + "_value2",
                    Unit::getValue2),
                (values1, values2) -> {values1.putAll(values2); return values1;})
        ));

    printMap(map);
}

Output:

a: {2022_value2: 13, 2021_value1: 10, 2022_value1: 15, 2021_value2: 11}
b: {2022_value2: 37, 2021_value1: 20, 2022_value1: 30, 2021_value2: 25}

Note:

  • If performance is concerned, Collector.of() would be slightly better because it doesn’t create intermediate collections.
  • For this approach to work correctly (I mean the code listed above as well as in the question), each combination of unitId and year should be unique. Otherwise, consider adding a logic for resolving duplicates.
Advertisement