Skip to content
Advertisement

Java Stream API – Group by items of an object’s inner list

Is there a way to achieve the following example code, leveraging Java Stream API rather than having to create a HashMap and populate it inside double forEaches? I was trying to play around with groupingBy and flatMap but couldn’t find a way out.

Having a list of Movies, where each one has a list of genres (Strings)…

class Movie {
    List<String> genres;
}
List<Movie> movies = new ArrayList<>();

…I want to group all the movies by genre

Map<String, List<Movie>> moviesByGenre = new HashMap();
movies.stream()
        .forEach(movie -> movie.getGenres()
                .stream()
                .forEach(genre -> moviesByGenre
                        .computeIfAbsent(genre, k -> new ArrayList<>())
                        .add(movie)));

Advertisement

Answer

This one is tricky because you cannot define a key for each Movie since such an object can appear under multiple keys.

The best solution is as far as I know equal to yours:

Map<String, List<Movie>> groupedMovies = new HashMap<>();
movies.forEach(movie -> {
    movie.getGenres().forEach(genre ->
        groupedMovies.computeIfAbsent(genre, g -> new ArrayList<>()).add(movie)
    );
});

If you want to “convert” this snippet into , you have to start with what you have – which is the individual genres. Extract them from each Movie using flatMap and distinct to avoid duplicates. Then use a Collector.toMap to get the desired output.

  • Key: Function.identity() to map each unique genre as a key itself.
  • Value: Use another Stream to filter out the movies containing a particular genre to assign them to the key.
Map<String, List<Movie>> groupedMovies = movies.stream()
    .map(Movie::getGenres)
    .flatMap(List::stream)
    .distinct()
    .collect(Collectors.toMap(
            Function.identity(),
            genre -> movies.stream()
                           .filter(movie -> movie.getGenres().contains(genre))
                           .collect(Collectors.toList())));

The procedural approach in the first snippet is faster, easier to read and understand. I don’t recommend using here.


A note.. there is no meaning of using forEach right after stream: The sequence of list.stream().forEach(...) can be list.forEach(...) instead.

User contributions licensed under: CC BY-SA
10 People found this is helpful
Advertisement