How to count array elements properly with multiple aggregation in Spring Data MongoDB?

Tags: , , ,



I need to create advanced aggregation using Spring Data MongoDB having model like that:

@Getter
@Setter
@Document
public class City {

  @Id
  @JsonSerialize(using = ToStringSerializer.class)
  private ObjectId id;

  private Address description;

  private String name;

  ...

}

@Getter
@Setter
@Document
public class Library {

  @Id
  @JsonSerialize(using = ToStringSerializer.class)
  private ObjectId id;

  private Address address;

  private String workingHours;

  @JsonSerialize(using = ToStringSerializer.class)
  private ObjectId cityId;

  ...

}

@Getter
@Setter
@Document
public class Book {

  @Id
  @JsonSerialize(using = ToStringSerializer.class)
  private ObjectId id;

  private Boolean published;

  private Boolean hidden;

  private String title;

  @JsonSerialize(using = ToStringSerializer.class)
  private ObjectId libraryId;

  ...

}

pom.xml

<dependency>
        <groupId>org.springframework.boot</groupId>
        <artifactId>spring-boot-starter-data-mongodb</artifactId>
</dependency>
<dependency>
        <groupId>org.springframework.data</groupId>
        <artifactId>spring-data-mongodb</artifactId>
        <version>2.2.0</version>
</dependency>

Cities collection:

{ 
    "_id" : ObjectId("5f47878c95f47e209402fe15"), 
    "name" : "Warsaw",
    "description" : "Sample description"
}
{ 
    "_id" : ObjectId("5f4787918b343fff4f52c270"), 
    "name" : "Chicago",
    "description" : "Sample description"
}

Libraries collection:

{ 
    "_id" : ObjectId("5f45440ee89590218e83a697"), 
    "workingHours" : "8:00 PM - 8:00 AM",
    "address" : DBRef("addresses", ObjectId("5f4544198da452a5523e3d11")),
    "cityId": ObjectId("5f47878c95f47e209402fe15")
},
{ 
    "_id" : ObjectId("5f478725d1507323a80efa31"), 
    "workingHours" : "8:00 PM - 8:00 AM",
    "address" : DBRef("addresses", ObjectId("5f4787379e72f882e4d26912")),
    "cityId": ObjectId("5f47878c95f47e209402fe15")
},
{ 
    "_id" : ObjectId("5f47872f7c4872d4983961f5"), 
    "workingHours" : "8:00 PM - 8:00 AM",
    "address" : DBRef("addresses", ObjectId("5f47873d5ddedadb3d6ddd6e")),
    "cityId": ObjectId("5f4787918b343fff4f52c270")
}

Books collection:

{ 
    "_id" : ObjectId("5f454423be823729015661ed"), 
    "published": true,
    "hidden": false,
    "title": "The Hobbit, or There and Back Again"
    "libraryId": ObjectId("5f45440ee89590218e83a697")
},
{ 
    "_id" : ObjectId("5f45445b876d08649b88ed5a"), 
    "published": true,
    "hidden": false,
    "title": "Harry Potter and the Philosopher's Stone"
    "libraryId": ObjectId("5f45440ee89590218e83a697")
},
{ 
    "_id" : ObjectId("5f45446c7e33ca70363f629a"), 
    "published": true,
    "hidden": false,
    "title": "Harry Potter and the Cursed Child"
    "libraryId": ObjectId("5f45440ee89590218e83a697")
},
{ 
    "_id" : ObjectId("5f45447285f9b3e4cb8739ad"), 
    "published": true,
    "hidden": false,
    "title": "Fantastic Beasts and Where to Find Them"
    "libraryId": ObjectId("5f45440ee89590218e83a697")
},
{ 
    "_id" : ObjectId("5f45449fc121a20afa4fbb96"), 
    "published": false,
    "hidden": false,
    "title": "Universal Parks & Resorts"
    "libraryId": ObjectId("5f45440ee89590218e83a697")
},
{ 
    "_id" : ObjectId("5f4544a5f13839bbe89edb23"), 
    "published": false,
    "hidden": true,
    "title": "Ministry of Dawn"
    "libraryId": ObjectId("5f45440ee89590218e83a697")
}

Depending on the context of the user, I have to return cities with count of libraries and books in the city that can be filtered based on startsWith() or like() principle.

Assuming that I have 2 libraries in one city and 1 library in the other.

  1. I need to count libraries using lookup first and return librariesCount – it will be 2 and 1.
  2. I need to fetch / lookup books in every library, then count them as ‘booksCount’ and then multiply by librariesCount to get total amount of booksCount in the city (let’s call it cityBooksCount).

I came up with aggregation like this:

Criteria criteria = Criteria.where("_id");

MatchOperation matchOperation = Aggregation.match(criteria);
            
LookupOperation lookupOperation = LookupOperation.newLookup().from("libraries").localField("_id").foreignField("cityId").as("libraries");

UnwindOperation unwindOperation = Aggregation.unwind("libraries", true);

LookupOperation secondLookupOperation = LookupOperation.newLookup().
              from("books").
              localField("libraryIdArray").
              foreignField("libraryId").
              as("books");

UnwindOperation secondUnwindOperation = Aggregation.unwind("books", true);

AggregationOperation group = Aggregation.group("_id")
            .first("_id").as("id")
            .first("name").as("name")
            .first("description").as("description")
            .push("libraries").as("libraries")
            .push("books").as("books");

ProjectionOperation projectionOperation = Aggregation.project("id", "description", "name")              
.and(VariableOperators.mapItemsOf(ConditionalOperators.ifNull("libraries").then(Collections.emptyList()))
.as("library").andApply(aggregationOperationContext -> {
                  Document document = new Document();
                  document.append("id", "$$library._id");
                  return document;
              })).as("libraryIdArray")
.and(ConvertOperators.valueOf(ArrayOperators.Size.lengthOfArray(ConditionalOperators.ifNull("libraries").then(Collections.emptyList()))).convertToString()).as("librariesCount")        
.and(ConvertOperators.valueOf(ArrayOperators.Size.lengthOfArray(ConditionalOperators.ifNull("books").then(Collections.emptyList()))).convertToString()).as("cityBooksCount");

Aggregation aggregation = Aggregation.newAggregation(matchOperation, lookupOperation, unwindOperation, secondLookupOperation, secondUnwindOperation, group, projectionOperation);
            
mongoTemplate.aggregate(aggregation, "cities", Document.class).getRawResults().get("results");

Thanks to the help of one of the stackoverflow users I was able to obtain librariesCount in proper way. Unfortunately cityBooksCount always point to 0.

I’m not so familiar with MongoDB, but I know that $lookup operation is possible on array, so I’ve tried mapping libraries array to list of ObjectId, but it’s not working properly. Probably I’m doing something wrong, but I don’t know where is the problem. I get the proper amount of cities with other projected fields.

Can anyone tell me what I am doing wrong and how to correct it?

Thank you in advance.

Answer

This might be giving you the expected answer.

db.cities.aggregate([
  {
    "$lookup": {
      "from": "Libraries",
      "localField": "_id",
      "foreignField": "cityId",
      "as": "libraries"
    }
  },
  {
    $unwind: {
      path: "$libraries",
      preserveNullAndEmptyArrays: true
    }
  },
  {
    "$lookup": {
      "from": "Books",
      "localField": "libraries._id",
      "foreignField": "libraryId",
      "as": "books"
    }
  },
  {
    $unwind: {
      path: "$books",
      preserveNullAndEmptyArrays: true
    }
  },
  {
    $group: {
      _id: "$_id",
      name: {
        $first: "$name"
      },
      description: {
        $first: "$description"
      },
      libraries: {
        $push: "$libraries"
      },
      books: {
        $push: "$books"
      }
    }
  },
  {
    $project: {
      _id: 1,
      name: 1,
      description: 1,
      libraryCount: {
        $size: "$libraries"
      },
      bookCount: {
        $size: "$books"
      }
    }
  }
])

As we discussed, there are some slight changes. Hope, you understood how to convert the mongo query to spring data aggregation.



Source: stackoverflow