Skip to content
Advertisement

mongo toLIst() java.lang.OutOfMemoryError: Java heap space

I try to fetch some data from mongodb , but my k8s pods hitting:

Terminating due to java.lang.OutOfMemoryError: Java heap space

Checking the heap dump this seems is causing some trouble:

try (CloseableIterator<A> iter = 
         mongoTemplate.stream(query(criteria),
                              DocumentAnnotation.class,
                              ANNOTATIONS_COLLECTION_NAME)) {
    return StreamSupport.stream(
        Spliterators.spliteratorUnknownSize(iter, Spliterator.ORDERED), false)
                        .filter(annotation -> isAnnotationAcceptedByFilter(annotation))
                        .collect(Collectors.toList());
}

In general, it creates an iterator using Mongo driver streaming API and iterates through all annotations returned by a database using given criteria. It seems that Mongo DB driver is reading annotations in bulks of 47427 items (? at least I see that in heap dump) and despite of the fact that most will be filtered by the filter in Java so not returned to the client, that is causing a problem because each such request allocates 100MB of RAM to keep this bulk.

Does anybody know if that bulk size is configurable?

Thanks

Advertisement

Answer

Based on what you have said in the comments, my opinion is that what you have misdiagnosed the problem. The batch size (or “bulk size” as you called it) is not the problem, and changing the internal batch size for the Mongo driver won’t fix the problem. The real problem is that even after filtering it the list you are creating using the stream is too large for the Java heap size that you are using.

There are two possible approaches to solving this:

  • Instead of putting the annotations into a List, iterate the stream and process the annotations as you get them.

  • Figure out a way to extract the annotations in batches. Then get a separate list of the annotations in each batch.

(In other circumstances, I would suggest trying to do the filtering in the MongoDB query itself. But that won’t help to solve your OOME problem.)

But if you need all of the annotations in memory at the same time in order to process them, then your only practical option will be to get more memory.

User contributions licensed under: CC BY-SA
3 People found this is helpful
Advertisement