Getting oldest n elements of a Java list with conditional logic applied



Java 8. I have a list of widgets (List<Widget>) where a widget looks like:

public class Widget {

  // lots of fields
  private Date createTime;

  // getters, setters & ctors
  
}

All createTime values will be in the past; none in the future. And the present doesn’t really exist. 😉

I don’t have any control over how the list of widgets gets created, or how many get created. But the list of widgets might be huge and so I need to use this list of widgets to figure out which ones I want to delete (deleteList). However, I need to apply some Stream API filters/magic to this list so that I wind up either with:

  1. if the number of widgets (numWidgets) is less than or equal to the number I want to retain (retentionSize), then I want to keep all the widgets and so I want my deleteList to be empty
  2. else, numWidgets > retentionSize, and so I want my deleteList to only contain the oldest numWidgets - retentionSize (I’ll provide an example below)

Again the algorithm here is:

if numWidgets > retentionSize
  i want oldest (numWidgets - retentionSize)

if numWidgets <= retentionSize
  i want an empty list

Examples:

  1. numWidgets is 83 and retentionSize is 90; since numWidgets <= retentionSize then deleteList is empty
  2. numWidgets is 107 and retentionSize is 90; since numWidgets > retentionSize then deleteList is the oldest numWidgets - retentionSize (107 – 90 = 17) widgets in the list

My best attempt thus far using the Stream API is:

// again I don't control how I get allWidgets
List<Widget> allWidgets = getSomehow();
int numWidgets = allWidgets.size();
List<Widget> oldest = (numWidgets > retentionSize)
  ? allWidgets.sort(Comparator.comparing(widget -> widget.getCreateTime()))
  : Collections.emptyList();

Several problems with this attempt:

  • its a compiler error as allWidgets.sort(...) doesn’t return a list; but even if it did…
  • this sorts the widgets in time ascending order whereas I believe I want descending
  • it doesn’t take the differential (numWidgets - retentionSize) into consideration

Can anyone help nudge me across the finish line here?

Answer

First, you should sort the stream, rather than the list. The first element should be newest.

Then skip the first retentionSize widgets. It doesn’t matter that you skipped more elements than there are in the stream. skip doesn’t care. Now, only the oldest (numWidgets - retentionSize) widgets remain in the stream.

List<Widget> allWidgets = getSomehow();
List<Widget> deleteList = allWidgets.stream()
                          .sorted(Comparator.comparing(Widget::getCreateTime).reversed())
                          .skip(retentionSize)
                          .collect(Collectors.toList());

This will sort the list every time, even if retentionSize is greater than the number of widgets. If this is causing performance problems, you can just add an explicit check for that:

List<Widget> deleteList;
if (numWidgets > retentionSize) {
    deleteList = allWidgets.stream()
                 .sorted(Comparator.comparing(Widget::getCreateTime).reversed())
                 .skip(retentionSize)
                 .collect(Collectors.toList());
} else {
    deleteList = List.of();
}

If you instead want everything but the elements in deleteList (as if you have deleted them), use limit:

List<Widget> allWidgets = getSomehow();
List<Widget> thingsNotInDeleteList = allWidgets.stream()
                          .sorted(Comparator.comparing(Widget::getCreateTime).reversed())
                          .limit(retentionSize)
                          .collect(Collectors.toList());


Source: stackoverflow