Data set got merged incorrectly/randomly?

Question

I have a .csv file that looks something like this: 123,1-1-2020,[Apple] 123,1-2-2020,[Apple] 123,1-2-2020,[Beer] 345,1-3-2020,[Bacon] 345,1-4-2020,[Cheese] 345,1-4-2020,[Sausage] 345,1-5-2020,[Bacon] ...

Accepted Answer

First, you’re using double-brace syntax in the merge() method, which is highly discouraged. See e.g. this answer: What is Double Brace initialization in Java?Second, your code is iterating the list too much, you can (and should) do it in a single iteration.Before going on, we need to look at the Receipt class, which is not shown in the question, but we can infer that it looks like this:public class Receipt { private final String number; private final String date; private final Set items; public Receipt(String number, String date, Set items) { this.number = number; this.date = date; this.items = items; } public String getNumber() { return this.number; } public String getDate() { return this.date; } public Set getItems() { return this.items; }}The goal is to merge receipts that share number and date. To do that in a single iteration, we need a Map, keyed by the combination of number and date. There are 3 ways to do that:Implement equals() and hashCode() to be based on only those two fields, ignoring the items field. This is the easiest and simplest solution, but requires modifying the Receipt class, and to define “equality” to not include the item list, which is a debatable decision, so let us not do this.Implement a dedicated class with just the two fields, and implement equals() and hashCode(). This has the advantage of leaving Receipt alone, but does require a new class.Use the Receipt class as the key, but give the Map class a custom Comparator that only compares the two fields. With Java 8, we can do this without creating a new class, so let us try that. We’ll need to use a TreeMap in order to supply a custom Comparator.So, we’ll create a TreeMap>, using a custom Comparator, iterate the list once, merging the items as we go, then finally iterating the map to build the new list of merged receipts.protected static List convert(List list) { Map> map = new TreeMap<>(Comparator.comparing(Receipt::getDate) .thenComparing(Receipt::getNumber)); for (Receipt receipt : list) { map.merge(receipt, receipt.getItems(), (v1, v2) -> { Set newSet = new TreeSet<>(v1); newSet.addAll(v2); return newSet; }); } List result = new ArrayList<>(); for (Entry> entry : map.entrySet()) { result.add(new Receipt(entry.getKey().getNumber(), entry.getKey().getDate(), entry.getValue())); } return result;}It can of course also be done with Stream logic:protected static List convert(List list) { return list.stream() .collect(Collectors.groupingBy( Function.identity(), () -> new TreeMap<>(Comparator.comparing(Receipt::getDate) .thenComparing(Receipt::getNumber)), Collectors.flatMapping(r -> r.getItems().stream(), Collectors.toCollection(TreeSet::new)))) .entrySet().stream() .map(e -> new Receipt(e.getKey().getNumber(), e.getKey().getDate(), e.getValue())) .collect(Collectors.toList());}TestUses Java 9 of methods.List list = List.of( new Receipt("123", "1-1-2020", Set.of("Apple")), new Receipt("123", "1-2-2020", Set.of("Apple")), new Receipt("123", "1-2-2020", Set.of("Beer")), new Receipt("345", "1-3-2020", Set.of("Bacon")), new Receipt("345", "1-4-2020", Set.of("Cheese")), new Receipt("345", "1-4-2020", Set.of("Sausage")), new Receipt("345", "1-5-2020", Set.of("Bacon")));List converted = convert(list);converted.forEach(r -> System.out.println(r.getNumber() + "," + r.getDate() + "," + r.getItems().size() + "," + r.getItems()));Output123,1-1-2020,1,[Apple]123,1-2-2020,2,[Apple, Beer]345,1-3-2020,1,[Bacon]345,1-4-2020,2,[Cheese, Sausage]345,1-5-2020,1,[Bacon]

Advertisement

Answer