There is a list containing lines of the following form:
1/ce/a6/5a/1cea65ab9260df8d55fb29ce0df570d3.jpg ::: 2021-09-17T17:07:52Z
How do I remove duplicate lines but still ignore the date? that is, ignore the date at the end of the line:
::: 2021-09-17T17:07:52Z
Only the first part of the string before the date is important:
1/ce/a6/5a/1cea65ab9260df8d55fb29ce0df570d3.jpg
Advertisement
Answer
This should work:
public static void main(String[] args) { String[] input = {"1/ce/a6/5a/1cea65ab9260df8d55fb29ce0df570d3.jpg ::: 2021-09-17T17:07:52Z", "1/ce/a6/5a/1cea65ab9260df8d55fb29ce0df570d4.jpg ::: 2021-09-17T17:07:52Z", "1/ce/a6/5a/1cea65ab9260df8d55fb29ce0df570d3.jpg ::: 2021-09-17T17:07:00Z"}; HashMap<String, String> outMap = new HashMap<>(); List<String> keys = new LinkedList<>(); for(String line:input) { String key = line.substring(0, line.indexOf(":::")); String oldVal = outMap.putIfAbsent(key, line); if(oldVal==null) { keys.add(key); } } List<String> collect = keys.stream().map(key -> outMap.get(key)).collect(Collectors.toList()); collect.forEach(System.out::println); }
For each line, the part before the :::
is treated as a key. The HashMap is used to remember if a line with that key was already encountered in the input list and the first occurrence is saved in the Map.
The Map
has one problem though: the order of the things it contains is not preserved. To solve this, we remember the order of the keys using the List<String> keys
.
This code prints:
1/ce/a6/5a/1cea65ab9260df8d55fb29ce0df570d3.jpg ::: 2021-09-17T17:07:52Z 1/ce/a6/5a/1cea65ab9260df8d55fb29ce0df570d4.jpg ::: 2021-09-17T17:07:52Z