How can I use Java Stream to find the average of all values that share a key?

Tags: ,



I’m having a lot of trouble with trying to average the values of a map in java. My method takes in a text file and sees the average length of each word starting with a certain letter (case insensitive and goes through all words in the text file.

For example, let’s say I have a text file that contains the following::

"Apple arrow are very common Because bees behave Cant you come home"

My method currently returns:

{A=5, a=8, B=7, b=10, c=10, C=5, v=4, h=4, y=3}

Because it is looking at the letters and finding the average length of the word, but it is still case sensitive.

It should return:

{A=5, a=8, B=7, b=10, c=10, C=5, v=4, h=4, y=3}

{a=4.3, b=5.5, c=5.0, v=4.0, h=4.0, y=3}

This is what I have so far.

public static Map<String, Integer> findAverageLength(String filename) {
    
     Map<String, Integer> wordcount = new TreeMap<>(String.CASE_INSENSITIVE_ORDER);
       
        try 
        {
            Scanner in = new Scanner(new File(filename));
            List<String> wordList = new ArrayList<>();
            while (in.hasNext()) 
            {
                wordList.add(in.next());
            }

            wordcount = wordList.stream().collect(Collectors.toConcurrentMap(w->w.substring(0,1), w -> w.length(), Integer::sum));
            System.out.println(wordcount);
            
        }
        
        catch (IOException e)
        {
            System.out.println("File: " + filename + " not found");
        }
                    
  return wordcount; 
}

Answer

You are almost there.

You could try the following.

  • We group by the first character of the word, converted to lowercase. This lets us collect into a Map<Character, …>, where the key is the first letter of each word. A typical map entry would then look like

    a = [ Apple, arrow, are ]
    
  • Then, the average of each group of word lengths is calculated, using the averagingDouble method. A typical map entry would then look like

    a = 4.33333333
    

Here is the code:

// groupingBy and averagingDouble are static imports from
// java.util.stream.Collectors
Map<Character, Double> map = Arrays.stream(str.split(" "))
    .collect(groupingBy(word -> Character.toLowerCase(word.charAt(0)),
        averagingDouble(String::length)));

Note that, for brevity, I left out additional things like null checks, empty strings and Locales.

Also note that this code was heavily improved responding to the comments of Olivier Grégoire and Holger below.



Source: stackoverflow