How can I use Java Stream to find the average of all values that share a key?

Tags: ,

I’m having a lot of trouble with trying to average the values of a map in java. My method takes in a text file and sees the average length of each word starting with a certain letter (case insensitive and goes through all words in the text file.

For example, let’s say I have a text file that contains the following::

"Apple arrow are very common Because bees behave Cant you come home"

My method currently returns:

{A=5, a=8, B=7, b=10, c=10, C=5, v=4, h=4, y=3}

Because it is looking at the letters and finding the average length of the word, but it is still case sensitive.

It should return:

{A=5, a=8, B=7, b=10, c=10, C=5, v=4, h=4, y=3}

{a=4.3, b=5.5, c=5.0, v=4.0, h=4.0, y=3}

This is what I have so far.

public static Map<String, Integer> findAverageLength(String filename) {
     Map<String, Integer> wordcount = new TreeMap<>(String.CASE_INSENSITIVE_ORDER);
            Scanner in = new Scanner(new File(filename));
            List<String> wordList = new ArrayList<>();
            while (in.hasNext()) 

            wordcount =>w.substring(0,1), w -> w.length(), Integer::sum));
        catch (IOException e)
            System.out.println("File: " + filename + " not found");
  return wordcount; 


You are almost there.

You could try the following.

  • We group by the first character of the word, converted to lowercase. This lets us collect into a Map<Character, …>, where the key is the first letter of each word. A typical map entry would then look like

    a = [ Apple, arrow, are ]
  • Then, the average of each group of word lengths is calculated, using the averagingDouble method. A typical map entry would then look like

    a = 4.33333333

Here is the code:

// groupingBy and averagingDouble are static imports from
Map<Character, Double> map =" "))
    .collect(groupingBy(word -> Character.toLowerCase(word.charAt(0)),

Note that, for brevity, I left out additional things like null checks, empty strings and Locales.

Also note that this code was heavily improved responding to the comments of Olivier Grégoire and Holger below.

Source: stackoverflow