Skip to content
Advertisement

How can I find the number of lines that contain a certain word in java using Java Stream?

My method would read from a text file and find the word “the” inside of each line and count how many lines contain the word. My method does work but the issue is that I need only lines that contain the word by itself, not a substring of the word as well

For example, I wouldn’t want “therefore” even though it contains “the” it’s not by itself.

I’m trying to find a way to limit the lines to those that contain “the” and have the length of the word be exactly 3 but I’m unable to do that.

Here is my method right now:

JavaScript

For example, if a text file contains these lines:

JavaScript

The method would return 4

Advertisement

Answer

Use regex to enforce word boundaries:

JavaScript

or for the general case:

JavaScript

Details:

  • b means “word boundary”
  • (?i) means “ignore case”

Using word boundaries prevents "Therefore" matching.

Note that in java, unlike many other languages, String#matches() must match the entire string (not just find a match within the string) to return true, hence the .* at either end of the regex.

User contributions licensed under: CC BY-SA
6 People found this is helpful
Advertisement