Skip to content
Advertisement

How to count the characters in a String that matched a given Regex

Given an arbitrary String that contains 0 or more Substrings that match a regular expression.

How can I count the number of characters in that String that were part of Substrings that matched the regex?

Example:

Given a regex that matches any email address and the String:

"I have two email addresses: email@gmail.com and email@hotmail.com"

This would return the int value of 32 (the number of characters in "email@gmail.com" plus "email@hotmail.com").

I’m not being clear enough, it seems. Let’s pretend you want to set a limit to the number of characters in a tweet, but you want to allow people to include their email address in the tweet and have it count as zero characters.

Possible method signature of solution:

public int lengthOfSubStringsMatchingRegex(String input, String regex)

Advertisement

Answer

Just loop over the matching groups of your Regex, and use length() to extract the number of characters. Add them to your counter, and that’s it.

public int lengthOfSubStringsMatchingRegex(String input, String regex)
    Pattern p = Pattern.compile(regex);
    Matcher m = p.matcher(input);

    int count = 0;
    while (m.find())
        count += m.group().length();

    return count;
}

As an alternative, but slightly less readable, you can use directly the offsets:

count += m.end() - m.start();

start() returns the start index of the previous match.
end() returns the offset after the last character matched.

User contributions licensed under: CC BY-SA
3 People found this is helpful
Advertisement